8 years ago+ 0 comments

There are three aspects to solving this problem:

How to calculate the correct values without running out of universe
How to implement the calculations without running out of stack space
How to handle the very large numbers involved, using modulo nonsense

1. Calculating the correct values without running out of universe

The solution was designed to be solved using dynamic programming and memo-isation. This means:

It can be broken down into smaller sub-problems
The results of those smaller sub-problems can be cached and reused

The first problem is identification of the sub-problems, which is done as follows:

Each level of the city has four (corner) sub-cities and two connecting nodes that join the four sub-cities together in an ‘H’
Each sub-city has a node that is the access point from that sub-city to the rest of the sub-cities (and the two connecting nodes)

At level 1, each sub-city is a single node

At level 2, each sub-city is a level 1 city

At level 3, each sub-city is a level 2 city

Etc…

  *   * <----- sub-city
  |   |
  o - o <----- connecting node
  |   |
  *   *

To start off, I’ll leave the added complication of different costs to one side. I’ll bring it back in at the end of the explanation.

To make the problem composable, we have to devise a way to break the calculations of the distances between nodes up so that we can make use of the results of sub-problems. I propose that this can be done as follows:

A full node-to-node cost calculation for each city / sub-city (F)
A node-to-junction-node cost calculation for each city / sub-city (J)
A diagonal top-left node to bottom-right node calculation (D)
A node count per city calculation

In the following ascii art diagrams, you’ll see F, J and D. These are short-hand for Full, Junction and Diagonal. When they have numbers following them, it means that we are referring to a specific level of D. D1, means D for level 1, for example. When there is no number, we mean the previous level of F, J or D. If we are calculating level 5, D refers to D4 (D for level 4).

Node count per city calculation

This calculation is the simplest calculation so let’s consider it first. It is simply the number of nodes in a city of level n

For level 0, there is a single node so the answer is 1
For level 1, there are four sub-cities of a single point each and the two connecting nodes so the answer is 6
For level 2, each sub-city has 6 nodes so the answer is 6 * 4 + 2 = 26
For level n, the equation is count(n) = 4 * count(n-1) + 2

Diagonal top-left node to bottom-right node calculation

This calculation is required by the other calculations, so let’s consider it first. For this calculation, we want to travel diagonally from one corner of the city to the opposite corner of the city:

For level 0, this cost is 0 (base case). We know it is zero because if you consider the H shape for level one, there is only one node per sub-city and so no paths to traverse
For levels 1+ this is the cost of traversing two sub-cities plus the two centre connecting nodes
- For level 1, the cost of traversing each sub-city is 0 (level 0) + 3 for the centre connecting nodes
- For level 2, the cost of traversing each sub-city is 3 (level 1) + 3 for the centre connecting nodes = 9
- For level n, the equation is diagonal(n) = 2 * diagonal(n-1) + 3

Level 1   Level 2   Level n

o         D1        D
|         |         |
. - .     . - .     . - .
    |         |         |
    o        D1         D

Node to junction node calculation

Here is where things get interesting. We need to calculate the cost of paths from every node in a city to the junction node. The junction node is the node that will join the sub-city to the rest of the city for the next city level. For convention, we will pick the node at the bottom-right of the H as the sub-city containing the junction node:

For level 0, this cost is 0
For levels 1+, there are four sub-cities. We are calculating the cost of getting from every other node to the junction node, so the cost overall is:
- The cost of getting from the top-left sub-city to the junction node
- The cost of getting from the top-right sub-city to the junction node
- The cost of getting from the bottom_left sub-city to the junction node
- The cost of getting from nodes in the bottom right sub-city to the junction nodes
- The cost of getting from the connecting nodes to the junction node

    J   .        .   J       .   .       .   .       .   .       .   .
    |   .        .   |       .   .       .   .       .   .       .   .
    . - .   +    . . .   +   . - .   +   . . .   +   o - .   +   . . o
    .   |        .   |       |   |       .   .       .   |       .   |
    .   D        .   D       J   D       .   J       .   D       .   D

Full node-to-node cost calculation

This is the hardest calculation to visualise. We need the costs for:

Node to node inside each sub-city
Node to node for each pair of sub-cities (a node in one sub-city connected to a node in another sub-city)
Connecting nodes to each sub-city’s nodes

The problem is calculation of the costs for nodes in each pair of subcities. We could calculate it by iterating over each node in each sub-city but that is O(p^2) where p is the number of nodes in the sub-city. It gets very, very big very, very fast. Instead, we can exploit the following:

Each node in each sub-city will be used p times as it must connect to p nodes in the other sub-city
The cost of traversing from one sub-city to the other is always the same for a pair of sub-cities
As there are p * p paths between the two sub-cities, we just multiply the traversal cost by p^2

Our calculation ends up being J * 2 * P + (P * P * t) where:

J is the node-to-junction cost of the sub-city
P is the number of nodes in the sub-city
T is the traversal cost between the sub-cities

We can now calculate the whole cost of the full node-to-node calculation:

For level 0, this is 0
For level 1, this is 29 * the cost for this city level, as explained in the problem statement
For level 2+, this is calculated as follows:
- The full node-to-node cost of each sub-city
- The node to node cost of each pair of sub-cities
- The connecting node to sub-city node costs

    F   .       .   .       .   F       .   .
    .   .       .   .       .   .       .   .
    . . .   +   . . .   +   . . .   +   . . . +
    .   .       .   .       .   .       .   .
    .   .       F   .       .   .       .   F


    J   .       J   J       J   .       .   J       .   .       .   J
    |   .       |   |       |   .       .   |       .   .       .   |
    . . .   +   . - .   +   . - .   +   . - .   +   . - .   +   . . .   +
    |   .       .   .       .   |       |   .       |   |       .   |
    J   .       .   .       .   J       J   .       J   J       .   J


    J   .       .   J       .   .       .   .
    |   .       .   |       .   .       .   .
    o . .   +   o - .   +   o . .   +   o - .   +
    .   .       .   .       |   .       .   |
    .   .       .   .       J   .       .   J


    J   .       .   J       .   .       .   .
    |   .       .   |       .   .       .   .
    . - o   +   . . o   +   . - o   +   . . o
    .   .       .   .       |   .       .   |
    .   .       .   .       J   .       .   J

    PHEW!

Bringing costs into the mix

Costs may seem like an annoying additional complexity, but really they just require you to multiply the calculation for a given level by the cost for a given level. For example, the full node-to-node cost for a city at level 1 is normally 29. Multiplying this calculation by 2, as required for the second example gives you 58 rather than 29. Store 58 for the level 1 full cost and use it in subsequent calculations, and it all comes out fine.

2. Implement calculations without running out of stack space

You might be tempted to implement your solution using recursion, that is:

city_cost(n)
  city_cost(n - 1)
    city_cost(n - 2)
      …
        …
          city_cost(1)
            #calculate the base case

You will very-likely run out of stack space when n is 1000000, as this can generate a call-stack a million calls deep. Only minor changes are required to avoid this problem, however, so I leave it as an exercise to the reader.

3. Handling the very large numbers involved, using modulo nonsense

This was the aspect of the test that annoyed me. You can use a language that supports big numbers, but the numbers get very big and you are likely to hit timeout on the more demanding tests. If you are using python, you can get around this easily enough, but if you are using a language without native big integer support, such as c++, you have to watch out for intermediate steps of your calculations overflowing. Just ruin the beautiful, aesthetic formulae by ensuring that you don’t overflow intermediate calculations and you’ll be fine :)