• + 1 comment

    Hi Virsalus.


    Hummm... Forget my timings since I've compiled the code without optimizations. My bad! :-(


    After having recompiled with -O3 and ran on my computer, my algorithm took 0.003s and my implementation of Floyd's algorithm (based on dbrenton23's) took 0.459s. Of course the timings vary (even for different runs on the same machine) but this gives an idea of orders of magnitude.


    Now, please, allow me to comment on your implementation.


    1) Firstly it allocates memory and compute the whole sequence a[0], ..., a[N-1]. This imposes a complexity of (at least) O(N) in space and time. If you don't do this and, following [1], compute the values on the fly then the implementation gets complexity of O(1) in space and O(lam + mu) in time (where lam and mu are as in [1]).


    2) To avoid an out-of-bound access to a, you've introduced the stopping criterion (2*i)+1 > N-1 in which case your algorithm outputs N. This is a bug as the test case "5 1 2048 0" shows. Indeed, in this case the sequence, modulo 2^31, is 1, 2048, 4194304, 0, 0. Hence, there are 4 distinct elements but when i == 2, we have (2*i)+1 == 5 > 4 == N-1. At this point your implementation stops and outputs 5 (not 4). Notice that the first loop of Floyd's algorithm looks for a cycle but not necessarilly the first one. This means that the algorithm might need to go far beyond N to find this cycle and your implementation is wrongly preventing it from doing so. (AFAIK the standard problem of cycle-finding is not bound by any value of N, it's Hackerrank's problem that introduced that twist.) One way to fix this is, again, computing the values on the fly, allowing the algorithm to go as far as it needs to.


    Look at dbrenton23's implementation. He has managed to adapt the original Floyd's algorithm (with no bounds) to the Hackerrank's problem (bounded by N).


    Going back to my algorithm: it has O(1) in space and time. To be very precise, the algorithm has just one loop that is executed no more that min(lg N, 60) times. Since N < 10^8, we have lg N < lg (10^8) < 27. Hence, the loop is executed no more than 27 times. At this point, one can argue that the algorithm is O(lg N) in time but because of the min with 60 it means that even if N was infinity (or, in other words, if there were no bound) the loop would be executed at most 60 times. This guarantees O(1) time complexity.


    A natural question is where does the number 60 comes from. As I said in my original post the math is a bit lengthy to be done here but I can tell that it comes from 60 = 2 * (31 - 1) where the number 31 is used because we are interested in integers modulo 2^31.


    My implementation is available at [2].


    Thanks.


    [1] https://en.wikipedia.org/wiki/Cycle_detection#Tortoise_and_hare

    [2] http://ideone.com/gFQgqT