Everyone knows that testing is one of the essential components of a production code. There are lots of different metrics on testing to support the importance of it. But it is rare to see any good examples in the real world.

That changed, when I was told about SQLite!

Marking connected graphs

Problem statement

Suppose you have a graph and you want to know how many nodes are connected in a cluster. How can we find out?

[Stop here if you want to think about it]

Outline of ideas

Essentially we want to visit a node, mark it as visited and add it to the counter. After this we would like to continue to its neighbours. This inspires us to use recursion.

Python code

Suppose a graph G is a dictionary given in the following form.

G[v1] = [neighbour1, neighbour2, ... ]

We can use the following code to visit each node systematically.

def MarkComponent(G, v, marked):
  # Using a dictionary is particularly convenient,
  # as when in recursion we can just pass it over.
  # If we use a list, we will need to initialize it somewhere
  # in the client code to do marked.append(v)
  marked[v] = True
  counter = 1
  for neighbour in G[v]:
    if neighbour not in marked:
      counter += MarkComponent(G, v, marked)

  return counter

Cycle detection: the tortoise and the hare

Problem statement

Consider a function f(x). We know it is a periodic function and would like to know the period. How can we find it?

Let’s add more conditions so that this question is manageable. Let’s assume f(x) is a discreet function, and each step is given recursively in the following sense:

x_1 = f(x_0)
x_2 = f(x_1) = f^2(x_0)

More generally,

x_n = f^n(x_0)

How can we find the period and the first element that kicks off the repetition?

[Stop here if you would like to think about it.]

Outline of the solution

There are various ways to solve this but a simple yet effective algorithm was discovered by Floyd. It is often called the tortoise and the hare algorithm. The algorithm has three steps.

1. Discover a repeating element.
2. Discover the first repeating element.
3. Discover the length of the repetition.

The ideas on each step comes in the following form.

1. Since we know that at some point there is a repetition with length l, we know that

x_i = x_{i+jl}

In particular when i = jl, we have x_{jl} = x_{2jl}. Essentially, if we search for forms x_k = x_{2k}, then we are guaranteed to find a repeating element.

2. Now we have found a repeating element. We want to find the first repeating element.

Let’s denote the first repeating element as x_m. Then we have the following relation:

x_m = x_{m+nl}.

Note that in step 1, we already have x_{jl}. Therefore, by searching from x_0 and x_{jl} as a starting point, and performing the search by incrementing in the following fashion:

Check whether: x_0 = x_{jl}
Check whether: x_1 = x_{jl+1}

we can find the first repeating element.

3. Once we know the first repeating element, finding the length is straightforward.

Python code

Suppose f(x) is given by an array A such that A = [5, 1, 3, 0, 4, 6, 2]. The function is defined as

x_1 = A[0] = 5
x_2 = A[x_1] = 6
x_3 = A[x_2] = 2

etc. The Python code for the Floyd’s algorithm is the following.

def FloydCycleDetection(A):
  # First we look for x_i = x_2i entries.
  tortoise = A[0] # x_1
  hare = A[A[0]] # x_2
  while tortoise != hare:
    tortoise = A[tortoise]
    hare = A[A[hare]]

  # At this point we found one element that is repeating.
  # Furthermore, it is of the following form x_jl where l is the length of period.
  # We find the first repeating element x_m such that x_m = x_{jl+m}
  # Let's send the hare to the start and search.
  hare = 0
  while tortoise != hare:
    tortoise = A[tortoise]
    hare = A[hare]

  # Now we know what is the first repeating element; hare is sitting on it.
  # We send the tortoise around to find the length of repetition.
  # In the following, note that we start from length 1 rather than 0.
  # This is because if we start from 0 the while loop will immediately terminate.
  length = 1
  tortoise = A[hare]
  while tortoise != hare:
    length += 1
    tortoise = A[tortoise]

return length, tortoise
Follow

Get every new post delivered to your Inbox.