Backtracking (and recursion)
Sum to $15$
We would like to solve the following problem - first by hand, and then later by computer. Consider a $3\times 3$ matrix containing numbers $1..9$ with the following constraints.
- Each number $1..9$ should occur once in the matrix.
- The sum of all rows, columns and diagonals should be $15$.
We will try to reason carefully about the solution, in such a way that we will be able to translate our arguments to code later in the lesson.
At the beginning we have no information about what the entries in the matrix should contain. We can express this by saying that our partial solution is $$\begin{pmatrix}1\dots 9 & 1\dots 9 & 1\dots 9 \\ 1\cdots 9 & 1\dots 9 & 1\dots 9 \\ 1\cdots 9 & 1\dots 9 & 1\dots 9\end{pmatrix}$$
To start searching for a solution, we can make an assumption on the content in one or more of the entries in the matrix. For instance, a solution will have to be of one of the following two forms. $$\begin{pmatrix}2\dots 9 & 1\dots 9 & 1\dots 9 \\ 1\cdots 9 & 1\dots 9 & 1\dots 9 \\ 1\cdots 9 & 1\dots 9 & 1\dots 9\end{pmatrix}\mbox{ or }\begin{pmatrix}1 & 1\dots 9 & 1\dots 9 \\ 1\cdots 9 & 1\dots 9 & 1\dots 9 \\ 1\cdots 9 & 1\dots 9 & 1\dots 9\end{pmatrix}$$ Dividing the solution into different cases in this way is called branching. The division of the problem we chose here is one particular approach to branching - there are many.
Having assumed that the first entry is $1$, we can propagate (i.e. apply) the constraints to eliminate $1$ from all the other entries in the matrix. Also, we can eliminate other entries due to the contraints of having sums to $15$. In other words, our partial solution has to be of the form $$\begin{pmatrix}1 & 5\dots 9 & 5\dots 9 \\ 5\dots 9 & 5\dots 9 & 2,3,4 \\ 5\dots 9 & 2,3,4 & 5\dots 9\end{pmatrix}$$
We see that there are at most two entries which can contain the numbers $2,3,4$, which is clearly impossible. Our assumption of having $1$ in the top-left corner was therefore wrong, and we can therefore discard this possibility. We go back to our list of possible solutions (we backtrack), branch again, giving us the two partial solutions $$\begin{pmatrix}3\dots 9 & 1\dots 9 & 1\dots 9 \\ 1\dots 9 & 1\dots 9 & 1\dots 9 \\ 1\dots 9 & 1\dots 9 & 1\dots 9\end{pmatrix}, \begin{pmatrix}2 & 1\dots 9 & 1\dots 9 \\ 1\dots 9 & 1\dots 9 & 1\dots 9 \\ 1\dots 9 & 1\dots 9 & 1\dots 9\end{pmatrix}.$$
We propagate constraints on the last matrix. First of all, we can remove $2$ as a possibility in all entries except the top-left corner. Now $1$ and $3$ cannot be used with $2$ to form a sum of $15$, and so there only to entries which can contain $1$ and $3$. The partial solution then has to have the form $$\begin{pmatrix}2 & 4\dots 9 & 4\dots 9 \\ 4\dots 9 & 4\dots 9 & 1,3 \\ 4\dots 9 & 1,3 & 4\dots 9\end{pmatrix}$$
We could argue further to remove more entries, but instead we choose to branch again, this time on the middle left entry, giving us the following three possibilities $$\begin{pmatrix}3\dots 9 & 1\dots 9 & 1\dots 9 \\ 1\dots 9 & 1\dots 9 & 1\dots 9 \\ 1\dots 9 & 1\dots 9 & 1\dots 9\end{pmatrix},\begin{pmatrix}2 & 4\dots 9 & 4\dots 9 \\ 4\dots 9 & 4\dots 9 & 3 \\ 4\dots 9 & 1,3 & 4\dots 9\end{pmatrix}\begin{pmatrix}2 & 4\dots 9 & 4\dots 9 \\ 4\dots 9 & 4\dots 9 & 1 \\ 4\dots 9 & 1,3 & 4\dots 9\end{pmatrix}$$
We apply constraints to the last matrix, this time with a slightly more involved argument. If the lower right corner is odd, then the bottom left and top right corners are odd. This again implies that the middle left and middle top are odd as well. Now there are three odd numbers $5..9$, so we have a contradiction with having found $4$ odd tiles. We can conclude that the lower right corner is even, which implies that the other two corners are even as well. The middle tile must then be odd. We have reduced to the following situation $$\begin{pmatrix}2 & 5,7,9 & 6,8 \\ 5,7,9 & 5,7,9 & 1 \\ 4,6,8 & 3 & 6,8\end{pmatrix}$$ We can continue the argument. The lower left corner must be $4$, which gives $9$ in the middle left and $5$ in the middle. But then $7$ is at the middle top and we can place $6$ and $8$, giving us a solution $$\begin{pmatrix}2 & 5,7,9 & 6,8 \\ 5,7,9 & 5,7,9 & 1 \\ 4,6,8 & 3 & 6,8\end{pmatrix}$$
This is by no means the only solution to the problem. To find the other solutions we go back to our list of partial solutions and retrieve the last entry $$\begin{pmatrix}2 & 4\dots 9 & 4\dots 9 \\ 4\dots 9 & 4\dots 9 & 3 \\ 4\dots 9 & 1,3 & 4\dots 9\end{pmatrix}$$ Applying constraints too this partial solution gives us the solution $$\begin{pmatrix}2 & 7 & 6 \\ 9 & 5 & 1 \\ 4 & 3 & 8\end{pmatrix}$$
In this way we can continue until we find all possible solutions to the problem. In pseudo-code, we can summarise the method as
loop until list of partial solutions is empty
remove the last entry from the list
reason about the partial solution (i.e. propagate constraints)
there are now three cases
1. the solution is invalid, in which case it is discared
2. we have obtained a solution, which is recorded
3. divide the partial solution into cases and add to our list.
This method, which we will call backtracking, can be used to solve a whole range of problems.
When implementing we need to make specific choices about how to propagate constraints and how to branch. For larger problems, typical for real world applications, considerable effort will be spent on designing the propagator and brancher suitable for that speficic problem. There is an interplay between these two tasks in the sense, that good propagators will lead to less branching, and good branchers will simplify the work of the propagator. Writing good propagators can be difficult, and they may be expensive to run. But simple propagators can lead to intolerable amounts of branching. We will not go any deeper into these issues here, but instead use fairly simple propagators and branchers in our example implementations.
Distinct list
Consider the following problem. We would like to construct lists of length $5$, where each entry is an integer $0..4$, using the approach above. We can think of this problem as computing the sample space to $5$ people in a queue which was discussed in module 1. We will use python to implement a solution to this problem using backtracking. To start with we define three data structures which we will work with
current = [set(range(5)) for i in range(5)]
stack = [current]
result = []
The list current is the partial solution we work with at any time. At the moment, there are no restrictions on the entries in the list. The stack will be our list of partial solutions above. A stack is a list where we restrict ourselves to adding and removing elements at the back of the list. Python implements two list-methods which lets us do exactly that
# adding to the back of the list
stack.append(x)
# removing from the back of the list and returning the value
stack.pop()
The stack is initialised with the initial partial solution. At the start result is empty, but we will be adding solutions to the problem as we find them.
The main loop of the program is as follows
while(stack != []):
#Pop the next element of the stack
current = stack.pop()
# popagate constraints
done = False
while not done:
done = applyConstraints(current)
# Query state of current solution
minSize = 100
maxSize = -100
maxIndex = -1
for i in range(len(current)):
length = len(current[i])
if minSize > length: minSize = length
if maxSize < length:
maxSize = length
maxIndex = i
# Discard, store or branch
if minSize > 0:
# store if we have solution
if maxSize == 1:
result.append(solution)
# branch if no solution yet
else:
prev = deepcopy(current)
current[maxIndex] = set([prev[maxIndex].pop()])
stack.append(prev)
stack.append(current)
Propagating constraints is done in the funciton applyConstraint(), which takes a partial solution as an argument. The function returns false if the partial solution was changed by applying the constraints, and so we loop until no more changes are possible.
After propagating constraints we are in one of three cases. Either we have an invalid solution, or a valid complete solution, or the solution is still partial, in which case we will need to do more branching. We find out which of these three situations we are in by finding the sizes of the largest and the smallest set in the partial solution current. If there is an empty set in current, then the solution is invalid and can be discarded. If all sets are singletons, then the solution is valid but still partial.
Branching is done here by splitting off one element from the largest set in current. So for something like for instance
[{0,1},{2,3,4},...]
would be branched into the two partial solutions
[{0,1},{4},...] [{0,1},{2,3},...]
We look a little closer at the applyConstraints function
def applyConstraints(solution):
done = True
for i in range(len(solution)):
if len(solution[i]) == 1:
#
# prune lists
#
temp = solution[i].pop()
for y in solution:
oldlen = len(y)
y.discard(temp)
if oldlen != len(y): done = False
solution[i] = set([temp])
return done
The function searches the partial solution for singletons, and then removes those singletons from any of the other subsets in the solution.
The program computes all the $5!=120$ possible lists (or queues) to the problem. This code can be modified to compute all the sample spaces for the queue-problems discussed earlier in module 1.
Distinct ordered list
We will compute the $C(7,4)=35$ lists of length $4$ with increasing numbers entries, the entries being constrained to $0..7$. We can use a similar setup as before.
current = [set(range(7)) for i in range(4)]
stack = [current]
result = []
The only function we need to rewrite is the propagator, which we implement as
def applyConstraints(solution):
done = True
for i in range(len(solution)):
if len(solution[i]) == 1:
#
# prune lists
#
sing = solution[i].pop()
for j in range(i+1,len(solution)):
oldlen = len(solution[j])
solution[j]=set(filter(lambda x : x>sing, solution[j]))
if oldlen != len(solution[j]): done = False
solution[i] = set([sing])
return done
Here, after having identified a singleton, we remove that singleton from all consecutive entries in the partial solution.
The program can be modified to compute all the sample spaces for the card examples in module 1.
Latin squares
A latin square with $4$ letters is a $4\times 4$ matrix with entries $0..3$, each occuring once in each row and column. We will compute the 576 different $4\times 4$ latin squares using backtracking. We look at the applyConstraints function
def applyConstraints(latin):
for i in range(16):
if len(latin[i]) == 1:
x = list(latin[i])[0]
for j in range(4):
latin[4*(i//4)+j].discard(x)
latin[(i%4)+j*4].discard(x)
latin[i] = set([x])
where we choose a slightly simpler implementation compared with the previous examples.