# Pipes, Clojure and Choco - Optimisation

Earlier posts here (ending with part 5) have described using
Clojure and Choco to solve this pipes problem. Today’s
post uses similar ideas to look at *optimisation*.

Optimisation is useful when the constraints allow for many possible outcomes, but we want to select “the best” in some way.

To demonstrate this I have adapted the previous puzzle to answer a slightly different question:

If you can choose

anytile for the empty spaces, which choices give the shortest length of tubing?

Otherwise, the constraints are as before: all tiles must contain some pipes; pipes must connect across tiles; the question may specify some fixed, initial tiles.

An advantage of using Choco over a “hand written” DFS is immediately apparent: it’s easy to adapt Choco to solve this new problem. I am not sure what the best approach would be for adapting the code I wrote here - surely there’s something better than generating all constraints and then sorting them? But what?

Given Choco’s general API there are various ways to approach the problem. I
tried two: one using `costRegular`

(DFAs) and one using additional
cost variables. In both cases, most of the code is taken from the earlier
solution which uses binary relations to constrain neighbours.

## The DFA Approach

In a sense, this approach is “cheating”, or at least misleading, because the
DFA I use accepts all possible combinations. The only reason for using a DFA
is that I noticed that the `costRegular`

call in the Choco API provides an
easy way to specify a cost for each variable. And since that method requires
a DFA, I simply used one that accepts all patterns!

There is one wrinkle: specifying the DFA as a regular expression like `".*"`

does not work. I am not sure if this is a limitation in the DFA library, or
is driven by the need to know more details about possible values, but a
support post from the Choco team explained that the easiest
way to generate an “accept anything” DFA is to generate an empty one (that
accepts nothing) and then take the complement.

Apart from the trick above, the rest of the code is pretty simple: we
introduce a new variable, which is the sum of the costs (set by the
`costRegular`

API call, and indicate that it is an “objective variable” (ie
should be minimised); then we call the method, passing in the costs and
variables:

```
(defn str-array [values]
(into-array String values))
(defn ivar-array [values]
(into-array IntegerVariable values))
(defn cost [mask]
(cond
(#{0 1 2 4 8} mask) 99
(#{14 13 11 7} mask) 3
(= mask 15) 4
:else 2))
(defn constrain-sum-dfa [model n masks variables]
(let [sum
(Choco/makeIntVar "length" (int (* n n)) (int (* n n 4))
(into-array String ["cp:objective"]))
costs (int-array (map cost (range unknown)))
automaton (FiniteAutomaton.)]
(doseq [m masks] (.addToAlphabet automaton m))
(.addConstraint model
(Choco/costRegular sum (ivar-array (vals variables))
(.complement automaton)
(into-array (for [i (keys variables)] costs))))))
```

## The Cost Variables Approach

When I first started using Choco I was worried about the number of variables I was introducing. But it seems that adding variables is often the natural way to adapt a problem so that the constraints match the API. As long as the new variables are strongly determined (ie depend rigidly on existing variables) they don’t seem to “cost” much.

So for this approach, I add a new variable for each empty cell, which
represents the cost (pipe length) of that cell. I then use a simple lookup
table (via the `nth`

API call) to associate the cost with a particular tile
(so a tile with mask `3`

is of length `2`

because it has two bits set - two
“radial” section of pipe).

Finally, another variable, flagged as the objective, is set to the sum of the costs:

```
(defn cost-variable [lo hi variable]
(Choco/makeIntVar (str "cost " (.getName variable))
(int lo) (int hi) (str-array [])))
(defn constrain-sum-nth [model n lo hi variables]
(let [costs (int-array (map cost (range unknown)))
cost-variables
(ivar-array
(for [v (vals variables)]
(let [cost (cost-variable lo hi v)]
(.addConstraint model (Choco/nth v costs cost))
cost)))
sum
(Choco/makeIntVar "length" (int (* n n)) (int (* n n 4))
(str-array ["cp:objective"]))]
(.addConstraint model
(Choco/eq (Choco/sum cost-variables) sum))))
```

## Results

Both approaches work well. They both return puzzles with minimal (as far as I can see) lengths of piping, and do so in similar times, except for the 8x8 case. Here are some solutions and timings, when starting with a completely empty puzzle:

```
2x2 3x3 4x4 5x5 6x6 7x7 8x8
╔╗ ╔╦╗ ╔╗╔╗ ╔╗╔═╗ ╔╗╔╗╔╗ ╔╗╔╗╔╦╗ ╔╗╔╗╔╗╔╗
╚╝ ║╚╣ ╚╝║║ ╚╝╚═╣ ╚╝║║║║ ╚╝╚╝║║║ ╚╝║║║║║║
╚═╝ ╔═╝║ ╔╗╔╗║ ╔═╝║║║ ╔╗╔╗║║║ ╔═╝║║║║║
╚══╝ ║╚╣║║ ╚═╗║║║ ╚╝╚╝║║║ ╚═╗║║║║║
╚═╝╚╝ ╔═╝║║║ ╔╗╔╗║║║ ╔═╝║║║║║
╚══╝╚╝ ║╚╣║║║║ ╚═╗║║║║║
╚═╝╚╝╚╝ ╔═╝║║║║║
╚══╝╚╝╚╝
```

Algorithm | 2x2 | 3x3 | 4x4 | 5x5 | 6x6 | 7x7 | 8x8 |
---|---|---|---|---|---|---|---|

DFA | 0.92ms | 1.4ms | 1.9ms | 30ms | 3.7ms | 240,000ms | >1hr |

Cost variables | 0.89ms | 1.3ms | 2.0ms | 25ms | 3.6ms | 150,000ms | 6.3ms |

There are two interesting things here.

First, the odd sized squares are, in general, taking much longer than the evens. Looking at the solutions, the even squares all have the minimum possible pipe length (every tile has length 2), but that’s not possible with the odd sizes. So my guess is that Choco somehow “knows” the minimal cost and so finishes the the even problems as soon as it has a match, but for the odd squares it needs to search more (all?) alternatives before it can be sure it has the lowest value possible (which apparently requires two “cost 3” tiles).

Second, the 8x8 DFA?! It looks like we passed the cutoff for a small number optimisation?

As ever, latest code is in the repo.

Andrew

## Related Posts

- Pipes, Clojure and Choco (part 5: Choco)
- Choco and Regular Expressions
- Pipes, Clojure and Choco (part 3: Smart Search)
- An ORM for C?
- Pipes, Clojure and Choco (part 2: Direct Search)
- Pipes, Clojure and Choco (part 1: Limits of Laziness)
- Pipes, Clojure and Choco (part 4: Benchmarking)
- How to Average Complex Responses
- CMake with Check Unit Tests
- OpenSSH Certificates