Pipes, Clojure and Choco - Optimisation

Earlier posts here (ending with part 5) have described using Clojure and Choco to solve this pipes problem. Today’s post uses similar ideas to look at optimisation.

Optimisation is useful when the constraints allow for many possible outcomes, but we want to select “the best” in some way.

To demonstrate this I have adapted the previous puzzle to answer a slightly different question:

If you can choose any tile for the empty spaces, which choices give the shortest length of tubing?

Otherwise, the constraints are as before: all tiles must contain some pipes; pipes must connect across tiles; the question may specify some fixed, initial tiles.

An advantage of using Choco over a “hand written” DFS is immediately apparent: it’s easy to adapt Choco to solve this new problem. I am not sure what the best approach would be for adapting the code I wrote here - surely there’s something better than generating all constraints and then sorting them? But what?

Given Choco’s general API there are various ways to approach the problem. I tried two: one using costRegular (DFAs) and one using additional cost variables. In both cases, most of the code is taken from the earlier solution which uses binary relations to constrain neighbours.

The DFA Approach

In a sense, this approach is “cheating”, or at least misleading, because the DFA I use accepts all possible combinations. The only reason for using a DFA is that I noticed that the costRegular call in the Choco API provides an easy way to specify a cost for each variable. And since that method requires a DFA, I simply used one that accepts all patterns!

There is one wrinkle: specifying the DFA as a regular expression like ".*" does not work. I am not sure if this is a limitation in the DFA library, or is driven by the need to know more details about possible values, but a support post from the Choco team explained that the easiest way to generate an “accept anything” DFA is to generate an empty one (that accepts nothing) and then take the complement.

Apart from the trick above, the rest of the code is pretty simple: we introduce a new variable, which is the sum of the costs (set by the costRegular API call, and indicate that it is an “objective variable” (ie should be minimised); then we call the method, passing in the costs and variables:

(defn str-array [values]
  (into-array String values))

(defn ivar-array [values]
  (into-array IntegerVariable values))

(defn cost [mask]
    (#{0 1 2 4 8} mask) 99
    (#{14 13 11 7} mask) 3
    (= mask 15) 4
    :else 2))

(defn constrain-sum-dfa [model n masks variables]
  (let [sum
        (Choco/makeIntVar "length" (int (* n n)) (int (* n n 4))
          (into-array String ["cp:objective"]))
        costs (int-array (map cost (range unknown)))
        automaton (FiniteAutomaton.)]
    (doseq [m masks] (.addToAlphabet automaton m))
    (.addConstraint model
      (Choco/costRegular sum (ivar-array (vals variables))
        (.complement automaton)
        (into-array (for [i (keys variables)] costs))))))

The Cost Variables Approach

When I first started using Choco I was worried about the number of variables I was introducing. But it seems that adding variables is often the natural way to adapt a problem so that the constraints match the API. As long as the new variables are strongly determined (ie depend rigidly on existing variables) they don’t seem to “cost” much.

So for this approach, I add a new variable for each empty cell, which represents the cost (pipe length) of that cell. I then use a simple lookup table (via the nth API call) to associate the cost with a particular tile (so a tile with mask 3 is of length 2 because it has two bits set - two “radial” section of pipe).

Finally, another variable, flagged as the objective, is set to the sum of the costs:

(defn cost-variable [lo hi variable]
  (Choco/makeIntVar (str "cost " (.getName variable))
    (int lo) (int hi) (str-array [])))

(defn constrain-sum-nth [model n lo hi variables]
  (let [costs (int-array (map cost (range unknown)))
          (for [v (vals variables)]
            (let [cost (cost-variable lo hi v)]
              (.addConstraint model (Choco/nth v costs cost))
        (Choco/makeIntVar "length" (int (* n n)) (int (* n n 4))
          (str-array ["cp:objective"]))]
    (.addConstraint model
      (Choco/eq (Choco/sum cost-variables) sum))))


Both approaches work well. They both return puzzles with minimal (as far as I can see) lengths of piping, and do so in similar times, except for the 8x8 case. Here are some solutions and timings, when starting with a completely empty puzzle:

2x2  3x3   4x4    5x5     6x6      7x7       8x8
╔╗   ╔╦╗   ╔╗╔╗   ╔╗╔═╗   ╔╗╔╗╔╗   ╔╗╔╗╔╦╗   ╔╗╔╗╔╗╔╗
╚╝   ║╚╣   ╚╝║║   ╚╝╚═╣   ╚╝║║║║   ╚╝╚╝║║║   ╚╝║║║║║║
     ╚═╝   ╔═╝║   ╔╗╔╗║   ╔═╝║║║   ╔╗╔╗║║║   ╔═╝║║║║║
           ╚══╝   ║╚╣║║   ╚═╗║║║   ╚╝╚╝║║║   ╚═╗║║║║║
                  ╚═╝╚╝   ╔═╝║║║   ╔╗╔╗║║║   ╔═╝║║║║║
                          ╚══╝╚╝   ║╚╣║║║║   ╚═╗║║║║║
                                   ╚═╝╚╝╚╝   ╔═╝║║║║║
Algorithm 2x2 3x3 4x4 5x5 6x6 7x7 8x8
DFA 0.92ms 1.4ms 1.9ms 30ms 3.7ms 240,000ms >1hr
Cost variables 0.89ms 1.3ms 2.0ms 25ms 3.6ms 150,000ms 6.3ms

There are two interesting things here.

First, the odd sized squares are, in general, taking much longer than the evens. Looking at the solutions, the even squares all have the minimum possible pipe length (every tile has length 2), but that’s not possible with the odd sizes. So my guess is that Choco somehow “knows” the minimal cost and so finishes the the even problems as soon as it has a match, but for the odd squares it needs to search more (all?) alternatives before it can be sure it has the lowest value possible (which apparently requires two “cost 3” tiles).

Second, the 8x8 DFA?! It looks like we passed the cutoff for a small number optimisation?

As ever, latest code is in the repo.


Related Posts

blog comments powered by