Thursday, April 15, 2010

Solving Combinatory Problems with Haskell

Today my Visual Studio start page had a link to an interesting blog post by Octavio Hernandez and I couldn't pass up the opportunity to solve the problem with Haskell. Please note that this is not in anyway a commentary on language X is better than Language Y, rather looking at the same problem in many different languages is very helpful in understanding the problem itself better. I'll try and make this a literate Haskell post so you should be able to copy and paste the text into a text file with the ".lhs" extension and have everything run. So we begin by importing an operator that we will use later:

> import Data.List ((\\))

We will need a list of the digits.

> num = [1..9]


The first generation is all of the one digit numbers that satisfy n mod 1 == 0. That's an easy one, all of them satisfy that condition!

> m1 = num

We can see this in the C# version as from i1 in oneToNine without any where clause. The next step is to look at two digit numbers without repeating digits where n mod 2 == 0. So we write this:

> m2 = [n | a <- num
>         , b <- num \\ [a]
>         , n <- [a*10 + b]
>         , n `mod` 2 == 0
>         ]

This is a Haskell list comprehension and it can be read as "list ([) of n's (n) such that (|)...". Where the "..." is a series of conditions. The conditions are either pulling a value from a list (the ones with the left arrow <-), or are predicate statements. It will only use values that cause all the predicate statements to be true. We specifically have a being a digit. The second value b is using the list difference operator // which takes a list on the left and a list of values to remove on the right.

We will pause for a second and notice how similar this is to the mathematical statement:



If we run the code so far in ghci we can see that evaluating m2 gives us:

*Main> m2
[12,14,16,18,24,26,28,32,34,36,38,42,46,48,52,54,56,58,62,64,68,72,74,76,78,82,8
4,86,92,94,96,98]

Notice no digits repeat and all the numbers are even. Evaluating length m2 we see:

*Main> length m2
32

Since there are 9 digits (greater than 0) half of which are even (round down because 1 is odd) there should be four repeating digit even two digit number (22,44,66,88). Since we are not considering zero, there are nine possibilities for the first digit and four for the second (same as our repeats). This gives four times nine (36) minus our repeats (4) which equals 32. We now should be convinced that m2 is correct.

At this point we might be tempted to do the same thing for m3 but replacing our a to draw from m2. Instead we will capture the idea of that step in a more generic way. First we need a helper:

> toDigits a   = reverse $ h a
>    where h 0 = []
>          h a = a `mod` 10 : (h $ a `div` 10)

Back to ghci:

*Main> :t toDigits
toDigits :: (Integral a) => a -> [a]

This Haskell type shows that toDigits takes an Integral and gives a list of Integrals. It could be helpful to construct a sort of inverse to toDigits called fromDigits which could be defined as:

> fromDigits = foldl1 (\a b -> 10*a + b)

The details are not important, but it is helpful to notice that these two functions are nearly inverses of each other:

*Main> :t fromDigits . toDigits
fromDigits . toDigits :: Integer -> Integer
*Main> :t toDigits . fromDigits
toDigits . fromDigits :: [Integer] -> [Integer]

Here the . operator is composing the two functions.

*Main> fromDigits [1,2,3]
123
*Main> toDigits 123
[1,2,3]
*Main> (fromDigits . toDigits) 123
123

When we use fromDigits on a list with numbers larger than nine toDigits fails to be an inverse:

*Main> fromDigits [12,3]
123
*Main> toDigits 123
[1,2,3]
*Main> fromDigits [1,23]
33

This "failure" would still be useful in the case where only the first item in the list is greater than nine. In general it is nearly always instructive to try and construct an inverse function as it tells you quite a lot about your original function. You might observe similarities between fromDigits and n in m2. It turns out that there is another way to generalize n which we will come back to later.

Looking back at our definition of m2 we want to make that work for any step so we look for the parts that need to change. We know that we want to have a pull from the numbers from the previous step. And we know that we will be "modding" by a higher number each time. This implies two inputs to our step function which we will call ns and m respectively. We also have a problems that our helper function will fix for us. First the numbers from the previous step need to be split into digits for them to be subtracted from the pool of digits for b. So that becomes b <- num \\ (toDigits a). Now we can write our function down:

> step ns m = [n | a <- ns
>                , b <- num \\ (toDigits a)
>                , n <- [10*a + b]
>                , n `mod` m == 0
>                ]

We can test this out and see if it produces the next step:

*Main> let m3 = step m2 3
*Main> m3
[123,126,129,147,162,165,168,183,186,189,243,246,249,261,264,267,285,321,324,327
,342,345,348,369,381,384,387,423,426,429,462,465,468,483,486,489,528,543,546,549
,561,564,567,582,621,624,627,642,645,648,681,684,687,723,726,729,741,762,765,768
,783,786,789,825,843,846,849,861,864,867,921,924,927,942,945,948,963,981,984,987
]
*Main> length m3
80

I'll leave it as an exercise to the reader to determine if that is correct for m3.

We can now write out all the steps to get to m9 but we will now call them s1...s9 for steps one through nine:

> s1 = num
> s2 = step s1 2
> s3 = step s2 3
> s4 = step s3 4
> s5 = step s4 5
> s6 = step s5 6
> s7 = step s6 7
> s8 = step s7 8
> s9 = step s8 9

We can verify s9 is in fact the same result as the C# code gives:

*Main> s9
[381654729]

But we are not done yet! Writing out all the steps is repetitive. We will do a little bit of Haskell trickery to see how we can easily write this more simpily:

> s 1 = num
> s 2 = step (s 1) 2
> s 3 = step (s 2) 3
> s 4 = step (s 3) 4
> s 5 = step (s 4) 5
> s 6 = step (s 5) 6
> s 7 = step (s 6) 7
> s 8 = step (s 7) 8
> s 9 = step (s 8) 9

Here we used Haskell's pattern matching style of function definition to turn our s1 through s9 functions into a single function s that takes one parameter. All we had to do was add a few spaces and parentheses Now it should be clear how to collapse this down to two statements:

> s' 1 = num
> s' m = step (s (m-1)) m

We have s' 1 as the base case and s' m as the mth case.

*Main> s' 9
[381654729]


Next time we will go further and look at some other ways to generalize this kind of problem.

No comments: