Archive for November 2009

 
 

Muscle Injury and Healing

Pretty good article. Loads of references.

From C# to F# – Part 10

Today we’ll talk about a super-awesome feature of F# called Discriminated Unions.

From time to time you want to put N types under a single cap. The standard way to do this in object-oriented languages to is to create a common interface and then implement that interface in a number of classes. That’s perfect if N is an open-ended number and if those types have something in common. But sometimes the types are completely unrelated and OO doesn’t help.

Here are some examples:

1. Binary tree. Some nodes are parent nodes, some are leaves. The two have nothing in common. Sure you can tack on the visitor pattern to make the tree walk easier, but if you have to actually manipulate the tree in some way then interfaces don’t help and you need to know whether a node is a parent or a leaf.

2. XML. Your node may have a text children, or other nodes, etc. Not much in common.

3. Language syntax trees. What does an if node have in common with a variable declaration node?

Etc., etc. In those cases what you really want is to have a collection of types (LeafNode and ParentNode in the case of a binary tree) placed under a single type (let’s call it BinaryTreeNode) that can be one or the other, but nothing else.

The way to do that in F# is using discriminated unions. Say our tree is supposed to store integers. Each ParentNode will contain two references to child BinaryTreeNodes, and each LeafNode will contain an integer. The way to write all that in F# is:

type BinaryTreeNode = LeafNode of int | ParentNode of BinaryTreeNode*BinaryTreeNode

(In case you forgot BinaryTreeNode*BinaryTreeNode is a type signature for a pair of BinaryTreeNodes.)

This should be enough for you to try and tease apart the general syntax:

type <union name> =
    <union name member1> of <argument type> |
    <union name member2> of <argument type> |
    ...

To construct a variable of type BinaryTreeNode, you simply write the union member name followed by the argument:

let (x : BinaryTreeNode) = LeafNode 5

of course, the type declaration in the above example is unnecessary, I just wanted to make it clear that when you write “LeafNode 5” the resulting type is BinaryTreeNode. Usually you just write:

let x = LeafNode 5

To tease apart a variable that’s a discriminated union, we use pattern matching, demonstrated below with a function that computes the sum of all integers in a tree:

let rec sumOfAllNodes node =
    match node with
    | LeafNode x -> x
    | ParentNode (child1, child2) -> sumOfAllNodes child1 + sumOfAllNodes child2

and then we can test this pup:

let test1 = sumOfAllNodes (LeafNode 5)

let test2 = sumOfAllNodes (ParentNode (LeafNode 5, LeafNode 6))

Of course, as it is right now our tree needs to be perfectly balanced, with each parent having two children. We can correct that by adding a third member of the union, call it NoNode, which we’ll use to specify that a node is a missing:

type BinaryTreeNode =
     LeafNode of int
     | ParentNode of BinaryTreeNode*BinaryTreeNode
     | NoNode

As you can see from NoNode, an union member doesn’t have to take arguments, so NoNode is not decorated with of <type>. Thus we can construct an empty tree simply by saying:

let emptyTree = NoNode

With the above addition to BinaryTreeNode, the compiler will complain that we are not handling NoNode in our sumOfAllNodes function, so we need to correct that:

let rec sumOfAllNodes node =
    match node with
    | LeafNode x -> x
    | ParentNode (child1, child2) -> sumOfAllNodes child1 + sumOfAllNodes child2
    | NoNode -> 0

And now you know pretty much everything about discriminating unions.

Discriminating unions are so delicious that most functional languages have managed to be completely free of NullPointer exceptions. Pointers, by default, are never null. If you want to express the fact that a value may be missing, you use a discriminating union between your type and None. When you want use this type, the language forces you to pattern match between the two possibilities and handle all the cases properly. More here.

From C# to F# – Part 9

Functional languages have an obsession with lists just like imperative languages use arrays as the most frequent data structure building block. LISP the father of them all, its name even stands for LISt Processing. Sometimes those lists are simple linked lists like in LISP, sometimes they are more advanced things like in Haskell, where, due to its execution strategy, they are more like über-efficient enumerators. In F# they are simple immutable singly-linked list.

The easiest way to create a linked list is using List literals:

let someList = [1; 2; 3; 4; 5]

An empty list is just brackets:

let emptyList = []

Creating a new list by adding a new node at the front of another list is cheap, and is done using the :: operator:

let fromZero = 0 :: someList

In fact, you can construct list literals that way if you’re into pain:

let someList2 = 1::2::3::4::5::[]

You can merge two lists together using the @ operator:

let mergedList = someList @ someList2

You can get the first element (head) of a non-empty list using function List.head, and the rest of the list (tail) using List.tail function:

let head = List.head someList
let tail = List.tail someList

You can figure out if a list is empty using List.isEmpty function:

let isEmpty = List.isEmpty emptyList

You can determine the length of a list using the List.length function:

let megedListLen = List.length mergedList

But of course, by now you know how to write the equivalent of List.length by yourself:

let listLen lst =
    let rec listLenHelper lst count =
        if List.isEmpty lst then
            count
        else
            listLenHelper (List.tail lst) (count + 1)
    listLenHelper lst 0

The List class is full of other little gems. Say you want to print all elements of a list. Use List.iter to call a function f for all elements of a list:

let printInteger (x : int) = Console.WriteLine x
List.iter printInteger someList

Or you can convert all integers in a list to some other value using List.map:

let makeNegative x = - x
let negativeList = List.map makeNegative someList

One of the most used function is List.fold, to which you give a function, a seed value, and a list. Say you give it a function f, seed 0 and a list [1; 2; 3]. It’s going to first call f(0, 1), then with the result of that it’s going to call f (previous result, 2), and then with the result of that it’s going to call f (previous result, 3) and finally return that.

For example, let’s assume we want to compute a factorial and a sum of a list. For factorial we’ll use the seed one and the function will be multiply, and for sum we’ll use the seed of zero and the function will be add.

let add x y = x + y
let multiply x y = x * y

let sum = List.fold add 0 someList
let product = List.fold multiply 1 someList

Or let’s step away from boring mathematics and go into the real world. Say you want to produce a file with comma-separated values. If you have a list of column values that go into a line, you can get the comma-separated line using a short sequence like this:

let columns = ["one"; "two"; "three"; "four"; "five"]

let combine s1 s2 = s1 + "," + s2

let result = List.fold combine(List.head columns) (List.tail columns)

The first line of the above example is just an input list, consisting of five strings. Next is the combine function which takes two strings, and returns a single string with the two values separated by a comma. Finally, we create the comma-separated line by calling List.fold and passing it the head of our input (“one”) as seed, and the rest of the list as the list (“two”, “three”, “four” and “five”.) List.fold then combines “one” with “two”, then “one,two” with “three”, etc. till the end of the list.

Finally, you can use pattern matching on lists. Let’s rewrite our listLen function above using pattern matching:

let listLen2 lst =
    let rec listLenHelper lst count =
        match lst with
        | [] -> count
        | hd::tl -> listLenHelper tl (count + 1)
    listLenHelper lst 0

The most common patterns when using lists are [] and hd::tl because most of the time that’s all you care about. And they have an advantage over if List.isEmpty list then … because a) the code is cleaner, and b) head and tail are already broken down for you by the pattern matching engine instead of you having to call List.head and List.tail.

If you need to you can do patterns like a :: b :: [], or a :: b :: c, etc., though those are rarely used.

Edit: Corrected the body of the multiply method per plomu’s comment.

Edit: Wrote more about List.fold as prompted by the above comment.

F# Bug Rates

Pretty interesting comment by somebody that goes under the name “SSP” about his/her bug rates per LOC between various languages.

F# Performance

One thing I care about deeply is performance. Since I’m looking to start using F# for my commercial projects when VS10 comes out, I did some Googling and took a look at the code that the F# compiler produces when higher-order functions are used extensively.

This page on Stack Overflow has a bunch of good links to various posts about F# perf. And I was pleasantly surprised by the quality of the code that the F# compiler produces, even in Debug mode.

But no conclusion yet. Need to do more spelunking.

Quick Note on Writing Tutorials

Writing the F# tutorial reminded me of Guy Steele’s famous talk, Growing a Language. The secret of good tutorials, for me at least, is never to introduce more than one new non-trivial concept at a time.

I still remember the confusion of learning about ML discriminating unions, pattern matching, and tuple types all at once. Hopefully I managed to avoid making that kind of mistake with with what I’ve written so far.

From C# to F# – Part 8

Today we’ll talk about patter matching. Pattern matching is basically your C# switch statement on steroids. You won’t really appreciate pattern matching until we cover discriminating unions and lists, but it’s a central piece of the language and one that you’ll be using frequently.

Pattern matching is an expression with an input variable and a list of patterns and corresponding results that should be returned if a pattern matches the input.

For example:

let binaryDigitAsString number =
    match number with
    | 0 -> "Zero"
    | 1 -> "One"
    | _ -> "Too large for me"

let _ =
    Console.WriteLine (binaryDigitAsString 0)
    Console.WriteLine (binaryDigitAsString 1)
    Console.WriteLine (binaryDigitAsString 15)
    ignore (Console.ReadLine ()

The binaryDigitOfString function thus produces the equivalent of the following C# code:

string BinaryDigitAsString (int number) {
    switch (number) {
        case 0: return "Zero";
        case 1: return "One";
        default: return "Too large for me"
    }
}

So far this is plain vanilla stuff. The underscore character is used as a “catch all”, or a rough equivalent of default, and everything is simply matched by value.

Let’s try something more advanced:

let distanceFromOrigin point =
    match point with
    | (x, 0.0) –> abs x
    | (0.0, y) –> abs y
    | (x, y) -> Math.Sqrt (x * x + y * y)

The above function calculates the distance between (0, 0) and a point in the two-dimensional plane, with shortcuts for coordinates that lie on the X and Y axes (“abs” is a standard library function, basically the same thing as Math.Abs.). Its rough C# equivalent is:

float DistanceFromOrigin (Tuple<float, float> point) {
    if (point.Item2 == 0) {
        float x = Math.Abs (point.Item1);
        return x;
    }
    else if (point.Item1 == 0) {
        float y = Math.Abs (point.Item2);
        return y;
    }
    else {
        float x = point.Item1;
        float y = point.Item2;
        return Math.Sqrt (x * x + y * y);
    }
}

As you can see here, pattern matching is not limited to simple comparisons like the C# switch statement is. It can “break apart” the matched value and do comparisons on individual members of the matched value. Also, during the matching phase, it can bind variables (x and y in the above example) and those variables can then be used in the body of a case statement.

There’s more to pattern matching than this, but this is all we can do with our current F# vocabulary. More next time.

(Edit: Fixed the distance function to return the absolute value in the shortcut paths. Thanks Chris.)

Part 9

From C# to F# – Part 7

In F# it is dead simple to put multiple values into a single one. The runtime library has types Tuple<T1, T2>, Tuple<T1, T2, T3>, etc. Then the syntax for creating one of these objects is straightforward: simply wrap multiple values in parenthesis and separate them by commas.

For example:

(1, 2)

is a Tuple<int, int>. And:

(“Test”, 1, 1.2)

is a Tuple<string, int, float>.

Receiving tuples is similarly easy. Just wrap your arguments into parenthesis and separate them by commas:

let sumpair (a, b) = a + b

Alternatively, you can receive the whole tuple as a single argument, and then use the fst and snd standard library functions to break the pair into its constituents:

let sumpair x = fst x + snd x

The definition of fst and snd functions in the standard library is very simple:

let fst (a, b) = a
let snd (a, b) = b

Let’s test the whole round-trip:

let first (a, b) = a 

let _ =
    Console.WriteLine (first (1, 2))
    ignore (Console.ReadLine ())

This packs two integers into a tuple, sends that tuple into the function first, gets an integer back and prints it out.

One weird thing is how F# displays the type signature for tuples. Where C# reports the type of a tuple for two integers as Tuple<int, int>, F# reports it as int * int.

Again, just a convention, stemming from the fact that tuples are first-class citizens in F#, and in C# they are just another class.

This packaging of stuff is really useful in the functional style of programming. A lot of “glue” and processing functions in the standard library are created to work on a single argument, so having the ability to pack a bunch of values into one becomes really useful when creating pipelines of processing instructions. But more on that later.

Non-Annoying Ringtones

Microsoft has posted some decent ringtones for download on the Windows Mobile site. If you have one of those annoying ringtones from your favorite rock band, please do everybody around you a favor and go download one.

From C# to F# – Part 6

Time to dive into F#’s support for generics. If you define a function like this:

let increment x = x + 1

then you already know that F# will infer the type of x to be int, and if you hover over the variable x in Visual Studio it will display a tooltip saying val x : int.

But if you define a function that can work with any type:

let identity x = x

then the tooltip for x will say val x: ‘a. What in the name of God is that weird apostrophe doing there?

It goes like this. Say you were defining the identity function in C#. You’d write something like this:

T Identity<T> (T t) {
    return t;
}

In other words, you would come up with the name of the generic type inside this scope, and it would be T. But in this case the F# compiler is inventing the generic type name for you because it’s doing type inference, so it plays it safe and invents the name ‘a.

Of course, if you want to go with the name T you can specify the type in the function’s signature, but you have to add the apostrophe in front to tell the compiler that this is a generic type:

let identity (x : 'T) = x

If you hover over x now VS will use the generic type you specified, so it will report x to be val x : ‘T as you instructed it.

Let’s play with this a little bit. What if we write the following two functions:

let first x y = x
let second x y = y

and hover over x and y, the compiler will report them as ‘a and ‘b.

And again, if you want to give them more descriptive names you can:

let first (x : 'FirstArgType) (y : 'SecondArgType) = x
let second (x : 'FirstArgType) (y : 'SecondArgType) = y

Most F# programmers simply don’t bother. I found it helpful in cases where I had a function with a bunch of arguments, though that’s a rare case since with currying there’s usually not much need.

One place where this is used, though, is to intentionally constrain the types. Say you want x and y to be of the same type. In that case you’d write something like:

let first (x : 'T) (y : 'T) = x

and the compiler would obey your constraint.

Part 7