From C# to F# – Part 10
Today we’ll talk about a super-awesome feature of F# called Discriminated Unions.
From time to time you want to put N types under a single cap. The standard way to do this in object-oriented languages to is to create a common interface and then implement that interface in a number of classes. That’s perfect if N is an open-ended number and if those types have something in common. But sometimes the types are completely unrelated and OO doesn’t help.
Here are some examples:
1. Binary tree. Some nodes are parent nodes, some are leaves. The two have nothing in common. Sure you can tack on the visitor pattern to make the tree walk easier, but if you have to actually manipulate the tree in some way then interfaces don’t help and you need to know whether a node is a parent or a leaf.
2. XML. Your node may have a text children, or other nodes, etc. Not much in common.
3. Language syntax trees. What does an if node have in common with a variable declaration node?
Etc., etc. In those cases what you really want is to have a collection of types (LeafNode and ParentNode in the case of a binary tree) placed under a single type (let’s call it BinaryTreeNode) that can be one or the other, but nothing else.
The way to do that in F# is using discriminated unions. Say our tree is supposed to store integers. Each ParentNode will contain two references to child BinaryTreeNodes, and each LeafNode will contain an integer. The way to write all that in F# is:
type BinaryTreeNode = LeafNode of int | ParentNode of BinaryTreeNode*BinaryTreeNode
(In case you forgot BinaryTreeNode*BinaryTreeNode is a type signature for a pair of BinaryTreeNodes.)
This should be enough for you to try and tease apart the general syntax:
type <union name> =
<union name member1> of <argument type> |
<union name member2> of <argument type> |
...
To construct a variable of type BinaryTreeNode, you simply write the union member name followed by the argument:
let (x : BinaryTreeNode) = LeafNode 5
of course, the type declaration in the above example is unnecessary, I just wanted to make it clear that when you write “LeafNode 5” the resulting type is BinaryTreeNode. Usually you just write:
let x = LeafNode 5
To tease apart a variable that’s a discriminated union, we use pattern matching, demonstrated below with a function that computes the sum of all integers in a tree:
let rec sumOfAllNodes node =
match node with
| LeafNode x -> x
| ParentNode (child1, child2) -> sumOfAllNodes child1 + sumOfAllNodes child2
and then we can test this pup:
let test1 = sumOfAllNodes (LeafNode 5)
let test2 = sumOfAllNodes (ParentNode (LeafNode 5, LeafNode 6))
Of course, as it is right now our tree needs to be perfectly balanced, with each parent having two children. We can correct that by adding a third member of the union, call it NoNode, which we’ll use to specify that a node is a missing:
type BinaryTreeNode =
LeafNode of int
| ParentNode of BinaryTreeNode*BinaryTreeNode
| NoNode
As you can see from NoNode, an union member doesn’t have to take arguments, so NoNode is not decorated with of <type>. Thus we can construct an empty tree simply by saying:
let emptyTree = NoNode
With the above addition to BinaryTreeNode, the compiler will complain that we are not handling NoNode in our sumOfAllNodes function, so we need to correct that:
let rec sumOfAllNodes node =
match node with
| LeafNode x -> x
| ParentNode (child1, child2) -> sumOfAllNodes child1 + sumOfAllNodes child2
| NoNode -> 0
And now you know pretty much everything about discriminating unions.
Discriminating unions are so delicious that most functional languages have managed to be completely free of NullPointer exceptions. Pointers, by default, are never null. If you want to express the fact that a value may be missing, you use a discriminating union between your type and None. When you want use this type, the language forces you to pattern match between the two possibilities and handle all the cases properly. More here.
