Computer Science Canada [Haskell-tut] Custom Data Types |
Author: | wtd [ Sat Mar 12, 2005 1:53 am ] | ||||||||||||||||||||||||||||||||||||||||||||
Post subject: | [Haskell-tut] Custom Data Types | ||||||||||||||||||||||||||||||||||||||||||||
Introduction Eventually everyone wants to know how to create their own composite data types. Java/C#/Ruby/C++/etc. prpogrammers create classes. C programmers create structs. Turing programmers create records. It's natural to want to group related bits of data together and give a name to that basic data form. Learning by Example In a recent debate with rizzix, I used the example of a Name data type. It's one I frequently use, so I thought I'd use it again here, and perhaps explain its construction in greater detail. Declaration and Constructors We start out by declaring the type.
The keyword data tells the compiler that we're introducing a new data type. In this case that type is called "Name". The ellipsis stands in for constructors which wil take any number of arguments and in excange give us a Name. These constructors may have any name (including the name of the type), as long they all have unique names. Starting simply, let's have a single constructor which takes a single argument, giving us a Name that consists of a single String.
The 1 in the constructor's name denotes that we're only storing one name. Now, with that done, let's add two more constructors. The first of these will handle a Name with both first and last names, and the second will handle the possibility of no name at all.
This last constructor is known as a unary constructor, since it takes no arguments. Pattern Matching You should note that in the constructor declarations I didn't give the arguments names, but rather simply specified their types. The reasoning behind this is clear as we see pattern matching in action. Let's say I want to get the last name from a Name. How do I do that? How can I tell if a Name is a Name1, Name2 or NoName? The answer is simple. Aside from simply saying that a function has an argument, and that argument is of a given type, we can very specifically overload the function to only match one form of that data type. First we declare a function "lastName" as taking a Name and returning a String.
Now we define the function.
The "first" and "last" here are patterns which stand in for the two String arguments lassed to Name2 originally. We can then use these names in the function body. In fact, a bit more on pattern matching: we never cared about the value of the individual's first name, so instead of giving that a name, we can use an underscore to indicate that we don't care about its value.
We can use the same type of pattern matching to get a first name. Let's say this time, though, that we want to define the function for the Name1 formof the Name type as well.
The Case Expression It's important to note that this kind of pattern matching is also how Haskell's case expression works. We could have written firstName as follows.
Robustness Of course, this isn't very robust. There's no definition of lastName for Name1 or NoName, and firstName isn't defined for NoName. Obviously, this is because those forms of Name don't have the data we're trying to extra. Haskell has a data type we can use for this, and seeing it will give you further idea of how data types can be used. The Maybe type is a generic type which is either Just a value, or Nothing. It's declared like so:
With it we can declare firstName and lastName as:
We can thend efine the functions as:
We can now use pattern matching on these values.
Polymorphism Let's take a simple example. We want a function which will converta Name into a suitable String for output. In Haskell we normally use the show function to do just this. Getting show (and functions which rely on it) to accept a Name is a matter of polymorphism. Polymorphism in Haskell involves classes, as it does in so many languages. However, Haskell's notion of classes is a bit different from what Java programmers, for instance, normally think of. Instead it is more closely related to that language's idea of interfaces or abstract classes. A class in Haskell is simply a function or set of functions which must be defined for a data type in order for that data type to belong to that class. A data type which does belong to that class is said to be an instance of it. The show function is required to be defined for something to be an instance of the Show class. The Show class' declaration looks something like:
The "a" stands in as a generic type which can represent any type. We specify that type when we declare our Name type to be an instance of Show.
Classes are later used to constrain genericity. For instance, the print function is a generic function which takes any type, so long as it's an instance of the Show class. Its declaration is:
Deriving Similarly, we can check for equality between names by making Name an instance of the Eq class. This class is interesting because it declares two functions: the == and /= operators, but it provides a default definition for the /= operator.
Consequently, we need only define the == operator for Name.
Of course, we'd have to write several definitions to cover the various combinations. For a relatively simple comparison like this, it becomes quite useful to use the deriving keyword which automatically creates definitions required by basic classes like Eq. Using deriving is done when the data type is declared.
Of course, for finer grained control, you should write the code yourself. For instance, the default behavior of deriving the Ord class, which involves comparisons, won't give us the ability to compare based on last name first.
Polymorphism with a Custom Class Of course, nothing limits us to classes included in the standard library. Let's define a new data type which incorporates a Name. We then have two new functions to get the first and last names.
Of course, it's silly to introduce two new functions rather than reusing the firstName and lastName functions. To do this we'll have to create a HasName class for these two functions.
We can then make Name and Person both instances of HasName.
Further Details If further details on this subject are desired, please let me know. |