[Haskell-tut] Custom Data Types
Author |
Message |
wtd
|
Posted: Sat Mar 12, 2005 1:53 am Post subject: [Haskell-tut] Custom Data Types |
|
|
Introduction
Eventually everyone wants to know how to create their own composite data types. Java/C#/Ruby/C++/etc. prpogrammers create classes. C programmers create structs. Turing programmers create records. It's natural to want to group related bits of data together and give a name to that basic data form.
Learning by Example
In a recent debate with rizzix, I used the example of a Name data type. It's one I frequently use, so I thought I'd use it again here, and perhaps explain its construction in greater detail.
Declaration and Constructors
We start out by declaring the type.
The keyword data tells the compiler that we're introducing a new data type. In this case that type is called "Name". The ellipsis stands in for constructors which wil take any number of arguments and in excange give us a Name. These constructors may have any name (including the name of the type), as long they all have unique names.
Starting simply, let's have a single constructor which takes a single argument, giving us a Name that consists of a single String.
code: | data Name = Name1 String |
The 1 in the constructor's name denotes that we're only storing one name. Now, with that done, let's add two more constructors. The first of these will handle a Name with both first and last names, and the second will handle the possibility of no name at all.
code: | data Name = Name1 String | Name2 String String | NoName |
This last constructor is known as a unary constructor, since it takes no arguments.
Pattern Matching
You should note that in the constructor declarations I didn't give the arguments names, but rather simply specified their types. The reasoning behind this is clear as we see pattern matching in action.
Let's say I want to get the last name from a Name. How do I do that? How can I tell if a Name is a Name1, Name2 or NoName? The answer is simple. Aside from simply saying that a function has an argument, and that argument is of a given type, we can very specifically overload the function to only match one form of that data type.
First we declare a function "lastName" as taking a Name and returning a String.
code: | lastName :: Name -> String |
Now we define the function.
code: | lastName (Name2 first last) = last |
The "first" and "last" here are patterns which stand in for the two String arguments lassed to Name2 originally. We can then use these names in the function body.
In fact, a bit more on pattern matching: we never cared about the value of the individual's first name, so instead of giving that a name, we can use an underscore to indicate that we don't care about its value.
code: | lastName (Name2 _ last) = last |
We can use the same type of pattern matching to get a first name. Let's say this time, though, that we want to define the function for the Name1 formof the Name type as well.
code: | firstName :: Name -> String
firstName (Name1 first) = first
firstName (Name2 first _) = first |
The Case Expression
It's important to note that this kind of pattern matching is also how Haskell's case expression works.
We could have written firstName as follows.
code: | firstName :: Name -> String
firstName name =
case name of
Name1 first -> first
Name2 first _ -> first |
Robustness
Of course, this isn't very robust. There's no definition of lastName for Name1 or NoName, and firstName isn't defined for NoName. Obviously, this is because those forms of Name don't have the data we're trying to extra.
Haskell has a data type we can use for this, and seeing it will give you further idea of how data types can be used.
The Maybe type is a generic type which is either Just a value, or Nothing. It's declared like so:
code: | data Maybe a = Just a | Nothing |
With it we can declare firstName and lastName as:
code: | firstName :: Name -> Maybe String
lastName :: Name -> Maybe String |
We can thend efine the functions as:
code: | firstName (Name1 first) = Just first
firstName (Name2 first _) = Just first
firstName NoName = Nothing
lastName (Name1 _) = Nothing
lastName (Name2 _ last) = Just last
lastName NoName = Nothing |
We can now use pattern matching on these values.
code: | let bob = Name1 "Bob" in
case lastName bob of
Just name -> print name
Nothing -> print "No last name." |
Polymorphism
Let's take a simple example. We want a function which will converta Name into a suitable String for output. In Haskell we normally use the show function to do just this. Getting show (and functions which rely on it) to accept a Name is a matter of polymorphism.
Polymorphism in Haskell involves classes, as it does in so many languages. However, Haskell's notion of classes is a bit different from what Java programmers, for instance, normally think of. Instead it is more closely related to that language's idea of interfaces or abstract classes.
A class in Haskell is simply a function or set of functions which must be defined for a data type in order for that data type to belong to that class. A data type which does belong to that class is said to be an instance of it. The show function is required to be defined for something to be an instance of the Show class.
The Show class' declaration looks something like:
code: | class Show a where
show :: a -> String |
The "a" stands in as a generic type which can represent any type. We specify that type when we declare our Name type to be an instance of Show.
code: | instance Show Name where
show (Name1 first) = first
show (Name2 first last) = first ++ " " ++ last
show NoName = "no name" |
Classes are later used to constrain genericity. For instance, the print function is a generic function which takes any type, so long as it's an instance of the Show class. Its declaration is:
code: | print :: (Show a) => a -> IO () |
Deriving
Similarly, we can check for equality between names by making Name an instance of the Eq class. This class is interesting because it declares two functions: the == and /= operators, but it provides a default definition for the /= operator.
code: | class Eq a where
(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool
m /= n = not (m == n) |
Consequently, we need only define the == operator for Name.
code: | instance Eq Name where
Name1 n1 == Name1 n2 = n1 == n2
Name2 f1 l1 == Name2 f2 l2 = f1 == f2 && l1 == l2
NoName == NoName = True |
Of course, we'd have to write several definitions to cover the various combinations. For a relatively simple comparison like this, it becomes quite useful to use the deriving keyword which automatically creates definitions required by basic classes like Eq.
Using deriving is done when the data type is declared.
code: | data Name = Name1 String | Name2 String String | NoName
deriving (Eq) |
Of course, for finer grained control, you should write the code yourself. For instance, the default behavior of deriving the Ord class, which involves comparisons, won't give us the ability to compare based on last name first.
code: | instance Ord Name where
compare (Name1 f1) (Name1 f2) =
compare f1 f2
compare (Name1 f1) Name2 f2 _) =
compare f1 f2
compare (Name2 f1 l1) (Name1 f2) =
compare f1 f2
compare (Name2 f1 l1) (Name2 f2 l2) =
if l1 == l2
then compare f1 f2
else compare l1 l2
compare NoName NoName = EQ
compare _ NoName = GT
compare NoName _ = LT |
Polymorphism with a Custom Class
Of course, nothing limits us to classes included in the standard library. Let's define a new data type which incorporates a Name. We then have two new functions to get the first and last names.
code: | type Age = Int
data Person = Person Name Age
deriving (Eq)
personFirstName :: Person -> Maybe String
personFirstName (Person name _) = firstName name
personLastName (Person name _) = lastName name |
Of course, it's silly to introduce two new functions rather than reusing the firstName and lastName functions. To do this we'll have to create a HasName class for these two functions.
code: | class HasName a where
firstName :: a -> Maybe String
lastName :: a -> Maybe String |
We can then make Name and Person both instances of HasName.
code: | instance HasName Name where
firstName (Name1 first) = Just first
firstName (Name2 first _) = Just first
firstName NoName = Nothing
lastName (Name1 _) = Nothing
lastName (Name2 _ last) = Just last
lastName NoName = Nothing
instance HasName Person where
firstName (Person name _) = firstName name
lastName (Person name _) = lastName name |
Further Details
If further details on this subject are desired, please let me know. |
|
|
|
|
|
Sponsor Sponsor
|
|
|
|
|