Programming C, C++, Java, PHP, Ruby, Turing, VB
Computer Science Canada 
Programming C, C++, Java, PHP, Ruby, Turing, VB  

Username:   Password: 
 RegisterRegister   
 [Haskell-tut] Custom Data Types
Index -> Programming, General Programming -> Functional Programming
View previous topic Printable versionDownload TopicSubscribe to this topicPrivate MessagesRefresh page View next topic
Author Message
wtd




PostPosted: Sat Mar 12, 2005 1:53 am   Post subject: [Haskell-tut] Custom Data Types

Introduction

Eventually everyone wants to know how to create their own composite data types. Java/C#/Ruby/C++/etc. prpogrammers create classes. C programmers create structs. Turing programmers create records. It's natural to want to group related bits of data together and give a name to that basic data form.

Learning by Example

In a recent debate with rizzix, I used the example of a Name data type. It's one I frequently use, so I thought I'd use it again here, and perhaps explain its construction in greater detail.

Declaration and Constructors

We start out by declaring the type.

code:
data Name = ...


The keyword data tells the compiler that we're introducing a new data type. In this case that type is called "Name". The ellipsis stands in for constructors which wil take any number of arguments and in excange give us a Name. These constructors may have any name (including the name of the type), as long they all have unique names.

Starting simply, let's have a single constructor which takes a single argument, giving us a Name that consists of a single String.

code:
data Name = Name1 String


The 1 in the constructor's name denotes that we're only storing one name. Now, with that done, let's add two more constructors. The first of these will handle a Name with both first and last names, and the second will handle the possibility of no name at all.

code:
data Name = Name1 String | Name2 String String | NoName


This last constructor is known as a unary constructor, since it takes no arguments.

Pattern Matching

You should note that in the constructor declarations I didn't give the arguments names, but rather simply specified their types. The reasoning behind this is clear as we see pattern matching in action.

Let's say I want to get the last name from a Name. How do I do that? How can I tell if a Name is a Name1, Name2 or NoName? The answer is simple. Aside from simply saying that a function has an argument, and that argument is of a given type, we can very specifically overload the function to only match one form of that data type.

First we declare a function "lastName" as taking a Name and returning a String.

code:
lastName :: Name -> String


Now we define the function.

code:
lastName (Name2 first last) = last


The "first" and "last" here are patterns which stand in for the two String arguments lassed to Name2 originally. We can then use these names in the function body.

In fact, a bit more on pattern matching: we never cared about the value of the individual's first name, so instead of giving that a name, we can use an underscore to indicate that we don't care about its value.

code:
lastName (Name2 _ last) = last


We can use the same type of pattern matching to get a first name. Let's say this time, though, that we want to define the function for the Name1 formof the Name type as well.

code:
firstName :: Name -> String
firstName (Name1 first) = first
firstName (Name2 first _) = first


The Case Expression

It's important to note that this kind of pattern matching is also how Haskell's case expression works.

We could have written firstName as follows.

code:
firstName :: Name -> String
firstName name =
  case name of
    Name1 first -> first
    Name2 first _ -> first


Robustness

Of course, this isn't very robust. There's no definition of lastName for Name1 or NoName, and firstName isn't defined for NoName. Obviously, this is because those forms of Name don't have the data we're trying to extra.

Haskell has a data type we can use for this, and seeing it will give you further idea of how data types can be used.

The Maybe type is a generic type which is either Just a value, or Nothing. It's declared like so:

code:
data Maybe a = Just a | Nothing


With it we can declare firstName and lastName as:

code:
firstName :: Name -> Maybe String
lastName :: Name -> Maybe String


We can thend efine the functions as:

code:
firstName (Name1 first) = Just first
firstName (Name2 first _) = Just first
firstName NoName = Nothing

lastName (Name1 _) = Nothing
lastName (Name2 _ last) = Just last
lastName NoName = Nothing


We can now use pattern matching on these values.

code:
let bob = Name1 "Bob" in
   case lastName bob of
      Just name -> print name
      Nothing -> print "No last name."


Polymorphism

Let's take a simple example. We want a function which will converta Name into a suitable String for output. In Haskell we normally use the show function to do just this. Getting show (and functions which rely on it) to accept a Name is a matter of polymorphism.

Polymorphism in Haskell involves classes, as it does in so many languages. However, Haskell's notion of classes is a bit different from what Java programmers, for instance, normally think of. Instead it is more closely related to that language's idea of interfaces or abstract classes.

A class in Haskell is simply a function or set of functions which must be defined for a data type in order for that data type to belong to that class. A data type which does belong to that class is said to be an instance of it. The show function is required to be defined for something to be an instance of the Show class.

The Show class' declaration looks something like:

code:
class Show a where
   show :: a -> String


The "a" stands in as a generic type which can represent any type. We specify that type when we declare our Name type to be an instance of Show.

code:
instance Show Name where
   show (Name1 first) = first
   show (Name2 first last) = first ++ " " ++ last
   show NoName = "no name"


Classes are later used to constrain genericity. For instance, the print function is a generic function which takes any type, so long as it's an instance of the Show class. Its declaration is:

code:
print :: (Show a) => a -> IO ()


Deriving

Similarly, we can check for equality between names by making Name an instance of the Eq class. This class is interesting because it declares two functions: the == and /= operators, but it provides a default definition for the /= operator.

code:
class Eq a where
   (==) :: a -> a -> Bool
   (/=) :: a -> a -> Bool
   m /= n = not (m == n)


Consequently, we need only define the == operator for Name.

code:
instance Eq Name where
   Name1 n1 == Name1 n2 = n1 == n2
   Name2 f1 l1 == Name2 f2 l2 = f1 == f2 && l1 == l2
   NoName == NoName = True


Of course, we'd have to write several definitions to cover the various combinations. For a relatively simple comparison like this, it becomes quite useful to use the deriving keyword which automatically creates definitions required by basic classes like Eq.

Using deriving is done when the data type is declared.

code:
data Name = Name1 String | Name2 String String | NoName
   deriving (Eq)


Of course, for finer grained control, you should write the code yourself. For instance, the default behavior of deriving the Ord class, which involves comparisons, won't give us the ability to compare based on last name first.

code:
instance Ord Name where
   compare (Name1 f1) (Name1 f2) =
      compare f1 f2
   compare (Name1 f1) Name2 f2 _) =
      compare f1 f2
   compare (Name2 f1 l1) (Name1 f2) =
      compare f1 f2
   compare (Name2 f1 l1) (Name2 f2 l2) =
      if l1 == l2
         then compare f1 f2
         else compare l1 l2
   compare NoName NoName = EQ
   compare _ NoName = GT
   compare NoName _ = LT


Polymorphism with a Custom Class

Of course, nothing limits us to classes included in the standard library. Let's define a new data type which incorporates a Name. We then have two new functions to get the first and last names.

code:
type Age = Int
data Person = Person Name Age
   deriving (Eq)

personFirstName :: Person -> Maybe String
personFirstName (Person name _) = firstName name

personLastName (Person name _) = lastName name


Of course, it's silly to introduce two new functions rather than reusing the firstName and lastName functions. To do this we'll have to create a HasName class for these two functions.

code:
class HasName a where
   firstName :: a -> Maybe String
   lastName :: a -> Maybe String


We can then make Name and Person both instances of HasName.

code:
instance HasName Name where
   firstName (Name1 first) = Just first
   firstName (Name2 first _) = Just first
   firstName NoName = Nothing
   lastName (Name1 _) = Nothing
   lastName (Name2 _ last) = Just last
   lastName NoName = Nothing

instance HasName Person where
   firstName (Person name _) = firstName name
   lastName (Person name _) = lastName name


Further Details

If further details on this subject are desired, please let me know. Smile
Sponsor
Sponsor
Sponsor
sponsor
Display posts from previous:   
   Index -> Programming, General Programming -> Functional Programming
View previous topic Tell A FriendPrintable versionDownload TopicSubscribe to this topicPrivate MessagesRefresh page View next topic

Page 1 of 1  [ 1 Posts ]
Jump to:   


Style:  
Search: