Computer Science Canada Education by Example: Linked List |
Author: | wtd [ Tue Mar 29, 2005 2:39 pm ] | ||||||||||||||||||||||||||||||||||||||||||||||
Post subject: | Education by Example: Linked List | ||||||||||||||||||||||||||||||||||||||||||||||
Introduction One of the crazy things about learning, is that you generally have to learn stuff others already have. You have to reinvent the wheel to really understand how the wheel works. Computer science is a perfect example of this. The great, all-important question is how to reinvent the wheel. What tools should someone use to do so? The following takes a look at implementing basic data structures (and functions of them) in three different programming languages. Two of them I am fond of, and the third is Java, which has risen to quite a bit of popularity in computer science education. Noteworthy is the fact that all of the included languages use garbage collection, and thus free the programmer from thinking about memory management issues. I may expand this and change that later. Linked List The linked list is a simple data structure, and one that is quite useful for having collections of data that can grow beyond preset bounds. It's is composed of nodes. In a singly linked list each node is connected to a succeeding node. Link a chain, it's then possible to get from one end of the list to the other by following links. A doubly linked list simply lets one easily go backward as well as forward. For the same of this discussion we'll look at singly linked lists. For the purposes of this discussion the linked list need only store integers. Subsequent discussion will give us the ability to store many other types of data. Onto the code! Haskell
Here I've succinctly laid out the rules for a linked list. A node either holds an int and another node, or is empty. I need now a way of determining if the list is empty. Again, we're basically just spelling out the rules for a linked list.
Now, what if we want to find out if a node is the last in a list? Clearly it is if the node is empty, or if the next node is Empty.
A slightly more complicated question: how long is the list? To find this out we have to walk through the list. Basically we start with a node. If it's not empty we count one, then add that to the result of finding the length of the list for the next node. An empty node has a length of zero.
The next question is: how do we print everything in the list? Perhaps the question should be: how do we generate a string representation of the list? Once we have a string, we can use standard printing functions. An empty list gives us an empty string. Otherwise tack on a string representation of the value, and if it's not the last node, also tack on a comma and the results of stringifying the rest of the list.
Let's create a function which checks to see if a value is in a list. For this we walk over the list. If a node has the value we're looking for, it's true. Otherwise check to see if it's in the rest of the list. Obviously, an empty node will never contain the desired value.
Now let's add something to the end of the list. If the existing list is empty, then we'll simply create a new node holding that value. Otherwise we take the exising node, but switch its next node to the result of adding the new value to the rest of the list.
One last trick. Let's check to see if two lists are equal. This means they must contain all of the same values in the same order. Obviously two empty lists are equal. An empty list is not equal to anything else. For nodes containing values, the values have to be equal, and the remaining lists have to also be equal.
So, our linked list code looks like:
The most important thing to take away from this example implementation is that Haskell allowed me to fairly directly describe the problem. Next we look at a Java solution. Java Java is a rather heavily object-oriented language, so we start by creaing a class to represent a node. Each node should contain an integer and anothing node. That node can be null, indidating that there's no next node. We'll also create a constructor.
Next we create accessors to allow us to get the value, and the next node.
Now, since we're using a null Node object to represent the empty state, we can't call any methods on it, and testing for the null state will take the place of the "empty" function in the Haskell example. Next in the previous example, I created a function to see if the current node was the last in the list. For this we can simply test to see if the next node is null.
Now, let's create a method to walk a list and determine its length. Haskell has no looping syntax, so we used the more natural recursion. Java does have loops, so we'll use those. Here we use a temporary node which points to each node in turn until the node is null.
We can use the same walking pattern to generate a string representation of the list. We'll also use a buffer to prevent memory problems.
Again, we'll apply the walking pattern to the check to see if the list contains a value. The problem here, that's rather easy to see is that this relies heavily on the particular behavior of the language. The false value is returned if nothing is returned from within the loop.
I now need to be able to add some value to the end of the list. Again, I need to use a loop, but I don't want to use the same strategy, since that strategy walks off the end of the list and I need to keep a reference to the last node.
One last trick, again. I want to be able to test for equality. We simply walk both lists simultaneously. If they're of different lengths, we can assume false. If any one value is not equal, we automatically assume a result of false. If it walks off the end of either list without returning, we can assume they've passed all equality tests and return true.
The full code:
Ruby The Ruby example will be similar to the Java example, for the most part, save that Ruby's added expressiveness will allow perhaps for more understandable code. As the various methods have been documented in the Java example,I'll simply post the code here.
Conclusions? We've seen three implementations of a linked list. One in Haskell that primarily uses recursion, and the others in Java and Ruby, which use an object-oriented approach, but more importantly, a looping approach to walking a list. Which is the best for educational purposes? This is very subjective. On one hand, Haskell allows us to most clearly state the solution without having to worry overly about the state of any given variable at run-time. Thus we can say that pattern-matching and a simple run-time environment are a plus for Haskell. On the other hand, Java is a popular language, and this example clearly demonstrated some of its quirks, like being able to return early from a function and skip executing parts of it. Additionally, since we can'toverload functions (or "methods" in Java parlance) based on the state of a data structure, it forces us to delve more deeply into run-time boolean checks. The Ruby example has merit as well. Though it does not fare nearly so well as Haskell in very simply stating the solution, it fares better than Java. The simpler syntax really shines here, letting us state more clearly how we're solving the problem.
Certainly reads more clearly, and more immediately conveys its purpose than:
Uniform access is another aspect of Ruby syntax which is clearly demonstrated here. The ability to call methods without using parentheses, and the ability to use the names of instance variables for accessors means that we can pretend to be directly accessing those instance variables, even though they are safely hidden away inside the object. The fact that we can write:
Instead of:
Is simply an added benefit, but again shows that we're more cleary stating what we're doing, rather than exactly how we're doing it. Ruby fares less well in this regard than Haskell, but better than Java. How important that clarity is in education should remain open to debate. I encourage all of you to debate this one with me. |
Author: | wtd [ Thu Mar 31, 2005 10:18 pm ] | ||||||||||||||||||||||||||||||||||
Post subject: | Education by Example: Binary Tree | ||||||||||||||||||||||||||||||||||
Binary Trees For the second installment, let's look at creating a generic binary tree. Please note that this one requires Java 5.0. A binary tree is composed of nodes. Each node contains a value and links to two other trees. Any tree may be empty, and this state can be described in various ways depending on the language. Haskell First let's establish that a tree is either a branch with a value and two other trees branching off from it, or empty. The values it can store are any type, as represented by a.
Next, let's create a function which determines the size of a tree. We do this in a recursive manner. Each branch has a value of one, plus the sizes of the branches coming off of it. Empty has a value of zero.
Now, let's create a function to determine if a value is within a tree. Clearly an empty tree does not. A branch has the value if its value is the desired value, or if one of its branches has the desired value.
We'll want to be able to test for equality. Again, a recursive solution. Obviously two empty trees are equal. An empty tree and anything else, though clearly is an unequal situation. Comparing two branches involves first seeing if the values they hold are equal, then checking to see if their corresponding sub-trees are equal.
Now, we've seen that we can determine whether or not a tree has a particular value, but what if we want to get the tree that starts with a particular value? There's an additional twist here. The value we're looking for might not be in the tree at all. If it isn't, then we can't return a tree. So we'll use a Maybe value. Maybe is another data type the declaration of which looks like:
Clearly, looking for a tree starting with any value in an Empty tree returns nothing. Doing the same with a branch involves a couple of things. Clearly, if the branch's value is the desired value, that's the desired tree. Otherwise look in the first branch. If it's not there, then look in the second branch.
One last trick and then I'll show off the complete module and move onto Java and Ruby. Let's say I want all trees that begin with a desired value? For this, and in other languages I'll use list types provided by the language. Clearly, again, a search on an empty tree finds nothing. Searching on a branch should check for matching trees in the subtrees, and, if the current branch matches return that as well.
Wait, I changed my mind. One more trick. We want a string representation of a tree. Again, not surprisingly, this is a recursive solution. The output from this will be replicated in the Java and Ruby examples.
The full source:
Java Now, let's look at a comparable program in Java. As I mentioned at the top, this requires Java 5.0, which includes generics. The constructors should give us the ability to set just the value, with the branches being null, or to set the value and the branches. We'll also need read-only (for simplicity) accessors.
The next step is to add a recursive method for calculating the size of the tree.
Next we'll want to determine if a tree has a particular value.
Now, we'll want to be able to determine if two trees are equal.
Getting a tree that starts with a particular value. Returns null if it can't find one. Here, as before, we have to diligently check to make sure the left and right branches are not null before we can call a method on them.
The corresponding method of course finds all trees that start with a particular value.
The final code looks like:
Ruby Having fairly well documented both the Haskell and Java example implementations, I feel safe in providing little documentation for the Ruby example. The only thing I'll point out is that variables in Ruby dn't have types themselves (values do). Thus no extra syntax is required for the tree to be generic.
Conclusions? The single biggest thing to note here is the use of the null or nil values in the Java and Ruby examples. This may seem okay, but it really presents us with a problem. We can't treat an empty tree as a tree at all, so we can't have a method that will find the size of an empty tree. Instead we have to test for the tree being null (or "nil") and then treat it differently. With Haskell, we can treat an empty tree as a tree, meaning we need not test at run-time for null or nil. As a direct result of this, the recursion flows far more naturally. Please, debate! |