Education by Example: Linked List
Author |
Message |
wtd
|
Posted: Tue Mar 29, 2005 2:39 pm Post subject: Education by Example: Linked List |
|
|
Introduction
One of the crazy things about learning, is that you generally have to learn stuff others already have. You have to reinvent the wheel to really understand how the wheel works. Computer science is a perfect example of this.
The great, all-important question is how to reinvent the wheel. What tools should someone use to do so?
The following takes a look at implementing basic data structures (and functions of them) in three different programming languages. Two of them I am fond of, and the third is Java, which has risen to quite a bit of popularity in computer science education. Noteworthy is the fact that all of the included languages use garbage collection, and thus free the programmer from thinking about memory management issues. I may expand this and change that later.
Linked List
The linked list is a simple data structure, and one that is quite useful for having collections of data that can grow beyond preset bounds. It's is composed of nodes. In a singly linked list each node is connected to a succeeding node. Link a chain, it's then possible to get from one end of the list to the other by following links. A doubly linked list simply lets one easily go backward as well as forward. For the same of this discussion we'll look at singly linked lists.
For the purposes of this discussion the linked list need only store integers. Subsequent discussion will give us the ability to store many other types of data. Onto the code!
Haskell
code: | data Node = Node Int Node | Empty |
Here I've succinctly laid out the rules for a linked list. A node either holds an int and another node, or is empty.
I need now a way of determining if the list is empty. Again, we're basically just spelling out the rules for a linked list.
code: | empty Empty = True
empty _ = False |
Now, what if we want to find out if a node is the last in a list? Clearly it is if the node is empty, or if the next node is Empty.
code: | lastNode Empty = True
lastNode (Node _ Empty) = True
lastNode _ = False |
A slightly more complicated question: how long is the list? To find this out we have to walk through the list. Basically we start with a node. If it's not empty we count one, then add that to the result of finding the length of the list for the next node. An empty node has a length of zero.
code: | listLength Empty = 0
listLength (Node _ next) = 1 + length next |
The next question is: how do we print everything in the list?
Perhaps the question should be: how do we generate a string representation of the list? Once we have a string, we can use standard printing functions.
An empty list gives us an empty string. Otherwise tack on a string representation of the value, and if it's not the last node, also tack on a comma and the results of stringifying the rest of the list.
code: | instance Show Node where
show Empty = ""
show (Node value Empty) = show value
show (Node value next) = show value ++ ", " ++ show next |
Let's create a function which checks to see if a value is in a list. For this we walk over the list. If a node has the value we're looking for, it's true. Otherwise check to see if it's in the rest of the list. Obviously, an empty node will never contain the desired value.
code: | hasValue _ Empty = False
hasValue desiredValue (Node value next) =
value == desiredValue || hasValue desiredValue next |
Now let's add something to the end of the list. If the existing list is empty, then we'll simply create a new node holding that value. Otherwise we take the exising node, but switch its next node to the result of adding the new value to the rest of the list.
code: | addToList newValue Empty = Node newValue Empty
addToList newValue (Node existingValue next) =
Node existingValue (addToList newValue next) |
One last trick. Let's check to see if two lists are equal. This means they must contain all of the same values in the same order. Obviously two empty lists are equal. An empty list is not equal to anything else. For nodes containing values, the values have to be equal, and the remaining lists have to also be equal.
code: | instance Eq Node where
Empty == Empty = True
Empty == _ = False
_ == Empty = False
Node value1 next1 == Node value2 next2 =
value1 == value2 && next1 == next2 |
So, our linked list code looks like:
code: | module LinkedList where
data Node = Node Int Node | Empty
instance Show Node where
show Empty = ""
show (Node value Empty) = show value
show (Node value next) = show value ++ ", " ++ show next
instance Eq Node where
Empty == Empty = True
Empty == _ = False
_ == Empty = False
Node value1 next1 == Node value2 next2 =
value1 == value2 && next1 == next2
addToList newValue Empty = Node newValue Empty
addToList newValue (Node existingValue next) =
Node existingValue (addToList newValue next)
hasValue _ Empty = False
hasValue v (Node nv next) = v == nv || hasValue v next
empty Empty = True
empty _ = False
lastNode Empty = True
lastNode (Node _ Empty) = True
lastNode _ = False
listLength Empty = 0
listLength (Node _ next) = 1 + listLength next |
The most important thing to take away from this example implementation is that Haskell allowed me to fairly directly describe the problem. Next we look at a Java solution.
Java
Java is a rather heavily object-oriented language, so we start by creaing a class to represent a node. Each node should contain an integer and anothing node. That node can be null, indidating that there's no next node.
We'll also create a constructor.
code: | class Node
{
private int value;
private Node next;
public Node(int newValue)
{
value = newValue;
next = null;
}
} |
Next we create accessors to allow us to get the value, and the next node.
code: | class Node
{
private int value;
private Node next;
public Node(int newValue)
{
value = newValue;
next = null;
}
public int getValue() { return value; }
public Node getNext() { return next; }
public void setNext(Node newNextNode) { next = newNextNode; }
} |
Now, since we're using a null Node object to represent the empty state, we can't call any methods on it, and testing for the null state will take the place of the "empty" function in the Haskell example.
Next in the previous example, I created a function to see if the current node was the last in the list. For this we can simply test to see if the next node is null.
code: | public boolean last()
{
return next == null;
} |
Now, let's create a method to walk a list and determine its length. Haskell has no looping syntax, so we used the more natural recursion. Java does have loops, so we'll use those. Here we use a temporary node which points to each node in turn until the node is null.
code: | public int length()
{
Node temp = this;
int count = 0;
while (temp != null)
{
temp = temp.getNext();
count++;
}
return count;
} |
We can use the same walking pattern to generate a string representation of the list. We'll also use a buffer to prevent memory problems.
code: | public String toString()
{
StringBuffer sb = new StringBuffer();
Node temp = this;
while (temp != null)
{
sb.append(temp.getValue());
if (!temp.last())
sb.append(", ");
temp = temp.getNext();
}
return sb.toString();
} |
Again, we'll apply the walking pattern to the check to see if the list contains a value. The problem here, that's rather easy to see is that this relies heavily on the particular behavior of the language. The false value is returned if nothing is returned from within the loop.
code: | public boolean hasValue(int desiredValue)
{
Node temp = this;
while (temp != null)
{
if (temp.getValue() == desiredValue)
return true;
temp = temp.getNext();
}
return false;
} |
I now need to be able to add some value to the end of the list. Again, I need to use a loop, but I don't want to use the same strategy, since that strategy walks off the end of the list and I need to keep a reference to the last node.
code: | public void append(int newValue)
{
Node temp = this;
while (!temp.last())
{
temp = temp.getNext();
}
temp.setNext(new Node(newValue));
} |
One last trick, again. I want to be able to test for equality. We simply walk both lists simultaneously. If they're of different lengths, we can assume false. If any one value is not equal, we automatically assume a result of false. If it walks off the end of either list without returning, we can assume they've passed all equality tests and return true.
code: | public boolean equals(Node otherList)
{
Node temp1 = this;
Node temp2 = otherList;
if (temp1.length() != temp2.length())
return false;
while (temp1 != null && temp2 != null)
{
if (temp1.getValue() != temp2.getValue())
return false;
temp1 = temp1.getNext();
temp2 = temp2.getNext();
}
return true;
} |
The full code:
code: | class Node
{
private int value;
private Node next;
public Node(int newValue)
{
value = newValue;
next = null;
}
public int getValue() { return value; }
public Node getNext() { return next; }
public void setNext(Node newNextNode) { next = newNextNode; }
public boolean last()
{
return next == null;
}
public int length()
{
Node temp = this;
int count = 0;
while (temp != null)
{
temp = temp.getNext();
count++;
}
return count;
}
public String toString()
{
StringBuffer sb = new StringBuffer();
Node temp = this;
while (temp != null)
{
sb.append(temp.getValue());
if (!temp.last())
sb.append(", ");
temp = temp.getNext();
}
return sb.toString();
}
public boolean hasValue(int desiredValue)
{
Node temp = this;
while (temp != null)
{
if (temp.getValue() == desiredValue)
return true;
temp = temp.getNext();
}
return false;
}
public void append(int newValue)
{
Node temp = this;
while (!temp.last())
{
temp = temp.getNext();
}
temp.setNext(new Node(newValue));
}
public boolean equals(Node otherList)
{
Node temp1 = this;
Node temp2 = otherList;
if (temp1.length() != temp2.length())
return false;
while (temp1 != null && temp2 != null)
{
if (temp1.getValue() != temp2.getValue())
return false;
temp1 = temp1.getNext();
temp2 = temp2.getNext();
}
return true;
}
} |
Ruby
The Ruby example will be similar to the Java example, for the most part, save that Ruby's added expressiveness will allow perhaps for more understandable code. As the various methods have been documented in the Java example,I'll simply post the code here.
code: | class Node
attr_reader :value
attr_accessor :next
def initialize(value)
@value, @next = value, nil
end
def last?
@next.nil?
end
def length
temp = self
count = 0
until temp.nil?
count += 1
temp = temp.next
end
count
end
def to_s
temp = self
str = ""
until temp.nil?
str += temp.value.to_s
str += ", " unless temp.last?
temp = temp.next
end
str
end
def has_value?(desired_value)
temp = self
until temp.nil?
return true if temp.value == desired_value
temp = temp.next
end
false
end
def <<(new_value)
temp = self
temp = temp.next until temp.last?
temp.next = Node.new(new_value)
end
def ==(other_node)
temp1 = self
temp2 = other_node
return false unless temp1.length == temp2.length
until temp1.nil? or temp2.nil?
return false unless temp1.value == temp2.value
temp1, temp2 = temp1.next, temp2.next
end
true
end
end |
Conclusions?
We've seen three implementations of a linked list. One in Haskell that primarily uses recursion, and the others in Java and Ruby, which use an object-oriented approach, but more importantly, a looping approach to walking a list.
Which is the best for educational purposes?
This is very subjective. On one hand, Haskell allows us to most clearly state the solution without having to worry overly about the state of any given variable at run-time. Thus we can say that pattern-matching and a simple run-time environment are a plus for Haskell.
On the other hand, Java is a popular language, and this example clearly demonstrated some of its quirks, like being able to return early from a function and skip executing parts of it. Additionally, since we can'toverload functions (or "methods" in Java parlance) based on the state of a data structure, it forces us to delve more deeply into run-time boolean checks.
The Ruby example has merit as well. Though it does not fare nearly so well as Haskell in very simply stating the solution, it fares better than Java. The simpler syntax really shines here, letting us state more clearly how we're solving the problem.
code: | until temp1.nil? or temp2.nil? |
Certainly reads more clearly, and more immediately conveys its purpose than:
code: | while (temp1 != null && temp2 != null) |
Uniform access is another aspect of Ruby syntax which is clearly demonstrated here. The ability to call methods without using parentheses, and the ability to use the names of instance variables for accessors means that we can pretend to be directly accessing those instance variables, even though they are safely hidden away inside the object.
The fact that we can write:
code: | attr_reader :value
attr_accessor :next |
Instead of:
code: | public int getValue() { return value; }
public Node getNext() { return next; }
public void setNext(Node newNextNode) { next = newNextNode; } |
Is simply an added benefit, but again shows that we're more cleary stating what we're doing, rather than exactly how we're doing it. Ruby fares less well in this regard than Haskell, but better than Java. How important that clarity is in education should remain open to debate.
I encourage all of you to debate this one with me. |
|
|
|
|
|
Sponsor Sponsor
|
|
|
wtd
|
Posted: Thu Mar 31, 2005 10:18 pm Post subject: Education by Example: Binary Tree |
|
|
Binary Trees
For the second installment, let's look at creating a generic binary tree. Please note that this one requires Java 5.0.
A binary tree is composed of nodes. Each node contains a value and links to two other trees. Any tree may be empty, and this state can be described in various ways depending on the language.
Haskell
First let's establish that a tree is either a branch with a value and two other trees branching off from it, or empty. The values it can store are any type, as represented by a.
code: | data Tree a = Branch a (Tree a) (Tree a) | Empty |
Next, let's create a function which determines the size of a tree. We do this in a recursive manner. Each branch has a value of one, plus the sizes of the branches coming off of it. Empty has a value of zero.
code: | treeSize Empty = 0
treeSize (Branch _ b1 b2) =
1 + treeSize b1 + treeSize b2 |
Now, let's create a function to determine if a value is within a tree. Clearly an empty tree does not. A branch has the value if its value is the desired value, or if one of its branches has the desired value.
code: | hasValue _ Empty = False
hasValue desiredValue (Branch value b1 b2) =
value == desiredValue ||
hasValue desiredValue b1 ||
hasValue desiredValue b2 |
We'll want to be able to test for equality. Again, a recursive solution. Obviously two empty trees are equal. An empty tree and anything else, though clearly is an unequal situation. Comparing two branches involves first seeing if the values they hold are equal, then checking to see if their corresponding sub-trees are equal.
code: | instance (Eq a) => Eq (Tree a) where
Empty == Empty = True
Empty == _ = False
_ == Empty = False
Branch v1 b11 b12 == Branch v2 b21 b22 =
v1 == v2 && b11 == b21 && b21 == b22 |
Now, we've seen that we can determine whether or not a tree has a particular value, but what if we want to get the tree that starts with a particular value?
There's an additional twist here. The value we're looking for might not be in the tree at all. If it isn't, then we can't return a tree. So we'll use a Maybe value. Maybe is another data type the declaration of which looks like:
code: | data Maybe a = Just a | Nothing |
Clearly, looking for a tree starting with any value in an Empty tree returns nothing. Doing the same with a branch involves a couple of things. Clearly, if the branch's value is the desired value, that's the desired tree. Otherwise look in the first branch. If it's not there, then look in the second branch.
code: | treeStartingWith _ Empty = Nothing
treeStartingWith desiredValue branch@(Branch value b1 b2)
| value == desiredValue = Just branch
| otherwise =
case result1 of
Just _ -> result1
Nothing -> result2
where
result1 = treeStartingWith desiredValue b1
result2 = treeStartingWith desiredValue b2 |
One last trick and then I'll show off the complete module and move onto Java and Ruby. Let's say I want all trees that begin with a desired value?
For this, and in other languages I'll use list types provided by the language.
Clearly, again, a search on an empty tree finds nothing. Searching on a branch should check for matching trees in the subtrees, and, if the current branch matches return that as well.
code: | allTreesBeginningWith _ Empty = []
allTreesBeginningWith desiredValue branch@(Branch value b1 b2)
| value == desiredValue = branch:all
| otherwise = all
where
t1 = allTreesBeginningWith desiredValue b1
t2 = allTreesBeginningWith desiredValue b2
all = t1 ++ t2 |
Wait, I changed my mind.
One more trick. We want a string representation of a tree. Again, not surprisingly, this is a recursive solution. The output from this will be replicated in the Java and Ruby examples.
code: | instance (Show a) => Show (Tree a) where
show Empty = "Empty"
show (Branch value b1 b2) =
"(Branch " ++ show value ++ " "
++ show b1 ++ " "
++ show b2 ++ ")" |
The full source:
code: | module BTree where
data Tree a = Branch a (Tree a) (Tree a) | Empty
treeSize Empty = 0
treeSize (Branch _ b1 b2) =
1 + treeSize b1 + treeSize b2
hasValue _ Empty = False
hasValue desiredValue (Branch value b1 b2) =
value == desiredValue ||
hasValue desiredValue b1 ||
hasValue desiredValue b2
instance (Eq a) => Eq (Tree a) where
Empty == Empty = True
Empty == _ = False
_ == Empty = False
Branch v1 b11 b12 == Branch v2 b21 b22 =
v1 == v2 && b11 == b21 && b12 == b22
treeStartingWith _ Empty = Nothing
treeStartingWith desiredValue branch@(Branch value b1 b2)
| value == desiredValue = Just branch
| otherwise =
case result1 of
Just _ -> result1
Nothing -> result2
where
result1 = treeStartingWith desiredValue b1
result2 = treeStartingWith desiredValue b2
allTreesBeginningWith _ Empty = []
allTreesBeginningWith desiredValue branch@(Branch value b1 b2)
| value == desiredValue = branch:all
| otherwise = all
where
t1 = allTreesBeginningWith desiredValue b1
t2 = allTreesBeginningWith desiredValue b2
all = t1 ++ t2
instance (Show a) => Show (Tree a) where
show Empty = "Empty"
show (Branch value b1 b2) =
"(Branch " ++ show value ++ " "
++ show b1 ++ " "
++ show b2 ++ ")" |
Java
Now, let's look at a comparable program in Java. As I mentioned at the top, this requires Java 5.0, which includes generics.
The constructors should give us the ability to set just the value, with the branches being null, or to set the value and the branches. We'll also need read-only (for simplicity) accessors.
code: | class Branch<T>
{
private T value;
private Branch<T> left, right;
public Branch(T initValue)
{
value = initValue;
left = null;
right = null;
}
public Branch(T initValue, Branch<T> initLeft, Branch<T> initRight)
{
value = initValue;
left = initLeft;
right = initRight;
}
public T getValue() { return value; }
public Branch<T> getLeft() { return left; }
public Branch<T> getRight() { return right; }
} |
The next step is to add a recursive method for calculating the size of the tree.
code: | public int size()
{
int count = 1;
if (left != null) count += left.size();
if (right != null) count += right.size();
return count;
} |
Next we'll want to determine if a tree has a particular value.
code: | public boolean hasValue(T desiredValue)
{
return value.equals(desiredValue) ||
(left != null && left.hasValue(desiredValue)) ||
(right != null && right.hasValue(desiredValue));
} |
Now, we'll want to be able to determine if two trees are equal.
code: | public boolean equals(Branch<T> otherTree)
{
if (otherTree == null)
return false;
return value.equals(otherTree.getValue()) &&
((left != null && left.equals(otherTree.getLeft()))
|| (left == null && otherTree.getLeft() == null)) &&
((right != null && right.equals(otherTree.getRight()))
|| (right == null && otherTree.getRight() == null));
} |
Getting a tree that starts with a particular value. Returns null if it can't find one. Here, as before, we have to diligently check to make sure the left and right branches are not null before we can call a method on them.
code: | public Branch<T> treeStartingWith(T desiredValue)
{
if (value.equals(desiredValue))
return this;
else
{
Branch<T> result =
left == null ? null
: left.treeStartingWith(desiredValue);
if (result != null) return result;
result =
right == null ? null
: right.treeStartingWith(desiredValue);
if (result != null) return result;
return null;
}
} |
The corresponding method of course finds all trees that start with a particular value.
code: | public ArrayList<Branch<T>> allTreesBeginningWith(T desiredValue)
{
ArrayList<Branch<T>> list = new ArrayList<Branch<T>>();
if (value.equals(desiredValue))
list.add(this);
if (left != null)
list.addAll(left.allTreesBeginningWith(desiredValue));
if (right != null)
list.addAll(right.allTreesBeginningWith(desiredValue));
return list;
} |
The final code looks like:
code: | class Branch<T>
{
private T value;
private Branch<T> left, right;
public Branch(T initValue)
{
value = initValue;
left = null;
right = null;
}
public Branch(T initValue, Branch<T> initLeft, Branch<T> initRight)
{
value = initValue;
left = initLeft;
right = initRight;
}
public T getValue() { return value; }
public Branch<T> getLeft() { return left; }
public Branch<T> getRight() { return right; }
public int size()
{
int count = 1;
if (left != null) count += left.size();
if (right != null) count += right.size();
return count;
}
public boolean hasValue(T desiredValue)
{
return value.equals(desiredValue) ||
(left != null && left.hasValue(desiredValue)) ||
(right != null && right.hasValue(desiredValue));
}
public boolean equals(Branch<T> otherTree)
{
if (otherTree == null)
return false;
return value.equals(otherTree.getValue()) &&
((left != null && left.equals(otherTree.getLeft()))
|| (left == null && otherTree.getLeft() == null)) &&
((right != null && right.equals(otherTree.getRight()))
|| (right == null && otherTree.getRight() == null));
}
public Branch<T> treeStartingWith(T desiredValue)
{
if (value.equals(desiredValue))
return this;
else
{
Branch<T> result =
left == null ? null : left.treeStartingWith(desiredValue);
if (result != null)
return result;
result =
right == null ? null : right.treeStartingWith(desiredValue);
if (result != null)
return result;
return null;
}
}
public ArrayList<Branch<T>> allTreesBeginningWith(T desiredValue)
{
ArrayList<Branch<T>> list = new ArrayList<Branch<T>>();
if (value.equals(desiredValue))
list.add(this);
if (left != null)
list.addAll(left.allTreesBeginningWith(desiredValue));
if (right != null)
list.addAll(right.allTreesBeginningWith(desiredValue));
return list;
}
public String toString()
{
StringBuffer sb = new StringBuffer("(Branch ");
sb.append(value);
sb.append(" ");
sb.append(left == null ? "Empty" : left);
sb.append(" ");
sb.append(right == null ? "Empty" : right);
sb.append(")");
return sb.toString();
}
} |
Ruby
Having fairly well documented both the Haskell and Java example implementations, I feel safe in providing little documentation for the Ruby example.
The only thing I'll point out is that variables in Ruby dn't have types themselves (values do). Thus no extra syntax is required for the tree to be generic.
code: | class Branch
attr_reader :value, :left, :right
def initialize(value, left = nil, right = nil)
@value, @left, @right = value, left, right
end
def size
1 + (@left.nil? ? 0 : @left.size) + (@right.nil? ? 0 : @right.size)
end
def has_value?(desired_value)
@value == desired_value or
(@left.nil? ? false : (@left.has_value? desired_value)) or
(@right.nil? ? false : (@right.has_value? desired_value))
end
def ==(other_tree)
return false if other_tree.nil?
@value == other_tree.value and
(@left.nil? ? other_tree.left.nil? : @left == other_tree.left) and
(@right.nil? ? other_tree.right.nil? : @right == other_tree.right)
end
def tree_starting_with(desired_value)
if @value == desired_value
self
else
(@left.nil? ? nil : @left.tree_starting_with(desired_value)) or
(@right.nil? ? nil : @right.tree_starting_with(desired_value))
end
end
def all_trees_beginning_with(desired_value)
list = []
list.push(self) if @value == desired_value
list.push(*@left.all_trees_beginning_with(desired_value)) unless @left.nil?
list.push(*@right.all_trees_beginning_with(desired_value)) unless @right.nil?
list
end
def to_s
"(Branch #{@value} #{@left.nil? ? "Empty" : @left} #{@right.nil? ? "Empty" : @right})"
end
end |
Conclusions?
The single biggest thing to note here is the use of the null or nil values in the Java and Ruby examples. This may seem okay, but it really presents us with a problem. We can't treat an empty tree as a tree at all, so we can't have a method that will find the size of an empty tree. Instead we have to test for the tree being null (or "nil") and then treat it differently.
With Haskell, we can treat an empty tree as a tree, meaning we need not test at run-time for null or nil. As a direct result of this, the recursion flows far more naturally.
Please, debate! |
|
|
|
|
|
|
|