Computer Science Canada

Education by Example: Linked List

Author:  wtd [ Tue Mar 29, 2005 2:39 pm ]
Post subject:  Education by Example: Linked List

Introduction

One of the crazy things about learning, is that you generally have to learn stuff others already have. You have to reinvent the wheel to really understand how the wheel works. Computer science is a perfect example of this.

The great, all-important question is how to reinvent the wheel. What tools should someone use to do so?

The following takes a look at implementing basic data structures (and functions of them) in three different programming languages. Two of them I am fond of, and the third is Java, which has risen to quite a bit of popularity in computer science education. Noteworthy is the fact that all of the included languages use garbage collection, and thus free the programmer from thinking about memory management issues. I may expand this and change that later.

Linked List

The linked list is a simple data structure, and one that is quite useful for having collections of data that can grow beyond preset bounds. It's is composed of nodes. In a singly linked list each node is connected to a succeeding node. Link a chain, it's then possible to get from one end of the list to the other by following links. A doubly linked list simply lets one easily go backward as well as forward. For the same of this discussion we'll look at singly linked lists.

For the purposes of this discussion the linked list need only store integers. Subsequent discussion will give us the ability to store many other types of data. Onto the code!

Haskell

code:
data Node = Node Int Node | Empty


Here I've succinctly laid out the rules for a linked list. A node either holds an int and another node, or is empty.

I need now a way of determining if the list is empty. Again, we're basically just spelling out the rules for a linked list.

code:
empty Empty = True
empty _ = False


Now, what if we want to find out if a node is the last in a list? Clearly it is if the node is empty, or if the next node is Empty.

code:
lastNode Empty = True
lastNode (Node _ Empty) = True
lastNode _ = False


A slightly more complicated question: how long is the list? To find this out we have to walk through the list. Basically we start with a node. If it's not empty we count one, then add that to the result of finding the length of the list for the next node. An empty node has a length of zero.

code:
listLength Empty = 0
listLength (Node _ next) = 1 + length next


The next question is: how do we print everything in the list?

Perhaps the question should be: how do we generate a string representation of the list? Once we have a string, we can use standard printing functions.

An empty list gives us an empty string. Otherwise tack on a string representation of the value, and if it's not the last node, also tack on a comma and the results of stringifying the rest of the list.

code:
instance Show Node where
  show Empty = ""
  show (Node value Empty) = show value
  show (Node value next) = show value ++ ", " ++ show next


Let's create a function which checks to see if a value is in a list. For this we walk over the list. If a node has the value we're looking for, it's true. Otherwise check to see if it's in the rest of the list. Obviously, an empty node will never contain the desired value.

code:
hasValue _ Empty = False
hasValue desiredValue (Node value next) =
  value == desiredValue || hasValue desiredValue next


Now let's add something to the end of the list. If the existing list is empty, then we'll simply create a new node holding that value. Otherwise we take the exising node, but switch its next node to the result of adding the new value to the rest of the list.

code:
addToList newValue Empty = Node newValue Empty
addToList newValue (Node existingValue next) =
  Node existingValue (addToList newValue next)


One last trick. Let's check to see if two lists are equal. This means they must contain all of the same values in the same order. Obviously two empty lists are equal. An empty list is not equal to anything else. For nodes containing values, the values have to be equal, and the remaining lists have to also be equal.

code:
instance Eq Node where
  Empty == Empty = True
  Empty == _ = False
  _ == Empty = False
  Node value1 next1 == Node value2 next2 =
    value1 == value2 && next1 == next2


So, our linked list code looks like:

code:
module LinkedList where

data Node = Node Int Node | Empty

instance Show Node where
  show Empty = ""
  show (Node value Empty) = show value
  show (Node value next) = show value ++ ", " ++ show next

instance Eq Node where
  Empty == Empty = True
  Empty == _ = False
  _ == Empty = False
  Node value1 next1 == Node value2 next2 =
    value1 == value2 && next1 == next2

addToList newValue Empty = Node newValue Empty
addToList newValue (Node existingValue next) =
  Node existingValue (addToList newValue next)

hasValue _ Empty = False
hasValue v (Node nv next) = v == nv || hasValue v next

empty Empty = True
empty _ = False

lastNode Empty = True
lastNode (Node _ Empty) = True
lastNode _ = False

listLength Empty = 0
listLength (Node _ next) = 1 + listLength next


The most important thing to take away from this example implementation is that Haskell allowed me to fairly directly describe the problem. Next we look at a Java solution.

Java

Java is a rather heavily object-oriented language, so we start by creaing a class to represent a node. Each node should contain an integer and anothing node. That node can be null, indidating that there's no next node.

We'll also create a constructor.

code:
class Node
{
   private int value;
   private Node next;

   public Node(int newValue)
   {
      value = newValue;
      next  = null;
   }
}


Next we create accessors to allow us to get the value, and the next node.

code:
class Node
{
   private int value;
   private Node next;

   public Node(int newValue)
   {
      value = newValue;
      next  = null;
   }

   public int getValue() { return value; }
   public Node getNext() { return next; }
   public void setNext(Node newNextNode) { next = newNextNode; }
}


Now, since we're using a null Node object to represent the empty state, we can't call any methods on it, and testing for the null state will take the place of the "empty" function in the Haskell example.

Next in the previous example, I created a function to see if the current node was the last in the list. For this we can simply test to see if the next node is null.

code:
public boolean last()
{
   return next == null;
}


Now, let's create a method to walk a list and determine its length. Haskell has no looping syntax, so we used the more natural recursion. Java does have loops, so we'll use those. Here we use a temporary node which points to each node in turn until the node is null.

code:
public int length()
{
   Node temp = this;
   int count = 0;
   
   while (temp != null)
   {
      temp = temp.getNext();
      count++;
   }

   return count;
}


We can use the same walking pattern to generate a string representation of the list. We'll also use a buffer to prevent memory problems.

code:
public String toString()
{
   StringBuffer sb = new StringBuffer();
   Node temp = this;
   
   while (temp != null)
   {
      sb.append(temp.getValue());
      if (!temp.last())
         sb.append(", ");

      temp = temp.getNext();
   }

   return sb.toString();
}


Again, we'll apply the walking pattern to the check to see if the list contains a value. The problem here, that's rather easy to see is that this relies heavily on the particular behavior of the language. The false value is returned if nothing is returned from within the loop.

code:
public boolean hasValue(int desiredValue)
{
   Node temp = this;
     
   while (temp != null)
   {
      if (temp.getValue() == desiredValue)
         return true;

      temp = temp.getNext();
   }

   return false;
}


I now need to be able to add some value to the end of the list. Again, I need to use a loop, but I don't want to use the same strategy, since that strategy walks off the end of the list and I need to keep a reference to the last node.

code:
public void append(int newValue)
{
   Node temp = this;
   
   while (!temp.last())
   {
      temp = temp.getNext();
   }

   temp.setNext(new Node(newValue));
}


One last trick, again. I want to be able to test for equality. We simply walk both lists simultaneously. If they're of different lengths, we can assume false. If any one value is not equal, we automatically assume a result of false. If it walks off the end of either list without returning, we can assume they've passed all equality tests and return true.

code:
public boolean equals(Node otherList)
{
   Node temp1 = this;
   Node temp2 = otherList;

   if (temp1.length() != temp2.length())
      return false;

   while (temp1 != null && temp2 != null)
   {
      if (temp1.getValue() != temp2.getValue())
         return false;

      temp1 = temp1.getNext();
      temp2 = temp2.getNext();
   }

   return true;
}


The full code:

code:
class Node
{
   private int value;
   private Node next;

   public Node(int newValue)
   {
      value = newValue;
      next  = null;
   }

   public int getValue() { return value; }
   public Node getNext() { return next; }
   public void setNext(Node newNextNode) { next = newNextNode; }

   public boolean last()
   {
      return next == null;
   }

   public int length()
   {
      Node temp = this;
      int count = 0;
     
      while (temp != null)
      {
         temp = temp.getNext();
         count++;
      }

      return count;
   }

   public String toString()
   {
      StringBuffer sb = new StringBuffer();
      Node temp = this;
   
      while (temp != null)
      {
         sb.append(temp.getValue());
         if (!temp.last())
            sb.append(", ");

         temp = temp.getNext();
      }

      return sb.toString();
   }

   public boolean hasValue(int desiredValue)
   {
      Node temp = this;
     
      while (temp != null)
      {
         if (temp.getValue() == desiredValue)
            return true;

         temp = temp.getNext();
      }

      return false;
   }

   public void append(int newValue)
   {
      Node temp = this;
   
      while (!temp.last())
      {
         temp = temp.getNext();
      }

      temp.setNext(new Node(newValue));
   }

   public boolean equals(Node otherList)
   {
      Node temp1 = this;
      Node temp2 = otherList;

      if (temp1.length() != temp2.length())
         return false;

      while (temp1 != null && temp2 != null)
      {
         if (temp1.getValue() != temp2.getValue())
            return false;

         temp1 = temp1.getNext();
         temp2 = temp2.getNext();
      }

      return true;
   }
}


Ruby

The Ruby example will be similar to the Java example, for the most part, save that Ruby's added expressiveness will allow perhaps for more understandable code. As the various methods have been documented in the Java example,I'll simply post the code here.

code:
class Node
   attr_reader :value
   attr_accessor :next

   def initialize(value)
      @value, @next = value, nil
   end

   def last?
      @next.nil?
   end

   def length
      temp = self
      count = 0

      until temp.nil?
         count += 1
         temp = temp.next
      end

      count
   end

   def to_s
      temp = self
      str = ""

      until temp.nil?
         str += temp.value.to_s
         str += ", " unless temp.last?

         temp = temp.next
      end

      str
   end

   def has_value?(desired_value)
      temp = self
     
      until temp.nil?
         return true if temp.value == desired_value
         temp = temp.next
      end

      false
   end

   def <<(new_value)
      temp = self

      temp = temp.next until temp.last?

      temp.next = Node.new(new_value)
   end

   def ==(other_node)
      temp1 = self
      temp2 = other_node

      return false unless temp1.length == temp2.length

      until temp1.nil? or temp2.nil?
         return false unless temp1.value == temp2.value
         temp1, temp2 = temp1.next, temp2.next
      end

      true
   end
end


Conclusions?

We've seen three implementations of a linked list. One in Haskell that primarily uses recursion, and the others in Java and Ruby, which use an object-oriented approach, but more importantly, a looping approach to walking a list.

Which is the best for educational purposes?

This is very subjective. On one hand, Haskell allows us to most clearly state the solution without having to worry overly about the state of any given variable at run-time. Thus we can say that pattern-matching and a simple run-time environment are a plus for Haskell.

On the other hand, Java is a popular language, and this example clearly demonstrated some of its quirks, like being able to return early from a function and skip executing parts of it. Additionally, since we can'toverload functions (or "methods" in Java parlance) based on the state of a data structure, it forces us to delve more deeply into run-time boolean checks.

The Ruby example has merit as well. Though it does not fare nearly so well as Haskell in very simply stating the solution, it fares better than Java. The simpler syntax really shines here, letting us state more clearly how we're solving the problem.

code:
until temp1.nil? or temp2.nil?


Certainly reads more clearly, and more immediately conveys its purpose than:

code:
while (temp1 != null && temp2 != null)


Uniform access is another aspect of Ruby syntax which is clearly demonstrated here. The ability to call methods without using parentheses, and the ability to use the names of instance variables for accessors means that we can pretend to be directly accessing those instance variables, even though they are safely hidden away inside the object.

The fact that we can write:

code:
attr_reader :value
attr_accessor :next


Instead of:

code:
public int getValue() { return value; }
public Node getNext() { return next; }
public void setNext(Node newNextNode) { next = newNextNode; }


Is simply an added benefit, but again shows that we're more cleary stating what we're doing, rather than exactly how we're doing it. Ruby fares less well in this regard than Haskell, but better than Java. How important that clarity is in education should remain open to debate.

I encourage all of you to debate this one with me. Smile

Author:  wtd [ Thu Mar 31, 2005 10:18 pm ]
Post subject:  Education by Example: Binary Tree

Binary Trees

For the second installment, let's look at creating a generic binary tree. Please note that this one requires Java 5.0.

A binary tree is composed of nodes. Each node contains a value and links to two other trees. Any tree may be empty, and this state can be described in various ways depending on the language.

Haskell

First let's establish that a tree is either a branch with a value and two other trees branching off from it, or empty. The values it can store are any type, as represented by a.

code:
data Tree a = Branch a (Tree a) (Tree a) | Empty


Next, let's create a function which determines the size of a tree. We do this in a recursive manner. Each branch has a value of one, plus the sizes of the branches coming off of it. Empty has a value of zero.

code:
treeSize Empty = 0
treeSize (Branch _ b1 b2) =
  1 + treeSize b1 + treeSize b2


Now, let's create a function to determine if a value is within a tree. Clearly an empty tree does not. A branch has the value if its value is the desired value, or if one of its branches has the desired value.

code:
hasValue _ Empty = False
hasValue desiredValue (Branch value b1 b2) =
  value == desiredValue ||
  hasValue desiredValue b1 ||
  hasValue desiredValue b2


We'll want to be able to test for equality. Again, a recursive solution. Obviously two empty trees are equal. An empty tree and anything else, though clearly is an unequal situation. Comparing two branches involves first seeing if the values they hold are equal, then checking to see if their corresponding sub-trees are equal.

code:
instance (Eq a) => Eq (Tree a) where
  Empty == Empty = True
  Empty == _ = False
  _ == Empty = False
  Branch v1 b11 b12 == Branch v2 b21 b22 =
    v1 == v2 && b11 == b21 && b21 == b22


Now, we've seen that we can determine whether or not a tree has a particular value, but what if we want to get the tree that starts with a particular value?

There's an additional twist here. The value we're looking for might not be in the tree at all. If it isn't, then we can't return a tree. So we'll use a Maybe value. Maybe is another data type the declaration of which looks like:

code:
data Maybe a = Just a | Nothing


Clearly, looking for a tree starting with any value in an Empty tree returns nothing. Doing the same with a branch involves a couple of things. Clearly, if the branch's value is the desired value, that's the desired tree. Otherwise look in the first branch. If it's not there, then look in the second branch.

code:
treeStartingWith _ Empty = Nothing
treeStartingWith desiredValue branch@(Branch value b1 b2)
  | value == desiredValue = Just branch
  | otherwise =
    case result1 of
      Just _ -> result1
      Nothing -> result2
  where
    result1 = treeStartingWith desiredValue b1
    result2 = treeStartingWith desiredValue b2


One last trick and then I'll show off the complete module and move onto Java and Ruby. Let's say I want all trees that begin with a desired value?

For this, and in other languages I'll use list types provided by the language.

Clearly, again, a search on an empty tree finds nothing. Searching on a branch should check for matching trees in the subtrees, and, if the current branch matches return that as well.

code:
allTreesBeginningWith _ Empty = []
allTreesBeginningWith desiredValue branch@(Branch value b1 b2)
  | value == desiredValue = branch:all
  | otherwise = all
  where
    t1 = allTreesBeginningWith desiredValue b1
    t2 = allTreesBeginningWith desiredValue b2
    all = t1 ++ t2


Wait, I changed my mind. Smile

One more trick. We want a string representation of a tree. Again, not surprisingly, this is a recursive solution. The output from this will be replicated in the Java and Ruby examples.

code:
instance (Show a) => Show (Tree a) where
  show Empty = "Empty"
  show (Branch value b1 b2) =
    "(Branch " ++ show value ++ " "
               ++ show b1    ++ " "
               ++ show b2    ++ ")"


The full source:

code:
module BTree where

data Tree a = Branch a (Tree a) (Tree a) | Empty

treeSize Empty = 0
treeSize (Branch _ b1 b2) =
  1 + treeSize b1 + treeSize b2

hasValue _ Empty = False
hasValue desiredValue (Branch value b1 b2) =
  value == desiredValue ||
  hasValue desiredValue b1 ||
  hasValue desiredValue b2

instance (Eq a) => Eq (Tree a) where
  Empty == Empty = True
  Empty == _ = False
  _ == Empty = False
  Branch v1 b11 b12 == Branch v2 b21 b22 =
    v1 == v2 && b11 == b21 && b12 == b22

treeStartingWith _ Empty = Nothing
treeStartingWith desiredValue branch@(Branch value b1 b2)
  | value == desiredValue = Just branch
  | otherwise =
    case result1 of
      Just _ -> result1
      Nothing -> result2
  where
    result1 = treeStartingWith desiredValue b1
    result2 = treeStartingWith desiredValue b2

allTreesBeginningWith _ Empty = []
allTreesBeginningWith desiredValue branch@(Branch value b1 b2)
  | value == desiredValue = branch:all
  | otherwise = all
  where
    t1 = allTreesBeginningWith desiredValue b1
    t2 = allTreesBeginningWith desiredValue b2
    all = t1 ++ t2

instance (Show a) => Show (Tree a) where
  show Empty = "Empty"
  show (Branch value b1 b2) =
    "(Branch " ++ show value ++ " "
               ++ show b1    ++ " "
               ++ show b2    ++ ")"


Java

Now, let's look at a comparable program in Java. As I mentioned at the top, this requires Java 5.0, which includes generics.

The constructors should give us the ability to set just the value, with the branches being null, or to set the value and the branches. We'll also need read-only (for simplicity) accessors.

code:
class Branch<T>
{
  private T value;
  private Branch<T> left, right;

  public Branch(T initValue)
  {
    value = initValue;
    left = null;
    right = null;
  }

  public Branch(T initValue, Branch<T> initLeft, Branch<T> initRight)
  {
    value = initValue;
    left = initLeft;
    right = initRight;
  }

  public T getValue() { return value; }
  public Branch<T> getLeft() { return left; }
  public Branch<T> getRight() { return right; }
}


The next step is to add a recursive method for calculating the size of the tree.

code:
public int size()
{
  int count = 1;
  if (left != null) count += left.size();
  if (right != null) count += right.size();
  return count;
}


Next we'll want to determine if a tree has a particular value.

code:
public boolean hasValue(T desiredValue)
{
  return value.equals(desiredValue) ||
         (left != null && left.hasValue(desiredValue)) ||
         (right != null && right.hasValue(desiredValue));
}


Now, we'll want to be able to determine if two trees are equal.

code:
public boolean equals(Branch<T> otherTree)
{
  if (otherTree == null)
    return false;
 
  return value.equals(otherTree.getValue()) &&
         ((left != null && left.equals(otherTree.getLeft()))
          || (left == null && otherTree.getLeft() == null)) &&
         ((right != null && right.equals(otherTree.getRight()))
          || (right == null && otherTree.getRight() == null));
}


Getting a tree that starts with a particular value. Returns null if it can't find one. Here, as before, we have to diligently check to make sure the left and right branches are not null before we can call a method on them.

code:
public Branch<T> treeStartingWith(T desiredValue)
{
  if (value.equals(desiredValue))
    return this;
  else
  {
    Branch<T> result =
      left == null ? null
                   : left.treeStartingWith(desiredValue);

    if (result != null) return result;

    result =
      right == null ? null
                    : right.treeStartingWith(desiredValue);

    if (result != null) return result;

    return null;
  }
}


The corresponding method of course finds all trees that start with a particular value.

code:
public ArrayList<Branch<T>> allTreesBeginningWith(T desiredValue)

  ArrayList<Branch<T>> list = new ArrayList<Branch<T>>();

  if (value.equals(desiredValue))
    list.add(this);

  if (left != null)
    list.addAll(left.allTreesBeginningWith(desiredValue));

  if (right != null)
    list.addAll(right.allTreesBeginningWith(desiredValue));
     
  return list;
}


The final code looks like:

code:
class Branch<T>
{
  private T value;
  private Branch<T> left, right;

  public Branch(T initValue)
  {
    value = initValue;
    left = null;
    right = null;
  }

  public Branch(T initValue, Branch<T> initLeft, Branch<T> initRight)
  {
    value = initValue;
    left = initLeft;
    right = initRight;
  }

  public T getValue() { return value; }
  public Branch<T> getLeft() { return left; }
  public Branch<T> getRight() { return right; }

  public int size()
  { 
    int count = 1;
    if (left != null) count += left.size();
    if (right != null) count += right.size();
    return count;
  }

  public boolean hasValue(T desiredValue)
  {
    return value.equals(desiredValue) ||
           (left != null && left.hasValue(desiredValue)) ||
           (right != null && right.hasValue(desiredValue));
  }

  public boolean equals(Branch<T> otherTree)
  {
    if (otherTree == null)
      return false;
 
    return value.equals(otherTree.getValue()) &&
           ((left != null && left.equals(otherTree.getLeft()))
            || (left == null && otherTree.getLeft() == null)) &&
           ((right != null && right.equals(otherTree.getRight()))
            || (right == null && otherTree.getRight() == null));
  }

  public Branch<T> treeStartingWith(T desiredValue)
  {
    if (value.equals(desiredValue))
      return this;
    else
    {
      Branch<T> result =
        left == null ? null : left.treeStartingWith(desiredValue);

      if (result != null)
        return result;

      result =
        right == null ? null : right.treeStartingWith(desiredValue);

      if (result != null)
        return result;

      return null;
    }
  }

  public ArrayList<Branch<T>> allTreesBeginningWith(T desiredValue)
  {
    ArrayList<Branch<T>> list = new ArrayList<Branch<T>>();

    if (value.equals(desiredValue))
      list.add(this);

    if (left != null)
      list.addAll(left.allTreesBeginningWith(desiredValue));

    if (right != null)
      list.addAll(right.allTreesBeginningWith(desiredValue));
     
    return list;
  }

  public String toString()
  {
    StringBuffer sb = new StringBuffer("(Branch ");
    sb.append(value);
    sb.append(" ");
    sb.append(left == null ? "Empty" : left);
    sb.append(" ");
    sb.append(right == null ? "Empty" : right);
    sb.append(")");
    return sb.toString();
  }
}


Ruby

Having fairly well documented both the Haskell and Java example implementations, I feel safe in providing little documentation for the Ruby example.

The only thing I'll point out is that variables in Ruby dn't have types themselves (values do). Thus no extra syntax is required for the tree to be generic.

code:
class Branch
  attr_reader :value, :left, :right

  def initialize(value, left = nil, right = nil)
    @value, @left, @right = value, left, right
  end

  def size
    1 + (@left.nil? ? 0 : @left.size) + (@right.nil? ? 0 : @right.size)
  end

  def has_value?(desired_value)
     @value == desired_value or
    (@left.nil? ? false : (@left.has_value? desired_value)) or
    (@right.nil? ? false : (@right.has_value? desired_value))
  end

  def ==(other_tree)
    return false if other_tree.nil?

     @value == other_tree.value and
    (@left.nil? ? other_tree.left.nil? : @left == other_tree.left) and
    (@right.nil? ? other_tree.right.nil? : @right == other_tree.right)
  end

  def tree_starting_with(desired_value)
    if @value == desired_value
      self
    else
      (@left.nil? ? nil : @left.tree_starting_with(desired_value)) or
      (@right.nil? ? nil : @right.tree_starting_with(desired_value))
    end
  end

  def all_trees_beginning_with(desired_value)
    list = []

    list.push(self) if @value == desired_value
    list.push(*@left.all_trees_beginning_with(desired_value)) unless @left.nil?
    list.push(*@right.all_trees_beginning_with(desired_value)) unless @right.nil?

    list
  end

  def to_s
    "(Branch #{@value} #{@left.nil? ? "Empty" : @left} #{@right.nil? ? "Empty" : @right})"
  end
end


Conclusions?

The single biggest thing to note here is the use of the null or nil values in the Java and Ruby examples. This may seem okay, but it really presents us with a problem. We can't treat an empty tree as a tree at all, so we can't have a method that will find the size of an empty tree. Instead we have to test for the tree being null (or "nil") and then treat it differently.

With Haskell, we can treat an empty tree as a tree, meaning we need not test at run-time for null or nil. As a direct result of this, the recursion flows far more naturally.

Please, debate! Smile


: