Computer Science Canada

Easily Sorting Complex Data

Author:  wtd [ Fri Jun 24, 2005 7:13 pm ]
Post subject:  Easily Sorting Complex Data

Let's say we have a bunch of data, like an array of names. Let's further specify that each of those names is a hash, like:

code:
{"first" => "Bob", "last" => "Smith"}


Now, let's programmatically build a list of, oh... 10 names.

code:
names = []

10.times do
   print "First name? "
   first_name = gets.strip
   print "Last name? "
   last_name = gets.strip
   names.push({"first" => first_name, "last" => last_name})
end


Now we have 10 names in an array. Let's just quickly look at a bit of block usage to clean this up. Blocks are very important and powerful in Ruby.

code:
names = Array.new 10 do
   print "First name? "
   first_name = gets.strip
   print "Last name? "
   last_name = gets.strip
   {"first" => first_name, "last" => last_name}
end


Now, how do we sort this based on first or last name?

Well, let's look at the hard way first. Let's start with sorting by last name. First we need to extract a list of all of the last names.

code:
last_names = names.collect { |name| name["last"] }


We also need to sort that.

code:
last_names = names.collect { |name| name["last"] }.sort


We also want to remove duplicates, which we might as well do before sorting.

code:
last_names = names.collect { |name| name["last"] }.uniq.sort


Now, we have a list of sorted last names, and only unique last names. Let's print the names in this sorted order.

We need to iterate over the sorted last names. For each sorted last name we need to iterate over the original set of name data and only print a name if the last names match.

code:
last_names.each do |last_name|
   names.each do |name|
      p name if name["last"] == last_name
   end
end


Now, if we want to put all of the names in sorted order into a new array, we do something very similar.

code:
sorted_names = []

last_names.each do |last_name|
   names.each do |name|
      sorted_names.push(name) if name["last"] == last_name
   end
end


We can use the select method to clean up that inner loop.

code:
sorted_names = []

last_names.each do |last_name|
   names.select { |name| name["last"] == last_name }.each do |name|
      sorted_names.push(name)
   end
end


Or...

We could simply use the sort method, with a block. You see, sorting inevitably comes down to a comparison between two adjacent elements. Using a block we can tell Ruby's sort method exactly how to make that comparison.

code:
sorted_names = names.sort { |a, b| a["last"] <=> b["last"] }


The double-ended arrow operator is simply Ruby's way of saying "compare". For less than it returns -1; for equals it returns 0; for greater than, 1.

Sorting by first name is then as simple as:

code:
sorted_names = names.sort { |a, b| a["first"] <=> b["first"] }


And if we want to sort by both...

code:
sorted_names = names.sort { |a, b| a["first"] <=> b["first"] }.sort { |a, b| a["last"] <=> b["last"] }


Any questions?

Author:  wtd [ Fri Jun 24, 2005 8:33 pm ]
Post subject: 

As a further example, let's represent a name as a class.

code:
class Name
   attr_reader :first, :last
 
   def initialize(first, last)
      @first, @last = first, last
   end
end


code:
sorted_names = names.sort { |a, b| a.last <=> b.last }


: