O'Caml Flexibility: Why You Should Check It Out

wtd · **Posted:** Mon May 02, 2005 7:26 pm

I'll say it up front. O'Caml isn't the prettiest of programming languages. It has a lot of syntactic warts.

That said, should you learn it? Absolutely.

O'Caml is winning over more and more people because it's quite flexible. The language itself offers all of the usual functional goodness. Functions are first class and can be partially applied. Types are inferred wherever possible. It's statically typed, catching virtually all type errors at compile time. It has pattern matching. In addition, there are imperative features, good support for object-oriented programming, and a good module system.

Running an O'Caml program is another place where its flexibility can be seen. O'Caml programs can be compiled to bytecode and interpreted. They can be compiled to native machine code. Or, for learners, there's a powerful interactive interpreter.

wtd · **Posted:** Tue May 03, 2005 2:34 am

Since O'Caml is often seen as a cleaner, more expressive alternative to C++ that's quicker to program in, and often produces faster code, please feel free to ask for examples implemented in both. Smile

For a starter, let's consider programs to list out all command-line arguments.

Ocaml:

open Array
open Sys;;

Array.iter print_endline Sys.argv;;

c++:

#include <iostream>

int main(int argc, char **argv)
{
for (int i = 0; i < argc; ++i)
{
std::cout << argv[i] << std::endl;
}

return 0;
}

Or, if we were to take the imperative approach in O'Caml.

Ocaml:

open Array
open Sys;;

let argc = Array.length Sys.argv in
for i = 0 to argc - 1 do
print_endline Sys.argv.(i)
done;;

Certainly the first program is the cleanest of the three, but even with the imperative O'Caml program, the for loop is certainly more expressive than C++'s.

wtd · **Posted:** Tue May 03, 2005 5:25 pm

The classic grades problem. Read ten grades. Find the average, highest grade and lowest grade.

Ocaml:

open Printf
open Array

include Printf;;

let grades = Array.create 10 0
and sum = Array.fold_left (+) 0
and max = Array.fold_left (fun a b -> if a > b then a else b) 0
and min = Array.fold_left (fun a b -> if a < b then a else b) 10000
and get_grade () = try read_int () with _ -> 0
in
for i = 0 to Array.length grades - 1 do
grades.(i) <- get_grade ()
done;
let avg = float_of_int (sum grades) /. float_of_int (Array.length grades)
and max_grade = max grades
and min_grade = min grades
in
printf "Average: %f\n" avg;
printf "Max: %d\n" max_grade;
printf "Min: %d\n" min_grade;;

c++:

#include <iostream>

int main()
{
int grades[10];

for (int i(0); i < 10; ++i)
{
std::cin >> grades[i];
if (std::cin.fail())
{
grades[i] = 0;
}
}

int sum(0);

for (int i(0); i < 10; ++i)
{
sum += grades[i];
}

double average(static_cast<double>(sum) / 10);

int max(grades[0]);

for (int i(1); i < 10; ++i)
{
if (grades[i] > max)
{
max = grades[i];
}
}

int min(grades[0]);

for (int i(1); i < 10; ++i)
{
if (grades[i] < max)
{
max = grades[i];
}
}

std::cout << "Average: " << average << std::endl
<< "Max: " << max << std::endl
<< "Min: " << min << std::endl;

return 0;
}

Naveg · **Posted:** Tue May 03, 2005 6:01 pm

did i miss something important or are those two posts strikingly similar?

wtd · **Posted:** Tue May 03, 2005 10:47 pm

No. I had some instability just as I posted that screwed things up.

Fortunately I'm a moderator so I can fix my goofs. Wink

wtd · **Posted:** Wed May 04, 2005 12:36 am

Ocaml:

class name f l =
object (self)
val first = f
val last = l
method first = first
method last = last
method full_name = first ^ " " ^ last
end

class formal_name t f l =
object (self)
inherit name f l as super
val title = t
method title = title
method full_name = title ^ " " ^ super#full_name
end

let bobs_name = new formal_name "Sir" "Bob" "Smith" in
print_endline bobs_name#full_name;;

c++:

class name
{
private:
const std::string _first, _last;
public:
name(std::string f, std::string l);

std::string first() const;
std::string last() const;
std::string full_name() const;
};

class formal_name : public name
{
private:
const std::string _title;
public:
formal_name(std::string t, std::string f, std::string l);

std::string title() const;
std::string full_name() const;
};

int main()
{
formal_name bobs_name("Sir", "Bob", "Smith");

std::cout << bobs_name.full_name() << std::endl;

return 0;
}

name::name(std::string f, std::string l)
: _first(f)
, _last(l)
{ }

std::string name::first() const
{
return _first;
}

std::string name::last() const
{
return _last;
}

std::string name::full_name() const
{
return _first + " " + _last;
}

formal_name::formal_name(std::string t, std::string f, std::string l)
: name(f, l)
, _title(t)
{ }

std::string formal_name::title() const
{
return _title;
}

std::string formal_name::full_name() const
{
return _title + " " + name::full_name();
}

wtd · **Posted:** Wed May 04, 2005 2:26 pm

A basic "pyramid of asterisks program, imperatively in both languages.

Ocaml:

let default_levels = 8
let levels =
let default_levels = 8
in
try int_of_string Sys.argv.(1)
with
Failure "int_of_string" ->
print_endline "Please enter an integer.";
exit 1
| _ -> default_levels

let draw_pyramid ch sz =
for i = 0 to sz - 1 do
let num_chars = i * 2 + 1
and num_spaces = sz - i - 1 in
let chars = String.make num_chars ch
and spaces = String.make num_spaces ' '
in
print_endline (spaces ^ chars)
done;;

draw_pyramid '*' levels

c++:

#include <iostream>
#include <string>
#include <sstream>

void draw_pyramid(char ch, int sz)
{
for (int i(0); i < sz; ++i)
{
int num_chars = i * 2 + 1;
int num_spaces = sz - i - 1;
std::string chars(num_chars, ch);
std::string spaces(num_spaces, ' ');

std::cout << spaces << chars << std::endl;
}
}

int main(int argc, char **argv)
{
const int default_levels(8);
int levels;

if (argc >= 2)
{
std::stringstream ss;
ss << argv[1];
ss >> levels;
if (ss.fail())
{
std::cerr << "Please enter an integer." << std::endl;
return 1;
}
}
else
{
levels = default_levels;
}

draw_pyramid('*', levels);

return 0;
}

md · **Posted:** Wed May 04, 2005 3:35 pm

Personally I find the C++ code in all your examples much more strait forward. They might be more lines, but their easy to follow. As for flexibility, does being able to do things in less lines (at the cost of a horrible syntax...) make a language moreflexible then another language that does not make such sacrafices? Given that both languages are turing complete, they are really both equally flexible...

wtd · **Posted:** Wed May 04, 2005 5:08 pm

Well, when I speak of flexibility, I actually am primarily referring to the ways to run O'Caml programs. You can compile to native code (ocamlopt is a very good optimizing compiler for 9 different CPU architectures and very frequently produces faster executables than C++ compilers). You can compile to bytecode and use a virtual machine, like Java, but with better garbage collection. You can use the "top-level" to interactively execute commands and test code.

Only one of those is an option with C++.

For the language itself, you have a number of advantages.

The language is statically-typed, and more type-safe than C++. If it compiles, you're pretty much guaranteed there aren't any type problems.

Types are inferred. Did I once actually type out the name of a type in the above code? The compiler is able to figure out from how I used values what type they are.

You get functions as first class items. See:

code:

Array.iter print_endline Sys.argv

The "print_endline" function is passed as a value to the iter function (in the Array module).

You have sane loops. So does C++, you're tempted to say. The problem, though, is that the index in a C++ loop can be mutated within the loop body. Consider:

code:

for (int i(0); i < 10; ++i)
{
i--;
}

Why should this kind of madness be allowed?

I personally also find the O'Caml for loop much more expressive.

code:

for i = 0 to 9 do ... done

Consider the ability also to have what a C++ programmer would think of as "control structures" return values.

So instead of:

code:

int a;

if (pigs.fly())
a = 42;
else
a = 67;

We can have something like:

code:

let a = if pigs#fly then 42 else 67

Now, granted, we could have written:

code:

int a(pigs.fly() ? 42 : 67);

But I find the 'Caml version more expressive.

Functions can easily return more than one value.

Let's say we want to feed a number to a C++ function, then get that number, that number times 2, and that number squared.

Well, we need a type that can hold more than one value.

c++:

struct Result
{
const int original, times_two, squared;
Result(int o, int t, int s)
: original(o)
, times_two(t)
, squared(s)
{ }
};

Result foo(int n)
{
return Result(n, n * 2, n * n);
}

Now, I need to call the function extract the second value and print it.

c++:

int main()
{
Result r(foo(42));

std::cout << r.times_two << std::endl;

return 0;
}

Now, let's see that in O'Caml.

code:

let foo n = (n, n * 2, n * n)
let (_, times_two, _) = foo 42;;

printf "%d\n" times_two

The above also nicely demonstrates pattern-matching. The underscores are values in the pattern that I don't really care about. I can also use much deeper pattern matching.

code:

let bar (m, n, (x, y, z), ((e, f), g)) =
(* do something with all of those values *)

It's not about lines of code saved, it's about ease of development. The biggest source of lost clock cycles in computing is the time programmers spend typing. Losing that time for trivial stuff like this is silly.

Let's look at OO programming, and inheritance specifically. Resolving naming clashes and referring to methods in a parent class is fairly tedious in C++.

c++:

struct A
{
std::string a() const { return "a"; }
};

struct B
{
std::string a() const { return "b"; }
};

struct C : public A, public B
{
std::string a() const { return A::a() + " " + B::a(); }
};

code:

class a =
object
method a = "a"
end

class b =
object
method a = "b"
end

class c =
object
inherit a as a
inherit b as b
method a = a#a ^ " " ^ b#a
end

I don't mind your questions, and hopefully you don't mind my explanation. I can understand your feelings about the syntax. It's very different. I felt the same way too, but I've come to embrace its potential. A lot of people are using it, after all, so there has to be some benefit (that's the philosophy that's driven me to learn new programming languages). Smile

Feel free to keep the questions coming.

wtd · **Posted:** Wed May 04, 2005 5:17 pm

An additional benefit is the module system. Let's take a simple example. Let's say in C++ I have a namespace holding a "hello world" function and a string containing the greeting for it to print.

c++:

namespace Hello
{
std::string greeting("hello world");

void hello_world()
{
std::cout << greeting << std::endl;
}
}

Now, let's say I want to hide the greeting variable, so it can't be accessed outside of the namespace. Hmmm.... can't do it.

Let's look at O'Caml. First, everything available.

code:

module Hello =
struct
let greeting = "hello world"
let hello_world () = print_endline greeting
end

Now, if I only want the function to be visible...

code:

module Hello : sig
val hello_world : unit -> unit
end =
struct
let greeting = "hello world"
let hello_world () = print_endline greeting
end

wtd · **Posted:** Wed May 04, 2005 9:55 pm

Let's take two lists of integers and check to see if each element in the first list is bigger than its counterpart in the second list. We'll assume they're equal in length.

c++:

bool all_greater_than(const std::vector<int>& list1, const std::vector<int>& list2)
{
for (int i(0); i < list1.size(); ++i)
{
if (list1[i] <= list2[i]) return false;
}

return true;
}

Now, let's look at how we can do this in O'Caml.

We can use the for_all2 function in the List module. It loops over two lists, and passes their corresponding elements to a function which returns a boolean value. If all of those boolean values are true, it returns true; otherwise false.

We also need a function to tell us if one element is greater than the other.

code:

let greater_than a b = a > b

let all_greater_than list1 list2 =
List.for_all2 greater_than list1 list2

But we notice in looking at this that "greater_than" is exactly the same as (>), making it redundant.

code:

let all_greater_than list1 list2 =
List.for_all2 (>) list1 list2

Now, what's great about O'Caml (and other functional languages) is partial function application. That is, if we don't provide all of the necessary arguments, we get back a function which takes the remaining arguments.

In other words, we can avoid passing the lists explicitly as arguments.

code:

let all_greater_than = List.for_all2 (>)

At this point, of course, we question the point in even making this a separate function. The new function isn't much shorter and it's actually less expressive than simply writing:

code:

List.for_all2 (>) [1;2;3] [0;1;2]

Unfortunately, the same is not possible in C++.

wtd · **Posted:** Thu May 05, 2005 2:22 pm

Of course, it's easy enough for me to just say: "use the built-in library function." So, let's create our own function. We'll start a bit less ambitious, and just test for equality.

First we need to define a function. Let's call it "all_equal2" and have it take two lists as arguments.

code:

let all_equal2 list1 list2 = ...

This is obviously psuedo-code until we can fill in the rest.

Now, how do we actually tell if two lists are equal?

Well, obviously two empty lists are equal.
If the first list is empty, and the second is not, they're not equal.
If the first list is not empty, and the second is, they're not equal.
If they're both non-empty, check to see if their first elements are equal, and ifthey are, and running the function on both remaining lists yields true, then they're equal.

A pretty simple set of rules, don't you think?

Note the:

Quote:

running the function on both remaining lists yields true, then they're equal.

Yep, it's a recursive function. So we add the "rec" keyword.

code:

let rec all_equal2 list1 list2 = ...

Now, how do we determine whether or not the list is empty, and if not, how do we get the first element, and the remaining list?

Well, pattern matching comes to the rescue. But first, a few things:

An empty list in O'Caml looks like:

code:

[]
The :: operator is the "list construction" operator. For instance:

code:

[4]

Is really just:

code:

4 :: []

And:

code:

[4;5;6;7;8]

Is:

code:

4 :: [5;6;7;8]

So we can pattern match with that.

code:

match [4;3;5] with x::xs -> print_int x
Matches proceed in order, and only one match is made. Therefore we can consider any previous patterns rejected when we write a pattern.

So, let's start filling in our rules:

code:

let rec all_equal2 list1 list2 =
match (list1, list2) with
([], []) -> (* two empty lists *)
| ([], _) -> (* an empty list and a non-empty list *)
| (_, []) -> (* a non-empty list and an empty list *)
| (x::xs, y::ys) -> (* two non-empty lists *)

Now, we know enough to fill in their of those.

code:

let rec all_equal2 list1 list2 =
match (list1, list2) with
([], []) -> true
| ([], _) -> false
| (_, []) -> false
| (x::xs, y::ys) -> (* two non-empty lists *)

But what about the fourth? We need to check to see if x equals y, and then run the function of xs and ys.

code:

let rec all_equal2 list1 list2 =
match (list1, list2) with
([], []) -> true
| ([], _) -> false
| (_, []) -> false
| (x::xs, y::ys) ->
x = y && all_equal2 xs ys

And that's all there is to it. Not only is it concise, but it's perfectly logical. There's no mucking around with pointers or loops that we need to break out of early. And yet, it's optimized. The function will go no further than absolutely necessary because there are exactly three out four conditions which cause it to stop dead in its tracks and return a value.

Of course, there are a couple more things we can do with this. First, instead of returning false when a length mismatch is encountered, and that's what happens with conditions 2 and 3, let's raise an exception.

The Invalid_argument exception should work nicely.

code:

let rec all_equal2 list1 list2 =
match (list1, list2) with
([], []) -> true
| ([], _) -> raise (Invalid_argument "List2 too long")
| (_, []) -> raise (Invalid_argument "List1 too long")
| (x::xs, y::ys) ->
x = y && all_equal2 xs ys

Also, we really should be able to use this with any function, as the List.for_all2 function does.

It's pretty simple to do this. Our function need only take an additional argument, which is the function to use as a test. "pred" is short for "predicate".

code:

let rec test_all2 pred list1 list2 =
match (list1, list2) with
([], []) -> true
| ([], _) -> raise (Invalid_argument "List2 too long")
| (_, []) -> raise (Invalid_argument "List1 too long")
| (x::xs, y::ys) ->
pred x y && all_equal2 pred xs ys

We now need only write:

code:

test_all2 (=) [1;2;3] [1;2;3]

Or, if we have an exception, we can handle it.

code:

try test_all2 (=) [1;2;3] [1;2;3]
with Invalid_argument _ -> false

rizzix · **Posted:** Thu May 05, 2005 2:25 pm

i still prefer haskell.. Rolling Eyes

wtd · **Posted:** Thu May 05, 2005 2:29 pm

rizzix wrote:

i still prefer haskell

They both have their merits, and it still has a lot of stuff other languages lack (real type-safety, decent generics, pattern-matching, type inferencing, partial function application, really good list support, an interactive interpreter, etc.). Additionally, it has speed and efficiency, a damn good object layer, and for those new to functional programming it's probably a getler intro than Haskell. It's how I wrapped my mind around the concepts.

Have some faith in me. Smile

I wouldn't be talking it up if there wasn't something there.

wtd · **Posted:** Fri May 06, 2005 12:10 pm

Mutually recursive functions and good habits

Lets look at a set of mutually recursive functions in C++.

c++:

void a()
{
std::cout << "Hello world" << std::endl;
b();
}

void b()
{
std::cout << "foo bar!" << std::endl;
a();
}

Useless, but also incorrect. The function "a" doesn't know about "b" existing, so we have to use a forward declaration, which can be a pain in the butt.

c++:

void b();

void a()
{
std::cout << "Hello world" << std::endl;
b();
}

void b()
{
std::cout << "foo bar!" << std::endl;
a();
}

Now, let's see O'Caml's version.

code:

let rec a () =
print_endline "Hello world";
b ()
and b () =
print_endline "foo bar!";
a()

The "and" is the critical element here. With it we can define mutually recursive functions. But why do I say this enforces good habits?

Well, in general, if you have mutually recursive functions, they're going to be a potential source of trouble, and should be grouped closely together for easier maintenance. O'Caml not only promotes that, but it enforces it by leaving no other options for defining mutually recursive functions.

	Computer Science Canada Programming C, C++, Java, PHP, Ruby, Turing, VB Username: Password: Register
Wiki Blog Search Turing Chat Room Members