Author:  wtd [ Sun Apr 03, 2005 5:13 pm ]
Post subject:  Haskell Example: Sieve of Eratosthenes

 code: module Sieve where sieve [] = [] sieve (x:xs) = x : sieve remaining   where     remaining = [y | y <- xs, y `mod` x /= 0]

Very simply, running the sieve on a list of no numbers will clearly give us no resulting numbers.

In the second case, we run the sieve on a list on numbers, where x is the first umber, and xs represents everything else, which may be an empty list.

We create a new list by prepending x onto the result of running the sieve on whatever's left.

It's in the where clause that we see the real work taking place. Here we filter out all numbers in xs which are multiples of x, using a list comprehension. The list comprehension iterates over xs, binding each element to y, which it returns if y is not (/=) evenly disivible by x.

You'll note that the fourth step of the sieve algorithm is not implemented. There is a very good reason for this. Haskell is a lazily-evaluated language. This means, for instance, that I can create an infinitely long list and the language has no problem with it... until I try to find the end of it.

The fourth step of the sieve, though, would ask me to find the end of the list. With an infinitely long list, though, finding the end means an infinite loop.

But this is also an advantage. It makes it exceptionally easy to write something like:

 code: take 1000 \$ sieve [2..]

Which gets the first thousand primes from the sieve.

You may look at this and wonder how it works. Surely I'm calculating an infinite numbers of primes, since I've specified no end to the input list.

Well, again, Haskell is lazily evaluated, and this function is recursive. It will do no more work than is necessary and stop once the first thousand primes have been calculated.

Author:  wtd [ Sun Apr 03, 2005 7:10 pm ]
Post subject:

Here we see a solution which incorporates the

 code: sieve' n = snd \$ until   (\(input, _) -> input == [])   (\(input@(x:xs), output) ->     if x ^ 2 > last input       then ([], output ++ input)       else ([y | y <- xs, y `mod` x /= 0], output ++ [x]))   ([2..n], [])

This is rather more complex. Let's look at the until function first.

The first argument is a function which determines when the until function should terminate. In this case, if terminates if the input list is empty.

The second argument updates the input. In this case that input (the third argument) is an input list full of numbers, and an output list, for known primes. The if enforces the 4th component of the sieve. If the largest prime squared is greater than the largest number in the input list, then everything in the input list is prime.

Otherwise we remove anything divisible by the largest known prime from the input list and go again.

The

 code: snd \$

Merely get the second element in the tuple. In this case, the output list.

Author:  wtd [ Sun Apr 03, 2005 7:17 pm ]
Post subject:

For a slightly cleaner function, we can use:

 code: ([] ==) . fst

In place of:

 code: \(input, _) -> input == []

The former is simply an example of partial application and composition, vs. a lambda expression in the latter.

Author:  GordonBGood [ Thu Dec 15, 2016 3:30 am ]
Post subject:  Re: Haskell Example: Sieve of Eratosthenes - a real array based version...

What has been posted so far **IS NOT** the Sieve of Eratosthenes (SoE) neither by algorithm nor by performance as explained so very well in the following article: http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf. Essentially, when an algorithm needs to use the `mod` function which implies division it is not the SoE, which only uses a data structure and repeated additions to mark composite number representations in that data structure. The Turner sieve is only a very poor version of a Trial Division sieve, although very concise.?

A true Sieve of Eratosthenes function in Haskell using a monoid-protected mutable array which is very fast up to a reasonable range is the following:

 code: {-# OPTIONS_GHC -O2 -rtsopts -fllvm #-} import Data.Word import Data.Bits import Data.Array.ST (runSTUArray) import Data.Array.Base soe :: Word32 -> [Word32] soe top = 2 : [fromIntegral i * 2 + 3 | (i, False) <- assocs bufb] where  bufb = runSTUArray \$ do   let bfLmt = (top - 3) `div` 2   cmpstsb <- newArray (0, bfLmt) False :: ST s (STUArray s Int Bool)   let cullp i =          let p = i + i + 3 in          let s = (p * p - 3) `div` 2 in          if s > bfLmt then loop (n - 1) else do            isCmpst <- unsafeRead cmpstsb i            if isCmpst then cullp (i + 1) else -- is Prime            let cull j = -- very tight inner loop where all the time is spent                  if j > bfLmt then cullp (i + 1) else do                    unsafeWrite cmpstsb j True                    cull (j + p) in cull s in cullp 0

It works very well up to the limit of the CPU cache sizes, say 256 Kilobytes for many modern CPU's = so two Megabits and a sieving range of about four million (sieves odds only, as two is the only even prime); after that it will get progressively slower for a given range as it starts to use CPU L3 cache (if there is one) and then main RAM memory, which is relatively very slow. The above code will cull the array for a sieving range up to four million in about four milliseconds on a modern CPU, although it takes at least that long again to list the resulting primes such as by:

 code: main = print \$ length \$ soe 4000000

For sieving ranges much higher than that, a page segmented sieve is required.

 :