Sieve of Eratosthenes in Haskell

It’s the second famous Haskell example to illustrate its elegance and succinctness, only after quicksort. Non-surprisingly, it’s not implementing the original algorithm just like the quicksort case. Well, in Haskell’s defense, it’s actually implementing the non-refined version specified in https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes.

This paper gives a though analysis on this problem, and is fairly readable. The following code is basically a reprint of the code in that paper.

import Data.Time

primes = 2 : [x | x <- [3..], is_prime x]
  where
    is_prime x = all (\f -> x `mod` f /= 0) $ factors x
    factors x = takeWhile (\n -> n*n <= x) primes

slow_seive =  primes [2..]
  where
    primes (x:xs) = x : (primes $ filter (\y -> y `mod` x /= 0) xs)
    primes _ = error "finite int"

fast_seive = 2 : [3..] `minus` composites
  where
    composites = union [map (x*) [x..] | x <- fast_seive]

minus (x:xs) (y:ys)
  | x < y = x : minus xs (y:ys)
  | x == y = minus xs ys
  | x > y = minus (x:xs) ys
minus _ _ = error "i"

union = foldr merge []

merge (x:xs) ys = x : merge' xs ys
  where
    merge' (x:xs) (y:ys)
      | x < y = x : merge' xs (y:ys)
      | x == y = x : merge' xs ys
      | x > y = y : merge' (x:xs) ys
    merge' _ _ = error "i"
merge _ _ = error "i"

force e = seq e $ return ()
-- force e = seq e $ print e
index = 10*1000

main = do
  start <- getCurrentTime
  force $ primes !! index
  stop <- getCurrentTime
  print $ diffUTCTime stop start

  start <- getCurrentTime
  force $ slow_seive !! index
  stop <- getCurrentTime
  print $ diffUTCTime stop start

  start <- getCurrentTime
  force $ fast_seive !! index
  stop <- getCurrentTime
  print $ diffUTCTime stop start

$ ghc -O prime.hs; ./prime
0.040767s
3.273944s
0.045448s

The benchmark result is a bit different from the one in the paper, that it claims that fast_seive runs faster for practical size n, while trivial division on my machine is slightly faster. (Using ghc 7.10.3 on ubuntu 14.04.)