Functional Sieve of Eratosthenes

   I was thinking of how I can use functional constructs to implement Sieve of Eratosthenes and I bet that millions of readers are dying to learn what I came up with.

   First the C# LINQ version:

   class Program
   {
       public static List<int> Sieve(int max)
       {
           //initialize the array
           int[] numbers = new int[max - 1];
           for (int i = 0; i < max - 1; i++)
           {
               numbers[i] = i + 2;
           }

           //get the ceiling so we do not check unnecessary
           var ceiling = Math.Sqrt(max);
           var primes = new List<int>();


           while (numbers[0] <= ceiling)
           {
               //add the current first number to primes because we're sure it's a prime
               primes.Add(numbers[0]);

               //filter the remaining numbers and create a new filtered array
               numbers = (from r in numbers
                          where r % numbers[0] != 0 //note that numbers[0] is actually catured in a closure here
                          select r).ToArray();

               //alternatively
               //numbers = numbers.Where(i => i % numbers[0] != 0).ToArray();
           }

           //if we've reached the ceiling add the remaining numbers to primes
           primes.AddRange(numbers);
           return primes;
       }

       static void Main(string[] args)
       {
           foreach (int i in Sieve(121))
           {
               Console.WriteLine(i);
           }

           //time the code for great number of primes
           DateTime start = DateTime.Now;
           Console.WriteLine(Sieve(10000000).Count);
           Console.WriteLine("Time = " + DateTime.Now.Subtract(start).TotalMilliseconds);
       }
   }

So this code runs for 10-15 seconds on my machine for input 10000000.

   And here is the F# code:

   let sieve max =
       //get the ceiling
       let ceiling = sqrt (float max) in
       
       //declare local recursive function to filter the numbers
       let rec sieveStep (numbers, currentPrimes) =
           //check if we should exit the recursion
           if float (List.hd numbers) > ceiling
           then
               (numbers, currentPrimes)
           else
               //append the head of the list to the primes list
               let newPrimes = currentPrimes @ [numbers.Head] in
               //filter the numbers list and make a recursive call
               sieveStep (List.filter (fun x -> x % numbers.Head <> 0) numbers, newPrimes) in
       
       //call the sieve step with list initializer and empty list and get back a tuple
       let (primesLeft, primes) = sieveStep ([2..max], []) in
       //append the numbers left to the primes list and return
       primes @ primesLeft;;
   
   
   //call with argument 121 and print
   121 |> sieve |> List.iter (fun x -> printfn "%d" x);;

   //time the call for large numbers
   let start = System.DateTime.Now;;
   10000000 |> sieve |> List.length |> printfn "%d";;
   printfn "%f" (System.DateTime.Now.Subtract(start)).TotalMilliseconds;;

And this runs for about 90-100 seconds on my machine. Some guy on the HubFS forums came up with the explaination that creating many small objects (linked list) instead of few large objects (arrays) creates some problems for the GC. Here is the array version in F# it uses the light syntax.

   let sieveArray max =
       let ceiling = sqrt (float max)
       
       let rec sieveStep (numbers : array<_>, currentPrimes) =
           if float numbers.[0] > ceiling
           then
               (numbers, currentPrimes)
           else
               let newPrimes = numbers.[0] :: currentPrimes
               sieveStep (Array.filter (fun x -> x % numbers.[0] <> 0) numbers, newPrimes)
       
       let (primesLeft, primes) = sieveStep ([|2..max|], [])
       primes @ (List.of_array primesLeft)

This code runs for 14-17 seconds on my machine for the 10000000 input. Of course none of this implementations is the most efficient. Note that the F# and C# code are not equivalent in any way so these numbers cannot be used to compare F# and C# in terms of performance.

   Of course the most efficient implementation for this (and many more) algorithm does things in place to a bool array. I shamelessly copied it for you from here:

   public static List<int> InPlaceSieve(int max)
   {
       // initially assume all integers are prime
       bool[] isPrime = new bool[max + 1];
       for (int i = 2; i <= max; i++)
       {
           isPrime[i] = true;
       }

       // mark non-primes <= N using Sieve of Eratosthenes
       for (int i = 2; i * i <= max; i++)
       {

           // if i is prime, then mark multiples of i as nonprime
           // suffices to consider mutiples i, i+1, ..., N/i
           if (isPrime[i])
           {
               for (int j = i; i * j <= max; j++)
               {
                   isPrime[i * j] = false;
               }
           }
       }

       // count primes
       List<int> result = new List<int>();

       for (int i = 2; i < max; i++)
       {
           if (isPrime[i])
           {
               result.Add(i);
           }
       }

       return result;
   }

This runs for less than a second for the 10000000 input but I would not have come up with it on my own and I would argue that it is harder to understand and read. I wonder what this code will look like in functional style.
Tags:   english programming 
Posted by:   Stilgar
01:07 13.10.2008

Comments:



No comments yet.




Post as:



Post a comment: