I was thinking of how I can use functional constructs to implement
Sieve of Eratosthenes and I bet that millions of readers are dying to learn what I came up with.
First the C# LINQ version:
class Program
{
public static List<int> Sieve(int max)
{
//initialize the array
int[] numbers = new int[max - 1];
for (int i = 0; i < max - 1; i++)
{
numbers[i] = i + 2;
}
//get the ceiling so we do not check unnecessary
var ceiling = Math.Sqrt(max);
var primes = new List<int>();
while (numbers[0] <= ceiling)
{
//add the current first number to primes because we're sure it's a prime
primes.Add(numbers[0]);
//filter the remaining numbers and create a new filtered array
numbers = (from r in numbers
where r % numbers[0] != 0 //note that numbers[0] is actually catured in a closure here
select r).ToArray();
//alternatively
//numbers = numbers.Where(i => i % numbers[0] != 0).ToArray();
}
//if we've reached the ceiling add the remaining numbers to primes
primes.AddRange(numbers);
return primes;
}
static void Main(string[] args)
{
foreach (int i in Sieve(121))
{
Console.WriteLine(i);
}
//time the code for great number of primes
DateTime start = DateTime.Now;
Console.WriteLine(Sieve(10000000).Count);
Console.WriteLine("Time = " + DateTime.Now.Subtract(start).TotalMilliseconds);
}
}
So this code runs for 10-15 seconds on my machine for input 10000000.
And here is the F# code:
let sieve max =
//get the ceiling
let ceiling = sqrt (float max) in
//declare local recursive function to filter the numbers
let rec sieveStep (numbers, currentPrimes) =
//check if we should exit the recursion
if float (List.hd numbers) > ceiling
then
(numbers, currentPrimes)
else
//append the head of the list to the primes list
let newPrimes = currentPrimes @ [numbers.Head] in
//filter the numbers list and make a recursive call
sieveStep (List.filter (fun x -> x % numbers.Head <> 0) numbers, newPrimes) in
//call the sieve step with list initializer and empty list and get back a tuple
let (primesLeft, primes) = sieveStep ([2..max], []) in
//append the numbers left to the primes list and return
primes @ primesLeft;;
//call with argument 121 and print
121 |> sieve |> List.iter (fun x -> printfn "%d" x);;
//time the call for large numbers
let start = System.DateTime.Now;;
10000000 |> sieve |> List.length |> printfn "%d";;
printfn "%f" (System.DateTime.Now.Subtract(start)).TotalMilliseconds;;
And this runs for about 90-100 seconds on my machine. Some guy on the
HubFS forums came up with the explaination that creating many small objects (linked list) instead of few large objects (arrays) creates some problems for the GC. Here is the array version in F# it uses the light syntax.
let sieveArray max =
let ceiling = sqrt (float max)
let rec sieveStep (numbers : array<_>, currentPrimes) =
if float numbers.[0] > ceiling
then
(numbers, currentPrimes)
else
let newPrimes = numbers.[0] :: currentPrimes
sieveStep (Array.filter (fun x -> x % numbers.[0] <> 0) numbers, newPrimes)
let (primesLeft, primes) = sieveStep ([|2..max|], [])
primes @ (List.of_array primesLeft)
This code runs for 14-17 seconds on my machine for the 10000000 input. Of course none of this implementations is the most efficient. Note that the F# and C# code are not equivalent in any way so these numbers cannot be used to compare F# and C# in terms of performance.
Of course the most efficient implementation for this (and many more) algorithm does things in place to a bool array. I shamelessly copied it for you from
here:
public static List<int> InPlaceSieve(int max)
{
// initially assume all integers are prime
bool[] isPrime = new bool[max + 1];
for (int i = 2; i <= max; i++)
{
isPrime[i] = true;
}
// mark non-primes <= N using Sieve of Eratosthenes
for (int i = 2; i * i <= max; i++)
{
// if i is prime, then mark multiples of i as nonprime
// suffices to consider mutiples i, i+1, ..., N/i
if (isPrime[i])
{
for (int j = i; i * j <= max; j++)
{
isPrime[i * j] = false;
}
}
}
// count primes
List<int> result = new List<int>();
for (int i = 2; i < max; i++)
{
if (isPrime[i])
{
result.Add(i);
}
}
return result;
}
This runs for less than a second for the 10000000 input but I would not have come up with it on my own and I would argue that it is harder to understand and read. I wonder what this code will look like in functional style.