Computer Science Canada

[Performance] Beat the pants off of my best time

Author:  DemonWasp [ Thu Jul 31, 2008 8:34 pm ]
Post subject:  [Performance] Beat the pants off of my best time

I'm sure some of you are much smarter and better at coding than I am. I'm really curious optimisation of code ...and how to make the computer run fast, overall. So, I'd like to see some code that's better than mine (and some code worse than mine, so I don't feel too badly about being a huge noob). The problem I pose is fairly simple:


Using the language and method of your choice, find and list all the prime numbers between 1 and (a very large) N...as fast as you possibly can.

You will need to be able to output your results to a file or not, optionally (you can use a command-line parameter for example).
You will need to give me some way of modifying N when running your program (I want to run several different values of N, so I can make a pretty graph or somesuch).

Ideally, beat my implementation:
code:

WITH FILE OUTPUT
N       Time (seconds, approximate)
10M     < 0.8
100M    9
1B      92

WITH NO OUTPUT
N       Time (seconds, approximate)
10M     < 0.3
100M    6
1B      65


When the contest is over, I will post my implementation. I'm pretty sure it works right, but I could be wrong - I only checked the numbers up to around 100 manually.

Expected File Sizes, as per my implementation (each number on its own line)
10M -> ~5.0 MB
100M -> 48.7 MB
1B -> 478.7 MB


Rules:
1. This is mostly for fun (I have little to donate, though I suppose bits are in order). Final winner will be judged on one of my computers (probably the laptop or the new desktop).

2. I can run anything in C, C++, or Java code, provided that:
2.a) anything in C(++) compiles with g++ or gcc (version 4.2.3)
2.b) anything in Java must be able to run with JVM 1.6.0_06

3. I can run alternate languages, provided that:
3.a) You point me to a tutorial on installing the necessary tools
3.b) The tools required are free, and available on Ubuntu or Windows XP. I'd prefer to keep the tests all in Ubuntu, however.

4. Your program must compile and execute with no errors, segfaults, or other naughtiness. Programs that fail will not be counted.

5. Epic win if you beat my times by more than 50%.

6. Yes, I'm aware that optimisation of this problem is silliness, but I find it interesting.

Computer specifications, for those curious:
Desktop: E6600 @ 2.4Ghz, 2GB DDR2 RAM, 64-bit Ubuntu 8.04.1, ReiserFS on a Seagate 7200.10
Laptop: T5400 @ 1.66Ghz (locked), 2GB DDR2 RAM, 32-bit Ubuntu 8.04.1, ext3 on a WD 120GB drive
>> Before you remind me that I'd be better off with Arch or Gentoo for sheer performance, let me forestall you: I'm a huge Linux noob, so it's a miracle that Ubuntu is installed properly at all.

Author:  rizzix [ Thu Jul 31, 2008 10:48 pm ]
Post subject:  RE:[Performance] Beat the pants off of my best time

Well, raw speed will (most likely) only be obtained through low-level C + SIMD extensions. There's no way to beat that speed assuming all implementations use the same algorithm (of course assuming your algorithm can take advantage of SIMD extensions).

And, secondly, Ubuntu is actually well optimised. Smile

Author:  DemonWasp [ Thu Jul 31, 2008 11:35 pm ]
Post subject:  RE:[Performance] Beat the pants off of my best time

So? Go ahead and post something, I'd be interested to see it. For reference, my implementation is in Java, so I'm sure someone can figure out a way to beat it (I have a C version, but I haven't done File IO for it yet).

Author:  rizzix [ Thu Jul 31, 2008 11:51 pm ]
Post subject:  RE:[Performance] Beat the pants off of my best time

Quote:
So?
Well, I was questioning the purpose of this challenge.

Author:  DemonWasp [ Thu Jul 31, 2008 11:54 pm ]
Post subject:  RE:[Performance] Beat the pants off of my best time

I wanted to see something like that. I'm curious as to what's done in apps that are actually performance-critical, rather than the stuff I'm working on right now (test-code is lame).

Plus this was my half-assed attempt to breathe some life back into compsci.ca ... this forum gets pretty boring around summertime.

Author:  rizzix [ Fri Aug 01, 2008 12:00 am ]
Post subject:  RE:[Performance] Beat the pants off of my best time

Ah very well then. However may I suggest you recommend that everyone participating should exclude the "startup" time of their program, from their benchmarks? You did that after all, in your Java implementation. Additionally specify the number of runs and the number of different data sets used in each run.

Author:  DemonWasp [ Fri Aug 01, 2008 12:57 am ]
Post subject:  RE:[Performance] Beat the pants off of my best time

Ah, okay, probably a good plan:

Exclude your startup times (but NOT memory-allocation):
I was running my test from within Eclipse (using the same JVM), so you can exclude startup time (probably negligible in the case of C programs; it takes practically no time to get into main() ). Do all the timing yourself, as accurately as you can, or else rely on me to use the time command correctly...

Trial Process
When testing each program, I will take 5 runs and average their results for each of N = 10M, 100M and 1B. Assuming I have time, I'll do both with and without File IO, but I'll do with File IO first. You're welcome to run your own tests, of course. Tests which run more than five times as long as my own tests will be terminated and receive the grade "lolcat" instead of an actual time (I'm very lazy sometimes).

Author:  TheFerret [ Fri Aug 01, 2008 6:17 am ]
Post subject:  RE:[Performance] Beat the pants off of my best time

http://ferret22ca.googlepages.com/soln.txt is my solution in Java, it can basically cut your time in half on my system... (Amd x2 4200+, 4gb Ram, Vista 64bit) To get it to run without breaking Java, you have to use -Xmx1224M when you are compiling it for the 1 billion one or java will complain about being out of memory... To change the number, just change MAX to whatever number is needed...

P.s. I was going to try the harder Sieve but then I got lazy... lol

Author:  zylum [ Fri Aug 01, 2008 7:06 am ]
Post subject:  RE:[Performance] Beat the pants off of my best time

My C solution runs in about 37 seconds when compiled using the -O flag and no output on the 1B case... It only uses 1B bits so its not sooo bad space wise. It's the basic sieve with a couple optimizations for speed and space.

C:
#include <stdio.h>
#include <time.h>
int main(int argc, char *argv[]) {
        clock_t time = clock();
        int N = atoi(argv[1]);
        uint p[N/32+1];
        int i, j, NP = 0;
        for (i = 0; i <= N>>5; i++) p[i] = 0;
        for (i = 2; i < sqrt(N); i++)
                if (0 == (p[i>>5] & (1 << ((i-1) & 31))))
                        for (j = i<<1; j < N; j += i)
                                p[j>>5] |= 1 << ((j-1) & 31);
        for (i = 2; i < N; i+=2)
                if (0 == (p[i>>5] & (1 << ((i-1) & 31)))) {
                        NP++;
                        //printf("\n%d", i);
                }
        printf("\nPRIMES: %d\n", NP+1);
        printf("TIME: %d\n", clock() - time);
        return 0;
}

Author:  Zeroth [ Fri Aug 01, 2008 8:15 am ]
Post subject:  Re: [Performance] Beat the pants off of my best time

DemonWasp where's your code? The idea is that we should know what you did, and which algorithm you used. I'm leaving on a trip soon, so I can't participate. So, instead, I'll leave you all with the paper containing the fastest prime finding algorithm.

http://citeseer.ist.psu.edu/atkin99prime.html

If I did have time, I'd implement that in assembly. Twisted Evil

Author:  DemonWasp [ Fri Aug 01, 2008 8:26 am ]
Post subject:  RE:[Performance] Beat the pants off of my best time

@TheFerret: Your solution does seem to run fast (I don't have the reference machine at work, so I tested that on a server-class machine; I'll have to check when I get home. It appears to do well, but I'm having a hard time believing less than half the time when it's the exact same algorithm.

Perhaps I should manually inline the accessor methods and output methods I have. The method calls may be making a huge difference, and Java may not be inlining them like it should.

@Zylum: Your solution is refusing to compile on the machine at work; it complains that uint and atoi aren't known. I can only assume this is my fault, as I'm a huge C noob...helps?

Nicely done, both...assuming your solutions are as fast as you claim, you have well and truly beaten me.

Author:  DemonWasp [ Fri Aug 01, 2008 8:31 am ]
Post subject:  RE:[Performance] Beat the pants off of my best time

@Zeroth: I used a simple seive algorithm (it's named after someone, but I don't remember their name). It goes like this:

1. Assume every number is prime.
2. Starting at n = 2, eliminate the multiples of n whenever n is prime, going sequentially; list each n that you eliminate for as you do this.

My code at the moment is both a mess, and not with me. I implemented the thing a lot more enterprisey than I had to (for example, using a method for output instead of just having the lines there). I'll clean it up, test it again, then permit it to show it's ugly face here.

Author:  zylum [ Fri Aug 01, 2008 11:32 am ]
Post subject:  RE:[Performance] Beat the pants off of my best time

I compiled with gcc on cygwin so it should compile fine under linux Razz

Author:  Vermette [ Fri Aug 01, 2008 11:42 am ]
Post subject:  Re: RE:[Performance] Beat the pants off of my best time

DemonWasp @ August 1st 2008, 08:31 wrote:
@Zeroth: I used a simple seive algorithm (it's named after someone, but I don't remember their name.

Sieve of Erastothenes

Author:  DemonWasp [ Fri Aug 01, 2008 12:44 pm ]
Post subject:  RE:[Performance] Beat the pants off of my best time

@zylum: Could you give the exact command line? This may just not be linking something that you were. Would it matter what folder I put it in?

@Vermette: Thanks.

Author:  zylum [ Fri Aug 01, 2008 9:43 pm ]
Post subject:  RE:[Performance] Beat the pants off of my best time

gcc p.c -O -Wl,--stack,150000000 -o p

The -Wl,--stack,150000000 changes the stack size to 150MB.

Author:  Brightguy [ Sat Aug 02, 2008 7:47 am ]
Post subject:  Re: [Performance] Beat the pants off of my best time

Eratosthenes with a twist. It has not been optimized, except for tight bit packing. Did the first billion primes in just under a minute on my Eee PC.
c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

int main(int argc, char* argv[])
{   time_t timer = time(NULL);
   
    int i,j,np=0,n=atoi(argv[1]);
    char* p = (char*)malloc((n>>3)+1);
    if(p==NULL)
        exit(1);
    memset(p,255,(n>>3)+1);

    for(i=2;i*i<=n;i++)
        if(p[i>>3]&(1<<(i&7)))
            for(j=n/i;j>=i;j--)
                if(p[j>>3]&(1<<(j&7)))
                    p[(i*j)>>3]&=~(1<<((i*j)&7));

    /*for(i=2; i<=n; i++)
        if(p[i>>3]&(1<<(i&7)))
        {   printf("%u\n", i);
            np++;
        }
    printf("%u/%u primes\n", np, n);*/

    printf("%u seconds\n", time(NULL) - timer);

    free(p);
    return 0;
}

Author:  wtd [ Sat Aug 02, 2008 12:49 pm ]
Post subject:  RE:[Performance] Beat the pants off of my best time

While I do not have a proper environment to test this in optimized form, I ran a naive Sieve of Erastothenes through the O'Caml top-level.

code:
# let t = Sys.time ()
  and a = Array.make 2_097_151 true in
      for i = 1 to Array.length a - 1 do
          if a.(i) then
              for j = 2 to Array.length a / (i + 1) do
                  a.(j * (i + 1) - 1) <- false
              done
      done;
      Printf.printf "%f\n" (Sys.time () -. t);;
3.875000
- : unit = ()
#

Author:  zylum [ Sat Aug 02, 2008 1:07 pm ]
Post subject:  RE:[Performance] Beat the pants off of my best time

Brightguy, pretty much what I did but I used ints instead of chars XD

Author:  wtd [ Mon Aug 04, 2008 1:17 am ]
Post subject:  RE:[Performance] Beat the pants off of my best time

code:
chris@frankenstein:~/Documents$ cat primes.ml
let t = Sys.time ()
  and a = Array.make 2_097_151 true in
      for i = 1 to Array.length a - 1 do
          if a.(i) then
              for j = 2 to Array.length a / (i + 1) do
                  a.(j * (i + 1) - 1) <- false
              done
      done;
      Printf.printf "%f\n" (Sys.time () -. t);;
chris@frankenstein:~/Documents$ ocamlopt primes.ml -o primes
chris@frankenstein:~/Documents$ ./primes
0.536032
chris@frankenstein:~/Documents$

Author:  wtd [ Mon Aug 04, 2008 9:48 pm ]
Post subject:  RE:[Performance] Beat the pants off of my best time

code:
chris@frankenstein:~/Documents$ cat primes2.sml
structure ArrayUtils =
struct
    fun fromToByDo(array, startIndex, endIndex, incrementBy, func) =
        if startIndex <= endIndex then
           ( func(array, startIndex, Array.sub(array, startIndex));
             fromToByDo(array, startIndex + incrementBy, endIndex, incrementBy, func) )
        else ()
end;

let
    val t = Time.now()
    val a = Array.array(2097151, true)
in
    Array.update(a, 0, false);

    Array.appi(fn (i, x) =>
        if x then
            let
                val number = i + 1
            in
                ArrayUtils.fromToByDo(a, number * 2 - 1, Array.length a - 1, number,
                                      fn (a, i, _) => Array.update(a, i, false))
            end
        else ()) a;

    let
        val t' = Time.now()
        val diff = Time.-(t', t)
    in
        print ((IntInf.toString(Time.toMilliseconds diff)) ^ "\n") 
    end
end;

chris@frankenstein:~/Documents$ mlton -output sml_primes2 primes2.sml
chris@frankenstein:~/Documents$ ./sml_primes2
584
chris@frankenstein:~/Documents$ ./sml_primes2
584
chris@frankenstein:~/Documents$ ./sml_primes2
584
chris@frankenstein:~/Documents$

Author:  TheFerret [ Mon Aug 04, 2008 11:56 pm ]
Post subject:  RE:[Performance] Beat the pants off of my best time

Ok, I found another way of doing this, the sieve of Atkins, so here is a solution for that, http://ferret22ca.googlepages.com/soln2.txt

You need to use the same mem flag as for the first one and this one runs about 3 seconds faster than the first one...

Author:  andrew. [ Thu Aug 07, 2008 10:41 am ]
Post subject:  RE:[Performance] Beat the pants off of my best time

Here's mine:
Turing:
var n : int
put "Enter a number."
get n
for i : 1..n
put i
end for


Sorry, just had to Razz. Turing is so lame.

Author:  gitoxa [ Thu Aug 07, 2008 4:08 pm ]
Post subject:  Re: RE:[Performance] Beat the pants off of my best time

andrew. @ Thu Aug 07, 2008 10:41 am wrote:
Sorry, just had to Razz. Turing is so lame.


That doesn't even work...

Author:  andrew. [ Fri Aug 08, 2008 11:44 am ]
Post subject:  RE:[Performance] Beat the pants off of my best time

My bad, didn't read the part about the prime numbers. Hmm...that is a hard task to do....


: