If you don't know what this is all about, this is part of the Great Internet Mersenne Prime Search. The project's main site at mersenne.org contains background info and code and binaries for most Intel platforms.
These are compiled from Ernst Mayer's source code. His README file has links to the actual source code, as well as instructions and binaries for Alpha Linux, Sparc Solaris, VMS, and Irix. If you are familiar with the project but not with this particular implementation, Ernst's README should get you up and running in short order.
The binaries:
The mlucas.cfg file is a new feature of Mlucas-2.7b; optimal values for each
FFT size vary among different CPU models and even among machines with the same
CPU but different memory characteristics (cache size, bandwidth, latency, etc).
It's probably best to make your own for each type of machine you'll be using,
but here are some sample files:
Here are some sample timings using the Ev5+ binary on Tru64 5.1 systems, with
error checking disabled:
2 Feb 2001
4.x binaries will work on 5.x systems, but the 5.x binaries are significantly faster.
$ f90 -O4 -fast -assume accuracy_sensitive -Olimit 100000 -non_shared -pipeline \
> -unroll 1 -speculate none -arch generic -tune generic -o Mlucas-2.7b *.f90
$ f90 -O4 -fast -assume accuracy_sensitive -Olimit 100000 -non_shared -pipeline \
> -unroll 1 -speculate none -arch ev4 -tune ev4 -o Mlucas-2.7b-ev4 *.f90
5.x binaries will not work on 4.x systems.
All binaries are complied non_shared, because its faster; if you'd like a
shared binary, send me an email and I'll hook you up.
All are compiled to output every 2000 iterations. If you'd like a binary
compiled with a different output increment or with different compiler flags,
shoot me an email.
If you'd like to construct your own mlucas.cfg file, which will probably
increase your performance a good bit, you'll need the
timings.txt file. Run the trial runs specified in there, and use the
best value for your machine in your mlucas.cfg file (go ahead and download one
of the ones above so you can get the format, etc).
Speed 731MHz 667Mhz 500MHz 466MHz 500MHz 333MHz
CPU EV67 EV67 EV6 EV6 EV56 EV5
OS 5.1 5.1 5.0 5.0a 5.0 5.1
2.7b 2.7b 2.7b 2.7b 2.7b 2.7b
4MB L2 4MB L2 4MB L2 2MB L2 ?? L2 8MB L3 96k L2 2MB L3
FFT
256 0.03608 0.03823 0.05345 0.05848 0.09831 0.13480
288 0.04275 0.04523 0.06363 0.07458 0.11433 0.16906
320 0.04790 0.05093 0.07153 0.08843 0.12781 0.20493
350 0.05646 0.06056 0.08456 0.10878 0.14616 0.24110
352 0.05650 0.06031 0.08448 0.10575 0.15395 0.26653
384 0.05661 0.06043 0.08525 0.11210 0.16523 0.28818
420 0.06873 0.07320 0.10280 0.13968 0.34251
448 0.07091 0.07520 0.10571 0.14296 0.36913
480 0.07525 0.07985 0.12001 0.15500 0.40075
512 0.07966 0.08095 0.12521 0.16320 0.41561
576 0.09598
640 0.10650
700 0.12768
704 0.12728
768 0.12715
840 0.15413
896 0.15811
960 0.16946
I'll be adding to these as I have time, plus running some comparisons
with error checking enabled and also against Mlucas-2.7a.
Paul Victor Novarese
novarese@novarese.net