[Shootout-list] Numerical medium-sized benchmarks

Sebastien Loisel Sebastien Loisel <sloisel@gmail.com>
Fri, 25 Mar 2005 09:53:34 -0800


> This is a dead link for me (shows up as Google not

Sorry, go to groups.google.com and search for

replacing c++ with haskell: aborted

> > as if the more abstract code I intended to write
> > might be harder to write in ML and Haskell.
> 
> This is a great finding.  Most of the languages
> in the shootout do reasonably well in the various
> problems on the site.  Perhaps its time to push

These are Xavier Leroy's results:
(* Compilation options and performance (Athlon XP 1600):
     1.64s   g++ -O2 -ffast-math cpp-version.C -o cpp-version
     2.01s   g++ -O2 cpp-version.C -o cpp-version
     3.65s   ocamlopt -ffast-math -inline 100 -unsafe heap.ml bench.ml
     4.24s   ocamlopt -inline 100 -unsafe heap.ml bench.ml
     3.83s   ocamlopt -ffast-math -inline 100 heap.ml bench.ml
     4.29s   ocamlopt -inline 100 heap.ml bench.ml
*)

> I would suggest jettisoning our existing matrix
> test and just substitute Sebastien's determinate
> test.

Sounds good to me. I should also say a couple of things about this benchmark.

When N breaks the cache size barrier, you will notice a large
performance degradation. I don't know how big the caches are on our
computers, and even if I did, it varies from computer to computer. The
solution is to write a version of eval_A_times_u that works on mxm
subblocks of A, such that blocks of length m of both u and v (and any
temporary vectors) will fit in the cache. This is precisely what BLAS
and LAPACK do, and you can get amazing performance improvements out of
it. What it means is that if the matlab guy comes and shows you how
fast it is in matlab, he's really showing you how fast it is in
fortran because LAPACK is written in fortran. If you break the cache
barrier, without using LAPACK and just rewriting eval_A_times_u in
block form, you'll probably beat LAPACK again.

Second, if you ramp up N above 10000 or something, you stop gaining
precision as roundoff takes over. So your assessment that the nice fat
number is easy to check for correctness is true, but the fat number
becomes screwy for very large values of N.

Cheers,

Sebastien Loisel