[Shootout-list] Stuff

Jon Harrop jon@ffconsultancy.com
Fri, 22 Apr 2005 20:47:55 +0100


On Friday 22 April 2005 16:42, Isaac Gouy wrote:
> The mystery is why you "don't want to do" it on a website that's
> always been about including a wild range of languages - GAWK to
> Haskell.

We need to ossify "a wide range of languages": My guess is that all of my 
programs should be implementable in D, SML, C#, OCaml, C++, Lisp, Java, 
Haskell, Lua, Scheme, Perl, Python and Felix. Is that wide enough?

The main missing languages are C, Fortran and Pascal because I don't think the 
necessary data structures will be easy enough to implement in those 
languages, but I may be wrong. Other data structures can be substituted but 
are likely to be slower.

I don't want to "change" the problems to make it easy to implement them in C 
and Fortran as the problems will no longer represent anything relevant or 
useful.

> Is it possible that your "practically important sets of problems" are
> much less "practically important" to programmers working in other
> domains?

Yes, absolutely. Analogously, the object*, tcp* and threads* benchmarks are 
all completely irrelevant for me but I don't think they should be removed as 
a consequence.

> Obviously, absence of implementation is not evidence that an
> implementation is even difficult, let alone impossible.

It is not proof. I take it to be evidence.

> Perspective matters - nsieve has working implementations in ~22
> different language implementations. If there's a problem then it isn't
> that nsieve is so complicated and demanding that it excludes many
> languages.

Yes.

> (The obvious problem benchmark is pidigits because it requires
> arbitrary precision arithmetic.)

I wouldn't have said that was a problem. Indeed, I think pidigits is one of 
the more interesting programs currently on the shootout.

> > If you go down that route, you'll end up with assembler programs
> > written in all languages. I don't think anyone wants that.
>
> While we're speculating, does anyone want implementations using
> different algorithms with different running times?

I'd like to see "natural" implementations. For example, I think the C++ sieve 
should use vector<bool>. Also, I'd be happy to see OCaml use the Set module 
and C++ use the STL set<> to implement sets, even though these are quite 
different.

For non-trivial tasks, it will be impractical to restrict implementations to 
using identical data structures and algorithms. Specifically, when the data 
structures and algorithms in standard libraries become useful.

> Some folk don't -
> http://groups-beta.google.com/group/comp.lang.functional/msg/ddb2894d9e3d80
>24?hl=en

I agree with what Ralph said there.

I would add that the performance vs verbosity trade-off could be addressed by 
allowing multiple versions of each program in the shootout. The performance 
result would then be the maximum performance of the submitted programs and 
the LOC count would be the length of the shortest working program.

> > So if you want to avoid bias then I'd say why is there an nbody
> > program which is clearly geared up for Fortran
>
> You assume too much - the nbody program was worked through in Java
> without a thought to Fortran.
>
> >(it even requires Fortran-style implementation, IIRC)
>
> Really - in what way?

By "geared up for Fortran" I mean it is a simple array-based numerical 
algorithm which, I believe, is enforced by the benchmark specification rather 
than objectively selected as the most appropriate method (which it probably 
is because the task is too simple to be representative of most scientific 
computing). Also, enforcing the hard coding of magic numbers is not only poor 
programming style but also devalues the LOC measure in the shootout. I think 
this benchmark could be greatly improved upon.

I think a better benchmark would be a program to relax a box of atoms by 
minimizing (using a specified gradient-descent variant) the potential energy 
with respect to the atomic coordinates (given as input to the program). Use 
something like a truncated Lennard-Jones potential or the Dzugutov potential 
so the interactions are short range. Programs should be run on 10^5 
particles. The programs will then be practically useful and, therefore, will 
be much better for objective measures of language performance, verbosity, 
clarity and memory use.

If you want something harder then introduce electrostatic interactions and the 
fast multipole method. There is actually a simple 1D example along these 
lines in my book (67 LOC in OCaml). I'm happy to submit that as well, if 
anyone is interested.

> > Oh, I had another idea. What about a program to solve x^3 - x - 1
> > using Newton Raphson?
>
> What work does it do that isn't already covered by other benchmarks?

1. Basic numerical analysis (mach_eps, numerical derivative, newton raphson).
2. Higher-order functions (numerical derivative, newton raphson).

My "relax" benchmark is more complicated and more CPU intensive but not 
dissimilar.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists