[Shootout-list] Stuff

Fri, 22 Apr 2005 01:45:04 +0100

Woohoo! I seem to be on the list (had a lot of problems getting here...)

Right, I've got lots of things I'd like to discuss. Firstly, you all need to 
buy copies of my book. ;-)

Someone asked if the programs from my book are simply more array+float 
problems. The answer is that they are most definitely not! The book 
emphasises the strong points of OCaml in the context of scientific computing, 
so it often uses non-trivial data structures (sets etc.) and lots of integers 
as well as floats. These kinds of things are so difficult to implement in 
most other languages that programs are either at worst intractable (Fortran) 
or at best error-prone (C++).

I think the simulated annealing for travelling salesman and "n"th-nearest 
neighbour programs would be the most interesting to get on the shootout.

The simulated annealing approach to practically solving the travelling 
salesman problem is a pedagogical task set on undergraduate science courses. 
However, the vast majority of the implementations out there are dire, 
precisely because they stick to arrays. The OCaml implementation given in my 
book uses nested closures and, consequently, is much faster at the slowest 
operations. As a functional construct, nested closures do not have a single 
obvious representation in imperative languages. I can think of object-based 
and array-based versions which could be used in C++ and Fortran, 
respectively, for this problem but neither are as elegant as the OCaml 
solution. Even more sophisticated entries could use a balanced binary tree to 
represent the current path. Brooks Moses has kindly converted my OCaml 
implementation into Fortran. A C++ conversion should be easy enough (and I 
mean a proper one, not the junk you'll find in Numerical Recipies).

The "n"th-nearest neighbour program is graph theoretic with the added twist 
that the graph is potentially infinite (periodically tiled). In the book, the 
problem is phrased as a set-theoretic recurrence relation. This is 
implemented directly (+ memoizing) in the OCaml program and the result is 
very succinct and much more easily seen to be correct than the nearest 
equivalents in imperative languages. I've already written a C++ version which 
uses the STL set, but the STL's set union is vastly slower than OCaml's...

Regarding some other discussions I've read about in the archives. I think it 
is vitally important to allow people to use common libraries (and to not 
count the library as submitted code). For example, lots of scientific 
programs are based upon Fourier series, computed with the FFT. It simply 
isn't feasible to write a decent FFT yourself (O(n log n) for any n), you 
just use FFTW. Similarly for LAPACK. Practically, it would be very useful to 
see how easily languages can exploit FFTW and LAPACK. Just as importantly for 
OCaml, the vast majority of my programs lex and parse their input properly 
using ocamllex and ocamlyacc. IMHO, these should be allowed in the shootout.

Reading the March archive I've come up with a few more ideas for benchmarks:

1. Sphere: Recursively subdivide an icosahedron to produce an arbitrarily 
accurate approximation to a sphere (described in the OpenGL Red Book).

2. Empty program: It sounds silly, but wouldn't the minimal program in each 
language be an interesting thing for newbies to look at?

3. Discrete wavelet transform: This can be implemented really elegantly using 
lists, or much more efficiently but less clearly using arrays (at least in 
OCaml).

4. OpenGL: Everyone loves graphics, why not have a section for code which 
shows off OpenGL?

5. Graphics algorithms: I like the idea of a ray tracer someone suggested. How 
about some more algorithms from graphics?

6. Calculator or BASIC: a simple lexer, parser and interpreter.

7. Symbolic manipulation: write a program which can symbolically differentiate 
x^x or multiply out polynomials with arbitrary-precision integer coefficients 
or something.

On Friday 22 April 2005 00:57, Brent Fulgham wrote:
> I suppose we could set it up so that it expects to
> find three files for each solution:
>
> 1.  The main implementation
> 2.  A Lexer
> 3.  A yacc/bison rule set.
>
> These files would then be built by the respective
> programs for that language implementation.

I like this, provided you're talking about the lex and yacc sources (.mll 
and .mly) and not the compiled ".ml", which would be huge! :-)

BTW, do you know the Fortran Mandelbrot is suffering from the 1 LOC bug? I 
can't think why it's just that one with the problem?!

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists