[Shootout-list] Stuff
Jon Harrop
jon@ffconsultancy.com
Mon, 25 Apr 2005 12:44:52 +0100
On Monday 25 April 2005 10:34, Bengt Kleberg wrote:
> Jon Harrop wrote:
> > For example, here is a minimal core of the "nth" program in OCaml:
> >
> > let rec nth_nn n (i, io) = match n with
> > 0 -> singleton (i, io)
> > | 1 -> fold (fun (j, jo) s ->
> > add (j, add_i io jo) s) bonds.(i) empty
> > | n ->
> > let pprev = nth_nn (n - 2) (i, io) in
> > let prev = nth_nn (n - 1) (i, io) in
> > let aux j t = union (nth_nn 1 j) t in
> > diff (diff (fold aux prev empty) prev) pprev
>
> would it be possible to explain this program without using ocaml (or any
> other programming languge)?
Sure, NP.
Basically, given a graph (the sets of nearest neighbours of each vertex), you
want to find the "n"th-nearest neighbours of the "i"th vertex in a graph.
This is most naturally written as a recurrence relation.
You define the 0th neighbour of a vertex "i" to be the set containing only i:
{i}. The first neighbours of "i" were given in the input so you just look
them up. The 2nd-nearest neighbours of "i" are the neighbours of the
neighbours of "i" which are not the neighbours of "i" and which are not "i"
itself.
In general, the "n"th-nearest neighbours may be written in terms of the
"n-1"th- and "n-2"th-nearest neighbours: The "n"th-nearest neighbours of "i"
are the neighbours of the "n-1"th nearest neighbours of "i" without the
"n-1"th- and "n-2"th-nearest neighbours of "i".
That would lead to code like this:
let rec nth_nn n i = match n with
0 -> singleton i
| 1 -> bonds.(i)
| n ->
let pprev = nth_nn (n - 2) i in
let prev = nth_nn (n - 1) i in
let aux j t = union (nth_nn 1 j) t in
diff (diff (fold aux prev empty) prev) pprev
However, there is an additional quirk. Scientists simulate atomic structures
by "wrapping them around" a cube, known as periodic boundary conditions. This
means the atoms on one face of the cubic simulation are taken to be bonded to
the atoms on the opposite face. So the number of atoms simulated is finite i
= [1..N] but this actually models an infinite number of atoms, tiled to cover
all of space.
In scientific computing, programmers often gloss over this and their programs
silently produce erroneous results when they are required to work over
structures approaching the size of the cube used in the simulation.
This can be attributed to the use of only the index "i" to refer to an atom,
allowing you to refer only to one of "N" atoms. I get around this caveat by
using both "i" and the index of the cube that the atom lies in, allowing you
to refer to any of the infinite number of atoms uniquely. So, the atom
denoted "(1, (0, 0, 0))" is the first atom in the cube at the origin and "(2,
(-1, 0, 0))" is the second atom in the cube just to the left of the cube at
the origin.
For example, if atom "1" is on the left-face of the cube and its neighbouring
atom "2" is on the right face, I do not simply say that "1" is bonded to "2"
but, rather, that (1, (0, 0, 0)) is bonded to (2, (-1, 0, 0)).
In the OCaml code, I've used (i, io) to denote the "i"th atom in the cube at
offset "io". As you can see, the original code is only slightly longer but
correctly handles this common source of errors.
Scientists studying a wide variety of atomic structures using similar programs
to look at all sorts of interesting phenomena. These programs are reasonably
computationally expensive (they often take an hour to run) but a factor of
two isn't so much of a big deal, writing a correct program in less time than
it takes to run is more important.
So I think this would make a great benchmark for the shootout because it is a
new class of problem not seen before on the shootout, it emphasises set union
(which no other benchmark does AFAIK), it is somewhat computationally
expensive and it is of practical importance.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists