[Shootout-list] Binary trees

Jon Harrop jon@ffconsultancy.com
Mon, 20 Jun 2005 00:41:43 +0100


On Sunday 19 June 2005 18:47, Einar Karttunen wrote:
> Jon Harrop <jon@ffconsultancy.com> writes:
> > This is not totally unlike a real program. In particular, this tree can
> > be empty (containing only one T object, Empty). The other implementations
> > (e.g. OCaml, MLton, Haskell) can't represent an empty tree. Does this
> > make them invalid?
>
> The Haskell implementation which is currently the fastest has:

It says Clean, D and MLton are faster ATM.

> data Tree = Node  Int Tree Tree | Nil
> which is perfectly capable of representing an empty tree. If the other
> ones are not they should be fixed.

You are quite right. This is putting Haskell at a disadvantage. But what can 
we do about it?

> >   type int_tree = Empty of int | Node of int_tree * int * int_tree
>
> Seems unfair to me.

Yes, it is arbitrarily "unfair" but unavoidable with the current approach, 
i.e. there is no clear C equivalent to either of those - the underlying 
representation is up to GHC, ocamlopt, MLton etc.

Some people have said that we are trying to measure the compilers' efficiency 
at optimising. This is a perfectly valid point but doesn't get around the 
fact that this type of benchmark is totally subjective.

Also, I haven't found anyone who knows how to code a Fortran equivalent to the 
tree in my ray tracer. Consequently, the current Fortran entry implements 
quite a different data structure which is simpler and substantially less 
powerful than that found in most other languages. I don't object to this, of 
course, because the program still works but other people should object 
strongly (if they are self-consistent).

> Are you saying that measuring one datastructure cannot be done if there
> exists a faster datastructure for the same purpose?

I am saying that this test is pointless because you cannot define the data 
structure concretely in a way that can be enforced. You either give your 
artistic impression of the shape of the data structure (as the task 
description for binary trees currently does) or you give a concrete 
definition in terms of memory contents which can either not be enforced or 
cannot be feasibly implemented in high-level languages.

> > Or how about this, I replace the tree data structure in OCaml with a
> > recursive closure? That won't even need to build the tree, there will be
> > nothing to GC and it will probably be orders of magnitude faster as a
> > consequence. Even better, it's subtle, and Isaac probably won't notice.
> > ;-)
>
> Or you could try to play nice so that whining wouldn't be required.

No! This is my fundamental point. It is not possible to define "play nice" 
and, consequently, we cannot expect anyone to "play nice". So there will 
always be subjective whining with the current approach.

> > For me, the propect of having Isaac throw a submission back in my face
> > because of a rule he just imagined is the strongest reason to not join in
> > development of the shootout. If people were free to submit their own
> > crazy optimisations in their favourite language then I think the shootout
> > would be much more popular (just look at the software texture mapping
> > competitions in days of old).
>
> If you do something you suspect that people won't like, you can post to
> the list first *asking* whether that is considered ok.

IMHO, good programmers will want to start from a solid task description. I 
know lots of great programmers who are definitely not going to waste their 
time writing programs which may or may not be rejected by the shootout on the 
basis of people's feelings.

> > Also, these problems with binarytrees will disappear if we follow
> > Skaller's advise for designing the tests. This test shouldn't state that
> > you must use a binary tree, it should simply require you to solve a
> > problem that is best attacked with a binary tree.
>
> That sounds very hard. First we have a benchmark and people start
> contributing to it. After half a year someone finds a faster algorithm
> and now all the implementations need to be rewritten. This is especially
> hard for the more exotic languages with few contributors and in the
> meantime the languages with the most activic contributors seem to be
> faster in the test.

Yes. It is still not ideal but I believe that is better than having someone 
choose which implementations in which languages are allowed and which are 
not.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists