[Shootout-list] Directions of various benchmarks

Fri, 20 May 2005 15:21:57 +1000

On Fri, 2005-05-20 at 05:49 +0100, Jon Harrop wrote:
> On Friday 20 May 2005 05:36, Brent Fulgham wrote:
> > On May 19, 2005, at 6:56 PM, John Skaller wrote:
> > > On Thu, 2005-05-19 at 13:30 -0700, Brent Fulgham wrote:
> > >> --- Bengt Kleberg <bengt.kleberg@ericsson.com> wrote:
> > >>
> > >> How can we judge if a declarative language meets
> > >> an imperitive specification?
> > >
> > > By the output.
> >
> > What if the output was precomputed?
> 
> I think you two may be talking at cross purposes here. When Brent said "How 
> can we judge if a declarative language meets an imperative specification?", I 
> think he expected the answer "you can't".
> 
> When Skaller said "By the output", I think he meant "You can't, you should 
> just just check the output".
> 
> So I predict that you're both going to say "let's ditch imperative 
> specifications by specifying the problem to be solved rather than the CPU ops 
> which must be performed to get there after we buy Jon a beer". Am I 
> right? :-)

Almost. Lol :)

I have no problem with the output being specified by giving
an imperative algorithm. In fact giving an actual Java program
is a very good technique because it is 

(a) A formal language readily understood, and,
(b) the specification is operational (executable)

So if you change the wording from 'do it the same way as this
java program' to 'produce the same result as this Java program'
I'd be reasonably happy. Perhaps there is an argument for psuedo-code
instead, or using Haskell in some cases or just plain English:
for example the specification for the regexp phone number test
is clearly given in plain English. The critical difference being
the change of wording so the requirement is simply to produce
the same results. I prefer 'formal English' for specifications,
but I know from experience on C++ committee this isn't a very
easy language to for specification writing ;(

For some numerical tests, 'same' result is too strong,
instead one might prefer 'close' to the result, and some
way of measuring 'close', such as a matrix whose elements
are all within 'epsilon' of the result computed by a
reference Java program.

For some tests, one may want to impose additional constraints
over and above 'correctness', such as 'constant memory',
this can also be measured. The key thing is the criteria should
'in principle' be observables in the scientific sense.

LOC is an observable. Perhaps not a good one. 'Well commented'
is not, but it could be made one -- at some stretch of the imagination
we could install the Ocaml Demexp program and actually VOTE on such
issue .. (this is an online GTK based GUI voting system).

In practice, 'common sense' would prevail. The real point
is that with a clear *principle* test case authors could
proceed without worrying if they were offending Isaac's
opinion about what consituted exception handling, Brent's
idea of what an external library is, Haskell's ideas of
how to be lazy, Skaller's diatribes, etc etc .. :)

For external libraries, this ties in with my argument
that a 'compiler' means a tool with a fixed set of
command line switches. gcc is free to use PCRE as a regexp
engine provided the library is linked on EVERY test line.
To pass all the tests might need 15 external libraries
and that would show up on the command line as a complexity.

Note requiring this also simplifies the test harness: you only
get to pick 'one' set of switches and libraries per 'translator',
so choose wisely, and pay the price where there is one in return
for the gains (gcc crashes sometimes with -O3 -fomit-frame-pointer ..)

-- 
John Skaller, skaller at users.sf.net
PO Box 401 Glebe, NSW 2037, Australia Ph:61-2-96600850 
Download Felix here: http://felix.sf.net