[Shootout-list] Directions of various benchmarks

18 May 2005 21:51:28 +1000

On Wed, 2005-05-18 at 17:19, Jos=C3=A9 Bollo wrote:

> > Bottom line of argument: "same way" tests make no sense
> > because by their very nature they defeat the whole purpose
> > of comparisons, to compare how similar problems are
> > solved in DIFFERENT ways .. and what the consequences
> > for performance, code clarity, ease of use, etc, are.
>=20
> that would be a long debate, here is my humble opinion: relaxing the 's=
ame=20
> way' will improve the competition and then every implementer will copy =
the=20
> algorithm of the fastest realisation. So the 'no same way' will tend to=
 'same=20
> way'.=20

In that case, the test should be thrown out as a bad test.

I agree this will happen to some extent and it should.
However, consider just one example: a regexp search.

Perl uses PCRE. Ocaml uses Str. C hand codes the solution
using char *. Felix uses the builtin regular expression
facility.

What happens: Perl solution is terse, performs moderately,
but is hard to read. The Ocaml solution is similar but a little
faster because Leroy is smarter than Wall, but a little
harder to read because it needs extra \ because there
is no 'regexp' mode for strings. The C code is extremely
fast but huge and unmaintainable. The Felix solution is almost as
fast as the C solution, uses the clearest notation, but is=20
a bit longer than the Perl.

SO here we have 4 distinct solutions to the same problem,
with 3 radically different approaches: on the Perl and Ocaml
solution use a builtin NFA, and they're distinct implementations.

Now if you go and insist on 'do it the same way' and someone disallows
the Felix solution because it uses a regular definition system
without captures -- and the C solution is 'obviously out' because
it implements the state automaton by hand.

This is absurd. All 4 solutions should be allowed, C should
be penalised for being verbose and unmaintainable, Perl and Ocaml
should be given credit for terseness but penalised for being
somewhat unmaintainable (Ocaml being hurt more than Perl because=20
Perl has special lexical support for regexps), Felix is penalised
for being verbose but gets points for being clear and maintainable,
and C and Felix get points for hands down clobbering both Ocaml
and Perl because they both generate extremely fast low level
linear machine code.

For a problem without captures but a complex regexp,
if Felix doesn't win this competition the test is flawed.
For a simple regexp, Perl should win. For a heavy search,
C and Felix should win.

The point is that the languages are NOT all copying each
other's algorithms, and they're going to give different
results entirely on speed, depending on both the regexp
being processed and on the data being processed.

Although Ocaml can generate a DFA dynamically,
so is too many lines to include -- I know because
that code is precisely what the Felix compiler uses=20
to generate the transition matrix. In practice
the Ocaml user wanting high performance would
use ocamllex to generate the automaton,
because there is C code to process it faster
than native Ocaml could.

So I accept completely your argument, but not as an argument
in favour of 'same way' requirements, but as an argument
in favour of throwing out most tests than lead to similar results
for all languages, given 'same thing' arguments (that is,
tests that simply require a correct result no matter
how it is obtained).

--=20
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850,=20
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net