[Shootout-list] Directions of various benchmarks

skaller skaller@users.sourceforge.net
18 May 2005 22:51:26 +1000


On Wed, 2005-05-18 at 17:48, Bengt Kleberg wrote:
> skaller wrote:

> 
> if i understand you correctly we will have some langugaes that have pcre 
> builtin. they always register as pcre. then we have langugaes without 
> pcre builtin. they register as fail.

Yes and so a test requiring PCRE is worthless: languages using
it are using the same code and those not doing so are all failing.

So the requirement should be to solve a particular problem for
which PCRE can provide a good solution, but ANY solution should
be allowed (provided it isn't too long or too slow of course).

Same way tests are semantic nonsense. The only way to
do something 'the same way as this Java program' -- which is
the specification of some tests right at the moment --
is obviously to actually use THAT Java program.

How the heck can you judge if a solution in C or Ocaml
or Haskell is 'the same way' as a Java program when one
language is imperative with variables and the other
is functional?

This isn't a rhetorical question or philosophising: already
Isaac had the audacity to ban the Felix solution to the
EH test because it was orders of magnitude faster than every
other solution, with eth excuse that "by my(skaller) own 
admission it wasn't using the same mechanism as C++/Java" 
-- but failed to similarly ban the C solution and 
the Scheme solution because they
both also use variants of the same technique -- all three
languages use continuation passing, C with Jumpbufs,
Scheme with Call/CC, and Felix with a non-local goto.

Similarly 'concurrency' tests are in question, because
a requirement to use Posix thread is absurd: should we
really measure OS thread handling limitations? 

And how would this be implemented in Felix? -- You have to
understand that with Felix microthreading, routines
are called by yielding .. it is nonsense to spawn
a posix thread in Felix because both that thread and
its creator would immediately yield the same
continuation object to the user space scheduler
 .. causing it to crash. The spawning has to be
done in the sheduler .. and I can't implement that until
I have at least one *real* reason to do so.

OTOH if you allow user space context switching, as supported by 
several systems (including Felix and I think Haskell?), and also
supported by Ocaml bytecode interpreter and Python .. then there's
no way to know if you will really get concurrency .. and in the
test cases there was none .. the Felix optimiser just eliminated
the overhead, again outperforming other solutions because even
the dumb optimiser I have implemented is smarter than the testers :)
<no offense intended>

A similar problem with 'repeat the same code' and Haskell cleverly
'lazifying' the repitition away. Again a test designed in with
a dumb imperative programming model in mind can't be expected
to mean much for an advanced system like Haskell.

> if the normal way to use these languages is with pcre, why is there no 
> builtin pcre? if pcre is not the normal way, why is there no 
> alternative? (perhaps they are not supposed to be used for the task at 
> hand? perhaps should they fail then?)

Because in many languages, such as Lua and C, the 'normal' way to solve
higher level problems IS to use an external library. The standard
libraries are *deliberately* minimalist, and the language is instead
made very good for both building libraries and binding to them.

In such systems the 'normal' way is to choose the best available tool,
which precludes bundling any special tool into the language.

In fact in ISO C++ version 1998/9 there is no regexp but in
ISO C++ version 'sometime in 200x if we're lucky' there will
be a regexp library (templated of course).

And in Felix the answer is 'why not just bundle PCRE into
the package to satisfy the Shootout requirements and pass
the test' because as the vendor I could do that. I'm the one
that authoritatively defines what is 'builtin' and what is not.

Yet .. for Felix I cannot. The whole language is designed to take
the C concept of a minimal language with bindings to external libraries
much further, so that almost everything is user defined: right
now without external bindings there aren't even any integers
in the language!

So the very distiction being made is deliberately being challenged
in a fundamental way by Felix. By design it is BOTH nothing more
than a way of binding things together, and also a package of
example bindings that are useful.

In fact per chance, regular expressions are handled directly
as low level primitives which is a LOT more than you can say
about Perl: the funny thing is that in Felix regexps are
primitive constructions .. but strings are NOT!!!

So in my opinion .. if you want to get pedantic Felix is the
ONLY language I know of that can truly be said to handle
regexps natively. Yet the whole facility is intended as a plugin
extension (and the treatment as a primitive is really a design
bug :)

If you tried to read this and got confused .. well me too!!
The point is that isn't really well defined what is in and what
is not in a language, what is 'the same' and what is not:
is a tail recursive loop really 'the same as' an imperative loop?

To really answer these questions you must be a theorist working
with two similar mathematical models and be able to connect
the models to show equivalence.. such result are often vitally
important to theoreticians -- but for practical programming 
languages the question of whether things are done 'the same way'
just can't be answered.

Example. Suppose you want to say 'sort this file using
a bubble sort'. Do it 'the same way' at the algorithmic level.

Well you CANNOT sanely require that. So, what you require instead
is 'sort this array with less than 5 lines of code without allocating
more than a constant amount of memory'.

And you can add that 'the intent is to use a bubble sort'. 
(Maybe its 6, but you can't implement a quicksort in such a small loc :)

So anywhere you really want to specify a particular algorithm,
it can almost always be done by specifying limits on the space
and time complexity order plus perhaps a constraint on the LOC of code.
There is a way to get what we want *without* gratuitous and arbitrary
limititations.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net