[Shootout-list] Directions of various benchmarks

Einar Karttunen ekarttun@cs.helsinki.fi
Wed, 18 May 2005 23:14:18 +0300


skaller <skaller@users.sourceforge.net> writes:
> where 'acka' contains all the known values of Ack(3,x).
> Clearly this algorithm is (a) correct and (b) faster than
> all others by a MASSIVE margin .. but it defeats the real
> intent of the test.

Test specification is "calculate with naive recursive algorithm". 
How do you see "return precomputed result" as fulfilling that?

> The solution is: drop Ackermann as a test. The problem 
> is that it does lots of function calls in naive version,
> which we want BUT there are not enough possible inputs.

No. Ackerman is a simple and easy test to measure one small thing. 
Replacing it with something complicated would be quite
counterproductive. 

> We must choose instead a function that is heavily recursive
> but which has MANY inputs so the results can't be pre-calculated.
> I do not know such a function off hand, but I'm confident
> suitable ones exist.

Lots of inputs means lots of worthless code parsing input and
potentially IO-bound programs.

> So, this is not 'dynamic exception handling' of the same
> kind as in C++ and Java, and that is quite deliberate:
> I don't supply general dynamic EH because it is extremely
> BAD to have, it is way too powerful. The connection between
> the throw and catch points is not just 'open', it may
> not even exist.

And more precisely it is not invisible to code layers it may be passing
through. But this is a dead horse since exception are no longer there.


> There is no other way to get pre-emptive threading on linux
> that I know of.

Both GHC and Erlang have pre-emptive soft threads on Linux...
The trick is to use co-routines, timer signals, non-blocking IO 
or a separate thread pool for IO and lots of black magic.

>> There is a semantic difference between them and coroutines.
>
> Actually not really. The only difference is that with
> coroutines every point of the code except a yield
> is a critical section, whereas with pre-emptive threading,
> this behaviour is obtained by explicitly locking at the start
> of the thread and yield is just a release and re-acquisition
> of the lock -- so coroutines can be emulated by pre-emptive
> threads. OTOH you can yield anywhere in a coroutine ..
> so semantically they're actually equivalent, or more
> precisely each can emulate the other.

The semantic difference becomes visible the instant you have one thread
which takes too long time. Also liveness properties are quite
different. 

> However, threading is NOT concurrency, that requires
> a multi-cpu computer, and you can often run two processes,
> gain concurrency, and not have any threading.

By that argument there is no concurrency at all on a single-cpu
computer. But pre-emptive scheduling is equal to concurrency (minus
scheduling algorithms).

> Finally, the 'requirements' on at least one of the tests
> allowed microthreding in a sloppy way: several systems
> actually used it and were permitted despite not using 
> Posix threads.

How would you calculate a system which uses one posix thread to run the
computations with a pre-emptive scheduler and IO in other threads? What
if the system is capable of launching new posix threads that run the
computations on a SMP system? And this all is without the threads
noticing a thing.

>> All other languages could switch to using co-routines and have a similar
>> performance boost 
>
> Nope! Which other compiled language than Felix HAVE coroutines??
>
> Um .. not C .. not Ocaml .. not C++ ..  Haskell might I think?

One can implement coroutines with call/cc if I remember correctly - so
many ML implementations, Scheme, Haskell, ... Also the ocaml encoding is
likely to be quite short.

> The proof is simple -- Felix does the work *without* calling
> Posix threads and creams it in! It also isn't limited to 350 or
> so threads (that's about when my old LInux box started to refuse
> to create more threads) .. Felix easily runs 10,000 threads.
> Again, it was designed to, the original design was for a
> telephony environment (600 calls/second is the nominal
> rating of a telephony switch .. average call length 3 minutes,
> the load is shared between about 10 CPUs)

Actually you can get 10000 posix threads with NPTL... 

>> You cannot get a sorting algorithm like that very easily in Haskell. It
>> needs mutable arrays for the constant memory and thus the LOC will climb.
>
> Then it will fail, if you can't write a sort in 5-6 loc.
> Imperative programming does have some advantages?

Getting guaranteeded constant memory usage is non-trivial in many
highlevel languages. Why do you think it is important? I think Haskell
will be allocating short lived data and still kill scripting languages.
I think "implement sorting in 5 lines with constant memory" is a much
more evil specification than "implement bublesort".

You seem to advocate tests as black boxes "do whatever you want inside
as long as they satisfy X, Y and Z". I would like an approach where the
what and how is in the specification and can be discussed on the list
if one is not sure whether some solution fulfills it.

- Einar Karttunen