[Shootout-list] Many tests cheat

02 Nov 2004 15:37:15 +1100

On Tue, 2004-11-02 at 01:59, Bengt Kleberg wrote:

> as long as the about-this-test claims that some things should hold, then 
> the test programs should follow this if they are to be measured. imho.

I agree with that, but the issue is what things it makes
sense for the tests to claim should hold.

Some of the things don't make sense to me.

For example, there is a limit on how much buffering
is allowed in some tests that do I/O (4K).

Unfortunately, in many languages the programmer has little
control over that, so I would contend the requirement
should be replaced somehow.

In both C++ and Ocaml, there can be double buffering of I/O. 
This is done by the system: iostreams for C++, and Ocaml channels
in Ocaml. Similar buffering may or may not occur in
other languages. So there is no way to actually check
if the constraint is met.

The correct way to handle this is simply to place
a limit on memory such that loading the whole file
in will exceed the limit and make the program fail
the test.

The fact is several languages cheat on this limit.
In one Ocaml program the buffer size is blatantly set to 8K.

Suppose Felix uses mmap? Then how much of the file
is actually 'read' is entirely up to the OS virtual
memory manager.

What we *really* want to require is a constant memory solution.
So make a hard limit of 10 Meg and provide a 200 Meg input :)

> this is what i too would prefer. however, i have found it to be beyond 
> my ability to create such a test in some cases. therefore i think it is 
> ok to have 'same way' tests.

An alternative is to consider designing the test an open
problem for the community. I agree some things will
be hard to test.

> i think what you have described is a multiply-matrices test. it will 
> tell us something about accessing matrices, but we would be in the dark 
> about array performance. 

Matrices are arrays aren't they?

The idea is this: an array is characterized by O(1) random
access. That's the definition of an array. This is a performance
based definition. Any program dominated by random accesses
will measure array performance. It doesn't matter if the
actual implementation used is a hash table, which is also
O(1) random access. 

In fact, hashtables handle *sparse* arrays with memory
linear in the number of non-empty elements, whereas
arrays are linear in the maximum number of elements.

So it isn't so hard to tell hash tables and contiguous
memory apart by performance checks.

Whether some data structure is an array or not
in the underlying implementation isn't an observable,
so it isn't sensible to make it part of a benchmark.

Benchmarks have to meaure observables .. this is
self-evident and follows by definition.

You may *think* you know what an array is, but I guess
you're thinking about C.. these tests should apply fairly
to all languages.

In particular, Haskell doesn't have arrays in the C sense:
you can't 'write' to a purely functional data structure.
Functional arrays are actually an active research topic.

So *reading* arrays in Haskell can be just as fast as 
in C, so the speed of read-only access should be measured
separately from mixed read./write. Haskell will do fine
on the read only test. It will have trouble with read/write,
but the answer may not be to use a functional array at all.
Depending on the problem, a Haskell programmer might have
to use lists, some kind of balanced tree, etc.

You can say this doesn't measure array access in the C
sense and the answer is: you're right, you have to
disqualify Haskell completely from this test,
because even a functional array is NOT actually an array
like a C array.

I think any way to produce the correct answer is
acceptable if it doesn't exceed time/memory limits.
I also think the default 'CRAPS' rating should
include memory and LOC = 1, not 0. This means a
high performance array access test solution in Haskell
might exist, which is very fast .. but it may require
a hand coded data structure and algorithm, and the
test should be required to include all that code.

After all speed is only one reason for choosing a language,
convenience and expressivess are important in these tests too.

-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net