[Shootout-list] Re: OCaml harmonic benchmark (unreliable measurements)

Bengt Kleberg bengt.kleberg@ericsson.com
Mon, 02 May 2005 10:13:04 +0200


Jon Harrop wrote:
...deleted
> Anyway, the time is hugely noisy because it is so short:

this is a major problem.

it is strongly related to the fact that all (often only 3) measurments 
are much too close (no spread). iirc this is due to the fact that all 
languages have to have the same value for n in the table (ie, a slow 
language must be able to pass the time limit for the same 'n' that is 
used by the fast programs). the time limit can not be increased since we 
have a finit amount of time to run the tests in. the time limit and the 
finit time available are not available. i have searched the faq, and it 
is not in the faq.

without this knowledge i find it difficult to solve the problem. 
moreover, i think the concept of a single value (instead of a graph) is 
a bad idea anyway. so i am doubly unfit to solve this. however, to get 
the discussion started i do it anyway.


my suggestion is to move away from the current system of same 'n' for 
all languages. instead i suggest a system with best 'n' for all 
languages. ie, measure several, well spread out 'n' (up until a time 
limit). compute n/sec for all iterations. put the best n/sec in the 
table. this way we would no longer be hobbled by to too long times for a 
few languages making the timeings of the fast languages unreliable.


bengt