[Shootout-list] bench mark test run time, max bad score

Brent Fulgham bfulg@pacbell.net
Tue, 28 Sep 2004 21:37:50 -0700


I'm not really interested in changing the current logarithmic scoring 
system.  It works like this:

1.  All languages are run for a particular test.
2.  For each test, the "best score" is determined by locating the 
lowest non-zero score. (This is
      done for CPU, Memory, and LOC scores.)
3.  For each language, compute its score (for each test) in 
logarithmic space:

           score(x) = 1 / (1 + log2(x / b))

           where:

               x = Current test's score
               b = Best score for this test

       This yields a non-zero value from 0 to 1, which 1 being the 
"best score".
4.   Each such score is then multiplied by the weight of the test (a 
number between 0 and 5),
       yielding the language's score for that test. If a language does 
not have an entry for a test,
       its score is zero. (Again we do the same for Memory and LOC).
5.   Then the CPU/Memory/LOC scores are multiplied by their respective 
Mulipliers and the resulting
       scores are added together to the get final score.
6.  Add up all the scores for each language for each test.

The advantage of this approach (conceived by Stephen Weeks) is that 
high-performing tests get
spread out a bit (amplifying the relative advantages) to help 
determine "best".

If this approach is used on all machines running the shootout, the 
scoring should be very consistent
because the scores are based on relative ranking, which is somewhat 
less likely to change based on
hardware than the particular value of a given run.

-Brent