[Shootout-list] X per second scoring system, resume

Brandon J. Van Every vanevery@indiegamedesign.com
Thu, 30 Sep 2004 13:55:06 -0700


Einar Karttunen wrote:
> Brandon J. Van Every wrote:
>
> > - yes it would be nice to have some larger, roughly
> > constant amount of time for all tests
>
> Does this mean to fix input sizes?

No, as it says, it means roughly fixing the amount of time all tests
run.  That's the basis of an 'X per second' scoring system.  Whatever
amount of work you can get done in a defined chunk of time.  I'm
thinking the chunk of time needs to be 30..60 seconds for accuracy, but
I could be mistaken.

> Many test won't scale in a linear
> fashion... So this will penalize in most case faster languages.
>
> > - 'guessing N' never goes away, it's just per test or per
> > invasive timer iteration.
>
> So we don't reduce the amount of work.

No we don't.  The point is to increase accuracy, not reduce
implementation work.  It costs work.  Personally, I value the accuracy
more than avoiding the work.

Or by 'work' do you mean, 'the benchmarks will still take a long time to
run?'  In an invasive micro system, not necessarily.  For calibration,
we only need to pick the inner loop N so that the inner loop takes
(perhaps) 1 second.  1 second between START() STOP() is plenty good for
accuracy  in any environment with decent timers.  You'd need to start at
N=1, do a run, double it until the inner START() STOP() time exceeds 1
second, then halve/double/binary squeeze until it takes sufficiently
close to 1 second.  Use that N.   Of course if N=1 already takes much
longer than 1 second to perform, you can't sample at the rate you want
and you have a more basic problem.  You could either use N=1 or flunk
the test as too slow.

This whole process of convergence might take 5 seconds or so.  So if you
want 60 seconds of actual run, your whole micro-invasive test takes 65
seconds to run.  I'm seeing 31 tests at present, if I'm good at
counting, so that's 33.58 minutes per language.  There's 55 languages,
so that's 30.78 hours of testing for the whole thing.

Hm, that's pretty high.  Seems there's a motive to pick either a lower
number of seconds per test, or to tailor it according to how fast the
language actually is.  Anyways, in a micro-invasive approach, the
calibration is both easily automated and quick.  'Guessing N' is never
'somebody-else's-problem'.  I suppose if you did 5 seconds of
calibration and 10 seconds of actual testing, you'd get the whole thing
done in 7.5 hours.  A night's rest.

> > - Ideally we guess N automatically.
> 'Somebody-else's-problem' works but
> > creates labor for any new user of the Shootout test suite.
>
> Choosing a different N can make the order vary.

You mean like, some language dies once N hits a certain threshold?  Or
do you mean small noisy variances?  Noise isn't important.

> > - invasive timers are usually more accurate, but are gruntwork to
> > implement
>
> Invasive timers are more accurate depending on the language,
> making timer scales vary with language.

Languages already vary with languages.  :-)  Any timer on the order of
microseconds will do fine.  We're talking about START() STOP() gaps of 1
second.  Even milliseconds is tolerable if the language is slow.

> > - we haven't checked if some language poses special implementation
> > difficulties
>
> Yes, they do.

Example then please?

> > - it doesn't matter much if a slow language has a slow timer.
> >   Only whether a fast language has a slow timer.
>
> Yes it does. It makes the slow language appear even slower..

Penny wise, pound foolish.  Slow relative to what?  If it's at the
bottom of the dogpile anyways it doesn't matter.

The crux of your argument is, you believe some languages will have
timers so bad, that invasive timing overhead will be unacceptable in
some cases.  I'm not ready to buy your premise yet.  What's your
specific example?

I will also wager that all the (relatively) high-performance languages
have a high-performance timer available.  Either natively or through a C
call.


Cheers,                     www.indiegamedesign.com
Brandon Van Every           Seattle, WA

"The pioneer is the one with the arrows in his back."
                          - anonymous entrepreneur