[Shootout-list] X per second scoring system, resume
Brandon J. Van Every
vanevery@indiegamedesign.com
Thu, 30 Sep 2004 12:28:58 -0700
Isaac Gouy wrote:
>
> Bengt & Brandon
>
> Humbly suggest you set up IIRC and talk to each other directly.
I agree the volume got a bit much yesterday, but it's not really a
private conversation. There needs to be the possibility of input from
others and also an archive. Plus a lot is said at once and that's
better for the medium of e-mail than chat.
I think where we're at right now is:
- yes it would be nice to have some larger, roughly constant amount of
time for all tests
- we need to measure to decide if it should be 30 seconds, 60 seconds,
something else
- 'guessing N' never goes away, it's just per test or per invasive timer
iteration.
- Ideally we guess N automatically. 'Somebody-else's-problem' works but
creates labor for any new
user of the Shootout test suite.
- invasive timers are usually more accurate, but are gruntwork to
implement
- we haven't checked if some language poses special implementation
difficulties
- it doesn't matter much if a slow language has a slow timer.
Only whether a fast language has a slow timer.
- startup/shutdown times vary both by language and by test.
How much is unknown.
- test startup/shutdown times can be guessed by noninvasive means,
but only 2 invasive timers can prove them
- I'm not convinced that garbage collection presents a special
benchmarking difficulty.
Tests simply need to be longer no matter what the problem. Algorithms
can vary too.
Unless someone has further arguments, I think a wise course of action
would be to pick a language and a test that is expected to have the most
variance in startup/shutdown times. In other words, deal with a likely
worst case. Implement invasive timers for that language, see if it
makes much difference for test accuracy. Also deal with "how many
seconds should the test be?" while one is at it.
Can anyone recommend what language + test should be the guinea pig?
Note I personally have a dependency: I need to get the Shootout working
on Windows. Alternately I could set up Linux on a removable HD, but
that + learning unfamiliar tools takes time too, so I'd rather take a
stab at getting it to work on Windows first.
If the PITA factor of a Windows port turns out to be large, then I must
return to the question of standard scoring systems before bothering.
Defining 'LCD Benchmark' or 'Simple Benchmark', as opposed to the
current CRAPS system. Lotsa stuff we haven't ironed out there. Like
how many tests, of what kind, how they should be categorized, how they
should be weighted....
Cheers, www.indiegamedesign.com
Brandon Van Every Seattle, WA
"The pioneer is the one with the arrows in his back."
- anonymous entrepreneur