[Shootout-list] testbed normalization

Brandon J. Van Every vanevery@indiegamedesign.com
Tue, 28 Sep 2004 11:51:02 -0700


Brandon J. Van Every wrote:
> Bengt Kleberg wrote:
> >
> > i suggested that the run time for the fastest implementation on a
> > particular test should always be (about) 1 second.
>
> That will play havoc with historical archives.  Computers are going to
> get faster.  In 2008 some machine is going to be blowing the doors off
> of some test.  Are you just going to renormalize everything
> and make it
> difficult to compare with the results from 2004?  This is a subtle
> reason why additive systems are better.  In an additive system, you're
> just dealing with bigger and bigger numbers for 'work per second' as
> computers get faster.

Actually computer-computer normalization is a difficulty here too.  I
am, for instance, running a 'dinosaur' 866 MHz Pentium III, with a
'modest' 512MB of RAM, on Windows 2000 SP4.  Essentially year 2000 HW in
2004.  Well, the 128MB GeForce4 TI is a little more recent, but
definitely not a current card.  And, I was doing fine with the 32MB
GeForce2, I've noticed no change.  I find my machine is still quite a
capable workhorse.  I've yet to run into a performance limitation for
general programming tasks, it's just not my dominant problem right now.
Learning newfangled languages and tools tends to be my dominant problem;
they all seem to be fast enough on my system.

So, what to normalize against?  How about the performance of the C
benchmark on any given machine?

Uuuh, of course this becomes problematic once one creates a Garbage
Collection or Concurrency benchmark, as C can't do those things.  I
guess for garbage collection, it would be reasonable to normalize on
Java, as historically that was the 1st popular GC language for the
masses.  I'm not convinced Smalltalk or Lisp have ever been popular the
way Java has become.

I suppose choice of normalization for Concurrency is more impolitic, as
there's no clear choice.  Well, one could make a choice and renormalize
on something else if it proves to be a bad one.  That would mean
recomputing the archives, however, which is a hassle.

What is the databasing format of the archives, incidentally?  Would it
be easy enough to 'change one field in a spreadsheet' to renormalize
everything, if such was needed?  Also, numerical accuracy would need to
be good.  Don't want rounding errors to change historical results.  I
suppose if absolute scores are always stored, and then a normalized view
of the data is generated, this problem is avoided.  You never touch the
original data.


Cheers,                     www.indiegamedesign.com
Brandon Van Every           Seattle, WA

"The pioneer is the one with the arrows in his back."
                          - anonymous entrepreneur