[Shootout-list] A different way of measuring

Brandon J. Van Every vanevery@indiegamedesign.com
Sat, 11 Sep 2004 11:50:24 -0700


Peter Hinely wrote:
>
> I think what the shootout really should do is measure the slope of the
> graph of CPU time vs. N iterations.

Hello.  I am new to the Shootout, but experienced in the technology and
marketing of OpenGL Viewperf / GLPerf benchmarking.  As a 3D device
driver writer once upon a time, I have lied with benchmarks
professionally.  :-)

You are proposing a particular kind of composite number.  All composite
numbers summarize stuff and gloss over details, that is their nature.
Thus you can always lie with them.  "We've got 120 on CDRS-3!"  Well,
that covers up all kinds of dirty deeds.  The only way you ever
understand a benchmark fully is to look at all the subtests, all the
graphs of all the different samples vs. memory plots, etc.  In short,
you only understand if you're an engineer putting in a lot of time into
finding out exactly where someone lied.

It is tempting to think you can solve the problem by offering up some
more elegant, mathematically continuous composite number.  But you
cannot do so.  You are just changing the way you lie.

My concern about your particular composite number, is it's difficult to
understand.  At least on Viewperf, the way of constructing the composite
number is very simple.  If you look up the test definitions, you know
exactly how much each subtest is weighted.  So then it's easy to isolate
how people are cheating.  It's all just simple addition and
multiplication.  Your schema, in contrast, requires an understanding of
calculus.  Many people do have that understanding, but many don't, or
have lost what they once had.  I am thinking of all the managers and
Pointy Haired Bosses who might someday make policy decisions based on
Shootout scores.

I would greatly prefer a Keep It Simple Stupid composite scoring index.


Cheers,                         www.indiegamedesign.com
Brandon Van Every               Seattle, WA

20% of the world is real.
80% is gobbledygook we make up inside our own heads.