[Shootout-list] Directions of various benchmarks

Thu, 02 Jun 2005 05:29:30 +1000

On Tue, 2005-05-31 at 17:09 +0200, Bengt Kleberg wrote:
> On 2005-05-25 14:28, John Skaller wrote:
> ...deleted
> > In my opinion, on the benchmarking side (as opposed
> > to the web site), it is quite easy to write any
> > code to do anything. The HARD part is design (as usual).
> 
> how about the folowing file system layout?

That's one possibility. Good to actually code stuff though.
I have some Python script that does benchmarking now.
It isn't complete of course. Here is the output:

Rosella 2005/06/02 04:04 ocamlb takfp 2 0.0063
Rosella 2005/06/02 04:04 gcc_3_4 takfp 5 0.0052
Rosella 2005/06/02 04:04 gccopt_4_0 takfp 6 0.0070
Rosella 2005/06/02 04:04 gccopt_3_4 ack 4 0.0049
Rosella 2005/06/02 04:04 ocamlopt takfp 7 0.0109
Rosella 2005/06/02 04:04 gccopt_4_0 ack 5 0.0050
Rosella 2005/06/02 04:04 gcc_3_4 takfp 8 0.0440
Rosella 2005/06/02 04:04 gcc_3_4 takfp 9 0.2575
Rosella 2005/06/02 04:04 gcc_3_4 ack 6 0.0084
Rosella 2005/06/02 04:04 ocamlb ack 8 0.2846
Rosella 2005/06/02 04:04 ocamlopt ack 9 0.0411
Rosella 2005/06/02 04:04 ocamlb ack 10 4.5203
Rosella 2005/06/02 04:04 gccopt_4_0 takfp 10 0.7318
Rosella 2005/06/02 04:04 felix ack 10 0.1298
Rosella 2005/06/02 04:04 gcc_4_0 ack 12 9.4169
Rosella 2005/06/02 04:04 gccopt_3_4 ack 12 2.2309
Rosella 2005/06/02 04:04 gcc_3_4 takfp 10 1.9112
Rosella 2005/06/02 04:04 gcc_3_4 takfp 10 1.8929
Rosella 2005/06/02 04:04 felix takfp 10 1.0539
Rosella 2005/06/02 04:04 gcc_3_4 ack 12 9.3695
Rosella 2005/06/02 04:04 ocamlopt ack 12 2.4685
Rosella 2005/06/02 04:04 gccopt_4_0 takfp 11 4.8558
Rosella 2005/06/02 04:04 gccopt_3_4 ack 11 0.5327
Rosella 2005/06/02 04:04 gcc_3_4 takfp 10 1.9130
Rosella 2005/06/02 04:04 gccopt_4_0 ack 11 0.6415

Some notes now: the first field is the hostname,
the second the test date, the third the test time.
The fourth is the translator key, the fifth the test key,
the sixth the value of n, and the last field is the elapsed time.

Here is what the test process does, roughly:

We start with a a minimum and maximum allowed time per test,
and a minimum and maximum initial n value (one pair for
each test).

The procedure randomly picks a translator, test, and n,
and measures the time.

If the time is too low, the minimum n is increased by 1.
If the time is too high, the maximum time is decreased by 1.

The test process runs until a total elapsed time has 
expired, a test crashes, or you press Ctrl-C.

The results of each test are appended to a single file,
which accumulates results for ever. If someone else 
does some tests they can mail you the file as an attachment
and you can just append it to other data you have.

The test procedure is such that a modification to
allow for concurrency may be possible.
--------------------------------------------------

The key properties of this procedure are:

(a) it is automatically adaptive: it downgrades
results of tests that run too quickly, and it 
find the maximum n value automatically.

(b) it kills tests that exceed a time limit

(c) it runs for a fixed amount of time and stops

(d) it can also be stopped with a Ctrl-C

(e) It runs the tests randomly to avoid any
biases such as pre-loaded cache memory
from a previous test

(f) the results are infinitely cumulative

(g) it can merge results from different
architectures.

(h) the set of tests, source files,
and translators can be varied at any time

(i) the procedure measures *real* time

(j) The procedure does not measure memory use

(k) The procedure does not check the results

(l) The procedure does not analyse the results

(m) The data can be parsed by 'readline' and then 'split'
on a single space.

----------------------------------------------------
Here is how the measurement is done: grab the current time.

Launch two processes without waiting: 
(1) the test, and,
(2) a 'sleep' command line

which is set a few seconds above the maximum allowable time.

Then wait for one of the child processes to return,
record the termination time.

Kill the other process and wait for it.

--------------------------------------------------
The script is written in Python, and requires
a configuration file that looks like this:
---------------------------------------------------

# this config file defines default translators on your platform
# to be used by the performance test module
#
# The file 'config/speed_xlators.py' can be edited,
# it will not be clobbered once created
#
# define the translators
def mk_gcc_3_4(k,p):
  return "gcc-3.4 -o speed/exes/%s/%s speed/src/c/%s.c" % (k,p,p)

def mk_gcc_3_4_opt(k,p):
  x = "gcc-3.4 -O3 -fomit-frame-pointer "
  x = x + "-o speed/exes/%s/%s speed/src/c/%s.c" % (k,p,p)
  return x

def mk_gcc_4_0(k,p):
  return "gcc-4.0 -o speed/exes/%s/%s speed/src/c/%s.c" % (k,p,p)

def mk_gcc_4_0_opt(k,p):
  x = "gcc-4.0 -O3 -fomit-frame-pointer "
  x = x + "-o speed/exes/%s/%s speed/src/c/%s.c" % (k,p,p)
  return x

def mk_ocamlopt(k,p):
  return "ocamlopt.opt -o speed/exes/%s/%s speed/src/ocaml/%s.ml" %
(k,p,p)

def mk_ocamlb(k,p):
  return "ocamlc.opt -o speed/exes/%s/%s speed/src/ocaml/%s.ml" %
(k,p,p)

def mk_felix(k,p):
  x = "bin/flx --test --force --static --optimise -c
-DFLX_PTF_STATIC_POINTER "
  x = x + "speed/src/felix/%s && " % p
  x = x + "mv speed/src/felix/%s speed/exes/%s/%s" % (p,k,p)
  return x

xlators = [
  ('felix',mk_felix,'felix'),
  ('gcc_3_4',mk_gcc_3_4,'c'),
  ('gccopt_3_4',mk_gcc_3_4_opt,'c'),
  ('gcc_4_0',mk_gcc_4_0,'c'),
  ('gccopt_4_0',mk_gcc_4_0_opt,'c'),
  ('ocamlopt',mk_ocamlopt,'ocaml'),
  ('ocamlb',mk_ocamlb,'ocaml'),
]
------------------------------------------------
The file layout is:

speed/src/<language>/<tests>
speed/exes/<xlator>/<executables>

And here is the actual script:
------------------------------------
import os
import time
import sys
import random
import signal
import socket

# utility to make a directory
def mkdir(p):
  try: os.mkdir(p)
  except: pass

execfile("config/speed_xlators.py")

# make the output directories
mkdir('speed/exes')
for key,mk,src in xlators:
  mkdir('speed/exes/'+key)

#define the tests
tests = {
  'ack':(1,5),
  'takfp':(1,5),
}

#compile the programs
for xl,mk,src in xlators:
  for test in tests.keys():
    fst,last = tests[test]
    cmd = mk(xl,test)
    os.system(cmd)

# maximum allowed time per test, seconds
maxtime = 5.0
mintime = 0.5
sleep_time = 10

#total time for testsing, seconds:
max_test_time = 200.0
start_time = time.time()

#hostname
hostname = socket.gethostname()

#test date
etime = time.time()
ltime = time.localtime(etime)
date = time.strftime("%Y/%m/%d %H:%M",ltime)

ntests = len(tests)
nlators = len(xlators)

pid = 0
spid = 0

f = open("speed/results.dat","at")
while time.time() - start_time < max_test_time:
  # pick a random test and a random translator
  test_ix = random.randint(0,ntests-1)
  xlator_ix = random.randint(0,nlators-1)
  test = tests.keys()[test_ix]
  fst,lst = tests[test]
  xl,mk,src = xlators[xlator_ix]
  n = random.randint(fst,lst+1)

  test_file = "speed/exes/"+xl+"/"+test
  test_arg = "%d" % n

  print test_file,test_arg
  start = time.time()
  pid = os.spawnl(os.P_NOWAIT,test_file,"DUMMY",test_arg)
  spid = os.spawnlp(os.P_NOWAIT,"sleep","DUMMY","%d" % sleep_time)
  pidx,status = os.wait()
  finish = time.time()
  elapsed = finish - start

  if pidx == spid:
    os.kill(pid,signal.SIGKILL)
    print "TIMEOUT"
  else:
    os.kill(spid,signal.SIGKILL)
  os.wait()

 if  pidx == pid:
    signalled = os.WIFSIGNALED(status)
    exited = os.WIFEXITED(status)
    if not (signalled or exited):
      print "WHAT?? Neither exited nor signalled?"
      exit(2)

    if signalled:
      sig = os.WTERMSIG(status)
      if sig == signal.SIGINT:
        raise KeyboardInterrupt

      if sig == signal.SIGSEGV:
        print "SEGMENTATION FAULT, TERMINATING"
        sys.exit(1)

      if sig != 0:
        print "UNKNOWN SIGNAL",sig,": TERMINATING"
        sys.exit(1)
    else:
      if status == 0:
        x = hostname + " " + date + " %s %s %d %6.4f" %
(xl,test,n,elapsed)
        print x
        f.write(x+"\n")
        f.flush()
      else:
        ret = os.WEXITSTATUS(status)
        if ret != 0:
          print "TEST RETURNED ERROR CODE ",ret,": TERMINATING"
          sys.exit(1)
        else:
          print "WHAT?? Exit code is 0 and not 0?"
          sys.exit(2)

  if elapsed > maxtime: lst = n - 1
  if elapsed < mintime: fst = n + 1
  if fst > lst: lst = fst
  tests[test] = (fst,lst)

f.close()

-----------------------------------

-- 
John Skaller, skaller at users.sf.net
PO Box 401 Glebe, NSW 2037, Australia Ph:61-2-96600850 
Download Felix here: http://felix.sf.net