[Shootout-list] Fix Regexp data

skaller skaller@users.sourceforge.net
06 May 2005 03:19:08 +1000


The supposed correct output for regex is:

1: (111) 111-1111
2: (111) 222-2222
3: (111) 333-3333
4: (111) 444-4444
5: (111) 555-5555
6: (111) 666-6666
7: (111) 777-7777
8: (111) 888-8888
9: (111) 999-9999
10: (111) 000-0000
11: (111) 232-1111
12: (111) 242-1111


As mentioned MANY times in previous posts, this is WRONG.
The correct output is in fact:

1: (111) 111-1111
2: (111) 222-2222
3: (111) 333-3333
4: (111) 444-4444
5: (111) 555-5555
6: (111) 666-6666
7: (111) 777-7777
8: (111) 888-8888
9: (111) 999-9999
10: (111) 000-0000
11: (111) 232-1111
12: (111) 242-1111
13: (213) 222-2222

The line number 13 above is not correctly accepted by
most of the programs because they all use the same
INCORRECT regexp.

The regexp used by Felix is correct.

The following languages should now FAIL this test.

SML MLton 0.441,0200.44767 OCaml 0.538440.5324 Scheme Bigloo
1.325,1761.3260 D Digital Mars 1.785521.7832 SML SML/NJ 1.833,4761.77766
Icon 1.9714,1081.9623 AWK mawk 2.0411,1722.0422 Pike 2.093,4241.9716
Perl 2.561,4522.5624 Clean 2.603,1882.60164 Java#2 2.7311,6642.5642
OCaml (bytecode) 2.751,0522.7524 C gcc 2.984202.9898 Python Psyco
3.064,1843.0028 Python 3.232,2443.2027 Ada 95 GNAT 3.621,2723.6257 Lua
3.6711,4963.6716 Nice 3.8318,4923.5436 Lisp Newlisp 3.938283.9222
Lisp librep 5.161,2405.1531 AWK gawk 5.729245.7222 Scheme Chicken
6.231,3486.2216 PHP 6.553,4846.5221 Ruby 7.771,6927.7623 Scheme MZC
11.494,88811.4045 Scheme MzScheme 13.704,92413.6145 Forth GForth
13.7979213.7878 Ada 95 GNAT#2 14.241,18414.2361 Scheme Guile
14.504,38014.4544 Tcl 14.961,56814.9511 Haskell GHC 23.381,81623.3829
Erlang 24.215,06823.9971

The following line, taken from the Ocaml example:

 "[^0-9(]*" (* must be preceded by non-digit *)

shows what the error is. ( is NOT a digit.
[There is a similar error at the end too]

The Perl example is totally bogus. It tries to use
a backreference. This is nasty, wrong, and cannot
be made to work. 

I realise that this test is probably going to be
deprecated, but until it is the correct test
data should be used.

------------------------

In addition, the testing script is bugged somehow.
The page says Ocaml failed this test, but it is still
given a score.

regexmatch.ocaml_run %A

************************************************************
*   TEST (regexmatch.ocaml_run) FAILED:  regexmatch.ocaml_out differs from Output.100
************************************************************

OCaml 0.53 8 44 0.53 24

---------------------------

Finally, this is utterly unacceptable as a specification:

"Each program should be implemented the same way - 
the same way as this Perl program."

The original specification should be used,
it is also included in the 'about this test'.
--------------------------

A comment on replacement of this test. Like the
numerical stuff, it isn't enough to have just one
regexp test - especially one which exercises
a feature KNOWN to be unspecified, that is,
without proper mathematical foundations.


-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net