[Shootout-list] mandelbrot problem
John Skaller
skaller@users.sourceforge.net
Thu, 16 Jun 2005 16:37:44 +1000
On Wed, 2005-06-15 at 17:02 -0700, Stephen Weeks wrote:
> The mandelbrot benchmark is flawed because it uses floating point
> computations with exact comparisons to yield bits in the output and
> then requires an exact match on the output.=20
This point has already been discussed, the current 'acceptance
test' is too strict.
> On x86, floating point
> numbers are kept internally at 80 bits and writes to memory round them
> to 64 bits; hence, the specified correct output of the benchmark is an
> accident of how the compiler that generated that executable made
> certain low-level decisions (register allocation, instruction
> selection, ...) that affect reads/writes of doubles to memory.
See below please ..
> As evidence of the problem, observe that mandelbrot.mlton produces
> different output depending on whether the program is compiled with
> "-ieee-fp false" or "-ieee-fp true", where the latter forces writes to
> memory after every floating point operation. =20
.. if that is so it is a bug in mlTon, you should file a bug
report. There is a requirement to round after every operation.
That isn't the same as writing to memory. I provide the man page
data for gcc version 4.0 for you below.=20
As you can see there are a LOT of switches
to help trade off standards conformance with performance.
None of them 'writes the result to memory after every operation'
though, that sound like a code generator hack.
I note in passing .. who said how big a float must be used?
I noted the C Takfp test was using float not double .. I changed
Felix from using double to floats .. and lo speed improved
dramatically. Ocaml uses doubles and is still marginally
faster though .. :]
And finally, a 'hack' to solve the problem would be to
compare floats as characters .. by requiring the output
to be text formatted to 'rounded to so many decimal places'=20
as with a %10.3f format in C or something similar. This would
require no changes in the current Shootout .. the output just
wouldn't be very interesting to look at :)
-ffloat-store
Do not store floating point variables in regis=E2=80=90
ters, and inhibit other options that might change
whether a floating point value is taken from a
register or memory.
This option prevents undesirable excess precision
on machines such as the 68000 where the floating
registers (of the 68881) keep more precision than
a "double" is supposed to have. Similarly for the
x86 architecture. For most programs, the excess
precision does only good, but a few programs rely
on the precise definition of IEEE floating point.
Use -ffloat-store for such programs, after modify=E2=80=90
ing them to store all pertinent intermediate com=E2=80=90
putations into variables.
-ffast-math
Sets -fno-math-errno, -funsafe-math-optimizations,
-fno-trapping-math, -ffinite-math-only,
-fno-rounding-math, -fno-signaling-nans and fcx-
limited-range.
This option causes the preprocessor macro
"__FAST_MATH__" to be defined.
This option should never be turned on by any -O
option since it can result in incorrect output for
programs which depend on an exact implementation
of IEEE or ISO rules/specifications for math func=E2=80=90
tions.
-fno-math-errno
Do not set ERRNO after calling math functions that
are executed with a single instruction, e.g.,
sqrt. A program that relies on IEEE exceptions
for math error handling may want to use this flag
for speed while maintaining IEEE arithmetic
compatibility.
This option should never be turned on by any -O
option since it can result in incorrect output for
programs which depend on an exact implementation
of IEEE or ISO rules/specifications for math func=E2=80=90
tions.
The default is -fmath-errno.
-funsafe-math-optimizations
Allow optimizations for floating-point arithmetic
that (a) assume that arguments and results are
valid and (b) may violate IEEE or ANSI standards.
When used at link-time, it may include libraries
or startup files that change the default FPU con=E2=80=90
trol word or other similar optimizations.
This option should never be turned on by any -O
option since it can result in incorrect output for
programs which depend on an exact implementation
of IEEE or ISO rules/specifications for math func=E2=80=90
tions.
The default is -fno-unsafe-math-optimizations.
-ffinite-math-only
Allow optimizations for floating-point arithmetic
that assume that arguments and results are not
NaNs or +-Infs.
This option should never be turned on by any -O
option since it can result in incorrect output for
programs which depend on an exact implementation
of IEEE or ISO rules/specifications.
The default is -fno-finite-math-only.
-fno-trapping-math
Compile code assuming that floating-point opera=E2=80=90
tions cannot generate user-visible traps. These
traps include division by zero, overflow, under=E2=80=90
flow, inexact result and invalid operation. This
option implies -fno-signaling-nans. Setting this
option may allow faster code if one relies on
=E2=80=98=E2=80=98non-stop=E2=80=99=E2=80=99 IEEE arithmetic, =
for example.
This option should never be turned on by any -O
option since it can result in incorrect output for
programs which depend on an exact implementation
of IEEE or ISO rules/specifications for math func=E2=80=90
tions.
The default is -ftrapping-math.
-frounding-math
Disable transformations and optimizations that
assume default floating point rounding behavior.
This is round-to-zero for all floating point to
integer conversions, and round-to-nearest for all
other arithmetic truncations. This option should
be specified for programs that change the FP
rounding mode dynamically, or that may be executed
with a non-default rounding mode. This option
disables constant folding of floating point
expressions at compile-time (which may be affected
by rounding mode) and arithmetic transformations
that are unsafe in the presence of sign-dependent
rounding modes.
The default is -fno-rounding-math.
This option is experimental and does not currently
guarantee to disable all GCC optimizations that
are affected by rounding mode. Future versions of
GCC may provide finer control of this setting
using C99=E2=80=99s "FENV_ACCESS" pragma. This command
line option will be used to specify the default
state for "FENV_ACCESS".
-fsignaling-nans
Compile code assuming that IEEE signaling NaNs may
generate user-visible traps during floating-point
operations. Setting this option disables opti=E2=80=90
mizations that may change the number of exceptions
visible with signaling NaNs. This option implies
-ftrapping-math.
This option causes the preprocessor macro "__SUP=E2=80=90
PORT_SNAN__" to be defined.
The default is -fno-signaling-nans.
This option is experimental and does not currently
guarantee to disable all GCC optimizations that
affect signaling NaN behavior.
-fsingle-precision-constant
Treat floating point constant as single precision
constant instead of implicitly converting it to
double precision constant.
-fcx-limited-range
-fno-cx-limited-range
When enabled, this option states that a range
reduction step is not needed when performing com=E2=80=90
plex division. The default is -fno-cx-lim=E2=80=90
ited-range, but is enabled by -ffast-math.
This option controls the default setting of the
ISO C99 "CX_LIMITED_RANGE" pragma. Nevertheless,
the option applies to all languages.
--=20
John Skaller <skaller at users dot sf dot net>
Download Felix: http://felix.sf.net