[Shootout-list] mandelbrot problem

John Skaller skaller@users.sourceforge.net
Thu, 16 Jun 2005 16:37:44 +1000


On Wed, 2005-06-15 at 17:02 -0700, Stephen Weeks wrote:
> The mandelbrot benchmark is flawed because it uses floating point
> computations with exact comparisons to yield bits in the output and
> then requires an exact match on the output.=20

This point has already been discussed, the current 'acceptance
test' is too strict.

>  On x86, floating point
> numbers are kept internally at 80 bits and writes to memory round them
> to 64 bits; hence, the specified correct output of the benchmark is an
> accident of how the compiler that generated that executable made
> certain low-level decisions (register allocation, instruction
> selection, ...) that affect reads/writes of doubles to memory.

See below please ..

> As evidence of the problem, observe that mandelbrot.mlton produces
> different output depending on whether the program is compiled with
> "-ieee-fp false" or "-ieee-fp true", where the latter forces writes to
> memory after every floating point operation. =20

.. if that is so it is a bug in mlTon, you should file a bug
report. There is a requirement to round after every operation.
That isn't the same as writing to memory. I provide the man page
data for gcc version 4.0 for you below.=20

As you can see there are a LOT of switches
to help trade off standards conformance with performance.

None of them 'writes the result to memory after every operation'
though, that sound like a code generator hack.

I note in passing .. who said how big a float must be used?
I noted the C Takfp test was using float not double .. I changed
Felix from using double to floats .. and lo speed improved
dramatically. Ocaml uses doubles and is still marginally
faster though .. :]

And finally, a 'hack' to solve the problem would be to
compare floats as characters .. by requiring the output
to be text formatted to 'rounded to so many decimal places'=20
as with a %10.3f format in C or something similar. This would
require no changes in the current Shootout .. the output just
wouldn't be very interesting to look at :)

     -ffloat-store
          Do not store floating point variables in regis=E2=80=90
           ters, and inhibit other options that might change
           whether a floating point value is taken from a
           register or memory.

           This option prevents undesirable excess precision
           on machines such as the 68000 where the floating
           registers (of the 68881) keep more precision than
           a "double" is supposed to have.  Similarly for the
           x86 architecture.  For most programs, the excess
           precision does only good, but a few programs rely
           on the precise definition of IEEE floating point.
           Use -ffloat-store for such programs, after modify=E2=80=90
           ing them to store all pertinent intermediate com=E2=80=90
           putations into variables.

       -ffast-math
           Sets -fno-math-errno, -funsafe-math-optimizations,
           -fno-trapping-math, -ffinite-math-only,
           -fno-rounding-math, -fno-signaling-nans and fcx-
           limited-range.

           This option causes the preprocessor macro
           "__FAST_MATH__" to be defined.

           This option should never be turned on by any -O
           option since it can result in incorrect output for
           programs which depend on an exact implementation
           of IEEE or ISO rules/specifications for math func=E2=80=90
           tions.

       -fno-math-errno
           Do not set ERRNO after calling math functions that
           are executed with a single instruction, e.g.,
           sqrt.  A program that relies on IEEE exceptions
           for math error handling may want to use this flag
           for speed while maintaining IEEE arithmetic
           compatibility.

           This option should never be turned on by any -O
           option since it can result in incorrect output for
           programs which depend on an exact implementation
           of IEEE or ISO rules/specifications for math func=E2=80=90
           tions.

           The default is -fmath-errno.

       -funsafe-math-optimizations
           Allow optimizations for floating-point arithmetic
           that (a) assume that arguments and results are
           valid and (b) may violate IEEE or ANSI standards.
           When used at link-time, it may include libraries
           or startup files that change the default FPU con=E2=80=90
           trol word or other similar optimizations.

           This option should never be turned on by any -O
           option since it can result in incorrect output for
           programs which depend on an exact implementation
           of IEEE or ISO rules/specifications for math func=E2=80=90
           tions.

           The default is -fno-unsafe-math-optimizations.

       -ffinite-math-only
           Allow optimizations for floating-point arithmetic
           that assume that arguments and results are not
           NaNs or +-Infs.

           This option should never be turned on by any -O
           option since it can result in incorrect output for
           programs which depend on an exact implementation
           of IEEE or ISO rules/specifications.

           The default is -fno-finite-math-only.

       -fno-trapping-math
           Compile code assuming that floating-point opera=E2=80=90
           tions cannot generate user-visible traps.  These
           traps include division by zero, overflow, under=E2=80=90
           flow, inexact result and invalid operation.  This
           option implies -fno-signaling-nans.  Setting this
           option may allow faster code if one relies on
           =E2=80=98=E2=80=98non-stop=E2=80=99=E2=80=99 IEEE arithmetic, =
for example.

           This option should never be turned on by any -O
           option since it can result in incorrect output for
           programs which depend on an exact implementation
           of IEEE or ISO rules/specifications for math func=E2=80=90
           tions.

           The default is -ftrapping-math.

       -frounding-math
           Disable transformations and optimizations that
           assume default floating point rounding behavior.
           This is round-to-zero for all floating point to
           integer conversions, and round-to-nearest for all
           other arithmetic truncations.  This option should
           be specified for programs that change the FP
           rounding mode dynamically, or that may be executed
           with a non-default rounding mode.  This option
           disables constant folding of floating point
           expressions at compile-time (which may be affected
           by rounding mode) and arithmetic transformations
           that are unsafe in the presence of sign-dependent
           rounding modes.

           The default is -fno-rounding-math.

           This option is experimental and does not currently
           guarantee to disable all GCC optimizations that
           are affected by rounding mode.  Future versions of
           GCC may provide finer control of this setting
           using C99=E2=80=99s "FENV_ACCESS" pragma.  This command
           line option will be used to specify the default
           state for "FENV_ACCESS".

       -fsignaling-nans
           Compile code assuming that IEEE signaling NaNs may
           generate user-visible traps during floating-point
           operations.  Setting this option disables opti=E2=80=90
           mizations that may change the number of exceptions
           visible with signaling NaNs.  This option implies
           -ftrapping-math.

           This option causes the preprocessor macro "__SUP=E2=80=90
           PORT_SNAN__" to be defined.

           The default is -fno-signaling-nans.

           This option is experimental and does not currently
           guarantee to disable all GCC optimizations that
           affect signaling NaN behavior.

       -fsingle-precision-constant
           Treat floating point constant as single precision
           constant instead of implicitly converting it to
           double precision constant.

       -fcx-limited-range
       -fno-cx-limited-range
           When enabled, this option states that a range
           reduction step is not needed when performing com=E2=80=90
           plex division.  The default is -fno-cx-lim=E2=80=90
           ited-range, but is enabled by -ffast-math.

           This option controls the default setting of the
           ISO C99 "CX_LIMITED_RANGE" pragma.  Nevertheless,
           the option applies to all languages.




--=20
John Skaller <skaller at users dot sf dot net>
Download Felix: http://felix.sf.net