Main Page | Report this Page
Computers Forum Index  »  Computer Architecture - Arithmetic  »  Floating point bug? Or Feature?
Page 1 of 1    

Floating point bug? Or Feature?

Author Message
Tim Mensch
Posted: Sat Mar 08, 2008 6:59 am
Guest
Can someone tell me why this isn't a bug?

double foo = INT_MAX ;
double bar = 0 ;
double bax = foo + bar ;

ASSERT(foo==INT_MAX); // This succeeds; easily enough precision to hold 2^31
ASSERT(bax==INT_MAX); // This fails!? INT_MAX+0 == INT_MAX+1!?

I built this on Visual C++ 2005 with /fp=precision ("precision" floating point). It created very simple floating point assembly language (this is from a debug build, but it fails in debug and release):

double foo = INT_MAX ;
00513C21 fld qword ptr [__real@41dfffffffc00000 (576208h)]
00513C27 fstp qword ptr [ebp-48h]
double bar = 0 ;
00513C2A fldz
00513C2C fstp qword ptr [ebp-58h]
double bax = foo + bar ;
00513C2F fld qword ptr [ebp-48h]
00513C32 fadd qword ptr [ebp-58h]
00513C35 fstp qword ptr [ebp-68h]

Sure enough, when the fadd is executed, the stack floating point value goes from INT_MAX to INT_MAX+1, even though it's adding a zero. This happens on both an Intel Pentium Core2Duo and an AMD Turion64 (running in 32-bit mode).

Please CC my email address when replying, or it may take me a while to notice. Yes, it's a real address as-is, at least for now--if it ends up blocked in the future, or I don't reply to it, then change the number and it will work again. Smile

Thanks in advance!

Tim Mensch
 
glen herrmannsfeldt
Posted: Mon Mar 10, 2008 2:35 am
Guest
Tim Mensch wrote:

Quote:
Can someone tell me why this isn't a bug?

double foo = INT_MAX ;
double bar = 0 ;
double bax = foo + bar ;

ASSERT(foo==INT_MAX); // This succeeds; easily enough precision to hold 2^31
ASSERT(bax==INT_MAX); // This fails!? INT_MAX+0 == INT_MAX+1!?

Does it depend on the rounding mode?

Can you store the temporary real (80 bit) internal values and show
the hex value of them?

-- glen


Quote:
I built this on Visual C++ 2005 with /fp=precision ("precision" floating point). It created very simple floating point assembly language (this is from a debug build, but it fails in debug and release):

double foo = INT_MAX ;
00513C21 fld qword ptr [__real@41dfffffffc00000 (576208h)]
00513C27 fstp qword ptr [ebp-48h]
double bar = 0 ;
00513C2A fldz
00513C2C fstp qword ptr [ebp-58h]
double bax = foo + bar ;
00513C2F fld qword ptr [ebp-48h]
00513C32 fadd qword ptr [ebp-58h]
00513C35 fstp qword ptr [ebp-68h]
 
Quadibloc
Posted: Wed Mar 12, 2008 7:35 pm
Guest
On Mar 7, 6:59 pm, "Tim Mensch" <tim-usenet-4...@bitgems.com> wrote:
Quote:
Can someone tell me why this isn't a bug?

      double foo = INT_MAX ;
      double bar = 0 ;
      double bax = foo + bar ;

      ASSERT(foo==INT_MAX);  // This succeeds; easily enough precision to hold 2^31
      ASSERT(bax==INT_MAX); // This fails!?  INT_MAX+0 == INT_MAX+1!?

That is definitely unexpected behavior on most computing systems.

Unlike decimal fractions, integers are represented exactly in floating-
point, and integers are typically 32 bits long while double-precision
floating-point occupies 64 bits and typically uses 8 or 9 bits for the
exponent.

Even if INT_MAX turned out to be 2^63-1 because your system used 64-
bit integers, so that the floats involved were not exact, adding zero
ought not to have produced a result differing from the original
number, so both assertions should have either succeeded or failed
together.
 
purnnamu
Posted: Sat Mar 15, 2008 2:00 am
Guest
On Mar 7, 7:59 pm, "Tim Mensch" <tim-usenet-4...@bitgems.com> wrote:
Quote:
Can someone tell me why this isn't a bug?

      double foo = INT_MAX ;
      double bar = 0 ;
      double bax = foo + bar ;

      ASSERT(foo==INT_MAX);  // This succeeds; easily enough precision to hold 2^31
      ASSERT(bax==INT_MAX); // This fails!?  INT_MAX+0 == INT_MAX+1!?

I built this on Visual C++ 2005 with /fp=precision ("precision" floating point). It created very simple floating point assembly language (this is from a debug build, but it fails in debug and release):

      double foo = INT_MAX ;
00513C21  fld         qword ptr [__real@41dfffffffc00000 (576208h)]
00513C27  fstp        qword ptr [ebp-48h]
      double bar = 0 ;
00513C2A  fldz
00513C2C  fstp        qword ptr [ebp-58h]
      double bax = foo + bar ;
00513C2F  fld         qword ptr [ebp-48h]
00513C32  fadd        qword ptr [ebp-58h]
00513C35  fstp        qword ptr [ebp-68h]

Sure enough, when the fadd is executed, the stack floating point value goes from INT_MAX to INT_MAX+1, even though it's adding a zero. This happens on both an Intel Pentium Core2Duo and an AMD Turion64 (running in 32-bit mode)..

Please CC my email address when replying, or it may take me a while to notice. Yes, it's a real address as-is, at least for now--if it ends up blocked in the future, or I don't reply to it, then change the number and it will work again. :)

Thanks in advance!

Tim Mensch


I tested the code in my environments (Intel Core2 duo, VC++ 6.0 and
gcc3.xx 32-bit).
It works correctly in both environments.

In some compiler environment, FPU processes all calculations in 80-
bit.
And then, the outputs are converted into 64-bit format using FST
instruction.
I have experienced a problem that the output is different from
IEEE-754 due to this 80-bit internal operation.
(in my case, gcc 3.xx 32-bit)

However, your code works correctly in the gcc 3.xx 32-bit environment.
I checked.

I don't know why you got the bizarre result in your environment.
I also agree that your code should work as you expected.

Inwook

----------- the tested code -----------

#include "limits.h"
#include "assert.h"

int main(void)
{
double foo= INT_MAX;
double bar= 0;
double bax=foo+bar;

if(foo==INT_MAX)printf("foo=INT_MAX\n");
if(bax==INT_MAX)printf("bax=INT_MAX\n");

assert(foo==INT_MAX);
assert(bax==INT_MAX);
}
 
purnnamu
Posted: Sat Mar 15, 2008 2:08 am
Guest
On Mar 7, 7:59 pm, "Tim Mensch" <tim-usenet-4...@bitgems.com> wrote:
Quote:
Can someone tell me why this isn't a bug?

double foo = INT_MAX ;
double bar = 0 ;
double bax = foo + bar ;

ASSERT(foo==INT_MAX); // This succeeds; easily enough precision to hold 2^31
ASSERT(bax==INT_MAX); // This fails!? INT_MAX+0 == INT_MAX+1!?

I built this on Visual C++ 2005 with /fp=precision ("precision" floating point). It created very simple floating point assembly language (this is from a debug build, but it fails in debug and release):

double foo = INT_MAX ;
00513C21 fld qword ptr [__real@41dfffffffc00000 (576208h)]
00513C27 fstp qword ptr [ebp-48h]
double bar = 0 ;
00513C2A fldz
00513C2C fstp qword ptr [ebp-58h]
double bax = foo + bar ;
00513C2F fld qword ptr [ebp-48h]
00513C32 fadd qword ptr [ebp-58h]
00513C35 fstp qword ptr [ebp-68h]

Sure enough, when the fadd is executed, the stack floating point value goes from INT_MAX to INT_MAX+1, even though it's adding a zero. This happens on both an Intel Pentium Core2Duo and an AMD Turion64 (running in 32-bit mode).

Please CC my email address when replying, or it may take me a while to notice. Yes, it's a real address as-is, at least for now--if it ends up blocked in the future, or I don't reply to it, then change the number and it will work again. :)

Thanks in advance!

Tim Mensch



I tested the code in my environments (Intel Core2 duo, VC++ 6.0 and
gcc3.xx 32-bit).
It works correctly in both environments.

In some compiler environment, FPU processes all calculations in 80-
bit.
And then, the outputs are converted into 64-bit format using FST
instruction.
I have experienced a problem that the output is different from
IEEE-754 due to this 80-bit internal operation.
(in my case, gcc 3.xx 32-bit)


However, your code works correctly in the gcc 3.xx 32-bit
environment.

I don't know why you got the bizarre result in your environment.
I also agree that your code should work as you expected.


Inwook


----------- the tested code -----------


#include "limits.h"
#include "assert.h"


void main(void)
{
double foo= INT_MAX;
double bar= 0;
double bax=foo+bar;


if(foo==INT_MAX)printf("foo=INT_MAX\n");
if(bax==INT_MAX)printf("bax=INT_MAX\n");


assert(foo==INT_MAX);
assert(bax==INT_MAX);
}
 
Tim Mensch
Posted: Sat Mar 15, 2008 6:21 am
Guest
First, thanks for all the replies. I didn't notice them at first; I need to get a more convenient news reader set up.

When I again attempted to reproduce the problem, it actually worked correctly this time. A bit more experimentation leads me to the conclusion that the error only happens after initializing a CppUnit class: CppUnit::TextTestRunner. Does anyone know of something it might be doing that would change the behavior of the FPU? (forcing it to use 32-bit floats internally, perhaps?) My assembly language skills don't extend to the arcana of current Intel FPU modes.

To answer a few questions:

1. When I was first debugging the problem, I was able to watch the floating point registers in the debugger, and I could see the floating point add cause the value in the debugger change from INT_MAX, to INT_MAX+1:

+2.1474836470000000e+0009 changed to
+2.1474836480000000e+0009

And to answer related a question about precision, the raw representation in hex of the 64-bit double is:

41dfffffffc00000 for INT_MAX
and
41e0000000000000 for INT_MAX+1, so clearly there's plenty of precision to spare.

2. I'm building with 32-bit integers.

3. I'm building on Visual Studio 2005.

4. I'm embarrassed to admit I don't know how to set the rounding mode. My code (that currently works) is using the compiler flag /fp:precise, though /fp:fast and /fp:strict both also work.

Thanks again,

Tim
 
glen herrmannsfeldt
Posted: Mon Mar 17, 2008 1:19 am
Guest
Tim Mensch wrote:
(snip)

Quote:
4. I'm embarrassed to admit I don't know how to set the rounding mode. My code (that currently works) is using the compiler flag /fp:precise, though /fp:fast and /fp:strict both also work.

This was for another question, but it shows how to set precision
and rounding modes on some systems.

#include <stdio.h>
#include <assert.h>
#include <limits.h>
#include <floatingpoint.h>

int main() {
double foo = 1.0;
double bar = 0.9999999999999;
double bax;
int i,j;

fpsetround(FP_RP);
printf("%d\n",fpgetround());
for(i=0;i<4;i++) for(j=0;j<4;j++) {
fpsetround(i);
fpsetprec(j);
bax = foo - bar ;
printf("%20.14e %d %d\n",bax,fpgetround(),fpgetprec());
}
}

This is on my FreeBSD system, but other gcc systems
don't have them.

-- glen
 
 
Page 1 of 1    
All times are GMT
The time now is Mon Nov 30, 2009 2:50 am