 |
|
| Computers Forum Index » Computer Architecture - Arithmetic » Floating point bug? Or Feature? |
|
Page 1 of 1 |
|
| Author |
Message |
| Tim Mensch |
Posted: Sat Mar 08, 2008 6:59 am |
|
|
|
Guest
|
Can someone tell me why this isn't a bug?
double foo = INT_MAX ;
double bar = 0 ;
double bax = foo + bar ;
ASSERT(foo==INT_MAX); // This succeeds; easily enough precision to hold 2^31
ASSERT(bax==INT_MAX); // This fails!? INT_MAX+0 == INT_MAX+1!?
I built this on Visual C++ 2005 with /fp=precision ("precision" floating point). It created very simple floating point assembly language (this is from a debug build, but it fails in debug and release):
double foo = INT_MAX ;
00513C21 fld qword ptr [__real@41dfffffffc00000 (576208h)]
00513C27 fstp qword ptr [ebp-48h]
double bar = 0 ;
00513C2A fldz
00513C2C fstp qword ptr [ebp-58h]
double bax = foo + bar ;
00513C2F fld qword ptr [ebp-48h]
00513C32 fadd qword ptr [ebp-58h]
00513C35 fstp qword ptr [ebp-68h]
Sure enough, when the fadd is executed, the stack floating point value goes from INT_MAX to INT_MAX+1, even though it's adding a zero. This happens on both an Intel Pentium Core2Duo and an AMD Turion64 (running in 32-bit mode).
Please CC my email address when replying, or it may take me a while to notice. Yes, it's a real address as-is, at least for now--if it ends up blocked in the future, or I don't reply to it, then change the number and it will work again.
Thanks in advance!
Tim Mensch |
|
|
| Back to top |
|
|
|
| glen herrmannsfeldt |
Posted: Mon Mar 10, 2008 2:35 am |
|
|
|
Guest
|
Tim Mensch wrote:
Quote: Can someone tell me why this isn't a bug?
double foo = INT_MAX ;
double bar = 0 ;
double bax = foo + bar ;
ASSERT(foo==INT_MAX); // This succeeds; easily enough precision to hold 2^31
ASSERT(bax==INT_MAX); // This fails!? INT_MAX+0 == INT_MAX+1!?
Does it depend on the rounding mode?
Can you store the temporary real (80 bit) internal values and show
the hex value of them?
-- glen
Quote: I built this on Visual C++ 2005 with /fp=precision ("precision" floating point). It created very simple floating point assembly language (this is from a debug build, but it fails in debug and release):
double foo = INT_MAX ;
00513C21 fld qword ptr [__real@41dfffffffc00000 (576208h)]
00513C27 fstp qword ptr [ebp-48h]
double bar = 0 ;
00513C2A fldz
00513C2C fstp qword ptr [ebp-58h]
double bax = foo + bar ;
00513C2F fld qword ptr [ebp-48h]
00513C32 fadd qword ptr [ebp-58h]
00513C35 fstp qword ptr [ebp-68h] |
|
|
| Back to top |
|
|
|
| Quadibloc |
Posted: Wed Mar 12, 2008 7:35 pm |
|
|
|
Guest
|
On Mar 7, 6:59 pm, "Tim Mensch" <tim-usenet-4...@bitgems.com> wrote:
Quote: Can someone tell me why this isn't a bug?
double foo = INT_MAX ;
double bar = 0 ;
double bax = foo + bar ;
ASSERT(foo==INT_MAX); // This succeeds; easily enough precision to hold 2^31
ASSERT(bax==INT_MAX); // This fails!? INT_MAX+0 == INT_MAX+1!?
That is definitely unexpected behavior on most computing systems.
Unlike decimal fractions, integers are represented exactly in floating-
point, and integers are typically 32 bits long while double-precision
floating-point occupies 64 bits and typically uses 8 or 9 bits for the
exponent.
Even if INT_MAX turned out to be 2^63-1 because your system used 64-
bit integers, so that the floats involved were not exact, adding zero
ought not to have produced a result differing from the original
number, so both assertions should have either succeeded or failed
together. |
|
|
| Back to top |
|
|
|
| purnnamu |
Posted: Sat Mar 15, 2008 2:00 am |
|
|
|
Guest
|
On Mar 7, 7:59 pm, "Tim Mensch" <tim-usenet-4...@bitgems.com> wrote:
Quote: Can someone tell me why this isn't a bug?
double foo = INT_MAX ;
double bar = 0 ;
double bax = foo + bar ;
ASSERT(foo==INT_MAX); // This succeeds; easily enough precision to hold 2^31
ASSERT(bax==INT_MAX); // This fails!? INT_MAX+0 == INT_MAX+1!?
I built this on Visual C++ 2005 with /fp=precision ("precision" floating point). It created very simple floating point assembly language (this is from a debug build, but it fails in debug and release):
double foo = INT_MAX ;
00513C21 fld qword ptr [__real@41dfffffffc00000 (576208h)]
00513C27 fstp qword ptr [ebp-48h]
double bar = 0 ;
00513C2A fldz
00513C2C fstp qword ptr [ebp-58h]
double bax = foo + bar ;
00513C2F fld qword ptr [ebp-48h]
00513C32 fadd qword ptr [ebp-58h]
00513C35 fstp qword ptr [ebp-68h]
Sure enough, when the fadd is executed, the stack floating point value goes from INT_MAX to INT_MAX+1, even though it's adding a zero. This happens on both an Intel Pentium Core2Duo and an AMD Turion64 (running in 32-bit mode)..
Please CC my email address when replying, or it may take me a while to notice. Yes, it's a real address as-is, at least for now--if it ends up blocked in the future, or I don't reply to it, then change the number and it will work again. :)
Thanks in advance!
Tim Mensch
I tested the code in my environments (Intel Core2 duo, VC++ 6.0 and
gcc3.xx 32-bit).
It works correctly in both environments.
In some compiler environment, FPU processes all calculations in 80-
bit.
And then, the outputs are converted into 64-bit format using FST
instruction.
I have experienced a problem that the output is different from
IEEE-754 due to this 80-bit internal operation.
(in my case, gcc 3.xx 32-bit)
However, your code works correctly in the gcc 3.xx 32-bit environment.
I checked.
I don't know why you got the bizarre result in your environment.
I also agree that your code should work as you expected.
Inwook
----------- the tested code -----------
#include "limits.h"
#include "assert.h"
int main(void)
{
double foo= INT_MAX;
double bar= 0;
double bax=foo+bar;
if(foo==INT_MAX)printf("foo=INT_MAX\n");
if(bax==INT_MAX)printf("bax=INT_MAX\n");
assert(foo==INT_MAX);
assert(bax==INT_MAX);
} |
|
|
| Back to top |
|
|
|
| purnnamu |
Posted: Sat Mar 15, 2008 2:08 am |
|
|
|
Guest
|
On Mar 7, 7:59 pm, "Tim Mensch" <tim-usenet-4...@bitgems.com> wrote:
Quote: Can someone tell me why this isn't a bug?
double foo = INT_MAX ;
double bar = 0 ;
double bax = foo + bar ;
ASSERT(foo==INT_MAX); // This succeeds; easily enough precision to hold 2^31
ASSERT(bax==INT_MAX); // This fails!? INT_MAX+0 == INT_MAX+1!?
I built this on Visual C++ 2005 with /fp=precision ("precision" floating point). It created very simple floating point assembly language (this is from a debug build, but it fails in debug and release):
double foo = INT_MAX ;
00513C21 fld qword ptr [__real@41dfffffffc00000 (576208h)]
00513C27 fstp qword ptr [ebp-48h]
double bar = 0 ;
00513C2A fldz
00513C2C fstp qword ptr [ebp-58h]
double bax = foo + bar ;
00513C2F fld qword ptr [ebp-48h]
00513C32 fadd qword ptr [ebp-58h]
00513C35 fstp qword ptr [ebp-68h]
Sure enough, when the fadd is executed, the stack floating point value goes from INT_MAX to INT_MAX+1, even though it's adding a zero. This happens on both an Intel Pentium Core2Duo and an AMD Turion64 (running in 32-bit mode).
Please CC my email address when replying, or it may take me a while to notice. Yes, it's a real address as-is, at least for now--if it ends up blocked in the future, or I don't reply to it, then change the number and it will work again. :)
Thanks in advance!
Tim Mensch
I tested the code in my environments (Intel Core2 duo, VC++ 6.0 and
gcc3.xx 32-bit).
It works correctly in both environments.
In some compiler environment, FPU processes all calculations in 80-
bit.
And then, the outputs are converted into 64-bit format using FST
instruction.
I have experienced a problem that the output is different from
IEEE-754 due to this 80-bit internal operation.
(in my case, gcc 3.xx 32-bit)
However, your code works correctly in the gcc 3.xx 32-bit
environment.
I don't know why you got the bizarre result in your environment.
I also agree that your code should work as you expected.
Inwook
----------- the tested code -----------
#include "limits.h"
#include "assert.h"
void main(void)
{
double foo= INT_MAX;
double bar= 0;
double bax=foo+bar;
if(foo==INT_MAX)printf("foo=INT_MAX\n");
if(bax==INT_MAX)printf("bax=INT_MAX\n");
assert(foo==INT_MAX);
assert(bax==INT_MAX);
} |
|
|
| Back to top |
|
|
|
| Tim Mensch |
Posted: Sat Mar 15, 2008 6:21 am |
|
|
|
Guest
|
First, thanks for all the replies. I didn't notice them at first; I need to get a more convenient news reader set up.
When I again attempted to reproduce the problem, it actually worked correctly this time. A bit more experimentation leads me to the conclusion that the error only happens after initializing a CppUnit class: CppUnit::TextTestRunner. Does anyone know of something it might be doing that would change the behavior of the FPU? (forcing it to use 32-bit floats internally, perhaps?) My assembly language skills don't extend to the arcana of current Intel FPU modes.
To answer a few questions:
1. When I was first debugging the problem, I was able to watch the floating point registers in the debugger, and I could see the floating point add cause the value in the debugger change from INT_MAX, to INT_MAX+1:
+2.1474836470000000e+0009 changed to
+2.1474836480000000e+0009
And to answer related a question about precision, the raw representation in hex of the 64-bit double is:
41dfffffffc00000 for INT_MAX
and
41e0000000000000 for INT_MAX+1, so clearly there's plenty of precision to spare.
2. I'm building with 32-bit integers.
3. I'm building on Visual Studio 2005.
4. I'm embarrassed to admit I don't know how to set the rounding mode. My code (that currently works) is using the compiler flag /fp:precise, though /fp:fast and /fp:strict both also work.
Thanks again,
Tim |
|
|
| Back to top |
|
|
|
| glen herrmannsfeldt |
Posted: Mon Mar 17, 2008 1:19 am |
|
|
|
Guest
|
Tim Mensch wrote:
(snip)
Quote: 4. I'm embarrassed to admit I don't know how to set the rounding mode. My code (that currently works) is using the compiler flag /fp:precise, though /fp:fast and /fp:strict both also work.
This was for another question, but it shows how to set precision
and rounding modes on some systems.
#include <stdio.h>
#include <assert.h>
#include <limits.h>
#include <floatingpoint.h>
int main() {
double foo = 1.0;
double bar = 0.9999999999999;
double bax;
int i,j;
fpsetround(FP_RP);
printf("%d\n",fpgetround());
for(i=0;i<4;i++) for(j=0;j<4;j++) {
fpsetround(i);
fpsetprec(j);
bax = foo - bar ;
printf("%20.14e %d %d\n",bax,fpgetround(),fpgetprec());
}
}
This is on my FreeBSD system, but other gcc systems
don't have them.
-- glen |
|
|
| Back to top |
|
|
|
|
|
All times are GMT
The time now is Mon Nov 30, 2009 2:50 am
|
|