ieee 754 - Converting from floating-point to decimal with floating-point computations -
i trying convert floating-point double-precision value x
decimal 12 (correctly rounded) significant digits. assuming x
between 10^110 , 10^111 such decimal representation of form x.xxxxxxxxxxxe110
. and, fun, trying use floating-point arithmetic only.
i arrived pseudo-code below, operations double-precision operations, notation 1e98
double nearest mathematical 10^98, , 1e98_2
double nearest result of mathematical subtraction 10^98-1e98
. notation fmadd(x * y + z)
fused multiply-add operation operands x
,y
, z
.
y = x * 2^-1074; // exact q = y / 1e98; // q denormal , significand of q interpreted // integer our candidate 12 decimal // digits of x r = fmadd(q * 1e98 - y); // close 1e98 * (error made during division) // if 1e98_2 >= 0, divided number smaller wished // correct answer may q or q+1. if (r , 1e98_2 have opposite signs) { return significand of q; } s = copysign(2^-1074, r); r1 = abs(r); r2 = abs(1e98_2); h = 1e98 * 0.5 * 2^-1074; set rounding mode downwards r3 = fmadd(r2 * q + r1); if (r3 < h) { return significand of q; } else { return significand of (q + s) }
i apologize confusion pervades above pseudo-code, not clear me yet, hence following questions:
does first fmadd work intended (to compute 1e98 * (error made during division))?
the signs. cannot convince myself right. cannot convince myself wrong either.
any idea, perhaps arguments, frequency algorithm might produce wrong result?
if works @ all, there chance algorithm continue work if “q = y / 1e98” changed “q = y * 1e-98” (leaving other instructions same)?
i have not tested algorithm. not have computer fmadd instruction, although hope find 1 can execute above.
let y/d
exact operation, , q=rnd(y/d)
result rounded nearest float.
true error multiplied d rt=(rnd(y/d)-y/d)*d=q*d-y
, operation performed fmadd r=rnd(q*d-y)
why q*d-y
exact (fmadd no final rounding) less clear explain, q*d
has limited number of bits (<nbits(q)+nbits(d)
), exponent of y
of q*d
(+/- 1) , since error |rt|<0.5*ulp(q)*d
, means first nbits(q)
vanishing... answers question 1.
so q*1e98 - y = r
, |r|*2^1074 <= 0.5e98 < 5*10^98
(2nd inequality lucky)
q*(10^98) - y = r + (10^98-1e98)*q
|10^98-1e98|*q*2^1074 <= 0.5e95
(assuming @ least 15 digits precision, log(2^53)/log(10) > 15
)
so ask whether |q*(10^98)-y|*2^1074>5*10^97
you have approximation of |q*(10^98)-y|
r+1e98_2*q
since |r| < 5*10^98
, , |r+(10^98-1e98)*q|<|r|
if signs opposite, think answers positively question 2. wouldn't sure if 1e98_2 < 0.
if r
, 1e98_2
have same sign might exceed 5*10^97
, further handling discussion of r3 = 1e98_2*q + r
versus h=0.5e98*2^-1074
for question 3, @ first sight, i'd 2 things might make algorithm fail:
1e98_2
not exact (10^98-1e98-1e98_2 = -3.6e63
approx.)and
h
notht=0.5*10^98*2^-1074
bit smaller saw above.
the true error r3t
approximately (1e98_2-3e63)*q + r < r3
(and case when >0 interesting us, because 1e98_2>0).
so approximation of error r3 falling above approximated tie h when true error r3t below true tie ht lead incorrect rounding. possible, , if yes how frequent question 3?
to mitigate above inequality risk, tried truncate magnitude of r3, r3 <= 1e98_2*q + r
. felt bit tired perform true analysis of error bounds...
so scanned error, , first failing example found 1.0000000001835e110 (i assume correctly rounded nearest double, in fact 1000000000183.49999984153799821120915424942630528225695526491963291846957919215885146546696544423465444842668032e98).
in case, r
, 1e98_2
have same sign, ,
(x/1e98) > 1000000000183.50000215
q
significand rounded1000000000184
r3>h
(r3*2^1074
approx. 5.000001584620017e97) , incorrectly incrementedq+s
, when should have beenq-s
, definitely bug.
my answers are:
yes,
r=fmadd(q * 1e98 - y)
1e98*(error made during division), don't care of division, it's providing guess, counts subtraction exact.yes, sign correct because
|r| < 5*10^98
, ,|r+(10^98-1e98)*q|<|r|
if signs opposite. wouldn't sure if 1e98_2 < 0.taking first failing example
(1.0000000001835e110 - 1.0e110)/1.0e110 ulp -> 1.099632e6
, very naive conjecture 1 case out of million, r3 falling on h... once q+s corrected q-s, occurence ofr3>h
whiler3t<ht
much smaller 1/1,000,000 in case... there more 10^15 doubles in range of interest, consider not serious answer...yes, discussion above solely guess q, independently of way produced, , subtraction in 1. still exact...
Comments
Post a Comment