Conversation
|
@codex proved it, but you need to modify the constants in order for it to work. The present constants are unsound. |
Thanks a lot. |
Gas Snapshot Comparison Report
|
|
@duncancmt can you again verify new impl? Thanks a lots. |
duncancmt
left a comment
There was a problem hiding this comment.
The formal verification checks out.
| // shifted estimate is 0. | ||
| let b := sub(255, clz(x)) | ||
| z := or(shr(7, shl(div(b, 3), byte(add(mod(b, 3), 29), 0x90b5e5))), 1) | ||
| // final error. This gives >97 bits of precision after only 5 |
There was a problem hiding this comment.
The comment is subtly wrong. The worst-case error analysis bound is 91 bits, not 97. This doesn't matter for implementation correctness, though
| // final error. This gives >97 bits of precision after only 5 | |
| // final error. This gives >91 bits of precision after only 5 |
There was a problem hiding this comment.
@duncancmt are u sure? As per my it is 97 bit accuracy
Best formula: c_s = (89 + 26*s) / 128, applied with s = mod(n+2, 3)
case n mod 3 = 0:
n = 3k + 0
z0 = 2^floor((3k+0+2)/3) = 2^(k+0) = 1 * 2^k
x^(1/3) in [1.0000, 1.2599) * 2^k
ratio r in (0.7937, 1.0000]
error eps in (-0.2063, +0.0000]
worst |eps| = 0.2063 (2.28 bits)
with constant multiplier:
s = (n+2) mod 3 = (3k+2) mod 3 = 2
c[s] = (89 + 26*2) / 128 = 141/128 = 1.101562
new ratio r in (0.8743, 1.1016]
new eps in (-0.1257, +0.1016]
worst |eps| = 0.1257 (2.99 bits)
case n mod 3 = 1:
n = 3k + 1
z0 = 2^floor((3k+1+2)/3) = 2^(k+1) = 2 * 2^k
x^(1/3) in [1.2599, 1.5874) * 2^k
ratio r in (1.2599, 1.5874]
error eps in (+0.2599, +0.5874]
worst |eps| = 0.5874 (0.77 bits)
with constant multiplier:
s = (n+2) mod 3 = (3k+3) mod 3 = 0
c[s] = (89 + 26*0) / 128 = 89/128 = 0.695312
new ratio r in (0.8760, 1.1037]
new eps in (-0.1240, +0.1037]
worst |eps| = 0.1240 (3.01 bits)
case n mod 3 = 2:
n = 3k + 2
z0 = 2^floor((3k+2+2)/3) = 2^(k+1) = 2 * 2^k
x^(1/3) in [1.5874, 2.0000) * 2^k
ratio r in (1.0000, 1.2599]
error eps in (+0.0000, +0.2599]
worst |eps| = 0.2599 (1.94 bits)
with constant multiplier:
s = (n+2) mod 3 = (3k+4) mod 3 = 1
c[s] = (89 + 26*1) / 128 = 115/128 = 0.898438
new ratio r in (0.8984, 1.1320]
new eps in (-0.1016, +0.1320]
worst |eps| = 0.1320 (2.92 bits)
Optimal integer multipliers (by s = mod(b, 3)):
c[0] = 89/128 = 0.695312
c[1] = 115/128 = 0.898438
c[2] = 141/128 = 1.101562
Worst |eps_0| (with best c) = 0.1320 (2.92 bits)
Newton trace from worst eps_0 = +0.1320:
step 0: eps = +1.3196e-01, bits = 2.92
step 1: eps = +1.4786e-02, bits = 6.08
step 2: eps = +2.1439e-04, bits = 12.19
step 3: eps = +4.5948e-08, bits = 24.38
step 4: eps = +2.1112e-15, bits = 48.75
step 5: eps = +4.4573e-30, bits = 97.50 <-- target
There was a problem hiding this comment.
The sign of the relative error matters because there is an un-squared term in the relative error recurrence relation
def f(e):
return e**2 * (3 + 2 * e) / (3 * (1 + e) ** 2)
def f5(e):
return f(f(f(f(f(e)))))
def log_2(x, prec=Decimal('0.0001')):
return (x.ln() / Decimal(2).ln()).quantize(prec)
es = [Decimal('-0.12396'), Decimal('+0.10374'), Decimal('-0.10156'), Decimal('+0.13196'), Decimal('-0\
.12569'), Decimal('+0.10156')]
max([log_2(f5(abs(e))) for e in es])
# Decimal('-97.5018')
max([log_2(f5(e)) for e in es])
# Decimal('-91.8558')
the culprit is probably the stack scheduling because now |
duncancmt
left a comment
There was a problem hiding this comment.
Looks good. Formal verification checks out.
thanks a lot. |

Description
smaller bytecode and less gas
@duncancmt can you please formally verify this impl.
this is will give up to >94 bits in 5 iteration
Python script : https://gist.github.com/atarpara/6b9c18596fba67ebf17b93e33c7901db
Checklist
Ensure you completed all of the steps below before submitting your pull request:
forge fmt?forge test?Pull requests with an incomplete checklist will be thrown out.