View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000927 | MetaPost | bug | public | 2015-03-05 20:46 | 2015-03-20 02:19 |
Reporter | toby | Assigned To | |||
Priority | normal | Severity | minor | Reproducibility | always |
Status | new | Resolution | open | ||
Platform | Intel | OS | OSX | OS Version | 10.9.2 |
Product Version | 1.890 | ||||
Summary | 0000927: normaldeviate is not ~ N(0,1) with new number systems | ||||
Description | Using the new number systems "decimal" or "double", I get unexpected results from `normaldeviate`. The expected behaviour is described on p.183 of the Metafont book. The random numbers returned are supposed to be distributed ~ N(0,1), ie "the so-called normal distributon with mean zero and variance one." The sample programme in the next box calculates the mean and variance for a sample of 1000 `normaldeviate` numbers. With the "old" number system you get the expected behaviour (mean about 0 and variance about 1). With either of the new number systems you get wildly extravagant variance. I'm using MP version 1.902 | ||||
Steps To Reproduce | sum := 0; sumsq := 0; n := 1000; for i=1 upto n: r := normaldeviate; sum := sum + r; sumsq := sumsq + r**2; endfor mu = sum/n; var = sumsq/n - mu**2; show mu, var; show mpversion; end. | ||||
Additional Information | With "mpost": >> 0.06042 >> 1.00555 >> "1.902" ) with "mpost -numbersystem=double" >> -0.65570211372579323 >> 765.47814649151951 >> "1.902" ) with "mpost -numbersystem=decimal" >> 0.5380032282704515183638352250419854 >> 213.6313527928249793656705740805164 >> "1.902" ) | ||||
Tags | No tags attached. | ||||
|
Looking at mp.w I see that the algorithm to generate normal deviates uses two constants. These are defined in mpmath.w:mp_initialize_scaled_math as mp_new_number (mp, &math->sqrt_8_e_k, mp_scaled_type); math->sqrt_8_e_k.data.val = 112429; /* $2^{16}\sqrt{8/e}\approx 112428.82793$ */ mp_new_number (mp, &math->twelve_ln_2_k, mp_fraction_type); math->twelve_ln_2_k.data.val = 139548960; /* $2^{24}\cdot12\ln2\approx139548959.6165$ */ but in mpmathdouble.w:mp_initialize_double_math we have mp_new_number (mp, &math->sqrt_8_e_k, mp_scaled_type); math->sqrt_8_e_k.data.dval = 112429 / 65536.0; /* $2^{16}\sqrt{8/e}\approx 112428.82793$ */ mp_new_number (mp, &math->twelve_ln_2_k, mp_fraction_type); math->twelve_ln_2_k.data.dval = 139548960 / 65536.0; /* $2^{24}\cdot12\ln2\approx13954895... but the second one is a *fraction* not a *scaled* number, so simply dividing by 2^16 gives the wrong value and hence the `normaldeviate` algorithm is all messed up. Scanning down the rest of this initialization function it appears that you have assumed *all* the fractions were |scaled| numbers - I guess that this is likely to be the source of many bugs with the new number systems. |
|
Thank you for the report. I cannot guarantee that I will able to fix for the next TL. |
|
If I've understood the code correctly the particular constant "twelve_ln_2_k" is required because we've passed a |fraction| to m_log but it expects a |scaled|. Therefore it computes mlog(4096u) = mlog(4096)+mlog(u) = 12 mlog(2) + mlog(u). We have then to subtract this from "twelve_ln_2" to get -mlog(u). Since -mlog(u) is -256 log(u) then the ab_vs_cd line is using 1024*-256 log(u) = -2^{18} log u which is (I think!) -4 log u as a scaled number (which is what we want). So (I think) the "twelve_ln_2_k" constant needs to be whatever 12 mlog(2) is with the new number systems. However it might be a better idea to simply rewrite the "norm_rand" routine for doubles and decimals so that they avoid all these complicated scaled vs fraction manoeuvres. Does the structure of the code allow you to do that? The Ratio Method Knuth is using is pretty simple without the complexities of scaled arithmetic. [Although I note that there was a mis-print in the old edition of Seminumerical Algorithms that I have - it has -4/log(u) instead of -4*log(u). This was corrected in the third edition] |
|
R4[Final test.] at 3.4.1 has -4ln(U) in my third Edition --- do you mean this one ? |
|
Yes, the third edition is correct - it should indeed be -4 ln(U) - some simple algebra shows this. Equation (24) has v <= 2u \sqrt{ ln (1/u) } so v^2 <= 4u^2 ln (1/u) and v^2/u^2 <= 4 ln (1/u) (since u^2 must be > 0) hence X^2 <= -4 ln(U) |
|
Here's a user land version (which is reasonably quick...) % Ratio method normal deviates, following TAOCP 3.4.1 Algorithm R, except that % the inner loop ensures u is not too small, so that xa does not overflow. vardef rm_normaldeviate = save u, v, xa; forever: forever: u := uniformdeviate 1; exitif (u>1/64); endfor v := sqrt(8/mexp(256)) * ( -1/2 + uniformdeviate 1 ); xa := v/u; exitif ( xa**2 <= -mlog(u)/64 ); endfor xa enddef; |
|
The problem is the function ab_vs_cd. In the double number system it uses a cast down to int, and this introduces mistakes. The best solution is to split as in the case of the function exp. |
|
Fixed in beta-0.80.0 |
Date Modified | Username | Field | Change |
---|---|---|---|
2015-03-05 20:46 | toby | New Issue | |
2015-03-05 21:36 | toby | Note Added: 0001320 | |
2015-03-09 11:01 | luigi scarso | Note Added: 0001323 | |
2015-03-10 17:47 | toby | Note Added: 0001324 | |
2015-03-10 17:57 | luigi scarso | Note Added: 0001325 | |
2015-03-10 22:19 | toby | Note Added: 0001326 | |
2015-03-11 00:18 | toby | Note Added: 0001327 | |
2015-03-18 14:20 | luigi scarso | Note Added: 0001341 | |
2015-03-20 02:19 | luigi scarso | Note Added: 0001342 |