0000927: normaldeviate is not ~ N(0,1) with new number systems - MantisBT

ID	Project	Category	View Status	Date Submitted	Last Update

0000927	MetaPost	bug	public	2015-03-05 19:46	2015-03-20 01:19

Reporter	toby	Assigned To
Priority	normal	Severity	minor	Reproducibility	always
Status	new	Resolution	open
Platform	Intel	OS	OSX	OS Version	10.9.2
Product Version	1.890

Summary	0000927: normaldeviate is not ~ N(0,1) with new number systems
Description	Using the new number systems "decimal" or "double", I get unexpected results from `normaldeviate`. The expected behaviour is described on p.183 of the Metafont book. The random numbers returned are supposed to be distributed ~ N(0,1), ie "the so-called normal distributon with mean zero and variance one." The sample programme in the next box calculates the mean and variance for a sample of 1000 `normaldeviate` numbers. With the "old" number system you get the expected behaviour (mean about 0 and variance about 1). With either of the new number systems you get wildly extravagant variance. I'm using MP version 1.902
Steps To Reproduce	sum := 0; sumsq := 0; n := 1000; for i=1 upto n: r := normaldeviate; sum := sum + r; sumsq := sumsq + r2; endfor mu = sum/n; var = sumsq/n - mu2; show mu, var; show mpversion; end.
Additional Information	With "mpost": >> 0.06042 >> 1.00555 >> "1.902" ) with "mpost -numbersystem=double" >> -0.65570211372579323 >> 765.47814649151951 >> "1.902" ) with "mpost -numbersystem=decimal" >> 0.5380032282704515183638352250419854 >> 213.6313527928249793656705740805164 >> "1.902" )
Tags	No tags attached.

toby 2015-03-05 20:36 reporter ~0001320	Looking at mp.w I see that the algorithm to generate normal deviates uses two constants. These are defined in mpmath.w:mp_initialize_scaled_math as mp_new_number (mp, &math->sqrt_8_e_k, mp_scaled_type); math->sqrt_8_e_k.data.val = 112429; /* $2^{16}\sqrt{8/e}\approx 112428.82793$ / mp_new_number (mp, &math->twelve_ln_2_k, mp_fraction_type); math->twelve_ln_2_k.data.val = 139548960; / $2^{24}\cdot12\ln2\approx139548959.6165$ / but in mpmathdouble.w:mp_initialize_double_math we have mp_new_number (mp, &math->sqrt_8_e_k, mp_scaled_type); math->sqrt_8_e_k.data.dval = 112429 / 65536.0; / $2^{16}\sqrt{8/e}\approx 112428.82793$ / mp_new_number (mp, &math->twelve_ln_2_k, mp_fraction_type); math->twelve_ln_2_k.data.dval = 139548960 / 65536.0; / $2^{24}\cdot12\ln2\approx13954895... but the second one is a fraction not a scaled number, so simply dividing by 2^16 gives the wrong value and hence the `normaldeviate` algorithm is all messed up. Scanning down the rest of this initialization function it appears that you have assumed all the fractions were \|scaled\| numbers - I guess that this is likely to be the source of many bugs with the new number systems.

luigi scarso 2015-03-09 10:01 manager ~0001323	Thank you for the report. I cannot guarantee that I will able to fix for the next TL.

toby 2015-03-10 16:47 reporter ~0001324	If I've understood the code correctly the particular constant "twelve_ln_2_k" is required because we've passed a \|fraction\| to m_log but it expects a \|scaled\|. Therefore it computes mlog(4096u) = mlog(4096)+mlog(u) = 12 mlog(2) + mlog(u). We have then to subtract this from "twelve_ln_2" to get -mlog(u). Since -mlog(u) is -256 log(u) then the ab_vs_cd line is using 1024-256 log(u) = -2^{18} log u which is (I think!) -4 log u as a scaled number (which is what we want). So (I think) the "twelve_ln_2_k" constant needs to be whatever 12 mlog(2) is with the new number systems. However it might be a better idea to simply rewrite the "norm_rand" routine for doubles and decimals so that they avoid all these complicated scaled vs fraction manoeuvres. Does the structure of the code allow you to do that? The Ratio Method Knuth is using is pretty simple without the complexities of scaled arithmetic. [Although I note that there was a mis-print in the old edition of Seminumerical Algorithms that I have - it has -4/log(u) instead of -4log(u). This was corrected in the third edition]

luigi scarso 2015-03-10 16:57 manager ~0001325	R4[Final test.] at 3.4.1 has -4ln(U) in my third Edition --- do you mean this one ?

toby 2015-03-10 21:19 reporter ~0001326	Yes, the third edition is correct - it should indeed be -4 ln(U) - some simple algebra shows this. Equation (24) has v <= 2u \sqrt{ ln (1/u) } so v^2 <= 4u^2 ln (1/u) and v^2/u^2 <= 4 ln (1/u) (since u^2 must be > 0) hence X^2 <= -4 ln(U)

toby 2015-03-10 23:18 reporter ~0001327	Here's a user land version (which is reasonably quick...) % Ratio method normal deviates, following TAOCP 3.4.1 Algorithm R, except that % the inner loop ensures u is not too small, so that xa does not overflow. vardef rm_normaldeviate = save u, v, xa; forever: forever: u := uniformdeviate 1; exitif (u>1/64); endfor v := sqrt(8/mexp(256)) * ( -1/2 + uniformdeviate 1 ); xa := v/u; exitif ( xa**2 <= -mlog(u)/64 ); endfor xa enddef;

luigi scarso 2015-03-18 13:20 manager ~0001341	The problem is the function ab_vs_cd. In the double number system it uses a cast down to int, and this introduces mistakes. The best solution is to split as in the case of the function exp.

luigi scarso 2015-03-20 01:19 manager ~0001342	Fixed in beta-0.80.0

Date Modified	Username	Field	Change
2015-03-05 19:46	toby	New Issue
2015-03-05 20:36	toby	Note Added: 0001320
2015-03-09 10:01	luigi scarso	Note Added: 0001323
2015-03-10 16:47	toby	Note Added: 0001324
2015-03-10 16:57	luigi scarso	Note Added: 0001325
2015-03-10 21:19	toby	Note Added: 0001326
2015-03-10 23:18	toby	Note Added: 0001327
2015-03-18 13:20	luigi scarso	Note Added: 0001341
2015-03-20 01:19	luigi scarso	Note Added: 0001342