View Issue Details

IDProjectCategoryView StatusLast Update
0000927MetaPostbugpublic2015-03-20 02:19
Reportertoby Assigned To 
PrioritynormalSeverityminorReproducibilityalways
Status newResolutionopen 
PlatformIntelOSOSXOS Version10.9.2
Product Version1.890 
Summary0000927: normaldeviate is not ~ N(0,1) with new number systems
DescriptionUsing the new number systems "decimal" or "double", I get unexpected results from `normaldeviate`.
The expected behaviour is described on p.183 of the Metafont book. The random numbers returned are supposed to be distributed ~ N(0,1), ie "the so-called normal distributon with mean zero and variance one."

The sample programme in the next box calculates the mean and variance for a sample of 1000 `normaldeviate` numbers. With the "old" number system you get the expected behaviour (mean about 0 and variance about 1). With either of the new number systems you get wildly extravagant variance.

I'm using MP version 1.902
Steps To Reproducesum := 0;
sumsq := 0;
n := 1000;
for i=1 upto n:
  r := normaldeviate;
  sum := sum + r;
  sumsq := sumsq + r**2;
  endfor
mu = sum/n;
var = sumsq/n - mu**2;
show mu, var;
show mpversion;
end.
Additional InformationWith "mpost":
>> 0.06042
>> 1.00555
>> "1.902" )

with "mpost -numbersystem=double"
>> -0.65570211372579323
>> 765.47814649151951
>> "1.902" )

with "mpost -numbersystem=decimal"
>> 0.5380032282704515183638352250419854
>> 213.6313527928249793656705740805164
>> "1.902" )
TagsNo tags attached.

Activities

toby

2015-03-05 21:36

reporter   ~0001320

Looking at mp.w I see that the algorithm to generate normal deviates uses two constants.

These are defined in mpmath.w:mp_initialize_scaled_math as

  mp_new_number (mp, &math->sqrt_8_e_k, mp_scaled_type);
  math->sqrt_8_e_k.data.val = 112429; /* $2^{16}\sqrt{8/e}\approx 112428.82793$ */
  mp_new_number (mp, &math->twelve_ln_2_k, mp_fraction_type);
  math->twelve_ln_2_k.data.val = 139548960; /* $2^{24}\cdot12\ln2\approx139548959.6165$ */

but in mpmathdouble.w:mp_initialize_double_math we have

  mp_new_number (mp, &math->sqrt_8_e_k, mp_scaled_type);
  math->sqrt_8_e_k.data.dval = 112429 / 65536.0; /* $2^{16}\sqrt{8/e}\approx 112428.82793$ */
  mp_new_number (mp, &math->twelve_ln_2_k, mp_fraction_type);
  math->twelve_ln_2_k.data.dval = 139548960 / 65536.0; /* $2^{24}\cdot12\ln2\approx13954895...

but the second one is a *fraction* not a *scaled* number, so simply dividing by 2^16 gives the wrong value and hence the `normaldeviate` algorithm is all messed up.

Scanning down the rest of this initialization function it appears that you have assumed *all* the fractions were |scaled| numbers - I guess that this is likely to be the source of many bugs with the new number systems.

luigi scarso

2015-03-09 11:01

manager   ~0001323

Thank you for the report. I cannot guarantee that I will able to fix for the next TL.

toby

2015-03-10 17:47

reporter   ~0001324

If I've understood the code correctly the particular constant "twelve_ln_2_k" is required because we've passed a |fraction| to m_log but it expects a |scaled|. Therefore it computes mlog(4096u) = mlog(4096)+mlog(u) = 12 mlog(2) + mlog(u). We have then to subtract this from "twelve_ln_2" to get -mlog(u).

Since -mlog(u) is -256 log(u) then the ab_vs_cd line is using 1024*-256 log(u) = -2^{18} log u which is (I think!) -4 log u as a scaled number (which is what we want).

So (I think) the "twelve_ln_2_k" constant needs to be whatever 12 mlog(2) is with the new number systems.

However it might be a better idea to simply rewrite the "norm_rand" routine for doubles and decimals so that they avoid all these complicated scaled vs fraction manoeuvres. Does the structure of the code allow you to do that?

The Ratio Method Knuth is using is pretty simple without the complexities of scaled arithmetic. [Although I note that there was a mis-print in the old edition of Seminumerical Algorithms that I have - it has -4/log(u) instead of -4*log(u). This was corrected in the third edition]

luigi scarso

2015-03-10 17:57

manager   ~0001325

R4[Final test.] at 3.4.1 has -4ln(U) in my third Edition --- do you mean this one ?

toby

2015-03-10 22:19

reporter   ~0001326

Yes, the third edition is correct - it should indeed be -4 ln(U) - some simple algebra shows this.
Equation (24) has v <= 2u \sqrt{ ln (1/u) }

so v^2 <= 4u^2 ln (1/u)

and v^2/u^2 <= 4 ln (1/u) (since u^2 must be > 0)

hence X^2 <= -4 ln(U)

toby

2015-03-11 00:18

reporter   ~0001327

Here's a user land version (which is reasonably quick...)

% Ratio method normal deviates, following TAOCP 3.4.1 Algorithm R, except that
% the inner loop ensures u is not too small, so that xa does not overflow.
vardef rm_normaldeviate =
    save u, v, xa;
    forever:
        forever:
            u := uniformdeviate 1;
            exitif (u>1/64);
        endfor
        v := sqrt(8/mexp(256)) * ( -1/2 + uniformdeviate 1 );
        xa := v/u;
        exitif ( xa**2 <= -mlog(u)/64 );
    endfor
    xa
enddef;

luigi scarso

2015-03-18 14:20

manager   ~0001341

The problem is the function ab_vs_cd. In the double number system it uses a cast down to int, and this introduces mistakes. The best solution is to split as in the case of the function exp.

luigi scarso

2015-03-20 02:19

manager   ~0001342

Fixed in beta-0.80.0

Issue History

Date Modified Username Field Change
2015-03-05 20:46 toby New Issue
2015-03-05 21:36 toby Note Added: 0001320
2015-03-09 11:01 luigi scarso Note Added: 0001323
2015-03-10 17:47 toby Note Added: 0001324
2015-03-10 17:57 luigi scarso Note Added: 0001325
2015-03-10 22:19 toby Note Added: 0001326
2015-03-11 00:18 toby Note Added: 0001327
2015-03-18 14:20 luigi scarso Note Added: 0001341
2015-03-20 02:19 luigi scarso Note Added: 0001342