   7.3.9  Random variables : random_variable randvar

randvar (alias: random_variable) takes a probability distribution specification as its argument and returns an object representing a random variable. Its value(s) can be generated subsequently by calling sample, rand, randvector or randmatrix.

The probability distribution is specified as a sequence of arguments. The supported types are : uniform, normal, binomial, multinomial, negbinomial, Poisson, Student, Fisher-Snedecor, Cauchy, Weibull, beta, gamma, chi-square, geometric, exponential and discrete.

Continuous distributions.

The usual way to specify a continuous distribution is to pass the probability density function as the first argument, followed by one or more (numeric) parameters. However, it can also be defined by specifying its type and first and/or second moment (the mean and/or the standard deviation/variance); the supported types are : normal, uniform, binomial, Poisson, geometric, exponential, gamma, beta and Weibull. Additionally, a uniform distribution can be defined by specifying its range as an interval. The arguments are entered in form:

• mean=µ
• stddev=σ
• variance=σ2
• [range=]a..b or range=[a,b]
Discrete distributions.

To create a discrete random variable one can pass either

• a list W=[w1,w2,…,wn] of nonnegative weights as the first argument, optionally followed by a list of values V=[v1,v2,…,vn],
• a list of of object-weight pairs : [[v1,w1],[v2,w2],…,[vn,wn]], or
• a nonnegative function f followed by a range specification [range=]a..b and optionally either a positive integer N (with a,b∈ℝ) or a list of values V=[v0,v1,v2,…,vn] where n=ba, a<b and a,b∈ℤ.

The weights are automatically scaled by the inverse of their sum to obtain the values of the probability mass function. If a function f is given instead of a list of weights, then wk=f(a+k) for k=0,1,…,ba unless N is given, in which case wk=f(xk) where xk=a+(k−1) ba/N and k=1,2,…,N. The resulting random variable X has values in {0,1,…,n−1} for 0-based modes (e.g. Xcas) resp. in {1,2…,n} for 1-based modes (e.g. Maple). If the list V of custom objects is given, then V[X] is returned instead of X. If N is given, then vk=xk for k=1,2,…,N.

Examples.

To define a random variable with a Fisher-Snedecor distribution (two degrees of freedom), input :

X:=random_variable(fisher,2,3)

Output :

fisherd(2,3)

To generate some values of X, input :

rand(X) // alternative : sample(X)

Output :

2.0457

Input :

randvector(5,X) // alternative : sample(X,5)

Output :

[3.9823,0.50771,0.44836,0.79225,0.088813]

To define a random variable with multinomial distribution, input :

M:=randvar(multinomial,[1/2,1/3,1/6],[a,b,c])

Output :

’multinomial’,[1/2,1/3,1/6],[a,b,c]

Input :

randvector(10,M)

Output :

[a,c,b,a,b,b,a,b,b,b]

Some continuous distributions can be defined by specifying its first and/or second moment. Input :

randvector(10,randvar(poisson,mean=5))

Output :

[5,4,4,8,3,8,3,3,5,9]

Input :

randvector(5,randvar(weibull,mean=5.0,stddev=1.5))

Output :

[3.6483,3.4194,6.8166,4.3778,2.4178]

Input :

X:=randvar(binomial,mean=18,stddev=4)

Output :

binomial(162,1/9)

Input :

X:=randvar(weibull,mean=12.5,variance=1)

Output :

weibulld(3.0857,13.98)

Input :

mean(randvector(1000,X))

Output :

12.582

Input :

G:=randvar(geometric,stddev=2.5)

Output :

geometric(0.32792)

Input :

evalf(stddev(randvector(1000,G)))

Output :

2.4245

Input :

Output :

Uniformly distributed random variables can be defined by specifying the support as an interval. Input :

randvector(5,randvar(uniform,range=15..81))

Output :

[61.97,76.427,37.939,69.639,40.325]

Input :

rand(randvar(uniform,e..pi))

Output :

3.0434

The following examples demonstrate various ways to define a discrete random variable. Input :

X:=randvar([["apple",1/3],["orange",1/4], ["pear",1/5],["plum",13/60]]):;
randvector(5,X)

Output :

["apple","plum","pear","orange","apple","pear"]

Input :

W:=[1,4,5,3,1,1,1,2]:; X:=randvar(W):;
approx(W/sum(W))

Output :

[0.055556,0.22222,0.27778,0.16667, 0.055556,0.055556,0.055556,0.11111]

Input :

frequencies(randvector(10000,X))

Output :

[[0,0.0566],[1,0.2152],[2,0.2798],[3,0.1683], [4,0.0594],[5,0.0564],[6,0.0568],[7,0.1075]]

Input :

X:=randvar(k->1-(k/10)`^`2,range=-10..10):;
histogram(randvector(10000,X),-10,0.33,display=filled)

Output : Input :

X:=randvar([3,1,2,5],[alpha,beta,gamma,delta]):;
randmatrix(5,4,X)

Output :

 α β δ δ δ α α α δ γ α δ δ α δ α α β δ δ

Discrete random variables can be used to approximate custom continuous random variables. For example, consider a probability density function f as a mixture of two normal distributions on the support S=[−10,10]. We sample f in N=10000 points in S. Input :

F:=normald(3,2,x)+normald(-5,1,x):; c:=integrate(F,x=-10..10):;
f:=unapply(1/c*F,x):;
X:=randvar(f,range=-10..10,10000):;

Now we generate 25000 values of X and plot a histogram :

R:=sample(X,25000):;
hist:=histogram(R,-10,0.1):;
PDF:=plot(f(x),display=red+line_width_2):; hist,PDF

Output : Sampling from discrete distributions is fast : generating 25 million samples from the distribution of X which has about 10000 outcomes takes only couple of seconds. In fact, the sampling complexity is constant. Also observe that the process isn’t slowed down by spreading it across 1000 calls of randvector. Input :

for k from 1 to 1000 do randvector(25000,X); od:;

Evaluation time: 2.12

Independent random variables can be combined in an expression, yielding a new random variable. In the example below, we define a log-normally distributed variable Y from a variable X with standard normal distribution. Input :

X:=randvar(normal):; mu,sigma:=1.0,0.5:;
Y:=exp(mu+sigma*X):;
L:=randvector(10000,Y):; histogram(L,0,0.33)

Output : It is known that E[Y]=eµ+σ2/2. The mean of L should be close to that number. Input :

mean(L); exp(mu+sigma`^`2/2)

Output:

3.0789,3.0802

In case a compound random variable is defined as an expression containing several independent random variables X,Y,… of the same type, it is sometimes needed to prevent its evaluation when passing it to randvector or randmatrix. Input :

Y:=randvar(normal):;

X/Y is wrapped by eval because otherwise it would automatically reduce to 1 as X and Y are both normald(0,1). Input :

randvector(5,eval(X/Y,0))

Output :

[0.2608,-0.056913,-4.7966,-1.2622,-1.2997]

To save typing, one can define Z with eval(∗,0) and pass eval(Z,1) to randvector or randmatrix. Input :

Z:=eval(X/Y,0):; randvector(5,eval(Z,1))

Output :

[0.19015,-2.4509,-1.4277,-1.1452,1.2935]

Parameters of a distribution can be entered as symbols to allow (re)assigning them at any time. Input :

purge(lambda):; X:=randvar(exp,lambda):;
lambda:=1:;

Now execute the following command line several times in a row. The parameter λ is updated in each iteration :

r:=rand(X); lambda:=sqrt(r)

Output (by executing the above command line three times) :

8.5682,2.9272
1.5702,1.2531
0.53244,0.72968   