Numerical Integration | Vose Software

Numerical Integration

See also: Model design introduction, Vose Integrate

A slightly deeper understanding of what you are actually doing when performing a Monte Carlo simulation will allow you to expand and improve your use of ModelRisk/Excel greatly: to find shortcuts that improve a model's accuracy and efficiency; to perform many complex statistical analyses in a simple and intuitive way; perform mathematical integrals (admittedly a less common task for most of us); and it will help you understand and explain risk analysis models much better.

Numerical integration encompasses a broad range of techniques that replace algebraic integration with simulation: the relative frequency of the result of some mathematical operation performed on randomly generated values is developed that would be the equivalent of performing the integral of some function. It's a horrible definition - the best explanation is through examples, as provided below.

See also: Monte Carlo simulation introduction

The Monte Carlo simulation modeling that one would typically carry out with ModelRisk is an example of numerical integration. For example it is quite simple to calculate the probability distribution function of the sum of two independent distributions.

Let X be the first random variable with cumulative distribution function FX(x), and let Y be the second distribution with density g(y). Then the cumulative distribution function of the sum of X and Y, FX+Y, is given by:

The sum of two independent distributions is sometimes known as the convolution of the distributions.

For all but the simplest combinations of distributions this integral is very complicated or even impossible to determine algebraically. Monte Carlo simulation replaces the integral with the relative frequency with which values are generated. So, for example, we create a model with two cells, each with a distribution representing variable X and Y, and a third cell that adds together the generated values for X and Y. The cumulative distribution function of the sum of X and Y, FX+Y, is replaced in simulation by the empirical cumulative distribution of the generated values in this third cell.

Comparison of two or more random variables or uncertain parameters

Monte Carlo simulation gives you an easy way to compare two or more comparable random variables, or two or more comparable uncertain parameters. The technique for comparing uncertain parameters is as follows:

1. Construct a distribution for each uncertain parameter in separate cells. For example:

A1: Uncertainty distribution for parameter X

A2: Uncertainty distribution for parameter Y

A3: Uncertainty distribution for parameter Z

where, of course, X, Y and Z are comparable: i.e. they have the same units (e.g. km/h) and refer to the same characteristic (e.g. maximum car speed of model X,Y,Z)

2. In a separate cell construct a logical function that makes the required comparison between the values, and returns a 1 if true, and a 0 otherwise.

For example, we might want to know which car is fastest, so we would write:

B1: =IF(A1=MAX(A1:A3),1,0)            for car X

B2: =IF(A2=MAX(A1:A3),1,0)            for car Y

B3: =IF(A3=MAX(A1:A3),1,0)            for car Z

Or we might wish to know whether a car is at least 10% faster than the other two:

B1: =IF(A1>MAX(A2,A3)*1.1,1,0)      for car X

B2: =IF(A2>MAX(A1,A3)*1.1,1,0)      for car Y

B3: =IF(A3>MAX(A1,A2)*1.1,1,0)      for car Z

3. In another cell monitor the mean of the values generated for these cells using VoseSimMean. These values after a simulation run are the confidence we have that each car meets the tested requirement (e.g. fastest, fastest by 10%, etc.)

The direct relationship between this type of simulation model and an algebraic integration is discussed in two examples:

Binomial probability example

Mixing calculation and simulation

As a general rule, it is much better to be able to create a probability model that calculates, rather than simulates, the required probability or probability distribution. Calculation is preferable because the model answer is updates immediately if a parameter value changes (rather than requiring a re-simulation of the model) and more importantly within this context, it is far more efficient.