Crude sensitivity analysis for identifying important input distributions | Vose Software

# Crude sensitivity analysis for identifying important input distributions

With ModelRisk  you can begin your investigation into the inter-relationships between input and output variables sensitivity analysis by performing a crude sensitivity analysis, as a precursor to performing a more advanced analysis on these key variables using spider plots. It achieves this by performing one of two statistical analyses on data that have been generated from input distributions and data calculated for the selected output. Built into this operation are two important assumptions:

1. All the tested input parameters have either a purely positive or negative statistical correlation with the output; and

2. Each uncertain variable is modelled with a single distribution

Assumption 1. is rarely invalid, but would be incorrect if the output value was at a maximum or minimum for an input value somewhere in the middle of its range, e.g.:

Assumption 2. is very often incorrect. For example, the impact of a risk event might be modelled as:

=VoseBernoulli(20%,U1)*VoseTriangle(10,20,50,U2)

Using ModelRisk to generate the Uniform(0,1) variates and then performing the standard sensitivity analysis will evaluate the effect of the Bernoulli and the Triangle distributions separately, so the measured effect on the output will be divided between these two distributions. However. ModelRisk offers the function VoseRiskEvent to help get round this problem:

=VoseRiskEvent(20%),VoseTriangle(10,20,50), U1)

so only one Uniform(0,1) variate is being used to drive the sampling of the risk impact.

Similarly, if you were an insurance company you might be interested in the impact on your corporate cashflow of the aggregate claims distribution for some particular policy. ModelRisk offers a number of aggregate distribution functions that internally calculate the aggregation of claim size and frequency distributions. So, for example, one can write:

=VoseAggregatePanjer("VosePoisson((5500)","VoseLognormal(2350,1285)",,,U1)

which will return the aggregate cost of Poisson(5500) claims each drawn independently from a Lognormal(2350,1285) distribution, and the generated aggregate cost value will be controlled by the U1 variate.

Assumption 2 also means that this method of sensitivity analysis is invalid for a variable that is modelled over a series of cells, like a time series of exchange rates or sales volumes. The automated analysis will evaluate the sensitivity of the output to each distribution separately. You can still evaluate the sensitivity of a time series by running two simulations: one with all the distributions simulating random values, and another with the distributions of the time series locked to their expected value. If the distributions vary significantly, the variable time series is important.

### Two statistical analyses

Tornado charts for two different methods of sensitivity analysis are in common use. Both methods plot the variable against a statistic that takes values from -1 (the output is wholly dependent on this input, but when the input is large, the output is small), through 0 (no influence) to +1 (the output is wholly dependent on this input, and when the input is large, the output is also large):

1. Stepwise least - squares regression between collected input distribution values and the selected output values. The assumption here is that there is a relationship between each input I and the output O (when all other inputs are held constant) of the form O = I*m + c where m, c are constants. That assumption is correct for additive and subtractive models, and will give very accurate results in those circumstances, but is otherwise less reliable and somewhat unpredictable. The r-squared statistic is then used as the measure of sensitivity in a Tornado chart.

2. Rank order correlation. This analysis replaces each collected value by its rank amongst other values generated for that input or output, and then calculates the Spearman's rank order correlation coefficient between each input and the output. Since this is a non-parametric analysis, it is considerably more robust than the regression analysis option where there are complex relationships between the inputs and output.