Statistical descriptions of model outputs

Statistical measures of simulation results can be categorized into three broad groups:

Measures of location - where the distribution is 'centred'
Measures of spread - how broad the distribution is
Measures of shape - how lopsided or peaked the distribution is

Measures of location

There are essentially three measures of central tendency (i.e. measures of the central location of a distribution) that are commonly provided in statistics reports:

The mode is the location of a distribution's peak, i.e. the value with the highest probability or probability density;
The median is the value one estimates to have a 50% probability of being above or below;
The mean is the probability weighted average of all possible outcomes

Interestingly the mode, median and mean follow a particular order in terms of their values.

The conditional mean is another useful measure that you can readily calculate for yourself.

Measures of spread

The three measures of spread commonly provided in statistics reports are standard deviation, variance and range. Frankly, they all have their problems, so we discuss various other measures of spread you can calculate for yourself that are often more useful:

Inter-percentile range

Offers a consistent interpretation of the measure of spread, and is the one we most commonly use;

Mean deviation

Equal to the average distance the output values are from their mean. Like a standard deviation, but with less emphasis on tail values.

Semi-variance and semi-standard deviation

The spread of a part of the distribution, useful when you want to look at say the spread of loss if you don't make a profit, or the spread of finish times if a project exceeds its deadline;

Normalized measures of spread

Dividing a measure of spread by a measure of location, this gives a statistic that is scaled to the variable, and allows for a better comparison between variables.

Measures of shape

These are measures that indicate how lopsided or peaked the distribution is. The best-known and most often used examples are Skewness (S) and Kurtosis (K).

Don't use too many

However, in general, at Vose Software we use very few statistical measures in writing our reports. The following statistics are easy to understand and, for nearly any problem, communicate all the information one needs to get across:

The mean which tells you where is distribution is located and has some important properties for comparing and combining risks;
Cumulative percentiles which give the probability statements that decision-makers need (like the probability of being above, or below X or between X and Y);
Relative measures of spread: normalized standard deviation (occasionally) for comparing the level of uncertainty of different options relative to their size (i.e. as a dimensionless measure) where the outputs are roughly normal, and normalized inter-percentile range (more commonly) for the same purpose where the outputs being compared are not all normal.

Read on: Mode

Statistical descriptions of model outputs

Measures of location

Measures of spread

Measures of shape

Don't use too many

Navigation