Distributions used in modeling expert opinion | Vose Software

Distributions used in modeling expert opinion

See also: Modeling expert opinion introduction, Sources of error in subjective estimation

Various types of probability distributions play a role in modelling expert opinion. One categorization that is good to know when using distributions for modeling expert opinion, is between parametric and non-parametric distributions.

Non-parametric and Parametric Distributions 

See also: Distributions introduction, Modeling expert opinion introduction, Parametric and non-parametric distributions

Parametric distributions are based on a mathematical function whose shape and range is determined by one or more distribution parameters. These parameters often have little obvious or intuitive relationship to the distribution shapes they define. Examples of parametric distributions are: Lognormal, Normal, Beta, Weibull, Pareto, Loglogistic, Hypergeometric - most distribution types, in fact.

Non-parametric distributions, on the other hand, have their shape and range determined by their parameters directly in an obvious and intuitive way. Their distribution function is simply a mathematical description of their shape. Non-parametric distributions are: Uniform, Relative, Triangle, Cumulative and Discrete.

As a rule, non-parametric distributions are far more reliable and flexible for modelling expert opinion about a model parameter. The questions that the analyst poses to the expert to determine the distribution's parameters are intuitive and easy to respond to. Changes to these parameters also produce an easily predicted change in the distribution's shape and range. The application of each non-parametric distribution type to modelling expert opinion is discussed below.

There are three common exceptions to the above preference for using non-parametric distributions to model expert opinion:

1. The PERT distribution is frequently used to model an expert's opinion. Although it is, strictly speaking, a parametric distribution, it has been adapted so that the expert need only provide estimates of the minimum, most likely and maximum values for the variable and the PERT function finds a shape that fits these restrictions. The PERT distribution is explained more fully below.

2. The expert may occasionally be very familiar with using the parameters that define the particular distribution. For example, a toxicologist may regularly determine the mean standard error of a chemical concentration in a set of samples. It might be quite helpful to ask the expert for the mean and standard deviation of her uncertainty about some concentration in this case.

3.The parameters of a parametric distribution are sometimes intuitive and the analyst can therefore ask for their estimation directly. For example, a Binomial distribution is defined by n, the number of trials that will be conducted, and p, the probability of success of each trial. In cases where you consider the Binomial distribution to be the most appropriate, you can ask the expert for estimates of n and p, recognizing that you will have to insert them into a Binomial distribution, but try to avoid any discussion of the Binomial distribution that might cause confusion. Note that the estimates of n and p can also be distributions themselves.

There are other problems associated with using parametric distributions for modelling expert opinion:

The model that includes parametric distributions to represent opinion is more difficult to review later because the parameters of the distribution may have no intuitive appeal.

It is very difficult to get the precise shape right when using parametric distributions to model expert opinion as the effects of changes in the parameters are not usually obvious.

Overview of the most important distributions

A brief overview of the most important distributions for modeling expert opinions follows below. All of these distributions are available in ModelRisk.






Uniform distribution


The Uniform distribution is generally a very poor model of expert opinion since all values within its range have equal probability density, but that density falls sharply to zero at the minimum and maximum in an unnatural way. The uniform distribution obeys the Maximum Entropy Formalism where only the minimum and maximum are known, but in our experience it is rare indeed that the expert will be able to define the minimum and maximum but have no opinion to offer on a most likely value.

The Uniform distribution can, however, be used to highlight or exaggerate the fact that little is known about the parameter. It can also be used to model circular variables (like the direction of wind from 0 to 2p) or a random position between two points.

Triangular distribution


The Triangular distribution is the most commonly used distribution for modeling expert opinion. It is defined by its minimum (a), most likely (b) and maximum (c) values. The figure below shows three Triangle distributions: Triangle(0,10,20), Triangle(0,10,50), Triangle(0,50,50) which are symmetric, right-skewed and left-skewed respectively.

The Triangle distribution has a very obvious appeal because it is so easy to think about the three defining parameters and to envisage the effect of any changes.

The mean and standard deviation of the Triangle distribution are determined from its three parameters:

            mean =

            standard deviation =

These formulae show that the mean and standard deviation are equally sensitive to all three parameters. Many models involve parameters for which it is fairly easy to estimate the minimum and most likely values, but for which the maximum is almost unbounded and could be enormous.

Central Limit Theorem tells us that, when adding up a large number of distributions (for example adding costs or task durations), it is the distributions' means and standard deviations that are most important because they determine the mean and standard deviation of the risk analysis result. In situations where the maximum is so difficult to determine, the Triangle distribution is not usually appropriate since the values generated from it will depend a great deal on how the estimation of the maximum is approached. For example, if the maximum is assumed to be the absolutely largest possible value, the risk analysis output will have a far larger mean and standard deviation than if the maximum is assumed to be a 'practical' maximum by the estimating experts.

The Triangle distribution is often considered to be appropriate where little is known about the parameter outside an approximate estimate of its minimum, most likely and maximum values. On the other hand, its sharp, very localized peak and straight lines produce a very definite, unusual and very unnatural shape, which could be said to conflict with the assumption of little knowledge of the parameter.

Generalized Trapezoid Uniform (GTU) distribution

VoseGTU(min, leftmode, rightmode,max,m,n)

The Trapezoid Uniform distribution is a generalization of the Triangle distribution. Its PDF has a flat part in the middle rather than a peak at the mode. So it consists of three stages, and looks like a trapezoid - hence the name. So it takes four parameters: min, left mode, right mode and max. For the limiting case min=left mode and max=right mode this becomes a uniform distribution, and for left mode=right mode this becomes a triangle distribution.

We can generalize this trapezoidal distribution by allowing the slopes to be curved, governed by two extra parameters m and n: m for the curvature on the left and n on the right. m=1 and n=1 correspond to straight slopes. The GTU distribution is well suited for subjective estimation: it is often natural to provide "soft" minimum and maximum mode parameters based on expert opinion, together with "hard" min and max boundaries guided by natural/practical limits.

PERT distribution

VosePERT(min, mode, max)

The PERT distribution gets its name because it uses the same assumption about the mean (see below) as PERT networks (PERT = Project Evaluation and Review Technique - used in the past for project planning). It is a version of the Beta distribution and requires the same three parameters as the Triangle distribution, namely minimum (a), most likely (b) and maximum (c). The figure below shows three PERT distributions whose shape can be compared to the Triangle distributions here.

The equation of a PERT distribution is related to the Beta distribution as follows:

  PERT (a, b, c) = Beta(a1, a2) * (c - a) + a




            The mean:


The last equation for the mean is a restriction that is assumed in order to be able to determine values for a1 and a2. It also shows how the mean for the PERT distribution is four times more sensitive to the most likely value than to the minimum and maximum values. This should be compared with the Triangle distribution where the mean is equally sensitive to each parameter. The PERT distribution therefore does not suffer to the same extent the potential systematic bias problems of the Triangle distribution, that is in producing too great a value for the mean of the risk analysis results where the maximum for the distribution is very large.

The standard deviation of a PERT distribution is also less sensitive to the estimate of the extremes. Although the equation for the PERT standard deviation is rather complex, the point can be illustrated very well graphically. The figure below compares the standard deviations of the Triangle and PERT distributions that have the same a, b and c values.

To illustrate the point, the figure uses values of zero and one for a and c respectively and allows b (the most likely value; the x-axis in the above figure) to vary between zero and one, although the observed pattern extends to any {a,b,c} set of values. It can be seen that the PERT distribution produces a systematically lower standard deviation than the Triangle distribution, particularly where the distribution is highly skewed (i.e. b is close to zero or one in this case). As a general rough rule of thumb, cost and duration distributions for project tasks often have a ratio of about 2:1 between the (maximum - most likely) and (most likely - minimum), equivalent to b = 0.3333 on the figure above. The standard deviation of the PERT distribution at this point is about 88% of that for the Triangle distribution. This implies that using PERT distributions throughout a cost or schedule model, or any other additive model, will display about 10% less uncertainty than the equivalent model using Triangle distributions.

Some readers would perhaps argue that the increased uncertainty that occurs with Triangle distributions will compensate to some degree for the 'over-confidence' that is often apparent in subjective estimating. The argument is quite appealing at first sight but is not conducive to the long term improvement of the organization's ability to estimate. We would rather see an expert's opinion modelled as precisely as is practicable. Then, if the expert is consistently over-confident, this will become apparent with time and the estimating can be corrected.

ModelRisk offers an adjustable version of the PERT distribution we designed at Vose, called the Modified PERT distribution.

Modified PERT distribution

VoseModPERT(min, mode, max, gamma)

We have developed a modified PERT distribution to produce shapes with varying degrees of uncertainty for the same minimum, most likely and maximum, by changing the assumption about the mean:


In the standard PERT, g = 4, which is the PERT network assumption that m = (a + 4b + c) /6. However, if we increase the value of g, the distribution becomes progressively more peaked and concentrated around b (and therefore less uncertain). Conversely, if we decrease g the distribution becomes flatter and more uncertain. The figure below illustrates the effect of three different values of g for a modified PERT(5,7,10) distribution.

This modified PERT distribution can be very useful in modeling expert opinion. The expert is asked to estimate the same three values as before (i.e. minimum, most likely and maximum). Then a set of modified PERT distributions are plotted and the expert is asked to select the shape that fits his/her opinion most accurately.

Relative Distribution

VoseRelative(min, max, {values}, {weights})

The Relative distribution is the most flexible of all of the continuous distribution functions. It enables the analyst and expert to tailor the shape of the distribution to reflect, as closely as possible, the opinion of the expert. The Relative distribution takes the form VoseRelative(min, max, {values}, {weights}) in ModelRisk where {values} is an array of x-values with probability densities (or weights) {weights} and where the distribution falls between the minimum and maximum. The {weights} values are not constrained to give an area under the curve of 1 since the software recalibrates the probability scale. The figure below gives an example.

Cumulative distribution


The Cumulative (ascending) distribution has the form VoseCumulA(minimum, maximum{xi}, {Pi}) where {xi} is an array of x-values with cumulative probabilities {Pi} and where the distribution falls between the minimum and maximum. The figure below shows the distribution VoseCumulative(0, 10, {1, 4, 6}, {0.1, 0.6, 0.8}) as it is defined in its cumulative form and how it looks as a relative frequency plot:

The Cumulative distribution is used in some texts to model expert opinion. However, we have found it largely unsatisfactory because of the insensitivity of its probability scale. A small change in the shape of the Cumulative distribution that would pass unnoticed produces a radical change in the corresponding relative frequency plot that would not be acceptable, as illustrated in this topic.

Discrete distribution

VoseDiscrete({xi}, {pi})

The Discrete distribution has the form VoseDiscrete({xi}, {pi}) where {xi} is an array of the possible values of the variable with probability weightings {pi}. The {pi} values do not have to add up to unity as the software will normalise them automatically. It is actually often useful just to consider the ratio of likelihood of the different values and not to worry about the actual probability values. The Discrete distribution can be used to model a discrete parameter (that is, a parameter that may take one of two or more distinct values), e.g. the number of turbines that will be used in a power station.

Read on: Incorporating differences in expert opinions