Polya distribution | Vose Software

Polya distribution

Format: Polya(a, b)

Uses

There are several types of distribution in the literature that have been given the Pólya name. We employ the name for a distribution that is very common in the insurance field.

A standard initial assumption of the frequency distribution of the number of claims is Poisson:

Number of claims = Poisson(l)

where l is the expected number of claims during the period of interest. The Poisson distribution has a mean and variance equal to l and one often sees historic claim frequencies with a variance greater than the mean so that the Poisson model underestimates the level of randomness of claim numbers. A standard method to incorporate greater variance is to assume that l is itself a random variable (and the claim frequency distribution is then called a mixed Poisson model). Because of its flexibility in shape, and ease of mathematics, a Gamma(a,b) distribution is most commonly used to describe the random variation of l between periods, so:

Claims = Poisson(Gamma(a,b))                                (1)

This is the Pólya (a,b) distribution.

For a process where the number of events occur randomly in a unit of time according to a Pólya(α, β) distribution, the time between successive events follows a Pareto2(1/ β, α) distribution. This is equivalent to how, in a Poisson process where the number of events occur randomly in a unit of time according to a Poisson(λ) distribution, the time between successive events follows an Exponential(1/ λ) distribution.

Relationship to the Negative Binomial

If a is an integer, we have:

Number of events = Poisson(Gamma(a,b)) = NegBin(a,1/(1+b))   (2)

so one can say that the Negative Binomial distribution is a special case of the Polya.

Another parameterization of the Polya

Another common actuarial parameterization of the Pólya distribution comes from rewriting Equation 1 as follows:

Number of events = Poisson(l * Gamma(h,1/h))                        (3)

The Gamma(h,1/h) has mean = 1, so the Gamma distribution in Equation 3 is adding random variation about the expected rate l . Using this parameterization with our Pólya distribution you would write:

Number of events = Pólya(h,l/h)

Other common Poisson mixture models

Two other variations of Equation 1 are of interest:

a) Number of events = Poisson(Exponential(b))

which is a special case of Equation 1 with a = 1. From Equation 2 and recognizing that NegBin(1,p) = Geometric(p) this simplifies to:

number of events = Geometric(1/(1+b))

b) Number of events = Poisson(l + Gamma(a,b))

which is introduced to give more flexibility, e.g. to try to match variance and skewness of claim frequencies as well as the mean rate. The result is a Delaporte distribution:

Poisson(l  + Gamma(a,b)) = Delaporte(a,b,l)

We can split this equation up:

Poisson(l  + Gamma(a,b))    = Poisson(l) + Poisson(Gamma(a,b))

= Poisson(l) + Pólya(a,b)

So the Delaporte distribution models a frequency as having two components: a Poisson variable with a fixed expected rate; and an independent Poisson variable with an expected rate that is itself a random variable.

Use in microbiology

If the number of clusters of organisms in a sample are Poisson(λ) distributed and the number of organisms in a random cluster follows a Logarithmic(θ) distribution, then the total number of organisms in the sample follows a Pólya distribution.

Zero-modified versions

When modeling or analyzing counting data, it is often desirable to modify probability of zero of the discrete distribution we use, to more accurately model the probability of "no event occurring". We can make two types of modifications to our distribution for this:

• Zero-inflated model - we increase the probability of zero.

• Zero-truncated model - we entirely remove the probability of zero events occurring.

ModelRisk functions added to Microsoft Excel for the Pólya distribution

VosePolyaVoseDiscreteFit generates random values from this distribution for Monte Carlo simulation, or calculates a percentile if used with a U parameter.

VosePolyaObject constructs a distribution object for this distribution.

VosePolyaProb returns the probability mass or cumulative distribution function for this distribution.

VosePolyaProb10 returns the log10 of the probability mass or cumulative distribution function.

VosePolyaFit generates values from this distribution fitted to data, or calculates a percentile from the fitted distribution.

VosePolyaFitObject constructs a distribution object of this distribution fitted to data.

VosePolyaFitP returns the parameters of this distribution fitted to data.

ModelRisk functions for the Zero-Inflated Pólya distribution

VoseZIPolyaVoseDiscreteFit generates random values from this distribution for Monte Carlo simulation, or calculates a percentile if used with a U parameter.

VoseZIPolyaObject constructs a distribution object for this distribution.

VoseZIPolyaProb returns the probability mass or cumulative distribution function for this distribution.

VoseZIPolyaProb10 returns the log10 of the probability mass or cumulative distribution function.

VoseZIPolyaFit generates values from this distribution fitted to data, or calculates a percentile from the fitted distribution.

VoseZIPolyaFitObject constructs a distribution object of this distribution fitted to data.

VoseZIPolyaFitP returns the parameters of this distribution fitted to data.

ModelRisk functions for the Zero-Truncated Pólya distribution

VoseZTPolyaVoseDiscreteFit generates random values from this distribution for Monte Carlo simulation, or calculates a percentile if used with a U parameter.

VoseZTPolyaObject constructs a distribution object for this distribution.

VoseZTPolyaProb returns the probability mass or cumulative distribution function for this distribution.

VoseZTPolyaProb10 returns the log10 of the probability mass or cumulative distribution function.

VoseZTPolyaFit generates values from this distribution fitted to data, or calculates a percentile from the fitted distribution.

VoseZTPolyaFitObject constructs a distribution object of this distribution fitted to data.

VoseZTPolyaFitP returns the parameters of this distribution fitted to data.