Copulas | Vose Software

# Copulas

Quantifying dependence has long been a major topic in finance and insurance risk analysis and has led to an intense interest in, and development of, copulas.

But copulas are now enjoying increasing popularity in other areas of risk analysis where one has considerable amounts of data. The rank order correlation employed by most Monte Carlo simulation tools is certainly a meaningful measure of dependence but is very limited in the patterns it can produce.

Copulas offer a far more flexible method for combining marginal distributions into multivariate distributions and offer an enormous improvement in capturing the real correlation pattern. Understanding the mathematics is a little more onerous but is not all that important if you just want to use it as a correlation tool.

In what follows we use the formulae for a bivariate copula to keep them reasonably readable, and show graphs of bivariate copulas, but keep in mind that the ideas extend to multivariate copulas too. I start off with an introduction to some copulas from a theoretical viewpoint, and then look at how we can use them in models. Cherubini et el (2004) is a very thorough and readable exploration of copulas and gives algorithms for their generation and estimation, some of which we use in ModelRisk.

A d-dimensional copula C is a multivariate distribution with uniformly distributed marginals U(0,1) on [0,1]. Every multivariate distribution F with marginals F1, F2, ... , Fd  can be written as:

for some copula C (this is known as Sklar's theorem). Because the copula of a multivariate distribution describes its dependence structure, we can use measures of dependence which are copula-based. The concordance measures Kendall's tau and Spearman's rho, as well as the coefficient of tail dependence can, unlike rank order correlation coefficient, be expressed in terms of the underlying copula alone. I will focus particularly on Kendall's tau as the relationships between the value of Kendall's tau t and the parameters of the copulas discussed in this section are quite straightforward.

The general relationship between Kendall's tau t of two variables X and Y and the copula C(u,v) of the bivariate distribution function of X and Y is:

This relationship gives us a tool for fitting a copula to a data set: we simply determine Kendall's tau for the data and then apply a transformation to get the appropriate parameter value(s) for the copula being fitted.

### Modeling with copulas

In order to make use of copulas in your risk analysis, you need three things:

1. A method to estimate its parameter(s), which has been described above;
2. A model that generates the copula. In this section I will show you how this is done for the Archimedian and elliptical copulas described above;
3. Functions that use the inversion method to generate values from the marginal distributions to which you wish to apply the copula. Excel offers a very limited number of such functions but they are notoriously inaccurate and unstable.

Let's say that we have a data set of 1000 joint observations for each of five variables, we fit the data to Gamma distributions for each variable and correlate them together with a Normal copula.

The following example model shows how to do this with ModelRisk :

Example model  Correlation_with_fitted_copula - using a fitted Normal copula to correlate Gamma variables.

### Making a special case of bivariate copulas

In the standard formulation for copulas there is no distinction between a bivariate (only two marginals) and a multivariate (more than two) copula. However, we can manipulate a bivariate copula to greatly extend its applicability.

Sometimes, when creating a certain model, one is interested in a particular copula (say the Clayton copula), but with a greater dependence in the positive tails than in the negative (a Clayton copula has greater dependence in the negative tail than in the positive).

For a bivariate copula it is possible to change the direction of the copulas by calculating 1-X where X is one of the copula outputs. For example, if we have:

{A1:A2}                Clayton Copula with a = 8

B1                           =1-A1

B2                           =1-A2

A scatter plot of B1:B2 now gives:

ModelRisk offers an extra parameter to allow control over the possible directional combinations. For Clayton and Gumbel copulas there are four possible directional possibilities, but for the Frank there are just two since it is symmetric about its centre. The following plots illustrate the four possible bivariate Clayton copulas (1000 samples) with parameter a = 15 and the two possible bivariate Frank copulas (1000 samples) with parameter 21.

Estimation of which direction gives the closest fit to data simply requires that one repeat the fitting methods described above, calculate the likelihood of the data for each direction, and select the direction with the maximum likelihood. ModelRisk has bivariate copula functions that do this directly, returning either the parameters of the fitted copula or generating values from a fitted copula.

### An empirical copula

Despite the extra flexibility afforded by copulas over rank order correlation, you can see that they still rely on a symmetrical relationship between the variables: draw a line between (0,0) and (1,1) and you get a symmetric pattern about that line (assuming you didn't alter the copula direction).

Unfortunately, real-world variables tend to have other ideas.

As risk analysts, we put ourselves in a difficult situation if we try to squeeze data into a model that just doesn't fit. The empirical copula gives us a solution. Provided we have a good amount of observations we could Bootstrap the ranks of the data to construct an approximation to an empirical copula. We would then use the empirical estimate rank/(n+1) for the quantile that should be associated with a value in a set of n data points.

The main drawback to this method occurs when we have relatively few observations. For example, if we have just nine observations, the empirical copula will only generate values of {0.1, 0.2, ... 0.9} and we can only generate between the 10th and 90th percentiles of the marginal distributions.

This problem can be corrected by applying some order statistics thinking.

The ModelRisk function VoseCopulaData encapsulates that thinking and constructs an empirical copula based on the data. In the model below there are just 21 observations, so any correlation structure is only vaguely known.

The following plots show how the VoseCopulaData performs. The large grey dots are the data and the small dots are 3000 samples from the empirical copula: notice that the copula extends over (0,1) for all variables and fills in the areas between the observations with greatest density concentrated around the observations: