Model design | Vose Software

Model design

See also: Building models that are easy to check and modify, Building models that are efficient, Colour coding models for clarity

Risk analysis is about supporting decisions by answering questions about risk. A good risk analyst provides qualitative, and where time and knowledge permit, quantitative, information to decision-makers that are pertinent to their questions. Inevitably, decision-makers must deal with other factors that may not be quantified in a risk analysis, which can be frustrating for a risk analyst when they see their work being 'ignored'. Don't let it be: the best risk analysts remain professionally neutral to the decisions that are made from their work.

The first step to designing a good model is to put yourself in the position of the decision-maker by understanding how the information you might provide connects to the questions they are asking. A decision-maker often does not appreciate all that comes with asking a question in a certain way, and may not initially have worked out all the possible options for handling the risk (or opportunity).

When you believe that you properly understand the risk question or questions that need(s) answering, it is time to brainstorm with colleagues, stakeholders, and the managers about how you might put an analysis together that satisfies the managers' needs. Effort put into the brainstorm stage pays back ten fold: everyone is clear on the purpose of your analysis; the participants will be more co-operative in providing information and estimates; and you can discuss the feasibility of any risk analysis approach.

We recommend you think of mapping out your ideas with Venn diagrams and event trees. Then look at the data (and perhaps expertise for subjective estimates) you believe are available to populate the model. If there are data gaps (there usually are), consider whether you will be able to get the necessary data to fill the gap, and quickly enough to be able to produce an analysis within the decision-maker's time frame.  If the answer is 'no', look for other ways to produce an analysis that will meet the decision maker's needs, or perhaps a sub-set of those needs.

But whatever you do, don't embark on a risk analysis where you know that data gaps will remain and your decision maker is left with no useful support. Some scientists argue that risk analysis can also be for research purposes - to determine where the data gaps lie. We see the value in that determination, of course, but if that is your purpose, state it clearly and don't leave any expectation from the managers that will be unfulfilled.

The following topics offer some quality control tips to help you produce an efficient model that is the most easy to understand, modify and check:

There is a tendency to settle on the form that a risk analysis model will take too early on in the risk analysis process. In part that will be because of a limited knowledge of the available options, but also because people tend not to take a step back and ask themselves what the purpose of the analysis is, and also how it might evolve over time. In this topic we give a short guide to various types of model structure used in risk analysis.

Choice of model structure

Spreadsheet modeling

Spreadsheets, and by that we mean Excel these days, are the most natural and first choice for most people because it is perceived that relatively little additional knowledge is required  to produce a risk analysis model. Monte Carlo add-ins for Excel have made adding uncertainty into a spreadsheet as simple as clicking a few buttons. You can run a simulation and look at the distribution results in a few seconds and a few more button clicks. Monte Carlo simulation software tools for Excel have focused very much on the graphical interfaces to make risk analysis modelling easy: combine that with the ability to track formulae across spreadsheets, imbed graphs, format sheets in many ways, VBA and data importing capabilities and we can see why Excel is so popular. I have even seen a whole trading floor run on Excel using VBA, and not a single recognisable spreadsheet appeared on any dealer's screen.

But Excel has its limitations. ModelRisk overcomes many of them for high level financial and insurance modelling, but there are many types of problems for which Excel is not suitable. Project cost and schedule risk analysis can be done in spreadsheets at a crude level, and a crude level is often enough for large scale risk analysis since we are rarely interested in the minutia that can be built into a project planning model (like you might make with Primevera or Microsoft Project). However, a risk register is better constructed in an electronic database with various levels of access. The problem with building a project plan in a spreadsheet is that expanding the model into greater detail becomes mechanically very awkward, whilst it is a simple matter in project planning software.

In other areas, risk analysis models with spreadsheets have a number of limitations:

1.        They scale very badly meaning that spreadsheets can become really huge when one has a lot of data, or when one is performing repetitive calculations that could be succinctly written in another language (e.g. a looping formula), although one can get round this to some degree with Visual Basic. Our company reviews many risk models built in spreadsheets and they can be vast, often unnecessarily so because there are shortcuts to achieving the same result if one knows a bit of probability mathematics. The next version of Excel will handle even bigger sheets so I predict this problem will only get worse;

2.        They are limited to the two dimensions of a grid, three at a push if one uses sheets as a third dimension: if you have a multidimensional problem you should really think hard before deciding on a spreadsheet. There are a lot of other modelling environments one could use: C++ is highly flexible, but opaque to anyone who is not a C++ programmer. Matlab and, to a lesser extent, Mathematica and Maple, are highly sophisticated mathematical modelling software with very powerful built-in modelling capabilities that will handle many dimensions and can also perform simulations;

3.        They are really slow. Running a simulation in Excel will take hundreds or more times longer than specialised tools. That's a problem if you have a huge model, or if you need to achieve a high level of precision (i.e. require many iterations);

4.        Simulation models built in spreadsheets calculate in one direction meaning that if one acquires new data that can be matched to a forecast in the model, the data cannot be integrated into the model to update the estimates of parameters on which the model was based and therefore produce a more accurate forecast.

5.        Spreadsheets cannot easily handle modelling dynamic systems. There are a number of flexible and user-friendly tools like Simul8 which give very good approximations to continuously varying stochastic systems with many interacting components. I give an example later in this topic. Attempting to achieve the same in Excel is not worth the pain.

Alternatives to spreadsheet modeling
Influence diagrams

Influence diagrams are quite popular - they essentially replicate the mathematics you can build in a spreadsheet, but the modelling environment is quite different. Analytica is the most popular influence diagram tool. Variables (called nodes) are represented as graphical objects (circles, squares, etc) and connected together with arrows (called arcs) which show the direction of interaction between these variables. The visual result is a network that shows the viewer which variables affect which, but you can imagine that such a diagram quickly becomes overly complex, so one builds sub-models. Click on a model object and it opens another view to show a lower level of interaction. A big drawback is that the mathematics and data behind the model are hard to get to, but some people love them. They are certainly very visual.

Event trees

Event trees offer a way to describe a sequence of probabilistic events, together with their probabilities and impacts. They are perhaps the most useful of all the methods for depicting a probabilistic sequence, because they are very intuitive, the mathematics to combine the probabilities is simple and the diagram helps ensure the necessary discipline. Event trees are built out of nodes (boxes) and arcs (arrows):


Example of a simple event tree

The tree starts from the left with a node (in the above diagram, Select Animal to denote the random selection of an animal from some population) and arrows to the right indicated possible outcomes (here whether the animal is infected with some particular disease agent, or not) and their probabilities (p, which would be the prevalence of infected animals in the population, and (1-p) respectively).  Branching out from these boxes are arrows to the next probability event (the testing of an animal for the disease), and attached to these arrows are the conditional probabilities of the next level of event occurring. The conditional nature of the probabilities in an event tree are extremely important to underline. In this example:

     Se = P(Test positive for disease given the animal is infected); and

     Sp = P(Test negative for disease given the animal is not infected)

 Thus, following the rules of conditional probability algebra, we can say, for example:

P(animal is infected and tests positive) = p*Se

P(animal is infected and tests negative) = p*(1-Se)

P(animal tests positive) = p*Se + (1-p)*(1-Sp)

 Event trees are very useful for building up your probability thinking, although they will get quite complex rather quickly. We use them a great deal to help understand and communicate a problem.

Decision trees

Decision trees are like event trees but add possible decision options. They have a role in risk analysis and in fields like petroleum exploration are very popular. They sketch the possible decisions that one might make and the outcomes that might result. Decision tree software (which can also produce event trees) can also calculate the best option to take under the assumption of some user-defined utility function. It often is difficult for decision-makers to be comfortable with defining a utility curve, but they are useful for communicating the logic of a problem.

Example of a simple decision tree. The decision options are to make either of two investments or do nothing with associated revenues as a result. More involved decision trees would include two or more sequential decisions depending on how well the investment went.

Fault trees

Fault trees start from the reverse approach to an event tree. An event tree looks forward from a starting point and considers the possible future outcomes. A fault tree starts with the outcome and looks at the ways it could have arisen. A fault tree is therefore constructed from the right with the outcome and moves to the left with the possible immediate events that could have made that outcome arise, and continues backwards with the possible events that could have made the first set of events arise, etc.

Fault trees are very useful for focusing attention on what might go wrong and why. It has been used in reliability engineering for a long time, but also has applications in areas like terrorism. For example, one might start with the risk of deliberate contamination of a city's drinking water supply and then consider routes that the terrorist could use (pipeline, treatment plant, reservoir, etc) and the probabilities of being able to do that given the security in place.

Discrete event simulation

Discrete Event Simulation (DES) differs from Monte Carlo simulation mainly in that it models the evolution of a (usually stochastic) system over time. It does this by allowing the user to define equations for each element in the model for how it changes, moves and interacts with other elements. Then it steps the system through small time increments and keeps track of where all elements are at any time (e.g. parts in a manufacturing system, passengers in an airport or ships in a harbour). More sophisticated tools can increase the clock steps when nothing is happening, then decrease again to get a more accurate approximation to the continuous behavior it is modelling.

Monte Carlo Simulation

See also: Monte Carlo simulation introduction

This technique involves the random sampling of each probability distribution within the model to produce hundreds or even thousands of scenarios (also called iterations or trials). Each probability distribution is sampled in a manner that reproduces the distribution's shape. The distribution of the values calculated for the model outcome therefore reflects the probability of the values that could occur. Monte Carlo simulation offers many advantages over the other techniques presented above:

·      The distributions of the model's variables do not have to be approximated in any way.

·      Correlation and other inter-dependencies can be modelled.

·      The level of mathematics required to perform a Monte Carlo simulation is quite basic.

·      The computer does all of the work required in determining the outcome distribution.

·      Software is commercially available to automate the tasks involved in the simulation.

·      Complex mathematics can be included (e.g. power functions, logs, IF statements, etc.) with no extra difficulty.

·      Monte Carlo simulation is widely recognised as a valid technique so its results are more likely to be accepted.

·      The behaviour of the model can be investigated with great ease.

·      Changes to the model can be made very quickly and the results compared with previous models.

Monte Carlo simulation is often criticised as being an approximate technique. However, in theory at least, any required level of precision can be achieved by simply increasing the number of iterations in a simulation. The limitations are in the number of random numbers that can be produced from a random number generating algorithm and, more commonly, the time a computer needs to generate the iterations. For a great many problems, these limitations are irrelevant or can be avoided by structuring the model into sections.

We can conclude that, except for the situations described above, that spreadsheet modeling combined with Monte Carlo simulation is the best approach in the majority of risk analyses. Monte Carlo simulation is explained in more depth in the Monte Carlo simulation section.

Read on: Building models that are easy to check and modify