Friday, December 11, 2009

The Gas Pedal: Rise of Structured Finance (Part 2)

The Gaussian Copula

Insurance actuaries have long been aware of a phenomenon called "Broken Heart Syndrome", in which the surviving partner in a romantic relationship tends to die sooner (statistically speaking) than normal after his or her companion dies. There are numerous causal explanations for this, from a rise in catecholamines, cortisol, and other physiological stress agents that would reduce immune system function, to a sheer psychological fatigue and loss of the spark, the will to live. These are all very important details for the medical and counseling professions to contend with.

For a life insurance actuary working with pure statistical data, however, the more pressing problem is how to determine the strengths of the co-dependencies: in the Broken Heart Syndrome context, the lives of two individuals in a romantic couple are not independent variables, but subject to an "if...then" clause that states that the chances of one dying in a given year are greater if his or her partner has died.

The statistical technique used to link two variables is called the "copula". In the late 1990s, a statistician named David Li (trained as an actuary and familiar with the Broken Hearts Syndrome modeling problem) came up with the idea that determination of the default probabilities for two mortgage derivatives could be approached from the same perspective that actuaries were using for BHS co-dependencies in human beings. Li did not, as has occasionally been reported, invent the Gaussian copula theorem; he was the first to use it in this particular financial application.

As a general rule, you should become very suspicious when you see the term "Gaussian" applied to anything in financial risk measurment. Gaussian means "normally distributed", and financial market prices feature far too many extreme events for us to state that they behave in accordance with this distribution. For example, the CFO of Goldman Sachs reported in 2008 that the bank had been hit with "25-sigma events several days in a row." Sigma indicates "standard deviation", the measure of dispersion used in the Gaussian/normal distribution. A single 25-sigma event should never occur in the history of the universe; the chances of rolling several in a row is inconceivable, beyond the laws of nature. If events like this are occurring (and large moves happen fairly frequently in the markets), then it indicates that the wrong statistical distribution is being used and the model is seriously underestimating the risks of extreme events.

Calculating the risk of a bond default could be done in one of two general ways: top-down or bottom-up. In a bottom-up approach, an analyst would go through the books of the company issuing the bonds and use various techniques, such as debt-coverage ratios and liquidity factors, to try to determine the chances that the company would default on its bond obligations. In a top-down approach, the analyst could look at many similarly rated companies and decide what the chances were of a default based on how many similar companies out there in the investable universe had defaulted in a given year.

In the case of a complex instrument like a CDO, however, an analyst is left with serious problems in the event that either approach is taken. Determining the probability of a particular individual homeowner defaulting by looking up that person's name and history is essentially impossible if one starts with a structured product that has bundled thousands of mortgages together and securitized them; in fact, the purpose of doing the bundling was to create an actuarial regularity that did not depend on the behaviors of any particular homeowners. On the other hand, a top-down approach could also be difficult because CDOs and credit default swaps are relatively new financial instruments, and thus there may not be enough data on actual defaults by issuers to try to statistically determine true investment risks under a variety of market conditions (the sample size would be too small to make statements that carried the necessary degrees of precision).

The solution was to look at the prices of the quasi-insurance contracts---the credit default swaps---that could be purchased to protect the owner of default by a particular issuer. The idea here was that the pricing of the insurance would indicate the risk of default (not an unreasonable assumption, really). The probability of default was backed out of the CDS prices and then fed into the Gaussian copula mechanism.

Remember the co-dependency issue that is described by the Broken Heart Syndrome. The probability of a mortgage default may be similarly connected to the probability of another mortgage defaulting; perhaps a general macro scenario, like a recession, could cause both default risks to increase at the same time (there is a bit of a jump in logic involved here: in the Broken Heart Syndrome case, the two lives have a co-dependence because the death of a spouse may indirectly cause the other spouse to die. In the case of a mortgage default, you have a third variable---recession, asteroid, Lex Luthor, whatever---that is creating the lack of independence).

The traditional statistical approach to examining this would be to look for the correlations between the credit default swap prices on the issuers of the mortgage-backed securities contained in a CDO. The problem with using so-called pair-wise correlations in this way is that you end up with a very complicated situation on your hand---the number of correlations increases exponentially as you add more issuers to a CDO, because you are examining how each issuer interacts with every other issuer. The mathematical expression is N(N-1)/2; if you had 100 issuers in a CDO, you'd have to calculate 4,950 different correlation relationships and then come up with a way to figure out your total default correlation risk. The model would become unwieldy very quickly. What the Gaussian copula allowed was for a single correlation input, perhaps the average correlation, to be used. In return, the copula would spit out a single correlation number from which you could determine the riskiness of the tranche.

Why was this so important? It was very important because the ratings agencies needed to ascertain the riskiness of these tranches in order to assign them credit ratings. The perceived riskiness was assessed by their default probabilities, which in turn were determined through the use of the Gaussian copula theorem.

There are all kinds on intricacies involved in how the Gaussian copula works and was used to assess the risk of CDO tranches, but the main point to drive home is that the GC, by making extensive use of the attractive, user-friendly properties of the normal distribution, is trading simplicity and ease-of-employment for a (known) tendency of this family of statistical distributions to seriously underestimate the risks of extreme events in the financial markets. Two related, additional problems: "non-stationarity", which means that the key statistical parameter used in these models---correlation---is not stable, but can change without notice; and "insufficiently time homogenous" data, which means that the data used to obtain the inputs for the model may come from a time period in which prices behaved differently than they can be depended on to behave in the future. The Economist: "There was no guarantee that the future would be like the past, if only because the American housing market had never before been buoyed up by a frenzy of CDOs."

I note that these are not problems that are unique to the Gaussian copula approach: much of modern finance, including Modern Portfolio Theory and the Black-Scholes Merton Option Pricing Model, is based on the assumption that markets live in a Gaussian world and that parameters are stable.

Many of the articles I have read have been very condescending in suggesting that there was something intrinsically stupid about using the Gaussian copula to determine the joint-default probabilities and arrive at a single scalar for determining risk. I don't agree---the mathematics involved may be straightforward once the particulars are encoded in a valuation algorithm, but getting to that point involved some subtleties that are the province of a small mathematical priesthood. A few serious quants---Paul Wilmott, Nassim Taleb---were pointing out that this was a dangerous practice, but the regulators certainly were not.

How This Helped to Cause Big Blow Ups

The financial instruments involved in these modeling efforts were extremely sensitive to changes in correlation assumptions. Imagine a line of 1,000 dominos, and think of correlation, simplistically but usefully for our purposes, measuring the inverse of the distance between them (i.e., a high correlation means two dominos are close together; a low correlation means a greater distance). When a domino is standing, it is is paying you money; when it falls, you lose money. You push the first domino; your losses depend on how many subsequently fall with it. If you get the *average* correlation even slightly wrong, you may find that you have far greater losses than you ever thought you would.

Many of the investors in CDOs had purchased them because they had been given an investment-grade rating; when default probabilities were shown to be higher than assumed, the ratings agencies had to mark them down. What you basically had here was a grand financial experiment taking place---no one was really sure how these things would behave

It may be surprising that some of the firms that took very serious losses on CDOs and synthetic CDOs lost money in the senior tranches that were supposed to be insulated from risk and very safe. The reason for this seeming anomaly is because those senior tranches were assumed to have lower correlations and were thus more sensitive to errors in the risk modeling process. Most people knew that the junk level "equity" stuff was pretty dangerous---primary customers were hedge funds who would engage in "long equity, short mezz" trades (buying the high-yield low tranche and shorting the middle, or mezzanine, tranche).

As mortgage-backed securities were purchased by CDOs, the leverage increased. A CDO's highly-rated, "safe" senior tranche could be comprised of risky, subprime mortgages if the modeling assumptions had given them a low joint default probability. When a correlation assumption was shown to be too low, however, the tranche could essentially be wiped out with a speed and violence that was completely unexpected, at least to the holder of a AAA-rated security.

In my opinion, the problem with applying things like the Gaussian copula to derivatives pricing is not that the attempts are congenitally deranged, it is that they create an aura of scientific respectability that lends itself to false precision. When a non-quant manager, probably an MBA and/or Oxbridge PPE type who is quite clever but lacks formal training in, say, things like Ito's lemma or Taylor series expansions, is fed an output number from a group of math or physics PhDs working in a bank's quantshop, that number may be used with a confidence that is inappropriate (business schools tend to push the Gaussian distribution and "frequentist" statistics, instead of the overarching Bayesian approach that I feel is more appropriate for decision-making under these conditions...much more on this in the future, as it is the real purpose for this whole blog).

There is always a tension between trading profit centers and risk management cells--if the risk managers allowed the firm to always scale to the worst-case scenario, no trades would ever be taken. The "just give me a number" mentality leads to problems even in non-leveraged, plain-vanilla financial models, because the single number that is selected is probably going to be the average of the range. Lets say that a firm is trying to decide on how much production capacity to buy, and that depends on what the annual sales forecast is. The management team takes the average sales forecast of 50,000 widgets and builds its production to meet this. Sounds reasonable, except that it almost guarantees disappointment: if widget sales fall below 50,000, which they will 50% of the time, the firm will have purchased excess production capacity. If widget sales are happily brisk and exceed 50,000, which they will the other 50% of the time, the firm will not be able to take advantage of this because it didn't buy enough production capacity.

Variations on this "flaw of averages" problem confound all kinds of business modeling attempts---they are present all over the place, although there are methods that can employed to try to mitigate their effects (Monte Carlo simulation, Real Options Analysis, game theory, etc.). But these all can have their own problems, too: for instance, Monte Carlo simulation requires that you have a good handle on the underlying statistical distribution. Financial markets present some pathological distributions; there is no clear agreement about how best to model them. Running thousands of iterations sampled from the wrong distribution would just give you the same problem of false precision that we have already described.

The methods that are least dependent on getting the assumptions about the distribution right are called "non-parametric." Our firm strongly believes in using non-parametric approaches wherever possible, and they do have the advantage of allowing for a Kalashnikov-like, battlefield-ready robustness. However, non-parametric methods still require that the past be at least somewhat indicative of the future, so you need to test against a very wide range of market conditions. If you have a situation in which past asset price behavior has been mild-mannered, even tame, non-parametric methods will not tell you how to deal with a future that is aggressively hostile and wild. Building a reserve for never-before-seen, catastrophic risks would have prevented these instruments from getting investment-grade quality ratings, which would have made them far less marketable, which would have made entities on all sides of these deals unhappy.

In the next section, we will put the fractional-reserve engine, monetary policy fuel, government affordable housing mandate steering wheel, and structural finance gas pedal together and finally have our spectacular crash.

For Further Reading: Pablo Triana's "Lecturing Birds on Flying" and Riccardo Rebonato's "Plight of The Fortune Tellers" are both excellent. A more accessible and conceptual discussion of the financial risks that come with false precision can be found in Nassim Taleb's "Fooled by Randomness" and "The Black Swan". Those who want to get into exploded detail regarding technical aspects of derivatives pricing will enjoy the forums at, which is the Oxford mathematician Paul Wilmott's site.

No comments:

Post a Comment