Seduced by Data in the Financial Industry

A while back I wrote about the trouble that can occur when the managers of large organizations overestimate the utility of large data sets and sophisticated statistical tools, with Robert McNamara’s problems in Vietnam as a poster child. In 2010 that example seems remote, so let’s talk about an example that’s closer to home: the financial crisis.

We’ve seen that decision-makers are at greatest risk of being seduced by data when there are several layers of abstraction between themselves and the people on the “front lines.” Consider the case of making a loan. In a traditional small bank, you might have a president who establishes guidelines for the kinds of loans that are issued, and a handful loan officers who evaluate individual applications and make decisions based on the guidelines. There’s not much danger of the bank president getting out of touch with reality here: if a loan officer thinks the rules don’t make sense, he can probably go down the hall and talk to the president about his concerns.

Things get trickier as the bank gets bigger. There will be more applications to evaluate, and that requires branches, more loan officers and a more complex bureaucracy to manage them. This can create problems. For example, if loan officers are paid on commission, they might not have much incentive to report if the rules are too lax.

Still, things get a whole lot worse when securitization comes along. Securitization works like this: a bank approves a bunch of loans to various customers, combines those loans into a big bundle called a Collateralized Debt Obligation, slices them into “tranches” (so that each slice has a piece of each loan), and sells the slices to a bunch of third parties.

Now there are many layers of abstraction and bureaucracy between the guy evaluating an individual loan application and the guy who will ultimately lose his money if the loan goes bad. And the incentives have become extremely perverse. The bank earns fees the moment it originates a loan, but it may bear little or no long-term risk for making bad loans. Meanwhile, the assets are so fragmented that it’s not practical for buyers to do due diligence. In hindsight, this seems like an obviously terrible idea. Yet lots of otherwise smart people endorsed the concept. What happened, in a nutshell, is that the financial industry fell prey to the same intellectual error that befell Robert McNamara: confusing large amounts of data for knowledge.

In the last decades of the 20th Century, Wall Street developed what they thought were sophisticated statistical tools that allowed them to accurately estimate the riskiness of complex portfolios without firsthand knowledge of the underlying assets. For example, as banks got larger, they increasingly relied on numerical standards like income and credit scores, rather than more subjective personal factors, to decide which loans to approve. This made a certain amount of sense because as banks got larger, it became more important to have consistent standards across the organization.

Second, investors increasingly relied on a handful of large credit ratings agencies who evaluated CDOs for riskiness. The firms creating the CDOs carefully packaged these bundles of securities so that as many slices as possible would get an “investment grade” rating. This was important not only because it gave buyers confidence, but also because government regulations mandated that financial institutions hold a certain fraction of their balance sheets in investment grade assets.

Finally, the buyers themselves developed (ostensibly) sophisticated statistical techniques like value at risk that purported to give the institution a high-confidence estimate of the maximum amount the institution could lose on a given portfolio. Wall Street had armies of well-paid “quants” whose job it was to compute these figures.

As the system became more complex and centralized in Wall Street, it became increasingly difficult for the people creating the securities to do a “reality check.” The Wall Street quant who never looked at individual mortgage applications is in precisely the same situation as a general who’s never been to the front lines in Vietnam. In both cases, the people who are on the front lines had a strong incentive to skew the data to make themselves look good. And they began doing just that. Loan officers began encouraging their customers to fudge the information on their loan application to ensure they’d be approved. Companies came up with ever more elaborate CDO structures designed to convince ratings agencies to give a large fraction of their securities investment-grade ratings. So for Wall Street executives, the numbers looked great right up until the moment his balance sheet imploded.

The fundamental problem, I think, was the size of the firms and the complexity of the financial instruments. No statistic can perfectly summarize a messy, real-world asset. There’s no substitute for understanding the (literal) “facts on the ground”: for knowing something about the individual properties and property owners who are the ultimate anchor for any loan. And the more complex and fragmented an asset is, the harder it is to perform due diligence. Yet as firms grow larger, senior management is forced to rely on increasingly abstract statistical measures. Ken Lewis couldn’t possibly have developed an in-depth understanding of all the CDOs (and other exotic financial instruments) Bank of America was buying in the years before the financial crisis, both because they were extremely complex and because there were just too many of them.

There’s been a lot of talk about the need for a systematic risk regulator to monitor the entire financial sector and raise the alarm if major financial institutions are making loans that are too risky. But if the financial system continues to be structured the way it was in 2007, then it’s not clear how much good such a regulator can do because he’ll be fundamentally in the same boat as the CEO. The balance sheets of the largest banks were so complex that regulators had no real alternative than to rely on statistics supplied by the bank itself. And that means that the regulator is going to have the same blind spots as the bank’s management.

So the most important part of any reform package has to be limiting the size of financial firms. A financial institution with $2 trillion of assets under management is a recipe for disaster. Smaller firms not only reduce the risk of too big to fail problems, which is important in its own right, but it also makes it easier for everyone—bank executives, regulators, and members of the general public—to understand what these institutions are doing. No executive can possibly manage it responsibly, and no regulator can possibly understand it well enough to conduct meaningful oversight.

Unfortunately, the Brown-Kaufman amendment to the financial reform bill, which would have limited the growth of large financial firms, failed in May. All signs point to continued consolidation on Wall Street. Which seems like a recipe for another crisis and more bailouts in the coming years.

This entry was posted in Uncategorized. Bookmark the permalink.

10 Responses to Seduced by Data in the Financial Industry

  1. Erich Schmidt says:

    Good well-reasoned post, Tim. The solution you outline, though, makes too much sense to have a chance.

  2. Aaron Massey says:

    Great post. I liked the one on McNamara as well. They both point out important problems in data analysis, but the bottom-up solutions appear to be more preventative than responsive. Pretend you’re the guy in charge of a major bank just before the crash and you just realized that you’ve been seduced by data (which may be a pretty difficult task in and of itself). How do you go from too much of the wrong kind of data to just enough of the right kind?

  3. Barry says:

    The one caveat about this is that the people running the firm were making the sort of ‘mistakes’ that boosted their pay. To me, that is a sign that these weren’t mistakes. Note that many of them can be (at least mostly) controlled – for example, a firm making mortgages through brokers can audit a sample, and see what sort of ‘errors’ are in the data, as well as sending testers (mystery shoppers) to the various brokerage firms.

  4. Aaron,

    There are a few things a CEO can do. For one thing, he can insist that the bank focus on investing in simple and transparent assets. If the CEO gets a report that’s full of math that’s too complicated for him to understand, he should think about whether that’s a good business to be in. And as Barry suggests, he can invest more in auditing, creating a separate department whose job it is to go through the underlying assets with a fine-toothed comb and point out ways the data might be misleading.

    More fundamentally, though, the problem is that these institutions are just too large. My sense is that it would be in shareholders’ interests (at least, it would be if not for the promise of future bailout money) for a huge bank like Bank of America or Citigroup to break itself up into several smaller business units each of which would be more manageable. The problem is that asking what the CEO should do is a little bit like asking what Barack Obama can do to reduce executive power. There are a wide variety of perks that go along with running a big company that the CEO is not likely to give up even if doing so is in shareholders’ interests. It has happened–the old AT&T voluntarily broke itself up into several pieces in the late 1990s. But it’s pretty rare.

    But the important question is what should people who aren’t the CEO. First and most obviously, the government should have made some kind of divestiture plan a pre-condition of receiving TARP money. If you’re too big to fail, then you’re too big, period and you should get smaller. Similarly, boards of directors should be more skeptical of mergers and more friendly to spin-offs. Beyond a certain point, the CEO’s ego is in conflict with the shareholder’s interests.

  5. Don Marti says:

    You can’t ignore the social factors, though. Making risky, high-interest loans to individuals used to be a violation of the norms of mainstream finance. It was something low or disreputable: “usury” or “loan sharking.” The norm collapsed in the 1970s, and Citibank persuaded South Dakota to legalize usury. Now the industry is mainly driven by betting on debtor families in bulk, and counting on high interest to make up for the default rate, instead of making higher-quality loans. The higher the interest rates, the more statistics a bank needs (and gets buried in).

  6. Don: Absolutely, but the two points are not unrelated. A small bank that held its loans on its own books wouldn’t have made the kind of crazy subprime loans that securitization made possible, because the risks would have been too obvious. Once there were faraway hedge funds to take on the long-term risk, banks increasingly saw their role as mere loan origination, with risk management being someone else’s problem. And so the culture shifted to one that prized originating loans as an end in itself.

  7. Timon says:

    I agree with your conclusion (no big banks) but I actually think we haven’t even begun to scratch the surface of the surface of companies exploiting massive-scale data.

    Suppose a cell phone company got together with a major supermarket, and then got together with a credit card issuer that was also evaluating credit risk for a mortgage. If they just apply two years worth of absolutely all available data — every place the person has gone and when, a good chunk of what they’ve bought, everyone they’ve called and when — it is inconceivable to me that they would not find monstrously good indicators of who’s a credit risk. Ie, never loan to people who buy Carlo Rossi after 1am; don’t loan to people who call such and such numbers; don’t sell life or health insurance to people whose cars are found moving faster than 80 mph after 2am (easily determined via cell tower triangulation or gps); do loan to people who buy multigrain bread and never leave the house after 8. The thing is, these clusters and patterns will emerge from the data with no prodding; if you have a bunch of people who hang out at the same location late at night it could be a shift at the hospital or a biker bar, and you won’t have to look at a map to know which is which, these kinds of inferences will just happen.

    I think if anything the risk is that data will be so unreasonably effective that legislatures will be tempted to fine tune what is legally usable to such a degree that there will effectively be no free market in anything where forward-looking risks are taken into account. We have just recently completely evacuated the concept of “insurance” in the health care market — a pre-existing condition is not the kind of thing which can be insured, any more than a sunken ship can be insured against losses. It is not a stretch to say, for example, that geodata will be ruled out as soon as someone is turned down for a job because their cell-phone spinoff credit report showed that they cruise areas where streetwalkers congregate, and leave fleabag motels late at night a few times a month.

    Also, I think it is too charitable to Wall St. to proceed as if they earnestly applied the most rigorous mathematical analysis in a competitive market for the benefit of their customers. The pseudo-science of CDOs was just a convenient foil to quickly steal as much as possible from as many people as possible, and was dreamt up by people who knew perfectly well that if they were wrong, they would long since have skimmed enough of the booty to not care.

  8. Thanks for the interesting comment Timon. A couple of thoughts.

    First, one of the big problems here is that (as they say) past performance is not a predictor of future results. The problem wasn’t (just) that the banks were bad at distinguishing (relatively) good credit risks from (relatively) bad ones. FICO scores probably did correlate relatively well with default rates. The problem was that the models they built based on data from 2003, in a rising market, didn’t do a good job of predicting default rates when the bubble started popping in 2008. Throwing more data on the pile wouldn’t have helped with that problem.

    On giving Wall Street too much credit: you’re probably right that some people on Wall Street figured out what was happening and used it to their advantage. But I also think a lot of people really were fooled. One data point is that (IIRC) a lot of banks actually held onto a non-trivial fraction of the CDOs they originated. That suggests to me that they actually believed their own smoke and mirrors.

  9. Timon says:

    The banks’ shareholders held the crap assets, not the individuals who created them. The bamboozle-embezzlement was from, in order, the customers, the shareholders, then the taxpayers. I am sure many 22 year old traders thought they were on to something and overestimated the quality of their knowledge, but I guess their role could be chalked up to “useful idiots” (useful to the managers collecting real bonuses based on imaginary earnings.)

    I completely agree about the inadequacy of 5 years of historic data for making projections; it is also true that the few people who bothered to look at statistically meaningful samples collected over long periods of time, ie Case and Schiller, saw the bubble as clear as day. My thought is just that there are, in the phantasmagorically vast new world of data that technology is opening up, possibilities for all kinds of statistical bulls-eyes we aren’t thinking about yet and it is a bit unfair to attribute the failings Wall St. frauds to a data-oriented worldview. What I mean is, FICO scores and mortgage repayment history over 5 boom years may not be meaningful at all — but your cell-phone data over 3 days may tell a person asking the right question everything they need to know, if they have first analyzed a few billion or trillion or quadrillion other records. I really hope someone at Facebook is collating foreclosure data and connecting some dots — wouldn’t be surprised if they could give fair Isaac a run for his money just on some social-geographical vectors, with no payment history at all.

    The other big-picture thing is that a statistical analysis should be simple — yes this guy is paying 70% of his income in mortgage payments, but hey this magic machine says he’ll make good. Substituting a complex, abstract process for an obvious and time-tested one is not something you should do lightly and the enthusiasm banks had for it is probably itself evidence for the prosecution.

  10. Don Marti says:

    Adam Smith’s answer was to cap interest rates at a little above the rate for low-risk loans.

    The “consumer credit” racket is not just a waste of mathematical talent in the private sector, but a drain on law enforcement, too:

Leave a Reply

Your email address will not be published.