The likelihood that a bank loan will default is of interest to both regulators and investors. Under the Basel II regulatory guidelines, a bank must hold capital in proportion to the riskiness of its assets. The probability of default is a primary determinant of riskiness on a loan. Investors, in turn, price a loan in the secondary market based on its expected cash flow, which again depends on the default probability.
How should market participants assess the default probability on a pool of bank loans? It is natural to consider historical data on loan conditions and default rates, and to estimate a statistical model that can be used to predict defaults going forward. Such statistical models have been used widely across the financial markets, to enhance market liquidity and impose capital requirements on financial institutions.
The accuracy of predictions from statistical models was especially poor in the subprime mortgage market in the period from August 2007 onwards. We argue that one cause for this failure was that these models relied entirely on hard information variables, and ignored changes in the incentives of lenders to collect soft information about borrowers. That is, they failed to account for the change in the relationship between observable borrower characteristics and default likelihood caused by a fundamental change in lender behavior. Such a failure is in the spirit of the Lucas critique (Robert Lucas, 1976): a purely statistical model ignores the idea that a change in the incentives of agents who generate the data may change the very nature of the data.
What changed the behavior of lenders in the subprime market? There was a tremendous growth in securitization in the subprime sector after 2000. Securitization increases the distance between the originator of the loan and the party that bears the default risk inherent in the loan. As Jeremy Stein (2002) points out, soft information is unverifiable to a third party. We argue that the increase in distance therefore results in lenders choosing to not collect soft information (such as the likelihood of future income shocks) about borrowers. Consequently, among borrowers with similar hard information characteristics, the set that receives loans changes in a fundamental way as the securitization regime changes. This leads to a breakdown in the quality of predictions from default models that use parameters estimated using data from a period in which a low proportion of loans are securitized. Importantly, the breakdown is systematic, and therefore predictable: it occurs in the set of borrowers on whom, after conditioning on the hard information, soft information is potentially important.
In this piece, we outline a simple theoretical model that develops this argument, building on the intuition of Gary Gorton and George Pennacchi (1995) that a bank that makes and sells loans is subject to a moral hazard problem with respect to screening borrowers. We then comment on the empirical tests reported in a companion paper (Uday Rajan, Amit Seru and Vikrant Vig (2009)).