Skip to content

The Question of Simple Models and Little Data

For some time, I have been wondering why many highly competent authors in quantitative economics work with very stylized models and with very limited data.


  • Hsieh/Hurst/Jones/Klenow (Econometrica, forthcoming) could use panel data to get a better idea of individuals' comparative advantage for particular occupations.
  • Lagakos/Waugh (AER) could use panel data to see whether urban/rural migrants experience large wage gains (similar to Glazer/Mare using US data).
  • An extreme example: Manuelli/Seshadri (AER) use essentially no data at all.

Could one not pin down key model parameters, such as "the elasticity" in Manuelly/Seshadri, more precisely with more/better data?

One possible resolution of the puzzle: these papers really point out the possibility that a particular cause-effect mechanism could be empirically important.

To do so, they write down a simple model and calibrate it in a simple way. The point being made would then be rather limited: one can write down a non-nonsensical model and stick in non-nonsensical parameter values and find that the mechanism under study is "big."

Of course, this is not the way the papers are written. They typically contain quantitative statements, such as "the entire rise in the US college wage premium can be accounted for by the changing relative abilities of college graduates" (Hendricks/Schoellman, JME 2014; to point a finger at myself).

An innocent reader (like myself, until recently) might take the quantitative results at face value.

But then it is puzzling that the models so stylized and that not more data are used to discipline them.

But then: if the papers merely point out a possibility, this puzzle is resolved.

Perhaps, the authors understand that possibilities are all we can get from quantitative models. So we might as well proceed with simple examples instead of complicated models and detailed data.

But then we have a major problem: quantitative economics is then limited to accumulating potentially important explanations for what we observe.

In many cases, the number of explanations is quite large. Take the case of cross-country income gaps. If we add up the fractions explained by physical capital, human capital, misallocation, capital import frictions, etc., we end up explaining the observed income gaps many times over.

Something seems fundamentally wrong here.