Our obsession with using survey data for economic research is ruining economics…

Jonathan Newman has a piece saying economic research relies too much on survey data. However, most of surveys are biased and even large samples cannot do away with these biases:

In a recent Bloomberg article, Noah Smith celebrates the increasing trend of empirical work in economics over the years. Purely theoretical papers are on the decline as a share of all published work. More and more economists are utilizing data to estimate the magnitude of various effects or to estimate specific parameters in theoretical models.

Empirical work is on the rise

The following figure from Angrist et al (2017)1 backs up Smith’s claims — empirical work is on the rise.


Of course, the distinction between a purely theoretical paper and an empirical paper in the mainstream is quite different from the Misesian distinction between economic theory and empirical work (history). But the trend is undeniable — economists are using data in more of their research than they used to.

Not only is empirical work in general on the rise, but one particular source of data is more popular than ever: surveys. To proxy the growth in popularity of surveys, I’ve plotted the National Longitudinal Survey citations by year since 1968:


I’d like to focus on labor economics because survey data is especially popular in that field. Other fields also use survey data, though sometimes in an indirect way. Macroeconomists, for example, indirectly and perhaps unwittingly use survey data whenever they use price indices and unemployment rate data from the Bureau of Labor Statistics.

Problem with surveys:

Surveys, it turns out, have big problems. The biggest problem is that we can’t trust people to accurately report their own personal information. The accuracy of survey data depends on the respondent’s own attention, memory, and attitude toward the survey.

Surveys were discounted by economists in the 70s and 80s for this very reason: survey respondents cannot be trusted to reveal accurate information about themselves. People tend to have an inflated view of themselves and exaggerate their personal information. But surveys suffer from other issues like selection bias and difficulties in identifying causation in panel data.3

Selection bias happens when people have certain characteristics that make them more likely to be included in the analysis than the rest of the population. It’s a problem for surveys because it’s difficult to randomly administer surveys, but it is easy to survey people from a certain region at a certain school who are of a certain age and are ok with taking a survey (maybe because they need the extra credit in their economics or psychology class) and then make the flimsy assumption that the sample is representative of the population.

It’s difficult to wrench cause and effect from panel survey data because life events can be related to each other and because of interpersonal differences between the respondents. Since the survey respondents’ lives are the subject matter for the labor economist, there is still no pure laboratory experiment in which one life event is administered to a treatment group and then the researcher compares that group to a control group.

People live their lives and select themselves into college, moving from place to place, marriage, having kids, etc., and these life events are often dependent on other life events that already happened and the individual’s own preferences and personality. These kinds of biases aren’t the type that “average out” if you just get a large enough sample size. Thus isolating the ultimate cause of some labor market outcome is an impossibly complex task, even if survey respondents have perfect memories and are totally honest.


2 Responses to “Our obsession with using survey data for economic research is ruining economics…”

  1. phuenermund Says:

    The article mentions four general problems with using survey data: (1) social desirability bias, (2) inaccuracy due to inattention of respondents or problems of recalling, (3) selection bias, and (4) causality. However, the last two are by no means specific to survey data and the former are well understood by researchers. In any form of data analysis you have to pay close attention to the quality of your data. And of course you should not hide behind the label of an “official government survey that is widely used in the literature”. But that doesn’t mean we should dismiss survey data altogether.

    • Amol Agrawal Says:

      Hi. Apologies for the late reply. The idea is not to dismiss survey based research but to think of it as the only useful source of research. The problem as you rightly say is most papers hide behind “official government survey”.

Leave a Reply to Amol Agrawal Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: