Skip to content

Some Ideas for Honors Thesis Projects

This page contains a collection of ideas that may be useful starting points for honors theses.

Disclaimer: I haven't thought about these ideas carefully, so they may not be good ideas. Or they may have been done (note the date next to each idea). Or they may not be doable. Or boring, or fundamentally mistaken, or any of the other things that tend to derail research projects.

Don't take the ideas as given. Use them to get started thinking about a set of questions.

A good thesis topic explores cause and effect (e.g., minimum wages affect unemployment) with some clever identification strategy (e.g., instrumental variables). Most of the topics I am listing here don't fit that pattern. They are really just data descriptions. The reason is that I work with structural models. For the questions that I study, getting answers with data only is usually not possible.


Do colleges raise local incomes? (2023)

Question: If a city hosts a high quality college (think Harvard), does this raise local incomes or earnings? Does this change the composition of the local economy (think Silicon Valley)?


The key issue is identification. Fortunately, pretty much all high quality colleges were established a long time ago. Perhaps one could argue that their locations are not correlated with other factors that determine today's earnings. For example, the location of land grant universities seems fairly random. Harvard was founded when Boston was just a small settlement.

If this identification argument is compelling, one could simply correlate economic outcomes with the presence of a high quality college and claim causality.

How big a problem is student debt? (2023)

Many reports claim that student debt is a huge problem with many students holding large amounts of debt that never get repaid.

This is strange because

  • the college earnings premium is very high (at least for graduates!)
  • student loan limits are quite low


  • who holds large student debt? (grad students; for profit college students)
  • where do they get large loans? Many of the reported debt amounts are much greater than federal student loan limits?
  • who does not repay? do they lack funds? Can one see in the data why they may fail to repay?

This is largely a descriptive project, but it might have high value added. I have not seen a good summary of student debt that clearly answers the questions posed above.

Data sources:

  • Survey of Consumer Finances: has info on original loan amounts, remaining debt, income, assets
  • Where to find info by type of college?

Why do students choose specific colleges? (2023)

A lot of college students attend colleges that are "worse" than others they could probably have gotten into. The literature calls this "undermatch." Why does this happen?

More generally, why do students choose specific colleges? How important are academics and earnings vs other factors (close to home, financial aid, ...)?

HERI's CIRP freshmen survey has asked students questions along these lines for many years. What do students say? Why are they undermatched? How do the answers differ by parental background and college quality?

Why do students drop out of college? (2023)

Almost half of all college starters fail to earn a BA degree in six years. They drop out. Why?

Some possibilities:

  • financial shocks
  • poor academic performance
  • students learn that they hate college
  • students planned to drop out
  • students get job offers

Use survey data to find out what students say about why they dropped out. Try to find evidence that dropping out is correlated with observable shocks.

Also of interest: It appears (in NLSY data) that dropouts take few courses, starting in their first year. It would be of interest to show that this predicts dropping out pretty well. Then the question is: why do students take so few courses?

Why do good students attend bad colleges? (2023)

Only about 40 pct of students in the top AFQT quartile attend colleges in the top quality quartile (similar to public flagship universities) (NLSY97 data). Why?

Some possibilities:

  • students could not get accepted at a good school (did they apply?)
  • students chose a specific college because a friend or family member went there
  • students got good financial aid from a mediocre college
  • good colleges are too expensive

Use survey responses about why specific colleges were chosen. Look at other colleges where students were accepted.

Colleges without earnings gains (2022)

A New York Times article discusses why about half of all colleges appear to produce no earnings gains relative to typical high school graduates.

It would be interesting to look more deeply into the (publicly available) data to figure out:

  • how many students do these colleges actually enroll?
  • how many are colleges that specialize in professions with low earnings but high amenities (e.g., the arts)?
  • who attends these colleges? Are there earnings gains relative to what similar students would have earned without college?

College qualities over time

Are college qualities highly persistent over time? Do colleges move around in the quality distribution?

Which colleges move up and why? Is there a mechanism that "rewards" better colleges and forces "worse" colleges to exit?

Related: which colleges grow and which shrink? Do "successful" colleges grow? Does it have anything to do with student earnings?

Drawbacks: this is descriptive.

College Stratification

Hoxby (2009) shows that colleges became more stratified in the 1960s. Her data end in 2006. They are also not publicly available.

Moreover, Hoxby's data show that initially highly selective colleges became more selective over time and vice versa. A different, but related, question is: did colleges become more homogeneous? How did the CDF of college "qualities" change over time? What happened more recently?

Possible data sources: IPEDS (since about 1985) and HERI freshmen surveys.

Drawback: this is descriptive.

Cross-country Income Differences

Occupational downgrading of immigrants (2020)

Idea: If immigrants from poor countries have less human capital (given schooling), they should be employed in jobs that require less human capital. Those are jobs held by natives with lower schooling.

Quantify this:

  • Construct average native schooling by [occupation, industry].

  • For each source/host pair: compute the average gap between immigrant and native schooling in [occ, ind] cells. This is a measure of occupational downgrading.

To what extent is the wage gap between immigrants and similar natives explained by this?

Jones (2014) has strong claims about downgrading. How do those hold up?

The task content of immigrant jobs (2020)

Todd Schoellman may have done this.


Where do Immigrants do well?

... and how has this changed over time? And how about their children?

Related to Abramitzky & Boustan's "Streets of Gold".


Hsieh/Klenow for Immigrants

Hsieh et al 2019 show that women and black men were underrepresented in certain occupations in the 1960s. Over time, the gaps diminished, suggesting that the allocation of talent improved.

Is there evidence for a similar convergence among immigrants?

Drawback: this is really just a replication of Hsieh et al for a different population group.

Sources of earnings "shocks" (2021)

Administrative data show that earnings "shocks" are asymmetric (frequent small positive and rare large negative shocks).

What observable events are associated with earnings shocks?

What fraction of the "shocks" are due to

  • employer changes including layoffs
  • occupation changes
  • big changes in hours worked
  • family events (e.g., having children)

The goal is to inform how one could model earnings shocks (in structural models). There is a recent paper (for which I cannot find a reference) that does something related using administrative data from a Nordic country.


  • may be hard to do with publicly available data
  • descriptive

How predictable are lifetime earnings? (2020)

It is fairly easy to get a lower bound on predictability. Take a panel dataset. Use half to fit a statistical model. Use the other half to perform out of sample prediction.

Specification search is a problem.

Uncertainty about aggregate shocks (basically uncertainty about the relationship between individual characteristics and earnings) are not measured. But the same is true in structural models.

Skill premium variation across U.S. states / cities

Dispersion supposedly has decreased. Could one explore empirically possible explanations?

This is very open ended without a clear hypothesis or method.

Giannone, Elisa. n.d. “Skill-Biased Technical Change and Regional Convergence”