January 25, 2017

Poverty and Crime

Summary: Many people argue that economic variables such as poverty, unemployment, and income inequality, cause crime. Most studies do find that poor individuals have higher than average crime rates. However, many studies do not find that poor areas have higher than average crime rates or that an area’s crime rates rise when it’s economic situation worsens. Further still, the mean effect size reported for the relationship between crime and poverty is small, suggesting a weak statistical association. Concerns about the direction of causality, as well as evidence from certain natural experiments, cast further doubt that poverty has any meaningful causal impact on crime. 

Poverty and Crime Among Individuals

Let’s begin by asking a simple question: are poor people more criminal than rich people? The answer to this question is unequivocally yes. Ellis, Beaver, and Wright (2009) reviewed the literature and demonstrated that it consistently shows that the poorer a person is the more likely they are to be a criminal:


Poverty and Crime Variation Among Areas

Given this, you would expect that the more poor people a region has in it the higher its crime rate will tend to be. But the relevant empirical evidence does not clearly show that this is so. Before we go any further I want to to ask yourself a question: what proportion of studies should find an effect in order for us to have weak, moderate, and strong, confidence in it? 50% obviously offers no confidence since the effect is just as likely to be found as to not be found, and so we have no good reason to suppose it exists. But how much higher, in your subjective opinion, should it be in order to justify varying levels of confidence? My own intuition is a replication rate of 65% would justify weak confidence, 75% would justify moderate confidence, and 85% would justify strong confidence. What numbers would you choose for these thresholds? What ever they are, keep them in mind. (Maybe even write them down.)

Proponents of the poverty causes crime theory often cite Hsieh and Pugh (1993) which meta analyzed 34 studies and found that 97% of the 76 relationships reported, all but 2, indicated that poorer areas had higher crime rates. The average correlation between poverty and violent crime was 0.44 and statistically significant. Hiegh and Pugh also analyzed studies on income inequality. In a shocking coincidence, they once again found that 97% of the reported effect sizses were positive and that the average correlation was .44.

However, Hiseh and Pugh’s work is over 20 years old and, more importantly, not consistent with what more recent and larger meta-analyses have found.  Consider, for instance, Vieratsis 2000, which meta-analyzed 45 studies, 9 more than Hsieh and Pugh 1993, and found the following:



An even larger meta-analysis on poverty and crime was done in Ellis, Beaver, and Wright’s 2009 Handbook of Crime Correlates. The results can be seen below:





Chiricos (1987)   carried out an even larger meta-analysis, on 288 studies, on the relationship between unemployment and crime:


As far as mean effect size goes, Pratt and Cullen (2005) meta-analyzed 153 studies on poverty and estimated a mean effect size of .253 with 59% of results being statistically significant, 167 studies on inequality with a mean effect size of .207 and 55% of effects being statistically significant, and 204 studies on unemployment with a mean effect of .135 and 44% of  findings being statistically significant. Similarly, Nivette (2011) meta-analyzed 37 studies looking at predictors of national crime rates. For national wealth the mean effect size was -.055 and not statistically significant. For income inequality, the mean effect size ranged from .224 to .416 depending on how income inequality was measured. In both cases the effect size was statistically significant. Unemployment’s relationship with crime (across only 4 studies) was .043 and not significant.

(An effect size indicates how much we would expect a predictor variable to rise, in standard deviations, given a 1 standard deviation increase in the variable we are predicting. Put another way, it is the % of real world differences statistically accounted for by a predictor variable. [No, this is not what r^2 is.]. So, an effect size of .2 would indicate that 20% of differences in the predicted variables can be statistically explained by the predictor variable. By convention, 0.2 is considered the threshold for a weak effect, 0.5 for a moderate effect, and 0.8 for a strong effect.)

Across these meta-analyses what we consistently see is that most studies do find that poorer areas are more criminal but most studies do not find an effect size which is statistically significant. This is likely because, as Pratt and Cullen ,and Nivette, demonstrated, the mean effect size found is too weak to be significant.

Hiegh and Pugh’s analysis is a major outlier, and it is worth asking why. A meta-analysis is aimed at giving us a representative look at all the studies on a given association, and Hiegh and Pugh clearly failed to do that. This result is especially puzzling given that, in their literature review, Hiehg and Pugh state “In the late 1970s and early 1980s, a number of important literature reviews cataloged a variety of studies on crime and economic conditions and all too frequently could not reach a clear conclusion on the existence of even the most simple bivariate associations (Box, 1987; Braithwaite & Braithwaite, 1980; Chiricos, 1987; Clelland & Carter, 1980; Elliott & Ageton, 1980; Tittle, Villemez, & Smith, 1978).” Thus, there were able to cite 6 studies which complained about the inconsistency of the evidence but could only find 3 studies that did not find a positive relationship between poverty and crime!

Given the inconsistency of their findings with their own literature review, and every other meta-analysis on the topic that I could find, and the fact their study produced identical results for poverty and income inequality, but in terms of the percent of studies that found effect and the mean effect size down to the second decimal, I, quite frankly, am inclined to think that their results are fraudulent. That is, they literally threw out studies that didn’t show that poverty was associated with crime. Can I prove this? No. But it is the most plausible explanation I can think of for this bizarre set of facts. Either way, their study is clearly not representative, and the totality of evidence suggests that the relationship between crime and poverty is weak at best.

Poverty, Unemployment, Income Inequality, and Crime Over Time

Of course, what we have looked at so far is just simple cross sectional relationships. And, as people on the internet are so fond of  pointing out, correlation is not causation. So, let’s look at how crime and poverty co-vary over time.

First, he is how poverty and crime in the US have changed over the last 50 years (1):





Over this time period, poverty actually negatively correlated with the crime rate. This means that, as the poverty rate went down, crime went up. Similarly, in the roaring 1920’s crime increased and during the Great Depression it fell (Brearly 1932 and Wilson 2011). Further in line with this, Ellis, Beaver, and Wright 2009 analyzed 8 studies on the relationship between national wealth and crime over time and the results were highly inconsistent.


They also looked at changes in unemployment and crime over time:


Rufrancos et al 2013 meta-analyzed 35 reported associations between income inequality and crime over time and found only 60% of them to be positive and statistically significant.

Thus, the longitudinal relationship between crime and poverty seems inconsistent at best.

Direction of causality

It is somewhat striking that no consistent or large impact of economics on crime is to be found in the literature. Often times, sociologists and criminologists act as if economics is the main driving force behind crime. And yet, the theory doesn’t really even get off the group. But let’s say that it did. Let’s say that researchers consistently found significant and large cross sectional and longitudinal effects of unemployment, poverty, and inequality, on crime. The poverty causes crime theory would still face a major obstacle: the direction of causality.

Imagine that you’re a store owner and crime rises in your area. As a result, your store gets broken into more often and fewer people are willing to venture into your area and so you get fewer customers. Initially, this cut in profits causes you to give people fewer hours. Then you let a few people go. Eventually, as crime continues to rise, you end up going broke or moving your store to a better area taking the wealth that you brought to the area with you. And you’re not alone. This happened to several of the local store owners.  In this plausible story, rather than poverty causing crime, crime has caused poverty. If this were to occur, we would see a correlation, both at one point in time and over time, between poverty, unemployment, inequality, and crime.

Thus, there are two very different interpretations of the relationship between economics and crime which would explain strong statistical relationships between these variables if they existed. And there is no research (that I know of) strongly suggesting one over the other.

Confounding Variables

As with any statistical association, the other major difficult here is confounding variables. Many studies on poverty and crime try to control for all sorts of confounding variables. However, they consistently do not control for psychological variables. This is a serious flaw because it is easily imaginable that the psychological variables which cause poverty (aggression, stupidity, low self-control, etc.) also cause crime.

Sadiaslan et al (2014) provided evidence that the, weak, statistical link between poverty and crime is not casual. This study analyzed over half a million Swedes and related their childhood income levels to their future criminality. In line with previous research, the study found that children from poor families were more likely than average to grow up and become criminals. However, they found that the exact same thing was true of these poor kid’s siblings even when their families had become wealthier by the time they were growing up. If poverty itself where what caused poor people’s high crime rates, then kids of ex-poor families that got rich should not be prone to criminality. But they are just as prone to criminality as their siblings, who were raised in poverty, are. This strongly suggests that there is some set of facts about these families which at once make them more likely to be poor and to be a criminal rather than poverty actually causing crime itself.

Interpreting the Data

So, how should we interpret all this data? Well, here is one interpretation of this data which I find plausible: there is some factor about a subset of people which both makes them poor and makes them criminal. But it isn’t poverty itself. This is why the relationship between poverty and crime at the individual level is strong. The relationship between poverty at higher levels of aggregation is weaker because being in a poor area doesn’t make someone any more criminal, it just exposes them more often to that subset of people who are criminal, and poor, for some other reason. And crime doesn’t change consistently with poverty over time because temporary increases in poverty aren’t caused by people’s psychology changing to match the minds of those who are normally poor.

Obviously, this interpretation is not proven. But it fits the data better than any other I can think of. Maybe you can think of a better one. Or, maybe you are aware of some important empirical evidence I am ignoring. If so, feel free to leave a comment.


  1. Poverty rates were taken from the Census and crime from the Uniform Crime Report


Facebook Comments
  • BadgerWA

    Excellent article. One gripe. “Effect size” is a vague term: https://en.m.wikipedia.org/wiki/Effect_size. Please replace it with the exact statistical metric you are referring to.

  • Jalmari Ikävalko

    Perhaps the root cause is education. The lack of education correlates strongly with both poverty and crime likehilood (http://eml.berkeley.edu/~moretti/lm46.pdf ). Also, given that most crime is commited by relatively young people but the biggest extra in income from education doesn’t really show until in much older people (since you’ll be more likely to spend some years for higher degrees, and age-related salary increases in high-tech jobs are larger than in low-paying jobs, et cetera), it would make sense that changes in the quality and availability of education does not reflect as quickly in both poverty and crime rate. An improvement in education could have a lag of ten, even twenty years, to income, while the first effect to crime could be visible within only a few years.

    There’s also evidence that education while incarnated strongly lowers redictivism rates (http://www.rand.org/pubs/research_reports/RR266.html ), which might give further credence to the interpretation that education is the common cause of both poverty and crime.

  • Emil Kirkegaard

    “An effect size indicates how much we would expect a predictor variable to rise, in standard deviations, given a 1 standard deviation increase in the variable we are predicting. Put another way, it is the % of real world differences statistically accounted for by a predictor variable. [No, this is not what r^2 is.]. So, an effect size of .2 would indicate that 20% of differences in the predicted variables can be statistically explained by the predictor variable.”

    No, that is exactly what r^2 is. This mistake was also made in another post, so I will address it. The confusion is perhaps due to the fact that predictive validity is a non-linear function of the proportion of variance accounted for. In general, the non-squared versions are the most important values to use, but in some cases, one must use the squared versions (e.g. eta^2). One must use the squared versions when one must calculate e.g. how much variance is leftover in a path model.

    A correlation of .50 with an outcome gives you 50% predictive validity, but it is only 25% of the variance. So, one can in fact have 4 uncorrelated variables each with a .50 correlation to the outcome and each accounting for 25% of the variance. These would account for a total of 200% variance according to your claim. A correlation of .10 is 1% variance, so one can have 100 of them, which by your error sums to 1000% variance.

    You cannot meta-analyze studies by counting the number of significant p values like this. This method produces very biased results. You can find the details in Hunter and Schmidt’s book on meta-analysis. The TL;DR version is that the p value depends on sample size too, and studies tend to be too small (have low power), so you get a lot of ‘false negatives’ i.e. studies with p values > alpha (usually .05), but which studied an effect that was not ~0. In this case, the individual-level correlation between poverty and crime is fairly small, so only very large studies would consistently ‘find it’ (produce p values below .05).

    You can however meta-analyze studies by looking at the direction of effect and in this case, one can clearly see the poverty link. However, as Sariaslan (you misspelled his name) showed, this statistical link is not causal.

    • You are confused by the use of the term ” differences”. Differences is not equivalent to the statistical concept of variance. I understand why this would be confusing since in statistics variance is normally what is talked about, but if you say “variance” to most people they think you mean actual differences, which you do not. That is why I spelled out what I mean before saying it. I didn’t say that a correlation of .2 accounted for 20% of variation, I said it accounted for 20% of differences.

      This may come off as dishonest to you, but I think this is what most people intuitively think you mean when you talk about “variance” in the first place, and so, to most people, this is actually clearer than talking about variance.

      • Emil Kirkegaard

        But a predictor with .20 does not account for 20% of differences. You can have 25 independent .20 predictors. That sums to 500% differences explained. It doesn’t work no matter how you phrase it in terms of % (the math is only right in the specific case of r = 1.00).

        A .20 predictor gives 20% of max predictive validity. That is usually what matters. It’s a very neat thing. For instance, the current best polygenic scores for education can account for some 9% of variance, i.e. correlation of .30. The heritability of educational attainment is about 40% i.e. gene x phenotype correlation is .63. So, despite us only having 22% of the possible genetic variance accounted for, we can already predict with near 50% of max validity (.30/.63).