“Figures often beguile me, particularly when I have the arranging of them myself; in which case the remark attributed to Disraeli would often apply with justice and force: ‘There are three kinds of lies: lies, damned lies, and statistics.'”
— Mark Twain
The debate over America’s response to the ongoing crisis of gun violence reached a fever pitch last month following the mass shooting in Parkland, Florida. I published an article last week calling for all sides of the issue to consider a different perspective: That in America the rates of violent crimes like homicide and robbery seem more sensitive to differences in economic inequality than to differences in the rate of gun ownership.
The article quickly generated tens of thousands of views and thousands of shares on social media. As expected it was received with a mix of praise, ambivalence, and criticism. Some critics accused me of manipulating data to arrive at my conclusion. Others questioned the validity of the source data. Some suggested that my central research question was flawed, that in choosing to study all homicides I had somehow(?) excluded gun homicides. Others wondered why I wasn’t studying mass shootings exclusively. Some disputed my choice of analysis techniques, and others went so far as to say that I have no understanding of statistics or data analysis at all — a claim that my clients, colleagues, and former professors might find amusing.
There are many people on all sides of this issue who will struggle to accept that their current opinion might be misinformed and subject to confirmation bias. They have various ways of shooting the messenger, no pun intended, and that’s fine. I’m doing this work because I’m intellectually curious about what lies at the root of the problem and am eager to revise my thinking based on what I learn, not because I expect that everyone who reads it will applaud. Some have found it helpful, and I do appreciate the many honest questions and kind words of encouragement I have received.
To achieve some scale in explaining the nuances of the work to date, I’m addressing some of the most-frequently-asked questions I’ve received. My hope is that readers will use this information to inform their own opinions on the issue and their discourse in this and other forums.
Death and violent crime rates. The best data source that exists for the various death rates by state, intent, cause, and a multitude of other factors is the CDC’s WONDER database. The FBI also provides data on violent crimes through its Uniform Crime Reporting (UCR) database. Both of these are updated annually and provide several decades of historical data.
Firearm ownership rates. There is great variability in the reported statistics for firearm ownership in the United States:
- Estimated guns per capita. This measure, cited in Wikipedia and elsewhere, holds that there are ~101 guns per 100 people in the United States, a number far greater than that of any other country. This is a top-down estimate at the country level. No state-level data exists for this measure so it is not particularly useful in our analysis.
- Household firearm ownership rate. This measure is derived from primary research that asks respondents something along the lines of, “Do you have any guns in your house?” The measure represents the number of households out of 100 that report having guns. Several different sources for this data are widely cited:
- The journal Injury Prevention publishes an annual survey of ~4,000-5,000 US households that includes questions about gun ownership. The weakness in this survey is a sample size that is insufficient to draw any statistically significant inference at the state level. For example, a national sample of 4,000 implies only 8 or 9 respondents from Washington, DC. One person who read my work went so far as to contact the lead author of a recent study, who confirmed that the sample size was insufficient for state-level reporting; the authors discourage the study from being used in this way due to its design limitations. Nevertheless, state-level data from these studies has worked its way into popular articles on Wikipedia, Vox, and other sites.
- The CDC’s Behavioral Risk Factors Surveillance System (BRFSS) used a much larger sample size, ~200,000 households, and published household firearm ownership rates. Here is a link to 2001 BRFSS ownership data published by The Washington Post. But the CDC dropped all firearm questions from the BRFSS starting around 15 years ago, so these comprehensive state-level surveys of household firearm ownership are somewhat dated. However, the BRFSS state-level household ownership estimates are still highly correlated with the most recent Injury Prevention data. I used BRFSS data in my analysis for its larger sample size.
Central research questions
There seems to be much confusion surrounding the various measures being used to quantify the problem:
- Mass shootings are currently defined as shooting incidents having four or more deaths, excluding the perpetrator, and excluding gang or domestic violence incidents. These events occur infrequently, only 90 times in the United States between 1966 and 2012, and generally dominate the news cycle for days or weeks after each event. While the US has more mass shootings than any other country (31% of all incidents worldwide), fewer than 100 people per year die in these tragedies. The numbers are so small and the events so rare that it’s not feasible to do a meaningful study of mass shootings using quantitative statistical analysis techniques on secondary data; any well-designed study of mass shootings will likely require qualitative primary research, which is well outside my budget at present.
- Firearm homicides, including all mass shootings. There are ~10,000 of these per year in the United States.
- Non-firearm homicides committed by any means other than a firearm. There are ~7,000 of these per year.
- Homicides = firearm homicides + non-firearm homicides
- Firearm suicides, there are ~20,000 of these per year in the United States.
- Non-firearm suicides, which also total ~20,000 per year.
- Suicides, or firearm suicides + non-firearm suicides
- “Gun deaths” = firearm homicides + firearm suicides, and sometimes unintentional firearm deaths. These total ~30,000 per year, or about as many people as die in automobile accidents every year in the United States.
Here is a graphical representation of these measures, to scale, including 10-year average death rates per 100,000 people. Data is from CDC WONDER, as cited above:
Many — perhaps most — gun-control advocates focus on gun deaths, the blue area in the chart above. This is the broadest measure of the firearm-related death rate and includes gun suicides (about 2/3 of the total), gun homicides (about 1/3), and sometimes unintentional gun deaths (that little blue sliver below homicides that’s too small for a label).
Problems with the “gun deaths” measure
Studying “gun deaths” to inform policy decisions is problematic for several reasons.
First, studying how gun deaths are related to guns is about as interesting as studying how drowning is related to water. If there were no guns, there would be no gun deaths. States and countries with fewer guns generally have fewer gun deaths, and those with fewer oceans, rivers, lakes, ponds, swimming pools, bathtubs, and 5-gallon buckets generally have fewer drownings. Got it.
The problem is that factors other than guns can predict the overall suicide rate and homicide rate, and plenty of gun-substitutes exist that can be used to take one’s life or that of another.
What happens to the overall rates of homicide and suicide when there are fewer guns among a population? Do these rates go down, stay the same, or go up, and by how much?
These questions are much more interesting and valuable to study, but the gun deaths measure ignores them in how it frames the issue. Perhaps the goal of a policy change should be to reduce all homicide and suicide deaths as much as possible, not just those related to guns.
Second, there is no statistical relationship between a state’s gun homicide rate and its gun suicide rate. This observation suggests that the idea of combining these two measures into one for the purpose of informing public policy decisions may be misguided.
Here is a scatterplot of each state’s gun suicide rate vs. its gun homicide rate from the same dataset cited above:
There is no pattern here, no apparent relationship between these two variables. The factors that might influence a state’s gun suicide rate appear to be different from those that might influence its gun homicide rate. It might seem reasonable to conclude that gun homicide and gun suicide are two unrelated problems.
What about the relationship between non-gun homicide rate vs. gun homicide rate?
That’s a much more interesting correlation. A strong relationship seems to exist between these two variables. As the gun homicide rate goes up, so does the non-gun homicide rate. If you know the value of one variable for a given state, you could guess the value of the other variable within a narrow margin of error. This relationship may imply the existence of one or more common factors that explain the variance in both forms of homicide. And does this strong correlation hint at a substitution effect — meaning that if guns didn’t exist, murderers would find another way to do the deed?
Given a choice between studying gun homicides and gun suicides as a compound measure vs. all homicides as a compound measure, which do you think would be most interesting? Or the most useful, i.e. most likely to discover factors that could explain what’s going on and how we might address these problems effectively with policy change?
I think the answer is obvious. And I’m going to continue studying homicide and suicide, of all causes, as two separate and independent problems.
I think anyone talking about an easy solution to “gun deaths” is likely ignorant — however good their intentions — of just how meaningless this measure might be. It fails to appreciate the differences between its underlying problems of suicide and homicide. It fails to account for any possibility of a substitution effect in either problem.
It does make for a bigger problem to solve and easy sloganeering, supported by clean first-order statistics easily packaged into social media memes that tout the promise of a simple gun control narrative.
But managing policy to a measure that muddles the structure of its underlying problems for marketing purposes? That will lead only to disappointment.
Edit: Sam Fisher asked in a comment below for a non-gun vs. gun suicide scatterplot. Here it is, along with a matrix showing the correlation coefficients for all the measures discussed above.