## Criminal graphics graphical crime

##### Nov 06, 2024

One of my Twitter followers disliked the following chart showing FBI crime statistics for 2023 (link):

If read quickly, the clear message of the chart is that something spiked on the right side of the curve.

But that isn't the message of the chart. The originator applied this caption: "The age-crime curve last year looked pretty typical. How about this year? Same as always. Victims and offenders still have highly similar, relatively young ages."

So the intended message is that the blue and the red lines are more or less the same.

***

What about the spike on the far right?

If read too quickly, one might think that the oldest segment of Americans went on a killing spree last year. One must read the axis label to learn that elders weren't committing more homicides, but what spiked were murderers with "unknown" age.

A quick fix of this is to plot the unknowns as a column chart on the right, disconnecting it from the age distribution. Like this:

***

This spike in unknowns appears consequential: the count is over 2,000, larger than the numbers for most age groups.

Curiously, unknowns in age spiked only for offenders but not victims. So perhaps those are unsolved cases, for which the offender's age is unknown but the victim's age is known.

If that hypothesis is correct, then the same pattern will be seen year upon year. I checked this in the FBI database, and found that every year about 2,000 offenders have unknown age.

In other words, the unknowns cannot be the main story here. Instead of dominating our attention, it should be pushed to the background, e.g. in a footnote.

***

Next, because the amount of unknowns is so different between the offenders and victims, comparing two curves of counts is problematic. Such a comparison is based on the assumption that there are similar total numbers of offenders and victims. (There were in fact 5% more offenders than there were victims in 2023.)

The red and blue lines are not as similar as one might think.

Take the 40-49 age group. The blue value is 1,746 while the red value is 2,431, a difference of 685, which is 40 percent of 1,746! If we convert each to proportions, ignoring unknowns, the blue value is 12% compared to the red value of 15%, a difference of 3% which is a quarter of 12%.

By contrast, in the 10-19 age group, the blue value is 3,101 while the red value is 2,147, a difference of about 1,000, which is a third of 3,101. Converted to proportions, ignoring unknowns, the blue value is 21% compared to the red value of 13%, a difference of 8% which is almost 40% of 21%.

It's really hard to argue that these age distributions are "similar".

As seen from the above, offenders are much more likely to be younger (10-29 years old) than victims, and they are also much more likely to be 90+! Meanwhile, the victims are more likely to be 60-89.