Minority Report in real life: Predicting crime using data instead of psychics

The weather was quite warm on the evening of March 22. It had been a particularly pleasant walk from the train station to the bar in Lakeview—a pretty decent neighborhood in Chicago. As I was sipping my gin and tonic, I was listening to Brett Goldstein, the Chief Data Officer of Chicago, about using distress call data to predict violent events before they occurred. This was a meetup of the Data Science Chicago group, and Adam Pah commented on it last week on his blog post. Along with about a hundred other people, I enjoyed hearing more on the serious efforts to increase transparency, publicly available data with practical interfaces, and new ways of utilizing the City’s tremendous amounts of data. But I was particularly interested in his work on predictive analytics. The police has limited resources. If we could use past distress call and crime records to accurately calculate probabilities of new events occurring at each neighborhood at each time slot, we could maximize the efficiency of these limited resources. Instead of reacting to a crime, increased patrols in an area could prevent it.

I left the bar and felt perfectly safe as I went home. I live in Rogers Park—a not so decent neighborhood in Chicago. Yesterday night, a man was stabbed half a block away from my building. Today, at 2:30 pm, as I laid sick in my bed, a person was gunned down to his death right in front of my building. The police say that two sub-factions of a gang are at war, and apparently I live in the middle of the battlefield. Right now, I am biting my nails, strongly wishing that the data scientists in the city would hurry up and start predicting this stuff accurately, so that I don’t die from a stray bullet as I carry oranges home from the grocery store.

The academic approach to predicting and preventing violent events has all the characteristics of the new Big Data Age. Our capability to collect, store and analyze large amounts of data has increased exponentially. When it comes to efficient use of information, not all of our mathematical models and predictive algorithms can surpass human instinct—our brains’ powerful, complex heuristics of pattern recognition. However, these (perhaps not as smart) algorithms have an advantage over our almighty brains: The sheer amount of data they can incorporate without any bias. As we come up with cleverer ways to treat all this data, it might just be possible to get one step further than human detectives’ current expert analysis built on personal experience. An even more likely path is to develop models that can simplify and convert huge data into a form that expert humans can understand instinctively; therefore utilizing all the tools we have: math, computers, data, and brains with experience.

The U.S. military has been using this approach for some time now. Data on insurgent attacks were used to predict new incidents in the Iraq and Afghanistan campaigns. However, it is still unclear how effective these new counter-insurgency techniques turned out. In a military campaign, it is difficult to decouple other effects and evaluate such efforts precisely. A recent Nature news article reports that now these military methods are evaluated on domestic ground: The police stopping gangs and drug dealers using military tactics. The counter-insurgency doctrine is tested on Springfield, Massachusetts. It includes data collection techniques developed by the military, social network analysis to identify key gang leaders, and computational methods to predict possible crimes. Kevin Kit Parker from Harvard University and his team are collecting data of their own to assess the success of this approach. Preliminary results are impressive. According to John Barbieri, deputy chief of police, crime rates in Springfield dropped 62% since the first year of implementation.

Clearly, there is an interesting ethical discussion here on treating citizens as potential criminals, or how far predicting violent events could go, even perhaps into human rights and presumption of innocence territory. But steering away from it, and focusing on potential crime rates in specific locations, current developments in data analysis sound very promising. It may take some time to perfect, though. The Big Data Age is quite new and shiny, and its tools are still improving. There are many pitfalls we are still learning to avoid. I can’t help but think of an imaginary, terrible episode of Numb3rs—the TV show where they used math to solve crimes. I imagine a scientist looking at a lot of points on a graph depicting crime statistics. He says “Let’s look at this in a log-log plot.” And when he does, his eyes shine brightly. As he draws a straight line in the middle of the clearly curved, horribly noisy data over a single decade, “I’ve got it,” he shouts with joy. “It’s a power law!”

I have no doubt that one day, soon, we will master predictive analytics and our neighborhoods will be safer. This is not a case of “In a future with jetpacks, Robocop will stop crime before it happens.” Preventing violence is within our reach. The only question in my mind is “Will I survive the gang wars to see the day?” With a little luck, and a little help from my friends in the City of Chicago, I hope so.

— Irmak Sirer