Why predictive police don't work
A simulation by Carnegie Mellon University explains how, even moving from the datasets relating to arrests to those on complaints, the areas "at greatest risk" are indicated in an unbalanced and useless way by the algorithms
(photo: Ann Hermes / The Christian Science Monitor) Even using other starting datasets, for example victim complaints instead of arrests, predictive police systems - those that indicate where and when certain crimes are likely to occur based on what they can learn from the past - do not work. In the sense that they return a very different situation from the one that actually occurred and are substantially useless, because they carry various kinds of prejudices and imbalances.Indeed, if the topic of algorithmic bias has been central to the 'analysis and criticism of automated systems, from facial recognition to platforms for professional selection or precisely for city safety, a Carnegie Mellon University survey now demonstrates that changing the starting data is not enough to solve the problem. Rather. One of the recurring escape routes for those who develop these hotspot analysis systems, such as Keystats, HunchLab, Palantir, the German Precobs or PredPol - the most widespread in the United States born in 2008 from a joint collaboration between the Los Angeles Police Department and UCLA expert Jeff Brantigham - is having used the complaints collected in a certain neighborhood to feed the algorithms instead of police reports or information on arrests made in that specific city area. The Italian KeyCrime, on the other hand, works in a different way.
Nil-Jana Akpinar and Alexandra Chouldechova of the University of Pittsburgh and Maria De-Arteaga of the University of Texas have instead tried in the paper entitled "The effect of differential victim crime reporting on predictive policing systems" as well 'use of that initial information produces biased predictions to say the least. In essence, it is of little use if not harmful, as well as discriminatory, in the fight against crime. The group of researchers built an algorithm on their own following the same model applied by some very popular platforms, including PredPol used by several dozen police departments. She then trained the model on the basis of complaints by drawing on data from the metropolis of Bogota, Colombia, one of the few cities for which we have independent data on crime reporting at the district level.
When the three experts compared the predictions of their PredPol-like platform with real data for each district of the Colombian capital uncovered a large number of errors. Among the many, in a district where few crimes had been reported, the platform had foreseen 20% of the areas with the highest crime rate. In another area with a high number of cases, the algorithm, however, overloaded the forecasts by a further 20%. In both cases, providing incorrect if not counterproductive indications, especially in the first. "Our findings suggest that differential crime reporting rates may lead to a shift in areas to pay more attention to from high crime but low reporting to high or medium crime and high reporting. - this can lead to bad decisions both in terms of excessive and poor surveillance ".
If, then, basing these systems on arrest data leads algorithms astray, because it suffers from the prejudices of the police forces (in the United States are arrested in proportion mainly African Americans and belonging to other minorities), and leads to more patrolling certain areas with the result of increasing even more arrests, even relying on reports does not change the picture. Still remaining in the United States, the main field of application and investigation into the distortions of these systems spread throughout the world, it is African American people who are reported more often than whites. Or it happens more frequently that white people and people belonging to a higher social class report a poor black than the other way around. And African Americans are also more likely to report to each other. In short, as with the data relating to arrests, the use of different datasets also leads to identifying the black majority neighborhoods with criminal hot spots more often than the real data prove. And to trigger the usual chain that in some cases has led the police departments to give up the use of such unbalanced algorithms.
According to the lawyer and researcher Rashida Richardson, who specializes precisely in deepening the prejudices of algorithms at the AI Now Institute in New York, these results reinforce the already very clear evidence on the datasets used in predictive police analyzes: “They lead to prejudicial outcomes - he told Mit Technology Review - that do not contribute to public safety. I believe that many suppliers such as PredPol do not fundamentally understand how structural and social conditions influence or distort many forms of crime data. ”
There are in fact many other factors that can distort the starting point on the basis of which algorithms are trained. Staying in the case of the investigation, the fact of reporting to the police is, for example, largely linked to the level of trust of the community in them and the possibility of truly obtaining justice or at least an investigation into what was reported. So, Richardson adds, "if you live in a community with historically corrupt or racist police, it will affect whether and how people are willing to report a crime." In this case, a predictive tool of this kind would end up underestimating crime levels, for example, and would avoid concentrating the necessary controls in that area, mistaking the lack of complaints for a safer context.