A House Built on Discriminatory Sand
Strategies for Pushing Back on Data-Driven Policing Trends
by Anthony W. Accurso
The National Association of Criminal Defense Lawyers (“NACDL”) released its Task Force report on data-driven predictive policing in September, 2021, highlighting the failures of predictive policing and making policy recommendations regarding its use.
In 2017, the NACDL created a Task Force to study data-driven predictive policing in use around the U.S. For two years, Task Force members interviewed numerous witness, including technology and industry experts, law enforcement personnel, academics, attorneys, advocates, and community stakeholders.
The Task Force defined “data-driven policing” as “the tools that analyze data to determine where, how, and who to police.” This includes primarily software developed for, and marketed to, law enforcement agencies around the country, and which often uses artificial intelligence (“AI”) to analyze data and make predictions about when and where crime is likely to occur, or who is likely to be involved in crime.
This topic has received a lot of attention lately as communities around the country are demanding a reimagining of policing, and often-hidden tools and tactics are being revealed to the public, usually as a result of investigative reporting by activists and reform groups.
Recently, Criminal Legal News featured a cover article on this issue written by Matt Clarke, entitled “The Real Minority Report: Predictive Policing Algorithms Reflect Racial Bias Present in Corrupt Historic Databases,” in the November 2021 issue. While this article was a comprehensive accounting of the topic, it could not include the perspective and recommendations provided by the NACDL’s Task Force due to the report’s release date. And while the NACDL’s report reaches many of the same conclusions as those contained in Clarke’s article, it also includes ideas about actions that can be taken by communities to address this problem.
How It Works
The “Holy Grail” of policing is to prevent crime entirely, deterring it by having officers present at a crime scene just before crime occurs or by incapacitating persons attempting to commit crime. After all, everyone wants to live in a safe community. How “safety” is achieved, however, is a matter of some dispute.
Law enforcement has always tried to predict where crime is likely going to occur, and various methods of predicting crime—many claiming to have a basis in science—have been used by police. The difference with new, AI-driven systems is the sheer amount of information that can be analyzed at speeds previously unattainable by human analysts alone.
“The application of data mining technologies to domestic security is the attempt to automate certain analytic tasks to allow for better and more timely analysis of existing datasets by identifying and cataloging various threads and pieces of information that many already exist but remain unnoticed using traditional means of investigation. Data mining can provide answers to questions that have not been asked, or even elicit questions for problems that have not yet been identified.”
Applying this idea to the problem of directing police resources, on its face, sounds like a good idea. But these systems rely on massive sets of police information to train them, and this information has been demonstrated to be error-prone and definitely not race-neutral.
Further, to predict future crime based on past crime would necessarily require a complete understanding of all crime in a community. Yet the data used to train predictive policing software are “at best a partial representation of crime in the community, and at worst, a record containing falsified crimes, planted evidence, and racially biased arrests.”
“Though many assume that police data is objective, it is embedded with political, social, and other biases,” according to legal scholar Rashida Richardson. “Indeed, police data is a reflection of the department’s practices and priorities; local, state, or federal interests; and institutional and individual biases. In fact, even calling this information ‘data’ could be considered a misnomer, since ‘data’ implies some type of consistent scientific measurement or approach. In reality, these are no standardized procedures or methods for the collection, evaluation, and use of information captured during the course of law enforcement activities, and police practices are fundamentally disconnected from democratic controls, such as transparency or oversight.”
Elizabeth Joh, Professor of Law at UC Davis, further elaborates on this concept by stating, “[t]he difference between crime data and actual crime reflects a longstanding observation about the police. Policing is not the passive collection of information, nor the identification of every violation of the law. Every action—or refusal to act—on the part of a police officer, and every single decision made by a police department, is also a decision about how and whether to generate data. Crime data doesn’t simply make itself known ... [it] is the end result of many processes and filters that capture some aspect of the crime that actually occurs.”
AI-driven software is trained on these flawed, misrepresentative datasets and cannot automatically remove the biases present in the data. Even where programmers attempt to “correct” for these biases, it is exceedingly difficult, if not impossible, to do so.
“As a result, the presence of bias in the initial dataset leads to predictions that are subject to the same biases that already exists within the dataset,” writes statistician William Isaac. “Further, these biased forecasts can often become amplified if practitioners begin to concentrate resources on an increasingly smaller subset of these forecasted targets.”
The Task Force found that this resulted in “self-perpetuating feedback loops of crime predictions, in which officers patrol neighborhoods that had been disproportionately targeted by law enforcement in the past, and were thus overrepresented in the historical crime data used to train and build predictive crime algorithms.”
The end result is that officers do the same thing they have always done, which is to overpolice poor and BIPOC communities. The only difference is that data-driven predictions allow police to obscure how the racist history of policing produces continued racist results, by hiding behind a seemingly-scientific objective facade.
These automated-decision systems are then “trusted above human judgement while simultaneously concealing potential unchecked errors.”
Law enforcement, when left to their own devices and absent any transparency or oversight, will always approach the problem of community safety by increasing the amount, and harshness, of policing. As legal scholar Andrew Ferguson writes, “a black-box futuristic answer is a lot easier than trying to address generations of economic and social neglect, gang violence, and a large-scale underfunding of the educational system.”
Recommendations
The NACDL Task Force produced several ideas about the use of data-driven predictive policing systems, but its overall recommendation is simply that law enforcement should not use such systems at all because the demonstrated drawbacks outweigh any claimed benefits.
Indeed, some police departments have already made this choice. Often, companies peddling these products offer free trials of software and hardware, but after a few years, the police begin to experience “spiraling prices, hard-to-use software, opaque terms of service, and [a] failure to deliver products.”
These experiences culminate at a time when communities are pushing-back about this kind of surveillance regime, demanding change and accountability. San Diego was the first city to enact a legislative ban on predictive policing systems, and other have since followed.
Yet the companies that sell these systems approach police departments with grand claims and no up-front costs, a potent and seductive mix, especially when these sales pitches occur outside of the community’s awareness (and may stay that way until budget funds are spent on these systems).
Ultimately, the Task Force acknowledges that police departments will continue to adopt and employ these systems for the time being. It made several supplemental recommendations that can, to some extent, mitigate the worst abuses of power.
Transparency
Police departments are generally reluctant to notify communities when they use novel forms of surveillance or investigative tools. But if they are to gain the trust of the communities they are sworn to protect and serve, they must be transparent in their use of tools like AI-driven predictive systems, as well as the methods of data collection and surveillance used to support or “train” them.
The NACDL supports model legislation proposed by the ACLU that communities can enact at the city level. This legislation, named Community Control Over Police Surveillance (“CCOPS”), requires police to be transparent about using new technologies (including predictive policing systems) and obtain community approval for their use, followed by periodic effectiveness reviews.
This process would ideally include publicly available forums where police or the proposed vendor presents and describes the new tech to the communities that would be subjected to it.
If a community collectively finds the possible benefit outweighs the social and economic costs, then community stakeholders collaborate with the police to develop written policies governing the use of the tech.
Also, companies that provide predictive policing systems are known for taking measures to prevent meaningful review of how their systems operate, usually invoking trade-secret privilege to prevent disclosure during court cases or public records requests. But this kind of secrecy is incompatible with democratic oversight and scientific review of a tool’s effectiveness. Any vendor who wants to benefit from tax dollars must agree, in writing, that doing business with police departments means giving up their right to assert trade-secret privilege when it comes to court filings and records requests.
During the review process, the police or vender should also be required to show how the product is not only race-neutral in its proposed methodology and inner workings, but on subsequent review also demonstrate that its actual effect continues to be so.
These kinds of review, aided by experts in statistics, computer science, and criminology, would have advised against predictive policing algorithms—based on datasets containing obvious racial biases—and spared communities the economic and social costs of its implementation.
Accountability
Were a community to approve of predictive policing systems, part of the ongoing review into their effectiveness involves requiring police to publish at least annual reports regarding uses of the technology and results of audits on its effectiveness.
Due to the rampant inclusion of errors in police databases—the result of innocent errors or intentional fraud on the part of officers trying to meet quotas—persons must be allowed to know what information the police have accumulated about them, and they must have an effective means of challenging the retention of inaccurate, obsolete, or illegally retained data.
This is especially true with regards to gang databases, such as California’s CalGang database or the Chicago Police Department’s (“CPD”) Strategic Subjects List (“SSL”). Activists and journalists have documented a myriad of abuses regarding such databases, and people included on them frequently have no means to challenge their inclusion in the database or any obvious errors in the data.
Former police officers have admitted to falsifying data on persons, for reasons including meeting quotas or being able to monitor or harass individuals for being perceived as participating in gang activity. This is a clear violation of the Fourth and Fifth Amendments to the U.S. Constitution, but a lack of police transparency in these secretive lists prevents accountability.
These lists often include information on children, and the data persist after the person becomes an adult. Science has demonstrated that human brains progressively develop decision-making ability through age 25 and that the vast majority of crime-involved youth eventually “age out” of criminal activity. This is why justice systems treat children differently and expunge crime data after a person becomes an adult.
Yet these gang databases do not similarly expunge the data. CalGang “listed 42 infants under the age of one as active gang members,” and CPD’s SSL included “7,700 people who were added to the database before they turned 18, including 52 children who were only 11 or 12 years old at the time of their inclusion.” These persons will face collateral consequences of inclusion on these lists for decades unless there is a means of holding police accountable for the data. These consequences are not limited to harassment by police and heightened scrutiny by the justice system—including tighter bail restrictions and lengthier sentencing—but also extend to other areas of a person’s life.
According to the Task Force report, “an investigation in 2019 revealed that the NYPD, does in fact, routinely disseminate sealed data to third parties, including prosecutors, the news media and housing, immigration and family-court officials.” Further, “[a]llegations of gang involvement have also been found to deter students in Chicago from accessing their neighborhood schools, with researches at the University of Illinois at Chicago claiming that Chicago Public Schools can refuse to admit young people with alleged gang designations.”
Yet this need not be the case, as communities can enact legislation, like CCOPS, that limits the kind of data the police may collect and retain, holding these departments accountable for collection of “big data.”
Courts
Because of the so-far secretive nature of predictive policing systems and surveillance programs that support them, criminal defendants have been denied many of the protections that are guaranteed by the U.S. Constitution.
Defense attorneys are rarely provided evidence relating to use of predictive policing or surveillance technologies, only infrequently discovering them in the course of a trial. This frustrates the right of defendants to challenge the nature and veracity of any evidence used against them, including how it was collected.
Police must possess reasonable suspicion to believe a person may be engaged in a crime in order to stop or detain them. But if a person is on a gang list and is present in a location where software predicts a crime will occur, do police then have the reasonable suspicion required to conduct an investigative detention? What if the info used to support the software’s determinations are erroneous or fraudulent? Should police be allowed to plead good-faith reliance on these flawed, and often racist, conclusions?
These questions raise concerns about the Fourth and Fifth Amendment rights of criminal defendants, and answering them is often frustrated by police reluctance to disclose the use of software tools when making decisions to detain people. Even when their use is disclosed, vendors often assert trade-secret privilege to prevent defense attorneys from exposing flaws in the data or the algorithm itself.
While CCOPS legislation is not the solution to all these problems, it will at least alert defense attorneys to the use of police tech tools and prevent venders from asserting trade-secret privilege in court.
The NACDL also advocates that judges and defense attorneys be provided training in how these tools work, including their common flaws, in order to provide a forum where the use of the tools can be properly assessed and defendants afforded the due process they deserve. They also recommend making resources available, so defense attorneys can access experts in these systems to better defend their clients.
Finally, when police insist on maintaining vast databases on community members, they should also provide notification about information collected about them (in a reasonable timeframe) and be provided resources, including access to free attorney representation, to help challenge inaccurate or obsolete information. Police departments must understand that the cost of using “big data” includes the cost of doing it well, which involves accountability to the communities they serve.
Summary
Reviewing the research into the outcomes of data-driven predictive policing systems and their accompanying surveillance tools, a person would be forgiven for question why groups like the National Institute for Justice funded these programs in their early stages. In addition to being sold as race-neutral, these programs often included a social outreach component of those programs.
The work of Andrew Papachristos, a sociologist who focuses on how gun violence spreads within social networks, was the inspiration for Chicago’s SSL databse. He “envisioned a system in which social and medical providers could provide immediate guidance to those in harm’s way, and where service providers could reach out to young people who would be better served through diversion as opposed to detention.”
Papachristos later distanced himself from the program, citing the CPD’s “fixation on identifying offenders in communities” as simply reinforcing the ways in which “America devalues the lives of young people of color.”
Until America collectively, and communities individually, reckon with the intersection of policing and racism in this county, even ideas backed with the best of intentions will have a discriminatory effect.
Mathematician Ben Green writes that, “[i]n the hands of police, even algorithms intended for unbiased and nonpunitive purposes are likely to be warped or abused. For whatever its underlying capabilities, every technology is shaped by the people and institutions that wield it. Unless cities alter the police’s core functions and values, use by police of even the most fair and accurate algorithms is likely to enhance discriminatory and unjust outcomes.”
These changes need not take a national political sea change to be enacted. Each community can gather a critical mass of citizens and push for local changes to policing practices, and thereby prevent uses or misuses of technologies like data-driven predictive policing.
A predictive policing “house” built on the “sand” of discriminatory data cannot stand for long in the face of community pressure to change the way their officers carry out their duty to protect and serve.
Source: nacdl.org
As a digital subscriber to Criminal Legal News, you can access full text and downloads for this and other premium content.
Already a subscriber? Login