Can algorithms improve judges’ decisions? Daniel Chen’s research has provided extensive evidence documenting bias in the US legal system. Here, he argues that integrating machine learning (ML) tools and with legal data offers a mechanism to detect in real time – and thereby remedy – judicial behavior that undermines the rule of law.
Until now, most empirical work has focused on observing the influences on judges’ behavior, helping to diagnose the problem of bias but offering little in terms of solutions. There is a substantial literature showing that features that ought to be legally irrelevant – such as race, the weather, or judicial attributes – are in fact predictive of legal outcomes in a variety of settings.
Daniel’s insight is that judges are most likely to allow these extra-legal biases to influence their decision-making when they are least swayed by legally relevant circumstances. In asylum courts, Daniel finds that judges with the highest and lowest grant rates are much more predictable than others. “However, less predictable judges tend to have middling grant rates. It may be that they lack strong preferences, and are therefore guided by random factors when making a decision – essentially flipping a coin."
ML offers a way to automatically detect such cases of judicial indifference – where judges’ decisions appear to ignore the circumstances of the case – because they are also the contexts in which ML tools are likely to be least accurate in predicting the decisions.
Equally, Daniel’s research has demonstrated the possibility for ML to automate the detection of inconsistencies between judges due to legally irrelevant factors. In asylum courts, Daniel finds these influences are highly prevalent, including: the time of day; the size of the applicant’s family; whether genocide has been in the news; and the date of the decision.
Identifying judges whose behavior is predictable at relatively early procedural stages may enable policy intervention. For example, training programs could be targeted toward these judges, either with the goal of de-biasing or to help improve the hearing process. Simply alerting judges to the fact that their behavior may indicate unfairness may be sufficient to change their behavior.
Advances in data analysis may permit more targeted interventions. “It may be possible,” Daniel suggests, “to establish the most predictable combinations of case and judicial characteristics. When such pairs are found, judges can be given a ‘red flag’ warning, as a counter-weight to confirmation bias or other non-legal sources of influence.”
Informing judges about the predictions made by a model decision-maker could help reduce judge-level variation and arbitrariness. “If brought to a judge’s attention, potential biases could be subjected to higher-order cognitive scrutiny. Such efforts would build on the push to integrate risk assessment into the criminal justice process.”
An additional pathway for ML to improve legal decisions is judicial education. “The first goal would be to expose judges to findings concerning the effects of legally relevant and legally irrelevant factors on decisions, with the goal of general rather than specific debiasing. For example, Pope, Price, and Wolfers (2013) found that awareness of racial bias among NBA referees subsequently reduced that bias. The second goal would be to educate legal decision-makers about inference, prediction, and the tools of data analysis, so that they can better understand available information, and the conscious and unconscious factors that may influence their decisions.”
Judicial education has had considerable success. By 1990, 40% of federal judges had attended a two-week training program in economics, founded in 1976. Daniel’s research has found that this training led to economics language rapidly becoming prevalent in judicial opinions. More tangibly, training changed how judges perceived the consequence of their decisions. Judges in economic regulation cases shifted their votes in an anti-regulatory direction by 10%. In district courts, when given discretion in sentencing, economics-trained judges gave 20% longer sentences than their non-economics counterparts.
Daniel believes this training is likely to have provided structure for judges to understand patterns. “The next challenge is to see whether ML, text-as-data analysis, and other developments allow for a further step. If judges are shown behavioral findings, will they be less prone to behavioral biases? If judges are taught the theoretical structure that drives behavioral bias, will they be better judges? Could a new generation of theory and evidence from behavioral and social sciences provide better justice and increase cooperation, trust, recognition and respect?”
Extract of the TSE Mag#19 Summer 2019