Stats and judicial bias

In a highly unusual move, applicants have presented statistics to support their allegation that those seeking a judicial review of migration decisions had virtually no chance of succeeding in Judge Street’s court between January and June this year.  […] Of 254 rulings delivered, Judge Street rejected migration appeals in 252 cases. [Natasha Robinson, ‘Federal Circuit Court judge Alexander Street accused of bias after rejecting hundreds of migration cases’, ABC News 10 September 2015]

Forget the details and focus on the core issue: can we demonstrate bias with statistics?

On one intuition, the justice system is supposed to be person-independent.  By this, we mean that regardless of the judge hearing your case, the police officer who arrested you, the jury that deliberated the evidence, the lawyer who represented you, &c., &c., &c., &c., &c., you should basically end up with the same legal outcome.

Of course, this is a bit of a fiction.  Wealthy people pay for expensive lawyers because they’re more likely to get a favourable outcome for them than relying on one appointed by the State.  Police officers have significant discretion and often use it to the benefit of white females and to the detriment of black males.  Juries are basically incompetent, and judges change their verdict depending on how far they are from their previous meal.

The legal system is a human system, and it comes with all the failings, quibbles, and eccentricities of the humans who administer it.  Fancy talk about the ‘rule of law’ is designed to distract us from the core reality of the legal system, making it seem more abstract and noble than it actually is.  But the more we see it as a flesh and blood system, than a supernatural creature of pure light, the more we can tackle with the issues it creates seriously.

When we come across situations like those of Judge Street, we instinctively think that something’s gone wrong.  Could it really be the case that Street happened to get a disproportionately large number of cases that failed to have merit?

This intuition depends on the statistical unlikelihood that cases of a similar class would have wildly different qualities.  If Judge A and Judge B each hear 100 criminal cases all involving automobile theft, we would express surprise if Judge A had guilty verdicts in 99 of those cases, and Judge B had not guilty verdicts in 99.  Similarly, if both caseloads split 50/50 guilty, we would be surprised if Judge A ordered prison sentences and Judge B gave out community service.

But things aren’t so clear cut.  A few weeks ago, I wrote about Bennion’s — frankly racist — argument about diversity of judges.  His argument was tricky.  Imagine that we increase the diversity of backgrounds on the Court.  If the decisions start to change markedly, then that’s evidence of cultural bias.  But if the decisions stay the same, what was the point of diversity?

I concluded:

Bennion is fine with ‘normal’ variation amongst educated whites, but diversity more broadly (or representatively) is an existential threat to the legal tradition.  There is no principled reason why we should agree with Bennion, and his position is overtly racist.

This is where statistical anomalies aren’t a good indicator of bias or prejudice.  Over in the medical profession, statistical indicators are used to flag potential problems.  Two doctors in the same area are assumed to have the same billing profile.  If they differ, it signals that one of the doctors is doing something unconventional and in contradiction of acceptable practice.  For example, if one doctor starts ordering lots of CT scans disproportionately to other doctors in the area, it flags the doctor for review.

But the statistical indicators are not the end of the story.  Rather than use them definitively, doctors undergo professional review.  The relevant indicator is not that they bill differently to other doctors, but that their practice is not considered acceptable by others in the profession.

I think this approach is better when talking about issues to do with Judge Street.  Sure, the statistical profile is weird, but if you go through the cases, does Judge Street deviate from the bounds of acceptable professional practice?  That’s unable to be answered by reference to the statistics alone.

Statistics like this are a dangerous tool.  The asylum seeker activists, for example, absolutely do not want a review of the tribunals that hear appeals about asylum seeker claims despite the fact that the decisions are disproportionately favourable in comparison to similar countries.  Back in 2013, I corrected Adam McBeth’s assertions about asylum seeker decision making.  He claimed that the high proportion of Department of Immigration negative assessments that were overturned indicated a ‘culture of no’ within the Department.  He didn’t factor in the high success rate within the Department (showing that the Department was actually favourable to asylum seeker claims and a small proportion needed to be appealed).  But it was also worth noting that the Tribunal’s decisions are oddly favourable.  When he was Foreign Minister, Bob Carr tried to begin a discussion about why the success rate was so high in comparison to other countries, but was immediately savaged by the lefties.  Nobody wants to know why the success rate is so high, just in case it leads to decisions to make the success rate lower.

The only way to review the Tribunal is not to look at the statistics, but to analyse the process by which decisions are made.  Are they all within the scope of acceptable professional practice, or are there systemic flaws resulting in weird statistical anomalies?  Although Judge Street’s numbers jumps out at us, we should wait to see whether it’s signal or noise.


