Two notable AI mishaps dent reputation, IBM and AWS the culprits

AI’s glowing reputation has been tarnished a little this past week, with two major players managing to draw ire for falling short of the mark. IBM got flak for dodgy cancer diagnoses, while Amazon’s Rekognition system reckoned 28 members if the US Congress were criminals, based on their faces, after a trial carried out by the American Civil Liberties Union (ACLU).

Amazon’s failure here, part of the ACLU’s attempt to flag the problem of using such facial recognition systems in law enforcement, flustered five of those congress people to demand an immediate briefing with CEO Bezos. However, because we can’t see the data set, there’s a definite chance here for them to get egg on their face, if it transpires that they do share a resemblance to the mugshots in the data set.

Regardless, the ACLU is hoping to demonstrate that there are serious problems with these recognition systems, which are not being appropriately considered – mostly because of the hype surrounding AI and the underlying supposition that the computer can’t be wrong.

The ACLU, well known for its work on racial problems in the US, was most concerned by Amazon’s Rekognition’s apparent racial bias. Its test showed that a disproportionate amount of black and Latino congress members were flagged, which isn’t surprising, given the amount of reports out there of AI systems inheriting biases from the data they are trained on. In this case, if the recognition system is using a data set where a disproportionate amount of the images are of black and Latino people, it will inherently get more training time for spotting those faces.

The question then is how do you address the issue of bias in the data set. With something like facial recognition, presumably you ensure you use a balanced amount of faces, so that the AI process gets equal training time – otherwise it will get ‘better’ at spotting the faces it sees more of.

But this challenge of bias extends into data that might seem objective, such as binary or numerical data, if there has been a conscious human decision of what data is ‘correct.’ Human bias has to be considered at the selection level too, and as humans, it’s pretty hard for humans to spot when humans have been subjective about something, and brought their own opinions (conscious or not) into the matter.

The five congress people want answers for “how to address the defects of this technology in order to prevent inaccurate outcomes.” They’re quite right in wanting clarification, especially as there are so many police forces interested in using the technology – to spot a criminal among the crowd, and apprehend them. However, as South Wales Police trial showed, there’s a huge false positive rate.

In its trial, the police force has been testing an automated system to help it manage large events, such as football games. In the UEFA Champions League Final, held in Cardiff in June 2017, the plan was to be able to identify known troublemakers in the crowd, and step in when needed. However, a Freedom of Information Act enquiry found that of the 2,470 matches the system found, 2,297 were false positives. That left 173 correct matches; a success rate of just 8%.

The technology hadn’t improved when it was used at later events, and other UK police forces are interested in using it too. A Chinese deployment made a lot of headlines, used to target people on train networks, and the London Metropolitan Police are looking into significant deployments in the city.

So while manual police action is required, there won’t be many objections to the systems. While Orwellian, it is being used in a public place at least. But think of the potential damage that this sort of system could cause if it was tied into other elements of the criminal justice system. Imagine if a system with an 8% success rate was dishing out fines to people supposedly breaching bail or restraining order conditions – the legal costs of challenging these decisions could be crippling.

Then think of how these systems might be used by governments. Sure, our movements through public spaces could be monitored by the state now, using detectives to follow us and log our comings and goings. But with technologies like these, the practical expense of such surveillance is effectively negated. A city could track what its occupants are up to in real time, which is very appealing to many politicians.

Tying such systems into things like China’s social credit program could quickly become extremely oppressive – something the ACLU would certainly object to. There’s also a slim chance it could trigger an age of robotic benevolent dictators though, but we’re back to square one here – of having to be sure that the data we feed to these systems isn’t carrying over some problematic biases or selection factors.

So while Amazon tries to deflect the wrath of the US congress, which could definitely throw a spanner in the works, IBM has been battling the fallout from an investigation by Stat News, which found that its Watson AI was giving incorrect diagnoses to cancer doctors.

The leaked internal documents that were found suggest that the Watson system is being trained on synthetic medical records, rather than real patient records. This could be a very big problem for IBM, which has said publicly that the training used real data, with one doctor from Florida’s Jupiter Hospital noting that “this product is a piece of shit,” and that the hospital had “bought it for marketing and with the hopes that you would achieve the vision. We can’t use it for most cases.”

IBM’s Watson sounded like a great fit for hospital work, able to automate a lot of the image scanning and studying functions that doctors are currently required to do, and in turn freeing up time that those doctors could spend with patients. However, it appears that technology companies are running into problems surrounding privacy, and don’t have as much access to medical records and patients as they would like.

This appears to have been the culprit for IBM’s problems, and while it made a big song and dance about the projects, Big Blue has begun layoffs at its health division. Its numbers here had swelled through some major acquisitions, but the reported 50% to 70% cuts to the workforce seem pretty savage by any measure. IBM had paid $2.6bn for Truven in 2016, as well as $1bn for Merge in 2015 – but hasn’t disclosed how much it paid for Explorys. Having sunk north of $4bn into the project, IBM might soon lose patience with its healthcare wing – especially as rivals are going to begin crowding into the space on the back of open source hardware and software, with many nimble enough to target specific hospital projects.