Google’s AI-building AI-based program, AutoML, has created its own AI offspring, which can outperform a human-built AI system in video recognition tasks. Using a technique called reinforcement learning, AutoML (human-built) would tell NASNet when it struck gold, in a fully automated process. It’s a big step forward for machine-learning optimization, but still a long way from Skynet.
If this process can be expanded to other machine-learning systems, then Google will have solved one of the biggest problems with building these AI systems – the huge amount of human intervention needed to tweak them until they produce a satisfactory output. It’s a very complex process, and one that needs a lot of tinkering, but if the AutoML approach can be ported to other tasks (natural language processing, data analytics, etc.), then the industry is poised to take a big step forward.
NASNet currently holds the top score on both the ImageNet image classification and COCO object detection datasets, which are both respected large-scale academic resources. Scoring 82.7% in the ImageNet test, 1.2% higher than the previous best, and when a stripped-down version was tested, which could be run on mobile platforms with lower computation power, it scored 74% accuracy – around 3.1% better than previous equivalent attempts. In terms of progress, that 82.7% score was achieved using around half the computational cost of the previous best.
Google says that the image features learned by NASNet in the ImageNet and COCO tests could be reused in other computer vision applications, and has consequently open-sourced NASNet “for inference on image classification and for object detection in the Slim and Object Detection TensorFlow repositories. We hope that the larger machine-learning community will be able to build on these models to address multitudes of computer vision problems we have not yet imagined.”
In related news, fellow Alphabet division DeepMind reported that it has now created a way to get its AlphaGo AI system to teach itself chess and shogi (simpler games than Go) using no human data – just randomization and repetition. That capability suggests that DeepMind has worked out how to transfer its model, which achieved superhuman Go performance through randomization, to other games – adaptability that has not been possible until now.
But it complicates the black-box problem of current AI technologies – that the researchers can tweak the inputs and outputs until it performs as expected, but find it extremely hard to actually explain how a neural-network functions. Similarly, there are techniques that can fool these systems that are very hard for humans to spot, and if researchers like those in the linked Kyushu University report are able to trick image recognition systems so easily, it calls AI capabilities into question.
The industry will have to travel a long way until these AI-spawned AI-systems can be trusted with mission-critical tasks – such as emergency shut-offs, autonomous vehicle navigation, or healthcare functions. It will have to embrace penetration testing from whitehat hackers, and probably organize an ecosystem of events like the BlackHat conferences – which expose just how poor many security systems actually perform in the wild.
But the black-box element complicates this. Traditionally, a whitehat would be able to find and exploit an error in code, and then report a fix to a company – or even fix it themselves. However, an error inside the black-box isn’t subject to this approach, and its developers might not know why it is mistaking a particular road-sign for something else. Sure, they should be able to correct the model, but if they can’t peer under the hood of the black-box, they’re going to be left vulnerable to a similar error being reproduced – and potentially not spotted until it is too late.
Alphabet and its subsidiaries have been leading the charge in pushing these machine-learning systems, with DeepMind unveiling a relational reasoning system back in June that enabled complex questions to be asked – such as “what size is the cylinder that is left of the brown metal thing that is left of the big sphere.” Google’s MultiModel system has a similar function, being a new neural network architecture that can handle multiple domains – such as images, speech, and text. DeepMind has also created a self-learning function that could guess what sounds to match to video.
The final piece of Google-based AI news comes from a new tool called DeepVariant, which Google has released to help with decoding the human genome. The open source tool, available on GitHub, has been designed to better process the massive amount of data generated through analyzing a person’s genome – the sequences of nucleic acid that form the chromosomes inside a person’s DNA.
According to the MIT Technology Review, it can automatically identify small insertion and deletion mutations, and single-base-pair mutations, in the genome sequencing data, to provide a picture of a full genome. This is a significant improvement on the high-throughput sequencing techniques of old, which were apparently error-prone, and only offered limited snapshots, and now allows scientists to discern small legitimate mutations from the errors in the sequencing process.