ARM makes Project Trillium public for mobile machine learning

ARM has unveiled Project Trillium, an initiative that aims to add new processor designs that cater for AI and machine learning (ML) applications. Supported by new software tools, the two designs, ML and OD (Object Detection), are being pushed as a way of adding advanced AI capabilities to mobile devices.

Of course, ARM has rivals in this wave of ML-focused silicon. Google has its TPU, Intel is developing chips based on its Nervana acquisition, and CEVA and Ambarella launched new products before this year’s Consumer Electronics Show. A host of smaller start-ups will hope to pile on the pressure – including Cerebras, DeePhi, Graphcore, Horizon robotics, and Mythic.

ARM ML is the first new processor design, and boasts two main features – the Fixed-Function Engine (FFE) and the Programmable Layer Engine (PLE). ARM says that additional PLEs can be added to the design, to support non-convolutional layers – meaning that the chip can be customized to suit the need of the developer.

In terms of specs, ARM says the ML Processor can provide 4.6 TOPS, at an efficiency of 3 TOPS per Watt (TOPS/W). The silicon specialist says that this provides a massive uplift in systems that might have previously just relied on CPUs, GPUs or DSPs, and while it doesn’t mention pricing, it says that this provides “unmatched performance in thermal and cost-constrained environments”.

In terms of applications, smartphones are highlighted, but so are smart cameras and AR/VR devices. IoT-type devices are also prominent, with smart home, drones, wearables, robotics, and medical all apparently catered for. ARM says that the new suite will be available for early preview in April, leading to general availability in mid-2018.

ARM OD is the second new design, and has been built to recognize features and movements of human bodies – likely derived from ARM’s 2016 acquisition of Apical. ARM is keen on the idea of using both the ML and OD in the same device, using the OD chip to parse the incoming images (from a camera), which can then flag areas of interest that the ML processor can focus on – with the example use being the OD flagging faces in an image, which are then identified precisely by the ML.

For specs, ARM says it can handle 1080p streams at 60fps, ideal for high quality video applications, and can detect objects measuring just 50×60 pixels inside those feeds – around 0.14% of the visible image. ARM says that it can track virtually unlimited objects per frame, and uses a detailed model to detect direction, trajectory, pose and gesture.

The OD is a second generation product, and ARM says that the first iteration was used in Hive’s security cameras, according to a blog post from Jem Davies, general manager of the firm’s ML Group,. Davies writes that the combination of ML and OD processors will soon enable AR experiences in connected diving goggles (for spotting fish in coral reefs and alerting tourists to hazards), with the OD processor filtering the scene so that only the most important data is displayed on screen.

Davies goes on to say that ML represents the biggest inflection point in computing for more than a generation, which will touch just about every market segment. Edge processing will dominate here, according to Davies, who writes: “I say this because I have the laws of physics, the laws of economics, and many laws of the land on my side. The world doesn’t have the bandwidth to cope with real time analysis of all the video being shot today, and the power and cost of transmitting that data to be processed in the cloud is simply prohibitive.”

The third component of the release is a new set of open source Linux and Android development software, called ARM NN SDK, described as a translation layer between existing neural network frameworks with the chips. Currently, the NN SDK supports the new ML design, as well as ARM’s Mali GPUs, and Cortex-A and Cortex-M CPUs. ARM NN currently supports the Caffe (Facebook) framework, as well as TensorFlow (Google), and MXNet (Apache) – with support for TensorFlow scheduled to be added soon.

In terms of the application stack, which ARM measures in five layers, the ARM NN SDK (3) sits above the compute library (2) running on either its Cortex CPUs or Mali GPUs (1), and below the TensorFlow or Caffe Neural Network (4) that powers the actual ML Application (5).

“The rapid acceleration of artificial intelligence into edge devices is placing increased requirements for innovation to address compute while maintaining a power efficient footprint. To meet this demand, ARM is announcing its new ML platform, Project Trillium,” said Rene Haas, president of ARM’s IP Products Group. “New devices will require the high performance ML and AI capabilities these new processors deliver. Combined with the high degree of flexibility and scalability that our platform provides, our partners can push the boundaries of what will be possible across a broad range of devices.”