Nvidia joins ARM’s Project Trillium, Intel’s brow gets visibly sweatier

ARM and Nvidia have announced a partnership that will combine Nvidia’s open source Deep Learning Accelerator software and framework (NVDLA) with ARM’s recently announce Project Trillium – to create a chip design that can be used for machine-learning and AI applications at the network edge. Essentially, this is a way for developers to take advantage of Nvidia’s development ecosystem, to more easily create applications for ARM-based chips.

Project Trillium, announced in the run up to MWC, continues ARM’s strategy of designing chips that can then be licensed by other companies and manufactured. ARM had said that third-party designs could be integrated under the Trillium umbrella, and the Nvidia NVDLA seems to be the first instance of this. Strangely, Trillium won’t be the final brand name. The new family has two core designs – Machine-Learning (ML) and Object Detection (OD).

ARM says that ML provides a massive uplift in systems that might have previously relied just on CPUs, GPUs, or DSPs, and has an efficiency of 3 TOPS per watt – meaning that it is very well suited for edge devices. The OD design is derived from ARM’s acquisition of Apical, and is intended for machine-vision and image processing.

ARM is keen on using the two parts in the same device or chip, so that the OD element parses incoming data that the ML component can then run machine-learning functions on, with the example of a camera feed flagging areas of an image that require attention from the ML chip – for things like advanced motion detection or facial analytics.

Nvidia’s contribution, the NVDLA, is an open source architecture that Nvidia is hoping will help promote a standard way to design a deep learning inference accelerator. It combines both hardware (based on Nvidia’s Xavier SoC) and software (available on GitHub), but uses an Nvidia license for the open source elements – not the more common Apache, MIT, or BSD licenses.

“Accelerating AI at the edge is critical in enabling ARM’s vision of connecting a trillion IoT devices,” said Rene Haas, EVP, and president of the IP Group, at ARM. “Today we are one step closer to that vision by incorporating NVDLA into the ARM Project Trillium platform, as our entire ecosystem will immediately benefit from the expertise and capabilities our two companies bring in AI and IoT.”

Nvidia uses ARM CPUs inside its SoCs, the Tegra CPU being a good example, as a means of coordinating its GPUs, but has (until now) been pitching its offerings at larger devices – such as cars, drones, and cloud-computing arrays. With the ARM partnership, Nvidia is hoping that its presence in the low-power ARM ecosystem might help boost its adoption among IoT device developers – who might then use Nvidia hardware in future designs. Nvidia is looking to enter the smaller end of the IoT market.

The NVDLA is focused on Convolutional Neural Networks (CNNs), which are used in most image processing and object recognition applications – especially in cars and drones, the two markets Nvidia is hoping to corner. The NVDLA was opened up last year, and should give developers a leg-up when it comes to designing their own CNN-based applications. Nvidia will be hoping to sell them the GPU processing power, or the entire SoC, needed to power those apps.

Strategically, the move from Nvidia might shut out a lot of CNN and machine-learning focused startups – especially those working on AI frameworks and chip designs. For the high-end data center market, more developers are likely to use the Nvidia implementation, which should drive demand for Nvidia’s cloud computing GPU products.

On the network-edge, being enmeshed in ARM, the first choice for most developers looking for low-power consumption, will help drive demand for the data center too, as the neural networks that power the device’s inference processing need to be trained in the first place – and that training requires hundreds of GPUs in racks of servers. There will also be opportunities in the middle ground.

In theory, developing using the NVDLA framework should allow you to scale up to very powerful chips and down to low-power designs. Currently, ARM says its designs are used in 95% of smartphones, 95% of wearables, and 85% of automotive in-vehicle infotainment (IVI) systems too.

The partnership turns the screw on Intel, which has seen its certain future in the data center rocked by the popularity of GPUs for AI-based applications. For years, thanks to its CPU dominance, Intel’s growth would coincide with the future growth for computing – with its Xeon range powering that increased demand for compute resources.

But the emergence of GPUs has rocked that relationship, with Intel now countering an incursion from the GPUs traditionally only used in video workloads. Behind the GPUs lurk other specialized machine-learning designs, such as Google’s TPU and Imagination’s Power VR NNA, as well as recent Intel acquisitions Nervana and Movidius.

Some co-processor designs have also emerged, such as CEVA’s NeuPro, which has a configurable DSP, as well as similar implementations from Qualcomm’s Snapdragon 845, and Huawei’s Kirin 970. There are a few startups in this space too, including Graphcore and BrainChip, and the FPGA specialists like Xilinx and Altera (also owned by Intel), are also looking to push their designs as configurable answers to the question of AI and ML compute requirements.

As can be seen, Intel has been very worried by the impact of the new demands for computing pushing its Xeons out of the market. This has led to it acquiring companies that might keep it at the forefront. But seeing its two most prominent rivals getting cozy may well trigger some sleepless nights for the company.