Nvidia used to be the outsider knocking on the door of Intel’s castle with its GPUs (graphical processor units). Now, the GPU has grown well beyond its roots in powering PC visuals and become a key building block in high performance computing and cloud architectures – to the extent that Intel has been forced to abandon its former hostility to the approach, and announce its own GPUs.
But Nvidia has a clear headstart here, and has been steadily bringing its architecture into the telecoms world. Its opportunity lies in the very high end infrastructure that will be needed to run a cloud-native RAN, in particular, to the same level of performance as a conventional access network. Even Intel accepts that this cannot be done on x86 CPUs (central processor units) alone, but will need a complex set of accelerators to offload and optimize the most demanding tasks. Some of this will be done with FPGAs (field programmable gate arrays) – Intel acquired Altera to add these chips to its arsenal – but the GPU increasingly has a mainstream role to support demanding cloud functions in the telecoms world, such as virtualized RAN and network AI (artificial intelligence).
Nvidia recently announced a joint vRAN development project with Ericsson (though the Swedish firm remains non-committal on whether GPUs really can run a RAN to the same standards as a hardware solution). Now the GPU giant has got closer to ARM, introducing a reference design platform for GPU-accelerated servers based on ARM processor core IP.
This could help take ARM-based designs into Intel-dominated areas such as supercomputing, AI and cloud telecoms networks. There have been various attempts to do this, in order to provide an alternative to x86 on the merchant market (which, in turn, could slow the webscale giants’ attempts to build their own processor designs). So far, ARM has had limited impact on the cloud server world – Marvell is the most successful player, courtesy of its acquisition of Cavium, but Qualcomm put its own ARM-based server development on the back burner. Now, Nvidia claims five OEMs have already committed to build high end servers based on the new reference design – Ampere, Cray, Fujitsu, HPE and Marvell itself.
The announcement follows Nvidia’s announcement, last summer, that it would provide full support for ARM architectures with its CUDA-X software platform, and it has now released its ARM-compatible software development kit as a preview. And the company announced that several research centers are already using ARM/Nvidia platforms in supercomputing, including the USA’s Oak Ridge and Sandia National Laboratories, the UK’s University of Bristol, and Riken (Japan).
Last month, Nvidia discussed how it is applying recent breakthroughs in GPU technology for supercomputing to the vRAN challenge. It has worked with Ericsson to build the “world’s first software-defined 5G RAN”. Its new EGX Edge Supercomputing Platform is designed to be flexible, to support a variety of high performance use cases including 5G, massive IoT and AI. It is cloud-native, and powered by the company’s Cuda Tensor Core GPU, which can process 15 teraflops of data per second and up to 140 simultaneous high definition video streams.
“All of this processing basically translates to one single node that is equivalent to hundreds of nodes in a data center,” CEO Jensen Huang said during his keynote speech at the Mobile World Congress Americas show.
Among the partners announced for EGX are, in addition to Ericsson on the vRAN side, Red Hat to help implement Kubernetes container orchestration and deliver a carrier-grade software stack; and Microsoft to integrate its Azure cloud platform to help with AI computation and other intensive workloads, distributed from edge to cloud as required.
Huang described how Nvidia sees 5G and edge computing as being inextricable – twin enablers of many applications that need to process huge amounts of data, often with very quick response times, including AI analytics, cloud gaming, mixed reality, the IoT and software-defined networking (SDN).
“The future is software-defined, and these low latency applications that have to be delivered at the edge can now be provisioned at the edge,” he went on. “That future will become software-defined high performance computing.”
While Nvidia was focusing on use cases, such as factory robotics, which require a combination of 5G connectivity, edge computing and AI, Ericsson was emphasizing how a GPU architecture could make RAN functions themselves far more efficient and scalable. Despite the talk about their collaboration on vRAN, both firms said they were not yet ready to share any details of their approach or progress.
Thomas Noren, Ericsson’s head of 5G commercialization, used dynamic spectrum sharing (DSS) as an example of an intensive, time-sensitive RAN workload which would require very high performance chips before it could be virtualized on cloud infrastructure. DSS allows an operator to run 4G and 5G flexibly on the same band and dynamically allocate spectrum between the two. DSS reschedules spectrum every millisecond, which makes it very challenging for the hardware to handle all the time and frequency variables.
So far, such workloads require special-purpose, fully optimized processors, argued Noren. He said: “We have a multicore baseband architecture that allows us to do parallel processing; we have the capacity to introduce 5G and DSS on the same baseband. Our purpose-built baseband unit is the fastest, most power-efficient unit in the industry. That’s why we can do DSS.”
So it is a huge concession by Ericsson to say it believes a vRAN, and sophisticated tasks like DSS, could in future run on an off-the-shelf chip. The same goes for the distributed unit (DU) within a disaggregated vRAN. Most vRAN architectures will run some network functions on a centralized server, while others will be distributed closer to the cell site on a distributed unit, often because they require lower latency.
Ericsson already has roadmaps to virtualize the centralized unit (CU) on off-the-shelf hardware, and has worked on this with Intel. But while the CU requires a great deal of horsepower, the DU brings different challenges because many units will be deployed, so cost and power consumption must be minimized, even while supporting very low latency response and potentially high bandwidth.
Noren said: “If you look at all the available technologies, they are too expensive, too power hungry and too big to be effective compared to our purpose-built hardware for DU. Nvidia has a platform and development framework that we can potentially use.”
He was clear that the co-development of a GPU-based vRAN was only in the experimental stage, and remained cautious about success – understandably, given the decades of R&D that have gone into creating specialized platforms. He pointed out that Ericsson’s existing baseband, running on specialized silicon, could already handle the “incredibly computer intensive” software to support a workload like DSS, while in the case of the Nvidia platform, he was merely “open to the idea”. He said: “We will explore if we can develop a distributed baseband product (DU) with Nvidia GPU. We don’t have any committed product plans, but we think this is a very interesting idea.”
Verizon has also been working hard on ways in which GPUs might help support very high performance 5G/edge networks and applications, either working as accelerators or, through parallel processing, supporting a high end cloud platform in their own right. A group of engineers at Verizon has been working for the past two years on ways to orchestrate and load-balance data over a 5G network onto an edge processing unit based on one or more GPUs.
The effort was initially focused on XR and AI workloads, under the leadership of TJ Vitolo, head of the telco’s XR Lab. They were looking to reduce power consumption in the smartphone or other device by removing its GPU, and moving those workloads to the edge cloud. The low latency of edge-plus-5G would enable the same experience for the user as they currently get from a device-based GPU, and the relaxation of power constraints would support more advanced applications based on XR, AI or massive IoT.
This led to the development of a GPU-based orchestration system which has recently gone into the test phase, amid promises that it will be able to revolutionize XR markets.