V-Nova recruited by MPEG-5 for codec enhancement

There will be something for everybody in MPEG’s announcements at NAB 2019, which range through holographic compression with V-PCC and even non-video genomic data sets with MPEG-G, through to license-free codecs with MPEG-5. Not surprisingly the latter is garnering most interest and is also where there is greatest intrigue in the air at NAB. This could be said to be the definitive coming of age and industry acceptance for London-based V-Nova, whose Perseus technology has been embraced within the working draft of MPEG-5 as an extension to existing codecs to improve processing and compression efficiency.

At the time of writing, just before NAB, MPEG would naturally not comment formally, but has confirmed the subjects that will be covered. It is clear that V-Nova’s involvement in MPEG-5 was recommended by Thierry Fautier, Harmonic’s VP of video strategy and co-chair of the MPEG Roadmap committee charged with presenting the 2020 MPEG roadmap to the industry.

MPEG-5 comes in two parts, with part 1, otherwise known as Essential Video Coding (EVC), being an emerging and fully independent codec pitched as a royalty free competitor not just to AV1 from the Alliance for Open Media representing particularly the big technology firms like Google and Netflix with Apple also on board, but also MPEG’s own HEVC as well as its forthcoming Versatile Video Coding (VVC). Note here that it may be just the basic tools that are royalty free but that is not quite clear yet.

MPEG-5 part 2 is completely independent of part 1 and designed as a codec-agnostic enhancement, which is where Perseus fits in. It is a wrapper capable of adapting to any other codec, whether or not from the MPEG stable. In all cases it improves compression, or video quality at a given bit rate, while reducing processing power consumed, although to varying degrees.

Part 2 itself has some interesting potential use cases since it can enhance legacy codecs, notably AVC/H.264 and Google’s VP9 as the ancestor to AV1, which are both widely installed. MPEG-5 Part 2 can boost efficiency of those by perhaps 50%, which will enable a number of traditional operators, as well as streaming providers, to delay upgrading to new codecs. As V-Nova has argued, operators such as Twitch, the world’s leading streaming provider for gaming, will not be able to scale up to millions of users with new codecs such as AV1, VVC or for that matter MPEG-5 part 1 for many years. With part 2, streaming providers can halve the encoding opex, noting that this one is not royalty free and indeed that V-Nova hopes to recoup a significant amount from its deployment.

It is worth summarizing the evolution of V-Nova’s Perseus to see how it got to this position. The original Perseus, launched on April 1st, 2015 and now called Perseus Pro, was pitched initially as an intra-only codec primarily for high end contribution. It was for this purpose V-Nova’s launch partner, Sky Italia, deployed Perseus Pro, although it was later adopted also for distribution, but again intra-only. V-Nova then developed a version that can also work in temporal mode under the new name Perseus Pro. This is a complete codec which effectively combines the intra functions of say JPEG and PNG with the temporal side of H.264, but with different underlying technology that in principle could be applied to any complex data set and not just video.

Perseus Plus then evolved from Pro with the initial design goal being to cater for live use cases where decoding requirements are much tighter and where compatibility with more complex workflows perhaps involving dynamic ad insertion or metadata management are required. This initially simplified the original Perseus by taking just two correction layers starting with intra-only and then adding temporal prediction.

A key point though is that because it assumes poor decoder hardware, it was designed to be extra light and avoided some tools of the Pro version, to the extent according to V-Nova that it can even decode ultra HD streams using JavaScript on some PCs.

Perseus Plus then is the foundation of MPEG-5 part 2, at least in the working version ahead of final standardization later in 2019. The working group is in fact being co-chaired by Dolby alongside V-Nova, so it is unlikely now that Perseus Plus will fail to make the final version, although enhancements are anticipated from other MPEG participants. The latter may include Comcast as a shareholder in V-Nova through its acquisition of Sky, while Amazon and Technicolor are definitely involved.

This leaves various possibilities, given that MPEG-5 Part 2 is independent of Part 1. What we think quite likely is that Part 1 itself will integrate a version of Part 2 that would give it a greater competitive edge. After all, Part 2 is an enhancement rather than a standalone codec.

It is also possible that VVC and AV1 will do something similar since it would accelerate processing and therefore make them cheaper and easier to adopt. These camps may even decide to incorporate what V-Nova would present as a fresh and relatively clean body of valuable IP (Intellectual Property) to build on, since it would dovetail well with their existing technology.

But meanwhile MPEG itself is embroiled over internal conflicts between MPEG-5 part 1 and VVC, which is its designated direct successor to HEVC. Otherwise known as MPEG-I Part 3, VVC is due for release in 2020 by the Joint Video Exploration Team (JVET), a combined MPEG and ITU consortium. It has sometimes also been referred to as Future Video Coding (FVC) or ITU H.266 and we may recall the German Fraunhofer Institute staging a demonstration at IBC 2018 showing 40% to 50% improvement in efficiency over HEVC.

But the institute noted that AV1 had already shown similar performance while being at least closer to royalty free and already implemented. The institute expressed doubt that VVC would be able to keep pace with AV1 and – just to muddy the waters – at NAB Harmonic will be demonstrating its version of AV1 for full HD. This means MPEG might do well to favor MPEG-5 part amplified by Perseus.

At least MPEG has other toys to show at NAB, such as V-PCC (Volumetric Point Cloud Compression), which is part of MPEG-I, an emerging suite of standards targeting immersive media. This has been a major research area involving 3D point clouds representing data points in space that can have attributes such as color and luminance associated with them, as well as movement.

The aim is to address full Extended Reality and ultimately holographic projection, which will generate far more data even than 8K HD services at 120fps, by facilitating compression around six degrees of freedom (6DoF). This would address fully immersive movement in three-dimensional space and provide a foundation for virtual and blended reality take off. Current VR implementations are limited to three degrees of freedom (3DoF) by compression constraints, among other things.