Your browser is not supported. Please update it.

8 October 2020

VC-6 standard marks paradigm shift in AI-based compression

Faultline teased readers earlier this year that October 2020 would deliver some mouthwatering developments from the video compression underworld, mainly in the form of standards and primarily driven by one company.

That company is V-Nova and this week the VC-6 video production codec has been published as a standard by the SMPTE, claiming to be the first AI-based compression codec not developed on widely used transformation techniques in data compression such as DCT (discrete cosine transform) or wavelet. Instead, VC-6 uses a hierarchical system of S-Trees – where nodes form a tree structure similar to the more common quadtree data structure – and achieves compression by reusing the same structures at different resolutions, making VC-6 faster in software-based implementations.

The AI-based novel hierarchical approach makes VC-6 better suited to 4K and 8K production, with V-Nova highlighting how the codec can surmount challenges tied to the latter emerging format. By enabling editing at multiple resolutions from a single file, VC-6 allows users to capture in 8K from RAW without having to cross-convert to other resolutions.

The London-based video compression vendor cites initial VC-6 use cases showing up to 60% reduction of disc access for 4K files, while achieving 2x to 4x faster decoding over existing systems based on ProRes. For 8K, results show processing reductions achieved by de-mosaicking only 12.5% of the full resolution (this is also known as debayering – a color reconstruction algorithm process).

The journey to VC-6 as we know it today, and indeed V-Nova’s development over the years, has made for fascinating observation. The best part? V-Nova isn’t done yet.

We must keep in mind that V-Nova’s Perseus video codec was reborn as SMPTE-2117, or VC-6 as it became more widely known, in late 2019, some six months after the game-changing reveal that this same compression technology was instrumental in MPEG-5 Part 2, now more commonly known as LCEVC (low complexity enhancement video codec). Any minute now, LCEVC will be subject to editorial review from standards bodies. Once reviews have completed, publicly available LCEVC libraries will be made available online for everyone to access.

Looking back, for Perseus Pro, now rechristened PPro, V-Nova developed machine learning algorithms based on neural network convolution, which is particularly well suited to analysis of visual images. Originally, Perseus was oblivious to the structure of the underlying image, operating almost purely on mathematical redundancy within the digitized data so that it could equally well compress any set of binary information, not just video. But V-Nova saw the potential and eventual necessity for achieving further gains in compression efficiency by taking account of the spatial data structure within images.

The machine learning algorithms dovetail with the hierarchical multilayer structuring of the images performed by Perseus, manipulating connections and their weights to differentiate finely between objects at different scales. This enables objects, or rather their boundaries, to be identified with the smallest possible residuals, in essence number of bits. The benefits are better overall compression compared to equivalent production and imaging formats and in particular enhanced rendering at multiple scales, such that effectively lossless video can be reconstructed after decoding given sufficiently high bit rates. At lower bit rates, the codec can still avoid blocky artifacts after reconstruction to yield video that appears almost visually lossless.

Perseus Plus then applies some the same algorithms to enhance those existing codecs. In this guise, the product is overcoming the stigma of failing to win over big customers, since it can be employed readily as a software enhancement to other codecs already installed.

VC-6 was last heard making moves into AWS with a new contribution system based on the video codec launched in April this year. Pitched as offering an “important new option” for broadcasters and producers to drive up quality while cutting out the need for costly leased connections, the collaboration features V-Nova’s P.Link contribution system based on VC-6 compression with AWS Elemental MediaConnect and AWS Direct Connect. V-Nova’s P.Link offers Intra-frame low latencies and claims equivalent quality at up to 70% lower bitrates than JPEG 2000 systems.

The issue here is that premium live sports and event contribution is almost wholly reliant on dedicated dark fiber backbones. As such, these steeply priced dedicated connections often mean that lower-quality, higher-latency encoders are used in deployments. So, delivering contribution feeds via AWS enables more cost-effective one-to-one or one-to-many links, while SMPTE VC-6 encoding slims down processing power requirements providing higher-density and lower operating costs.

Later this year, V-Nova plans to make a multi-platform SDK and cloud service deployment of VC-6 available via an Early Access Program. VC-6 is currently shipping as part of V-Nova’s P.Link for contribution and remote production application.