1 July 2021

Tencent teases in-house codec R&D, focusing on VVC over AV1

For such a small virtual industry event, Chinese giant Tencent had a domineering presence at this week’s Picture Coding Symposium 2021.

Faultline sat through back-to-back presentations from four engineers and researchers within Tencent’s Media Lab – flexing some serious muscles in areas from video codec patents and AI, to 4K, 8K and immersive formats.

Kicking off proceedings was Dr. Stephan Wenger, Senior Director of Intellectual Property and Standards at Tencent America, who described an industry worsening from an implementor perspective. While video codec royalties for AVC (H.264) are charged at between $0.10 to $0.20 per-device, depending on volume, royalties for HEVC (H.265) are rising to more than a dollar per-device, again depending on volume and including additional fees per unit for higher HEVC profiles, much to Tencent’s displeasure.

“Even when counting inflation, that is a significant increase,” Wenger deplored.

Looking ahead to VVC (H.266), even with Wenger’s extensive industry knowledge and contacts, he said royalties remain unknown but expects they will probably take on a similar magnitude to HEVC, perhaps slightly cheaper.

We should remind ourselves that Tencent is a board member at AOMedia, so there is an underlying agenda here to promote AV1, although thankfully Wenger did not dwell too much on the royalty-free angle of AV1.

He did note that codecs are largely interchangeable and therefore there is fundamentally not all that much difference between current codecs. At the end of the day, selection of which codec to use is based on the commercial environment rather than a technical environment – with Wegner highlighting the importance of efficiency, complexity, available hardware support, IP costs, and risk assessment.

“Many codecs need support by helper specs to be useful, such as file formats, transmission systems, and RTP payload formats. Some of these are quite complex and powerful,” he said.

With AVC patents due to expire in the coming years, Wegner explained that, in most countries, patents can be enforced 20 years past the priority date, but in some countries that time can be extended substantially.

“In the US, it’s not uncommon to see several hundred days of patent term extensions. I have seen ~3 years added through this mechanism. Many AVC patents should expire around 2023, but there’s a good number that will stay with us for considerably longer, reading on advanced profiles or supplemental enhancement information (SEI) messages,” continued Wegner.

Tucking into Tencent’s own video streaming operations now, Soo-Chul Han, Principal Software Engineer at Tencent America, continued the narrative of hardware support being essential to Tencent’s streaming business.

Despite being a software company, we learned that Tencent works closely with hardware companies for support, so closely in fact that Tencent has a say in tailoring future hardware support for its own purposes – which goes to show the type of influence software giants with vested interest in certain video codecs can wield over the hardware sector.

And with Tencent in the process of transitioning from current video codecs to the latest state of the art codecs, Han revealed that Tencent has deployed its own in-house AV1 encoder for VoD and real-time live streaming, to support its Tencent Video streaming platform as well as the Tencent Cloud. This provides various rate control modes that are needed, with Han showcasing several built-in presets making it much easier to use. For example, the slowest preset will give the best quality, while the fastest can be used for real-time applications.

Tencent is also developing its O266 real-time decoder for VVC. Hardware support for decode is currently not available for newer codecs, so software decoding can add value. Tencent says it can decode not only HD but 4K content too, achieving 25% faster decode time than the VTM reference decoder which is available on GitHub. A VVC encoder is also in the works at the Tencent Media Lab, with an eye for 4K, 8K and 360-degree video.

Moving on to Tencent’s exploits in artificial intelligence, Songnan Li, Deputy Director at the multi-faceted firm, presented some of the latest goings on around AI-based video enhancement and restoration.

Boasting that deep learning can outperform other methods, Li claims Tencent has made strides in achieving higher dynamic range, higher frame rate, and wider color gamut using these AI-based models.

“Videos should keep pace with the display device and the higher requirements of customers today. Many are already using deep learning, especially those in the cloud where there is enough power using GPU cores,” noted Li.

He spun up a couple of demos showing user-generated videos where quality had been impacted by over exposure and weak colors. Tencent’s AI is able to detect imperfections in low quality videos and filter these out automatically, he explained, saving a lot of man-power.

For even higher quality, 4K+ video enhancement using AI is normally very slow and memory intensive, according to Li. Tencent uses AI-based techniques for offline video transcoding which he says is not particularly sensitive to processing speed. For video processing on mobile devices, Tencent is attempting to develop AI-based methods, but Li believes traditional signal processing-based methods will continue to dominate this space, for now.

Tencent’s AI algorithms can also automatically edit videos, with Li highlighting soccer video footage extraction as a prime example of demand for such automated features, where Tencent can extract 19 separate types of in-game event.

Finally, Dr. Bing Jian of Tencent Media Lab presented some immersive multimedia technologies, mainly around 360-degree video and live sports replays. In this respect, these systems usually consist of an array of high-resolution cameras set up around a stadium, which are connected to a network and controlled by software to capture events from multiple viewpoints. Jian is proposing improved immersion through better synthesizing. We all know about 6DoF (six degrees of freedom) to replicate the freedom of body movement, but Jian argues that movement remains limited and ultimately a 2D image is still created in many cases. Another brief demo showed Tencent’s implementation of a synthesized view on a mobile platform, for its Freeview product which aims to build end-to-end systems and handle distributed encoding and camera synchronization.

One overarching takeaway was clear, that a company the size of Tencent needs to occupy a gross amount of floorspace at major trade events to showcase its products and services properly, because online demonstrations cannot come close to doing them justice.