In a market drunk on AI video upscaling and compression claims, how many companies have stopped to question whether any of this claimed quality improvement actually looks better to the human eye?
The answer, according to video optimization firm Beamr, cannot be found in the industry’s comfort blanket of objective metrics like PSNR, SSIM, and even VMAF, which can be routinely “fooled” and increasingly detached from perceptual reality.
At NAB 2026, Beamr is launching an expanded version of its Vista platform, positioned as a scalable subjective testing service. The company frames Vista as a bridge between fast-but-flawed metrics and slow-but-expensive lab-based human testing, by building an environment where real human viewers are brought into the loop.
Speaking to Faultline, Chief Product Officer Dani Megrelishvili made it clear this is a business model pivot for Beamr, and a much-needed one.
Originally built to validate Beamr’s own Content-Adaptive Bitrate (CABR) technology, the platform is refining a methodology for comparing videos that look the same, and figuring out which one is worse. Only in the past six months, Megrelishvili said, have customers began pushing Beamr to externalize the capability.
That demand has triggered a productization effort, transforming what was effectively an internal QA tool into something closer to perceptual quality as a service.
The mechanics are simple. Viewers are shown side-by-side clips and told to choose which looks worse. This forced-choice method is designed to eliminate bias and surface even marginal differences.
Behind the scenes, Beamr seeds tests with verification pairs to weed out inattentive or fraudulent participants, a necessary step when relying on crowdsourced labor pools.
And yes, Vista is crowdsourced. Beamr is sourcing “a few dozen” viewers per test via external platforms, paying them to participate, and filtering results statistically. Results are returned within days, much quicker than lab-based testing, with confidence levels designed to satisfy engineering teams making production decisions.
Every tweak, from codec switches (i.e. AV1 vs. HEVC), bitrate ladder changes, or super resolution models, introduces risk. Rollbacks can be costly for video service providers, and visual regressions can slip through unnoticed if teams rely purely on metrics. Vista is something of a final sanity check.
Beamr’s forthcoming NAB demo will use Vista to validate Nvidia’s RTX Video Super Resolution. According to Beamr, AI upscaling outperformed bicubic alternatives with 95% statistical confidence across 60 qualified viewers.
Studios, streamers, and codec vendors have long relied on “golden eyes” panels to validate quality. This traditional subjective testing is slow, expensive, and tightly controlled (calibrated displays, lighting conditions, lab environments). Beamr is telling customers it’s okay to trade some of that control for speed and accessibility.
Yet Beamr is not alone.
Netflix, for example, has invested heavily in aligning VMAF with human perception through extensive subjective testing datasets. Other vendors are layering machine learning on top of subjective metrics, rather than replacing metrics altogether. Meanwhile, crowdsourced UX testing platforms outside the video domain have been operating at scale for years.
There are also questions around consistency. Crowdsourced viewers and device variability (particularly as HDR and high-end displays enter the mix) introduce noise that traditional lab testing is designed to eliminate. Beamr admits HDR testing is not yet supported, which is a notable omission.
So while Beamr positions Vista as closing the gap between objective and subjective testing, it may be more accurate to say it is sacrificing some objective precision for speed, scale, and cost.
Another point is “hackability”. Megrelishvili addressed how scores like VMAF can be optimized to look good on paper without actually improving what viewers perceive on screen. This is a byproduct of how these metrics are trained and tuned, creating an incentive for encoder developers and AI models to chase higher scores. In that sense, Vista is a corrective layer, designed to potentially expose when a pipeline has been engineered to satisfy an algorithm instead of a human eye.
Last year, following Beamr CEO Sharon Carmel’s presentation at Mile High Video in Denver, Faultline wrote that the company was doubling down on Nvidia’s GPU ecosystem, convinced that the latest architectures would deliver the speed and efficiency once promised.
At the time, Beamr’s market cap was sitting pretty at $43.4 million. Today, it has fallen to $28 million. Beamr is hoping that cuddling up closer with Nvidia will reverse the downward trend.

