x265’s ability to leverage AI to accelerate encoding Adaptive Bitrate Streaming (ABR) has been presented in the paper titled “Adaptive Multi-Resolution Encoding Scheme for ABR Streaming” that is to appear in the proceedings of IEEE International Conference on Image Processing (ICIP), 2018. Adaptive streaming enables dynamic adaptation to changes in network conditions by encoding video content in multiple bitrates and resolutions. Multi-pass x265 encodes exploit the structural redundancy across multiple resolutions by sharing analysis data in the order of increasing resolution, thereby reducing the computational burden significantly.
Static and dynamic refinement features were conceptualized for the Adaptive streaming topology in x265. As the first step, the input video sequence is scaled down to the lowest resolution of the bitrate ladder and is encoded. Coding metadata such as CU quadtree structure, PU predictions, coding modes and motion vectors (MVs) are stored during this first pass encode at CTU level abstraction. The subsequent higher-resolution encodes invoke either the static or the dynamic refinement algorithm. The figure below depicts this architecture for a three pass system. Theoretically, this can be extended to N passes.
The static refinement algorithm may further be classified as intra-picture and inter-picture refinement wherein the information from the previous pass is used at different levels of granularity thereby giving the current encode a head start. The encoder intelligently refines the analysis data based on certain heuristics and later uses it in the Rate Distortion Optimization (RDO) process. By tweaking the degree of information that is being reused, the user can control the performance vs quality balance of the encode.
The dynamic refinement algorithm applies Bayesian classification on the information saved in the previous pass and identifies patterns between the complexity of the content and the refinement level being used. Based on these patterns, the dynamic refinement algorithm switches between the static refinement levels. This will allow the encoder to find the optimal balance between performance and quality while taking the content complexity into account, making it more efficient than the static refinement methods.
Compared to conventional single-resolution approaches, the computational complexity of higher resolution encodes is drastically reduced with the use of the AI-based adaptive multi-resolution technique. The adaptive multi-resolution scheme is able to achieve a whopping speedup of about 2.5X, with a negligible drop in the quality. The above table from the research paper compares the speed vs quality of static and dynamic refinement levels. Since the AI tools used in x265 are not based on CNN models, x265 is able to leverage the advances in CPU technology to achieve these improvements.
As it turns out, the x265 project turned 5 a couple of months back; our first commits from the HM encoder date back to March, 2013. And being the geeks that we are, we didn’t even realize it!
Many thanks to all those who have enabled x265 accomplish all that it has over 5 years! We continue to innovate inside x265 to improve both quailty, and performance and look forward to celebrating our 10th anniversary with you.
By Pradeep Ramachandran
Now that the dust has settled, it is time to thank all the contributors for enabling a great showing by x265 at NAB 2018. We showed-off our new ML-accelerated content adaptive encoding for ABR, AVX-512 acceleration, and the recently added support for HDR10+/HLG at NAB. We received great feedback on what people would like to see in the coming releases, and will be working hard to continue to innovate in that space. We’ve also formed a committee to guide the future development of x265, and other open source media codecs that we will blog about more in the coming weeks; read this article for an initial idea of what this is about.
By Pradeep Ramachandran
Finally, the acceleration that we’ve all been waiting for is here! We’ve been working extensively with Intel for the last few months to use Intel Advanced Vector Extensions 512 (AVX-512) to accelerate x265. After much effort, we’re delighted to share that we’ve been able to accelerate 4K HDR encoding in main10 profile by over 15% for high-quality offline encoding. Checkout this white-paper on the Intel site for more information.
The patches will be pushed to the default branch soon. Let us know the results of your tests – you know where to find us!
By Pradeep Ramachandran
After what seems to have been a long delay, AV1 finally froze its bitstream last week! Like many folks in the industry, we have been waiting for this moment for a long time to see what a truly ‘royalty-free’ codec can can bring in terms of tools to the encoding space.
A few months ago, we stopped looking further into how AV1’s tools compare to that of HEVC due to this peer-review paper published in a leading journal from leading researchers in the field of video coding. That paper reported that the HM encoder provide average bitrate savings of 30% relative to AV1 with an encoding speed that was 25X that of the AV1 encoder; the informed would know that x265’s veryslow is very comparable in encoder efficiency to the HM encoder. While optimizations could bridge the gap in speed, bridging the gap in encoder efficiency would be hard unless some fundamental improvements (that are not covered by the HEVC-patents, mind you) are made. Now that the bitstream is frozen, we will be digging to see what new tools have been brought to the table that were not included in this comparison in the hope to answer the question “Is AV1 fundamentally better than HEVC as a standard?”. Stay tuned here to hear more, or share your thoughts on doom9/x265-devel mailing list.
And of course, there is the issue of patent royalties and licensing, but we will leave that up to the lawyers to deliberate and decide; lets talk more tech here!
By Pradeep Ramachandran
MulticoreWare, the developers of x265, will be available to discuss all things media at NAB 2018 in Las Vegas from April 9th – 12th in their booth at SU-14708. Swing by to talk about the soon-to-publish AVX512 acceleration, content adaptive optimizations for ABR encoding with x265, or if you just want a selfie with the creators of the world’s most popular HEVC encoder.
See you in Vegas!
PS: Make sure to mention this blog post if you stop by to stand a chance to win some open-source memorabilia!
By Pradeep Ramachandran
MSU recently released their 2017 codec comparison, where they compared x265 against other HEVC encoders, VP9 encoders, and the AV1 encoder. While the efforts that go into such large scale tests are appreciated, we as the x265 developers have to respectfully disagree with your conclusions drawn from this MSU report as we believe it is incomplete.
If you notice, these tests use v1.9 of x265 which is over 20months old! Since then, x265 has had 7 versions with an imminent 8th version. As anyone would expect, x265 has made considerable progress in speed and quality during this time. Specifically, we’ve made big changes to the lambda tables which considerably improved visual quality as reported by both consumers and customers.
That said, it is possible that maybe AV1 is a better codec than HEVC (at least in quality); maybe so is VP9. Maybe they have tools that are competent and can challenge HEVC. However, for the above said reasons, these results do not conclusively prove so, in our opinion!
Now as to why MSU used a 20month old encoder, there is some history there about the reservations of the x265 team to the validity of MSUs past tests as only objective scores were looked at. Quoting my dear friend, ‘we look at video and not at graphs!’. It is encouraging to see the MSU tests taking a turn towards doing subjective testing, which, in our opinion, is the right direction. Hopefully we will work with their newly mended ways in future for a more fair and realistic evaluation!
By Pradeep Ramachandran
By Tom Vaughan
Recently, a competitor (Beamr) published a blog post comparing their HEVC encoder to x265. They claimed that their HEVC encoder is faster, AND it produces better video quality.
Of course, we’re used to people comparing other HEVC encoders to x265.
- x265 is available under an open source license, and therefore it is widely available for any competitor to use in such comparison tests.
- x265 is by far the most widely known and widely used HEVC encoder, and so other companies want to build their brand by comparing to x265.
- But most importantly, they know that x265 is the “gold standard” of HEVC encoders. It’s the benchmark that all others must try to beat.
So, did Beamr’s encoder really beat x265? Or are their claims just a bunch of marketing bluster, that isn’t backed up by the facts?
This latest test was conducted in a way that mostly followed the encoder comparison guidelines we recently published. But there were many flaws.
- The bit rates produced were not identical. On average the Beamr encodes had 6% higher bit rates. In one case, the Beamr encode had a bit rate that was 18% higher. This was due to the fact that the test video sequences were very short, and one-pass encoding was used (giving the encoder less time to “dial in” to the target bit rate). This problem could have been avoided by using 2 pass encoding, which can achieve a more accurate bit rate while leveling quality across the encode.
- All of the test parameters were hand-picked by Beamr. The test content, the test hardware, the settings used, including the bit rate, the number of hardware threads used and the x265 presets. This is known as “cherry picking” (the fallacy of incomplete evidence). We wonder how many tests were run other different conditions, in order to determine the tests that showed the Beamr encoder under the most favorable light.
- Beamr chose to use x265’s ultrafast, medium and veryslow presets. They claimed a big speed advantage against x265’s veryslow preset. Of course, as the name implies, we designed our veryslow preset to be very slow. It is focused on achieving the highest possible quality, and no compromises are made that would improve performance. The next preset, slower, is twice as fast, but has nearly identical encoding efficiency. Why didn’t Beamr compare their encoder to x265 with the slower preset? [Because they would have lost on a speed comparison, as well as on quality.]
- Beamr only chose video sequences with relatively low motion and detail. Clips like Bar Scene, Dinner Scene, and Wind And Nature are exceptionally easy to encode. Why no sports, or other high detail + high motion content? Perhaps it’s because Beamr relies on using Tiles, which cannot use inter-prediction across tile boundaries. x265 uses Wavefront Parallel Processing, which is more efficient.
- Beamr chose only 4K content, at 7 Mbps bit rates. Why didn’t they show compare the 2 encoders for 1080P, 720P and lower picture sizes? Why not compare the encoders at a wider range of bit rates?
- Beamr chose a 4 year old hardware architecture (Xeon E5 v2, code named “Ivy Bridge”) to run their comparison test on. These processors don’t support AVX2 instructions, which newer Xeon generations (Haswell, Broadwell, Purley) do. We wonder how the Beamr encoder compares to x265 on modern machines.
Beamr claims a speed advantage over x265 under these carefully selected conditions. Most commercial companies use a preset like “slow” or “slower”, but rarely use our veryslow preset for their high quality offline encoding. For real-time encoding, we have a more advanced encoding library called UHDkit that can run multiple encoding instances in parallel to achieve high performance on a many-core server, allowing for higher quality settings than x265’s ultrafast preset. On modern Haswell, Broadwell or Purley generation Xeon powered servers, customers who have tried competing HEVC encoders tell us that UHDkit outperforms the competition under a range of scenarios.
Beamr made the bitstreams available, so that anyone could compare the quality of the video produced. You should download the videos and compare for yourself. Beamr claimed that the visual quality of their encodes was clearly superior, but we are confused by this claim. Perhaps their definition of higher visual quality is different from ours. Perhaps by “higher quality”, they mean “softer, with less detail.” If you prefer video with more detail and accuracy, x265 is the clear winner.
To compare video, you can’t look at still images – you need to actually watch the video at full speed. We have a special video comparison tool (UHDcode Pro Player) that we make available to customers and partners which can play 2 streams simultaneously, letting you hide or reveal more or less of each stream. This makes it easy to see which encode is better. But the screen shots below are fairly representative of the difference in detail you will see in the competing encodes. Take a look at the texture on all of the surfaces (the road, the buildings, the water). Take a look a the sharpness of the detail on every object. Everyone we’ve talked to so far agrees that x265 produced the better video. That matches the feedback we get from our customers and prospective customers that have compared x265 to Beamr, and all of our other competitors. We consistently win these customer shoot-outs, based on the quality and performance of x265 and our premium encoding library, UHDkit.
On the Beamr blog, they’ve posted some screen shots, which you can download to see for yourself how much more detail the x265 encodes have. Here are some additional examples…