By Pradeep Ramachandran
By Tom Vaughan
Recently, a competitor (Beamr) published a blog post comparing their HEVC encoder to x265. They claimed that their HEVC encoder is faster, AND it produces better video quality.
Of course, we’re used to people comparing other HEVC encoders to x265.
- x265 is available under an open source license, and therefore it is widely available for any competitor to use in such comparison tests.
- x265 is by far the most widely known and widely used HEVC encoder, and so other companies want to build their brand by comparing to x265.
- But most importantly, they know that x265 is the “gold standard” of HEVC encoders. It’s the benchmark that all others must try to beat.
So, did Beamr’s encoder really beat x265? Or are their claims just a bunch of marketing bluster, that isn’t backed up by the facts?
This latest test was conducted in a way that mostly followed the encoder comparison guidelines we recently published. But there were many flaws.
- The bit rates produced were not identical. On average the Beamr encodes had 6% higher bit rates. In one case, the Beamr encode had a bit rate that was 18% higher. This was due to the fact that the test video sequences were very short, and one-pass encoding was used (giving the encoder less time to “dial in” to the target bit rate). This problem could have been avoided by using 2 pass encoding, which can achieve a more accurate bit rate while leveling quality across the encode.
- All of the test parameters were hand-picked by Beamr. The test content, the test hardware, the settings used, including the bit rate, the number of hardware threads used and the x265 presets. This is known as “cherry picking” (the fallacy of incomplete evidence). We wonder how many tests were run other different conditions, in order to determine the tests that showed the Beamr encoder under the most favorable light.
- Beamr chose to use x265’s ultrafast, medium and veryslow presets. They claimed a big speed advantage against x265’s veryslow preset. Of course, as the name implies, we designed our veryslow preset to be very slow. It is focused on achieving the highest possible quality, and no compromises are made that would improve performance. The next preset, slower, is twice as fast, but has nearly identical encoding efficiency. Why didn’t Beamr compare their encoder to x265 with the slower preset? [Because they would have lost on a speed comparison, as well as on quality.]
- Beamr only chose video sequences with relatively low motion and detail. Clips like Bar Scene, Dinner Scene, and Wind And Nature are exceptionally easy to encode. Why no sports, or other high detail + high motion content? Perhaps it’s because Beamr relies on using Tiles, which cannot use inter-prediction across tile boundaries. x265 uses Wavefront Parallel Processing, which is more efficient.
- Beamr chose only 4K content, at 7 Mbps bit rates. Why didn’t they show compare the 2 encoders for 1080P, 720P and lower picture sizes? Why not compare the encoders at a wider range of bit rates?
- Beamr chose a 4 year old hardware architecture (Xeon E5 v2, code named “Ivy Bridge”) to run their comparison test on. These processors don’t support AVX2 instructions, which newer Xeon generations (Haswell, Broadwell, Purley) do. We wonder how the Beamr encoder compares to x265 on modern machines.
Beamr claims a speed advantage over x265 under these carefully selected conditions. Most commercial companies use a preset like “slow” or “slower”, but rarely use our veryslow preset for their high quality offline encoding. For real-time encoding, we have a more advanced encoding library called UHDkit that can run multiple encoding instances in parallel to achieve high performance on a many-core server, allowing for higher quality settings than x265’s ultrafast preset. On modern Haswell, Broadwell or Purley generation Xeon powered servers, customers who have tried competing HEVC encoders tell us that UHDkit outperforms the competition under a range of scenarios.
Beamr made the bitstreams available, so that anyone could compare the quality of the video produced. You should download the videos and compare for yourself. Beamr claimed that the visual quality of their encodes was clearly superior, but we are confused by this claim. Perhaps their definition of higher visual quality is different from ours. Perhaps by “higher quality”, they mean “softer, with less detail.” If you prefer video with more detail and accuracy, x265 is the clear winner.
To compare video, you can’t look at still images – you need to actually watch the video at full speed. We have a special video comparison tool (UHDcode Pro Player) that we make available to customers and partners which can play 2 streams simultaneously, letting you hide or reveal more or less of each stream. This makes it easy to see which encode is better. But the screen shots below are fairly representative of the difference in detail you will see in the competing encodes. Take a look at the texture on all of the surfaces (the road, the buildings, the water). Take a look a the sharpness of the detail on every object. Everyone we’ve talked to so far agrees that x265 produced the better video. That matches the feedback we get from our customers and prospective customers that have compared x265 to Beamr, and all of our other competitors. We consistently win these customer shoot-outs, based on the quality and performance of x265 and our premium encoding library, UHDkit.
On the Beamr blog, they’ve posted some screen shots, which you can download to see for yourself how much more detail the x265 encodes have. Here are some additional examples…
By Tom Vaughan
Whether you want to compare two encoders, or compare different settings for the same encoder, it’s important to understand how to set up and run a valid test. These guidelines are designed to allow anyone to conduct a good test, with useful results. If you publish the results of an encoder comparison and you violate these rules, you shouldn’t be surprised when video professionals point out the flaws in your test design.
- You must use your eyes. Comparing video encoders without visually comparing the video quality is like comparing wines without tasting them. While it’s tempting to use mathematical quality metrics, like peak signal to noise ratio (PSNR), or Structural Similarity (SSIM), these metrics don’t accurately measure what you are really trying to judge; subjective visual quality. Only real people can judge whether test sample A looks better than test sample B, or whether two samples are visually identical. Video encoders can be optimized to produce the highest possible PSNR or SSIM scores, but then they won’t produce the highest possible visual quality at a given bit rate. If you publish PSNR and SSIM values, but you don’t make the encoded video available for others to compare visually, you’re not conducting a valid test at all.
Note: If you’re running a test with x264 or x265, and you wish to publish PSNR or SSIM scores (hopefully in addition to and not instead of conducting subjective visual quality tests), you MUST use –tune PSNR or –tune SSIM, or your results will be completely invalid. Even with these options, PSNR and SSIM scores are not a good way to compare encoders. x264 and x265 were not optimized to produce the best PSNR and SSIM scores. They include a number of algorithms that are proven to improve subjective visual quality, while at the same time reducing PSNR and SSIM scores. Only subjective visual quality, as judged by real humans matters.
Of course subjective visual quality testing is very time consuming. But it’s the only valid method for true comparison tests. If, however, you have a very large quantity of video content, and you need to compare the quality of content A against content B, or set up an early warning indicator in an automated quality control system, objective metrics are useful. Netflix has done some valuable work in this area, and we would recommend their VMAF (Video Multimethod Assessment Fusion) metric as the best available today. At best, objective metric scores should be considered only a rough indication of visual quality.
- Video must be evaluated as video, not still frames. It’s relatively easy to compare the visual quality of two decoded frames, but that’s not a valid comparison. Video encoders are designed to encode moving images. Things that are obvious when you are examining a single frame may be completely invisible to any observer when viewed as part of a sequence of images at 24 frames per second or faster. Similarly, you’ll never spot motion inaccuracy or other temporal issues such as pulsing grain or textures if you’re only comparing still frames.
- Use only the highest quality source sequences. A source “sequence” is a video file (a sequence of still pictures) that will serve as the input to your test. It’s important for your source video files to be “camera quality”. You can’t use video that has already been compressed by a camcorder, or video that was uploaded to a video sharing site like YouTube or Vimeo that compresses the video to consumer bit rates so that it can be streamed efficiently from those websites. Important high frequency spatial details will be lost, and the motion and position of objects in the video will be inaccurate if the video was already compressed by another encoder.
In the early days of digital video, film cameras were able to capture higher quality images than video cameras, and so the highest quality source sequences were originally shot with film movie cameras, and then scanned one frame at a time. Today, high quality digital video cameras are able to capture video images that rival the highest quality film images. Modern professional digital video cameras can either record uncompressed (RAW) or very lightly compressed high bit rate (Redcode, CinemaDNG, etc.) video, or transfer uncompressed video via HDMI or SDI to an external recording device (Atomos Shogun, Blackmagic Video Assist), which can store the video in a format that utilizes very little video compression (ProRes HQ, DNxHD, DNxHR). Never use the compressed video from a consumer video camera (GoPro, Nikon, Canon, Sony, Panasonic, etc.). The quality of the embedded video encoder chips in consumer video cameras, mobile devices and DSLRs is not good enough. To test video encoders, you need video that does not already include any video compression artifacts.
- Use a variety of source sequences. You should include source video that is a representative sample of all of the scenarios that you are targeting. This may include different picture sizes (4K, 1080P, 720P, etc), different frame rates, and different content types (fixed camera / talking heads, moving camera, sports/action, animation or computer generated video), and different levels of complexity (the combination of motion and detail).
- Reproducibility matters. Ideally, you should choose source sequences that are available to others so that they can replicate your test, and reproduce and validate your results. A great source of test sequences can be found at https://media.xiph.org/video and https://media.xiph.org/video/derf/. Otherwise, if you are conducting a test that you will publish and you have your own high quality source sequences, you should make them available for others to replicate your test.
- Speed matters. Video encoders try many ways to encode each block of video, choosing the best one. They can be configured to go faster, but this will always have a trade-off in encoding efficiency (quality at a given bit rate). Typically, encoders provide presets that enable you to choose a point along the speed vs. efficiency tradeoff function (x264 and x265 provide ten performance presets, ranging from –preset ultrafast to –preset placebo). It’s not a valid test to compare two encoders unless they are configured to run at a similar speed (the frames per second that they encode) on identical hardware systems. If encoder A requires a Supercomputer to compare favorably with encoder X running on a standard PC or server, or if both encoders are not tested with similar configurations (fast, medium, or slow/high quality), the result is not valid.
- Eliminate confounding factors. When comparing encoding performance (speed), it’s crucial to eliminate other factors, such as decoding, multiplexing and disk I/O. Encoders only take uncompressed video frames as input, so you can decode a high quality source sequence to raw YUV, storing it on a fast storage system such as an SSD or array of SSDs so that I/O bandwidth will be adequate to avoid any bottlenecks.
- Bit Rate matters. If encoders are run at high bit rates, the quality may be “visually lossless”. In other words, an average person will not be able to see any quality degradation between the source video and the encoded test video. Of course, it isn’t possible to determine which encoder or which settings are best if both test samples are visually lossless, and therefore, visually identical. The bit rates (or quality level, for constant quality encoding) you chose for your tests should be in a reasonable range. This will vary with the complexity of the content, but for typical 1080P30 content, for HEVC encoder testing, you should test at bit rates ranging roughly from 400 kbps to 3 Mbps, and for 4K30 you should cover a range of roughly 500 kbps to 15 Mbps. It will be easiest to see the differences at low bit rates, but a valid test will cover the full range of quality levels applicable to the conditions you expect the encoder to be used for.
- Rate Control matters. Depending on the method that the video will be delivered, and the devices that will be used to decode and display the video, the bit rate may need to be carefully controlled in order to avoid problems. For example, a satellite transmission channel will have a fixed bandwidth, and if the video bit rate exceeds this channel bandwidth, the full video stream will not be able to be transmitted through the channel, and the video will be corrupted in some way. Similarly, most video is decoded by hardware video decoders built into the device (TV, PC, mobile device, etc.), and these decoders have a fixed amount of memory to hold the incoming compressed video stream, and to hold the decoded frames as they are reordered for display. Encoding a video file to an overall average target bit rate is relatively easy. Maintaining limits on bit rate throughout the video, so as not to overfill a transmission channel, or overflow a video decoder memory buffer is critical for professional applications.
- Encoder Settings matter. There are many, many settings available in a good video encoder, like x265. We have done many experiments to determine the optimal combination of settings that trade off encoder speed for encoding efficiency. These 10 performance presets make it easy to run valid encoder comparison tests. If you are comparing x265 with another encoder, and you believe you have the need to modify default settings, contact us to discuss your test parameters, and we’ll give you the guidance you need.
- Show your work. Before you believe any published test or claim (especially from one of our competitors), ask for all of the information and materials needed to reproduce those results. It’s easy to make unsubstantiated claims, and it’s easy for companies to run hundreds of tests, cherry-picking the tests that show their product in the most favorable light. Unless you are given access to the source video, the encoded bitstreams, the settings, the system configuration, and you are able to reproduce the results independently with your own test video sequences under conditions that meet your requirements, don’t believe everything you read.
- Speak for yourself. Don’t claim to be expert in the design and operation of a particular video encoder if you are not. Recognize that your experience with each encoder is limited to the types of video you work with, while encoders are generally designed to cover a very wide range of uses, from the highest quality archiving of 8K masters or medical images, to extremely low bit rate transmission of video through wireless connections. If you want to know an encoder can or can’t do, or how to optimize it for a particular scenario, you should ask the developers of that encoder.
Hardware vs. Software encoders.
It is a bit silly to compare hardware encoders to software encoders. While it’s interesting to know how a hardware encoder compares to a software encoder at any given point in time on a given hardware configuration, there are vast differences between the two types of encoders. Each type has distinct advantages and disadvantages. Hardware encoders are not cross platform; they are either built in or added on to the platform. Hardware encoders are typically designed to run in real-time, and with lower power consumption than software encoders, but for the highest quality video encoding, hardware encoders can NEVER beat software encoders, because their algorithms are fixed (designed into the hardware), while software encoders are infinitely flexible, configurable and upgradeable. There are many situations where only a hardware encoder makes sense, such as in a video camera or cell phone. There are also many situations where only a software encoder makes sense, such as when it comes to high quality video encoding in the cloud, on virtual machines.
By Pradeep Ramachandran
- Improved grain handling with
--tunegrain option by throttling VBV operations to limit QP jumps.
- Frame threads are now decided based on number of threads specified in the
--pools, as opposed to the number of hardware threads available. The mapping was also adjusted to improve quality of the encodes with minimal impact to performance.
- CSV logging feature (enabled by
--csv) is now part of the library; it was previously part of the x265 application. Applications that integrate libx265 can now extract frame level statistics for their encodes by exercising this option in the library.
- Globals that track min and max CU sizes, number of slices, and other parameters have now been moved into instance-specific variables. Consequently, applications that invoke multiple instances of x265 library are no longer restricted to use the same settings for these parameter options across the multiple instances.
- x265 can now generate a seprate library that exports the HDR10+ parsing API. Other libraries that wish to use this API may do so by linking against this library. Enable ENABLE_HDR10_PLUS in CMake options and build to generate this library.
- SEA motion search receives a 10% performance boost from AVX2 optimization of its kernels.
- The CSV log is now more elaborate with additional fields such as PU statistics, average-min-max luma and chroma values, etc. Refer to documentation of
--csvfor details of all fields.
- x86inc.asm cleaned-up for improved instruction handling.
- New API x265_encoder_ctu_info() introduced to specify suggested partition sizes for various CTUs in a frame. To be used in conjunction with
--ctu-infoto react to the specified partitions appropriately.
- Rate-control statistics passed through the x265_picture object for an incoming frame are now used by the encoder.
- Options to scale, reuse, and refine analysis for incoming analysis shared through the x265_analysis_data field in x265_picture for runs that use
--analysis-reuse-modeload; use options
- VBV now has a deterministic mode. Use
By Pradeep Ramachandran
Today Intel launched the next generation of Xeon processors, the Intel Xeon Scalable Processor Family (code-named “Purley”), based on the Skylake CPU architecture. The Intel Xeon Scalable Processor Family is a powerful new generation of 14nm chips which provide significant improvements over the previous generation of Xeon processors (Xeon E5 v4 and E7 v4, code named “Broadwell”), including many fundamental CPU architectural improvements, a much faster internal data transfer architecture (a mesh architecture with 2x the bandwidth instead of a ring architecture), AVX-512 vector processing, improved cache, and improved I/O architecture with six DDR4 memory channels and 48 PCIe lanes.
With x265 pushing the previous generation processors to the edge for memory bandwidth and threading, the benefits that these new Xeons provide for x265 users will be game changing. Our initial results with the latest build of x265 show a 67% average per-core gain for encoding using HEVC Main profile, and a 50% average gain with Main10 profile across different presets. In particularly, off-line encoding of 4K content is seeing tremendous benefits due to the higher memory bandwidth that the CPUs are able to utilize from cache and system memory. Intel’s Xeon Scalable Processor Family makes x265 and UHDkit the ideal option for a wider range of scenarios including both live and offline HEVC encoding, and they double the performance/cost you’ll get with our software-based encoding libraries. We’re also seeing significant performance improvements with x264 – roughly 40% higher performance per core on average.
As we enhance x265 to take advantage of the new technologies that these new processors bring to the light, including AVX-512, we expect that users of x265 will love the benefits that they see with these new Xeons. This even extends to the Core i9 (Skylake-X) consumer processor family, which are based on the same Purley architecture. Give them a spin, and let us know what you think!
By Pradeep Ramachandran
- HDR10+ supported. Dynamic metadata may be either supplied as a bitstream via the userSEI field of x265_picture, or as a json jile that can be parsed by x265 and inserted into the bitstream; use
--dhdr10-infoto specify json file name, and
--dhdr10-optto enable optimization of inserting tone-map information only at IDR frames, or when the tone map information changes.
- Lambda tables for 8, 10, and 12-bit encoding revised, resulting in significant enhancement to subjective visual quality.
- Enhanced HDR10 encoding with HDR-specific QP optimzations for chroma, and luma planes of WCG content enabled; use
- Ability to accept analysis information from other previous encodes (that may or may not be x265), and selectively reuse and refine analysis for encoding subsequent passes enabled with the
- Slow and veryslow presets receive a 20% speed boost at iso-quality by enabling the
- The bitrate target for x265 can now be dynamically reconfigured via the reconfigure API.
- Performance optimized SAO algorithm introduced via the
--limit-saooption; seeing 10% speed benefits at faster presets.
- x265_reconfigure API now also accepts rate-control parameters for dynamic reconfiguration.
- Several additions to data fields in x265_analysis to support
--refine-level: see x265.h for more details.
- Avoid negative offsets in x265 lambda2 table with SAO enabled.
- Fix mingw32 build error.
- Seek now enabled for pipe input, in addition to file-based input
- Fix issue of statically linking core-utils not working in linux.
- Fix visual artifacts with
- Fix bufferFill stats reported in csv.
By Tom Vaughan
MONTREAL, CANADA – APRIL 19, 2017 – At the 2017 NAB Show, Haivision will demonstrate a breakthrough in live 4Kp60 HEVC software-only performance video streaming, leveraging the unparalleled quality of x265 software encoding, while running at a performance level that was previously only possible with dedicated hardware. This demonstration will be presented by Haivision’s HaiGear Labs, the company’s technologies research group, at the Renaissance Hotel (suite Ren Deluxe – B) next to the Las Vegas Convention Center.
Through the use of commodity off-the-shelf processing capabilities, Haivision will showcase how x265 software encoding, running on readily available dual-socket servers from the Intel® Xeon® Processor E5-2600 v4 product family, addresses the growing demand for high-quality 4K video streaming. This development brings down the costs associated with encoding live 4K video and enables 4K video streaming on ubiquitous Intel cloud compute architectures.
The foundation for these video streaming innovations comes from the company’s four years of active involvement in the x265 open source project, a commercially backed initiative founded with the goal of producing the highest performance, most efficient HEVC/H.265 video encoder software implementation. Haivision is an original charter licensee of the x265 project and has made significant contributions to the x265 initiative through tight technology collaboration with MulticoreWare, the primary developer of the widely adopted open-source codec.
Haivision’s quality-to-performance in its live 4K HEVC demonstration leverages UHDKit, MulticoreWare’s extended encoding library built on top of the x265 HEVC encoder. By heavily investing in advancing the UHDKit for low-latency live encoding, Haivision has been able to push the boundaries on what has been possible in HEVC software encoding.
“Haivision has been an active contributor to x265 and UHDKit and has helped MulticoreWare push the envelope with regard to live encoding performance,” said Tom Vaughan, vice president, general manager for video, MulticoreWare. “Haivision’s numerous contributions are invaluable to every user of x265.”
“Haivision’s long-term association with MulticoreWare’s x265 project and our tuning of the UHDKit for high performance streaming on the Intel platform has enabled our customers to benefit from software-only or CPU/GPU balanced performance,” said Mahmoud Al-Daccak, chief technology officer, Haivision. “We will continue to pioneer and contribute to these development communities that rely on open-source initiatives to move the streaming video industry forward.”
As a pioneer in high performance streaming solutions, Haivision innovates in the areas of live hardware and software encoding/decoding, video stream transport and management. The company is dedicated to pushing the technology envelope, and fostering partnerships and collaboration within the industry to expand the ecosystems of performance video that its customers depend on. To learn more or book a demonstration, visit haivision.com/nab.
Haivision, a private company founded in 2004, provides media management and video streaming solutions that help the world’s leading organizations communicate, collaborate and educate. Haivision is recognized as one of the most influential companies in video by Streaming Media and one of the fastest growing companies by Deloitte’s Technology Fast 500. Haivision is headquartered in Montreal and Chicago, with regional offices located throughout the United States, Europe, Asia and South America. Learn more at haivision.com.
By Tom Vaughan
MulticoreWare, the developers of x265, will be at the National Association of Broadcasters convention in Las Vegas, Nevada, April 24-27th. You can find us in the South Hall, Upper, booth SU14002. We’ll be demonstrating the latest advances to x265, and our premium video encoding framework, UHDkit (which includes both x264 and x265, plus many extended capabilities). If you haven’t registered, you can get a free guest pass by using Guest Pass Code: LV6642. Contact us through our x265 Facebook page if you would like to schedule a meeting.
By Pradeep Ramachandran
Release date – 15th February, 2017.
- New SSIM-based RD-cost computation for improved visual quality, and efficiency; use
- Multi-pass encoding can now share analysis information from prior passes (in addition to rate-control information) to improve performance and quality of subsequent passes; to your multi-pass command-lines that use the
--multi-pass-opt-distortionto share distortion information, and
--multi-pass-opt-analysisto share other analysis information.
- A dedicated thread pool for lookahead can now be specified with
- option:–dynamic-rd dynamically increase analysis in areas where the bitrate is being capped by VBV; works for both CRF and ABR encodes with VBV settings.
- The number of bits used to signal the delta-QP can be optimized with the
--opt-cu-delta-qpoption; found to be useful in some scenarios for lower bitrate targets.
- Experimental feature option:–aq-motion adds new QP offsets based on relative motion of a block with respect to the movement of the frame.
- Reconfigure API now supports signalling new scaling lists.
- x265 application’s csv functionality now reports time (in milliseconds) taken to encode each frame.
--strict-cbrenables stricter bitrate adherence by adding filler bits when achieved bitrate is lower than the target; earlier, it was only reacting when the achieved rate was higher.
--hdrcan be used to ensure that max-cll and max-fall values are always signaled (even if 0,0).
- Fixed incorrect HW thread counting on MacOS platform.
- Fixed scaling lists support for 4:4:4 videos.
- Inconsistent output fix for
--opt-qp-pssby removing last slice’s QP from cost calculation.
- VTune profiling (enabled using ENABLE_VTUNE CMake option) now also works with 2017 VTune builds.
By Pradeep Ramachandran
Release date – 26th December, 2016.
- Enhancements to TU selection algorithm with early-outs for improved speed; use
- New motion search method SEA (Successive Elimination Algorithm) supported now as :option: –me 4
- Bit-stream optimizations to improve fields in PPS and SPS for bit-rate savings through
- Enabled using VBV constraints when encoding without WPP.
- All param options dumped in SEI packet in bitstream when info selected.
- x265 now supports POWERPC-based systems. Several key functions also have optimized ALTIVEC kernels.
- Options to disable SEI and optional-VUI messages from bitstream made more descriptive.
- New option
--scenecut-biasto enable controlling bias to mark scene-cuts via cli.
- Support mono and mono16 color spaces for y4m input.
--min-cu-sizeof 64 no-longer supported for reasons of visual quality (was crashing earlier anyways.)
- API for CSV now expects version string for better integration of x265 into other applications.
- Several fixes to slice-based encoding.
--log2-max-poc-lsb‘s range limited according to HEVC spec.
- Restrict MVs to within legal boundaries when encoding.