Next-Gen Video Encoding: x265 Tackles HEVC/H.265

Introducing HEVC And x265

So much of what we do at Tom’s Hardware depends on an evolving benchmark suite. Sometimes I put up news stories or Twitter posts asking for what you want to see from our reviews, and we’ve added a ton of testing based on that feedback. But we also keep up with industry trends and adopt testing for taxing new technologies as soon as we can.

Now, you’re already familiar with the H.264 video codec, which is instrumental in compressing high-definition video for distribution. Most of the devices you watch movies on employ fixed-function logic to accelerate decoding of H.264-based content, minimizing the host processor workload and, at least on mobile devices, extending battery life. But high-quality software-based encoding can still be pretty taxing, which is why we have Adobe’s Media Encoder, HandBrake, and TotalCode Studio in our standard benchmark suite.

What’s the point of three different benchmarks that involve H.264? As it turns out, each encoding algorithm is different, and at a given quality level, bit rate can vary quite a bit. The following chart, which comes from a comparison conducted by Lomonosov Moscow State University’s Graphics and Media Lab, demonstrates the x264 encoder’s efficiency compared to other popular options.

x264 benefits from years of development and optimization. It’s freely available under the terms of the GNU GPL for internal use, or you can license it commercially if your company is concerned about linking proprietary applications to GPL code. So, big companies like Netflix, Hulu, Amazon, and YouTube are leveraging it to get more quality from lower-bit rate files, preserving bandwidth and delivering a better experience. Meanwhile, enthusiasts and power users get to use it at home without paying anything, and open source front-ends like HandBrake employ it for H.264-based encoding.

But of course, we’re entering this era of higher-definition displays, higher dynamic range, and larger color space, all of which has to be represented by more data. That means larger video files if you want better quality. You can already see how streaming the nicest-looking content is getting increasingly more bandwidth-intensive. Fortunately, the standard for H.264’s successor, High Efficiency Video Coding, was recently published. It’s more computationally intensive, but should increase coding efficiency dramatically compared to H.264.

Instead of H.264’s 16x16-pixel macroblocks, HEVC employs something called a Coding Tree Unit that can be as large as 64x64, describing less complex areas more efficiently. Even still, 1080p encodes are expected to be five to 10 times more taxing, while 4K video multiplies those demands by another 4 to 16x. Fortunately, a lot of effort went into making sure that encoding can be parallelized, and I’ll illustrate the impact of this shortly.

How, you ask? Today, MulticoreWare (the company responsible for creating an OpenCL-accelerated version of x264 for Telestream’s Episode Encoder) is making pre-alpha code for its HEVC encoder available at Bitbucket. Its commercially-funded project began earlier this year, and it’ll employ the same business model as x264, meaning you can download and compile x265 under the GNU GPL as well. Leveraging source code from x264 (and indeed, with that project’s lead developer as an adviser), MulticoreWare is hoping to see x265 become a true successor.

Chris Angelini
Chris Angelini is an Editor Emeritus at Tom's Hardware US. He edits hardware reviews and covers high-profile CPU and GPU launches.
  • Jindrich Makovicka
    In addition to PSNR comparison, I'd be much more interested in the SSIM metric, which is better suited for codecs using psychovisual optimizations.

    PSNR can be usable for when testing varying parameters for one codec, but not so much when comparing two completely different codecs.
    Reply
  • CaedenV
    Nice intro to the new codec!
    And to think that this is unoptomized... Once this is finalized it will really blow 264 out of the water and open new doors for 4K content streaming, or 1080p streaming with much better detail and contrast. This is especially important with the jump to 4K video. The 16x16 grouping limit on x264 is great for 1080p, but with 4K and 8K coming down the pipe in the industry we need something better. The issue is that we really do not have many more objects on the screen as we did back in the days of 480i video, it is merely that each object is more detailed. Funny thing is that a given object will typically have more homogeneous data across its surface area, and when you jump form 1080p to 4K (or 8K as is being done for movies) then it takes a lot more 16x16 groupings which may all relay the same information if it is describing a large simple object. Moving up to 64x64 alone allows for 8K groupings that take up the same percentage of the screen as 16x16 groupings do in 1080p.
    Reply
  • nibir2011
    Considering the CPU Load i think it wont be a viable solution for almost any home user within next 2-3 years unless CPUs gets exceptionally fast.

    Of course then we have the Quantum Computer. ;)
    Reply
  • Shawna593767
    Quantum computers aren't fast enough for this, the get their speed by doing less calculations.
    For instance a faster per clock x86 computer might have to do say 10 million calculations to find something, whereas the quantum computer is slower per clock but would only need 100,000 calculations.
    Reply
  • Cryio
    I the rate Intel is NOT improving their CPUs, quantum computers are far, far away
    Reply
  • nibir2011
    11211403 said:
    Quantum computers aren't fast enough for this, the get their speed by doing less calculations.
    For instance a faster per clock x86 computer might have to do say 10 million calculations to find something, whereas the quantum computer is slower per clock but would only need 100,000 calculations.


    well a practical quantum computer does not exist . lol

    i think that is not the case with calculation.i think what you mean is accuracy. number of calculation wont be different; it will be how many times same calculations need to be done. in theory a quantum computer should be able to make perfect calculations as it can get all the possible results by parallelism of bits. a normal cpu cant do that it has to evaluate each results separately. SO a quantum computer is very very efficient than any traditional cpu. Speed is different it depends on both algorithm and architecture. quantum algorithms is at its infancy. last year maybe a quantum algorithm for finding out primes was theorized. I do not know if we will see a quantum computer capable of doing what the regular computers do next 30 years.

    thanks

    Reply
  • InvalidError
    Most of the 10bit HDR files I have seen seem to be smaller than their 8bits encodes for a given quality. I'm guessing this is due to lower quantization error - less bandwidth wasted on fixing color and other cumulative errors and noises over time.
    Reply
  • ddpruitt
    I know it's a minor detail but it's important:

    H.264 and H.265 are NOT encoding standards, they are DECODING standards. The standards don't care how the video is encoded just how it's decoded, I think it should be made clear because the article implies they are decoding standards and people incorrectly assume one implies the other. x264 and x265 are just open source encoders that encode to formats that can be decoded properly by H.264 and H.265.

    x264 has noticeable issues with blacks, they tend to come out grey. I would like to see if x265 resolves the problem. I would also like to see benchmarks on the decoding end (CPU Load, power usage, etc) as I see this becoming an issue in the future with streaming video on mobile devices and laptops.
    Reply
  • chuyayala
    I truly hope this is optimized for Open-CL encoding in the future.
    Reply
  • Nintendo Maniac 64
    You guys should really include VP9 in here as well, since unlike VP8 it's actually competitive according to the most recent testing done on the Doom9 forums, though apparently the reference encoder's 2-pass mode is uber slow.
    Reply