YouTube is now building its own video transcoding chips | GeekComparison

Extreme close-up photo of computer component.
Enlarge / A Google Argos VCU. It transcodes video very quickly.

Google has decided that YouTube requires such a massive transcoding workload that it needs to build its own server chips. The company detailed its new “Argos” chips in a YouTube blog post, a CNET interview, and in a paper for ASPLOS, the Architectural Support for Programming Languages ​​and Operating Systems Conference. Just as there are GPUs for graphics workloads and Google’s TPU (tensor processing unit) for AI workloads, YouTube’s infrastructure team says it’s created the “VCU,” or “Video (trans)Coding Unit,” which allows YouTube to convert a single video transcoding to more than a dozen versions it needs to provide a smooth, bandwidth-efficient, profitable video site.

Google’s Jeff Calow said the Argos chip has “delivered up to 20-33x improvements in compute efficiency compared to our previous optimized system, which ran software on traditional servers.” The VCU package is a full length PCI-E card and is very similar to a graphics card. One board has two Argos ASIC chips buried under a giant, passively cooled aluminum heat sink. There’s even what looks like an 8-pin power connector on the end, because PCI-E just doesn’t have enough power.

Google provided a nice chip diagram showing 10 “encoder cores” on each chip, with Google’s whitepaper adding that “all other elements are ready-made IP blocks.” Google says that “each encoder core can encode 2160p in real time, up to 60 FPS (frames per second) using three reference frames.”

The cards are specially designed to fit into Google’s warehouse-scale computer system. Each compute cluster in the YouTube system will have a section of special “VCU machines” loaded with the new maps, saving Google from having to break open every server and load it with a new map. Google says the cards resemble GPUs in that they fit into existing accelerator trays. CNET reports that “thousands of chips are currently running in Google’s data centers,” and the cards allow individual video workloads such as 4K video to be “available to view in hours rather than the days previously required.”

Taking into account the research and development of the chips, Google says this VCU plan will save the company a ton of money, even given the benchmark below that shows the installation’s TCO (total cost of ownership) compared to the running the algorithm on Intel Skylake chips and Nvidia T4 Tensor core GPUs.

Google's benchmark and cost-of-ownership table from the whitepaper.

Google’s benchmark and cost-of-ownership table from the white paper.


YouTube’s unfathomably large transcoding problem

Since YouTube is the largest video site in the world, it was initially seen as an impossible task to keep it running until Google bought the company in 2006. Since then, Google has fought aggressively to keep the site’s costs down, often through the internet infrastructure and make it work. Today, the primary infrastructure problem YouTube needs to solve for end users is delivering video that works exactly right for your device and bandwidth while maintaining quality. That means using a codec supported by your device and choosing a resolution that matches your display (and not blowing up your internet connection with a huge file).

For Google, that means transcoding a single video into a lot from other videos. You can see some of this work for yourself by just clicking the gear for an 8K video, where you’ll see nine total resolutions created from a single upload: 144p, 240p, 360p, 480p, 720p, 1080p, 1440p, 2160p and 4320p. These are all different video files and each should be created from the original uploaded 8K file. just now for your specific device.

Google also needs to offer some of those nine resolutions in multiple codecs, which determine how the video is compressed as it travels across the Internet. The company wants to offer videos in the most advanced, efficient codec available to save on bandwidth, which is a huge chunk of YouTube’s cost. However, decoding a video codec consumes processing power, and on cheaper mobile devices decoding will not be smooth and efficient without dedicated hardware acceleration support for each new codec. That means Google should use only the best codecs on new devices and keep copies of the video in older codecs for older devices.

These days, modern devices usually get the efficient VP9 codec, keeping the more compatible H.264 for devices that aren’t up to date. No one really knows the depth of YouTube’s video codec selection, but the site generally also supports devices that are nearly 10 years old, including “low-res flip phones,” according to the ASPLOS paper. So there are some pre-H.264 codecs, like 3GP, for old devices.

A tagged dieshot of an Argos chip.
Enlarge / A tagged dieshot of an Argos chip.


Google’s YouTube computing challenge becomes even more unfathomably greater when you consider that codecs are constantly being pushed forward – and again, since bandwidth is such a huge cost to running the site, pushing these new codecs has advantages for Google and upgrade as soon as possible. Upgrading to a new codec means transcoding every video (or at least a majority of them) to the hot new codec, and oh yeah, this has to happen every few years for every new codec.

How many videos do you think there are on YouTube? Google probably only gives stats about growth (such as “500 hours of video are uploaded to YouTube every minute”) because the total number of videos is so large that it’s an unknowable number. Not to mention YouTube Live (imagine all this transcoding happening live, with a 100ms delay) and the additional workload of Drive and Google Photos. Google has the largest transcoding job in the world.

Codecs are so important to YouTube’s success that Google is actually taking the lead in developing them. In 2009, Google bought codec developer On2 Technologies (the company that provided the VP6 codec used in Flash video, which powered YouTube at the time), and the search giant has been a major codec developer ever since. After pushing and upgrading to VP8 and VP9, ​​Google is moving forward with its next codec, dubbed “AV1”, which it hopes will see a wide rollout one day. AV1 was created through an industry coalition.

Regarding AV1, Calow told the YouTube blog, “One of the things about this is that it wasn’t a one-off program. It was always the intention to have multiple generations of the chip with fine-tuning of the systems between them. what we’re doing in the next generation chip is adding AV1, a new advanced encoding standard that compresses more efficiently than VP9 and has an even higher computational burden to encode.” AV1 is experimentally available on YouTube and several other video sites, but mass adoption is currently being held back by customer support. These second-generation chips are already being phased into Google’s server farms, according to CNET.

Leave a Comment