TikTok parent company used AI to optimize Linux kernel, boosting performance and efficiency

Fedora Linux
(Image credit: Pixabay)

A technical presentation from Chinese tech company ByteDance — which is best known for creating TikTok — detailed how it used AI and machine learning to make the Linux kernel run better on any hardware (via TechSpot). ByteDance believes that in the future computer engineers will likely have to lean on AI for kernel optimization. And with the gains touted in the presentation, those claims might be right.

ByteDance gave its presentation at the Linux Plumbers Conference on Nov. 14 — and though you might think the developer of TikTok is out of place here, you'd be wrong. The presentation, delivered by ByteDance engineer Cong Wang, was heavily detailed both technically and academically (it was made for computer engineers, after all).

The general gist of the presentation: ByteDance used AI to make the Linux kernel (the core of the operating system) much more efficient and performant across all kinds of hardware. We've recently seen AI help make GPU drivers more efficient, but doing the same thing in the kernel of an OS is a significant step up as a technical feat.

That this AI-powered solution worked universally is a big deal, as hardware-specific optimizations are often required to achieve good performance — and that can be challenging for developers because there are so many possible combinations of components.

The presentation detailed how AI optimizations were able to reduce memory usage by 30% — and that was using existing Linux tools, just more efficiently. Network latency was also improved by up to 12% with AI that has prior knowledge (which wouldn't be hard to obtain on a computer used regularly).

ByteDance concluded that AI-assisted kernel optimization could also help balance CPU usage, use cache more effectively, and even detect malware. At the same time, it also acknowledged that machine learning and AI wasn't a silver bullet: real human engineers will apparently not be replaced by computers any time soon for coding kernels.

Matthew Connatser

Matthew Connatser is a freelancing writer for Tom's Hardware US. He writes articles about CPUs, GPUs, SSDs, and computers in general.

  • Order 66
    I wonder how much this improves gaming performance, if at all.
    Reply
  • hotaru.hino
    My hope is ByteDance will honor the GPL requirements and upload their changes somewhere so that whatever changes the AI did can be studied.

    But given the political implications...
    Reply
  • TJ Hooker
    hotaru.hino said:
    My hope is ByteDance will honor the GPL requirements and upload their changes somewhere so that whatever changes the AI did can be studied.

    But given the political implications...
    GPL applies when distributing software. Unless this tweaked kernel is present in software they actually release, I don't think they have any obligation under GPL to make the source code changes available.
    Reply
  • Grobe
    TJ Hooker said:
    Unless this tweaked kernel is present in software they actually release, I don't think they have any obligation to make the source code changes available.
    Also, my gut feeling is this was more proof of concept and not a stable and proven version ready for everyday use.
    Reply
  • NinoPino
    Without source code to review and most importantly Torvalds approvation, optimizations are worthless.
    Reply
  • wakuwaku
    hotaru.hino said:
    My hope is ByteDance will honor the GPL requirements and upload their changes somewhere so that whatever changes the AI did can be studied.

    But given the political implications...

    TJ Hooker said:
    GPL applies when distributing software. Unless this tweaked kernel is present in software they actually release, I don't think they have any obligation to make the source code changes available.

    Grobe said:
    Also, my gut feeling is this was more proof of concept and not a stable and proven version ready for everyday use.

    NinoPino said:
    Without source code to review and most importantly Torvalds approvation, optimizations are worthless.

    TLDR:
    They do not need to open source their kernel or modifications simply because they did not modify the Linux kernel. They modified existing parameters or settings of the Linux Kernel using AI, instead of being done manually by humans.

    This article on tom's is dumbed down too much. You guys should at least breeze through the quoted source article from techspot, and also the quoted source article in the techspot article which is from zdnet.
    If you did you would know that the ByteDance engineer did not modify the Linux Kernel, nor made their own version or anything like that.

    The Linux kernel has thousands of parameters or settings that a user can modify and tune to suite a specific type of work to improve performance. One of the best examples of this is if you dabbled in Custom Roms and Kernels and rooting and Magisk for your Android phone, you would find out lots of guides on different kernel settings for different types of usage of your phone, as well as apps or GUIs that help you tweak them yourselves or use preset settings that the community thinks is best.

    What ByteDance did is use AI to analyze the type of work that a user is doing and decide the best parameters to apply to the Linux Kernel to maximize performance of that work. If I understand correctly from the source article, it is also doing this dynamically, meaning it can probably do this on the fly and set the best parameters every time a different new type of work starts.

    Since ByteDance is only using AI to change EXISTING Linux kernel settings, not directly modifying the Kernel itself, there is no really any GPL violations or obligations. There is no source code changes, they did not do anything to the source code of the Linux kernel. There is no proof of concept or stability issues beyond what already exists in the Linux kernel.

    Maybe they can release the AI model itself, make it open source. Maybe release an app or a distro integrating the AI and make that open source.

    Edit: Also note that they presented at the Linux Plumbers Conference, it is not like they are trying to hide or be secretive on what they achieved.
    Reply
  • NinoPino
    wakuwaku said:
    TLDR:
    They do not need to open source their kernel or modifications simply because they did not modify the Linux kernel. They modified existing parameters or settings of the Linux Kernel using AI, instead of being done manually by humans.

    This article on tom's is dumbed down too much. You guys should at least breeze through the quoted source article from techspot, and also the quoted source article in the techspot article which is from zdnet.
    If you did you would know that the ByteDance engineer did not modify the Linux Kernel, nor made their own version or anything like that.

    The Linux kernel has thousands of parameters or settings that a user can modify and tune to suite a specific type of work to improve performance. One of the best examples of this is if you dabbled in Custom Roms and Kernels and rooting and Magisk for your Android phone, you would find out lots of guides on different kernel settings for different types of usage of your phone, as well as apps or GUIs that help you tweak them yourselves or use preset settings that the community thinks is best.

    What ByteDance did is use AI to analyze the type of work that a user is doing and decide the best parameters to apply to the Linux Kernel to maximize performance of that work. If I understand correctly from the source article, it is also doing this dynamically, meaning it can probably do this on the fly and set the best parameters every time a different new type of work starts.

    Since ByteDance is only using AI to change EXISTING Linux kernel settings, not directly modifying the Kernel itself, there is no really any GPL violations or obligations. There is no source code changes, they did not do anything to the source code of the Linux kernel. There is no proof of concept or stability issues beyond what already exists in the Linux kernel.

    Maybe they can release the AI model itself, make it open source. Maybe release an app or a distro integrating the AI and make that open source.

    Edit: Also note that they presented at the Linux Plumbers Conference, it is not like they are trying to hide or be secretive on what they achieved.
    Sorry, I did not check the source.
    Thanks.
    Reply
  • bit_user
    Does anyone have a link to a PDF version of the slides or a paper? (see below) I don't want to watch the whole talk and hate trying to click through a youtube video, especially since they took the lazy approach of just recording an 8 hour block.

    Edit: found a link to the slides, thanks to the TechSpot article!
    https://lpc.events/event/17/contributions/1520/attachments/1152/2582/Linux%20Kernel%20Autotuning.pdf
    The slides then have a link to a paper (note: not theirs - thanks @TJ Hooker !):
    https://arxiv.org/pdf/2111.11554.pdf
    The paper then has a link to their code (again, not theirs):
    https://github.com/sbu-fsl/kernel-ml
    Anyway, it's probably good to start with some background. Seven years ago, a startup called Concertio released an AI-powered toolkit for using AI to dynamically tune Linux kernel parameters (after some 18+ months of development). They've since been acquired by Synopsis:
    https://www.phoronix.com/news/Concertio-Optimizer-Studiohttps://concertio.com/
    Far more recently, we heard about an effort by Oracle to use eBPF (think of it as little "plug-in" programs you can load into the kernel) to do something similar. Essentially, this represents a more tightly-integrated version of the above.
    https://www.phoronix.com/news/Oracle-bpftune
    The downside of both approaches is that they're limited to adjusting parameters which have already been exposed by various subsystems. I believe the research referenced by this article explores even tighter-integration of AI into the kernel, which ultimately would let you optimize things that can't easily be exposed as a set of simple parameters. For instance, if you're doing prefetching or swapping, you might use AI to predict the likelihood that a given block or page is going to be referenced in the near future.

    As with any sort of adaptive, predictive optimizations, the potential exists to hit some pathological case where the optimization does more harm than good (for a simple example of this, see cache thrashing). Then, there's the overhead of running the AI algorithms, themselves - that needs to be consistently less than whatever benefits you're gaining.
    Reply
  • bit_user
    NinoPino said:
    Without source code to review and most importantly Torvalds approvation, optimizations are worthless.
    I think it's just research. They're at the stage of exploring what's possible and then finding what benefits it can provide. That can inform others who want to do a full production implementation of such ideas.

    In other words, it's a little like the difference between science and engineering.

    Although they didn't necessarily have to (however, maybe it's a requirement for publication or presentation in/at certain journals or conferences), they did in fact release their code:
    https://github.com/sbu-fsl/kernel-ml
    Reply
  • bit_user
    wakuwaku said:
    TLDR:
    They do not need to open source their kernel or modifications simply because they did not modify the Linux kernel.
    Actually, they did. The paper goes into great detail about their design, however even the presentation makes their case about why you really want to have integration directly with the kernel.

    wakuwaku said:
    This article on tom's is dumbed down too much. You guys should at least breeze through the quoted source article from techspot,
    Thanks for pointing me at the TechSpot link, but you should taken your own advice and kept going until you find the original paper or at least the slides. My experience has been that articles on sites like this are either written by authors not sufficiently versed in what they're trying to cover, or they oversimplify it, as you said.

    wakuwaku said:
    Since ByteDance is only using AI to change EXISTING Linux kernel settings, not directly modifying the Kernel itself, there is no really any GPL violations or obligations.
    The license on their main repo is Apache, which you wouldn't be able to load into the kernel without tainting it. However, the code for their kernel module appears to be in a copy they made of some kernel rev, but they committed the entire, modified tree in one shot, so I haven't found their code to check it has a different license.

    What's kind of interesting about their github repo is that it all seems about 2 years old. I'm guessing they haven't been able to present before now, due to Covid-related travel restrictions that were lifted much later in China than elsewhere.

    wakuwaku said:
    There is no source code changes, they did not do anything to the source code of the Linux kernel.
    LOL, which is why they made a copy of the entire kernel tree in github, and then their README explains how to build it. Right.
    Reply