Former Tesla AI Director reproduces GPT-2 in 24 hours for only $672 — GPT-4 costs $100 million to train

AI Chatbot
(Image credit: Shutterstock)

OpenAI launched GPT-2 in 2019, reportedly costing $256 per hour to train. However, it’s been five years since then, and we’re already at GPT-4o. Advancements in hardware, software, and data mean that training the same model will take less time and less money, as Andrej Karpathy, the developer behind the project to reproduce GPT-2 in llm.c, has proven.

The primary driver of cost savings is using a single 8XH100 node to do the training, which dropped the cost to just $28 an hour — almost 90% off in as little as five years. Nvidia launched the H100 in 2023, so OpenAI likely used hardware with much less power when it started working on GPT-2. However, the number of hours it took to train for GPT-2 is unknown. In comparison, the cost of training GPT-4 was more than $100 million.

Another thing that made llm.c much faster to train is that it directly implemented GPT training. Karpathy said, “Because llm.c is a direct implementation of GPT training in C/CUDA, the requirements are minimal — there is no need for conda environments, Python interpreters, pip installs, etc. You spin up a cloud GPU node, optionally install NVIDIA cuDNN, NCCL/MPI, download the .bin data shards, compile and run, and you’re stepping in minutes.” He added, “You then wait 24 hours and enjoy samples about English-speaking Unicorns in the Andes.”

The llm.c project started its life as part of an educational video, but it soon turned into something that Karpathy built from scratch after he got ‘stuck with some PyTorch things.’ It shows Andrej’s passion for AI and the lengths he was willing to go through to finish his project. Nevertheless, he didn’t accomplish this alone, as he had the support of several developers from across the globe.

AI training isn’t getting cheaper

TOPICS
Jowi Morales
Contributing Writer

Jowi Morales is a tech enthusiast with years of experience working in the industry. He’s been writing with several tech publications since 2021, where he’s been interested in tech hardware and consumer electronics.