Elon Musk's Grok 3 is now available, beats ChatGPT in some benchmarks — LLM took 10x more compute to train versus Grok 2

(Image credit: xAI)

Elon Musk just launched Grok 3, the latest version of xAI’s LLM that was trained at the Colossus Supercluster in Memphis, Tennessee using 100,000 Nvidia H100 GPUs. He had previously said, about a week ago, that its full release was imminent and claimed that it would outperform its rivals. Today he launched the AI model via a live stream on X (formerly Twitter) showcasing impressive performance benchmark results.

Early Grok-3 benchmarks show it dominating the field. pic.twitter.com/KXubPhaA5xFebruary 18, 2025

Musk began the presentation by saying “The mission of xAI and Grok is to understand the universe,” and explaining that he wants to answer questions like, “What’s going on? Where are the aliens? What is the meaning of life? How does the universe end? How did it start?” He added, “Of course, that’s to be a maximally truth-seeking AI even if that truth is sometimes at odds with what is politically correct.”

https://t.co/hEfQ31gANQFebruary 18, 2025

After speaking about his goals with AI, Musk proclaimed that Grok 3 is an order of magnitude more capable than Grok 2, and that it was trained in a very short period. This was likely possible because of the massive number of GPUs xAI used for parallelized training, which also took just 19 days to set up — a record time especially since Nvidia's CEO Jensen Huang said that that usually takes four years.

Grok 3 isn’t just a single LLM though — instead, it’s a family of several models, with the first ones launched being Grok 3 and Grok 3 mini. xAI also showed off Grok 3 Reasoning and Grok 3 mini Reasoning, which are similar to OpenAI 03-mini and DeepSeek R1 models and will solve problems through a step-by-step logical process.

Image 1 of 2

Benchmarks shown by the xAI team reveal Grok-3 and Grok-3 mini models outperforming its competition, including Gemini-2 Pro, DeepSeek-V3, Claude 3.5 Sonnet, and GPT-4o, in several tests, including Math (AIME), Science (GPQA), and Coding (LCB). The reasoning models, which are accessible via the Grok app, also outperform the competition using the same benchmarks. Aside from this, the Grok app will have a new feature called DeepSearch, which scours the internet when questioned to then distill all the information into a single answer.

Other experts have been given access to Grok 3 in advance and were able to test these claims. For example, former Tesla Director of AI and OpenAI founder Andrej Karpathy shared his test results on X, saying that Grok 3 + Thinking feels similar to OpenAI’s o1-pro model while being a bit better than DeepSeek-R1 and Gemini 2.0 Flash Thinking. This is actually quite a feat, especially since OpenAI and Google have had a massive head start over xAI.

I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check.Thinking✅ First, Grok 3 clearly has an around state of the art thinking model ("Think" button) and did great out of the box on my Settler's of Catan… pic.twitter.com/qIrUAN1IfDFebruary 18, 2025

Grok 3 will be available to X Premium+ subscribers first. However, those who want to access more advanced features will need to sign up for SuperGrok, which is rumored to cost around $30 a month or $300 annually.

TOPICS

Jowi Morales is a tech enthusiast with years of experience working in the industry. He’s been writing with several tech publications since 2021, where he’s been interested in tech hardware and consumer electronics.

6 Comments Comment from the forums

Crazyy8

Ew, politics. Anyway, I wonder what people will use Grok3 for. The only times I've used AI in recent days was translation(it seemed pretty good compared to Google translate). People seem to promote specifically Grok's lack of bias(IE not being hypocritical in (ew)politics and not being censored like the new Chinese AI)which might be useful. I've been burned out from the late '23 and early '24 AI being everywhere and being on everything. Yall know any good uses for AI, or uses that are improved with Grok3?
Reply
Gururu

I love how Deepseek and Grok mini have the same numbers
Reply
Jabberwocky79

""and explaining that he wants to answer questions like, “What’s going on? Where are the aliens? What is the meaning of life? How does the universe end? How did it start?” He added, “Of course, that’s to be a maximally truth-seeking AI even if that truth is sometimes at odds with what is politically correct.”"
I cannot even begin to quantify the levels of irony in expecting truth regarding these matters from artificial intelligence.
Reply
jg.millirem

Crazyy8 said:
Ew, politics. Anyway, I wonder what people will use Grok3 for. The only times I've used AI in recent days was translation(it seemed pretty good compared to Google translate). People seem to promote specifically Grok's lack of bias(IE not being hypocritical in (ew)politics and not being censored like the new Chinese AI)which might be useful. I've been burned out from the late '23 and early '24 AI being everywhere and being on everything. Yall know any good uses for AI, or uses that are improved with Grok3?
Ew, avoidance of the realities in which Musk and all the bros operate and cement power from.
Reply
Crazyy8

jg.millirem said:
Ew, avoidance of the realities in which Musk and all the bros operate and cement power from.
I'm avoiding talk about politics purposefully. 1. I don't want to ruin the discussion and 2. Politics(taken to the xtreme)aren't taken well in the forums.
Reply
Lamarr the Strelok

Eh.. what russian stuff are you talking about? How exactly should they cover the story? And yes, everything was better in the "good 'ol days". As far as criticism goes ,I'd rather have more honest reviews on AMD cards and quit being obsessed with the ray tracing BS.Send me an e mail when nvidia doesn't charge hundreds for RT and it's the standard everyone says it will be.
Reply

Show more comments