Tesla Brags About In-House Supercomputer, Now With 7,360 A100 GPUs

From Nvidia A100 to Tesla Dojo
(Image credit: Tesla)

Tesla has boosted its in-house AI supercomputer with thousands of additional Nvidia A100 GPUs. The Tesla supercomputer had 5,760 A100 GPUs about a year ago, and that count has since risen to 7,360 A100 GPUs — that's an additional 1,600 GPUs, or about a 28% increase. 

According to Tesla Engineering Manager Tim Zaman, this upgrade makes the firm's AI system a top-7 supercomputer worldwide by GPU count. 

An Nvidia A100 GPU is a powerful Ampere architecture solution aimed at data centers. Yes, it uses the same GPU architecture as GeForce RTX 30 series GPUs, which are some of the best graphics cards currently available. However, there is no close consumer relation to the A100, which comes with 80GB of HBM2e memory on board, offers up to 2 TB/s bandwidth, and requires up to 400W of power. The architecture of the A100 has also been tweaked for accelerating tasks common in AI, data analytics, and high-performance computing (HPC) applications.

The first system Nvidia showed wielding the A100 was the Nvidia DGX A100, which packed in eight A100 GPUs linked via six NVSwitch with 4.8 TBps of bi-directional bandwidth for up to 10 PetaOPS of INT8 performance, 5 PFLOPS of FP16, 2.5 TFLOPS of TF32, and 156 TFLOPS of FP64 in a single node.

That was eight A100 GPUs — Tesla's AI supercomputer now has 7,360 of these. Tesla hasn't publicly benchmarked its AI supercomputer, but the similarly-equipped GPU-based NERSC Perlmutter, which has 6,144 Nvidia A100 GPUs, achieves 70.87 Linpack petaflops. Using this and data from other A100 GPU supercomputers as performance reference points, HPC Wire estimates the Tesla AI supercomputer is capable of achieving about 100 Linpack petaflops. 

Tesla doesn’t intend to continue down the Nvidia GPU architecture path for its in-house AI supercomputers long-term. This world’s top-7 machine by GPU-count is merely a precursor to the upcoming Dojo supercomputer, which was first announced by Elon Musk back in 2020. A year ago we got a look at the Tesla D1 Dojo chip, which are designed to supplant Nvidia's GPUs for “maximum performance, throughput and bandwidth at every granularity.”

(Image credit: Tesla)

The Tesla Dojo D1 is a custom ASIC (application-specific integrated circuit) design, purposed for AI training, and it is one of the first ASICs in this field. Current D1 test chips are manufactured on TSMC N7 and pack in about 50 million transistors.

More information about the Dojo D1 chip, and the Dojo system, might be revealed at next week's Hot Chips Symposium — three Tesla presentations are schedule for next Tuesday, addressing Dojo D1 chip architecture, Dojo and ML training, and enabling AI through system integration. 

Mark Tyson
News Editor

Mark Tyson is a news editor at Tom's Hardware. He enjoys covering the full breadth of PC tech; from business and semiconductor design to products approaching the edge of reason.

  • Nothing like vertical integration. ALWAYS make your OWN stuff to control your destiny. Don’t believe the fools that say it’s cheaper to outsource it’s not. Outsource companies need to make a profit too. And when you outsource you lose the ability to innovate

    I love to see Tesla vertically integrate. It’s one of the main reasons for their success
    Reply
  • Wisecracker
    Takes a bunch of HPC compute from one million China vehicles to figure out all of those Autopilot crashes . . .
    Reply
  • cirdecus
    Mandark said:
    Nothing like vertical integration. ALWAYS make your OWN stuff to control your destiny. Don’t believe the fools that say it’s cheaper to outsource it’s not. Outsource companies need to make a profit too. And when you outsource you lose the ability to innovate

    I love to see Tesla vertically integrate. It’s one of the main reasons for their success


    Completely agree. It does, however, tend to condense power which could mean less consumer choice, but vertical integration is where it's at. I never thought I'd see Amazon buying their own fleet of transport vehicles, planes and ships lol.
    Reply
  • jtenorj
    Noticed a mistake in your article. You state that Tesla's D1 chip has 50 million transistors when Telsa's slide for the chip clearly shows 50 billion. That's a difference of 3 orders of magnitude. 50 billion is also much more in line with the chip's size in mm squared as well as its 400w power draw on a modern fairly compact process node.
    Reply
  • bit_user
    Which is the ASIC Jim Keller designed at Tesla, after he left AMD and before he went to Intel?

    Edit: I guess it's Hardware 3 ?
    Reply
  • bit_user
    Mandark said:
    Nothing like vertical integration. ALWAYS make your OWN stuff to control your destiny. Don’t believe the fools that say it’s cheaper to outsource it’s not.
    This only works if you're big enough. And even some of the biggest companies (Amazon, Google, Microsoft) merely licensed IP from ARM to make their CPUs. In fact, even Tesla licensed ARM A72 cores for Hardware 3, according to what I'm reading.

    Mandark said:
    Outsource companies need to make a profit too. And when you outsource you lose the ability to innovate
    Doing everything yourself is sometimes referred to as "Not-Invented-Here (NIH) Syndrome". A lot of companies have gone bankrupt, that way.

    You have to be cognizant of where you're adding value and how you plan to compete in the marketplace. The most successful strategy is not always by vertical integration.

    Mandark said:
    I love to see Tesla vertically integrate. It’s one of the main reasons for their success
    SpaceX is a counter-example, from what I've heard. They were able to undercut everyone else on price & time-to-market by sourcing quite a lot of commodity parts.

    I think it makes sense for Tesla to DIY if they simply couldn't find anything on the market that met their needs. Otherwise, it's an incredibly expensive way to gain like ~6 months lead on your competition.
    Reply
  • Yeah I see you’ve been fooled by all those MBA and financial types. out there out there. Obviously you have to be big enough but there are no other drawbacks

    Making your own electronics even under license is still better than paying somebody else to do it

    The minute you rely on outsource companies you lose the ability to innovate and you’re at their mercy. They can raise prices for any reason or no reason at all and you’re screwed

    And by doing this themselves they’re not relying on TSMC who is already overburdened. Do you see what I’m getting at with this? Is it really that hard to understand? Now they’re not constrained by other companies. Do you still not get it?

    I completely dis agree with you on every point you’ve made.

    And please provide links for what you say about SpaceX because I don’t believe they’re outsourcing all that much. Basically you fell for a LOAD OF HOGWASH and obviously still believe it. Sad.

    Tesla Obviously knows the market they’re competing in, and they are the most successful EV company out there and will continue to be if they continue to operate the way they are. They are the class leaders in the Ev market.

    Obviously a smaller company isn’t going to do massive vertical integration until it becomes more successful of course. Remember this thread in my comment was about Tesla and no other company so please stop with the generalizations of why I’m wrong. I’m talking about Tesla and there’s no reason they shouldn’t vertically integrate. They are making tons of money, they are a money making machine and they are huge and they plan to be gigantic and they will be

    And do you want to know why they are doing this? When they started they had to outsource basically everything that they couldn’t do themselves and nobody wanted to deal with them and basically force them into the situation where they are today where they have to do everything themselves and now they don’t need anybody else so good for them

    Elon said that they will mine materials if necessary if they can’t get them for their batteries and such.
    they are making their own batteries the 4680s but they need materials for that and they have agreements with CATL for the manufacture of batteries and to acquire the materials necessary that currently aren’t being mined in the US.

    Through vertical integration Tesla is achieving their goals because now they can keep the lion share of the money for every car sold and the data is out there just go look. They make more money on every car sold that’s just about everybody there at the top of the heap

    You can bet your arse that vertical integration makes complete sense for auto builders. Even Toyota does it. They own or mostly own their suppliers
    Reply
  • bit_user
    Mandark said:
    Obviously you have to be big enough but there are no other drawbacks
    The other drawbacks are that you lack the competency, the talent, or the patience and investment to develop it. And then the whole thing turns into an expensive debacle.

    For instance, take Qualcomm and Samsung. They both used to design their own CPU cores, but ended up killing those programs in favor of licensing ARM's cores. When you're in the SoC business and you have to license somebody else's CPU cores, that's not vertical integration!

    Mandark said:
    The minute you rely on outsource companies you lose the ability to innovate and you’re at their mercy. They can raise prices for any reason or no reason at all and you’re screwed
    That's really a statement about lack of competition among your suppliers, rather than outsourcing in general. Sure, in a situation where you have little/no leverage over your suppliers, then DIY becomes more attractive.

    Mandark said:
    And by doing this themselves they’re not relying on TSMC who is already overburdened.
    Exactly who are you saying isn't relying on TSMC? Do you think Tesla fabs its own chips, too??

    Mandark said:
    Do you see what I’m getting at with this? Is it really that hard to understand? Now they’re not constrained by other companies. Do you still not get it?
    That last statement causes me to wonder how much you really know about this stuff. I mean, probably 90% of the components in a Tesla car are sourced from other companies. Do you think Tesla makes its own airbags? What about all the thousands of sensors in its cars? How about all the wires or the foam in their seats?

    And true & complete vertical integration would have them mining their own iron, producing their own vulcanized rubber, and making their own tires. And why not? ...because they're strategic, not dumb. They know where they're trying to add value or competitive advantage.

    Mandark said:
    I completely dis agree with you on every point you’ve made.
    Cool. When you start a successful business, be sure to let us all know.

    Mandark said:
    And please provide links for what you say about SpaceX because I don’t believe they’re outsourcing all that much. Basically you fell for a LOAD OF HOGWASH and obviously still believe it. Sad.

    https://www.ft.com/content/4961bd6f-bb4b-4ffd-8de8-9b65aebccfd3
    "The second part of the Silicon Valley formula involved a disruptive economic model. The secret of SpaceX’s success has been its command of fixed-price contracting, a new technique that a cash-strapped Nasa has used to stretch its budget further. Space contractors brought up in the previous cost-plus world have struggled to adapt. A Nasa study credited SpaceX with leaner staffing, fewer levels of management and a new supply chain that didn’t depend on the automatic mark-ups contractors applied in the past.The extent of this economic disruption is likely to reverberate through the space sector for years. Nasa calculated that SpaceX’s Falcon 9 rocket took under $400m to develop, less than a tenth of what it would have cost to create the rocket under its traditional contracting method."

    Mandark said:
    Tesla Obviously knows the market they’re competing in, and they are the most successful EV company out there
    You can't overlook their first-mover advantage.

    Mandark said:
    I’m talking about Tesla and there’s no reason they shouldn’t vertically integrate. They are making tons of money, they are a money making machine and they are huge and they plan to be gigantic and they will be
    Another thing you need to understand about businesses is profit margins. If doing something in-house can save enough money, then sure. But, it has to be done well, and in the highly-regulated automotive world, that can be a high bar - which means requiring lots of investment to get it right. That's why they don't develop tires, airbags, or even the chips powering the entertainment consoles in their cars, which are made by AMD.
    Reply
  • For things like seats that are usually outsource Tesla makes their own. Now of course we understand that they need materials to make the seats and that has to be sourced of course but I’m just saying the more you can do in-house the higher the profit per car can be because you’re not having to pay somebody else for producing it for you.

    In the case of seats, it makes sense because that is a super expensive part that the big OEM manufacturers usually outsource but seats are hugely expensive. From what I have seen in reviews and what I have read their seats are actually really good and improving

    Also once they scale up there battery production they won’t need CATL to be producing batteries for them. As long as they can get the materials they need to make the batteries they should be all set. Musk has stated that he will get into the mining business if he has to

    I also agree that if done in the house it has to be done well.
    Reply