Tachyum's Monster 128 Core 5.7GHz 'Universal Processor' Does Everything
A processor that can do HPC, AI, and Machine Learning all on one chip!
Tachyum has created one of the most powerful processors in the world: The Prodigy T16128 Universal Processor. The Prodigy T16128 has 128 64-bit CPU cores operating at up to 5.7GHz, 16 DDR5 memory controllers, and 64 PCIe 5.0 lanes, and can handle general-purpose computing, high-performance computing (HPC), and AI workloads — all on a single chip.
Tachyum calls Prodigy the world's first "universal processor," and says it was designed from the ground up to be a multi-purpose CPU capable of running a multitude of the world's most intensive computing applications. Prodigy not only handles all of these different tasks on a single chip, it does so with a power budget that's 10 times lower than that of traditional hardware — and at one-third the cost.
Tachyum boldly claims the Prodigy supercomputer chip offers four times the performance of Intel's fastest Xeon on the market and triple the raw performance of Nvidia's H100 in high-performance computing applications. All while being 10 times more power efficient.
To create such impressive performance within a single core architecture, Tachyum says it built Prodigy with matrix and vector processing capabilities from the ground up — rather than making them an afterthought. Prodigy supports a range of data types, including FP64, FP32, TF32, BF16, Int8, FP8, and TAI, all from the individual CPU cores themselves.
The Prodigy processors could be game-changers when they arrive in 2023. The latest server hardware from AMD, Intel, and Nvidia all rely on individual pieces of hardware — even within a single CPU or GPU — to perform these different workloads. An example of this is Nvidia's RTX series GPUs, which require dedicated machine learning Tensor cores for AI to work and dedicated RT cores for ray tracing applications.
Prodigy, on the other hand, will be able to run ray tracing and AI applications on individual cores, and won't need to divert data to another chip inside the microprocessor.
Running all of these different HPC workloads inside a single chip could drastically change the server landscape: Companies would be able to pack many more chips into a server farm with lower power requirements and less cooling.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
The Prodigy T16128 runs on a 5nm process technology of unknown origin, and operates within a very small (for the power it provides) 64 mm x 84mm FCLGA package. Tachyum says the chip is capable of performing 12 AI PetaFLOPS and 90 TeraFLOPS when it comes to HPC workloads. The Prodigy chip can also run binaries for x86, ARM, RISC-V, and ISA. For some perspective, a single Nvidia A100 is only capable of 5 AI PetaFLOPS.
Each core is specifically capable of 2x 1024-bit vector units, 4096-bit matrix operations, and 4 out-of-order instructions per clock. Virtualization and Advance RAS are also supported. The chip also includes over 128MB of L2+L3 cache with error correction capabilities. To feed all of its cores the chip comes with 16 DDR5 memory controllers rated for up 7200MT/s with a maximum capacity of 8TB per socket.
The T16128 is the flagship model in Tachyum's Prodigy lineup, with the 64 core T864 and the 32 Core T832 filling the mid-range and entry-level slots, respectively, in the product stack. Production starts in 2023, so we should see actual benchmarks of these chips sometime next year.
Aaron Klotz is a contributing writer for Tom’s Hardware, covering news related to computer hardware such as CPUs, and graphics cards.
-
thisisaname Was looking good until I read at the end "Production starts in 2023, so we should see actual benchmarks of these chips sometime next year. " So a dream CPU in more than one meaning of the word.Reply -
atmapuri AVX512 looks shy in compare to 2x1024bit vector and 1x4096bit matrix registers. Yes, On paper one core is 4x faster than Intel for vector math. 16x50GB/s = 800GB/s memory bandwidth, if you can afford to run minimum 32 threads to make use of that.Reply -
mdd1963 If this CPU was x86-64 compatible, and one could install Windows Server 2022 Datacenter, it should cost only $36K in Windows licensing fees, at 16 cores per license, ... x 8! SQL Server Enterprise would run just $879, 872!Reply
I 'd assume all these cores would need to be spread out into an area the size of a piece of bread, or two.... at a minimum... -
jasonf2
I am pretty sure looking at the arc on this thing it is going to require a custom OS. By the time that port is done that $900,000 figure you just threw will look like a deal.mdd1963 said:If this CPU was x86-64 compatible, and one could install Windows Server 2022 Datacenter, it should cost only $36K in Windows licensing fees, at 16 cores per license, ... x 8! SQL Server Enterprise would run just $879, 872!
I 'd assume all these cores would need to be spread out into an area the size of a piece of bread, or two.... at a minimum... -
Friesiansam If the reality lives up to the bold claims, it will be a highly impressive chip but, if something seems to good to be true, it usually is.Reply -
farnell121 "The Prodigy chip can also run binaries for x86, ARM, RISC-V, and ISA".Reply
I'm assuming you meant POWER ISA, instead of just Instruction Set Architecture, as that would be redundant. -
Historical Fidelity I’m calling bs on this. Factors more powerful, factors more energy efficient, yet the silicon is 5400mm2 in area. Doesn’t add up, but then again you gotta try the snake oil before you can judge hahaReply -
hotaru.hino Tachyum boldly claims the Prodigy supercomputer chip offers four times the performance of Intel's fastest Xeon on the market and triple the raw performance of Nvidia's H100 in high-performance computing applications. All while being 10 times more power efficient.
Where have I heard this before and it ended up falling flat on its face?