Skip to main content

6,000 RISC-V Cores on a Xilinx FPGA Break the CoreScore World Record

Xilinx Render
The high-end Xilinx Virtex UltraScale+ VCU128 FPGA used to achieve the CoreScore record. (Image credit: Xilinx)

A new world record for the densest arrangement of RISC-V cores (measured by the CoreScore benchmark) has been achieved by pairing 6,000 RISC-V SERV cores and one of Xilinx's most powerful FPGA designs, the VCU128 board. The benchmark simulates how many SERV cores can be deployed on a single piece of silicon, and the Xilinx's Virtex UltraScale+ VCU128 FPGA can fit as many as 6,000 SERV cores via its internal reconfiguration. The previous record-holder had a total of 5,087 cores hosted on Xilinx's VCU118.

FPGAs (Field-Programmable Gate Array) are exotic pieces of hardware because they have very few fixed-function elements. Instead, they are built to be programmable on the fly (or in the field), mimicking transistor arrangements defined by the programmer. This essentially allows FPGAs to be the closest we have to adaptive processing electronics, changing from moment to moment according to the workload at hand (this is a simplified explanation).

"What do you do when you have the award-winning SERV, the world's smallest RISC-V CPU?" asks Olof Kindgren, designer of both the SERV core and the CoreScore benchmark. "Well, among other things, we, of course, want to see how many SERV cores you can fit into various devices. This is what CoreScore is for. And on top of that list of currently 30 boards, we can now find Sylvain Lefebvre and his Xilinx VCU128 board that fits 6000 SERV cores."

These cores aren't what you'd typically find on your best CPUs for gaming from Intel or AMD; they are stripped-down, barebones bit-serial work units that include as few extraneous functions as possible. That approach minimizes the total die space occupied by each core. The design achieves performance via workload parallelization, not from the obvious processing grunt from each core.

"We are nearing the max," Lefebvre says of his 6,000-core record, "with 98.5% LUTs [Lookup Tables] (and 100% BRAM [Block RAM]) of the VCU128 FPGA utilized. It's been great fun working with Olof Kindgren on this, and it was a perfect intro to our Xilinx VCU128 monster."