Nvidia co-organizes a contest to help build AI dataset to accelerate GPU design

(Image credit: Shutterstock)

Despite their impressive capabilities in generating content, large language models (LLMs) are not so great at designing hardware. Believing this weakness is due to a lack of hardware design data to train the models, Nvidia, Georgia Institute of Technology, and others have organized a contest to help create the needed open-source public dataset.

Nvidia’s director of design automation research, Haoxing (Mark) Ren, recently announced the collaboration on X (formerly known as Twitter). Ren said the shortage of high-quality data specific to hardware design was “one of the bottlenecks for LLM-Assisted Hardware Design.” To address these shortcomings, Nvidia and others organized the ICCAD Contest on LLM-Assisted Hardware Code Generation.

Current efforts to design GPUs and other hardware with LLM assistance require “extensive human interaction.” The designs created by the LLMs are often either non-synthesizable or non-functional, or they are too simplistic or impractical. Researchers believe this is because of insufficient exposure to high-quality hardware design data during pretraining.

The lack of high-quality data is commonly regarded as one of the bottlenecks for LLM-Assisted Hardware Design. To develop an open-source, large-scale, high-quality dataset, we are co-organizing the LLM4HWDesign contest @ @ICCAD with EIC Lab @GaTech. https://t.co/qvwLZUlOVhJuly 8, 2024

After noting one Nvidia project’s success using an in-house large-scale Verilog code dataset, the organizers decided to enrich the current Verilog code dataset. The contest aims to build a large-scale, high-quality hardware design code dataset that will eventually be open-source.

The LLM4HWDesign contest runs in two phases. The first, data sample collection, ends August 10, 2024. From August 20 until October. 1, the second phase will improve and fine-tune the data sets collected in Phase I. When collecting data sets in Phase I, contest participants will begin with the existing Verilog dataset and expand it.

In Phase II, participants will use data filtering to remove low-quality data and develop techniques to generate more accurate descriptions for the collected data samples automatically. Finally, they’ll create labeling strategies to help the learning process for LLMs.

The winners of the LLM4HWDesign contest will be announced during IEEE/ACM’s International Conference on Computer-Aided Design at the end of October 2024. Nvidia and the National Science Foundation are sponsoring the contest, which includes some valuable awards on top of the recognition afforded by the contest:

1st Place Award: 1 x Nvidia RTX 4080 GPU plus $2,000 per team.
2nd Place Award: 1 x Nvidia RTX 4080 GPU plus $1,000 per team.
3rd Place Award: 1 x Nvidia RTX 4070 GPU plus $500 per team.

Jeff Butts has been covering tech news for more than a decade, and his IT experience predates the internet. Yes, he remembers when 9600 baud was “fast.” He especially enjoys covering DIY and Maker topics, along with anything on the bleeding edge of technology.

2 Comments Comment from the forums

hotaru251

that "prize" is a joke considering what benefit Nvidia gets from their work.

thye get paid a fraction of a fraction of the value it is worth long term.
Reply
tennis2

$7,000 in total prizes?? (not to mention the GPUs don't cost Nvidia MSRP, so less than that)
I know it's not going to be a turnkey solution handed to Nvidia, but there will ostensibly be many (hundreds/thousands) of submissions to something like this. All of which will undoubtedly be fed into Nvidia's AI training module. So Nvidia stands to gain millions/billions, for what amounts to less 1 employee's monthly salary.

Highway robbery this is.
Reply