AMD unveils its first small language model, AMD-135M — AI performance enhanced by speculative decoding

(Image credit: AMD)

As AMD flexes its muscles in the AI game, it is not only introducing new hardware but is betting on software too, trying to hit new market segments not already dominated by Nvidia.

Thus, AMD has unveiled its first small language model, AMD-135M, which belongs to the Llama family and is aimed at private business deployments. It is unclear whether the new model has to do anything with the company's recent acquisition of Silo AI (as the deal has to be finalized and cleared by various authorities, so probably not), but this is a clear step in the direction of addressing the needs of specific customers with a pre-trained model done by AMD - using AMD hardware for inference.

The main reason why AMD's models are fast is because they use so-called speculative decoding. Speculative decoding introduces a smaller 'draft model' that generates multiple candidate tokens in a single forward pass. Tokens are then passed to a larger, more accurate 'target model' that verifies or corrects them. On the one hand, this approach allows for multiple tokens to be generated simultaneously, yet on the other hand this comes at the cost of power due to increased data transactions.

AMD's new release comes in two versions: AMD-Llama-135M and AMD-Llama-135M-code, each designed to optimize specific tasks by accelerating inference performance by using speculative decoding technology, a logical thing to do for a small-language model-based AI service. Somehow, both prevail in performance tests conducted by AMD.

The base model, AMD-Llama-135M, was trained from the ground up on 670 billion tokens of general data. This process took six days using four 8-way AMD Instinct MI250-based nodes (in AMD's nomenclature these are just 'four AMD MI250 nodes').
In addition, AMD-Llama-135M-code was fine-tuned with an extra 20 billion tokens specifically focused on coding, completing this task in four days using the same hardware.

AMD believes that further optimizations can lead to even better performance. Yet, as the company shares benchmark numbers of its previous-generation GPUs, we can only imagine what its current-generation (MI300X) and next-generation (MI325X) could do.

TOPICS

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.

6 Comments Comment from the forums

hotaru251

i know its the "ai" llm fad for businesses (most customers dont give two but I really wish they'd of spent the time & money on something else like improving GPU drivers/features that more people actually want.
Reply
Pierce2623

Man…until somebody other than Nvidia successfully monetizes AI, I’d be rather skeptical of investing too much into it as a business.
Reply
m3city

Aaaand, can SB tell me what can it be used for? I'm not joking, I tried to use chatgpt for anything but fun and failed. What are the actual use cases for LLM?
Reply
ThomasKinsley

m3city said:
Aaaand, can SB tell me what can it be used for? I'm not joking, I tried to use chatgpt for anything but fun and failed. What are the actual use cases for LLM?
llms excel at questions asked in natural human language. For example, it's better than a search engine when you're trying to remember an old show or game that you can describe but can't remember the name. It's quite good for subjective tasks. For example, it can spit out templates, such as professional documents and emails, quite well. It's very good for bouncing ideas off of it. For example, you can list a problem, mention the various ways you've tried to solve it, and then ask for further ideas on how to fix it.

It's also good for learning a language. Some might object and say the grammar skills aren't perfect (this is true), but conversing with the LLM in the language you're learning and receiving responses back in that language is extremely helpful at progressing (especially when you're at that awkward level where native speakers don't want to speak with you!). You can prompt it to formulate quizzes or ask it to provide you sentences at the level you're learning (A1, B2, etc.).

Many naturally get upset and say you shouldn't be relying on AI, and to that I agree. The sad reality though is today's ChatGPT writes better than the average person online. That's an indictment on the culture. I would never use ChatGPT as an authority figure. It's just a tool. How well you formulate the prompts will determine how much you can extract from it.
Reply
m3city

ThomasKinsley said:
llms excel at questions asked in natural human language. For example, it's better than a search engine when you're trying to remember an old show or game that you can describe but can't remember the name. It's quite good for subjective tasks. For example, it can spit out templates, such as professional documents and emails, quite well. It's very good for bouncing ideas off of it. For example, you can list a problem, mention the various ways you've tried to solve it, and then ask for further ideas on how to fix it.

It's also good for learning a language. Some might object and say the grammar skills aren't perfect (this is true), but conversing with the LLM in the language you're learning and receiving responses back in that language is extremely helpful at progressing (especially when you're at that awkward level where native speakers don't want to speak with you!). You can prompt it to formulate quizzes or ask it to provide you sentences at the level you're learning (A1, B2, etc.).

Many naturally get upset and say you shouldn't be relying on AI, and to that I agree. The sad reality though is today's ChatGPT writes better than the average person online. That's an indictment on the culture. I would never use ChatGPT as an authority figure. It's just a tool. How well you formulate the prompts will determine how much you can extract from it.
That's actually a set of interesting ideas. Thanks. I knew about stuff you wrote in first paragraph, but that example kind of opened my mind bout such use cases. I would never thought about your second idea (Lang learning), but it's great as well. So LLM na be useful for things that your serious decision rely on, but it may get you somewhere, where it's mere an assistant that you need to double check occasionally.
Reply
dimar

Waiting for Creative to release Sound Blaster AI.
Reply

Show more comments