Sign in with
Sign up | Sign in

AMD Licenses ZRAM Gen2

Tags:
Last response: in CPUs
Share

In theory AMD could put a 15 Meg L3 on its k10 for no additional cost.

zram gen2 may be fast enough for L3....

i highly doubt we will see it anytime soon.... but i have been wrong before

My no. 1 question, leaving aside the fact that zram2 can or can't be fast enough for today's cpus, is: Will zram gen2 be able to offset penryn's power consumption if used for L2 and L3 Cache by AMD (or posibly on L3 only) ??

Why I ask this without much worry about zram speed is the fact that in server space power consumtion and core numbers may give better value than raw CPU speed in hz.

More on zram gen2 from Edimag

Quote:

Z-RAM Gen2 stores significantly more charge in the memory bitcell. The additional charge provides an order-of-magnitude improvement in both cell margin (the difference between a "1" and a "0"), and in data retention time. This higher margin also provides much faster data read and write times, yet reduces power consumption significantly. As a result, Z-RAM Gen2
substantially broadens the range of applications able to take advantage of ZRAM's density: these include highperformance applications requiring greater than 1GHz operation (when pipelined); and low-power applications requiring longbattery life.

Z-RAM was already the densest memory technology in the world; Z-RAM Gen2, however, is now more than twice as fast and cuts memory read power by 75% and memory write power by an impressive 90%. It also exhibits extreme flexibility, since the technology can be 'tuned' for a very wide range of speed/power operating points, from ultra-low power to very high performance. Z-RAM Gen2 also exhibits an ultra-high density greater than 5Mbits per mm2 at 65nm, and greater than 10Mbits per mm2 at 45nm. This is effectively double the density of an eDRAM and up to six times the density of an SRAM. Other salient features include random array access greater than 400MHz, and very low active power consumption of under 10µW/MHz.
Related ressources

Quote:
http://www.cieonline.co.uk/cie2/articlen.asp?pid=1486&i...

Don't know if this was posted... But looks like AMD is actually testing ZRAM2 on its chips now


Sorta old news --- but the Gen2 is still being talked about in the 400 Mhz range....

Frankly, knowing AMD's endeavor into embedded and consumer electronics, I can see them leveraging the SOI/Zram for some very application specific products off the bat --- but for L3 cache, I think this is still too slow....

But I could be wrong.

well it comes down to is 15 megs of 400 MHZ L3 zram cache faster than having to hop to main ram after u pass the 3 meg limit on the k10

i am positive AMD is testing this

15 megs of zram L3 cache would consume alot less power than 3 megs of standard L3 cache

its beautiful all around... the only hinderance is the clock speed of zram...

Jack is more of the expert..... i will talk to my boy BITMEOFF...he is also a cache-ologist... see what he thinks

What if they used ZRAM with dual or quad channel interface; this would allow 800-1600MHz and if they made the L3 16 or 32MB it would definitely be better than a core clocked 2M L3.

Quote:
What if they used ZRAM with dual or quad channel interface; this would allow 800-1600MHz and if they made the L3 16 or 32MB it would definitely be better than a core clocked 2M L3.
\


could be done i would think... i yield to jack

Quote:
http://www.cieonline.co.uk/cie2/articlen.asp?pid=1486&i...

Don't know if this was posted... But looks like AMD is actually testing ZRAM2 on its chips now


It's interesting as IBM is gongwith eDRAM for it's next chips. Maybe AMD is hedging their bets as either one will do a great job. For now this would be good for cell phones and handhelds.

like m25 said... maybe they can quad rail it..... 400 X 4 = 15 megs of 1.6 ghz L3


True, but they also licensed Gen 1 at 90nm but didn't use it so I think it will be for cell phones until it can run at core speed. But then I don't know if the K10 L3 is running at core speed or not.

Quote:

However, there is the latency piece --- even though it would run at the speed of the core, signal propogation, setting the bit, etc. would take more than one tick or tock... this is latency. The major portion of latency is the signal propogation to the bit cell --- this is why the general rule of thumb is the larger the cache the larger the latency. The reason?? Because the fastest you can accurately ensure you get the data from the cache will depend on the time it takes to get the last bit of data physically located the farthest from the core. So a 2 meg cache may only take up 30% of the die area, but a 4 meg cache may take up 50% of the die area. The larger 4 meg cache will have transistors and bits physically farther from the core --- hence the delay getting the signal from the bit to the core will be longer --- higher latency.

There is more to it though.
Latency has an impact on performance only for sparse memory accesses.
For access to adjacent memory locations, SRAM (and even DRAM) supports burst modes where you can get a pipelined access which yields a throughput of 1 data element per clock, after locating the first element of the burst.
Since cache access is always performed per block (or cache line), the burst mode can be applied all the time and it hides most of the negative effects of (high) latency.
For example, let's say that you want transfer a block of 128 bytes from the L3 to the L2 cache, on a 128bit bus (16 bytes) and that the L3 cache has a latency of 20 clocks.
So with SRAM and a pipeline burst mode, you get the first 16 bytes after 20 clocks, but you get the rest in chunks of 16 bytes per clock.
Overall, you'd transfer your whole cache block in (1x20 + 7x1) = 27 clocks.
If your cache is running at, say, 3GHz, that's about 8.91ns.
Now let's take our ZRAM running at 400MHz, and let's suppose that the latency is only 1 clock... still, to transfer our cache block, we'd need 8 clocks, but with a clock period of 2.5ns, that would require a whopping (8x2.5= ) 20ns to transfer, more than double of what we get with high frequency SRAM.

Theoretically, you could just use a wider cache interface, or organize it in multiple banks to do some kind of interleaving, so yes you could get a higher throughput even from Z-Ram.
But since i'm not a "semiconductor guru", i have no idea about the feasibility of such a design. :) 

can you speak on the possibility of double or quad piping the L3 Zram.....

we think in theory it can be done...

double or quad pump it ....fat bandwidth...

could it be done?

or is it better to just wait for zram3 or 4

I know ZRAM is much more dense and energy effective than conventional cache (If my memory is not tricking me by a large amount, I think it's something like 32MB on about 50mm^2) don't know if you can do this with eDRAM.

We use Kilograms over here :lol: 

5.5/8 is the same as 11/16 right? :D 

I'm fairly use to lbs simply because the combat robots I build over here match the US standards as well, so we can compete internationally. I build 30lbers (13.6kg) Featherweight Combat Robots 8)


Wow... Soooo off topic...
Ask the community
!