Sign in with
Sign up | Sign in
Your question
Closed

AMD Piledriver rumours ... and expert conjecture

Last response: in CPUs
Share
a b à CPUs
October 27, 2011 1:26:24 PM

We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...
a c 158 à CPUs
a b À AMD
October 27, 2011 1:59:10 PM

Hope see here more civilized posts.
a b à CPUs
October 27, 2011 5:31:04 PM

^^^^^ How are any of the above posts constructive expert conjecture or rumors?
Just sayin'...

AFAIK, the only thing we know from AMD slides is that AMD estimates Piledriver to be 10% better than BD based Zambezi/FX . FWIW, one test of Zambezi on Win 8 vs. Win 7 did show ~5% improvement for Zambezi/FX-8150. So it we see a 15+% improvement over the current FX CPU performance, it's a start.

http://www.pcstats.com/articleview.cfm?articleID=2622
Related resources
a b à CPUs
October 27, 2011 5:41:27 PM

AMD Is Already Testing the First Trinity APU Samples

Quote:
A series of processors from AMD's upcoming Trinity APU series were just spotted online in a benchmark database suggesting that the Sunnyvale-based chip maker has already started testing these CPUs and may have even sent them to its partners.

Results of tests run using these processors were added to the OpenBenchmarking.org database and included not just the names and clock speeds of the APUs, but also information regarding their performance.

All these details however have been removed from the database by the Phoronix Test Suite author “to save these engineers on too much embarrassment or trouble.”

Despite this measure, Phoronix has however disclosed a few basic info about the chips. These are actually four in number and all are Engineering Sample processors.

The APUs include either two or four processing cores, have their clock speeds set between 2.5GHz and 3.3GHz, working integrated graphics and seem to function well under Linux with the proprietary AMD Catalyst driver.

AMD's next-generation Trinity APUs are based on the Piledriver core which is said to offer 10% better performance than Bulldozer and feature a VLIW4 GPU derived from the Cayman graphics used inside the Radeon HD 6900 series.

Much like the current Llano APUs, the chips will lack any sort of Level 3 cache memory as AMD wanted to increase the die area available to the on-board GPU.

According to AMD, Piledriver based APUs will be divided into three main versions for specific price-points and markets.

All the chips will be manufactured by Globalfoundries using the 32nm fabrication process and early estimates indicate that the quad-core version of the chip will feature more than 2 billion transistors.

The first Trinity APUs are expected to arrive at the end of Q1 2012 or in early Q2.
a b à CPUs
October 27, 2011 5:43:41 PM

^^ Not shocked about the lack of L3. Will also cut down power somewhat, but that still has to be a MAJOR concern, given Trinitys primary market.
a b à CPUs
October 27, 2011 5:49:40 PM

Globalfoundries Discloses Peculiarities of AMD's Trinity: Piledriver x86 Cores, Radeon HD 7000 Graphics. Globalfoundries Talks Next-Gen 32nm SOI AMD Fusion Chip

Quote:
At its annual Global Technology Conference (GTC 2011), Globalfoundries officially disclosed peculiarities of AMD's next-generation Fusion A-series accelerated processing unit code-named Trinity. As expected, the chip will be based on enhanced Bulldozer/Piledriver x86 cores as well as AMD's next-gen Radeon HD 7000-series graphics technology.

As reported, Advanced Micro Devices' second-generation code-named Trinity APU for mainstream personal computers (Comal for notebooks and Virgo for desktops) will be made using 32nm SOI HKMG process technology at Globalfoundries. The APU will feature up to four x86 cores powered by enhanced Bulldozer/Piledriver architecture, AMD Radeon HD 7000-series "Southern Islands" graphics core with DirectX 11-class graphics support and other improvements.

AMD and Globalfoundries claim that Trinity will offer up to 50% improvement in GFLOPS performance with the same power consumption as currently available A-series "Llano" APUs or similar GFLOPS horsepower with 50% reduction of power consumption.

While basic specs and peculiarities are known, it is still unclear how much more powerful will the enhanced Bulldozer (Piledriver) x86 cores be compared to the first-generation Bulldozer offerings and what will be different between the two iterations of the micro-architecture. It also remains to be seen whether AMD Radeon HD 7000-series will rely on VLIW4 architecture (Cayman-like), will sport a new graphics/compute architecture or something hybrid. It will be interesting to find out whether AMD will implement unified address space for CPU and GPU cores and/or other enhancements for heterogeneous multi-core solutions into its second-generation Fusion or will be a little bit more conservative.


a b à CPUs
October 27, 2011 5:50:48 PM

Quote:
^^ Not shocked about the lack of L3. Will also cut down power somewhat, but that still has to be a MAJOR concern, given Trinitys primary market.


True, but seeing as how the L3 (and L2 for that matter) don't help BD much, maybe it won't matter :p .
October 27, 2011 8:58:51 PM

So is AM3+ a dead socket with PD moving to FM2?

I still think the module concept can work well for AMD if they manage a better implementation this time around.

Things on my wish list for Piledriver (for all those at AMD who aren't listening):
-Improved branch prediction
-Reconfigured cache ratios
-Return to hand-design (if the speculation that what appears to be excess transistors (2 billion!) on BD was the result of machine-design is true)
a c 100 à CPUs
a b À AMD
October 27, 2011 9:10:47 PM

Trinity is FM2 as it has on board graphics so needs a new socket, I would think they will release an AM3+ version but only because of logic which there was little of in Bulldozers release.
a b à CPUs
October 27, 2011 10:26:28 PM

So PD is based on the BD architect, but tweaked, right? How extensive is the "tweaking?" Is it a major arch change?
a b à CPUs
October 28, 2011 1:02:10 AM

Will Piledriver still use the AM3+ socket and 9xx series mobos?
a b à CPUs
October 28, 2011 1:35:05 AM

Quote:
Will Piledriver still use the AM3+ socket and 9xx series mobos?

Yes. Trinity, however, is an APU and will use the FM2 socket if you were confused with that.
a b à CPUs
October 28, 2011 2:56:02 AM

Whats with all the socket changing now?
a b à CPUs
October 28, 2011 3:03:36 AM

Quote:
Whats with all the socket changing now?

APUs use the FM sockets. CPUs use the AM socket.
a c 122 à CPUs
a b À AMD
October 28, 2011 3:23:51 AM

Quote:
Trinity is FM2 as it has on board graphics so needs a new socket, I would think they will release an AM3+ version but only because of logic which there was little of in Bulldozers release.


There will be a AM3+ which will be like BD, only CPU. FM2 (not sure on that yet) will be Trinity which is PD (was BD but changed) with a GPU. Thats the whole reason for the FM1 socket, the on die GPU changed the pinout too much to make it work on AM3.
a b à CPUs
October 28, 2011 4:19:37 AM

Let me get this straight:

Zambezi(BD) will be replaced by Vishera(PD)
Stars(A series) will be replaced by Trinity
Stars is the Llano platform that will be replaced by Trinity, a whole new circuit.

Correct me if I am wrong?
a c 122 à CPUs
a b À AMD
October 28, 2011 5:36:14 AM



I will doubt 30% CPU wise. I can see GPU wise.

I don't think the GPU will be the next gen but a refresh of the HD6K series, still VILW4 not MIMD as only the ones that I think will actually use the MIMD is the HD79XX series. I also think that the Trinity GPU will be the HD7K equivalent to current Llano GPUs but will allow for higher clocks in some way.

The entire HD7K series, apart from the HD79XX, will be a die shrink of Cayman Islands but have higher clocks thanks to the 28nm process. Look at the rumored specs of the HD7870. Exact same specs as the HD6970, same sharder amount, GDDR5, 2GB but higher clock speeds for the core and memory, thanks to about a 100W drop, to allow for a slight performance boost.

The HD7970 on the other hand will be MIMD, 2GB of XDR2 @ 8000MHz, 1000MHz core, 2048SPUs and a 512Bit memory interface (reminds me of the HD2900).

It will also more than double the GPixels/s fill rate (64 vs 28.2) and increase the GTixels/s by 50% (128 vs 84.5). As well it will boast a whopping 338.2 GB/s memory bandwidth vs 176GB/s (almost double). Hell its more powerful in almost every area than the HD6990 except GTixels/s (159.2) which means we could possibly have another HD5870, a card that performed the same as a HD4870X2 for the most part but it could possibly beat the HD6990.

All that aside, thats why I don't think Trinity will be as big a jump as they say it will.
October 28, 2011 5:51:39 AM

Just a quick question here if they did a die shrink of PHII how much estimated performance would that have given? Would it have been able to clock a lot higher? and is there a chance we'll see that coz i kinda wanna c it.
a c 100 à CPUs
a b À AMD
October 28, 2011 10:59:58 AM

Quote:
Just a quick question here if they did a die shrink of PHII how much estimated performance would that have given? Would it have been able to clock a lot higher? and is there a chance we'll see that coz i kinda wanna c it.

I would like to see that too but I don't think we will. If they took a Phenom x4 + die shrink + BDs turbo and memory controller and had it clocked at 4GHz + 4,6GHz turbo then you may get a CPU that can reach close to a 2500K for a maybe cheaper price and definitely better than 4 core BD. Again this is pure speculation and I may not work but its would be worth ago instead of releasing BD.
October 28, 2011 2:57:16 PM

I don't think there is any hope for a 32nm Phenom III either. I think AMD is ceding the IPC game to Intel and is going to try and develop products that push the market in new directions instead of trying to constantly trying to play catch up.

If you can't win the game, change it.

I think the APU and module game they are playing now is a good strategic choice, but they have to be able to implement their innovations better than they did with BD.

If AMD can deliver on their designs and ARM is able to make any headway in the server market I think Intel may be the one left scratching their head for once.
October 28, 2011 3:04:21 PM

http://www.engadget.com/2011/10/27/amd-reports-1-69-bil...

Things were starting to look pretty bleak in Q2 for AMD, but Q3 is an entirely different story. The company reported a revenue of $1.69 billion, up 7-percent from last quarter. More importantly, net income climbed to $97 million, up from just $61 million in Q2 and a far cry from the $118 million loss posted this time last year. Even the graphics division had good news to share. After the former ATI ran at an operating loss of $7 million last quarter, it netted $12 million in operating income in Q2. We wouldn't exactly call this the second coming of the CPU underdog, but it certainly should make fans and investors sleep a little better at night. Check out the complete PR after the break.
October 28, 2011 3:11:10 PM

Quote:
I think a game changer is a good start but performance is still the no1 thing to have for us enthusiasts and gamers alike.


Which goes back to implementation. The goal for BD was to hold the line or slightly improve IPC. If they had been able to do that I think we would have seen a much different reception for BD.

Instead we are seeing IPC that is worse than Phenom II and BD is considered a major disappointment.
a b à CPUs
October 28, 2011 3:48:04 PM

Quote:
I will doubt 30% CPU wise. I can see GPU wise.

I don't think the GPU will be the next gen but a refresh of the HD6K series, still VILW4 not MIMD as only the ones that I think will actually use the MIMD is the HD79XX series. I also think that the Trinity GPU will be the HD7K equivalent to current Llano GPUs but will allow for higher clocks in some way.

The entire HD7K series, apart from the HD79XX, will be a die shrink of Cayman Islands but have higher clocks thanks to the 28nm process. Look at the rumored specs of the HD7870. Exact same specs as the HD6970, same sharder amount, GDDR5, 2GB but higher clock speeds for the core and memory, thanks to about a 100W drop, to allow for a slight performance boost.

The HD7970 on the other hand will be MIMD, 2GB of XDR2 @ 8000MHz, 1000MHz core, 2048SPUs and a 512Bit memory interface (reminds me of the HD2900).

It will also more than double the GPixels/s fill rate (64 vs 28.2) and increase the GTixels/s by 50% (128 vs 84.5). As well it will boast a whopping 338.2 GB/s memory bandwidth vs 176GB/s (almost double). Hell its more powerful in almost every area than the HD6990 except GTixels/s (159.2) which means we could possibly have another HD5870, a card that performed the same as a HD4870X2 for the most part but it could possibly beat the HD6990.

All that aside, thats why I don't think Trinity will be as big a jump as they say it will.


I see what you mean, but why doesn't AMD do a clean circuit design?

Also, I have to agree with you on Trinity. I only 10% boost on it from Llano.
a b à CPUs
October 28, 2011 3:48:58 PM

I have already deleted any posts ( 10 ) without some connection to Bulldozer's successor ... or suggestions of a shrink of a previous K series CPU.

Swing the topic back to Piledriver please?

How about putting up some roadmaps?

Anyone popped over to see what Charlie D or Anand or Fuad has found?

Lets pool resources and display them here.

Hopefully JF-AMD will return soon - as he had heavy work commitments with the Interlagos release.

a b à CPUs
October 28, 2011 3:56:32 PM


GPU performance increase touted as around 30% as the new 7 series GPU will be used ... as opposed to the current 8 series (read 5 series) GPU.

CPU performance would surely increase ... how much is unknown.

Just getting a decrease in cache latency and a VCE will probably yield a good return though an increase in L1 Cache size ...


http://www.forum-3dcenter.org/vbulletin/showthread.php?...

Engineering samples are likely out of the oven even now.

I didn't troll for more info on the destop variants.

I would imagine a better stepping for the desktops might be out shortly.

a b à CPUs
October 28, 2011 4:08:29 PM

Quote:
I put up some links with piledriver info.



cheers for that !!

:) 
a b à CPUs
October 28, 2011 4:40:50 PM

Increased L1 DTLB size from 32 entries to 64 entries

http://support.amd.com/us/Processor_TechDocs/47414.pdf (Bulldozer)

AMD Family 15h processors have multiple compute units, each containing its own L2 cache and two cores. The cores share their compute unit’s L2 cache. Each core incorporates the complete x86 instruction set logic and L1 data cache. Compute units share the processor’s L3 cache and Northbridge (see Chapter 2, Microarchitecture of AMD Family 15h Processors).

16KB per cluster L1 caches

http://www.donanimhaber.com/islemci/haberleri/Dunyada-i...
a b à CPUs
October 28, 2011 5:04:27 PM

Piledriver really needs a triple or quad channel memory controller. Unfortunately that doesn't look like it will happen. As we saw with the Llano APU the DDR3 speed has significant effect on the graphics performance.

Beefing up the GPU 50% and adding Bulldozer cores is really going to squeeze the memory bus. DDR3 2133 will help but it's expensive.
a b à CPUs
October 28, 2011 5:16:53 PM

I don't see any performance benefit gained wrt more channels ... look at Nehalem and SB ... they don't seem to need (benefit) from the increased bw.



October 28, 2011 5:36:16 PM

The CPU side would not benefit much but the graphics performance of Llano is fairly sensitive to bandwidth.

I think some are mixing discussion of the CPU vs the APU which is understandable since the next APU will be using PD cores as will 'FX-next'.
October 28, 2011 7:26:32 PM

To my knowledge, Trinity has always been set for PD cores.
Charlie saw one awiles back, and its unconfirmed as to whether itll only be VLIW4 only, with no GCN added.
Its possible I suppose, but adding several designs, besides just dumb shrinks stretches their already stretched capacity, unless it works in their favor?
a b à CPUs
October 28, 2011 7:32:13 PM

Could they stack another CPU on the existing one? If they do, they will possibly be ahead of Intel. Possibly.
October 28, 2011 7:39:17 PM

Interesting. Since BD IPC was actually lower than the prior core used on Llano, they must be banking on a significant core frequency bump on Trinity-- either from Turbo 3 or just better power/thermals.
a b à CPUs
October 28, 2011 7:42:13 PM

Quote:
Interesting. Since BD IPC was actually lower than the prior core used on Llano, they must be banking on a significant core frequency bump on Trinity-- either from Turbo 3 or just better power/thermals.


What are you referring to?
October 28, 2011 7:49:46 PM

Quote:
What are you referring to?

Sorry, should have been clearer... the link you posted regarding Trinity on softpedia.
October 28, 2011 7:53:56 PM

Remember Llanos lower clocks, and an improved 32nm for PD, which should have a few bugs worked out, meaning possibly higher IPC as well
a b à CPUs
October 28, 2011 8:04:49 PM

Is the 10% boost really any improvement? It doesn't seem much compared to how behind BD is ATM. Any news about power consumption and heat?
October 28, 2011 8:07:50 PM

Quote:
http://vr-zone.com/articles/report-amd-trinity-details-...

http://news.softpedia.com/news/AMD-Confirms-Trinity-APU...

It seems Piledriver is not getting much attention,( another BD?) however, with 2012 around the corner, Trinity will begin to appear. This is odd because Llano has been out for a short time already.


I have a feeling AMD will try to push out APUs at nearly the same rate that they do GPUs since the two are probably sharing some of the design cycle. I think the ex-AMD engineer claimed that is why they moved to machine generated layouts so they could get the GPU and CPU sides on similar development schedules.
a b à CPUs
October 28, 2011 8:30:45 PM

Quote:

It seems Piledriver is not getting much attention,( another BD?) however, with 2012 around the corner, Trinity will begin to appear. This is odd because Llano has been out for a short time already.



That makes sense as AMD's mantra is 'The Future is Fusion." The mainstream market is the APU and it's where they are making the most money.
a c 122 à CPUs
a b À AMD
October 28, 2011 8:59:47 PM



I so called it. AMD is having 32nm yield issues across the board, not just Llano. Llano has a bit more mainly due to the GPU, probably not the CPU.

Looks like low supply for BD until GF gets 32nm mature.

Quote:
GPU performance increase touted as around 30% as the new 7 series GPU will be used ... as opposed to the current 8 series (read 5 series) GPU.

CPU performance would surely increase ... how much is unknown.

Just getting a decrease in cache latency and a VCE will probably yield a good return though an increase in L1 Cache size ...


http://www.forum-3dcenter.org/vbulletin/showthread.php?...

Engineering samples are likely out of the oven even now.

I didn't troll for more info on the destop variants.

I would imagine a better stepping for the desktops might be out shortly.


As I said before, I would assume 10% increase if Trinity doesn't follow BDs suit. The 30% GPU I would imagine would be due to higher clocks.

Wish AMD would put more info out there. Kinda annoying to announce something with very little info on it.
a b à CPUs
October 28, 2011 9:13:25 PM

On the L3 approach to it...

I got an A8 and I can tell you guys it performs very very well for it'slow speed. Been experimenting a lot with it and it can max out my 4890 (which would be near a 6850) running at 2.9Ghz. Hey, the Athlon II's are a sample for them too :p 

There are few apps that actually need L3 cache that bad, so it's not a bad trade off at all for more space to get GPU muscle in.

On the arch itself, I haven't seen any diagram with tweaks so far that makes it differ from Zambezi, so the performance should be in the same ballpark if not the actual same.

Cheers!

EDIT: Deleted a word.
October 28, 2011 10:19:20 PM

Wow, interesting posts I guess. From what I've seen, PD\Trinity will pick up where B3 leaves off. PD is set to get FMA3 and some bit manip instructions.

I've been in search of an errata list because it's been said that two threads sharing a module tend to get the wrong prefetch data which definitely causes thrashing and will really test the OoO engine for mispredictions. You could lose 5% efficiency which will translate to much more perf.

Also, the affinity tests are showing more perf than it should by not sharing L2. I do believe that a fix in PD\Trinity will be to tweak the L2 WCC (write coalescing cache) to align better for each INT unit.

That I believe is causing the low L2 bandwidth. More to come
October 28, 2011 11:50:16 PM

Quote:
I so called it. AMD is having 32nm yield issues across the board, not just Llano. Llano has a bit more mainly due to the GPU, probably not the CPU.

Looks like low supply for BD until GF gets 32nm mature.



As I said before, I would assume 10% increase if Trinity doesn't follow BDs suit. The 30% GPU I would imagine would be due to higher clocks.

Wish AMD would put more info out there. Kinda annoying to announce something with very little info on it.



That article doesn't say there are BD yield issues. Everyone KEEPS saying it's not BD but Llano. IF they are wasting wafers there are fewer for FX.

As far as Trinity:




Also the word is that UVD is getting a new feature to challenge QuickSync. Searching for the reference. It's also possible that Trinity is better because the GPU is already 28nm so it's actually the opposite of a shrink and has the simpler VLIW4 arrangement. Also, they can optimize compilers more for LWP (Lightweight Profiling) and XOP\AVX.

Again the problems with X were caused by a perfect storm of SW optimizations - or lack of and Win 7 no understanding modules. Of PD\Trinity can't fix that but I'm pretty certain of the issues I believe are occurring in L1 and L2. FX is the first Shared L2 arch. PD should tweak the Write Through and eviction scheme to increase bandwidth.


It's really difficult to get to in-depth with what they may have done with PD, but I don't think they will get 20% more clock speed at this point. I would think that if they got the first Trinity chips back in June, they have probably finalized max clocks.

BTW, I see everyone started right back in with delete worthy posts. Perhaps I'll come to a SB post and set a good example.
October 29, 2011 3:43:44 AM

why can't that shrink the processor to 28nm if the GPU will also be 28nm?
!