Sign in with
Sign up | Sign in
Your question

Moore's Law is alive and well

Tags:
  • CPUs
  • Intel
  • Processors
Last response: in CPUs
Share
February 28, 2006 3:31:05 AM

Wired is running an article about future lithography techniques

Quote:
Intel said there may be as many as 100 cores packed on a single processor within 10 years.


Good, quick read.

More about : moore law alive

February 28, 2006 3:36:29 AM

cool

only problem i foresee is one of memory bandwidth. 8 dies trying to access 2 DIMMs? That's going to be a bitch. Both intel and AMD need to work on a faster and bigger memory subsystem soon.

-mpjesse
February 28, 2006 5:14:35 AM

LOL in ten years, Intel will have had 40 different chipsets and 20 different CPU sockets.

In the near future, AMD is looking good. Multiple HT controllers could easily be built into desktop systems to address multiple memory banks. Eight CPUs on one die addressing two (or 4) separate dual channel 2 gig memory banks; the tech is in production right now. It would just me a matter of AMD implementing it. 8 way Opterons have 2 (I believe) HT controllers to communicate with the rest of the system.

I'm really not sure what Intel has going on their side. I know multi-xeon setups aren't as efficient as similar opteron ones, with regard to memory sub-system usage. I'm not really sure of the particulars, though.
Related resources
Can't find your answer ? Ask !
February 28, 2006 5:20:03 AM

All Opteron CPU's (1xx 2xx and 8xx) have 3 HTT Links. On 1xx Opterons, 2 are either disabled or not in use (nowhere for it to go lol). On 2xx Opterons, it depends on the setup. In a board with 4x DIMM's per CPU, 1 HTT goes to RAM, and 1 is used for Coherency to the other CPU. That's 2, and CPU0 (in some motherboards) 1 HTT Link runs at 600MHz (1200MHz effective) and that goes to the PCI-X. CPU1 has it's 3rd HTT disabled or not in use. On 8xx Opterons, each Opteron has 1 HTT for it's seperate 4x DIMM's (All 8-Way Opteron boards I've seen have 4x DIMM's per CPU). The rest of the HTT's just link to other CPU's.

Than again, 8-Way Opteron boards are really 3 boards. 2 Boards w/ 4 CPU's and 16x DIMM's, those are attached via a bus, and those are sat on top of the 3rd board, which is the actual motherboard. At least the ones I've seen are like this.

~~Mad Mod Mike, pimpin' the world 1 rig at a time
February 28, 2006 5:26:12 AM

So, in theory, you could take an 8-way and put it in a desktop environment, with the proper Mobo, and use the other two HT's to address two more memory banks. Correct?
February 28, 2006 5:29:25 AM

In Theory, yes. In practice, not likely. A single Opteron CPU would get flooded by 12x DIMM's (3x 4x DIMM's), but something like a Dual-Core or Quad-Core, that would be very beneficial. But think of it like this, comparing the RAM/FSB to RAM/HTT, w/ the OMC in A64 or O64, it has a 1GHz HTT to the RAM and than a direct link back @ RAM speed. If you look at the FSB, it has a 8GB/s maximum (1066MHz FSB) and DDR2 is already passed 10GB/s. So, technically, you could be able to read data from RAM faster in an A64 (as fast as the RAM will let) but writing is only as fast as the HTT to it. I find that pretty interesting for DDR2.

HTT is already a better candidate for DDR2, w/ 1400MHz, it's somethin' like 11GB/s, that's enough for DDR2 667 @ Full Speed and even DDR2 800 will be better performance. Intel's 1333MHz FSB will be an interesting addition, and the only reason I'd buy an Intel w/ DDR2, otherwise you're wasting bandwidth.

~~Mad Mod Mike, pimpin' the world 1 rig at a time
February 28, 2006 5:35:16 AM

Right, in the post we are referring to I said
Quote:
Eight CPUs on one die addressing two (or 4) separate dual channel 2 gig memory banks...


So we agree :D 
February 28, 2006 5:36:40 AM

Yep. That'd be quite interesting, and that would put AMD so far ahead of Intel. The only way Intel could counter is to have multiple northbridges, which doesn't seem very likely or plausible.

~~Mad Mod Mike, pimpin' the world 1 rig at a time
February 28, 2006 5:41:50 AM

I hate to reply to my replys, but check this:

After reading my own post (hehe) I got an idea. What if there were multiple, like, Sub-Northbridges. And all these Sub-NB's had 1 Memory Controller in it, and they all addressed their own memory. Than, those Sub-NB's would feed information to the NB. Or, Intel could implement DIB (Dual Independant Bus) as they said, and they could have each 4 FSB's in a Quad-Core CPU, than each FSB links to each Sub-NB which gives each CPU it's own memory and the Sub-NB's link to the main NB. Than, when data comes in from SB or other devices on the main NB, it could find which current bus was least busiest and use that for processing. I think it seems pretty interesting and would temporarily solve Intel's dillemma. What you guys think?

~~Mad Mod Mike, pimpin' the world 1 rig at a time
February 28, 2006 5:45:32 AM

Lol funny you should mention multiple northbridges. I read an article about implementing multiple ULI chips via HT (way to lazy to find it right now) to achieve more PCIe lanes. I'm not positive, but I think it was regarding an Intel chipset.
No, it isn't likely though considering: the cost of production, the thermal characteristics and the pure realestate on the MB.
February 28, 2006 6:35:29 AM

Quote:
All Opteron CPU's (1xx 2xx and 8xx) have 3 HTT Links. On 1xx Opterons, 2 are either disabled or not in use (nowhere for it to go lol). On 2xx Opterons, it depends on the setup. In a board with 4x DIMM's per CPU, 1 HTT goes to RAM, and 1 is used for Coherency to the other CPU. That's 2, and CPU0 (in some motherboards) 1 HTT Link runs at 600MHz (1200MHz effective) and that goes to the PCI-X. CPU1 has it's 3rd HTT disabled or not in use. On 8xx Opterons, each Opteron has 1 HTT for it's seperate 4x DIMM's (All 8-Way Opteron boards I've seen have 4x DIMM's per CPU). The rest of the HTT's just link to other CPU's.

Than again, 8-Way Opteron boards are really 3 boards. 2 Boards w/ 4 CPU's and 16x DIMM's, those are attached via a bus, and those are sat on top of the 3rd board, which is the actual motherboard. At least the ones I've seen are like this.

~~Mad Mod Mike, pimpin' the world 1 rig at a time

Are you sure?
I thought opterons used the same type of memory bus as the A64s.
That the 2xx series used 1 HTT for the n bridge, and one went to the other chip. That in 4xx and 8xx they used the 3rd HTT to interconnect the multiple chips.
February 28, 2006 5:52:03 PM

2-Way Opteron Diagram
4-Way Opteron Diagram
4-Way Opteron Diagram (More Legible)
8-Way Opteron Diagram

As you can see from the diagrams, I was mis-stating some things. But my basic outline was the same, that Opties have 3 HTT links and some are disabled. As you can see, CPU 1 has 1 of it's HTT links disabled in the 2-Way Diagram. There is no such thing as 4xx Opterons, there is 4-way boards, but you need 8xx Opteron CPU's to put in them. (8xx Opterons can be used in 2-Way boards as well, provides a tiny performance increase over 2xx so I am told).

Note: All 4-Way and 8-Way Opteron boards use AMD 8xxx Chipsets.

~~Mad Mod Mike, pimpin' the world 1 rig at a time
February 28, 2006 7:18:24 PM

The basic silicon for the HT links in all Opterons and Athlons is the same.
AMD just disable a feature option for some links to prevent you using it to build bigger.
8xx - All links function with coherency support and can be connected to three neighbouring CPU's or less CPU's and more for IO (eg. chipset). You can also put a 8xx series into a 2 socket MB with a 2xx.
2xx – 1 link for CPU-to-CPU with coherency support and the rest can be used for IO.
1xx – No links for CPU-to-CPU (no links with coherency support) so all 3 can be used for IO.
Athlon64 – No links for CPU-to-CPU (no links with coherency support) so all 3 can be used for IO.
If coherency support for a link is disabled at the factory (when they make it a 1xx, 2xx, 8xx or Athlon64) then you can’t re-enable it. Coherency support allows CPU’s to share each others memory (which is directly attached to each) via messages.
For example, I want data at #### :
* I’m the owner so I don’t care to ask others for permission;
* I’m the owner but others are sharing it so I’ll let them know I want to change it;
* I want data at ####, does anyone have it in their cache;
* I have it and you can read it or here’s the update;
* I have an exclusive lock so you can’t have it until I’m finished with it;

Without the coherency traffic, chaos reigns.
February 28, 2006 7:29:43 PM

Where did you read that A64's have 3 HTT Links? Because I do not believe that to be true. And in 1xx Opterons, only 1 is used, the other 2 are not AFAIK.

EDIT: Diagram This also shows that not all CPU's use their 3 HTT's in a 4-Way Board.
EDIT: Diagram Here's an illustration of a 1-Way A64/Opteron motherboard.
EDIT: Diagram Here is why Dual Xeon's Suck
EDIT: Picture I wish this was mine ;) 
EDIT: Link Look at bottom of page, you can see what I was talking about with 8-Way Opterons.
EDIT: Diagram A nice diagram of the architecture of an AMD Opteron 64 CPU
EDIT: Diagram Here's a 2-Way with CPU1 only using 1 HTT link.
EDIT: Diagram Another reason why Xeon's Suck
EDIT: Diagram A64 Architecture Diagram

~~Mad Mod Mike, pimpin' the world 1 rig at a time
March 1, 2006 7:00:37 PM

Intel ? Hmm... I could have sworn I heard that name before. Wherent those the same guys that while "discovering" (hyper)pipelining, predicted 10 GHz CPU's by now ? Until they noticed the catch.. indeed 10 Ghz CPU's run kind of hot.

So now they "discovered" that by adding another core, you get some easy performance boost, and now they think this will keep helping until eternity ? Sure.. until they discover the catch behind this one: software.

After 4 decades of supercomputers, there is still no OS that can gracefully anywhere near that number of simultaneous processes efficiently in a single image system. Linux has a hard time feeding 8 cores, and Windows is barely better.

Let alone any applications that are of any interest to consumers, other than very specific streaming tasks that are much better handled by cheap, dedicated chips anyway (physics, 3D video, video encoding, rendering,..).

IMO this is none other than a sign of defeat, stating you have run out of idea's to improve ILP or clockspeed, because no matter how you look at it, single threaded performance (applicatin latency) will always matter. If not, we'd have had those 100 core 486's today. IBM isn't stupid persuing per thread performance and clockspeed with their power6, rather than just adding cores like most others.

To summarize: I certainly believe process scaling can make it possible to produce 100 core chips in a decade (Sun already has a relatively small and cool 8 core chip on the market today) but if this is what intel are betting on for the future, they will run into a brick wall that will make Prescott look like the best idea since sliced bread.
March 1, 2006 7:05:44 PM

Quote:
Lol funny you should mention multiple northbridges. I read an article about implementing multiple ULI chips via HT (way to lazy to find it right now) to achieve more PCIe lanes. I'm not positive, but I think it was regarding an Intel chipset.
No, it isn't likely though considering: the cost of production, the thermal characteristics and the pure realestate on the MB.


Adding memory controllers (and therefore bandwidth) to the cpu isn't that hard. Designing the motherboards is the real problem. Its not without reason AMD didn't double the # of memory controllers when going dual core. The reason is quad channel would make motherboards incredibly expenseve to design and manufacture.

It doesn't matter how you do it (multiple northbridges, connected through HTT or otherwise, more ODMC's), the problem remains the same. FBDIMM might help here though, as some other serial memory protocols like Rambus.
March 1, 2006 7:07:04 PM

Quote:
Intel ? Hmm... I could have sworn I heard that name before. Wherent those the same guys that while "discovering" (hyper)pipelining, predicted 10 GHz CPU's by now ? Until they noticed the catch.. indeed 10 Ghz CPU's run kind of hot.

So now they "discovered" that by adding another core, you get some easy performance boost, and now they think this will keep helping until eternity ? Sure.. until they discover the catch behind this one: software.

After 4 decades of supercomputers, there is still no OS that can gracefully anywhere near that number of simultaneous processes efficiently in a single image system. Linux has a hard time feeding 8 cores, and Windows is barely better.

Let alone any applications that are of any interest to consumers, other than very specific streaming tasks that are much better handled by cheap, dedicated chips anyway (physics, 3D video, video encoding, rendering,..).

IMO this is none other than a sign of defeat, stating you have run out of idea's to improve ILP or clockspeed, because no matter how you look at it, single threaded performance (applicatin latency) will always matter. If not, we'd have had those 100 core 486's today. IBM isn't stupid persuing per thread performance and clockspeed with their power6, rather than just adding cores like most others.

To summarize: I certainly believe process scaling can make it possible to produce 100 core chips in a decade (Sun already has a relatively small and cool 8 core chip on the market today) but if this is what intel are betting on for the future, they will run into a brick wall that will make Prescott look like the best idea since sliced bread.


Hahahaha I like you man. AMD w/ the K10's is promising 10GHz Operation as well, we'll have to see how that goes. As for Multi-Core, I've been stating on these forumz for as long as I've been here that I do not believe Multi-Core to be the future for Desktop and Normal Users, but rather for High-End Servers. There needs to be more concentration on making better apps, better and more secure OS's, and face it, making better Video Games. But I do believe that the need to increase speed is necessary, because I like bragging rights :>).

BTW: Prescott + Sliced Bread = Toast :-D.

~~Mad Mod Mike, pimpin' the world 1 rig at a time
March 1, 2006 7:43:27 PM

Quote:
Lol funny you should mention multiple northbridges. I read an article about implementing multiple ULI chips via HT (way to lazy to find it right now) to achieve more PCIe lanes. I'm not positive, but I think it was regarding an Intel chipset.
No, it isn't likely though considering: the cost of production, the thermal characteristics and the pure realestate on the MB.


Adding memory controllers (and therefore bandwidth) to the cpu isn't that hard. Designing the motherboards is the real problem. Its not without reason AMD didn't double the # of memory controllers when going dual core. The reason is quad channel would make motherboards incredibly expenseve to design and manufacture.

It doesn't matter how you do it (multiple northbridges, connected through HTT or otherwise, more ODMC's), the problem remains the same. FBDIMM might help here though, as some other serial memory protocols like Rambus.

It wouldn't be expensive for Quad Channel (Look at Dual Opteron boards w/ 8x DIMM's w/ 4 per CPU, easily found for <$250) and it isn't hard for motherboards, but it is the hard part on the CPU. The CPU would require 736 pins dedicated for memory of DDR1 Quad Channel (or 2 Dual Channel per CPU) on a Dual-Core CPU to have Independent Memory Controllers, and that means a new socket. That brings in Socket F for Opterons w/ 1207 pins, because DDR2 Quad Channel w/ Independent Memory Controllers would require 960 Pins from the CPU, so we might see it in Socket F, which would be awesome.

~~Mad Mod Mike, pimpin' the world 1 rig at a time
March 2, 2006 10:23:39 AM

Quote:

It wouldn't be expensive for Quad Channel (Look at Dual Opteron boards w/ 8x DIMM's w/ 4 per CPU, easily found for <$250)


Its much easier to route 4 memory channels to 2 different sockets than to one and the same. Also, $250 is completely outrageous for a MB if you are talking mainstream. I suspect a typical desktop MB to cost around $10-15. IF quad channel increases that cost by just $10, its definately a showstopper.
March 2, 2006 5:50:58 PM

P4Man, it isn't EASIER in the sense of "routing" (w/e the hell you're talking about with that) but adding Dual Mem Controllers to a CPU means a new socket, because each CPU would have Dual Channel and require 4 times the pins of the single DIMM (184x4 for DDR1 240x4 for DDR2). Socket F can bring either Quad Channel RAM or Dual Mem Controllers on Dual-Core and have Dual-Channel per Core. I would rather have the latter.

~~Mad Mod Mike, pimpin' the world 1 rig at a time
March 3, 2006 9:05:14 PM

I believe he's referring to the traces on the mobo.... routing 4 memory channels to a single socket is more difficult because of the amount of traces required. You get too many too close together, and you get crosstalk.
March 4, 2006 8:15:38 AM

Quote:
Wired is running an article about future lithography techniques

Intel said there may be as many as 100 cores packed on a single processor within 10 years.


Good, quick read.

if there's one thing to learn from intel its never to listen to their future predictions
March 4, 2006 11:01:51 AM

Quote:
Wired is running an article about future lithography techniques

Intel said there may be as many as 100 cores packed on a single processor within 10 years.


Good, quick read.

if there's one thing to learn from intel its never to listen to their future predictions
We've also learned that you can sell anything as long as you've got dancing alien's and funny little blue men.
March 4, 2006 12:32:12 PM

Quote:
Wired is running an article about future lithography techniques

Intel said there may be as many as 100 cores packed on a single processor within 10 years.


Good, quick read.

they also said they could be at 10ghz + by now with the prescott
!