Sign in with
Sign up | Sign in
Your question
Closed

Report: Intel Haswell-E Halo Platform Will Have Eight Cores

Last response: in News comments
Share
June 17, 2013 11:27:20 AM

'Four-core configurations of GPU less dies'?? I think this needs a correction
Score
0
a c 79 à CPUs
a b å Intel
June 17, 2013 11:31:47 AM

swordrage said:
'Four-core configurations of GPU less dies'?? I think this needs a correction


Makes perfect sense to me.

Score
1
Related resources
June 17, 2013 11:34:27 AM

2014? This should have been what we saw for this year already.. Common Intel this is sounding much better than your current stutter releases.
Score
0
June 17, 2013 11:39:48 AM

20MB L3 cache and DDR4 support sounded good.
Let's see if Intel could bring down that 140W TDP to about 70W.
Score
0
June 17, 2013 11:44:38 AM

Aren't these the new ones that are soldered to the board or is that the next generation?
Score
0
June 17, 2013 11:47:19 AM

zeek the geek said:
2014? This should have been what we saw for this year already.. Common Intel this is sounding much better than your current stutter releases.

It's Haswell-E. I think they like releasing the Extreme edition chips a year later, don't ask me why...

milktea said:
20MB L3 cache and DDR4 support sounded good.
Let's see if Intel could bring down that 140W TDP to about 70W.

Are you even serious? They're going from 6 cores within 130w to 8 within 140w, and that's not enough for you? They're not going to pull physics from their backside, you know. Wait till they hit 5nm, maybe then you'll get 8 cores @ 4 GHz under 70.

Look at their standard mainstream parts if you'd like to see stuff around 70w.
Score
2
June 17, 2013 11:48:28 AM

Can we just skip Ivy-E and go to Haswell-E?
Score
1
June 17, 2013 11:53:07 AM

kenyee said:
Aren't these the new ones that are soldered to the board or is that the next generation?

Dude, did you even read the whole thing? Socket LGA 2011-3. LGA. Land Grid Array. Not BGA. Socket LGA 2011-3. SOCKET.

All current chips come in some sort of BGA packaging, Broadwell will be the entirely soldered generation, but then Broadwell won't release for the desktop (at least not in a socketed form), and we'll skip straight to Skylake that WILL NOT BE SOLDERED.

Hope that helps.
Score
1
a b à CPUs
June 17, 2013 11:57:48 AM

Still only 40 lanes of PCIe? Grrr... I need 48, darnit. I need it to handle dual video cards, a raid controller and 10Gbe, and that takes 48 lanes.
Score
1
a b à CPUs
June 17, 2013 12:00:48 PM

It looks really nice...except for the inevitable $1000+ pricetag.

for the price of one cpu + motherboard, we'll be able to build 2 complete 8-core AMD desktops
Score
0
a b à CPUs
June 17, 2013 12:02:47 PM

It looks really nice...except for the inevitable $1000+ pricetag.

for the price of one cpu + motherboard, we'll be able to build 2 complete 8-core AMD desktops
Score
0
a c 944 à CPUs
a c 117 å Intel
June 17, 2013 12:05:00 PM

I'll be picking up 2 of these.
Score
0
a b à CPUs
June 17, 2013 12:15:17 PM

It looks really nice...except for the inevitable $1000+ pricetag.

for the price of one cpu + motherboard, we'll be able to build 2 complete 8-core AMD desktops
Score
0
a b à CPUs
June 17, 2013 12:25:52 PM

ojas said:
zeek the geek said:
2014? This should have been what we saw for this year already.. Common Intel this is sounding much better than your current stutter releases.

It's Haswell-E. I think they like releasing the Extreme edition chips a year later, don't ask me why...


Extreme series parts are based on the Server EP line of parts which lag a year behind desktop parts. As such, Ivy Bridge server is coming out this year, while Haswell Server is coming out next year (Assuming they maintain their ~yearly cadence). The reason for this is that Server parts require much more validation because there is so much more on the die. In fact, it takes about a year of extra validation for server parts to be release.


Score
0
a b à CPUs
June 17, 2013 12:27:47 PM

between the 8 cores and DDR4, definately jumping from Sandy to Haswell-E in 2014 :)  nothing in between is worth the upgrade
Score
0
June 17, 2013 1:47:35 PM

I'd also love to see the power requirement go down from 130W to 80W on 8 cores, maybe even 96W. With consoles finally up to date later this year I am hoping that for gaming atleast this means our systems will be fully utilized.

However, DDR4, quad-channel, ect I still have not seen in which combinations my system can be maxxed out.
Score
0
a c 136 à CPUs
June 17, 2013 1:48:08 PM

dgingeri said:
Still only 40 lanes of PCIe? Grrr... I need 48, darnit. I need it to handle dual video cards, a raid controller and 10Gbe, and that takes 48 lanes.

There is no benefit to going beyond x8 on PCIe 3.0 for GPUs (almost none over x4 most of the time) so you can use 2x x8 for GPUs, which leaves you with 24 lanes (192Gbps) for everything else.

Alternately, you can get a board that uses PLX or similar chips to more efficiently use the CPU's lanes... 10GbE only requires 1.25 PCIe 3.0 lanes so wasting 4 lanes from the CPU would waste 2.75 lanes. Using a 40 lanes PLX to expand the 8 "extra" lanes would make 64Gbps available for IO; enough to handle all but the craziest prosumer IO needs.
Score
0
June 17, 2013 2:50:53 PM

this means ill be waiting until 2016 until i adopt ddr4, which means ill be comfy until then. sweet.
Score
0
June 17, 2013 4:11:24 PM

ojas said:
kenyee said:
Aren't these the new ones that are soldered to the board or is that the next generation?

Dude, did you even read the whole thing? Socket LGA 2011-3. LGA. Land Grid Array. Not BGA. Socket LGA 2011-3. SOCKET.

All current chips come in some sort of BGA packaging, Broadwell will be the entirely soldered generation, but then Broadwell won't release for the desktop (at least not in a socketed form), and we'll skip straight to Skylake that WILL NOT BE SOLDERED.

Hope that helps.


You are just as wrong as the guy you are correcting.

Broadwell will offer LGA just like Intel always has.

AMD would love for Intel to only offer BGA, but it's not going to happen.
Score
0
June 17, 2013 5:23:12 PM

40 gen 3 PCIe lanes, now you're talking Intel! I'd like to see 48 or more, but it's a big step in the right direction.
Score
0
June 17, 2013 9:41:55 PM

InvalidError said:
dgingeri said:
Still only 40 lanes of PCIe? Grrr... I need 48, darnit. I need it to handle dual video cards, a raid controller and 10Gbe, and that takes 48 lanes.

There is no benefit to going beyond x8 on PCIe 3.0 for GPUs (almost none over x4 most of the time) so you can use 2x x8 for GPUs,
This is a very absolutist statement and is rooted very much in the current state of affairs, rather than looking towards 2014 and beyond.

The first exception I'd cite is that of partially-resident textures, where main memory is used to hold textures too large to fit on the video card. This puts much more strain on the bus.

Second, consider the fact that scene geometry continues to increase and the GPU is increasingly involved in tasks like physics and AI. Whenever the CPU and GPU are collaborating like that, it's not just the bandwidth that counts, but also the latency. The sooner the CPU gets the answer back from the GPU, the sooner it can start performing the next operation (and faster bus speeds reduce transmission time).

Finally, consider applications specifically involving GPU-compute. Depending on the application, the bus can quickly become a bottleneck.

If you look back through previous advances in bus architecture, you'll see that the first couple generations of games and graphics cards didn't benefit much from a new standard (I'll make an exception for AGP, which was long overdue). But well before the next generation comes along, products and applications have evolved to take advantage of the capacity of the previous generation.

Since both of the major new consoles have APUs with extremely high-bandwidth CPU <-> GPU communication, I suspect we're in for a wave of games thare are increasinlgy sensitive to GPU bus bandwidth.
Score
-1
June 17, 2013 9:53:03 PM

milktea said:
20MB L3 cache and DDR4 support sounded good.
Let's see if Intel could bring down that 140W TDP to about 70W.
Don't you guys get it? Intel can & does build lower power chips, but they're not as fast. They could build more power-hungry, faster chips than they currently do, but they're under no competitive pressure to do so.

Look at it this way: Intel & AMD will both build high-end CPUs that burn as much power as the market will accept. If Intel decides to make their fastest CPU burn only 70 W, then AMD will come along and blow it out of the water with a 140 W chip. In fact, this is what AMD is currently trying, with the 220 W chip they just announced, but I think they've considerably over-shot, with that one.
Score
0
a c 136 à CPUs
June 17, 2013 11:56:24 PM

bit_user said:
This is a very absolutist statement and is rooted very much in the current state of affairs, rather than looking towards 2014 and beyond.

I still stand by what I said.

Partially resident textures? With 2-3GB GPUs? Really? How many games use enough textures at sufficiently ludicrous resolutions to actually require that?

Latency? Going over PCIe already requires hundreds of cycles so the drivers and software already need to be written to accommodate very high latencies. A few cycles more or less should make little to no difference to properly written software unless you are attempting to do something that exceeds on-card resources but then you would be screwed anyway since system RAM is nowhere near as fast as on-board RAM even on the best of days... this effectively becomes a case of "wrong hardware for the job."

x8 currently has very little benefit over x4 most of the time even though PCIe 3.0 has been out for nearly two years already so it will still be a few more years before x8 starts becoming a bottleneck actually worth worrying about. By the time it does, PCIe 4.0 will likely be out since the optimistic ETA is late 2014.

BTW, PCIe 4.0 is coming out in late-2014 or early-2015 so, "looking toward 2014 and beyond", PCIe 4.0 would be my answer. By the time PCIe 3.0 x4 becomes a bottleneck, PCIe 4.0 x4 will be available and enthusiasts will likely have an urge to upgrade regardless of what they have today.
Score
0
a b à CPUs
June 18, 2013 4:09:45 AM

8 cores, DDR4, 140W... sounds pretty tempting to me.

How about an 8 core nonHT and non-iGPU part at a lower price point specifically for gamers? I am not a huge gamer, but at this point I would much rather have a native 8 core part than a quad core with HT or an 8 core with HT. AMD's CPU division has not gotten a whole lot right over the last few years, but having their version of HT which can be used by any software compared to Intel HT which has to be specifically programmed for seems to be a great direction to go. Now if only they could get the rest of their platform to be just as innovative...
Score
0
June 18, 2013 4:49:09 AM

wonder how well will it compare with AMD 8-cores jaguar APU
Score
0
June 18, 2013 4:56:33 AM

We should've had an 8-core processor, already this year with Haswell, not Haswell-E.
Score
0
a c 136 à CPUs
June 18, 2013 6:00:48 AM

CaedenV said:
AMD's CPU division has not gotten a whole lot right over the last few years, but having their version of HT which can be used by any software compared to Intel HT which has to be specifically programmed for seems to be a great direction to go.

HT works exactly the same as having more physical cores as far as software is concerned and does not require any HT-specific programming to use.

What does require architecture-specific optimizations is fancy stuff like using HT or the 2nd int-core for AMD to use the 2nd thread/core as an intelligent cache pre-fetcher or other similar function intended to support the other thread/core that does the heavy-lifting.

Unless you go out of your way to implement architecture-specific tweaks, every HT thread and every int-core looks exactly the same to software.
Score
0
June 18, 2013 7:02:01 AM

Wake me when they transition all the USB ports to 3.0
Score
0
June 18, 2013 10:19:22 AM

ScrewySqrl said:
It looks really nice...except for the inevitable $1000+ pricetag.

for the price of one cpu + motherboard, we'll be able to build 2 complete 8-core AMD desktops

Should be a $600 part too, just like the...3930K, is it?

Blandge said:
ojas said:
zeek the geek said:
2014? This should have been what we saw for this year already.. Common Intel this is sounding much better than your current stutter releases.

It's Haswell-E. I think they like releasing the Extreme edition chips a year later, don't ask me why...


Extreme series parts are based on the Server EP line of parts which lag a year behind desktop parts. As such, Ivy Bridge server is coming out this year, while Haswell Server is coming out next year (Assuming they maintain their ~yearly cadence). The reason for this is that Server parts require much more validation because there is so much more on the die. In fact, it takes about a year of extra validation for server parts to be release.

Ah. Interesting. Thanks! :) 

Grandmastersexsay said:
ojas said:
kenyee said:
Aren't these the new ones that are soldered to the board or is that the next generation?

Dude, did you even read the whole thing? Socket LGA 2011-3. LGA. Land Grid Array. Not BGA. Socket LGA 2011-3. SOCKET.

All current chips come in some sort of BGA packaging, Broadwell will be the entirely soldered generation, but then Broadwell won't release for the desktop (at least not in a socketed form), and we'll skip straight to Skylake that WILL NOT BE SOLDERED.

Hope that helps.


You are just as wrong as the guy you are correcting.

Broadwell will offer LGA just like Intel always has.

AMD would love for Intel to only offer BGA, but it's not going to happen.

Um. See, i really can't prove anything a year before, i could post a few links and dump info gathered in the last few months into this post, but i'm just too lazy at the moment.

But in short,
Broadwell has more or less been confirmed to be BGA/PGA only, without any LGA part.
There will be a Haswell "Refresh" next year for the desktop with the 9-series chipsets. These will be LGA. Compatible with 8-series? Most probably.
Haswell-E will also come out next year and git into LGA 2011-3.
Skylake comes next for the desktop in 2015.

That's all folks.

p.s. I do remember seeing one indication of a Broadwell LGA part, but everything since has pointed to a Haswell Refresh. Unless this Haswell refresh turns out to be Broadwell, which i don't think it is, the above info seems to be the most likely case as of now.

p.p.s. AMD wouldn't love Intel even if they decided not to release anything at all for a year.
Score
0
June 18, 2013 3:44:15 PM

This is when I will finally ditch i7-980x. Finally a worthy upgrade.
Score
0
June 21, 2013 8:43:23 PM

JOSHSKORN said:
We should've had an 8-core processor, already this year with Haswell, not Haswell-E.
Well, more cores need more memory bandwidth. That's why the Sandybridge-E platform has quad-channel memory. Maybe when DDR4 hits, dual-channel might be enough to keep 6 cores fed.

Then again, the iGPU can probably consume all of that memory bandwidth, all by itself. So, you probably need both on-die graphics memory (which is making an appearance in mobile Haswell and I think the XBox One CPU) and DDR4.

Also, most desktop users don't need 4 cores, let alone 8. Even most games don't benefit much from going beyond quad-core. I think Intel got it right with their market segmentation: 6+ cores needs a high-end platform, and only makes much sense for high-end users and workstations/servers.
Score
0
June 21, 2013 9:34:49 PM

InvalidError said:
bit_user said:
This is a very absolutist statement and is rooted very much in the current state of affairs, rather than looking towards 2014 and beyond.
I still stand by what I said.
I think I'd be surprised if you didn't.
;) 

InvalidError said:
Latency? Going over PCIe already requires hundreds of cycles so the drivers and software already need to be written to accommodate very high latencies.
We're talking about very different things, here. Any GPU-compute task is currently driven by the CPU. The computation model is that the CPU prepares some work and ships it off to the GPU. In the meantime, the CPU can go do something else or queue up more work, but if we're talking about something like a game, the CPU often needs to get the results before it can finish that frame and move on to the next. So, pipelining opportunities are limited.

Now, the CPU is waiting for the data to transfer to the card, then for the GPU to process it, and finally for the GPU to return some result. If either of those steps includes a significant amount of data transfer, then the time the CPU must wait for that data directly affects the amount of time to process a frame, which directly affects the framerate.

And we're not talking about "a few cycles more or less", but rather something like feeding 100 MB of scene geometry to the physics engine, for instance. And let's just worry about one direction. When you're talking about 4, 8, or 16 GB/sec, that amounts to 25 ms, 12.5 ms, or 6.25 ms. No big deal, eh? Well, if you're doing it for every frame, then the maximum framerate you could reach would be 40, 80, or 160 fps. Right now, you're probably thinking that 80 fps sounds pretty good, so 8x it is! Well, let's not forget that we're assuming exclusive use of the bus, but there's a lot of graphics data being shipped over it, as well. It also assumes that the GPU's computation time is zero, which it's not. And the CPU has to do other chores that are either dependent on the results of the GPU computation or a dependency of it. So, maybe you can see why bus speed is so important - because transaction time can be as much or more important than the total amount of stuff you can cram over it, if you were simply sending nonstop.

Of course, I have no idea how realistic 100 MB is, and clever game devs will try to keep as much data on the graphics card as possible and overlap as many things as they can. But even if you cut my number down to 10 MB and assume that bus channel is available 50% of the time, you still come out with spending 30% of the time potentially waiting on just the bus @ 60 fps, in the PCIe 3.0 x4 case.

Don't think we won't start seeing this, since the APUs used in the upcoming consoles have no bus separating the CPU and GPU. They both have fast, wide datapaths to unified memory, as well as possibly a shared L3 cache. Game devs will certainly be doing a lot more heterogeneous computing, to use AMD's parlance.

InvalidError said:
x8 currently has very little benefit over x4 most of the time even though PCIe 3.0 has been out for nearly two years already so it will still be a few more years before x8 starts becoming a bottleneck actually worth worrying about.
Fortunately, this point can quite easily be refuted, on the basis of PCIe scaling studies Tom's has actually done. In an article they wrote about PCIe 2.0 scaling, nearly three years ago, they stated:

Quote:
we did see a fairly large difference between x8 and x16 slots

Source: http://www.tomshardware.com/reviews/pcie-geforce-gtx-480-x16-x8-x4,2696-17.html
Which flies in the face of your above assertion.

InvalidError said:
By the time it does, PCIe 4.0 will likely be out since the optimistic ETA is late 2014.

BTW, PCIe 4.0 is coming out in late-2014 or early-2015 so, "looking toward 2014 and beyond", PCIe 4.0 would be my answer. By the time PCIe 3.0 x4 becomes a bottleneck, PCIe 4.0 x4 will be available and enthusiasts will likely have an urge to upgrade regardless of what they have today.
That's completely out of context. We're talking about this platform, not something coming out in 2015 or 2016! I was saying that you have to look at the demands of the software that will be out during the first couple years after this platform is released, in order to judge whether PCIe x4 or x8 would be a bottle neck on it, because that's the minimum period of time when people who buy these will actually be using them.
Score
0
a c 136 à CPUs
June 22, 2013 4:50:08 AM

bit_user said:
And we're not talking about "a few cycles more or less", but rather something like feeding 100 MB of scene geometry to the physics engine, for instance.

You shouldn't be feeding geometry data to the GPU; you upload it to the GPU's RAM once and then reuse that. The only traffic afterward is scene updates and the CPU does not need to update the whole scene at once. Even without caching scene geometry to GPU RAM, it can still upload geometry one object at a time as it gets done processing each of them and process the next object while geometry for the previous one is being transferred.

bit_user said:
Of course, I have no idea how realistic 100 MB is, and clever game devs will try to keep as much data on the graphics card as possible and overlap as many things as they can. But even if you cut my number down to 10 MB and assume that bus channel is available 50% of the time, you still come out with spending 30% of the time potentially waiting on just the bus @ 60 fps, in the PCIe 3.0 x4 case.

If you keep everything you can possibly leave in GPU RAM in GPU RAM, the bus would be free for control and data ~100% of the time and if programmers do their job right, they would be interleaving GPU commands/data transfers with other processing to avoid stalling on IO backlog.

bit_user said:
Don't think we won't start seeing this, since the APUs used in the upcoming consoles have no bus separating the CPU and GPU.

No matter how fast the interconnect between CPU and GPU is, you still wouldn't want to process a large geometry/scene blob before starting to send it to the GPU even if you use shared memory with zerocopy API that technically has 0ms latency since the GPU still needs early access to scene data to start rendering the next frame and the GPU likely needs more cycles to render stuff than the time it takes the software to prepare the necessary data.

bit_user said:
Fortunately, this point can quite easily be refuted, on the basis of PCIe scaling studies Tom's has actually done. In an article they wrote about PCIe 2.0 scaling, nearly three years ago, they stated:

Quote:
we did see a fairly large difference between x8 and x16 slots

Source: http://www.tomshardware.com/reviews/pcie-geforce-gtx-480-x16-x8-x4,2696-17.html
Which flies in the face of your above assertion.

"fairly large" being 1-5% in most games they used, which is negligible in my book.

Here is a newer review of PCIe performance scaling using Ivy Bridge, a HD7970 and a GTX680...
http://www.techpowerup.com/reviews/Intel/Ivy_Bridge_PCI...

In most cases, there is only a 1-2% difference between x4 and x8.
Score
0
June 22, 2013 8:18:09 AM

This is quickly descending into technical nitpicking that's of questionable relevance to the original point. I'm just saying...

InvalidError said:
bit_user said:
And we're not talking about "a few cycles more or less", but rather something like feeding 100 MB of scene geometry to the physics engine, for instance.

You shouldn't be feeding geometry data to the GPU; you upload it to the GPU's RAM once and then reuse that.
Of course you wouln't spoon-feed primitives to the GPU, one at a time. That's obviously not what I meant. But you ned to send/re-send geometry that changed, hence my revised figure of 10 MB. However, not being a game dev on a AAA title, I can't say for sure how much scene geometry must be updated by the CPU for every frame, or even comment on whether a "retained mode" is what modern games actually use.

InvalidError said:
If you keep everything you can possibly leave in GPU RAM in GPU RAM, the bus would be free for control and data ~100% of the time and if programmers do their job right, they would be interleaving GPU commands/data transfers with other processing to avoid stalling on IO backlog.
In pretty much any game worth playing, stuff is happening in the game world around the player. Therefore, it would not be possible just to dump everything in GPU memory (assuming it's big enough) and simply sit back and tweak camera angles. And if the GPU is being used to accelerate physics or AI, as in my example, then some data must flow back to the CPU, hence the blocking.

InvalidError said:
bit_user said:
Fortunately, this point can quite easily be refuted, on the basis of PCIe scaling studies Tom's has actually done. In an article they wrote about PCIe 2.0 scaling, nearly three years ago, they stated:

Quote:
we did see a fairly large difference between x8 and x16 slots

Source: http://www.tomshardware.com/reviews/pcie-geforce-gtx-480-x16-x8-x4,2696-17.html
Which flies in the face of your above assertion.

"fairly large" being 1-5% in most games they used, which is negligible in my book.

Here is a newer review of PCIe performance scaling using Ivy Bridge, a HD7970 and a GTX680...
http://www.techpowerup.com/reviews/Intel/Ivy_Bridge_PCI...

In most cases, there is only a 1-2% difference between x4 and x8.
The article I cited is nearly thee years old. If they saw a difference way back then, surely the disparity is larger, now. As for the article you cited, the problem with their conclusions is that they averaged over all resolutions - including ones that are heavily fill-rate limited and too slow for any serious gamer to actually use. If you restricted it to resolutions that were actually usable, I think you'd probably see the gap increased slightly, since 2010.

Now, by the time Haswell-E systems hit the streets, even that article will probably be 2.5 years old and the situation will have shifted further. And that's just on the release date of Haswell-E. In the year+ that follows, while it's still the highest-end platform available, the situation will continue to evolve. So, all of this is very speculative and only time will tell the answer.
Score
0
a c 136 à CPUs
June 22, 2013 12:13:42 PM

bit_user said:
The article I cited is nearly thee years old. If they saw a difference way back then, surely the disparity is larger, now. As for the article you cited, the problem with their conclusions is that they averaged over all resolutions - including ones that are heavily fill-rate limited and too slow for any serious gamer to actually use.

The article I cited does not average over all resolutions. Each test case has separate graphs for every resolution used and for the most common modern resolution (1920x1080), there is less than 5% difference between x4 and x16 in most games tested.

As for "the disparity getting larger", I'm not really seeing that. Drivers, APIs, game frameworks/engines and programmers are getting better at working around latency and non-deterministic GPU/driver/API behavior. To make SLI/CFX work better, they have no choice but to improve on that since everything is becoming that much more unpredictable in the process with more hardware and software involved. Getting the most out of future GPUs will likely need it just as much due to the large amount of compute resources that need scheduling and the inherent challenge of keeping 2000+ GPU threads busy without overwhelming the host CPU. You cannot do that if your software is written to keep the GPU on a tight leash so the whole rendering/physics pipeline needs to become more loosely coupled to make things more efficient.

Sounds familiar? This is fundamentally the same sort of effort programmers need to do to leverage multi-core/multi-thread CPUs. The more large-scale parallelism you are attempting to leverage, the more vitally important it becomes to reduce or eliminate inter-dependencies between threads and stages while structuring algorithms to tolerate more latency. Code with multiple or complex inter-dependencies and low tolerance to latency will usually scale miserably; sometimes to the point of performing worse than its non-threaded equivalent.

Even if you eliminate communications-bound latency, you still have to deal with computations-bound latency so you still need to write code with (usually unpredictable) latency in mind. This is not going away even with GPiGPU or whatever its brand-specific name and complementary features might be.

As for PCIe 4.0, with Broadwell seemingly getting delayed, I would not be too surprised if it launched with the necessary hardware on-chip, just waiting for 4.0 certification if it does not have it from the start much like SB-E did.
Score
0
!