Sign in with
Sign up | Sign in
Your question

Low Quad Core Optimization For Recent Gaming

Last response: in Video Games
Share
June 3, 2011 9:11:30 PM

Hello everyone. I have a problem here. And more that a problem something i realized over the last moth.

I recently got an nVidia GTX 560 Ti (nice card by the way) and i started playing some of the most demanding games in the market.
Witcher 2 for example, Crysis 2.

I realized that my GPU was not over 50 or 60% usage, and average was like 35% usage. Why y asked myself.
I started thinking that maybe my CPU was the bottleneck. I got a Quad-Core Q8300 2.5Ghz @ 2.67Ghz. (This is the Quad-Core with big cache size, so it should be fast, it has been even placed to over-run Cores i5 on a lot of benchmarks).

Well here is the thing, i said IT COULD ME MY BOTTLENECK! and started to check all of todays game CPU usage, Dirt 3, Witcher 2, Crysis 2, Shogun 2, etc.

Here is the result:

- Non of the games used more than 50 or 60% of CPU in all cores. So the average usage of the CPU is 50% at all times.

Conclusion:

- Non of today games uses 4 Cores really. They use only 2 Cores and the windows shares the load over the four cores, but only 50% to each. 4*50% = 200% - 2*100% = 200%

- So at the end, its not important if you have a 4 Core or 8 or even 16 cores. You will not be using them. Non of this games will use 100% of full 4 Cores QuadCores, not even a close 85%.

Again in the end the only thing that wins is the CPU speed per core. In this case 2.67Ghz.

Example:
Having a 2.67Ghz Quad-Core is the same like having a 2.67Ghz Core 2 Duo for gaming. Tested it out myslef. SAME FRAME RATES.

Even a Core i7 Downclocked to 2.67 Ghz Results in the same frame rates using same video card (GTX 560Ti). The only difference is that a Core 2 Duo will use 100% of CPU, the Quad 50% of CPU and the 8 core i7 25% of CPU. In case you video card(s) meets the requirements.

There is no way that the gaming companies are selling games like (QUAD-CORE OPTIMIZED). because they are not.

Please, if u will run a game and check your CPU usage and post it here with your CPU number along with the video card usage. You will realize like most people, that unless you have a core clock bigger that 3.0 Ghz you may not be using all your system resources, which mean that if you have a HIGH end video card you may not be using it at all.

Thanks in advance and please comment.
a b à CPUs
June 4, 2011 5:05:52 AM

Q8300 have different architecture design when compared to Corei's and the Corei are more powerful. Some games today do uses all four cores and dual cores struggles to keep up good framerates. I still have my Q8400 @ 3.2ghz Oc but it's not that fast anymore compare to what I have now 2500k. I think if you overclocked it you may squeeze a bit more out of it to go by with your GTX560. I can give you one example; the poorly optimized GTA 4, sucks on dual cores but runs decent on quadcores.
m
0
l
June 4, 2011 5:28:53 AM

That is the problem actually. Games run "Decent" on PC. And i mean all games. I overclocked my quad-core to reach 3.0Ghz and no difference; just a few (7-10 FPS). The issue remains that even on core i7 or quad core not all the CPU is used. This has nothing to with architecture. Im an electronic engineer and architectures may chance and the overall performance will actually be different between processor but in the end if you have 50% usage on one core this mean that half of the time he is doing NOTHING at all.

So 55% CPU usage on the witcher or NFS:SHIFT 2 means that 45% of the time my CPU is doing nothing.

yeah, if i overclock it i might get better FPS because it will be at the same 55% but at a higher clock speed.

The issue remains this, mathematically if you CPU is like mine 2.5Ghz and u overclock it to 3.00Ghz you will get a 20% increase on FPS. if you are doing 35FPS you can get 35+20% = 42 FPS. (Approx, if your video card and memories can handle the work)

Now if u where using 98% of your CPU this will actually mean a increase of 48% on games like The witcher (which only reaches my CPU to 50%). Now If you are getting 35FPS + 48% = 52 FPS (again Approx)

That means overclocking gives u 7 FPS and right game creation gives you almost 20 FPS.

The point here is that companies are selling this as Quad-Core OPTIMIZED GAMES when they are NOT OPTIMIZED at all. CPU just stays there half of the time doing NOTHING at all. just resting. This mean very bad resources usage and poor engineer work.

So here i am, finding a way to improve this USAGE (not FPS by overclocking) but by using your actual resources like they are build to be used. Not in a bad way, charging people money for something you are not getting at all.
m
0
l
Related resources
June 6, 2011 7:19:57 PM

Hmm I don't you where you get all your information but With my 2500k when I play BC2 my cpu usage goes close to 100% on all four cores....most newer games will use all the cores. My cpu usage is even around 100% while playing Counter Strike Source and that game is old.

Crysis 2 might be unoptimized because even though it is new it is made for consoles not PC and it is not that demanding when compared to the original Crysis.

m
0
l
a b à CPUs
June 6, 2011 8:44:35 PM

http://www.grandtheftpc.com/2010/03/list-of-quad-core-o...
it takes a lot of work to optimize a game for 4 cores. understanding is that in most games the environment takes one core and then the charracter takes another then they have to find other things to dedicate to the other cores but they can have 2 cores do the environment. that is my understanding but could be wrong
m
0
l
June 6, 2011 8:53:11 PM

U are Wrong, u don't attach instructions to cores, it doesn't happen that way.
You just got to build your game in multiple threads, and make those threads small so non of them take a Core for a long time. That way they are divided over all the cores in an even way.

For example, i create a thread for grass, another for trees, another for faces, etc. And make sure that all the threads take the same time executing in (CPU ticks).

If you make one huge thread that generates all the environment, it will take a core and run step by step, leaving the other cores free (if they finished their task) this is "bad optimization".

If you cut your thread in multiple threads Core 1 can be processing trees, while Core 2 water, and Core 3..and so on.

The problem is that Engines are not optimized by default, because, depending on your game environment and number of objects, the optimization will change. Companies need to start focusing on optimizing their games so that the engine can handle them, or modifying the engine for improved performance.
m
0
l
a b à CPUs
June 6, 2011 8:58:03 PM

DarthSidious said:
For example, i create a thread for grass, another for trees, another for faces, etc. And make sure that all the threads take the same time executing in (CPU ticks).


thats what i remember reading about and if i remember correctly it was that some threads take longer to complete or something along that route....
m
0
l
June 6, 2011 9:17:33 PM

yeah, exactly, so, don't make thread so huge. Because they will take a long time to complete and the other cores will be doing nothing at all. Create multiple thread to do the task, so all cores can work at the same time.
m
0
l
June 7, 2011 12:59:41 AM

Well....

I tested the witcher update v1.2 today. It has some CPU improvements, i got avg 15% improvement con the game. Try it out.

See, this is what im talking about, companies trying to improve gaming. Hope that they develop engines that can reach 100% CPU usage soon.
m
0
l
June 7, 2011 6:43:43 PM

DarthSidious said:
Well....

I tested the witcher update v1.2 today. It has some CPU improvements, i got avg 15% improvement con the game. Try it out.

See, this is what im talking about, companies trying to improve gaming. Hope that they develop engines that can reach 100% CPU usage soon.



Much of the time the reason your cpu doesn't reach 100% usage is because of gpu bottleneck, or if your gpu usage isn't 100% then your cpu is bottenecking.
m
0
l
June 7, 2011 6:59:30 PM

haha...the problem men is tha non of them is at 100% see, there is no bottleneck, just bad usage.
m
0
l
June 8, 2011 12:32:10 PM

DarthSidious said:
haha...the problem men is tha non of them is at 100% see, there is no bottleneck, just bad usage.



Well then the problem is your computer, not the game.
m
0
l
June 8, 2011 1:09:12 PM

ineedaname8 said:
Well then the problem is your computer, not the game.

lol, care to enlighten us what the problem is when neither the cpu nor the gpu are at 100% load when gaming?

let me give you a hint, think, 2001, take that game and run it on your modern rig, think you will get 100% cpu load?

so unless there'a clear bottleneck somewhere else on the OP's machine (which you have yet to point out), there's no reason to believe that hardware is at fault for not running at full capacity.
m
0
l
June 8, 2011 4:13:20 PM

Thats right, at least one part of the PC should run at 100% usage on long time task like gaming, video encoding or rendering.

The 3 important parts in rendering are RAM, CPU, HDD, Video Card.

With 4 Gb of RAM and no virtual memory RAM and HDD are out of the question to de at 100%. So that leaves Video Card and CPU.

If non of them is at 100% that mean bad resource usage.

Ill give u an example, a 1998 game will use Core 0 to 100% and Cores 1,2,3 at 0%. That is bad CPU usage, if u multask this game in 4 "equally heavy" taks. u will get 100% at all cores, rendering the 1 core game 4 times faster.

Now lets take need for speed shift 2 for example, core 0 is at 90% usage while cores 1,2,3 at 25%. This means that the tasks are not of the same"Load".

In this case u can say the game is not well programed, of at least not Quad-Core Optimized or multi-threading optimized.

Even in games That say they run better on a Core 2 Duo , u will not see 100% usage on all cores, but at least something like 99% and 85%. The problem is that developers still use same coding techniques and scaled to 4 or more cores this is a disaster.
m
0
l
June 8, 2011 4:58:55 PM

Well from one point of view I can see why this is the case, multi cores are still relatively new. A lot of game devs focus on developing the game content first and thinking about optimizing things later. Trouble is once you got the game working it's harder to chop it into bits and pieces. There are certain practices that developers have to follow to make sure the initial code write is optimized, the initial engine is meant to work well with a multi core system and then add game content on top of that.

so until all devs get around to HOW to write software for multicores properly, and then actually follow the rules for code development right from the start, then we might hope to see good multicore utilization.
m
0
l
June 8, 2011 7:17:22 PM

If you take an old game that does not use multiple cores, then either at least one of your cpu cores should be at 100% or your gpu should be at 100% or both.

The only way neither one would be around 100% is if the game uses an fps cap, but your computer and produce more than the cap, then your cpu and gpu usage will both be under 100%.

If your game only uses two cores and you have 4 cores then yes your cpu will be maxed out at 50% overall cpu usage because it cannot use the other cores, but the cores being used will be at 100% usage.

But once again unless there is an fps cap either you gpu should be at 100% usage or each core it does use will be around 100% usage.
m
0
l
June 8, 2011 8:22:28 PM

Thats all true ineedaname8.

The problem that today games are showing is that, without reaching FPS cap neather GPU or even 1 of the cores is at 100%. So u can see a bad usage there. The CPU spends time doing nothing waiting for other threads witting the same game finish on other core.

For example. Lets say there is a thread to do trees, another to do water, and a last one to do char, but water needs the tree to be build for start working, and char need water to generate reflections. If we load the 3 of them at the same time to the Quad-Core this will happen.

Core 0 -> running trees Core 1 ->Building Water polygons Core 2 -> Building Char polygons
Core 0 -> running tree shaders Core 1 ->Waiting for tree (idle) Core 2 -> Waiting for water (idle)
Core 0 -> finish Core 1 ->Building Water polygons Core 2 -> Waiting for water (idle)
Core 1 ->finish Core 2 -> Building Char shaders
Core 2-> Finish

Now, we can see that the CPU does nothing 3 times in 2 of the cores and was only executing 6 tasks (2 poly creating and 2 shaders) son 50% CPU usage here.

This is the problem that today gaming is showing for CPU usage, if the game was well optimized he could be doing other stuff with those cores, (actually windows will share the core in that idle time) but here is the problem.

As u can see the task are inter-blocking each other, so, until tree is done, no other can finish, if u do a huge program u will reach a point in witch, unit core 0 is done, no other stuff can be made in the current frame, and in the end you will have a poor CPU usage. New drawing methods are required in the new engines, for letting other tasks to start based on approximation or cache build environments.

Game companies are just lazy and try to develop games fast with a lot of content. In the end, nothing is better than a well designed game, content is important? yeah, but useless if u run the game at 20 FPS and with lag and artifacts.

People remember well developed games and graphic innovation. Performance should be at hand with all this at all times.
For example, i got to say that Crysis 2 was not the super heavy game all people where expecting, best graphics than Crysis 1 and better performance. I enjoyed Crysis 2 much more than Crysis 1, i was not annoyed by lag or artifacts on the screen. CryEngine 3 is almost completely developed for a multi-core, it takes almost 90% usage on a Quad-Core.

Difference, cry engine 3 took several years to be build and a lot of engineering work, they stand up for one of the 2011 most wanted games, while games like NFS:Shift 2 are just a pathetic excuse to take money away from people.

I say that companies should do their homework and stop playing with us. I work my ass of to get a Quad-Core and a GTX 560 Ti and their titles with are not actually cheap 50$ each crapy game, to realize they suck at performance.

Its impossible that a huge graphic game like crysis 2 runs at 70FPS on my PC and some lower end shift or witcher 2 (older and newer then crysis 2) will not even take 40 FPS without a lot of lag when it rains(witcher) or its dark(NFS).
m
0
l
a b à CPUs
June 8, 2011 8:53:55 PM

gta 4 uses all four cores, you loose fps for every core you disable. f1 2010 is quad core optimized, when disabling all but 2 cores the game is unplayable with about 15 fps with lows to 1 or 2 but with 4 cores plus HT is will get 36 fps. when i didnt restrict the cores it was about 30% usage but with only 2 core is was at 15%.
my suggestion is that you get a job at a developer and show them how to do it right.
m
0
l
June 12, 2011 5:39:05 AM

Haha, i tried that, they seem to be a closed community, hehehe. I really would like to see some good stuff in the upcoming games like elder scroll. I have some solutions for CPU usage here, and ill be posting them as they come along.

Mafia 2: Delete de cloth folder if you are going to use physx, why? because is poorly done on the CPU.

The Witcher 2: Install update 1.2, turn ubersampling off, force anti-aliasing in nvidia control panel and turn it off in game settings, dont use Vsync, turn off blur.
m
0
l
June 13, 2011 1:22:47 PM

DarthSidious said:
Haha, i tried that, they seem to be a closed community, hehehe. I really would like to see some good stuff in the upcoming games like elder scroll. I have some solutions for CPU usage here, and ill be posting them as they come along.

Mafia 2: Delete de cloth folder if you are going to use physx, why? because is poorly done on the CPU.

The Witcher 2: Install update 1.2, turn ubersampling off, force anti-aliasing in nvidia control panel and turn it off in game settings, dont use Vsync, turn off blur.

Oh, dude, that is useful information. You might want to make another topic to post such recommendations in with a more appropriate topic name than the current one.

otherwise, I just checked out some presentation dx did in december 2010, multithreading is still a wild beast in gaming community, and they still improving the directx itself to get better performance in games, let alone the developers trying to use it.
m
0
l
June 15, 2011 12:46:13 AM

Hehe, yeah, ill make a post with all the games i play or played and if i can a detailed performance hit of every in-game option. Plus mods for config files and fixes.

I'll keep this post for news on CPU usage improvement for today engines. Hopping New elder Scroll teach people how to do stuff.

And i loved your joke. And in the same spirit i say, there are some people that dont know how to use 100 cores. :) .

BTW in tested some stuff and came out with this.

DirectX 9 sucks at quad core jeje.

DirectX 10 sucks at quad cores.

DirectX 11 sucks at quad cores.

Updates are needed.
m
0
l
a b à CPUs
June 15, 2011 8:39:23 PM

Quote:
DirectX 9 sucks at quad core jeje.

DirectX 10 sucks at quad cores.

DirectX 11 sucks at quad cores.


The DX API has NOTHING to do with CPU core usage.

Its clear from the comments here that people have very little idea how threads are actually managed by the Windows scheduler. Its also clear that people don't understand that constantly running 100% loads on all processing cores is NOT a good thing [quite the opposite actually...].

Windows essentially works like this: The highest priority, non-IO blocked thread will ALWAYS run. Priority is assigned as a value between 1 and 32 [Priorities 16-32 are not avaliable to non-kernal tasks though], and processes that are ready to run, but are not being executed gradually have their priorities raised so they get a turn to run. [Essentially, Windows is a mix between Round Robin and a Priority scheme, insofar as its impossible for high priority threads to block out a low priority one].

This is important to understand, because games especially do a LOT of IO [reading texture data, a LOT of matrix manipulation, etc], and thus many game threads are constantly swapped out of the CPU while waiting to complete an IO operation. As such, you shouldn't expect to ever see max CPU usage, simply because the threads are always being swapped out of execution.

Of course, this gets a bit more complicated in actual use; threads that require user interaction get a priority boost, you have low-level kernal threads with boosted priority, you have the OS attempting to figure out on its own what "type" of thread is being run, you have various locks between threads that impeeds execution, etc.

And all this is BEFORE you get into how the CPU actually assigns threads to a specific core...

Finally, I note this: When a core is NOT at 100% work, what that means is that the CPU is doing work faster then it is being assigned, which means that there is no need to increase speed any farther. So when I see a game that uses two cores @ 50%, all that means is that the game is no longer CPU bound, and barring CPU architecture changes, one should not expect any increase in speed due to overclocking, using more cores, or any other reason. So I find it halarious when people complain a game is NOT using 100% of hte CPU, as that actually points to a horrifically coded game that wastes a lot of CPU resources, rather then one that "threads well".

People REALLY need to learn how Windows works under the hood...
m
0
l
June 15, 2011 10:00:36 PM

I think everyone here understands that 50-50 on a dual core is bad usage, but are you saying that running 100-25-25-25 is better than running 100-100-75-75?
m
0
l
June 16, 2011 12:30:22 AM

no, im say that i run at 40-40-40-40 haha and in some games at 80-25-20-10 now that is bad, no mater what you do. It should be 100-100-100-100 or at least 85-85-85-85

Thats the point men. If u use a Core 2 Duo u will get 100-100 and in a quad 50-50-50-50 wich is the same, yeah, maybe in architecture quad core is faster in comparison to a Core 2 Duo and a 100% usage on a Core 2 Duo may give u 5FPS less that a 50% usage Quad-Core, but still, if the games are designed to use 100% u will get double the frame rate, taking that your video card(s) can handle it.
m
0
l
a b à CPUs
June 16, 2011 12:56:44 AM

well there has to be a bottleneck somewhere cause if the cpu can work at 100% on atleast 1 core and the gpu can handle that then theres a bottleneck somewhere which is why i would assume that they will only work at under 100% but the gpu and cpu working almost the same amount. maybe hard drive?
m
0
l
June 16, 2011 3:31:22 AM

No men...tested it all....its game optimization, thats all.
m
0
l
a b à CPUs
June 16, 2011 3:56:04 AM

just looking at it from an outside view i see it like this. if a game is two core optimized and one cores job takes 1 unit of time and the other 2 units of time then the one with 2 units of time would be working at 100% and the other 50%. that being true only if the other parts of the computer can keep up with it.
m
0
l
a b à CPUs
June 16, 2011 12:02:01 PM

Quote:
I think everyone here understands that 50-50 on a dual core is bad usage, but are you saying that running 100-25-25-25 is better than running 100-100-75-75?


Yes, and running 50-50-25-0 is even better, because your not CPU bottlenecked. A CPU core running at 100% simply means the CPU is overworked, either because its too slow, the game is very badly coded, or because there is part of the code that could be run on another core to better balence the load.

If no CPU core is at 100% usage, then you are not CPU bottlenecked, and thus, no farther action needs to be taken to improve performance.

Quote:
no, im say that i run at 40-40-40-40 haha and in some games at 80-25-20-10 now that is bad, no mater what you do. It should be 100-100-100-100 or at least 85-85-85-85


No, it shouldn't, and thats what people don't understand. A game running 100-100-100-100 will run slow, because you have a major CPU bottleneck. Doing more work for no reason does not increase performance; just the opposite.

If I run a game on one core at 75% usage, then no amount of threading or using more cores is going to have ANY IMPACT WHATSOEVER on performance. Why? Because the CPU isn't being bottlenecked. Its that simple.
m
0
l
June 17, 2011 2:24:24 AM

You just dont get it. Here let me explain.

Lets say you want to draw 1000 pics.

Steps for 1 pic:

1. Fetch data con HDD
2. Send Data to Memory
3. Process Data con CPU
4. Send Data to Video Card

now if u send all the pics u can, there should always be a bottleneck, example

HDD is slow, - > HDD at 100%, Memory at 40%, CPU at 40%, GPU at 40% (This are not real numbers, just an example)
CPU is slow, -> HDD at 40% , Memory at 40%, CPU at 100% , GPU at 40%
Memory is slow, -> HDD at 40% , Memory at 100%, CPU at 40%, GPU at 40%
GPU is slow, -> HDD at 40% , Memory at 40%, CPU at 40%, GPU at 100%

This means, in every computer operation there is always a bottleneck.
On games bottleneck is usually CPU or GPU, so, if you are not framecaping eater CPU or GPU should be at 100%.

The issue here is that neither CPU or GPU are at 100%.

Memory is not the bottleneck, HDD is fully defrag, even tested on solid state drive.

The bottleneck here is the tasks on CPU.

A core can only do 1 task at a time.
Imagine this for a moment. I tell you u got to do this operations

2+2
1+1
1+2
189,87 + (893,2 * 50)

You have 4 people and you give each one one of this tasks, you tell them, that when they all finish, a 5th person will write the answers on a screen and win a price.

Now, the 5th person cant write until all the answers are ready.

It becomes clear the the 3 top tasks will be done in like 1 sec. but the last one will take longer, so, 3 of the guys will finish in 1 sec and the last one will take 2 min, because the other 3 guys cant help him.

Now, 3 guys will do nothing while 1 guys is working for 1:59.

The final time for posting the answer will be 2:00min.

This is what is happening, the bottleneck is in the distribution see. 1 Core is doing all the hard work because the other 3 finish fast, this leads you to 100% - 25% - 20% - 20% cpu usage in each core. In average it is 41% usage.

Now, if you could take the work from the cores and distribute them evenly you will get better CPU usage, this will lead to more Frames ready for the video card to process.
Yes, maybe you wont get to 100% on all because video card wont be able to handle the work and reach 100% usage itself, but you will get a lot more frame rates.

Example:
The Witcher 2 had this problem in version 1.00. CPU usage was 40% and GPU usage was 45%. The update 1.1 improved CPU performance. They improved the way they send tasks to the cores. How? making tasks smaller. This makes all the Cores finish each task per frame in more or less the same time.
This allowed to reach CPU usage of 55%. This means more frames ready to be processed by the GPU, witch leads to a GPU usage of 60% now.

If you could improve all the games or even this one closer to 100% usage on all cores (and i insist, GIVEN THAT ALL you other PC components can catch up with the work) you will get huge FPS improvements.

Now, its clear that you cant make all task the same in all game environments, some have more glass, or more trees, but at least reach a 80% usage in average in the hole game.

This is the point men, todays gaming bottleneck is the software itself, not the hardware. Yes if you improve your hardware frame rates will improve, but here is why:

In the math example imagine that i give the tasks to 10yr old kids or to 25yr old. Yes 25yr will finish faster, but why? because the kid that has to do the huge math operation will take longer than the 25yr in the same operation. Even with the 25yr 3 of the guys will do nothing. In the CPU is the same. Yes, in Core i7 each core is faster than in a quad-core. But you are still not using the group of guys the right way. If we improve the way the tasks are done by the 4 guys i bet u can make the Quad-Core beat the bad optimized Core i7.

Example:

Imagine that in the Core i7 the 3 guys finish in 0.8 sec and the last finish in 1:30 sec.
Now, that will beat the Quad Cores 2:00 right.

Now imagine this, in the quad-core i make the 3 guys finish in 1 sec. And i have a way (clearly not possible with humans hehe) to make all the 4 guys finish the huge math operation. They are 4 so they will finish it in 1/4 of the time. This is 30 secs. Now the total time will be 30sec. That is 3 times faster than the Core i7.

Now clearly if i use this same idea on the core i7 it will do it in 20 sec maybe, showing that he is better, yes, but we taked the time from 2:00min to 30 secs.

THIS is the problem, developers need to discover this way to make all cores (guys) work together. Or at least a way to improve their "working together" that is exactly what multi-core stands for.


I STRONGLY recommend you to read something about microprocessor pipelines, algorithms, and ALU. This will explain me further, and of course the more important part to read, interlock tasks in microprocessors.

I cant explain myself more, this subject takes a lot of microprocessor knowledge and architecture. In a very fast speed university it will take you at least 6 months to a year of study, but in wikipedia and google can give you a fast idea, like my example witch i hopes it helps.

Sorry if i fail in explaining, my english is not actually that good, and its a hard subject to write about.
m
0
l
a b à CPUs
June 17, 2011 1:14:59 PM

Quote:
This means, in every computer operation there is always a bottleneck.
On games bottleneck is usually CPU or GPU, so, if you are not framecaping eater CPU or GPU should be at 100%.


Yes, in every PC, for a given application, one component will by definition be a bottleneck. Your assumption however that the bottleneck is either the CPU or GPU in games is horribly flawed, because you ignore other bottlenecks [specifically memory I/O]. You also make the assumption that at least one of your game threads is always being executed, which may or may not be true at any given point in time.

Quote:
This is what is happening, the bottleneck is in the distribution see. 1 Core is doing all the hard work because the other 3 finish fast, this leads you to 100% - 25% - 20% - 20% cpu usage in each core. In average it is 41% usage.


In THIS situation, yes, you do have a bottleneck, as one core is getting more work then it can do. I'm not arguing that. I'm arguing that this is a BAD thing, and better core loading could give you closer to a 75%-50%-20%-20% mix, eliminating the CPU as a bottleneck.

What I'm arguing is that seeing 100-100-100-100 is a BAD thing, as it indicates the CPU is a bottleneck, and that 50-50-25-20 indicates the CPU is NOT a bottleneck. In this case, situation two is far better coded then situation 1, and will be significantly faster to boot [especially is some other process comes in and demands immediate CPU time, where you run a real risk of a critical game thread being forced out, forcing everyone else to wait].

Quote:
Imagine this for a moment. I tell you u got to do this operations

2+2
1+1
1+2
189,87 + (893,2 * 50)

You have 4 people and you give each one one of this tasks, you tell them, that when they all finish, a 5th person will write the answers on a screen and win a price.

Now, the 5th person cant write until all the answers are ready.

It becomes clear the the 3 top tasks will be done in like 1 sec. but the last one will take longer, so, 3 of the guys will finish in 1 sec and the last one will take 2 min, because the other 3 guys cant help him.

Now, 3 guys will do nothing while 1 guys is working for 1:59.

The final time for posting the answer will be 2:00min.


Remember, Windows is smart enough to know when a thread is IO blocked, so in this case, the thread that actually writes the answer in question will not be loaded until all its parts are complete. So in your example, this is what the cores are actually doing:

Core0: Spends 1 second doing math operation, then spends 1:59 doing some other work.
Core1: Spends 1 second doing math operation, then spends 1:59 doing some other work.
Core2: Spends 1 second doing math operation, then spends 1:59 doing some other work.
Core3: Spends 2:00 doing math operation.

This gets you to 2:00 after you started, where the thread that actually outputs the answer is now not IO blocked. In this case, any one of the avaliable cores [probably Core3, based on the fact it needs a new thread to work on] will load that thread and output the result.

Now, heres the main point I want to make: For the first three cores, for a full 1:59 seconds, they are NOT working on your program, as they've already been assigned new threads. They are doing some amount of work; could be 5% of their processing capacity, could be 100%. The only thing you can gurantee in regards to your program is that Core3 will be doing work for 2:00 at 100% of its work [a CPU bottleneck].

What the other three cores are doing between 0:01 and 1:59 is IRRELEVENT as far as your program is concerned, as they are not working on your program. So the argument that making them go to 100% will somehow improve performance is laughable.

The above is a perfect example of badly designed software. If you have a thread that takes that long to execute, and some other thread dependent on those results, they should be combined as a single thread. The extra threading does not give any performance benifit, due to the inherent CPU bottleneck. In fact, the extra thread could REDUCE performance, as some other higher priority system thread could theoretically pre-empt the thead that outputs the results you've waited 2:00 for, even though its finally not IO-blocked, and thats before considering the overhead of creating/managing a thread in the first place.

BTW, this example is also why I favor Intel's "strong core" approach over AMDs "massivly multicore" approach, as AMD's approach chokes on heavy process workloads.

Quote:
THIS is the problem, developers need to discover this way to make all cores (guys) work together. Or at least a way to improve their "working together" that is exactly what multi-core stands for.


You can't, at least as far as CPUs are currently designed. The example you gave is an example of a situation where more cores does not lead to faster performance, due to the bottleneck on Core3.*

*Ok, technically, the math operation COULD be broken up into smaller segments, each run on its own Core, but because the OS operates at the thread level, it would be up to the CPU hardware to recognize the math operation and break it up across all the cores. The downside, of course, being that other threads don't get a chance to run, therefore decreasing system performance at the benifit of increasing application performance, leading you into a cost-benifit analysys.

Quote:
I STRONGLY recommend you to read something about microprocessor pipelines, algorithms, and ALU. This will explain me further, and of course the more important part to read, interlock tasks in microprocessors.


Aside from 15 years of coding programs for Windows and Linux, I happen to specilize in working with Real Time Operating Systems, where time is the most important thing to consider when writing an application. This is especially true given how most hardware I work with still operates on comparativly slow processors [Mhz] with limited amounts of RAM avaliable [KB's].

The pont being, I damn well know how Windows handles threads and the like internally.
m
0
l
June 17, 2011 4:01:42 PM

Ok, you have good points there, the problem is than in graphics the cores cant do nothing if the frame is not completely ready, they cant start another frame.

The problem seems to be, that they don't really optimize the game, they chop it in some tasks, but there is still a huge one that takes 1 core almost all the time.

Im sorry i my opinion offended you, i'm an electronic engineer and i work too with low Mhz microprocessors (like 8 Mhz, hehe) and i know optimization is hard to achieve, but it must be done if you want to improve. I also worked in game developing once. Its hard to make a game ride a Quad Core the way he must.
m
0
l
!