Sign in with
Sign up | Sign in
Your question

Does Windows know how to maximally assign workloads to threads/cores in a quad-core processor that has Hyper-Threading on?

  • hyper threading
  • Quad Core
  • CPUs
  • Processors
Last response: in CPUs
December 22, 2013 1:03:42 AM

So, I have been trying to google for this answer, but I cannot find a direct technical answer to the question. (Or if it did, it was in a confusing way)

So, say you have a program/game that uses 2 threads, and only 2 threads. With a quad-core CPU and hyper threading on, will it always know to still use two PHYSICAL cores instead of just 2 threads, using 50% of the cpu instead of 25% of it? Does Hyper-threading every cause problems of this type, artificially using half of what it really could/should do? How does Windows know how to handle all of this?

Does having the cores split up in threads decrease the efficiency vs. using 1 thread for 1 core, is essentially what I'm asking. I have seen some discussions and benchmarks related to hyper threading not affecting performance too much in one way or the other, but I am curious how it works.

Thanks! Sorry if the way I asked the question was odd, just trying to be clear over concise.

More about : windows maximally assign workloads threads cores quad core processor hyper threading

a b à CPUs
December 22, 2013 2:31:44 AM

Hyper-threading is there to increase performance. Now windows knows there are 4 cores and 8 threads so my guess is it would use thread #0 & #1 first which will be what the BIOS assigns core #0 & #1.
December 24, 2013 7:47:24 PM

Okay, anyone want to get more technical with this? Is it able to let one thread take a full core and another thread take the second full core?
January 24, 2014 4:16:20 AM

To be clear, lets define these Terms first:

Core - The concept of what an original CPU is. A core executes a software thread.

Software Thread - A sequence of instructions that a program follows.

Hypher Threading - It's a fake Core. Only a minute part of the core was cloned (namelly registers), in essence it simulates an extra Core, but it's only able to use the real core "free time" to run another software thread. It shares all other resources (ALU, Cache, memoryAccess, Bus, etc ... )

Core Thread - is similar to hyperthreading, in the sense that it's a fake core, i.e. an incomplete core clone. The real difference is that a Core Thread includes more private resources than what exists with hyperthreading alone (another set of registers), how many more resources, depends on the actual technology being used (for example, it could have it's own instruction fetcher).

CPU - represent a processing unit or in other words, a processor of a stream of instructions, so you need something that knows how to process instructions (a core) and a stream of instruction (a thread). This can be made by putting together Hyperthreading with an existing core, or using a core thread in an existing core.

Windows, knows "CPU"s only!
Windows does NOT know what a "Core" or "Core Thread" (in CPU terms) his.

A typical application, can have tens of software threads.

An operating systems refers to an aplication as a "process", and you can see these listed in windows "Task Manager", in the tab "processes". A process is composed of one or more software threads.
A software thread is also called a light weight process, since it shares memory with the process, but as a different and parallel execution sequence.
Operating system schedules processes (by scheduling its threads) by priority.
By default a process can run on any of the existing CPUs.

If you open Task Maanger, tab "processes" and you right click on a process, there is an option called "set afinity", which allows you to define on which CPUs your process (its software threads) can run on.

In some very special cases, it can be faster to force a process to run on a single CPU, to minimize context switching.

So best of worlds, is to have full cores, also sometimes named just "core".
So having 4 core 4 threads (one thread per core), is better than having 2 cores and 4 threads (2 threads per core).

NOTE: Assuming the same core speed, the least you share the faster you get.

To Resume imagine this:
- A core is like a pipe. More Mhz means larger diameter pipe, hence more water (or instructions) goes through.
- A core thread is like a tube or hose, that flows water (or instructions).
- A CPU is a pair of a pipe with a hose inside.

So windows only see CPUs (a pipe+hose pair), i.e. outputs streams (water or instructions)

If you have 2 pipes and 2 hoses, there is only 1 hose inside each pipe, hence hoses can be as large as the pipe.
You have 2 output streams (water or instructions), since you have 2 hoses.

If you have 1 pipe and 2 hoses, there are 2 hoses inside the same pipe, hence hoses have to be smaller to fit inside the same pipe.
But you still have two streams of output (water), since you still have 2 hoses, although smaller.
a b à CPUs
January 24, 2014 5:06:12 AM

Yes, but that does not really answer his question, which is an interesting question after all. I am not sure how aware Windows actually is of the difference between real and fake cores. All it displays e.g. in task manager are "cores". I know for sure that older generations of Windows, such as NT 4.0 Server, knew nothing about hyperthreading. You could still use HT on them; they simply regarded every fake core as a real CPU and treated it accordingly - all the way down to licensing! (the cost of your windows server license depends on how many CPUs you are employing. If your Windows treats every core of your modern processor as a separate CPU, then things quickly get expensive. Which is why this was fixed in later Windows versions; they recognize multi-core processors as such - but still display every core as a core, fake or not.

Returning to wulfay's question: He talks about an application which can utilize a maximum of, say, 2 cores (Starcraft 2 would be an example for this). So the game allocates two "cores" from the CPU. wulfay now argues that this is a matter of luck: The cores that are being allocated could actually belong to the same physical (hyperthreaded) core, so that this single core must do the work of both threads, and all other 3 cores are sitting idle. This is most obviously less efficient than if the 2 allocated cores belong to 2 different physical cores. The question is now whether Windows - or the CPU - is intelligent enough to make sure the first case never occurs.

All I know is that when an application can use less cores than available, e.g. only one core, Windows detects that this core is at high load while the others are idle. In order to better balance the load, Windows shifts the application to another core. However, this only achieves that this latter core goes to high load and the first one goes idle. This way Windows keeps desperately rotating the application from core to core within milliseconds without ever achieving anything in the process - a long criticized weakness of Windows. Funny enough this causes task manager to display a pretty equal utilization of all cores, so that it looks as if the application would scale nicely over several cores when it does not.

I am not sure whether this core-shifting has an impact on performance (apparently it does not), but it sure has an impact on power-saving, because it prevents the other cores from entering the deeper sleep states that they could use if they were not being used every few milliseconds. I remember reading that with the introduction of Windows 8.1 Microsoft reportedly altered the allocation algorithm to detect these situations and refrain from pointless core-switching. Pretty late a reaction, seeing that the problem has been known since at least Windows XP.

Personally, I always endeavor to determine how many cores a game can use, and then set its core affinity to the corresponding number of cores. That way I prohibit shifting this application to any of the other cores, allowing these to go to sleep without losing the faintest amount of performance.

The only game I know that does this on automatic is Medieval 2 - Total War. When you Alt + Tab out of the game and check its CPU affinity in Task Manager, you find that it allocates itself to core 0 alone.

wulfay's question remains unanswered though (unless you count bouncedk's short and reasonless answer).
a b à CPUs
January 24, 2014 2:25:55 PM

Here is the simplest answer: its closed source so NOBODY but Microsoft can know (without hacking). You can either trust it if you use applications that have more threads than your CPUs cores or you can turn it off and you won't have to worry.
a b à CPUs
January 24, 2014 5:35:39 PM

ganon11000 said:
Here is the simplest answer: its closed source so NOBODY but Microsoft can know (without hacking).

You are a little quick with affirming such a thing. Besides the fact that hackers do exist (and they are good enough to disable the forced activation system), there are other ways to find out. For instance, you could try an application that can use exactly 2 cores (such as Starcraft 2) and set affinity to cores 0 and 1 (so that it may not use any other cores). Then measure its performance. Then pick 2 different cores that it may run on. Whenever you pick 2 cores which belong to the same physical core, you should notice a significant performance drop.

Once you have recorded what performance looks like on two different physical cores as opposed to 2 hyperthreads on the same core, remove the affinity setting, so that Windows may assign any cores of its liking to the application. Then measure again. Windows (below 8.1) keeps rotating the application across all its cores in a vain attempt to balance the load (this is a known and provable fact). If you see a performance less than what you measured when you forced 2 different physical cores to be used, you know that at some times the same physical core is being assigned. On the other hand, if Windows is intelligent enough to detect hyperthreaded cores and tries first to disperse the load on the real cores, then the performance should match your best case.

I am honest enough to say that I cba to do this, but that does not mean that it cannot be done.

Best solution

a c 231 à CPUs
January 24, 2014 5:47:07 PM

Windows can tell you that a processor has hyper threading and can therefore plan the thread distribution accordingly, it is built into the windows system libraries
To determine if hyperthreading is enabled for the processor, compare NumberOfLogicalProcessors and NumberOfCores. If hyperthreading is enabled in the BIOS for the processor, then NumberOfCores is less than NumberOfLogicalProcessors. For example, a dual-processor system that contains two processors enabled for hyperthreading can run four threads or programs or simultaneously. In this case, NumberOfCores is 2 and NumberOfLogicalProcessors is 4.

If you know that hyper threading is enabled and if you know that hyper threaded cores are either alternately or if they are placed all at the end you can easily aim your threads to the primary cores and avoid pushing multiple tasks to a single physical core before you have made use of all the others.

It is ignorant to say that nobody but microsoft can know, computer science/architecture isn't a mystery field, microsoft has tons of APIs for windows which give you lots of information about the system so that you can better tune your application to the environment if you don't like how windows organizes loads by default.
March 29, 2014 4:59:45 PM

wulfay said:
Okay, anyone want to get more technical with this? Is it able to let one thread take a full core and another thread take the second full core?

i should imagine it would depend on the programmer's design for the game, instructing the processor to make up its own mind, or ordering it to do one or the other

technochi computers
October 7, 2014 7:57:24 PM

The Windows scheduler is not very intelligent. Windows NT 6.1 (eg Windows 7 and Server 2008 R2) will rebalance processes across threads periodically, but all it knows about is virtual processors, so all threads across all cores look equal to it. If two threads are on the same CPU, or if they are backed by fractional VMWare resources, Windows has no clue.

In general, soft threads (ie, instruction pipelines) are assigned sequentially: 0 and 1 would be off of core 0; 2 and 3 would be off of core 1; etc.

If you run two high-resource workloads on multi-core, HT system, Windows will NOT make the best use of the CPU. You'll have one core working at 100%, with both of its threads spinning away, competing for the same logic units.

Compare to higher-end server operating systems, such as AIX, which are generally aware of the underlying architecture, and will rebalance workloads for peak performance, taking into account I/O wait (waited cycles), memory affinity, cache affinity, and wattage usage. High-CPU-Time processes which run in user space most of the time would be separated to different cores automatically as necessary by the OS scheduler.

For lower end operating systems like Windows, you can play with the processor affinity masks for your processes, but most software I've seen does not go to this effort, rather leaving it as a manual exercise for the admin/user.