CPU Cores or threads?

bigsteely101

Distinguished
Nov 15, 2010
18
0
18,510
Hey everyone, I am running a software program which apparently does not take advantage of multi-cores and multi-threads. The software program is "Metatrader 4". I use this to study and trade currencies. I am in the process of picking up a second computer system to run all my historical back testing on it. The historical data sits on my drive. I would like to know what should I be looking for in terms of CPU cores/threads when I do my research?

This situation is a little different than just saying it is not a multi-core/thread software program so don't concern yourself with this. The thing is, I run approximately 10 instances of the program at one time, testing various strategies. So there will be a large load on the programs, I just don't know how the cores/threads work in this type of situation. Do they dedicate/share resources to the multiple instances? How can I make a purchase that would maximize this type of computing scenario?

Thanks CH
 

nordlead

Distinguished
Aug 3, 2011
692
0
19,060
cores are independent of each other and can run at the exact same time. They don't share resources until you reach higher levels of cache or possibly RAM.

threads (regular threads) share the same physical cores and swap in and out to make it look like they are running at the same time. Every processor is capable of doing this.

Hyper threading is intel's technology that allows threads to use the same physical cores, except hyper threading does have some duplicated resources (instead of shared). This duplication makes hyper threading faster than regular threads.

Since it is possible to instruct certain programs to run on specific cores (or virtual cores) it does make a difference what kind of processor you buy.

Let's assume all of the specs of a CPU are identical other than core count and if it supports hyper threading. Since you want to run 10 instances at the fastest speed possible, then ultimately you'd want a 10-core processor. Since that isn't available you'd have to settle for a 6-core with hyper threading (12 virtual cores) for the next best results.

The other alternative is use two CPUs with at least 5 cores.

So, on the many core front I'd recommend the i7-970 or higher. However, since everything isn't equal the Sandy Bridge CPUs are incredibly fast even an i7-2600k (4 cores with hyper threading) overclocked might just beat the much more expensive 6 core (with hyper threading) counterparts.

Just to show how much more efficient the Sandy Bridge architecture is you can look at the tom's hardware chart where everything is single core @ 3GHz - http://www.tomshardware.com/charts/x86-core-performance-comparison/All-time-based-values-added,2779.html
 

bigsteely101

Distinguished
Nov 15, 2010
18
0
18,510


Thanks so much for your recommendations.

So, I am unsure whether I go desktop or laptop. I currently have a desktop with an i5-760 and thought I should diversify a little so I was thinking of a laptop with the i7 - 2630QM chip. Apparently it has 4 cores/8 threads http://ark.intel.com/products/52219. Unfortunately, the CPU benchmark at http://www.cpubenchmark.net/high_end_cpus.html is far from being comparable to a desktop version.

Any real downside to the laptop path, other than it is expected to get hot faster?
 

bigsteely101

Distinguished
Nov 15, 2010
18
0
18,510


Would a quad core i7 have better computational qualities (algorithms and all that jargon!) than a quad core i5? Maybe I could lessen the cost with an i5 but increase overall performance via memory and SSD drive.
 
cores are independent of each other and can run at the exact same time. They don't share resources until you reach higher levels of cache or possibly RAM.

Not necessarilly. Bulldozer is a perfect example where each pair of cores [on BD module] shares significant management resources [scheduler, FPU, etc]. Whether this effects performance is another matter...

Would a quad core i7 have better computational qualities (algorithms and all that jargon!) than a quad core i5? Maybe I could lessen the cost with an i5 but increase overall performance via memory and SSD drive.

Maybe. You do gain hyperthreading, but I don't know how much that will help you. I'd just stick with an i5 personally.
 


I don't think it matters.

But I do believe the point being made on good RAMs and fast disk I/O. Not sure how 'techie' you are but it would not be too difficult to set up an SSD RAID, not only for your OS/Apps but for your data (dependent upon the size of your data, of course).

Because your intention is in running multiple instances I suspect you will actually 'move' the stress to your disk I/O from the CPU.

And from a practical standpoint in pure GFLOPs, it is hard to top an x6 Thuban. The overall issue I see (with either AMD of Intel - not counting the disk I/O and RAMs) is how well Windows would work with your software in 'load-balancing' across CPU cores.

AND ... you could actually try a little experimenting right now by firing up multiple instances, go to the processes tab in task manager, and set different core affinities.

See how that floats yer boat.

 

nordlead

Distinguished
Aug 3, 2011
692
0
19,060
as a note, suggestions are tough because I'm missing information. For example, is speed the #1 priority over price? Will a single instance of this program max out a single core (or come close)? Do all 10 instances typically use the CPU at the same time?

The i5-2500k and i7-2600k are identical except that the i7 has hyper threading. The i7-2600k costs $100 more. They are both based on the Sandy Bridge architecture. If the CPU is the bottleneck (which is the impression I'm under based on the request) then an i7-2600k will be superior to the i5-2500k. I don't know which would win between the i7-9xx or the i7-2600k, but based on the price difference and the easy overclocking I'd go with the much cheaper i7-2600k. If that reality is that you don't max out the CPU (or come close), then a i5-2500k would be sufficient and the $100 can be spent elsewhere

As for the i7 - 2630QM, it is significantly slower than the i7-2600k with a stock clock of 2GHz and a turbo boost up to 2.9GHz. For comparison the i5-760 that you have is a 2.8GHz lynnfield core.

Investing in an SSD and RAM is also a good idea as pointed out above, but the reality is that you should be investing in removing bottlenecks whatever they happen to be.
 

nordlead

Distinguished
Aug 3, 2011
692
0
19,060


Yes, bulldozer doesn't fit the description I gave, but Bulldozer is essentially implementing Hyper threading but calling them cores. Considering the chip isn't out yet, I'll consider my very simple answer good enough :-D
 

bigsteely101

Distinguished
Nov 15, 2010
18
0
18,510


On my i5-760, I have set up a RAID 0 with four SSD drives on my 3Gb/s ports...but it's performing relatively poorly. I get 500 read and 450 write speeds. I think I currently have a bottleneck across the DMI of the i5-760 chip which only allows 1GB transfer bandwidth one way. As my peripherals, I have 4 SSD's in RAID 0, 1 SSD as main drive, 1 spindle drive, and a CD Rom drive. A new chip like the Sandy Bridge has a bandwidth of 10Gb/s across the DMI 2.0.

I am not exactly sure what you mean by "set different affinities".

I have attached a screen shot of my CPU processes which I have never seen this high before; 100% (Okay, maybe I can't attach a file...not sure how to do it here!). You'll see that there are 10 metatrader instances up. Five are running just regularly and five are doing historical back-testing, BUT only 4 on the CPU resource monitor show that they are being taxed hard (21%, 20%, 14.36%, and 13.37%).
 

bigsteely101

Distinguished
Nov 15, 2010
18
0
18,510


Yes, speed is my goal, cost is not an issue, other than getting into the ridiculously priced Intel "extreme" chipsets.
 
CPU-affinity_01.jpg


[:jaydeejohn:5]

If you right-click on the instance in the process tab you may select an individual core (or more correctly, 'deselect' individual cores) for the instance to be executed.

How big are your data sets? Will a single set reside in RAM?

 

bigsteely101

Distinguished
Nov 15, 2010
18
0
18,510



Thanks for the pointer.

The entire data set file is 3GB (Sept 2007 - Sept 2011). Though each of the 5 back-testing instances is seeking data for a period of 2 months.
 

bigsteely101

Distinguished
Nov 15, 2010
18
0
18,510


It appears that the CPU is the limiting factor and bottleneck. I closed down five of the instances and am now only running the historical back-test. The CPU usage is now as follows for the five remaining instances: 24%, 24, 24, 12,12.

When we say that I have "100% consumption of CPU by the processes", what exactly is it lacking that would otherwise reduce that load. Is there one variable that contributes to the high usage, such as speed, I/O bandwidth...etc.?
 
Boy ... this is a fun one :)

The more I snoop into 'Metatrader 4' the more interesting it gets. You are certainly swimming in competitive waters. Hopefully, I can give you a few ideas with which you can run.

First, I'll take a wild swing at one of your original questions ("Do they dedicate/share resources to the multiple instances?") and speculate on it's effects on the hardware debate so far, and what I think might help 'fix' it.

It seems to me when you launch multiple instances the result in hardware you are seeing can be called 'chunking' and 'thrashing'. The data set is small enough to permanently reside in memory address space BUT it is continuously being written, flushed, rewritten, flushed, rewritten, flushed, rewritten, flushed ... (I think you get the point!) What I think happens when MT4 computes/reads the returned processed data, it releases the data set from the memory address space it established.

Then, if you take this 'chunking' and start adding multiple instances, you simply start multi-chunking, or 'thrashing' as each instance conducts its own R/Ws to address space. I suspect this is where your cpu utilization is coming from.

The good news is it looks as if MT4 is highly script-able, and I'll venture a guess that someone has figured a way to dedicate a chunk of address space without it constantly being released, thereby freeing up process resources.

In the Task Manager Process tab under the 'View' drop-down, check out 'Select Columns' -- there are a dozen or more related to memory and I/O. Load 'em all up and see what jumps out at you :lol:

Any input on the core affinity? What tweaks did you try?





 

bigsteely101

Distinguished
Nov 15, 2010
18
0
18,510



Sorry, but I still can't figure out how to attach a screen shot of my task manager with all the memory an I/O details.

I tried changing my "priority" and "affinity" settings to dedicate CPU power to the terminals which were running. In this case I had two terminals running. The results were that same as if I left it as default. So essentially, I can not allocate any greater CPU power than the 25% to any one terminal. I guess that says that each terminal is maxing out on a single core.