bl4ckninj4

Distinguished
Jun 22, 2011
3
0
18,510
Hey all.

I'm a network engineer studying for the CCIE Routing & Switch lab.

I'm going to use Dynamips/Dynagen to emulate the routers (around 10), and connect to real Cisco 3550/3560 switches.

Dynamips is very CPU intensive, so need some advise on what you think is better suited for the job - bang per buck

I'll be running just OpenBSD & Dynamips/dynagen (no gui, cli only) & connecting to it via SSH

Choices, I'd be running Raid 0 SCSI & at least 6 GB RAM

HP Proliant server:

4 x Xeon 3.2 GHZ (single core cpus), 4MB L2[q[/quote]
4 x Xeon 3.0 GHZ (single core cpus), 4MB L2
8 x Xeon 3.0 GHZ (single core cpus), 4MB L2
2 x Dual Core AMD 2.2Ghz
4 x AMD Opteron 852 2.6hz - 1 MB L2

Dell PowerEdge 860
1 x Xeon Quad Core - 2.4 GHZ


Or I could upgrade a BOX I'm using for FreeNAS 8

MB - gigabtye GA-G31M-S2L

could upgrade cpu to Core™ 2 Quad Q9650 but would cost the same as buying one the above boxes (except the 8 CPU box)


The most important question: Is a machine running 4 single CPUs just or nearly as fast as one running 2 x dual core or 1 x quad if they all where say 3.0GHZ. Or are they in a totally different performance level all together?

Many thanks in advance :hello:
 

lucasrivers

Distinguished
Jul 3, 2011
89
0
18,630
I don't know about the boxes above, but i can say that 4 separate single-core CPU's will run faster then, 2x dual-core or 1 quad-core. Both dual-core and quad-core processors have one clock speed, not one for each core.

Where as 4 single-core CPU's have separate clock speeds, which in the and will result in a faster mechine.

I could be wrong though.



 

Timop

Distinguished
The single core Xeons are all netbursts, so the best performance out of the bunch would be the Xeon @2.4Ghz with the dual Opteron being a little behind.

With highly threaded apps, a core is a core, so no matter its 2*2 or 1*4, performance would be almost the same, give or take for some interconnect bottlenecks.

However, everything including the Q6950 would be beat by a ~$110 Phenom II X4, which comes out to less than $250 for the whole rig if you reuse the case/PSU.





 

compulsivebuilder

Distinguished
Jun 10, 2011
578
1
19,160
It depends.

If you are putting each of the single core chips on their own motherboard, then each will have all of the resources of the motherboard (memory, I/O) to itself - this may make them faster.

If all the single core chips are on the same motherboard, then they share the same resources, but they will have one severe disadvantage over the multicore chip. On the multi-core chip, a given location in memory will be accessed once, and cached in L3 cache on the CPU die, whereas each of the single core CPUs will have to fetch that location independently. I suspect that will make a dramatic difference if all the cores are running the same code - the first core on the multi-core will incur the hit of accessing main RAM, but the other cores will benefit from the cache. L3 cache is a lot faster than main RAM.
 

bl4ckninj4

Distinguished
Jun 22, 2011
3
0
18,510
Thanks for all the responses.

I'm still no further forward :(

A colleague, who is also studying for the CCIE R&S, is running an AMD Phenom II X6 1100T Black Edition 3.3 GHz and has great performance with 8GB 1333 RAM & Asus M4A87TD/USB3 870 Socket AM3

I can set Dynamips up to run 6 instances, to use the six cores.

Let me know what are your thoughts on the above setup?

Thanks
 

Timop

Distinguished

The 1100T would be great for your case if you can afford it. How much money do you have anyways?
 

bl4ckninj4

Distinguished
Jun 22, 2011
3
0
18,510
Money is not really a problem, but value for money is.

I'm not going throw a £1000 at it when £150 would suffice.

The machine will only be used for Dynamips and it needs at least 3 PCI slots (12 ethernet ports required to go to switches, so 3 x 4 port NICs)

I'd be unhappy to spend over £300 for processor, MB & memory though.
 


The issue with multiple CPU sockets is the matter of memory addressing. A single quad (or x6) must only deal with a single DIMM bank, essentially leading to better performance.

With multiple sockets, the CPU first checks its own DIMM bank, and if it misses results in a page fault, and has to go out looking in other DIMM banks. This is called non-uniform memory architecture (or NUMA).

On 2/4/8p systems when software is not optimized for NUMA your page faults drive I/O R/Ws through the roof, substantially reducing the anticipated efficiency of all those CPUs. A single socket/DIMM has no such issues (except when your software is poorly coded)

Like FireFox :lol: