Power of ARM

montosaurous

Honorable
Aug 21, 2012
1,055
0
11,360
Today the director of regional Cisco classes came in today and spoke to my class. He seemed quite optimistic about the internet. However, he had nothing nice to say about the future of PC's. He asked one girl why she elected to take the course, and she said because she has had an interest in computers. To my shock, he said "Computers are dead. Microsoft can't sell their programs and Windows 8." as well as "This phone, this iPad is stronger than this PC. They have 4GB memory and dual cores. Phones are having quad cores and memory of 16GB, 32GB, 128GB, 256GB...", all while he was referring to a PC that had an i5 2400 and 4GB of RAM right in front of him.He also said some other stuff that makes little sense, but that is a different story. We can all agree with him though when he said the future for careers in IT is bright. Now, I know that x86 is much more powerful than ARM, which I've heard some say has the IPC of a Pentium III. So, exactly how powerful is ARM relative to an x86 model that we're all familiar with? Is it a Pentium III, Pentium 4, Athlon 64 X2, Pentium D, ect.?
 
Solution
ARM 64 is a disruptive technology which is being aimed at several market areas. Consumer based devices which demand one full days use along with evening use without a charge. This relegates ARM 64 SoC's from .25-5 Watts with power saving modes to shut down cores which are not in use and greatly reduce the speed of the single core when it is idle. The power savings mode will go across all ARM 64 products and be many times more effective than x86 power savings modes.

The second target is server infrastructure. ARM 64 bit SoC's that are aimed at servers are 2-8 cores and use 2-15 watts of power. At 2 Watts it would be possible to place 96 ARM 64 SoC's in a 1 Rack Unit Server Chassis. At 15 Watts it would be possible to place 16 SoC's in...
I can't say how powerful ARM is. I believe Cortex ARM 15 has about the same # of transistors in a Pentium.

In general the number of transistors in a CPU (or ARM) indicates it's performance potential, but so does the architecture. More powerful CPUs are generally more complex and weaker CPUs, thus they have more transistors. For now it is difficult to say exactly how power the Coretx ARM 15 is since the same software cannot be used on a Windows and Android device.

I would say ARM is more powerful than the Pentium 3 since it playback HD video. However, ARM has dedicated hardware to do that. HD video did not even exist back in the Penitum 3 era so naturally that CPU would choke when attempting to playback video. For the moment ARM based devices are nice for consumption based products; watching videos, playing simple games, listening to music, reading a book, etc. However, to actually do "real work" in the office, not even close to the capabilities of a PC. For example, there are simple spreadsheet apps for Android, but they are simple and the ARM processor would not be capable of keeping up with even a P4 (perhaps even a P3) when it comes to complex calculations / financial modelling. Even when you want to develop apps for Android you use a computer.

It will be interesting to see Apple's ARM processor for their next generation laptops. They plan on dropping Intel for good, and are working on an their own processors for Macs. When is it expected to be released? Not sure since I don't really pay attention to Apple products. But once it is released, then it can be benchmarked against a Mac running an Intel CPU to see how it performs.



 

PhaTzie

Honorable
Oct 1, 2013
1
0
10,520
ARM 64 is a disruptive technology which is being aimed at several market areas. Consumer based devices which demand one full days use along with evening use without a charge. This relegates ARM 64 SoC's from .25-5 Watts with power saving modes to shut down cores which are not in use and greatly reduce the speed of the single core when it is idle. The power savings mode will go across all ARM 64 products and be many times more effective than x86 power savings modes.

The second target is server infrastructure. ARM 64 bit SoC's that are aimed at servers are 2-8 cores and use 2-15 watts of power. At 2 Watts it would be possible to place 96 ARM 64 SoC's in a 1 Rack Unit Server Chassis. At 15 Watts it would be possible to place 16 SoC's in a system that consumes between 225-250 Watts nominally. Why would it be more beneficial to place 96 2W SoC's in a 1U server rather than 16 15W ARM SoC's in a 1U. For those services which require the serving of data 96 SoC's with a shared memory architecture would run circles around the 16 15W muscle car SoC's. It s about efficiency and fast access to data and the ability to move data. It would also make sense to add a Switch ASIC to the 96 2W ARM SoC server which will be a highly disruptive technology and... will change the cloud in the near future. Shared memory architectures are the future of the server ! Moving data and calculation this is where the 16 15W SoC's would go up against a 4 processor Sandy Bridge system. With a shared memory architecture the 16 15W SoC's will run circles around the 4 CPU sandy bridge system !

Let us examine a system from SGI. The SGI UV system is the largest scale shared memory architected system currently in existence. It was originally created for one purpose. "Scale Up the number of transactions per second". Banks use the SGI UV to be able to increase the number of transactions without problems. Distributed computing puts time lags into credit requests. The shot gun technique where thousands of hijacked computers were used by criminals to place credit requests cause banks to loose 100's of millions of dollars. This proves that transaction per second was very important when looking at share memory architecture.

I have seen storage systems by Foxconn which uses the Calxeda ARM 64 clusters and provide improved throughput while reducing overall operational energy costs. ARM 64 in the right configuration will cost 10 times less per transaction while being able to out and out out transact x86 servers period ! This is why we are about to see a huge change in the cloud from CISC x86 processor to RISC ARM based processors.

Until some benchmarks are completed the AMD 64 bit 15W 8 Core SoC's could out transact x86 architecture because of the opportunities using shared memory architecture. So if a cabinet with 160 x86 processors can maintain 25 TFlops. Then the AMD system should be able to do at least 5 times or 125 TFlops per cabinet. 1/5th the cost of x86 per cabinet !
 
Solution
Wow, that CISCO fella has little knowledge about how architectures work... How the friggin' hell is he in that position blows my mind in a lot of ways...

Comparing different ISAs literally is comparing 2 types of different nature food. Milk and Fruits, for example. You can't say that Milk will keep you alive neglecting Fruits for example. In this case, X86 serves its purpose for a certain type of loads, whereas the ARM ISAs cover another part (low power envelopes are their current strongest point). Both aim to "calculate" stuff and produce results on a screen for you, but comparing them directly without context is, at least for me, a dumb thing to do.

To answer on how it could compare to... You have to go into the metal: current ARM design wins (Samsung Exynoss-series, Apple's A-series and Qualcomm Snapdragon-series, for example) can do much more than an Athlon XP or a Pentium III. Not because of one simple catchy phrase like "because they are old", but because the amount of transistors you can pack in today's designs (the "nanometer" race), plus dedicated hardware for the stuff you know your design is not good for emulating (decoding, for example). Also, software evolution helps a lot. Packing more stuff into the hardware allows for new software to depend less on "emulating" things through the RAW processing power of the CPU or GPU. This is a little more complex to explain, but X86, being a CISC approach, packs more potential for certain number crunching schemes than ARM being a RISC approach. Think of what you usually hear as "floating point operations" or "streamed instructions". ARM being a RISC approach, can't compete for a "fixed pipe" approach to solve one specific need a CISC approach can (in very simple terms), but both can approaches can crunch numbers providing the same exact results. I won't go to the "what if ARM goes to the same power envelope as X86", because I'm not even 100% sure of what's better. I believe you have to pick the right tool for the job at hand, and that implies a case-by-case analysis.

Now, comparing what's inside the Samsung Galaxy S4 to your run-of-the-mill i5 computer desktop, for example. There's not even competition there. The i5 floors the SG4 in raw power (transistor count is higher, design power is higher, all hardware/software accompanying the CPU is stronger), but you can't put the i5 PC inside your pocket. It's a trade off for going portable, no surprises there. Also important is the OS. He mentioned that Microsoft failed to "sell" Windows 8... Well, I bet Android nor iOS have even HALF the features of even Win XP or Win 2000. He's comparing a light OS to a full blown desktop-designed part. I do agree that 90% of regular people is happy with what Android or iOS offers, but from an operative point of view, there's not even a comparison to be made. In the particular case of Android (Linux kernel), is like comparing it to Fedora or Gentoo.

Anyway, what the CISCO dude didn't tell you, is that when you want to go low power and high performance, you will always have intersection points with old tech, but current to current, the higher power consuming, higher transistor counting parts will prevail and be better -> rule of thumb. This is also the case with OSes. To get a hold of the new goodies you have to update parts of the OS that could not work well with all the old stuff in the code. Think the jump from Win 98 to Win XP or from the Linux kernel 2.x to 3.x and so on.

Hope we help on this discussion, sounds like a fun thread to read later on, haha.

Cheers!
 

8350rocks

Distinguished
ARM is really great at processing small length, simple instructions quickly. That's why your phone feels really snappy. RISC (ARMs type of architecture) is designed around the methodology of short instruction strings that take approximately 1 clock cycle to complete. CISC (x86-64 type of architecture) is designed around the capability to run complex, long instruction strings that can take several clock cycles to complete.

RISC will run what it runs fast. Think of it like this...remember when everyone said Macs were faster than Windows machines years ago (If not it may be before your time)? Anyway, Macs that ran OSX Leopard and earlier were on PowerPC architecture that was a RISC architecture. It couldn't do some of the things x86 could, but what it could do...it did quite fast.

ARM is aimed at lower power envelopes, specifically because it's good at simple operations and scales downwardly *far* better than x86 does. Although, once you start adding on all the additional criteria to make ARM able to compete with x86 on long string complex operations; your transistor counts rise, your thermal envelopes increase, and your power consumption spikes. For this reason, x86 scales upwardly better than ARM; because it's already been through the growing pains of trying to add on all those things. The x86 architecture has already had to get power consumption under control, plus making the more complex components work together smoothly and seamlessly; which, can take a few product generations to nail down.

The differences are that ARM can scale well in parallel...meaning you can run many ARM CPUs at once and consume lower power doing parallel tasks well. Though, the issue with that is that modern GPUs are actually even better at those tasks than ARM is, and they can run GPGPU functions on x86 code.

So, essentially, x86 CPUs do serial tasks well, and run complex parallel tasks that have to be tied together well too. Modern GPUs can run massively parallel calculations like physics and other things better than anything else, and consume relatively low power in relation to their raw computing power for such tasks. ARM cores are good for running simpler instruction sets and being efficient at it, which helps make phone batteries last longer.

They all have a place, purpose and function. Though, ARM will not surpass x86 in computing anytime soon. Additionally, that guy from Cisco has made the mistake of writing off the PC Gaming industry, which accounted for $18 billion in profits last year. It is also the only growing segment in desktop PCs.

(In reality, I think he's an ignorant twit who was over promoted, but I tried to be diplomatic about it)