Intel Xeon Phi Performance
Throughout its press day, Intel repeated over and over the importance of optimized core when comparing the performance of a CPU to an accelerator. One of the company's first examples involved a bit of Fortran code. First, we saw results from the unoptimized single-threaded code, followed by a simple Xeon Phi port. The difference showed the Phi to be somewhere around 300x faster. Then, the Intel team demonstrated why its first comparison was flawed. When the same code was re-run on dual Xeon E5s, the Phi was only about twice as fast.
The purpose of this exercise seemed to be expectation management. It's in the best interest of companies like Nvidia to run parallelized code in a single thread as a baseline, and then run the same code on a graphics processor to claim more than two orders of magnitude improvement. But if you allow optimized code to take advantage of a multi-core CPU's resources, the real delta between them is much smaller.
Then, Intel shared some of the real-world performance improvements seen from comparisons between dual-socket Xeon-based machines and Xeon Phi.
Financial services professionals are probably salivating at these numbers. Monte Carlo models are often used to solve problems using a bunch of unknown inputs and probability. I've personally used them to suggest the risk and financial impact of large projects and product programs. And, after the 2001 dot-com crash, Black-Scholes became a preferred option valuation method. This was a huge deal in the mid-2000s because Silicon Valley companies that gave employees options instead of higher salaries were under increased pressure to pin a value on those options.
Intel also brought in representatives from Altair, a software and technology provider, to suggest how easy it was for them to port code to the Xeon Phi architecture and show examples of workloads like crash test simulations, which generally saw a 2.5x performance improvement.
In lieu of hardware and software we can test ourselves, Intel's discussion of performance is plausible. Optimization can move the needle in one direction or the other, and certain applications are going to realize more gain from what Intel is doing with Xeon Phi than others. But, with that said, a 2-2.5x improvement seems reasonable in environments able to benefit from parallelized computing.