Power your breakthrough innovations with the highly parallel processing of the Intel® Xeon Phi™ coprocessor. We’ve packed over a teraFLOPS of double precision peak performance into every chip—the highest parallel performance per watt of any Intel® Xeon® processor.1,2,3,4 Now you can think “reuse” rather than “recode” with x86 compatibility. Languages, tools, and applications run smoothly across the full spectrum of Intel® Xeon® processor family-based platforms.
A broad ecosystem of programming languages, models, and tools support the Intel® architecture and all of them can be used with both the Intel Xeon processors and Intel Xeon Phi coprocessors. Applications that run on one processor family will run on the other. This uniformity can greatly reduce the complexity of software development. Existing applications will need to be tuned and recompiled to maximize throughput, but your develpers won’t need to rethink the entire problem or master new tools and programming models. Instead, they can reuse existing code and maintain a common code base using familiar tools and methods.
Intel Xeon Phi coprocessors provide up to 61 cores, 244 threads, and 1.2 teraflops of performance, and they come in a variety of configurations to address diverse hardware, software, workload, performance, and efficiency requirements.
The Intel® Xeon Phi™ Coprocessor 3100 family provides outstanding parallel performance. It is an excellent choice for computer-bound workloads, such as MonteCarlo, Black-Scholes, HPL, LifeSc, and many others.
The Intel® Xeon Phi™ Coprocessor 5100 family is optimized for high-density computing and is well-suited for workloads that are memory-bandwidth bound, such as STREAM, memory-capacity bound, such as ray-tracing, or both, such as reverse time migration (RTM).
The Intel® Xeon Phi™ Coprocessor 7100 family provides the most features and the highest performance and memory capacity of the Intel Xeon Phi product family. The family supports Intel® Turbo Boost Technology 1.0, which increases core frequencies during peak workloads when thermal conditions allow.
Get over a teraFLOPS of double precision peak performance.1,2,3
Compared with Intel® Xeon® processor E5 family-based servers, the Intel® Xeon Phi™ coprocessor delivers:
Increase server density by up to 8x greater FLOPS per rack by adding Intel® Xeon Phi™ coprocessors to your Intel® Xeon® processor E5 family-based servers.1,3,10
View architectural diagram for Intel® Xeon Phi™ product family >
Applications can support both Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors which use common languages, models, and tools, including:
Leverage the compute flexibility of Intel® Xeon Phi™ coprocessors:
New Intel® Xeon Phi™ 3100 and 7100 product families >
Getting to parallelism and programmability >
High performance computing >
The Intel® Xeon Phi™ coprocessor: Product brief >
Intel® Many Integrated Core Architecture (Intel® MIC Architecture) >
Intel® Xeon® Processors Technical Resources >
1. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information, go to www.intel.com/performance.
2. Claim based on calculated theoretical peak double precision performance capability for a single coprocessor. 16 DP FLOPS/clock/core * 61 cores * 1.238 GHz = 1.208 TeraFLOPS.
3. Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance.
4. 2 socket Intel® Xeon® processor E5-2670 server vs. a single Intel® Xeon Phi™ coprocessor 7120P (Intel Measured DGEMM perf/watt score 309 GF/s @ 335W vs. 829 GF/s @ 195W
5. 2 socket Intel® Xeon® processor E5-2600 product family server vs. Intel® Xeon Phi™ coprocessor (2.52x: Measured by Los Alamos Labs June 2012. 2 socket E5-2687 (8 core, 3.1GHz) vs. 1 pre-production Intel® Xeon Phi™ coprocessor (60 cores, 1.0GHz) on a Molecular Dynamics application. Workload completion time of 4hr 7m 10s vs. 1hr 38m 16s) (2.53x: Measured by Sinopec October 2012. 2 socket E5-2680 (8 core, 2.7GHz) server without a coprocessor vs. same server with 2 pre-production Intel® Xeon Phi™ coprocessors (61 cores, 1.091GHz) on a Seismic Imaging application. Workload completion time of 1342 seconds vs. 528.6 seconds).
6. Intel does not control or audit the design or implementation of third party benchmarks or web sites referenced in this document. Intel encourages all of its customers to visit the referenced web sites or others where similar performance benchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of systems available for purchase.
7. Calculated theoretical Peak Double precision FLOPS (2 x Intel® Xeon® processor E5-2670; 8C, 2.6 GHz vs. 1 x Intel® Xeon Phi™ coprocessor 7120P; 61C. 1.238 GHz
8. 2 socket Intel® Xeon® processor E5-2600 product family server vs. Intel® Xeon Phi™ coprocessor (2.2x: Measured by Intel October 2012. 2 socket E5-2670 (8 core, 2.6GHz) vs. 1 Intel® Xeon Phi™ coprocessor 7120P (61 cores, 1.238GHz) on STREAM Triad benchmark 79.5 GB/s vs. 175GB/s
9. 2 socket Intel® Xeon® processor E5-2600 product family server vs. Intel® Xeon Phi™ coprocessor (10.75x: Measured by Intel October 2012. 2 socket E5-2670 (8 core, 2.6GHz) vs. 1 Intel® Xeon Phi™ coprocessor SE10P (61 cores, 1.1GHz) on a Single Precision Monte Carlo Simulation. 45,501 options/sec vs. 489,354 options/sec )
10. (Phi FLOPS/Rack) 2 socket Intel® Xeon® processor E5-2670 server vs. same 2 socket server with 2 Intel® Xeon Phi™ coprocessor 7120P installed (Calculated Theoretical Peak Dual Precision FLOPS: 332.8 GF/s vs.( 332.8 +( 2 x 1208 GF/s)))