• <More on Intel.com

Intel® Xeon Phi™ Product Family

Highly parallel processing to power your breakthrough innovations

Intel® Xeon Phi™ Coprocessor

Power your breakthrough innovations with the highly parallel processing of the Intel® Xeon Phi™ coprocessor. We’ve packed over a teraFLOPS of double-precision peak performance into every chip.

Manufacturing Applications

The manufacturing vertical uses many software packages for static, dynamic, and structural analysis of a wide range of design problems. ANSYS Mechanical* is a widely used mechanical design software package, while MiniFE* is a finite element analysis tool.

Intel Measured as of May 2014

Configuration Details

ANSYS Mechanical (v15): (V145sp-5 Turbine SP5 Data Set)

Platform hosting the coprocessor and platform for two-socket Intel® Xeon® processor baseline: 

Two 8-core 2.6 GHz Intel Xeon processor E5-2670 (only 2 cores enabled for optimal performance vs. licensing costs)

64 GB DDR3-1600, 8.0 GT/s OS version 2.6.32-279.el6.x86_64, Intel® Turbo Boost Technology enabled, Intel® Hyper-Threading Technology disabled

Coprocessor details

Intel® Xeon Phi™ coprocessor 7120A: 61 cores, 1.238 GHz, 16 memory channels, 16 GB memory at 5.5 GT/s, 300W thermal design power (TDP) C-step (Intel Turbo Boost Technology off, error-correcting code (ECC) on)

Flash version 2.1.03.0386, SMC Firmware 1.15.4830, SMC Boot Loader Version: 1.8.4326, Intel® Manycore Platform Software Stack (Intel® MPSS) 2.1.6720-15, uOS version: 2.6.38.8-g2593b11

Intel® Composer XE 12.1, Intel® MPI Library 4.1.1.036 (Update 1)

Source:  Intel Internal Testing  TR2083

 

Sandia Labs MiniFE Solver:

Platform hosting the coprocessor and platform for two-socket Intel® Xeon® processor baseline):

Two-socket Intel® Software Development Platform: 2x Intel Xeon processor E5-2697 v2 (12 core, 30M cache, 2.7 GHz, 8.0 GT/s Intel® QuickPath Interconnect (Intel® QPI), 130W TDP) 64 GB memory at 1600 MHz, Red Hat Enterprise Linux* (RHEL*) 6.4, Fabric: Mellanox MCX353A-FCAT Fourteen Data Rate (FDR) Infiniband* 8x, firmware 2.30.3200

Coprocessor details:

Intel® Xeon Phi™ coprocessor 7120A: 61 cores, 1.238 GHz, 16 memory channels, 16 GB memory at 5.5 GT/s, 300W TDP C-step (Intel Turbo Boost Technology off, ECC on)

Software stack (Intel Xeon Phi coprocessor):

Intel® Manycore Platform Software Stack (Intel® MPSS) 6720-16 (Flash: 2.1.03.0386; coprocessor OS: 2.6.38.8-g2593b11)

Intel Composer XE 14.0.0 (build 20130728), Intel MPI Library 4.1.1.036

MiniFE version 2.0

 

 

2-socket Intel Xeon processor

2-socket Intel Xeon processor +  one Intel Xeon Phi  coprocessor 

Mantevo MiniFE
(400 x 400 x 400 grid) (8 nodes)

8.27sec in total CG solver

  4.06 sec in total CG solver 

Source:  Intel Internal Testing TR2049A

 

Princeton/GTC-P: (Version 2.0)

Platform hosting the coprocessor and platform for two-socket Intel® Xeon® processor baseline:

Two-socket Intel Software Development Platform Platform: 2x Intel Xeon processor E5-2697 v2 (12 core, 30M cache, 2.7 GHz, 8.0 GT/s Intel QPI, 130W TDP, Intel Turbo Boost Technology on, Intel Hyper-Threading Technology on) 64 GB memory at 1600 MHz, RHEL 6.4

Coprocessor/GPU details:

Intel® Xeon Phi™ coprocessor 7120A: 61 cores, 1.238GHz, 16 memory channels, 16 GB memory at 5.5 GT/s, 300W TDP C-step (Intel Turbo Boost Technology off, ECC on)

NVIDIA K20X*: 2688 SP cores at 732 MHz (896 DP cores), 6 GB memory (12 channels) at 5.2 GT/s, ECC on, GK110 die

Software stack (Intel Xeon Phi coprocessor):

Intel MPSS 2.1.6720-16 (Flash: 2.1.03.0386; coprocessor OS:  2.6.38.8-g2593b11)

Intel Composer XE 14.0.1.106 build 20131008, Intel MPI Library 4.1.1.036, CUDA* 5.5

 

 

1 node (workload A) (lower is better)

4 node (workload B) (lower is better)        

8 node (workload B) (lower is better)  

NVIDIA K20X Score (native):

39.57 seconds

N/A

N/A

Two-socket Intel Xeon processor score: 

26.723 seconds

143.778 seconds

72.91 seconds

Two-socket Intel Xeon processor + Intel Xeon Phi coprocessor score:   

23.93 seconds

121.53 seconds

65.94 seconds

Intel Xeon Phi coprocessor score (native):       

36.71 seconds

N/A

N/A

 

Fabric:  Ethernet

Source: Intel Internal Testing TR2081

Additional information: 1 2 3 4 5

Product and Performance Information

open

1. Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information, go to www.intel.com/performance.

2. Intel does not control or audit the design or implementation of third party benchmarks or websites referenced in this document. Intel encourages all of its customers to visit the referenced websites or others where similar performance benchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of systems available for purchase.

3. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See www.intel.com/content/www/us/en/processors/processor-numbers.html for details.

4. Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel® microprocessors. These optimizations include SSE2 and SSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel® microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product user and reference guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804

5. Different hardware architectures may require different source code. Results are based on Intel’s best efforts to use code optimized to run on all architectures and perform the same work. Future code optimizations may result in different results. Microprocessor-dependent optimizations in this product are intended for use with Intel® microprocessors. Certain optimizations not specific to Intel® microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product user and reference guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804