The browser version you are using is not recommended for this site.
Please consider upgrading to the latest version of your browser by clicking one of the following links.

Intel® Xeon Phi™ Product Family

Power your breakthrough innovations with the highly parallel processing of the Intel® Xeon Phi™ coprocessor. We have packed over a teraFLOPS of double-precision peak performance into every chip.

Throughput with SGEMM* and DGEMM*

The SGEMM* and DGEMM* benchmarks demonstrate the processor and coprocessors ability to do a multiplication of two matrices using either single precision calculations (SGEMM) or double precision calculations (DGEMM). Different matrix sizes were tested and produced varying results. The matrix size that produced the best result was chosen.

Developer starter kits >

Intel measured as of December 2013

Intel® Xeon® processor platform:

Two-socket Intel® Server Board S2600CP software development platform: two Intel® Xeon® processor E5-2670 (20M cache, 2.6 GHz, 8.0 GT/s Intel® QuickPath Interconnect (Intel® QPI), 115 W thermal design power (TDP)), 64 GB memory at 1600 MHz, Red Hat Enterprise Linux* (RHEL) 6.3, Intel® Turbo Boost Technology on, Intel® Hyper-Threading Technology (Intel® HT Technology) off, Enhanced Intel SpeedStep® Technology enabled, HW and ACL prefetch on, C-state enabled: Performance mode Intel® Compiler 14.0, Intel® Math Kernel Library (Intel® MKL) 11.1.1 (Intel Xeon processor results only).

Two-socket Intel Server Board S2600CP software development platform: two Intel Xeon processor E5-2697 v2 (30M cache, 2.7 GHz, 8.0 GT/s Intel QPI, 130 W TDP), 64 GB memory at 1866 MHz, RHEL 6.3, Intel Turbo Boost Technology on, Intel HT Technology off, Enhanced Intel SpeedStep Technology enabled, HW and ACL prefetch on, C-state enabled: Performance mode Intel Compiler 14.0, Intel MKL 11.1.1 (Intel Xeon processor results only).


Platform hosting the coprocessor and coprocessor details:

Two-socket Intel Server Board S2600CP software development platform: two Intel Xeon processor E5-2697 v2 (12 core, 30M cache, 2.7 GHz, 8.0 GT/s Intel QPI, 130 W TDP), 64 GB memory at 1600 MHz, RHEL 6.4, Intel Turbo Boost Technology on, Intel HT Technology off, Enhanced Intel SpeedStep Technology enabled, Power: Performance mode

Intel® Xeon Phi™ coprocessor 7120D: 61 cores, 1.238 GHz, 16 memory channels, 16 GB memory at 5.5 GT/s, 270 W TDP C-step (Intel Turbo Boost Technology on)

Intel Xeon Phi coprocessor 7120P: 61 cores, 1.238 GHz, 16 memory channels, 16 GB memory at 5.5 GT/s, 300 W TDP C-step (Intel Turbo Boost Technology on)

Intel Xeon Phi coprocessor 5120D: 60 cores, 1.053 GHz, 16 memory channels, 8 GB memory at 5.5 GT/s, 245 W TDP C-step

Intel Xeon Phi coprocessor 5110P: 60 cores, 1.053 GHz, 16 memory channels, 8 GB memory at 5.0 GT/s, 225 W TDP B1-step

Intel Xeon Phi coprocessor 3120P: 57 cores, 1.098 GHz, 12 memory channels, 6 GB memory at 5.0 GT/s, 300 W TDP C-step

 

Matrix sizes:

 

SGEMM

DGEMM

Source

7120D:

30720 x 30720

15360 x 15360

Internal Intel Testing TR2039C

7120P:

30720 x 30720

15360 x 15360

Internal Intel Testing TR2039C

5120D:

15360 x 15360

15360 x 15360

Internal Intel Testing TR2039C

5110P:

15360 x 15360

15360 x 15360

Internal Intel Testing TR2039C

3120P:

15360 x 15360

14366 x 14366

Internal Intel Testing TR2039C

E5-2670

38912 x 38912

18944 x 18944

Internal Intel Testing TR 1390

E5-2697v2

38400 x 38400

20480 x 20480

Internal Intel Testing TR 1391

 

Software stack (Intel® Xeon Phi™ coprocessor results):

Intel® Manycore Platform Software Stack (Intel® MPSS) 3.1: (Flash*: 2.1.03.0386; SMC firmware: 1.15.4830, coprocessor OS: 2.6.38.8+mpss3.1; SMC Flash 1.15.4830) Intel® C++ Composer XE for Linux* 2013 SP1 update 1 (Intel® C++ Compiler 14.0.1, Intel® MKL: 11.1.1, Intel® Integrated Performance Primitives (Intel® IPP) 8.0.1, Intel® Threading Building Blocks (Intel® TBB) 4.2.1)

Additional information: 1 2 3 4 5

Related Videos

Product and Performance Information

1

Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information, go to www.intel.com/performance.

2

Intel does not control or audit the design or implementation of third party benchmarks or websites referenced in this document. Intel encourages all of its customers to visit the referenced websites or others where similar performance benchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of systems available for purchase.

3

Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See www.intel.com/content/www/us/en/processors/processor-numbers.html for details.

4

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel® microprocessors. These optimizations include SSE2 and SSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.

Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel® microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product user and reference guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

5

Different hardware architectures may require different source code. Results are based on Intel’s best efforts to use code optimized to run on all architectures and perform the same work. Future code optimizations may result in different results.

Microprocessor-dependent optimizations in this product are intended for use with Intel® microprocessors. Certain optimizations not specific to Intel® microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product user and reference guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804