Building Software Acceleration Features in the Intel® Quick Assist Technology (Intel® QAT) Engine for OpenSSL* 1.1.1

Published: 11/12/2020  

Last Updated: 03/29/2021

By John P Mechalas

Updated 5/6/2021 with performance data for the Intel Xeon Scalable processor family.

Updated 3/29/2021 for release 0.6.5 of the Intel® Quick Assist Technology Engine for OpenSSL

Intel® Quick Assist Technology (Intel® QAT) has been expanded to provide software-based acceleration of cryptographic operations through instructions in the Intel® Advanced Vector Extensions 512 (Intel® AVX-512) family. This software-based acceleration has been incorporated into the Intel QAT Engine for OpenSSL*, a dynamically loadable module that uses the OpenSSL ENGINE framework, allowing administrators to add this capability to OpenSSL without having to rebuild or replace their existing OpenSSL libraries.

Software acceleration is provided for the following algorithms:

  • RSA with 2048, 3072, and 4096 bit keys
  • ECDH for the Montgomery Curve X25519 and NIST Prime Curves P-256 and P-384
  • ECDSA for the NIST Prime Curves P-256 and P-384
  • AES-GCM with 128, 192, and 256 bit keys

About This Guide

This guide steps you through the process of building the Intel QAT Engine for OpenSSL on the following Linux distributions, but it can be adapted to others.

  • CentOS* Linux* 8.2
  • Ubuntu* Server 20.04 LTS

Two build procedures are provided: one that uses the distribution-provided build of OpenSSL, and one that creates a customized installation using OpenSSL built from source. Each methodology has its pros and cons, an dyou should choose the procedure that works best for your environment. Using the distribution-provided OpenSSL means less complexity as you are running OpenSSL out of its standard system path, but it ties you to a specific version that is integrated with the OS. Building OpenSSL from source lets you control the version that you deploy independent of the distribution-provided build, and that makes it possible to perform version updates as needed without disrupting system operations. This added flexibility comes at a cost, however, as you'll need to add the OpenSSL binary directory to your PATH and update LD_LIBRARY_PATH to include the shared library directories for OpenSSL and its dependency libraries.

Click the tab for the desired build option to view the procedure.

Using the Distribution-Provided OpenSSL

This section describes how to build the Intel QAT Engine for OpenSSL for your OS distribution's pre-packaged OpenSSL. If you want to build the engine for a custom build of OpenSSL that is made from source code, click the Building OpenSSL from Source tab, above.

Build Requirements

To build the Intel QAT Engine for OpenSSL you'll need to ensure that your distribution's default version of OpenSSL is 1.1.1e or later, as the engine is not compatible with earlier releases. You can check your distribution's OpenSSL version by running:

openssl version

You'll also need some prerequisite software packages in order to build both the engine and its dependencies.

Ubuntu 20.04 LTS

To build the QAT engine and its dependencies on Ubuntu, you’ll need to install the following packages from apt:

sudo apt install autoconf build-essential libtool cmake cpuid libssl-dev

The libssl-dev package provides the header files for OpenSSL, and ensures that the OpenSSL libraries are present.

You’ll also need to install version 2.15 or later of nasm*, which is not provided by default for the Ubuntu 20.04 distribution. You must fetch and install this package manually:

wget http://archive.ubuntu.com/ubuntu/pool/universe/n/nasm/nasm_2.15.04-1_amd64.deb
sudo dpkg -i nasm_2.15.04-1_amd64.deb

CentOS 8.2

CentOS requires some repository updates before the necessary software packages can be installed. Specifically, you'll need to add the Extra Packages for Enterprise Linux (EPEL) repository.

sudo dnf install epel-release
sudo dnf update

Now install the following packages using dnf:

sudo dnf group install "Development Tools"
sudo dnf install cpuid cmake openssl-devel

The openssl-devel package provides the header files for OpenSSL, and ensures that the OpenSSL libraries are present.

You’ll also need to install version 2.15 or later of nasm*, which is not provided by default for the CentOS 8.2 distribution. You must fetch and install this package manually:

wget https://www.nasm.us/pub/nasm/releasebuilds/2.15.02/linux/nasm-2.15.02-0.fc31.x86_64.rpm
sudo rpm -i nasm-2.15.02-0.fc31.x86_64.rpm

Runtime Requirements

To make use of the software acceleration features in the Intel QAT Engine for OpenSSL, you’ll need a system that supports Intel® AVX-512 with the following instruction set extensions:

  • AVX512F
  • AVX512_IFMA
  • VAES
  • VPCLMULQDQ

The latter two extensions were introduced with certain 10th Generation Intel® Core™ processors and 3rd Generation Intel® Xeon® Scalable processors (products formerly codenamed Ice Lake). A quick way to verify that your system supports the necessary features is to run the cpuid command. Run the following and check that the output matches.

$ cpuid -1 | egrep 'VAES|VPCLM|GFNI|AVX512F|AVX512IFMA'
      AVX512F: AVX-512 foundation instructions = true
      AVX512IFMA: fused multiply add           = true
      VAES instructions                        = true
      VPCLMULQDQ instruction                   = true

All features must be present.

These output fields are only present in cpuid version 20200211 or later. This is the default version provided in Ubuntu 20.04 and CentOS 8.2.

Building the Intel QAT OpenSSL Engine for Software Acceleration

The software acceleration support in the Intel QAT Engine for OpenSSL depends on the following two libraries. They must be built first, but they may be built in any order:

Once these libraries are installed, you can build the Intel Quick Assist Technology OpenSSL Engine.

We’ll step through how to build each one.

Building Intel® Integrated Performance Primitives Cryptography

First, checkout the source code repository from GitHub*:

git clone https://github.com/intel/ipp-crypto.git
cd ipp-crypto

Ensure you are building against a fixed release of the code, and not the development branch. At the time of this writing, the latest release was 2020, update 3:

git checkout ipp-crypto_2020u3

You only need to build the multi-buffer portion of the Intel IPP package, so change to the multi-buffer crypto library subdirectory. Then, prepare the build by running cmake:

cd sources/ippcp/crypto_mb
cmake . -Bbuild -DCMAKE_INSTALL_PREFIX=/usr

This will configure the library to install into /usr. To perform the full build, run:

cd build
make -j
sudo make install

This will put the shared library in /usr/lib, which means we won’t need to set LD_LIBRARY_PATH.

Building the Intel® Multi-Buffer Crypto for IPsec Library

First, checkout the source code repository from GitHub:

git clone https://github.com/intel/intel-ipsec-mb.git
cd intel-ipsec-mb

Ensure you are building against a fixed release of the code, and not a development branch. At the time of this writing, the latest release was 0.55:

git checkout v0.55

There is no configuration step. Build the library using:

make -j SAFE_DATA=y SAFE_PARAM=y SAFE_LOOKUP=y

To install:

sudo make install NOLDCONFIG=y

This will place the shared libraries in /usr/lib, which again means no LD_LIBRARY_PATH modifications.

Building the Intel Quick Assist Technology (Intel QAT) Engine for OpenSSL

Checkout the software repository from GitHub:

git clone https://github.com/intel/QAT_Engine.git
cd QAT_Engine

Next, ensure you are building against a fixed release of the code, and not a development branch. At the time of this writing, the latest release was 0.6.5:

$ git checkout v0.6.5

Intel QAT hardware devices take precedence over the multibuffer software implementations, and the engine does not support runtime selection of the operating mode at the current time. To use software acceleration on a system with Intel QAT hardware, the hardware offload support must be explicitly disabled in the library at compile time by adding the --disable-qat_hw configuration option.

To configure the Intel QAT Engine for OpenSSL for all software acceleration features:

./autogen.sh
./configure --enable-qat_sw

To install, run:

make -j
sudo make install

Since this makes use of the system-provided OpenSSL installation, we won’t need to modify LD_LIBRARY_PATH to use the engine.

After the installation has completed, you should see the engine present in OpenSSL’s engine directory. In Ubuntu 20.04, this is in /usr/lib/x86_64-linux-gnu/engines-1.1. For non-standard builds, the engine directory can be obtained by running the “openssl version” command.

$ openssl version -e
ENGINESDIR: "/usr/lib/x86_64-linux-gnu/engines-1.1"

Verify that the engine is present by running “ls”. You should see qatengine.so in the directory list:

$ ls -l /usr/lib/x86_64-linux-gnu/engines-1.1
total 396
-rw-r--r-- 1 root root  23104 Apr 20  2020 afalg.so
-rw-r--r-- 1 root root  14120 Apr 20  2020 capi.so
-rw-r--r-- 1 root root  26688 Apr 20  2020 padlock.so
-rwxr-xr-x 1 root root 334160 Sep 28 13:27 qatengine.so

Post-Build Configuration

On CentOS 8.2, you'll need to update the dynamic linker cache:

sudo ldconfig

Building OpenSSL from Source

This section describes how to build the Intel QAT Engine for OpenSSL when OpenSSL is built from source code. If you want to build the engine using your distribution's pre-packaged version of OpenSSL, click the Using the Distribution-Provided OpenSSL tab, above.

Build Requirements

You'll need some prerequisite software packages to build OpenSSL, the Intel QAT Engine for OpenSSL, and the engine's dependencies.

Ubuntu 20.04 LTS

To build the Intel QAT Engine for OpenSSL and its dependencies on Ubuntu, you’ll need to install the following packages from apt:

sudo apt install autoconf build-essential libtool cmake cpuid

You’ll also need to install version 2.15 or later of nasm*, which is not provided by default for the Ubuntu 20.04 distribution. You must fetch and install this package manually:

wget http://archive.ubuntu.com/ubuntu/pool/universe/n/nasm/nasm_2.15.04-1_amd64.deb
sudo dpkg -i nasm_2.15.04-1_amd64.deb

CentOS 8.2

CentOS requires some repository updates before the necessary software packages can be installed. Specifically, you'll need to add the Extra Packages for Enterprise Linux (EPEL) repository.

sudo dnf install epel-release
sudo dnf update

Now install the following packages using dnf:

sudo dnf group install "Development Tools"
sudo dnf install cpuid cmake

You’ll also need to install version 2.15 or later of nasm*, which is not provided by default for the CentOS 8.2 distribution. You must fetch and install this package manually:

wget https://www.nasm.us/pub/nasm/releasebuilds/2.15.02/linux/nasm-2.15.02-0.fc31.x86_64.rpm
sudo rpm -i nasm-2.15.02-0.fc31.x86_64.rpm 

Runtime Requirements

To make use of the software acceleration features in the Intel QAT Engine for OpenSSL, you’ll need a system that supports Intel® AVX-512 with the following instruction set extensions:

  • AVX512F
  • AVX512_IFMA
  • VAES
  • VPCLMULQDQ

The latter two extensions were introduced with certain 10th Generation Intel® Core™ processors and 3rd Generation Intel® Xeon® Scalable processors (products formerly codenamed Ice Lake). A quick way to verify that your system supports the necessary features is to run the cpuid command. Run the following and check that the output matches.

$ cpuid -1 | egrep 'VAES|VPCLM|GFNI|AVX512F|AVX512IFMA'
      AVX512F: AVX-512 foundation instructions = true
      AVX512IFMA: fused multiply add           = true
      VAES instructions                        = true
      VPCLMULQDQ instruction                   = true

All features must be present.

These output fields are only present in cpuid version 20200211 or later. This is the default version provided in Ubuntu 20.04 and CentOS 8.2.

Choose a Directory Structure

Before we proceed with the build, we need to decide where to install the completed packages and libraries for both OpenSSL and the Intel QAT Engine for OpenSSL. Since the goal of building from source code is to produce a build that can be upgraded as needed without interfering with existing applications, we want a directory hierarchy that allows for parallel installations of multiple versions of the same tool. To keep the filesystem tidy, we'll place these packages in /opt using the following structure:

/opt/tool/version

Building OpenSSL v1.1.1

Fetch the most recent build of OpenSSL 1.1.1. You can either check out the source code repository from GitHub, or download one of the pre-packaged tarballs. We'll do the latter, and at the time of this writing the latest version was 1.1.1k:

wget https://www.openssl.org/source/openssl-1.1.1k.tar.gz
tar xf openssl-1.1.1k.tar.gz
cd openssl-1.1.1k

To configure OpenSSL, run the config program and set the --prefix and --openssldir options to our desired installation directory, which will be /opt/openssl/1.1.1k

./config --prefix=/opt/openssl/1.1.1k --openssldir=/opt/openssl/1.1.1k

Then build and install:

make -j
sudo make install

Because we have installed OpenSSL into a non-standard build directory, we'll need to make some environment changes. To ensure you get this version of OpenSSL and not your system one, prepend the OpenSSL directory to your PATH variable:

export PATH=/opt/openssl/1.1.1k/bin:$PATH

You also need to set LD_LIBRARY_PATH in our environment to run the binary:

export LD_LIBRARY_PATH=/opt/openssl/1.1.1k/lib

To verify everything, run the following:

$ openssl version -v -e
OpenSSL 1.1.1k  25 Mar 2021
ENGINESDIR: "/opt/openssl/1.1.1k/lib/engines-1.1"

You should see the correct version, and ENGINESDIR should be pointing to your installation in /opt/openssl.

You can set these environment variables in scripts to ensure that OpenSSL from the intended location.

If you don't set LD_LIBRARY_PATH, OpenSSL will load the equivalent libraries from the distribution's default location, resulting in a mis-match of library and binary versions. This will prevent the QAT engine from loading in OpenSSL, and it can also cause sporadic runtime errors in OpenSSL itself.

Building the Intel QAT OpenSSL Engine for Software Acceleration

The software acceleration support in the Intel QAT Engine for OpenSSL depends on the following two libraries. They must be built first, but they may be built in any order:

Once these libraries are installed, you can build the Intel Quick Assist Technology OpenSSL Engine.

We’ll step through how to build each one.

Building Intel® Integrated Performance Primitives Cryptography

First, checkout the source code repository from GitHub*:

git clone https://github.com/intel/ipp-crypto.git
cd ipp-crypto

Ensure you are building against a fixed release of the code, and not the development branch. At the time of this writing, the latest release was 2020, update3:

git checkout ippcp_2020u3

You only need to build the multi-buffer portion of the Intel IPP package, so change to the multi-buffer crypto library subdirectory.

This library needs to know where to find your OpenSSL sources, and it looks for an installation path in the environment variable OPENSSL_ROOT_DIR:

export OPENSSL_ROOT_DIR=/opt/openssl/1.1.1k/

Then, prepare the build by running cmake:

cd sources/ippcp/crypto_mb
cmake . -Bbuild -DCMAKE_INSTALL_PREFIX=/opt/crypto_mb/2020u3

Note that we are using crypto_mb as our tool name, since we aren't building the entire IPP package.

To perform the build, run:

cd build
make -j
sudo make install

We'll also need to update LD_LIBRARY_PATH to include this new library directory:

export LD_LIBRARY_PATH=/opt/openssl/1.1.1k/lib:/opt/crypto_mb/2020u3/lib

Building the Intel® Multi-Buffer Crypto for IPsec Library

First, check out the source code repository from GitHub:

git clone https://github.com/intel/intel-ipsec-mb.git
cd intel-ipsec-mb

Ensure you are building against a fixed release of the code, and not a development branch. At the time of this writing, the latest release was v0.55:

git checkout v0.55

There is no configuration step. Build the library using:

make -j SAFE_DATA=y SAFE_PARAM=y SAFE_LOOKUP=y

To install to our destination, define PREFIX when running "make install":

sudo make install NOLDCONFIG=y PREFIX=/opt/ipsec-mb/0.55

We need to add this new installation dir to LD_LIBRARY_PATH as well:

export LD_LIBRARY_PATH=/opt/openssl/1.1.1k/lib:/opt/crypto_mb/2020u3/lib:/opt/ipsec-mb/0.55/lib

Building the Intel Quick Assist Technology Engine for OpenSSL

Checkout the software repository from GitHub:

git clone https://github.com/intel/QAT_Engine.git
cd QAT_Engine

Next, ensure you are building against a fixed release of the code, and not a development branch. At the time of this writing, the latest release was 0.6.5:

git checkout v0.6.5

Intel QAT hardware devices take precedence over the multibuffer software implementations, and the engine does not support runtime selection of the operating mode at the current time. To use software acceleration on a system with Intel QAT hardware, the hardware offload support must be explicitly disabled in the library at compile time by adding the --disable-qat_hw configuration option.

To configure the Intel QAT OpenSSL engine for all software acceleration features, we need to do the following:

  • Set LDFLAGS and CPPFLAGS to ensure the Intel IPP Crypto and Multi-Buffer Library for IPSec libraries are found by configure. Though the option --with-qat_sw_install_dir is provided for this purpose, it requires both libraries to be installed in the same location.
  • Supply the location of our OpenSSL library via the --with-openssl_install_dir option.
  • Add the --with-openssl_dir option, which points to the OpenSSL source code. This will regenerate the error files from OpenSSL's source.

​Here, we assume that you are building from your home directory. Be sure to update the path to match your build environment.

./autogen.sh
LDFLAGS="-L/opt/ipsec-mb/0.55/lib -L/opt/crypto_mb/2020u3/lib" CPPFLAGS="-I/opt/ipsec-mb/0.55/include -I/opt/crypto_mb/2020u3/include" ./configure --prefix=/opt/openssl/1.1.1k --with-openssl_install_dir=/opt/openssl/1.1.1k --with-openssl_dir=$HOME/openssl-1.1.1k --enable-qat_sw

To build and install, we also need to set PERL5LIB on the command line, so that Perl can find the configdata.pm file in the OpenSSL source directory. This step is necessary due to a bug in the QAT Engine build configuration, and it will be fixed in a future release.

PERL5LIB=$HOME/openssl-1.1.1k make -j
sudo PERL5LIB=$HOME/openssl-1.1.1k make install

After the installation has completed, you should see the engine present in OpenSSL’s engine directory:

$ ls -l /opt/openssl/1.1.1k/lib/engines-1.1/qatengine.*
-rwxr-xr-x 1 root root   1011 Mar 26 15:16 /opt/openssl/1.1.1k/lib/engines-1.1/qatengine.la
-rwxr-xr-x 1 root root 170520 Mar 26 15:16 /opt/openssl/1.1.1k/lib/engines-1.1/qatengine.so

Remember both to set LD_LIBRARY_PATH and modify your own PATH, or the following examples will fail.

Testing the Engine

Once the engine is in place, you can proceed with functionality tests.

The first test is to ensure the engine loads.

$ openssl engine -v qatengine
(qatengine) Reference implementation of QAT crypto engine(qat_sw) v0.6.5
     ENABLE_EXTERNAL_POLLING, POLL, ENABLE_HEURISTIC_POLLING, 
     GET_NUM_REQUESTS_IN_FLIGHT, INIT_ENGINE

If the above command returns errors such as the following:

139667965596992:error:25066067:DSO support routines:dlfcn_load:could not load the shared library:../crypto/dso/dso_dlfcn.c:118:filename(/usr/lib/x86_64-linux-gnu/engines-1.1/qatengine.so): libcrypto_mb.so: cannot open shared object file: No such file or directory
139667965596992:error:25070067:DSO support routines:DSO_load:could not load the shared library:../crypto/dso/dso_lib.c:162:
139667965596992:error:260B6084:engine routines:dynamic_load:dso not found:../crypto/engine/eng_dyn.c:414:
139667965596992:error:2606A074:engine routines:ENGINE_by_id:no such engine:../crypto/engine/eng_list.c:334:id=qatengine

If you are using the distribution-provided OpenSSL

  • Make sure the Intel® Multi-Buffer Crypto for IPsec Library and the Intel IPSec Library are both installed into /usr/lib. If you did not set a prefix for the former, it will install into /usr/local and you’ll need to set LD_LIBRARY_PATH in your environment.
  • If you're running on CentOS 8.2, make sure you have run ldconfig.

If you built OpenSSL from source

  • Make sure LD_LIBRARY_PATH is set to the paths for OpenSSL, the Intel Multi-Buffer Crypto for IPsec Library, and the Intel IPSec Library. All three paths must be present.
  • Verify your installation paths and make sure they are in /opt/tool/version

Assuming the engine loads correctly, you can test the software acceleration for each of the enabled algorithms. To do that, we'll run “openssl speed” on the individual algorithms and compare the engine performance to the baseline.

On Intel Xeon Scalable processor families, you must use processor affinitiy (also known as CPU pinning) to bind these processes to a core. The multibuffer implementations make use of AVX-512 features that produce internal power transitions, and if the CPU scheduler moves these jobs to other cores during execution then multiple power transitions will occur, countering performance gains. Most server applications support CPU affinity masks in some form, but for “openssl speed” we must rely on the taskset command to do this for us.

RSA

The RSA acceleration makes use of an asynchronous scheduling algorithm which Intel calls multi-buffer, which processes multiple operations in parallel. To test the accelerator performance, you must supply the -async_jobs parameter to “openssl speed”. On current Intel architectures, 8 asynchronous jobs delivers optimal performance.

While both sign and verify operations are accelerated, the largest gains are in signing. This translates to performance gains for servers processing TLS handshakes with RSA certificates.

2048-bit keys

Baseline

taskset 0x1 openssl speed rsa2048

Intel QAT Engine for OpenSSL

taskset 0x1 openssl speed -engine qatengine -async_jobs 8 rsa2048

3072-bit keys

Baseline

taskset 0x1 openssl speed rsa3072

Intel QAT Engine for OpenSSL

taskset 0x1 openssl speed -engine qatengine -async_jobs 8 rsa3072

4096-bit keys

Baseline

taskset 0x1 openssl speed rsa4096

Intel QAT Engine for OpenSSL

taskset 0x1 openssl speed -engine qatengine -async_jobs 8 rsa4096

ECDH

Like RSA, the ECDH acceleration makes use of an asynchronous scheduling algorithm.

Montgomery EC Curve X25519

Baseline

taskset 0x1 openssl speed ecdhx25519

Intel QAT Engine for OpenSSL

taskset 0x1 openssl speed -engine qatengine -async_jobs 8 ecdhx25519

NIST Curve P-256

Baseline

taskset 0x1 openssl speed ecdhp256

Intel QAT Engine for OpenSSL

taskset 0x1 openssl speed -engine qatengine -async_jobs 8 ecdhp256

NIST Curve P-384

Baseline

taskset 0x1 openssl speed ecdhp384

Intel QAT Engine for OpenSSL

taskset 0x1 openssl speed -engine qatengine -async_jobs 8 ecdhp384

ECDSA

The ECDSA algorithms also make use of asynchronous scheduling.

NIST Curve P-256

Baseline

taskset 0x1 openssl speed ecdsap256

Intel QAT Engine for OpenSSL

taskset 0x1 openssl speed -engine qatengine -async_jobs 8 ecdsap256

NIST Curve P-384

Baseline

taskset 0x1 openssl speed ecdsap384

Intel QAT Engine for OpenSSL

taskset 0x1 openssl speed -engine qatengine -async_jobs 8 ecdsap384

AES GCM Encryption

The AES GCM encryption acceleration is a purely vectorized implementation of the respective EVP ciphers. Key sizes of 128, 192, and 256 bits are supported but 192-bit encryption is almost never used in practice.

“Openssl speed” reports performance for several block sizes, but real-world applications tend to use buffers that are 8k or larger for increased efficiency and performance. This is also where the largest gains are seen.

128-bit keys

Baseline

taskset 0x1 openssl speed -evp aes-128-gcm

Intel QAT Engine for OpenSSL

taskset 0x1 openssl speed -engine qatengine -evp aes-128-gcm

256-bit keys

Baseline

taskset 0x1 openssl speed -evp aes-256-gcm

Intel QAT Engine for OpenSSL

taskset 0x1 openssl speed -engine qatengine -evp aes-256-gcm

Typical Performance Gains

The performance of each algorithm, as measured by “openssl speed”, will vary based on your hardware, system configuration, and BIOS settings. There will also be variations between runs due to fluctuations in normal system activity.

Client System Performance

The results given below are typical values, obtained from the system described in Table 1.

Table 1: Client system configuration

System

Dell Inc.* XPS* 13 7390 2-in-1 Laptop

CPU

Intel® CoreTM i7-1065G7 (4 cores, 8 threads) @ 1.30 GHz

CPU FEATURES

Intel® Hyper-Threading Technology enabled
Intel® Turbo Boost Technology 2.0 disabled

Memory

16 GB (2x 8GB) LPDDR4 SDRAM 3733 MT/s

Storage

512 GB M.2 NVMe SSD

OS

Ubuntu 20.04 LTS

Note that Intel® Turbo Boost Technology 2.0 was disabled for these runs so that the performance gains shown came solely from the Intel QAT Engine for OpenSSL, and not from variations in clock speed.

This is the complete output from OpenSSL for RSA using 2048-bit keys, without the Intel QAT Engine:

$ openssl speed rsa2048
Doing 2048 bits private rsa's for 10s: 5799 2048 bits private RSA's in 10.00s
Doing 2048 bits public rsa's for 10s: 202497 2048 bits public RSA's in 10.00s
OpenSSL 1.1.1k  25 Mar 2021
built on: Fri Mar 26 22:09:23 2021 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr) 
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.001724s 0.000049s    579.9  20249.7

This is the output with the Intel QAT Engine for OpenSSL:

$ openssl speed -engine qatengine -async_jobs 8 rsa2048
engine "qatengine" set.
Doing 2048 bits private rsa's for 10s: 26640 2048 bits private RSA's in 9.99s
Doing 2048 bits public rsa's for 10s: 577880 2048 bits public RSA's in 9.37s
OpenSSL 1.1.1k  25 Mar 2021
built on: Fri Mar 26 22:09:23 2021 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr) 
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.000375s 0.000016s   2666.7  61673.4

The number of sign operations per second jumps from 579.9 to 2666.7, showing a roughly 4.6x performance gain when using the Intel QAT Engine for OpenSSL.

The results for all the enabled algorithms are provided in Table 2.

Table 2: Results from "openssl speed" (client system)

Algorithm

Operation

Baseline

Intel QAT Engine for OpenSSL

 

Speedup

RSA 2048

sign

579.9

2667.7

signs/sec

4.60x

RSA 3072

sign

194

673.3

signs/sec

3.47x

RSA 4096

sign

88

348.5

signs/sec

3.96x

ECDH X25519

n/a

10796

49200

ops/sec

4.56x

ECDH P-256

n/a

7191

22783

ops/sec

3.17x

ECDH P-384

n/a

417

5939.4

ops/sec

14.24x

ECDSA P-256

sign

17059

44524.4

signs/sec

2.61x

ECDSA P-384

sign

394

13905.2

signs/sec

35.29x

AES-128-GCM

encrypt (8k blocks)

2368791

4723040.3

kB/sec

1.99x

AES-256-GCM

encrypt (8k blocks)

2013831

4093460.5

kB/sec

2.03x

On the target system, the performance gains in RSA signs range from 3.5 to 4.6x. The gains in ECDH X25519 operations exceed 4x. GCM shows a 2x gain for both 128- and 256-bit keys. The NIST Curve P-384 algorithms show rather significant gains, especially compared to the NIST Curve P-256 algorithms who see improvements of only 2.6 to 3x, but this is because the starting points for P-384 were general-purpose software implementations with no other code optimizations. All of the other algorithms began with AVX-2 code paths, and thus there was significantly more room for improvement in the P-384 code.

Server System Performance

Certain SKUs of 3rd generation Intel Xeon Scalable processors contain two Fuse Multiply ADD (FMA) units instead of one, and this translates to better performance. The results given below are typical values for the system described in Table 3, which has two FMA units.

Table 3: Server system configuration

System

Intel® Server Board M50CYP Family

CPU

2x Intel® Xeon® Platinum 8368 CPU @ 2.40GHz

CPU FEATURES

Intel® Hyper-Threading Technology enabled
Intel® Turbo Boost Technology 2.0 disabled

Memory

64 GB (4x 16GB) DDR4 Registered SDRAM 3200 MT/s

Storage

960 GB M.2 NVMe SSD

OS

Ubuntu 20.04 LTS

 

Table 4: Results from "openssl speed" (server system)

Algorithm

Operation

Baseline

Intel QAT Engine for OpenSSL

Measure

Speedup

RSA 2048

sign

1134.6

6908.3

signs/sec

6.09x

RSA 3072

sign

371.2

1294.6

signs/sec

3.49x

RSA 4096

sign

167.6

802.1

signs/sec

4.79x

ECDH x25519

n/a

20158.6

120003.8

ops/sec

5.95x

ECDH p256

n/a

13490.5

44741.4

ops/sec

3.32x

ECDH p384

n/a

803.6

12882.6

ops/sec

16.03x

ECDSA p256

signs

32103.9

81653.3

signs/sec

2.54x

ECDSA p384

signs

779.4

29880.2

signs/sec

38.34x

AES-128-GCM

encrypt (8k blocks)

4374242

9467142

kB/sec

2.16x

AES-256-GCM

encrypt (8k blocks)

3721579

8326487

kB/sec

2.24x

Note that “openssl speed” measures the raw performance of the algorithm itself. The performance of real-world applications will vary depending on the workload and other factors. Configuring an application to use the QAT engine is an application-specific procedure, and at minimum, it requires that the application support OpenSSL’s asynchronous interface. Consult your application’s documentation for guidance.

§

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.