Building Software Acceleration Features in the Intel® Quick Assist Technology (Intel® QAT) Engine for OpenSSL* 1.1.1

ID 658256
Updated 9/16/2022
Version Latest
Public

author-image

By

Intel® Quick Assist Technology (Intel® QAT) has been expanded to provide software-based acceleration of cryptographic operations through instructions in the Intel® Advanced Vector Extensions 512 (Intel® AVX-512) family. This software-based acceleration has been incorporated into the Intel QAT Engine for OpenSSL*, a dynamically loadable module that uses the OpenSSL ENGINE framework, allowing administrators to add this capability to OpenSSL without having to rebuild or replace their existing OpenSSL libraries.

Software acceleration is provided for the following algorithms:

  • RSA with 2048, 3072, and 4096-bit keys
  • ECDH for the Montgomery Curve X25519 and NIST Prime Curves P-256 and P-384
  • ECDSA for the NIST Prime Curves P-256 and P-384
  • AES-GCM with 128, 192, and 256-bit keys

About This Guide

This guide steps you through the process of building the Intel QAT Engine for OpenSSL on the following Linux distributions, but it can be adapted to others.

  • Ubuntu* Server 20.04 LTS
  • Rocky* Linux* 8.6

Building OpenSSL

Two build procedures are provided: one that uses the distribution-provided build of OpenSSL, and one that creates a customized installation using OpenSSL built from source. Each methodology has its pros and cons, and you should choose the procedure that works best for your environment. Using the distribution-provided OpenSSL means less complexity as you are running OpenSSL out of its standard system path, but it ties you to a specific version that is integrated with the OS. Building OpenSSL from source lets you control the version that you deploy independent of the distribution-provided build, and that makes it possible to perform version updates as needed without disrupting system operations. This added flexibility comes at a cost, however, as you'll need to add the OpenSSL binary directory to your PATH and update LD_LIBRARY_PATH to include the shared library directories for OpenSSL and its dependent libraries.

Click the tab for the desired build option to view the procedure.

 

 Using the Distribution-Provided OpenSSL

This section describes how to build the Intel QAT Engine for OpenSSL for your OS distribution's pre-packaged OpenSSL. If you want to build the engine for a custom build of OpenSSL that is made from source code, click the Building OpenSSL from Source tab, above.

Build Requirements

To build the Intel QAT Engine for OpenSSL you'll need to ensure that your distribution's default version of OpenSSL is 1.1.1e or later, as the engine is not compatible with earlier releases. You can check your distribution's OpenSSL version by running:

openssl version

You'll also need some prerequisite software packages to build both the engine and its dependencies.

Ubuntu 20.04 LTS

To build the QAT engine and its dependencies on Ubuntu, you’ll need to install the following packages from apt:

sudo apt install autoconf build-essential libtool cmake cpuid libssl-dev pkg-config

The libssl-dev package provides the header files for OpenSSL and ensures that the OpenSSL libraries are present.

You’ll also need to install version 2.15 or later of nasm*, which is not provided by default for the Ubuntu 20.04 distribution. You must fetch and install this package manually:

wget http://archive.ubuntu.com/ubuntu/pool/universe/n/nasm/nasm_2.15.05-1_amd64.deb
sudo dpkg -i nasm_2.15.05-1_amd64.deb
Rocky Linux 8.6

Rocky Linux requires some repository updates before the necessary software packages can be installed. Specifically, you'll need to add the Extra Packages for Enterprise Linux (EPEL) repository.

sudo dnf install epel-release
sudo dnf update

Now install the following packages using dnf:

sudo dnf group install "Development Tools"
sudo dnf install wget cpuid cmake openssl-devel

The openssl-devel package provides the header files for OpenSSL and ensures that the OpenSSL libraries are present.

You’ll also need to install version 2.15 or later of nasm*, which is not provided by default for the Rocky Linux 8.6 distribution. You must fetch and install this package manually:

wget https://www.nasm.us/pub/nasm/releasebuilds/2.15.05/linux/nasm-2.15.05-0.fc31.x86_64.rpm
sudo rpm -i nasm-2.15.05-0.fc31.x86_64.rpm

Runtime Requirements

To make use of the software acceleration features in the Intel QAT Engine for OpenSSL, you’ll need a system that supports Intel® AVX-512 with the following instruction set extensions:

  • AVX512F
  • AVX512_IFMA
  • VAES
  • VPCLMULQDQ

The latter two extensions were introduced with certain 10th Generation Intel® Core™ processors and 3rd Generation Intel® Xeon® Scalable processors (products formerly codenamed Ice Lake). A quick way to verify that your system supports the necessary features is to run the cpuid command. Run the following and check that the output matches.

$ cpuid -1 | egrep 'VAES|VPCLM|GFNI|AVX512F|AVX512IFMA'
      AVX512F: AVX-512 foundation instructions = true
      AVX512IFMA: fused multiply add           = true
      VAES instructions                        = true
      VPCLMULQDQ instruction                   = true

All features must be present.

These output fields are only present in cpuid version 20200211 or later. This is the default version provided in Ubuntu 20.04.

Building the Intel QAT OpenSSL Engine for Software Acceleration

The software acceleration support in the Intel QAT Engine for OpenSSL depends on the following two libraries. They must be built first, but they may be built in any order:

Once these libraries are installed, you can build the Intel Quick Assist Technology OpenSSL Engine.

We’ll step through how to build each one.

Building Intel® Integrated Performance Primitives Cryptography

First, checkout the source code repository from GitHub*:

git clone https://github.com/intel/ipp-crypto.git
cd ipp-crypto

Ensure you are building against a fixed release of the code, and not the development branch. At the time of this writing, the latest release was ippcp_2021.6:

git checkout ippcp_2021.6

You only need to build the multi-buffer portion of the Intel IPP package, so change to the multi-buffer crypto library subdirectory. Then, prepare the build by running cmake:

cd sources/ippcp/crypto_mb
cmake . -Bbuild -DCMAKE_INSTALL_PREFIX=/usr

This will configure the library to install into /usr. To perform the full build, run:

cd build
make -j
sudo make install

This will put the shared library in /usr/lib, which means we won’t need to set LD_LIBRARY_PATH.

Building the Intel® Multi-Buffer Crypto for IPsec Library

First, checkout the source code repository from GitHub:

git clone https://github.com/intel/intel-ipsec-mb.git
cd intel-ipsec-mb

Ensure you are building against a fixed release of the code, and not a development branch. At the time of this writing, the latest release was v1.2:

git checkout v1.2

There is no configuration step. Build the library using:

make -j

To install:

sudo make install NOLDCONFIG=y

This will place the shared libraries in /usr/lib, which again means no LD_LIBRARY_PATH modifications.

Building the Intel Quick Assist Technology (Intel QAT) Engine for OpenSSL

Checkout the software repository from GitHub:

git clone https://github.com/intel/QAT_Engine.git
cd QAT_Engine

Next, ensure you are building against a fixed release of the code, and not a development branch. At the time of this writing, the latest release was v0.6.15:

git checkout v0.6.15

To configure the Intel QAT Engine for OpenSSL for all software acceleration features:

./autogen.sh
./configure --enable-qat_sw

Then build and install:

make -j
sudo make install

Since this makes use of the distribution-provided OpenSSL installation, we won’t need to modify LD_LIBRARY_PATH to use the engine.

After the installation has completed, you should see the engine present in OpenSSL’s engine directory. In Ubuntu 20.04, this is in /usr/lib/x86_64-linux-gnu/engines-1.1. For non-standard builds, the engine directory can be obtained by running the “openssl version” command.

$ openssl version -e
ENGINESDIR: "/usr/lib/x86_64-linux-gnu/engines-1.1"

Verify that the engine is present by running “ls”. You should see qatengine.so in the directory list:

$ ls -l /usr/lib/x86_64-linux-gnu/engines-1.1
total 396
-rw-r--r-- 1 root root  23104 Apr 20  2020 afalg.so
-rw-r--r-- 1 root root  14120 Apr 20  2020 capi.so
-rw-r--r-- 1 root root  26688 Apr 20  2020 padlock.so
-rwxr-xr-x 1 root root 334160 Sep 28 13:27 qatengine.so
Post-Build Configuration

On Rocky Linux 8.6, you'll need to update the dynamic linker cache:

sudo ldconfig

 

 Building OpenSSL from Source

This section describes how to build the Intel QAT Engine for OpenSSL when OpenSSL is built from source code. If you want to build the engine using your distribution's pre-packaged version of OpenSSL, click the Using the Distribution-Provided OpenSSL tab, above.

Build Requirements

You'll need some prerequisite software packages to build OpenSSL, the Intel QAT Engine for OpenSSL, and the engine's dependencies.

Ubuntu 20.04 LTS

To build the Intel QAT Engine for OpenSSL and its dependencies on Ubuntu, you’ll need to install the following packages from apt:

sudo apt install autoconf build-essential libtool cmake cpuid pkg-config

You’ll also need to install version 2.15 or later of nasm*, which is not provided by default for the Ubuntu 20.04 distribution. You must fetch and install this package manually:

wget http://archive.ubuntu.com/ubuntu/pool/universe/n/nasm/nasm_2.15.05-1_amd64.deb
sudo dpkg -i nasm_2.15.05-1_amd64.deb
Rocky Linux 8.6

Rocky Linux requires some repository updates before the necessary software packages can be installed. Specifically, you'll need to add the Extra Packages for Enterprise Linux (EPEL) repository.

sudo dnf install epel-release
sudo dnf update

Now install the following packages using dnf:

sudo dnf group install "Development Tools"
sudo dnf install wget cpuid cmake

You’ll also need to install version 2.15 or later of nasm*, which is not provided by default for the Rocky Linux 8.6 distribution. You must fetch and install this package manually:

wget https://www.nasm.us/pub/nasm/releasebuilds/2.15.05/linux/nasm-2.15.05-0.fc31.x86_64.rpm
sudo rpm -i nasm-2.15.05-0.fc31.x86_64.rpm

Runtime Requirements

To make use of the software acceleration features in the Intel QAT Engine for OpenSSL, you’ll need a system that supports Intel® AVX-512 with the following instruction set extensions:

  • AVX512F
  • AVX512_IFMA
  • VAES
  • VPCLMULQDQ

The latter two extensions were introduced with certain 10th Generation Intel® Core™ processors and 3rd Generation Intel® Xeon® Scalable processors (products formerly codenamed Ice Lake). A quick way to verify that your system supports the necessary features is to run the cpuid command. Run the following and check that the output matches.

$ cpuid -1 | egrep 'VAES|VPCLM|GFNI|AVX512F|AVX512IFMA'
      AVX512F: AVX-512 foundation instructions = true
      AVX512IFMA: fused multiply add           = true
      VAES instructions                        = true
      VPCLMULQDQ instruction                   = true

All features must be present.

These output fields are only present in cpuid version 20200211 or later. This is the default version provided in Ubuntu 20.04.

Choose a Directory Structure

Before we proceed with the build, we need to decide where to install the completed packages and libraries for both OpenSSL and the Intel QAT Engine for OpenSSL. Since the goal of building from source code is to produce a build that can be upgraded as needed without interfering with existing applications, we want a directory hierarchy that allows for parallel installations of multiple versions of the same tool. To keep the filesystem tidy, we'll place these packages in /opt using the following structure:

/opt/tool/version

Building OpenSSL v1.1.1

Fetch the most recent build of OpenSSL 1.1.1. You can either check out the source code repository from GitHub or download one of the pre-packaged tarballs. We'll do the latter, and at the time of this writing the latest version was 1.1.1q:

wget https://www.openssl.org/source/openssl-1.1.1q.tar.gz
tar xf openssl-1.1.1q.tar.gz
cd openssl-1.1.1q

To configure OpenSSL, run the config program and set the --prefix and --openssldir options to our desired installation directory, which will be /opt/openssl/1.1.1q

./config --prefix=/opt/openssl/1.1.1q --openssldir=/opt/openssl/1.1.1q

Then build and install:

make -j
sudo make install

Because we have installed OpenSSL into a non-standard build directory, we'll need to make some environment changes. To ensure you get this version of OpenSSL and not your system one, prepend the OpenSSL directory to your PATH variable:

export PATH=/opt/openssl/1.1.1q/bin:$PATH

You also need to set LD_LIBRARY_PATH in your environment to run the binary, but for the purposes of this document, we will wait until all components are compiled and installed before doing so.

To verify OpenSSL is built and installed correctly, run the following:

$ LD_LIBRARY_PATH=/opt/openssl/1.1.1q/lib openssl version -v -e
OpenSSL 1.1.1q  5 Jul 2022
ENGINESDIR: "/opt/openssl/1.1.1q/lib/engines-1.1"

You should see the correct version, and ENGINESDIR should be pointing to your installation in /opt/openssl.

You can set these environment variables in scripts to ensure that OpenSSL from the intended location.

If you don't set LD_LIBRARY_PATH, OpenSSL will load the equivalent libraries from the distribution's default location, resulting in a mismatch of library and binary versions. This will prevent the QAT engine from loading in OpenSSL, and it can also cause sporadic runtime errors in OpenSSL itself.

Building the Intel QAT OpenSSL Engine for Software Acceleration

The software acceleration support in the Intel QAT Engine for OpenSSL depends on the following two libraries. They must be built first, but they may be built in any order:

Once these libraries are installed, you can build the Intel Quick Assist Technology OpenSSL Engine.

We’ll step through how to build each one.

Building Intel® Integrated Performance Primitives Cryptography

First, checkout the source code repository from GitHub*:

git clone https://github.com/intel/ipp-crypto.git
cd ipp-crypto

Ensure you are building against a fixed release of the code, and not the development branch. At the time of this writing, the latest release was 2021.6:

git checkout ippcp_2021.6

You only need to build the multi-buffer portion of the Intel IPP package, so change to the multi-buffer crypto library subdirectory. Then, prepare the build by running cmake:

cd sources/ippcp/crypto_mb
cmake . -Bbuild -DCMAKE_INSTALL_PREFIX=/opt/crypto_mb/2021.6

Note that we are using crypto_mb as our tool name, since we aren't building the entire IPP package.

This library needs to know where to find your OpenSSL sources, so set OPENSSL_ROOT_DIR.

export OPENSSL_ROOT_DIR=/opt/openssl/1.1.1q/

To perform the build, run:

cd build
make -j
sudo make install

We'll also need to update LD_LIBRARY_PATH to include this new library directory and will do so after all components have been compiled and installed.

Building the Intel® Multi-Buffer Crypto for IPsec Library

First, checkout the source code repository from GitHub:

git clone https://github.com/intel/intel-ipsec-mb.git
cd intel-ipsec-mb

Ensure you are building against a fixed release of the code, and not a development branch. At the time of this writing, the latest release was v1.2:

git checkout v1.2

There is no configuration step. Build the library using:

make -j

To install to our destination, define PREFIX when running "make install":

sudo make install NOLDCONFIG=y PREFIX=/opt/ipsec-mb/1.2

We'll also need to update LD_LIBRARY_PATH to include this new library directory and will do so after all components have been compiled and installed.

Building the Intel Quick Assist Technology Engine for OpenSSL

Checkout the software repository from GitHub:

git clone https://github.com/intel/QAT_Engine.git
cd QAT_Engine

Next, ensure you are building against a fixed release of the code, and not a development branch. At the time of this writing, the latest release was 0.6.15:

git checkout v0.6.15

To configure the Intel QAT OpenSSL engine for all software acceleration features, we need to do the following:

  • Set LDFLAGS and CPPFLAGS to ensure the Intel IPP Crypto and Multi-Buffer Library for IPSec libraries are found by configure. The options --with-qat_sw_crypto_mb_install_dir and --with-qat_sw_ipsec_mb_install_dir are provided for this purpose.
  • Supply the location of our OpenSSL library via the --with-openssl_install_dir option.
  • Add the --with-openssl_dir option, which points to the OpenSSL source code. This will regenerate the error files from OpenSSL's source.

​Here, we assume that you are building from your home directory. Be sure to update the path to match your build environment.

./autogen.sh
LDFLAGS="-L/opt/ipsec-mb/1.2/lib -L/opt/crypto_mb/2021.6/lib" CPPFLAGS="-I/opt/ipsec-mb/1.2/include -I/opt/crypto_mb/2021.6/include" ./configure --prefix=/opt/openssl/1.1.1q --with-openssl_install_dir=/opt/openssl/1.1.1q --with-openssl_dir=$HOME/openssl-1.1.1q --enable-qat_sw

To build and install, we also need to set PERL5LIB on the command line, so that Perl can find the configdata.pm file in the OpenSSL source directory. This step is necessary due to a bug in the QAT Engine build configuration, and it will be fixed in a future release.

PERL5LIB=$HOME/openssl-1.1.1q make -j
sudo PERL5LIB=$HOME/openssl-1.1.1q make install

After the installation has completed, you should see the engine present in OpenSSL’s engine directory:

$ ls -l /opt/openssl/1.1.1q/lib/engines-1.1/qatengine.*
-rwxr-xr-x 1 root root   1022 Sep  1 11:07 /opt/openssl/1.1.1q/lib/engines-1.1/qatengine.la
-rwxr-xr-x 1 root root 802112 Sep  1 11:07 /opt/openssl/1.1.1q/lib/engines-1.1/qatengine.so

Remember to set both LD_LIBRARY_PATH and modify your own PATH, or the following examples will fail. This can be done either in the user’s environment or the application’s environment. The example below demonstrates exporting the LD_LIBRARY_PATH in the user’s environment. Note that this method may impact other applications or libraries that are dependent on a specific versions of OpenSSL’s libCrypto or libSSL shared libraries delivered as part of the distribution (as is the case with the Kerberos library libk5crypto.so on Rocky Linux 8.6).

export PATH=/opt/openssl/1.1.1q/bin:$PATH
export LD_LIBRARY_PATH=/opt/openssl/1.1.1q/lib:/opt/crypto_mb/2021.6/lib:/opt/ipsec-mb/1.2/lib

To avoid impacting other applications or libraries dependent of older version of OpenSSL libraries, the safer option is to set LD_LIBRARY_PATH as part of the application’s environment which is shown in the following example.

LD_LIBRARY_PATH=/opt/openssl/1.1.1q/lib:/opt/crypto_mb/2021.6/lib:/opt/ipsec-mb/1.2/lib openssl version -e -v

 

Testing the Engine

Once the engine is in place, you can proceed with functionality tests.

The first test is to ensure the Intel QAT Engine loads correctly.

$ openssl engine -v qatengine
(qatengine) Reference implementation of QAT crypto engine(qat_sw) v0.6.15
     ENABLE_EXTERNAL_POLLING, POLL, ENABLE_HEURISTIC_POLLING,
     GET_NUM_REQUESTS_IN_FLIGHT, INIT_ENGINE, SW_ALGO_BITMAP

If the above command returns errors such as the following:

139667965596992:error:25066067:DSO support routines:dlfcn_load:could not load the shared library:../crypto/dso/dso_dlfcn.c:118:filename(/usr/lib/x86_64-linux-gnu/engines-1.1/qatengine.so): libcrypto_mb.so: cannot open shared object file: No such file or directory
139667965596992:error:25070067:DSO support routines:DSO_load:could not load the shared library:../crypto/dso/dso_lib.c:162:
139667965596992:error:260B6084:engine routines:dynamic_load:dso not found:../crypto/engine/eng_dyn.c:414:
139667965596992:error:2606A074:engine routines:ENGINE_by_id:no such engine:../crypto/engine/eng_list.c:334:id=qatengine

If you are using the distribution-provided OpenSSL

  • Make sure the Intel® Multi-Buffer Crypto for IPsec Library and the Intel IPP CryptoMB Library are both installed into /usr/lib. If you did not set a prefix for the former, it will install into /usr/local and you’ll need to set LD_LIBRARY_PATH in your environment.
  • If you're running on Rocky Linux 8.6, make sure you have run ldconfig.

If you built OpenSSL from source

  • Make sure LD_LIBRARY_PATH is set to the paths for OpenSSL, the Intel Multi-Buffer Crypto for IPsec Library, and the Intel IPSec Library. All three paths must be present.
  • Verify your installation paths and make sure they are in /opt/tool/version

Assuming the engine loads correctly, you can test the software acceleration for each of the enabled algorithms. To do that, we'll run “openssl speed” on the individual algorithms and compare the engine performance to the baseline.

On Intel Xeon Scalable processor families, you must use processor affinity (also known as CPU pinning) to bind these processes to a core. The multi-buffer implementations make use of AVX-512 features that produce internal power transitions, and if the CPU scheduler moves these jobs to other cores during execution then multiple power transitions will occur, countering performance gains. Most server applications support CPU affinity masks in some form, but for “openssl speed” we must rely on the taskset command to do this for us.

RSA

The RSA acceleration makes use of an asynchronous scheduling algorithm which Intel calls multi-buffer, which processes multiple operations in parallel. To test the accelerator performance, you must supply the -async_jobs parameter to “openssl speed”. On current Intel architectures, 8 asynchronous jobs deliver optimal performance.

While both sign and verify operations are accelerated, the largest gains are in signing. This translates to performance gains for servers processing TLS handshakes with RSA certificates.

2048-bit keys

Baseline taskset 0x1 openssl speed rsa2048
Intel QAT Engine for OpenSSL taskset 0x1 openssl speed -engine qatengine -async_jobs 8 rsa2048

3072-bit keys

Baseline taskset 0x1 openssl speed rsa3072
Intel QAT Engine for OpenSSL taskset 0x1 openssl speed -engine qatengine -async_jobs 8 rsa3072

4096-bit keys

Baseline taskset 0x1 openssl speed rsa4096
Intel QAT Engine for OpenSSL taskset 0x1 openssl speed -engine qatengine -async_jobs 8 rsa4096

ECDH

Like RSA, the ECDH acceleration makes use of an asynchronous scheduling algorithm.

Montgomery EC Curve X25519

Baseline taskset 0x1 openssl speed ecdhx25519
Intel QAT Engine for OpenSSL taskset 0x1 openssl speed -engine qatengine -async_jobs 8 ecdhx25519

NIST Curve P-256

Baseline taskset 0x1 openssl speed ecdhp256
Intel QAT Engine for OpenSSL taskset 0x1 openssl speed -engine qatengine -async_jobs 8 ecdhp256

NIST Curve P-384

Baseline taskset 0x1 openssl speed ecdhp384
Intel QAT Engine for OpenSSL taskset 0x1 openssl speed -engine qatengine -async_jobs 8 ecdhp384

ECDSA

The ECDSA algorithms also make use of asynchronous scheduling.

NIST Curve P-256

Baseline taskset 0x1 openssl speed ecdsap256
Intel QAT Engine for OpenSSL taskset 0x1 openssl speed -engine qatengine -async_jobs 8 ecdsap256

NIST Curve P-384

Baseline taskset 0x1 openssl speed ecdsap384
Intel QAT Engine for OpenSSL taskset 0x1 openssl speed -engine qatengine -async_jobs 8 ecdsap384

AES GCM Encryption

The AES GCM encryption acceleration is a purely vectorized implementation of the respective EVP ciphers. Key sizes of 128, 192, and 256 bits are supported but 192-bit encryption is almost never used in practice.

“Openssl speed” reports performance for several block sizes, but real-world applications tend to use buffers that are 8k or larger for increased efficiency and performance. This is also where the largest gains are seen.

128-bit keys

Baseline taskset 0x1 openssl speed -evp aes-128-gcm
Intel QAT Engine for OpenSSL taskset 0x1 openssl speed -engine qatengine -evp aes-128-gcm

256-bit keys

Baseline taskset 0x1 openssl speed -evp aes-256-gcm
Intel QAT Engine for OpenSSL taskset 0x1 openssl speed -engine qatengine -evp aes-256-gcm

Typical Performance Gains

The performance of each algorithm, as measured by “openssl speed”, will vary based on your hardware, system configuration, and BIOS settings. There will also be variations between runs due to fluctuations in normal system activity.

Client System Performance

The results given below are typical values, obtained from the system described in Table 1: Client system configuration.

Table 1: Client system configuration

System

Dell Inc.* XPS* 13 7390 2-in-1 Laptop

CPU

Intel® CoreTM i7-1065G7 (4 cores, 8 threads) @ 1.30 GHz

CPU FEATURES Intel® Hyper-Threading Technology enabled
Intel® Turbo Boost Technology 2.0 disabled
Memory

16 GB (2x 8GB) LPDDR4 SDRAM 3733 MT/s

Storage 512 GB M.2 NVMe SSD
OS Ubuntu 20.04 LTS

Note that Intel® Turbo Boost Technology 2.0 was disabled for these runs so that the performance gains shown came solely from the Intel QAT Engine for OpenSSL, and not from variations in clock speed.

This is the complete output from OpenSSL for RSA using 2048-bit keys, without the Intel QAT Engine:

$ openssl speed rsa2048
Doing 2048 bits private rsa's for 10s: 5799 2048 bits private RSA's in 10.00s
Doing 2048 bits public rsa's for 10s: 202497 2048 bits public RSA's in 10.00s
OpenSSL 1.1.1k  25 Mar 2021
built on: Fri Mar 26 22:09:23 2021 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr) 
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.001724s 0.000049s    579.9  20249.7

This is the output with the Intel QAT Engine for OpenSSL:

$ openssl speed -engine qatengine -async_jobs 8 rsa2048
engine "qatengine" set.
Doing 2048 bits private rsa's for 10s: 26640 2048 bits private RSA's in 9.99s
Doing 2048 bits public rsa's for 10s: 577880 2048 bits public RSA's in 9.37s
OpenSSL 1.1.1k  25 Mar 2021
built on: Fri Mar 26 22:09:23 2021 UTC
options:bn(64,64) rc4(16x,int) des(int) aes(partial) idea(int) blowfish(ptr) 
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -O3 -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG
                  sign    verify    sign/s verify/s
rsa 2048 bits 0.000375s 0.000016s   2666.7  61673.4

The number of sign operations per second jumps from 579.9 to 2666.7, showing a roughly 4.6x performance gain when using the Intel QAT Engine for OpenSSL.

The results for all the enabled algorithms are provided in Table 2: Results from "openssl speed".

Table 2: Results from "openssl speed"

Algorithm Operation Baseline Intel QAT Engine for OpenSSL   Speedup
RSA 2048 sign 579.9 2667.7 signs/sec 4.60x
RSA 3072 sign 194 673.3 signs/sec 3.47x
RSA 4096 sign 88 348.5 signs/sec 3.96x
ECDH X25519 n/a 10796 49200 ops/sec 4.56x
ECDH P-256 n/a 7191 22783 ops/sec 3.17x
ECDH P-384 n/a 417 5939.4 ops/sec 14.24x
ECDSA P-256 sign 17059 44524.4 signs/sec 2.61x
ECDSA P-384 sign 394 13905.2 signs/sec 35.29x
AES-128-GCM

encrypt (8k blocks)

2368791 4723040.3 kB/sec 1.99x
AES-256-GCM

encrypt (8k blocks)

2013831 4093460.5 kB/sec 2.03x

On the target system, the performance gains in RSA signs range from 3.5 to 4.6x. The gains in ECDH X25519 operations exceed 4x. GCM shows a 2x gain for both 128-bit and 256-bit keys. The NIST Curve P-384 algorithms show rather significant gains, especially compared to the NIST Curve P-256 algorithms who see improvements of only 2.6 to 3x, but this is because the starting points for P-384 were general-purpose software implementations with no other code optimizations. All the other algorithms began with AVX-2 code paths, and thus there was significantly more room for improvement in the P-384 code.

Server System Performance

Certain SKUs of 3rd generation Intel Xeon Scalable processors contain two Fuse Multiply ADD (FMA) units instead of one, and this translates to better performance. The results given below are typical values for the system described in Table 3, which has two FMA units.

System Intel® Server Board M50CYP Family
CPU 2x Intel® Xeon® Platinum 8368 CPU @ 2.40GHz
CPU FEATURES Intel® Hyper-Threading Technology enabled
Intel® Turbo Boost Technology 2.0 disabled
Memory

64 GB (4x 16GB) DDR4 Registered SDRAM 3200 MT/s

Storage 960 GB M.2 NVMe SSD
OS Ubuntu 20.04 LTS
Algorithm Operation Baseline Intel QAT Engine for OpenSSL Measure Speedup
RSA 2048 sign 1134.6 6908.3 signs/sec 6.09x
RSA 3072 sign 371.2 1294.6 signs/sec 3.49x
RSA 4096 sign 167.6 802.1 signs/sec 4.79x
ECDH x25519 n/a 20158.6 120003.8 ops/sec 5.95x
ECDH p256 n/a 13490.5 44741.4 ops/sec 3.32x
ECDH p384 n/a 803.6 12882.6 ops/sec 16.03x
ECDSA p256 signs 32103.9 81653.3 signs/sec 2.54x
ECDSA p384 signs 779.4 29880.2 signs/sec 38.34x
AES-128-GCM

encrypt (8k blocks)

4374242 9467142 kB/sec 2.16x
AES-256-GCM

encrypt (8k blocks)

3721579 8326487 kB/sec 2.24x

Note that “openssl speed” measures the raw performance of the algorithm itself. The performance of real-world applications will vary depending on the workload and other factors. Configuring an application to use the QAT engine is an application-specific procedure, and at minimum it requires that the application support OpenSSL’s asynchronous interface. Consult your application’s documentation for guidance.