Storage: Accelerate Hash Function Performance Using the Intel® Intelligent Storage Acceleration Library

Published: 09/24/2015  

Last Updated: 09/23/2015

By Quoc-Thai V Le

Abstract

With the growing number of devices connected to the cloud and the Internet, data is being generated from many different sources including smartphones, tablets, and Internet of Things devices. The demand for storage is growing every year. For cloud storage developers who are looking for ways to speed up their storage performance, the optimized hash functions in the Intel® Intelligent Storage Acceleration Library (Intel® ISA-L) accelerate the computation, providing up to a 8x performance gain over OpenSSL* algorithms. After a study of performance using version 2.14, the latest version of Intel ISA-L, the data shows a potential performance gain for developers to apply Intel ISA-L to their existing application.

This article captures the performance data and the system configuration for developers interested in reproducing this experiment in their own environment. Intel ISA-L can run on various Intel® server processors and provides operation acceleration through the following instruction sets:

  • Intel® Advanced Encryption Standard New Instructions (Intel® AES-NI)
  • Intel® Streaming SIMD Extensions (Intel® SSE)
  • Intel® Advanced Vector Extensions (Intel® AVX)
  • Intel® Advanced Vector Extensions 2 (Intel® AVX2)

Benefits

Intel ISA-L multibinary support functions allow an appropriate version to be selected at first run (based on the supported instruction set) and can be called instead of the architecture-specific versions. Developers can deploy a single binary with multiple function versions and then choose features at runtime. If code size is a concern, just call the architecture-specific version directly to reduce the code size. In default mode the base functions are written in C and the multibinary function will call those if none of the required instructions sets are enabled. 

For example, if the code is compiled on an Intel® Xeon® E5 v3 processor family and there are three versions of a particular functions (funct1_sse (), func1_avx(), func1_avx2 ()), the function (func1()) will determine that the appropriate function to call is func1_sse(). There is also a base function (func1_base()), which the multibinary function calls if none of the required instruction sets are enabled.

By including the Intel® instruction extensions listed above, Intel ISA-L reduces the number of instructions providing the ability to manipulate multiple datum in one instruction. See the reference section below to learn more about the extensions. The intelligence of selecting the right instruction extension for the processor allows the application to take full advantage of the system bandwidth. Figure 1 below shows a process where a developer can apply the Intel ISA-L functions in their deduplication application. In a quick study (see Figure  2), the performance run of the hash functions were able to achieve up to a 8x performance gain on the Intel® Xeon® processor E5-2650 v3.


Figure 1. One method of applying Intel® Intelligent Storage Acceleration Library into the data deduplication process.


Figure 2. Hash functions’ relative performance using OpenSSL* versus Intel® Intelligent Storage Acceleration Library.

Setting Up Intel® Intelligent Storage Acceleration Library On the System

  1. To access the full suite of Intel ISA-L functions, please fill out and submit this request form.
    You will receive an email that provides information on how to get the complete ISA-L zip file.
  2. Download and unzip the library source into the OS.
  3. Read the ISA-L_Getting_Started.pdf and Release_notes.txt supplied with the source. From the Guide, choose and follow the instructions to build the source depending on your needs.

Running the Provided Benchmarks

  1. Install “automake” to build the library and included unit tests.
  2. Run “make perfs”. This builds all unit function tests set for ‘cache cold – larger data set exceeds LLC size.’
  3. Run “make perf”. This runs each unit test supported by the platform architecture. Performance results are output to the console.

Optional: Run “make igzip/igzip_file_perf” and “make igzip/igzip_stateless_file_perf”. This builds additional compression functions and unit tests. Compression tests (igzip_file_perf and igzip_stateless_file_perf) are run using each file of a standard corpus—The Calgary Corpus— as an input. It is available here

Table 1 describes the platform configuration we used in our testing.

Table 1. Tested System Configuration

Related Links and Resources

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.