Service mesh ensures that the communication layer between microservices is fast, reliable, and secure. A service mesh is a configurable, low‑latency infrastructure layer designed to handle a high volume of network‑based inter-process communication among application infrastructure services using application programming interfaces (APIs). Managing microservices is a job with a lot of moving parts.
Istio* exists to extend Kubernetes* and reduce complexity for developers looking to deal with traffic management, telemetry and security in complex environments. You might be surprised, given Istio’s role in abstracting lower-level components away from developers, to learn how much of Istio development is at “lower” layers of the stack. Let’s look at how improvements in Istio spring from community work on deeper technologies.
Intel has been an active contributor in the Istio community since 2020. Based upon customer feedback, we have contributed multiple enhancements in multi-tenancy, security, performance, Internet Protocol version 6 (IPv6) and many other areas.
We see moving Istio under the Cloud Native Computing Foundation* (CNCF) as an opportunity for the ecosystem to further collaborate and make cloud native more secure, reliable, and resilient. Intel welcomes the recent announcement of Istio’s acceptance as an incubation project within CNCF.
Engineers from Intel have served in various roles within the Istio community over multiple areas (as maintainers, a release manager, co-chair for IstioCon etc.) Intel has been a proponent in expanding the Istio community over the years by sponsoring various Istio meetups in China, IstioCon 2022 and others including talks at different conferences and engagement with multiple customers and partners.
Intel has focused on helping Istio make gains in performance, security and network functionality.
Here’s a closer look at our contributions.
Technical Contributions
Performance
Performance has been an area of ongoing effort in Istio deployments utilizing Envoy* as a sidecar proxy. Following feedback from multiple customers and partners, Intel delivered solutions that aim to improve performance and reduce latency in service mesh deployments.
The three main areas Intel is working are on: Transport Layer Security (TLS) handshake acceleration using Intel® Advanced Vector Extensions 512 (Intel® AVX-512), accelerating Istio/Envoy Routing using Hyperscan* and Transmission Control Protocol/Internet Protocol (TCP/IP) eBPF* bypass.
1. TLS Handshake Performance Improvement using Intel AVX-512
Available on Istio release: 1.14
Performance can be improved by over 20-30% in queries per second (QPS) and latency improvements of 20-23% with Intel® AVX-512.
Crypto operations can be both symmetric and asymmetric in nature. Intel’s optimizations implement the solution using asynchronous TLS to take advantage of the hardware offload acceleration benefits that also save CPU cycles.
Intel® AVX-512 utilizes single instruction multiple data (SIMD) vector instruction capabilities into the CPU. Recently crypto instructions have been added to the vector instruction set Intel® AVX-512. When accelerated with Intel® AVX-512, the handshakes are executed in parallel and can boost performance.
Envoy uses BoringSSL* as the default TLS library. BoringSSL supports setting private key methods for offloading asynchronous private key operations, and Envoy implements a private key provider framework to allow creation of Envoy extensions that handle TLS handshakes private key operations (signing and decryption) using the BoringSSL hooks.
CryptoMB private key provider is an Envoy extension that handles BoringSSL TLS Rivest–Shamir–Adleman (RSA) operations using Intel® AVX-512 multi-buffer acceleration. When a new handshake happens, BoringSSL invokes the private key provider to request the cryptographic operation and then control returns to Envoy. These RSA requests are gathered in a buffer. When the buffer is full or the timer expires, the private key provider invokes Intel® AVX-512 to process the buffer. After processing, Envoy is notified that cryptographic operations are finished so it can continue with handshakes.
2. Performance - TCP/IP Bypass using eBPF*
A TCP/IP Bypass using eBPF Bypass can result in latency improvements of 11-17%.
The current implementation of service mesh in Istio and Envoy involves an overhead of TCP/IP stack. Data plane implementation in Istio is through Envoy as a sidecar proxy. The data packets traverse the TCP/IP stack at least three times during the following situations: inbound, outbound, and Envoy-to-Envoy with the same host
The multiple TCP/IP stack traversal causes performance degradation in the data plane when Envoy is deployed as a sidecar proxy.
Intel’s proposed solution is to bypass the TCP/IP networking stack in the Linux* kernel by using the eBPF module. This solution has shown a reduction in latency (11-17%) because the data is written directly to the socket in the user space. For more information: https://github.com/intel/istio-tcpip-bypass
3. Istio/Envoy Routing Acceleration with Hyperscan*
Using Hyperscan in Envoy can result in an increase of 20% in performance (QPS) and latency improvements of 16%.
Envoy performs filtering, access control and routing operations in a service mesh environment. During filtering operations, it selects different filters for different Hypertext Transfer Protocol (HTTP) requests. For access control operations, Envoy utilizes access control to block suspicious requests by matching their characteristics with security policies. The routing actions involve parsing the URL paths and other components of HTTP requests that need to be routed upstream to different clusters or services.
During the above operations, regex matching is key factor in helping Envoy decide the routing of a request. These regex operations utilize expensive CPU cycles, however, using Hyperscan to optimize regex matching operations can improve performance by saving CPU cycles and the latency of requests.
In the operations mentioned above, matching is the basic but core module that helps Envoy decide where a request can be redirected, and regex matching is one of the most expensive methods which consumes much more CPU utilization compared to prefix and exact matching.
Integration into Istio is an ongoing process, it’s available in Envoy release: 1.2.
Security
Security of cloud native applications in service mesh is critical to enterprises and cloud infrastructure providers.
Intel’s contributions are in these three main areas: data plane private key protection, Trusted Certificate Service (to protect certificate authority (CA) private keys), and multi-CA support in Istio.
1. Data plane private key protection.
In Istio, Envoy serves as a sidecar proxy responsible for the data path. The communications between microservices using Envoy sidecar proxy is usually done through Mutual Transport Layer Security (mTLS) protocol. Private keys are normally stored in plain text in the Kubernetes environment. To secure the data path, Intel® Software Guard Extensions (SGX) provide a solution where the keys are protected in Intel® SGX enclaves.
2. Trusted Certificate Service – Solution to protect CA private keys
Trusted Certificate Service (TCS) is a Kubernetes certificate signing solution that uses the security capabilities provided by Intel® Software Guard Extensions (Intel® SGX). The signing key is stored and used inside the Intel SGX enclave(s) and is never stored in clear anywhere in the system. TCS is implemented as a cert-manager external issuer by providing support for both cert-manager and Kubernetes certificate signing APIs.
Architecture
TCS contains two independent parts: Trusted Certificate Issuer and Trusted Attestation Controller (proxy).
- Trusted Certificate Issuer
Trusted certificate issuer (later TCS issuer) signs certificate signing requests (CSRs). The signing key is stored and used inside the SGX enclave(s) and is never stored in clear anywhere in the system. The TCS issuer is implemented as a cert-manager external issuer by providing support for both cert-manager and Kubernetes certificate signing APIs.
- Trusted Attestation Controller
Trusted Attestation Controller (later TCS attestation controller) is a proxy service providing an integration point to external SGX attestation and key management services. The integration is done via vendor specific container (plugin).
Trusted Certificate Issuer handles securing the private keys for the CA Server. The Trusted Attestation Controller provides a service to ensure that the SGX environment deployed in the cluster is secure.
For more information: https://github.com/intel/trusted-certificate-issuer
3. Multi-CA support in Istio
Multi-tenancy is a common scenario in cloud native and service mesh. By default, all workloads in Istio service mesh will have a single CA. This brings challenges for multi-tenancy support.
Having a multi-CA (multi-signer) feature, where each tenant can have their own dedicated CA and the traffic between different tenant's workload will not be allowed automatically
Networking Features
Intel is working closely with the community to bring IPv6 dual stack support to Istio.
Istio can separately support IPv4 and IPv6 well for now, but how to handle the communication between IPv4 and IPv6 services in Istio is a requirement, especially for Telco in 5G. So dual stack, a technology compatible with both IPv4 and IPv6 addresses this gap.
IPv6-only cluster support was improved significantly in 2021. Dual-stack cluster support, however, does not work. As of 1.12 when Istio is installed on a dual-stack cluster it only operates over the default IP family type of a service. Moreover, depending on how the Kubernetes service object is configured, this can be either an IPv4 or IPv6 address. So, there are many core components that need to be changed for dual stack support within Istio in dual stack Kubernetes clusters, including bootstrapping Envoy, control plane and pilot-agent, etc. Given the complexity, dual-stack implementation will happen in a series of phases.
Intel is excited about what we can contribute to Istio in the coming years.
About the authors
Iris Ding, now at Intel’s Software and Advanced Technology Group (SATG) has a rich background in open source development, cloud computing, middleware development and design. She is currently serving as Istio maintainer and focusing on research in cloud native area such as Kubernetes and service mesh.
Grace Lian, Senior Director leading Cloud Software Engineering team at Intel, focuses on cloud native technologies development including Kubernetes, service mesh, container runtimes, microservices optimizations etc. Her team actively engages with open source communities.
Ramesh Masavarapu is a Technical Product Manager at Intel focused on cloud native technologies such as service mesh. He is focused on understanding pain points and delivering key solutions to customers and partners through open-source projects like Istio, Envoy etc.
Photo by Vidar Nordli-Mathisen on Unsplash