Executive Summary
Pico is the leading provider of technology services for financial markets around the world. With lower network latencies and increasing message rates and trading volume in financial markets, Pico’s Corvil Analytics appliances needed to deliver higher sustained throughput. Collaborating with Intel, Pico engineers and developers integrated 3rd Gen Intel® Xeon® Scalable processors into their 7th generation Corvil 10000 appliance. Although the initial requirements of the market do not require it, Pico states that the new CPU’s enhanced performance and latest features will enable the Corvil 10000 to achieve 100 million packets per second throughput to support ever rising trading rates and volume.
Pico provides mission critical technology, data, and analytic services for the financial markets community.
Challenge
In financial trading practice, speed is everything. Over the years, the feeds for the Consolidated Tape Association’s (CTA) Securities Information Processor (SIP) have gotten faster, reducing latencies. Lower latencies enable increased trading volumes. Today CTA latencies are under 20 microseconds, delivering nearly a million quotes and hundreds of thousands of trades each 100 milliseconds. To accommodate increased rates, the Options Price Reporting Authority (OPRA) messages are passed over a 40 Gbps network to its customers, demanding higher performance from Financial Services Industry (FSI) infrastructure.
Pico provides mission critical technology, data, and analytic services for the financial markets community. The company’s resilient proprietary network, PicoNet, is a globally comprehensive, low-latency, and fully redundant network. PicoNet interconnects all major financial data centers around the world, including all major public cloud providers. Customers include the world’s largest banks, exchanges, quantitative hedge funds, electronic market makers, and asset managers.
Pico Corvil Analytics bare-metal appliances run in Pico’s co-located data centers and in private data centers to ingest, timestamp, process, and store trade information used by financial strategists and traders.
“Quotes and trades on the exchanges update in microsecond time frames,” explained Donal O’Sullivan, Head of Product Management at Pico. “And the fastest FPGA engines run in sub-microsecond timeframes. Our appliances monitor the trading activity and customers’ algorithms, ingesting large volumes of data in real-time, timestamping, correlating, analyzing, and then locally storing all that data to disk. The data is used for troubleshooting, risk management, regulatory reporting, and client retention.”
With the 3rd Gen Intel® Xeon® Scalable processors, the goal of the Corvil 10000 is 100 million packets per second from ingest to storage, rather than supporting bursts—performance unachievable without the new technology.
Pico’s current flagship appliance, the Corvil 9000, supports 40 Gbps Ethernet (with a future-ready 100 Gbps software upgrade) for analytics, packet capture, and export. The appliance can support 100 Gbps for shorter intervals. However, as data volumes have continued to grow, and with significant adoption of 100 Gbps Ethernet in the trading network, Pico needed a more performant solution. The company sought to develop a new higher speed appliance, providing their customers with the analytics throughput to cope with traffic rates for the next few years.
"Corvil systems need to process up to 100 million packets per second sustained—that’s ingesting, analyzing, timestamping, and storing to disk 100 million packets each second. To reach this metric, the Corvil Analytics 10000 appliance is built on 3rd Gen Intel Xeon Scalable processors."
Solution
“At the outset of designing our product line,” added O’Sullivan, “we decided to base our appliances on the Intel platform, a decision we have doubled down on with each new Intel architecture release. The Corvil 10000 is part of the 7th generation of market-leading analytics appliances.”
According to O’Sullivan, 100 Gbps wire speed translates to about 100 million packets in a second for typical market data packets. So Corvil systems need to process up to 100 million packets per second sustained, in their co-located facilities, which are in the low-latency market environments.
“To do the analytics we provide, we analyze 100 percent of packets,” commented O’Sullivan. “That’s ingesting, analyzing, timestamping, and storing to disk 100 million packets each second.”
To reach this metric, the next-generation Corvil Analytics 10000 appliance is built on 3rd Gen Intel Xeon Scalable processors with its new microarchitecture. Compared to the previous generation of Intel Xeon Scalable processors, these processors deliver up to:
- 1.46x average performance gains1
- 1.6x higher memory bandwidth with 2.66x more memory capacity1
- 1.33x more PCIe lanes1
Plus, the new processors include Intel® Speed Select technology and support PCIe Gen 4 for faster I/O. In collaboration with Intel, Pico received early samples of the next-generation server processors in 2021 and began porting their code.
“On the initial port of the code, we measured a 40 percent increase2 in our streaming packet processing engine—the core engine of the appliances,” stated O’Sullivan. “That was achieved with little specific tuning of the code to utilize the new capabilities.”
Result
The 3rd Gen Intel Xeon Scalable processors deliver the capabilities Corvil appliances need to continue to deliver its industry-leading services. With the new processors, the Corvil 10000 is targeting 100 million packets per second from ingest to storage, rather than supporting bursts— performance unachievable without the new technology.
“The increased instructions per clock of the 3rd Gen Intel Xeon Scalable processors provided an immediate boost to our processing engine,” explained O’Sullivan. “But, that was just the beginning. Our systems do a lot of memory accesses for ingest, timestamping, algorithm processing, and other processing. The increased memory bandwidth and memory speed allow sustained processing within the cores.”
Other bottlenecks eliminated by the new processors included disk I/O and compression.
“To support sustained throughput to storage, we needed to add more storage capacity,” added O’Sullivan. “With PCIe Gen 4, there are more and faster PCIe lanes, which can support more disks to deliver the throughput we need.”
With the volume of data being processed, the Corvil appliance compresses data, using compression done by the CPU cores, before storing it on disk. According to O’Sullivan, support for Bit Algebra and Vector Bit Manipulation calculations in the new processors give them a significant improvement in compression to disk at the sustained speed of the box.
Corvil appliances deliver their sustained performance by scaling out across a distributed infrastructure and using the high core counts and hyper-threading offered by Intel® architecture. There are also many less-critical-path threads, but there are always hot threads that need more performance, and multi-threading some tasks is not always the optimal approach.
“Timestamp processing is computationally only moderately expensive, but when you want to process 100 million per second, that’s a lot! Multithreading introduces a lot of headaches, so there are benefits to doing this sequentially on a single thread. The Intel Speed Select technology lets us run those threads at higher speeds—even higher than turbo—while keeping below the power/heat envelope. We could not have achieved the sustained throughput we need without these new CPUs,” concluded O’Sullivan.
Solution Summary
Due to their sustainable high message throughput, Pico’s Corvil Analytics appliances deliver industry-leading performance for financial markets services. Corvil has built their product lines on Intel architecture for many years. With continually advancing network bandwidths and message volumes in the financial markets, their 7th generation Corvil 10000 appliance was built on 3rd Gen Intel Xeon Scalable processors. This new generation of processors offer increased instructions per clock, greater memory bandwidth and capacity, PCIe Gen4 for faster I/O, and Intel Speed Select technology. The enhanced feature set allows the new appliance to target 100 million messages per second throughput with headroom for next-generation network bandwidths in the trading infrastructure.
“Timestamp processing is computationally only moderately expensive, but when you want to process 100 million per second, that’s a lot! Multithreading introduces a lot of headaches, so there are benefits to doing this sequentially on a single thread. The Intel Speed Select technology lets us run those threads at higher speeds—even higher than turbo—while keeping below the power/heat envelope. We could not have achieved the sustained throughput we need without these new CPUs.” —Donal O’Sullivan, Head of Product Management, Pico.
Solution Ingredients
- Pico Corvil Analytics 10000 appliance
- Dual-socket Intel Xeon Platinum 8358 processors with 36 cores each
- Intel Speed Select technology
- Goal of 100 million packets per second processing