Core Utilization in DPDK Apps
- Application: a DPDKtestpmdapp running on one core and performing L2 forwarding. The application is compiled against DPDK with theVTuneprofiling enabled.Profiler
- DPDK with.VTuneprofiling support enabledProfilerVTuneprofiling support is integrated into DPDK since version 18.11. When using earlier versions, apply the attached patches (available for versions 17.11, 18.02, and 18.05). To enable profiling on the DPDK side, enable theProfilerVTuneto attach to the DPDK polling cycle. For this, reconfigure and recompile the DPDK (and the target application) with theProfilerCONFIG_RTE_ETHDEV_RXTX_CALLBACKSandCONFIG_RTE_ETHDEV_PROFILE_WITH_VTUNEflags enabled (located in theconfig/common_base configfile).
- : Input and Output analysisIntel® VTune™Profiler
- Starting with the 2020 release, Intel® VTune™ Amplifier has been renamed toIntel® VTune™.Profiler
- Most recipes in theIntel® VTune™Performance Analysis Cookbook are flexible. You can apply them to different versions ofProfilerIntel® VTune™. In some cases, minor adjustments may be required.Profiler
- Operating system: Test system that consists of the traffic generator (GEN in the picture below) providing 64-byte frames and packet receiver (SUT - system under test), connected via 40 GbE link. The SUT performs L2 forwarding of packets.
- CPU: Intel® Xeon® Platinum 8180 (38.5M Cache, 2.5 GHz, 28 cores)
Run Input and Output Analysis
amplxe-cl -collect io -knob kernel-stack=false -knob dpdk=true -knob collect-pcie-bandwidth=true -knob collect-memory-bandwidth=false -knob dram-bandwidth-limits=false --target-process=testpmd
Analyze Core Utilization with the DPDK Rx Spin Time Metric
Analyze Packets Retrieval with DPDK Rx Batch Statistics Histogram
Understand Rx Operations and Investigate Rx Peaks
- 4 x 32 Bytedescriptors or8 x 16 Bytedescriptors are completed.
- A descriptor is invalidated in the internal NIC cache.
- 32 Byte Rx descriptor: Most ofrte_eth_rx_burst()calls receive 4 packets.
- 16 Byte Rx descriptor: Most ofrte_eth_rx_burst()calls receive 8 packets.