Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference

ID 767253
Date 3/22/2024
Public
Document Table of Contents

Instrumented Profile-Guided Optimization

This content describes traditional Instrumented Profile-Guided Optimization (IPGO).

With this method, profile collection is done in software, and the steps are essentially the same on all platforms.

The instrumentation has significant overhead, which can limit the scenarios in which profile collection can be performed.

Hardware Profile-Guided Optimization (HWPGO) may be a better alternative when profile collection overhead is a concern or Performance Monitoring Unit (PMU)-based feedback is needed.

Please refer to the LLVM Project's Clang Compiler User Manual for more details and information on other software feedback mechanisms.

Usage

  1. Compile with optimizations plus -fprofile-generate=app.profraw.

    This option generates additional code which tracks the executable's execution profile. This instrumentation should be expected to slow down execution considerably:

    icx -xCORE-AVX512 -Ofast -fprofile-generate=app.profraw app.c -o app

    There is no requirement that a particular linker be used: On Linux, if the linker is invoked directly, then you must add the libclang_rt.profile.a library as an input and specify -u__llvm_profile_runtime as a command line flag:

    ./app

  2. Create a profile by executing the instrumented executable.

    This should leave raw profile data on disk according to the -fprofile-generate option. app.profraw file name can be overridden by setting the LLVM_PROFILE_FILE environment variable.

    NOTE:
    This option supports special specifiers, such as %m (see Profiling with Instrumentation for more information on specifiers and %m), which can help to ensure unique file or directory names for cases when multiple processes are using the same file system. There is also an icx-specific expansion %e. %e represents the timestamp value of the number of seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC).

  3. Use the raw instrumentation profile(s) to create an LLVM profile:

    llvm-profdata merge app.profraw --output app.prof

    Multiple profraw files can be specified in case multiple invocations of the process were involved.

    NOTE:
    This merge also converts the raw profile to a format understood by the compiler, so this step is required even in the case of a single profraw file.

  4. Recompile specifying the profile information to the compiler:

    icx -xCORE-AVX512 -Ofast -fprofile-use=app.prof -o app