Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference

ID 767253
Date 9/08/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Performance and Large Program Considerations

IPO-related Performance Issues

There are some general optimization guidelines for using IPO that you should keep in mind:

  • Using IPO on very large programs might trigger internal limits of other compiler optimization phases.

  • Applications where the compiler does not have sufficient intermediate representation (IR) coverage to do whole program analysis might not perform as well as those where IR information is complete.

In addition to these general guidelines, there are some practices to avoid while using IPO. The following list summarizes the activities to avoid:

  • Do not use the link phase of an IPO compilation using mock object files produced for your application by a different compiler. Intel® compilers cannot inspect mock object files generated by other compilers for optimization opportunities.

  • Update make files to call the appropriate Intel linkers when using IPO from scripts. For Linux, replace all instances of ld with xild; for Windows, replace all instances of link with xilink.

IPO for Large Programs

In most cases, IPO generates a single true object file for the link-time compilation. This behavior is not optimal for very large programs, perhaps even making it impossible to use [Q]ipo compiler option on the application.

The compiler provides two methods to avoid this problem. The first method is an automatic size-based heuristic, which causes the compiler to generate multiple true object files for large link-time compilations. The second method is to manually instruct the compiler to perform multi-object IPO.

  • Use the [Q]ipoN compiler option and pass an integer value in the place of N.

Regardless of the method used, it is best to use the compiler defaults first and examine the results. If the defaults do not provide the desired results then experiment with generating a different number of object files.

Use [Q]ipoN to Create Multiple Object Files

If you specify [Q]ipo0, which is the same as not specifying a value, the compiler uses heuristics to determine whether to create one or more object files based on the expected size of the application. The compiler generates one object file for small applications, and two or more object files for large applications. If you specify any value greater than 0, the compiler generates that number of object files, unless the value you pass a value that exceeds the number of source files. In that case, the compiler creates one object file for each source file then stops generating object files. The generated object files follow OS-specific naming conventions.

The following example commands demonstrate how to use [Q]ipo2 option to compile large programs.

Linux

icpx -fsycl -ipo2 -c a.cpp b.cpp
The resulting object files are ipo_out.o, ipo_out1.o, and ipo_out2.o.

Windows

icx -fsycl /Qipo2 /c a.cpp b.cpp
The resulting object files are ipo_out.obj, ipo_out1.obj, and ipo_out2.obj.

Link the resulting object files as shown in Use Interprocedural Optimization.

Code Layout and Multi-object IPO

One of the optimizations performed during an IPO compilation is code layout. The analysis performed by the compiler during multi-file IPO determines a layout order for all of the routines for which it has intermediate representation (IR) information. For a multi-object IPO compilation, the compiler must tell the linker about the desired order.

The compiler first puts each routine in a named text section that varies depending on the operating system:

Linux

The first routine is placed in .text00001, the second is placed in .text00002, and so on.

Windows

The first routine is placed in .text$00001, the second is placed in .text$00002, and so on.

See Also