Development Reference Guides

Contents

Performance and Large Program Considerations

IPO-related Performance Issues

There are some general optimization guidelines for using IPO that you should keep in mind:
  • Using IPO on very large programs might trigger internal limits of other compiler optimization phases.
  • Applications where the compiler does not have sufficient intermediate representation (IR) coverage to do whole program analysis might not perform as well as those where IR information is complete.
In addition to these general guidelines, there are some practices to avoid while using IPO. The following list summarizes the activities to avoid:
  • Do not use the link phase of an IPO compilation using mock object files produced for your application by a different compiler. Intel® compilers cannot inspect mock object files generated by other compilers for optimization opportunities.
  • Update make files to call the appropriate Intel linkers when using IPO from scripts. For Linux, replace all instances of
    ld
    with
    xild
    ; for Windows, replace all instances of
    link
    with
    xilink
    .

IPO for Large Programs

In most cases, IPO generates a single true object file for the link-time compilation. This behavior is not optimal for very large programs, perhaps even making it impossible to use
[Q]ipo
compiler option on the application.
The compiler provides two methods to avoid this problem. The first method is an automatic size-based heuristic, which causes the compiler to generate multiple true object files for large link-time compilations. The second method is to manually instruct the compiler to perform multi-object IPO.
  • Use the
    [Q]ipo
    N
    compiler option and pass an integer value in the place of
    N
    .
Regardless of the method used, it is best to use the compiler defaults first and examine the results. If the defaults do not provide the desired results then experiment with generating a different number of object files.

Using
[Q]ipo
N
to Create Multiple Object Files

If you specify
[Q]ipo0
, which is the same as not specifying a value, the compiler uses heuristics to determine whether to create one or more object files based on the expected size of the application. The compiler generates one object file for small applications, and two or more object files for large applications. If you specify any value greater than 0, the compiler generates that number of object files, unless the value you pass a value that exceeds the number of source files. In that case, the compiler creates one object file for each source file then stops generating object files.
The following example commands demonstrate how to use
[Q]ipo2
option to compile large programs.
Operating System
Example Command
Linux*
dpcpp -ipo2 -c a.cpp b.cpp
Windows*
dpcpp-cl /Qipo2 /c a.cpp b.cpp
In executing the above commands, the compiler generates object files using an OS-dependent naming convention. On Linux*, the example command results in object files named
ipo_out.o
,
ipo_out1.o
, and
ipo_out2.o
. On Windows*, the file names follow the same convention; however, the file extensions will be
.obj
.
Link the resulting object files as shown in Using IPO.

Understanding Code Layout and Multi-Object IPO

One of the optimizations performed during an IPO compilation is code layout. The analysis performed by the compiler during multi-file IPO determines a layout order for all of the routines for which it has intermediate representation (IR) information. For a multi-object IPO compilation, the compiler must tell the linker about the desired order.
The compiler first puts each routine in a named text section that varies depending on the operating system:
Linux:
  • The first routine is placed in
    .text00001
    , the second is placed in
    .text00002
    , and so on.
Windows:
  • The first routine is placed in
    .text$00001
    , the second is placed in
    .text$00002
    , and so on.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.