Invoke the Profiler Runtime Wrapper to Obtain Profiling Data
After compiling your SYCL* program using the
Intel® oneAPI
, you can profile your FPGA design using the Profiler Runtime Wrapper. The Profiler Runtime Wrapper calls your executable and collects profile information at a given sample rate. The performance counter data is saved in a
DPC++/C++
Compilerprofile.mon
monitor description file that the Profiler Runtime Wrapper post-processes and outputs into a readable
profile.json
file. You are encouraged to use the
profile.json
for further data processing instead of the
profile.mon
file. However, both are available for use after host execution completes.
To invoke the Profiler Runtime Wrapper, execute the following command:
aocl profile [options] /path/to/executable [executable options]
where:
- [options]are any additional flags you want to pass to the wrapper. Refer toaocl profile –helpfor a list of options and their uses.
- /path/to/executableis the path to the executable generated by the compiler.
- [executable options]are any options or arguments that need to be passed along to the executable.
Because of slow network disk accesses, running the host application from a networked directory might introduce delays between kernel executions. These delays might increase the overall execution time of the host application. In addition, they might introduce delays during kernel executions while the runtime stores profile output data to disk.
Split the Execution and Data Post-Processing
By default, the Profiler Runtime Wrapper automatically runs a post-processing step on your
profile.mon
monitor file to produce a readable
profile.json
file. In some situations, the post-processing step may take longer than expected. Because of this, you can choose to separate the execution and data post-processing steps into two separate manual steps. To do this, use the
--no-json
and
--no-run <path to profile.mon file>
Profiler Runtime Wrapper options.
- The--no-jsonflag only runs your executable and produces aprofile.monmonitor file without post-processing it.
- The--no-run <path to profile.mon file>flag does not invoke your executable and instead just calls the post-processing step on the suppliedprofile.monfile.
Temporal Performance Collection
During the run of your host application, the Profiler collects performance counter data at a given sample rate
n
. After
n
cycles, the Profiler collects the performance counter data and outputs it to the
profile.mon
monitor file.
- You can control the rate at which the Profiler counters are sampled by setting the Profiler Runtime Wrapper's-periodflag. The specified period is the minimum number of kernel pipeline clock cycles between profiling samples. If you do not set a period, the default behavior is to profile as often as possible.For particularly large or long-running designs, the amount of data generated by the default temporal period might result in very largeprofile.monandprofile.jsonfiles. To reduce this file size, increase the sampling period or turn off temporal profiling.
- To turn off temporal profiling and instead collect performance data only once a kernel has finished executing, you can set the Profiler Runtime Wrapper'sflag.-no-temporalIf you collect the performance data only at the end of execution, the data is an average representation of the kernel's overall execution.