Intel® VTune™ Profiler

User Guide

ID 766319
Date 3/22/2024

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Instrument Your Application

To get the most out of the ITT APIs when collecting performance data with Intel® VTune™ Profiler, you need to add API calls in your code to designate logical tasks. This will help you visualize the relationship between tasks in your code, including when they start and end, relative to other CPU and GPU tasks.

At the highest level a task is a logical group of work executing on a specific thread, and may correspond to any grouping of code within your program that you consider important. You can mark up your code by identifying the beginning and end of each logical task with __itt_task_begin and __itt_task_end calls. For example, to track "smoke rendering" separately from "detailed shadows", you should add API tracking calls to the code modules for these specific features.

To get started, use the following API calls:

  • __itt_domain_create() creates a domain required in most ITT API calls. You need to define at least one domain.
  • __itt_string_handle_create() creates string handles for identifying your tasks. String handles are more efficient for identifying traces than strings.
  • __itt_task_begin() marks the beginning of a task.
  • __itt_task_end() marks the end of a task.


The following sample shows how four basic ITT API functions are used in a multi threaded application:

#include <windows.h>
#include <ittnotify.h>
// Forward declaration of a thread function.
DWORD WINAPI workerthread(LPVOID);
bool g_done = false;
// Create a domain that is visible globally: we will use it in our example.
__itt_domain* domain = __itt_domain_create("Example.Domain.Global");
// Create string handles which associates with the "main" task.
__itt_string_handle* handle_main = __itt_string_handle_create("main");
__itt_string_handle* handle_createthread = __itt_string_handle_create("CreateThread");
void main(int, char* argv[])
// Create a task associated with the "main" routine.
__itt_task_begin(domain, __itt_null, __itt_null, handle_main);
// Now we'll create 4 worker threads
for (int i = 0; i < 4; i++)
// We might be curious about the cost of CreateThread. We add tracing to do the measurement.
__itt_task_begin(domain, __itt_null, __itt_null, handle_createthread);
::CreateThread(NULL, 0, workerthread, (LPVOID)i, 0, NULL);
// Wait a while,...
g_done = true;
// Mark the end of the main task
// Create string handle for the work task.
__itt_string_handle* handle_work = __itt_string_handle_create("work");
DWORD WINAPI workerthread(LPVOID data)
// Set the name of this thread so it shows  up in the UI as something meaningful
char threadname[32];
wsprintf(threadname, "Worker Thread %d", data);
// Each worker thread does some number of "work" tasks
__itt_task_begin(domain, __itt_null, __itt_null, handle_work);
return 0;