Tracing Distributed Non-MPI Applications
- The application handles startup and termination of all processes itself. Both startup with a fixed number of processes and dynamic spawning of processes is supported, but spawning processes is an expensive operation and should not be done too frequently.
- For a reliable startup, the application has to gather a short string from every process in one place to bootstrap the TCP/IP communication in Intel Trace Collector. Alternatively, one process is started first and its string is passed to the others. In this case you can assume that the string is always the same for each program run, but this is less reliable because the string encodes a dynamically chosen port which may change.
- Map the hostname to an IP address that all processes can connect to.
- The application initializes itself and its communication.
- The application initializes communication between VTserver and processes.
- Trace data is collected locally by each process.
- VT data collection is finalized, which moves the data from the processes to the VTserver, where it is written into a file.
- The application terminates.
- it requires a more complex communication between the application and VTserver
- the startup time for 2 is expected to be sufficiently small
- reusing the existing communication would only work well if the selection of active processes does not change
Initializing and Finalizing
- The application server initiates its processes.
- Each process callsVT_clientinit().
- VT_clientinit()allocates a port for TCP/IP communication with the VTserver or other clients and generates a string which identifies the machine and this port.
- Each process gets its own string as result ofVT_clientinit().
- The application collects these strings in one place and calls VTserver with all strings as soon as all clients are ready. VT configuration is given to the VTserver as file or through command line options.
- Each process callsVT_initialize()to actually establish communication.
- The VTserver establishes communication with the processes, then waits for them to finalize the trace data collection.
- Trace data collection is finalized when all processes have calledVT_finalize().
- Once the VTserver has written the trace file, it quits with a return code indicating success or failure.
Running without VTserver
- a newVT_COMM_WORLDwhich contains all of the spawned processes, but not the spawning process
- a communicator which contains the spawning process and the spawned ones; the spawning process gets it as result fromVT_attach()and the spawned processes by callingVT_get_parent()
VTserver <contact infos> [config options]
VTserver <contact1> <contact2> --logfile-name example.stf