What is a "Region"?
A "region" is a slice of execution of a program. It is demarcated by dynamic "start" (EVENT_START) and "stop" (EVENT_STOP) triggers as a program is running.
e.g., the SDE-PinPlay logger starts recording a pinball on an EVENT_START and stops recording it on an EVENT_STOP.
Specifying "Region of interest" for Pin and SDE based analyses
Pintools can include $PIN_ROOT/source/tools/InstLib/control_manager.H to define a "CONTROL_MANAGER". See the example usage in $PIN_ROOT/source/tools/InstLibExamples/control.cpp.
Similar use of the "CONTROL_MANAGER" in SDE can be seen in $SDE_BUILD_KIT/pinkit/sde-example/example/controller-example.cpp.
CONTROL_MANAGER makes available a number of switches to the pintool (with an optional prefix), that allow specifying "EVENT_START" and "EVENT_STOP" event callbacks to the pintool, and the pintool can then choose to take some action (such as 'enable analysis' or 'disable analysis').
Region triggers from a pintool
A pintool can start/stop regions programmatically by calling the following CONTROL_MANAGER method:
//trigger all registered control handlers //eventID - the Id of the event //tid - the triggering thread //bcast - whether this event affects all threads VOID Fire(EVENT_TYPE eventID, CONTEXT* ctx, VOID * ip, THREADID tid, BOOL bcast);
e.g. the 'gdb_record' script invokes the underlying pinplay-driver tool with "-log:controller_default_start 0" and then the pinplay-driver tool triggers EVENT_START/EVENT_STOP whenever "(gdb) pin record on" and "(gdb) pin record off"commands are issued by the user.
What is the controller
The controller is a flexible powerful way for users to define a slice of execution which they want to be profiled by SDE or an SDE based tool. The tool is responsible to support the controller events, and define its behavior, regarding incoming events. See details about SDE based tools support for the controller below.
The user can define a set of alarms which will trigger the events to be fired by the controller. The tool will receive these events, and act accordingly.
The controller provides a set of predefined alarms which include:
- icount: instruction count
- address: a real virtual address, a function name (symbol), image+offset
- itext: sequence of raw bytes which is interpreted as an instruction
- ssc: a code sequence consisting of 2 instructions, the first has an immediate that identify this marker
- int3: an embedded int3 instruction
- isa-extension/isa-category: the execution of instruction which belong to this XED group (ISA extension or ISA category)
- cpuid: a cpuid instruction with a special input (as defined by the input registers)
- magic: a code sequence also used by other simulation tools which has two input values that identify the marker
- pcontrol: entering the MPI_Pcontrol function with specific string argument that identify the marker
- enter_func: entering to a function with this name
- exit_func: returning from a function with this name
- interactive: the user interactively sends the event to the process from another window on the same machine
- timeout: number of seconds to trigger event
The controller exposes a knob: "-control" which gets the actual definition in a string argument to this knob. This is a flexible way for the user to control the region being profiled by the tool.
The following will describe the behavior for single-threaded applications. An additional paragraph will be describing multi-threaded applications.
Each instance of the "-control" knob, defines an alarm-chain as:
A full definition of the syntax, and the available alarm-types can be found below.
This defines that the <event> will be fired by the controller, once the <alarm> identified by the <value> is executed. If ":count<int>" is used, the event will be fired only when the alarm identified by the <value> is executed for the <int> time.
The controller will fire a start event, once 100 instructions are executed.
The controller will fire a stop event, once we reach the symbol 'foo' for the 3rd time.
Repeat - Repeating an alarm multiple time
By default the alarm is "armed" only once for each thread. Once the event is fired, this thread won't trigger an additional event, for this alarm. If one adds the ",repeat:<int>" to the end of the alarm, it will be activated <int> times. A repeat token without an argument means that the alarm will be fired every time it is executed.
The controller will fire a start event, once 100 instructions are executed. No more events will be fired
The controller will fire a start event, once 100 instructions are executed. Then it will "re-arm" the icount alarm, and fire a start event after additional 00 instructions are executed. This is repeated again. So 3 'start' events are fired, each 100 instructions. As opposed to using the ":count3" syntax, which will fire a single start event, only once the condition of 100 instructions is reached for the 3rd time.
Omitting the repeat count
The controller will fire a start event, every 100 instructions that are executed.
Multiple alarms in a chain
Each instance of the "-control" knob defines, an “alarm chain” (see description below).
Alarm chain is a sequence of alarms, separated by the ',' character, where each alarm is activated only after the previous one was fired. So the ',' character applies an order between the proceeding alarm and the following alarm.
The controller will fire a start event for each thread which reaches icount 100. Then it will 'arm' the next alarm for that thread. So once a thread reaches the symbol foo for the 3rd time (after reaching icount 100), the controller will fire a stop event for that thread. This will be repeated twice, for each thread.
In addition, one can set the entire chain to start only after some other chain has finished, using the ",name:<name>" and the ",waitfor:<name>" syntax
In some cases, we want the event to be fired only after a certain condition. For example: we want to start event to be fired when a function foo is called from the function bar. This can be done with the pre-condition event type. This event doesn't actually call to the tool (i.e. fires an event) but only arms the next alarm in the chain.
The controller will arm the start event after calling to the var function. Now, when foo is called the start event will be fired and the region start. The stop event will be triggered after 100000 instructions have been executed.
The alarm-chains are handled separately for each thread. The controller "arms" the <alarm-type> separately for each thread, and the alarm's value is counted separately for each thread.
If the ":tid<int>" syntax is used, the preceding alarm is armed only for the thread with the defined thread ID.
If the ":global" token is used the alarm will be counted in all threads.
Going back to the example above:
For each thread that reaches an icount of 100, the controller will fire a start event to the tool. The the event will be delivered with the tid of the triggering thread.
Adding the ":tid<int> syntax
Only when thread with pin thread ID 3 will reach the icount of 100, the controller will fire the start event to the tool.
Adding the ":bcast" syntax, will cause the controller to specify that the event is marked with the broadcast attribute (upon the arrival of the event). The tool can use this information to decide if to profile all the threads based on this event or only the triggering thread. The effect of adding the "bcast" syntax depends on the way the tool handles this information. In Addition, an alarm defined with bcast, which is notated with the "bcast" token, will 'arm' the following alarm in the chain, for all threads.
The full definition of an alarm chain:
alarm-chain ::= <alarm>[,<alarm-chain>]
alarm ::= <event>:<type>:<value>[:count<int>][:tid<int>][:bcast][:global]
event ::= start|stop|threadid|precond
type ::= icount|address|ssc|itext|isa-extension|isa-category|int3|magic|pcontrol|enter_func|exit_func
Values per type
0x<hex address> [pc address]
<name>+<offset> [image + offset]
<name> [symbol/function name]
ssc: hex (SSC markers are special no-OP instructions built into the binary)
itext: hex (raw bytes of the instruction)
isa-extension: string (the instructions extension as specified by XED)
isa-category: string (the instructions category as specified by XED)
int3: no argument (the int3 instruction is not really executed)
magic: int.int (the instruction xchg ebx,ebx and input/output values are defined by the numbers)
pcontrol: string (the second argument to the MPI_Pcontrol function called by the application)
enter_func: string (bare function name - without namespace and params)
exit_func: string (bare function name - without namespace and params)
interactive: no argument (see below)
timeout: int (number of seconds)
Using the address alarm with image and offset, the '+' sign is the key to distinguish between a function name and an image name. The image name can be full path or only the base name of the image.
Using the interactive controller requires two windows: one to run sde with the application and the interactive controller, the second to specify the start/stop event by using the controller client. Here is an example:
** using file: /tmp/ctrl_file.32564
** listening to port: 34106
> python <kit>/misc/cntrl_client.py /tmp/ctrl_file.32564
Stop event> python <kit>/misc/cntrl_client.py /tmp/ctrl_file.32564
The controller will fire the event once the alarm reaches the triggering condition. The instruction count alarm specifically works in a basic block granularity. The event will be fired at the beginning of a basic-block, when the current icount + the number of instructions within the basic-block, exceed the value defined in the controller, for triggering the event.
Some events were modified to be more accurate, and they are triggers in the specific instructions and not in the basic block: address, ssc, isa-extension, isa-category.
Please note that using the controller for specifying region of interest for tracing with pinplay has special handling. The start and stop events always act on all the threads. Due to implementation limitations, there is a small delay between the exact instruction on which the event is fired and when it is actually start or stop the region.
Special alarm - Uniform:
period: number of instructions before next sampling starts.
length: number of instructions to sample.
count: number of samples.
The alarm can be used to define multiple regions based on instruction count.
repeat[<int>]: number of iterations of the chain (when no number provided - execute in endless loop).
count<int>: delay firing the event to only the Nth execution of the alarm, the counting happens for each thread unless global is also specified.
tid<int>: the thread to monitor, events on other threads are ignored.
bcast: inform the tool that the event should be processed for all threads (this behavior is tool specific, and it is in the tool's responsibility).
name<string>: the name of the chain, other chains can “wait” for this chain to finish before it starts.
waitfor<string>: start the chain only after the chain with the specified name has finished.
global: Count alarms summary for all threads and not in a specific thread. this token cannot appear with tid token.
If no start event is defined by the user in the command line, a default start event is armed for each thread.
Controller support in SDE tools
As mentioned above, the controller is a self-contained component within SDE. All details are related to when and what events are fired by the controller. This section will discuss which SDE tools support using the controller, and how do they handle events fired by the controller.
List of tools which support the controller API:
- analysis tools: mix, footprint, align-check,chip-check, debugtrace,dynamic-mask-profiler, icount,memory-area-cross, sse-checker
- tracing tool
All analysis tools behave similarly: The tool collects data per thread. Based on an arrival of a start event triggered by that thread. The tool will stop the collections of data for that thread, when a stop event arrives for that thread. If the tool receives an event with "bcast" as the tid, rather than the triggering tid, the tool will apply this event on all threads.
These events are effective right at the point of arrival.
The tracing tool handles a global region. Meaning, at each given time, we are either in a region, and tracing all threads, or we are outside a region, and not tracing any thread.
Another special behavior of the tracing tool is the transition from in/out a region. The effect of the event arrival isn't immediate. Following is a description of what happens in the tracing tool upon an arrival of a start event:
- Once the thread which "caught" the event reaches the end of its current Basic-Block (BBL), it stops and calls all other threads to stop to.
- Each one of the threads will stop at the end of its current BBL.
- Once all treads are stopped, we change mode to "in-region", meaning we'll start generating the trace.
- Resume all threads
The same goes for stopping the trace generation.
Please notice that the tracing tool ignores any event that arrives between step 1 and 4. The controller behavior is orthogonal to the way the tool handles the events. So, the controller will continue firing events based on the alarms defined by the user in the command line. The tracing tool would ignore them if a previous event arrived, and its processing hasn't been completed yet.
By default, sde analysis tools follow the same behavior if you run an sde tool in addition to the tracing tool. You can cancel this by adding '-pinplay-control 0', this will keep the behavior of the sde analysis tools described above.
here are some examples of command line usage for defining the alarms/events.
>sde –control start:address:foo,stop:addres:bar -- <app>
start at symbol foo, stop at symbol bar
>sde –control start:icount:1000000,stop:icount:100000 -- <app>
Fire a 'start' event after 1M instrctions, fire a 'stop' after additional 100K instructions
>sde –skip 1000000 -length 100000 -- <app>
>sde –control precond:ssc:11223344,start:address:foo,stop:addres:bar -- <app>
start at symbol foo, only after ssc mark 0x11223344, stop at symbol bar
(The precond means we don't fire any event, but 'arm' the address:foo alarm only after we reach ssc:11223344)
>sde –control start:address:foo,bcast,stop:addres:bar,bcast -- <app>
start at symbol foo, fire an event with 'bcast' as the triggering thread, stop at symbol bar, fire the event with 'bcast' as the triggering thread.
* It's the tools decision/responsibility to handle the 'bcast' differently than the default case.
>sde –control start:address:foo:tid0,stop:addres:bar:tid0 -- <app>
start at symbol foo, stop at symbol bar but monitor only tid0
(The controller ignores the fact the tid1 reaches the symbol foo)
>sde –control precond:icount:200,uniform:800:500:5 -- <app>
start uniform sampling after 200 instructions.
equivalent to old usage:
>sde -uniform-skip 200 -uniform-period 500 -uniform-length 500 -uniform-count 5
>sde –control precond:ssc:11223344,uniform:800:500:5 -- <app>
start uniform sampling after ssc mark 0x11223344 instructions.
>sde –control precond:address:foo,precond:address:bar,repeat:2,name:c1 -control start:address:boo,waitfor:c1 -- <app>
start at symbol boo only after the sequence foo,bar,foo,bar
>sde –control start:enter_func:foo,stop:exit_func:foo -- <app>
start at beginning of function foo, stop when exiting for function foo
for this call sequence: A->B->foo->C->D
capturing functions foo,C,D
>sde –control start:enter_func:foo,stop:exit_func:bar -- <app>
for this call sequence: A->bar->foo->C->bar
start at beginning of function foo, stop when exiting the first bar function
*for exit function event we always refer to the top most caller.
>sde –control start:enter_func:foo,stop:exit_func:foo -- <app>
for this call sequence: A->foo->foo->foo->foo->C
start at beginning of the first function foo, stop when exiting the first foo function
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.