DynamoRIO
Multi-Instrumentation Manager

The drmgr DynamoRIO Extension provides a mediator for combining and coordinating multiple instrumentation passes. It replaces certain DynamoRIO events and API routines with its own versions that mediate among multiple components, typically several libraries and one client, though it is also useful for splitting a client up into modules. drmgr facilitates developing instrumentation frameworks and libraries that can be composed and combined.

Setup

To use drmgr with your client simply include this line in your client's CMakeLists.txt file:

use_DynamoRIO_extension(clientname drmgr)

That will automatically set up the include path and library dependence.

The drmgr_init() function may be called multiple times; subsequent calls will be nops and will return true for success. This allows a library to use drmgr without coordinating with the client over who invokes drmgr_init().

Event Replacement

In order to provide ordering control over event callbacks, drmgr replaces a number of DynamoRIO's events. For many of these, simply replacing dr_ with drmgr_ is sufficient, as that will then use a default priority. To request a priority, use the _ex version of the drmgr_register_ routine. The basic block event is a special case as it is completely replaced with a new set of multiple events for different stages of instrumentation.

Instrumentation Stages

drmgr divides code changes into four types:

  1. Application-to-application transformations: changes to the application code itself that are meant to affect application behavior or application performance
  2. Instrumentation insertion: monitoring code added between the application instructions
  3. Instrumentation-to-instrumentation transformations: typically, optimizations applied to the full set of inserted instrumentation
  4. Meta-instrumentation transformations: typically, debugging of the full set of inserted instrumentation

Instrumentation insertion is split into two pieces: analysis of the full application code (after any changes from its original form), followed by insertion of instrumentation, one instruction at a time. The result is five separate, sequential stages:

  1. Application-to-application transformations
  2. Application code analysis
  3. Instrumentation insertion, one instruction at a time
  4. Instrumentation-to-instrumentation transformations
  5. Meta-instrumentation transformations

Each component that registers with drmgr can register for some or all of the five stages. In each stage, each registered compoment's callback is invoked. This groups the different types of changes together and allows them to assume that no later change will invalidate their analysis or actions. The instrumentation insertion is performed in one forward pass: for each instruction, each registered component is invoked. This simplifies register allocation (register allocation is provided by a separate Extension drreg).

Emulation-Aware Instrumentation

Support for one client library or component emulating or refactoring application code while another observes application behavior is provided through emulation marking and reading support.

An emulation-creating library uses drmgr_insert_emulation_start() and drmgr_insert_emulation_end() to mark the emulation region and make a copy of the original instruction being replaced.

An emulation-aware observational client then wants to instrument the original copy, rather than the emulation sequence present in the instruction list. The client should use the drmgr_orig_app_instr_for_fetch() and drmgr_orig_app_instr_for_operands() routines inside the insertion event to determine the instruction to observe, ignoring the instruction passed to the insertion event except as a placeholder in the instruction list for where to place added instructions for instrumentation.

The refactoring of application code for simpler instrumentation performed by drutil_expand_rep_string() and drx_expand_scatter_gather() is also marked as emulation, but only for instructions, not data: i.e., DR_EMULATE_INSTR_ONLY is set for these refactorings. These are typically used in address tracing clients, and we recommend that such clients also use the emulation-aware instrumentation approach just outlined.

If expansions or true emulation is present and a client is not emulation-aware, it may not accurately observe the application's behavior. For example, an opcode recording client might record the opcodes for the expansion sequence emulating a scatter instruction, rather than the original scatter opcode.

Ordering

The proper ordering of instrumentation passes depends on the particulars of what each pass is doing. drmgr supports naming each pass and specifying relative ordering by requesting that one pass occur before and/or after another named pass. Numeric priorities are also supported for resolving order among identical placement once named ordering is resolved.

Some ordering rules do apply. For example, function replacing should occur before most other application transformations. Ordering of instrumentation insertion and especially instrumentation-to-instrumentation transformations can be highly dependent on exact transformations involved. Care should be taken when ordering passes within each stage.

Traces

drmgr does not mediate trace instrumentation. Those interested in hot code should use the drmgr basic block events and act only when the for_trace parameter is set. Those wanting to optimize the longer code sequences in traces are on their own for register allocation, and must be careful to handle instrumentation that has already been added from the basic block events.

IT Blocks

To facilitate simple instrumentation of IT blocks, when in Thumb mode drmgr automatically adds the predicate of the application instruction being operated on in the instrumentation insertion stage to all meta instructions added by callbacks during that stage. Furthermore, drmgr automatically adds IT instructions after all stages are complete, to ensure that all condtional instructions are legal in Thumb mode.

Auto Predication

Most client instrumentation wants to be predicated to match the app instruction, so we do it by default. Clients may opt-out by calling drmgr_disable_auto_predication() at the start of the insertion bb event. Clients may also control auto predication with finer granularity by directly calling instrlist_set_auto_predicate() and instrlist_get_auto_predicate().

Thread-Local and Callback-Local Storage

drmgr also coordinates sharing of the thread-local-storage field among multiple components and provides automated support for callback-private fields on Windows. It replaces the single dr_get_tls_field() pointer with two separate arrays of pointers: one for callback-shared fields, and one for callback-private fields. When a field is requested, an integer index is returned to the caller for use in retrieving the appropriate pointer.

Callback-local Storage

On Windows, events such as keypresses or mouse movements are delivered to applications as callbacks. These callbacks interrupt a thread's execution in order to handle the event. The interrupted context is saved and a new context entered. When the event handling is finished, the interrupted context is resumed. Callbacks can interrupt other callbacks, resulting in a stack of contexts.

When a tool maintains state across application execution, it must handle callback contexts. Thread-local storage (tls) is per-thread and is thus callback-shared. Callbacks interrupt thread execution to execute arbitrary amounts of code in a new context before returning to the interrupted context. Thread-local storage fields that persist across application execution can be overwritten during callback execution, resulting in incorrect values when returning to the original context. Callback-local storage, rather than thread-local storage, should be used for any fields that store information specific to the application's execution.

Callbacks are Windows-specific. The cls interfaces are not marked for Windows-only, however, to facilitate cross-platform code. We recommend that cross-plaform code be written using cls fields on both platforms; the fields on Linux will never be stacked and will function as tls fields. Technically the same context interruption can occur with a Linux signal, but Linux signals typically execute small amounts of code and avoid making stateful changes; furthermore, there is no guaranteed end point to a signal. The drmgr_push_cls() and drmgr_pop_cls() interface can be used to provide a stack of contexts on Linux, or to provide a stack of contexts for any other purpose such as layered wrapped functions. These push and pop functions are automatically called on Windows callback entry and exit.

Instruction Note Fields

Instrumentation passes often need to mark instructions with information for later passes. One method of doing this is to use the note field built in to the instr_t type. For example, labels can be inserted with their note fields corresponding to pre-defined constants to indicate insertion points. In order to avoid these note constants from overlapping and conflicting among different components or passes, drmgr provides mediation of the namespace.

drmgr does not make use of its note mediation mandatory and does not override instr_set_note() or instr_get_note(). Instrumentation passes should feel free to use pointer values in the note field. The note constant value mediation is intended for small constants that will not be confused with pointer values.