DynamoRIO
Client Transparency

DynamoRIO must avoid interfering with the semantics of a program while it executes. Shifting execution from the original application code into a cache that occupies the application's own address space provides flexibility but complicates transparency. To achieve transparency, DynamoRIO cannot make any assumptions about a program's stack usage, heap usage, or any of its dependencies on the instruction set architecture or operating system. DynamoRIO's transparency restrictions necessarily apply to a client as well.

Contents:

We first describe each of the three transparency categories for DynamoRIO and their ramifications for clients using the code cache.

Resource Usage Conflicts

Ideally, DynamoRIO and its client's resources should be completely disjoint from the application's, to avoid conflicts in the usage of libraries, heap, input/output, and locks. Library conflicts are the most relevant to a client.

Library Transparency

Sharing libraries with the application can cause problems with re-entrancy and corruption of persistent state like error codes. DynamoRIO's dispatch code can execute at arbitrary points in the middle of application code; a client's instrumentation is similarly executed. If both the application and DynamoRIO use the same non-re-entrant library routine, DynamoRIO might call the routine while the application is inside it, causing incorrect behavior. We have learned this lesson the hard way, having run into it several times. The solution is for DynamoRIO's external resources to come only from system calls and never from user libraries. This is straightforward to accomplish on Linux, and most operating systems, where the system call interface is a standard mechanism for requesting services (a in the figure below). However, on Windows, the documented method of interacting with the operating system is not via system calls but instead through an application programming interface (the {Win32 API}) built with user libraries on top of the system call interface (b in the figure). If DynamoRIO uses this interface, re-entrancy and other resource usage conflicts can, and will, occur. To achieve full transparency on Windows, the system call interface (c in the figure) must be used, rather than the API layer.

DynamoRIO provides access to resources to clients via a cross-platform API. We do not recommend that a client invoke its own system calls as this bypasses DynamoRIO's monitoring of changes to the process address space and changes to threads or control flow.

DynamoRIO does have its own loader and through it support is provided to clients to use external libraries while isolating them from the libraries used by the application: see Using External Libraries. This support is limited, however, and there are libraries that are not supported today.

In addition to Library Transparency, several other types of transparency are of concern to a client:

Heap Transparency

Sharing heap allocation routines with the application violates Library Transparency. Most heap allocation routines are not re-entrant (they are thread-safe, but not re-entrant). Additionally, DynamoRIO should not interfere with the data layout of the application (Data Transparency: see below) or with application memory bugs (Error Transparency: see below). DynamoRIO obtains its memory directly from system calls and parcels it out internally with a custom memory manager. It exports its heap allocation routines through its API to ensure that clients maintain Heap Transparency (see Common Utilities). Calls to malloc in external libraries used by clients are also redirected to DynamoRIO's internal heap.

Input/Output Transparency

DynamoRIO uses its own input/output routines to avoid interfering with the application's buffering. As with heap transparency, DynamoRIO exports its input/output routines to clients to ensure that transparency is not violated (see Common Utilities).

Synchronization Transparency

DynamoRIO and its clients must avoid acquiring locks that the application also acquires, such as the LoaderLock on Windows. Additionally, there are restrictions imposed by DynamoRIO on when its own locks (including client locks) can be acquired, to allow it to safely synchronize with multiple threads: no lock should be held while application code is executing in the code cache. Locks can be used while inside client code reached from clean calls out of the code cache, but they must be released before returning to the cache. Failing to follow these restrictions can lead to deadlocks.

Leaving the Application Unchanged When Possible

As many aspects of the application as possible should be left unchanged:

X87 Floating Point State Transparency

Because it is expensive to do so and rarely necessary, on x86 DynamoRIO does NOT save or restore the x87 floating point state or MMX (64-bit) registers during a context switch away from the application. The larger multimedia registers (XMM, YMM, ZMM, whether holding floating-point or integer values) are preserved, but if at any time a client wishes to use x87 floating point or MMX operations, it must explicitly preserve the state. The dr_insert_clean_call() routine takes a boolean indicating whether x87 floating point state should be preserved across the call; this is the most convenient method for saving the state.

The state can alternatively be saved explicitly from C code using:

Saving can be done from inlined code in the code cache using:

These routines require a buffer that is 16-byte-aligned and of a certain size (512 bytes for processors with the FXSR feature, and 108 bytes for those without). Here is a sample usage:

byte fp_raw[512 + 16];
byte *fp_align = (byte *) ( (((ptr_uint_t)fp_raw) + 16) & ((ptr_uint_t)-16) );

Note that floating point operations include almost any operation that acts on a float, even printing one with %f, on older compilers. Modern compilers typically do not use x87 operations.

The XMM (128-bit) and larger registers are saved by DynamoRIO on context switches and clean calls. See also the dr_mcontext_xmm_fields_valid() routine.

On ARM/AArch64, DynamoRIO saves and restores all the registers, including the SIMD/FP registers, during a context switch away from the application. The functions proc_save_fpstate and proc_restore_fpstate are provided, but do nothing.

Thread Transparency

For full transparency, if DynamoRIO or a client creates extra threads they should be hidden from any introspection performed by the application.

Executable Transparency

The program binary and shared library files on disk should not be modified.

Data Transparency

DynamoRIO leaves application data unmodified, including heap layout.

Stack Transparency

The application stack must look exactly like it does natively. It is tempting to use the application stack for scratch space, but we have seen applications like Microsoft Office access data beyond the top of the stack (i.e., the application stores data on the top of the stack, moves the stack pointer to the previous location, and then accesses the data). Using the application stack for scratch space would clobber such data. Additionally, hand-crafted code might use the stack pointer as a general-purpose register. Other and better options for temporary space are available. DynamoRIO provides thread-local storage through its API.

DynamoRIO provides a separate stack to use for itself and for the client, and never assumes even that the application stack is valid. Many applications examine their stack and may not work properly if something is slightly different than expected.

Pretending The Application Is Unchanged When It Is Not

For changes that are necessary (such as executing out of a code cache), DynamoRIO must warp events like interrupts, signals, and exceptions such that they appear to have occurred natively.

Cache Consistency

DynamoRIO must keep its cached copies of the application code consistent with the actual copy in memory. If the application unloads a shared library and loads a new one in its place, or modifies its own code, DynamoRIO must change its code cache to reflect those changes to avoid incorrectly executing stale code. If the client needs to modify application code, it should do so through the basic block event, rather than directly.

Address Space Transparency

DynamoRIO must pretend that it is not perturbing the application's address space. An application bug that writes to invalid memory and generates an exception should do the same thing under DynamoRIO, even if we have allocated memory at that location that would natively have been invalid. This requires protecting all DynamoRIO memory from inadvertent (or malicious) writes by the application. Furthermore, DynamoRIO hides itself from introspection by manipulating memory queries.

Application Address Transparency

Although the application's code is moved into a cache, every address manipulated by the application must remain an original application address. DynamoRIO must translate indirect branch targets from application addresses to code cache addresses, and conversely if a code cache address is ever exposed to the application, DynamoRIO must translate it back to its original application address. The latter occurs when the operating system hands a machine context to a signal or exception handler. In that case both the faulting or interrupted address and the complete register state must be made to look like the signal or exception occurred natively, rather than inside the code cache where it actually occurred.

To save space, DynamoRIO does not store any mappings from code cache state to application state. Since our cache consistency guarantees that the original application code cannot have changed since we built a fragment, we re-create the fragment from the original code, applying all the same transformations we applied when we first copied it into our code cache. We then walk through the reproduction and the code cache version in lockstep until we reach the target instruction. In order to accomplish this algorithm in code that has been modified or re-arranged by a client, DynamoRIO needs client support. We have not yet implemented this support.

Error Transparency

Application errors under DynamoRIO must occur as they would natively. We accomplish this by maintaining Heap Transparency, Data Transparency, Stack Transparency, and Address Space Transparency, We have seen many cases of applications that access invalid memory natively, handle the exception, and carry on. Without Error Transparency such applications would not work properly under DynamoRIO.

Timing Transparency

We would like to make it impossible for the application to determine whether it is executing inside of DynamoRIO. However, this may not be attainable for some aspects of execution, such as the exact timing of certain operations. This brings efficiency into the transparency equation.

Changing the timing of multi-threaded applications can uncover behavior that does not normally happen natively. We have encountered race conditions while executing under DynamoRIO that are difficult to reproduce outside of our system. These are not strictly speaking transparency violations, as the errors could have occurred without us, but are best avoided.

Debugging Transparency

A debugger should be able to attach to a process under DynamoRIO's control just like it would natively. Previously discussed transparency issues overlap with Debugging Transparency. Stack Transparency and Data Transparency ensure that callstacks and application memory show up correctly. However, DynamoRIO can't entirely hide the fact that the application is running out of a code cache from the debugger. The debugger will see the wrong eip value when breaking in and execution break points will not work correctly (and in fact can lead to corruption of the DynamoRIO code cache). Debugging Tools for Windows and gdb are known to work with DynamoRIO with the exception of the issues noted above.

Alertable System Calls

On Windows, DynamoRIO does not support a client making alertable system calls. See Avoid Alertable System Calls for more information.

DR_API void dr_insert_save_fpstate(void *drcontext, instrlist_t *ilist, instr_t *where, opnd_t buf)
DR_API void proc_restore_fpstate(byte *buf)
DR_API size_t proc_save_fpstate(byte *buf)
DR_API void dr_insert_restore_fpstate(void *drcontext, instrlist_t *ilist, instr_t *where, opnd_t buf)