DynamoRIO
Creating New Analysis Tools

drcachesim provides a drmemtrace analysis tool framework to make it easy to create new trace analysis tools. A new tool should subclass analysis_tool_t.

Concurrent processing of traces is supported by logically splitting a trace into "shards" which are each processed sequentially. The default shard is a traced application thread, but the tool interface can support other divisions. For tools that support concurrent processing of shards and do not need to see a single time-sorted interleaved merged trace, the interface functions with the parallel_ prefix should be overridden, and parallel_shard_supported() should return true. parallel_shard_init_stream() will be invoked for each shard prior to invoking parallel_shard_memref() for each entry in that shard; the data structure returned from parallel_shard_init() will be passed to parallel_shard_memref() for each trace entry for that shard. The concurrency model used guarantees that all entries from any one shard are processed by the same single worker thread, so no synchronization is needed inside the parallel_ functions. A single worker thread invokes print_results() as well.

For serial operation, process_memref(), operates on a trace entry in a single, sorted, interleaved stream of trace entries. In the default mode of operation, the analyzer_t class iterates over the trace and calls the process_memref() function of each tool. An alternative mode is supported which exposes the iterator and allows a separate control infrastructure to be built. This alternative mode does not support parallel operation at this time.

Both parallel and serial operation can be supported by a tool, typically by having process_memref() create data on a newly seen traced thread and invoking parallel_shard_memref() to do its work.

For both parallel and serial operation, the function print_results() should be overridden. It is called just once after processing all trace data and it should present the results of the analysis. For parallel operation, any desired aggregation across the whole trace should occur here as well, while shard-specific results can be presented in parallel_shard_exit().

Today, parallel analysis is only supported for offline traces. Support for online traces may be added in the future.

In the default mode of operation, the analyzer_t class iterates over the trace and calls the appropriate analysis_tool_t functions for each tool. An alternative mode is supported which exposes the iterator and allows a separate control infrastructure to be built.

As explained in Trace Format, each trace entry is of type memref_t and represents one instruction or data reference or a metadata operation such as a thread exit or marker. There are built-in scheduling markers providing the timestamp and cpu identifier on each thread transition. Other built-in markers indicate disruptions in user mode control flow such as signal handler entry and exit.

The absolute ordinals for trace records and instruction fetches are available via the memtrace_stream_t interface passed to the initialize_stream() function for serial operation and parallel_shard_init_stream() for parallel operation. If the iterator skips over some records that are not passed to the tools, these ordinals will include those skipped records. If a tool wishes to count only those records or instructions that it sees, it can add its own counters.

In some cases, a tool may want to observe the exact sequence of trace_entry_t in an offline trace stored on disk. To support such use cases, the trace_entry_t specialization of analysis_tool_tmpl_t and analyzer_tmpl_t can be used. Specifically, such tools should subclass record_analysis_tool_t, and use the record_analyzer_t class.

CMake support is provided for including the headers and linking the libraries of the drmemtrace framework. A new CMake function is defined in the DynamoRIO package which sets the include directory for using the drmemtrace/ headers:

use_DynamoRIO_drmemtrace(mytool)

The drmemtrace_analyzer library exported by the DynamoRIO package is the main library to link when building a new tool. The tools described above are also exported as the libraries drmemtrace_basic_counts, drmemtrace_view, drmemtrace_opcode_mix, drmemtrace_histogram, drmemtrace_reuse_distance, drmemtrace_reuse_time, drmemtrace_simulator, and drmemtrace_func_view and can be created using the basic_counts_tool_create(), opcode_mix_tool_create(), histogram_tool_create(), reuse_distance_tool_create(), reuse_time_tool_create(), view_tool_create(), cache_simulator_create(), tlb_simulator_create(), and func_view_create() functions.

Scheduler

In addition to the analysis tool framework, which targets running multiple tools at once either in parallel across all traced threads or in a serial fashion, we provide a scheduler which will map inputs to a given set of outputs in a specified manner. This allows a tool such as a core simulator, or just a tool wanting its own control over advancing the trace stream (unlike the analysis tool framework where the framework controls the iteration), to request the next trace record for each output on its own.

Here is a simple example of a single-output, serial stream. This also serves as an example of how to replace the now-removed old analysis tool framework's "external iterator" interface:

scheduler_t scheduler;
std::vector<scheduler_t::input_workload_t> sched_inputs;
sched_inputs.emplace_back(trace_directory);
if (scheduler.init(sched_inputs, 1, scheduler_t::make_scheduler_serial_options()) !=
scheduler_t::STATUS_SUCCESS) {
FATAL_ERROR("failed to initialize scheduler: %s",
scheduler.get_error_string().c_str());
}
auto *stream = scheduler.get_stream(0);
memref_t record;
for (scheduler_t::stream_status_t status = stream->next_record(record);
status != scheduler_t::STATUS_EOF; status = stream->next_record(record)) {
if (status != scheduler_t::STATUS_OK)
FATAL_ERROR("scheduler failed to advance: %d", status);
if (!my_tool->process_memref(record)) {
FATAL_ERROR("tool failed to process entire trace: %s",
my_tool->get_error_string().c_str());
}
}
scheduler_tmpl_t< memref_t, reader_t > scheduler_t
Definition: scheduler.h:836
Definition: memref.h:132