drcachesim provides a
drmemtrace analysis tool framework to make it easy to create new trace analysis tools. A new tool should subclass analysis_tool_t.
Concurrent processing of traces is supported by logically splitting a trace into "shards" which are each processed sequentially. The default shard is a traced application thread, but the tool interface can support other divisions. For tools that support concurrent processing of shards and do not need to see a single time-sorted interleaved merged trace, the interface functions with the parallel_ prefix should be overridden, and parallel_shard_supported() should return true. parallel_shard_init() will be invoked for each shard prior to invoking parallel_shard_memref() for each entry in that shard; the data structure returned from parallel_shard_init() will be passed to parallel_shard_memref() for each trace entry for that shard. The concurrency model used guarantees that all entries from any one shard are processed by the same single worker thread, so no synchronization is needed inside the parallel_ functions. A single worker thread invokes print_results() as well.
For serial operation, process_memref(), operates on a trace entry in a single, sorted, interleaved stream of trace entries. In the default mode of operation, the analyzer_t class iterates over the trace and calls the process_memref() function of each tool. An alternative mode is supported which exposes the iterator and allows a separate control infrastructure to be built. This alternative mode does not support parallel operation at this time.
Both parallel and serial operation can be supported by a tool, typically by having process_memref() create data on a newly seen traced thread and invoking parallel_shard_memref() to do its work.
For both parallel and serial operation, the function print_results() should be overridden. It is called just once after processing all trace data and it should present the results of the analysis. For parallel operation, any desired aggregation across the whole trace should occur here as well, while shard-specific results can be presented in parallel_shard_exit().
Today, parallel analysis is only supported for offline traces. Support for online traces may be added in the future.
In the default mode of operation, the analyzer_t class iterates over the trace and calls the appropriate analysis_tool_t functions for each tool. An alternative mode is supported which exposes the iterator and allows a separate control infrastructure to be built.
As explained in Trace Format, each trace entry is of type memref_t and represents one instruction or data reference or a metadata operation such as a thread exit or marker. There are built-in scheduling markers providing the timestamp and cpu identifier on each thread transition. Other built-in markers indicate disruptions in user mode control flow such as signal handler entry and exit.
CMake support is provided for including the headers and linking the libraries of the
drmemtrace framework. A new CMake function is defined in the DynamoRIO package which sets the include directory for using the
drmemtrace_analyzer library exported by the DynamoRIO package is the main library to link when building a new tool. The tools described above are also exported as the libraries
drmemtrace_func_view and can be created using the basic_counts_tool_create(), opcode_mix_tool_create(), histogram_tool_create(), reuse_distance_tool_create(), reuse_time_tool_create(), view_tool_create(), cache_simulator_create(), tlb_simulator_create(), and func_view_create() functions.