DynamoRIO
Simulator Parameters

drcachesim's behavior can be controlled through options passed after the -c drcachesim but prior to the "--" delimiter on the command line:

$ bin64/drrun -t drcachesim <options> <to> <drcachesim> -- /path/to/target/app <args> <for> <app>

Boolean options can be disabled using a "-no_" prefix.

The parameters available are described below:

  • -offline
    default value: false
    By default, traces are processed online, sent over a pipe to a simulator. If this option is enabled, trace data is instead written to files in -outdir for later offline analysis. No simulator is executed.
  • -ipc_name
    default value: drcachesimpipe
    For online tracing and simulation (the default, unless -offline is requested), specifies the name of the named pipe used to communicate between the target application processes and the caching device simulator. On Linux this can include an absolute path (if it doesn't, a default temp directory will be used). A unique name must be chosen for each instance of the simulator being run at any one time. On Windows, the name is limited to 247 characters.
  • -outdir
    default value: .
    For the offline analysis mode (when -offline is requested), specifies the path to a directory where per-thread trace files will be written.
  • -subdir_prefix
    default value: drmemtrace
    For the offline analysis mode (when -offline is requested), specifies the prefix for the name of the sub-directory where per-thread trace files will be written. The sub-directory is created inside -outdir and has the form 'prefix.app-name.pid.id.dir'.
  • -indir
    default value: ""
    After a trace file is produced via -offline into -outdir, it can be passed to the simulator via this flag pointing at the subdirectory created in -outdir. The -offline tracing produces raw data files which are converted into final trace files on the first execution with -indir. The raw files can also be manually converted using the drraw2trace tool. Legacy single trace files with all threads interleaved into one are not supported with this option: use -infile instead.
  • -infile
    default value: ""
    Directs the simulator to use a single all-threads-interleaved-into-one trace file. This is a legacy file format that is no longer produced.
  • -jobs
    default value: -1
    By default, both post-processing of offline raw trace files and analysis of trace files is parallelized. This option controls the number of concurrent jobs. 0 disables concurrency and uses a single thread to perform all operations. A negative value sets the job count to the number of hardware threads, with a cap of 16.
  • -module_file
    default value: ""
    The opcode_mix tool needs the modules.log file (generated by the offline post-processing step in the raw/ subdirectory) in addition to the trace file. If the file is named modules.log and is in the same directory as the trace file, or a raw/ subdirectory below the trace file, this parameter can be omitted.
  • -alt_module_dir
    default value: ""
    Specifies a directory containing libraries referenced in -module_file for analysis tools, or in the raw modules file for post-prcoessing of offline raw trace files. This directory takes precedence over the recorded path.
  • -chunk_instr_count
    default value: 10000000
    Specifies the size in instructions of the chunks into which a trace output file is split inside a zipfile. This is the granularity of a fast seek. This only applies when generating .zip-format traces; when built without support for writing .zip files, this option is ignored. For 32-bit this cannot exceed 4G.
  • -funclist_file
    default value: ""
    The func_view tool needs the mapping from function name to identifier that was recorded during offline tracing. This data is stored in its own separate file in the raw/ subdirectory. If the file is named funclist.log and is in the same directory as the trace file, or a raw/ subdirectory below the trace file, this parameter can be omitted.
  • -cores
    default value: 4
    Specifies the number of cores to simulate.
  • -line_size
    default value: 64
    Specifies the cache line size, which is assumed to be identical for L1 and L2 caches. Must be a power of 2.
  • -L1I_size
    default value: 32K
    Specifies the total size of each L1 instruction cache. Must be a power of 2 and a multiple of -line_size.
  • -L1D_size
    default value: 32K
    Specifies the total size of each L1 data cache. Must be a power of 2 and a multiple of -line_size.
  • -L1I_assoc
    default value: 8
    Specifies the associativity of each L1 instruction cache. Must be a power of 2.
  • -L1D_assoc
    default value: 8
    Specifies the associativity of each L1 data cache. Must be a power of 2.
  • -LL_size
    default value: 8M
    Specifies the total size of the unified last-level (L2) cache. Must be a power of 2 and a multiple of -line_size.
  • -LL_assoc
    default value: 16
    Specifies the associativity of the unified last-level (L2) cache. Must be a power of 2.
  • -LL_miss_file
    default value: ""
    If non-empty, when running the cache simulator, requests that every last-level cache miss be written to a file at the specified path. Each miss is written in text format as a <program counter, address> pair. If this tool is linked with zlib, the file is written in gzip-compressed format. If non-empty, when running the cache miss analyzer, requests that prefetching hints based on the miss analysis be written to the specified file. Each hint is written in text format as a <program counter, stride, locality level> tuple.
  • -L0_filter
    default value: false
    DEPRECATED: Use the -L0I_filter and -L0D_filter options instead.
  • -L0I_filter
    default value: false
    Filters out instruction hits in a 'zero-level' cache during tracing itself, shrinking the final trace to only contain instructions that miss in this initial cache. This cache is direct-mapped with size equal to -L0I_size. It uses virtual addresses regardless of -use_physical. The dynamic (pre-filtered) per-thread instruction count is tracked and supplied via a TRACE_MARKER_TYPE_INSTRUCTION_COUNT marker at thread buffer boundaries and at thread exit.
  • -L0D_filter
    default value: false
    Filters out data hits in a 'zero-level' cache during tracing itself, shrinking the final trace to only contain data accesses that miss in this initial cache. This cache is direct-mapped with size equal to -L0D_size. It uses virtual addresses regardless of -use_physical.
  • -L0I_size
    default value: 32K
    Specifies the size of the 'zero-level' instruction cache for -L0I_filter. Must be a power of 2 and a multiple of -line_size, unless it is set to 0, which disables instruction fetch entries from appearing in the trace.
  • -L0D_size
    default value: 32K
    Specifies the size of the 'zero-level' data cache for -L0D_filter. Must be a power of 2 and a multiple of -line_size, unless it is set to 0, which disables data entries from appearing in the trace.
  • -instr_only_trace
    default value: false
    If -instr_only_trace, only instruction fetch entries are included in the trace and data entries are omitted.
  • -coherence
    default value: false
    Writes to cache lines will invalidate other private caches that hold that line.
  • -use_physical
    default value: false
    If available, metadata with virtual-to-physical-address translation information is added to the trace. This is not possible from user mode on all platforms. The regular trace entries remain virtual, with a pair of markers of types TRACE_MARKER_TYPE_PHYSICAL_ADDRESS and TRACE_MARKER_TYPE_VIRTUAL_ADDRESS inserted at some prior point for each new or changed page mapping to show the corresponding physical addresses. If translation fails, a TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE is inserted. This option may incur significant overhead both for the physical translation and as it requires disabling optimizations.For -offline, this option must be passed to both the tracer (to insert the markers) and the simulator (to use the markers).
  • -virt2phys_freq
    default value: 0
    This option only applies if -use_physical is enabled. The virtual to physical mapping is cached for performance reasons, yet the underlying mapping can change without notice. This option controls the frequency with which the cached value is ignored in order to re-access the actual mapping and ensure accurate results. The units are the number of memory accesses per forced access. A value of 0 uses the cached values for the entire application execution.
  • -cpu_scheduling
    default value: false
    By default, the simulator schedules threads to simulated cores in a static round-robin fashion. This option causes the scheduler to instead use the recorded cpu that each thread executed on (at a granularity of the trace buffer size) for scheduling, mapping traced cpu's to cores and running each segment of each thread on the core that owns the recorded cpu for that segment.
  • -max_trace_size
    default value: 0
    If non-zero, this sets a maximum size on the amount of raw trace data gathered for each thread. This is not an exact limit: it may be exceeded by the size of one internal buffer. Once reached, instrumentation continues for that thread, but no further data is recorded.
  • -max_global_trace_refs
    default value: 0
    If non-zero, this sets a maximum size on the amount of trace entry references (of any type: instructions, loads, stores, markers, etc.) recorded. Once reached, instrumented execution continues, but no further data is recorded. This is similar to -exit_after_tracing but without terminating the process.The reference count is approximate.
  • -trace_after_instrs
    default value: 0
    If non-zero, this causes tracing to be suppressed until this many dynamic instruction executions are observed from the start of the application. At that point, regular tracing is put into place. The threshold should be considered approximate, especially for larger values. Use -trace_for_instrs, -max_trace_size, or -max_global_trace_refs to set a limit on the subsequent trace length. Use -retrace_every_instrs to trace repeatedly.
  • -trace_for_instrs
    default value: 0
    If non-zero, this stops recording a trace after the specified number of instructions are traced. Unlike -exit_after_tracing, which kills the application (and counts data as well as instructions), the application continues executing. This can be combined with -retrace_every_instrs. The actual trace period may vary slightly from this number due to optimizations that reduce the overhead of instruction counting.
  • -retrace_every_instrs
    default value: 0
    This option augments -trace_for_instrs. After tracing concludes, this option causes non-traced instructions to be counted and after the number specified by this option, tracing will start up again for the -trace_for_instrs duration. This process repeats itself. This can be combined with -trace_after_instrs for an initial period of non-tracing. Each tracing window is delimited by TRACE_MARKER_TYPE_WINDOW_ID markers. For -offline traces, each window is placed into its own separate set of output files, unless -no_split_windows is set.
  • -split_windows
    default value: true
    By default, offline traces in separate windows from -retrace_every_instrs are written to a different set of files for each window. If this option is disabled, all windows are concatenated into a single trace, separated by TRACE_MARKER_TYPE_WINDOW_ID markers.
  • -exit_after_tracing
    default value: 0
    If non-zero, after tracing the specified number of references, the process is exited with an exit code of 0. The reference count is approximate. Use -max_global_trace_refs instead to avoid terminating the process.
  • -raw_compress
    default value: lz4
    Specifies the compression type to use for raw offline files: "snappy", "snappy_nocrc" (snappy without checksums, which is much faster), "gzip", "zlib", "lz4", or "none". Whether this reduces overhead depends on the storage type: for an SSD, zlib and gzip typically add overhead and would only be used if space is at a premium; snappy_nocrc and lz4 are nearly always performance wins.
  • -online_instr_types
    default value: false
    By default, offline traces include some information on the types of instructions, branches in particular. For online traces, this comes at a performance cost, so it is turned off by default.
  • -replace_policy
    default value: LRU
    Specifies the replacement policy for caches. Supported policies: LRU (Least Recently Used), LFU (Least Frequently Used), FIFO (First-In-First-Out).
  • -data_prefetcher
    default value: nextline
    Specifies the hardware data prefetcher policy. The currently supported policies are 'nextline' (fetch the subsequent cache line) and 'none' (disables hardware prefetching). The prefetcher is located between the L1D and LL caches.
  • -page_size
    default value: 4K
    Specifies the virtual/physical page size.
  • -TLB_L1I_entries
    default value: 32
    Specifies the number of entries in each L1 instruction TLB. Must be a power of 2.
  • -TLB_L1D_entries
    default value: 32
    Specifies the number of entries in each L1 data TLB. Must be a power of 2.
  • -TLB_L1I_assoc
    default value: 32
    Specifies the associativity of each L1 instruction TLB. Must be a power of 2.
  • -TLB_L1D_assoc
    default value: 32
    Specifies the associativity of each L1 data TLB. Must be a power of 2.
  • -TLB_L2_entries
    default value: 1024
    Specifies the number of entries in each unified L2 TLB. Must be a power of 2.
  • -TLB_L2_assoc
    default value: 4
    Specifies the associativity of each unified L2 TLB. Must be a power of 2.
  • -TLB_replace_policy
    default value: LFU
    Specifies the replacement policy for TLBs. Supported policies: LFU (Least Frequently Used).
  • -simulator_type
    default value: cache
    Specifies the type of the simulator. Supported types: cache, miss_analyzer, TLB, reuse_distance, reuse_time, histogram, basic_counts, or invariant_checker.
  • -verbose
    default value: 0
    Verbosity level for notifications.
  • -show_func_trace
    default value: true
    In the func_trace tool, this controls whether every traced call is shown or instead only aggregate statistics are shown.
  • -test_mode
    default value: false
    Run extra analyses for sanity checks on the trace.
  • -test_mode_name
    default value: ""
    Run extra analyses for specific sanity checks by name on the trace.
  • -disable_optimizations
    default value: false
    Disables various optimizations where information is omitted from offline trace recording when it can be reconstructed during post-processing. This is meant for testing purposes.
  • -dr
    default value: ""
    Specifies the path of the DynamoRIO root directory.
  • -dr_debug
    default value: false
    Requests use of the debug build of DynamoRIO rather than the release build.
  • -dr_ops
    default value: ""
    Specifies the options to pass to DynamoRIO.
  • -tracer
    default value: ""
    The full path to the tracer library.
  • -tracer_alt
    default value: ""
    The full path to the tracer library for the other bitwidth, for use on child processes with a different bitwidth from their parent. If empty, such child processes will die with fatal errors.
  • -only_thread
    default value: 0
    For simulator types that support it, limits analyis to the single thread with the given identifier. 0 enables all threads.
  • -skip_refs
    default value: 0
    Specifies the number of references to skip in the beginning of the application execution. These memory references are dropped instead of being simulated.
  • -warmup_refs
    default value: 0
    Specifies the number of memory references to warm up caches before simulation. The warmup references come after the skipped references and before the simulated references. This flag is incompatible with warmup_fraction.
  • -warmup_fraction
    default value: 0
    Specifies the fraction of last level cache blocks to be loaded such that the cache is considered to be warmed up before simulation. The warmup fraction is computed after the skipped references and before simulated references. This flag is incompatible with warmup_refs.
  • -sim_refs
    default value: 8589934592G
    Specifies the number of memory references to simulate. The simulated references come after the skipped and warmup references, and the references following the simulated ones are dropped.
  • -view_syntax
    default value: att/arm/dr
    Specifies the syntax to use when viewing disassembled offline traces. The option can be set to one of "att" (AT&T style), "intel" (Intel style), "dr" (DynamoRIO's native style with all implicit operands listed), and "arm" (32-bit ARM style). An invalid specification falls back to the default, which is "att" for x86, "arm" for ARM (32-bit), and "dr" for AArch64.
  • -config_file
    default value: ""
    The full path to the cache hierarchy configuration file.
  • -report_top
    default value: 10
    Specifies the number of top results to be reported.
  • -reuse_distance_threshold
    default value: 100
    Specifies the reuse distance threshold for reporting the distant repeated references. A reference is a distant repeated reference if the distance to the previous reference on the same cache line exceeds the threshold.
  • -reuse_distance_histogram
    default value: false
    By default only the mean, median, and standard deviation of the reuse distances are reported. This option prints out the full histogram of reuse distances.
  • -reuse_skip_dist
    default value: 500
    Specifies the distance between nodes in the skip list. For optimal performance, set this to a value close to the estimated average reuse distance of the dataset.
  • -reuse_verify_skip
    default value: false
    Verifies every skip list-calculated reuse distance with a full list walk. This incurs significant additional overhead. This option is only available in debug builds.
  • -record_function
    default value: ""
    Record invocations trace for the specified function(s) in the option value. Default value is empty. The value should fit this format: function_name|func_args_num (e.g., -record_function "memset|3") with an optional suffix "|noret" (e.g., -record_function "free|1|noret"). The trace would contain information for each function invocation's return address, function argument value(s), and (unless "|noret" is specified) function return value. (If multiple requested functions map to the same address and differ in whether "noret" was specified, the attribute from the first one requested will be used. If they differ in the number of args, the minimum value will be used.) We only record pointer-sized arguments and return values. The trace identifies which function is involved via a numeric ID entry prior to each set of value entries. The mapping from numeric ID to library-qualified symbolic name is recorded during tracing in a file "funclist.log" whose format is described by the drmemtrace_get_funclist_path() function's documentation. If the target function is in the dynamic symbol table, then the function_name should be a mangled name (e.g. "_Znwm" for "operator new", "_ZdlPv" for "operator delete"). Otherwise, the function_name should be a demangled name. Recording multiple functions can be achieved by using the separator "&" (e.g., -record_function "memset|3&memcpy|3"), or specifying multiple -record_function options (e.g., -record_function "memset|3" -record_function "memcpy|3"). Note that the provided function name should be unique, and not collide with existing heap functions (see -record_heap_value) if -record_heap option is enabled.
  • -record_heap
    default value: false
    It is a convenience option to enable recording a trace for the defined heap functions in -record_heap_value. Specifying this option is equivalent to -record_function [heap_functions], where [heap_functions] is the value in -record_heap_value.
  • -record_heap_value
    default value: malloc|1&free|1|noret&tc_malloc|1&tc_free|1|noret&__libc_malloc|1&__libc_free|1|noret&calloc|2&_Znwm|1&_ZnwmRKSt9nothrow_t|2&_ZnwmSt11align_val_t|2&_ZnwmSt11align_val_tRKSt9nothrow_t|3&_ZnwmPv|2&_Znam|1&_ZnamRKSt9nothrow_t|2&_ZnamSt11align_val_t|2&_ZnamSt11align_val_tRKSt9nothrow_t|3&_ZnamPv|2&_ZdlPv|1|noret&_ZdlPvRKSt9nothrow_t|2|noret&_ZdlPvSt11align_val_t|2|noret&_ZdlPvSt11align_val_tRKSt9nothrow_t|3|noret&_ZdlPvm|2|noret&_ZdlPvmSt11align_val_t|3|noret&_ZdlPvS_|2|noret&_ZdaPv|1|noret&_ZdaPvRKSt9nothrow_t|2|noret&_ZdaPvSt11align_val_t|2|noret&_ZdaPvSt11align_val_tRKSt9nothrow_t|3|noret&_ZdaPvm|2|noret&_ZdaPvmSt11align_val_t|3|noret&_ZdaPvS_|2|noret
    Functions recorded by -record_heap. The option value should fit the same format required by -record_function. These functions will not be traced unless -record_heap is specified.
  • -record_dynsym_only
    default value: false
    Symbol lookup can be expensive for large applications and libraries. This option causes the symbol lookup for -record_function and -record_heap to look in the dynamic symbol table only.
  • -record_replace_retaddr
    default value: false
    Function wrapping can be expensive for large concurrent applications. This option causes the post-function control point to be located using return address replacement, which has lower overhead, but runs the risk of breaking an application that examines or changes its own return addresses in the recorded functions.
  • -miss_count_threshold
    default value: 50000
    Specifies the minimum number of LLC misses of a load for it to be eligible for analysis in search of patterns in the miss address stream.
  • -miss_frac_threshold
    default value: 0.005
    Specifies the minimum fraction of LLC misses of a load (from all misses) for it to be eligible for analysis in search of patterns in the miss address stream.
  • -confidence_threshold
    default value: 0.75
    Specifies the minimum confidence to include a discovered pattern in the output results. Confidence in a discovered pattern for a load instruction is calculated as the fraction of the load's misses with the discovered pattern over all the load's misses.
  • -enable_drstatecmp
    default value: false
    When true, this option enables the drstatecmp library that performs state comparisons to detect instrumentation-induced bugs due to state clobbering.
  • -enable_kernel_tracing
    default value: false
    By default, offline tracing only records a userspace trace. If this option is enabled, offline tracing will record each syscall's Kernel PT and write every syscall's PT and metadata to files in -outdir/kernel.raw/ for later offline analysis. And this feature is available only on Intel CPUs that support Intel@ Processor Trace.