DrCacheSim Offline Trace Debugging

This page contains tips for troubleshooting issues when using the DrCacheSim tool's offline memory address traces.

Thread Interleaving Granularity

Each thread has a 32K buffer and when it fills up, or when a system call is about to be executed, the buffer is written out to a per-thread file with a header attached which has a timestamp and cpu identifier. Thus the thread interleaving and cpu scheduling can be reconstructed with a granularity of these buffers.

Viewing binary trace files

Viewing raw, pre-processed files

Each offline_entry_t struct is 8 bytes, so it's easy to view the records as 8-byte entries:

$ od -t x8 -A x drmemtrace.simple_app.09291.0000.dir/raw/drmemtrace.simple_app.09291.0000.raw | tail
06ed10 200080080007c170 00007fd5a8a49700
06ed20 00007fd5a8a496a8 2000e0080007c187
06ed30 00007ffdee8cd0f8 00007ffdee8cd100
06ed40 00007ffdee8cd108 00007ffdee8cd110
06ed50 00007ffdee8cd118 200060080003c15b
06ed60 200040080003c164 00007ffdee8cd118
06ed70 2000a008000c1180 00007fd5a8a47e68
06ed80 20006008000c11b1 802e9f9036480c5d
06ed90 c100000000000000

Of course, if the raw file is compressed, it must first be uncompressed. Use unlz4 for lz4-compressed raw files.

Cheat sheet:

  • c040000000000004 is a header (offline version 4; type 0x40==OFFLINE_FILE_TYPE_ARCH_X86_64)
  • c100000000000000 is a footer
  • 8* is a timestamp
  • 6* is a pid
  • 4* is a tid
  • 2* are PC entries
  • a* is an arm iflush
  • c2* is a marker
  • c203* is a cpu id
  • c20a* is the cache line size
  • c212* is the page size
  • c200* is kernel event; c201* is kernel xfer

Viewing post-processed files

trace_entry_t is 12 bytes and is turned into memref_t by reader_t and its subclasses. To view the 12-byte entries I use od or hexdump to split into 6 2-byte entries and then combine the final 4 into an 8-byte little-endian field using awk:

$ zcat drmemtrace.tool.drcacheoff.burst_malloc.211917.2237.dir/trace/drmemtrace.tool.drcacheoff.burst_malloc.211917.8542.trace.gz | od -A x -t x2 -w12 | awk '{printf "%s | %s %s %s%s%s%s\n", $1, $2, $3, $7, $6, $5, $4}' | head
000000 | 0019 0000 0000000000000001
00000c | 0016 0004 0000000000033bcd
000018 | 0018 0004 0000000000033bcd
000024 | 001c 0002 002eff22e15562f3
000030 | 001c 0003 0000000000000000
00003c | 000a 0003 0000555ec5db2e40
000048 | 0000 0004 00007ffed4b546ac

The printed columns are "offset | type size addr". Type cheat sheet (from trace_type_t enum):

  • 0x19 header
  • 0x16 thread
  • 0x18 pid
  • 0x1c marker: 2=timestamp; 3=cpuid; 0xc=version; 9=filetype
  • 0x0a instr (non-cti)
  • 0x0e direct call
  • 0x00 load
  • 0x01 store
  • 0x1d non-fetched instr
  • 0x1a footer
  • 0x2f instr encoding

type_is_instr: 0xa-0x10 + 0x1e

For .zip files, the data is split into the component files within the archive, in order. Each component repeats enough information (the final timestamp + cpu from the prior chunk, instruction encodings, etc.) to avoid having to examing prior chunks. Extract each component file in turn to view the data. For example:

$ unzip -p drmemtrace.app.194439.1516.dir/trace/drmemtrace.app.194439.7393.trace.zip chunk.0000 | hexdump -v -e '"%010_ax | " 6/2 "%04x " "\n"' | awk '{printf "%s %s %s %s %s%s%s%s\n", $1, $2, $3, $4, $8, $7, $6, $5}' | head -3
0000000000 | 0019 0000 0000000000000004
000000000c | 001c 000c 0000000000000004
0000000018 | 001c 0009 0000000000000040

Viewing instruction disassembly

You can use the view analysis tool with skip_refs and sim_refs parameters to select a window, or for small traces you can re-post-process with drraw2trace with high verbosity (-verbose 4 is good).