drcachesim is one of the few simulators to support multiple processes. This feature requires an out-of-process simulator and inter-process communication. A single-process design would incur less overhead. Thus, we expect
drcachesim to pay for its multi-process support with potentially unfavorable performance versus single-process simulators.
When comparing cache hits, misses, and miss rates across simulators, the details can vary substantially. For example, some other simulators (such as
cachegrind) do not split memory references that cross cache lines into multiple hits or misses, while
drcachesim does split them. Instructions that reference multiple memory words on the same cache line (such as
ldm on ARM) are considered to be single accesses by
drcachesim, while other simulators (such as
cachegrind) may split the accesses into separate pieces. A final example involves string loop instructions on x86.
drcachesim considers only the first iteration to involve an instruction fetch (presenting subsequent iterations as a "non-fetched instruction" which the simulator ignores: the basic_counts tool does show these as a separate statistics), while other simulators (incorrectly) issue a fetch to the instruction cache on every iteration of the string loop.