DynamoRIO
Google Workload Traces

With the rapid growth of internet services and cloud computing, workloads on warehouse-scale computers (WSCs) have become an important segment of today’s computing market. These workloads differ from others in their requirements of on-demand scalability, elasticity and availability. They have fundamentally different characteristics from traditional benchmarks and require changes to modern computer architecture to achieve optimal efficiency. Google is sharing instruction and memory address traces from workloads running in Google data centers so that computer architecture researchers can study and develop new architecture ideas to improve the performance and efficiency of this important class of workloads.

Trace Format

The Google workload traces are captured using DynamoRIO's drmemtrace. The traces are records of instruction and memory accesses as described at Trace Format. We separate instruction and memory access records from each software thread into a separate file (.memtrace.gz). In addition, for each software thread, we also provide a branch_trace which contains execution data (taken/not taken, branch target) about each branch instruction (conditional, non-conditional, calls, etc.). Finally, for each workload trace, we provide a thread statistics file (.threadstats.csv) which contains the thread ID (tid), instruction count, non-fetched instruction count (e.g. implicit instructions generated from microcode), load count, store count, and prefetch count.

Getting the Traces

The Google Workload Traces can be downloaded from:

Directory convention:

  •   workload/trace-X/
    where X is sequential starting from 1

Filename convention:

  • Memory trace file:
      <uuid>.<tid>.memtrace.gz
  • Branch trace file:
      <uuid>.branch_trace.<tid>.csv.gz
  • Thread statistics summary:
      <uuid>.threadstats.csv

Getting Help and Reporting Bugs

The Google Workload Traces are essentially inputs to drive third party tools (such as analyzers or simulators, including those provided here: Analysis Tool Suite). If you encounter a crash in a tool provided by a third party, please locate the issue tracker for the tool you are using and report the crash there. If you believe the issue is with the Google Workload Traces or with DynamoRIO or tools provided with DynamoRIO, you can file an issue as described at Reporting Problems.

For general questions, or if you are not sure whether the problem you hit is a bug in your own code or in provided code, use the DynamoRIO users group mailing list/discussion forum rather than opening an issue in the tracker. The users list will reach a wider audience of people who might have an answer, and it will reach other users who may find the information beneficial.

Contributing

We welcome contributions to the Google workload trace project. The goal of providing the Google workload traces is to enable computer architecture researchers to develop insights and new architecture ideas to improve the performance and efficiency of workloads that run on warehouse-scale computers.

You can contribute to the project in many ways:

  • Providing suggestions for improving trace formats.
  • Sharing and collaborating on architecture research.
  • Reporting issues: see Getting Help and Reporting Bugs