With the rapid growth of internet services and cloud computing, workloads on warehouse-scale computers (WSCs) have become an important segment of today’s computing market. These workloads differ from others in their requirements of on-demand scalability, elasticity and availability. They have fundamentally different characteristics from traditional benchmarks and require changes to modern computer architecture to achieve optimal efficiency. Google is sharing instruction and memory address traces from workloads running in Google data centers so that computer architecture researchers can study and develop new architecture ideas to improve the performance and efficiency of this important class of workloads. To protect Google's intellectual property, these traces have had their original ISA replaced with a synthetic ISA that we call DR_ISA_REGDEPS. This synthetic ISA removes architecture specific details (e.g., the opcode of instructions), while still providing enough information (e.g., register dependencies, instruction categories) to perform meaningful analyses and simulations.

Public Trace Format

The Google workload traces are captured using DynamoRIO's drmemtrace. The traces are records of instruction and memory accesses as described at Trace Format. While memory accesses are left unchanged compared to the original trace, instructions follow the DR_ISA_REGDEPS synthetic ISA.

DR_ISA_REGDEPS has the purpose of preserving register dependencies and giving hints on the type of operation an instruction performs. For this reason, all operands that are not registers (e.g., memory and immediate operands) are not present. Memory operations of a DR_ISA_REGDEPS instr_t can be inferred from the subsequent dynamorio::drmemtrace::TRACE_TYPE_READ and dynamorio::drmemtrace::TRACE_TYPE_WRITE records. Note that if a memory operand of the original ISA instruction uses a register, its corresponding virtual register will be present in the list of source register operands of the corresponding DR_ISA_REGDEPS instruction.

Being a synthetic ISA, some routines that work on instructions coming from an actual ISA (such as DR_ISA_AMD64) are not supported (e.g., decode_sizeof()). We do support decode() and decode_from_copy(): to decode an encoded DR_ISA_REGDEPS instruction into an instr_t.

A DR_ISA_REGDEPS instr_t contains the following information:

Categories: composed by dr_instr_category_t values, they indicate the type of operation performed (e.g., a load, a store, a floating point math operation, a branch, etc.). Note that categories are composable, hence more than one category can be set. This information can be obtained using instr_get_category().
Arithmetic flags: we don't distinguish between different flags, we only report if at least one arithmetic flag was read (all arithmetic flags will be set to read) and/or written (all arithmetic flags will be set to written). This information can be obtained using instr_get_arith_flags().
Number of source and destination operands: we only consider register operands. This information can be obtained using instr_num_srcs() and instr_num_dsts(). Memory operands can be deduced by subsequent read and write records in the trace.
Source operation size: is the largest source operand the original ISA instruction operated on. This information can be obtained using instr_get_operation_size().
List of register operand identifiers: they are contained in opnd_t lists, separated in source and destination. Note that these reg_id_t identifiers are virtual and it should not be assumed that they belong to any DR_REG_ enum value of any specific architecture. These identifiers are meant for tracking register dependencies with respect to other DR_ISA_REGDEPS instructions only. These lists can be obtained by walking the instr_t operands with instr_get_dst() and instr_get_src().
ISA mode: is always DR_ISA_REGDEPS. This information can be obtained using instr_get_isa_mode().
Encoding bytes: an array of bytes containing the DR_ISA_REGDEPS instr_t encoding. Note that this information is present only for decoded instructions (i.e., instr_t generated by decode() or decode_from_copy()). This information can be obtained using instr_get_raw_bits().
Length: the length of the encoded instruction in bytes. Note that this information is present only for decoded instructions (i.e., instr_t generated by decode() or decode_from_copy()). This information can be obtained using instr_length(). Be aware that in Google Workload Traces the instruction fetch size of a dynamorio::drmemtrace::memref_t and the instr_length() of the corresponding fetched instruction do not match! To allow analyses and simulations of front-end behavior to have a realistic size of fetched instructions, we kept the instruction fetch size to be the same as in the original ISA instruction.

Note that all routines that operate on instr_t and opnd_t are also supported for DR_ISA_REGDEPS instructions and their operands. However, querying information outside of those described above will return the zeroed value set by instr_create() or instr_init() when the instr_t was created.

On top of instructions and memory accesses, traces also have dynamorio::drmemtrace::trace_marker_type_t markers. All markers of the original trace are present, except for:

Which have been removed.

Because tracing overhead results into inflated context switches, the dynamorio::drmemtrace::TRACE_MARKER_TYPE_CPU_ID values have been modified to "unknown CPU" to avoid confusion. We recommend users to use our scheduler (see Trace Scheduler) for a realistic schedule of a trace's threads.

Also, we preserved the following markers, but only for SYS_futex functions:

Every trace has a v2p.textproto and an info.textproto file associated with it.

The v2p.textproto file provides a plausible virtual to physical mapping of the virtual addresses present in a trace for more realistic TLB simulations. This is a static virtual to physical mapping with 2 MB pages. Users can generate different mappings (e.g., smaller page size) by modifying this file, or create their own mapping following the same v2p.textproto format.

To run the TLB_simulator tool leveraging the provided v2p.textproto, use:

drrun -t drmemtrace -tool TLB_simulator -indir ${PATH_TO_TRACE} -use_physical -v2p_file ${PATH_TO_TRACE}/aux/v2p.textproto

The info.textproto file provides users with additional, human-readable information about a trace. Unlike v2p.textproto, this file is currently not intended to be used with any DynamoRIO tool. The file contains information on phases of the underlying workload and the number of peak live cores (i.e., the maximum number of cores used at the same time during the executon of the traced workload). We recommend users to set their trace simulations to use the provided number of peak live cores and scale the LLC accordingly, if the peak live cores is not the whole socket.

These traces are supported starting from DynamoRIO 11.3.

Getting the Traces

The Google Workload Traces can be downloaded from:

Google workload trace folder

Directory structure:

CHANGELOG.txt
CONTRIBUTING.txt
LICENSE.txt
README.txt
workload_name/
  trace/
    <uuid>.<tid>.memtrace.zip
  aux/
    info.textproto
    v2p.textproto

Getting Help and Reporting Bugs

The Google Workload Traces are essentially inputs to drive third party tools (such as analyzers or simulators, including those provided here: Analysis Tool Suite). If you encounter a crash in a tool provided by a third party, please locate the issue tracker for the tool you are using and report the crash there. If you believe the issue is with the Google Workload Traces or with DynamoRIO or tools provided with DynamoRIO, you can file an issue as described at Reporting Problems.

For general questions, or if you are not sure whether the problem you hit is a bug in your own code or in provided code, use the DynamoRIO users group mailing list/discussion forum rather than opening an issue in the tracker. The users list will reach a wider audience of people who might have an answer, and it will reach other users who may find the information beneficial.

Contributing

We welcome contributions to the Google workload trace project. The goal of providing the Google workload traces is to enable computer architecture researchers to develop insights and new architecture ideas to improve the performance and efficiency of workloads that run on warehouse-scale computers.

You can contribute to the project in many ways:

Providing suggestions for improving trace formats.
Sharing and collaborating on architecture research.
Reporting issues: see Getting Help and Reporting Bugs .

Cite Google Workload Traces

If you would like to cite this work, you can use the following BibTeX entry:

@misc{Google_Workload_Traces_Version_2,
  title = {Google Workload Traces Version 2},
  howpublished = {\url{https://console.cloud.google.com/storage/browser/external-traces-v2}},
  note = {Accessed: yyyy-mm-dd}
}

Common Issues

Here we collect common issues that some users might experience when using the Google Workload Traces.

Depending on the Linux distribution used, there might be a "nofile" threshold for the number of files a single process can open that is set too low (e.g., 1024). This threshold can be increased using "ulimit -n 8192". A threshold of 8192 is enough for Google Workload Traces. Consider adding this command to your ".bashrc". A low "nofile" threshold is problematic when a DynamoRIO analyzer or scheduler operates on a whole Google trace, which can have over 2000 trace files (one per software thread). This can result in "Failed to initialize scheduler: Failed to open PATH/TO/MEMTRACE.ZIP". If this happens, please increase the "nofile" threshold.

Deprecated Google Workload Traces (Version 1)

The previous version of Google workload traces contains a subset of the information of the current traces and has been deprecated. Please use the current version described above.

The previous version can still be found at:

Google workload trace folder (Version 1)

DynamoRIO 11.0 is the latest version that supports these traces.