- Use debug builds (pass "-debug" to drrun) with notifications turned on to diagnose errors. In general, it is much easier to debug with the debug build of DynamoRIO. A release build crash may show up as an earlier, easier-to-diagnose assert in debug build.
- Try running without any client to isolate where the error is
- Asserts can be suppressed with the
-ignore_assert_list *option or with
- Logging can be enabled with
-loglevel N. Logs are written to DR-install/logs/. See our logging documentation for information on what is contained in the logs.
Use the load_syms or load_syms64 script to locate client and private libraries on Windows (
-no_hideis no longer necessary (it never helped with the client library anyway)). Pre-install these are in tools/windbg-scripts/load_syms and tools/windbg-scripts/load_syms64. I simply change my windbg shortcuts to point at these via -c (please note: these are meant for attaching to a running process that is under DynamoRIO control):"C:\Program Files (x86)\Debugging Tools for Windows\windbg.exe" -pt 1 -c "$><E:\derek\dr\git\src\tools\windbg-scripts\load_syms""C:\Program Files\Debugging Tools for Windows (x64)\windbg.exe" -pt 1 -c "$><E:\derek\dr\git\src\tools\windbg-scripts\load_syms64"
For debugging a 32-bit process with the 64-bit windbg, use the load_symsWOW64 variant. However, this is less well-tested than using a 32-bit windbg on a 32-bit process.
- To locate client and private libraries on Linux, use the add-symbol-file commands printed out at start time (see below for more information).
- Use read watchpoints instead of breakpoints in application code, as the trap instruction inserted by the debugger into the application code can end up copied into DynamoRIO's code cache, resulting in an unhandled trap.
- On Windows, if an application invokes OutputDebugString() while under a debugger, DynamoRIO can end up losing control of the application.
- A SIGSEGV or Access Violation observed in a debugger does not necessarily indicate a problem. DynamoRIO uses "safe read" operations to access untrusted application addresses and can incur faults which are handled and continued past.
- DynamoRIO disables itself when Windows is booted in safe mode (without networking). Thus, if a crash occurs in a Windows service under DynamoRIO, rebooting in safe mode will allow recovery.
If a client library doesn't seem to function for a given process, it is possible that the client library wasn't loaded due to permissions errors.
One of the common situations where this happens is when the target application runs as a different user than the user who created the client library. This results in the application process not having the right permissions to access the client library.
Try running the process under the debug mode of DynamoRIO (see dr_register_process()), where diagnostic messages are raised on errors like client library permissions. To see all messages, set the notification options like -msgbox_mask and -stderr_mask options to 0xf (see DynamoRIO Runtime Options). This will alert you to the problem.
On Linux we use gdb for debugging. Note that we now use debug information files that are separate from their corresponding shared libraries: e.g.,
libdynamorio.so.debug. There is a section inside the shared library that identifies the debug file.
On Linux, DR sends itself a SIGILL signal during initialization, in order to measure the size of the signal frame used by the kernel. If you are under gdb, simply continue past this signal. SIGILL (SIGFPE on MacOS; SIGSTKFLT inside QEMU where the others crash QEMU (but note the poor support for SIGSTKFLT in gdb)) is also sent to other threads when attaching to a multi-threaded application and to suspend or terminate them later on during execution.
Additionally, DR uses various "safe read" strategies where it may raise a SIGSEGV or SIGBUS while examining application memory. Load symbols and look for
safe_read variants on the call stack (such as
safe_read_fast) to identify these. Just like with the init-time SIGILL, simply continue past these.
Use the DynamoRIO runtime option
-msgbox_mask with a desired mask, optionally along with
-pause_via_loop if the application uses stdin, to cause a target application to wait and let you attach gdb. To attach early on, use a debug internal build and set the mask for informational messages. See the option descriptions in the API documentation.
For fast iterative debugging of small applications, it's nice to be able to launch the app under DR under gdb with one command:
However, be sure to run this command within gdb prior to running the app to ensure success:
One drawback over attaching is that gdb's breakpoints in the loader can interfere with the dlopen() execution, and you will have to continue through them. DR's safe_read faults may also show up. It's best to ignore them via:
If you have control of the target application you can build it with the start/stop interface so that it invokes DynamoRIO and then run it under gdb just like a native application.
To isolate clients and their libraries from the application, DR uses its own private loader to load clients. By default, gdb can only see libraries loaded by the glibc loader (ld-linux.so). To work around this problem, DR prints the gdb commands necessary to load symbols in a debug build. It looks like this:
In a release build, that message is not printed. However, the same string of commands is available in the global variable
gdb_priv_cmds, which can be displayed within gdb with something like this:
You can also use the
-no_private_loader option to use the system loader to load the client, although this will break many clients and apps and is no longer officially supported.
To manually generate the symbol file loading commands, you want to tell gdb where the
.text segment is.
You will need to adjust that address based on where the library was actually loaded: simply add the current base address from the maps file or DR's diagnostic printout and subtract the preferred base.
add-symbol-file commands shown in the prior section include symbols for the DynamoRIO library. If you need symbols for the DR library itself before DR sets up those commands, in some cases the debugger loads them properly for you and you do not need to do anything special. However, some versions of gdb load DR's symbols at the wrong address when you launch a process from within the debugger and you may need to clear the symbols first, before executing any
add-symbol-file ... commands, by running:
The debugger also gets DR's symbols wrong when DR reloads itself to avoid gaps between its segments. The repository contains a gdb python script to load the libdynamorio.so symbols which is the simplest way to automagically get the correct symbols.
Another useful script provided in the repository is a memory query script to print the line in the maps file matching a given address.
gdb has trouble with generating call stacks at various points during DR execution, such as when at a system call in generated code. Manually setting the stack and frame pointers can solve this:
For examining the stack, I find this function useful:
frame command has never worked for me: it always prints
#0 0x00000000 in ?? (). Instead I set
$ebp, and sometimes have to also set
$eip (which can only be set from frame 0). Or you can manually walk frame pointers:
To get line numbers use
info line instead:
There is no way to view segment bases directly within gdb, but you can create a core file and mine it to find the bases. Here is how to do it for gs for 64-bit stored in its MSR (rather than the GDT):
The Visual Studio debugger is completely inadequate for debugging applications running under DynamoRIO. It frequently fails due to our manipulation of its injected thread, it has no command-line interface, etc. We use WinDbg (and its counterparts ntsd and cdb) exclusively. WinDbg is a low-level debugger that provides symbolic debugging too.
Install the Debugging Tools for Windows to get the full WinDbg. For debugging 32-bit applications, we recommend using WinDbg version 6.3.0017, not the newer versions 6.4 through 6.11, as they have problems displaying callstacks involving DynamoRIO code. However, if you cannot obtain 6.3.0017 (it is no longer supported), get the latest version. You'll have to go to extra effort to get a callstack when attaching at a DynamoRIO messagebox midway through execution for a 32-bit process (see below). For 64-bit, use the most recent version of WinDbg.
Most of the information about how to use this WinDbg is available from its excellent help file. Some key commands include:
- Display callstacks of all threads:
- Select thread number N:
- Display callstack of current thread with numbered frames:
- Select frame number N:
- Display local variables (only useful in debug builds):
- Display local variable x:
- Display all symbols in module M starting with XYZ:
- Display stack with symbolic references:
- Switch context (e.g., to an exception context):
.cxr <CONTEXT address>
- Display loaded modules:
- Open a log file:
- Set a breakpoint:
- Continue execution:
tools/windbg-scripts scripts and the WindDbg Tips section below for further examples.
You also need access to the symbol files both for DynamoRIO and for the operating system libraries. You can use the
.symfix command to automatically download symbols from the Microsoft symbol server. You should specify a local cache directory so that WinDbg doesn't query the server every time:
.symfix c:\my\symbol\dir. The symbol path and cache directory are stored in a global environment variable
_NT_SYMBOL_PATH. Here is an example path:
Note that you may want to remove any network paths once you have the symbols of interest locally to speed up debugging.
To attach to a process on Windows, use the
-msgbox_mask option and attach the debugger while the dialog box has paused the application. Use
-msgbox_mask 15 to attach a program startup, or
-msgbox_mask 12 to attach at a later error.
In order to attach press
F6; this will show a list of all processes available and you can choose your application. You can also view the attach list through the File->Attach menu item. After you have attached click 'ok' on the pop up window.
To get control of DynamoRIO when it starts initializing, invoke WinDbg from the cygwin command line as follows. Make sure WinDbg is in your path.
Once the debugger command line comes up, type the following commands, substituting your path to the directory where dynamorio.dll is located:
The dynamorio.dll library is not loaded by the system loader, and so WinDbg does not automatically find it. Once DynamoRIO has initialized enough, a script that we provide will automatically load the symbols for DynamoRIO and all client libraries in use. See the top of this file for how to load these scripts at WinDbg startup. From within WinDbg, for 32-bit:
For a 32-bit application but a 64-bit windbg:
These scripts will fail if the process is not running under DynamoRIO or if it has not finished DynamoRIO initialization: thus, they will not work at the early attach point described under Launching Within WinDbg above.
To manually point WinDbg at the library, you will need its directory and its base address. Then use these two commands:
Libraries loaded for a client are not on the system library list. To identify these modules use the following command:
Symbols for private libraries can be automatically loaded using the load_syms script. For 32-bit:
For a 32-bit application but a 64-bit windbg:
Windbg versions from 6.4 onward refuse to show a callstack if it's not part of the main stack for that thread, for 32-bit processes. You'll see just the top frame, often
ntdll!NtRaiseHardError if you attached at a message box. Something like this should show the right callstack although it won't let you examine frames:
Another trick is to clobber the TEB stack fields:
Which will then result in the k commands working nicely (I've done this in 6.11.001.402; should work in all other versions). Though of course only do this when not planning to continue app execution, or restore the TEB fields, just in case (that's what the
!teb is for, to print out the original values for restoring).
When DynamoRIO catches an unhandled exception, the callstack should have a
dynamorio!intercept_exception frame. Navigate to that frame, where we have a
CONTEXT* local var named
cxt. Then change the thread's context and ask for a stack trace. So if you see in the initial callstack:
Execute these commands:
And now you should see the callstack as of the exception itself.
You can create a core dump from windbg that allows others, or you at a later date, to re-analyze the bug:
It is good practice to always create a dump file when attaching windbg, just in case the bug is not reproducible.
.ldmp files are generated automatically by the core in certain failure situations. The failure situations on which a .ldmp file is created are specified by the
-dumpcore_mask option (see the enum in core/os_shared.h), which defaults to 0x1ff in debug builds and to 0x0 in release builds. An additional parameter
-dumpcore_violation_threshold controls the maximum number of violation .ldmp files to generate.
Once you have a .ldmp file you can use it to recreate the process for debugging purposes. To do so use the ldmp.exe utility in the tools module. The syntax is:
ldmp_file is the ldmp you wish to view and
dummy_executable is a fully qualified absolute windows style path to
dummy.exe (also located in the tools module).
A cygwin example:
A cmd shell example:
Ldmp will print out a process id and a bunch of mapping information. Use WinDbg to attach to the process id specified, making sure to attach NON-INVASIVELY** (you will crash WinDbg with an invasive attach). You should now be ready to debug.
Note that process ids and thread ids will differ from the original process as well as peb and teb addresses (though not contents, so you can still use !teb and !peb). Ldmp.exe provides mapping information from original thread ids and teb addrs to new thread ids and teb addrs as well as mappings for any other memory regions that were moved. Note that ldmp.exe can only recreate threads that were in our all_threads list at the time the ldmp was created, though ldmp.exe will try to detect teb regions associated with the missing threads. Handle information will not be available. MEM_TYPE information is also lost, but it is available in the human readable parts of the ldmp file. It is expected that ldmp.exe will be unable to copy over the shared_user_data/vsyscall page (so note that if debugging across os versions).
Once you are finished you will need to use task manager or DRkill.exe to kill the recreated process (will show up as dummy.exe unlike in WinDbg). Ldmp files are also somewhat human readable.
Some debugging scenarios include services that start very early in the system initialization before we are able to start any programs - including debuggers. Also, sometimes even at a later stage a machine becomes completely unresponsive. Our only resort to getting control over such a machine is with a kernel debugger. Later we'll add notes here on how to use a kernel debugger.
If a Windows machine is hung without a chance of attaching a user mode debugger we can still get a full system dump with a keyboard command.
This dump file requires a pagefile on your boot drive that is at least as large as your main system memory. Also verify that you have free space on your drive for the dump file itself. This means you need at least 2xRAM to be able to do this.
How to set it up:
- Go to Control Panel | System | Advanced | Startup and Recovery Settings
- Choose a Complete Memory Dump and note the location: it should be
- Keep the overwrite setting checked, but remember after getting a dump to rename it with a meaningful name if you want to keep it
- Go to Performance Settings | Advanced | Virtual Memory and make sure you have a large enough pagefile
To set up the key combination, set this registry key:
The magic key combination is
See http://support.microsoft.com/kb/244139/ for details on changing which keys are used, in case you're using a KVM that messes it up.
There is also an easy way to debug remotely with WinDbg when you can actually start it on the target machine. This saves you the trouble of VNCing or to the target and we should give it a try. Of course, in this mode it is not secure at all - read up on docs for better ways of doing this if you like it.
You can also set this in
However, be careful not to try this on services like
winlogon.exe on which it doesn't work. You will need a recovery CD if you mess it up.
Note that if you have already started a debug session you can easily convert it into a remote server with
and such debugging is much faster than over RDP or VNC, and helpful for sharing a debug session with someone else.
These apply only to debug builds:
- Dump stats:
- See info on all locks - most importantly contention stats:
dt innermost_lock -l next_process_lock
These work in both release and debug:
- Always save to a logfile via the .logopen command. After debugging, copy the logfile to the bug directory to keep a record of your analysis. This is particularly important for a bug you gave up on or didn't have time to complete analysis of.
- Dump all callstacks:
- In user mode debugger:
- In kernel mode debugger on XP+:
!process 0 1f [lsass.exe](HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\i8042prt\Parameters])
- In user mode debugger:
- Check which build you're using in the target:
lm vm dynamorio
Dump heap units:
You need to obtain the heap.units address and then dump the first two fields in the !list command:
``` > dt dynamorio!heap units . +0x000 units : 0x1a151000 +0x000 start_pc : 0x1a15101c "1X@" +0x004 end_pc : 0x1a160fff "" ```
``` > ?? heap.units struct theHeapUnit * 0x1a151000
> !list -t theHeapUnit.next_global -x "dd" -a "L2" 0x1a151000 dd 1a151000 L2 1a151000 1a15101c 1a160fff
dd 1a131000 L2 1a131000 1a13101c 1a140fff ```
- See current memory state with initial allocation information:
Use the scripts in the
tools/windbg-scriptsmodule to simplify analyses of DynamoRIO data structures.
Invoke by setting up parameters in the pseudo-registers and then using the
The address_query.pl script in the tools module can be used for a command-line address-to-line utility. It requires that you have already built DRload.exe in the tools module. It will use cdb if you've installed the Debugging Tools for Windows in the standard location; otherwise it uses the ntsd present on every machine and goes through a temporary log file.
It can take as input stdin, a file, or command-line arguments listing the addresses to be queried. Here's the usage:
Here's an example using a command-line argument:
As another example, if you have a callstack in the file "cstack" with two columns (frame pointers followed by addresses), you could do something like this:
See the comments at the top of the script for more information.