We’ve all done a lot of print-based debugging: if the code doesn’t work, stick
a print
statement in the middle to see what it’s doing.
This is a blunt tool, though a very easy one to use. For tricky cases, look to an interactive debugger before sticking in a few dozen print statements. Debugging is hard without the right tools:
“Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.” – Brian Kernighan
Debugging is like the Scientific Method #
- Formulate hypotheses
- Make predictions about what you will see
- Test your hypotheses
- Record your observations!
- Update your hypotheses and repeat.
Rubber Ducking #
A true story, a rubber duck, and a very long day.
Print statements/Logging #
Low tech and clunky, but still useful.
Some packages allow debugging output to be turned on and off with a flag.
Using your tests #
A good first debugging step is to make sure you have tests for the function you’re debugging.
You’re going to be trying all sorts of changes, tinkering with things, refactoring, and generally messing with the function – if you have tests, you can easily check if you fixed the bug without introducing any new ones. If you don’t, you have to laboriously try various inputs until you’re satisfied.
Tests also ensure that you actually know what the function is supposed to do.
Interactive debuggers #
An interactive debugger halts program execution and allows you to inspect the current state: display local variables, view the call stack, set breakpoints, and even run new code. You can step through the code line-by-line to examine how it works.
Debuggers can often be configured to open automatically when your program
crashes or throws an exception (like Python’s pdb
). IDEs also let you set
breakpoints and run debuggers whenever you’d like, or you can add code to
invoke the debugger when desired.
Debugging in R #
First, RStudio provides an integrated debugger that’s useful for running code step-by-step or inspecting a specific function.
If you’re not using RStudio, the debug()
function can be used to
tell R to enter a debugger whenever a certain function is called
(see also debugonce()
); or you can insert a call to browser()
wherever you want your breakpoint, and when R reaches this, it
will stop the code and let you explore.
R also can enter the debugger automatically when an error occurs.
In Errors and Exceptions, we discuss how errors in R are actually “conditions”,
and you can define “handlers” to handle conditions and do things.
R provides one such handler that lets you inspect the entire call
stack, print out variables, and so on: recover()
.
To use it, set
options(error = recover)
at the top of your script. This tells R to use recover()
as the default
handler for all errors.
This feature is extremely useful if you have a long-running script that dies 75% of the way through, since you can catch it at the moment of failure.
If you use testthat
for unit tests, it supports opening a debugger
automatically when a test fails. You do this by setting a special test
“reporter” that reports failures by debugging them:
library(testthat)
test_file("test_foo.R", reporter = "debug")
Debugging in Python #
In Python, the pdb
debugger is built right in. Python editors like Spyder can
set breakpoints just like in RStudio; check the documentation for
your editor to find out how to use it.
You can also use pdb
from the command line:
# Instead of
python ingest_crimes.py -s 2707.1 data/example_data.txt
# Run
python -m pdb ingest_crimes.py -s 2707.1 data/example_data.txt
Some unit testing tools (like Python’s pytest) can automatically open a debugger when a test fails, so you can figure out exactly what happened. This can help you diagnose finicky tests:
# Instead of
pytest test_stuff.py
# Run
pytest --pdb test_stuff.py
Resources #
- pdb, the Python debugger. If you use Jupyter, look at the %debug magic command.
- RStudio’s debugging documentation
- gdb and lldb for compiled languages (C, C++, Objective-C, whatever GCC or LLVM support)
- A debugging and profiling story.