Efficient Debugging in Python: A Guide to Using pdb Over print

Efficient Debugging in Python: A Guide to Using pdb Over print

ยท

6 min read

Debugging is an important part of software development. As humans, we create things, and mistakes can happen. Acknowledging this helps us quickly identify errors, track them, and correct them.

This article is about why debugging in Python using the built-in debugger called pdb is better than using print statements. We will learn how to use pdb to manipulate a running program and inspect some errors.

If you are interested in more content covering topics like this, subscribe to my newsletter for regular updates on software programming, architecture, and tech-related insights.

Debugging in Python

Debugging in Python is straightforward. I mean, I have an error, right? A variable with the wrong format? A None variable? Is a block of code not running normally?

In my early days as a Python developer, using print() was my go-to choice. And I still use it; it is pretty effective, let's be honest.

To showcase this, let's analyze a simple program.

def add_numbers(a, b):
    return a + b

def main():
    num1 = 10
    num2 = "20"  # Intentional error: num2 should be an integer
    print("num1:", num1)
    print("num2:", num2)
    result = add_numbers(num1, num2)
    print("Result:", result)

if __name__ == "__main__":
    main()

Running this program will generate a TypeError.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'str'

To debug this, we can inspect the type of the variables.

print("num1:", num1, type(num1))
print("num2:", num2, type(num2))

Running this program will show that num2 is a string, leading to a TypeError when attempting to add it to an integer.

But what happens when you have a much more complex program? Let's take the example of a multithreaded application that calculates factorials of multiple numbers.

import threading

def factorial(n):
    if n == 1:  # Intentional error: should be if n == 0
        return 1
    else:
        return n * factorial(n - 1)

def calculate_factorial(num):
    result = factorial(num)
    print(f"Factorial of {num} is {result}")

def main():
    numbers = [0, 5, 7, 10, 12]
    threads = []

    for number in numbers:
        thread = threading.Thread(target=calculate_factorial, args=(number,))
        threads.append(thread)
        thread.start()

    for thread in threads:
        thread.join()

if __name__ == "__main__":
    main()

Try running this script in Python, and you will receive an error.

koladev@koladev-2 ~ % python3 factorial.py
Exception in thread Thread-1 (calculate_factorial):
Factorial of 5 is 120
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.12/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
Factorial of 7 is 5040
Factorial of 10 is 3628800
Factorial of 12 is 479001600
    self.run()
  File "/opt/homebrew/Cellar/python@3.12/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/threading.py", line 1010, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/koladev/factorial.py", line 10, in calculate_factorial
    result = factorial(num)
             ^^^^^^^^^^^^^^
  File "/Users/koladev/factorial.py", line 7, in factorial
    return n * factorial(n - 1)
               ^^^^^^^^^^^^^^^^
  File "/Users/koladev/factorial.py", line 7, in factorial
    return n * factorial(n - 1)
               ^^^^^^^^^^^^^^^^
  File "/Users/koladev/factorial.py", line 7, in factorial
    return n * factorial(n - 1)
               ^^^^^^^^^^^^^^^^
  [Previous line repeated 993 more times]
RecursionError: maximum recursion depth exceeded

Let's add print statements to trace the execution flow and identify the cause of the error.

import threading

def factorial(n):
    print(f"Calculating factorial for {n}")  # Debugging statement
    if n == 1:  # Intentional error: should be if n == 0
        print(f"Reached base case for {n}")  # Debugging statement
        return 1
    else:
        result = n * factorial(n - 1)
        print(f"Factorial for {n} after recursion is {result}")  # Debugging statement
        return result

def calculate_factorial(num):
    print(f"Starting calculation for {num}")  # Debugging statement
    result = factorial(num)
    print(f"Factorial of {num} is {result}")

def main():
    numbers = [0, 5, 7, 10, 12]
    threads = []

    for number in numbers:
        print(f"Creating thread for number {number}")  # Debugging statement
        thread = threading.Thread(target=calculate_factorial, args=(number,))
        threads.append(thread)
        thread.start()

    for thread in threads:
        thread.join()

if __name__ == "__main__":
    main()

When running the program, you will notice errors regarding the calculation of the factorial of negative numbers.

koladev@koladev-2 ~ % python3 factorial.py
Creating thread for number 0
Starting calculation for 0
Calculating factorial for 0
Calculating factorial for -1
Creating thread for number 5
Calculating factorial for -2
Calculating factorial for -3
Starting calculation for 5

From the output, you can see that when calculating the factorial for 0, the function keeps calling itself with decreasing values until it hits negative numbers, leading to a recursion error because the base case if n == 1 is never reached for n = 0.

So, we can just modify the condition from if n == 1 to if n == 0.

However, looking at the print statements in the console is quite clumsy. We have some issues with using print for debugging:

  • Output Overload: The console output becomes cluttered with debugging statements, making it difficult to track the flow of execution.

  • Synchronized Output: In multithreaded applications, print statements can interleave, leading to a jumbled output that's hard to interpret.

  • Limited Insight: Print statements provide only a snapshot of the program's state at specific points, which might not be sufficient to understand complex issues such as race conditions.

Let's see how pdb can make things better for us.

Using pdb for Debugging in Python

pdb is the Python debugger, a built-in module for interactive debugging of Python programs. It allows you to set breakpoints, step through code, inspect variables, and control execution flow to identify and fix issues.

Let's modify the code by introducing pdb.

import threading
import pdb

def factorial(n):
    pdb.set_trace()  # Setting a breakpoint
    if n == 1:  # Intentional error: should be if n == 0
        return 1
    else:
        return n * factorial(n - 1)

def calculate_factorial(num):
    result = factorial(num)
    print(f"Factorial of {num} is {result}")

def main():
    numbers = [0, 5]
    threads = []

    for number in numbers:
        thread = threading.Thread(target=calculate_factorial, args=(number,))
        threads.append(thread)
        thread.start()

    for thread in threads:
        thread.join()

if __name__ == "__main__":
    main()

Now run the program.

$ python3 factorial.py
> factorial.py(5)factorial()
-> pdb.set_trace()  # Setting a breakpoint
(Pdb) n
> factorial.py(6)factorial()
-> if n == 1:  # Intentional error: should be if n == 0
(Pdb) p n
0
(Pdb) n
> factorial.py(9)factorial()
-> return n * factorial(n - 1)
(Pdb) p n
0
(Pdb) n
> factorial.py(5)factorial()
-> pdb.set_trace()  # Setting a breakpoint
(Pdb) p n
-1
(Pdb) n

From the session above, you can see that the value of n decreases below 0 due to the incorrect base case (if n == 1 instead of if n == 0). This confirms the error in the base case.

Correct the condition, and the program should run without any errors, printing the correct factorials.

Debugging with pdb offers a cleaner, more controlled way to inspect and correct errors, especially in complex and multithreaded applications.

Conclusion

Debugging is an essential skill for any developer, and mastering tools like pdb can significantly enhance your ability to diagnose and fix issues efficiently. In this article, we that while print() statements have their place, using an interactive debugger such as pdb allows for a more nuanced and thorough examination of your code's behavior, especially in complex or multithreaded environments.


If you enjoyed this article and want to stay updated with more content, subscribe to my newsletter. I send out a weekly or bi-weekly digest of articles, tips, and exclusive content that you won't want to miss ๐Ÿš€