11  Debugging with AI

Debugging is often one of the most challenging and time-consuming aspects of programming. AI assistants can be powerful allies in the debugging process, helping identify and fix issues more efficiently than traditional approaches. This chapter explores how to use intentional prompting techniques specifically for debugging tasks.

11.1 Common Debugging Scenarios

11.1.1 Syntax Errors

Syntax errors occur when code violates the rules of the programming language. These are typically the easiest errors to fix, as they’re caught by compilers or interpreters before the code runs.

AI assistants can: - Explain cryptic error messages in plain language - Identify the exact location of syntax errors - Suggest corrections based on context

Example prompt:

I'm getting this syntax error in my Python code:

```python
def calculate_total(items):
    total = 0
    for item in items
        total += item.price
    return total

SyntaxError: invalid syntax

Can you identify what’s wrong and how to fix it?


### Logic Errors

Logic errors occur when the code runs without errors but doesn't produce the expected results. These are often more difficult to identify and fix than syntax errors.

AI assistants can:
- Trace through code execution step by step
- Identify flawed assumptions or logical gaps
- Suggest alternative approaches

**Example prompt:**

My binary search function seems to work for some cases but fails for others:

def binary_search(arr, target):
    left = 0
    right = len(arr) - 1
    
    while left <= right:
        mid = (left + right) // 2
        
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    
    return -1

It correctly finds 5 in [1, 3, 5, 7, 9], but when I search for 7 in [1, 3, 5, 7, 9], it returns -1 instead of 3. Can you help me find the bug?


### Runtime Errors

Runtime errors occur during program execution and cause the program to terminate unexpectedly. These include exceptions, segmentation faults, and other crashes.

AI assistants can:
- Analyze error messages and stack traces
- Identify common causes for specific exceptions
- Suggest defensive programming techniques to prevent crashes

**Example prompt:**

My code is throwing the following exception:

IndexError: list index out of range

Here’s the relevant function:

def process_data(data_list):
    result = []
    for i in range(len(data_list)):
        result.append(data_list[i] + data_list[i+1])
    return result

It crashes when I call it with process_data([1, 2, 3, 4]). Why is this happening and how can I fix it?


### Performance Issues

Performance issues occur when code runs correctly but takes too long to execute or consumes excessive resources.

AI assistants can:
- Identify performance bottlenecks
- Suggest algorithmic improvements
- Recommend more efficient data structures or libraries

**Example prompt:**

My function to find duplicate values in a list becomes extremely slow with large inputs:

def find_duplicates(values):
    duplicates = []
    for i in range(len(values)):
        for j in range(i+1, len(values)):
            if values[i] == values[j] and values[i] not in duplicates:
                duplicates.append(values[i])
    return duplicates

How can I optimize this to handle lists with thousands of items efficiently?


## Prompt Templates for Debugging

### Error Diagnosis Template

When you encounter an error message:

I’m getting this error message:

[paste the complete error message including stack trace]

From this code:

[paste the relevant code section]
  1. What is causing this error?
  2. How can I fix it?
  3. Is there a better approach to what I’m trying to do?

### Code Review Template

When your code runs but doesn't behave as expected:

My code should [describe expected behavior], but instead it [describe actual behavior]:

[paste the code]

Here’s an example of the input: [provide a specific example]

Expected output: [describe what you expect] Actual output: [describe what actually happens]

Can you help me identify what’s wrong?


### Step-by-Step Tracing Template

For understanding exactly where logic goes wrong:

Could you help me trace through this function step by step with the input [specific input]?

[paste function code]

I’d like to see the value of each variable at each step to understand where my logic is failing.


### Performance Debugging Template

For optimizing slow code:

This function works correctly but becomes slow with larger inputs:

[paste code]
  1. What is the time complexity of this function?
  2. Where are the performance bottlenecks?
  3. How can I optimize it while maintaining the same functionality?

## Effective Debugging Workflows

### The Divide and Conquer Approach

When debugging complex issues, breaking the problem down is often the most effective strategy:

1. **Isolate the problem**:

I’m not sure which part of my code is causing the issue. If I comment out the sections marked #A, #B, and #C, does anything stand out as a likely culprit? ```

  1. Create a minimal reproduction:

    Here's a simplified version of my code that still produces the error.
    Can you identify the issue in this smaller example?
  2. Binary search the code:

    If I commented out the first half of the function, the error disappears.
    Can you help me narrow down which part of the first half is problematic?

11.1.2 The Hypothesis Testing Approach

Debugging by forming and testing specific hypotheses:

  1. Form a hypothesis:

    I suspect the issue might be related to how I'm handling null values.
    Does that seem like a plausible cause based on the symptoms?
  2. Design a test:

    How can I modify my code to verify whether null values are causing the issue?
  3. Interpret results:

    I added print statements before and after the suspected line,
    and I'm seeing [specific output]. What does this tell us?

11.1.3 The Comparative Analysis Approach

Debugging by comparing working and non-working code:

  1. Identify differences:

    This code works correctly:
    [working code]
    
    But this similar code fails:
    [failing code]
    
    What key differences explain why one works and the other doesn't?
  2. Incremental changes:

    If I change my code from A to B incrementally, at what point does it break?
    I'll start by changing X and see if that affects the behavior.
  3. Reference implementation:

    Here's my implementation of algorithm X that isn't working:
    [my code]
    
    And here's a reference implementation that works:
    [reference code]
    
    What am I doing differently that could cause my issues?

11.2 Understanding Error Messages with AI

Error messages can often be cryptic, especially for beginners. AI assistants can translate these messages into actionable information.

11.2.1 Anatomy of Error Messages

When sharing error messages with AI, include:

  1. The error type/name
  2. The error message
  3. The line number or location
  4. The stack trace (if available)
  5. The context surrounding the error

Example prompt:

I'm getting this error message, but I don't understand what it means:

TypeError: cannot unpack non-iterable int object
  File "my_script.py", line 42, in process_data
    key, value = data_point
  File "my_script.py", line 27, in main
    results = process_data(points)

The data_point variable is coming from this loop:
for data_point in processed_points:
    key, value = data_point
    # rest of code...

Can you explain what this error means and how to fix it?

11.2.2 Common Error Patterns

AI can help identify patterns in errors that might indicate systemic issues:

Example prompt:

I keep getting different errors in different parts of my codebase, but they all seem related to type mismatches. Here are three recent examples:

[Error 1 details]
[Error 2 details]
[Error 3 details]

Is there a common root cause that might explain all of these errors? Should I be looking for a specific pattern in my code?

11.3 Debugging Strategies with AI

11.3.1 Rubber Duck Debugging

Rubber duck debugging involves explaining your code line by line, which often helps you spot the issue yourself. AI can serve as an advanced “rubber duck” that can also respond with insights.

Example prompt:

I'm going to walk through this function line by line to try to understand why it's not working. Please let me know if you spot any issues in my explanation:

```python
def merge_sorted_lists(list1, list2):
    result = []
    i = j = 0
    
    while i < len(list1) and j < len(list2):
        if list1[i] < list2[j]:
            result.append(list1[i])
            i += 1
        else:
            result.append(list2[j])
            j += 1
    
    # At this point, either list1 or list2 might have remaining elements
    # I expect this to add any remaining elements from list1
    result.extend(list1)
    
    # And this should add any remaining elements from list2
    result.extend(list2)
    
    return result

When I call this with [1, 3, 5] and [2, 4, 6], I expect [1, 2, 3, 4, 5, 6] but get [1, 2, 3, 4, 5, 6, 3, 5, 2, 4, 6]. What am I missing?


### Print Debugging with AI

Print debugging involves adding print statements to track variable values. AI can suggest strategic places to add these statements.

**Example prompt:**

I suspect my recursive function is not terminating correctly. Where should I add print statements to debug it effectively?

def factorial(n):
    if n <= 1:
        return 1
    else:
        return n * factorial(n-1)

It works for small inputs but crashes with large ones.


### Root Cause Analysis

AI can help trace errors back to their root causes:

**Example prompt:**

My web application is throwing this error:

DatabaseError: too many database connections

This happens sporadically, usually during peak usage hours. The database connection code looks like this:

def get_db_connection():
    return psycopg2.connect(DATABASE_URL)

def fetch_user_data(user_id):
    conn = get_db_connection()
    cursor = conn.cursor()
    cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
    return cursor.fetchone()

What could be the root cause, and how can I fix it?


### Debugging AI-Generated Code

A common scenario is debugging code that AI has generated but contains subtle errors. This situation provides excellent learning opportunities by requiring you to critically analyze and fix issues.

**Example: Debugging a Gradebook Program**

Consider this AI-generated code for a student gradebook:

```python
students = {
    "Alice": "85, 90, 78",
    "Bob": "70, 88, 92"
}

for name in students:
    avg = sum(students[name]) / len(students[name])
    print(f"{name}: {avg}")

Analysis of Errors:

This code contains several issues: 1. The scores are stored as strings rather than numbers 2. The sum() function won’t work on strings 3. Using string length for division will give incorrect results

Effective Debugging Prompt:

Your code stores scores as strings. Can you fix it so each student has a list of numbers instead?

AI Response (improved but still with issues):

students = {
    "Alice": [85, 90, 78],
    "Bob": [70, 88, 92]
}

for name in students:
    total = 0
    for score in students[name]:
        total += score
    avg = total / 3
    print(f"{name} average is {avg}")

Further Analysis: The code now works but has a critical flaw: it assumes each student has exactly 3 scores.

Follow-up Debugging Prompt:

Can you make a function to calculate the average of any list of scores, instead of assuming 3?

AI Response (further improved):

def average(scores):
    return sum(scores) / len(scores)

for name, scores in students.items():
    print(f"{name} average: {average(scores)}")

This iterative debugging process demonstrates key skills: 1. Identifying specific errors in AI-generated code 2. Creating targeted prompts to address those errors 3. Iteratively improving the solution through focused feedback 4. Thinking critically about assumptions and edge cases

This approach is particularly valuable for learning, as it requires understanding both the problem and the implementation details rather than just accepting whatever code the AI generates.

11.4 Avoiding Debugging Pitfalls

11.4.1 The XY Problem

The XY Problem occurs when you ask about your attempted solution (Y) rather than your actual problem (X), leading to inefficient debugging. AI can help identify this pattern.

Wrong approach:

How do I make this regex work for validating email addresses?

/^[a-zA-Z0-9]+@[a-zA-Z0-9]+$/

Better approach:

I need to validate email addresses in my application. 
I tried using this regex: /^[a-zA-Z0-9]+\\@[a-zA-Z0-9]+$/

But it doesn't handle cases like:
- Dots in the local part (john.doe\@example.com)
- Hyphens in domain names (example-site.com)
- TLDs with more than one segment (.co.uk)

What's the best approach for validating email addresses, considering these requirements?

11.4.2 The Correlation ≠ Causation Trap

This occurs when you attribute a bug to something that happened around the same time, but isn’t actually the cause.

Example:

After adding a new CSS file to my project, my JavaScript form validation stopped working. 
Here's the new CSS I added:

[CSS code]

And here's my validation function:

[JS code]

How could the CSS be breaking my JavaScript?

AI might respond: > The CSS you’ve added is unlikely to directly break your JavaScript validation. These are separate concerns in web development. Let’s look for other possible causes: > > 1. Did you make any other changes at the same time? > 2. Could there be a JS syntax error that’s preventing execution? > 3. Are you using CSS selectors in your JS that might no longer match? > 4. Did you accidentally change any IDs or classes used by both? > > Let’s examine your JavaScript for the actual issue…

11.4.3 The Bandage Solution

This occurs when you fix the symptoms of a bug rather than the underlying cause. AI can help identify deeper issues.

Example prompt:

My function sometimes returns NaN, so I added a check to return 0 instead when that happens:

```javascript
function calculateAverage(numbers) {
  let sum = 0;
  for (let i = 0; i < numbers.length; i++) {
    sum += numbers[i];
  }
  let average = sum / numbers.length;
  
  // Fix for the NaN issue
  if (isNaN(average)) {
    return 0;
  }
  
  return average;
}

This fixes the immediate problem, but is there a better approach?


## Case Study: Debugging a Complex Issue

Let's walk through a complete debugging process using AI assistance:

### Initial Problem Statement

**Human prompt:**

My sorting algorithm isn’t working properly. It’s supposed to sort an array of numbers in ascending order, but some arrays end up partially sorted:

def bubble_sort(arr):
    n = len(arr)
    for i in range(n):
        for j in range(0, n - i - 1):
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
    return arr

It works for [5, 3, 8, 1, 2], giving [1, 2, 3, 5, 8]. But for [5, 3, 8, 1, 2, 9, 4], it gives [1, 2, 3, 5, 4, 8, 9]. What’s going wrong?


### Tracing Execution

**AI response:**
> Let me trace through the bubble sort algorithm step by step with the input [5, 3, 8, 1, 2, 9, 4].
> 
> In bubble sort, we compare adjacent elements and swap them if they're in the wrong order. After each complete pass through the array, the largest element "bubbles up" to its correct position.
> 
> Let's trace through each pass:
> 
> Initial array: [5, 3, 8, 1, 2, 9, 4]
> 
> Pass 1 (i=0):
> - Compare 5 and 3: Swap → [3, 5, 8, 1, 2, 9, 4]
> - Compare 5 and 8: No swap → [3, 5, 8, 1, 2, 9, 4]
> - Compare 8 and 1: Swap → [3, 5, 1, 8, 2, 9, 4]
> - Compare 8 and 2: Swap → [3, 5, 1, 2, 8, 9, 4]
> - Compare 8 and 9: No swap → [3, 5, 1, 2, 8, 9, 4]
> - Compare 9 and 4: Swap → [3, 5, 1, 2, 8, 4, 9]
> 
> After Pass 1: [3, 5, 1, 2, 8, 4, 9]
> 
> Pass 2 (i=1):
> - Compare 3 and 5: No swap → [3, 5, 1, 2, 8, 4, 9]
> - Compare 5 and 1: Swap → [3, 1, 5, 2, 8, 4, 9]
> - Compare 5 and 2: Swap → [3, 1, 2, 5, 8, 4, 9]
> - Compare 5 and 8: No swap → [3, 1, 2, 5, 8, 4, 9]
> - Compare 8 and 4: Swap → [3, 1, 2, 5, 4, 8, 9]
> 
> After Pass 2: [3, 1, 2, 5, 4, 8, 9]
> 
> Wait, I need to review the algorithm. In a correct bubble sort, the largest elements should bubble to the end, and we should get closer to the correct sort with each pass. Let me carefully check your implementation...
> 
> I don't see anything wrong with your bubble sort implementation. Let me continue tracing:
> 
> Pass 3 (i=2):
> - Compare 3 and 1: Swap → [1, 3, 2, 5, 4, 8, 9]
> - Compare 3 and 2: Swap → [1, 2, 3, 5, 4, 8, 9]
> - Compare 3 and 5: No swap → [1, 2, 3, 5, 4, 8, 9]
> - Compare 5 and 4: Swap → [1, 2, 3, 4, 5, 8, 9]
> 
> After Pass 3: [1, 2, 3, 4, 5, 8, 9]
> 
> Wait, this is now correctly sorted! Let me trace one more pass to be sure:
> 
> Pass 4 (i=3):
> - Compare 1 and 2: No swap → [1, 2, 3, 4, 5, 8, 9]
> - Compare 2 and 3: No swap → [1, 2, 3, 4, 5, 8, 9]
> - Compare 3 and 4: No swap → [1, 2, 3, 4, 5, 8, 9]
> 
> No more swaps are needed. The array is now [1, 2, 3, 4, 5, 8, 9], which is correctly sorted.
> 
> It seems the algorithm is working correctly after all. The expected output for [5, 3, 8, 1, 2, 9, 4] should be [1, 2, 3, 4, 5, 8, 9], not [1, 2, 3, 5, 4, 8, 9] as you mentioned.
> 
> There may be an issue elsewhere in your code, or perhaps there was a mistake in the observed output. Could you verify the output again, or provide more context about how you're calling and testing this function?

### Debugging the Test Case

**Human prompt:**

You’re right, I made a mistake in reporting the output. Let me recheck…

I ran the function again with [5, 3, 8, 1, 2, 9, 4], and now I’m getting [1, 2, 3, 4, 5, 8, 9], which is correct.

However, I’m running this inside a larger application where I call the function like this:

def process_data(data_sets):
    results = []
    for data in data_sets:
        sorted_data = bubble_sort(data.copy())
        results.append(sorted_data)
    return results

And I’m still seeing some incorrect sorts in the output. Could there be an issue with how I’m calling it?


### Identifying the Root Cause

**AI response:**
> Based on your updated information, the bubble sort implementation itself appears to be working correctly. The issue might be in how it's being called or how the data is being processed within your larger application.
> 
> Let's analyze the `process_data` function:
> 
> ```python
> def process_data(data_sets):
>     results = []
>     for data in data_sets:
>         sorted_data = bubble_sort(data.copy())
>         results.append(sorted_data)
>     return results
> ```
> 
> This function:
> 1. Takes a list of data sets
> 2. Creates a copy of each data set
> 3. Sorts the copy using bubble_sort
> 4. Appends the sorted data to a results list
> 5. Returns all results
> 
> This approach seems correct. You're creating a copy of each data set before sorting, which is good practice to avoid modifying the original data.
> 
> Here are some potential issues to check:
> 
> 1. **Data type consistency**: Are all elements in your arrays numbers? Sorting mixed types (like strings and numbers) can lead to unexpected results.
> 
> 2. **Mutable objects**: If your arrays contain mutable objects rather than simple numbers, the sorting might be affected by reference issues.
> 
> 3. **Preservation of original data**: Double-check that `data.copy()` is creating a proper copy. For nested structures, you might need a deep copy.
> 
> 4. **Concurrency issues**: If your application is multi-threaded, there could be race conditions affecting the sorting.
> 
> 5. **Data transformation**: Is the data being transformed between the sorting and when you observe the results?
> 
> 6. **Visualization/display issue**: Sometimes the issue is not with the sorting but with how results are displayed or accessed.
> 
> Can you share:
> 1. A specific example of a dataset that's being incorrectly sorted in your application
> 2. How you're verifying the results (e.g., printing them, displaying in UI)
> 3. Any processing that happens to the results after sorting

### Resolution

**Human prompt:**

I think I found the issue. In another part of the code, there’s a function that modifies the results after sorting:

def highlight_outliers(data_list):
    for i in range(len(data_list)):
        # Swap positions of certain elements to highlight them
        for j in range(len(data_list[i])-1):
            if data_list[i][j] > data_list[i][j+1] * 2:  # Highlight elements 2x larger than next
                data_list[i][j], data_list[i][j+1] = data_list[i][j+1], data_list[i][j]
    return data_list

This is being called on the results from process_data! It’s intentionally un-sorting some elements to highlight outliers, but I forgot it was there.

Thank you for helping me trace through the algorithm; it confirmed that my bubble sort is correct and led me to look elsewhere for the issue. ```

11.5 Key Takeaways

  • Effective debugging with AI requires clear communication about errors and context
  • Different debugging scenarios call for different prompting strategies
  • Templates can streamline the debugging process for common issues
  • Structured debugging approaches like divide-and-conquer and hypothesis testing work well with AI
  • Always verify AI’s debugging suggestions with your own testing
  • Root cause analysis is more valuable than symptom mitigation
  • Debugging is a process of investigation, not just code fixing

11.6 Moving Forward

In the next chapter, we’ll explore refactoring strategies with AI assistance, building on the debugging skills we’ve developed here to improve existing code rather than just fixing bugs.