18  Testing

18.1 The Wall

AI said the code worked. You ran it, it seemed fine. Then you gave it to someone else and it crashed with inputs you never tried. AI tested its own code with the happy path. The obvious, expected inputs, and missed every edge case.

You asked AI to “add tests” and it generated tests that passed trivially. One test checked that add(2, 3) returns 5. It did not test what happens when you pass None, a string, or a negative number. The tests gave you false confidence.

This chapter fixes that.

18.2 Thinking Session

18.2.1 Getting Oriented

NoteThinking Session Prompt

Why do we test code? I know AI can generate tests, but what makes a test actually useful? What is the difference between a test that gives you confidence and a test that just takes up space? And what is pytest, why do Python developers prefer it over the built-in unittest module?

Your AI should explain: good tests check edge cases, not just happy paths. A test is useful if its failure tells you something is broken. pytest is preferred because it uses plain assert statements instead of special methods (assertEqual, assertTrue), making tests more readable.

18.2.2 Go Deeper

NoteThinking Session Prompt

How do I write a pytest test? Walk me through the structure: where do test files go, how do I name them, what does a test function look like, and how do I run tests? Also explain what “arrange, act, assert” means.

TipWhat to Look For

Test files go in a tests/ directory, named test_*.py. Test functions start with test_. Arrange-Act-Assert: set up the input (arrange), call the function (act), check the result (assert). pytest discovers and runs tests automatically. Your AI should show this structure, not just individual assert statements.

NoteThinking Session Prompt

What are edge cases and how do I identify them? When AI generates a function, what inputs should I always test? Give me a checklist I can use for any function.

18.2.3 Challenge It

NoteThinking Session Prompt

Here is a function and its AI-generated tests. Are the tests good enough?

def calculate_average(numbers):
    return sum(numbers) / len(numbers)

def test_average():
    assert calculate_average([1, 2, 3]) == 2.0
    assert calculate_average([10, 20]) == 15.0
TipWhat to Look For

The tests miss: empty list (ZeroDivisionError), single item list, negative numbers, very large numbers, non-numeric values. The function itself has a bug. It crashes on empty input. Good tests would catch this. AI-generated tests often test only the happy path.

18.2.4 What You Should Have Learned

  • Good tests check edge cases, not just expected inputs
  • pytest uses plain assert, simple and readable
  • Test files: tests/test_*.py, functions: test_*()
  • Arrange, Act, Assert. The test structure pattern
  • Edge case checklist: empty input, None, wrong type, boundary values, single item
  • Tests give confidence to modify code. That is their real value

18.3 The Gap

Testing is what separates code that works from code you can trust. When AI generates code, you cannot verify it by running it once with expected input. You need tests for the edge cases AI did not think of. Now you know how to write those tests.

In the Building Session, you will write tests for your chatbot’s core functions.

18.4 Building Session

18.4.1 The Spec

Add pytest tests for your chatbot:

  • Test classify_input() with greetings, questions, farewells, and unknown input
  • Test get_response() returns the right type (tuple of string and bool)
  • Test edge cases: empty input, very long input, input with only spaces
  • Test load_history() when the file does not exist

18.4.2 Prompt It

NoteBuilding Session Prompt

Create a test file tests/test_chatbot.py for my chatbot. Write pytest tests for:

  • classify_input(“hello”) should return “greeting”
  • classify_input(“bye”) should return “farewell”
  • classify_input(“what?”) should return “question”
  • classify_input(“random text”) should return “default”
  • classify_input(““) should return”default” (not crash)
  • get_response() should return a tuple of (str, bool)
  • get_response(“bye”) should return (str, True). The exit flag

Use the arrange/act/assert pattern. Include at least one test that checks an edge case AI would miss.

18.4.3 Read the Code

Your AI will produce something like this:

"""Tests for the chatbot response system."""
from chatbot import classify_input, get_response


def test_classify_greeting():
    assert classify_input("hello") == "greeting"
    assert classify_input("hi there") == "greeting"
    assert classify_input("Hey!") == "greeting"


def test_classify_farewell():
    assert classify_input("bye") == "farewell"
    assert classify_input("goodbye") == "farewell"


def test_classify_question():
    assert classify_input("what is Python?") == "question"


def test_classify_default():
    assert classify_input("random text") == "default"


def test_classify_empty():
    """Edge case: empty input should not crash."""
    assert classify_input("") == "default"


def test_classify_whitespace():
    """Edge case: whitespace-only input."""
    assert classify_input("   ") == "default"


def test_response_type():
    """get_response returns (str, bool)."""
    response, should_exit = get_response("hello")
    assert isinstance(response, str)
    assert isinstance(should_exit, bool)


def test_response_farewell_exits():
    """Farewell responses should signal exit."""
    _, should_exit = get_response("bye")
    assert should_exit is True


def test_response_greeting_continues():
    """Greeting responses should not exit."""
    _, should_exit = get_response("hello")
    assert should_exit is False
TipWhat to Notice

Each test function does one thing and has a descriptive name. Edge cases (empty input, whitespace) have their own tests with docstrings explaining why. isinstance() checks the type without caring about the specific value. The _ in _, should_exit discards a value you do not need. Run with pytest tests/ from the project root.

18.4.4 Stretch It

NoteBuilding Session Prompt

Add a test for load_history() that uses pytest’s tmp_path fixture to test with a temporary JSON file. Test both the case where the file exists with valid data and where it does not exist.

18.5 Your Chatbot So Far

  • Ch 1-16: Full features with persistence and error handling
  • Ch 17: Debug mode
  • Ch 18: pytest test suite for core functions

18.6 Quick Reference

# Test file: tests/test_example.py
def test_addition():
    # Arrange
    x, y = 2, 3
    # Act
    result = x + y
    # Assert
    assert result == 5

# Run tests
# pytest                    # all tests
# pytest tests/test_x.py   # specific file
# pytest -v                 # verbose output

# Common assertions
assert result == expected
assert result is True
assert result is None
assert isinstance(result, str)
assert len(items) > 0

# Expected exceptions
import pytest
def test_invalid_input():
    with pytest.raises(ValueError):
        int("not a number")