18 Testing

18.1 The Wall

AI said the code worked. You ran it, it seemed fine. Then you gave it to someone else and it crashed with inputs you never tried. AI tested its own code with the happy path. The obvious, expected inputs, and missed every edge case.

You asked AI to “add tests” and it generated tests that passed trivially. One test checked that add(2, 3) returns 5. It did not test what happens when you pass None, a string, or a negative number. The tests gave you false confidence.

This chapter fixes that.

18.2 Thinking Session

18.2.1 Getting Oriented

Thinking Session Prompt

Why do we test code? I know AI can generate tests, but what makes a test actually useful? What is the difference between a test that gives you confidence and a test that just takes up space? And what is pytest, why do Python developers prefer it over the built-in unittest module?

Your AI should explain: good tests check edge cases, not just happy paths. A test is useful if its failure tells you something is broken. pytest is preferred because it uses plain assert statements instead of special methods (assertEqual, assertTrue), making tests more readable.

18.2.2 Go Deeper

Thinking Session Prompt

How do I write a pytest test? Walk me through the structure: where do test files go, how do I name them, what does a test function look like, and how do I run tests? Also explain what “arrange, act, assert” means.

What to Look For

Test files go in a tests/ directory, named test_*.py. Test functions start with test_. Arrange-Act-Assert: set up the input (arrange), call the function (act), check the result (assert). pytest discovers and runs tests automatically. Your AI should show this structure, not just individual assert statements.

Thinking Session Prompt

What are edge cases and how do I identify them? When AI generates a function, what inputs should I always test? Give me a checklist I can use for any function.

18.2.3 Challenge It

Thinking Session Prompt

Here is a function and its AI-generated tests. Are the tests good enough?

def calculate_average(numbers):
    return sum(numbers) / len(numbers)

def test_average():
    assert calculate_average([1, 2, 3]) == 2.0
    assert calculate_average([10, 20]) == 15.0

What to Look For

The tests miss: empty list (ZeroDivisionError), single item list, negative numbers, very large numbers, non-numeric values. The function itself has a bug. It crashes on empty input. Good tests would catch this. AI-generated tests often test only the happy path.

18.2.4 What You Should Have Learned

Good tests check edge cases, not just expected inputs
pytest uses plain assert, simple and readable
Test files: tests/test_*.py, functions: test_*()
Arrange, Act, Assert. The test structure pattern
Edge case checklist: empty input, None, wrong type, boundary values, single item
Tests give confidence to modify code. That is their real value

18.3 The Gap

Testing is what separates code that works from code you can trust. When AI generates code, you cannot verify it by running it once with expected input. You need tests for the edge cases AI did not think of. Now you know how to write those tests.

In the Building Session, you will write tests for your chatbot’s core functions.

18.4 Building Session

18.4.1 The Spec

Add pytest tests for your chatbot:

Test classify_input() with greetings, questions, farewells, and unknown input
Test get_response() returns the right type (tuple of string and bool)
Test edge cases: empty input, very long input, input with only spaces
Test load_history() when the file does not exist

18.4.2 Prompt It

Building Session Prompt

Create a test file tests/test_chatbot.py for my chatbot. Write pytest tests for:

classify_input(“hello”) should return “greeting”
classify_input(“bye”) should return “farewell”
classify_input(“what?”) should return “question”
classify_input(“random text”) should return “default”
classify_input(““) should return”default” (not crash)
get_response() should return a tuple of (str, bool)
get_response(“bye”) should return (str, True). The exit flag

Use the arrange/act/assert pattern. Include at least one test that checks an edge case AI would miss.

18.4.3 Read the Code

Your AI will produce something like this:

"""Tests for the chatbot response system."""
from chatbot import classify_input, get_response


def test_classify_greeting():
    assert classify_input("hello") == "greeting"
    assert classify_input("hi there") == "greeting"
    assert classify_input("Hey!") == "greeting"


def test_classify_farewell():
    assert classify_input("bye") == "farewell"
    assert classify_input("goodbye") == "farewell"


def test_classify_question():
    assert classify_input("what is Python?") == "question"


def test_classify_default():
    assert classify_input("random text") == "default"


def test_classify_empty():
    """Edge case: empty input should not crash."""
    assert classify_input("") == "default"


def test_classify_whitespace():
    """Edge case: whitespace-only input."""
    assert classify_input("   ") == "default"


def test_response_type():
    """get_response returns (str, bool)."""
    response, should_exit = get_response("hello")
    assert isinstance(response, str)
    assert isinstance(should_exit, bool)


def test_response_farewell_exits():
    """Farewell responses should signal exit."""
    _, should_exit = get_response("bye")
    assert should_exit is True


def test_response_greeting_continues():
    """Greeting responses should not exit."""
    _, should_exit = get_response("hello")
    assert should_exit is False

What to Notice

Each test function does one thing and has a descriptive name. Edge cases (empty input, whitespace) have their own tests with docstrings explaining why. isinstance() checks the type without caring about the specific value. The _ in _, should_exit discards a value you do not need. Run with pytest tests/ from the project root.

18.4.4 Stretch It

Building Session Prompt

Add a test for load_history() that uses pytest’s tmp_path fixture to test with a temporary JSON file. Test both the case where the file exists with valid data and where it does not exist.

18.5 Your Chatbot So Far

Ch 1-16: Full features with persistence and error handling
Ch 17: Debug mode
Ch 18: pytest test suite for core functions

18.6 Quick Reference

# Test file: tests/test_example.py
def test_addition():
    # Arrange
    x, y = 2, 3
    # Act
    result = x + y
    # Assert
    assert result == 5

# Run tests
# pytest                    # all tests
# pytest tests/test_x.py   # specific file
# pytest -v                 # verbose output

# Common assertions
assert result == expected
assert result is True
assert result is None
assert isinstance(result, str)
assert len(items) > 0

# Expected exceptions
import pytest
def test_invalid_input():
    with pytest.raises(ValueError):
        int("not a number")

# Testing ## The Wall AI said the code worked. You ran it, it seemed fine. Then you gave it to someone else and it crashed with inputs you never tried. AI tested its own code with the happy path. The obvious, expected inputs, and missed every edge case. You asked AI to "add tests" and it generated tests that passed trivially. One test checked that `add(2, 3)` returns `5`. It did not test what happens when you pass `None`, a string, or a negative number. The tests gave you false confidence. This chapter fixes that. ## Thinking Session ### Getting Oriented ::: {.callout-note title="Thinking Session Prompt"} Why do we test code? I know AI can generate tests, but what makes a test actually useful? What is the difference between a test that gives you confidence and a test that just takes up space? And what is pytest, why do Python developers prefer it over the built-in unittest module? ::: Your AI should explain: good tests check edge cases, not just happy paths. A test is useful if its failure tells you something is broken. pytest is preferred because it uses plain `assert` statements instead of special methods (`assertEqual`, `assertTrue`), making tests more readable. ### Go Deeper ::: {.callout-note title="Thinking Session Prompt"} How do I write a pytest test? Walk me through the structure: where do test files go, how do I name them, what does a test function look like, and how do I run tests? Also explain what "arrange, act, assert" means. ::: ::: {.callout-tip title="What to Look For"} Test files go in a `tests/` directory, named `test_*.py`. Test functions start with `test_`. Arrange-Act-Assert: set up the input (arrange), call the function (act), check the result (assert). `pytest` discovers and runs tests automatically. Your AI should show this structure, not just individual assert statements. ::: ::: {.callout-note title="Thinking Session Prompt"} What are edge cases and how do I identify them? When AI generates a function, what inputs should I always test? Give me a checklist I can use for any function. ::: ### Challenge It ::: {.callout-note title="Thinking Session Prompt"} Here is a function and its AI-generated tests. Are the tests good enough? ```python def calculate_average(numbers): return sum(numbers) / len(numbers) def test_average(): assert calculate_average([1, 2, 3]) == 2.0 assert calculate_average([10, 20]) == 15.0 ``` ::: ::: {.callout-tip title="What to Look For"} The tests miss: empty list (ZeroDivisionError), single item list, negative numbers, very large numbers, non-numeric values. The function itself has a bug. It crashes on empty input. Good tests would catch this. AI-generated tests often test only the happy path. ::: ### What You Should Have Learned - Good tests check edge cases, not just expected inputs - pytest uses plain `assert`, simple and readable - Test files: `tests/test_*.py`, functions: `test_*()` - Arrange, Act, Assert. The test structure pattern - Edge case checklist: empty input, None, wrong type, boundary values, single item - Tests give confidence to modify code. That is their real value ## The Gap Testing is what separates code that works from code you can trust. When AI generates code, you cannot verify it by running it once with expected input. You need tests for the edge cases AI did not think of. Now you know how to write those tests. In the Building Session, you will write tests for your chatbot's core functions. ## Building Session ### The Spec Add pytest tests for your chatbot: - Test `classify_input()` with greetings, questions, farewells, and unknown input - Test `get_response()` returns the right type (tuple of string and bool) - Test edge cases: empty input, very long input, input with only spaces - Test `load_history()` when the file does not exist ### Prompt It ::: {.callout-note title="Building Session Prompt"} Create a test file `tests/test_chatbot.py` for my chatbot. Write pytest tests for: - classify_input("hello") should return "greeting" - classify_input("bye") should return "farewell" - classify_input("what?") should return "question" - classify_input("random text") should return "default" - classify_input("") should return "default" (not crash) - get_response() should return a tuple of (str, bool) - get_response("bye") should return (str, True). The exit flag Use the arrange/act/assert pattern. Include at least one test that checks an edge case AI would miss. ::: ### Read the Code Your AI will produce something like this: ```python """Tests for the chatbot response system.""" from chatbot import classify_input, get_response def test_classify_greeting(): assert classify_input("hello") == "greeting" assert classify_input("hi there") == "greeting" assert classify_input("Hey!") == "greeting" def test_classify_farewell(): assert classify_input("bye") == "farewell" assert classify_input("goodbye") == "farewell" def test_classify_question(): assert classify_input("what is Python?") == "question" def test_classify_default(): assert classify_input("random text") == "default" def test_classify_empty(): """Edge case: empty input should not crash.""" assert classify_input("") == "default" def test_classify_whitespace(): """Edge case: whitespace-only input.""" assert classify_input(" ") == "default" def test_response_type(): """get_response returns (str, bool).""" response, should_exit = get_response("hello") assert isinstance(response, str) assert isinstance(should_exit, bool) def test_response_farewell_exits(): """Farewell responses should signal exit.""" _, should_exit = get_response("bye") assert should_exit is True def test_response_greeting_continues(): """Greeting responses should not exit.""" _, should_exit = get_response("hello") assert should_exit is False ``` ::: {.callout-tip title="What to Notice"} Each test function does one thing and has a descriptive name. Edge cases (empty input, whitespace) have their own tests with docstrings explaining why. `isinstance()` checks the type without caring about the specific value. The `_` in `_, should_exit` discards a value you do not need. Run with `pytest tests/` from the project root. ::: ### Stretch It ::: {.callout-note title="Building Session Prompt"} Add a test for load_history() that uses pytest's tmp_path fixture to test with a temporary JSON file. Test both the case where the file exists with valid data and where it does not exist. ::: ## Your Chatbot So Far - Ch 1-16: Full features with persistence and error handling - Ch 17: Debug mode - **Ch 18: pytest test suite for core functions** ## Quick Reference ```python # Test file: tests/test_example.py def test_addition(): # Arrange x, y = 2, 3 # Act result = x + y # Assert assert result == 5 # Run tests # pytest # all tests # pytest tests/test_x.py # specific file # pytest -v # verbose output # Common assertions assert result == expected assert result is True assert result is None assert isinstance(result, str) assert len(items) > 0 # Expected exceptions import pytest def test_invalid_input(): with pytest.raises(ValueError): int("not a number") ```