9 Case Study: From Notebook to Package with nbdev
This case study parallels Chapter 4 (SimpleBot), but follows a notebook-first workflow. We’ll build TextKit—a text analysis library—entirely in Jupyter notebooks, then ship it as a published Python package using nbdev.
9.1 Project Overview
TextKit is a lightweight text analysis library that provides simple utilities for analyzing text. Key features include:
- Word and character statistics
- Readability scoring (Flesch-Kincaid, etc.)
- Basic sentiment indicators
- Text cleaning utilities
This project is ideal for our notebook case study because:
- Natural notebook fit: Text analysis involves exploration and visualization
- Keeps the theme: Complements SimpleBot’s chatbot focus (analyzing what bots produce)
- Real utility: Functions you’d actually use in data analysis
- Right size: Small enough to complete, complex enough to demonstrate the workflow
By the end of this chapter, you’ll have a package published to PyPI—built entirely from notebooks.
9.2 Why nbdev for This Project?
In Chapter 8, we introduced nbdev as a way to develop libraries from notebooks. Here’s why it fits TextKit:
| Traditional Workflow | nbdev Workflow |
|---|---|
Write code in .py files |
Write code in notebooks |
| Write separate test files | Tests live next to code |
| Write docs separately | Docs generated from notebooks |
| Context switching | Single environment |
For exploratory, iterative work like text analysis, nbdev keeps everything together.
9.3 1. Setting Up the nbdev Project
9.3.1 Installing nbdev
pip install nbdev9.3.2 Creating the Project
nbdev_new --lib_name textkit --user yourusername --author "Your Name"
cd textkitThis creates:
textkit/
├── nbs/ # Your notebooks live here
│ ├── 00_core.ipynb # Main module
│ ├── index.ipynb # Becomes README and docs homepage
│ └── _quarto.yml # Documentation config
├── textkit/ # Generated Python package (don't edit directly)
├── settings.ini # Project configuration
├── setup.py # Generated for pip install
└── pyproject.toml
9.3.3 Key Insight: You Edit Notebooks, Not .py Files
The textkit/ directory contains generated code. Your source of truth is nbs/*.ipynb.
9.4 2. Building the Core Module
9.4.1 The First Notebook: 00_core.ipynb
Open nbs/00_core.ipynb in Jupyter. The structure:
# Cell 1: Module header
#| default_exp coreThis directive tells nbdev: “export cells from this notebook to textkit/core.py”.
9.4.2 Exporting Functions
#| export
def word_count(text: str) -> int:
"""Count words in text.
Parameters
----------
text : str
Input text to analyze
Returns
-------
int
Number of words
Examples
--------
>>> word_count("Hello world")
2
>>> word_count("")
0
"""
if not text or not text.strip():
return 0
return len(text.split())The #| export directive marks this cell for inclusion in the generated module.
9.4.3 Exploring as You Build
This is where notebooks shine. Between exported cells, add exploration:
# Not exported - just exploration
sample_text = """
The quick brown fox jumps over the lazy dog.
This is a sample paragraph for testing our text analysis functions.
"""
print(f"Word count: {word_count(sample_text)}")Your notebook becomes both implementation AND documentation of your thinking.
9.5 3. Adding Tests with nbdev
9.5.1 Inline Doctests
The docstring examples above ARE tests. nbdev runs them automatically:
nbdev_test9.5.2 Dedicated Test Cells
For more complex tests:
#| test
def test_word_count_edge_cases():
assert word_count("") == 0
assert word_count(" ") == 0
assert word_count("one") == 1
assert word_count("one two three") == 3
# Unicode handling
assert word_count("café résumé") == 29.5.3 Running Tests
# Run all tests
nbdev_test
# Run tests for specific notebook
nbdev_test --path nbs/00_core.ipynb9.6 4. Building More Functionality
9.6.1 Readability Scores
#| export
def flesch_reading_ease(text: str) -> float:
"""Calculate Flesch Reading Ease score.
Scores typically range from 0-100:
- 90-100: Very easy (5th grade)
- 60-70: Standard (8th-9th grade)
- 0-30: Very difficult (college graduate)
Examples
--------
>>> score = flesch_reading_ease("The cat sat on the mat.")
>>> 90 <= score <= 120 # Simple sentence = high score
True
"""
words = word_count(text)
sentences = sentence_count(text)
syllables = syllable_count(text)
if words == 0 or sentences == 0:
return 0.0
return (
206.835
- 1.015 * (words / sentences)
- 84.6 * (syllables / words)
)9.6.2 Helper Functions
#| export
def sentence_count(text: str) -> int:
"""Count sentences in text.
Examples
--------
>>> sentence_count("Hello. World!")
2
>>> sentence_count("No punctuation here")
1
"""
import re
if not text.strip():
return 0
# Split on sentence-ending punctuation
sentences = re.split(r'[.!?]+', text)
# Filter empty strings
return len([s for s in sentences if s.strip()])#| export
def syllable_count(text: str) -> int:
"""Estimate syllable count (English approximation).
Examples
--------
>>> syllable_count("hello")
2
>>> syllable_count("beautiful")
4
"""
import re
text = text.lower()
words = text.split()
count = 0
for word in words:
word = re.sub(r'[^a-z]', '', word)
if not word:
continue
# Simple heuristic: count vowel groups
syllables = len(re.findall(r'[aeiouy]+', word))
# Adjust for silent e
if word.endswith('e') and syllables > 1:
syllables -= 1
count += max(1, syllables)
return count9.7 5. Visualizations in Your Notebook
Notebooks excel at visual exploration. Add analysis cells (not exported):
# Visualization - not exported, but shows in docs
import matplotlib.pyplot as plt
def visualize_readability(texts: dict[str, str]):
"""Compare readability across multiple texts."""
names = list(texts.keys())
scores = [flesch_reading_ease(t) for t in texts.values()]
plt.figure(figsize=(10, 5))
plt.barh(names, scores, color='steelblue')
plt.xlabel('Flesch Reading Ease Score')
plt.title('Readability Comparison')
plt.axvline(x=60, color='red', linestyle='--', label='Standard difficulty')
plt.legend()
plt.tight_layout()
plt.show()
# Demo with sample texts
samples = {
"Children's book": "The cat sat. The dog ran. They played.",
"News article": "The committee announced sweeping regulatory changes affecting multiple industries.",
"Academic paper": "The epistemological ramifications of quantum indeterminacy necessitate reconceptualization.",
}
visualize_readability(samples)This visualization appears in your generated documentation—showing users what the library can do.
9.8 6. Building the Text Analyzer Class
For a more complete API, add a class that combines functionality:
#| export
class TextAnalyzer:
"""Analyze text with multiple metrics.
Examples
--------
>>> analyzer = TextAnalyzer("Hello world. How are you?")
>>> analyzer.word_count
5
>>> analyzer.sentence_count
2
"""
def __init__(self, text: str):
self.text = text
self._word_count = None
self._sentence_count = None
@property
def word_count(self) -> int:
if self._word_count is None:
self._word_count = word_count(self.text)
return self._word_count
@property
def sentence_count(self) -> int:
if self._sentence_count is None:
self._sentence_count = sentence_count(self.text)
return self._sentence_count
@property
def avg_words_per_sentence(self) -> float:
if self.sentence_count == 0:
return 0.0
return self.word_count / self.sentence_count
@property
def readability(self) -> float:
return flesch_reading_ease(self.text)
def summary(self) -> dict:
"""Return all metrics as a dictionary."""
return {
"words": self.word_count,
"sentences": self.sentence_count,
"avg_words_per_sentence": round(self.avg_words_per_sentence, 1),
"flesch_reading_ease": round(self.readability, 1),
}9.9 7. Adding an Interactive Widget
End with something users can interact with—demonstrating the notebook as an application:
# Interactive demo (not exported - for notebook/docs only)
import ipywidgets as widgets
from IPython.display import display
def create_analyzer_widget():
"""Create an interactive text analyzer."""
text_input = widgets.Textarea(
value='Enter your text here...',
placeholder='Paste text to analyze',
description='Text:',
layout=widgets.Layout(width='100%', height='150px')
)
output = widgets.Output()
def analyze(change):
output.clear_output()
with output:
if text_input.value.strip():
analyzer = TextAnalyzer(text_input.value)
results = analyzer.summary()
print("📊 Analysis Results")
print("-" * 30)
for key, value in results.items():
print(f"{key.replace('_', ' ').title()}: {value}")
text_input.observe(analyze, names='value')
display(widgets.VBox([
widgets.HTML("<h3>📝 Text Analyzer</h3>"),
text_input,
output
]))
# Show the widget
create_analyzer_widget()When viewed in Colab or Binder, users can interact with your library without installing anything.
9.10 8. Generating the Package
9.10.1 Export to Python Modules
nbdev_exportThis generates textkit/core.py from your notebook’s #| export cells.
9.10.2 Verify Everything Works
# Run tests
nbdev_test
# Check for issues
nbdev_clean
nbdev_prepare9.10.3 The Generated Code
Look at textkit/core.py—it contains clean Python code generated from your notebooks, with proper imports and structure.
9.11 9. Documentation
9.11.1 The Index Notebook
nbs/index.ipynb becomes both your README.md and documentation homepage. Include:
- Installation instructions
- Quick start example
- Feature overview
# In nbs/index.ipynb
# TextKit
> Simple text analysis for Python
## Installation
```bash
pip install textkit9.12 Quick Start
from textkit.core import TextAnalyzer
text = "Your text here. Analyze it easily."
analyzer = TextAnalyzer(text)
print(analyzer.summary())
### Build Documentation
```bash
nbdev_docs
This generates a Quarto-based documentation site in _docs/.
9.13 10. Publishing to PyPI
9.13.1 Prepare for Release
# Clean and prepare
nbdev_prepare
# Build distribution
python -m build9.13.2 Publish
# Test PyPI first
twine upload --repository testpypi dist/*
# Then real PyPI
twine upload dist/*9.13.3 The Result
pip install textkitYou’ve shipped a Python package—developed entirely in notebooks.
9.15 Comparing Workflows
Here’s how this case study compares to the SimpleBot approach (Chapter 4):
| Aspect | SimpleBot (Scripts) | TextKit (nbdev) |
|---|---|---|
| Source files | .py in src/ |
.ipynb in nbs/ |
| Tests | Separate tests/ directory |
Inline with code |
| Documentation | Separate docs/ |
Generated from notebooks |
| Exploration | Separate REPL/scratch files | Integrated in notebooks |
| Output | Package on PyPI | Package on PyPI |
| Best for | Traditional dev, teams | Exploratory, teaching |
Both workflows produce the same result: a published package. Choose based on how you like to work.
9.16 When to Use This Workflow
The nbdev approach works best when:
- Exploration is central: You’re figuring things out as you build
- Teaching matters: Others will learn from your notebooks
- Docs should show execution: You want live examples in documentation
- Solo or small team: Git conflicts in notebooks are real
Consider traditional scripts when:
- Large teams: Notebook diffs are harder to review
- Complex architecture: Many interconnected modules
- Heavy IDE reliance: Refactoring tools work better with
.pyfiles - Existing codebase: Converting to nbdev is non-trivial
9.17 Summary
- nbdev inverts the workflow: Notebooks are source,
.pyfiles are generated - Tests live with code: Doctests and
#| testcells eliminate context switching - Exploration becomes documentation: Your investigative work helps users
- Same destination: Published package, installable via pip
- Different journey: Iterative, visual, integrated
9.18 Exercises
Extend TextKit: Add a
sentiment_words()function that counts positive/negative words from a simple word list. Include doctests.Add a notebook: Create
01_advanced.ipynbwith functions for text comparison (e.g., similarity between two texts).Publish to TestPyPI: Go through the full publication workflow to TestPyPI.
Create a Voilà dashboard: Convert the interactive widget section into a standalone Voilà dashboard.
Compare workflows: Take one function from TextKit and rewrite it in the traditional script workflow. Reflect on the differences.