Workflow Configuration Validation
Curriculum Curator provides a robust validation system for workflow configurations to catch errors early in the development process.
Overview
The validation system uses Pydantic models to enforce schema validation for workflow configurations. This ensures that:
- All required fields are present
- Field values have the correct types
- Step-specific fields are used correctly
- Defaults are applied properly
- Detailed error messages are provided when validation fails
Benefits
Using the validation system provides several benefits:
- Early Error Detection: Configuration errors are caught before execution begins
- Clear Error Messages: Detailed error messages identify exactly what's wrong
- Better Developer Experience: Autocomplete and type hints help when writing configurations
- Safer Production Deployments: Validated configurations are less likely to fail at runtime
- Reduced LLM Costs: Catching errors early prevents unnecessary LLM calls for invalid workflows
Validating Workflows
Using the Validation Tool
Curriculum Curator includes a validation tool to check workflow configurations:
# Validate a single workflow file
python -m curriculum_curator.tools.validate_workflow examples/workflows/minimal_module.yaml
# Validate all discovered workflows
python -m curriculum_curator.tools.validate_workflow --all
Programmatic Validation
You can also validate workflows programmatically:
from curriculum_curator.workflow.workflows import load_workflow_config
# Load and validate a workflow
workflow_config = load_workflow_config("examples/workflows/minimal_module.yaml")
if workflow_config:
print(f"Workflow '{workflow_config.name}' is valid")
else:
print("Workflow validation failed")
Configuration Schema
The workflow configuration schema is defined using Pydantic models:
Top-Level Structure
name: "workflow_name" # Required: Unique identifier for the workflow
description: "Description" # Required: Human-readable description of the workflow
defaults: # Optional: Defaults to apply to all steps unless overridden
llm_model_alias: "default_smart" # Default LLM model for prompt steps
output_format: "raw" # Default output format for prompt steps
validators: ["readability", "structure"] # Default validators for validation steps
steps: # Required: List of steps to execute
- name: "step_name" # Each step must have a name and type
type: "prompt" # Type determines what other fields are required
# ...additional fields based on step type
Step Types and Required Fields
Prompt Step
- name: "generate_content"
type: "prompt"
prompt: "path/to/prompt.txt" # Required: Path to prompt template
output_variable: "result_variable" # Required: Where to store result
llm_model_alias: "default_smart" # Optional: Override default
output_format: "raw" # Optional: Override default (raw, json, list, html)
transformation_rules: {} # Optional: Additional transformation rules
Validation Step
- name: "validate_content"
type: "validation"
content_variable: "content_to_validate" # Required: Content to validate
output_variable: "validation_issues" # Required: Where to store issues
validators: ["readability", "structure"] # Required: Validators to apply
validation_config: # Optional: Additional validator configuration
similarity:
threshold: 0.8
Remediation Step
- name: "fix_issues"
type: "remediation"
content_variable: "content_to_fix" # Required: Content to fix
issues_variable: "validation_issues" # Required: Issues to fix
output_variable: "fixed_content" # Required: Where to store fixed content
actions_variable: "remediation_actions" # Optional: Store remediation actions
remediation_config: {} # Optional: Additional remediator configuration
Output Step
- name: "generate_files"
type: "output"
output_mapping: # Required: Maps variables to file names
variable_name: "output_file.md"
output_dir: "output/path" # Required: Output directory
output_variable: "output_files" # Optional: Store output file paths
Common Validation Errors
Here are some common validation errors and how to fix them:
Missing Required Fields
This error means a required field is missing. Check the schema to identify which field is required for the step type.
Type Errors
This error means a field has the wrong type. Make sure your field values match the expected types in the schema.
Invalid Enum Values
This error means you're using an invalid value for a field with limited options (like output_format or step type).
Unknown Fields
This error means you're using fields that aren't defined in the schema. Check for typos or remove the unknown fields.
Extension and Customization
The validation system is designed to be extended as new step types are added to the workflow engine. When adding a new step type:
- Define a new Pydantic model for the step type in
models.py
- Add the new model to the
StepConfig
union type - Update the
parse_steps
method to handle the new step type - Update the workflow engine to create and execute the new step type
This ensures that validation remains robust as the system evolves.