ADR 0003: Workflow Configuration Format

Status

Accepted

Context

The Curriculum Curator system relies on a workflow engine to orchestrate the generation, validation, remediation, and output of educational content. We need a standardized format for defining these workflows that is:

Human-readable and easy to edit
Extensible to accommodate future workflow patterns
Capable of being loaded dynamically at runtime
Provides defaults and overrides for common properties

Decision

We will use YAML as the configuration format for workflow definitions with a standardized schema that supports defaults and step-specific overrides. Workflow definitions will be stored in standalone files that can be discovered at runtime.

Schema

The workflow configuration format will follow this structure:

# Top-level workflow definition
name: "workflow_name"  # Required: Unique identifier for the workflow (should follow snake_case)
description: "Workflow description"  # Required: Human-readable description of the workflow

# Optional global defaults that apply to all steps unless overridden
defaults:
  llm_model_alias: "default_smart"  # Default LLM model to use for all prompt steps
  output_format: "raw"  # Default output format for all prompt steps
  validators: ["readability", "structure"]  # Default validators for all validation steps
  remediators: ["format_corrector", "sentence_splitter"]  # Default remediators

# Required: List of steps defining the workflow execution
steps:
  # Each step defines an action in the workflow
  - name: "step_name"  # Required: Unique name for this step
    type: "prompt"  # Required: Step type (prompt, validation, remediation, output)
    # Additional parameters based on step type...

Step Types and Parameters

Prompt Step

Used to generate content by executing a prompt with an LLM.

- name: "generate_overview"
  type: "prompt"
  prompt: "course/overview.txt"  # Required: Path to the prompt template
  llm_model_alias: "default_smart"  # Optional: LLM model to use
  output_format: "raw"  # Optional: Format for output (raw, json, list, html)
  output_variable: "course_overview"  # Required: Variable to store the result
  transformation_rules: {}  # Optional: Special rules for content transformation

Validation Step

Used to validate content against a set of rules.

- name: "validate_lecture"
  type: "validation"
  content_variable: "lecture_content"  # Required: Content to validate
  validators: ["readability", "structure"]  # Required: Validators to apply
  output_variable: "validation_issues"  # Required: Variable to store issues
  validation_config:  # Optional: Additional configuration for validators
    similarity:
      threshold: 0.8
    structure:
      required_sections: ["introduction", "conclusion"]

Remediation Step

Used to fix issues found during validation.

- name: "remediate_lecture"
  type: "remediation"
  content_variable: "lecture_content"  # Required: Content to remediate
  issues_variable: "validation_issues"  # Required: Issues to fix
  output_variable: "remediated_lecture"  # Required: Variable to store fixed content
  actions_variable: "remediation_actions"  # Optional: Variable to store actions taken
  remediation_config:  # Optional: Additional configuration for remediators
    format_corrector:
      aggressive: true

Output Step

Used to generate output files from content.

- name: "generate_outputs"
  type: "output"
  output_mapping:  # Required: Maps variables to output filenames
    course_overview: "overview.md"
    remediated_lecture: "lecture.md"
  output_dir: "output/{course_slug}/{module_id}"  # Required: Output directory
  output_variable: "output_files"  # Optional: Variable to store output file paths

Future Step Types (Planned)

Conditional Step

Executes steps conditionally based on a condition.

- name: "conditional_step"
  type: "conditional"
  condition: "validation_issues|length > 0"  # Required: Condition to evaluate
  if_steps:  # Required: Steps to execute if condition is true
    - name: "remediate_content"
      type: "remediation"
      # ...
  else_steps:  # Optional: Steps to execute if condition is false
    - name: "skip_remediation"
      type: "prompt"
      # ...

Loop Step

Executes steps repeatedly for each item in a collection.

- name: "process_modules"
  type: "loop"
  items_variable: "modules"  # Required: Variable containing items to iterate over
  item_variable: "current_module"  # Required: Variable to store current item
  steps:  # Required: Steps to execute for each item
    - name: "process_module"
      type: "prompt"
      # ...

Parallel Step

Executes steps in parallel.

- name: "parallel_processing"
  type: "parallel"
  max_concurrency: 3  # Optional: Maximum number of concurrent tasks
  steps:  # Required: Steps to execute in parallel
    - name: "task1"
      type: "prompt"
      # ...
    - name: "task2"
      type: "prompt"
      # ...

Discovery Mechanism

Workflow configurations will be discovered from files in the following locations: 1. /examples/workflows/ - For built-in example workflows 2. /workflows/ - For user-defined workflows

YAML files in these directories will be loaded automatically when the application starts.

Implementation Details

The workflow engine will support defaults by merging workflow-level defaults with step-specific configurations.
Step-specific configurations will override workflow-level defaults.
The system will validate workflow configurations against a schema to catch errors early.
Workflow configurations will be versionable and shareable between users.