API Reference¶

Auto-generated reference for the public hands_on_ai API, built from the package's docstrings. For task-oriented walkthroughs, see the guides under Learn; this page is the exhaustive symbol-level reference.

Package¶

hands_on_ai: Your Hands-on AI Toolkit

A modular toolkit for learning AI concepts through hands-on experimentation.

Chat¶

The chat module provides the high-level conversational API, the bot personalities, and a stateful Conversation helper.

`chat.get_response`¶

Core response functionality for the chat module.

`chat_completion` ¶

Send a list of chat messages to the LLM and return (content, usage).

This is the low-level, multi-message primitive used by both :func:get_response (single-turn) and :class:Conversation (multi-turn).

Parameters:

Name	Type	Description	Default
`messages`	`list`	OpenAI-style message dicts, e.g. `[{"role": "system", ...}, {"role": "user", ...}]`.	required
`model`	`str`	LLM model to use (defaults to config setting).	`None`
`personality`	`str`	Used to pick a fallback message during retries.	`'friendly'`
`stream`	`bool`	Whether to request streaming output.	`False`
`retries`	`int`	Number of attempts before giving up.	`2`

Returns:

Name	Type	Description
`tuple`		`(content, usage)` where `usage` is a token-count dict, or
		`None` when the provider does not report usage (e.g. streaming).

Source code in src/hands_on_ai/chat/get_response.py

def chat_completion(
    messages: list,
    model: str = None,
    personality: str = "friendly",
    stream: bool = False,
    retries: int = 2,
):
    """
    Send a list of chat messages to the LLM and return ``(content, usage)``.

    This is the low-level, multi-message primitive used by both
    :func:`get_response` (single-turn) and :class:`Conversation` (multi-turn).

    Args:
        messages: OpenAI-style message dicts, e.g.
            ``[{"role": "system", ...}, {"role": "user", ...}]``.
        model: LLM model to use (defaults to config setting).
        personality: Used to pick a fallback message during retries.
        stream: Whether to request streaming output.
        retries: Number of attempts before giving up.

    Returns:
        tuple: ``(content, usage)`` where ``usage`` is a token-count dict, or
        ``None`` when the provider does not report usage (e.g. streaming).
    """
    if model is None:
        from ..config import get_model
        model = get_model()

    _warm_up(model)

    global _last_usage

    # Stream if explicitly requested, or if live REPL printing is enabled.
    do_stream = stream or _print_stream

    for attempt in range(1, retries + 1):
        try:
            client = _build_client()
            response = client.chat.completions.create(
                model=model,
                messages=messages,
                stream=do_stream,
                timeout=10,
            )

            if do_stream:
                # Collect all chunks (usage is not available when streaming),
                # printing each one live when REPL streaming is on.
                content = ""
                for chunk in response:
                    delta = chunk.choices[0].delta.content
                    if delta:
                        if _print_stream:
                            print(delta, end="", flush=True)
                        content += delta
                if _print_stream:
                    print()  # final newline after the streamed response
                _last_usage = None
                return (content or "⚠️ No response from model.", None)

            content = response.choices[0].message.content or "⚠️ No response from model."
            usage = _usage_dict(response)
            _last_usage = usage
            return (content, usage)

        except Exception as e:
            log.warning(f"Error during request (attempt {attempt}): {e}")
            if attempt < retries:
                fallback = _fallbacks.get(personality, _fallbacks.get("default", ["Retrying..."]))
                print(random.choice(fallback))
                time.sleep(1.0)
            else:
                return (f"❌ Error: {str(e)}", None)

`get_last_usage` ¶

Return token usage from the most recent get_response/bot call (or None).

Source code in src/hands_on_ai/chat/get_response.py

def get_last_usage():
    """Return token usage from the most recent get_response/bot call (or None)."""
    return _last_usage

`get_response` ¶

Send a single prompt to the LLM and retrieve the model's response.

This is a stateless, single-turn helper: it sends exactly one system message and one user message, with no memory of previous calls. For a multi-turn chat that remembers history, use :class:Conversation.

Parameters:

Name	Type	Description	Default
`prompt`	`str`	The text prompt to send to the model	required
`model`	`str`	LLM model to use (defaults to config setting)	`None`
`system`	`str`	System message defining bot behavior	`'You are a helpful assistant.'`
`personality`	`str`	Used for fallback character during retries	`'friendly'`
`stream`	`bool`	Whether to request streaming output (default False)	`False`
`retries`	`int`	Number of times to retry on error	`2`
`return_usage`	`bool`	If True, return `(response, usage)` where `usage` is a token-count dict (or None if unavailable)	`False`

Returns:

Name	Type	Description
`str`	`str`	AI response or error message. If `return_usage` is True, a
	`str`	`(response, usage)` tuple instead.

Source code in src/hands_on_ai/chat/get_response.py

def get_response(
    prompt: str,
    model: str = None,
    system: str = "You are a helpful assistant.",
    personality: str = "friendly",
    stream: bool = False,
    retries: int = 2,
    return_usage: bool = False,
) -> str:
    """
    Send a single prompt to the LLM and retrieve the model's response.

    This is a stateless, single-turn helper: it sends exactly one system
    message and one user message, with no memory of previous calls. For a
    multi-turn chat that remembers history, use :class:`Conversation`.

    Args:
        prompt (str): The text prompt to send to the model
        model (str): LLM model to use (defaults to config setting)
        system (str): System message defining bot behavior
        personality (str): Used for fallback character during retries
        stream (bool): Whether to request streaming output (default False)
        retries (int): Number of times to retry on error
        return_usage (bool): If True, return ``(response, usage)`` where
            ``usage`` is a token-count dict (or None if unavailable)

    Returns:
        str: AI response or error message. If ``return_usage`` is True, a
        ``(response, usage)`` tuple instead.
    """
    # Check for empty prompt
    if not prompt.strip():
        return ("⚠️ Empty prompt.", None) if return_usage else "⚠️ Empty prompt."

    # Resolve the model now so it is part of the cache key.
    if model is None:
        from ..config import get_model
        model = get_model()

    # Opt-in disk cache (HANDS_ON_AI_CACHE). Skip while streaming to the REPL,
    # where the printed output, not the return value, is what the user sees.
    from .. import cache
    use_cache = not stream and not _print_stream
    if use_cache:
        cached = cache.get(model, system, prompt)
        if cached is not None:
            return (cached, None) if return_usage else cached

    messages = [
        {"role": "system", "content": system},
        {"role": "user", "content": prompt},
    ]
    content, usage = chat_completion(
        messages,
        model=model,
        personality=personality,
        stream=stream,
        retries=retries,
    )

    # Only cache real responses, not error/empty placeholders.
    if use_cache and not content.startswith(("❌", "⚠️")):
        cache.put(model, system, prompt, content)

    return (content, usage) if return_usage else content

`set_stream_printing` ¶

Enable or disable live token printing to stdout (used by the chat REPL).

Source code in src/hands_on_ai/chat/get_response.py

def set_stream_printing(enabled: bool = True):
    """Enable or disable live token printing to stdout (used by the chat REPL)."""
    global _print_stream
    _print_stream = enabled

`stream_response` ¶

Like :func:get_response, but yields the response in chunks as it arrives.

This lets you show text as the model generates it, instead of waiting for the whole answer:

for chunk in stream_response("Tell me a short story"):
    print(chunk, end="", flush=True)

Parameters:

Name	Type	Description	Default
`prompt`	`str`	The text prompt to send to the model.	required
`model`	`str`	LLM model to use (defaults to config setting).	`None`
`system`	`str`	System message defining bot behavior.	`'You are a helpful assistant.'`
`personality`	`str`	Unused here; kept for signature parity with get_response.	`'friendly'`
`retries`	`int`	Unused here; streaming makes a single attempt.	`2`

Yields:

Name	Type	Description
`str`		Pieces of the response as they arrive.

Source code in src/hands_on_ai/chat/get_response.py

def stream_response(
    prompt: str,
    model: str = None,
    system: str = "You are a helpful assistant.",
    personality: str = "friendly",
    retries: int = 2,
):
    """
    Like :func:`get_response`, but yields the response in chunks as it arrives.

    This lets you show text as the model generates it, instead of waiting for the
    whole answer:

        for chunk in stream_response("Tell me a short story"):
            print(chunk, end="", flush=True)

    Args:
        prompt: The text prompt to send to the model.
        model: LLM model to use (defaults to config setting).
        system: System message defining bot behavior.
        personality: Unused here; kept for signature parity with get_response.
        retries: Unused here; streaming makes a single attempt.

    Yields:
        str: Pieces of the response as they arrive.
    """
    if not prompt.strip():
        yield "⚠️ Empty prompt."
        return

    if model is None:
        from ..config import get_model
        model = get_model()

    _warm_up(model)

    try:
        client = _build_client()
        response = client.chat.completions.create(
            model=model,
            messages=[
                {"role": "system", "content": system},
                {"role": "user", "content": prompt},
            ],
            stream=True,
            timeout=10,
        )
        for chunk in response:
            delta = chunk.choices[0].delta.content
            if delta:
                yield delta
    except Exception as e:
        log.warning(f"Error during streaming request: {e}")
        yield f"❌ Error: {str(e)}"

`chat.bots`¶

Bot personality discovery and retrieval.

`get_bot` ¶

Retrieve a specific bot by name.

Parameters:

Name	Type	Description	Default
`name`	`str`	Bot name	required

Returns:

Name	Type	Description
`function`		Bot function or None if not found

Source code in src/hands_on_ai/chat/bots.py

def get_bot(name):
    """
    Retrieve a specific bot by name.

    Args:
        name (str): Bot name

    Returns:
        function: Bot function or None if not found
    """
    return list_available_bots().get(name)

`get_bot_description` ¶

Get the first non-empty line of a bot's docstring.

Parameters:

Name	Type	Description	Default
`bot_func`	`function`	Bot function	required

Returns:

Name	Type	Description
`str`		Bot description

Source code in src/hands_on_ai/chat/bots.py

def get_bot_description(bot_func):
    """
    Get the first non-empty line of a bot's docstring.

    Args:
        bot_func (function): Bot function

    Returns:
        str: Bot description
    """
    if not bot_func.__doc__:
        return "No description."
    return next((line.strip() for line in bot_func.__doc__.splitlines() if line.strip()), "No description.")

`list_available_bots` ¶

Discover available bot functions defined in personalities module. Bots must accept a single 'prompt' argument and not be private.

Returns:

Name	Type	Description
`dict`		Dictionary of bot names and functions

Source code in src/hands_on_ai/chat/bots.py

def list_available_bots():
    """
    Discover available bot functions defined in personalities module.
    Bots must accept a single 'prompt' argument and not be private.

    Returns:
        dict: Dictionary of bot names and functions
    """
    bots = {}
    for name, obj in inspect.getmembers(personalities):
        if (
            callable(obj)
            and not name.startswith("_")
            and name.endswith("_bot")  # Enforce _bot suffix
        ):
            sig = inspect.signature(obj)
            params = list(sig.parameters.values())
            if len(params) == 1 and params[0].name == "prompt":
                bots[name] = obj
    return bots

`chat.conversation`¶

Multi-turn conversation memory for the chat module.

An LLM is stateless: each request only sees the messages you send it. To make a bot that "remembers" earlier turns, you keep the running transcript and resend it every time. Conversation does exactly that bookkeeping for you, so you can focus on the conversation instead of the plumbing.

Example

from hands_on_ai.chat import Conversation chat = Conversation(system="You are a helpful tutor.") chat.ask("My name is Sam.") chat.ask("What's my name?") # remembers "Sam" print(chat.total_tokens) # tokens used across the whole chat

`Conversation` ¶

A stateful chat that remembers the conversation history.

Source code in src/hands_on_ai/chat/conversation.py

class Conversation:
    """A stateful chat that remembers the conversation history."""

    def __init__(
        self,
        system: str = "You are a helpful assistant.",
        model: str = None,
        personality: str = "friendly",
    ):
        """
        Args:
            system: System message that defines the bot's behavior.
            model: LLM model to use (defaults to config setting).
            personality: Used for fallback character during retries.
        """
        self.system = system
        self.model = model
        self.personality = personality
        # The transcript we resend each turn. The system message stays first.
        self.messages = [{"role": "system", "content": system}]
        # Token accounting (None until the provider reports usage).
        self.last_usage = None
        self.total_tokens = 0

    def ask(self, prompt: str, stream: bool = False) -> str:
        """
        Send ``prompt`` as the next user turn and return the model's reply.

        The user message and the reply are both appended to the history, so the
        next call automatically includes everything said so far.
        """
        self.messages.append({"role": "user", "content": prompt})
        content, usage = chat_completion(
            self.messages,
            model=self.model,
            personality=self.personality,
            stream=stream,
        )
        self.messages.append({"role": "assistant", "content": content})

        self.last_usage = usage
        if usage and usage.get("total_tokens"):
            self.total_tokens += usage["total_tokens"]

        return content

    def reset(self):
        """Clear the conversation history, keeping the original system prompt."""
        self.messages = [{"role": "system", "content": self.system}]
        self.last_usage = None
        self.total_tokens = 0

    def history(self) -> list:
        """Return the user/assistant turns (excluding the system message)."""
        return [m for m in self.messages if m["role"] != "system"]

    def save(self, path):
        """Save the conversation (system prompt, history, token total) to JSON."""
        data = {
            "system": self.system,
            "model": self.model,
            "personality": self.personality,
            "messages": self.messages,
            "total_tokens": self.total_tokens,
        }
        Path(path).write_text(json.dumps(data, indent=2), encoding="utf-8")

    @classmethod
    def load(cls, path):
        """Recreate a conversation previously written with :meth:`save`."""
        data = json.loads(Path(path).read_text(encoding="utf-8"))
        conv = cls(
            system=data.get("system", "You are a helpful assistant."),
            model=data.get("model"),
            personality=data.get("personality", "friendly"),
        )
        conv.messages = data.get("messages", conv.messages)
        conv.total_tokens = data.get("total_tokens", 0)
        return conv

`init` ¶

Parameters:

Name	Type	Description	Default
`system`	`str`	System message that defines the bot's behavior.	`'You are a helpful assistant.'`
`model`	`str`	LLM model to use (defaults to config setting).	`None`
`personality`	`str`	Used for fallback character during retries.	`'friendly'`

Source code in src/hands_on_ai/chat/conversation.py

def __init__(
    self,
    system: str = "You are a helpful assistant.",
    model: str = None,
    personality: str = "friendly",
):
    """
    Args:
        system: System message that defines the bot's behavior.
        model: LLM model to use (defaults to config setting).
        personality: Used for fallback character during retries.
    """
    self.system = system
    self.model = model
    self.personality = personality
    # The transcript we resend each turn. The system message stays first.
    self.messages = [{"role": "system", "content": system}]
    # Token accounting (None until the provider reports usage).
    self.last_usage = None
    self.total_tokens = 0

`ask` ¶

Send prompt as the next user turn and return the model's reply.

The user message and the reply are both appended to the history, so the next call automatically includes everything said so far.

Source code in src/hands_on_ai/chat/conversation.py

def ask(self, prompt: str, stream: bool = False) -> str:
    """
    Send ``prompt`` as the next user turn and return the model's reply.

    The user message and the reply are both appended to the history, so the
    next call automatically includes everything said so far.
    """
    self.messages.append({"role": "user", "content": prompt})
    content, usage = chat_completion(
        self.messages,
        model=self.model,
        personality=self.personality,
        stream=stream,
    )
    self.messages.append({"role": "assistant", "content": content})

    self.last_usage = usage
    if usage and usage.get("total_tokens"):
        self.total_tokens += usage["total_tokens"]

    return content

`history` ¶

Return the user/assistant turns (excluding the system message).

Source code in src/hands_on_ai/chat/conversation.py

def history(self) -> list:
    """Return the user/assistant turns (excluding the system message)."""
    return [m for m in self.messages if m["role"] != "system"]

`load` `classmethod` ¶

Recreate a conversation previously written with :meth:save.

Source code in src/hands_on_ai/chat/conversation.py

@classmethod
def load(cls, path):
    """Recreate a conversation previously written with :meth:`save`."""
    data = json.loads(Path(path).read_text(encoding="utf-8"))
    conv = cls(
        system=data.get("system", "You are a helpful assistant."),
        model=data.get("model"),
        personality=data.get("personality", "friendly"),
    )
    conv.messages = data.get("messages", conv.messages)
    conv.total_tokens = data.get("total_tokens", 0)
    return conv

`reset` ¶

Clear the conversation history, keeping the original system prompt.

Source code in src/hands_on_ai/chat/conversation.py

def reset(self):
    """Clear the conversation history, keeping the original system prompt."""
    self.messages = [{"role": "system", "content": self.system}]
    self.last_usage = None
    self.total_tokens = 0

`save` ¶

Save the conversation (system prompt, history, token total) to JSON.

Source code in src/hands_on_ai/chat/conversation.py

def save(self, path):
    """Save the conversation (system prompt, history, token total) to JSON."""
    data = {
        "system": self.system,
        "model": self.model,
        "personality": self.personality,
        "messages": self.messages,
        "total_tokens": self.total_tokens,
    }
    Path(path).write_text(json.dumps(data, indent=2), encoding="utf-8")

RAG¶

Retrieval-augmented generation helpers — chunking, embedding, indexing, and similarity search.

`rag.utils`¶

Core RAG utilities for document loading, chunking, embedding, and retrieval.

`chunk_text` ¶

Split text into chunks of approximately equal size.

Parameters:

Name	Type	Description	Default
`text`		Text to chunk	required
`chunk_size`		Words per chunk (default from config)	`None`

Returns:

Name	Type	Description
`list`		List of text chunks

Source code in src/hands_on_ai/rag/utils.py

def chunk_text(text, chunk_size=None):
    """
    Split text into chunks of approximately equal size.

    Args:
        text: Text to chunk
        chunk_size: Words per chunk (default from config)

    Returns:
        list: List of text chunks
    """
    if chunk_size is None:
        chunk_size = get_chunk_size()

    words = text.split()
    return [" ".join(words[i:i+chunk_size]) for i in range(0, len(words), chunk_size)]

`copy_sample_docs` ¶

Copy sample documents to a destination directory.

Parameters:

Name	Type	Description	Default
`destination`		Path to copy documents to (default: current directory)	`None`

Returns:

Name	Type	Description
`Path`		Path to the destination directory

Source code in src/hands_on_ai/rag/utils.py

def copy_sample_docs(destination=None):
    """
    Copy sample documents to a destination directory.

    Args:
        destination: Path to copy documents to (default: current directory)

    Returns:
        Path: Path to the destination directory
    """
    if destination is None:
        destination = Path.cwd() / 'sample_docs'
    else:
        destination = Path(destination)

    destination.mkdir(exist_ok=True, parents=True)
    sample_path = get_sample_docs_path()

    # Copy all files
    for file_path in sample_path.iterdir():
        if file_path.is_file():
            shutil.copy2(file_path, destination / file_path.name)

    return destination

`get_embeddings` ¶

Get embeddings for text chunks using the OpenAI-compatible embeddings API.

This uses the same /v1 endpoint and client as the rest of the package, so it works with any OpenAI-compatible provider (Ollama, OpenAI, etc.) rather than only Ollama's native /api/embeddings endpoint.

Parameters:

Name	Type	Description	Default
`chunks`		List of text chunks	required
`model`		Embedding model to use (default from config)	`None`

Returns:

Name	Type	Description
`ndarray`		Array of embedding vectors

Raises:

Type	Description
`Exception`	If embedding request fails

Source code in src/hands_on_ai/rag/utils.py

def get_embeddings(chunks, model=None):
    """
    Get embeddings for text chunks using the OpenAI-compatible embeddings API.

    This uses the same ``/v1`` endpoint and client as the rest of the package,
    so it works with any OpenAI-compatible provider (Ollama, OpenAI, etc.)
    rather than only Ollama's native ``/api/embeddings`` endpoint.

    Args:
        chunks: List of text chunks
        model: Embedding model to use (default from config)

    Returns:
        ndarray: Array of embedding vectors

    Raises:
        Exception: If embedding request fails
    """
    if model is None:
        model = get_embedding_model()

    server_url = get_server_url()
    # Add /v1 suffix for OpenAI-compatible endpoints
    if not server_url.endswith("/v1"):
        server_url = server_url.rstrip("/") + "/v1"

    client = OpenAI(
        base_url=server_url,
        api_key=get_api_key() or "hands-on-ai",
        timeout=30,
    )

    response = client.embeddings.create(model=model, input=list(chunks))
    # Sort by index so vector order matches the input chunk order.
    ordered = sorted(response.data, key=lambda item: item.index)
    return np.array([item.embedding for item in ordered])

`get_sample_docs_path` ¶

Get the path to the sample document directory.

Returns:

Name	Type	Description
`Path`		Path object to the sample documents directory

Source code in src/hands_on_ai/rag/utils.py

def get_sample_docs_path():
    """
    Get the path to the sample document directory.

    Returns:
        Path: Path object to the sample documents directory
    """
    try:
        # For Python 3.9+
        with importlib.resources.path('hands_on_ai.rag.data', 'samples') as path:
            return path
    except Exception:
        # Fallback for older Python or direct file access
        module_path = Path(__file__).parent
        return module_path / 'data' / 'samples'

`get_top_k` ¶

Retrieve top k similar chunks for a query.

Parameters:

Name	Description	Default
`query`	Search query	required
`index_path`	Path to index file	required
`k`	Number of results to return	`3`
`return_scores`	Whether to include similarity scores	`False`

Returns:

Name	Type	Description
`list`		List of (chunk, source) tuples, optionally with scores

Source code in src/hands_on_ai/rag/utils.py

def get_top_k(query, index_path, k=3, return_scores=False):
    """
    Retrieve top k similar chunks for a query.

    Args:
        query: Search query
        index_path: Path to index file
        k: Number of results to return
        return_scores: Whether to include similarity scores

    Returns:
        list: List of (chunk, source) tuples, optionally with scores
    """
    vectors, chunks, sources = load_index_with_sources(index_path)
    query_vector = get_embeddings([query])[0].reshape(1, -1)
    sims = cosine_similarity(query_vector, vectors)[0]
    top_indices = sims.argsort()[-k:][::-1]

    top_chunks = [chunks[i] for i in top_indices]
    top_sources = [sources[i] for i in top_indices]
    top_scores = [sims[i] for i in top_indices]

    if return_scores:
        return list(zip(top_chunks, top_sources)), top_scores
    return list(zip(top_chunks, top_sources))

`list_sample_docs` ¶

List all available sample documents.

Returns:

Name	Type	Description
`list`		List of sample document filenames

Source code in src/hands_on_ai/rag/utils.py

def list_sample_docs():
    """
    List all available sample documents.

    Returns:
        list: List of sample document filenames
    """
    sample_path = get_sample_docs_path()
    return [f.name for f in sample_path.iterdir() if f.is_file()]

`load_index_with_sources` ¶

Load RAG index with source tracking.

Parameters:

Name	Type	Description	Default
`path`		Path to index file	required

Returns:

Name	Type	Description
`tuple`		(vectors, chunks, sources)

Source code in src/hands_on_ai/rag/utils.py

def load_index_with_sources(path):
    """
    Load RAG index with source tracking.

    Args:
        path: Path to index file

    Returns:
        tuple: (vectors, chunks, sources)
    """
    # allow_pickle=False prevents arbitrary code execution from a malicious
    # index file. chunks/sources are saved as plain string arrays, which load
    # fine without pickling.
    data = np.load(path, allow_pickle=False)
    return data["vectors"], data["chunks"], data["sources"]

`load_text_file` ¶

Load text from various file formats.

Parameters:

Name	Type	Description	Default
`path`	`Path`	Path to file	required

Returns:

Name	Type	Description
`str`	`str`	Extracted text content

Raises:

Type	Description
`ImportError`	If required dependencies are missing
`ValueError`	If file type is unsupported

Source code in src/hands_on_ai/rag/utils.py

def load_text_file(path: Path) -> str:
    """
    Load text from various file formats.

    Args:
        path: Path to file

    Returns:
        str: Extracted text content

    Raises:
        ImportError: If required dependencies are missing
        ValueError: If file type is unsupported
    """
    ext = path.suffix.lower()

    if ext in [".txt", ".md"]:
        return path.read_text(encoding="utf-8")

    elif ext == ".docx":
        try:
            import docx
        except ImportError:
            raise ImportError("`python-docx` is needed to read .docx files (it ships with hands-on-ai). Try reinstalling, or: pip install python-docx")
        doc = docx.Document(path)
        return "\n".join(p.text for p in doc.paragraphs if p.text.strip())

    elif ext == ".pdf":
        try:
            import fitz  # PyMuPDF
        except ImportError:
            raise ImportError("`pymupdf` is needed to read .pdf files (it ships with hands-on-ai). Try reinstalling, or: pip install pymupdf")
        with fitz.open(path) as doc:
            return "\n".join(page.get_text() for page in doc)

    else:
        raise ValueError(f"❌ Unsupported file type: {ext}. Supported: .txt, .md, .docx, .pdf")

`save_index_with_sources` ¶

Save RAG index with source tracking.

Parameters:

Name	Description	Default
`vectors`	Embedding vectors	required
`chunks`	Text chunks	required
`sources`	Source information for each chunk	required
`path`	Path to save index file	required

Source code in src/hands_on_ai/rag/utils.py

def save_index_with_sources(vectors, chunks, sources, path):
    """
    Save RAG index with source tracking.

    Args:
        vectors: Embedding vectors
        chunks: Text chunks
        sources: Source information for each chunk
        path: Path to save index file
    """
    np.savez(path, vectors=vectors, chunks=np.array(chunks), sources=np.array(sources))

Agent¶

Tool-using agent core: register tools, list them, and run the agent loop.

`agent.core`¶

Core agent functionality for ReAct-style reasoning and tool use.

`list_tools` ¶

List all registered tools.

Returns:

Name	Type	Description
`list`		List of tool information dictionaries

Source code in src/hands_on_ai/agent/core.py

def list_tools():
    """
    List all registered tools.

    Returns:
        list: List of tool information dictionaries
    """
    return [
        {"name": info["name"], "description": info["description"]}
        for info in _tools.values()
    ]

`register_tool` ¶

Register a tool with the agent.

Parameters:

Name	Type	Description	Default
`name`	`str`	Tool name	required
`description`	`str`	Tool description	required
`function`	`Callable`	Tool function	required

Source code in src/hands_on_ai/agent/core.py

def register_tool(name: str, description: str, function: Callable):
    """
    Register a tool with the agent.

    Args:
        name: Tool name
        description: Tool description
        function: Tool function
    """
    _tools[name] = {
        "name": name,
        "description": description,
        "function": function
    }
    log.debug(f"Registered tool: {name}")

`run_agent` ¶

Run the agent with the given prompt.

Parameters:

Name	Type	Description	Default
`prompt`	`str`	User question or instruction	required
`model`	`Optional[str]`	LLM model to use, defaults to configured model	`None`
`format`	`str`	Format to use ("react", "json", or "auto")	`'auto'`
`max_iterations`	`int`	Maximum number of tool use iterations	`5`
`verbose`	`bool`	Whether to print intermediate steps	`False`

Returns:

Name	Type	Description
`str`	`str`	Final agent response

Source code in src/hands_on_ai/agent/core.py

def run_agent(
    prompt: str, 
    model: Optional[str] = None, 
    format: str = "auto",
    max_iterations: int = 5, 
    verbose: bool = False
) -> str:
    """
    Run the agent with the given prompt.

    Args:
        prompt: User question or instruction
        model: LLM model to use, defaults to configured model
        format: Format to use ("react", "json", or "auto")
        max_iterations: Maximum number of tool use iterations
        verbose: Whether to print intermediate steps

    Returns:
        str: Final agent response
    """
    # Get model from config if not specified
    if model is None:
        model = get_model()

    # Determine which format to use if set to auto
    if format == "auto":
        format = detect_best_format(model)

    if verbose:
        log.info(f"Using {format} format for model {model}")

    # Use JSON format for smaller models
    if format == "json":
        return run_json_agent(prompt, _tools, model, max_iterations, verbose)

    # Otherwise use the original ReAct format
    return _run_react_agent(prompt, model, max_iterations, verbose)

Workflow¶

Multi-step pipeline runner for chaining steps together.

`workflow.runner`¶

A tiny file-based workflow runner: the Interpretable Context Methodology (ICM).

Instead of a coordination framework, a workflow is just a folder of numbered stages. Each stage has a CONTEXT.md (its instructions) and an output/ folder. One orchestrating model reads each stage's instructions plus the previous stage's output, and writes a new readable file. A human reviews (and can edit) the output between stages.

workspace/
├── CONTEXT.md            # optional: shared system prompt / overall goal
├── references/           # optional: stable rules (the "factory")
└── stages/
    ├── 01_research/
    │   ├── CONTEXT.md    # what this stage should do
    │   └── output/       # output.md is written here
    └── 02_draft/
        ├── CONTEXT.md
        └── output/

Run one stage at a time and review the output file before continuing. This runner is deliberately sequential and human-in-the-loop, not an autonomous loop:

from hands_on_ai.workflow import Pipeline

pipe = Pipeline("workspace")
pipe.status()        # show stages and which are done
pipe.run_next()      # runs stage 01, writes output.md, stops for review
# ...open stages/01_research/output/output.md, edit if needed...
pipe.run_next()      # runs stage 02 using stage 01's reviewed output

`Pipeline` ¶

Run a folder-based (ICM) workflow one reviewable stage at a time.

Source code in src/hands_on_ai/workflow/runner.py

class Pipeline:
    """Run a folder-based (ICM) workflow one reviewable stage at a time."""

    def __init__(self, path):
        self.root = Path(path)
        self.stages_dir = self.root / "stages"
        if not self.stages_dir.is_dir():
            raise FileNotFoundError(f"No 'stages/' folder found in {self.root}")

    # --- structure helpers ---

    def _stages(self):
        """Stage folders in numbered order."""
        return sorted(
            (p for p in self.stages_dir.iterdir() if p.is_dir()),
            key=lambda p: p.name,
        )

    @staticmethod
    def _output_path(stage):
        return stage / "output" / OUTPUT_NAME

    def _is_done(self, stage):
        out = self._output_path(stage)
        return out.exists() and out.read_text(encoding="utf-8").strip() != ""

    @staticmethod
    def _read(path):
        return path.read_text(encoding="utf-8").strip() if path.exists() else ""

    def _references(self, stage):
        """Concatenate workspace-level and stage-level reference files (the 'factory')."""
        texts = []
        for refs_dir in (self.root / "references", stage / "references"):
            if refs_dir.is_dir():
                for f in sorted(refs_dir.glob("*.md")):
                    texts.append(self._read(f))
        return "\n\n".join(t for t in texts if t)

    def _build_messages(self, stage, prev_stage):
        """Assemble the (system, prompt) for a stage from its contract + context."""
        system = self._read(self.root / "CONTEXT.md") or _DEFAULT_SYSTEM
        contract = self._read(stage / "CONTEXT.md") or f"# {stage.name}"

        parts = [contract]
        refs = self._references(stage)
        if refs:
            parts.append("## References (rules to follow)\n\n" + refs)
        if prev_stage is not None:
            prev = self._read(self._output_path(prev_stage))
            if prev:
                parts.append("## Input (output of the previous stage)\n\n" + prev)

        return system, "\n\n".join(parts)

    # --- running ---

    def status(self):
        """Print and return ``[(stage_name, done), ...]``."""
        rows = [(s.name, self._is_done(s)) for s in self._stages()]
        for name, done in rows:
            print(f"  [{'x' if done else ' '}] {name}")
        return rows

    def run_next(self, model: str = None):
        """
        Run the next not-yet-completed stage, write its output, and stop.

        This is the human-in-the-loop default: run one stage, then review (and
        optionally edit) ``output/output.md`` before calling ``run_next`` again.

        Returns:
            dict with ``stage``, ``output_path`` and ``output``, or ``None`` if
            every stage is already done.
        """
        stages = self._stages()
        for i, stage in enumerate(stages):
            if not self._is_done(stage):
                prev = stages[i - 1] if i > 0 else None
                system, prompt = self._build_messages(stage, prev)
                result = get_response(prompt, system=system, model=model)

                out = self._output_path(stage)
                out.parent.mkdir(parents=True, exist_ok=True)
                out.write_text(result, encoding="utf-8")
                return {"stage": stage.name, "output_path": str(out), "output": result}
        return None

    def run_all(self, model: str = None, max_steps: int = 50):
        """
        Run every remaining stage in order (no review pause between them).

        Use this only once you trust the pipeline. The review-first ``run_next``
        is the recommended way to drive it. ``max_steps`` is a safety bound.
        """
        results = []
        for _ in range(max_steps):
            r = self.run_next(model=model)
            if r is None:
                break
            results.append(r)
        return results

    def reset(self):
        """Delete all stage outputs so the workflow can be re-run from the start."""
        for stage in self._stages():
            out = self._output_path(stage)
            if out.exists():
                out.unlink()

`reset` ¶

Delete all stage outputs so the workflow can be re-run from the start.

Source code in src/hands_on_ai/workflow/runner.py

def reset(self):
    """Delete all stage outputs so the workflow can be re-run from the start."""
    for stage in self._stages():
        out = self._output_path(stage)
        if out.exists():
            out.unlink()

`run_all` ¶

Run every remaining stage in order (no review pause between them).

Use this only once you trust the pipeline. The review-first run_next is the recommended way to drive it. max_steps is a safety bound.

Source code in src/hands_on_ai/workflow/runner.py

def run_all(self, model: str = None, max_steps: int = 50):
    """
    Run every remaining stage in order (no review pause between them).

    Use this only once you trust the pipeline. The review-first ``run_next``
    is the recommended way to drive it. ``max_steps`` is a safety bound.
    """
    results = []
    for _ in range(max_steps):
        r = self.run_next(model=model)
        if r is None:
            break
        results.append(r)
    return results

`run_next` ¶

Run the next not-yet-completed stage, write its output, and stop.

This is the human-in-the-loop default: run one stage, then review (and optionally edit) output/output.md before calling run_next again.

Returns:

Type	Description
	dict with `stage`, `output_path` and `output`, or `None` if
	every stage is already done.

Source code in src/hands_on_ai/workflow/runner.py

def run_next(self, model: str = None):
    """
    Run the next not-yet-completed stage, write its output, and stop.

    This is the human-in-the-loop default: run one stage, then review (and
    optionally edit) ``output/output.md`` before calling ``run_next`` again.

    Returns:
        dict with ``stage``, ``output_path`` and ``output``, or ``None`` if
        every stage is already done.
    """
    stages = self._stages()
    for i, stage in enumerate(stages):
        if not self._is_done(stage):
            prev = stages[i - 1] if i > 0 else None
            system, prompt = self._build_messages(stage, prev)
            result = get_response(prompt, system=system, model=model)

            out = self._output_path(stage)
            out.parent.mkdir(parents=True, exist_ok=True)
            out.write_text(result, encoding="utf-8")
            return {"stage": stage.name, "output_path": str(out), "output": result}
    return None

`status` ¶

Print and return [(stage_name, done), ...].

Source code in src/hands_on_ai/workflow/runner.py

def status(self):
    """Print and return ``[(stage_name, done), ...]``."""
    rows = [(s.name, self._is_done(s)) for s in self._stages()]
    for name, done in rows:
        print(f"  [{'x' if done else ' '}] {name}")
    return rows

`init_workspace` ¶

Create a starter workspace with numbered stage folders.

Parameters:

Name	Type	Description	Default
`path`		Directory to create the workspace in.	required
`stages`		List of stage names, e.g. `["research", "draft"]` → `stages/01_research`, `stages/02_draft`.	required
`system`	`str`	Optional shared instruction written to the workspace `CONTEXT.md`.	`None`

Returns:

Name	Type	Description
`Path`		the workspace root.

Source code in src/hands_on_ai/workflow/runner.py

def init_workspace(path, stages, system: str = None):
    """
    Create a starter workspace with numbered stage folders.

    Args:
        path: Directory to create the workspace in.
        stages: List of stage names, e.g. ``["research", "draft"]`` →
            ``stages/01_research``, ``stages/02_draft``.
        system: Optional shared instruction written to the workspace ``CONTEXT.md``.

    Returns:
        Path: the workspace root.
    """
    root = Path(path)
    (root / "references").mkdir(parents=True, exist_ok=True)
    if system:
        (root / "CONTEXT.md").write_text(system, encoding="utf-8")

    for i, name in enumerate(stages, start=1):
        stage = root / "stages" / f"{i:02d}_{name}"
        (stage / "output").mkdir(parents=True, exist_ok=True)
        contract = stage / "CONTEXT.md"
        if not contract.exists():
            contract.write_text(
                f"# Stage {i:02d}: {name}\n\n"
                "Describe what this stage should do with the input it receives.\n",
                encoding="utf-8",
            )
    return root

Evaluation¶

LLM-as-judge scoring for bot outputs.

`eval.judge`¶

LLM-as-judge: ask a language model to score an output against criteria.

This is how a lot of modern AI evaluation works: instead of hand-writing graders, you ask a capable model to score a response. It is fast and flexible, but not infallible, so treat the score as a signal, not a verdict.

`judge` ¶

Ask an LLM to score output against criteria.

Parameters:

Name	Description	Default
`output`	The text to evaluate.	required
`criteria`	What "good" means here (e.g. "accurate, concise, and friendly").	required
`question`	The original question the output answers (optional context).	`None`
`model`	LLM model to use (defaults to config).	`None`
`scale`	Top of the scoring scale (default 5; 1 is worst).	`5`

Returns:

Name	Type	Description
`dict`		`{"score": int \| None, "reasoning": str, "raw": str}`.

Source code in src/hands_on_ai/eval/judge.py

def judge(output, criteria, question=None, model=None, scale=5):
    """
    Ask an LLM to score ``output`` against ``criteria``.

    Args:
        output: The text to evaluate.
        criteria: What "good" means here (e.g. "accurate, concise, and friendly").
        question: The original question the output answers (optional context).
        model: LLM model to use (defaults to config).
        scale: Top of the scoring scale (default 5; 1 is worst).

    Returns:
        dict: ``{"score": int | None, "reasoning": str, "raw": str}``.
    """
    system = (
        "You are a strict but fair evaluator. Score the response against the "
        f"criteria on a scale of 1 to {scale}, where {scale} is best. "
        "Reply with exactly two lines:\n"
        "SCORE: <number>\n"
        "REASONING: <one short sentence>"
    )

    parts = []
    if question:
        parts.append(f"Question:\n{question}")
    parts.append(f"Criteria:\n{criteria}")
    parts.append(f"Response to evaluate:\n{output}")

    reply = get_response("\n\n".join(parts), system=system, model=model)
    return _parse_verdict(reply, scale)

Core utilities¶

Shared configuration, model discovery, and response caching used across the toolkit.

`config`¶

Shared configuration for all hands-on-ai modules. Handles server settings, paths, and fallback messages.

`ensure_config_dir` ¶

Create config directory if it doesn't exist.

Source code in src/hands_on_ai/config.py

def ensure_config_dir():
    """Create config directory if it doesn't exist."""
    CONFIG_DIR.mkdir(exist_ok=True)

`get_api_key` ¶

Get the API key from config if available.

Source code in src/hands_on_ai/config.py

def get_api_key():
    """Get the API key from config if available."""
    return load_config().get("api_key", "")

`get_chunk_size` ¶

Get the default chunk size from config.

Source code in src/hands_on_ai/config.py

def get_chunk_size():
    """Get the default chunk size from config."""
    return load_config()["chunk_size"]

`get_embedding_model` ¶

Get the default embedding model from config.

Source code in src/hands_on_ai/config.py

def get_embedding_model():
    """Get the default embedding model from config."""
    return load_config()["embedding_model"]

`get_model` ¶

Get the default model from config.

Source code in src/hands_on_ai/config.py

def get_model():
    """Get the default model from config."""
    return load_config()["model"]

`get_server_url` ¶

Get the server URL from config.

Source code in src/hands_on_ai/config.py

def get_server_url():
    """Get the server URL from config."""
    return load_config()["server"]

`load_config` ¶

Load configuration from config file or environment variables.

Returns:

Name	Type	Description
`dict`		Configuration settings

Source code in src/hands_on_ai/config.py

def load_config():
    """
    Load configuration from config file or environment variables.

    Returns:
        dict: Configuration settings
    """
    # Precedence (lowest to highest): defaults < config file < environment variables.

    # Start with default configuration
    config = load_default_config()

    # User config file overrides the defaults
    if CONFIG_PATH.exists():
        try:
            with open(CONFIG_PATH, encoding="utf-8") as f:
                file_config = json.load(f)
                # Update only keys that exist in file
                for key in file_config:
                    config[key] = file_config[key]
        except Exception as e:
            log.warning(f"Failed to read config.json: {e}")

    # Environment variables have the highest priority (override file and defaults)
    if "HANDS_ON_AI_SERVER" in os.environ:
        config["server"] = os.environ["HANDS_ON_AI_SERVER"]

    if "HANDS_ON_AI_MODEL" in os.environ:
        config["model"] = os.environ["HANDS_ON_AI_MODEL"]

    if "HANDS_ON_AI_EMBEDDING_MODEL" in os.environ:
        config["embedding_model"] = os.environ["HANDS_ON_AI_EMBEDDING_MODEL"]

    if "HANDS_ON_AI_API_KEY" in os.environ:
        config["api_key"] = os.environ["HANDS_ON_AI_API_KEY"]

    return config

`load_default_config` ¶

Load the default configuration packaged with HandsOnAI.

Returns:

Name	Type	Description
`dict`		Default configuration settings

Source code in src/hands_on_ai/config.py

def load_default_config():
    """
    Load the default configuration packaged with HandsOnAI.

    Returns:
        dict: Default configuration settings
    """
    try:
        from importlib.resources import files
        path = files("hands_on_ai.data") / "default_config.json"
        with open(path, "r", encoding="utf-8") as f:
            return json.load(f)
    except Exception as e:
        log.warning(f"Failed to read default config: {e}")
        # Fallback to hardcoded defaults if file can't be loaded
        return {
            "server": DEFAULT_SERVER,
            "model": DEFAULT_MODEL,
            "embedding_model": DEFAULT_EMBEDDING_MODEL,
            "chunk_size": DEFAULT_CHUNK_SIZE,
        }

`load_fallbacks` ¶

Load fallback personality messages from user, local, or default locations.

Parameters:

Name	Type	Description	Default
`module`	`str`	Module name to load fallbacks for	`'chat'`

Returns:

Name	Type	Description
`dict`		Fallback messages by personality

Source code in src/hands_on_ai/config.py

def load_fallbacks(module="chat"):
    """
    Load fallback personality messages from user, local, or default locations.

    Args:
        module (str): Module name to load fallbacks for

    Returns:
        dict: Fallback messages by personality
    """
    # First try user override
    user_file = CONFIG_DIR / f"{module}_fallbacks.json"

    # Then try package data
    if user_file.exists():
        try:
            with user_file.open("r", encoding="utf-8") as f:
                return json.load(f)
        except Exception as e:
            log.warning(f"Failed to read user fallbacks: {e}")

    # Otherwise use built-in fallbacks from package data
    try:
        from importlib.resources import files
        path = files(f"hands_on_ai.{module}.data") / "fallbacks.json"
        with open(path, "r", encoding="utf-8") as f:
            return json.load(f)
    except Exception as e:
        log.warning(f"Failed to read built-in fallbacks: {e}")
        return {"default": ["Retrying..."]}

`save_config` ¶

Save configuration to config file.

Parameters:

Name	Type	Description	Default
`config`	`dict`	Configuration settings to save	required

Source code in src/hands_on_ai/config.py

def save_config(config):
    """
    Save configuration to config file.

    Args:
        config (dict): Configuration settings to save
    """
    ensure_config_dir()
    try:
        with open(CONFIG_PATH, "w", encoding="utf-8") as f:
            json.dump(config, f, indent=2)
    except Exception as e:
        log.warning(f"Failed to write config.json: {e}")

`models`¶

Core model utilities for Hands-on AI.

This module provides centralized functionality for working with LLM models: - Listing available models - Checking if a model exists - Getting model information - Normalizing model names - Detecting model capabilities

`check_model_exists` ¶

Check if a model exists on the server.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Name of the model	required

Returns:

Name	Type	Description
`bool`	`bool`	True if the model exists, False otherwise

Source code in src/hands_on_ai/models.py

def check_model_exists(model_name: str) -> bool:
    """
    Check if a model exists on the server.

    Args:
        model_name: Name of the model

    Returns:
        bool: True if the model exists, False otherwise
    """
    return get_model_info(model_name) is not None

`detect_best_format` ¶

Determine the best format for the given model based on its capabilities.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Name of the model	required

Returns:

Name	Type	Description
`str`	`str`	"react" or "json" (default)

Source code in src/hands_on_ai/models.py

def detect_best_format(model_name: str) -> str:
    """
    Determine the best format for the given model based on its capabilities.

    Args:
        model_name: Name of the model

    Returns:
        str: "react" or "json" (default)
    """
    capabilities = get_model_capabilities(model_name)

    if capabilities["react_format"]:
        return "react"
    return "json"

`get_model_capabilities` ¶

Determine the capabilities of a given model.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Name of the model	required

Returns:

Type	Description
`Dict[str, bool]`	Dict[str, bool]: Dictionary of capability flags

Source code in src/hands_on_ai/models.py

def get_model_capabilities(model_name: str) -> Dict[str, bool]:
    """
    Determine the capabilities of a given model.

    Args:
        model_name: Name of the model

    Returns:
        Dict[str, bool]: Dictionary of capability flags
    """
    # Initialize with default capabilities (conservative)
    capabilities = {
        "react_format": False,
        "json_format": True,
        "function_calling": False,
        "tool_use": False,
        "vision": False
    }

    # Get model info
    model_info = get_model_info(model_name)
    if not model_info:
        return capabilities

    # Check parameters field for model size
    if "parameters" in model_info:
        parameters = model_info.get("parameters", {})

        # Extract model size info
        model_size = 0
        if "num_params" in parameters:
            model_size = parameters["num_params"]
        elif "parameter_count" in parameters:
            model_size = parameters["parameter_count"]

        # Models with at least 30B parameters can likely handle ReAct format
        if model_size >= 30_000_000_000:  # 30B or larger
            capabilities["react_format"] = True
            capabilities["function_calling"] = True
            capabilities["tool_use"] = True

    # Check template/system prompt for function calling capabilities
    template = model_info.get("template", "")
    if "function" in template.lower() or "tool" in template.lower():
        capabilities["react_format"] = True
        capabilities["function_calling"] = True
        capabilities["tool_use"] = True

    # Check model families based on name
    model_name_lower = model_name.lower()

    # Models known to support vision
    vision_models = ["llava", "bakllava", "moondream", "cogvlm"]
    if any(vision_model in model_name_lower for vision_model in vision_models):
        capabilities["vision"] = True

    # Models known to support function calling / tool use
    function_models = [
        "gpt-4", "gpt4", "claude-2", "claude-3", "claude3",
        "llama3-70b", "llama-70b", "mixtral-8x7b"
    ]

    if any(pattern.lower() in model_name_lower for pattern in function_models):
        capabilities["react_format"] = True
        capabilities["function_calling"] = True
        capabilities["tool_use"] = True

    return capabilities

`get_model_info` ¶

Check if a model exists using OpenAI-compatible endpoint.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Name of the model	required

Returns:

Type	Description
`Optional[Dict[str, Any]]`	Optional[Dict]: Basic model information or None if not found

Source code in src/hands_on_ai/models.py

def get_model_info(model_name: str) -> Optional[Dict[str, Any]]:
    """
    Check if a model exists using OpenAI-compatible endpoint.

    Args:
        model_name: Name of the model

    Returns:
        Optional[Dict]: Basic model information or None if not found
    """
    # Try variations of the model name
    original_name = model_name
    normalized_name = normalize_model_name(model_name)

    model_variations = [original_name]
    if normalized_name != original_name:
        model_variations.append(normalized_name)

    server_url = get_server_url()

    # Add /v1 suffix for OpenAI-compatible endpoints
    if not server_url.endswith('/v1'):
        server_url = server_url.rstrip('/') + '/v1'

    # Try each variation
    for model_variant in model_variations:
        log.debug(f"Checking model: {model_variant}")

        try:
            # Create OpenAI client
            client = OpenAI(
                base_url=server_url,
                api_key=get_api_key() or "hands-on-ai"
            )

            # Get list of models and check if our model exists
            models_response = client.models.list()

            for model in models_response.data:
                if model.id == model_variant:
                    log.debug(f"Found model: {model_variant}")
                    # Return basic model info in expected format
                    return {
                        "name": model.id,
                        "parameters": {},  # Not available in OpenAI format
                        "template": "",    # Not available in OpenAI format
                        "created": getattr(model, 'created', 0)
                    }

        except Exception as e:
            log.debug(f"Error accessing model API for {model_variant}: {e}")
            continue

    # No matching model found
    log.debug(f"Model not found: {model_name}")
    return None

`list_models` ¶

List all available models using OpenAI-compatible endpoint.

Returns:

Type	Description
`List[Dict[str, Any]]`	List[Dict]: List of model information dictionaries

Source code in src/hands_on_ai/models.py

def list_models() -> List[Dict[str, Any]]:
    """
    List all available models using OpenAI-compatible endpoint.

    Returns:
        List[Dict]: List of model information dictionaries
    """
    server_url = get_server_url()

    # Add /v1 suffix for OpenAI-compatible endpoints
    if not server_url.endswith('/v1'):
        server_url = server_url.rstrip('/') + '/v1'

    try:
        # Create OpenAI client
        client = OpenAI(
            base_url=server_url,
            api_key=get_api_key() or "hands-on-ai"
        )

        # Use OpenAI-compatible models endpoint
        models_response = client.models.list()

        # Convert to the expected format
        models = []
        for model in models_response.data:
            models.append({
                "name": model.id,
                "size": 0,  # Size not available in OpenAI format
                "digest": "",  # Digest not available in OpenAI format
                "modified_at": getattr(model, 'created', 0)
            })

        return models

    except Exception as e:
        log.warning(f"Error listing models: {e}")
        return []

`normalize_model_name` ¶

Normalize the model name to the format expected by Ollama.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Original model name	required

Returns:

Name	Type	Description
`str`	`str`	Normalized model name

Source code in src/hands_on_ai/models.py

def normalize_model_name(model_name: str) -> str:
    """
    Normalize the model name to the format expected by Ollama.

    Args:
        model_name: Original model name

    Returns:
        str: Normalized model name
    """
    # If model name already has a tag (contains a colon), use it as is
    if ":" in model_name:
        return model_name

    # Otherwise append :latest tag
    return f"{model_name}:latest"

`cache`¶

Optional on-disk response cache.

Caching is off by default. Enable it by setting the HANDS_ON_AI_CACHE environment variable to a directory:

export HANDS_ON_AI_CACHE=~/.hands-on-ai/cache

When enabled, :func:hands_on_ai.chat.get_response returns a saved answer for an identical (model, system, prompt) instead of calling the model again. This is useful in classrooms: reruns are reproducible, repeated calls cost nothing, and a warmed cache works offline.

The cache is intentionally simple: one plain-text file per entry, named by a hash of the inputs. Delete the directory to clear it.

`cache_dir` ¶

Return the cache directory as a Path if caching is enabled, else None.

Source code in src/hands_on_ai/cache.py

def cache_dir():
    """Return the cache directory as a Path if caching is enabled, else None."""
    d = os.environ.get("HANDS_ON_AI_CACHE")
    return Path(d).expanduser() if d else None

`get` ¶

Return a cached response string, or None on a miss (or when disabled).

Source code in src/hands_on_ai/cache.py

def get(model, system, prompt):
    """Return a cached response string, or None on a miss (or when disabled)."""
    d = cache_dir()
    if d is None:
        return None
    f = d / f"{_key(model, system, prompt)}.txt"
    return f.read_text(encoding="utf-8") if f.exists() else None

`put` ¶

Store a response in the cache. No-op when caching is disabled.

Source code in src/hands_on_ai/cache.py

def put(model, system, prompt, response):
    """Store a response in the cache. No-op when caching is disabled."""
    d = cache_dir()
    if d is None:
        return
    d.mkdir(parents=True, exist_ok=True)
    (d / f"{_key(model, system, prompt)}.txt").write_text(response, encoding="utf-8")

API Reference¶

Package¶

Chat¶

chat.get_response¶

chat_completion ¶

get_last_usage ¶

get_response ¶

set_stream_printing ¶

stream_response ¶

chat.bots¶

get_bot ¶

get_bot_description ¶

list_available_bots ¶

chat.conversation¶

Conversation ¶

__init__ ¶

ask ¶

history ¶

load classmethod ¶

reset ¶

save ¶

RAG¶

rag.utils¶

chunk_text ¶

copy_sample_docs ¶

get_embeddings ¶

get_sample_docs_path ¶

get_top_k ¶

list_sample_docs ¶

load_index_with_sources ¶

load_text_file ¶

save_index_with_sources ¶

Agent¶

agent.core¶

list_tools ¶

register_tool ¶

run_agent ¶

Workflow¶

workflow.runner¶

Pipeline ¶

reset ¶

run_all ¶

run_next ¶

status ¶

init_workspace ¶

Evaluation¶

eval.judge¶

judge ¶

Core utilities¶

config¶

ensure_config_dir ¶

get_api_key ¶

get_chunk_size ¶

get_embedding_model ¶

get_model ¶

get_server_url ¶

load_config ¶

load_default_config ¶

load_fallbacks ¶

save_config ¶

models¶

check_model_exists ¶

detect_best_format ¶

get_model_capabilities ¶

get_model_info ¶

list_models ¶

normalize_model_name ¶

cache¶

cache_dir ¶

get ¶

put ¶

`chat.get_response`¶

`chat_completion` ¶

`get_last_usage` ¶

`get_response` ¶

`set_stream_printing` ¶

`stream_response` ¶

`chat.bots`¶

`get_bot` ¶

`get_bot_description` ¶

`list_available_bots` ¶

`chat.conversation`¶

`Conversation` ¶

`init` ¶

`ask` ¶

`history` ¶

`load` `classmethod` ¶

`reset` ¶

`save` ¶

`rag.utils`¶

`chunk_text` ¶

`copy_sample_docs` ¶

`get_embeddings` ¶

`get_sample_docs_path` ¶

`get_top_k` ¶

`list_sample_docs` ¶

`load_index_with_sources` ¶

`load_text_file` ¶

`save_index_with_sources` ¶

`agent.core`¶

`list_tools` ¶

`register_tool` ¶

`run_agent` ¶

`workflow.runner`¶

`Pipeline` ¶

`reset` ¶

`run_all` ¶

`run_next` ¶

`status` ¶

`init_workspace` ¶

`eval.judge`¶

`judge` ¶

`config`¶

`ensure_config_dir` ¶

`get_api_key` ¶

`get_chunk_size` ¶

`get_embedding_model` ¶

`get_model` ¶

`get_server_url` ¶

`load_config` ¶

`load_default_config` ¶

`load_fallbacks` ¶

`save_config` ¶

`models`¶

`check_model_exists` ¶

`detect_best_format` ¶

`get_model_capabilities` ¶

`get_model_info` ¶

`list_models` ¶

`normalize_model_name` ¶

`cache`¶

`cache_dir` ¶

`get` ¶

`put` ¶