Skip to content

API Reference

Auto-generated reference for the public hands_on_ai API, built from the package's docstrings. For task-oriented walkthroughs, see the guides under Learn; this page is the exhaustive symbol-level reference.


Package

hands_on_ai: Your Hands-on AI Toolkit

A modular toolkit for learning AI concepts through hands-on experimentation.


Chat

The chat module provides the high-level conversational API, the bot personalities, and a stateful Conversation helper.

chat.get_response

Core response functionality for the chat module.

chat_completion

Send a list of chat messages to the LLM and return (content, usage).

This is the low-level, multi-message primitive used by both :func:get_response (single-turn) and :class:Conversation (multi-turn).

Parameters:

Name Type Description Default
messages list

OpenAI-style message dicts, e.g. [{"role": "system", ...}, {"role": "user", ...}].

required
model str

LLM model to use (defaults to config setting).

None
personality str

Used to pick a fallback message during retries.

'friendly'
stream bool

Whether to request streaming output.

False
retries int

Number of attempts before giving up.

2

Returns:

Name Type Description
tuple

(content, usage) where usage is a token-count dict, or

None when the provider does not report usage (e.g. streaming).

Source code in src/hands_on_ai/chat/get_response.py
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
def chat_completion(
    messages: list,
    model: str = None,
    personality: str = "friendly",
    stream: bool = False,
    retries: int = 2,
):
    """
    Send a list of chat messages to the LLM and return ``(content, usage)``.

    This is the low-level, multi-message primitive used by both
    :func:`get_response` (single-turn) and :class:`Conversation` (multi-turn).

    Args:
        messages: OpenAI-style message dicts, e.g.
            ``[{"role": "system", ...}, {"role": "user", ...}]``.
        model: LLM model to use (defaults to config setting).
        personality: Used to pick a fallback message during retries.
        stream: Whether to request streaming output.
        retries: Number of attempts before giving up.

    Returns:
        tuple: ``(content, usage)`` where ``usage`` is a token-count dict, or
        ``None`` when the provider does not report usage (e.g. streaming).
    """
    if model is None:
        from ..config import get_model
        model = get_model()

    _warm_up(model)

    global _last_usage

    # Stream if explicitly requested, or if live REPL printing is enabled.
    do_stream = stream or _print_stream

    for attempt in range(1, retries + 1):
        try:
            client = _build_client()
            response = client.chat.completions.create(
                model=model,
                messages=messages,
                stream=do_stream,
                timeout=10,
            )

            if do_stream:
                # Collect all chunks (usage is not available when streaming),
                # printing each one live when REPL streaming is on.
                content = ""
                for chunk in response:
                    delta = chunk.choices[0].delta.content
                    if delta:
                        if _print_stream:
                            print(delta, end="", flush=True)
                        content += delta
                if _print_stream:
                    print()  # final newline after the streamed response
                _last_usage = None
                return (content or "⚠️ No response from model.", None)

            content = response.choices[0].message.content or "⚠️ No response from model."
            usage = _usage_dict(response)
            _last_usage = usage
            return (content, usage)

        except Exception as e:
            log.warning(f"Error during request (attempt {attempt}): {e}")
            if attempt < retries:
                fallback = _fallbacks.get(personality, _fallbacks.get("default", ["Retrying..."]))
                print(random.choice(fallback))
                time.sleep(1.0)
            else:
                return (f"❌ Error: {str(e)}", None)

get_last_usage

Return token usage from the most recent get_response/bot call (or None).

Source code in src/hands_on_ai/chat/get_response.py
32
33
34
def get_last_usage():
    """Return token usage from the most recent get_response/bot call (or None)."""
    return _last_usage

get_response

Send a single prompt to the LLM and retrieve the model's response.

This is a stateless, single-turn helper: it sends exactly one system message and one user message, with no memory of previous calls. For a multi-turn chat that remembers history, use :class:Conversation.

Parameters:

Name Type Description Default
prompt str

The text prompt to send to the model

required
model str

LLM model to use (defaults to config setting)

None
system str

System message defining bot behavior

'You are a helpful assistant.'
personality str

Used for fallback character during retries

'friendly'
stream bool

Whether to request streaming output (default False)

False
retries int

Number of times to retry on error

2
return_usage bool

If True, return (response, usage) where usage is a token-count dict (or None if unavailable)

False

Returns:

Name Type Description
str str

AI response or error message. If return_usage is True, a

str

(response, usage) tuple instead.

Source code in src/hands_on_ai/chat/get_response.py
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
def get_response(
    prompt: str,
    model: str = None,
    system: str = "You are a helpful assistant.",
    personality: str = "friendly",
    stream: bool = False,
    retries: int = 2,
    return_usage: bool = False,
) -> str:
    """
    Send a single prompt to the LLM and retrieve the model's response.

    This is a stateless, single-turn helper: it sends exactly one system
    message and one user message, with no memory of previous calls. For a
    multi-turn chat that remembers history, use :class:`Conversation`.

    Args:
        prompt (str): The text prompt to send to the model
        model (str): LLM model to use (defaults to config setting)
        system (str): System message defining bot behavior
        personality (str): Used for fallback character during retries
        stream (bool): Whether to request streaming output (default False)
        retries (int): Number of times to retry on error
        return_usage (bool): If True, return ``(response, usage)`` where
            ``usage`` is a token-count dict (or None if unavailable)

    Returns:
        str: AI response or error message. If ``return_usage`` is True, a
        ``(response, usage)`` tuple instead.
    """
    # Check for empty prompt
    if not prompt.strip():
        return ("⚠️ Empty prompt.", None) if return_usage else "⚠️ Empty prompt."

    # Resolve the model now so it is part of the cache key.
    if model is None:
        from ..config import get_model
        model = get_model()

    # Opt-in disk cache (HANDS_ON_AI_CACHE). Skip while streaming to the REPL,
    # where the printed output, not the return value, is what the user sees.
    from .. import cache
    use_cache = not stream and not _print_stream
    if use_cache:
        cached = cache.get(model, system, prompt)
        if cached is not None:
            return (cached, None) if return_usage else cached

    messages = [
        {"role": "system", "content": system},
        {"role": "user", "content": prompt},
    ]
    content, usage = chat_completion(
        messages,
        model=model,
        personality=personality,
        stream=stream,
        retries=retries,
    )

    # Only cache real responses, not error/empty placeholders.
    if use_cache and not content.startswith(("❌", "⚠️")):
        cache.put(model, system, prompt, content)

    return (content, usage) if return_usage else content

set_stream_printing

Enable or disable live token printing to stdout (used by the chat REPL).

Source code in src/hands_on_ai/chat/get_response.py
21
22
23
24
def set_stream_printing(enabled: bool = True):
    """Enable or disable live token printing to stdout (used by the chat REPL)."""
    global _print_stream
    _print_stream = enabled

stream_response

Like :func:get_response, but yields the response in chunks as it arrives.

This lets you show text as the model generates it, instead of waiting for the whole answer:

for chunk in stream_response("Tell me a short story"):
    print(chunk, end="", flush=True)

Parameters:

Name Type Description Default
prompt str

The text prompt to send to the model.

required
model str

LLM model to use (defaults to config setting).

None
system str

System message defining bot behavior.

'You are a helpful assistant.'
personality str

Unused here; kept for signature parity with get_response.

'friendly'
retries int

Unused here; streaming makes a single attempt.

2

Yields:

Name Type Description
str

Pieces of the response as they arrive.

Source code in src/hands_on_ai/chat/get_response.py
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
def stream_response(
    prompt: str,
    model: str = None,
    system: str = "You are a helpful assistant.",
    personality: str = "friendly",
    retries: int = 2,
):
    """
    Like :func:`get_response`, but yields the response in chunks as it arrives.

    This lets you show text as the model generates it, instead of waiting for the
    whole answer:

        for chunk in stream_response("Tell me a short story"):
            print(chunk, end="", flush=True)

    Args:
        prompt: The text prompt to send to the model.
        model: LLM model to use (defaults to config setting).
        system: System message defining bot behavior.
        personality: Unused here; kept for signature parity with get_response.
        retries: Unused here; streaming makes a single attempt.

    Yields:
        str: Pieces of the response as they arrive.
    """
    if not prompt.strip():
        yield "⚠️ Empty prompt."
        return

    if model is None:
        from ..config import get_model
        model = get_model()

    _warm_up(model)

    try:
        client = _build_client()
        response = client.chat.completions.create(
            model=model,
            messages=[
                {"role": "system", "content": system},
                {"role": "user", "content": prompt},
            ],
            stream=True,
            timeout=10,
        )
        for chunk in response:
            delta = chunk.choices[0].delta.content
            if delta:
                yield delta
    except Exception as e:
        log.warning(f"Error during streaming request: {e}")
        yield f"❌ Error: {str(e)}"

chat.bots

Bot personality discovery and retrieval.

get_bot

Retrieve a specific bot by name.

Parameters:

Name Type Description Default
name str

Bot name

required

Returns:

Name Type Description
function

Bot function or None if not found

Source code in src/hands_on_ai/chat/bots.py
31
32
33
34
35
36
37
38
39
40
41
def get_bot(name):
    """
    Retrieve a specific bot by name.

    Args:
        name (str): Bot name

    Returns:
        function: Bot function or None if not found
    """
    return list_available_bots().get(name)

get_bot_description

Get the first non-empty line of a bot's docstring.

Parameters:

Name Type Description Default
bot_func function

Bot function

required

Returns:

Name Type Description
str

Bot description

Source code in src/hands_on_ai/chat/bots.py
44
45
46
47
48
49
50
51
52
53
54
55
56
def get_bot_description(bot_func):
    """
    Get the first non-empty line of a bot's docstring.

    Args:
        bot_func (function): Bot function

    Returns:
        str: Bot description
    """
    if not bot_func.__doc__:
        return "No description."
    return next((line.strip() for line in bot_func.__doc__.splitlines() if line.strip()), "No description.")

list_available_bots

Discover available bot functions defined in personalities module. Bots must accept a single 'prompt' argument and not be private.

Returns:

Name Type Description
dict

Dictionary of bot names and functions

Source code in src/hands_on_ai/chat/bots.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def list_available_bots():
    """
    Discover available bot functions defined in personalities module.
    Bots must accept a single 'prompt' argument and not be private.

    Returns:
        dict: Dictionary of bot names and functions
    """
    bots = {}
    for name, obj in inspect.getmembers(personalities):
        if (
            callable(obj)
            and not name.startswith("_")
            and name.endswith("_bot")  # Enforce _bot suffix
        ):
            sig = inspect.signature(obj)
            params = list(sig.parameters.values())
            if len(params) == 1 and params[0].name == "prompt":
                bots[name] = obj
    return bots

chat.conversation

Multi-turn conversation memory for the chat module.

An LLM is stateless: each request only sees the messages you send it. To make a bot that "remembers" earlier turns, you keep the running transcript and resend it every time. Conversation does exactly that bookkeeping for you, so you can focus on the conversation instead of the plumbing.

Example

from hands_on_ai.chat import Conversation chat = Conversation(system="You are a helpful tutor.") chat.ask("My name is Sam.") chat.ask("What's my name?") # remembers "Sam" print(chat.total_tokens) # tokens used across the whole chat

Conversation

A stateful chat that remembers the conversation history.

Source code in src/hands_on_ai/chat/conversation.py
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
class Conversation:
    """A stateful chat that remembers the conversation history."""

    def __init__(
        self,
        system: str = "You are a helpful assistant.",
        model: str = None,
        personality: str = "friendly",
    ):
        """
        Args:
            system: System message that defines the bot's behavior.
            model: LLM model to use (defaults to config setting).
            personality: Used for fallback character during retries.
        """
        self.system = system
        self.model = model
        self.personality = personality
        # The transcript we resend each turn. The system message stays first.
        self.messages = [{"role": "system", "content": system}]
        # Token accounting (None until the provider reports usage).
        self.last_usage = None
        self.total_tokens = 0

    def ask(self, prompt: str, stream: bool = False) -> str:
        """
        Send ``prompt`` as the next user turn and return the model's reply.

        The user message and the reply are both appended to the history, so the
        next call automatically includes everything said so far.
        """
        self.messages.append({"role": "user", "content": prompt})
        content, usage = chat_completion(
            self.messages,
            model=self.model,
            personality=self.personality,
            stream=stream,
        )
        self.messages.append({"role": "assistant", "content": content})

        self.last_usage = usage
        if usage and usage.get("total_tokens"):
            self.total_tokens += usage["total_tokens"]

        return content

    def reset(self):
        """Clear the conversation history, keeping the original system prompt."""
        self.messages = [{"role": "system", "content": self.system}]
        self.last_usage = None
        self.total_tokens = 0

    def history(self) -> list:
        """Return the user/assistant turns (excluding the system message)."""
        return [m for m in self.messages if m["role"] != "system"]

    def save(self, path):
        """Save the conversation (system prompt, history, token total) to JSON."""
        data = {
            "system": self.system,
            "model": self.model,
            "personality": self.personality,
            "messages": self.messages,
            "total_tokens": self.total_tokens,
        }
        Path(path).write_text(json.dumps(data, indent=2), encoding="utf-8")

    @classmethod
    def load(cls, path):
        """Recreate a conversation previously written with :meth:`save`."""
        data = json.loads(Path(path).read_text(encoding="utf-8"))
        conv = cls(
            system=data.get("system", "You are a helpful assistant."),
            model=data.get("model"),
            personality=data.get("personality", "friendly"),
        )
        conv.messages = data.get("messages", conv.messages)
        conv.total_tokens = data.get("total_tokens", 0)
        return conv

__init__

Parameters:

Name Type Description Default
system str

System message that defines the bot's behavior.

'You are a helpful assistant.'
model str

LLM model to use (defaults to config setting).

None
personality str

Used for fallback character during retries.

'friendly'
Source code in src/hands_on_ai/chat/conversation.py
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
def __init__(
    self,
    system: str = "You are a helpful assistant.",
    model: str = None,
    personality: str = "friendly",
):
    """
    Args:
        system: System message that defines the bot's behavior.
        model: LLM model to use (defaults to config setting).
        personality: Used for fallback character during retries.
    """
    self.system = system
    self.model = model
    self.personality = personality
    # The transcript we resend each turn. The system message stays first.
    self.messages = [{"role": "system", "content": system}]
    # Token accounting (None until the provider reports usage).
    self.last_usage = None
    self.total_tokens = 0

ask

Send prompt as the next user turn and return the model's reply.

The user message and the reply are both appended to the history, so the next call automatically includes everything said so far.

Source code in src/hands_on_ai/chat/conversation.py
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
def ask(self, prompt: str, stream: bool = False) -> str:
    """
    Send ``prompt`` as the next user turn and return the model's reply.

    The user message and the reply are both appended to the history, so the
    next call automatically includes everything said so far.
    """
    self.messages.append({"role": "user", "content": prompt})
    content, usage = chat_completion(
        self.messages,
        model=self.model,
        personality=self.personality,
        stream=stream,
    )
    self.messages.append({"role": "assistant", "content": content})

    self.last_usage = usage
    if usage and usage.get("total_tokens"):
        self.total_tokens += usage["total_tokens"]

    return content

history

Return the user/assistant turns (excluding the system message).

Source code in src/hands_on_ai/chat/conversation.py
75
76
77
def history(self) -> list:
    """Return the user/assistant turns (excluding the system message)."""
    return [m for m in self.messages if m["role"] != "system"]

load classmethod

Recreate a conversation previously written with :meth:save.

Source code in src/hands_on_ai/chat/conversation.py
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
@classmethod
def load(cls, path):
    """Recreate a conversation previously written with :meth:`save`."""
    data = json.loads(Path(path).read_text(encoding="utf-8"))
    conv = cls(
        system=data.get("system", "You are a helpful assistant."),
        model=data.get("model"),
        personality=data.get("personality", "friendly"),
    )
    conv.messages = data.get("messages", conv.messages)
    conv.total_tokens = data.get("total_tokens", 0)
    return conv

reset

Clear the conversation history, keeping the original system prompt.

Source code in src/hands_on_ai/chat/conversation.py
69
70
71
72
73
def reset(self):
    """Clear the conversation history, keeping the original system prompt."""
    self.messages = [{"role": "system", "content": self.system}]
    self.last_usage = None
    self.total_tokens = 0

save

Save the conversation (system prompt, history, token total) to JSON.

Source code in src/hands_on_ai/chat/conversation.py
79
80
81
82
83
84
85
86
87
88
def save(self, path):
    """Save the conversation (system prompt, history, token total) to JSON."""
    data = {
        "system": self.system,
        "model": self.model,
        "personality": self.personality,
        "messages": self.messages,
        "total_tokens": self.total_tokens,
    }
    Path(path).write_text(json.dumps(data, indent=2), encoding="utf-8")

RAG

Retrieval-augmented generation helpers — chunking, embedding, indexing, and similarity search.

rag.utils

Core RAG utilities for document loading, chunking, embedding, and retrieval.

chunk_text

Split text into chunks of approximately equal size.

Parameters:

Name Type Description Default
text

Text to chunk

required
chunk_size

Words per chunk (default from config)

None

Returns:

Name Type Description
list

List of text chunks

Source code in src/hands_on_ai/rag/utils.py
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
def chunk_text(text, chunk_size=None):
    """
    Split text into chunks of approximately equal size.

    Args:
        text: Text to chunk
        chunk_size: Words per chunk (default from config)

    Returns:
        list: List of text chunks
    """
    if chunk_size is None:
        chunk_size = get_chunk_size()

    words = text.split()
    return [" ".join(words[i:i+chunk_size]) for i in range(0, len(words), chunk_size)]

copy_sample_docs

Copy sample documents to a destination directory.

Parameters:

Name Type Description Default
destination

Path to copy documents to (default: current directory)

None

Returns:

Name Type Description
Path

Path to the destination directory

Source code in src/hands_on_ai/rag/utils.py
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
def copy_sample_docs(destination=None):
    """
    Copy sample documents to a destination directory.

    Args:
        destination: Path to copy documents to (default: current directory)

    Returns:
        Path: Path to the destination directory
    """
    if destination is None:
        destination = Path.cwd() / 'sample_docs'
    else:
        destination = Path(destination)

    destination.mkdir(exist_ok=True, parents=True)
    sample_path = get_sample_docs_path()

    # Copy all files
    for file_path in sample_path.iterdir():
        if file_path.is_file():
            shutil.copy2(file_path, destination / file_path.name)

    return destination

get_embeddings

Get embeddings for text chunks using the OpenAI-compatible embeddings API.

This uses the same /v1 endpoint and client as the rest of the package, so it works with any OpenAI-compatible provider (Ollama, OpenAI, etc.) rather than only Ollama's native /api/embeddings endpoint.

Parameters:

Name Type Description Default
chunks

List of text chunks

required
model

Embedding model to use (default from config)

None

Returns:

Name Type Description
ndarray

Array of embedding vectors

Raises:

Type Description
Exception

If embedding request fails

Source code in src/hands_on_ai/rag/utils.py
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
def get_embeddings(chunks, model=None):
    """
    Get embeddings for text chunks using the OpenAI-compatible embeddings API.

    This uses the same ``/v1`` endpoint and client as the rest of the package,
    so it works with any OpenAI-compatible provider (Ollama, OpenAI, etc.)
    rather than only Ollama's native ``/api/embeddings`` endpoint.

    Args:
        chunks: List of text chunks
        model: Embedding model to use (default from config)

    Returns:
        ndarray: Array of embedding vectors

    Raises:
        Exception: If embedding request fails
    """
    if model is None:
        model = get_embedding_model()

    server_url = get_server_url()
    # Add /v1 suffix for OpenAI-compatible endpoints
    if not server_url.endswith("/v1"):
        server_url = server_url.rstrip("/") + "/v1"

    client = OpenAI(
        base_url=server_url,
        api_key=get_api_key() or "hands-on-ai",
        timeout=30,
    )

    response = client.embeddings.create(model=model, input=list(chunks))
    # Sort by index so vector order matches the input chunk order.
    ordered = sorted(response.data, key=lambda item: item.index)
    return np.array([item.embedding for item in ordered])

get_sample_docs_path

Get the path to the sample document directory.

Returns:

Name Type Description
Path

Path object to the sample documents directory

Source code in src/hands_on_ai/rag/utils.py
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
def get_sample_docs_path():
    """
    Get the path to the sample document directory.

    Returns:
        Path: Path object to the sample documents directory
    """
    try:
        # For Python 3.9+
        with importlib.resources.path('hands_on_ai.rag.data', 'samples') as path:
            return path
    except Exception:
        # Fallback for older Python or direct file access
        module_path = Path(__file__).parent
        return module_path / 'data' / 'samples'

get_top_k

Retrieve top k similar chunks for a query.

Parameters:

Name Type Description Default
query

Search query

required
index_path

Path to index file

required
k

Number of results to return

3
return_scores

Whether to include similarity scores

False

Returns:

Name Type Description
list

List of (chunk, source) tuples, optionally with scores

Source code in src/hands_on_ai/rag/utils.py
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
def get_top_k(query, index_path, k=3, return_scores=False):
    """
    Retrieve top k similar chunks for a query.

    Args:
        query: Search query
        index_path: Path to index file
        k: Number of results to return
        return_scores: Whether to include similarity scores

    Returns:
        list: List of (chunk, source) tuples, optionally with scores
    """
    vectors, chunks, sources = load_index_with_sources(index_path)
    query_vector = get_embeddings([query])[0].reshape(1, -1)
    sims = cosine_similarity(query_vector, vectors)[0]
    top_indices = sims.argsort()[-k:][::-1]

    top_chunks = [chunks[i] for i in top_indices]
    top_sources = [sources[i] for i in top_indices]
    top_scores = [sims[i] for i in top_indices]

    if return_scores:
        return list(zip(top_chunks, top_sources)), top_scores
    return list(zip(top_chunks, top_sources))

list_sample_docs

List all available sample documents.

Returns:

Name Type Description
list

List of sample document filenames

Source code in src/hands_on_ai/rag/utils.py
183
184
185
186
187
188
189
190
191
def list_sample_docs():
    """
    List all available sample documents.

    Returns:
        list: List of sample document filenames
    """
    sample_path = get_sample_docs_path()
    return [f.name for f in sample_path.iterdir() if f.is_file()]

load_index_with_sources

Load RAG index with source tracking.

Parameters:

Name Type Description Default
path

Path to index file

required

Returns:

Name Type Description
tuple

(vectors, chunks, sources)

Source code in src/hands_on_ai/rag/utils.py
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
def load_index_with_sources(path):
    """
    Load RAG index with source tracking.

    Args:
        path: Path to index file

    Returns:
        tuple: (vectors, chunks, sources)
    """
    # allow_pickle=False prevents arbitrary code execution from a malicious
    # index file. chunks/sources are saved as plain string arrays, which load
    # fine without pickling.
    data = np.load(path, allow_pickle=False)
    return data["vectors"], data["chunks"], data["sources"]

load_text_file

Load text from various file formats.

Parameters:

Name Type Description Default
path Path

Path to file

required

Returns:

Name Type Description
str str

Extracted text content

Raises:

Type Description
ImportError

If required dependencies are missing

ValueError

If file type is unsupported

Source code in src/hands_on_ai/rag/utils.py
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
def load_text_file(path: Path) -> str:
    """
    Load text from various file formats.

    Args:
        path: Path to file

    Returns:
        str: Extracted text content

    Raises:
        ImportError: If required dependencies are missing
        ValueError: If file type is unsupported
    """
    ext = path.suffix.lower()

    if ext in [".txt", ".md"]:
        return path.read_text(encoding="utf-8")

    elif ext == ".docx":
        try:
            import docx
        except ImportError:
            raise ImportError("`python-docx` is needed to read .docx files (it ships with hands-on-ai). Try reinstalling, or: pip install python-docx")
        doc = docx.Document(path)
        return "\n".join(p.text for p in doc.paragraphs if p.text.strip())

    elif ext == ".pdf":
        try:
            import fitz  # PyMuPDF
        except ImportError:
            raise ImportError("`pymupdf` is needed to read .pdf files (it ships with hands-on-ai). Try reinstalling, or: pip install pymupdf")
        with fitz.open(path) as doc:
            return "\n".join(page.get_text() for page in doc)

    else:
        raise ValueError(f"❌ Unsupported file type: {ext}. Supported: .txt, .md, .docx, .pdf")

save_index_with_sources

Save RAG index with source tracking.

Parameters:

Name Type Description Default
vectors

Embedding vectors

required
chunks

Text chunks

required
sources

Source information for each chunk

required
path

Path to save index file

required
Source code in src/hands_on_ai/rag/utils.py
109
110
111
112
113
114
115
116
117
118
119
def save_index_with_sources(vectors, chunks, sources, path):
    """
    Save RAG index with source tracking.

    Args:
        vectors: Embedding vectors
        chunks: Text chunks
        sources: Source information for each chunk
        path: Path to save index file
    """
    np.savez(path, vectors=vectors, chunks=np.array(chunks), sources=np.array(sources))

Agent

Tool-using agent core: register tools, list them, and run the agent loop.

agent.core

Core agent functionality for ReAct-style reasoning and tool use.

list_tools

List all registered tools.

Returns:

Name Type Description
list

List of tool information dictionaries

Source code in src/hands_on_ai/agent/core.py
34
35
36
37
38
39
40
41
42
43
44
def list_tools():
    """
    List all registered tools.

    Returns:
        list: List of tool information dictionaries
    """
    return [
        {"name": info["name"], "description": info["description"]}
        for info in _tools.values()
    ]

register_tool

Register a tool with the agent.

Parameters:

Name Type Description Default
name str

Tool name

required
description str

Tool description

required
function Callable

Tool function

required
Source code in src/hands_on_ai/agent/core.py
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
def register_tool(name: str, description: str, function: Callable):
    """
    Register a tool with the agent.

    Args:
        name: Tool name
        description: Tool description
        function: Tool function
    """
    _tools[name] = {
        "name": name,
        "description": description,
        "function": function
    }
    log.debug(f"Registered tool: {name}")

run_agent

Run the agent with the given prompt.

Parameters:

Name Type Description Default
prompt str

User question or instruction

required
model Optional[str]

LLM model to use, defaults to configured model

None
format str

Format to use ("react", "json", or "auto")

'auto'
max_iterations int

Maximum number of tool use iterations

5
verbose bool

Whether to print intermediate steps

False

Returns:

Name Type Description
str str

Final agent response

Source code in src/hands_on_ai/agent/core.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
def run_agent(
    prompt: str, 
    model: Optional[str] = None, 
    format: str = "auto",
    max_iterations: int = 5, 
    verbose: bool = False
) -> str:
    """
    Run the agent with the given prompt.

    Args:
        prompt: User question or instruction
        model: LLM model to use, defaults to configured model
        format: Format to use ("react", "json", or "auto")
        max_iterations: Maximum number of tool use iterations
        verbose: Whether to print intermediate steps

    Returns:
        str: Final agent response
    """
    # Get model from config if not specified
    if model is None:
        model = get_model()

    # Determine which format to use if set to auto
    if format == "auto":
        format = detect_best_format(model)

    if verbose:
        log.info(f"Using {format} format for model {model}")

    # Use JSON format for smaller models
    if format == "json":
        return run_json_agent(prompt, _tools, model, max_iterations, verbose)

    # Otherwise use the original ReAct format
    return _run_react_agent(prompt, model, max_iterations, verbose)

Workflow

Multi-step pipeline runner for chaining steps together.

workflow.runner

A tiny file-based workflow runner: the Interpretable Context Methodology (ICM).

Instead of a coordination framework, a workflow is just a folder of numbered stages. Each stage has a CONTEXT.md (its instructions) and an output/ folder. One orchestrating model reads each stage's instructions plus the previous stage's output, and writes a new readable file. A human reviews (and can edit) the output between stages.

workspace/
├── CONTEXT.md            # optional: shared system prompt / overall goal
├── references/           # optional: stable rules (the "factory")
└── stages/
    ├── 01_research/
    │   ├── CONTEXT.md    # what this stage should do
    │   └── output/       # output.md is written here
    └── 02_draft/
        ├── CONTEXT.md
        └── output/

Run one stage at a time and review the output file before continuing. This runner is deliberately sequential and human-in-the-loop, not an autonomous loop:

from hands_on_ai.workflow import Pipeline

pipe = Pipeline("workspace")
pipe.status()        # show stages and which are done
pipe.run_next()      # runs stage 01, writes output.md, stops for review
# ...open stages/01_research/output/output.md, edit if needed...
pipe.run_next()      # runs stage 02 using stage 01's reviewed output

Pipeline

Run a folder-based (ICM) workflow one reviewable stage at a time.

Source code in src/hands_on_ai/workflow/runner.py
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
class Pipeline:
    """Run a folder-based (ICM) workflow one reviewable stage at a time."""

    def __init__(self, path):
        self.root = Path(path)
        self.stages_dir = self.root / "stages"
        if not self.stages_dir.is_dir():
            raise FileNotFoundError(f"No 'stages/' folder found in {self.root}")

    # --- structure helpers ---

    def _stages(self):
        """Stage folders in numbered order."""
        return sorted(
            (p for p in self.stages_dir.iterdir() if p.is_dir()),
            key=lambda p: p.name,
        )

    @staticmethod
    def _output_path(stage):
        return stage / "output" / OUTPUT_NAME

    def _is_done(self, stage):
        out = self._output_path(stage)
        return out.exists() and out.read_text(encoding="utf-8").strip() != ""

    @staticmethod
    def _read(path):
        return path.read_text(encoding="utf-8").strip() if path.exists() else ""

    def _references(self, stage):
        """Concatenate workspace-level and stage-level reference files (the 'factory')."""
        texts = []
        for refs_dir in (self.root / "references", stage / "references"):
            if refs_dir.is_dir():
                for f in sorted(refs_dir.glob("*.md")):
                    texts.append(self._read(f))
        return "\n\n".join(t for t in texts if t)

    def _build_messages(self, stage, prev_stage):
        """Assemble the (system, prompt) for a stage from its contract + context."""
        system = self._read(self.root / "CONTEXT.md") or _DEFAULT_SYSTEM
        contract = self._read(stage / "CONTEXT.md") or f"# {stage.name}"

        parts = [contract]
        refs = self._references(stage)
        if refs:
            parts.append("## References (rules to follow)\n\n" + refs)
        if prev_stage is not None:
            prev = self._read(self._output_path(prev_stage))
            if prev:
                parts.append("## Input (output of the previous stage)\n\n" + prev)

        return system, "\n\n".join(parts)

    # --- running ---

    def status(self):
        """Print and return ``[(stage_name, done), ...]``."""
        rows = [(s.name, self._is_done(s)) for s in self._stages()]
        for name, done in rows:
            print(f"  [{'x' if done else ' '}] {name}")
        return rows

    def run_next(self, model: str = None):
        """
        Run the next not-yet-completed stage, write its output, and stop.

        This is the human-in-the-loop default: run one stage, then review (and
        optionally edit) ``output/output.md`` before calling ``run_next`` again.

        Returns:
            dict with ``stage``, ``output_path`` and ``output``, or ``None`` if
            every stage is already done.
        """
        stages = self._stages()
        for i, stage in enumerate(stages):
            if not self._is_done(stage):
                prev = stages[i - 1] if i > 0 else None
                system, prompt = self._build_messages(stage, prev)
                result = get_response(prompt, system=system, model=model)

                out = self._output_path(stage)
                out.parent.mkdir(parents=True, exist_ok=True)
                out.write_text(result, encoding="utf-8")
                return {"stage": stage.name, "output_path": str(out), "output": result}
        return None

    def run_all(self, model: str = None, max_steps: int = 50):
        """
        Run every remaining stage in order (no review pause between them).

        Use this only once you trust the pipeline. The review-first ``run_next``
        is the recommended way to drive it. ``max_steps`` is a safety bound.
        """
        results = []
        for _ in range(max_steps):
            r = self.run_next(model=model)
            if r is None:
                break
            results.append(r)
        return results

    def reset(self):
        """Delete all stage outputs so the workflow can be re-run from the start."""
        for stage in self._stages():
            out = self._output_path(stage)
            if out.exists():
                out.unlink()

reset

Delete all stage outputs so the workflow can be re-run from the start.

Source code in src/hands_on_ai/workflow/runner.py
179
180
181
182
183
184
def reset(self):
    """Delete all stage outputs so the workflow can be re-run from the start."""
    for stage in self._stages():
        out = self._output_path(stage)
        if out.exists():
            out.unlink()

run_all

Run every remaining stage in order (no review pause between them).

Use this only once you trust the pipeline. The review-first run_next is the recommended way to drive it. max_steps is a safety bound.

Source code in src/hands_on_ai/workflow/runner.py
164
165
166
167
168
169
170
171
172
173
174
175
176
177
def run_all(self, model: str = None, max_steps: int = 50):
    """
    Run every remaining stage in order (no review pause between them).

    Use this only once you trust the pipeline. The review-first ``run_next``
    is the recommended way to drive it. ``max_steps`` is a safety bound.
    """
    results = []
    for _ in range(max_steps):
        r = self.run_next(model=model)
        if r is None:
            break
        results.append(r)
    return results

run_next

Run the next not-yet-completed stage, write its output, and stop.

This is the human-in-the-loop default: run one stage, then review (and optionally edit) output/output.md before calling run_next again.

Returns:

Type Description

dict with stage, output_path and output, or None if

every stage is already done.

Source code in src/hands_on_ai/workflow/runner.py
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
def run_next(self, model: str = None):
    """
    Run the next not-yet-completed stage, write its output, and stop.

    This is the human-in-the-loop default: run one stage, then review (and
    optionally edit) ``output/output.md`` before calling ``run_next`` again.

    Returns:
        dict with ``stage``, ``output_path`` and ``output``, or ``None`` if
        every stage is already done.
    """
    stages = self._stages()
    for i, stage in enumerate(stages):
        if not self._is_done(stage):
            prev = stages[i - 1] if i > 0 else None
            system, prompt = self._build_messages(stage, prev)
            result = get_response(prompt, system=system, model=model)

            out = self._output_path(stage)
            out.parent.mkdir(parents=True, exist_ok=True)
            out.write_text(result, encoding="utf-8")
            return {"stage": stage.name, "output_path": str(out), "output": result}
    return None

status

Print and return [(stage_name, done), ...].

Source code in src/hands_on_ai/workflow/runner.py
133
134
135
136
137
138
def status(self):
    """Print and return ``[(stage_name, done), ...]``."""
    rows = [(s.name, self._is_done(s)) for s in self._stages()]
    for name, done in rows:
        print(f"  [{'x' if done else ' '}] {name}")
    return rows

init_workspace

Create a starter workspace with numbered stage folders.

Parameters:

Name Type Description Default
path

Directory to create the workspace in.

required
stages

List of stage names, e.g. ["research", "draft"]stages/01_research, stages/02_draft.

required
system str

Optional shared instruction written to the workspace CONTEXT.md.

None

Returns:

Name Type Description
Path

the workspace root.

Source code in src/hands_on_ai/workflow/runner.py
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
def init_workspace(path, stages, system: str = None):
    """
    Create a starter workspace with numbered stage folders.

    Args:
        path: Directory to create the workspace in.
        stages: List of stage names, e.g. ``["research", "draft"]`` →
            ``stages/01_research``, ``stages/02_draft``.
        system: Optional shared instruction written to the workspace ``CONTEXT.md``.

    Returns:
        Path: the workspace root.
    """
    root = Path(path)
    (root / "references").mkdir(parents=True, exist_ok=True)
    if system:
        (root / "CONTEXT.md").write_text(system, encoding="utf-8")

    for i, name in enumerate(stages, start=1):
        stage = root / "stages" / f"{i:02d}_{name}"
        (stage / "output").mkdir(parents=True, exist_ok=True)
        contract = stage / "CONTEXT.md"
        if not contract.exists():
            contract.write_text(
                f"# Stage {i:02d}: {name}\n\n"
                "Describe what this stage should do with the input it receives.\n",
                encoding="utf-8",
            )
    return root

Evaluation

LLM-as-judge scoring for bot outputs.

eval.judge

LLM-as-judge: ask a language model to score an output against criteria.

This is how a lot of modern AI evaluation works: instead of hand-writing graders, you ask a capable model to score a response. It is fast and flexible, but not infallible, so treat the score as a signal, not a verdict.

judge

Ask an LLM to score output against criteria.

Parameters:

Name Type Description Default
output

The text to evaluate.

required
criteria

What "good" means here (e.g. "accurate, concise, and friendly").

required
question

The original question the output answers (optional context).

None
model

LLM model to use (defaults to config).

None
scale

Top of the scoring scale (default 5; 1 is worst).

5

Returns:

Name Type Description
dict

{"score": int | None, "reasoning": str, "raw": str}.

Source code in src/hands_on_ai/eval/judge.py
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
def judge(output, criteria, question=None, model=None, scale=5):
    """
    Ask an LLM to score ``output`` against ``criteria``.

    Args:
        output: The text to evaluate.
        criteria: What "good" means here (e.g. "accurate, concise, and friendly").
        question: The original question the output answers (optional context).
        model: LLM model to use (defaults to config).
        scale: Top of the scoring scale (default 5; 1 is worst).

    Returns:
        dict: ``{"score": int | None, "reasoning": str, "raw": str}``.
    """
    system = (
        "You are a strict but fair evaluator. Score the response against the "
        f"criteria on a scale of 1 to {scale}, where {scale} is best. "
        "Reply with exactly two lines:\n"
        "SCORE: <number>\n"
        "REASONING: <one short sentence>"
    )

    parts = []
    if question:
        parts.append(f"Question:\n{question}")
    parts.append(f"Criteria:\n{criteria}")
    parts.append(f"Response to evaluate:\n{output}")

    reply = get_response("\n\n".join(parts), system=system, model=model)
    return _parse_verdict(reply, scale)

Core utilities

Shared configuration, model discovery, and response caching used across the toolkit.

config

Shared configuration for all hands-on-ai modules. Handles server settings, paths, and fallback messages.

ensure_config_dir

Create config directory if it doesn't exist.

Source code in src/hands_on_ai/config.py
30
31
32
def ensure_config_dir():
    """Create config directory if it doesn't exist."""
    CONFIG_DIR.mkdir(exist_ok=True)

get_api_key

Get the API key from config if available.

Source code in src/hands_on_ai/config.py
164
165
166
def get_api_key():
    """Get the API key from config if available."""
    return load_config().get("api_key", "")

get_chunk_size

Get the default chunk size from config.

Source code in src/hands_on_ai/config.py
159
160
161
def get_chunk_size():
    """Get the default chunk size from config."""
    return load_config()["chunk_size"]

get_embedding_model

Get the default embedding model from config.

Source code in src/hands_on_ai/config.py
154
155
156
def get_embedding_model():
    """Get the default embedding model from config."""
    return load_config()["embedding_model"]

get_model

Get the default model from config.

Source code in src/hands_on_ai/config.py
149
150
151
def get_model():
    """Get the default model from config."""
    return load_config()["model"]

get_server_url

Get the server URL from config.

Source code in src/hands_on_ai/config.py
144
145
146
def get_server_url():
    """Get the server URL from config."""
    return load_config()["server"]

load_config

Load configuration from config file or environment variables.

Returns:

Name Type Description
dict

Configuration settings

Source code in src/hands_on_ai/config.py
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
def load_config():
    """
    Load configuration from config file or environment variables.

    Returns:
        dict: Configuration settings
    """
    # Precedence (lowest to highest): defaults < config file < environment variables.

    # Start with default configuration
    config = load_default_config()

    # User config file overrides the defaults
    if CONFIG_PATH.exists():
        try:
            with open(CONFIG_PATH, encoding="utf-8") as f:
                file_config = json.load(f)
                # Update only keys that exist in file
                for key in file_config:
                    config[key] = file_config[key]
        except Exception as e:
            log.warning(f"Failed to read config.json: {e}")

    # Environment variables have the highest priority (override file and defaults)
    if "HANDS_ON_AI_SERVER" in os.environ:
        config["server"] = os.environ["HANDS_ON_AI_SERVER"]

    if "HANDS_ON_AI_MODEL" in os.environ:
        config["model"] = os.environ["HANDS_ON_AI_MODEL"]

    if "HANDS_ON_AI_EMBEDDING_MODEL" in os.environ:
        config["embedding_model"] = os.environ["HANDS_ON_AI_EMBEDDING_MODEL"]

    if "HANDS_ON_AI_API_KEY" in os.environ:
        config["api_key"] = os.environ["HANDS_ON_AI_API_KEY"]

    return config

load_default_config

Load the default configuration packaged with HandsOnAI.

Returns:

Name Type Description
dict

Default configuration settings

Source code in src/hands_on_ai/config.py
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
def load_default_config():
    """
    Load the default configuration packaged with HandsOnAI.

    Returns:
        dict: Default configuration settings
    """
    try:
        from importlib.resources import files
        path = files("hands_on_ai.data") / "default_config.json"
        with open(path, "r", encoding="utf-8") as f:
            return json.load(f)
    except Exception as e:
        log.warning(f"Failed to read default config: {e}")
        # Fallback to hardcoded defaults if file can't be loaded
        return {
            "server": DEFAULT_SERVER,
            "model": DEFAULT_MODEL,
            "embedding_model": DEFAULT_EMBEDDING_MODEL,
            "chunk_size": DEFAULT_CHUNK_SIZE,
        }

load_fallbacks

Load fallback personality messages from user, local, or default locations.

Parameters:

Name Type Description Default
module str

Module name to load fallbacks for

'chat'

Returns:

Name Type Description
dict

Fallback messages by personality

Source code in src/hands_on_ai/config.py
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
def load_fallbacks(module="chat"):
    """
    Load fallback personality messages from user, local, or default locations.

    Args:
        module (str): Module name to load fallbacks for

    Returns:
        dict: Fallback messages by personality
    """
    # First try user override
    user_file = CONFIG_DIR / f"{module}_fallbacks.json"

    # Then try package data
    if user_file.exists():
        try:
            with user_file.open("r", encoding="utf-8") as f:
                return json.load(f)
        except Exception as e:
            log.warning(f"Failed to read user fallbacks: {e}")

    # Otherwise use built-in fallbacks from package data
    try:
        from importlib.resources import files
        path = files(f"hands_on_ai.{module}.data") / "fallbacks.json"
        with open(path, "r", encoding="utf-8") as f:
            return json.load(f)
    except Exception as e:
        log.warning(f"Failed to read built-in fallbacks: {e}")
        return {"default": ["Retrying..."]}

save_config

Save configuration to config file.

Parameters:

Name Type Description Default
config dict

Configuration settings to save

required
Source code in src/hands_on_ai/config.py
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
def save_config(config):
    """
    Save configuration to config file.

    Args:
        config (dict): Configuration settings to save
    """
    ensure_config_dir()
    try:
        with open(CONFIG_PATH, "w", encoding="utf-8") as f:
            json.dump(config, f, indent=2)
    except Exception as e:
        log.warning(f"Failed to write config.json: {e}")

models

Core model utilities for Hands-on AI.

This module provides centralized functionality for working with LLM models: - Listing available models - Checking if a model exists - Getting model information - Normalizing model names - Detecting model capabilities

check_model_exists

Check if a model exists on the server.

Parameters:

Name Type Description Default
model_name str

Name of the model

required

Returns:

Name Type Description
bool bool

True if the model exists, False otherwise

Source code in src/hands_on_ai/models.py
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
def check_model_exists(model_name: str) -> bool:
    """
    Check if a model exists on the server.

    Args:
        model_name: Name of the model

    Returns:
        bool: True if the model exists, False otherwise
    """
    return get_model_info(model_name) is not None

detect_best_format

Determine the best format for the given model based on its capabilities.

Parameters:

Name Type Description Default
model_name str

Name of the model

required

Returns:

Name Type Description
str str

"react" or "json" (default)

Source code in src/hands_on_ai/models.py
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
def detect_best_format(model_name: str) -> str:
    """
    Determine the best format for the given model based on its capabilities.

    Args:
        model_name: Name of the model

    Returns:
        str: "react" or "json" (default)
    """
    capabilities = get_model_capabilities(model_name)

    if capabilities["react_format"]:
        return "react"
    return "json"

get_model_capabilities

Determine the capabilities of a given model.

Parameters:

Name Type Description Default
model_name str

Name of the model

required

Returns:

Type Description
Dict[str, bool]

Dict[str, bool]: Dictionary of capability flags

Source code in src/hands_on_ai/models.py
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
def get_model_capabilities(model_name: str) -> Dict[str, bool]:
    """
    Determine the capabilities of a given model.

    Args:
        model_name: Name of the model

    Returns:
        Dict[str, bool]: Dictionary of capability flags
    """
    # Initialize with default capabilities (conservative)
    capabilities = {
        "react_format": False,
        "json_format": True,
        "function_calling": False,
        "tool_use": False,
        "vision": False
    }

    # Get model info
    model_info = get_model_info(model_name)
    if not model_info:
        return capabilities

    # Check parameters field for model size
    if "parameters" in model_info:
        parameters = model_info.get("parameters", {})

        # Extract model size info
        model_size = 0
        if "num_params" in parameters:
            model_size = parameters["num_params"]
        elif "parameter_count" in parameters:
            model_size = parameters["parameter_count"]

        # Models with at least 30B parameters can likely handle ReAct format
        if model_size >= 30_000_000_000:  # 30B or larger
            capabilities["react_format"] = True
            capabilities["function_calling"] = True
            capabilities["tool_use"] = True

    # Check template/system prompt for function calling capabilities
    template = model_info.get("template", "")
    if "function" in template.lower() or "tool" in template.lower():
        capabilities["react_format"] = True
        capabilities["function_calling"] = True
        capabilities["tool_use"] = True

    # Check model families based on name
    model_name_lower = model_name.lower()

    # Models known to support vision
    vision_models = ["llava", "bakllava", "moondream", "cogvlm"]
    if any(vision_model in model_name_lower for vision_model in vision_models):
        capabilities["vision"] = True

    # Models known to support function calling / tool use
    function_models = [
        "gpt-4", "gpt4", "claude-2", "claude-3", "claude3",
        "llama3-70b", "llama-70b", "mixtral-8x7b"
    ]

    if any(pattern.lower() in model_name_lower for pattern in function_models):
        capabilities["react_format"] = True
        capabilities["function_calling"] = True
        capabilities["tool_use"] = True

    return capabilities

get_model_info

Check if a model exists using OpenAI-compatible endpoint.

Parameters:

Name Type Description Default
model_name str

Name of the model

required

Returns:

Type Description
Optional[Dict[str, Any]]

Optional[Dict]: Basic model information or None if not found

Source code in src/hands_on_ai/models.py
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
def get_model_info(model_name: str) -> Optional[Dict[str, Any]]:
    """
    Check if a model exists using OpenAI-compatible endpoint.

    Args:
        model_name: Name of the model

    Returns:
        Optional[Dict]: Basic model information or None if not found
    """
    # Try variations of the model name
    original_name = model_name
    normalized_name = normalize_model_name(model_name)

    model_variations = [original_name]
    if normalized_name != original_name:
        model_variations.append(normalized_name)

    server_url = get_server_url()

    # Add /v1 suffix for OpenAI-compatible endpoints
    if not server_url.endswith('/v1'):
        server_url = server_url.rstrip('/') + '/v1'

    # Try each variation
    for model_variant in model_variations:
        log.debug(f"Checking model: {model_variant}")

        try:
            # Create OpenAI client
            client = OpenAI(
                base_url=server_url,
                api_key=get_api_key() or "hands-on-ai"
            )

            # Get list of models and check if our model exists
            models_response = client.models.list()

            for model in models_response.data:
                if model.id == model_variant:
                    log.debug(f"Found model: {model_variant}")
                    # Return basic model info in expected format
                    return {
                        "name": model.id,
                        "parameters": {},  # Not available in OpenAI format
                        "template": "",    # Not available in OpenAI format
                        "created": getattr(model, 'created', 0)
                    }

        except Exception as e:
            log.debug(f"Error accessing model API for {model_variant}: {e}")
            continue

    # No matching model found
    log.debug(f"Model not found: {model_name}")
    return None

list_models

List all available models using OpenAI-compatible endpoint.

Returns:

Type Description
List[Dict[str, Any]]

List[Dict]: List of model information dictionaries

Source code in src/hands_on_ai/models.py
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
def list_models() -> List[Dict[str, Any]]:
    """
    List all available models using OpenAI-compatible endpoint.

    Returns:
        List[Dict]: List of model information dictionaries
    """
    server_url = get_server_url()

    # Add /v1 suffix for OpenAI-compatible endpoints
    if not server_url.endswith('/v1'):
        server_url = server_url.rstrip('/') + '/v1'

    try:
        # Create OpenAI client
        client = OpenAI(
            base_url=server_url,
            api_key=get_api_key() or "hands-on-ai"
        )

        # Use OpenAI-compatible models endpoint
        models_response = client.models.list()

        # Convert to the expected format
        models = []
        for model in models_response.data:
            models.append({
                "name": model.id,
                "size": 0,  # Size not available in OpenAI format
                "digest": "",  # Digest not available in OpenAI format
                "modified_at": getattr(model, 'created', 0)
            })

        return models

    except Exception as e:
        log.warning(f"Error listing models: {e}")
        return []

normalize_model_name

Normalize the model name to the format expected by Ollama.

Parameters:

Name Type Description Default
model_name str

Original model name

required

Returns:

Name Type Description
str str

Normalized model name

Source code in src/hands_on_ai/models.py
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
def normalize_model_name(model_name: str) -> str:
    """
    Normalize the model name to the format expected by Ollama.

    Args:
        model_name: Original model name

    Returns:
        str: Normalized model name
    """
    # If model name already has a tag (contains a colon), use it as is
    if ":" in model_name:
        return model_name

    # Otherwise append :latest tag
    return f"{model_name}:latest"

cache

Optional on-disk response cache.

Caching is off by default. Enable it by setting the HANDS_ON_AI_CACHE environment variable to a directory:

export HANDS_ON_AI_CACHE=~/.hands-on-ai/cache

When enabled, :func:hands_on_ai.chat.get_response returns a saved answer for an identical (model, system, prompt) instead of calling the model again. This is useful in classrooms: reruns are reproducible, repeated calls cost nothing, and a warmed cache works offline.

The cache is intentionally simple: one plain-text file per entry, named by a hash of the inputs. Delete the directory to clear it.

cache_dir

Return the cache directory as a Path if caching is enabled, else None.

Source code in src/hands_on_ai/cache.py
24
25
26
27
def cache_dir():
    """Return the cache directory as a Path if caching is enabled, else None."""
    d = os.environ.get("HANDS_ON_AI_CACHE")
    return Path(d).expanduser() if d else None

get

Return a cached response string, or None on a miss (or when disabled).

Source code in src/hands_on_ai/cache.py
35
36
37
38
39
40
41
def get(model, system, prompt):
    """Return a cached response string, or None on a miss (or when disabled)."""
    d = cache_dir()
    if d is None:
        return None
    f = d / f"{_key(model, system, prompt)}.txt"
    return f.read_text(encoding="utf-8") if f.exists() else None

put

Store a response in the cache. No-op when caching is disabled.

Source code in src/hands_on_ai/cache.py
44
45
46
47
48
49
50
def put(model, system, prompt, response):
    """Store a response in the cache. No-op when caching is disabled."""
    d = cache_dir()
    if d is None:
        return
    d.mkdir(parents=True, exist_ok=True)
    (d / f"{_key(model, system, prompt)}.txt").write_text(response, encoding="utf-8")