talk-buddy

Speech-to-Text Setup

Configure speech recognition services to enable voice input in Talk Buddy. This guide covers both online and local STT (Speech-to-Text) service options.

Understanding STT Services

What is Speech-to-Text?

STT services convert your spoken words into text that Talk Buddy can process:

Service Options

Online Services (Default)

Pre-configured services: Ready to use immediately

Self-hosted services: Run on your own computer

Quick Start (Online Services)

Default Configuration

Talk Buddy comes pre-configured with working STT services:

Check Current Status

  1. Look at status footer: STT indicator should be green (●)
  2. If green: You’re ready to practice with voice input
  3. If red/gray: Follow troubleshooting steps below

Test STT Service

  1. Go to Settings: Click “Settings” in sidebar
  2. Find STT section: Look for Speech-to-Text configuration
  3. Click “Test STT”: Verify service is working
  4. Speak into microphone: Say a test phrase
  5. Check results: Verify your speech was recognized correctly

Troubleshooting Online Services

Connection Issues

Microphone Problems

Local STT Setup (Speaches)

Why Use Local Services?

Privacy benefits:

Performance benefits:

Installing Speaches

System Requirements

Installation Steps

Option 1: Docker Installation (Recommended)

# Pull the Speaches Docker image
docker pull ghcr.io/tts-ai/speaches:latest

# Run Speaches container
docker run -d \
  --name speaches \
  -p 8000:8000 \
  ghcr.io/tts-ai/speaches:latest

Option 2: Python Installation

# Install Python 3.8+ if not already installed
python --version

# Install Speaches via pip
pip install speaches

# Start Speaches server
speaches serve --host 0.0.0.0 --port 8000

Option 3: Binary Installation

  1. Download: Get binary from Speaches releases
  2. Extract: Unzip to preferred location
  3. Run: Execute the binary to start server
  4. Configure: Set to run on port 8000

Configuring Talk Buddy for Local STT

Update Service URL

  1. Open Talk Buddy Settings
  2. Find STT Service URL field
  3. Change to local address: http://localhost:8000
  4. Save settings

Test Local Connection

  1. Click “Test STT” in settings
  2. Verify connection: Should show successful connection
  3. Test speech recognition: Speak a test phrase
  4. Check status footer: STT indicator should be green

Speaches Configuration Options

Model Selection

Speaches supports multiple STT models:

Fast Models (Lower accuracy, faster processing)

Accurate Models (Higher accuracy, slower processing)

Language Configuration

Configure for your language:

# Example: Configure for Spanish
speaches serve --language es --port 8000

# Example: Configure for French
speaches serve --language fr --port 8000

Advanced Configuration

Create configuration file speaches.yaml:

server:
  host: "0.0.0.0"
  port: 8000
  
stt:
  model: "Systran/faster-whisper-medium"
  language: "en"
  device: "auto"  # auto, cpu, cuda
  
tts:
  enabled: true
  model: "speaches-ai/piper-en_US-amy-low"

Advanced STT Configuration

Multiple Service Setup

Backup Services

Configure multiple STT services for redundancy:

  1. Primary service: Local Speaches for regular use
  2. Backup service: Online service for when local is unavailable
  3. Testing: Verify both services work independently

Service Switching

In Talk Buddy settings:

Performance Optimization

Hardware Optimization

For better local STT performance:

Model Selection

Choose appropriate models:

Security and Privacy

Local Service Security

Secure your local installation:

Data Privacy

Understand data handling:

Troubleshooting STT Issues

Common Problems

Microphone Not Working

Symptoms: No speech detected, silent input Solutions:

  1. Check system permissions: Grant microphone access to Talk Buddy
  2. Test hardware: Verify microphone works in other applications
  3. Check input device: Ensure correct microphone selected in system settings
  4. Adjust sensitivity: Increase microphone volume if too quiet

Poor Recognition Accuracy

Symptoms: Speech transcribed incorrectly, frequent mistakes Solutions:

  1. Improve audio quality: Use better microphone, reduce background noise
  2. Speak clearly: Slower, more deliberate speech
  3. Check language settings: Ensure STT service configured for your language
  4. Try different model: Some models work better for specific accents/voices

Service Connection Errors

Symptoms: Red STT indicator, connection timeouts Solutions:

  1. Verify service running: Check if Speaches or online service is available
  2. Test network connectivity: Ensure internet access for online services
  3. Check firewall: Confirm Talk Buddy can access STT service
  4. Restart services: Stop and start STT service, restart Talk Buddy

Slow Response Times

Symptoms: Long delays between speech and recognition Solutions:

  1. Use faster models: Switch to smaller, quicker STT models
  2. Optimize hardware: Close other applications, upgrade hardware
  3. Check network: Ensure stable, fast internet for online services
  4. Local processing: Switch to local STT service for better performance

Advanced Troubleshooting

Log Analysis

Check service logs for errors:

# View Speaches logs
docker logs speaches

# Check system microphone logs (macOS)
log show --predicate 'subsystem == "com.apple.coreaudio"' --last 5m

# Windows microphone troubleshooting
# Use Windows Audio troubleshooter in Settings

Network Diagnostics

Test service connectivity:

# Test local Speaches service
curl http://localhost:8000/health

# Test microphone endpoint
curl -X POST http://localhost:8000/stt \
  -H "Content-Type: audio/wav" \
  --data-binary @test-audio.wav

Performance Monitoring

Monitor resource usage:

Service Comparison

Online vs Local STT

Aspect Online Services Local Services
Setup Ready immediately Requires installation
Privacy Data sent externally Complete privacy
Accuracy Often very high Varies by model
Speed Network dependent Hardware dependent
Cost May have usage limits Free after setup
Offline Requires internet Works offline
Languages Many supported Depends on models

For Students

For Teachers

For Professionals


Quick Setup Checklist

Online STT (5 minutes)

Local STT (30 minutes)

Troubleshooting (15 minutes)


With proper STT setup, you’ll have accurate voice recognition for natural conversation practice. Choose the option that best fits your privacy needs and technical comfort level! 🎤

Related Guides: