Testing

Kai has 1173 tests across 25 test files covering every source module. This page explains how to run them, how the test suite is organized, and what patterns to follow when writing new tests.

Running tests

make test          # run full test suite
make lint          # ruff check + pyright
make fmt           # ruff format (auto-fix)

Or directly:

.venv/bin/pytest tests/
.venv/bin/pytest tests/test_bot.py -k "test_help"   # run specific tests
.venv/bin/pytest tests/ -x                            # stop on first failure

CI runs lint, format check, and tests on every push and PR via GitHub Actions.

Test file mapping

Each source module has a corresponding test file, some with multiple files for distinct feature areas:

Source	Test file(s)	Notes
`bot.py`	`test_bot.py`, `test_bot_totp.py`	TOTP auth tests separated for clarity
`claude.py`	`test_claude.py`	Subprocess mocking, streaming, workspace switching
`config.py`	`test_config.py`	Env var loading, validation, protected file reading
`cron.py`	`test_cron.py`	Job scheduling, execution, auto-remove logic
`history.py`	`test_history.py`	JSONL logging, recent history retrieval
`install.py`	`test_install.py`	Config, apply, status subcommands; generated files
`locks.py`	`test_locks.py`	Per-chat locking, stop events
`main.py`	`test_main.py`	Startup sequence, signal handling
`review.py`	`test_review.py`	PR review pipeline, prior-comment fetching, prompt building
`services.py`	`test_services.py`	Service proxy, auth types, error handling
`sessions.py`	`test_sessions.py`	SQLite operations, settings, workspace history
`totp.py`	`test_totp.py`, `test_totp_cli.py`	Verification logic separate from CLI setup/reset
`transcribe.py`	`test_transcribe.py`	ffmpeg + whisper subprocess mocking
`triage.py`	`test_triage.py`	Issue triage pipeline, JSON parsing, label creation
`tts.py`	`test_tts.py`	Piper TTS subprocess mocking, voice selection
`pool.py`	`test_pool.py`	Subprocess pool, idle eviction, workspace restoration
`prompt_utils.py`	`test_prompt_utils.py`	Shared prompt formatting utilities
`telegram_utils.py`	`test_telegram_utils.py`	Telegram-specific helper functions
`webhook.py`	`test_webhook.py`, `test_webhook_api.py`	GitHub/generic webhooks separate from REST API
(cross-cutting)	`test_phase2_isolation.py`	Multi-user phase 2 integration tests
(cross-cutting)	`test_user_config.py`	User configuration loading and validation
(cross-cutting)	`test_workspace_config.py`	Workspace configuration loading and validation

Common patterns

Filesystem isolation

Tests that touch the filesystem use pytest's tmp_path fixture. Sessions, config, and history tests create real SQLite databases and config files in temp directories rather than mocking the filesystem. This catches real path-handling bugs that mocks would hide.

Async testing

Most of Kai's code is async. Tests use pytest.mark.asyncio and AsyncMock from unittest.mock:

@pytest.mark.asyncio
async def test_something(self):
    mock_fn = AsyncMock(return_value="result")
    # ...

Telegram handler mocking

Bot tests use factory functions to create mock Telegram Update and Context objects:

_make_update(text, user_id) - creates a mock Update with a message
_make_context(bot) - creates a mock CallbackContext

These are defined in test_bot.py and test_bot_totp.py. They set up the minimum attributes handlers need (message text, user ID, chat ID, bot instance) without pulling in the full python-telegram-bot object graph.

Subprocess mocking

Claude, transcribe, and TTS tests mock asyncio.create_subprocess_exec (or subprocess.run for sync code) to simulate subprocess behavior without actually running external binaries. This makes tests fast and deterministic.

HTTP mocking

Service proxy tests use aioresponses to mock outbound HTTP requests:

with aioresponses() as mocked:
    mocked.post("https://api.example.com/v1/chat", payload={"result": "ok"})
    # call the service proxy...

Attribute patching

monkeypatch.setattr is the preferred way to patch module-level attributes and globals. It's cleaner than unittest.mock.patch for module globals and auto-restores after each test.

Writing new tests

Put tests in the corresponding file - if you're testing webhook.py, add to test_webhook.py or test_webhook_api.py
Group with classes - use class TestFeatureName to organize related tests
Use pytest.mark.asyncio for async tests
Use real databases when testing database interactions (via tmp_path), mock databases only when the test isn't about database behavior
Add docstrings to tests that verify non-obvious behavior - explain what the test is checking and why
Follow the commenting style used in the existing tests (see the project's commenting style guide)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing

Testing

Running tests

Test file mapping

Common patterns

Filesystem isolation

Async testing

Telegram handler mocking

Subprocess mocking

HTTP mocking

Attribute patching

Writing new tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally