Skip to content

NikolasMarkou/fsm_llm

Repository files navigation

FSM-LLM

FSM-LLM

Stateful LLM programs on a typed λ-calculus runtime. One executor, two surfaces, one verb.

Python License Version

fsm-llm is a Python framework for building stateful LLM programs — dialog bots, agents, reasoning chains, workflows, and long-context pipelines — that all compile to and execute on the same typed λ-calculus runtime. You author programs in whichever surface fits the problem; one verb (Program.invoke) runs all of them.

from fsm_llm import Program

# Surface A — FSM JSON: dialog with persistent per-turn state.
prog = Program.from_fsm("my_bot.json")
result = prog.invoke(message="Hi, I'd like to book a flight")
print(result.value)            # "Sure — where to?"
print(result.conversation_id)  # auto-started session id

# Surface B — λ-DSL: a one-shot pipeline, agent, or recursion.
from fsm_llm import react_term
prog = Program.from_term(react_term(decide_prompt=..., synth_prompt=...))
result = prog.invoke(inputs={"question": "What is 17 * 23?"})
print(result.value)            # the agent's final answer
print(result.oracle_calls)     # 2  — exactly what the planner predicted

The second guarantee — oracle_calls matches the planner's static prediction — is Theorem 2 of the design. Every Fix subtree carries a closed-form cost; the executor honors it.

Why fsm-llm

  • One runtime. FSM JSON dialogs and λ-DSL pipelines compile to the same AST. There is no separate "agents engine" plus "workflows engine" plus "FSM engine" — there is one β-reduction interpreter, and everything is a λ-term.
  • Theorem-2 cost prediction. For any program with planner-bounded recursion, the executor's oracle-call count equals the planner's prediction. Budget LLM calls before running.
  • Provider-agnostic. Built on LiteLLM — 100+ providers (OpenAI, Anthropic, Ollama, Google, Bedrock, Together, …) behind one interface. Switch with a string.
  • Layered API. Four documented layers: L1 substrate, L2 composition, L3 authoring, L4 invoke. Use only what you need.
  • Typed throughout. Pydantic v2 models for AST, definitions, results. Frozen, JSON-roundtrippable.

Install

pip install fsm-llm                      # core: dialog, runtime, stdlib
pip install fsm-llm[reasoning]           # reasoning engine (no extra deps)
pip install fsm-llm[agents]              # agents (no extra deps)
pip install fsm-llm[workflows]           # workflows (no extra deps)
pip install fsm-llm[monitor]             # web dashboard (fastapi, uvicorn)
pip install fsm-llm[mcp]                 # MCP tool provider
pip install fsm-llm[otel]                # OpenTelemetry exporter
pip install fsm-llm[oolong]              # OOLONG long-context bench loader
pip install fsm-llm[all]                 # everything

Python 3.10–3.12. Set OPENAI_API_KEY (or any provider key) in .env or your shell.

Three surfaces, one verb

Program is the unified entry point. Three constructors fix the mode at construction time; the same .invoke(...) returns a Result in every mode.

1. FSM JSON — dialogs with state

Author a state machine as JSON, compile to a λ-term, run turn by turn:

from fsm_llm import Program

prog = Program.from_fsm("intake_bot.json")            # path or dict or FSMDefinition
result = prog.invoke(message="hello", conversation_id=None)
# result.value             — the response string
# result.conversation_id   — auto-started or echoed back

See examples/basic, examples/intermediate, and examples/advanced for runnable FSMs.

2. λ-term — pipelines, agents, reasoning, recursion

Author a term directly in the DSL:

from fsm_llm import Program, leaf, let_, var

term = let_(
    "summary", leaf(template="Summarise: {doc}", input_vars=("doc",)),
    leaf(template="Translate to French: {summary}", input_vars=("summary",)),
)
prog = Program.from_term(term)
result = prog.invoke(inputs={"doc": "..."})

Or use a stdlib factory:

from fsm_llm import Program, react_term

term = react_term(
    decide_prompt="Given {question}, propose a tool call as JSON.",
    synth_prompt="Tool returned {observation}. Final answer:",
)
prog = Program.from_term(term)
result = prog.invoke(inputs={"question": "Capital of France?", "tool_dispatch": my_tools})

3. Factory — late-bound term construction

Program.from_factory calls a factory at construction time with explicit args:

from fsm_llm import Program
from fsm_llm.stdlib.long_context import niah_term

prog = Program.from_factory(
    niah_term,
    factory_kwargs={"question": "Where is the artefact stored?", "tau": 256, "k": 2},
)
result = prog.invoke(inputs={"document": long_doc})

Provider switching with HarnessProfile / ProviderProfile

Bundle prompt prefixes, leaf overrides, and provider kwargs at construction:

from fsm_llm import HarnessProfile, ProviderProfile, Program, register_harness_profile

register_harness_profile(
    "ollama:qwen3.5:4b",
    HarnessProfile(
        system_prompt_base="You are a precise, terse assistant.",
        leaf_template_overrides={"leaf_001_summarise": "Be brief: {doc}"},
        provider_profile_name="ollama:qwen3.5:4b",
    ),
)

prog = Program.from_term(my_term, profile="ollama:qwen3.5:4b")

Profiles apply once at construction; Theorem-2 strict equality is preserved.

Handlers

Hook into 8 timing points across an FSM turn or a term reduction. Two timings (PRE_PROCESSING, POST_PROCESSING) splice into the AST via compose; the other six dispatch host-side.

from fsm_llm import HandlerBuilder, HandlerTiming, Program

audit = (
    HandlerBuilder("audit")
    .at(HandlerTiming.PRE_PROCESSING)
    .do(lambda **kw: log_event(kw))
    .build()
)

prog = Program.from_fsm("bot.json", handlers=[audit])

See docs/handlers.md.

CLI

The package ships five console scripts:

fsm-llm --mode run --fsm path/to/fsm.json              # interactive run
fsm-llm --mode validate --fsm path/to/fsm.json         # schema check
fsm-llm --mode visualize --fsm path/to/fsm.json        # ASCII state graph

# Single-purpose subcommand aliases — same code, simpler signatures.
fsm-llm-validate  --fsm path/to/fsm.json
fsm-llm-visualize --fsm path/to/fsm.json
fsm-llm-monitor                                        # web dashboard (requires fsm-llm[monitor])
fsm-llm-meta                                           # interactive artifact builder

Architecture at a glance

        FSM JSON (Category A)              λ-DSL (Category B / C)
              │                                    │
              ▼  fsm_llm.dialog.compile_fsm        ▼  fsm_llm.runtime.dsl
        ┌─────────────────────────────────────────────────────┐
        │                  λ-AST (typed Term)                 │
        │  Var · Abs · App · Let · Case · Combinator · Fix    │
        │                       · Leaf                        │
        └─────────────────────────────────────────────────────┘
                                │
                                ▼
        ┌──────────────────────────────────────────┐
        │ Executor (β-reduction, depth-bounded)    │
        │ Planner  (closed-form k*, τ*, d, calls)  │
        │ Oracle   (one per Program — uniform)     │
        │ Session  (per-conversation persistence)  │
        │ Cost     (per-leaf accumulator)          │
        └──────────────────────────────────────────┘
                                │
                                ▼
                       Program.invoke(...)  →  Result

The kernel (runtime/) is closed against the dialog surface — no upward edges. The dialog surface (dialog/) is the FSM-JSON compiler and orchestrator. The standard library (stdlib/) is named factories built on the kernel. Program (in program.py) is the L4 facade.

See docs/architecture.md for the full picture.

Documentation

Doc What it covers
docs/quickstart.md Five-minute tour: install, FSM hello-world, λ-term hello-world, handlers, profiles
docs/api_reference.md Every public name across L1–L4 with signatures and examples
docs/architecture.md The runtime, the layers, Theorem 2, the cross-cutting decisions
docs/handlers.md All 8 timing points; AST-side vs host-side; HandlerBuilder cookbook
docs/fsm_design.md Patterns and anti-patterns for authoring FSM JSON
docs/migration_0.7_to_0.8.md Migration guide: every removed surface, before/after, FAQ
docs/lambda.md The architectural thesis — why λ-calculus is the substrate
docs/lambda_fsm_merge.md Canonical merge contract — invariants I1–I6, falsification gates G1–G5, deprecation calendar
docs/threat_model.md Trust boundaries, T-01..T-11, dismissed proposals
CHANGELOG.md Release notes

Examples

172 runnable examples across 10 trees. Run with:

python examples/basic/echo_bot/run.py
python examples/pipeline/react/run.py
python examples/long_context/niah_demo/run.py

All examples support OpenAI and Ollama out of the box. See examples/README.md for the index.

Migrating from 0.7.x

0.8.0 is the post-0.7.0 cleanup release. Eight more surfaces are hard removals at the source-tree level — no new deprecation cycle was introduced; the 0.7.0 deferred items shipped:

  • from fsm_llm import Handlerfrom fsm_llm import FSMHandler (the alias is gone; BaseHandler is the canonical base class).
  • from fsm_llm import LLMInterfacefrom fsm_llm.runtime._litellm import LLMInterface (top-level re-export removed).
  • from fsm_llm import BUILTIN_OPSfrom fsm_llm.runtime import BUILTIN_OPS (registry is still architecturally closed).
  • has_workflows() / has_reasoning() / has_agents() and the matching get_* helpers → gone; the stdlib subpackages are not optional since 0.7.0.
  • from fsm_llm.dialog.definitions import FSMError (and the 4 other re-exports) → from fsm_llm.types import FSMError (canonical home since 0.7.0; the back-compat re-export was removed in 0.8.0).
  • Program(_api=..., _profile=...) private kwargs → gone; the public ctor is term-mode only. Use Program.from_fsm for FSM mode.
  • Program.from_fsm(**api_kwargs) catch-all → replaced with explicit kwargs (model, api_key, temperature, max_tokens, max_history_size, max_message_length, handler_error_mode, transition_config) plus **llm_kwargs for LiteLLM passthrough.
  • Reasoning factory parameter renames: analytical_term(prompt_a, prompt_b, prompt_c)analytical_term(decomposition_prompt, analysis_prompt, integration_prompt). Every reasoning factory in stdlib/reasoning/lam_factories.py was migrated to descriptive parameter names matching its bind_names.

Two private modules also moved: dialog/extraction.py (extracted from turn.py; holds ExtractionEngine) and runtime/_handlers_ast.py (holds compose + AST splicers, moved from handlers.py). Public API is unchanged — from fsm_llm import compose and from fsm_llm.handlers import compose continue to work as re-exports.

See docs/migration_0.7_to_0.8.md for the detailed before/after walkthrough per surface and CHANGELOG.md for the full diff.

Contributing

make install-dev      # editable install with all extras + pre-commit
make test             # pytest -v
make lint format      # ruff
make type-check       # mypy

make test should report ~3300 tests passing on a clean checkout. Verify the exact count with pytest --collect-only -q | tail -3.

License

GPL-3.0-or-later. See LICENSE.

About

A Finite State Machine hybrid with Large Language Models

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors