API Reference

Complete module reference for the TerAgent library.

Core Module (`teragent.core`)

TAPRequest

from teragent import TAPRequest

request = TAPRequest(
    meta={"task_id": "1.1", "intent": "code_generation"},  # Task metadata
    context={"design": "...", "plan": "...", "memory": "..."},  # Reference material
    instruction="Implement user login module",  # Core instruction
    constraints=["Python 3.10+"],  # Hard constraints
    output_format_hint="<file path='...'>complete code</file>",  # Desired format
    thinking_mode="high",               # Extended: thinking mode (auto/deep/quick/high/max)
    multimodal_context=[...],           # Extended: list of MultimodalContent (images/video)
    long_horizon=None,                  # Extended: LongHorizonConfig for long-running tasks
    cache_preference=None,              # Extended: cache preference hints for cache-aware compilers
)

Methods:

estimate_prompt_tokens() -> int — Rough token count estimation

Extended Fields:

thinking_mode: Optional[Literal["deep", "quick", "auto"]] — Controls reasoning depth. Per-request override of driver-level default. Additional values "high", "max"` supported for GLM-5.2.
multimodal_context: Optional[list[MultimodalContent]] — List of multimodal content items (images, video) for models that support vision (M3, GLM-5.2 + 5V-Turbo). Each item has type ("image_url" or "video_url") and corresponding URL/data.
long_horizon: Optional[LongHorizonConfig] — Configuration for long-horizon autonomous tasks (GLM-5/5.2). Includes max_duration_hours, checkpoint_interval_minutes, evaluation_interval_steps, etc.
`cache_preference: Optional[Literal["auto", "aggressive", "none"]] — Cache preference hint

TAPResponse

from teragent import TAPResponse

response = TAPResponse(
    raw_text="...",  # Model's raw text output (None = API error)
    usage={"prompt_tokens": 100, "completion_tokens": 200},  # Token usage
    tool_calls=[...],  # Structured tool calls from API
    finish_reason="stop",  # Why the model stopped
    cache_hit_tokens=3000,        # Extended: tokens served from cache (cache-aware models)
    thinking_content="...",       # Extended: reasoning trace content (thinking mode models)
    long_horizon_status=None,     # Extended: status info for long-horizon tasks
)

Properties:

prompt_tokens -> int
completion_tokens -> int
total_tokens -> int

Extended Fields:

cache_hit_tokens: int | None — Number of tokens served from cache (DeepSeek V4 with cache_aware=true). Useful for cost tracking — cache hits are significantly cheaper.
thinking_content: str | None — The model's internal reasoning trace, available when thinking mode is active (DeepSeek V4 deep mode, GLM-5.2 high/max mode).
long_horizon_status: dict | None — Status information for long-horizon task steps, including checkpoint info, sub-goal progress, and strategy switch notifications.

CompiledPrompt

from teragent import CompiledPrompt

# Mode A: Messages list (OpenAI / GLM / DeepSeek)
prompt = CompiledPrompt(
    messages=[
        {"role": "system", "content": "..."},
        {"role": "user", "content": "..."},
    ],
    tools=[...],
)

# Mode B: System + User separation (Anthropic native)
prompt = CompiledPrompt(
    system_prompt="...",
    user_message="...",
    tools=[...],
)

Properties:

mode -> str — Returns "messages", "system_user", or "empty"

extra dict field: The extra dict carries compiler-specific parameters that adapters use to customize their behavior:

Key	Compiler	Adapter	Description
`cache_aware`	`deepseek_v4`	`openai_compatible`	Whether to freeze tool definitions for cache hit optimization
`variant`	`deepseek_v4`	`openai_compatible`	`"flash"` or `"pro"` — controls prompt strategy
`minimax_video_mode`	`minimax_m3`	`minimax_native`	`"understand"` or `"summarize"` — video processing mode
`minimax_frame_sampling`	`minimax_m3`	`minimax_native`	`"auto"`, `"uniform"`, `"keyframe"`, or `"dense"`
`thinking_mode`	`glm_52`	`openai_compatible`	`"high"` or `"max"` — dual thinking mode
`preserved_thinking`	`glm_52`	`openai_compatible`	Whether to inject preserved reasoning traces
`vision_coordination`	`glm_52`	`openai_compatible`	Whether 5V-Turbo vision coordination is active

ModelProvider

from teragent import ModelProvider, create_provider

# Create via factory function
provider = create_provider(
    compiler="glm_5",
    adapter="openai_compatible",
    model="glm-5",
    base_url="https://open.bigmodel.cn/api/paas/v4",
    api_key_env="GLM_API_KEY",
)

Methods:

execute_tap(request) -> TAPResponse — Execute a TAP request (compile → send)
stream_tap(request) -> AsyncIterator[str] — Stream a TAP request (compile → stream)
chat(messages, tools=None) -> dict — Simple chat (bypasses Compiler)
execute_tap_with_retry(request, max_retries=2) -> TAPResponse — TAP with retry + circuit breaker
chat_with_fallback(messages, tools=None) -> dict — Chat with fallback provider
set_tracer(tracer) — Attach a TAPTracer for auto-tracing
set_fallback(fallback_provider) — Set fallback provider
get_cost_summary() -> dict — Get aggregated cost summary by provider
close() — Close adapter connections

Properties:

tracer -> TAPTracer | None
fallback_provider -> ModelProvider | None
has_fallback -> bool
cost_records -> list[TAPCostRecord]
capabilities -> dict

Async context manager: ModelProvider supports async with for automatic resource cleanup:

async with create_provider(...) as provider:
    response = await provider.execute_tap(request)

TAPCompiler (ABC)

from teragent import TAPCompiler

class MyCompiler(TAPCompiler):
    def compile(self, request: TAPRequest) -> CompiledPrompt:
        # Transform TAPRequest into model-specific CompiledPrompt
        ...

Methods:

compile(request) -> CompiledPrompt — Abstract. Compile TAP request
get_system_prompt(intent) -> str — Get intent-specific system prompt

TAPAdapter (ABC)

from teragent import TAPAdapter

class MyAdapter(TAPAdapter):
    async def send(self, compiled: CompiledPrompt, model: str) -> TAPResponse:
        # Send compiled prompt to model API
        ...

    async def stream(self, compiled: CompiledPrompt, model: str) -> AsyncIterator[str]:
        # Stream compiled prompt to model API
        ...

Properties:

capabilities -> dict — Feature detection (streaming, tool_calling, etc.)
required_mode -> str — Expected CompiledPrompt mode ("any", "messages", "system_user")

Security Module (`teragent.security`)

EnhancedPermissionManager

from teragent.security import EnhancedPermissionManager, PermissionRule, PermissionEffect

epm = EnhancedPermissionManager(
    default_level=PermissionLevel.PLAN,
    default_effect=PermissionEffect.DENY,
    ai_classifier=None,
)

Methods:

add_rule(rule) — Add a permission rule
add_rules(rules) — Batch add rules
remove_rules_by_source(source) -> int — Remove rules by source
clear_rules() — Clear all rules
check(tool_name, path="") -> (bool, str) — Sync permission check (Layers 1-5, 7)
acheck(tool_name, path="", context="") -> (bool, str) — Async check (Layers 1-7, includes AI classifier)
check_tool_params(tool_name, params) -> (bool, str) — Check from tool params (auto-extract path)
acheck_tool_params(tool_name, params, context="") -> (bool, str) — Async version
elevate(new_level) — Elevate permission level
deactivate() — Reset to default level
set_level(level) — Directly set level
load_from_config(config) — Load from config dict
default_rules() -> list[PermissionRule] — Get built-in default rules
get_status_report() -> dict — Status report for debugging
get_rules_summary() -> list[dict] — List all rules sorted by priority
reset() — Reset all state

Sandbox

from teragent.security import execute_in_sandbox, check_command_safety

# Check command safety (no execution)
is_safe, reason = check_command_safety("rm -rf /")
# → (False, "命令匹配危险模式: ...")

# Execute in sandbox
exit_code, output = await execute_in_sandbox(
    cmd="python script.py",
    workdir="/project",
    level=0,  # 0=subprocess, 1=Docker, 2=Firecracker
    timeout=60,
)

File Writer

from teragent.security import write_files_safely, atomic_write_file

# Write multiple files atomically (2PC)
success, results = write_files_safely(
    files=[
        {"path": "/project/src/main.py", "content": "..."},
        {"path": "/project/src/utils.py", "content": "..."},
    ],
    workspace_root="/project",
)

# Write single file atomically
success = atomic_write_file("/project/src/main.py", "content")

Reliability Module (`teragent.reliability`)

CircuitBreakerManager

from teragent.reliability import CircuitBreakerManager

manager = CircuitBreakerManager(bus=event_bus)

# Record a model call
result = manager.record_model_call(
    prompt_tokens=500,
    completion_tokens=200,
    stage="plan",
    latency_ms=3500,
)

# Record success/failure
manager.record_success()
manager.record_failure("API timeout")

# Record agent step progress
manager.record_agent_step("read_file", had_effect=True)

# Check budget before call
result = manager.check_before_call(estimated_prompt_tokens=1000)

# Get status
status = manager.get_status()

StepBudget

from teragent.reliability import StepBudget

budget = StepBudget(max_steps=50)

if budget.consume():  # Returns True if budget remaining
    # Do work
    pass

# Properties
budget.current_steps  # Steps consumed
budget.remaining      # Steps remaining
budget.exhausted      # Whether budget is exhausted

RecoveryManager

from teragent.reliability import RecoveryManager, RecoveryType

manager = RecoveryManager()

# Check if recovery is needed
if manager.should_continue_after_truncation(finish_reason, attempt):
    manager.record_recovery(RecoveryType.LENGTH)

# Check error types
manager.is_context_overflow(error)
manager.is_retryable(error)
manager.should_retry_streaming(attempt)

# Get stats
stats = manager.get_stats()

Context Module (`teragent.context`)

ContextWindow

from teragent.context import ContextWindow

window = ContextWindow(model_token_limit=128_000)

# Estimate tokens
tokens = window.estimate(messages)

# Check if compaction needed
if window.should_compact(messages):
    # Trigger compaction
    pass

# Properties
window.available_budget
window.utilization
window.last_estimated_tokens

AutoCompactor

from teragent.context import AutoCompactor

compactor = AutoCompactor(
    context_window=window,
    model=provider,
    retain_count=8,  # Keep last 8 messages
    max_compacts=5,   # Max 5 compactions per session
)

# Check and compact if needed
compacted = await compactor.maybe_compact(messages, system_prompt)

# Get stats
stats = compactor.get_stats()

Pipeline Module (`teragent.pipeline`)

Extractor

from teragent import extract_files_from_response

files = extract_files_from_response(response_text, task_id="1.1")
# → [{"path": "src/main.py", "content": "..."}, ...]

PromptBuilder

from teragent import build_prompt, validate_prompt_tokens

# Build from template
messages = build_prompt(
    system_template="You are {role}. Task: {task}",
    context={"role": "engineer", "task": "implement login"},
)

# Validate token budget
errors = validate_prompt_tokens(messages, max_tokens=4000)

Checklist

from teragent import run_deterministic_checks, TaskInfo

task_list = [TaskInfo(id="1.1", title="Login module", status="completed")]
report, data = run_deterministic_checks("/project", task_list)

Retry

from teragent import retry_with_backoff

async def _call():
    return await provider.chat(messages=[...])

result = await retry_with_backoff(
    fn=_call,
    max_retries=3,
    validate=lambda r: [] if r else ["empty response"],
)

TAPTracer

from teragent import TAPTracer

tracer = TAPTracer(trace_dir="/project/.agent/traces")

# Auto-tracing via ModelProvider
provider.set_tracer(tracer)

# Manual tracing
trace_id = await tracer.record_request(tap_request)
await tracer.record_response(tap_response, task_id="1.1", trace_id=trace_id)
await tracer.record_checklist("1.1", checklist_data)

# Export
pairs = tracer.export_dpo_pairs()
tracer.export_dpo_pairs_jsonl()
traces = tracer.export_traces()
stats = tracer.get_trace_stats()

Streaming Module (`teragent.streaming`)

StreamingToolExecutor

from teragent.streaming import StreamingToolExecutor

executor = StreamingToolExecutor(
    tool_registry=registry,
    permission_level=0,
    max_concurrent=10,
)

# Execute with streaming
results, streaming_result, stats = await executor.execute_streaming(
    stream=model.adapter.stream(compiled, model.model),
    on_text_delta=lambda text: print(text, end=""),
    on_tool_complete=lambda tc, result: print(f"Tool {tc['name']}: {result.success}"),
)

# Batch fallback
results, stats = await executor.execute_batch_fallback(tool_calls)

# Check streaming capability
can_stream = executor.can_stream_with_tools(model)

Orchestration Module (`teragent.orchestration`)

Agent

from teragent.orchestration import Agent, Handoff

agent = Agent(
    name="code_reviewer",
    provider=provider,
    tools=[review_tool, lint_tool],
    handoffs=[
        Handoff(target_agent="editor", tool_name="transfer_to_editor"),
    ],
)

Orchestrator

from teragent.orchestration import Orchestrator, OrchestrationMode

orchestrator = Orchestrator(
    agents=[designer, planner, coder, reviewer],
    mode=OrchestrationMode.SEQUENTIAL,
)

# Run orchestration
result = await orchestrator.run("Build a REST API for user management")

# Stream orchestration
async for event in orchestrator.run_stream("Build a REST API"):
    print(event)

Handoff

from teragent.orchestration import Handoff, HandoffInputFilter

handoff = Handoff(
    target_agent="reviewer",
    tool_name="transfer_to_reviewer",
    input_filter=HandoffInputFilter.remove_all_tools,
)

SharedState

from teragent.orchestration import SharedState

state = SharedState()
state.set("key", "value", scope="session")
value = state.get("key", scope="session")

CancellationToken

from teragent.orchestration import CancellationToken

token = CancellationToken()
# Request cancellation
token.cancel()
# Check if cancelled
if token.is_cancelled:
    print("Cancelled!")

Tools Module (`teragent.tools`)

BaseTool

from teragent.tools import BaseTool, ToolResult
from teragent.core.types import ToolSafety

class MyTool(BaseTool):
    name = "my_tool"
    description = "Does something useful"
    parameters_schema = {
        "type": "object",
        "properties": {
            "input": {"type": "string", "description": "Input text"},
        },
        "required": ["input"],
    }
    _safety = ToolSafety.READ_ONLY
    _concurrency_safe = True

    async def execute(self, params, progress_callback=None):
        return ToolResult(
            success=True,
            data={"result": params["input"].upper()},
            safety=ToolSafety.READ_ONLY,
        )

ToolRegistry

from teragent.tools import ToolRegistry

registry = ToolRegistry()
registry.register(MyTool())

# Query
tool = registry.get("my_tool")
names = registry.list_tool_names()
summary = registry.get_summary()

ToolOrchestrator

from teragent.tools import ToolOrchestrator

orchestrator = ToolOrchestrator(
    tool_registry=registry,
    permission_level=0,
    hook_manager=hook_mgr,
    enhanced_perm_manager=epm,
)

# Execute batch
results = await orchestrator.execute_batch(tool_calls)

# Execute single
result = await orchestrator._execute_single(tool_call_dict)

Intent Module (`teragent.intent`)

IntentClassifier

from teragent.intent import IntentClassifier, IntentType

classifier = IntentClassifier(provider)

intent = await classifier.classify("Build me a web app")
# → IntentType.CREATE_PROJECT

intent = await classifier.classify("What does this code do?")
# → IntentType.CHAT

intent = await classifier.classify("Fix the bug in main.py")
# → IntentType.DEBUG

ConfirmationGate

from teragent.intent import ConfirmationGate

gate = ConfirmationGate()

confirmed = await gate.confirm_create_project("Build a new web app")
# → True/False (asks user for approval)

Hooks Module (`teragent.hooks`)

HookManager

from teragent.hooks import HookManager, HookDecision

manager = HookManager()

# Register a hook
manager.register_hook("pre_execute", my_hook)

# Run hooks
decision = await manager.run_hooks("pre_execute", context)
# → HookDecision.ALLOW / DENY / MODIFY

Built-in Hooks

AuditHook: Logs all tool executions for audit trail
DangerousCommandHook: Blocks dangerous shell commands using the 6-layer defense

Session Module (`teragent.session`)

SessionPersistence

from teragent.session import SessionPersistence

persistence = SessionPersistence(db_path=".agent/sessions.db")

# Create session
session_id = persistence.create(title="My Session", intent="chat")

# Save message
persistence.save_message(session_id, message)

# Restore session
messages = persistence.restore(session_id)

# List sessions
sessions = persistence.list_sessions()

Router Module (`teragent.router`)

RoutingReason

from teragent.router import RoutingReason

# Enum values indicating why a model was selected
RoutingReason.INTENT                    # Default intent-based routing
RoutingReason.MULTIMODAL_OVERRIDE       # Has multimodal content → M3
RoutingReason.DESKTOP_OVERRIDE          # Has desktop context → M3
RoutingReason.VIDEO_OVERRIDE            # Has video content → M3
RoutingReason.CONTEXT_LENGTH_OVERRIDE   # Context >200K → V4/M3
RoutingReason.LONG_HORIZON_OVERRIDE     # Long-horizon task → GLM-5
RoutingReason.COST_OPTIMIZATION         # Budget constraint → cheaper model
RoutingReason.DEGRADATION               # Primary unavailable → fallback
RoutingReason.PIPELINE_PROFILE          # Explicit pipeline profile assignment
RoutingReason.EXPLICIT                  # User explicitly specified model

RoutingDecision

from teragent.router import RoutingDecision

decision = RoutingDecision(
    selected_driver="openai_compatible.deepseek_v4_pro",
    selected_compiler="deepseek_v4",
    reason=RoutingReason.INTENT,
    intent="design",
)

Properties:

selected_driver: str — Full driver name (e.g., "openai_compatible.deepseek_v4_pro")
selected_compiler: str — Compiler name (e.g., "deepseek_v4")
reason: RoutingReason — Primary routing reason
intent: str — The request's intent type
trace: list[tuple[str, str, str]] — Ordered list of (reason, candidate, accepted/rejected) tuples
timestamp: float — Decision timestamp (epoch seconds)
estimated_cost: float — Estimated cost for this request
context_tokens: int — Estimated context token count

Methods:

add_trace(reason, candidate, result) — Append a trace entry for debugging

RoutingTable

from teragent.router import RoutingTable

table = RoutingTable(
    multimodal_driver="openai_compatible.minimax_m3",
    desktop_driver="openai_compatible.minimax_m3",
    long_horizon_driver="openai_compatible.glm_5",
)

Key attributes:

intent_routing: dict[str, str] — Maps intent → default driver name
multimodal_driver: str — Driver for multimodal content (default: M3)
desktop_driver: str — Driver for desktop context (default: M3)
long_horizon_driver: str — Driver for long-horizon tasks (default: GLM-5)
long_context_candidates: list[str] — Models supporting >200K context
cost_fallback_order: list[str] — Cheapest-to-most-expensive model order
degradation_map: dict[str, str] — Primary → fallback mapping
model_pricing: dict[str, dict[str, float]] — Per-model CNY pricing per million tokens
max_context_per_model: dict[str, int] — Per-model max context tokens
compiler_map: dict[str, str] — Driver name → compiler name mapping

Methods:

resolve_compiler(driver_name) -> str — Resolve compiler name from driver name
get_intent_driver(intent) -> str — Get default driver for an intent type
get_pricing(driver_name) -> dict[str, float] — Get pricing dict for a model
from_dict(data) -> RoutingTable — Create RoutingTable from a config dict

ModelRouter

from teragent.router import ModelRouter, RoutingTable, RoutingDecision

router = ModelRouter(
    available_providers={"openai_compatible.glm_5": glm_provider, ...},
    routing_table=RoutingTable(),
)

decision = router.route(tap_request)
provider = router.get_provider(decision.selected_driver)

Methods:

route(request) -> RoutingDecision — Route a TAP request through the 6-step decision flow
route_for_stage(stage, request) -> RoutingDecision — Route using active pipeline profile for a stage
get_decision_log() -> list[RoutingDecision] — Get the log of all routing decisions
get_provider(driver_name) -> ModelProvider | None — Get a provider by driver name
set_monthly_budget(limit_cny, warning_threshold, auto_downgrade) — Configure monthly budget

PipelineProfile

from teragent.router import PipelineProfile

profile = PipelineProfile(
    name="default",
    description="Default pipeline configuration",
    design_driver="openai_compatible.deepseek_v4_pro",
    plan_driver="openai_compatible.glm_5",
    execute_driver="openai_compatible.glm_5",
    review_driver="openai_compatible.deepseek_v4_pro",
)

Methods:

get_driver_for_stage(stage) -> str — Get driver name for a pipeline stage
from_dict(name, data) -> PipelineProfile — Create from a config dict

PipelineManager

from teragent.router import PipelineManager, PipelineProfile

pm = PipelineManager()

# Register profiles
pm.register_profile(PipelineProfile(name="budget", ...))

# Switch profiles at runtime
pm.set_active_profile("budget")

# Get driver for a stage
driver = pm.get_driver("execute")

Methods:

register_profile(profile) -> None — Register a pipeline profile
set_active_profile(name) -> bool — Switch to a named profile (returns True if found)
get_driver(stage) -> str — Get driver name for a stage from active profile
list_profiles() -> list[str] — List all registered profile names
get_profile(name) -> PipelineProfile | None — Get a profile by name
from_config(config, routing_table) -> PipelineManager — Create from TOML config dict

Properties:

active_profile_name -> str — Name of the currently active profile
active_profile -> PipelineProfile — The currently active PipelineProfile

Long-Horizon Module (`teragent.long_horizon`)

SubGoal

from teragent.long_horizon import SubGoal

goal = SubGoal(
    id="sg_1",
    description="Design database schema",
    completion_criteria="All tables defined with proper constraints",
    estimated_steps=10,
    dependencies=["sg_0"],  # Depends on sg_0
    status="pending",        # pending | in_progress | completed | failed
)

Attributes:

id: str — Unique identifier
description: str — Sub-goal description
completion_criteria: str — Measurable completion criteria
estimated_steps: int — Estimated number of steps
dependencies: list[str] — IDs of sub-goals this depends on (DAG topology)
status: str — Current status: pending | in_progress | completed | failed

PhaseResult

from teragent.long_horizon import PhaseResult

result = PhaseResult(
    sub_goal_id="sg_1",
    success=True,
    result_text="Database schema designed with 5 tables...",
    steps_taken=8,
    files_created=["src/models/user.py", "src/models/role.py"],
    files_modified=["src/db/init.py"],
    errors=[],
)

Attributes:

sub_goal_id: str — Corresponding sub-goal ID
success: bool — Whether the phase succeeded
result_text: str — Model output text
steps_taken: int — Steps consumed in this phase
files_created: list[str] — Files created
files_modified: list[str] — Files modified
errors: list[str] — Error messages

LongHorizonResult

from teragent.long_horizon import LongHorizonResult

result = LongHorizonResult(
    task_id="task_001",
    goal="Implement user management system",
    success=True,
    total_steps=120,
    total_elapsed_minutes=95.5,
    completed_sub_goals=5,
    total_sub_goals=5,
    strategy_switches=1,
    phase_results=[...],
    final_summary="All sub-goals completed successfully",
    checkpoints_saved=6,
)

Attributes:

task_id: str — Unique task identifier
goal: str — Original goal description
success: bool — Overall success
total_steps: int — Total steps consumed
total_elapsed_minutes: float — Total elapsed time
completed_sub_goals: int — Completed sub-goals count
total_sub_goals: int — Total sub-goals count
strategy_switches: int — Number of strategy switches
phase_results: list[PhaseResult] — Detailed phase results
final_summary: str — Final summary text
checkpoints_saved: int — Number of checkpoints saved

LongHorizonTaskManager

from teragent.long_horizon import LongHorizonTaskManager
from teragent.core.tap import LongHorizonConfig

manager = LongHorizonTaskManager(
    goal="Implement a complete user management system",
    model_provider=glm_provider,
    config=LongHorizonConfig(max_duration_hours=4),
)

result = await manager.execute_long_task()

Methods:

decompose_goal() -> list[SubGoal] — Break down the goal into sub-goals
execute_phase(sub_goal) -> PhaseResult — Execute one sub-goal phase
execute_long_task() -> LongHorizonResult — Run the full long-horizon task
save_checkpoint() -> str — Save current state as checkpoint
evaluate_progress() -> SelfEvaluationResult — Run self-evaluation
recover_from_checkpoint(checkpoint_id) -> bool — Resume from checkpoint

Checkpoint & CheckpointStore

from teragent.long_horizon.checkpoint import Checkpoint, CheckpointStore

store = CheckpointStore(base_dir=".teragent/checkpoints")

# Save checkpoint
checkpoint = Checkpoint(
    checkpoint_id=store.generate_checkpoint_id(),
    task_id="task_001",
    timestamp=store.now_iso(),
    phase="executing",
    completed_sub_goals=["sg_0", "sg_1"],
    current_sub_goal="sg_2",
    steps_completed=50,
    elapsed_minutes=30.0,
    strategy_switches=0,
    state_data={},
)
checkpoint_id = await store.save(checkpoint)

# Load latest
latest = await store.load_latest("task_001")

# List all
checkpoints = await store.list_checkpoints("task_001")

# Cleanup old (keep last 5)
deleted = await store.cleanup("task_001", keep_last=5)

SelfEvaluator

from teragent.long_horizon import SelfEvaluator, SelfEvaluationResult

evaluator = SelfEvaluator(
    model_provider=provider,
    evaluation_interval_steps=10,
    evaluation_interval_minutes=30.0,
)

if evaluator.should_evaluate(steps_since_last=10, minutes_since_last=30.0):
    result = await evaluator.evaluate(goal, progress_report, recent_results)
    if result.should_switch_strategy:
        # Trigger strategy switch
        ...

SelfEvaluationResult attributes:

goal_alignment: int — Goal alignment score (1-5)
output_quality: int — Output quality score (1-5)
bottleneck_identified: str — Bottleneck description
strategy_review: str — Strategy effectiveness review
next_step_plan: str — Recommended next steps
overall_score: float — Weighted overall score
should_switch_strategy: bool — Whether to switch strategy
raw_response: str — Raw model response text

SelfEvaluator methods:

evaluate(goal, progress_report, recent_results) -> SelfEvaluationResult — Run self-evaluation
should_evaluate(steps_since_last, minutes_since_last) -> bool — Check if evaluation is due
reset_evaluation_timer(current_steps) — Reset evaluation timer

StrategySwitcher

from teragent.long_horizon import StrategySwitcher, StrategySwitchRecord

switcher = StrategySwitcher(
    model_provider=provider,
    stagnation_threshold=3,
    no_progress_threshold=5,
    similarity_threshold=0.8,
)

is_stagnant, reason = switcher.detect_stagnation(recent_results, recent_steps)
if is_stagnant:
    new_strategy, record = await switcher.switch_strategy(
        current_strategy, reason, goal, progress_report
    )

StrategySwitchRecord attributes:

timestamp: str — ISO format timestamp
reason: str — Switch reason
previous_strategy: str — Previous strategy description
new_strategy: str — New strategy description
risk_assessment: str — Risk assessment
effectiveness: str — Post-hoc effectiveness evaluation

StrategySwitcher methods:

detect_stagnation(recent_results, recent_steps) -> (bool, str) — Detect stagnation
switch_strategy(current_strategy, reason, goal, progress_report) -> (str, StrategySwitchRecord) — Execute strategy switch
get_switch_history() -> list[StrategySwitchRecord] — Get switch history
assess_switch_effectiveness(record_index, subsequent_results) -> str — Evaluate switch effectiveness

Properties:

current_strategy -> str — Current strategy description

ProgressTracker & ProgressReport

from teragent.long_horizon.progress import ProgressTracker, ProgressReport

tracker = ProgressTracker(task_id="task_1", goal="Implement user system")
tracker.start_sub_goal("sg_1", "Design database")
tracker.record_step("Create User table")
tracker.complete_sub_goal("sg_1", "User table created")
report = tracker.get_report()

ProgressReport attributes:

task_id: str — Task identifier
goal: str — Original goal
total_sub_goals: int — Total sub-goals
completed_sub_goals: int — Completed sub-goals
current_phase: str — Current phase (planning/executing/evaluating/stagnant)
steps_completed: int — Steps taken
elapsed_minutes: float — Elapsed time
strategy_switches: int — Strategy switch count
estimated_remaining_minutes: float — Estimated time remaining

Budget Module (`teragent.reliability.budget`)

StepBudget

from teragent.reliability.budget import StepBudget

budget = StepBudget(max_steps=50)

if budget.consume():  # Returns True if budget remaining
    # Do work
    pass

budget.resume(extra_steps=10)  # Add more steps after user confirmation

Properties:

current_steps -> int — Steps consumed
remaining -> int — Steps remaining
exhausted -> bool — Whether budget is exhausted

CostRecord

from teragent.reliability.budget import CostRecord

record = CostRecord(
    driver_name="openai_compatible.deepseek_v4_pro",
    compiler="deepseek_v4",
    model="deepseek-v4-pro",
    intent="design",
    prompt_tokens=5000,
    completion_tokens=2000,
    cache_hit_tokens=3000,
    cost_cny=0.052,
    cost_saved_cny=0.012,
    success=True,
    latency_ms=3500.0,
)

Properties:

date_str -> str — Date string (YYYY-MM-DD) for date-dimension reporting
total_tokens -> int — Total tokens (prompt + completion)

MonthlyBudgetConfig

from teragent.reliability.budget import MonthlyBudgetConfig

config = MonthlyBudgetConfig(
    limit_cny=500.0,                     # Monthly budget cap (0 = no limit)
    warning_threshold=0.8,                # Warn at 80%
    critical_threshold=0.95,              # Auto-downgrade at 95%
    auto_downgrade_driver="openai_compatible.deepseek_v4_flash",  # Fallback driver
    notify_on_warning=True,               # Emit events on warning
)

CrossModelCostTracker

from teragent.reliability.budget import CrossModelCostTracker, MonthlyBudgetConfig, CostRecord

tracker = CrossModelCostTracker()
tracker.set_monthly_budget(MonthlyBudgetConfig(limit_cny=500.0))

# Record a cost
budget_status = tracker.record(CostRecord(
    driver_name="openai_compatible.deepseek_v4_pro",
    compiler="deepseek_v4",
    model="deepseek-v4-pro",
    intent="design",
    prompt_tokens=5000,
    completion_tokens=2000,
    cost_cny=0.052,
))

# Record from TAP response (convenience method)
status = tracker.record_from_tap_response(
    driver_name="openai_compatible.deepseek_v4_pro",
    compiler="deepseek_v4",
    model="deepseek-v4-pro",
    intent="design",
    prompt_tokens=5000,
    completion_tokens=2000,
    cache_hit_tokens=3000,
    latency_ms=3500.0,
)

# Check budget
status = tracker.check_budget()
# → {"level": "ok"|"warning"|"critical"|"exhausted", "utilization": float, ...}

# Generate report
report = tracker.generate_report(group_by="model")
model_stats = tracker.get_model_stats("openai_compatible.deepseek_v4_pro")
all_stats = tracker.get_all_model_stats()
cache_savings = tracker.get_cache_savings()

Key methods:

record(record) -> dict — Record a cost entry and check budget
record_from_tap_response(...) -> dict — Record from TAP response data (calculates cost)
check_budget() -> dict — Check current budget status
set_monthly_budget(config) — Configure monthly budget
generate_report(group_by, start_date, end_date) -> dict — Generate cost report
get_model_stats(driver_name) -> dict — Get stats for a specific model
get_all_model_stats() -> dict — Get stats for all models
get_cache_savings() -> dict — Get cache savings statistics

Properties:

is_budget_warning -> bool — Whether budget is in warning state
is_budget_exhausted -> bool — Whether budget is exhausted
should_auto_downgrade -> bool — Whether to auto-downgrade
total_records -> int — Number of cost records

Circuit Breaker Module (`teragent.reliability.circuit_breaker`)

ModelCircuitBreakerManager

from teragent.reliability.circuit_breaker import ModelCircuitBreakerManager, ModelBreakerConfig

manager = ModelCircuitBreakerManager()

# Record a failure (returns fallback model name if breaker just opened)
fallback = manager.record_failure("deepseek_v4_pro", "API timeout")

# Record a success
manager.record_success("deepseek_v4_pro")

# Check if a model can be called
if manager.can_call("deepseek_v4_pro"):
    # Safe to call
    ...

# Get fallback model
fallback = manager.get_fallback("deepseek_v4_pro")

# Get all model states
states = manager.get_all_states()
# → {"deepseek_v4_pro": "closed", "minimax_m3": "half_open", ...}

# Reset a specific model or all
manager.reset("deepseek_v4_pro")  # Reset specific
manager.reset()                    # Reset all

ModelBreakerConfig

from teragent.reliability.circuit_breaker import ModelBreakerConfig

config = ModelBreakerConfig(
    model_name="deepseek_v4_pro",
    max_consecutive_failures=5,       # Open after N consecutive failures
    window_seconds=300.0,            # Sliding window duration
    cooldown_seconds=60.0,           # Time before half-open transition
    failure_threshold_percent=0.5,   # Open if >50% failures in window
    half_open_max_calls=3,           # Test calls allowed in half-open
)

Recovery Module (`teragent.reliability.recovery`)

DegradationChain

from teragent.reliability.recovery import DegradationChain

chain = DegradationChain(breaker_manager=breaker_mgr)

# Get next available fallback
fallback = chain.get_fallback("deepseek_v4_pro", task_type="heavy")
# → "glm_5"

# Get full chain
full_chain = chain.get_full_chain("heavy")
# → ["deepseek_v4_pro", "glm_5", "deepseek_v4_flash"]

# Add custom chain
chain.add_chain("light", ["deepseek_v4_flash", "glm_5"])

Default chains:

"heavy": V4-Pro → GLM-5 → V4-Flash
"multimodal": M3 → V4-Pro (degrades to text-only)
"default": V4-Pro → GLM-5 → V4-Flash

LongHorizonRecoveryManager

from teragent.reliability.recovery import LongHorizonRecoveryManager
from teragent.long_horizon.checkpoint import CheckpointStore

store = CheckpointStore()
recovery_mgr = LongHorizonRecoveryManager(checkpoint_store=store)

# Recover from latest checkpoint
success = await recovery_mgr.recover_from_checkpoint(task_manager)

# Check if should downgrade to standard mode
if recovery_mgr.should_downgrade_to_standard(recovery_attempts=3, elapsed_time=1800):
    print("Switching to standard mode")

# Get reconnection delay (exponential backoff with jitter)
delay = recovery_mgr.get_reconnection_delay(attempt=2)

RateLimitHandler & RateLimitInfo

from teragent.reliability.recovery import RateLimitHandler, RateLimitInfo

handler = RateLimitHandler(breaker_manager=breaker_mgr)

# Parse rate limit response (normalizes different provider formats)
info = handler.parse_rate_limit_response(
    model_name="deepseek_v4_pro",
    status_code=429,
    headers={"Retry-After": "30"},
    body=None,
)

# Check if should retry
if handler.should_retry("deepseek_v4_pro", info):
    delay = handler.get_backoff_delay("deepseek_v4_pro", attempt=1, rate_limit_info=info)

RateLimitInfo attributes:

model_name: str — Model that returned the rate limit response
requests_remaining: int | None — Remaining requests in current window
tokens_remaining: int | None — Remaining tokens in current window
reset_time: float | None — Unix timestamp when window resets
retry_after: float | None — Seconds to wait before retrying

MiniMax Adapter (`teragent.core.adapters.minimax_native`)

MiniMaxNativeAdapter

from teragent.core.adapters.minimax_native import MiniMaxNativeAdapter

adapter = MiniMaxNativeAdapter(
    base_url="https://api.minimaxi.com/v1",
    api_key="your-api-key",
    group_id="your-group-id",       # Optional: required for some endpoints
    timeout=300.0,
    multimodal_timeout=600.0,       # Longer timeout for video processing
    enable_fake_tools=False,
)

MiniMax-specific methods:

send_desktop_command(command, params, screenshot, interactive_elements, active_window, model) -> dict — Send desktop command via dedicated endpoint
- Returns {"action": ..., "reasoning": ..., "raw_response": ...}
- Falls back to chat completions if desktop endpoint unavailable (404)

MiniMax-specific properties:

billing_summary -> dict — Cumulative billing tracker (input/output/cache tokens, request count)
rate_limit_info -> MiniMaxRateLimitInfo — Current rate limit information from headers

Capabilities:

multimodal: True
desktop: True
video: True
msa_efficient: True
max_context_tokens: 1_000_000

MiniMaxRateLimitInfo

from teragent.core.adapters.minimax_native import MiniMaxRateLimitInfo

info = MiniMaxRateLimitInfo()
# Updated automatically from response headers

Properties:

limit: int — Maximum requests in current window
remaining: int — Remaining requests
reset: float — Timestamp when window resets
is_exhausted -> bool — Whether rate limit is exhausted
reset_in_seconds -> float — Seconds until window resets

GLM Adapter (`teragent.core.adapters.glm_native`)

GLMNativeAdapter

from teragent.core.adapters.glm_native import GLMNativeAdapter

adapter = GLMNativeAdapter(
    base_url="https://open.bigmodel.cn/api/paas/v4",
    api_key="your-api-key",
    timeout=300.0,
    multimodal_timeout=600.0,       # Longer timeout for 1M context requests
)

GLM-specific capabilities:

Supports GLM-5 and GLM-5.2 model endpoints
Compatible with Zhipu AI's native message format
Handles High/Max thinking mode response parsing
Supports PreservedThinking trace injection

GLM-specific properties:

capabilities -> dict — Includes streaming: True, tool_calling: True, thinking_modes: ["high", "max"]

Note: GLM models can also be used via openai_compatible adapter with the appropriate compiler (glm_5 or glm_52). The glm_native adapter provides Zhipu AI-specific optimizations.

Compilers (`teragent.core.compilers`)

DeepSeekV4Compiler

from teragent.core.compilers.deepseek_v4 import DeepSeekV4Compiler

compiler = DeepSeekV4Compiler()
# The variant (flash/pro) is set via the `variant` parameter in compile() or driver config

Features:

Cache-aware prompt layout (freezes system prompt and tool definitions at the beginning)
Thinking mode support (auto, quick, deep)
1M context optimization
Flash/Pro variant switching via variant parameter (not separate compiler names)

Key extra dict keys: cache_aware, variant

GLM5Compiler

from teragent.core.compilers.glm_5 import GLM5Compiler

compiler = GLM5Compiler()

Features:

Recency effect optimization (key instruction placed last)
Long-horizon task support with self-evaluation injection
200K context optimization

GLM52Compiler

from teragent.core.compilers.glm_52 import GLM52Compiler

compiler = GLM52Compiler()

Features:

1M context with context degradation support (1M → 200K)
Dual thinking modes: High (default) and Max
PreservedThinking: preserves reasoning traces across coding sessions
5V-Turbo vision coordination: enables vision→code→verify cycles with GLM-5V-Turbo

Key extra dict keys: thinking_mode, preserved_thinking, vision_coordination

GLM5VTurboCompiler

from teragent.core.compilers.glm_5v_turbo import GLM5VTurboCompiler

compiler = GLM5VTurboCompiler()

Features:

Vision analysis compilation for GLM-5V-Turbo model
Converts multimodal content into GLM-5V-Turbo's expected format
Used in coordination with GLM52Compiler for vision→code→verify cycles

MiniMaxM3Compiler

from teragent.core.compilers.minimax_m3 import MiniMaxM3Compiler

compiler = MiniMaxM3Compiler()

Features:

MSA full-text injection mode (1/20 compute cost at 1M context)
Multimodal content encoding (image_url, video_url content blocks)
Desktop context conversion (DesktopContext → M3 desktop operation instructions)
Video processing hints (automatically added when using MiniMaxNativeAdapter)

Key extra dict keys: minimax_video_mode, minimax_frame_sampling

Desktop Tool (`teragent.tools.desktop`)

DesktopTool

from teragent.tools.desktop import DesktopTool, DesktopSafetyConfig

safety = DesktopSafetyConfig(
    safe_zones=[],               # Forbidden click zones (x1, y1, x2, y2)
    min_interval=0.5,            # Min seconds between operations
    max_consecutive_ops=50,      # Max consecutive operations
    screenshot_quality=75,       # JPEG quality (1-100)
    screenshot_format="jpeg",    # "jpeg" or "png"
)

tool = DesktopTool(safety_config=safety)

Supported actions (7 total):

Action	Description	Required Parameters
`screenshot`	Capture screen screenshot	None
`click`	Click at coordinates	`x`, `y`, `button` (optional)
`type_text`	Type text string	`text`
`scroll`	Scroll in direction	`direction`, `scroll_amount`
`hotkey`	Press keyboard shortcut	`keys` (comma-separated, e.g., `"ctrl,c"`)
`move_mouse`	Move mouse to coordinates	`x`, `y`
`drag`	Drag from start to end	`x`, `y`, `end_x`, `end_y`

Safety features (5 layers):

Permission level — All operations require DESTRUCTIVE-level confirmation
Safe zones — Configurable forbidden click areas
Rate limiting — Minimum interval between operations
Consecutive ops cap — Max consecutive operations before forced pause
Blocked shortcuts — Dangerous key combinations (Alt+F4, Ctrl+Alt+Del, etc.) are blocked

Properties:

simulation_mode -> bool — Whether in simulation mode (pyautogui unavailable)
safety_config -> DesktopSafetyConfig — Current safety configuration

Example:

result = await tool.execute({
    "action": "screenshot",
})
# Returns ToolResult with base64-encoded screenshot in data

result = await tool.execute({
    "action": "click",
    "x": 100,
    "y": 200,
    "button": "left",
})

Event Bus (`teragent.event_bus`)

EventBus

from teragent import EventBus

bus = EventBus()

# Subscribe
bus.on("agent_done", lambda **kw: print("Done!"))

# Subscribe once
bus.once("agent_done", handler)

# Emit (fire-and-forget)
await bus.emit("agent_done", total_steps=10)

# Emit and wait
await bus.emit_and_wait("agent_done", total_steps=10)

# Wait for event
args, kwargs = await bus.wait_for("agent_done", timeout=30)

# Query
names = bus.get_event_names()
history = bus.get_event_history(limit=50)

Rate Limiting — `teragent.core.rate_limiter`

Class / Function	Description
`TokenBucketRateLimiter`	Token bucket rate limiter with `acquire()` and `wait_and_acquire()`
`SlidingWindowRateLimiter`	Sliding window rate limiter with timestamp tracking
`AdaptiveRateLimiter`	Adaptive rate limiter that learns from `X-RateLimit-*` headers; auto-backs off on 429
`RateLimitConfig`	Configuration dataclass (strategy, max_tokens, refill_rate, window, safety_factor)
`RateLimitStrategy`	Enum: `TOKEN_BUCKET`, `SLIDING_WINDOW`, `ADAPTIVE`
`RateLimitStatus`	Current status with `is_limited` and `wait_seconds` properties
`RateLimiter`	Type alias: `TokenBucketRateLimiter \| SlidingWindowRateLimiter \| AdaptiveRateLimiter`
`create_rate_limiter(config=None)`	Factory function — returns appropriate limiter based on config

FilesExpand file tree

api-reference.md

Latest commit

History

api-reference.md

File metadata and controls

API Reference

Core Module (teragent.core)

TAPRequest

TAPResponse

CompiledPrompt

ModelProvider

TAPCompiler (ABC)

TAPAdapter (ABC)

Security Module (teragent.security)

EnhancedPermissionManager

Sandbox

File Writer

Reliability Module (teragent.reliability)

CircuitBreakerManager

StepBudget

RecoveryManager

Context Module (teragent.context)

ContextWindow

AutoCompactor

Pipeline Module (teragent.pipeline)

Extractor

PromptBuilder

Checklist

Retry

TAPTracer

Streaming Module (teragent.streaming)

StreamingToolExecutor

Orchestration Module (teragent.orchestration)

Agent

Orchestrator

Handoff

SharedState

CancellationToken

Tools Module (teragent.tools)

BaseTool

ToolRegistry

ToolOrchestrator

Intent Module (teragent.intent)

IntentClassifier

ConfirmationGate

Hooks Module (teragent.hooks)

HookManager

Built-in Hooks

Session Module (teragent.session)

SessionPersistence

Router Module (teragent.router)

RoutingReason

RoutingDecision

RoutingTable

ModelRouter

PipelineProfile

PipelineManager

Long-Horizon Module (teragent.long_horizon)

SubGoal

PhaseResult

LongHorizonResult

LongHorizonTaskManager

Checkpoint & CheckpointStore

SelfEvaluator

StrategySwitcher

ProgressTracker & ProgressReport

Budget Module (teragent.reliability.budget)

StepBudget

CostRecord

MonthlyBudgetConfig

CrossModelCostTracker

Circuit Breaker Module (teragent.reliability.circuit_breaker)

ModelCircuitBreakerManager

ModelBreakerConfig

Recovery Module (teragent.reliability.recovery)

DegradationChain

LongHorizonRecoveryManager

RateLimitHandler & RateLimitInfo

MiniMax Adapter (teragent.core.adapters.minimax_native)

Core Module (`teragent.core`)

Security Module (`teragent.security`)

Reliability Module (`teragent.reliability`)

Context Module (`teragent.context`)

Pipeline Module (`teragent.pipeline`)

Streaming Module (`teragent.streaming`)

Orchestration Module (`teragent.orchestration`)

Tools Module (`teragent.tools`)

Intent Module (`teragent.intent`)

Hooks Module (`teragent.hooks`)

Session Module (`teragent.session`)

Router Module (`teragent.router`)

Long-Horizon Module (`teragent.long_horizon`)

Budget Module (`teragent.reliability.budget`)

Circuit Breaker Module (`teragent.reliability.circuit_breaker`)

Recovery Module (`teragent.reliability.recovery`)

MiniMax Adapter (`teragent.core.adapters.minimax_native`)

GLM Adapter (`teragent.core.adapters.glm_native`)

Compilers (`teragent.core.compilers`)

Desktop Tool (`teragent.tools.desktop`)

Event Bus (`teragent.event_bus`)

Rate Limiting — `teragent.core.rate_limiter`