Recursive Problem Decomposition Engine
A multi-agent system that breaks complex problems into solvable dimensions, validates solutions through adversarial critique, and synthesizes computed answers β all in real time.
ProblemSolver.ai is a Centralized Processing Engine that applies structured, recursive problem decomposition to any complex question. Instead of asking an LLM to answer in one shot, it orchestrates a pipeline of specialized agents β each with a distinct cognitive role β that collaborate to produce validated, high-confidence solutions.
The system decomposes problems into dimensions (independent facets), phases (ordered execution steps), and tasks (atomic work items), then solves against those structures and validates the result through adversarial review.
The orchestrator coordinates five agents through a LangGraph StateGraph with revision loops and quality gates:
ββββββββββββββββββββββββββββββββββββββββββββββββ
β ORCHESTRATOR β
β (Top-Level StateGraph) β
ββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββ
β THINKER β
β Analyzes the problem, identifies ambiguity, β
β asks clarifying questions or makes β
β assumptions in autonomous mode. β
β β
β analyze_input β route_by_mode β β
β generate_questions | make_assumptions β β
β refine_understanding β
ββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββ
β PLANNER β
β Decomposes the problem into dimensions, β
β phases, and tasks. Evaluates complexity β
β and recursively decomposes if needed. β
β β
β identify_dimensions β evaluate_complexity β β
β decompose_deeper (loop) | finalize_plan β
ββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββ
β CRITIC β
β Evaluates the plan for completeness, β
β feasibility, and quality. Scores across β
β dimensions and decides: accept, revise, β
β or reject. β
β βββββ revise β
β evaluate_solution β score_dimensions β β
β render_decision β
ββββββββββββββββββββββββββββββββββββββββββββββββ
β accept
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββ
β SOLVER β
β Executes the validated plan to produce β
β a computed answer. Synthesizes concrete β
β results from the decomposed structure. β
ββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββ
β JUDGE β
β Final validation against original problem β
β constraints. Checks dimension coverage and β
β constraint satisfaction. Issues verdict β
β with confidence score. β
β βββββ retry β
β validate_completeness β render_verdict β
ββββββββββββββββββββββββββββββββββββββββββββββββ
β accept
βΌ
FINALIZE
The pipeline has two feedback loops that enforce quality:
- Critic Loop β If the Critic rejects or requests revision, the plan is sent back to the Planner with specific feedback. A degradation guard prevents plan collapse across revision passes. Capped at
MAX_CRITIC_ITERATIONS(default 3). - Judge Loop β If the Judge's confidence falls below the user-set threshold, or if the verdict is "reject", the entire plan-solve cycle is retried with accumulated feedback. Budget protection prevents runaway iterations.
Each agent runs its own internal LangGraph. The orchestrator streams internal node completions through an asyncio.Queue side-channel, which the WebSocket layer drains concurrently. The frontend renders live SVG mini-graphs showing exactly which internal node each agent is executing β Thinker's analyze_input, Planner's decompose_deeper, etc.
| Layer | Technology |
|---|---|
| Agent Framework | LangGraph 0.3+ (StateGraph with conditional edges and revision loops) |
| LLM Providers | Anthropic Claude, OpenAI GPT (configurable per-agent) |
| Backend API | FastAPI 0.115+ with async/await throughout |
| Real-Time | WebSocket streaming with per-session event multiplexing |
| Database | PostgreSQL 16 with async SQLAlchemy, Alembic migrations |
| Frontend | Next.js 15, React 19, TypeScript 5.7, Tailwind CSS |
| Agent Toolkit | 15 tools across 5 categories (math, analysis, research, document, system) |
| Package Manager | uv (Python), npm (Node.js) |
The Solver has access to a registry of executable tools that are injected into agent system prompts and tracked with per-tool metrics:
| Category | Tools | Libraries |
|---|---|---|
| Math | Calculator, Symbolic Math, Unit Converter, Matrix Operations | sympy, numpy, scipy, pint |
| Analysis | Data Profiler, Statistical Analysis, Pattern Detector, Aggregation Engine | pandas, scikit-learn |
| Research | Tavily Web Search, Fact Checker, Domain Knowledge | tavily-python |
| Document | Markdown Generator, Schema Builder, Data Formatter, Code Generator | jinja2, jsonschema |
| System | Resource Estimator, Quality Scorer, Risk Analyzer | β |
Each tool follows a common BaseTool interface with JSON Schema input/output validation, execution timing, success/failure tracking, and per-agent usage metrics.
The dashboard provides a real-time view of the entire pipeline:
- Problem Input β Title, description, and mode selection (interactive vs. autonomous)
- Pipeline Topology β SVG graph showing agent flow with live status indicators
- Confidence Gauge β Radial gauge with adjustable threshold slider
- LLM Token Counter β Per-agent token usage breakdown
- Agent Panels β Thinker (assumptions/constraints), Planner (dimensions/phases), Critic (scores/iterations), Solver (markdown-rendered answers), Judge (verdict/confidence)
- Sub-Graph Explorer β Expandable SVG mini-graphs of each agent's internal LangGraph nodes
- Agent Timeline β Chronological event feed with sub-node activity
- Clarification Modal β Interactive Q&A when Thinker needs user input
- Dark/Light Mode β Full theme support with CSS custom properties
ProblemSolver.ai/
βββ backend/
β βββ agents/
β β βββ base/ # BaseAgent, AgentRegistry, JSON parser
β β βββ orchestrator/ # Top-level StateGraph (9 nodes, 3 conditional edges)
β β βββ thinker/ # Problem understanding (4 nodes)
β β βββ planner/ # Recursive decomposition (4 nodes, loop)
β β βββ critic/ # Solution evaluation (3 nodes)
β β βββ judge/ # Final validation (2 nodes)
β βββ api/
β β βββ main.py # FastAPI app factory with lifespan
β β βββ deps.py # Dependency injection
β β βββ routes/ # REST + WebSocket endpoints
β βββ core/
β β βββ config.py # Pydantic settings
β β βββ llm_provider.py # Anthropic + OpenAI provider abstraction
β β βββ prompt_engine.py # Agent system prompt compilation
β β βββ subnode_events.py # Async event queue for sub-graph streaming
β βββ db/
β β βββ database.py # Async SQLAlchemy engine
β β βββ models.py # Session, Problem, Solution tables
β β βββ repositories/ # Async CRUD repositories
β βββ models/ # Pydantic domain models
β βββ services/
β β βββ toolkit.py # Tool execution service with timeouts
β βββ tools/
β βββ base.py # BaseTool, ToolResult, ToolMetrics
β βββ registry.py # ToolRegistry singleton
β βββ errors.py # Tool-specific exceptions
β βββ math/ # Calculator, symbolic, units, matrix
β βββ analysis/ # Profiler, statistics, patterns, aggregation
β βββ research/ # Tavily search, fact checker, domain knowledge
β βββ document/ # Markdown, schema, formatter, code gen
β βββ system/ # Estimator, quality scorer, risk analyzer
βββ frontend/
β βββ src/
β β βββ app/ # Next.js app router (layout, page, providers)
β β βββ components/
β β β βββ agents/ # ThinkerPanel, PlannerPanel, CriticPanel, SolverPanel, JudgePanel
β β β βββ layout/ # Navbar, ThemeToggle
β β β βββ pipeline/ # PipelineGraph, SubGraphExplorer, AgentTimeline, etc.
β β β βββ shared/ # StatusBadge, ProgressBar, ClarificationModal
β β βββ hooks/ # usePipeline (state + reducer), useWebSocket
β β βββ lib/ # Types, constants, sub-graph topology
β βββ package.json
βββ tests/ # 262 tests across agents, tools, models, core, API
βββ alembic/ # Database migrations
βββ docker/ # Docker Compose (PostgreSQL 16)
βββ scripts/ # start.sh, stop.sh, restart.sh, push.sh
βββ images/ # Assets
βββ pyproject.toml # Project config (uv / hatch)
- Python 3.12+ with uv package manager
- Node.js 20+ with npm
- PostgreSQL 16 (via Docker or local install)
- Anthropic API key (or OpenAI β configurable)
git clone https://github.com/iotlodge/problemsolver.ai.git
cd problemsolver.ai/ProblemSolver.ai
cp .env.example .env
# Edit .env β add your ANTHROPIC_API_KEY at minimumdocker compose -f docker/docker-compose.yml up -duv sync
uv run alembic upgrade headuv run uvicorn backend.api.main:app --reload --host 0.0.0.0 --port 8000cd frontend
npm install
npm run devOpen http://localhost:3000 β the dashboard connects to the backend via WebSocket automatically.
Once running, interactive API docs are available at http://localhost:8000/api/docs (Swagger) and http://localhost:8000/api/redoc (ReDoc).
All configuration is through environment variables (.env file):
| Variable | Default | Description |
|---|---|---|
ANTHROPIC_API_KEY |
β | Anthropic API key (required for Claude) |
OPENAI_API_KEY |
β | OpenAI API key (optional, for GPT models) |
DEFAULT_LLM_PROVIDER |
anthropic |
Which LLM provider to use |
DEFAULT_LLM_MODEL |
claude-sonnet-4-5-20250929 |
Model identifier |
DATABASE_URL |
postgresql+asyncpg://... |
PostgreSQL connection string |
MAX_DECOMPOSITION_DEPTH |
3 |
Max recursive decomposition depth |
MAX_CRITIC_ITERATIONS |
3 |
Max plan revision cycles |
THINKER_DEFAULT_MODE |
interactive |
interactive (asks questions) or autonomous (makes assumptions) |
NEXT_PUBLIC_API_URL |
http://localhost:8000 |
Backend URL for frontend proxy |
NEXT_PUBLIC_WS_URL |
ws://localhost:8000 |
WebSocket URL |
# Full suite (262 tests)
uv run pytest tests/ -v
# With coverage
uv run pytest tests/ --cov=backend --cov-report=term-missing
# Specific category
uv run pytest tests/test_agents/ -v # Agent pipeline tests
uv run pytest tests/test_tools/ -v # Tool implementation tests
uv run pytest tests/test_models/ -v # Domain model tests
uv run pytest tests/test_core/ -v # Prompt engine tests
uv run pytest tests/test_api/ -v # WebSocket testsInput: "Optimal Meeting Schedule β find availability windows during a single 8-hour workday for 5 people with overlapping constraints"
Pipeline Execution:
-
Thinker analyzes the problem, identifies ambiguities (time zones? priority ordering? lunch breaks?), and either asks clarifying questions or makes assumptions.
-
Planner decomposes into dimensions: Temporal Constraints, Participant Availability, Room/Resource Allocation, Priority Optimization. Each dimension gets phases and atomic tasks.
-
Critic evaluates the plan β scores completeness, feasibility, specificity. If the plan is too vague or missing edge cases, it sends revision feedback back to the Planner.
-
Solver executes the validated plan, computing concrete time windows, conflict resolutions, and a recommended schedule.
-
Judge validates the answer against original constraints β did it actually address all 5 people? Are the time windows valid? Does it respect the 8-hour boundary? Issues a verdict with a confidence score.
The entire flow streams to the frontend in real time, with sub-node activity visible in the Sub-Graph Explorer.
Apache License 2.0 β see LICENSE for details.
Built by @iotlodge
