ProblemSolver.ai

Recursive Problem Decomposition Engine
A multi-agent system that breaks complex problems into solvable dimensions, validates solutions through adversarial critique, and synthesizes computed answers — all in real time.

What is ProblemSolver.ai?

ProblemSolver.ai is a Centralized Processing Engine that applies structured, recursive problem decomposition to any complex question. Instead of asking an LLM to answer in one shot, it orchestrates a pipeline of specialized agents — each with a distinct cognitive role — that collaborate to produce validated, high-confidence solutions.

The system decomposes problems into dimensions (independent facets), phases (ordered execution steps), and tasks (atomic work items), then solves against those structures and validates the result through adversarial review.

Architecture

Agent Pipeline

The orchestrator coordinates five agents through a LangGraph StateGraph with revision loops and quality gates:

                    ┌──────────────────────────────────────────────┐
                    │              ORCHESTRATOR                     │
                    │         (Top-Level StateGraph)                │
                    └──────────────────────────────────────────────┘
                                       │
                                       ▼
                    ┌──────────────────────────────────────────────┐
                    │  THINKER                                      │
                    │  Analyzes the problem, identifies ambiguity,  │
                    │  asks clarifying questions or makes            │
                    │  assumptions in autonomous mode.               │
                    │                                                │
                    │  analyze_input → route_by_mode →               │
                    │    generate_questions | make_assumptions →      │
                    │    refine_understanding                        │
                    └──────────────────────────────────────────────┘
                                       │
                                       ▼
                    ┌──────────────────────────────────────────────┐
                    │  PLANNER                                      │
                    │  Decomposes the problem into dimensions,      │
                    │  phases, and tasks. Evaluates complexity      │
                    │  and recursively decomposes if needed.        │
                    │                                                │
                    │  identify_dimensions → evaluate_complexity →   │
                    │    decompose_deeper (loop) | finalize_plan    │
                    └──────────────────────────────────────────────┘
                                       │
                                       ▼
                    ┌──────────────────────────────────────────────┐
                    │  CRITIC                                       │
                    │  Evaluates the plan for completeness,        │
                    │  feasibility, and quality. Scores across     │
                    │  dimensions and decides: accept, revise,     │
                    │  or reject.                                   │
                    │                                  ◄──── revise │
                    │  evaluate_solution → score_dimensions →       │
                    │    render_decision                             │
                    └──────────────────────────────────────────────┘
                                       │ accept
                                       ▼
                    ┌──────────────────────────────────────────────┐
                    │  SOLVER                                       │
                    │  Executes the validated plan to produce      │
                    │  a computed answer. Synthesizes concrete     │
                    │  results from the decomposed structure.      │
                    └──────────────────────────────────────────────┘
                                       │
                                       ▼
                    ┌──────────────────────────────────────────────┐
                    │  JUDGE                                        │
                    │  Final validation against original problem   │
                    │  constraints. Checks dimension coverage and  │
                    │  constraint satisfaction. Issues verdict     │
                    │  with confidence score.                       │
                    │                                  ◄──── retry  │
                    │  validate_completeness → render_verdict       │
                    └──────────────────────────────────────────────┘
                                       │ accept
                                       ▼
                                   FINALIZE

Revision Loops

The pipeline has two feedback loops that enforce quality:

Critic Loop — If the Critic rejects or requests revision, the plan is sent back to the Planner with specific feedback. A degradation guard prevents plan collapse across revision passes. Capped at MAX_CRITIC_ITERATIONS (default 3).
Judge Loop — If the Judge's confidence falls below the user-set threshold, or if the verdict is "reject", the entire plan-solve cycle is retried with accumulated feedback. Budget protection prevents runaway iterations.

Real-Time Sub-Graph Visualization

Each agent runs its own internal LangGraph. The orchestrator streams internal node completions through an asyncio.Queue side-channel, which the WebSocket layer drains concurrently. The frontend renders live SVG mini-graphs showing exactly which internal node each agent is executing — Thinker's analyze_input, Planner's decompose_deeper, etc.

Tech Stack

Layer	Technology
Agent Framework	LangGraph 0.3+ (StateGraph with conditional edges and revision loops)
LLM Providers	Anthropic Claude, OpenAI GPT (configurable per-agent)
Backend API	FastAPI 0.115+ with async/await throughout
Real-Time	WebSocket streaming with per-session event multiplexing
Database	PostgreSQL 16 with async SQLAlchemy, Alembic migrations
Frontend	Next.js 15, React 19, TypeScript 5.7, Tailwind CSS
Agent Toolkit	15 tools across 5 categories (math, analysis, research, document, system)
Package Manager	uv (Python), npm (Node.js)

Agent Toolkit — 15 Tools

The Solver has access to a registry of executable tools that are injected into agent system prompts and tracked with per-tool metrics:

Category	Tools	Libraries
Math	Calculator, Symbolic Math, Unit Converter, Matrix Operations	sympy, numpy, scipy, pint
Analysis	Data Profiler, Statistical Analysis, Pattern Detector, Aggregation Engine	pandas, scikit-learn
Research	Tavily Web Search, Fact Checker, Domain Knowledge	tavily-python
Document	Markdown Generator, Schema Builder, Data Formatter, Code Generator	jinja2, jsonschema
System	Resource Estimator, Quality Scorer, Risk Analyzer	—

Each tool follows a common BaseTool interface with JSON Schema input/output validation, execution timing, success/failure tracking, and per-agent usage metrics.

Frontend

The dashboard provides a real-time view of the entire pipeline:

Problem Input — Title, description, and mode selection (interactive vs. autonomous)
Pipeline Topology — SVG graph showing agent flow with live status indicators
Confidence Gauge — Radial gauge with adjustable threshold slider
LLM Token Counter — Per-agent token usage breakdown
Agent Panels — Thinker (assumptions/constraints), Planner (dimensions/phases), Critic (scores/iterations), Solver (markdown-rendered answers), Judge (verdict/confidence)
Sub-Graph Explorer — Expandable SVG mini-graphs of each agent's internal LangGraph nodes
Agent Timeline — Chronological event feed with sub-node activity
Clarification Modal — Interactive Q&A when Thinker needs user input
Dark/Light Mode — Full theme support with CSS custom properties

Project Structure

ProblemSolver.ai/
├── backend/
│   ├── agents/
│   │   ├── base/             # BaseAgent, AgentRegistry, JSON parser
│   │   ├── orchestrator/     # Top-level StateGraph (9 nodes, 3 conditional edges)
│   │   ├── thinker/          # Problem understanding (4 nodes)
│   │   ├── planner/          # Recursive decomposition (4 nodes, loop)
│   │   ├── critic/           # Solution evaluation (3 nodes)
│   │   └── judge/            # Final validation (2 nodes)
│   ├── api/
│   │   ├── main.py           # FastAPI app factory with lifespan
│   │   ├── deps.py           # Dependency injection
│   │   └── routes/           # REST + WebSocket endpoints
│   ├── core/
│   │   ├── config.py         # Pydantic settings
│   │   ├── llm_provider.py   # Anthropic + OpenAI provider abstraction
│   │   ├── prompt_engine.py  # Agent system prompt compilation
│   │   └── subnode_events.py # Async event queue for sub-graph streaming
│   ├── db/
│   │   ├── database.py       # Async SQLAlchemy engine
│   │   ├── models.py         # Session, Problem, Solution tables
│   │   └── repositories/     # Async CRUD repositories
│   ├── models/               # Pydantic domain models
│   ├── services/
│   │   └── toolkit.py        # Tool execution service with timeouts
│   └── tools/
│       ├── base.py           # BaseTool, ToolResult, ToolMetrics
│       ├── registry.py       # ToolRegistry singleton
│       ├── errors.py         # Tool-specific exceptions
│       ├── math/             # Calculator, symbolic, units, matrix
│       ├── analysis/         # Profiler, statistics, patterns, aggregation
│       ├── research/         # Tavily search, fact checker, domain knowledge
│       ├── document/         # Markdown, schema, formatter, code gen
│       └── system/           # Estimator, quality scorer, risk analyzer
├── frontend/
│   ├── src/
│   │   ├── app/              # Next.js app router (layout, page, providers)
│   │   ├── components/
│   │   │   ├── agents/       # ThinkerPanel, PlannerPanel, CriticPanel, SolverPanel, JudgePanel
│   │   │   ├── layout/       # Navbar, ThemeToggle
│   │   │   ├── pipeline/     # PipelineGraph, SubGraphExplorer, AgentTimeline, etc.
│   │   │   └── shared/       # StatusBadge, ProgressBar, ClarificationModal
│   │   ├── hooks/            # usePipeline (state + reducer), useWebSocket
│   │   └── lib/              # Types, constants, sub-graph topology
│   └── package.json
├── tests/                    # 262 tests across agents, tools, models, core, API
├── alembic/                  # Database migrations
├── docker/                   # Docker Compose (PostgreSQL 16)
├── scripts/                  # start.sh, stop.sh, restart.sh, push.sh
├── images/                   # Assets
└── pyproject.toml            # Project config (uv / hatch)

Getting Started

Prerequisites

Python 3.12+ with uv package manager
Node.js 20+ with npm
PostgreSQL 16 (via Docker or local install)
Anthropic API key (or OpenAI — configurable)

1. Clone and configure

git clone https://github.com/iotlodge/problemsolver.ai.git
cd problemsolver.ai/ProblemSolver.ai
cp .env.example .env
# Edit .env — add your ANTHROPIC_API_KEY at minimum

2. Start PostgreSQL

docker compose -f docker/docker-compose.yml up -d

3. Install dependencies and migrate

uv sync
uv run alembic upgrade head

4. Start the backend

uv run uvicorn backend.api.main:app --reload --host 0.0.0.0 --port 8000

5. Start the frontend

cd frontend
npm install
npm run dev

Open http://localhost:3000 — the dashboard connects to the backend via WebSocket automatically.

API Documentation

Once running, interactive API docs are available at http://localhost:8000/api/docs (Swagger) and http://localhost:8000/api/redoc (ReDoc).

Configuration

All configuration is through environment variables (.env file):

Variable	Default	Description
`ANTHROPIC_API_KEY`	—	Anthropic API key (required for Claude)
`OPENAI_API_KEY`	—	OpenAI API key (optional, for GPT models)
`DEFAULT_LLM_PROVIDER`	`anthropic`	Which LLM provider to use
`DEFAULT_LLM_MODEL`	`claude-sonnet-4-5-20250929`	Model identifier
`DATABASE_URL`	`postgresql+asyncpg://...`	PostgreSQL connection string
`MAX_DECOMPOSITION_DEPTH`	`3`	Max recursive decomposition depth
`MAX_CRITIC_ITERATIONS`	`3`	Max plan revision cycles
`THINKER_DEFAULT_MODE`	`interactive`	`interactive` (asks questions) or `autonomous` (makes assumptions)
`NEXT_PUBLIC_API_URL`	`http://localhost:8000`	Backend URL for frontend proxy
`NEXT_PUBLIC_WS_URL`	`ws://localhost:8000`	WebSocket URL

Running Tests

# Full suite (262 tests)
uv run pytest tests/ -v

# With coverage
uv run pytest tests/ --cov=backend --cov-report=term-missing

# Specific category
uv run pytest tests/test_agents/ -v      # Agent pipeline tests
uv run pytest tests/test_tools/ -v       # Tool implementation tests
uv run pytest tests/test_models/ -v      # Domain model tests
uv run pytest tests/test_core/ -v        # Prompt engine tests
uv run pytest tests/test_api/ -v         # WebSocket tests

How It Works — Example

Input: "Optimal Meeting Schedule — find availability windows during a single 8-hour workday for 5 people with overlapping constraints"

Pipeline Execution:

Thinker analyzes the problem, identifies ambiguities (time zones? priority ordering? lunch breaks?), and either asks clarifying questions or makes assumptions.
Planner decomposes into dimensions: Temporal Constraints, Participant Availability, Room/Resource Allocation, Priority Optimization. Each dimension gets phases and atomic tasks.
Critic evaluates the plan — scores completeness, feasibility, specificity. If the plan is too vague or missing edge cases, it sends revision feedback back to the Planner.
Solver executes the validated plan, computing concrete time windows, conflict resolutions, and a recommended schedule.
Judge validates the answer against original constraints — did it actually address all 5 people? Are the time windows valid? Does it respect the 8-hour boundary? Issues a verdict with a confidence score.

The entire flow streams to the frontend in real time, with sub-node activity visible in the Sub-Graph Explorer.

License

Apache License 2.0 — see LICENSE for details.

Built by @iotlodge

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProblemSolver.ai

What is ProblemSolver.ai?

Architecture

Agent Pipeline

Revision Loops

Real-Time Sub-Graph Visualization

Tech Stack

Agent Toolkit — 15 Tools

Frontend

Project Structure

Getting Started

Prerequisites

1. Clone and configure

2. Start PostgreSQL

3. Install dependencies and migrate

4. Start the backend

5. Start the frontend

API Documentation

Configuration

Running Tests

How It Works — Example

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
alembic		alembic
backend		backend
docker		docker
frontend		frontend
images		images
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
alembic.ini		alembic.ini
pyproject.toml		pyproject.toml
uv.lock		uv.lock

iotlodge/problemsolver.ai

Folders and files

Latest commit

History

Repository files navigation

ProblemSolver.ai

What is ProblemSolver.ai?

Architecture

Agent Pipeline

Revision Loops

Real-Time Sub-Graph Visualization

Tech Stack

Agent Toolkit — 15 Tools

Frontend

Project Structure

Getting Started

Prerequisites

1. Clone and configure

2. Start PostgreSQL

3. Install dependencies and migrate

4. Start the backend

5. Start the frontend

API Documentation

Configuration

Running Tests

How It Works — Example

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages