- Overview & Strategic Context
- Innovation & Architecture
- System Architecture & User Flows
- Multi-Agent Reasoning Engine
- Backend Services & Core Infrastructure
- Frontend Environment & Workspace
- External AI & Cloud Services
- Technology Stack & Library Ecosystem
- Testing & Benchmarking
- Project Directory Structure
- Installation & Setup
FinAdvisor AI is a multi-agent financial advisory solution designed to help SMEs manage finances, build credible profiles, and access financial services. Built on LangGraph, it uses specialized AI agents to process informal business data and provide accurate, safe guidance.
Mission: To empower SMEs with intelligent financial guidance that transforms fragmented data into structured insight, enabling informed decisions and access to formal financial services with high accuracy and safety.
| Principle | Description |
|---|---|
| Accuracy | Exact numerical outputs, no LLM approximations. |
| Safety | Multi-layer validation before reaching users. |
| Transparency | Clear explanations of recommendation generation. |
| Inclusivity | Accommodates informal/cash-based records. |
| Personalization | Adapted to business size, sector, and risk. |
| Accessibility | Jargon-free explanations for all literacy levels. |
| Data Sovereignty | SMEs control their data. |
SMEs are critically underserved due to systemic challenges:
- Lack of Formal Records: Micro SMEs rely on handwritten or informal logs, lacking standard bookkeeping.
- "Dirty" Data: Existing records are often incomplete, inconsistent, and mixed with personal finances.
- Institutional Exclusion: Without formal statements, SMEs cannot access traditional financial services.
- Market Limitations: Existing solutions (traditional software, generic chatbots, human advisors, bank tools) require formal records, domain expertise, or are prohibitively expensive.
FinAdvisor AI bridges these gaps by:
- Structuring messy, informal records into usable summaries.
- Providing cash flow analysis formatted for irregular revenue patterns.
- Generating actionable financial health assessments.
- Enabling preparation for formal loans and grants.
- Maintaining multi-layer safety validation before user delivery.
Core Objectives:
| Objective | Description |
|---|---|
| Data Structuring | Transform informal records into structured summaries. |
| Cash Flow Intelligence | Real-time analysis tailored to SME patterns. |
| Health Assessment | Actionable, easy-to-understand business health reports. |
| Access Enablement | Help prepare profiles for formal lending. |
| Risk-Aware Guidance | Advice accounting for small business risks. |
| Financial Literacy | Educate owners on essential financial concepts. |
Primary Segment: SMEs including informal businesses, sole proprietors, and early-stage ventures (1-50 employees, with a primary focus on micro-enterprises 1-10) in retail, services, agriculture, trades, and gig economy sectors.
User Personas:
- Informal Micro-Enterprise: Relies on notebook or manual logs; needs basic financial separation and structuring for growth.
- Growing Small Business: Uses a mix of physical and digital records; needs better data consolidation and forecasting for expansion.
- Formalization-Ready Entrepreneur: Semi-formal operations; needs accounting organization, tax compliance visibility, and official profile building for institutional lending.
User Needs Matrix:
| Priority | Need |
|---|---|
| Critical | Data ingestion (informal logs) and structuring (categorization, quality scoring). |
| High | Cash flow tracking, financial health scoring, and loan/grant documentation access. |
| Medium | Scenario planning, tax estimates, and continuous financial education. |
The core innovation is an eight-agent architecture with strict "Separation of Concerns"—unlike single-LLM chatbots, FinAdvisor breaks down the reasoning process into specialized, composable nodes. This approach safely handles messy SME data and prevents systemic failure points.
Key Capabilities:
- Dirty Data Ingestion: Parses handwritten ledgers (OCR), mobile money logs, and reconciles contradictory records with quality scoring.
- Smart Query Routing: Efficiently classifies and routes factual queries, complex analysis, and data uploads to the appropriate agent.
- Self-Correcting Feedback Loop: Draft responses are reviewed by a Reflection Agent; errors trigger revisions before user delivery.
- LLM-Computation Separation: Language models handle reasoning; exact numerical math executes in sandboxed, deterministic tools.
- Persistent Context: Maintains short-term conversation context and long-term business profiles.
- Domain RAG: Specialized knowledge bases covering SME lending, tax, and ratio benchmarks.
SME users must be protected from AI risks. FinAdvisor transparently mitigates core vulnerabilities relying on verifiable operations:
| Risk Category | Key Mitigations |
|---|---|
| Hallucinations | Sandboxed computation; mandatory RAG citations verified by Reflection Agent. |
| Bias & Fairness | Specific inclusion of informal economy data; size and region-parameterized advice. |
| Privacy & Security | Strict data isolation; sensitive value masking; minimal context windows. |
| Technical Errors | Multi-format fallback parsing; confidence scoring on OCR data. |
| Legal/Compliance | Explicit disclaimers; professional referrals for high-risk queries. |
| Over-Reliance | Embedded financial education to build user independence. |
Scope: This framework focuses exclusively on SME financial advisory, excluding personal finance, credit scoring, or licensed investment advice.
The architecture is divided into a robust, asynchronous backend and a highly interactive, state-driven frontend. These environments communicate continuously via strictly typed WebSockets, enabling real-time bidirectional data flows without the overhead of HTTP connection polling.
The primary flow ensures that a user's natural language query is translated into precise, mathematically validated financial advice.
sequenceDiagram
participant User
participant Frontend
participant WebSocket
participant State as Agent State
participant Router as Agent Router
participant SubAgent as Domain Agents
participant Tools as Math/Data Tools
User->>Frontend: Query
Frontend->>WebSocket: Stream Input
WebSocket->>State: Update History
State->>Router: Execute Graph
Router->>SubAgent: Route Task
loop Tool Iteration
SubAgent->>Tools: Calculate
Tools-->>SubAgent: Data / Error
SubAgent->>SubAgent: Recalculate
end
SubAgent->>WebSocket: Stream Chain-of-Thought
SubAgent->>WebSocket: Stream Response & UI
WebSocket->>Frontend: Render Components
The multi-agent core relies on a directed graph of specialized nodes. Instead of a single monolithic prompt, inputs are routed dynamically, evaluated, and iteratively improved through reflection loops before reaching the user.
graph TD
Start((User Input)) --> Router{Router Node}
Router -->|Data Upload| Summarizer[Data Summarizer]
Router -->|Analysis| Refiner[Query Refiner]
Router -->|Follow Up| Memory[Memory Node]
Router -->|Quick Lookup| Expert[Finance Expert]
Router -->|Out of Scope| End((Final Response))
Summarizer --> Refiner
Refiner --> Memory
Memory --> Planner[Planner Node]
Planner -->|Needs Tool| Tools[Tool Executor]
Tools --> Planner
Planner -->|No Tool Needed| Reflection{Reflection Node}
Expert --> Reflection
Reflection -->|Retry Logic| Planner
Reflection -->|Approved| End
- Router Node: As the entry point, it semantically evaluates user input and assigns a distinct routing category (e.g., Data Upload, Quick Lookup, Analysis, or Follow Up).
- Data Summarizer Node: Handles deep context compression, parsing large tabular data or complex uploads to fit the context window efficiently before analysis.
- Query Refiner Node: Sanitizes and enhances ambiguous user questions, ensuring subsequent nodes have a rigid objective constraint to follow.
- Memory Node: Functions as the persistent storage layer. It extracts key entities (like age, risk tolerance) and dynamically references the user's historical state.
- Planner Node: Breaks down the refined objectives into step-by-step sub-tasks, identifying exactly when to calculate thresholds or fetch deterministic datasets.
- Tool Executor Node: Manages the highly secure, sandboxed execution iteration loop for deterministic data. It continuously returns data to the planner until completion.
- Finance Expert Node: A specialized track for quick, direct domain queries that require immediate financial wisdom without deep planning overhead.
- Reflection Node: The final gatekeeper. It evaluates logic chains and tool outputs recursively to enforce quality control, triggering retries (or escalating logic) before final delivery to the user.
The core reasoning engine operates using a graph-based state machine, preventing monolithic prompt failures. The system is broken down into specialized execution nodes (diagrammed above), tailored prompting, and discrete tools.
Instead of hallucinating math, agents trigger exact programmatic tools:
- Calculator Tool: Executes strict arithmetic operations.
- Forecast Tool: Generates compound interest and predictive asset trajectories mathematically.
- Data Quality Tool: Assesses confidence and completeness in user-provided datasets.
- User Data & File Content Tools: Extracts specific rows and tables from securely uploaded CSVs and structurized PDFs.
- Knowledge Base Tool: Triggers semantic searches against official financial datasets and compliance guidelines.
- Output Builder Tools: Enforces structured response formats, ensuring the frontend can render complex visualizations instead of plaintext.
A dedicated vector search pipeline anchors the agents' financial reasoning. This Retrieval-Augmented Generation (RAG) approach prevents outdated or hallucinated advice by prioritizing indexed tax codes and official documentation.
1. Embedding & Ingestion Pipeline
Complex financial documents and institutional guidelines are parsed, chunked semantically, and embedded into dense vector representations. This ensures that even deeply technical taxation rules and investment guidelines are mathematically searchable.
2. Intelligent Retrieval Algorithms
When an agent encounters a domain-specific query, the system bypasses its internal generalized weights and dynamically queries the vector store. It retrieves only the most contextually relevant document chunks based on high-dimensional similarity scores.
3. Contextual Injection
The retrieved factual data is rigidly injected into the agent's active context window alongside stringent system prompts. This forces the model to synthesize its answers strictly from the ingested sources rather than generating unverified assumptions.
graph TD
User[User Uploads PDF/Image/CSV] --> Parser[Document Parser / AWS Textract]
Parser --> Chunker[Semantic Chunker]
Chunker --> Embedder[Embedding Model]
Embedder --> VectorDB[(Vector DB / Pinecone)]
Query[User AI Prompt] --> QueryRefiner[Query Refiner Node]
QueryRefiner --> SearchEmbedder[Embed Refined Query]
SearchEmbedder --> VectorDB
VectorDB -- Return Top-K Chunks --> Context[Agent Context Window]
Context --> Planner[Finance Expert / Planner]
The backend natively supports complex business operations necessary for a consumer-facing product layer.
-
WebSocket Streaming Service: Maintains stateful, threaded connections handling real-time token pushing and operational checkpoints.
LoadingsequenceDiagram participant UI as React Frontend participant WS as WebSocket Layer participant Engine as LangGraph Engine participant Node as Active Agent Node UI->>WS: User message ("Explain Roth IRA") WS->>Engine: init_graph_stream() loop Graph Execution (Router -> Planner -> Tools) Engine->>Node: Execute Current Node Node-->>Engine: Yield Intermediate State (Thinking) Engine-->>WS: Stream Event (type: thinking, data: Node State) WS-->>UI: Update "Agent is analyzing..." UI alt Tool Invocation Node->>Node: Execute Local Function Node-->>WS: Stream Event (type: tool_call) WS-->>UI: Display Tool Spinner details end end Engine-->>WS: Stream Final Generation WS-->>UI: Render Final Formatted Markdown/Canvas -
Process Cancellation Mechanism: Features a robust, signal-based interruption system. If a user halts a query or disconnects mid-stream, the backend instantly terminates the active AI process and background graph loops, freeing up compute and preventing runaway API costs.
-
Rate Limit & Resource Protection: An advanced, Redis-backed Lua sliding window system tracking Tokens-Per-Minute (TPM). This mitigates API spam, prevents recursive agent loops, and protects overall cloud budgets.
sequenceDiagram
participant Client
participant API as Backend Service
participant Redis as Redis (Lua Scripts)
participant LLM as AI Provider
Client->>API: Submit Financial Query
API->>API: Estimate Token Cost
API->>Redis: eval(check_and_reserve, estimated_tokens)
alt Tokens Exhausted
Redis-->>API: Reject (Returns Retry-After)
API-->>Client: 429 Error / Threshold Reached UI
else Tokens Available
Redis-->>API: Approve & Reserve Allocation
API->>LLM: Execute Agent Graph
LLM-->>API: Return Final Generation
API->>Redis: record_actual(actual_tokens_used)
API-->>Client: Stream Final Response
end
- Document Processing: Securely ingests PDFs, Images (via OCR), and CSVs, converting unstructured data into structured schemas optimized for contextual analysis.
- Canvas Exports: Packages dynamically generated financial plans into distributable, formalized document formats.
Built with a highly scalable relational database architecture optimized for graph states:
- Conversation Mapping: Intricately maps complex graph execution states to specific user sessions.
- Financial Indexing: Utilizes composite indexing for rapid multi-dimensional filtering across transaction histories.
- Telemetry Tracking: Records granular user metadata, API budgets, branching logic paths, and document upload summaries.
The frontend leverages a modern reactive component architecture and robust state management to deliver a high-performance interactive experience.
- Real-Time Thinking Steps: Renders the exact step-by-step intermediate tools the agent is executing in real-time, enforcing complete transparency of the advisory process.
- Integrated File Management: Provides an intuitive drag-and-drop zone and interactive preview layer for parsing complex financial documents.
- Structured Input Templates: Provides users with predefined, visually guided templates and dynamic forms to seamlessly structure complex financial queries. This ensures the routing agents receive normalized, high-quality parameters immediately rather than relying solely on raw, unstructured conversational inputs.
- Streaming Render Engine: Highly optimized rendering layers that parse incoming WebSocket data streams into fully visualized components instantly.
Instead of constraining answers to plaintext, the system forces the UI to conditionally render interactive blocks:
- Dynamic Charts: Renders interactive financial trajectories, cash flow graphs, and predictive asset allocations.
- Comparison Tables: Formats side-by-side asset, loan, or investment comparisons for immediate visual contrast.
- Canvas Workspace: A collaborative document environment where users apply layouts and utilize AI-driven inline editing tools to refine financial reports originally drafted by the agent.
Financial planning is rarely a straight line; users frequently want to explore alternative strategies based on hypothetical scenarios (e.g., "What if I invest more aggressively?" versus "What if I use that capital to pay down loans?").
The Branching Mechanism: When a user selects a specific historical message in the chat timeline to "branch" from, the frontend sends a state-fork payload to the backend. The backend executes a deep copy of the complex LangGraph execution state, memory entities, and financial variables up to that exact message ID. This generates a completely new, independent session ID. The frontend then instantiates a parallel chat workspace, allowing the user to continue the conversation in a new direction. All subsequent deterministic calculations and tool iterations in this new branch operate strictly against the forked state, leaving the original conversation completely untouched.
Comparison with Standard AI Limitations:
| Feature | Traditional Linear Conversation | Non-Linear Contextual Branching |
|---|---|---|
| Scenario Exploration | Restricts users to a single timeline; "what-if" questions pollute the original context. | Empowers side-by-side strategy comparisons in safe, isolated workspaces. |
| State Management | Retains a single, monolithic conversation history which becomes easily bloated. | Functions intrinsically like a Git tree, managing multiple divergent state forks. |
| Context Pollution Risk | High. Modifying parameters mid-conversation often confuses the model's memory recall. | Zero. The parallel branch is an independent state; changes here cannot leak into the parent. |
| UI Experience | Forces endless scrolling and repetitive queries to reset previous parameters. | Provides distinct tabs/workspaces to toggle instantly between the parent plan and the alternate strategy. |
The platform integrates with top-tier AI providers and highly reliable cloud infrastructure to ensure maximal accuracy, privacy, and speed.
- Google (Gemini 2.5 Flash): The primary heavyweight reasoning model. Relied upon for extreme logical adherence, executing complex LangGraph nodes, and ensuring reliable tool-output interpretation.
- Google (Gemini 2.5 Flash Lite): Specialized lightweight model optimized for high-throughput classification, routing, and real-time data ingestion tasks.
- OpenAI: Powers the embedding architecture, converting financial semantic data into dense vector representations for the retrieval-augmented generation (RAG) loops.
- Ollama: Provides a local, offline environment to rapidly test agent logic variations without incurring cloud API costs.
- Pinecone: Operating as the high-throughput vector database, hosting the embedded compliance metrics and tax codes for instantaneous similarity searching.
- Amazon Textract: Specialized machine learning service utilized to automatically extract text, handwriting, and highly structured data from scanned financial PDFs and images.
- Amazon Lambda (Optional): Can be configured as a secure, sandboxed execution environment for deterministic tools, ensuring that any code execution triggered by the agents is isolated and cannot affect the core backend stability.
- Amazon S3: Secure, highly available cloud storage bucket utilized for persisting encrypted user uploads before parsing and pipeline injection.
- Neon DB (PostgreSQL): Alternative persistence layer designated for relational synchronization and rapid cloud prototyping.
- Docker / AWS ECS: The underlying packaging and orchestration logic enabling the Python servers to deploy predictably across load-balanced AWS environments.
A comprehensive listing of the active libraries and systems powering each architectural layer.
| Framework/Library | Primary Function |
|---|---|
langgraph & langchain-core |
Orchestrates the multi-agent state machine and standardizes LLM interfaces. |
langchain-anthropic & langchain-google-genai |
Inference providers executing deterministic graph logic. |
langchain-openai & langchain-pinecone |
Drives the vector-embeddings pipeline and Retrieval-Augmented Generation. |
pandas & numpy |
Underpins the high-fidelity tabular data summarization nodes. |
RestrictedPython |
Secure, sandboxed runtime preventing malicious payload execution from tools. |
| Framework/Library | Primary Function |
|---|---|
fastapi & uvicorn |
High-performance ASGI framework and server managing HTTP/WebSockets. |
sqlalchemy & asyncpg |
Native async Object-Relational Mapper (ORM) mapped to PostgreSQL. |
alembic |
Safely manages and applies relational database schema migrations. |
redis |
Underpins the Token-Per-Minute rate limiter and caches WebSocket states. |
boto3 & amazon-textract-textractor |
Leverages AWS cloud services for deep OCR capabilities. |
pydantic |
Enforces rigid data validation schemas across API requests and tool outputs. |
pdfplumber |
Structurally parses complex financial PDF documents for ingestion. |
pytest & pytest-asyncio |
Drives the rigorous automated testing and benchmarking suite. |
| Framework/Library | Primary Function |
|---|---|
react & react-dom |
Component-based visualization library operating the core client layer. |
vite |
Next-generation frontend build tooling enabling lightning-fast HMR. |
zustand |
Highly un-opinionated state manager tracking auth and agent output buffers. |
recharts & chart.js |
Renders dynamic financial models out of deterministic backend geometries. |
tailwind-merge & clsx |
Powers the modular and responsive Tailwind CSS utility frameworks. |
framer-motion |
Orchestrates the smooth visual transition of ThinkingSteps logic. |
react-markdown & remark-gfm |
Parses the LLM markdown streams securely for the interactive Canvas. |
@radix-ui/react-* |
Server high-accessibility component primitives (Dialogs, Tooltips, Modals). |
Given the non-deterministic nature of foundational language models, a stringent, metrics-driven evaluation suite enforces operational integrity.
- Automated Evaluations: Aggressively tracks model latency, routing path accuracy, and continuously measures semantic similarity against a locked golden dataset to map regressions.
- Comprehensive Unit & Flow Tests: Deep code coverage spanning from deterministic calculator validation and isolated agent node logic to holistic multi-turn state evaluations and rigid guardrail safety verifications.
The repository is organized into distinct environments, isolating the frontend client from the core backend logic, evaluation scripts, and internal testing frameworks.
finadvisor-ai/
├── backend/ # Core FastAPI server and LangGraph execution environment
│ ├── agents/ # Multi-agent architecture logic
│ │ ├── knowledge/ # Vector embeddings and RAG document retrieval logic
│ │ ├── nodes/ # LangGraph state nodes (Planner, Expert, Summarizer, etc.)
│ │ ├── prompts/ # Parameterized contextual instructions for LLM execution
│ │ └── tools/ # Deterministic mathematical and data-extraction tools
│ ├── app/ # General API models and robust service layer
│ │ ├── api/ # REST endpoints and dependency injections
│ │ ├── core/ # Logging, exceptions, Redis configuration, and security
│ │ ├── models/ # SQLAlchemy ORM definitions (users, sessions, transactions)
│ │ ├── services/ # Business logic (rate limiting, file uploads, canvas exports)
│ │ └── websocket/ # Real-time stateful streaming connections
│ └── migrations/ # Alembic database migration schemas and version history
├── frontend/ # React/Vite SPA client application
│ └── src/
│ ├── api/ # REST and WebSocket client connectors mapping to backend services
│ ├── components/ # React UI components
│ │ ├── chat/ # Message rendering, real-time ThinkingSteps, branching dialogs
│ │ ├── landing/ # Marketing pages and user onboarding flows
│ │ ├── templates/ # Dynamic structured forms for query normalization
│ │ ├── ui/ # Buttons, Modals, Loaders, and other reusable UI elements
│ │ └── tools/ # Canvas workspace, Chart renderers, and Data tables
│ ├── hooks/ # Custom React hooks for state management and WebSocket handling
│ ├── pages/ # Top-level application routing (Chat, Dashboards, Landing)
│ ├── lib/ # Utility functions for data formatting, token counting, and embedding visualizers
│ └── stores/ # Zustand state managers (Auth, Chat Session, UI Themes)
├── scripts/ # Benchmark scripts, seed data logic, and evaluation configurations
└── tests/ # Pytest suite covering agent logic, multi-turn flows, and guardrails- Environment Setup: Create a
.envfile at the project root with the content below, and fill in your valid API keys:
# LLMs
GOOGLE_API_KEY=your-api-key-here
# Pinecone — Docker local emulator
USE_PINECONE_LOCAL=true
# AWS Credentials
AWS_ACCESS_KEY_ID=aws-access-key-id-placeholder
AWS_SECRET_ACCESS_KEY=aws-secret-access-key-placeholder
AWS_REGION=us-east-1
OCR_BACKEND=textract
AWS_S3_BUCKET=aws-bucket-placeholder
STORAGE_BACKEND=s3
LAMBDA_CALCULATOR_ENABLED=true
LAMBDA_CALCULATOR_FUNCTION=finadvisor-calculator-
Build and Run Containers:
- Build images:
docker compose build - Run built containers:
docker compose up -d
- Build images:
-
Access the Application: Open your browser and go to
http://localhost:8080/to get started.
For faster iteration during development, you can run the services natively using the provided Makefile:
-
Infrastructure: Start the supporting services (Postgres, Redis, Pinecone local) via Docker:
make dev-infra
-
Backend Setup:
cd backend python3 -m venv venv source venv/bin/activate # On Windows: .\venv\Scripts\activate pip install -r requirements.txt cd .. make db-migrate make db-seed make dev-backend
-
Frontend Setup:
cd frontend npm install cd .. make dev-frontend
-
Database URL: When running natively, ensure your
.envfile useslocalhostfor service connections (e.g.,DATABASE_URL=postgresql+asyncpg://finadvisor:finadvisor@localhost:5432/finadvisor).
Distributed under the MIT License. See LICENSE for more information.