FinAdvisor AI - Financial Advisory for SMEs

Overview & Strategic Context
Innovation & Architecture
System Architecture & User Flows
Multi-Agent Reasoning Engine
Backend Services & Core Infrastructure
Frontend Environment & Workspace
External AI & Cloud Services
Technology Stack & Library Ecosystem
Testing & Benchmarking
Project Directory Structure
Installation & Setup

1. Overview & Strategic Context

1.1 Vision

FinAdvisor AI is a multi-agent financial advisory solution designed to help SMEs manage finances, build credible profiles, and access financial services. Built on LangGraph, it uses specialized AI agents to process informal business data and provide accurate, safe guidance.

Mission: To empower SMEs with intelligent financial guidance that transforms fragmented data into structured insight, enabling informed decisions and access to formal financial services with high accuracy and safety.

1.2 Core Principles

Principle	Description
Accuracy	Exact numerical outputs, no LLM approximations.
Safety	Multi-layer validation before reaching users.
Transparency	Clear explanations of recommendation generation.
Inclusivity	Accommodates informal/cash-based records.
Personalization	Adapted to business size, sector, and risk.
Accessibility	Jargon-free explanations for all literacy levels.
Data Sovereignty	SMEs control their data.

1.3 The Problem & Market Gap

SMEs are critically underserved due to systemic challenges:

Lack of Formal Records: Micro SMEs rely on handwritten or informal logs, lacking standard bookkeeping.
"Dirty" Data: Existing records are often incomplete, inconsistent, and mixed with personal finances.
Institutional Exclusion: Without formal statements, SMEs cannot access traditional financial services.
Market Limitations: Existing solutions (traditional software, generic chatbots, human advisors, bank tools) require formal records, domain expertise, or are prohibitively expensive.

1.4 Value Proposition & Key Objectives

FinAdvisor AI bridges these gaps by:

Structuring messy, informal records into usable summaries.
Providing cash flow analysis formatted for irregular revenue patterns.
Generating actionable financial health assessments.
Enabling preparation for formal loans and grants.
Maintaining multi-layer safety validation before user delivery.

Core Objectives:

Objective	Description
Data Structuring	Transform informal records into structured summaries.
Cash Flow Intelligence	Real-time analysis tailored to SME patterns.
Health Assessment	Actionable, easy-to-understand business health reports.
Access Enablement	Help prepare profiles for formal lending.
Risk-Aware Guidance	Advice accounting for small business risks.
Financial Literacy	Educate owners on essential financial concepts.

1.5 Target Users & Personas

Primary Segment: SMEs including informal businesses, sole proprietors, and early-stage ventures (1-50 employees, with a primary focus on micro-enterprises 1-10) in retail, services, agriculture, trades, and gig economy sectors.

User Personas:

Informal Micro-Enterprise: Relies on notebook or manual logs; needs basic financial separation and structuring for growth.
Growing Small Business: Uses a mix of physical and digital records; needs better data consolidation and forecasting for expansion.
Formalization-Ready Entrepreneur: Semi-formal operations; needs accounting organization, tax compliance visibility, and official profile building for institutional lending.

User Needs Matrix:

Priority	Need
Critical	Data ingestion (informal logs) and structuring (categorization, quality scoring).
High	Cash flow tracking, financial health scoring, and loan/grant documentation access.
Medium	Scenario planning, tax estimates, and continuous financial education.

2. Innovation & Architecture

2.1 Layered Multi-Agent Architecture

The core innovation is an eight-agent architecture with strict "Separation of Concerns"—unlike single-LLM chatbots, FinAdvisor breaks down the reasoning process into specialized, composable nodes. This approach safely handles messy SME data and prevents systemic failure points.

Key Capabilities:

Dirty Data Ingestion: Parses handwritten ledgers (OCR), mobile money logs, and reconciles contradictory records with quality scoring.
Smart Query Routing: Efficiently classifies and routes factual queries, complex analysis, and data uploads to the appropriate agent.
Self-Correcting Feedback Loop: Draft responses are reviewed by a Reflection Agent; errors trigger revisions before user delivery.
LLM-Computation Separation: Language models handle reasoning; exact numerical math executes in sandboxed, deterministic tools.
Persistent Context: Maintains short-term conversation context and long-term business profiles.
Domain RAG: Specialized knowledge bases covering SME lending, tax, and ratio benchmarks.

2.2 Risk Management & Safety Framework

SME users must be protected from AI risks. FinAdvisor transparently mitigates core vulnerabilities relying on verifiable operations:

Risk Category	Key Mitigations
Hallucinations	Sandboxed computation; mandatory RAG citations verified by Reflection Agent.
Bias & Fairness	Specific inclusion of informal economy data; size and region-parameterized advice.
Privacy & Security	Strict data isolation; sensitive value masking; minimal context windows.
Technical Errors	Multi-format fallback parsing; confidence scoring on OCR data.
Legal/Compliance	Explicit disclaimers; professional referrals for high-risk queries.
Over-Reliance	Embedded financial education to build user independence.

Scope: This framework focuses exclusively on SME financial advisory, excluding personal finance, credit scoring, or licensed investment advice.

3. System Architecture & User Flows

The architecture is divided into a robust, asynchronous backend and a highly interactive, state-driven frontend. These environments communicate continuously via strictly typed WebSockets, enabling real-time bidirectional data flows without the overhead of HTTP connection polling.

3.1. Main Chat Flow

The primary flow ensures that a user's natural language query is translated into precise, mathematically validated financial advice.

sequenceDiagram
    participant User
    participant Frontend
    participant WebSocket
    participant State as Agent State
    participant Router as Agent Router
    participant SubAgent as Domain Agents
    participant Tools as Math/Data Tools

    User->>Frontend: Query
    Frontend->>WebSocket: Stream Input
    WebSocket->>State: Update History
    State->>Router: Execute Graph
    Router->>SubAgent: Route Task
    loop Tool Iteration
        SubAgent->>Tools: Calculate
        Tools-->>SubAgent: Data / Error
        SubAgent->>SubAgent: Recalculate
    end
    SubAgent->>WebSocket: Stream Chain-of-Thought
    SubAgent->>WebSocket: Stream Response & UI
    WebSocket->>Frontend: Render Components

3.2. Agent Architecture & Node Interactions

The multi-agent core relies on a directed graph of specialized nodes. Instead of a single monolithic prompt, inputs are routed dynamically, evaluated, and iteratively improved through reflection loops before reaching the user.

graph TD
    Start((User Input)) --> Router{Router Node}

    Router -->|Data Upload| Summarizer[Data Summarizer]
    Router -->|Analysis| Refiner[Query Refiner]
    Router -->|Follow Up| Memory[Memory Node]
    Router -->|Quick Lookup| Expert[Finance Expert]
    Router -->|Out of Scope| End((Final Response))

    Summarizer --> Refiner
    Refiner --> Memory
    Memory --> Planner[Planner Node]

    Planner -->|Needs Tool| Tools[Tool Executor]
    Tools --> Planner

    Planner -->|No Tool Needed| Reflection{Reflection Node}
    Expert --> Reflection

    Reflection -->|Retry Logic| Planner
    Reflection -->|Approved| End

3.3. Execution Nodes

Router Node: As the entry point, it semantically evaluates user input and assigns a distinct routing category (e.g., Data Upload, Quick Lookup, Analysis, or Follow Up).
Data Summarizer Node: Handles deep context compression, parsing large tabular data or complex uploads to fit the context window efficiently before analysis.
Query Refiner Node: Sanitizes and enhances ambiguous user questions, ensuring subsequent nodes have a rigid objective constraint to follow.
Memory Node: Functions as the persistent storage layer. It extracts key entities (like age, risk tolerance) and dynamically references the user's historical state.
Planner Node: Breaks down the refined objectives into step-by-step sub-tasks, identifying exactly when to calculate thresholds or fetch deterministic datasets.
Tool Executor Node: Manages the highly secure, sandboxed execution iteration loop for deterministic data. It continuously returns data to the planner until completion.
Finance Expert Node: A specialized track for quick, direct domain queries that require immediate financial wisdom without deep planning overhead.
Reflection Node: The final gatekeeper. It evaluates logic chains and tool outputs recursively to enforce quality control, triggering retries (or escalating logic) before final delivery to the user.

4. Multi-Agent Reasoning Engine

The core reasoning engine operates using a graph-based state machine, preventing monolithic prompt failures. The system is broken down into specialized execution nodes (diagrammed above), tailored prompting, and discrete tools.

4.1. Deterministic Tool Selection

Instead of hallucinating math, agents trigger exact programmatic tools:

Calculator Tool: Executes strict arithmetic operations.
Forecast Tool: Generates compound interest and predictive asset trajectories mathematically.
Data Quality Tool: Assesses confidence and completeness in user-provided datasets.
User Data & File Content Tools: Extracts specific rows and tables from securely uploaded CSVs and structurized PDFs.
Knowledge Base Tool: Triggers semantic searches against official financial datasets and compliance guidelines.
Output Builder Tools: Enforces structured response formats, ensuring the frontend can render complex visualizations instead of plaintext.

4.2. RAG & Knowledge Base

A dedicated vector search pipeline anchors the agents' financial reasoning. This Retrieval-Augmented Generation (RAG) approach prevents outdated or hallucinated advice by prioritizing indexed tax codes and official documentation.

1. Embedding & Ingestion Pipeline

Complex financial documents and institutional guidelines are parsed, chunked semantically, and embedded into dense vector representations. This ensures that even deeply technical taxation rules and investment guidelines are mathematically searchable.

2. Intelligent Retrieval Algorithms

When an agent encounters a domain-specific query, the system bypasses its internal generalized weights and dynamically queries the vector store. It retrieves only the most contextually relevant document chunks based on high-dimensional similarity scores.

3. Contextual Injection

The retrieved factual data is rigidly injected into the agent's active context window alongside stringent system prompts. This forces the model to synthesize its answers strictly from the ingested sources rather than generating unverified assumptions.

graph TD
    User[User Uploads PDF/Image/CSV] --> Parser[Document Parser / AWS Textract]
    Parser --> Chunker[Semantic Chunker]
    Chunker --> Embedder[Embedding Model]
    Embedder --> VectorDB[(Vector DB / Pinecone)]

    Query[User AI Prompt] --> QueryRefiner[Query Refiner Node]
    QueryRefiner --> SearchEmbedder[Embed Refined Query]
    SearchEmbedder --> VectorDB
    VectorDB -- Return Top-K Chunks --> Context[Agent Context Window]
    Context --> Planner[Finance Expert / Planner]

5. Backend Services & Core Infrastructure

The backend natively supports complex business operations necessary for a consumer-facing product layer.

5.1. Core Services

WebSocket Streaming Service: Maintains stateful, threaded connections handling real-time token pushing and operational checkpoints.

sequenceDiagram
    participant UI as React Frontend
    participant WS as WebSocket Layer
    participant Engine as LangGraph Engine
    participant Node as Active Agent Node

    UI->>WS: User message ("Explain Roth IRA")
    WS->>Engine: init_graph_stream()
    loop Graph Execution (Router -> Planner -> Tools)
        Engine->>Node: Execute Current Node
        Node-->>Engine: Yield Intermediate State (Thinking)
        Engine-->>WS: Stream Event (type: thinking, data: Node State)
        WS-->>UI: Update "Agent is analyzing..." UI
        alt Tool Invocation
            Node->>Node: Execute Local Function
            Node-->>WS: Stream Event (type: tool_call)
            WS-->>UI: Display Tool Spinner details
        end
    end
    Engine-->>WS: Stream Final Generation
    WS-->>UI: Render Final Formatted Markdown/Canvas

Process Cancellation Mechanism: Features a robust, signal-based interruption system. If a user halts a query or disconnects mid-stream, the backend instantly terminates the active AI process and background graph loops, freeing up compute and preventing runaway API costs.
Rate Limit & Resource Protection: An advanced, Redis-backed Lua sliding window system tracking Tokens-Per-Minute (TPM). This mitigates API spam, prevents recursive agent loops, and protects overall cloud budgets.

sequenceDiagram
    participant Client
    participant API as Backend Service
    participant Redis as Redis (Lua Scripts)
    participant LLM as AI Provider

    Client->>API: Submit Financial Query
    API->>API: Estimate Token Cost
    API->>Redis: eval(check_and_reserve, estimated_tokens)

    alt Tokens Exhausted
        Redis-->>API: Reject (Returns Retry-After)
        API-->>Client: 429 Error / Threshold Reached UI
    else Tokens Available
        Redis-->>API: Approve & Reserve Allocation
        API->>LLM: Execute Agent Graph
        LLM-->>API: Return Final Generation
        API->>Redis: record_actual(actual_tokens_used)
        API-->>Client: Stream Final Response
    end

Document Processing: Securely ingests PDFs, Images (via OCR), and CSVs, converting unstructured data into structured schemas optimized for contextual analysis.
Canvas Exports: Packages dynamically generated financial plans into distributable, formalized document formats.

5.2. Database & State Layer

Built with a highly scalable relational database architecture optimized for graph states:

Conversation Mapping: Intricately maps complex graph execution states to specific user sessions.
Financial Indexing: Utilizes composite indexing for rapid multi-dimensional filtering across transaction histories.
Telemetry Tracking: Records granular user metadata, API budgets, branching logic paths, and document upload summaries.

6. Frontend Environment & Workspace

The frontend leverages a modern reactive component architecture and robust state management to deliver a high-performance interactive experience.

6.1. The Interactive Conversational Interface

Real-Time Thinking Steps: Renders the exact step-by-step intermediate tools the agent is executing in real-time, enforcing complete transparency of the advisory process.
Integrated File Management: Provides an intuitive drag-and-drop zone and interactive preview layer for parsing complex financial documents.
Structured Input Templates: Provides users with predefined, visually guided templates and dynamic forms to seamlessly structure complex financial queries. This ensures the routing agents receive normalized, high-quality parameters immediately rather than relying solely on raw, unstructured conversational inputs.
Streaming Render Engine: Highly optimized rendering layers that parse incoming WebSocket data streams into fully visualized components instantly.

6.2. Rich UI Tooling & Canvas Environment

Instead of constraining answers to plaintext, the system forces the UI to conditionally render interactive blocks:

Dynamic Charts: Renders interactive financial trajectories, cash flow graphs, and predictive asset allocations.
Comparison Tables: Formats side-by-side asset, loan, or investment comparisons for immediate visual contrast.
Canvas Workspace: A collaborative document environment where users apply layouts and utilize AI-driven inline editing tools to refine financial reports originally drafted by the agent.

6.3. Non-Linear Contextual Branching

Financial planning is rarely a straight line; users frequently want to explore alternative strategies based on hypothetical scenarios (e.g., "What if I invest more aggressively?" versus "What if I use that capital to pay down loans?").

The Branching Mechanism: When a user selects a specific historical message in the chat timeline to "branch" from, the frontend sends a state-fork payload to the backend. The backend executes a deep copy of the complex LangGraph execution state, memory entities, and financial variables up to that exact message ID. This generates a completely new, independent session ID. The frontend then instantiates a parallel chat workspace, allowing the user to continue the conversation in a new direction. All subsequent deterministic calculations and tool iterations in this new branch operate strictly against the forked state, leaving the original conversation completely untouched.

Comparison with Standard AI Limitations:

Feature	Traditional Linear Conversation	Non-Linear Contextual Branching
Scenario Exploration	Restricts users to a single timeline; "what-if" questions pollute the original context.	Empowers side-by-side strategy comparisons in safe, isolated workspaces.
State Management	Retains a single, monolithic conversation history which becomes easily bloated.	Functions intrinsically like a Git tree, managing multiple divergent state forks.
Context Pollution Risk	High. Modifying parameters mid-conversation often confuses the model's memory recall.	Zero. The parallel branch is an independent state; changes here cannot leak into the parent.
UI Experience	Forces endless scrolling and repetitive queries to reset previous parameters.	Provides distinct tabs/workspaces to toggle instantly between the parent plan and the alternate strategy.

7. External AI & Cloud Services

The platform integrates with top-tier AI providers and highly reliable cloud infrastructure to ensure maximal accuracy, privacy, and speed.

7.1. Foundation Models & AI Providers

Google (Gemini 2.5 Flash): The primary heavyweight reasoning model. Relied upon for extreme logical adherence, executing complex LangGraph nodes, and ensuring reliable tool-output interpretation.
Google (Gemini 2.5 Flash Lite): Specialized lightweight model optimized for high-throughput classification, routing, and real-time data ingestion tasks.
OpenAI: Powers the embedding architecture, converting financial semantic data into dense vector representations for the retrieval-augmented generation (RAG) loops.
Ollama: Provides a local, offline environment to rapidly test agent logic variations without incurring cloud API costs.
Pinecone: Operating as the high-throughput vector database, hosting the embedded compliance metrics and tax codes for instantaneous similarity searching.

7.2. Cloud Infrastructure (AWS & Partners)

Amazon Textract: Specialized machine learning service utilized to automatically extract text, handwriting, and highly structured data from scanned financial PDFs and images.
Amazon Lambda (Optional): Can be configured as a secure, sandboxed execution environment for deterministic tools, ensuring that any code execution triggered by the agents is isolated and cannot affect the core backend stability.
Amazon S3: Secure, highly available cloud storage bucket utilized for persisting encrypted user uploads before parsing and pipeline injection.
Neon DB (PostgreSQL): Alternative persistence layer designated for relational synchronization and rapid cloud prototyping.
Docker / AWS ECS: The underlying packaging and orchestration logic enabling the Python servers to deploy predictably across load-balanced AWS environments.

8. Technology Stack & Library Ecosystem

A comprehensive listing of the active libraries and systems powering each architectural layer.

8.1. Agent & LLM Ecosystem (Python)

Framework/Library	Primary Function
`langgraph` & `langchain-core`	Orchestrates the multi-agent state machine and standardizes LLM interfaces.
`langchain-anthropic` & `langchain-google-genai`	Inference providers executing deterministic graph logic.
`langchain-openai` & `langchain-pinecone`	Drives the vector-embeddings pipeline and Retrieval-Augmented Generation.
`pandas` & `numpy`	Underpins the high-fidelity tabular data summarization nodes.
`RestrictedPython`	Secure, sandboxed runtime preventing malicious payload execution from tools.

8.2. Core Backend (Python)

Framework/Library	Primary Function
`fastapi` & `uvicorn`	High-performance ASGI framework and server managing HTTP/WebSockets.
`sqlalchemy` & `asyncpg`	Native async Object-Relational Mapper (ORM) mapped to PostgreSQL.
`alembic`	Safely manages and applies relational database schema migrations.
`redis`	Underpins the Token-Per-Minute rate limiter and caches WebSocket states.
`boto3` & `amazon-textract-textractor`	Leverages AWS cloud services for deep OCR capabilities.
`pydantic`	Enforces rigid data validation schemas across API requests and tool outputs.
`pdfplumber`	Structurally parses complex financial PDF documents for ingestion.
`pytest` & `pytest-asyncio`	Drives the rigorous automated testing and benchmarking suite.

8.3. Frontend Application (Node/React)

Framework/Library	Primary Function
`react` & `react-dom`	Component-based visualization library operating the core client layer.
`vite`	Next-generation frontend build tooling enabling lightning-fast HMR.
`zustand`	Highly un-opinionated state manager tracking auth and agent output buffers.
`recharts` & `chart.js`	Renders dynamic financial models out of deterministic backend geometries.
`tailwind-merge` & `clsx`	Powers the modular and responsive Tailwind CSS utility frameworks.
`framer-motion`	Orchestrates the smooth visual transition of `ThinkingSteps` logic.
`react-markdown` & `remark-gfm`	Parses the LLM markdown streams securely for the interactive Canvas.
`@radix-ui/react-*`	Server high-accessibility component primitives (Dialogs, Tooltips, Modals).

9. Testing & Benchmarking

Given the non-deterministic nature of foundational language models, a stringent, metrics-driven evaluation suite enforces operational integrity.

Automated Evaluations: Aggressively tracks model latency, routing path accuracy, and continuously measures semantic similarity against a locked golden dataset to map regressions.
Comprehensive Unit & Flow Tests: Deep code coverage spanning from deterministic calculator validation and isolated agent node logic to holistic multi-turn state evaluations and rigid guardrail safety verifications.

10. Project Directory Structure

The repository is organized into distinct environments, isolating the frontend client from the core backend logic, evaluation scripts, and internal testing frameworks.

finadvisor-ai/
├── backend/            # Core FastAPI server and LangGraph execution environment
│   ├── agents/         # Multi-agent architecture logic
│   │   ├── knowledge/  # Vector embeddings and RAG document retrieval logic
│   │   ├── nodes/      # LangGraph state nodes (Planner, Expert, Summarizer, etc.)
│   │   ├── prompts/    # Parameterized contextual instructions for LLM execution
│   │   └── tools/      # Deterministic mathematical and data-extraction tools
│   ├── app/            # General API models and robust service layer
│   │   ├── api/        # REST endpoints and dependency injections
│   │   ├── core/       # Logging, exceptions, Redis configuration, and security
│   │   ├── models/     # SQLAlchemy ORM definitions (users, sessions, transactions)
│   │   ├── services/   # Business logic (rate limiting, file uploads, canvas exports)
│   │   └── websocket/  # Real-time stateful streaming connections
│   └── migrations/     # Alembic database migration schemas and version history
├── frontend/           # React/Vite SPA client application
│   └── src/
│       ├── api/        # REST and WebSocket client connectors mapping to backend services
│       ├── components/ # React UI components
│       │   ├── chat/       # Message rendering, real-time ThinkingSteps, branching dialogs
│       │   ├── landing/    # Marketing pages and user onboarding flows
│       │   ├── templates/  # Dynamic structured forms for query normalization
│       │   ├── ui/        # Buttons, Modals, Loaders, and other reusable UI elements
│       │   └── tools/      # Canvas workspace, Chart renderers, and Data tables
│       ├── hooks/      # Custom React hooks for state management and WebSocket handling
│       ├── pages/      # Top-level application routing (Chat, Dashboards, Landing)
│       ├── lib/        # Utility functions for data formatting, token counting, and embedding visualizers
│       └── stores/     # Zustand state managers (Auth, Chat Session, UI Themes)
├── scripts/            # Benchmark scripts, seed data logic, and evaluation configurations
└── tests/              # Pytest suite covering agent logic, multi-turn flows, and guardrails

11. Installation & Setup

11.1 Containerized Setup (Recommended)

Environment Setup: Create a .env file at the project root with the content below, and fill in your valid API keys:

# LLMs
GOOGLE_API_KEY=your-api-key-here

# Pinecone — Docker local emulator
USE_PINECONE_LOCAL=true

# AWS Credentials
AWS_ACCESS_KEY_ID=aws-access-key-id-placeholder
AWS_SECRET_ACCESS_KEY=aws-secret-access-key-placeholder
AWS_REGION=us-east-1

OCR_BACKEND=textract
AWS_S3_BUCKET=aws-bucket-placeholder
STORAGE_BACKEND=s3
LAMBDA_CALCULATOR_ENABLED=true
LAMBDA_CALCULATOR_FUNCTION=finadvisor-calculator

Build and Run Containers:
- Build images: docker compose build
- Run built containers: docker compose up -d
Access the Application: Open your browser and go to http://localhost:8080/ to get started.

11.2 Native Development Setup

For faster iteration during development, you can run the services natively using the provided Makefile:

Infrastructure: Start the supporting services (Postgres, Redis, Pinecone local) via Docker:
```
make dev-infra
```

Backend Setup:

cd backend
python3 -m venv venv
source venv/bin/activate  # On Windows: .\venv\Scripts\activate
pip install -r requirements.txt
cd ..
make db-migrate
make db-seed
make dev-backend

Frontend Setup:

cd frontend
npm install
cd ..
make dev-frontend

Database URL: When running natively, ensure your .env file uses localhost for service connections (e.g., DATABASE_URL=postgresql+asyncpg://finadvisor:finadvisor@localhost:5432/finadvisor).

12. License

Distributed under the MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 364 Commits
backend		backend
docs		docs
frontend		frontend
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyrightconfig.json		pyrightconfig.json

Folders and files

Latest commit

History

Repository files navigation