Skip to content

Latest commit

 

History

History
188 lines (142 loc) · 12 KB

File metadata and controls

188 lines (142 loc) · 12 KB

Answer42 — AI Agent Catalogue

Reference documentation for the 9-step multi-agent processing pipeline and their 9 Ollama fallback counterparts. All agents extend AbstractConfigurableAgent. Fallback agents are wired through FallbackAgentFactory.


Pipeline Overview

Papers uploaded to Answer42 are automatically processed through a sequential 9-step Spring Batch pipeline. Each step runs a specialised AI agent and stores its output in the answer42 schema. If the primary cloud provider fails, the corresponding Ollama fallback agent takes over transparently.

Upload → [1]PaperProcessor → [2]MetadataEnhancement → [3]ContentSummarizer
       → [4]ConceptExplainer → [5]QualityChecker → [6]CitationFormatter
       → [7]RelatedPaperDiscovery → [8]CitationVerifier → [9]PerplexityResearch
       → Complete

Pipeline Agents

1. PaperProcessorAgent

Class service/agent/PaperProcessorAgent.java
Fallback service/agent/PaperProcessorFallbackAgent.java
Provider OpenAI GPT-4 (fallback: Ollama llama3.1:8b)
Input Raw PDF text extracted from uploaded file
Output papers.text_content, section map, figure/table metadata
Purpose PDF structure analysis: section identification, table/figure recognition, mathematical notation processing

2. MetadataEnhancementAgent

Class service/agent/MetadataEnhancementAgent.java
Fallback service/agent/MetadataEnhancementFallbackAgent.java
Provider OpenAI GPT-4 + Crossref API + Semantic Scholar API
Input Paper text content, DOI/title from step 1
Output papers.doi, papers.authors, papers.journal, citation metrics, keywords
Purpose Enriches bibliographic metadata via external APIs; extracts structured author/venue data

3. ContentSummarizerAgent

Class service/agent/ContentSummarizerAgent.java
Fallback service/agent/ContentSummarizerFallbackAgent.java
Provider Anthropic Claude (fallback: Ollama with 8K char truncation)
Input Full paper text content
Output papers.summary_brief, papers.summary_standard, papers.summary_detailed, papers.key_findings
Purpose Multi-level summarisation (brief / standard / detailed) with key findings extraction

4. ConceptExplainerAgent

Class service/agent/ConceptExplainerAgent.java
Fallback service/agent/ConceptExplainerFallbackAgent.java
Provider OpenAI GPT-4
Input Paper text content
Output papers.glossary, papers.main_concepts
Purpose Identifies and explains domain-specific terms; adapts explanations to multiple education levels

5. QualityCheckerAgent

Class service/agent/QualityCheckerAgent.java
Fallback service/agent/QualityCheckerFallbackAgent.java
Provider Anthropic Claude
Input Paper text content, metadata from prior steps
Output papers.quality_score (0-100), papers.quality_feedback, letter grade (A-F)
Purpose Comprehensive quality assessment: methodology, statistical rigour, bias detection, reproducibility

6. CitationFormatterAgent

Class service/agent/CitationFormatterAgent.java
Fallback service/agent/CitationFormatterFallbackAgent.java
Provider OpenAI GPT-4 (fallback uses regex-based formatting)
Input Raw reference list from paper
Output papers.citations (APA, MLA, Chicago, IEEE formats)
Purpose Parses and reformats bibliography entries into standardised citation styles

7. RelatedPaperDiscoveryAgent

Class service/agent/RelatedPaperDiscoveryAgent.java
Fallback service/agent/RelatedPaperDiscoveryFallbackAgent.java
Provider Anthropic Claude + Perplexity API (fallback: Ollama + rule-based synthesis)
Input Paper metadata, keywords, abstract
Output discovered_papers, paper_relationships, discovery_results tables
Purpose Multi-source discovery via Crossref, Semantic Scholar, and Perplexity; AI synthesis and relevance ranking

8. CitationVerifierAgent

Class service/agent/CitationVerifierAgent.java
Fallback service/agent/CitationVerifierFallbackAgent.java
Provider OpenAI GPT-4
Input Formatted citations from step 6
Output papers.citation_verification — accuracy scores, DOI resolution status
Purpose Validates citation accuracy; resolves DOIs; flags missing or incorrect bibliographic data

9. PerplexityResearchAgent

Class service/agent/PerplexityResearchAgent.java
Fallback service/agent/PerplexityResearchFallbackAgent.java
Provider Perplexity sonar-pro (fallback: Ollama local analysis)
Input Paper abstract, key findings, research questions
Output papers.research_questions, papers.methodology_details, external context
Purpose Real-time web research, fact verification, trend analysis, and research context enrichment

Fallback System

All 9 agents have dedicated Ollama fallback counterparts managed by FallbackAgentFactory.

Trigger Behaviour
Cloud provider HTTP error or timeout FallbackAgentFactory selects the matching fallback agent
Retry exhausted (configurable via spring.ai.fallback.retry-after-failures) Ollama local model invoked
Ollama health check interval FALLBACK_HEALTH_CHECK_INTERVAL (default 30 s)
Content limit for local models 8 K character truncation to fit llama3.1:8b context window

Fallback configuration (.env or application.properties):

spring.ai.fallback.enabled=true
spring.ai.fallback.retry-after-failures=3
spring.ai.fallback.timeout-seconds=60
spring.ai.ollama.base-url=http://localhost:11434
spring.ai.ollama.chat.options.model=llama3.1:8b

Base Classes & Shared Infrastructure

Class Purpose
AbstractConfigurableAgent Base class: retry policies, circuit breakers, rate limiting, cost tracking
AnthropicBasedAgent Optimised prompt/response handling for Anthropic Claude
OpenAIBasedAgent Optimised handling for OpenAI GPT-4
PerplexityBasedAgent Optimised handling for Perplexity sonar-pro
FallbackAgentFactory Selects and instantiates the correct Ollama fallback agent
AgentMemoryStore Per-paper context store; shared across all 9 steps in a pipeline run
CreditService Tracks token usage and deducts credits per agent invocation

Agent Development Checklist

When adding a new agent:

  • Extend AbstractConfigurableAgent
  • Create a corresponding *FallbackAgent extending the same base
  • Register the fallback in FallbackAgentFactory
  • Add a Spring Batch Step and wire it in the job configuration
  • Persist all outputs to the answer42 schema via the appropriate repository
  • Use LoggingUtil for all logging — never System.out or raw Logger
  • Keep fhe class under 300 lines; extract helpers as needed
  • Write unit tests mocking the AI provider and verifying JSON output parsing