JAF Server API Documentation

The Juspay Agent Framework (JAF) provides a production-ready HTTP server that exposes agents through RESTful endpoints. This guide covers the complete server API, configuration options, and best practices for deployment.

Getting Started
Server Configuration
HTTP Endpoints
Authentication and CORS
Memory Provider Integration
Error Handling
Development vs Production
Performance and Scaling
Complete Examples

Getting Started

Basic Server Setup

import { runServer, makeLiteLLMProvider, createInMemoryProvider } from '@juspay-jaf/jaf';

const myAgent = {
  name: 'MyAgent',
  instructions: () => 'You are a helpful assistant',
  tools: []
};

const modelProvider = makeLiteLLMProvider('http://localhost:4000');
const memoryProvider = createInMemoryProvider();

const server = await runServer(
  [myAgent], 
  { modelProvider },
  { port: 3000, defaultMemoryProvider: memoryProvider }
);

Using createJAFServer for Advanced Configuration

import { createJAFServer } from '@juspay-jaf/jaf/server';

const server = createJAFServer({
  port: 3000,
  host: '0.0.0.0',
  cors: true,
  runConfig: {
    agentRegistry: new Map([['MyAgent', myAgent]]),
    modelProvider,
    maxTurns: 10,
    memory: {
      provider: memoryProvider,
      autoStore: true,
      maxMessages: 100
    }
  },
  agentRegistry: new Map([['MyAgent', myAgent]]),
  defaultMemoryProvider: memoryProvider
});

await server.start();

Server Configuration

ServerConfig Interface

interface ServerConfig<Ctx> {
  port?: number;                    // Default: 3000
  host?: string;                    // Default: '127.0.0.1'
  cors?: boolean;                   // Default: false
  runConfig: RunConfig<Ctx>;        // Required: Framework run configuration
  agentRegistry: Map<string, Agent<Ctx, any>>; // Required: Available agents
  defaultMemoryProvider?: MemoryProvider;      // Optional: Memory persistence
}

runServer Parameters

The runServer function provides a simplified interface:

async function runServer<Ctx>(
  agents: Map<string, Agent<Ctx, any>> | Agent<Ctx, any>[],
  runConfig: Omit<RunConfig<Ctx>, 'agentRegistry'>,
  options: Partial<Omit<ServerConfig<Ctx>, 'runConfig' | 'agentRegistry'>>
): Promise<ServerInstance>

Parameters:

agents: Either an array of agents or a Map of agent name to agent
runConfig: Framework configuration (modelProvider, maxTurns, etc.)
options: Server-specific options (port, host, cors, memory provider)

HTTP Endpoints

Health Check

GET /health

Returns server health status and basic information.

curl http://localhost:3000/health

Response:

{
  "status": "healthy",
  "timestamp": "2024-01-15T10:30:00.000Z",
  "version": "2.0.0",
  "uptime": 120000
}

List Available Agents

GET /agents

Returns all registered agents with their descriptions and available tools.

curl http://localhost:3000/agents

Response:

{
  "success": true,
  "data": {
    "agents": [
      {
        "name": "MathTutor",
        "description": "You are a helpful math tutor...",
        "tools": ["calculate"]
      },
      {
        "name": "ChatBot",
        "description": "You are a friendly chatbot...",
        "tools": ["greet"]
      }
    ]
  }
}

Chat with Agent

POST /chat

Main endpoint for interacting with agents.

curl -X POST http://localhost:3000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What is 15 * 7?"}
    ],
    "agentName": "MathTutor",
    "conversationId": "demo-conversation-1",
    "context": {"userId": "user123", "permissions": ["user"]},
    "maxTurns": 5,
    "memory": {
      "autoStore": true,
      "maxMessages": 100
    }
  }'

Request Schema:

interface ChatRequest {
  messages: HttpMessage[];          // Required: Conversation messages
  agentName: string;               // Required: Target agent name
  context?: any;                   // Optional: Agent context
  maxTurns?: number;              // Optional: Override max turns
  stream?: boolean;               // Optional: When true, responds with Server-Sent Events (SSE)
  conversationId?: string;        // Optional: For conversation persistence
  memory?: {
    autoStore?: boolean;          // Default: true
    maxMessages?: number;         // Optional: Memory limit
    compressionThreshold?: number; // Optional: Compression trigger
  };
}

interface HttpMessage {
  role: 'user' | 'assistant' | 'system';
  content: string;
}

Response (non-streaming):

{
  "success": true,
  "data": {
    "runId": "run_abc123",
    "traceId": "trace_def456",
    "conversationId": "demo-conversation-1",
    "messages": [
      {"role": "user", "content": "What is 15 * 7?"},
      {"role": "assistant", "content": "I'll calculate that for you.", "tool_calls": [...]},
      {"role": "tool", "content": "15 * 7 = 105", "tool_call_id": "call_123"},
      {"role": "assistant", "content": "15 × 7 equals 105."}
    ],
    "outcome": {
      "status": "completed",
      "output": "15 × 7 equals 105."
    },
    "turnCount": 2,
    "executionTimeMs": 1500
  }
}

Streaming Responses (SSE)

Set stream: true to receive a Server-Sent Events stream instead of a JSON response. The server keeps the HTTP connection open and pushes events as they happen. Each event name corresponds to a TraceEvent.type.

curl -N -H "Content-Type: application/json" \
  -X POST http://localhost:3000/chat \
  -d '{
    "messages": [{"role": "user", "content": "Hi, I am Alice. What time is it?"}],
    "agentName": "Assistant",
    "stream": true,
    "context": {"userId": "user123"}
  }'

SSE event format:

Content-Type: text/event-stream
Each event:
- event: <TraceEvent.type>
- data: <JSON payload>
- blank line terminator

Example events you may see:

run_start
llm_call_start / llm_call_end
assistant_message
tool_requests
tool_call_start / tool_call_end
tool_results_to_llm
final_output
run_end

The stream starts with a stream_start event containing run metadata and ends with stream_end after run_end.

SSE Event Types and Payloads

Below is the list of SSE events you may receive when stream: true is set on /chat and their payload shapes. Events correspond to engine TraceEvents unless noted.

Event	Description	Payload (JSON)
`stream_start`	Server-only preface with metadata	`{ runId: string, traceId: string, conversationId?: string, agent: string }`
`run_start`	A new run has begun	`{ runId: string, traceId: string }`
`turn_start`	Turn started	`{ turn: number, agentName: string }`
`llm_call_start`	Engine is calling the model	`{ agentName: string, model: string }`
`llm_call_end`	Model responded (raw choice)	`{ choice: any }`
`token_usage`	Token usage reported by provider	`{ prompt?: number, completion?: number, total?: number, model?: string }`
`assistant_message`	Assistant message received from model	`{ message: { role: 'assistant', content?: string, tool_calls?: Array<{ id: string, type: 'function', function: { name: string, arguments: string } }> } }`
`tool_requests`	LLM requested tool invocations	`{ toolCalls: Array<{ id: string, name: string, args: any }>}`
`tool_call_start`	Tool execution started	`{ toolName: string, args: any }`
`tool_call_end`	Tool execution finished	`{ toolName: string, result: string, toolResult?: any, status?: string }`
`tool_results_to_llm`	Tool results appended for next LLM turn	`{ results: Array<{ role: 'tool', content: string, tool_call_id?: string }> }`
`handoff`	Agent handoff occurred	`{ from: string, to: string }`
`handoff_denied`	Agent handoff was rejected	`{ from: string, to: string, reason: string }`
`guardrail_violation`	Input/output guardrail blocked output	`{ stage: 'input'
`decode_error`	Output codec validation failed	`{ errors: any }`
`turn_end`	Turn finished	`{ turn: number, agentName: string }`
`final_output`	Final output before completion	`{ output: any }`
`run_end`	Run completed or failed	`{ outcome: { status: 'completed'
`stream_end`	Server-only terminator	`{ ended: true }`

Agent-Specific Chat

POST /agents/:agentName/chat

Convenience endpoint for chatting with a specific agent.

curl -X POST http://localhost:3000/agents/MathTutor/chat \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "What is 10 + 5?"}],
    "conversationId": "math-session-1",
    "context": {"userId": "user123"}
  }'

This endpoint automatically sets the agentName based on the URL parameter.

Memory Management Endpoints

These endpoints are only available when a defaultMemoryProvider is configured.

Get Conversation

GET /conversations/:conversationId

Retrieve stored conversation history.

curl http://localhost:3000/conversations/demo-conversation-1

Response:

{
  "success": true,
  "data": {
    "conversationId": "demo-conversation-1",
    "userId": "user123",
    "messages": [...],
    "metadata": {
      "createdAt": "2024-01-15T10:00:00.000Z",
      "updatedAt": "2024-01-15T10:30:00.000Z",
      "totalMessages": 8,
      "lastActivity": "2024-01-15T10:30:00.000Z"
    }
  }
}

Delete Conversation

DELETE /conversations/:conversationId

Remove conversation from memory.

curl -X DELETE http://localhost:3000/conversations/demo-conversation-1

Response:

{
  "success": true,
  "data": {
    "deleted": true
  }
}

Memory Health Check

GET /memory/health

Check memory provider health and connectivity.

curl http://localhost:3000/memory/health

Response:

{
  "success": true,
  "data": {
    "healthy": true,
    "latencyMs": 12
  }
}

Authentication and CORS

CORS Configuration

CORS is disabled by default for security. Enable it for web applications:

const server = await runServer(
  agents,
  runConfig,
  { 
    cors: true,  // Enables CORS with permissive settings
    port: 3000 
  }
);

Default CORS Settings (when enabled):

origin: true (allows all origins)
methods: ['GET', 'POST', 'PUT', 'DELETE', 'OPTIONS']
allowedHeaders: ['Content-Type', 'Authorization']

Authentication

JAF server doesn't provide built-in authentication. Implement authentication using:

Reverse Proxy: Use nginx, Apache, or cloud load balancers
Custom Middleware: Add Fastify hooks for authentication
API Gateway: Use AWS API Gateway, Kong, or similar solutions

Example custom authentication:

const server = createJAFServer(config);

// Add authentication hook before starting
server.app.addHook('preHandler', async (request, reply) => {
  if (request.url.startsWith('/health')) return; // Skip health check
  
  const authHeader = request.headers.authorization;
  if (!authHeader || !isValidToken(authHeader)) {
    return reply.code(401).send({ error: 'Unauthorized' });
  }
});

await server.start();

Memory Provider Integration

Server-Level Memory Configuration

Configure memory at the server level for automatic conversation persistence:

const memoryProvider = createInMemoryProvider();

const server = await runServer(
  agents,
  { modelProvider },
  { 
    defaultMemoryProvider: memoryProvider,
    port: 3000 
  }
);

Request-Level Memory Configuration

Override memory settings per request:

{
  "messages": [...],
  "agentName": "Assistant",
  "conversationId": "session-123",
  "memory": {
    "autoStore": true,
    "maxMessages": 50,
    "compressionThreshold": 20
  }
}

Memory Provider Types

In-Memory Provider

import { createInMemoryProvider } from '@juspay-jaf/jaf';

const memoryProvider = createInMemoryProvider({
  maxConversations: 1000,
  maxMessagesPerConversation: 1000
});

Redis Provider

import { createRedisProvider } from '@juspay-jaf/jaf';
import { createClient } from 'redis';

const redisClient = createClient({ url: 'redis://localhost:6379' });
await redisClient.connect();

const memoryProvider = await createRedisProvider({
  type: 'redis',
  keyPrefix: 'jaf:memory:',
  ttl: 3600 // 1 hour
}, redisClient);

PostgreSQL Provider

import { createPostgresProvider } from '@juspay-jaf/jaf';
import { Client } from 'pg';

const postgresClient = new Client({
  connectionString: 'postgresql://user:pass@localhost:5432/jaf'
});
await postgresClient.connect();

const memoryProvider = await createPostgresProvider({
  type: 'postgres',
  tableName: 'conversations'
}, postgresClient);

Error Handling

Standard Error Response Format

All endpoints return errors in a consistent format:

{
  "success": false,
  "error": "Error message describing what went wrong"
}

Common HTTP Status Codes

200: Success
400: Bad Request (invalid JSON, missing required fields)
404: Not Found (agent not found)
500: Internal Server Error (agent execution failed, memory provider error)

Note: For streaming (stream: true), a successful connection returns 200 and remains open until the run finishes or the client disconnects.

503: Service Unavailable (memory provider not configured)

Agent Execution Errors

When agent execution fails, the response includes detailed error information:

{
  "success": true,
  "data": {
    "outcome": {
      "status": "error",
      "error": {
        "_tag": "ToolCallError",
        "tool": "calculate",
        "detail": "Invalid expression"
      }
    },
    "turnCount": 1,
    "executionTimeMs": 500
  }
}

Memory Provider Errors

Memory-related endpoints return specific error information:

{
  "success": false,
  "error": "Memory provider not configured"
}

{
  "success": false,
  "error": "Conversation not found",
  "details": {
    "_tag": "MemoryNotFoundError",
    "conversationId": "missing-conversation",
    "provider": "redis"
  }
}

Development vs Production

Development Configuration

const server = await runServer(
  agents,
  {
    modelProvider: makeLiteLLMProvider('http://localhost:4000'),
    maxTurns: 10,
    onEvent: new ConsoleTraceCollector().collect
  },
  {
    port: 3000,
    host: '127.0.0.1',  // Local only
    cors: true,          // Permissive for development
    defaultMemoryProvider: createInMemoryProvider()
  }
);

Production Configuration

const server = await runServer(
  agents,
  {
    modelProvider: makeLiteLLMProvider(process.env.LITELLM_URL, process.env.LITELLM_API_KEY),
    maxTurns: 5,
    onEvent: productionTraceCollector.collect
  },
  {
    port: parseInt(process.env.PORT || '3000'),
    host: '0.0.0.0',     // Accept external connections
    cors: false,         // Use reverse proxy for CORS
    defaultMemoryProvider: await createPostgresProvider(/* production config */)
  }
);

Environment Variables

# Server Configuration
PORT=3000
HOST=0.0.0.0

# Model Provider
LITELLM_URL=https://api.litellm.ai
LITELLM_API_KEY=your_api_key
LITELLM_MODEL=gpt-4

# Memory Provider
JAF_MEMORY_TYPE=postgres
JAF_POSTGRES_CONNECTION_STRING=postgresql://user:pass@localhost:5432/jaf
JAF_POSTGRES_SSL=true

# Or for Redis
JAF_MEMORY_TYPE=redis
JAF_REDIS_URL=redis://localhost:6379
JAF_REDIS_PASSWORD=your_password

Performance and Scaling

Server Performance

The JAF server is built on Fastify for high performance:

Async/Await: All handlers are fully asynchronous
Schema Validation: Request validation using JSON Schema/Zod
Connection Pooling: Automatic connection management for memory providers
Graceful Shutdown: Proper cleanup of resources

Scaling Considerations

Horizontal Scaling

Stateless Design: Agents are stateless; conversations persist in external memory
Load Balancing: Use any HTTP load balancer (nginx, HAProxy, AWS ALB)
Shared Memory: Use Redis or PostgreSQL for shared conversation state

Vertical Scaling

Memory Usage: In-memory provider scales with conversation count
CPU Usage: Depends on agent complexity and tool execution
I/O: External memory providers add latency but enable scaling

Production Architecture

┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Client    │    │   Client    │    │   Client    │
└─────────────┘    └─────────────┘    └─────────────┘
       │                   │                   │
       └───────────────────┼───────────────────┘
                           │
                  ┌─────────────┐
                  │Load Balancer│
                  └─────────────┘
                           │
       ┌───────────────────┼───────────────────┐
       │                   │                   │
┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│ JAF Server  │    │ JAF Server  │    │ JAF Server  │
│   Instance  │    │   Instance  │    │   Instance  │
└─────────────┘    └─────────────┘    └─────────────┘
       │                   │                   │
       └───────────────────┼───────────────────┘
                           │
                  ┌─────────────┐
                  │   Redis/    │
                  │ PostgreSQL  │
                  └─────────────┘

Monitoring and Observability

Built-in Logging

JAF server includes structured logging via Fastify:

// Logs are automatically generated for:
// - Request/response cycles
// - Agent execution times
// - Memory provider operations
// - Error conditions

Custom Tracing

Implement custom event collection:

import { TraceCollector } from '@juspay-jaf/jaf';

class ProductionTraceCollector implements TraceCollector {
  collect(event: TraceEvent): void {
    // Send to your monitoring system
    // (DataDog, New Relic, custom metrics)
    console.log(`[${event.type}]`, event.data);
  }
}

const server = await runServer(
  agents,
  {
    modelProvider,
    onEvent: new ProductionTraceCollector().collect
  },
  options
);

Health Monitoring

Monitor these endpoints for service health:

GET /health - Server availability
GET /memory/health - Memory provider connectivity
GET /agents - Agent registry status

Complete Examples

Basic Chat Server

import { runServer, makeLiteLLMProvider, createInMemoryProvider } from '@juspay-jaf/jaf';

const chatAgent = {
  name: 'ChatBot',
  instructions: () => 'You are a helpful assistant',
  tools: []
};

async function startChatServer() {
  const server = await runServer(
    [chatAgent],
    {
      modelProvider: makeLiteLLMProvider('http://localhost:4000'),
      maxTurns: 10
    },
    {
      port: 3000,
      cors: true,
      defaultMemoryProvider: createInMemoryProvider()
    }
  );
  
  console.log('Chat server running on http://localhost:3000');
}

startChatServer().catch(console.error);

Multi-Agent Server with Persistence

import { runServer, makeLiteLLMProvider, createRedisProvider } from '@juspay-jaf/jaf';
import { createClient } from 'redis';

const mathAgent = {
  name: 'MathTutor',
  instructions: () => 'You are a math tutor',
  tools: [calculatorTool]
};

const chatAgent = {
  name: 'ChatBot', 
  instructions: () => 'You are a friendly chatbot',
  tools: [greetingTool]
};

async function startProductionServer() {
  // Set up Redis for persistence
  const redisClient = createClient({ url: process.env.REDIS_URL });
  await redisClient.connect();
  
  const memoryProvider = await createRedisProvider({
    type: 'redis',
    keyPrefix: 'jaf:prod:',
    ttl: 86400 // 24 hours
  }, redisClient);

  const server = await runServer(
    [mathAgent, chatAgent],
    {
      modelProvider: makeLiteLLMProvider(
        process.env.LITELLM_URL,
        process.env.LITELLM_API_KEY
      ),
      maxTurns: 5,
      memory: {
        provider: memoryProvider,
        autoStore: true,
        maxMessages: 100
      }
    },
    {
      port: parseInt(process.env.PORT || '3000'),
      host: '0.0.0.0',
      cors: false, // Use reverse proxy
      defaultMemoryProvider: memoryProvider
    }
  );

  // Graceful shutdown
  process.on('SIGTERM', async () => {
    await server.stop();
    await redisClient.quit();
    process.exit(0);
  });
}

startProductionServer().catch(console.error);

Advanced Server with Custom Middleware

import { createJAFServer } from '@juspay-jaf/jaf/server';

async function startAdvancedServer() {
  const server = createJAFServer({
    port: 3000,
    host: '0.0.0.0',
    cors: true,
    runConfig: {
      agentRegistry: new Map([
        ['MathTutor', mathAgent],
        ['ChatBot', chatAgent]
      ]),
      modelProvider: makeLiteLLMProvider('http://localhost:4000'),
      maxTurns: 10
    },
    agentRegistry: new Map([
      ['MathTutor', mathAgent],
      ['ChatBot', chatAgent]
    ]),
    defaultMemoryProvider: memoryProvider
  });

  // Add custom authentication
  server.app.addHook('preHandler', async (request, reply) => {
    if (request.url === '/health') return;
    
    const apiKey = request.headers['x-api-key'];
    if (!apiKey || !isValidApiKey(apiKey)) {
      return reply.code(401).send({ error: 'Invalid API key' });
    }
  });

  // Add request logging
  server.app.addHook('preHandler', async (request, reply) => {
    console.log(`${request.method} ${request.url}`, {
      headers: request.headers,
      query: request.query
    });
  });

  // Add custom routes
  server.app.get('/metrics', async (request, reply) => {
    return {
      requestCount: getRequestCount(),
      uptime: process.uptime(),
      memoryUsage: process.memoryUsage()
    };
  });

  await server.start();
}

function isValidApiKey(key: string): boolean {
  return key === process.env.API_KEY;
}

startAdvancedServer().catch(console.error);

Testing the Server

// Test script
async function testServer() {
  const baseUrl = 'http://localhost:3000';
  
  // Health check
  const health = await fetch(`${baseUrl}/health`);
  console.log('Health:', await health.json());
  
  // List agents
  const agents = await fetch(`${baseUrl}/agents`);
  console.log('Agents:', await agents.json());
  
  // Chat with agent
  const chatResponse = await fetch(`${baseUrl}/chat`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
      messages: [{ role: 'user', content: 'Hello!' }],
      agentName: 'ChatBot',
      conversationId: 'test-conversation'
    })
  });
  
  console.log('Chat:', await chatResponse.json());
}

testServer().catch(console.error);

This comprehensive guide covers all aspects of the JAF server API. The server provides a robust, scalable foundation for deploying JAF agents in production environments with full conversation persistence, error handling, and monitoring capabilities.

FilesExpand file tree

server-api.md

Latest commit

History

server-api.md

File metadata and controls

JAF Server API Documentation

Table of Contents

Getting Started

Basic Server Setup

Using createJAFServer for Advanced Configuration

Server Configuration

ServerConfig Interface

runServer Parameters

HTTP Endpoints

Health Check

List Available Agents

Chat with Agent

Streaming Responses (SSE)

SSE Event Types and Payloads

Agent-Specific Chat

Memory Management Endpoints

Get Conversation

Delete Conversation

Memory Health Check

Authentication and CORS

CORS Configuration

Authentication

Memory Provider Integration

Server-Level Memory Configuration

Request-Level Memory Configuration

Memory Provider Types

In-Memory Provider

Redis Provider

PostgreSQL Provider

Error Handling

Standard Error Response Format

Common HTTP Status Codes

Agent Execution Errors

Memory Provider Errors

Development vs Production

Development Configuration

Production Configuration

Environment Variables

Performance and Scaling

Server Performance

Scaling Considerations

Horizontal Scaling

Vertical Scaling

Production Architecture

Monitoring and Observability

Built-in Logging

Custom Tracing

Health Monitoring

Complete Examples

Basic Chat Server

Multi-Agent Server with Persistence

Advanced Server with Custom Middleware

Testing the Server