Performance Optimization
Implement HTTP client connection pooling to reduce connection overhead and improve request performance for service-to-service communication.
Level of Effort: 🟢 Small (1-2 days)
- Implementation: 1 day for connection pooling setup and configuration
- Testing: 0.5 day for performance validation and testing
- Documentation: 0.5 day for configuration and usage documentation
Current Implementation
File: src/client/utils.py (lines 212-230)
async def send_request(self, url: str, data: dict) -> dict:
"""Send HTTP request with authentication."""
# Current: Creates new httpx.AsyncClient for each request
async with httpx.AsyncClient() as client:
response = await client.post(
url,
json=data,
headers={"Authorization": f"Bearer {id_token}"},
timeout=30.0
)
Performance issues:
- New HTTP connection established for each request
- TCP handshake overhead on every API call
- No connection reuse between requests
- Potential connection pool exhaustion under load
Proposed Implementation
1. Singleton HTTP Client with Connection Pooling
Enhanced client with connection pooling:
# src/client/utils.py
class UtilHandler:
def __init__(self, log_level: str = "INFO"):
# ... existing initialization
self._http_client: Optional[httpx.AsyncClient] = None
self._client_lock = asyncio.Lock()
async def _get_http_client(self) -> httpx.AsyncClient:
"""Get or create HTTP client with connection pooling."""
if self._http_client is None:
async with self._client_lock:
if self._http_client is None:
limits = httpx.Limits(
max_keepalive_connections=20,
max_connections=100,
keepalive_expiry=30.0
)
timeout = httpx.Timeout(30.0, connect=10.0)
self._http_client = httpx.AsyncClient(
limits=limits,
timeout=timeout,
http2=True # Enable HTTP/2 if supported
)
return self._http_client
async def send_request(self, url: str, data: dict) -> dict:
"""Send HTTP request with connection pooling."""
id_token = await self._get_id_token()
client = await self._get_http_client()
response = await client.post(
url,
json=data,
headers={"Authorization": f"Bearer {id_token}"},
)
if response.status_code == 200:
return response.json()
else:
# ... error handling
async def close(self):
"""Close HTTP client and clean up connections."""
if self._http_client:
await self._http_client.aclose()
self._http_client = None
2. Backend HTTP Client Optimization
If backend makes external HTTP calls:
# src/answer_app/utils.py (if needed for external API calls)
class UtilHandler:
def __init__(self, log_level: str = "INFO"):
# ... existing initialization
self._http_client: Optional[httpx.AsyncClient] = None
async def _get_http_client(self) -> httpx.AsyncClient:
"""Get HTTP client for external API calls."""
if self._http_client is None:
limits = httpx.Limits(
max_keepalive_connections=10,
max_connections=50,
keepalive_expiry=60.0
)
self._http_client = httpx.AsyncClient(limits=limits)
return self._http_client
3. Application Lifecycle Management
Proper client lifecycle in FastAPI:
# src/answer_app/main.py
@app.on_event("startup")
async def startup_event():
"""Initialize shared resources."""
# HTTP client will be created on first use
pass
@app.on_event("shutdown")
async def shutdown_event():
"""Clean up shared resources."""
# Close HTTP clients
if hasattr(utils, '_http_client') and utils._http_client:
await utils._http_client.aclose()
Streamlit app lifecycle:
# src/client/streamlit_app.py
import atexit
# Register cleanup function
atexit.register(lambda: asyncio.run(utils.close()) if utils._http_client else None)
Configuration Options
Add to application config:
# config.yaml
http_client:
max_keepalive_connections: 20
max_connections: 100
keepalive_expiry: 30.0
connection_timeout: 10.0
request_timeout: 30.0
enable_http2: true
# Retry configuration
max_retries: 3
retry_backoff_factor: 0.5
Expected Performance Improvements
Connection Overhead Reduction:
- TCP handshake elimination: Reuse existing connections
- SSL/TLS handshake savings: Keep secure connections alive
- DNS lookup reduction: Connection pooling reduces DNS queries
Performance Metrics:
- Latency improvement: 10-50ms reduction per request (depending on network)
- Throughput increase: 20-40% improvement for concurrent requests
- Resource efficiency: Reduced system socket usage
Load Handling:
- Better concurrency: Efficient handling of multiple simultaneous requests
- Connection limits: Prevent connection pool exhaustion
- Graceful degradation: Proper timeout and retry handling
Implementation Areas
Files to Modify:
src/client/utils.py: Add connection pooling to UtilHandler
src/client/streamlit_app.py: Add proper client lifecycle management
src/answer_app/main.py: Add startup/shutdown hooks (if backend needs HTTP client)
- Tests: Update mocking to work with persistent client
Configuration:
src/answer_app/config.yaml: Add HTTP client configuration options
- Environment variables: HTTP client tuning parameters
Testing Strategy
Performance Testing:
- Measure request latency before/after implementation
- Test concurrent request handling
- Validate connection reuse metrics
- Monitor resource usage under load
Functional Testing:
- Ensure all existing functionality works with pooled connections
- Test connection recovery after network issues
- Validate proper client cleanup on application shutdown
Edge Case Testing:
- Connection timeout scenarios
- Server connection limits
- Network interruption recovery
- Long-running connection behavior
Monitoring and Metrics
Connection pool metrics to track:
- Active connections count
- Connection reuse rate
- Connection creation/destruction frequency
- Request latency improvements
- Failed connection attempts
Logging enhancements:
logging.info(f"HTTP client stats: active={client.pool.active_count}, "
f"idle={client.pool.idle_count}")
Acceptance Criteria
Priority
Low - Performance optimization that provides value but isn't critical at current scale.
When to Implement
This becomes more valuable when:
- Application handles >50 requests/hour consistently
- Network latency between services becomes noticeable
- Multiple concurrent users create performance bottlenecks
- Service-to-service communication frequency increases
- Performance optimization becomes a focus area
Performance Optimization
Implement HTTP client connection pooling to reduce connection overhead and improve request performance for service-to-service communication.
Level of Effort: 🟢 Small (1-2 days)
Current Implementation
File:
src/client/utils.py(lines 212-230)Performance issues:
Proposed Implementation
1. Singleton HTTP Client with Connection Pooling
Enhanced client with connection pooling:
2. Backend HTTP Client Optimization
If backend makes external HTTP calls:
3. Application Lifecycle Management
Proper client lifecycle in FastAPI:
Streamlit app lifecycle:
Configuration Options
Add to application config:
Expected Performance Improvements
Connection Overhead Reduction:
Performance Metrics:
Load Handling:
Implementation Areas
Files to Modify:
src/client/utils.py: Add connection pooling to UtilHandlersrc/client/streamlit_app.py: Add proper client lifecycle managementsrc/answer_app/main.py: Add startup/shutdown hooks (if backend needs HTTP client)Configuration:
src/answer_app/config.yaml: Add HTTP client configuration optionsTesting Strategy
Performance Testing:
Functional Testing:
Edge Case Testing:
Monitoring and Metrics
Connection pool metrics to track:
Logging enhancements:
Acceptance Criteria
Priority
Low - Performance optimization that provides value but isn't critical at current scale.
When to Implement
This becomes more valuable when: