Skip to content

Latest commit

 

History

History
209 lines (158 loc) · 6.68 KB

File metadata and controls

209 lines (158 loc) · 6.68 KB

COPY THIS ENTIRE POST TO REDDIT

Title:

[P] Built a Python SDK for commodity price data - handling retries, rate limits, and async properly

Flair: Select "[P]" (Project)

Body:

Hi r/Python!

I've built a Python SDK for commodity price data after dealing with API integration headaches in production. Sharing what I learned about building resilient API clients.

## Quick Start

```bash
pip install oilpriceapi
export OILPRICEAPI_KEY="your_key"
python -c "from oilpriceapi import OilPriceAPI; print(OilPriceAPI().prices.get('BRENT_CRUDE_USD').value)"

What My Project Does

Fetches real-time and historical oil & commodity prices with comprehensive retry handling, rate limiting, and connection pooling. Handles the boring-but-critical stuff like exponential backoff with jitter, connection limits, and error handling.

Sync example:

from oilpriceapi import OilPriceAPI

client = OilPriceAPI(
    api_key="your_key",  # or reads from OILPRICEAPI_KEY env var
    timeout=5.0,
    max_retries=3
)

try:
    price = client.prices.get("BRENT_CRUDE_USD")
    print(f"Current price: ${price.value}")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.reset_time}")
except TimeoutError:
    # Automatically retried 3x with exponential backoff + jitter
    print("API timeout after retries")

Async example:

import asyncio
from oilpriceapi import AsyncOilPriceAPI

async def get_prices():
    async with AsyncOilPriceAPI() as client:
        # Fetch multiple commodities concurrently
        prices = await asyncio.gather(
            client.prices.get("BRENT_CRUDE_USD"),
            client.prices.get("WTI_USD"),
            client.prices.get("NATURAL_GAS_USD")
        )
        for price in prices:
            print(f"{price.commodity}: ${price.value}")

asyncio.run(get_prices())

What happens when things break:

  • Rate limited? Returns RateLimitError with reset time
  • Network timeout? Retries with exponential backoff + jitter (prevents thundering herd)
  • Server error 500? Retries on [500, 502, 503, 504] automatically
  • Bad API key? Clear AuthenticationError with helpful message

Target Audience

Production systems that need reliable commodity price data:

  • Energy trading systems
  • Risk management platforms
  • Financial analysis pipelines
  • Market research tools

Comprehensive error handling, connection pooling, and retry logic that won't DDoS your API during an outage.

Early adoption: 250+ PyPI downloads since September launch, 4 active users making 100+ API requests. Most popular feature: historical price data. Looking for feedback on what to improve.

Comparison

vs. Manual requests library: You'll spend days building retry logic, connection pooling, and rate limit handling. Then debug why your retries cause request storms. I did that already, so you don't have to.

vs. pandas-datareader:

  • Async/await support (datareader is synchronous only)
  • Active maintenance (datareader's commodity data is flaky)
  • Built-in retry with jitter
  • Type hints throughout

vs. yfinance:

  • Direct spot prices (not ETF proxies)
  • Better for energy market analysis where you need actual delivery prices
  • Focused on commodities rather than stocks

Technical Details

Retry Strategy:

  • Exponential backoff: 1s, 2s, 4s, 8s... (capped at 60s)
  • With jitter: adds 0-30% randomization to prevent synchronized retries
  • Configurable: customize retry codes, max attempts, backoff

Connection Pooling (Async):

  • Max 100 concurrent connections (prevents resource exhaustion)
  • 20 keepalive connections (improves performance)
  • Context managers for proper cleanup

Error Handling: 8 specific exception types (not generic Exception):

  • AuthenticationError (401)
  • RateLimitError (429) - includes reset time
  • DataNotFoundError (404)
  • ValidationError (422)
  • ServerError (5xx)
  • TimeoutError - after all retries
  • ConfigurationError - setup issues
  • OilPriceAPIError - base class

Type Safety:

  • Full type hints (mypy --strict compatible)
  • Pydantic models for responses
  • No Any types in public API

What I Learned Building This

Mistake 1: Naive exponential backoff without jitter

  • Problem: All clients retry at same time after outage → thundering herd
  • Fix: Add 0-30% random jitter to backoff timing

Mistake 2: No connection pooling limits

  • Problem: Under load, spawns unlimited connections → OOM
  • Fix: Configure httpx.Limits (max 100 concurrent, 20 keepalive)

Mistake 3: Generic exceptions

  • Problem: Callers can't handle different failures appropriately
  • Fix: Specific exceptions with relevant context (reset_time, status_code, etc.)

Roadmap

Currently working on:

  • Response caching with fallback (use stale cache when API is down)
  • Circuit breaker pattern (fail fast during outages)
  • Improving test coverage (current: ~65%, target: 85%+)

Future:

  • Client-side data validation
  • OpenTelemetry integration

What should I prioritize? Caching or circuit breaker? Open to feedback on what would be most useful.

Contributions welcome! These are good first issues if you want to contribute.

Links

Free Tier Reality Check

100 requests (lifetime) = 33/day. Good for:

  • ✅ Daily end-of-day batch jobs
  • ✅ Polling 1 commodity every 30 minutes
  • ✅ Development and testing workflows
  • ❌ Real-time tick data (need paid plan for that)

Happy to answer questions about implementation, especially around retry strategies and error handling patterns!


---

## POSTING INSTRUCTIONS

1. **Go to:** https://www.reddit.com/r/Python/submit
2. **Click:** "Text" tab
3. **Paste title** from above (including `[P]` prefix)
4. **Paste body** from above (everything between the triple backticks)
5. **Select flair:** Click "Flair" button and choose "[P]" (Project)
6. **Post when:** Tuesday-Thursday, 9-11am EST
7. **Stay online** for first 2 hours to respond to comments

## POST-SUBMISSION CHECKLIST

**Within 15 minutes:**
- [ ] Reply to first 3 comments
- [ ] Thank anyone who asks questions
- [ ] Link to specific code when explaining

**Within 1 hour:**
- [ ] Check post is visible (not spam filtered)
- [ ] Reply to all comments
- [ ] Fix any typos people point out

**Within 24 hours:**
- [ ] Continue responding to comments
- [ ] Open GitHub issues for feature requests
- [ ] Thank contributors publicly

---

**Status:** ✅ READY TO POST
**Best time:** Tuesday-Thursday, 9-11am EST
**Expected response:** 100+ upvotes, 20+ comments in 24 hours