A production-quality distributed scatter-gather system for aggregating network information from multiple sources (GeoIP, Ping, RDAP, ReverseDNS) using asynchronous worker orchestration.
flowchart TD
%% Define the Top Row (Pipeline)
subgraph Row1 [Request Pipeline]
direction LR
Client(Client)
API(API)
RMQ(RabbitMQ)
Workers(Workers)
%% Connections Left-to-Right
Client --> API
API --> RMQ
RMQ --> Workers
end
%% Define the Bottom Row (State & Orchestration)
subgraph Row2 [State & Orchestration]
direction LR
Redis[(Redis<br/>State)]
Saga(Saga<br/>Machine)
%% Redis is left of Saga to match your diagram
Redis ~~~ Saga
end
%% Cross-Row Connections
API --> Redis
Workers --> Saga
Saga --> RMQ
Saga --> Redis
Redis -.-> Client
%% Styling to align with previous diagrams
classDef plain fill:#fff,stroke:#333,stroke-width:2px;
classDef storage fill:#eee,stroke:#333,stroke-width:2px;
classDef queue fill:#f9f,stroke:#333,stroke-width:2px;
class Client,API,Workers,Saga plain;
class Redis storage;
class RMQ queue;
%% Layout Hint: Force Row 2 to be below Row 1
Row1 ~~~ Row2
- Non-Blocking API: Returns
202 Acceptedimmediately with a job ID - Polling Model: Clients poll GET
/api/lookup/{jobId}for status - Saga Pattern: Central state machine orchestrates the distributed workflow
- Worker Isolation: Each service type has dedicated worker processes
- Horizontal Scalability: Workers can be scaled independently
- .NET 10: Modern C# with minimal APIs and top-level statements
- MassTransit: Message bus abstraction over RabbitMQ
- RabbitMQ: Reliable message broker for command/event routing
- Redis: Fast, volatile state storage for jobs and saga state
- Docker Compose: Multi-container orchestration
block-beta
columns 2
%% -- Configure the columns --
%% Column 1: The Layer Box
%% Column 2: The Annotation (Arrows)
%% Row 1
L1["API Layer"] Note1["β REST endpoints"]
%% Row 2
L2["Application Layer"] Note2["β Use cases, Saga"]
%% Row 3
L3["Infra Layer"] Note3["β Redis, MassTransit"]
%% Row 4
L4["Domain Layer"] Note4["β Core entities (no deps)"]
%% Row 5
L5["Contracts Layer"] Note5["β Shared messages"]
%% Spacer rows
space:2
%% -- Workers Section --
%% We create a block that spans both columns (:2)
block:WorkersGroup:2
%% Header Text
WLabel["Workers (separate processes)"]
%% The 4 workers in a row
block:WList
columns 4
Geo Ping RDAP ReverseDNS
end
end
%% Styling to make the 'Notes' look like text (transparent)
style Note1 fill:none,stroke:none
style Note2 fill:none,stroke:none
style Note3 fill:none,stroke:none
style Note4 fill:none,stroke:none
style Note5 fill:none,stroke:none
style WLabel fill:none,stroke:none
- Docker & Docker Compose
- .NET 10 SDK (for local development)
# Clone the repository
git clone <repository-url>
cd DistributedLookup
# Start all services
docker-compose up --build
# API will be available at http://localhost:8080
# RabbitMQ management UI at http://localhost:15672 (guest/guest)# Basic request (uses default services: GeoIP, Ping, RDAP, ReverseDNS)
curl -X POST http://localhost:8080/api/lookup \
-H "Content-Type: application/json" \
-d '{"target": "8.8.8.8"}'
# Response:
{
"jobId": "123e4567-e89b-12d3-a456-426614174000",
"statusUrl": "/api/lookup/123e4567-e89b-12d3-a456-426614174000",
"message": "Job submitted successfully. Poll the status URL to check progress."
}
# Custom services
curl -X POST http://localhost:8080/api/lookup \
-H "Content-Type: application/json" \
-d '{
"target": "google.com",
"services": [0, 1]
}'# Poll for results
curl http://localhost:8080/api/lookup/123e4567-e89b-12d3-a456-426614174000
# Response (in progress):
{
"jobId": "123e4567-e89b-12d3-a456-426614174000",
"target": "8.8.8.8",
"targetType": "IPAddress",
"status": "Processing",
"completionPercentage": 75,
"requestedServices": [0, 1, 2, 3],
"results": [
{
"serviceType": "0",
"success": true,
"data": "{\"country\":\"US\",\"city\":\"Mountain View\"...}",
"durationMs": 234
},
{
"serviceType": "1",
"success": true,
"data": "{\"averageRoundtripMs\":15.2...}",
"durationMs": 2045
},
{
"serviceType": "2",
"success": true,
"data": "{\"handle\":\"NET-8-8-8-0-1\"...}",
"durationMs": 567
}
]
}curl http://localhost:8080/api/lookup/services
# Response:
[
{
"name": "GeoIP",
"value": 0,
"description": "Geographic location and ISP information"
},
{
"name": "Ping",
"value": 1,
"description": "Network reachability and latency check"
},
{
"name": "RDAP",
"value": 2,
"description": "Registration data via RDAP protocol"
},
{
"name": "ReverseDNS",
"value": 3,
"description": "Reverse DNS lookup (PTR record)"
}
]Client β API: POST /api/lookup
API β Redis: Save job (status: Pending)
API β RabbitMQ: Publish JobSubmitted event
API β Client: 202 Accepted + JobId
Saga β RabbitMQ: Consume JobSubmitted
Saga β RabbitMQ: Publish CheckGeoIP command
Saga β RabbitMQ: Publish CheckPing command
Saga β RabbitMQ: Publish CheckRDAP command
Worker β RabbitMQ: Consume command
Worker β External API: Query service
Worker β IWorkerResultStore: Save result (get ResultLocation)
Worker β RabbitMQ: Publish TaskCompleted(with ResultLocation, not data)
Key Change: Workers save results directly to storage BEFORE publishing events. Events contain only metadata (ResultLocation), not the actual result data. This reduces message size and improves saga performance.
Saga β RabbitMQ: Consume TaskCompleted
Saga β Redis: Update job with result
[When all tasks complete]
Saga β Redis: Mark job as Completed
Client β API: GET /api/lookup/{jobId}
API β Redis: Fetch job state
API β Client: Current status + results
dotnet test tests/Tests/Tests.csproj- β Domain entity logic (state transitions, validation)
- β Use case orchestration (mocked infrastructure)
- π Integration tests (Testcontainers) - Next step
[Fact]
public void AddResult_WhenAllServicesComplete_ShouldMarkAsCompleted()
{
// Arrange
var services = new[] { ServiceType.GeoIP, ServiceType.Ping };
var job = CreateTestJob(services);
// Act
job.AddResult(ServiceType.GeoIP, successResult);
job.AddResult(ServiceType.Ping, successResult);
// Assert
job.Status.Should().Be(JobStatus.Completed);
job.IsComplete().Should().BeTrue();
}- URL: http://localhost:15672
- Credentials: guest/guest
- View queues, message rates, consumer connections
docker exec -it distributed-lookup-redis redis-cli
# View all jobs
KEYS lookup:job:{jobId}
# Get job details
GET lookup:job:123e4567-e89b-12d3-a456-426614174000
# View all saga states
KEYS saga:*# Get saga instance
GET saga:123e4567-e89b-12d3-a456-426614174000
# Response shows:
{
"JobId": "123e4567-e89b-12d3-a456-426614174000",
"CurrentState": "Processing",
"RequestedServices": [0, 1, 2, 3],
"CompletedTasks": [0, 1],
"Results": [...]
}The API implements three-tier rate limiting to prevent abuse:
1. API Limit (Status checks and general endpoints)
- 100 requests per minute per client
- Fixed window strategy
- Queue limit: 10 requests
2. Expensive Operations (Job submissions)
- 20 requests per minute per client
- Sliding window strategy (6 segments)
- Queue limit: 5 requests
3. Global Limit
- 1000 requests per minute across all clients
- Prevents total system overload
Rate Limit Response:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
{
"error": "Rate limit exceeded",
"message": "Too many requests. Please try again later.",
"retryAfter": 60
}Readiness Check (/health/ready)
- Checks if the API is ready to serve requests
- Validates RabbitMQ connection
- Validates MassTransit bus readiness
- Used by Docker health checks
curl http://localhost:8080/health/readyLiveness Check (/health/live)
- Basic health check (process is running)
- Used for container orchestration
curl http://localhost:8080/health/liveBoth health endpoints bypass rate limiting.
- Aggregate Root:
LookupJobencapsulates all business logic - Value Objects:
ServiceResultis immutable - Rich Domain Model: State transitions enforced by entity methods
// Invalid state transitions throw exceptions
job.MarkAsProcessing(); // OK if Pending
job.MarkAsProcessing(); // Throws: already Processing- Commands:
SubmitLookupJob(write operation) - Queries:
GetJobStatus(read operation) - Separate read/write concerns for scalability
Initially(
When(JobSubmitted)
.PublishAsync(context => DispatchCommands(context))
.TransitionTo(Processing)
);
During(Processing,
When(TaskCompleted)
.ThenAsync(async context => UpdateJob(context))
.If(AllTasksComplete,
binder => binder.TransitionTo(Completed).Finalize())
);Each worker is:
- β Stateless (no shared memory)
- β Independently scalable
- β Fault-tolerant (retries via RabbitMQ)
- β Technology-agnostic (could be rewritten in Go, Python, etc.)
public interface IJobRepository
{
Task<LookupJob?> GetByIdAsync(string jobId);
Task SaveAsync(LookupJob job);
}Infrastructure dependency injection:
builder.Services.AddScoped<IJobRepository, RedisJobRepository>();All workers inherit from LookupWorkerBase<TCommand>, providing:
- β Consistent Workflow: Timing, validation, persistence, event publishing
- β DRY Principle: 90% reduction in worker code
- β Easy Extension: New workers only implement lookup logic
// Adding a new worker is simple
public sealed class WhoisConsumer(ILogger logger, IWorkerResultStore store)
: LookupWorkerBase<CheckWhois>(logger, store)
{
protected override ServiceType ServiceType => ServiceType.Whois;
protected override async Task<object> PerformLookupAsync(CheckWhois cmd, CancellationToken ct)
{
// Only implement the lookup - base class handles everything else
return await PerformWhoisLookup(cmd.Target);
}
}Worker Implementation Comparison:
| Aspect | Before | After |
|---|---|---|
| Lines of code | ~150 lines | ~30 lines |
| Duplication | High | None |
| Consistency | Manual | Enforced |
| Extension | Complex | Trivial |
Workers use IWorkerResultStore for result persistence:
- β Polyglot Persistence Ready: Support multiple storage backends
- β
Type-Safe Locations: Polymorphic
ResultLocationhierarchy - β Decoupled: Workers don't know about saga state
Current: Redis implementation
Future Ready: S3, DynamoDB, Azure Blob, FileSystem
// ResultLocation uses JSON polymorphism
[JsonPolymorphic]
[JsonDerivedType(typeof(RedisResultLocation), "redis")]
[JsonDerivedType(typeof(S3ResultLocation), "s3")]
[JsonDerivedType(typeof(DynamoDBResultLocation), "dynamodb")]
[JsonDerivedType(typeof(FileSystemResultLocation), "filesystem")]
[JsonDerivedType(typeof(AzureBlobResultLocation), "azureblob")]
public abstract record ResultLocation
{
public abstract StorageType StorageType { get; }
}Storage Strategy Examples:
// Small results β Redis (fast, in-memory)
public record RedisResultLocation(string Key, int Database, TimeSpan? Ttl)
: ResultLocation
{
public override StorageType StorageType => StorageType.Redis;
}
// Large results β S3 (cheap, durable)
public record S3ResultLocation(string Bucket, string Key, string? PresignedUrl)
: ResultLocation
{
public override StorageType StorageType => StorageType.S3;
}-
Rate Limiting
- Three-tier rate limiting (API, Expensive, Global)
- Configurable limits per endpoint
- Automatic retry-after headers
-
Health Checks
- Readiness check (bus + endpoints)
- Liveness check (process health)
- Docker health check integration
-
Worker Direct Persistence
- Workers save results directly to Redis
- Reduces load on saga
- Ensures result durability
-
Worker Base Class Architecture
- Template method pattern eliminates duplication
- All workers follow consistent pattern
- Easy to add new service types
- 90% code reduction per worker
-
Storage Abstraction Ready
- Interface for pluggable storage backends
- Polymorphic result locations
- Architecture supports Redis, S3, DynamoDB, Azure Blob
- No worker changes needed to add new backends
-
Authentication & Authorization
- Add API key validation
- Per-user rate limiting
- JWT token support
-
Error Handling
- Dead letter queues for failed messages
- Retry policies with exponential backoff
- Circuit breakers for external APIs
-
Observability
- Structured logging (Serilog)
- Distributed tracing (OpenTelemetry)
- Metrics (Prometheus + Grafana)
-
Resilience
- Timeout policies on HTTP calls
- Bulkhead isolation
- Saga compensation logic (rollback)
-
WebSocket Notifications
- Push updates instead of polling
- SignalR integration
-
Caching
- Cache frequent queries (Google DNS, etc.)
- TTL-based invalidation
-
Persistence
- Move from Redis to PostgreSQL for durable storage
- Keep Redis for fast read cache
-
Job Prioritization
- Priority queues in RabbitMQ
- SLA-based routing
-
Multi-Tenancy
- Tenant isolation
- Per-tenant quotas
-
Integration Tests
- Testcontainers for Docker-based tests
- End-to-end API tests
- Chaos engineering (kill workers mid-process)
-
Batch Processing
- Bulk job submissions
- Batch result updates
-
Connection Pooling
- HTTP client factory
- Redis connection multiplexer
-
Message Compression
- Compress large payloads
- Protobuf instead of JSON
-
Multi-Backend Storage
- Implement S3WorkerResultStore for large results
- Route based on result size
- Maintain fast retrieval times
// In API Program.cs
app.MapHub<JobStatusHub>("/hubs/job-status");
// Client usage
connection.on("JobUpdated", (jobId, status) => {
updateUI(jobId, status);
});x.AddConsumer<GeoIPConsumer>(cfg =>
{
cfg.UseMessageRetry(r => r.Exponential(
retryLimit: 3,
minInterval: TimeSpan.FromSeconds(2),
maxInterval: TimeSpan.FromSeconds(30),
intervalDelta: TimeSpan.FromSeconds(2)
));
});// 1. Implement the interface
public class S3WorkerResultStore : IWorkerResultStore
{
public async Task<ResultLocation> SaveResultAsync(
string jobId, ServiceType serviceType, object data, CancellationToken ct)
{
var key = $"results/{jobId}/{serviceType}";
await _s3Client.PutObjectAsync(bucket, key, data, ct);
return new S3ResultLocation(bucket, key, null);
}
}
// 2. Register in DI
builder.Services.AddScoped<IWorkerResultStore, S3WorkerResultStore>();
// That's it! All workers automatically use S3Available at http://localhost:8080/swagger when running in Development mode.
| Method | Endpoint | Description | Rate Limit |
|---|---|---|---|
| POST | /api/lookup |
Submit new lookup job | 20/min (expensive) |
| GET | /api/lookup/{jobId} |
Get job status and results | 100/min (api-limit) |
| GET | /api/lookup/services |
List available services | None |
| GET | /health/ready |
Readiness health check | None |
| GET | /health/live |
Liveness health check | None |
DistributedLookup/
βββ src/
β βββ Domain/ # Core business logic (no dependencies)
β β βββ Entities/ # LookupJob, ServiceResult, Enums
β βββ Application/ # Use cases and orchestration
β β βββ UseCases/ # SubmitLookupJob, GetJobStatus
β β βββ Saga/ # LookupJobStateMachine
β β βββ Workers/ # LookupWorkerBase, IWorkerResultStore
β β βββ Interfaces/ # IJobRepository
β βββ Infrastructure/ # External concerns
β β βββ Persistence/ # RedisJobRepository, RedisWorkerResultStore
β βββ Contracts/ # Shared message types
β β βββ Commands/ # CheckGeoIP, CheckPing, etc.
β β βββ Events/ # JobSubmitted, TaskCompleted
β β βββ ResultLocation.cs # Polymorphic storage locations
β βββ Api/ # REST API
β β βββ Controllers/ # LookupController
β β βββ Program.cs # DI configuration
β β βββ Dockerfile
β βββ Workers/
β βββ GeoWorker/ # GeoIP lookup worker
β βββ PingWorker/ # Ping check worker
β βββ RdapWorker/ # RDAP lookup worker
β βββ ReverseDnsWorker/ # Reverse DNS lookup worker
βββ tests/
β βββ Tests/ # Unit & integration tests
βββ docker-compose.yml # Multi-container setup
βββ DistributedLookup.sln # Solution file
Saga (Orchestration):
- β Central visibility of workflow
- β Easier to add compensation logic
- β Simpler debugging
Choreography:
- β Distributed state management
- β Harder to track job progress
For MVP:
- β Faster reads/writes (in-memory)
- β Built-in TTL (auto-cleanup)
- β Simpler deployment
For Production:
- Consider PostgreSQL for:
- Durable storage
- Complex queries
- Audit trails
Polling:
- β Simpler client implementation
- β No connection management
- β Works through firewalls/proxies
Push (WebSocket):
- Better UX
- Next step for enhancement
Before (Each worker ~150 lines):
- β Duplicated timing code
- β Duplicated validation
- β Duplicated persistence logic
- β Duplicated error handling
- β Inconsistent patterns
After (Each worker ~30 lines):
- β Single source of truth
- β Guaranteed consistency
- β Trivial to add new services
- β 90% code reduction
Flexibility:
- Small results (< 1KB) β Redis (fast)
- Medium results (1KB-1MB) β Redis or S3
- Large results (> 1MB) β S3 (cheap)
- Structured data β DynamoDB
- File uploads β Azure Blob Storage
No Worker Changes:
- Workers call
IWorkerResultStore.SaveResultAsync() - Saga stores
ResultLocationin state - Backend can be swapped without touching workers
- MassTransit Documentation
- Saga Pattern
- Clean Architecture
- CQRS Pattern
- Template Method Pattern
- Strategy Pattern
This project is created as a practical assignment to demonstrate distributed systems architecture, clean code principles, and production-ready .NET development.
Author's Note: This implementation prioritizes architectural clarity and demonstrable distributed computing concepts. The worker base class pattern and storage abstraction demonstrate how thoughtful design can dramatically reduce code duplication while improving extensibility. In a production environment, additional layers (authentication, comprehensive error handling, observability) would be essential. The focus here is on showing a solid foundation that can be extended incrementally.