Skip to content

Venkat-Gorla/cloudchat-v2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

92 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ’ฌ cloudchat-v2

Python AWS DynamoDB SNS Tests License

Next-generation messaging backend โ€” DynamoDB, Serverless, and Python

cloudchat-v2 is the production-ready successor to cloudchat-lite. It implements a modern messaging backend, inspired by WhatsApp-scale design principles.

This repository focuses exclusively on the backend, with a placeholder for a future React client.

๐Ÿ“‘ Table of Contents

โœจ Why this exists

cloudchat-lite (v1) was a solid first iteration with a clean single-table DynamoDB design. As the project evolved and access patterns were examined more deeply, two schema issues emerged that would limit correctness and scalability in a real messaging workload.

This repository (cloudchat-v2) exists to correct those specific issues while preserving the original design philosophy.

๐Ÿ”ฎ v1 โ†’ v2 Schema Evolution

  • Message ordering v1 relied on timestamp-based sort keys, which break under concurrent writes. v2 uses ULID-based ordering for guaranteed uniqueness and correct time sequencing.

  • Inbox ordering v1โ€™s GSI design prevented server-side sorting and pagination. v2 introduces a time-ordered inbox GSI.

v2 preserves the original single-table design and access philosophy โ€” it is a targeted schema correction, not a rewrite.

๐Ÿ“Œ For the full breakdown, see docs/schema-evolution.md

๐Ÿ” Language Transition (v1 โ†’ v2)

While cloudchat-lite (v1) was implemented in Node.js, cloudchat-v2 was intentionally rebuilt in Python from the ground up.

Rather than reusing existing code, the v2 backend was rewritten entirely to:

  • Practice production-grade async Python
  • Explore Pythonโ€™s ecosystem for serverless and data modeling
  • Validate that the original design principles translate cleanly across languages

Only the architecture and access patterns were carried forward โ€” all implementation details were rethought and reimplemented.

๐Ÿš€ Core Capabilities

Message Write Path

  • ULID-ordered message writes (collision-proof)
  • Atomic metadata updates for all participants
  • Correct inbox ordering guaranteed

Inbox (List Conversations)

  • GSI sorted by <LastTimestamp>#<conversation_id>
  • Server-side ordering and pagination
  • Stable under concurrent message writes

Conversation History

  • Query by conversation ID
  • ULID-ordered messages
  • Efficient pagination

Real-Time Delivery

  • Online fanout via WebSockets
  • Presence modeled as TTL-based soft state
  • Decoupled from message persistence

๐Ÿ— Tech Stack

Technology Usage
Python 3.13+ Core application language
AWS Lambda Serverless compute runtime
AWS API Gateway HTTP + WebSocket entry point
WebSockets (APIGW) Real-time message delivery
DynamoDB (single table) Messages, metadata, and presence (system of record)
DynamoDB Local Local validation of single-table design and read receipts POC
DynamoDB Streams Change data capture for real-time fanout
SNS Decoupled event distribution for message delivery
aioboto3 Async DynamoDB client
Pydantic v2 Data validation and serialization
pytest Testing framework
UV Dependency and environment management
AWS SAM Infrastructure-as-code, build, and deployment
IAM Least-privilege access control between services

๐Ÿงฑ Architecture Overview

Serverless Backend

Client (future React)
      |
API Gateway (HTTP)
      |
Lambda Handlers
      |
DynamoDB (Single Table)
   โ”œโ”€ Message rows
   โ””โ”€ Per-user conversation metadata for n-way chats
        โ†ณ Inbox GSI (sorted by recency)

Design Highlights

  • Fully async Python handlers
  • DynamoDB schema designed for high write throughput

โšก Real-Time Messaging Architecture

Implemented two real-time delivery pipelines to demonstrate scalable, decoupled fanout patterns.

Option 1 โ€” Direct DDB Stream Fanout

  • DynamoDB Streams trigger a Lambda on new messages

  • Lambda:

    • Hydrates message from DynamoDB
    • Resolves online recipients via Presence table
    • Sends directly over WebSocket
  • Simple, low latency, tightly coupled

Option 2 โ€” SNS-Based Fanout (Current)

  • DDB Stream Lambda publishes MessageSent events to SNS

  • SNS consumer Lambda:

    • Hydrates message from DynamoDB
    • Resolves online recipients via Presence table
    • Fans out to active WebSocket connections

Benefits

  • Independent scaling
  • Failure isolation and retries
  • Extensible to analytics, notifications, moderation

Why SNS?

  • Clean separation of persistence vs delivery
  • Stream processing never blocks on fanout
  • Proven production chat pattern
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Client โ”‚ โ†’ โ”‚ API / Lambda โ”‚ โ†’ โ”‚ DDB (Messages) โ”‚ โ†’ โ”‚ DDB Stream Lambda โ”‚ โ†’ โ”‚      SNS         โ”‚
โ”‚        โ”‚   โ”‚(send message)โ”‚   โ”‚                โ”‚   โ”‚  (publish event)  โ”‚   โ”‚ MessageSentTopic โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚     SNS Consumer Lambda        โ”‚ โ†’ โ”‚     WebSocket API      โ”‚ โ†’ โ”‚  Online Clients  โ”‚
โ”‚  - hydrate message from DDB    โ”‚   โ”‚                        โ”‚   โ”‚                  โ”‚
โ”‚  - lookup presence             โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚  - fanout to connections       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”Œ WebSocket Presence & Connection Lifecycle

Presence Refresh

        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚  Client โ”‚
        โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜
             โ”‚  connect / heartbeat
             v
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚  WebSocket API   โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
             โ”‚
             โ”‚ refresh ttl
             v
 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ”‚ DDB (Presence / TTL)   โ”‚
 โ”‚  ttl = now + interval  โ”‚
 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Presence Leverage

 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ”‚   Fanout / Send Logic  โ”‚
 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
            โ”‚
            โ”‚ check ttl > now
            v
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚   WebSocket API   โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
            โ”‚
            โ”‚ 410 Gone
            v
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚    Connection Cleanup     โ”‚
  โ”‚ (defensive / future hook) โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

IAM permissions are tightly scoped so Lambdas can only be invoked by the specific WebSocket API routes.
WebSockets are used for delivery only; DynamoDB TTL-based presence is the source of truth.

๐Ÿ“ฆ Quick Start (Optional)

cd backend
uv sync
sam build
sam deploy --guided

This provisions:

  • Lambda functions
  • DynamoDB message table (single-table design)
  • DynamoDB presence table (TTL-based)
  • Inbox GSI (sorted by recency)
  • API Gateway (HTTP + WebSocket)
  • IAM roles and permissions

๐Ÿ“Œ AWS credentials and SAM CLI required.

๐Ÿงช Testing

The backend includes a focused test suite covering core correctness and edge cases:

  • Unit tests

    • DynamoDB key construction and schema models
    • ULID ordering and timestamp extraction
    • Message fanout recipient parsing
  • Integration tests

    • TTL-based presence behavior

Tests are written using pytest with strict async validation.

cd backend
uv run pytest -q tests
.........................  [100%]
25 passed in 6.60s

โœ… AWS Integration Validation

Manual AWS test scripts were used to validate live DynamoDB access patterns, Lambda execution, inbox ordering, and real-time fanout behavior against deployed infrastructure.

These scripts live under:

backend/aws_tests/

โšก Real-Time Messaging (Live Demo)

The system supports live WebSocket message delivery to online users, driven by DynamoDB Streams and SNS-based fanout.

Messages are delivered asynchronously; ordering is preserved at the data layer and resolved client-side.

Real-Time Messaging Demo

๐Ÿท๏ธ License

MIT โ€” free for personal and professional use.

About

Production-grade distributed messaging backend demonstrating real-world systems design. Built with Python and AWS using DynamoDB single-table modeling, ULID-based ordering, WebSockets, and decoupled fanout via Streams and SNS.

Resources

License

Stars

Watchers

Forks

Contributors

Languages