Skip to content

Add AGENTS.md - AI agent architecture reference#21

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/add-agents-onboarding-guide
Draft

Add AGENTS.md - AI agent architecture reference#21
Copilot wants to merge 2 commits intomainfrom
copilot/add-agents-onboarding-guide

Conversation

Copy link

Copilot AI commented Feb 13, 2026

AI agents working on DStream lack a consolidated architecture reference, requiring them to piece together context from code, WARP.md status updates, and scattered examples across 6 repositories.

What's Added

AGENTS.md (1,250 lines) - Single source of truth for DStream architecture:

  • Three-process orchestration model - ASCII diagram showing Input Provider → CLI → Output Provider data flow
  • Provider communication protocol - stdin/stdout JSON specification with command envelopes
  • Language-specific patterns - Complete examples for Go native and .NET SDK implementations
  • Repository ecosystem - Table linking to all 6 provider repos with roles and languages
  • Development patterns - 3 working code examples (polling input, streaming output, .NET async)
  • FAQ - 13 Q&A covering protocol, lifecycle, distribution, testing

Key Documentation

Provider communication protocol:

// First line: command envelope
{"command": "run", "config": {...}}

// Subsequent lines: data envelopes (continuous stream)
{"data": {...}, "metadata": {...}}
{"data": {...}, "metadata": {...}}

Critical distinctions highlighted:

  • Providers are long-running services (not one-shot scripts)
  • stdout = data flow only, stderr = all logging
  • Input providers support run, output providers support full lifecycle

File Relationships

  • readme.md - User quick start
  • WARP.md - Project status and priorities
  • AGENTS.md - Architecture reference (new)

Complements existing docs without duplication. All 17 links verified, 52 code blocks tested for syntax correctness.

Original prompt

Add AGENTS.md - AI Agent Onboarding Guide

Overview

Create a comprehensive onboarding document (AGENTS.md) for AI agents working on the DStream ecosystem. This file will serve as the definitive architecture and integration guide, complementing the existing WARP.md (which tracks project status).

Goals

  1. Provide complete context about DStream's architecture in one place
  2. Document the provider communication protocol (stdin/stdout with JSON)
  3. Explain the three-process orchestration model
  4. Link to all related repositories with clear roles
  5. Include code examples for both Go and .NET providers
  6. Create a reference that agents can use to understand the system in future sessions

File Location

AGENTS.md (root of repository, alongside WARP.md and readme.md)

Content Structure

1. Quick Overview

  • What DStream is (Terraform for data streaming)
  • Key concepts: HCL config, provider model, stdin/stdout, OCI distribution

2. Architecture Overview

  • Three-process model diagram (Input Provider → DStream CLI → Output Provider)
  • Repository table with links:
    • katasec/dstream - CLI orchestrator (Go)
    • katasec/dstream-dotnet-sdk - .NET SDK
    • katasec/dstream-ingester-mssql - SQL Server CDC provider (Go)
    • katasec/dstream-log-output-provider - Log output provider (Go)

3. Provider Contract

Communication Protocol

  • Configuration: First line from stdin (JSON config)

    {
      "db_connection_string": "server=localhost;...",
      "poll_interval": "5s",
      "tables": ["Persons", "Cars"]
    }
  • Data Flow: Continuous JSON envelopes via stdout

    {
      "data": {
        "table_name": "Persons",
        "change_type": "insert",
        "id": 123
      },
      "metadata": {
        "timestamp": "2025-09-28T20:00:00Z"
      }
    }
  • Logging: All logs to stderr (never stdout)

Provider Interfaces

Lifecycle: Long-Running Services

  • Critical distinction: Providers are persistent processes, not one-shot scripts
  • Read config once at startup, then loop indefinitely
  • Graceful shutdown on SIGINT/SIGTERM

.NET SDK Pattern (katasec/dstream-dotnet-sdk):

// Base class
public abstract class ProviderBase<TConfig>
{
    protected TConfig Config { get; private set; }
    protected IPluginContext Ctx { get; private set; }
    public void Initialize(TConfig config, IPluginContext ctx);
}

// Input provider
public interface IInputProvider : IProvider
{
    IAsyncEnumerable<Envelope> ReadAsync(IPluginContext ctx, CancellationToken ct);
}

// Output provider
public interface IOutputProvider : IProvider
{
    Task WriteAsync(IEnumerable<Envelope> batch, IPluginContext ctx, CancellationToken ct);
}

// Bootstrap
await StdioProviderHost.RunInputProviderAsync<MyProvider, MyConfig>();

Go Native Pattern (katasec/dstream-ingester-mssql):

func main() {
    // 1. Read config from stdin (first line only)
    config := readConfigFromStdin()
    
    // 2. Setup long-running service
    ctx := context.Background()
    
    // 3. Generate/poll data continuously
    for {
        data := pollForChanges()
        json.NewEncoder(os.Stdout).Encode(envelope)
        log.Println("Processed batch") // stderr
    }
}

4. Provider Distribution (OCI)

  • Build and push to GHCR
  • Usage in dstream.hcl with provider_ref

5. Testing Providers

  • Local testing without CLI (pipe JSON to binary)
  • Integration testing with DStream CLI

6. Key Files Reference Table

File Purpose Link
WARP.md Project status [Link]
readme.md User quick start [Link]
dstream.hcl Example config [Link]

Provider examples table with links to:

  • MSSQL CDC (Go input)
  • Log Output (Go output)
  • Counter (C# input)
  • Console (C# output)

7. Common Development Patterns

  • Full code examples for creating new input/output providers in Go
  • Show complete working examples with config parsing, data generation/consumption

8. FAQ for AI Agents

  • How do providers communicate?
  • Are providers one-shot or long-running?
  • Where should logs go?
  • How to test locally?
  • What's the data format?
  • How are providers distributed?
  • Can I write in any language?

9. Next Steps for New Sessions

  • Checklist of what to review when starting a new session

Implementation Notes

  • Use clear markdown formatting with code blocks
  • Include actual working code examples
  • Add links to all referenced repositories and files
  • Use tables for structured information
  • Include diagrams using ASCII/Unicode art where helpful
  • Keep tone helpful and educational

Success Criteria

  • An AI agent can read this file and understand:
    • The complete DStream architecture
    • How to create a new provider
    • How providers communicate
    • Where to find examples in each language
    • How to test their work
  • File serves as single source of truth for architecture
  • Links are correct and functional
  • Code examples are accurate and test...

This pull request was created from Copilot chat.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: writeameer <221149+writeameer@users.noreply.github.com>
Copilot AI changed the title [WIP] Add AGENTS.md - AI agent onboarding guide Add AGENTS.md - AI agent architecture reference Feb 13, 2026
Copilot AI requested a review from writeameer February 13, 2026 05:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants