Add AGENTS.md - AI agent architecture reference by Copilot · Pull Request #21 · katasec/dstream

Copilot · 2026-02-13T05:44:40Z

AI agents working on DStream lack a consolidated architecture reference, requiring them to piece together context from code, WARP.md status updates, and scattered examples across 6 repositories.

What's Added

AGENTS.md (1,250 lines) - Single source of truth for DStream architecture:

Three-process orchestration model - ASCII diagram showing Input Provider → CLI → Output Provider data flow
Provider communication protocol - stdin/stdout JSON specification with command envelopes
Language-specific patterns - Complete examples for Go native and .NET SDK implementations
Repository ecosystem - Table linking to all 6 provider repos with roles and languages
Development patterns - 3 working code examples (polling input, streaming output, .NET async)
FAQ - 13 Q&A covering protocol, lifecycle, distribution, testing

Key Documentation

Provider communication protocol:

// First line: command envelope
{"command": "run", "config": {...}}

// Subsequent lines: data envelopes (continuous stream)
{"data": {...}, "metadata": {...}}
{"data": {...}, "metadata": {...}}

Critical distinctions highlighted:

Providers are long-running services (not one-shot scripts)
stdout = data flow only, stderr = all logging
Input providers support run, output providers support full lifecycle

File Relationships

readme.md - User quick start
WARP.md - Project status and priorities
AGENTS.md - Architecture reference (new)

Complements existing docs without duplication. All 17 links verified, 52 code blocks tested for syntax correctness.

Original prompt

Add AGENTS.md - AI Agent Onboarding Guide

Overview

Create a comprehensive onboarding document (AGENTS.md) for AI agents working on the DStream ecosystem. This file will serve as the definitive architecture and integration guide, complementing the existing WARP.md (which tracks project status).

Goals

Provide complete context about DStream's architecture in one place
Document the provider communication protocol (stdin/stdout with JSON)
Explain the three-process orchestration model
Link to all related repositories with clear roles
Include code examples for both Go and .NET providers
Create a reference that agents can use to understand the system in future sessions

File Location

AGENTS.md (root of repository, alongside WARP.md and readme.md)

Content Structure

1. Quick Overview

What DStream is (Terraform for data streaming)
Key concepts: HCL config, provider model, stdin/stdout, OCI distribution

2. Architecture Overview

Three-process model diagram (Input Provider → DStream CLI → Output Provider)
Repository table with links:
- katasec/dstream - CLI orchestrator (Go)
- katasec/dstream-dotnet-sdk - .NET SDK
- katasec/dstream-ingester-mssql - SQL Server CDC provider (Go)
- katasec/dstream-log-output-provider - Log output provider (Go)

3. Provider Contract

Communication Protocol

Configuration: First line from stdin (JSON config)

{
  "db_connection_string": "server=localhost;...",
  "poll_interval": "5s",
  "tables": ["Persons", "Cars"]
}

Data Flow: Continuous JSON envelopes via stdout

{
  "data": {
    "table_name": "Persons",
    "change_type": "insert",
    "id": 123
  },
  "metadata": {
    "timestamp": "2025-09-28T20:00:00Z"
  }
}

Logging: All logs to stderr (never stdout)

Provider Interfaces

Lifecycle: Long-Running Services

Critical distinction: Providers are persistent processes, not one-shot scripts
Read config once at startup, then loop indefinitely
Graceful shutdown on SIGINT/SIGTERM

.NET SDK Pattern (katasec/dstream-dotnet-sdk):

// Base class
public abstract class ProviderBase<TConfig>
{
    protected TConfig Config { get; private set; }
    protected IPluginContext Ctx { get; private set; }
    public void Initialize(TConfig config, IPluginContext ctx);
}

// Input provider
public interface IInputProvider : IProvider
{
    IAsyncEnumerable<Envelope> ReadAsync(IPluginContext ctx, CancellationToken ct);
}

// Output provider
public interface IOutputProvider : IProvider
{
    Task WriteAsync(IEnumerable<Envelope> batch, IPluginContext ctx, CancellationToken ct);
}

// Bootstrap
await StdioProviderHost.RunInputProviderAsync<MyProvider, MyConfig>();

Go Native Pattern (katasec/dstream-ingester-mssql):

func main() {
    // 1. Read config from stdin (first line only)
    config := readConfigFromStdin()
    
    // 2. Setup long-running service
    ctx := context.Background()
    
    // 3. Generate/poll data continuously
    for {
        data := pollForChanges()
        json.NewEncoder(os.Stdout).Encode(envelope)
        log.Println("Processed batch") // stderr
    }
}

4. Provider Distribution (OCI)

Build and push to GHCR
Usage in dstream.hcl with provider_ref

5. Testing Providers

Local testing without CLI (pipe JSON to binary)
Integration testing with DStream CLI

6. Key Files Reference Table

File	Purpose	Link
`WARP.md`	Project status	[Link]
`readme.md`	User quick start	[Link]
`dstream.hcl`	Example config	[Link]

Provider examples table with links to:

MSSQL CDC (Go input)
Log Output (Go output)
Counter (C# input)
Console (C# output)

7. Common Development Patterns

Full code examples for creating new input/output providers in Go
Show complete working examples with config parsing, data generation/consumption

8. FAQ for AI Agents

How do providers communicate?
Are providers one-shot or long-running?
Where should logs go?
How to test locally?
What's the data format?
How are providers distributed?
Can I write in any language?

9. Next Steps for New Sessions

Checklist of what to review when starting a new session

Implementation Notes

Use clear markdown formatting with code blocks
Include actual working code examples
Add links to all referenced repositories and files
Use tables for structured information
Include diagrams using ASCII/Unicode art where helpful
Keep tone helpful and educational

Success Criteria

An AI agent can read this file and understand:
- The complete DStream architecture
- How to create a new provider
- How providers communicate
- Where to find examples in each language
- How to test their work
File serves as single source of truth for architecture
Links are correct and functional
Code examples are accurate and test...

This pull request was created from Copilot chat.

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: writeameer <221149+writeameer@users.noreply.github.com>

Initial plan

d0a1b43

Copilot AI assigned Copilot and writeameer Feb 13, 2026

Copilot started work on behalf of writeameer February 13, 2026 05:45 View session

Add comprehensive AGENTS.md onboarding guide

0072977

Co-authored-by: writeameer <221149+writeameer@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add AGENTS.md - AI agent onboarding guide~~ Add AGENTS.md - AI agent architecture reference Feb 13, 2026

Copilot AI requested a review from writeameer February 13, 2026 05:51

Copilot finished work on behalf of writeameer February 13, 2026 05:51

writeameer approved these changes Feb 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AGENTS.md - AI agent architecture reference#21

Add AGENTS.md - AI agent architecture reference#21
Copilot wants to merge 2 commits intomainfrom
copilot/add-agents-onboarding-guide

Copilot AI commented Feb 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's Added

Key Documentation

File Relationships

Add AGENTS.md - AI Agent Onboarding Guide

Overview

Goals

File Location

Content Structure

1. Quick Overview

2. Architecture Overview

3. Provider Contract

Communication Protocol

Provider Interfaces

4. Provider Distribution (OCI)

5. Testing Providers

6. Key Files Reference Table

7. Common Development Patterns

8. FAQ for AI Agents

9. Next Steps for New Sessions

Implementation Notes

Success Criteria

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Feb 13, 2026 •

edited

Loading