Azure multimodal compliance ingestion engine

Brand Guardian AI is an LLM-powered compliance auditing system for marketing videos. It ingests a YouTube video, extracts speech/on-screen text, retrieves relevant policy guidance from a vector knowledge base, and returns a structured compliance report (PASS/FAIL, violations, and summary).

Current scope: This repo is an end-to-end backend prototype with CLI + FastAPI entrypoints and Azure-first integrations (Video Indexer, OpenAI, AI Search, App Insights).

1) Problem this project solves

Manual ad/compliance review is expensive and inconsistent when teams are shipping lots of content. This project automates first-pass review by combining:

Evidence extraction from videos (transcript + OCR),
Rule-grounded retrieval from your policy PDFs,
Deterministic orchestration (LangGraph),
Structured machine-readable output for downstream workflows.

This design reduces “hallucinated policy advice” risk versus plain prompting because retrieval injects organization-specific policy context at runtime.

2) System architecture (high level)

flowchart TD
    Client[CLI / FastAPI client] --> Graph[LangGraph workflow]

    subgraph Runtime Audit Flow
        Graph --> IndexNode[index_video_node]
        IndexNode --> VIService[VideoIndexerService]
        VIService --> YTDLP[yt-dlp download]
        VIService --> AzureVI[Azure Video Indexer]
        AzureVI --> Extracted[Transcript + OCR + Metadata]

        Extracted --> AuditNode[audit_content_node]
        AuditNode --> Search[Azure AI Search vector retrieval]
        AuditNode --> ChatLLM[Azure OpenAI chat model]
        Search --> PromptRules[Retrieved policy chunks]
        PromptRules --> ChatLLM
        ChatLLM --> Output[JSON verdict + findings]
    end

    Output --> Response[API/CLI response]

Knowledge ingestion architecture (offline / scheduled)

flowchart LR
    PDFs[Policy PDFs: backend/data] --> IngestScript[index_documents.py]
    IngestScript --> Splitter[RecursiveCharacterTextSplitter\nchunk_size=1000, overlap=200]
    Splitter --> Embeddings[Azure OpenAI Embeddings]
    Embeddings --> VectorIndex[Azure AI Search Index]

Existing image artifact (from repo)

3) Execution flow in detail

Step A — Entry points

You can start audits via:

CLI simulation (main.py)
HTTP API (POST /audit in backend/src/api/server.py)

Both build initial graph state:

video_url
video_id
compliance_results=[]
errors=[]

Step B — Node 1: Index video (`index_video_node`)

index_video_node orchestrates media ingestion and extraction:

Verifies URL shape (YouTube domain expected).
Downloads video locally using yt-dlp.
Uploads local file to Azure Video Indexer.
Polls indexing status until processed.
Extracts:
- transcript text,
- OCR text lines,
- basic metadata.

Error path sets final_status=FAIL and appends diagnostic text into errors.

Step C — Node 2: Audit content (`audit_content_node`)

audit_content_node performs RAG + reasoning:

If transcript missing, returns fail-fast response.
Initializes:
- Azure chat LLM,
- Azure embeddings,
- Azure Search vector store.
Builds query from transcript + OCR.
Retrieves top-k policy chunks from vector index.
Injects retrieved rules into strict JSON prompt.
Parses model output into:
- compliance_results
- final_status
- final_report

It also strips markdown code fences if the model wraps JSON in ```json blocks.

Step D — Response

The graph returns final state consumed by:

CLI pretty-print report,
FastAPI AuditResponse payload.

4) Core architecture decisions

LangGraph over ad-hoc script chaining: clearer stateful DAG + easier extensibility.
Azure Video Indexer for extraction: offloads speech/OCR complexity.
Azure AI Search as retriever: scalable vector retrieval over policy corpus.
Strict JSON target schema: supports downstream automation and deterministic UI rendering.
Telemetry hook: optional Azure Monitor instrumentation for production visibility.

5) Repository map

.
├── README.md
├── main.py
├── pyproject.toml
├── Project2_Langgraph_Architecture.png
├── docs/
│   ├── youtube-ad-specs.pdf
│   └── 1001a-influencer-guide-508_1.pdf
├── backend/
│   ├── Dockerfile
│   ├── data/
│   │   ├── youtube-ad-specs.pdf
│   │   └── 1001a-influencer-guide-508_1.pdf
│   ├── scripts/
│   │   ├── index_documents.py
│   │   └── explanation.txt
│   └── src/
│       ├── api/
│       │   ├── server.py
│       │   └── telemetry.py
│       ├── graph/
│       │   ├── state.py
│       │   ├── nodes.py
│       │   └── workflow.py
│       └── services/
│           └── video_indexer.py
└── azure_functions/
    ├── function_app.py        # currently empty scaffold
    ├── host.json              # currently empty scaffold
    ├── local.settings.json    # currently empty scaffold
    └── requirements.txt       # currently empty scaffold

6) Data model contracts

Graph state (`VideoAuditState`)

Inputs: video_url, video_id
Extraction: local_file_path?, video_metadata, transcript?, ocr_text[]
Outputs: compliance_results[], final_status, final_report
Diagnostics: errors[]

Violation object (`ComplianceIssue`)

category (e.g., disclosure, claim validation)
description
severity (e.g., CRITICAL/WARNING)
timestamp (optional)

7) API specification

`POST /audit`

Request

{
  "video_url": "https://youtu.be/dT7S75eYhcQ"
}

Response (shape)

{
  "session_id": "<uuid>",
  "video_id": "vid_<8chars>",
  "status": "PASS|FAIL",
  "final_report": "Natural-language summary",
  "compliance_results": [
    {
      "category": "Claim Validation",
      "severity": "CRITICAL",
      "description": "Explanation of violation"
    }
  ]
}

`GET /health`

Returns basic liveness payload:

{ "status": "healthy", "service": "Brand Guardian AI" }

8) Configuration

Create .env at repository root:

# Azure OpenAI
AZURE_OPENAI_ENDPOINT=
AZURE_OPENAI_API_KEY=
AZURE_OPENAI_API_VERSION=2024-02-01
AZURE_OPENAI_CHAT_DEPLOYMENT=
AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-3-small

# Azure AI Search
AZURE_SEARCH_ENDPOINT=
AZURE_SEARCH_API_KEY=
AZURE_SEARCH_INDEX_NAME=

# Azure Video Indexer
AZURE_VI_ACCOUNT_ID=
AZURE_VI_LOCATION=
AZURE_SUBSCRIPTION_ID=
AZURE_RESOURCE_GROUP=
AZURE_VI_NAME=project-brand-guardian-001

# Optional telemetry
APPLICATIONINSIGHTS_CONNECTION_STRING=

9) Local setup and run

Prerequisites

Python 3.12+
Azure resources configured and reachable
Network access to YouTube + Azure endpoints

Install

uv sync

(Alternative)

pip install -e .

Build the policy vector index

uv run python backend/scripts/index_documents.py

Run CLI audit

uv run python main.py

Run API server

uv run uvicorn backend.src.api.server:app --reload

Then visit:

http://localhost:8000/docs
http://localhost:8000/health

10) Observability and ops notes

Logging is enabled across API, graph nodes, and indexing script.
If APPLICATIONINSIGHTS_CONNECTION_STRING is set, telemetry is auto-instrumented via Azure Monitor OpenTelemetry.
Current Video Indexer polling uses fixed 30s intervals and no max timeout; add bounded retries for production hardening.

11) Current limitations

azure_functions/ folder is scaffold-only (empty implementation files).
No committed automated tests yet.
Output schema validation is prompt-enforced; adding strict post-parse validation would improve reliability.
API path currently invokes graph synchronously (invoke), which can block worker threads for long video jobs.

12) Recommended next improvements

Add unit tests for each node with mocked Azure clients.
Add integration tests for /audit using fixture transcripts.
Add JSON schema / Pydantic validation with retry-on-malformed-output.
Add timeout + exponential backoff for Video Indexer polling.
Introduce async workflow execution and job queue for long-running audits.
Implement human-review workflow branch for CRITICAL findings.

13) Quick start checklist

Fill .env with Azure credentials and endpoints.
Run index_documents.py to create policy embeddings.
Start API and call /audit with a YouTube URL.
Inspect output status, compliance_results, and final_report.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
azure_functions		azure_functions
backend		backend
docs		docs
.DS_Store		.DS_Store
.gitattributes		.gitattributes
Project2_Langgraph_Architecture.png		Project2_Langgraph_Architecture.png
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Azure multimodal compliance ingestion engine

1) Problem this project solves

2) System architecture (high level)

Knowledge ingestion architecture (offline / scheduled)

Existing image artifact (from repo)

3) Execution flow in detail

Step A — Entry points

Step B — Node 1: Index video (index_video_node)

Step C — Node 2: Audit content (audit_content_node)

Step D — Response

4) Core architecture decisions

5) Repository map

6) Data model contracts

Graph state (VideoAuditState)

Violation object (ComplianceIssue)

7) API specification

POST /audit

Request

Response (shape)

GET /health

8) Configuration

9) Local setup and run

Prerequisites

Install

Build the policy vector index

Run CLI audit

Run API server

10) Observability and ops notes

11) Current limitations

12) Recommended next improvements

13) Quick start checklist

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Step B — Node 1: Index video (`index_video_node`)

Step C — Node 2: Audit content (`audit_content_node`)

Graph state (`VideoAuditState`)

Violation object (`ComplianceIssue`)

`POST /audit`

`GET /health`

Packages