Guilherme Cavalcante GscDtAnalytic

Guilherme Cavalcante

AI-Augmented Engineer · LLMOps · Data Engineering · Cloud

Brasília, BR — open to remote

I build systems that put LLMs into production — from streaming data pipelines to eval infrastructure that catches prompt regressions before deploy.

Use Claude Code as an architecture amplifier: I define the system boundaries, own the decisions, and drive the execution.

LLM apps regress when prompts change without control. PromptBench treats prompts as versioned software assets.

PromptVersion is immutable — no UPDATE, every edit creates an auditable snapshot
Measures quality, latency, variance, and real cost (tokens from the provider's usage response, not estimates)
Regression verdict: dimension ∧ slice ∧ cost — a better global average doesn't earn promotion if any slice regresses
87% test coverage on the evaluation module; deterministic checks are pure functions

Stack: Next.js · FastAPI · PostgreSQL 16 · Redis · arq · Anthropic Claude

Political opinion monitoring across 167 municipalities in Rio Grande do Norte, Brazil — at < R$5/month.

RSS + social media ingestion → NLP sentiment pipeline → LLM-generated strategic recommendations
Real-time dashboard with RAG over regional political knowledge base
Cloud-native on GCP with full data governance

Stack: Python · GCP · BigQuery · dbt · Cloud Run

Exactly-once streaming pipeline with Claude as the anomaly explainer — ~US$40/month on GCP.

Stack: Redpanda · ksqlDB · Apache Iceberg · Python · Anthropic Claude

Every project here has:

A problem statement — not just a stack demo
ADRs (Architecture Decision Records) explaining why each key decision was made
A build audit trail — docs/progress/blockN.md logging what was done, what ran, what was decided
Real numbers — cost, latency, test coverage — not vague claims

AWS Certified Cloud Practitioner · Microsoft Azure AI Essentials · Anthropic Claude 101 · Anthropic Claude Code in Action

Expanding PromptBench with multi-model judge (majority vote), continuous eval on real traffic, and GitHub Actions PR gate
Growing English fluency for international collaboration

"A better global average doesn't earn promotion when it breaks the cost budget."
— from PromptBench Studio's regression verdict