Software Engineer @ Microsoft | AI Safety Researcher | Sailor
I like distributed systems and AI safety. Model Evals are very interesting and I'm open to discussing new opportunities if you want to reach out! Currently fascinated by interpretability and model evaluations.
|
AI Safety Evaluation Library for Python Building tools to evaluate frontier models for:
Inspired by research from Anthropic's alignment team. |
Distributed Systems & Infrastructure
|
gabriel = {
"languages": ["Python", "C#/.NET", "Golang", "C++", "Java"],
"ml_ai": ["TensorFlow", "Scikit-learn", "XGBoost", "Semantic Kernel", "FastMCP"],
"infrastructure": ["Kubernetes", "Docker", "Azure", "GCP", "AWS"],
"research_interests": [
"Mechanistic Interpretability",
"Pragmatic Interpretability",
"AI Alignment Research",
"Chain-of-Thought Faithfulness",
"Sycophancy & Reward Hacking"
]
}Papers and research that have my attention:
- On the Biology of a Large Language Model — Circuit tracing & attribution graphs in Claude
- Sycophancy to Subterfuge — Reward tampering in language models
- The Case for CoT Unfaithfulness is Overstated — Post-hoc faithfulness analysis
Always looking for paper recommendations in interpretability & alignment!
I'm always open to:
- Chatting about AI safety, interpretability, or alignment research
- Collaborating on open-source AI safety tooling
- Discussing new opportunties if you want to reach out
Reach out: LinkedIn or gab.01@hotmail.com
AI Safety is cool, I'd love to talk if you want to reach out

