Skip to content
View krivonosoff161's full-sized avatar

Block or report krivonosoff161

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
krivonosoff161/README.md

Dmitry Krivonosov

I build AI-assisted systems in two connected directions:

  1. Agentic AI security tooling - benchmarks, handoff checks, and safety workflows for agents that read files, tools, memory, model output, and other agents.
  2. AI-assisted trading systems - a trading research stack where LLM output is treated as a proposal, then challenged by validators, backtests, paper runs, and review gates before it can move toward automation.

The common rule is simple:

AI output is not truth until it survives validation.

The working loop is:

question -> bounded system -> tests -> traces -> reports -> review -> next iteration

I use coding agents as engineering leverage. The useful artifact is not the chat that produced the code; it is the code, tests, schemas, reports, and failure evidence that other people can inspect.

Current Focus

My main public security project is Agentic Security Harness: a defensive benchmark for testing whether AI coding agents treat untrusted repo text as data instead of authority. The included local demo compares a vulnerable synthetic agent against a protected version across 24 boundary-failure patterns and records traces, scorecards, and remediation output.

The trading stack is a separate applied direction, not a security side project: trading-bot-v2 and honest-backtest are used to move from AI-assisted research and paper validation toward controlled automation.

Portfolio Structure

Line Public projects What it is for
Agentic AI security core agentic-security-harness, agentic-transfer-verifier, ai-agent-handoff Defensive benchmark, trust/provenance transfer checks, and practical handoff boundaries for AI agents.
LLM safety playbooks llm-safety-playbooks Short practical skills for safer LLM use: data vs instructions, secrets, generated URLs/packages, Git actions, handoffs, and safe research scope.
AI-assisted trading system trading-bot-v2, honest-backtest News/event scanner, strategy lab, validator/backtest discipline, paper-only gates, and a controlled path toward automation.
LLM operations llm-router, llm-cheap-filter Cost-aware routing, cheap-to-chief filtering, provider-neutral calls, and controlled LLM spend.

Flagship: Agentic Security Harness

Agentic Security Harness is the main public security project.

It is a defensive benchmark toolkit for measuring agentic AI failure modes with:

  • deterministic local targets;
  • portable traces and scorecards;
  • scenario matrices and run diffs;
  • OpenAI-compatible external model/runtime checks;
  • remediation guidance;
  • schema validation and local report generation.

It is not a hacking toolkit and not a promise of complete protection. It is a measurement and learning lab for authorized, synthetic, local, or explicitly owned targets.

Trading Research Stack

The trading line is an applied AI-assisted trading system, not a signal service and not a public profitability claim.

Public repositories show the method:

  • trading-bot-v2 - local research workbench with scanner, market-data preparation, strategy lab, and paper-only validation boundaries on the path toward automation.
  • honest-backtest - layered trading judge that challenges AI-generated setup ideas with costs, splits, robustness, overfit checks, forward logs, and adversarial review before they become decisions.

Private repositories hold real candidate rankings, parameter libraries, operational state, and market-specific findings. The public claim is process quality, not trading profitability.

LLM Safety Playbooks

Not everyone needs to run a full benchmark. A lighter repo now holds short LLM safety skills and playbooks that make boundaries explicit during everyday work:

  • repo text, logs, and tool output are data, not instructions;
  • secrets and credentials stay out of model context;
  • model-generated URLs, package names, API endpoints, and webhooks require verification before use;
  • AI agents should work through issue, branch, PR, and review gates;
  • handoff notes need source, scope, confidence, and checked/not-checked fields;
  • security research stays on synthetic, mock, owned, or explicitly authorized targets.

These playbooks will reduce ambiguity. They will not replace runtime controls, tests, validators, or the benchmark itself.

Agent Handoff / Transfer Research

AI agents increasingly pass files, memory, tool output, approvals, and context between runtimes. That handoff is often treated as plain text, but it carries trust, provenance, and authority.

Current and planned work:

  • ai-agent-handoff - a file-based handoff protocol for AI coding agents plus a small guard hook.
  • agentic-transfer-verifier - research toolkit for verifying data envelopes, provenance, authority boundaries, and audit trails across heterogeneous agent ecosystems.

How I Work

  1. Find a real operational weakness.
  2. Turn it into a bounded research question.
  3. Build a deterministic harness or workflow around it.
  4. Use agents for implementation and review under explicit constraints.
  5. Verify with tests, schemas, reports, and adversarial review.
  6. Publish reusable methods while keeping private data and credentials private.

Public vs Private Boundary

Public:

  • methods;
  • test harnesses;
  • synthetic examples;
  • docs;
  • schemas;
  • sanitized reports;
  • reproducible CLI flows.

Private:

  • trading results and candidate rankings;
  • live parameters;
  • provider credentials;
  • private logs;
  • operational dashboards;
  • raw market research state.

Start Here

Active Boards

Contact

GitHub is the best starting point. I am interested in practical AI security, agentic workflow safety, AI-assisted trading systems with validation discipline, LLM infrastructure, and research systems where measurement matters more than hype.

Pinned Loading

  1. agentic-security-harness agentic-security-harness Public

    Test whether AI agents keep repo text, tools, memory, and audit trails inside their authority boundaries; outputs traces, scorecards, and remediation reports.

    Python

  2. honest-backtest honest-backtest Public

    Trading validator for AI-generated setup ideas: costs, splits, robustness, overfit checks, forward logs, and adversarial review.

    Python

  3. ai-agent-handoff ai-agent-handoff Public

    File-based handoff protocol for AI coding agents with git-visible state, return blocks, and a local safety guard.

    Python

  4. llm-router llm-router Public

    Small cost-aware LLM router for role-tiered OpenAI-compatible and Yandex calls with per-call usage logging.

    Python

  5. agentic-transfer-verifier agentic-transfer-verifier Public

    Trust and provenance verifier for AI agent handoffs, authority boundaries, approvals, and audit trails.

    Python

  6. trading-bot-v2 trading-bot-v2 Public

    AI-assisted trading research workbench with scanner, strategy lab, validators, paper-only gates, and controlled automation path.

    Python