From 88e11d49b2e2b08065f36e28299ed8721bf88655 Mon Sep 17 00:00:00 2001 From: Cursor Agent Date: Wed, 13 May 2026 17:28:39 +0000 Subject: [PATCH] Add AGENTS.md with Cursor Cloud development environment instructions --- AGENTS.md | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) create mode 100644 AGENTS.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000000..88d504a116 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,83 @@ + +# AGENTS.md + +## Cursor Cloud specific instructions + +### Project overview + +NVIDIA Triton Inference Server — a C++ inference serving platform with Python frontends. The core server binary (`tritonserver`) is built via CMake inside Docker containers using `build.py`. See `README.md` and `docs/customization_guide/build.md` for build details. + +### Key components + +| Component | Path | Language | Notes | +|---|---|---|---| +| Core server | `src/` | C++ | Requires GPU + Docker to build/run | +| Build orchestrator | `build.py` | Python | Generates Docker/CMake build steps; requires `distro`, `requests` | +| Container composer | `compose.py` | Python | Builds custom Triton Docker images | +| OpenAI-compatible frontend | `python/openai/` | Python (FastAPI) | Requires `tritonserver` Python bindings (C++ extension, only in Triton containers) | +| tritonfrontend bindings | `src/python/` | Python/C++ (pybind11) | Built as a wheel from C++ | +| QA tests | `qa/` | Shell scripts | ~140 `L0_*` integration tests; require GPU + pre-built Triton images | + +### Linting + +Pre-commit hooks are defined in `.pre-commit-config.yaml` and enforce: `isort`, `black`, `flake8`, `clang-format`, `codespell`, and misc checks (trailing whitespace, YAML/JSON validation, etc.). Run with: + +```bash +pre-commit run --all-files +``` + +Or on staged files only (automatically on `git commit` if hooks are installed): + +```bash +pre-commit install +``` + +### Running tests + +- **OpenAI frontend tests** (`python/openai/tests/`): require `tritonserver` Python bindings and a GPU. These only work inside an official Triton Docker container (e.g. `nvcr.io/nvidia/tritonserver:26.04-vllm-python-py3`). +- **QA integration tests** (`qa/L0_*/`): shell-script-based, require GPU + Docker + pre-built Triton images. See `docs/customization_guide/test.md`. + +### Environment constraints (Cloud VM) + +- **No GPU** available in the Cloud Agent VM. The core C++ server cannot be built or run natively. +- **No `tritonserver` Python module** available outside Docker containers. The OpenAI frontend `main.py` and its tests will fail to import without it. +- Python linting (`pre-commit`, `flake8`, `black`, `isort`) and `build.py --help` / `compose.py --help` work without GPU. +- The Python OpenAI frontend schemas (`python/openai/openai_frontend/schemas/`) can be imported and validated without `tritonserver`. + +### Build script dependencies + +`build.py` and `compose.py` require Python packages `distro` and `requests` to be installed. + +### OpenAI frontend dependencies + +Install with: +```bash +pip install -r python/openai/requirements.txt +pip install -r python/openai/requirements-test.txt +```