diff --git a/.claude/skills/adding-a-new-skill/SKILL.md b/.claude/skills/adding-a-new-skill/SKILL.md new file mode 100644 index 0000000000..c9e38c8fd9 --- /dev/null +++ b/.claude/skills/adding-a-new-skill/SKILL.md @@ -0,0 +1,64 @@ +--- +name: adding-a-skill +description: extending yourself with a new reusable skill by interviewing the user +--- + +If the user wants to add a new skill, you can help them with this: + +1. Ask the user for a short name and description of the skill. The name should be + something that can be rendered as a short (10-30 character) descriptive sequence + of words separated by hyphens. The description should be a line of text that is + focused on conveying to a future agent when the skill would be appropriate to + use and roughly what it does. + +2. Ask the user for details. You can build up an idea of what the skill should do + over multiple rounds of questioning. You want to find out: + + - If the skill should involve any key thoughts or considerations. + - If the skill has an order of steps, or is an unordered set of tasks. + - If the skill involves running sub-agents and if so how many. + - **If the skill involves a lot of work** that might benefit from splitting + into subagent tasks (see below for considerations). + - If the skill should include commands to run. + - If the skill should include code to write. + - If for code or commands to there is specific text or more of a + sketch or template of text _like_ some example to include. + - Any hard rules that an agent doing the skill should ALWAYS or NEVER do. + - A set of conditions for stopping, looping/extending, or resetting/restarting. + - How the result of applying the skill should be conveyed to the user. + + **Subagent considerations** + + Skills that involve a lot of work may benefit from splitting into pieces and + running each piece as a subagent. This keeps each subagent focused on a + moderate amount of work so it doesn't get lost or wander off track. If the + skill might use subagents, identify: + + - How the work could be split into moderate-sized pieces + - What information each subagent needs to do its piece + - The skill file should have an "Inputs" section listing what's needed + - The skill should suggest a format for the subagent prompt + +3. Once you have learned this information from the user, assemble it into a + file in the repository. Add a file named `.claude/skills//SKILL.md` + with the following structure: + + - YAML frontmatter with the fields `name` and `description` drawn from + the name and description. + - An introductory paragraph or two describing the goal of the skill and + any thoughts or special considerations to perform, as well as any + description of the meta-parameters like how to split work among subagents + and how to stop/loop/restart. If the skill involves a lot of work, + suggest how it might be split into moderate-sized subagent tasks. + - **If the skill might use subagents**: An "Inputs" section that lists + what information is needed for each piece, and suggests a format for the + subagent prompt. This section comes right after the overview. + - Either a sequential list of steps or an unordered list of tasks. + - Any code or commands in either specific or example form. + - Any of the ALWAYS/NEVER conditions. + - A "Completion" section describing how to summarize the work, noting that + if invoked as a subagent, the summary should be passed back to the + invoking agent. + +When you're done, save that file and then present the user with a link to it +so they can open it and review it. \ No newline at end of file diff --git a/.claude/skills/adding-tests/SKILL.md b/.claude/skills/adding-tests/SKILL.md new file mode 100644 index 0000000000..44d3fd2647 --- /dev/null +++ b/.claude/skills/adding-tests/SKILL.md @@ -0,0 +1,215 @@ +--- +name: adding-tests +description: analyzing a change to determine what tests are needed and adding them to the test suite +--- + +# Overview + +This skill is for analyzing a code change and adding appropriate test coverage. +It covers both unit tests and randomized/fuzz tests, ensuring that new +functionality is tested and bug fixes include regression tests. + +This skill involves a fair amount of work. Consider splitting it into pieces and +running each piece as a subagent (e.g., one for analyzing test needs, one for +writing unit tests, one for randomized tests). Keep each subagent focused on a +moderate amount of work so it doesn't get lost or wander off track. + +The output is either confirmation that tests were added (with details), or +confirmation that no additional tests are needed. + +# Inputs + +Before starting the analysis, gather the following information (if running as a +subagent, the invoking agent should provide these; otherwise, determine them +yourself or ask the user): + +1. **Git range**: The git command to get the diff (e.g., `git diff master...HEAD`). + +2. **Type of change**: Is this a new feature, bug fix, refactor, or performance + change? + +3. **Bug/issue reference** (if applicable): For bug fixes, the issue number or + description of what was broken. + +4. **Specific test requirements** (optional): Any specific testing requirements + the user has mentioned. + +If invoking as a subagent, the prompt should include: "Analyze and add tests +for: . Get the diff using ``. " + +# Analyzing Test Needs + +## Step 1: Understand the Change + +Get the diff using the command provided by the invoking agent: + +Categorize the change: +- **New feature**: Adding new functionality +- **Bug fix**: Correcting incorrect behavior +- **Refactor**: Changing implementation without changing behavior +- **Performance**: Optimizing existing code + +## Step 2: Find Existing Tests + +Locate tests related to the changed code: + +1. Look for test files with similar names (e.g., `Foo.cpp` → `FooTests.cpp`) +2. Search for tests that reference the modified functions/classes +3. Check if there are integration tests that exercise this code path + +```bash +# Find test files +find src -name "*Tests.cpp" | xargs grep -l "FunctionName" +``` + +## Step 3: Determine What Tests Are Needed + +### For New Features + +- Unit tests for each new public function/method +- Tests for expected behavior with valid inputs +- Tests for edge cases (empty input, max values, etc.) +- Tests for error handling with invalid inputs +- Integration tests if the feature involves multiple components + +### For Bug Fixes + +- A regression test that would have failed before the fix +- The test should exercise the exact condition that caused the bug +- Include a comment referencing the issue/bug number if available + +### For Refactors + +- Existing tests should still pass (no new tests typically needed) +- If existing test coverage is inadequate, add tests before refactoring + +### For Performance Changes + +- Ensure functional tests still pass +- Consider adding benchmark tests if appropriate +- Consider adding metrics or tracy ZoneScoped annotations to help + quantify performance +- Consider adding a `LogSlowExecution` wrapper that will flag any + long-running units of work on the IO service + +### Some Special Cases + +- Write fee bump tests to go along with any new regular transaction tests or any + logic changes to transaction processing and application. + +## Step 4: Check for Randomized Test Needs + +For changes affecting: +- Transaction processing +- Consensus/SCP +- Ledger state management +- Serialization/deserialization +- Any protocol-critical code + +Consider whether randomized testing is appropriate: +- Fuzz targets for parsing/deserialization +- Property-based tests for invariants +- Simulation tests for distributed behavior + +# Writing Tests + +## Unit Test Patterns + +Find existing tests in the same area and follow their patterns. Common patterns +in this codebase: + +1. **Test fixture setup**: Look for how test fixtures are created +2. **Assertion style**: Match the assertion macros used elsewhere +3. **Test naming**: Follow the naming convention of nearby tests +4. **Helper functions**: Reuse existing test helpers rather than creating new ones +5. **Test Strictness**: Ensure tests are strict and fail on any unexpected behavior, + check: + - [ ] Do my tests have timeouts? (no infinite hangs) + - [ ] Do my tests assert on failure? (no silent failures or early returns) + - [ ] Am I using logging where I should use assert/panic? + +## Test File Organization + +- Tests typically live in `src/` alongside the code they test, in a `test/` + subdirectory. +- Test files are usually named `*Tests.cpp` +- There are often "test utility" helper files named `*TestUtils.cpp` +- Tests are organized into test suites by component +- There are also some general testing utility files in `src/test` +- The unit test framework is "Catch2", a common C++ framework. + +## Adding a Unit Test + +1. Find the appropriate test file (or create one following conventions) +2. Add the test case following existing patterns +3. Ensure the test is self-contained and doesn't depend on external state +4. Run the new test in isolation to verify it works + +## Adding a Fuzz Target + +If adding a fuzz target: + +1. Check existing fuzz targets in the codebase for patterns +2. Create a target that exercises the specific code path +3. Ensure the target can handle arbitrary input without crashing (except for + intentional assertion failures) +4. Document what the fuzz target is testing + +# Output Format + +Report what was done: + +``` +## Tests Added + +### Unit Tests + +1. **src/ledger/LedgerManagerTests.cpp**: `processEmptyTransaction` + - Tests that empty transactions are rejected with appropriate error + - Regression test for issue #1234 + +2. **src/ledger/LedgerManagerTests.cpp**: `processTransactionWithMaxOps` + - Tests boundary condition at maximum operation count + +### Randomized Tests + +1. **src/fuzz/FuzzTransactionFrame.cpp**: Extended to cover new transaction type + - Added generation of the new transaction variant + +## No Additional Tests Needed + +[If applicable, explain why existing coverage is sufficient] +``` + +# ALWAYS + +- ALWAYS find and follow existing test patterns in the same area +- ALWAYS include regression tests for bug fixes +- ALWAYS test both success and failure paths +- ALWAYS test edge cases and boundary conditions +- ALWAYS run new tests to verify they pass +- ALWAYS run new tests with the bug still present (if a regression test) to verify they would have caught it +- ALWAYS reuse existing test helpers and fixtures +- ALWAYS keep tests focused and independent + +# NEVER + +- NEVER add tests that depend on external state or ordering +- NEVER add tests that are flaky or timing-dependent +- NEVER duplicate existing test coverage +- NEVER write tests that test implementation details rather than behavior +- NEVER add tests without running them +- NEVER skip randomized test consideration for protocol-critical code +- NEVER create new test helpers when suitable ones exist + +# Completion + +Summarize your work as follows: + +1. Summary of tests added (count and type) +2. Details of each test added +3. Confirmation that new tests pass +4. Any notes about test coverage that might still be lacking + +If invoked as a subagent, pass this summary back to the invoking agent. diff --git a/.claude/skills/configuring-the-build/SKILL.md b/.claude/skills/configuring-the-build/SKILL.md new file mode 100644 index 0000000000..13dec62ae2 --- /dev/null +++ b/.claude/skills/configuring-the-build/SKILL.md @@ -0,0 +1,70 @@ +--- +name: configuring-the-build +description: modifying build configuration to enable/disable variants, switch compilers or flags, or otherwise prepare for a build +--- + +# Overview + +The build works like this: + - We start by running `./autogen.sh` + - `autogen.sh` runs `autoconf` to turn `configure.ac` into `configure` + - `autogen.sh` also runs `automake` to turn `Makefile.am` into `Makefile.in` and `src/Makefile.am` into `src/Makefile.in` + - We then run `./configure` + - `configure` turns `Makefile.in` into `Makefile` and `src/Makefile.in` into `src/Makefile` + - `configure` also turns `config.h.in` into `config.h` that contains some variables + - `configure` also writes `config.log`, if there are errors they will be there + +- ALWAYS run `./autogen.sh` and `./configure` from top-level, never a subdirectory +- ALWAYS configure with `--enable-ccache` for caching +- ALWAYS configure with `--enable-sdfprefs` to inhibit noisy build output +- NEVER edit `configure` directly, only ever edit `configure.ac` +- NEVER edit `Makefile` or `Makefile.in` directly, only ever edit `Makefile.am` + +To change configuration settings, re-run `./configure` with new flags. + +You can see the existing configuration flags by looking at the head of `config.log` + +## Configuration variables + +To change compiler from clang to gcc, switch the value you pass for CC and CXX. +For example run `CXX=g++ CC=gcc ./configure ...` to configure with gcc. We want +builds to always work with gcc _and_ clang. + +To alter compile flags (say turn on or off optimization, or debuginfo) change +CXXFLAGS. For example run `CXXFLAGS='-O0 -g' ./configure ...` to build +non-optimized and with debuginfo. Normally you should not have to change these. + +Sometimes you will need to change to a different implementation of the C++ +standard library. To do this, pass `-stdlib=libc++` or `-stdlib=libstdc++` +in `CXXFLAGS` explicitly. But again, normally you don't need to do this. + +## Configuration flags + +Here are some common configuration flags you might want to change: + + - `--disable-tests` turns off `BUILD_TESTS`, which excludes unit tests and all + test-support infrastructure from core. We want this build variant to work + since it is the one we ship, but it is uncommon when doing development. + + - `--disable-postgres` turns off postgresql backend support in core, leaving + only sqlite. tests will run faster, and also this is a configuration we want + to work (we will remove postgres entirely someday). + +There are also some flags that turn on compile-time instrumentation for +different sorts of testing. Turn these on if doing specific diagnostic tests, +and/or to check for "anything breaking by accident". If you turn any on, you +will need to do a clean build -- the object files will have the wrong content. + + - `--enable-asan` turns on address sanitizer. + - `--enable-threadsanitizer` same, but for thread sanitizer. + - `--enable-memcheck` same, but for memcheck. + - `--enable-undefinedcheck` same, but for undefined-behaviour sanitizer. + - `--enable-extrachecks` turns on C++ stdlib debugging, slows things down. + - `--enable-fuzz` builds core with fuzz instrumentation, plus fuzz targets. + +There is more you can learn by reading `configure.ac` directly but the +instructions above ought to suffice for 99% of tasks. Try not to do anything +too strange with the configuration. + +When in doubt, or if things get stuck, you can always re-run `./autogen.sh` +and `./configure`. \ No newline at end of file diff --git a/.claude/skills/high-level-code-review/SKILL.md b/.claude/skills/high-level-code-review/SKILL.md new file mode 100644 index 0000000000..83f9c5acec --- /dev/null +++ b/.claude/skills/high-level-code-review/SKILL.md @@ -0,0 +1,217 @@ +--- +name: high-level-code-review +description: reviewing a change for semantic correctness, simplicity, design consistency, and completeness +--- + +# Overview + +This skill is for performing a high-level code review that requires +understanding the purpose and context of a change. Unlike low-level review +(which catches mechanical mistakes), high-level review evaluates whether the +change is correct in its intent and approach. + +This skill involves a fair amount of work. Consider splitting it into pieces and +running each piece as a subagent (e.g., one subagent per review criterion, or +one per file/component). Keep each subagent focused on a moderate amount of work +so it doesn't get lost or wander off track. + +The output is a **worklist** of issues to address, or confirmation that no +issues were found. + +# Inputs + +Before starting the review, gather the following information (if running as a +subagent, the invoking agent should provide these; otherwise, determine them +yourself or ask the user): + +1. **Goal of the change**: What is the change trying to accomplish? This could + come from an issue description, PR description, commit message, or user + explanation. This is **required** for high-level review. + +2. **Git range**: The git command to get the diff (e.g., `git diff master...HEAD`). + +3. **Any specific concerns** (optional): Areas the user wants extra attention on. + +If invoking as a subagent, the prompt should include: "Review the change for: +. Get the diff using ``. " + +# Obtaining Context + +Before reviewing, gather sufficient context: + +1. **Understand the goal**: Use the goal description provided by the invoking + agent. + +2. **Get the diff**: Use the git command provided by the invoking agent. + +3. **Build an understanding of context**. Use LSP tools to get lists of symbols + and call graphs and type hierarchies as necessary. For each modified file, + read enough of the surrounding code to understand the context (at minimum, + the entire function or class being modified). + +4. **Question your assumptions and seek absolute clarity**: Ask yourself + - Am I assuming requirements, constraints, or behavior? → ASK FIRST + - Am I guessing how something works instead of reading code/docs? → READ FIRST + +4. **Find similar patterns**: Search the codebase for similar functionality to + understand established patterns. + +# Review Criteria + +Evaluate the change against each of these criteria: + +## Correctness + +- Does the change accomplish its stated goal? +- Does it handle all the cases it claims to handle? +- Are the algorithms and logic correct? +- Are boundary conditions handled properly? +- Is arithmetic correct (especially with different integer types)? + +## Completeness + +- Are there edge cases not handled? +- Are error paths handled appropriately? +- If adding a feature, is it fully implemented or partially? +- Are there blocks of code missing with "TODO" or "FIXME" or + "later" or "for now" or "the real version will do..." +- Are all code paths tested or testable? +- Is there sufficient debug logging? +- If you see TransactionFrame being touched, always check if + FeeBumpTransactionFrame support was also added. + +## Performance + +- Are any algorithms of a higher complexity class than they should be? For + example, are there quadratic algorithms where there should be linear or linear + where there should be logarithmic or constant? +- Are appropriate types of containers being used in all cases? +- Are large data structures copied unnecessarily? +- Are copy constructors used where move constructors might work (without + making the logic hard to read)? +- Are any long running blocking operations running on a thread that needs to + remain responsive? For example, are there any unbounded IO operations on the + main thread? Any slow cryptography? +- If it is reasonable to have metrics, are there metrics that track the behavior + of the new code? +- If it is reasonable to have Tracy's ZoneScoped annotations for tracing, are + they present? + +## Consistency + +- Is the approach consistent with how similar things are done elsewhere in the + codebase? +- Does it follow established patterns for this type of change? +- Are naming conventions followed? +- Is the code organization consistent with surrounding code? +- Are new/modified interfaces used correctly everywhere? + +## Safety + +- Could the change have unintended side effects? +- Are invariants maintained? +- Is thread safety preserved (if applicable)? +- Is there any new risk of data inconsistency? +- Is there any new risk of non-determinism? is the global deterministic + pseudo-random number generator used consistently anywhere where + pseudo-non-determinism is desired? +- Are any containers unordered when they should be ordered, or vice versa? +- If there are unanticipated errors, does the code fail safely? +- Are resources properly managed (no leaks, no use-after-free)? +- Is input validation sufficient? +- Will the program recover safely from a crash, or can data be corrupted or lost? +- What forms of partial failure are possible, and are they handled safely? +- Are state changes made atomically and consistently? + +## Error Handling + +- Are errors detected and reported appropriately? +- Are error messages clear and actionable? +- Is error recovery handled correctly? +- Are exceptions used appropriately (if at all)? + +## Minimality + +- Is the change minimal, or does it include unnecessary modifications? +- Are there changes that should be split into separate commits/PRs? +- Is there dead code or debugging code that should be removed? +- Does the change add files it doesn't need to add? +- Does it add classes or functions that it doesn't need to add? +- Is anything duplicated that shouldn't be duplicated? +- Does it add unnecessary wrappers or layers of indirection? +- Does it add unnecessary backwards compatibility code? + +## Documentation + +- Are complex algorithms or non-obvious code explained in comments? +- Are public APIs documented? +- Do comments accurately reflect what the code does? +- Are the comments too verbose? Do they detract from clarity? +- Are any documentation files (README, docs/ etc.) updated if needed? + +# Output Format + +Produce a structured report as output. For each issue found: + +1. **File path** and **line number(s)** +2. **Category** (from the criteria above) +3. **Severity**: Critical / Major / Minor / Suggestion +4. **Description** of the issue +5. **Recommendation** for how to address it + +Example format: + +``` +## Issues Found + +### src/ledger/LedgerManager.cpp:142-150 — Completeness (Major) +**Issue:** The new `processTransaction` path does not handle the case where +the transaction has no operations. +**Recommendation:** Add a check for empty operations and return an appropriate +error code, similar to how `processPayment` handles this at line 89. + +### src/ledger/LedgerManager.cpp:200 — Consistency (Minor) +**Issue:** Variable named `txResult` but similar variables elsewhere use +`transactionResult`. +**Recommendation:** Rename to `transactionResult` for consistency. +``` + +If no issues are found: + +``` +## No Issues Found + +The change appears correct and complete. Observations: +- [Any positive observations or notes for the record] +``` + +# ALWAYS + +- ALWAYS understand the stated goal before reviewing +- ALWAYS read enough context to understand what the code is doing +- ALWAYS check for similar patterns in the codebase before flagging inconsistency +- ALWAYS distinguish between "definitely wrong" and "could be improved" +- ALWAYS provide specific, actionable recommendations +- ALWAYS cite file and line numbers +- ALWAYS consider whether apparent issues might be intentional +- ALWAYS prioritize correctness issues over style issues + +# NEVER + +- NEVER review without understanding what the change is trying to do +- NEVER flag style issues that are consistent with surrounding code +- NEVER suggest refactoring unrelated code +- NEVER recommend changes outside the scope of the current work +- NEVER assume something is wrong just because you would do it differently +- NEVER report issues without a concrete recommendation +- NEVER conflate personal preference with objective problems + +# Completion + +Summarize your work as follows: + +1. Summary: number of issues by severity +2. The detailed issue list (or confirmation of no issues) +3. Overall assessment: Ready to proceed / Needs attention / Major concerns + +If invoked as a subagent, pass this summary back to the invoking agent. diff --git a/.claude/skills/low-level-code-review/SKILL.md b/.claude/skills/low-level-code-review/SKILL.md new file mode 100644 index 0000000000..1cadaeab6c --- /dev/null +++ b/.claude/skills/low-level-code-review/SKILL.md @@ -0,0 +1,179 @@ +--- +name: low-level-code-review +description: reviewing a git diff for small localized coding mistakes that can be fixed without high-level understanding +--- + +# Overview + +This skill is for performing a low-level code review on a git diff, looking for +small, localized coding mistakes that can be identified and fixed without any +high-level understanding of the codebase. These are the kinds of mistakes that +could occur in any codebase, in any language, and are purely mechanical in +nature. + +For larger diffs, consider splitting the review into pieces and running each +piece as a subagent (e.g., one subagent per file or directory). Keep each +subagent focused on a moderate amount of work so it doesn't get lost or wander +off track. + +The output is a **worklist** of issues to fix. + +# Inputs + +Before starting the review, gather the following information (if running as a +subagent, the invoking agent should provide these; otherwise, determine them +yourself or ask the user): + +1. **Git range**: Determine which diff to review: + - Check if there are uncommitted changes (`git diff` and `git diff --cached`) + - If no uncommitted changes, check if current branch differs from `master` + - If neither applies, ask the user for a specific git range + +2. **Context about the change** (optional but helpful): A brief description of + what the change is intended to do, if known. + +If invoking as a subagent, the prompt should include: "Review the diff from +``. " + +# Obtaining the Diff + +Run the git diff command provided by the invoking agent to obtain the diff, +then analyze it. + +# Issues to Look For + +Focus only on issues that are clearly mistakes and can be fixed with confidence. +Do not flag anything that requires understanding the broader system design. + +## Definite Bugs + +- **Numeric overflow/underflow**: Operations on integer types that could exceed + their bounds (e.g., adding two `uint32_t` values near max). +- **Off-by-one errors**: Loop bounds, array indices, range calculations. +- **Null/nullptr dereference risk**: Dereferencing a pointer without checking if + it could be null, especially after operations that might return null. +- **Uninitialized variables**: Variables used before being assigned a value. +- **Resource leaks**: Memory, file handles, or other resources acquired but not + released on all code paths. +- **Use-after-free/move**: Using a resource after it has been freed or moved. +- **Double-free**: Freeing or deleting a resource twice. +- **Boolean logic errors**: Wrong operator precedence, De Morgan's law mistakes, + inverted conditions. +- **String formatting mismatches**: Format specifier doesn't match argument type + (e.g., `%d` for a `size_t`). + +## ACID-semantics violations (on disk) + +- Look for write-to-temp + rename patterns +- Verify temp file is fsynced before rename +- Verify directory is fsynced after rename +- Check: what happens if crash occurs between steps? +**Questions to ask:** +- If power is lost mid-operation, what state is on disk? +- Can the operation be resumed/retried after crash? +- Are there ordering dependencies between file writes? + +## ACID-semantics violations (in memory) +- Look for non-atomic state changes that could leave the system in an inconsistent state if interrupted +- Check for proper locking around shared state +- Check for proper error handling that rolls back or compensates for partial failures +**Questions to ask:** +- If an exception is thrown or an error occurs mid-operation, could the system be left in an inconsistent state? +- Are state changes made atomically, or could they be interrupted leaving partial updates? +- Are there any critical sections that lack proper locking? + +## Likely Mistakes + +- **Copy-paste errors**: Duplicated code blocks with subtle inconsistencies that + suggest a copy-paste where something wasn't updated. +- **Typos in identifiers**: Variable or function names that are almost but not + quite right (especially in new code that mirrors existing patterns). +- **Wrong variable used**: Using a similarly-named variable by mistake. +- **Missing `break` in switch**: Fall-through that appears unintentional. +- **Comparison instead of assignment** (or vice versa): `if (x = y)` vs `if (x == y)`. +- **Signed/unsigned comparison**: Comparing signed and unsigned integers in ways + that could produce unexpected results. +- **Use of raw pointers**: Use smart pointers, optionals or references. +- **Manual cleanup paths**: Use RAII guards. +- **Common C++ stdlib misuse**: Using the wrong container, algorithm, or utility + function for a task, calling container methods or constructors incorrectly, + introducing known UB, performance antipatterns, etc. + +## Style Issues (Only if Clearly Wrong) + +- **Typos in comments**: Misspelled words in comments or documentation. +- **Inconsistent naming**: New code that doesn't follow the naming pattern of + immediately surrounding code. +- **Missing `const`**: Parameters or variables that could clearly be const but + aren't. +- **Unused variables**: Variables declared but never used. +- **Unused includes**: Headers included but nothing from them appears to be used + in the changed code. +- **Dead code**: Code that can never execute (after unconditional return, etc.). +- **West const**: const should go to the right of the constant thing ("east const"). +- **Old idioms**: We are on C++20, so you should lean into features that help with + correctness and clarity when appropriate. + +# Output Format + +Produce a structured worklist as output. Each item should contain: + +1. **File path** and **line number** (from the diff) +2. **Issue type** (from the categories above) +3. **Original code** (the exact problematic code) +4. **Suggested fix** (the specific edit to make) +5. **Brief explanation** (why this is a problem, one sentence) + +Group issues by file. Example format: + +``` +## src/foo/Bar.cpp + +### Line 142: Numeric overflow risk +**Original:** `uint32_t total = count1 + count2;` +**Fix:** `uint64_t total = static_cast(count1) + count2;` +**Why:** Both operands are uint32_t and their sum could exceed UINT32_MAX. + +### Line 287: Typo in comment +**Original:** `// Calcualte the checksum` +**Fix:** `// Calculate the checksum` +**Why:** Misspelled "Calculate". +``` + +# ALWAYS + +- ALWAYS cite the exact line number from the diff +- ALWAYS quote the original code exactly as it appears +- ALWAYS provide a specific, concrete fix (not just "fix this") +- ALWAYS explain why it's a problem in one sentence +- ALWAYS focus only on changed lines in the diff (lines starting with `+`) +- ALWAYS group issues by file for easier processing +- ALWAYS consider whether an issue might be intentional before reporting +- ALWAYS prioritize potential runtime errors over style issues +- ALWAYS check if a correctness condition is actually checked earlier in the same function before reporting +- ALWAYS verify the issue exists in the new code, not just the removed code + +# NEVER + +- NEVER change logic or behavior beyond the minimal fix required +- NEVER suggest refactoring, redesign, or architectural changes +- NEVER flag issues that require understanding the broader codebase or system design +- NEVER report style preferences that aren't clearly inconsistent with surrounding code +- NEVER suggest changes to lines that weren't modified in the diff +- NEVER make assumptions about programmer intent for ambiguous cases +- NEVER report the same mechanical issue more than 3 times; instead note "and N similar occurrences in this file" +- NEVER flag intentional patterns (e.g., don't flag `if (auto* p = getPtr())` as "assignment in condition") +- NEVER report issues in test code that are clearly intentional (e.g., testing error paths) +- NEVER spend time on issues that a compiler warning would catch (assume the build has warnings enabled) + +# Completion + +Summarize your work as follows: + +- Present the worklist of issues found +- If no issues were found, report that explicitly: "No low-level issues found in + the diff." +- If the diff is very large (more than ~500 lines of additions), suggest + splitting the review by file or directory to ensure thoroughness. + +If invoked as a subagent, pass this summary back to the invoking agent. diff --git a/.claude/skills/running-make-to-build/SKILL.md b/.claude/skills/running-make-to-build/SKILL.md new file mode 100644 index 0000000000..8c85dcb1e2 --- /dev/null +++ b/.claude/skills/running-make-to-build/SKILL.md @@ -0,0 +1,64 @@ +--- +name: running-make-to-build +description: how to run make correctly to get a good build, and otherwise understand the build system +--- + +# Overview + +The build is a recursive make structure, with several projects vendored in to +`lib` that use their own Makefiles and also one primary `src/Makefile.am` that +defines most of the build. + +- ALWAYS run `make -j $(nproc)` to get full parallelism +- ALWAYS run from the top level directory +- ALWAYS run with `2>&1 | tail -N` to limit output +- ALWAYS run `git add && ./make-mks` after adding `` to + ensure it is included in the build. +- NEVER run from a subdirectory +- NEVER run with `make -C ` for any other directory +- NEVER run `cargo` manually, let `make` run it +- NEVER edit `Makefile` or `Makefile.in`, only ever edit `Makefile.am` + +## Targets + +The main targets are: + + - `all` -- the implicit target, builds `src/stellar-core` + - `check` -- builds `all` then runs unit and integration tests + - `clean` -- removes build artifacts + - `format` -- auto-formats source code with standard rules + +If anything goes wrong or is confusing in the build, start by running `make +clean` and trying again. You should have configured with `--enable-ccache` which +means that rebuilding will typically be very cheap. Especially if you run with +`make -j $(nproc)` + +## Rust build + +The `src/Makefile.am` also delegates to `cargo` to build the rust components +of stellar-core in `src/rust` as well as all the submodules in `src/rust/soroban`. +The integration is quite subtle. You should always let `src/Makefile.am` handle +invoking `cargo`. + +## Generated files + +Several source files are generated. All .x files in `src/protocol-{curr,next}` +are turned into .cpp and .h files by the `xdrpp` code-generator in `lib/xdrpp`. + +Parts of the XDR query system in `src/util/xdrquery` are built by `flex` and +`bison`. + +Files like `src/main/StellarCoreVersion.cpp` bake the current version +information into a string constant in stellar-core. + +And finally the rust bridge `src/rust/RustBridge.{cpp,h}` is generated by the +`cxxbridge` tool from `src/rust/bridge.rs`. + +## Editing the makefiles + +Most of the time you won't need to edit `Makefile.am` or `src/Makefile.am` at all. + +Files included in the build are driven by the script `./make-mks` which +lists files tracked by git and defines makefile variables based on them. +As soon as you add a .cpp or .h file to git and re-run `./make-mks` +it will be added to the build. diff --git a/.claude/skills/running-tests/SKILL.md b/.claude/skills/running-tests/SKILL.md new file mode 100644 index 0000000000..c62bf6da5c --- /dev/null +++ b/.claude/skills/running-tests/SKILL.md @@ -0,0 +1,385 @@ +--- +name: running-tests +description: running tests at various levels from smoke tests to full suite to randomized tests +--- + +# Overview + +This skill is for running tests systematically, starting with fast/focused tests +and progressing to slower/broader tests. This ordering allows failures to be +caught early, minimizing wasted time. + +This skill is designed to be run as a **subagent** to avoid cluttering the +invoking agent's context. The output is either confirmation that all tests +passed, or a report of failures. + +# Required Inputs (Before Launching Subagent) + +Since subagents cannot ask for clarification, the **invoking agent must gather +this information before launching**: + +1. **Changed files/modules**: Which files or modules were changed, so the + subagent can identify appropriate smoke tests and focused tests. + +2. **Test levels to run**: Which levels to execute. Options: + - "smoke only" - just Level 1 + - "through focused" - Levels 1-2 + - "through full suite" - Levels 1-3 (usually sufficient for small changes) + - "through full suite with tx-meta" - Levels 1-3 plus tx-meta baseline check + - "through sanitizers" - Levels 1-4 (for memory/concurrency-sensitive code) + +The subagent prompt should include: "Run tests for changes in ." + +# Test Output Control + +To reduce noise and keep agent context manageable, always use these flags: + +```bash +# Recommended flags for quiet output +--ll fatal # Only log fatal errors (not info/debug messages) +-r simple # Use simple reporter (minimal output) +--disable-dots # Don't print progress dots +--abort # Stop on first failure (don't run remaining tests) +``` + +Example: +```bash +./stellar-core test --ll fatal -r simple --disable-dots --abort "test name" +``` + +Note that if you ever do need information about a test when trying to diagnose +what went wrong with it, you might want to turn the log level up from fatal to +info, debug or even trace, using `--ll debug` or `--ll trace` for example. + +# Protocol Versions + +Many tests are protocol-specific and can behave differently across protocol +versions. Use these flags to control which protocol versions are tested: + +```bash +--version # Run tests for a specific protocol version +--all-versions # Run tests for all supported protocol versions +``` + +For focused testing during development, test with the current protocol version, +which is the default. The full test suite should eventually be run with +`--all-versions`. + +# Deterministic Random Number Generator + +Tests use a deterministic PRNG. By default, the seed varies, but you can set +a specific seed for reproducibility: + +```bash +--rng-seed # Use a specific RNG seed for reproducibility +``` + +This is useful for reproducing failures or for baseline checks that require +consistent output. + +# Test Levels + +Tests are run in order of increasing cost. Stop at the first failure. + +## Level 1: Smoke Tests + +Run 2-3 specific tests that are most likely to catch breakage in the changed +code. These should complete in seconds. + +To identify smoke tests: +1. Find tests in the same file/module as the changed code +2. Pick tests that directly exercise the modified functions +3. Prefer fast tests over slow ones + +```bash +# Run a specific test by name (use quotes for exact match) +./stellar-core test --ll fatal -r simple --abort "exact test name" +``` + +## Level 2: Focused Unit Tests + +Run all tests in the test file(s) related to the change. This typically takes +a few minutes. + +```bash +# Run tests matching a tag pattern +./stellar-core test --ll fatal -r simple --abort "[ModuleName*]" + +# Run tests from a specific area +./stellar-core test --ll fatal -r simple --abort "[ledgertxn]" + +# Combine tags (AND logic - must match all) +./stellar-core test --ll fatal -r simple --abort "[tx][soroban]" +``` + +### Example Test Names by Area + +**Ledger/Transaction tests:** +- `"[ledgertxn]"` - LedgerTxn operations +- `"[tx][payment]"` - Payment transaction tests +- `"[tx][createaccount]"` - CreateAccount tests +- `"[tx][offers]"` - Offer/DEX tests +- `"[tx][soroban]"` - Soroban (smart contract) transaction tests + +**Bucket/BucketList tests:** +- `"[bucket]"` - General bucket tests +- `"[bucketlist]"` - BucketList specific tests +- `"[bucketmergemap]"` - Bucket merge map tests + +**Herder tests:** +- `"[herder]"` - General herder tests +- `"[txset]"` - Transaction set tests +- `"[transactionqueue]"` - Transaction queue tests +- `"[quorumintersection]"` - Quorum intersection tests +- `"[upgrades]"` - Protocol upgrade tests + +**Overlay/Network tests:** +- `"[overlay]"` - Overlay network tests +- `"[flood]"` - Transaction flooding tests +- `"[PeerManager]"` - Peer management tests + +**Crypto/Utility tests:** +- `"[crypto]"` - Cryptography tests +- `"[decoder]"` - Base32/64 encoding tests +- `"[timer]"` - VirtualClock timer tests +- `"[cache]"` - Cache implementation tests + +**Soroban-specific tests:** +- `"[soroban]"` - All Soroban tests +- `"[soroban][archival]"` - State archival tests +- `"[soroban][upgrades]"` - Soroban upgrade tests + +## Level 3: Full Unit Test Suite + +Run the complete unit test suite. This may take 10-30 minutes. + +### Basic Execution + +```bash +make check +``` + +Or directly with quiet output: + +```bash +./stellar-core test --ll fatal -r simple --disable-dots --abort +``` + +### Parallel Execution (faster) + +For faster execution, use parallel partitions via `make check`: + +```bash +# Run with partitions equal to CPU cores +NUM_PARTITIONS=$(nproc) make check +``` + +### Full Protocol Coverage + +The full test suite should be run with all protocol versions: + +```bash +ALL_VERSIONS=1 NUM_PARTITIONS=$(nproc) make check +``` + +### SQLite-Only Testing (No Postgres) + +To test with SQLite only (faster, no Postgres dependency): + +```bash +./configure --disable-postgres --enable-ccache --enable-sdfprefs +make clean && make -j $(nproc) +NUM_PARTITIONS=$(nproc) make check +``` + +## Level 3b: Transaction Metadata Baseline Check + +This validates that transaction test execution produces the same metadata hashes +as fixed baselines stored in the repository. This catches unintended changes to +transaction semantics. + +**Important**: Always use `--rng-seed 12345` for baseline checks to ensure +deterministic results. + +```bash +# Check transaction tests against current protocol baseline +./stellar-core test "[tx]" --all-versions --rng-seed 12345 --ll fatal \ + --abort -r simple --check-test-tx-meta test-tx-meta-baseline-current +``` + +For next-protocol testing (when preparing protocol upgrades): + +```bash +./stellar-core test "[tx]" --all-versions --rng-seed 12345 --ll fatal \ + --abort -r simple --check-test-tx-meta test-tx-meta-baseline-next +``` + +If baselines need updating after intentional changes, the test will fail and +indicate which baselines differ. + +## Level 4: Sanitizer Tests + +**When to run**: Only needed for changes touching memory management, pointers, +concurrency, or threading code. Skip for simple logic changes, config changes, +or test-only changes. + +Run tests with sanitizers enabled to catch memory errors and undefined behavior. +This requires reconfiguring and rebuilding. + +### Address Sanitizer (ASan) + +Catches memory errors: buffer overflows, use-after-free, memory leaks. + +```bash +./configure --enable-asan --enable-ccache --enable-sdfprefs +make clean && make -j $(nproc) +./stellar-core test --ll fatal -r simple --disable-dots --abort +``` + +### Thread Sanitizer (TSan) + +Catches data races and threading issues. + +```bash +./configure --enable-threadsanitizer --enable-ccache --enable-sdfprefs +make clean && make -j $(nproc) +./stellar-core test --ll fatal -r simple --disable-dots --abort +``` + +### Undefined Behavior Sanitizer (UBSan) + +Catches undefined behavior like integer overflow, null pointer dereference. + +```bash +./configure --enable-undefinedcheck --enable-ccache --enable-sdfprefs +make clean && make -j $(nproc) +./stellar-core test --ll fatal -r simple --disable-dots --abort +``` + +## Level 5: Extra Checks Build + +**When to run**: Only for changes to core data structures or when Level 4 +sanitizers found something suspicious. Usually overkill. + +Run with C++ standard library debugging enabled. Slower but catches more issues. + +```bash +./configure --enable-extrachecks --enable-ccache --enable-sdfprefs +make clean && make -j $(nproc) +./stellar-core test --ll fatal -r simple --disable-dots --abort +``` + +# Build Verification + +Before running tests at Levels 4-6, also verify the build succeeds with +`--disable-tests` (the production configuration): + +```bash +./configure --disable-tests --enable-ccache --enable-sdfprefs +make clean && make -j $(nproc) +``` + +This doesn't run tests but ensures the production build works. + +# Interpreting Failures + +When a test fails: + +1. **Identify the failing test**: Note the exact test name and file +2. **Capture the failure output**: Save the error message and stack trace +3. **Determine if it's a real failure**: Check if the test is flaky or if this + is a genuine regression +4. **Locate the relevant code**: Find where in the changed code the failure + originates + +## Common Failure Patterns + +- **Assertion failure**: A test assertion didn't hold; check the condition +- **Crash/segfault**: Memory error; run with ASan for more details +- **Timeout**: Test took too long; may indicate infinite loop or deadlock +- **Sanitizer error**: Memory or threading bug; the sanitizer output shows where + +# Output Format + +Report the results: + +``` +## Test Results: PASS + +All test levels completed successfully: +- Level 1 (Smoke): 3 tests, 2.1s +- Level 2 (Focused): 47 tests, 1m 12s +- Level 3 (Full Suite): 1,234 tests, 18m 45s +- Level 3b (TX Meta Baseline): OK + +Build verification: +- --disable-tests: OK +``` + +Or on failure: + +``` +## Test Results: FAIL + +Failed at Level 2 (Focused Unit Tests) + +**Failing test:** `LedgerManagerTests.processTransactionRejectsEmpty` +**File:** src/ledger/LedgerManagerTests.cpp:142 +**Error:** + REQUIRE( result == TRANSACTION_REJECTED ) + with expansion: + TRANSACTION_SUCCESS == TRANSACTION_REJECTED + +**Analysis:** The test expects empty transactions to be rejected, but the +new code path is allowing them through. See LedgerManager.cpp:98 where the +empty check appears to be missing. + +Levels completed before failure: +- Level 1 (Smoke): 3 tests, 2.1s ✓ +``` + +# Choosing the Right Test Level + +**For most changes** (logic fixes, new features, refactors): +- Run through Level 3 (full suite) with `--all-versions` +- Run Level 3b (tx-meta baseline) for transaction-related changes +- Skip Levels 4-5 unless the change touches memory/threading + +**For memory-sensitive changes** (pointers, allocations, C++ containers): +- Run through Level 4 (at least ASan) + +**For concurrency changes** (threading, async, locks): +- Run through Level 4 (especially TSan) + +**For test-only changes** or documentation: +- Level 1-2 is usually sufficient + +# ALWAYS + +- ALWAYS run tests in order of increasing cost +- ALWAYS stop at the first failure (use `--abort` flag) +- ALWAYS use `--ll fatal -r simple --disable-dots` for quiet output +- ALWAYS capture and report failure details +- ALWAYS run full suite with `--all-versions` before considering complete +- ALWAYS use `--rng-seed 12345` for tx-meta baseline checks +- ALWAYS report timing for each level +- ALWAYS identify the specific test and location of failures + +# NEVER + +- NEVER skip smoke tests and go straight to full suite +- NEVER continue to later levels after a failure +- NEVER report "tests failed" without specifics +- NEVER assume a test failure is flaky without evidence +- NEVER run verbose output that floods the context +- NEVER run tests without having built first +- NEVER run sanitizers (Level 4-5) for trivial changes (it's overkill) + +# Completion + +Report to the invoking agent: + +1. Overall result: PASS or FAIL +2. For PASS: Summary of all levels completed with timing +3. For FAIL: Detailed failure report with analysis +4. Any observations (slow tests, warnings, etc.) diff --git a/.claude/skills/validating-a-change/SKILL.md b/.claude/skills/validating-a-change/SKILL.md new file mode 100644 index 0000000000..af128430bd --- /dev/null +++ b/.claude/skills/validating-a-change/SKILL.md @@ -0,0 +1,164 @@ +--- +name: validating-a-change +description: comprehensive validation of a change to ensure it is correct and ready for a pull request +--- + +# Overview + +This skill is for validating that a change is correct, complete, and ready for +a pull request. It orchestrates several other skills to systematically check +formatting, review code, ensure test coverage, and run tests. + +This skill involves a lot of work. Consider splitting it into pieces and running +each step as a subagent. Keep each subagent focused on a moderate amount of work +so it doesn't get lost or wander off track. + +**Critical principle**: Each step must fully resolve its own issues before +proceeding to the next step. There is no need to restart from the beginning +when issues are found—fix them in place and continue forward. + +The ordering prioritizes: +1. Code reviews first (catches design and semantic issues early) +2. Adding any missing tests +3. Running tests +4. Multiple configurations (catches config-specific issues) +5. Formatting last (mechanical cleanup) + +# Inputs + +Before starting validation, gather the following information (if running as a +subagent, the invoking agent should provide these; otherwise, determine them +yourself or ask the user): + +1. **Goal of the change**: What is the change trying to accomplish? This is + needed for high-level code review. + +2. **Type of change**: Is this a new feature, bug fix, refactor, or performance + change? This is needed for the adding-tests step. + +3. **Bug/issue reference** (if applicable): For bug fixes, the issue number. + +4. **Any specific concerns** (optional): Areas wanting extra attention. + +If invoking as a subagent, the prompt should include: "Validate change for: +. Change type: . " + +# Prerequisites + +This skill composes several other skills: +- `low-level-code-review` - for mechanical code review +- `high-level-code-review` - for semantic code review +- `adding-tests` - for analyzing and adding test coverage +- `running-tests` - for running tests at all levels +- `running-make-to-build` - for building correctly +- `configuring-the-build` - for setting up different configurations + +# Validation Steps + +Execute these steps in order. Each step should fully resolve any issues it +finds before proceeding to the next step. + +## Step 1: High-Level Code Review + +Consider running as a subagent using the `high-level-code-review` skill. + +Before launching, determine the git range: +- If uncommitted changes exist, use `git diff` and `git diff --cached` +- Otherwise, use `git diff master...HEAD` + +Launch with prompt: "Review the change for: . Get the diff +using ``. " + +This reviews the change for semantic correctness, design consistency, and +completeness. The subagent will return a report of issues by severity. + +If any Critical or Major issues are found, run a subagent to fix them before +proceeding. Minor issues and Suggestions can be addressed or deferred at your +discretion. + +## Step 2: Low-Level Code Review + +Consider running as a subagent using the `low-level-code-review` skill. + +Launch with prompt: "Review the diff from ``" + +This reviews the diff for small, mechanical coding mistakes that don't require +high-level understanding. The subagent will return a worklist of issues. + +If any issues are found, run a subagent to fix them before proceeding. + +## Step 3: Add Tests + +Consider running as a subagent using the `adding-tests` skill. + +Launch with prompt: "Analyze and add tests for: . +Get the diff using ``. " + +This analyzes the change to determine what tests are needed and adds them. +The subagent will return a report of tests added. + +## Step 4: Run Tests + +Consider running as a subagent using the `running-tests` skill. + +Before launching, identify the changed files/modules from the diff. + +Launch with prompt: "Run tests through full suite for changes in ." + +This runs tests at levels 1-3: smoke tests, focused tests, and the full unit +test suite. The subagent will return a detailed report of results. + +If any tests fail, run a subagent to fix the issues before proceeding. + +## Step 5: Build and Test with Multiple Configurations + +Consider running as a subagent using the `running-tests` skill with extended levels. + +Launch with prompt: "Run tests through sanitizers for changes in . +Include fuzz tests if protocol-critical code was changed." + +This runs tests at levels 4-6: sanitizer builds (ASan, TSan, UBSan), extra +checks build, and randomized/fuzz tests. See the `configuring-the-build` skill +for details on configuration options. + +Also verify the build succeeds with `--disable-tests` (the production config). + +If any configuration fails to build or any tests fail, run a subagent to fix +the issues before proceeding. + +## Step 6: Format the Code + +Run `make format` to auto-format all source code and verify formatting is clean. + +# ALWAYS + +- ALWAYS consider running long-running steps as subagents to keep them focused +- ALWAYS fully resolve issues within a step before proceeding to the next step +- ALWAYS wait for each subagent to complete before proceeding +- ALWAYS run high-level review before low-level review + +# NEVER + +- NEVER run multiple subagents in parallel (they may conflict) +- NEVER skip any step, even if you think it's unnecessary +- NEVER consider validation complete until all configurations pass +- NEVER skip high-level review just because you want to get to tests faster + +# Completion + +When all steps pass without requiring any code edits, the change is validated +and ready for a pull request. Summarize your work as follows: + +1. Summary of what was validated +2. Configurations tested +3. Test coverage added (if any) +4. Any observations or minor concerns that didn't require changes + +If validation cannot be completed (e.g., stuck in a loop), report: + +1. Which step keeps failing +2. What issues keep recurring +3. Whether the fundamental approach may need reconsideration + +If invoked as a subagent, pass this summary back to the invoking agent. diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 5a0adb3613..0105161259 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -1,8 +1,68 @@ -## Additional Code Review Guidelines -Write fee bump tests to go along with any new regular transaction tests or any logic changes to transaction processing and application. +# Overview -If you see TransactionFrame being touched, always check if FeeBumpTransactionFrame support was also added, and report if changes are potentially missing. +You are an expert software engineer working on a blockchain's primary +transaction processor called Stellar Core. -Report any numeric operations that may cause overflow or underflow (e.g., adding two `uint32_t`s near their maximum value). +It is written in C++ and Rust, is open source, hosted at github at +github.com/stellar/stellar-core, and is fairly mature code (being worked on for +over a decade). There is a lot of old code and a lot of tests. There is a team +of skilled engineers working on it. The code is also live, and handling real +monetary transactions. Correctness is of paramount importance. Testing is +extensive but not perfect. When modifying an area of code that does not have +test coverage, write tests first capturing existing behaviour before proceeding +to make changes. Do everything in your power to ensure that the +code you write is correct and high quality. Write code sparingly and only when +you are _sure_ of what to do. Spend much more effort planning before writing, +and validating after writing, than you do writing. Stop and ask for help when in +doubt. There are multiple skills available to help with this task. -Report any typos in comments. +## Basic configuration, building and testing: + +ALWAYS limit noisy build output by passing `--enable-sdfprefs` to configure. +ALWAYS enable ccache by passing `--enable-ccache` to configure. +ALWAYS build with parallelism by passing `-j $(nproc)` to make. +ALWAYS limit noisy test output with `--ll fatal -r simple --disable-dots` +ALWAYS abort tests on first failure with `--abort` +ALWAYS run build and test commands from the top-level directory of the repo. +ALWAYS run build and test commands with ` 2>&1 | tail -N` to limit output. +NEVER build from subdirectories, nor pass paths to make or other commands. +NEVER run cargo manually. + +```sh +# to run autoconf (needed before configure) +$ ./autogen.sh + +# to run configure (needed before build) +$ ./configure --enable-ccache --enable-sdfprefs + +# to build (with parallelism) +$ make -j $(nproc) + +# to run a single test +$ ./src/stellar-core test --ll fatal -r simple --abort --disable-dots "TestName" + +# to run all tests with a given tag (eg. [tx], [bucket], [overlay] or [soroban]) +$ ./src/stellar-core test --ll fatal -r simple --abort --disable-dots "[tag]" + +# to run the whole testsuite (with parallelism) +$ NUM_PARTITIONS=$(nproc) STELLAR_CORE_TEST_PARAMS='--ll fatal -r simple --abort --disable-dots' make check +``` + + +## Tools, subagents and skills + +You will know if you are running "as an agent" in vscode if you have access to +the `run_in_terminal` or `create_and_run_task` tools. + +If you are an agent running in vscode then: + + 1. You should also have access to a bunch of skills in blocks in your + context. If you don't, ask the user to enable them. The setting in vscode + is `"chat.useAgentSkills": true`. + + 2. You should have access to a bunch of tools that start with `lsp_` such as + `lsp_get_definition`. If you don't, ask the user to install "LSP language + model tools" extension from the marketplace. Use these `lsp_` tools + instead of `grep`, `rg` or other simple text search tools when seeking + information about the codebase. Especially use `lsp_document_symbols` + for an overview of any given file.