Skip to content

feat(write-contract): evidence-based rewrite with 28/28 tested patterns#5

Open
acastellana wants to merge 3 commits intogenlayerlabs:mainfrom
acastellana:feat/write-contract-v2
Open

feat(write-contract): evidence-based rewrite with 28/28 tested patterns#5
acastellana wants to merge 3 commits intogenlayerlabs:mainfrom
acastellana:feat/write-contract-v2

Conversation

@acastellana
Copy link
Collaborator

@acastellana acastellana commented Mar 12, 2026

Summary

Complete rewrite of the write-contract skill for the genlayer-dev plugin, based on hands-on development and testing of production intelligent contracts.

What changed

  • All patterns are test-verified (28/28) — nothing included on faith
  • _parse_llm_json helper included verbatim — handles dict/str/markdown fence edge cases, saves hours of debugging
  • 4 equivalence principle patterns with clear when-to-use guidance (partial field matching, numeric tolerance, LLM comparative, non-comparative)
  • Error prefix convention ([EXPECTED]/[EXTERNAL]/[TRANSIENT]) for classifying failures in validator logic
  • gl.vm.Return vs gl.vm.Result isinstance gotcha documented — gl.vm.Result is a type alias, not safe for isinstance
  • run_nondet_unsafe vs run_nondet table with clear guidance — use run_nondet_unsafe for all custom patterns
  • Testing section with validator agree/disagree examples and run_validator() usage
  • Stable JSON comparison with sort_keys=True note
  • Reference index — 13 URLs covering all major docs/SDK pages so Claude can fetch detail on demand
  • Leaner overall — removed unverified patterns, cut ~70 lines vs prior version

Verified against

  • SDK source at sdk.genlayer.com
  • Production contracts deployed on GenLayer testnet
  • genlayer-testing-suite examples

Summary by CodeRabbit

  • Documentation
    • Major rewrite of GenLayer contract docs: added versioned header and new contract skeleton; clarified storage types (TreeMap, DynArray, Address, u256) and forbid Python built-ins for on‑chain storage.
    • New sections: storage types, method decorators, transaction context, error-prefix conventions, address normalization, nondeterministic patterns and safety, JSON/nested-collection handling.
    • Expanded examples: web/LLM patterns, testing/validator workflows, cross-contract examples, and MCP doc lookup.

- All patterns test-verified (u256 arithmetic, nested collections, address
  handling, validator result types, run_nondet_unsafe vs run_nondet)
- _parse_llm_json helper included verbatim (handles dict/str/markdown fence)
- Equivalence principle: 4 patterns with clear when-to-use guidance
- Error prefix convention ([EXPECTED]/[EXTERNAL]/[TRANSIENT])
- Testing section: validator agree/disagree examples, run_validator() usage
- Reference index: 13 URLs covering all major docs/SDK pages
- Removed unverified patterns; leaner overall (~230 lines vs prior)
@coderabbitai
Copy link

coderabbitai bot commented Mar 12, 2026

📝 Walkthrough

Walkthrough

Comprehensive rewrite of the write-contract SKILL.md: replaces the prior contract skeleton with a versioned header and new contract scaffold, introduces GenLayer storage types, method decorators, transaction context and error conventions, nondeterministic-block guidance, equivalence/testing patterns, JSON/storage rules, and MCP doc lookup instructions. (50 words)

Changes

Cohort / File(s) Summary
Write-Contract Skill Documentation
plugins/genlayer-dev/skills/write-contract/SKILL.md
Complete restructuring and expansion: added required file headers and versioned contract scaffold (class MyContract(gl.Contract)), changed constructor and public API (__init__(name: str), get_info() -> str, increment()), replaced Python built-ins with GenLayer storage types (TreeMap, DynArray, u256, Address), introduced storage & JSON patterns for nested/dynamic collections, method decorator and gas semantics, transaction context and standardized error prefixes, nondeterministic-block patterns and testing guidance (run nondet vs nondet_unsafe), LLM/web access equivalence patterns, and MCP doc lookup instructions. Areas needing attention: storage type replacements, renamed methods/constructor, JSON/stability/testing patterns, and nondeterministic-block caveats.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I nibble on headers and count every byte,
Swapped lists for TreeMaps and made JSON polite.
Nondets now tamed, errors tagged with care,
Docs fetched from MCP — I hop and declare! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: a comprehensive rewrite of the write-contract skill documentation with 28 tested patterns.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can enable review details to help with troubleshooting, context usage and more.

Enable the reviews.review_details setting to include review details such as the model used, the time taken for each step and more in the review comments.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
plugins/genlayer-dev/skills/write-contract/SKILL.md (2)

156-156: Version-specific bug callout may become outdated.

The reference to "SDK v0.25.0 bug" will age as new SDK versions are released. Consider linking to a GitHub issue or using more general guidance (e.g., "As of March 2026, direct test mocking...").

📝 Suggested alternative wording
-⚠️ `web.render(mode='screenshot')` cannot be mocked in direct tests (SDK v0.25.0 bug — returns empty bytes). To test visual contracts in direct mode, pass image bytes as a method argument.
+⚠️ `web.render(mode='screenshot')` cannot be mocked in direct tests (known issue as of March 2026 — returns empty bytes). To test visual contracts in direct mode, pass image bytes as a method argument.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/genlayer-dev/skills/write-contract/SKILL.md` at line 156, The wording
in SKILL.md calls out "SDK v0.25.0 bug" for web.render(mode='screenshot') which
will become outdated; update the note to use a stable reference instead—either
link to a canonical GitHub issue/PR for the bug or rephrase to a timebound or
general statement (e.g., "As of <date>, direct test mocking of
web.render(mode='screenshot') returns empty bytes; pass image bytes to the
method to test visual contracts in direct mode") so the guidance remains
accurate without hardcoding a version.

168-176: Document edge case limitations in _parse_llm_json.

The helper assumes JSON object output (not arrays) and extracts content between first { and last }. If an LLM returns an array like [1, 2, 3], find("{") returns -1, bypassing extraction and passing raw input to json.loads(). While the documented examples only use object patterns, explicitly note this limitation in the docstring or expand the function to handle both cases.

Note: The claim of "28/28 tested patterns" is not detailed in SKILL.md. If test coverage exists, document specific patterns tested.

Also: Per coding guidelines, SKILL.md should sync with a skill.yaml file—currently none exists in this directory.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/genlayer-dev/skills/write-contract/SKILL.md` around lines 168 - 176,
Update the helper _parse_llm_json to handle and/or document array outputs:
either expand parsing to detect JSON arrays by searching for the first '[' and
last ']' when no '{' is found (use first/last bracket pair similar to current
brace logic) before calling json.loads, or add a clear docstring in SKILL.md and
above function _parse_llm_json that lists this limitation (LLM outputs must be
JSON object or array, describe current behavior with braces/array extraction and
mention tested patterns). Also add a short note in SKILL.md enumerating which
LLM output patterns were validated (or remove the precise "28/28" claim), and
ensure a corresponding skill.yaml exists or note in the docs that SKILL.md must
be synced with skill.yaml for this skill directory.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@plugins/genlayer-dev/skills/write-contract/SKILL.md`:
- Line 156: The wording in SKILL.md calls out "SDK v0.25.0 bug" for
web.render(mode='screenshot') which will become outdated; update the note to use
a stable reference instead—either link to a canonical GitHub issue/PR for the
bug or rephrase to a timebound or general statement (e.g., "As of <date>, direct
test mocking of web.render(mode='screenshot') returns empty bytes; pass image
bytes to the method to test visual contracts in direct mode") so the guidance
remains accurate without hardcoding a version.
- Around line 168-176: Update the helper _parse_llm_json to handle and/or
document array outputs: either expand parsing to detect JSON arrays by searching
for the first '[' and last ']' when no '{' is found (use first/last bracket pair
similar to current brace logic) before calling json.loads, or add a clear
docstring in SKILL.md and above function _parse_llm_json that lists this
limitation (LLM outputs must be JSON object or array, describe current behavior
with braces/array extraction and mention tested patterns). Also add a short note
in SKILL.md enumerating which LLM output patterns were validated (or remove the
precise "28/28" claim), and ensure a corresponding skill.yaml exists or note in
the docs that SKILL.md must be synced with skill.yaml for this skill directory.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 955a6d94-b7df-41e1-a2ba-6d4d3f5cee91

📥 Commits

Reviewing files that changed from the base of the PR and between a682e2f and 09951c9.

📒 Files selected for processing (1)
  • plugins/genlayer-dev/skills/write-contract/SKILL.md

@dohernandez
Copy link
Member

@acastellana you need to solve conflicts

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/genlayer-dev/skills/write-contract/SKILL.md`:
- Around line 251-256: The example for prompt_non_comparative contradicts the
"no web fetching" rule by calling gl.nondet.web.get(url); update the example so
the lambda returns a local/validated string (e.g., report_text) instead of
performing a web fetch, and ensure the snippet uses prompt_non_comparative(...)
with that local variable; reference the prompt_non_comparative call and the
gl.nondet.web.get usage when making the change so the example and rule are
consistent.
- Around line 335-338: The fenced code block containing the two search_docs
examples in SKILL.md is missing a language tag (triggering markdownlint MD040);
update that block by adding a language identifier (for example "text") after the
opening backticks so it becomes ```text and keep the same inner lines
(`search_docs(library="genlayer-docs", query="<topic>")` and
`search_docs(library="genlayer-sdk", query="<topic>")`), ensuring the fenced
block is properly tagged.
- Around line 1-363: Create and add a new skill.yaml next to SKILL.md that
provides the machine-readable procedure for the write-contract skill (name:
write-contract) mirroring SKILL.md's metadata and behavior: include fields for
name, description, allowed-tools (Bash, Read, Write, Edit,
mcp__genlayer-docs__search_docs, mcp__genlayer-docs__fetch_url), inputs/outputs
if any, and a concise stepwise procedure that maps to the sections in SKILL.md
(required file headers, contract skeleton, storage types, method decorators,
transaction context, error handling, non-deterministic blocks, equivalence
principle patterns, testing, and doc lookup). Ensure the YAML is valid (proper
keys, strings, lists) and references the other required files (validations.yaml,
sharp-edges.yaml, collaboration.yaml) as present artifacts so the skill
directory conforms to the required layout before merge.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f8fef337-d45d-4c0a-9c7c-2b8b3c2d72ed

📥 Commits

Reviewing files that changed from the base of the PR and between 09951c9 and e1dd509.

📒 Files selected for processing (1)
  • plugins/genlayer-dev/skills/write-contract/SKILL.md

Comment on lines 1 to +363
---
name: write-contract
description: Write production-quality GenLayer intelligent contracts. Covers equivalence principle selection, validator patterns, storage rules, LLM resilience, and cross-contract interaction.
description: Write GenLayer intelligent contracts — storage types, decorators, non-deterministic blocks, equivalence principle patterns, and error handling.
allowed-tools:
- Bash
- Read
- Write
- Edit
- Grep
- Glob
- mcp__genlayer-docs__search_docs
- mcp__genlayer-docs__fetch_url
---

# Write Intelligent Contract
# Writing Intelligent Contracts

Guidance for writing GenLayer intelligent contracts that pass consensus, handle errors correctly, and survive production.
Intelligent contracts are Python classes extending `gl.Contract`. State is persisted on-chain. Non-deterministic operations (LLM calls, web fetches) achieve consensus via the **Equivalence Principle**.

Always lint with `genvm-lint check` after writing or modifying a contract.
## Required File Headers

## Contract Skeleton
Every contract file **must** start with these two lines — deployment fails with `absent_runner_comment` without them:

```python
# { "Depends": "py-genlayer:test" }
# v0.1.0
# { "Depends": "py-genlayer:latest" }
```

## Contract Skeleton

```python
# v0.1.0
# { "Depends": "py-genlayer:latest" }
from genlayer import *
import json

@gl.contract
class MyContract:
# Storage fields — typed, persisted on-chain
class MyContract(gl.Contract):
owner: Address
items: TreeMap[str, Item]
item_order: DynArray[str]
name: str
count: u256
items: TreeMap[str, str]
tags: DynArray[str]

def __init__(self, param: str):
self.owner = gl.message.sender_account
def __init__(self, name: str):
self.owner = gl.message.sender_address
self.name = name
self.count = u256(0)

@gl.public.view
def get_item(self, item_id: str) -> dict:
return {"id": item_id, "value": self.items[item_id].value}
def get_info(self) -> str:
return json.dumps({"name": self.name, "count": int(self.count)})

@gl.public.write
def set_item(self, item_id: str, value: str) -> None:
if gl.message.sender_account != self.owner:
raise gl.UserError("Only owner")
self.items[item_id] = Item(value=value)
self.item_order.append(item_id)
def increment(self):
self.count = u256(int(self.count) + 1)
```

## Equivalence Principle — Which One to Use
## Storage Types

This is the most critical decision. Pick wrong and consensus will fail or be trivially exploitable.
Never use plain `int`, `list`, or `dict` — they cause deployment failures.

### Decision Tree
| Python type | GenLayer type | Notes |
|-------------|---------------|-------|
| `int` | `u256` / `u8`–`u256` / `i8`–`i256` / `bigint` | No plain `int` |
| `list[T]` | `DynArray[T]` | |
| `dict[K, V]` | `TreeMap[K, V]` | Keys must be `str` or `u256` |
| `float`, `bool`, `str`, `bytes` | same | Work directly |
| `datetime` | `datetime.datetime` | |
| address | `Address` | |

**`u256` arithmetic** — convert to/from `int` explicitly (✅ tested):
```python
self.count = u256(int(self.count) + 1)
total = int(self.amount_a) + int(self.amount_b)
```
Is the external call deterministic (same input → same output)?
├── YES → gl.eq_principle.strict_eq(fn)
│ Examples: blockchain RPC, stable REST APIs, DNS lookups
└── NO (LLM, dynamic web content, variable APIs)
Does the output have a clear "decision" field?
├── YES → gl.eq_principle.prompt_comparative(fn, principle="...")
│ Examples: oracle resolution, content classification
│ principle: "`outcome` must match exactly. Analysis can differ."
└── NO (numeric scores, complex multi-field output)
└── gl.vm.run_nondet_unsafe(leader_fn, validator_fn)
Write custom comparison with tolerances.
Examples: tweet scoring, price feeds with drift

**Nested collections** — `TreeMap` can't hold `DynArray`. Use JSON strings (✅ tested):
```python
id_list = json.loads(self.index.get(key) or "[]")
id_list.append(new_id)
self.index[key] = json.dumps(id_list)
```

### strict_eq — Deterministic calls only
## Method Decorators

| Decorator | Purpose |
|-----------|---------|
| `@gl.public.view` | Read-only, free |
| `@gl.public.write` | Modifies state |
| `@gl.public.write.payable` | Modifies state + accepts tokens |
| `@gl.public.write.min_gas(leader=N, validator=N)` | Minimum gas for non-det operations |

## Transaction Context

```python
def fetch_balance(self) -> int:
def call_rpc():
res = gl.nondet.web.post(rpc_url, body=payload, headers=headers)
return json.loads(res.body.decode("utf-8"))["result"]
return gl.eq_principle.strict_eq(call_rpc)
gl.message.sender_address # caller (Address)
gl.message.value # tokens sent (u256, payable only)
self.balance # this contract's token balance
```

Never use for LLM calls or web pages that change between requests.
## Error Handling

### prompt_comparativeLLM with clear outcome
Use `gl.vm.UserError`never bare `ValueError` or `Exception` (linter blocks deployment):

```python
def resolve(self) -> str:
def analyze():
page = gl.get_webpage(url, mode="text")
return gl.exec_prompt(f"Analyze: {page}\nReturn JSON with outcome and reasoning.")

return gl.eq_principle.prompt_comparative(
analyze,
principle="`outcome` field must be exactly the same. All other fields must be similar.",
)
if gl.message.sender_address != self.owner:
raise gl.vm.UserError("Only owner can call this")
```

Good for: oracle resolution, content classification, yes/no decisions backed by reasoning.
**Error prefix convention** (classifies failures for validator logic):
```python
raise gl.vm.UserError("[EXPECTED] Resource not found") # business logic
raise gl.vm.UserError("[EXTERNAL] Fetch failed: 404") # external API error
raise gl.vm.UserError("[TRANSIENT] Timeout on request") # temporary failure
```

### run_nondet_unsafe — Custom validator logic
## Address Handling

Use when you need tolerances, gate checks, or complex comparison.
Constructors must handle both `bytes` (test framework) and `str` (JS SDK) (✅ tested):

```python
def score_content(self, content: str) -> dict:
# Pre-read storage BEFORE entering nondet block
cached_data = gl.storage.copy_to_memory(self.reference_data)
def __init__(self, party_b: Address):
if isinstance(party_b, (str, bytes)):
party_b = Address(party_b)
self.party_b = party_b
```

def leader_fn():
analysis = gl.nondet.exec_prompt(prompt, response_format="json")
score = _parse_llm_score(analysis)
return {"score": score, "analysis": str(analysis.get("analysis", ""))}
⚠️ In tests, `create_address()` returns raw `bytes`. Pass them directly or as `"0x" + addr.hex()`. Never pass `str(raw_bytes)` — that produces Python repr (`"b'\\xd8...'"`) and raises `binascii.Error`.

def validator_fn(leaders_res: gl.vm.Result) -> bool:
if not isinstance(leaders_res, gl.vm.Return):
return _handle_leader_error(leaders_res, leader_fn)
---

validator_result = leader_fn()
leader_score = leaders_res.calldata["score"]
validator_score = validator_result["score"]
## Non-Deterministic Blocks

# Gate check: if either is zero (reject), both must agree
if (leader_score == 0) != (validator_score == 0):
return False
Non-det code **must** be inside a zero-argument function passed to an equivalence principle function. Storage is **not accessible** inside — copy to locals first.

# Tolerance: within 5x/0.5x bounds (log10 comparison)
if leader_score > 0 and validator_score > 0:
ratio = leader_score / validator_score
if ratio > 5.0 or ratio < 0.2:
return False
```python
@gl.public.write
def evaluate(self, url: str):
target = url # capture for closure
name = self.name # copy storage to local

return True
def leader_fn():
resp = gl.nondet.web.get(target)
return gl.nondet.exec_prompt(f"Analyze {name}: {resp.body.decode()[:4000]}")

return gl.vm.run_nondet_unsafe(leader_fn, validator_fn)
# ... pass to equivalence principle function
```

## Error Classification

Classify errors so validators know how to compare them. This is critical for consensus on failure paths.
### Web Access

```python
ERROR_EXPECTED = "[EXPECTED]" # Business logic (deterministic) — exact match required
ERROR_EXTERNAL = "[EXTERNAL]" # External API 4xx (deterministic) — exact match required
ERROR_TRANSIENT = "[TRANSIENT]" # Network/5xx (non-deterministic) — agree if both transient
ERROR_LLM = "[LLM_ERROR]" # LLM misbehavior — always disagree, force retry
resp = gl.nondet.web.get(url) # resp.status, resp.body (bytes)
resp = gl.nondet.web.post(url, body="...", headers={})
text = gl.nondet.web.render(url, mode="text") # JS-rendered → string
img = gl.nondet.web.render(url, mode="screenshot") # JS-rendered → bytes (PNG)
img = gl.nondet.web.render(url, mode="screenshot", wait_after_loaded="1000ms")
```

### Canonical error handler for validators
`mode='text'`/`'html'` → string. `mode='screenshot'` → bytes — only mode compatible with `images=[...]`.

```python
def _handle_leader_error(leaders_res, leader_fn) -> bool:
leader_msg = leaders_res.message if hasattr(leaders_res, 'message') else ''
try:
leader_fn()
return False # Leader errored, validator succeeded — disagree
except gl.vm.UserError as e:
validator_msg = e.message if hasattr(e, 'message') else str(e)
# Deterministic errors: must match exactly
if validator_msg.startswith(ERROR_EXPECTED) or validator_msg.startswith(ERROR_EXTERNAL):
return validator_msg == leader_msg
# Transient: agree if both hit transient failure
if validator_msg.startswith(ERROR_TRANSIENT) and leader_msg.startswith(ERROR_TRANSIENT):
return True
# LLM or unknown: disagree — forces consensus retry
return False
except Exception:
return False
```
⚠️ `web.render(mode='screenshot')` cannot be mocked in direct tests (SDK v0.25.0 bug — returns empty bytes). To test visual contracts in direct mode, pass image bytes as a method argument.

### Applying error prefixes
### LLM Calls

```python
# Web requests
if response.status >= 400 and response.status < 500:
raise gl.vm.UserError(f"{ERROR_EXTERNAL} API returned {response.status}")
elif response.status >= 500:
raise gl.vm.UserError(f"{ERROR_TRANSIENT} API temporarily unavailable")

# LLM responses
if not isinstance(analysis, dict):
raise gl.vm.UserError(f"{ERROR_LLM} LLM returned non-dict: {type(analysis)}")

# Business logic
if user_balance < amount:
raise gl.vm.UserError(f"{ERROR_EXPECTED} Insufficient balance")
result = gl.nondet.exec_prompt("Your prompt here")
result = gl.nondet.exec_prompt("Describe this", images=[img_bytes]) # multimodal
```

## Storage Rules

### Types — use GenLayer types, not Python builtins

| Python | GenLayer | Notes |
|--------|----------|-------|
| `dict` | `TreeMap[K, V]` | O(log n) lookup, persisted |
| `list` | `DynArray[T]` | Dynamic array, persisted |
| `int` | `u256` / `i256` | Sized integers for on-chain math |
| `float` | **avoid** | Use atto-scale integers (value * 10^18) |
| `enum` | `str` | Store `.value`, not the enum itself |

### Dataclasses for complex state
`exec_prompt` return type is **not guaranteed to be `str`** across different GenVM backends. Always use this helper (✅ all edge cases tested):

```python
@allow_storage
@dataclass
class Item:
name: str
status: str # Use str, not Enum
atto_amount: u256 # Atto-scale (value * 10^18), not float
created_at: str # ISO format string
tags: DynArray[str]
def _parse_llm_json(raw):
if isinstance(raw, dict):
return raw
s = str(raw).strip().replace("```json", "").replace("```", "").strip()
start, end = s.find("{"), s.rfind("}") + 1
if start >= 0 and end > start:
s = s[start:end]
return json.loads(s)
```

### Layout rules
---

- **Append new fields at END only.** Storage layout is order-sensitive. Reordering or inserting fields breaks deployed contracts.
- **Default values for new fields** — existing storage reads zero/empty for fields added after deployment.
- **Initialize DynArray/TreeMap by appending** in `__init__`, not by assignment. `self.items = [x]` does not work.
- **O(1) stat indexes** — maintain a `TreeMap[str, u256]` counter alongside collections for fast counts.
## Equivalence Principle Patterns

### Storage in non-deterministic blocks
### Pattern 1 — Partial Field Matching (default choice)

Storage is **inaccessible** from inside `leader_fn` / `validator_fn`. Pre-read everything you need:
Leader and validator each run independently. Compare only objective decision fields — ignore subjective text (✅ tested):

```python
# BEFORE the nondet block
cached_store = gl.storage.copy_to_memory(self.data_store)
cached_config = self.config_value # Simple values can be read directly

def leader_fn():
# Use cached_store and cached_config here
similar = list(cached_store.knn(embedding, 10))
...
@gl.public.write
def resolve(self, match_id: str):
url = self.matches[match_id]

def leader_fn():
web_data = gl.nondet.web.get(url)
prompt = f"""
Find the match result. Page: {web_data.body.decode()[:4000]}
Return JSON: {{"score": "X:Y", "winner": 1 or 2 or 0 for draw, "analysis": "reasoning"}}
"""
return _parse_llm_json(gl.nondet.exec_prompt(prompt))

def validator_fn(leader_result) -> bool:
if not isinstance(leader_result, gl.vm.Return): # ✅ tested: correct type
return False
v = leader_fn()
ld = leader_result.calldata
# Only compare decision fields — analysis text will differ across LLMs
return ld["winner"] == v["winner"] and ld["score"] == v["score"]

result = gl.vm.run_nondet_unsafe(leader_fn, validator_fn)
self.matches[match_id].winner = result["winner"]
self.matches[match_id].analysis = result["analysis"]
```

## LLM Resilience
### Pattern 2 — Numeric Tolerance

LLMs return unpredictable formats. Always defensively parse.
For prices or LLM scores that drift between leader and validator execution:

```python
def _parse_llm_score(analysis: dict) -> int:
"""Extract numeric score from LLM response, handling common variations."""
if not isinstance(analysis, dict):
raise gl.vm.UserError(f"{ERROR_LLM} Non-dict response: {type(analysis)}")

# Key aliasing — LLMs use alternate names
raw = analysis.get("score")
if raw is None:
for alt in ("rating", "points", "value", "result"):
if alt in analysis:
raw = analysis[alt]
break

if raw is None:
raise gl.vm.UserError(f"{ERROR_LLM} Missing 'score'. Keys: {list(analysis.keys())}")

# Coerce aggressively — handles int, float, "3", "3.5", whitespace
try:
return max(0, int(round(float(str(raw).strip()))))
except (ValueError, TypeError):
raise gl.vm.UserError(f"{ERROR_LLM} Non-numeric score: {raw}")
```

### JSON cleanup from LLM output
def validator_fn(leader_result) -> bool:
if not isinstance(leader_result, gl.vm.Return):
return False
v_price = leader_fn()
l_price = leader_result.calldata
if l_price == 0:
return v_price == 0
return abs(l_price - v_price) / abs(l_price) <= 0.02 # 2% tolerance

```python
def _parse_json(text: str) -> dict:
"""Clean LLM JSON: strip wrapping text, fix trailing commas."""
import re
first = text.find("{")
last = text.rfind("}")
text = text[first:last + 1]
text = re.sub(r",(?!\s*?[\{\[\"\'\w])", "", text) # Remove trailing commas
return json.loads(text)
result = gl.vm.run_nondet_unsafe(leader_fn, validator_fn)
```

### Always use response_format="json"

For LLM scores (0–10) — handle the zero/rejection gate:
```python
result = gl.nondet.exec_prompt(task, response_format="json")
if l == 0 or v == 0:
return l == v # both must agree on rejection
return abs(l - v) <= 1 # ±1 otherwise
```

This tells the LLM to return JSON. Still validate and clean — LLMs don't always comply.
### Pattern 3 — LLM Comparison (Comparative)

## Cross-Contract Interaction

### Read from another contract (synchronous)
When results are too rich for programmatic comparison — an LLM judges equivalence:

```python
other = gl.get_contract_at(Address(other_address))
value = other.view().get_data()
result = gl.eq_principle.prompt_comparative(
evaluate_fn,
principle="`outcome` must match exactly. Other fields may differ.",
)
```

### Write to another contract (asynchronous)
### Pattern 4 — Non-Comparative

Validators judge the leader's output against criteria without re-running the task. Use **only** for open-ended tasks with no web fetching (validators can't verify fetched data):

```python
other = gl.get_contract_at(Address(other_address))
other.emit(on="accepted").process_data(payload) # Non-blocking
result = gl.eq_principle.prompt_non_comparative(
lambda: gl.nondet.web.get(url).body.decode(),
task="Summarize in 2-3 sentences",
criteria="Must capture the main point. Must be 2-3 sentences.",
)
```

`emit()` queues the call — it executes after current transaction. Use `on="accepted"` (fast) or `on="finalized"` (safe).
⚠️ Never use for oracle/price/data contracts — validators only check if output looks reasonable, not if the fetched data is correct.

**Warning:** If the current transaction is appealed after `emit()`, the emitted call still happens but the balance may already be decremented.
---

### Factory pattern — deploy child contracts
## `run_nondet_unsafe` vs `run_nondet`

```python
def __init__(self, num_workers: int):
with open("/contract/Worker.py", "rt") as f:
worker_code = f.read()

for i in range(num_workers):
addr = gl.deploy_contract(
code=worker_code.encode("utf-8"),
args=[i, gl.message.contract_address],
salt_nonce=i + 1,
on="accepted",
)
self.worker_addresses.append(addr)
```
| | `run_nondet_unsafe` | `run_nondet` |
|---|---|---|
| Validator exceptions | Unhandled = Disagree | Caught + compared automatically |
| Error handling | You implement in `validator_fn` | Built-in `compare_user_errors` callback |
| Use for | All custom patterns (recommended) | Convenience functions internally |

Workers are immutable after deployment. Code changes require redeploying the factory.
**Use `run_nondet_unsafe` for custom patterns.** Convenience functions (`strict_eq`, `prompt_comparative`, `prompt_non_comparative`) use `run_nondet` internally.

### Cross-chain RPC verification
## Validator Result Types

```python
def verify_deposit(self, rpc_url: str, contract_addr: str, call_data: bytes) -> bytes:
"""Verify state on another chain via eth_call."""
payload = {
"jsonrpc": "2.0", "id": 1,
"method": "eth_call",
"params": [{"to": contract_addr, "data": "0x" + call_data.hex()}, "latest"],
}

def fetch():
res = gl.nondet.web.post(rpc_url, body=json.dumps(payload).encode(),
headers={"Content-Type": "application/json"})
if res.status != 200:
raise gl.vm.UserError(f"{ERROR_EXTERNAL} RPC failed: {res.status}")
data = json.loads(res.body.decode("utf-8"))
if "error" in data:
raise gl.vm.UserError(f"{ERROR_EXTERNAL} RPC error: {data['error']}")
hex_result = data.get("result", "0x")[2:]
return bytes.fromhex(hex_result) if hex_result else b""

return gl.eq_principle.strict_eq(fetch)
def validator_fn(leader_result) -> bool:
if not isinstance(leader_result, gl.vm.Return): # covers UserError + VMError
return False
data = leader_result.calldata
# ...
```

## Web Requests
`gl.vm.Return` and `gl.vm.UserError` are real classes — safe for `isinstance`. `gl.vm.Result` is a type alias — do not use in `isinstance`.

### Extracting stable fields for consensus
---

External APIs return variable data (timestamps, counts). Extract only stable fields:
## Testing Validator Logic

Use `direct_vm.run_validator()` to test whether your validator agrees or disagrees. **Only works with `run_nondet_unsafe`** — `strict_eq` uses `spawn_sandbox()` which is not supported in the test mock (✅ tested):

```python
def leader_fn():
res = gl.nondet.web.get(api_url)
data = json.loads(res.body.decode("utf-8"))
# Only return fields that won't change between leader and validator calls
return {"id": data["id"], "login": data["login"], "status": data["status"]}
# NOT: follower_count, updated_at, online_status
def test_validator_disagrees(direct_vm, direct_deploy):
contract = direct_deploy("contracts/MyContract.py")
direct_vm.sender = b'\x01' * 20

# Leader run
direct_vm.mock_llm(r".*", '{"winner": 1, "score": "2:1", "analysis": "A won"}')
contract.resolve("match_1")

# Swap mocks — different validator result
direct_vm.clear_mocks()
direct_vm.mock_llm(r".*", '{"winner": 2, "score": "0:1", "analysis": "B won"}')
assert direct_vm.run_validator() is False

def test_validator_agrees(direct_vm, direct_deploy):
contract = direct_deploy("contracts/MyContract.py")
direct_vm.sender = b'\x01' * 20

direct_vm.mock_llm(r".*", '{"winner": 1, "score": "2:1", "analysis": "A won"}')
contract.resolve("match_1")
# Same mock = same winner+score = validator agrees
assert direct_vm.run_validator() is True
```

### Deriving status from variable data
`run_validator()` raises `RuntimeError("No validator captured")` if called before any nondet method.

When raw data may differ (e.g., CI check counts change), compare derived summaries:
---

```python
def validator_fn(leaders_res: gl.vm.Result) -> bool:
validator_checks = leader_fn()
## Stable JSON Comparison

def derive(checks):
if not checks: return "pending"
for c in checks:
if c.get("conclusion") != "success": return "failing"
return "success"
When comparing structured output between leader and validator, always serialize with `sort_keys=True` — key order is not guaranteed (✅ tested):

return derive(leaders_res.calldata) == derive(validator_checks)
```python
json.dumps(result, sort_keys=True) # stable for exact comparison
```

## Anti-Patterns

| Don't | Do Instead | Why |
|-------|-----------|-----|
| `strict_eq()` for LLM calls | `prompt_comparative()` or `run_nondet_unsafe()` | LLM outputs are non-deterministic — strict_eq always fails consensus |
| Read storage inside `leader_fn` | `gl.storage.copy_to_memory()` before the nondet block | Storage is inaccessible in non-deterministic context |
| Store `list` or `dict` | `DynArray[T]` or `TreeMap[K, V]` | Python builtins aren't persistable |
| Use `float` for money/scores | Atto-scale `u256` (value * 10^18) | Floating point has rounding errors |
| Insert fields in the middle of a dataclass | Append at END only | Storage layout is positional — insertion shifts all subsequent fields |
| Store `Enum` directly | Store `enum.value` as `str` | Enum type not supported in storage |
| Ignore LLM response format | Validate type, sanitize JSON, alias keys | LLMs return unpredictable formats |
| Let validator agree on LLM errors | Return `False` (disagree) to force retry | Agreeing on broken LLM output locks bad state |
| Use bare `Exception` in contracts | Use `gl.vm.UserError` with error prefix | Bare exceptions become unrecoverable VMError |
| Compare variable API fields in validators | Extract stable fields or derive status | Timestamps, counts change between calls |
| O(n) scans over large collections | Maintain TreeMap indexes for O(1) lookups | Transactions have compute limits |

## Testing Strategy

1. **Lint first**: `genvm-lint check contracts/my_contract.py`
2. **Direct mode tests**: Fast (30ms), no server. Tests business logic, validation, state transitions. Validator logic NOT exercised.
3. **Integration tests**: Slow (seconds-minutes), full consensus. Tests validator agreement, real web/LLM calls. Run before deployment.
---

### DEV MODE for external dependencies
## Looking Up Docs

Skip cross-chain verification in tests by checking for zero address:
Use the `genlayer-docs` MCP server when you need detail beyond this skill:

```python
def __init__(self, bridge_sender: str):
self.bridge_sender = Address(bridge_sender)

def verify_deposit(self, ...):
if self.bridge_sender == Address("0x" + "0" * 40):
print("DEV MODE: skipping verification")
return True
# Real verification logic...
```
search_docs(library="genlayer-docs", query="<topic>")
search_docs(library="genlayer-sdk", query="<topic>")
```

Deploy with `bridge_sender="0x0000...0000"` in tests, real address in production.
Examples:
- `search_docs(library="genlayer-docs", query="equivalence principle patterns")`
- `search_docs(library="genlayer-sdk", query="TreeMap DynArray storage")`
- `search_docs(library="genlayer-docs", query="security prompt injection")`

If the MCP server is unavailable, fetch docs directly:

| Topic | URL |
|-------|-----|
| **Intelligent Contracts overview** | https://docs.genlayer.com/developers/intelligent-contracts/introduction |
| **Storage types & features** | https://docs.genlayer.com/developers/intelligent-contracts/features/storage |
| **Equivalence Principle (full)** | https://docs.genlayer.com/developers/intelligent-contracts/equivalence-principle |
| **Development setup & workflow** | https://docs.genlayer.com/developers/intelligent-contracts/tooling-setup |
| **Debugging** | https://docs.genlayer.com/developers/intelligent-contracts/debugging |
| **Security & prompt injection** | https://docs.genlayer.com/developers/intelligent-contracts/security-and-best-practices |
| **Python SDK reference** | https://docs.genlayer.com/api-references/genlayer-py |
| **GenLayer Test reference** | https://docs.genlayer.com/api-references/genlayer-test |
| **GenLayer JS reference** | https://docs.genlayer.com/api-references/genlayer-js |
| **CLI reference** | https://docs.genlayer.com/api-references/genlayer-cli |
| **Python SDK source (API)** | https://sdk.genlayer.com/main/api/genlayer.html |
| **Contract examples** | https://github.com/genlayerlabs/genlayer-testing-suite/tree/main/tests/examples |
| **Project boilerplate** | https://github.com/genlayerlabs/genlayer-project-boilerplate |

See also: `genvm-lint`, `direct-tests`, `integration-tests`, and `genlayer-cli` skills for tooling.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate skill definition files for write-contract
fd -a 'SKILL.md|skill.yaml' plugins/genlayer-dev/skills/write-contract

echo
echo "=== Key fields comparison (SKILL.md vs skill.yaml) ==="
rg -n '^(name:|description:|allowed-tools:|  - mcp__genlayer-docs__search_docs|  - mcp__genlayer-docs__fetch_url)' \
  plugins/genlayer-dev/skills/write-contract/SKILL.md \
  plugins/genlayer-dev/skills/write-contract/skill.yaml || true

echo
echo "=== skill.yaml content (first 220 lines) ==="
sed -n '1,220p' plugins/genlayer-dev/skills/write-contract/skill.yaml

Repository: genlayerlabs/skills

Length of output: 977


🏁 Script executed:

ls -la plugins/genlayer-dev/skills/write-contract/

Repository: genlayerlabs/skills

Length of output: 235


🏁 Script executed:

find plugins/genlayer-dev/skills/write-contract -type f -name "*.yaml" -o -name "*.yml"

Repository: genlayerlabs/skills

Length of output: 45


skill.yaml is missing from the write-contract skill directory.

The skill structure is incomplete. Per the required skill directory layout, each skill must contain SKILL.md, skill.yaml, validations.yaml, sharp-edges.yaml, and collaboration.yaml. Currently only SKILL.md exists (12,694 bytes). Create skill.yaml with the machine-readable procedure definition that mirrors the content in SKILL.md. This file is mandatory before merge.

🧰 Tools
🪛 LanguageTool

[grammar] ~158-~158: Ensure spelling is correct
Context: ... — returns empty bytes). To test visual contracts in direct mode, pass image bytes as a m...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🪛 markdownlint-cli2 (0.21.0)

[warning] 335-335: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/genlayer-dev/skills/write-contract/SKILL.md` around lines 1 - 363,
Create and add a new skill.yaml next to SKILL.md that provides the
machine-readable procedure for the write-contract skill (name: write-contract)
mirroring SKILL.md's metadata and behavior: include fields for name,
description, allowed-tools (Bash, Read, Write, Edit,
mcp__genlayer-docs__search_docs, mcp__genlayer-docs__fetch_url), inputs/outputs
if any, and a concise stepwise procedure that maps to the sections in SKILL.md
(required file headers, contract skeleton, storage types, method decorators,
transaction context, error handling, non-deterministic blocks, equivalence
principle patterns, testing, and doc lookup). Ensure the YAML is valid (proper
keys, strings, lists) and references the other required files (validations.yaml,
sharp-edges.yaml, collaboration.yaml) as present artifacts so the skill
directory conforms to the required layout before merge.

Comment on lines +251 to +256
Validators judge the leader's output against criteria without re-running the task. Use **only** for open-ended tasks with no web fetching (validators can't verify fetched data):

```python
other = gl.get_contract_at(Address(other_address))
other.emit(on="accepted").process_data(payload) # Non-blocking
result = gl.eq_principle.prompt_non_comparative(
lambda: gl.nondet.web.get(url).body.decode(),
task="Summarize in 2-3 sentences",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Non-comparative example conflicts with the rule it introduces.

Lines 251–252 say this pattern should be used with no web fetching, but Line 255 fetches via gl.nondet.web.get(url). This contradiction can cause incorrect pattern usage.

Suggested doc fix
-Validators judge the leader's output against criteria without re-running the task. Use **only** for open-ended tasks with no web fetching (validators can't verify fetched data):
+Validators judge the leader's output against criteria without re-running the task. Use **only** for open-ended tasks where the input is already available on-chain or passed as an argument (no fresh web fetching in the nondet task):

 ```python
 result = gl.eq_principle.prompt_non_comparative(
-    lambda: gl.nondet.web.get(url).body.decode(),
+    lambda: report_text,
     task="Summarize in 2-3 sentences",
     criteria="Must capture the main point. Must be 2-3 sentences.",
 )
</details>

<!-- suggestion_start -->

<details>
<summary>📝 Committable suggestion</summary>

> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

```suggestion
Validators judge the leader's output against criteria without re-running the task. Use **only** for open-ended tasks where the input is already available on-chain or passed as an argument (no fresh web fetching in the nondet task):

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/genlayer-dev/skills/write-contract/SKILL.md` around lines 251 - 256,
The example for prompt_non_comparative contradicts the "no web fetching" rule by
calling gl.nondet.web.get(url); update the example so the lambda returns a
local/validated string (e.g., report_text) instead of performing a web fetch,
and ensure the snippet uses prompt_non_comparative(...) with that local
variable; reference the prompt_non_comparative call and the gl.nondet.web.get
usage when making the change so the example and rule are consistent.

Comment on lines +335 to 338
```
search_docs(library="genlayer-docs", query="<topic>")
search_docs(library="genlayer-sdk", query="<topic>")
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language to the fenced code block.

The block at Line 335 is missing a language tag (markdownlint MD040).

Suggested doc fix
-```
+```text
 search_docs(library="genlayer-docs", query="<topic>")
 search_docs(library="genlayer-sdk", query="<topic>")
</details>

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.21.0)</summary>

[warning] 335-335: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @plugins/genlayer-dev/skills/write-contract/SKILL.md around lines 335 - 338,
The fenced code block containing the two search_docs examples in SKILL.md is
missing a language tag (triggering markdownlint MD040); update that block by
adding a language identifier (for example "text") after the opening backticks so
it becomes ```text and keep the same inner lines
(search_docs(library="genlayer-docs", query="<topic>") and
`search_docs(library="genlayer-sdk", query="")`), ensuring the fenced
block is properly tagged.


</details>

<!-- fingerprinting:phantom:triton:hawk -->

<!-- This is an auto-generated comment by CodeRabbit -->

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
plugins/genlayer-dev/skills/write-contract/SKILL.md (2)

251-256: ⚠️ Potential issue | 🟠 Major

Rule/example mismatch in non-comparative pattern.

The text says “no web fetching,” but the example fetches with gl.nondet.web.get(url). This can lead to incorrect pattern selection and unverifiable validator behavior.

Suggested doc fix
-Validators judge the leader's output against criteria without re-running the task. Use **only** for open-ended tasks with no web fetching (validators can't verify fetched data):
+Validators judge the leader's output against criteria without re-running the task. Use **only** for open-ended tasks where input is already available locally/on-chain (no fresh web fetching in the nondet task):

 ```python
 result = gl.eq_principle.prompt_non_comparative(
-    lambda: gl.nondet.web.get(url).body.decode(),
+    lambda: report_text,
     task="Summarize in 2-3 sentences",
     criteria="Must capture the main point. Must be 2-3 sentences.",
 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/genlayer-dev/skills/write-contract/SKILL.md` around lines 251 - 256,
The example for prompt_non_comparative contradicts the rule forbidding web
fetching: replace the lambda that calls gl.nondet.web.get(url).body.decode()
with a local/immutable value (e.g., report_text) so validators can judge without
re-running or verifying external fetches; update the example call to
prompt_non_comparative (and its criteria string if needed) to use the local
variable instead of gl.nondet.web.get.

399-402: ⚠️ Potential issue | 🟡 Minor

Add a language tag to the fenced code block.

The block starting at Line 399 is missing a language identifier (MD040).

Suggested doc fix
-```
+```text
 search_docs(library="genlayer-docs", query="<topic>")
 search_docs(library="genlayer-sdk", query="<topic>")
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @plugins/genlayer-dev/skills/write-contract/SKILL.md around lines 399 - 402,
The fenced code block that shows the two calls to search_docs
(search_docs(library="genlayer-docs", query="") and
search_docs(library="genlayer-sdk", query="")) is missing a language tag;
add an appropriate language identifier (e.g., "text" or "bash") to the opening
fence so the block becomestext (or bash) then the two search_docs lines and the closing , ensuring the code block is properly tagged to satisfy
MD040.


</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In @plugins/genlayer-dev/skills/write-contract/SKILL.md:

  • Around line 251-256: The example for prompt_non_comparative contradicts the
    rule forbidding web fetching: replace the lambda that calls
    gl.nondet.web.get(url).body.decode() with a local/immutable value (e.g.,
    report_text) so validators can judge without re-running or verifying external
    fetches; update the example call to prompt_non_comparative (and its criteria
    string if needed) to use the local variable instead of gl.nondet.web.get.
  • Around line 399-402: The fenced code block that shows the two calls to
    search_docs (search_docs(library="genlayer-docs", query="") and
    search_docs(library="genlayer-sdk", query="")) is missing a language tag;
    add an appropriate language identifier (e.g., "text" or "bash") to the opening
    fence so the block becomestext (or bash) then the two search_docs lines and the closing , ensuring the code block is properly tagged to satisfy
    MD040.

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: defaults

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `ddc72012-a054-43f9-a5a5-9beacacd213a`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between e1dd509178dcb2313a1cf72e62730a3c84e2e022 and 1001dd1f4b3fab50b5b6ebfd59654fdc859fdc68.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `plugins/genlayer-dev/skills/write-contract/SKILL.md`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants