JustInternetAI · justinmadison · Nov 29, 2025 · Nov 24, 2025 · Nov 25, 2025 · Nov 29, 2025
diff --git a/.claude/project-context.md b/.claude/project-context.md
@@ -261,6 +261,16 @@ else:
 - Issue #16: Connect tool execution system in Godot (assigned to Justin) - ✅ **COMPLETE**
   - See: `TESTING_TOOL_EXECUTION.md`, `TOOL_TESTING_FIXED.md` for details
   - Test scenes: `scenes/tests/` (use `test_tool_execution_simple.tscn` for quick verification)
+- Issue #22: **[HIGH PRIORITY]** Implement LLM request batching for concurrent agent decisions
+  - https://github.com/JustInternetAI/AgentArena/issues/22
+  - Current bottleneck: Each agent makes individual LLM calls
+  - Expected impact: 50-70% faster LLM inference with 4+ agents
+  - Files: `python/agent_runtime/runtime.py`, `python/backends/vllm_backend.py`
+- Issue #23: **[MEDIUM PRIORITY]** Convert tool execution to async for concurrent tool handling
+  - https://github.com/JustInternetAI/AgentArena/issues/23
+  - Make `ToolDispatcher` async-compatible
+  - Allows FastAPI to handle multiple tool requests concurrently
+  - Files: `python/agent_runtime/tool_dispatcher.py`, `python/ipc/server.py`
 - LLM backend integration (assigned to Andrew) - In Progress
 
 ## References

diff --git a/.gitignore b/.gitignore
@@ -76,3 +76,4 @@ tmp/
 # GitHub automation scripts (used for setup, not needed in repo)
 scripts/create_github_*.bat
 scripts/create_github_*.sh
+nul
diff --git a/TESTING_OBSERVATION_LOOP.md b/TESTING_OBSERVATION_LOOP.md
@@ -0,0 +1,188 @@
+# Observation-Decision Loop Test
+
+## Overview
+
+This test validates the complete observation-decision pipeline without executing actual agent movement. It demonstrates that game observations can be sent to the Python backend, processed into decisions, and returned to Godot.
+
+**GitHub Issue:** [#28](https://github.com/JustInternetAI/AgentArena/issues/28)
+
+## What This Tests
+
+✅ **Observation Serialization** - Game data → JSON format
+✅ **HTTP Communication** - Godot → Python backend
+✅ **Mock Decision Logic** - Rule-based decision making
+✅ **Response Handling** - JSON → Game data
+✅ **Continuous Loop** - 10 ticks without errors
+
+## Files Implemented
+
+### Backend (Python)
+- **`python/ipc/server.py`** - Added `/observe` endpoint (line 276)
+- **`python/ipc/server.py`** - Added `_make_mock_decision()` method (line 71)
+- **`python/test_observe_endpoint.py`** - Python test script
+- **`python/OBSERVE_ENDPOINT.md`** - API documentation
+
+### Frontend (Godot)
+- **`scripts/tests/test_observation_loop.gd`** - Test script
+- **`scenes/tests/test_observation_loop.tscn`** - Test scene
+- **`scenes/tests/README.md`** - Updated documentation
+
+## How to Run
+
+### Step 1: Start Python Backend
+
+```bash
+cd python
+venv\Scripts\activate
+python run_ipc_server.py
+```
+
+**Expected output:**
+```
+Agent Arena IPC Server
+Host: 127.0.0.1
+Port: 5000
+Max Workers: 4
+Starting IPC server...
+Registered 12 tools
+```
+
+### Step 2: Run Godot Test
+
+1. Open Godot project
+2. Navigate to `scenes/tests/test_observation_loop.tscn`
+3. Press **F6** to run the scene
+4. Watch the console output
+
+### Step 3: Observe Results
+
+**Godot Console** will show:
+```
+=== Observation-Decision Loop Test ===
+✓ Connected to Python backend!
+
+=== STARTING OBSERVATION LOOP TEST ===
+Running 10 ticks...
+
+[Initial State]
+  Agent position: (0, 0, 0)
+  Resources: 4
+    - Berry1 (berry) at distance 5.83
+    - Berry2 (berry) at distance 4.47
+    - Wood1 (wood) at distance 7.62
+    - Stone1 (stone) at distance 8.25
+  Hazards: 2
+    - Fire1 (fire) at distance 2.83
+    - Pit1 (pit) at distance 5.10
+
+--- Tick 0 ---
+Sending observation:
+  Position: (0, 0, 0)
+  Nearby resources: 4
+  Nearby hazards: 2
+✓ Decision received:
+  Tool: move_away
+  Params: {from_position:[2, 0, 2]}
+  Reasoning: Avoiding nearby fire hazard at distance 2.8
+  → Simulated position update: (-0.707107, 0, -0.707107)
+
+--- Tick 1 ---
+...
+```
+
+**Python Backend** will log:
+```
+INFO:ipc.server:Agent test_forager_001 decision: move_away - Avoiding nearby fire hazard at distance 2.8
+INFO:ipc.server:Agent test_forager_001 decision: move_to - Moving to collect berry (Berry2) at distance 4.5
+...
+```
+
+## Mock Decision Logic
+
+The backend uses a priority system:
+
+### Priority 1: Avoid Hazards (distance < 3.0)
+- Returns `move_away` tool
+- Agent moves away from dangerous hazards
+
+### Priority 2: Collect Resources (distance < 5.0)
+- Returns `move_to` tool
+- Agent moves toward nearest collectible resource
+
+### Priority 3: Idle (default)
+- Returns `idle` tool
+- No immediate actions needed
+
+## Success Criteria
+
+After running the test, verify:
+
+- [ ] Test runs for all 10 ticks without errors
+- [ ] Each tick sends observation to backend
+- [ ] Each tick receives decision from backend
+- [ ] Decisions make logical sense:
+  - Early ticks: Avoid fire (distance 2.83 < 3.0)
+  - Later ticks: Move to resources (after moving away from hazard)
+- [ ] Agent position updates (simulated, not real movement)
+- [ ] No crashes or connection failures
+- [ ] Python logs show all decisions
+
+## Controls
+
+- **T** - Run test again (resets position to origin)
+- **Q** - Quit the test
+
+## Expected Behavior
+
+1. **Tick 0-2:** Agent should avoid fire hazard (distance 2.83 < 3.0)
+2. **Tick 3-5:** Agent moves away from fire, distance increases
+3. **Tick 6-9:** Agent should move toward nearest resource (Berry2 at 4.47)
+
+## Troubleshooting
+
+### "Connection failed"
+- Ensure Python IPC server is running
+- Check that port 5000 is not blocked
+- Verify IPCService autoload is configured
+
+### "No decision received"
+- Check Python console for errors
+- Verify `/observe` endpoint exists (check `python/ipc/server.py`)
+- Test endpoint manually: `python test_observe_endpoint.py`
+
+### JSON parse errors
+- Check observation format in `build_observation()`
+- Verify all position arrays are `[x, y, z]` format
+- Check Python logs for validation errors
+
+## What's Next
+
+After this test passes:
+
+1. **Integrate with foraging scene** - Add observation loop to `scripts/foraging.gd`
+2. **Replace mock decisions** - Integrate real LLM backend
+3. **Implement movement execution** - Actually move agents based on decisions
+4. **Add multi-agent support** - Test with multiple agents simultaneously
+
+## Metrics
+
+View backend metrics:
+```bash
+curl http://127.0.0.1:5000/metrics
+```
+
+Should show:
+```json
+{
+  "total_observations_processed": 10,
+  "total_ticks": 0,
+  "total_tools_executed": 0,
+  ...
+}
+```
+
+## Related Documentation
+
+- **API Docs:** `python/OBSERVE_ENDPOINT.md`
+- **Test Suite:** `scenes/tests/README.md`
+- **GitHub Issue:** [#28](https://github.com/JustInternetAI/AgentArena/issues/28)
diff --git a/agent_arena.gdextension b/agent_arena.gdextension
@@ -3,7 +3,7 @@ entry_symbol = "agent_arena_library_init"
 compatibility_minimum = "4.5"
 
 [libraries]
-windows.debug.x86_64 = "res://bin/windows/libagent_arena.windows.template_debug.x86_64.dll"
+windows.debug.x86_64 = "res://bin/windows/libagent_arena.windows.template_release.x86_64.dll"
 windows.release.x86_64 = "res://bin/windows/libagent_arena.windows.template_release.x86_64.dll"
 linux.debug.x86_64 = "res://bin/linux/libagent_arena.linux.template_debug.x86_64.so"
 linux.release.x86_64 = "res://bin/linux/libagent_arena.linux.template_release.x86_64.so"

diff --git a/docs/architecture.md b/docs/architecture.md
@@ -14,8 +14,15 @@ The core simulation engine built as a Godot 4 GDExtension module.
 
 - `SimulationManager`: Manages deterministic tick loop and simulation state
 - `EventBus`: Handles event recording and replay for reproducibility
-- `Agent`: Godot-side agent representation with perception and action execution
+- `Agent`: Core C++ agent class with perception and memory (wrapped by SimpleAgent)
+- `SimpleAgent`: GDScript wrapper providing auto-discovery and signal-based tool responses
 - `ToolRegistry`: Manages available tools and their execution
+- `IPCClient`: Handles HTTP communication with Python backend
+
+**Autoload Services:**
+
+- `IPCService`: Global singleton managing connection to Python backend
+- `ToolRegistryService`: Global singleton managing tool registration and execution
 
 **Responsibilities:**
 
@@ -24,6 +31,7 @@ The core simulation engine built as a Godot 4 GDExtension module.
 - Sensor data collection (raycasts, vision, etc.)
 - Action execution in the game world
 - Navigation and pathfinding
+- Tool execution through Python IPC backend
 
 **Data Flow:**