From 57c91c43202768d1a0588eea6adafdc3ff0263ce Mon Sep 17 00:00:00 2001 From: Mehdi Date: Mon, 7 Jul 2025 03:07:13 +0000 Subject: [PATCH] update version --- README.md | 24 +-- VERSION | 2 +- package.json | 2 +- tests/docs/Integration-Test-Analysis.md | 220 --------------------- tests/docs/Integration-Test-Fix-Summary.md | 187 ------------------ 5 files changed, 14 insertions(+), 421 deletions(-) delete mode 100644 tests/docs/Integration-Test-Analysis.md delete mode 100644 tests/docs/Integration-Test-Fix-Summary.md diff --git a/README.md b/README.md index 966790b..64b2349 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,10 @@ # Claude Runner -**Run complex multi steps Claude Code task directly into your VS Code.** +**Run complex multi-step Claude Code tasks directly in your VS Code.** ![Claude Runner Logo](https://raw.githubusercontent.com/codingworkflow/claude-runner/main/assets/icon.png) -Create and run mutli steps task in your VS Code. Create workflows & save them to re-use. +Create and run multi-step tasks in your VS Code. Create workflows & save them to reuse. Get cost usage (estimate if you use subscription), and check conversation history. ## Key Features @@ -14,10 +14,10 @@ Get cost usage (estimate if you use subscription), and check conversation histor Create and execute sophisticated multi-step workflows: - Chain multiple Claude Code tasks together -- Mix different Claude models per task or keeep in Auto mode +- Mix different Claude models per task or keep in Auto mode - Session continuity between tasks - Save and reuse pipelines -- Format similar to Claude Code Github action +- Format similar to Claude Code GitHub action ![Create Workflow](https://raw.githubusercontent.com/codingworkflow/claude-runner/main/assets/pipeline.png) @@ -33,8 +33,8 @@ Choose the perfect model for your task: - **Claude Sonnet 4**: Balanced performance and cost-effectiveness - **Claude Sonnet 3.7**: Reliable performance for most tasks - **Claude Haiku 3.5**: Lightning-fast responses for quick queries -- Use bypass mode (recommended use devcontainer) -- Setup Parallel tasks configuration +- Use bypass mode (recommended for devcontainer) +- Set up parallel tasks configuration ### **Interactive Chat Mode** @@ -60,7 +60,7 @@ Monitor your Claude usage with detailed analytics: - Token consumption tracking - Cost estimation per model - Daily/weekly/monthly breakdowns -- Estimate user per 5 hours, set start window, max output +- Estimate usage per 5 hours, set start window, max output ![Usage Tracking](https://raw.githubusercontent.com/codingworkflow/claude-runner/main/assets/usage.png) @@ -68,13 +68,13 @@ Monitor your Claude usage with detailed analytics: ### **Access session logs** -View in vscode previous Claude code chats: +View in VS Code previous Claude Code chats: -- Browse sessions history +- Browse session history ![Conversation list](https://raw.githubusercontent.com/codingworkflow/claude-runner/main/assets/logs.png) -![Claude code session](https://raw.githubusercontent.com/codingworkflow/claude-runner/main/assets/conversation.png) +![Claude Code session](https://raw.githubusercontent.com/codingworkflow/claude-runner/main/assets/conversation.png) ## Quick Start @@ -88,8 +88,8 @@ View in vscode previous Claude code chats: **Latest Features:** - Pipeline system for complex workflows -- pause & resume -- auto-resume on timeout +- Pause & resume +- Auto-resume on timeout - Comprehensive usage analytics and cost tracking - Conversation history and search diff --git a/VERSION b/VERSION index a2268e2..9fc80f9 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.3.1 \ No newline at end of file +0.3.2 \ No newline at end of file diff --git a/package.json b/package.json index 7ef2659..6e0d766 100644 --- a/package.json +++ b/package.json @@ -2,7 +2,7 @@ "name": "claude-runner", "displayName": "Claude Runner", "description": "Execute Claude Code commands directly from VS Code with an intuitive interface", - "version": "0.3.1", + "version": "0.3.2", "publisher": "Codingworkflow", "private": false, "license": "GPL-3.0", diff --git a/tests/docs/Integration-Test-Analysis.md b/tests/docs/Integration-Test-Analysis.md deleted file mode 100644 index 2f9e1dd..0000000 --- a/tests/docs/Integration-Test-Analysis.md +++ /dev/null @@ -1,220 +0,0 @@ -# Integration Test Analysis: `WorkflowExecution.test.ts` - -## 🚨 **Critical Issues Found** - -The existing `WorkflowExecution.test.ts` suffers from the **same antipatterns** we fixed in our E2E test. It's over-mocking core business logic instead of testing real integration. - -## ❌ **Major Problems** - -### 1. **Over-Mocking Core Business Logic** - -```typescript -// ❌ BAD: Mocking the exact functionality being tested -executeWorkflowStub.callsFake(async (...) => { - // Completely fake execution logic - onStepProgress("task1", "running"); - onStepProgress("task1", "completed", { - session_id: "sess_123", - result: "Project analyzed successfully", // Fake result! - }); - onComplete(); -}); -``` - -**Problem:** This mocks the entire workflow execution engine. The test always passes because it's testing fake logic, not real integration. - -### 2. **Testing Deprecated Session Format** - -```typescript -// ❌ BAD: Using old format that should be rejected -with: { - prompt: "Implement changes", - resume_session: "${{ steps.analyze.outputs.session_id }}", // OLD FORMAT! -}, -``` - -**Problem:** This test uses the old `${{ }}` format that we specifically fixed the parser to reject. The test should fail with our parser changes. - -### 3. **Inline Workflow Definitions** - -```typescript -// ❌ BAD: Inline workflow instead of external fixtures -const workflow: ClaudeWorkflow = { - name: "Simple Workflow", - jobs: { - main: { - steps: [ - { - id: "task1", - // ... inline definition - }, - ], - }, - }, -}; -``` - -**Problem:** Not using external fixture files like real workflows. Inline data can be oversimplified and doesn't match actual user files. - -### 4. **False Integration Claims** - -**What the test claims vs. what it actually does:** - -| Claim | Reality | -| -------------------------- | ---------------------------- | -| "Integration test" | Mocks all integration points | -| "Tests session chaining" | Fakes session chaining logic | -| "Tests input resolution" | Mocks input resolution | -| "Tests workflow execution" | Completely mocks execution | -| "Tests cancellation" | Fakes cancellation logic | - -## ✅ **Fixed Integration Test** - -Created `WorkflowExecutionFixed.test.ts` with proper integration testing: - -### **Real Parser Integration** - -```typescript -// ✅ GOOD: Use real fixture files and real parser -const workflowPath = path.join( - fixturesPath, - "workflows", - "claude-test-coverage.yml", -); -const content = fs.readFileSync(workflowPath, "utf-8"); -const workflow = WorkflowParser.parseYaml(content); // Real parser! - -expect(workflow.name).toBe("test-coverage-improvement"); -``` - -### **Real Session Reference Validation** - -```typescript -// ✅ GOOD: Test parser correctly rejects old format -it("should reject workflow with invalid session reference format", () => { - const workflowPath = path.join(fixturesPath, "workflows", "claude-test.yml"); - - expect(() => { - const content = fs.readFileSync(workflowPath, "utf-8"); - WorkflowParser.parseYaml(content); - }).toThrow(/invalid.*session.*reference|unknown.*step/i); -}); -``` - -### **Real Service Integration** - -```typescript -// ✅ GOOD: Test real WorkflowService integration -const execution = workflowService.createExecution(workflow, {}); - -expect(execution.workflow).toBe(workflow); -expect(execution.status).toBe("pending"); -expect(execution.currentStep).toBe(0); -``` - -### **Proper Mock Boundaries** - -```typescript -// ✅ GOOD: Mock only external dependencies -jest.mock("child_process", () => ({ - exec: jest.fn(), // Mock Claude CLI (external) - spawn: jest.fn(), // Mock process spawning (external) -})); - -// ✅ GOOD: Don't mock business logic -// WorkflowParser - NOT mocked (we're testing it) -// WorkflowService - NOT mocked (we're testing it) -// Session validation - NOT mocked (we're testing it) -``` - -## **Test Results Comparison** - -### ❌ **Original Test Issues** - -``` -✓ Tests pass but with fake logic -✓ Uses deprecated session format -✓ Mocks what should be tested -✓ No real integration verification -``` - -### ✅ **Fixed Test Results** - -``` -✓ should load and parse workflow from fixture file -✓ should reject workflow with invalid session reference format -✓ should accept valid simple session reference format -✓ should create execution with real workflow -✓ should resolve workflow inputs properly -✓ should integrate parser + service + command building -``` - -## **Key Lessons** - -### **What Integration Tests Should Do** - -1. **Test Component Interactions** - Verify services work together -2. **Use Real Components** - Don't mock what you're testing -3. **Use External Fixtures** - Test with real data files -4. **Verify Real Parsing** - Test actual parser logic -5. **Test Error Conditions** - Verify real validation - -### **What Integration Tests Should NOT Do** - -1. ❌ Mock core business logic -2. ❌ Use inline test data -3. ❌ Test deprecated formats -4. ❌ Fake execution results -5. ❌ Always return success - -## **Integration vs E2E vs Unit** - -### **Unit Tests** - -- Test individual components in isolation -- Mock all dependencies -- Fast and focused - -### **Integration Tests** ✅ - -- Test component interactions -- Mock only external dependencies (CLI, file system) -- Use real business logic -- Test service coordination - -### **E2E Tests** - -- Test complete user workflows -- Simulate UI interactions -- Test end-to-end scenarios -- Include timing and state transitions - -## **Recommended Actions** - -1. **Replace** `WorkflowExecution.test.ts` with the fixed version -2. **Remove** over-mocking of business logic -3. **Add** real parser integration tests -4. **Use** external fixture files -5. **Test** real session reference validation -6. **Verify** actual service integration - -## **The Golden Rule (Applies to Integration Tests Too)** - -**"If you're mocking it, you're not testing it."** - -Integration tests should mock external dependencies only: - -- ✅ Mock: Claude CLI, file system operations, network calls -- ❌ Don't Mock: WorkflowParser, WorkflowService, session validation - -## **Summary** - -The original `WorkflowExecution.test.ts` has the same fundamental flaws we fixed in the E2E test: - -1. **Over-mocking** core functionality -2. **Fake execution** instead of real integration -3. **Inline data** instead of external fixtures -4. **Testing deprecated formats** that should fail -5. **False integration claims** while mocking everything - -The fixed version tests **real integration** between WorkflowParser, WorkflowService, and command building without mocking the business logic being tested. diff --git a/tests/docs/Integration-Test-Fix-Summary.md b/tests/docs/Integration-Test-Fix-Summary.md deleted file mode 100644 index 291528d..0000000 --- a/tests/docs/Integration-Test-Fix-Summary.md +++ /dev/null @@ -1,187 +0,0 @@ -# Integration Test Fix Summary - -## ✅ **Fixed: `WorkflowExecution.test.ts`** - -The integration test has been completely rewritten to follow proper integration testing principles. - -## **Before vs After** - -### ❌ **Before (Broken)** - -```typescript -// BAD: Mocking the core functionality being tested -executeWorkflowStub.callsFake(async (...) => { - // Completely fake execution logic - onStepProgress("task1", "running"); - onStepProgress("task1", "completed", { - session_id: "sess_123", - result: "Project analyzed successfully", // FAKE! - }); - onComplete(); -}); - -// BAD: Using deprecated session format that should fail -with: { - resume_session: "${{ steps.analyze.outputs.session_id }}", // OLD FORMAT! -} - -// BAD: Inline workflow definition -const workflow: ClaudeWorkflow = { - name: "Simple Workflow", - jobs: { /* inline definition */ } -}; -``` - -**Problems:** - -- ❌ Mocked `executeWorkflow` - the exact thing being tested -- ❌ Used deprecated `${{ }}` session format -- ❌ Always returned fake success results -- ❌ No real parser or service integration testing -- ❌ Inline workflow definitions instead of fixtures - -### ✅ **After (Fixed)** - -```typescript -// GOOD: Use real fixture files and real parser -const workflowPath = path.join(fixturesPath, "workflows", "claude-test-coverage.yml"); -const content = fs.readFileSync(workflowPath, "utf-8"); -const workflow = WorkflowParser.parseYaml(content); // REAL PARSER! - -// GOOD: Test parser validates session references correctly -expect(() => { - const content = fs.readFileSync("claude-test.yml", "utf-8"); - WorkflowParser.parseYaml(content); -}).toThrow(/invalid.*session.*reference/i); - -// GOOD: Test valid simple session format -with: { - resume_session: "task1", // NEW SIMPLE FORMAT! -} - -// GOOD: Test real service integration -const execution = workflowService.createExecution(workflow, {}); -expect(execution.workflow).toBe(workflow); -``` - -**Improvements:** - -- ✅ Uses real WorkflowParser with fixture files -- ✅ Tests session reference validation correctly -- ✅ Verifies deprecated format is rejected -- ✅ Tests real WorkflowService integration -- ✅ External fixture files instead of inline data - -## **Test Results** - -### ❌ **Before (False Positives)** - -``` -✓ should execute a simple workflow # FAKE - mocked everything -✓ should handle workflow with session chaining # FAKE - used old format -✓ should resolve workflow inputs # FAKE - mocked resolution -✓ should handle workflow failure # FAKE - simulated failure -✓ should support workflow cancellation # FAKE - mocked cancellation -``` - -### ✅ **After (Real Integration)** - -``` -✓ should load and parse workflow from fixture file -✓ should reject workflow with invalid session reference format -✓ should accept valid simple session reference format -✓ should create execution with real workflow -✓ should resolve workflow inputs properly -✓ should integrate parser + service + command building -``` - -## **Key Fixes Applied** - -### 1. **Real Parser Integration** - -- ✅ Uses actual `WorkflowParser.parseYaml()` -- ✅ Tests with real fixture files -- ✅ Verifies session reference validation - -### 2. **Session Format Validation** - -- ✅ Tests that old `${{ }}` format is rejected -- ✅ Tests that new simple format works -- ✅ Proves parser changes are working - -### 3. **Real Service Integration** - -- ✅ Tests `WorkflowService.createExecution()` -- ✅ Verifies input resolution -- ✅ Tests execution state management - -### 4. **Proper Mock Boundaries** - -- ✅ Mocks only external dependencies (file system, Claude CLI) -- ✅ Does NOT mock WorkflowParser (we're testing it) -- ✅ Does NOT mock WorkflowService (we're testing it) - -### 5. **End-to-End Integration** - -- ✅ Tests complete parser → service → command chain -- ✅ Verifies Claude step extraction -- ✅ No mocking of business logic - -## **What This Proves** - -### **Session Reference Fix Working** - -The test `"should reject workflow with invalid session reference format"` proves our parser changes are working correctly: - -```typescript -// This workflow uses old format and should be rejected -const workflowPath = path.join(fixturesPath, "workflows", "claude-test.yml"); - -expect(() => { - const content = fs.readFileSync(workflowPath, "utf-8"); - WorkflowParser.parseYaml(content); // Uses REAL parser -}).toThrow(/invalid.*session.*reference/i); -``` - -**Before our fix:** This test would have passed because everything was mocked. -**After our fix:** This test correctly fails when old format is used. - -### **Real Integration Working** - -The test `"should integrate parser + service + command building"` proves the complete integration chain works: - -```typescript -// Step 1: Parse with real parser -const workflow = WorkflowParser.parseYaml(content); - -// Step 2: Create execution with real service -const execution = workflowService.createExecution(workflow, {}); - -// Step 3: Extract Claude steps with real parser -const claudeSteps = WorkflowParser.extractClaudeSteps(workflow); -``` - -**This is true integration testing** - no mocking of business logic. - -## **Files Changed** - -- ✅ **Replaced:** `tests/integration/WorkflowExecution.test.ts` with proper integration test -- ✅ **Added:** Real parser integration tests -- ✅ **Added:** Session reference validation tests -- ✅ **Added:** Service integration tests -- ✅ **Removed:** Over-mocked fake execution tests - -## **The Result** - -**Before:** Test that always passed with fake results and deprecated session format -**After:** Test that proves real integration works and validates session reference fixes - -This demonstrates the **exact same antipattern fixes** we applied to the E2E test: - -1. Remove code duplication → Import real types -2. Remove fake execution → Use real components -3. Remove over-mocking → Mock only external dependencies -4. Add real integration → Test actual component coordination -5. Use external fixtures → Real workflow files - -The integration test now provides **real value** by testing actual component integration instead of mocked fake behavior.