Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
## [3.0.2](https://github.com/jmlweb/hyntx/compare/v3.0.1...v3.0.2) (2026-03-22)

### Bug Fixes

- **security:** remove ajv override that broke ESLint (ajv v6/v8 incompatibility) ([88aac5c](https://github.com/jmlweb/hyntx/commit/88aac5cce1820ec62c1c1ec0f90c45e3b4004767))
- update dependency overrides to resolve security vulnerabilities ([685befa](https://github.com/jmlweb/hyntx/commit/685befaffa058c5e6b45fd57432c73b0664aad29))

### Documentation

- add quality assessment report ([722de83](https://github.com/jmlweb/hyntx/commit/722de83b18ee4f332d79c02fe69b7cf03960038d))

## [3.0.1](https://github.com/jmlweb/hyntx/compare/v3.0.0...v3.0.1) (2026-01-27)

### Bug Fixes
Expand Down
28 changes: 14 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ hyntx -m individual # Short form
- Analyzing high-stakes or complex prompts
- Conducting quality audits or teaching sessions

**Performance Note**: Numbers based on `gemma3:4b` on CPU. Actual speed varies by hardware, model size, and prompt complexity.
**Performance Note**: Numbers based on `gemma4:e4b` on CPU. Actual speed varies by hardware, model size, and prompt complexity.

**Detailed Guide**: See [Analysis Modes Documentation](./docs/ANALYSIS_MODES.md) for comprehensive comparison, examples, and decision guidelines.

Expand Down Expand Up @@ -255,17 +255,17 @@ Configure one or more providers in priority order. Hyntx will try each provider
```bash
# Single provider (Ollama only)
export HYNTX_SERVICES=ollama
export HYNTX_OLLAMA_MODEL=gemma3:4b
export HYNTX_OLLAMA_MODEL=gemma4:e4b

# Multi-provider with fallback (tries Ollama first, then Anthropic)
export HYNTX_SERVICES=ollama,anthropic
export HYNTX_OLLAMA_MODEL=gemma3:4b
export HYNTX_OLLAMA_MODEL=gemma4:e4b
export HYNTX_ANTHROPIC_KEY=sk-ant-your-key-here

# Cloud-first with local fallback
export HYNTX_SERVICES=anthropic,ollama
export HYNTX_ANTHROPIC_KEY=sk-ant-your-key-here
export HYNTX_OLLAMA_MODEL=gemma3:4b
export HYNTX_OLLAMA_MODEL=gemma4:e4b
```

#### Provider-Specific Variables
Expand All @@ -274,7 +274,7 @@ export HYNTX_OLLAMA_MODEL=gemma3:4b

| Variable | Default | Description |
| -------------------- | ------------------------ | ----------------- |
| `HYNTX_OLLAMA_MODEL` | `gemma3:4b` | Model to use |
| `HYNTX_OLLAMA_MODEL` | `gemma4:e4b` | Model to use |
| `HYNTX_OLLAMA_HOST` | `http://localhost:11434` | Ollama server URL |

**Anthropic:**
Expand Down Expand Up @@ -303,7 +303,7 @@ export HYNTX_REMINDER=7d
```bash
# Add to ~/.zshrc or ~/.bashrc (or let Hyntx auto-save it)
export HYNTX_SERVICES=ollama,anthropic
export HYNTX_OLLAMA_MODEL=gemma3:4b
export HYNTX_OLLAMA_MODEL=gemma4:e4b
export HYNTX_ANTHROPIC_KEY=sk-ant-your-key-here
export HYNTX_REMINDER=14d

Expand All @@ -327,7 +327,7 @@ Ollama runs AI models locally for **privacy and cost savings**.
2. Pull a model:

```bash
ollama pull gemma3:4b
ollama pull gemma4:e4b
```

3. Verify it's running:
Expand Down Expand Up @@ -369,7 +369,7 @@ Configure multiple providers for automatic fallback:
```bash
# If Ollama is down, automatically try Anthropic
export HYNTX_SERVICES=ollama,anthropic
export HYNTX_OLLAMA_MODEL=gemma3:4b
export HYNTX_OLLAMA_MODEL=gemma4:e4b
export HYNTX_ANTHROPIC_KEY=sk-ant-your-key-here
```

Expand Down Expand Up @@ -481,11 +481,11 @@ If using Ollama (recommended for privacy):
ollama serve

# Pull a model if needed
ollama pull gemma3:4b
ollama pull gemma4:e4b

# Set environment variables (add to ~/.zshrc or ~/.bashrc)
export HYNTX_SERVICES=ollama
export HYNTX_OLLAMA_MODEL=gemma3:4b
export HYNTX_OLLAMA_MODEL=gemma4:e4b
```

### Available MCP Tools
Expand Down Expand Up @@ -662,7 +662,7 @@ Use check-context to verify: "Update the component to handle errors"
#### "Slow responses"

- Local Ollama models are fastest but require GPU for best performance
- Consider using a faster model: `export HYNTX_OLLAMA_MODEL=gemma3:4b:1b`
- Consider using a faster model: `export HYNTX_OLLAMA_MODEL=gemma4:e2b`
- Cloud providers (Anthropic, Google) offer faster responses but require API keys

## Privacy & Security
Expand Down Expand Up @@ -708,15 +708,15 @@ For local analysis with Ollama, you need to have a compatible model installed. S

| Use Case | Model | Parameters | Disk Size | Speed (CPU) | Quality |
| ------------------- | ------------- | ---------- | --------- | -------------- | --------- |
| **Daily use** | `gemma3:4b` | 2-3B | ~2GB | ~2-5s/prompt | Good |
| **Daily use** | `gemma4:e4b` | ~5GB Q4 | ~5GB | ~3-7s/prompt | Good |
| **Production** | `mistral:7b` | 7B | ~4GB | ~5-10s/prompt | Better |
| **Maximum quality** | `qwen2.5:14b` | 14B | ~9GB | ~15-30s/prompt | Excellent |

**Installation**:

```bash
# Install recommended model (gemma3:4b)
ollama pull gemma3:4b
# Install recommended model (gemma4:e4b)
ollama pull gemma4:e4b

# Or choose a different model
ollama pull mistral:7b
Expand Down
Empty file.
91 changes: 91 additions & 0 deletions docs/HEURISTICS_ANALYSIS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Heuristics Analysis Report

**Date:** 2026-01-30
**Author:** HAL (assisted analysis)

## Overview

This document analyzes the `extractRealExamples()` function in `src/core/aggregator.ts` and the category mapping in `src/providers/base.ts`.

## Current Architecture

### Analysis Pipeline

```
Prompts β†’ AI Provider β†’ Minimal/Individual Result β†’ Aggregator β†’ Full AnalysisResult β†’ Semantic Validator
```

### Key Components

1. **ISSUE_TAXONOMY** (`schemas.ts`): 8 predefined issue types
- `vague`, `no-context`, `too-broad`, `no-goal`, `imperative`
- `missing-technical-details`, `unclear-priorities`, `insufficient-constraints`

2. **extractRealExamples()** (`aggregator.ts`): Heuristic matcher for fallback examples
- Only used when AI doesn't provide examples (minimal mode)
- Uses boolean matching with specific patterns per issue type

3. **Individual Mode**: AI returns per-prompt results with real examples
- `parseBatchIndividualResponse()` in `base.ts`
- Examples come directly from AI categorization

## Findings

### extractRealExamples() Heuristics

The current implementation uses strict boolean matching:

| Issue Type | Current Heuristic |
| ---------- | ------------------------------------------------------------------------- |
| vague | < 50 chars, ≀ 5 words, generic verbs, no file extensions |
| no-context | Has pronouns (this/it/that), no files, no function/component/method/class |
| too-broad | > 100 chars, β‰₯ 2 "and", has also/then/build/create |
| no-goal | < 30 chars, ≀ 4 words, no action verbs, no question mark |
| imperative | < 20 chars, ≀ 3 words, starts with verb |

### Category Mapping Inconsistency

`base.ts` uses different category IDs than `schemas.ts`:

| base.ts (individual mode) | schemas.ts (taxonomy) |
| ------------------------- | --------------------- |
| `vague-request` | `vague` |
| `missing-context` | `no-context` |
| `unclear-goal` | `no-goal` |

## Recommendations

### 1. Unify Category IDs

Add mapping in `base.ts`:

```typescript
const CATEGORY_TO_TAXONOMY_ID: Record<string, string> = {
'vague-request': 'vague',
'missing-context': 'no-context',
'unclear-goal': 'no-goal',
// ... etc
};
```

### 2. Improve Heuristics (Future Work)

Consider scoring-based matching instead of boolean:

- Calculate match score (0-1) per prompt per issue
- Select highest-scoring examples
- More nuanced matching for edge cases

### 3. Individual Mode Already Works Well

The individual/batch-individual schema already extracts real examples from AI responses. The heuristics in `extractRealExamples()` are only a fallback for minimal mode.

## Test Coverage

- `aggregator.test.ts`: 50 tests, all passing
- Tests cover all issue types and edge cases
- Gold standard in `benchmark/gold-standard.ts`: 50 prompts across 4 tiers

## Conclusion

The current architecture is solid. The main improvement opportunity is unifying category mappings between individual mode and the taxonomy. The heuristics work correctly for their intended purpose as a fallback.
Loading