LLM stage crashes on OpenAI-compatible/local endpoints (Ollama): models return confidence 0-100 but output schema validates 0-1

## Summary

In LLM mode against an OpenAI-compatible endpoint (e.g. Ollama via `OPENAI_BASE_URL`), the semantic pass crashes with a pydantic `ValidationError`: local instruct models return `confidence` on a **0–100** scale, but the LLM-output schemas validate it as a **0.0–1.0** float (`Field(ge=0.0, le=1.0)`). Combined with the abort-on-first-error behavior (#10), one out-of-range value takes down the entire LLM stage; only `--no-llm` static analysis survives.

## Environment

- SkillSpector **2.2.3** (installed from `main`, `git+https://github.com/NVIDIA/skillspector.git`)
- Python 3.12 (isolated venv), macOS / Apple Silicon
- `SKILLSPECTOR_PROVIDER=openai`, `OPENAI_BASE_URL=http://localhost:11434/v1`, Ollama 0.30.8
- Reproduced with **two** different models: `qwen2.5:14b` and `gemma4:12b`

## What happens

With LLM mode on (no `--no-llm`):

```
pydantic_core.ValidationError: N validation errors for MetaAnalyzerResult
  findings.0.confidence  Input should be less than or equal to 1  [input=100]
  ...
```

Both models emit `confidence: 100`. The constraint isn't on a single schema: after relaxing the bound on `MetaAnalyzerFinding`, the **identical** crash reappears on `LLMAnalysisResult` (the per-analyzer schema) — i.e. it's systemic across the `LLMAnalyzerBase` output models, not a one-off. (Raw model speed is fine here; this is purely the value-range mismatch, separate from any timeout.)

## Root cause

The LLM-output models constrain confidence to 0–1:

- `src/skillspector/llm_analyzer_base.py:67` — `confidence: float = Field(ge=0.0, le=1.0, ...)`
- `src/skillspector/nodes/meta_analyzer.py:66` — `confidence: float = Field(ge=0.0, le=1.0, ...)`

Instruct models commonly express confidence as a percentage (0–100). Frontier models on strict function-calling providers tend to stay in range, but models served over Ollama's OpenAI-compatible endpoint don't honor the numeric bound (constrained decoding enforces type/structure, not magnitude), so the value comes back as `100` and **client-side** pydantic validation rejects it.

## How this differs from existing issues

- **#66 / #76 / #4** (Anthropic min/max): those are *server-side* — Anthropic's API 400s on the `minimum`/`maximum` JSON-Schema keywords, fixed by stripping them from the schema sent to the provider. This is the *client-side* counterpart: the schema is accepted, the model returns an out-of-range **value**, and SkillSpector's own pydantic validation crashes. Schema-keyword stripping does not address it.
- **#69 / #71** (fenced/prose JSON): there the output is unparseable. Here the JSON parses fine; only the numeric range is wrong. PR #71's `PydanticOutputParser` would still enforce `le=1` and crash.
- **#10** amplifies it: a single `ValidationError` aborts the whole semantic pass.

## Possible direction (untested)

Normalizing/clamping confidence **before** validation would resolve it — the existing `@field_validator("overall_assessment", mode="before")` in `meta_analyzer.py` is a natural precedent. It would need to cover every confidence-bearing LLM-output model (relaxing the bound on one just surfaced the same crash on the next). Rough shape, but you'll know the right form: `float(v); if v > 1: v /= 100`; then clamp to `[0, 1]`.

## Repro

```sh
ollama pull qwen2.5:14b
export SKILLSPECTOR_PROVIDER=openai OPENAI_BASE_URL=http://localhost:11434/v1 \
       OPENAI_API_KEY=ollama SKILLSPECTOR_MODEL=qwen2.5:14b
skillspector scan ./tests/fixtures/malicious_skill     # no --no-llm
# -> ValidationError: findings.0.confidence Input should be less than or equal to 1 [input=100]
```

Happy to open a PR if that's useful.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM stage crashes on OpenAI-compatible/local endpoints (Ollama): models return confidence 0-100 but output schema validates 0-1 #89

Summary

Environment

What happens

Root cause

How this differs from existing issues

Possible direction (untested)

Repro

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

LLM stage crashes on OpenAI-compatible/local endpoints (Ollama): models return confidence 0-100 but output schema validates 0-1 #89

Description

Summary

Environment

What happens

Root cause

How this differs from existing issues

Possible direction (untested)

Repro

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions