SFF-065: Context window-aware model policy

Reported by: repository user
Requested by: repository user
Priority: P1
Affected surfaces: model catalog, configuration, orchestration safeguards, portal guidance
Constraints: context-window data may vary by provider and model version; the policy must remain configurable rather than hard-coded

## Summary

SpecForge should discover or maintain the effective context window for configured models so the product can anticipate prompt-size risk before requests exceed model limits and react according to an explicit policy.

## Problem / opportunity

Without reliable context-window knowledge, the product can only discover over-limit situations too late, after prompt assembly has already failed or degraded. If SpecForge knows the limit ahead of time, it can warn early, recommend a larger model, compact or summarize context, or intentionally stop before producing a bad request.

## Requested behavior

The system should determine the usable context window for the models it supports, expose that capability to the runtime, and apply a configurable policy whenever a request is predicted to exceed or dangerously approach the limit.

## Scope

- In scope: collecting or resolving model context-window limits, estimating request size before execution, warning when the current model is insufficient, and adding configuration for the overflow-handling policy.
- Out of scope: silently rewriting all prompts without operator visibility, or assuming a single strategy works for every provider and workflow.

## Acceptance criteria

1. SpecForge can resolve or store context-window limits for the models it uses in a way the runtime can query before execution.
2. When a request is predicted to exceed or dangerously approach the configured model limit, the product can either recommend a larger model or apply a configured mitigation strategy.
3. Configuration exposes an explicit policy choice for overflow handling, including alternatives such as compacting, summarizing, or stopping with a blocking error.

## Notes

- This likely needs both provider/model metadata support and runtime token-estimation support.
- The policy decision should be operator-configurable because some flows prefer fidelity, while others prefer continuity.
- Related blocking bug: model configuration validity should already prevent running with incomplete linked models.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SFF-065: Context window-aware model policy #77

Summary

Problem / opportunity

Requested behavior

Scope

Acceptance criteria

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

SFF-065: Context window-aware model policy #77

Description

Summary

Problem / opportunity

Requested behavior

Scope

Acceptance criteria

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions