Reported by: repository user
Requested by: repository user
Priority: P1
Affected surfaces: model catalog, configuration, orchestration safeguards, portal guidance
Constraints: context-window data may vary by provider and model version; the policy must remain configurable rather than hard-coded
Summary
SpecForge should discover or maintain the effective context window for configured models so the product can anticipate prompt-size risk before requests exceed model limits and react according to an explicit policy.
Problem / opportunity
Without reliable context-window knowledge, the product can only discover over-limit situations too late, after prompt assembly has already failed or degraded. If SpecForge knows the limit ahead of time, it can warn early, recommend a larger model, compact or summarize context, or intentionally stop before producing a bad request.
Requested behavior
The system should determine the usable context window for the models it supports, expose that capability to the runtime, and apply a configurable policy whenever a request is predicted to exceed or dangerously approach the limit.
Scope
- In scope: collecting or resolving model context-window limits, estimating request size before execution, warning when the current model is insufficient, and adding configuration for the overflow-handling policy.
- Out of scope: silently rewriting all prompts without operator visibility, or assuming a single strategy works for every provider and workflow.
Acceptance criteria
- SpecForge can resolve or store context-window limits for the models it uses in a way the runtime can query before execution.
- When a request is predicted to exceed or dangerously approach the configured model limit, the product can either recommend a larger model or apply a configured mitigation strategy.
- Configuration exposes an explicit policy choice for overflow handling, including alternatives such as compacting, summarizing, or stopping with a blocking error.
Notes
- This likely needs both provider/model metadata support and runtime token-estimation support.
- The policy decision should be operator-configurable because some flows prefer fidelity, while others prefer continuity.
- Related blocking bug: model configuration validity should already prevent running with incomplete linked models.
Reported by: repository user
Requested by: repository user
Priority: P1
Affected surfaces: model catalog, configuration, orchestration safeguards, portal guidance
Constraints: context-window data may vary by provider and model version; the policy must remain configurable rather than hard-coded
Summary
SpecForge should discover or maintain the effective context window for configured models so the product can anticipate prompt-size risk before requests exceed model limits and react according to an explicit policy.
Problem / opportunity
Without reliable context-window knowledge, the product can only discover over-limit situations too late, after prompt assembly has already failed or degraded. If SpecForge knows the limit ahead of time, it can warn early, recommend a larger model, compact or summarize context, or intentionally stop before producing a bad request.
Requested behavior
The system should determine the usable context window for the models it supports, expose that capability to the runtime, and apply a configurable policy whenever a request is predicted to exceed or dangerously approach the limit.
Scope
Acceptance criteria
Notes