Skip to content

[codex] Preserve system prompts for Gemma templates#992

Open
mimeding wants to merge 5 commits intoosaurus-ai:mainfrom
mimeding:codex/gemma-system-prompt-preservation
Open

[codex] Preserve system prompts for Gemma templates#992
mimeding wants to merge 5 commits intoosaurus-ai:mainfrom
mimeding:codex/gemma-system-prompt-preservation

Conversation

@mimeding
Copy link
Copy Markdown
Contributor

@mimeding mimeding commented May 1, 2026

Rebased onto current origin/main (fd2ffece, #1015). The rebase also aligns the Gemma prompt shim with the newer reasoning_content field introduced by the thinking-content provider contract.

Business rationale

Gemma-family local templates have historically handled system role messages unevenly, which can silently drop the user's configured instructions when they run local models. Preserving those instructions strengthens the prompt harness and keeps local model behavior predictable across template variants, especially for users who rely on system prompts for business context, safety boundaries, or coding conventions.

Coding rationale

The compatibility shim stays narrowly scoped to Gemma-family local models instead of changing the global chat-message path. System content is mirrored into the first user turn and standalone system messages are removed only for affected templates, preserving the normal OpenAI-style role layout everywhere else. The rebase preserves reasoning_content when rebuilding the first user message so the branch remains compatible with the current thinking-content contract. Touched-file style fixes stay inside ModelRuntime.swift so strict lint can run cleanly without broad service refactors.

What changed

  • Preserved Gemma system prompts by moving system instructions into the first user turn for local Gemma-family templates.
  • Added local template compatibility tests for Gemma and non-Gemma behavior.
  • Rebased the CI cache hardening comments onto current main's DerivedData strategy.
  • Aligned the Gemma prompt shim with the new reasoning_content field.
  • Cleaned touched-file Swift style in ModelRuntime.swift so strict formatter and SwiftLint gates pass.

Validation

  • git fetch origin && git rebase origin/main - passed after resolving the .github/workflows/ci.yml comment conflict against main's current DerivedData cache behavior.
  • swift build --package-path Packages/OsaurusCore - passed.
  • swift build --package-path Packages/OsaurusCore -c release - passed.
  • swift test --package-path Packages/OsaurusCore - passed: 1,439 tests in 193 suites, with sandbox integration tests skipped by their normal environment gate.
  • xcrun swift-format lint --strict on every touched Swift file - passed.
  • swiftlint lint --strict on every touched Swift file - passed file-by-file.
  • git diff --check origin/main...HEAD - passed.
  • CLI gate skipped because this slice does not touch Packages/OsaurusCLI.
  • Workspace/Xcode build skipped because this slice does not touch Xcode targets or project settings.

Non-scope

  • No change to remote-provider system prompt behavior.
  • No rewrite of the chat template registry or model-family detection beyond the targeted Gemma shim.
  • No CLI behavior changes.

Residual risks

The shim intentionally duplicates system instructions into user-visible prompt text for affected local templates. If future Gemma templates gain reliable native system role handling, this compatibility path should be revisited so the model does not receive redundant instruction framing.

@mimeding mimeding force-pushed the codex/gemma-system-prompt-preservation branch from 2d27be1 to ed5d6bc Compare May 1, 2026 03:57
@mimeding mimeding marked this pull request as ready for review May 1, 2026 04:20
@tpae
Copy link
Copy Markdown
Contributor

tpae commented May 1, 2026

@mimeding this doesn't feel like a fix to me, as it introduces 1-off case related to a specific model. the fix should belong to https://github.com/osaurus-ai/vmlx-swift-lm, currently maintained by @jjang-ai .

@mimeding
Copy link
Copy Markdown
Contributor Author

mimeding commented May 1, 2026

You are right, thanks for calling this out. I checked the net diff here and this PR is adding an Osaurus-side Gemma-only compatibility shim in ModelRuntime.applyLocalTemplateCompatibility, which is the wrong layer for the actual fix.

I also checked osaurus-ai/vmlx-swift-lm: Gemma message generation / chat-template rendering is owned there (NoSystemMessageGenerator, Gemma model message generators, and the Gemma4 fallback templates/tests already cover system turns). I will treat this PR as a diagnostic/regression note rather than the merge target, move the implementation work to osaurus-ai/vmlx-swift-lm / coordinate with @jjang-ai, and then come back to Osaurus only for a dependency bump plus a small regression test if needed.

@mimeding
Copy link
Copy Markdown
Contributor Author

mimeding commented May 1, 2026

Follow-up implemented in the dependency repo: osaurus-ai/vmlx-swift-lm#1

I moved the fix out of Osaurus and into vmlx-swift-lm, which is the right layer for this behavior. The upstream PR preserves system instructions in NoSystemMessageGenerator by folding them into the first user turn for templates that cannot accept a system role, and it keeps tool calls intact.

That PR also includes the CI/package fixes needed for clean validation in a fresh public checkout:

  • clean package resolution when the ignored local RunBench/ directory is absent
  • corrected macOS workflow scheme and xctest bundle path for vmlx-swift-lm
  • serialized the MLX sampling sections of EvalTests to avoid Metal command-encoder contention in the full Xcode test bundle

Local validation on the upstream branch is green via git diff --check, swift package describe --type json, focused chat tests, Xcode build-for-testing, and the full Xcode-built MLXLMTests.xctest bundle with 512 tests passing. GitHub has not attached checks to the new vmlx PR yet because the target repo currently has no prior runs and the base workflow does not appear to attach PR checks for this repo until the workflow correction lands.

I would treat this Osaurus PR as blocked/superseded by the upstream PR for now. Once the vmlx PR lands, the Osaurus-side follow-up should be just a dependency bump plus a small regression test if the integration surface still needs it.

Michael Meding and others added 5 commits May 3, 2026 22:02
Business rationale: Gemma-family local templates need system prompts preserved after the current thinking-content contract landed, otherwise the trust-building prompt harness regresses during rebase even when CI previously passed.

Coding rationale: Preserve the existing Gemma system-preamble shim, pass through the new reasoning_content field when rebuilding the first user message, and limit style fixes to ModelRuntime.swift so touched-file lint can run cleanly without broad refactors.

Co-authored-by: Codex <codex@openai.com>
@mimeding mimeding force-pushed the codex/gemma-system-prompt-preservation branch from 599e29e to 8c402ff Compare May 4, 2026 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants