Return generated paper-model responses instead of top-token listings#20
Return generated paper-model responses instead of top-token listings#20sharpninja merged 3 commits intomainfrom
Conversation
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com> Agent-Logs-Url: https://github.com/sharpninja/BitNet-b1.58-Sharp/sessions/bf5ee2e5-4b19-4106-8974-0149d7104efb
Co-authored-by: sharpninja <16146732+sharpninja@users.noreply.github.com> Agent-Logs-Url: https://github.com/sharpninja/BitNet-b1.58-Sharp/sessions/bf5ee2e5-4b19-4106-8974-0149d7104efb
There was a problem hiding this comment.
Pull request overview
Updates the paper-aligned BitNetPaperModel so its chat-facing output is natural generated text (using memorized exemplar responses when available) rather than a diagnostic-style “top token” listing, while preserving diagnostics and ensuring trained exemplar responses survive checkpoint save/load.
Changes:
- Switch
BitNetPaperModel.GenerateResponse(...)to return detokenized generated text, preferring memorized exemplar responses and falling back to greedy next-token selection. - Prime the default paper model with
BitNetTrainingCorpus.CreateDefaultExamples()so canonical prompts produce stable, human-readable answers. - Persist/restore memorized exemplar responses in
BitNetPaperCheckpointand update tests to assert on the new response behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/BitNetSharp.Core/BitNetPaperModel.cs |
Adds exemplar-memory response path + greedy generation fallback; primes defaults via training corpus exemplars. |
src/BitNetSharp.Core/BitNetPaperCheckpoint.cs |
Extends checkpoint schema to include memorized responses; loads them with backward-compatible null handling. |
tests/BitNetSharp.Tests/BitNetModelTests.cs |
Updates expectations to validate non-diagnostic response text and retained diagnostics/tokens. |
tests/BitNetSharp.Tests/HostedAgentBenchmarksExecutionTests.cs |
Aligns benchmark-path assertion with new natural response output. |
| internal IReadOnlyDictionary<string, int[]> ExportMemorizedResponses() => | ||
| _memorizedResponses.ToDictionary( | ||
| static pair => pair.Key, | ||
| static pair => pair.Value.ToArray(), | ||
| StringComparer.Ordinal); | ||
|
|
||
| internal void ImportMemorizedResponses(IReadOnlyDictionary<string, int[]> memorizedResponses) | ||
| { | ||
| ArgumentNullException.ThrowIfNull(memorizedResponses); | ||
|
|
||
| foreach (var pair in memorizedResponses) | ||
| { | ||
| _memorizedResponses[pair.Key] = pair.Value.ToArray(); |
There was a problem hiding this comment.
ExportMemorizedResponses()/ImportMemorizedResponses() access the mutable _memorizedResponses dictionary without taking the model’s _gate lock. Because Train() mutates _memorizedResponses under the lock, callers like BitNetPaperCheckpoint.Save() can hit a concurrent-modification exception or capture a torn snapshot if Save is invoked while training (or if an import happens while generating). Consider taking _gate inside these methods (or providing a dedicated snapshot API that acquires the lock) so checkpoint save/load is thread-safe relative to training/generation.
| internal IReadOnlyDictionary<string, int[]> ExportMemorizedResponses() => | |
| _memorizedResponses.ToDictionary( | |
| static pair => pair.Key, | |
| static pair => pair.Value.ToArray(), | |
| StringComparer.Ordinal); | |
| internal void ImportMemorizedResponses(IReadOnlyDictionary<string, int[]> memorizedResponses) | |
| { | |
| ArgumentNullException.ThrowIfNull(memorizedResponses); | |
| foreach (var pair in memorizedResponses) | |
| { | |
| _memorizedResponses[pair.Key] = pair.Value.ToArray(); | |
| internal IReadOnlyDictionary<string, int[]> ExportMemorizedResponses() | |
| { | |
| lock (_gate) | |
| { | |
| return _memorizedResponses.ToDictionary( | |
| static pair => pair.Key, | |
| static pair => pair.Value.ToArray(), | |
| StringComparer.Ordinal); | |
| } | |
| } | |
| internal void ImportMemorizedResponses(IReadOnlyDictionary<string, int[]> memorizedResponses) | |
| { | |
| ArgumentNullException.ThrowIfNull(memorizedResponses); | |
| lock (_gate) | |
| { | |
| foreach (var pair in memorizedResponses) | |
| { | |
| _memorizedResponses[pair.Key] = pair.Value.ToArray(); | |
| } |
| model.ExportMemorizedResponses().ToDictionary( | ||
| static pair => pair.Key, | ||
| static pair => pair.Value.ToArray(), | ||
| StringComparer.Ordinal), |
There was a problem hiding this comment.
Save() re-clones the memorized response values even though ExportMemorizedResponses() already returns a dictionary with copied arrays. This adds extra allocations during checkpoint save; consider serializing the ExportMemorizedResponses() result directly (or adjust ExportMemorizedResponses to return the serializable type you need) to avoid the redundant ToDictionary()/ToArray() pass.
| model.ExportMemorizedResponses().ToDictionary( | |
| static pair => pair.Key, | |
| static pair => pair.Value.ToArray(), | |
| StringComparer.Ordinal), | |
| model.ExportMemorizedResponses(), |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4fd99a21a1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (nextToken.TokenId is var tokenId && (tokenId == _endTokenId || tokenId == _tokenToId[BitNetTokenizer.UnknownToken])) | ||
| { |
There was a problem hiding this comment.
Prevent generation from stopping on
<unk> logits
In the non-memorized path, generation aborts when the top token is <unk>, so prompts can return only the fallback "BitNet paper model is ready." even though normal tokens are available. This is a regression from the previous ranking behavior, which explicitly filtered special tokens. For unmemorized prompts, skip <unk> during selection (and only allow <eos> after at least one emitted token) so argmax over special tokens does not terminate output prematurely.
Useful? React with 👍 / 👎.
| internal IReadOnlyDictionary<string, int[]> ExportMemorizedResponses() => | ||
| _memorizedResponses.ToDictionary( | ||
| static pair => pair.Key, | ||
| static pair => pair.Value.ToArray(), | ||
| StringComparer.Ordinal); |
There was a problem hiding this comment.
Guard memorized-response export with
_gate
ExportMemorizedResponses() enumerates _memorizedResponses without locking, while Train() mutates that dictionary under _gate. If BitNetPaperCheckpoint.Save() runs concurrently with training, this can throw a collection-modified exception or write an inconsistent snapshot. Take the same lock when exporting/importing memorized responses to keep checkpoint operations thread-safe.
Useful? React with 👍 / 👎.
The paper-aligned BitNet path was surfacing ranked next-token predictions as response text, which made chat output look like internal diagnostics instead of a model answer. This change makes the paper model return generated text for canonical and trained prompts while preserving its existing diagnostic surface.
Response generation
BitNetPaperModel.GenerateResponse(...)to return natural response text instead of formatting the top logits into"Top next-token predictions: ..."Default prompt behavior
"how are you hosted"now respond like the traditional comparison model path instead of exposing ranked tokensTraining + checkpoint parity
Targeted expectation updates
"Top next-token predictions:"string💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.