macos: fall back to RTF/HTML when pasted plain text is encoding-lossy#45
macos: fall back to RTF/HTML when pasted plain text is encoding-lossy#45HelloBoLun wants to merge 1 commit into
Conversation
Some applications (certain IMs, Electron ports, legacy native apps) write a mangled plain-text pasteboard variant alongside a correctly-encoded RTF or HTML variant — non-ASCII characters arrive as U+FFFD or literal '?' runs because the app's plain-text writer fell back to an ASCII encoding. Pasting from such apps produced "?????" in the terminal. When getOpinionatedStringContents() detects an encoding-lossy plain-text variant (contains U+FFFD, or runs of 3+ '?'), recover the string from the rich-text variant provided it carries strictly more non-ASCII characters than plain text — this avoids regressing legitimate user text containing '?' runs. RTF parsing (local) is preferred; HTML is tried only when RTF is absent, minimizing WebKit involvement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR adds a best-effort recovery path in Confidence Score: 3/5Not safe to merge without addressing the U+FFFC / nonASCIIScalarCount issue, which can paste garbage characters in a broader set of cases than the bug being fixed. One P1 finding: embedded objects in RTF/HTML produce U+FFFC which is counted as non-ASCII, causing the swap heuristic to fire incorrectly and return garbage to the terminal. The two P2 findings (3-? threshold, main-thread assertion) are not blocking on their own. macos/Sources/Helpers/Extensions/NSPasteboard+Extension.swift — specifically the Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[getOpinionatedStringContents] --> B{File URL in pasteboard?}
B -- Yes --> C[Return escaped path]
B -- No --> D[Read plain text]
D --> E{plainText non-nil?}
E -- No --> J[richTextFallback]
E -- Yes --> F{looksEncodingLossy?}
F -- No --> G[Return plainText]
F -- Yes --> H[richTextFallback]
H --> I{recovered non-nil AND nonASCII recovered > nonASCII plain?}
I -- Yes --> K[Return recovered ⚠️ may contain U+FFFC]
I -- No --> G
J --> L{RTF data available?}
L -- Yes --> M[NSAttributedString RTF parse]
L -- No --> N{HTML data available?}
N -- Yes --> O[NSAttributedString HTML parse via WebKit]
N -- No --> P[Return nil]
M --> Q{string non-empty?}
Q -- Yes --> R[Return string]
Q -- No --> N
O --> S{string non-empty?}
S -- Yes --> T[Return string]
S -- No --> P
Reviews (1): Last reviewed commit: "macos: fall back to RTF/HTML when pasted..." | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
macos/Sources/Helpers/Extensions/NSPasteboard+Extension.swift (2)
70-82: Optional: simplifylooksEncodingLossy.The manual run-length loop can be replaced with a single
containscheck, which is easier to read and equivalent for the "3+ consecutive?" rule.♻️ Proposed simplification
private static func looksEncodingLossy(_ s: String) -> Bool { - if s.contains("\u{FFFD}") { return true } - var run = 0 - for scalar in s.unicodeScalars { - if scalar == "?" { - run += 1 - if run >= 3 { return true } - } else { - run = 0 - } - } - return false + s.contains("\u{FFFD}") || s.contains("???") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@macos/Sources/Helpers/Extensions/NSPasteboard`+Extension.swift around lines 70 - 82, The looksEncodingLossy function can be simplified by replacing the manual run-length loop with a single contains check for three consecutive question marks: change the body of private static func looksEncodingLossy(_ s: String) to return s.contains("\u{FFFD}") || s.contains("???"), keeping the same function name (looksEncodingLossy) so it still detects the replacement character or a run of 3+ '?' characters.
60-67: LGTM — fallback ordering and strict non-ASCII guard look correct.The "more non-ASCII scalars" gate effectively preserves legitimate
?runs while still recovering mangled CJK/emoji pastes, and theplainText == nilbranch falling through torichTextFallback()is a nice addition for HTML-only sources (Slack/Discord).One small consideration: the comparison uses strict
>. If a source app writes a plain variant where some non-ASCII scalars survived (say 2) but the RTF variant has the full 5, the replacement still happens correctly. But if both counts happen to be equal while the plain one is still partially mangled (e.g., mixed U+FFFD + recovered scalars), the recovered version will be dropped. GivenU+FFFDalone already flags lossy, you may want to tighten: whenplainTextcontainsU+FFFD, preferrecoveredwhenever it's non-empty (since U+FFFD is unambiguous evidence of loss), reserving the strict count comparison for the???heuristic.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@macos/Sources/Helpers/Extensions/NSPasteboard`+Extension.swift around lines 60 - 67, The logic should prefer the richTextFallback when the plainText contains the replacement character U+FFFD (an unambiguous sign of lossy encoding) even if nonASCIIScalarCount(recovered) is not strictly greater; update the branch that currently checks Self.looksEncodingLossy(s) and Self.nonASCIIScalarCount(recovered) > Self.nonASCIIScalarCount(s) to first test if s contains U+FFFD and if recovered is non-empty then return recovered, otherwise keep the existing nonASCIIScalarCount comparison (or change to >= if you prefer) so that recovered is chosen on clear lossy evidence while preserving the current heuristic for ambiguous cases; refer to plainText (s), Self.looksEncodingLossy, self.richTextFallback(), and Self.nonASCIIScalarCount when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@macos/Sources/Helpers/Extensions/NSPasteboard`+Extension.swift:
- Around line 104-115: The HTML-to-NSAttributedString conversion inside
getOpinionatedStringContents() must only run on the main thread; add a
main-thread guard at the start of that function (e.g.,
dispatchPrecondition(condition: .onQueue(.main))) to assert caller correctness
and prevent background-thread invocation, and also update the function's doc
comment to state the main-thread requirement so callers know the contract;
ensure the guard is added before the call to
NSAttributedString(data:options:documentAttributes:) and the doc mentions
getOpinionatedStringContents() explicitly.
---
Nitpick comments:
In `@macos/Sources/Helpers/Extensions/NSPasteboard`+Extension.swift:
- Around line 70-82: The looksEncodingLossy function can be simplified by
replacing the manual run-length loop with a single contains check for three
consecutive question marks: change the body of private static func
looksEncodingLossy(_ s: String) to return s.contains("\u{FFFD}") ||
s.contains("???"), keeping the same function name (looksEncodingLossy) so it
still detects the replacement character or a run of 3+ '?' characters.
- Around line 60-67: The logic should prefer the richTextFallback when the
plainText contains the replacement character U+FFFD (an unambiguous sign of
lossy encoding) even if nonASCIIScalarCount(recovered) is not strictly greater;
update the branch that currently checks Self.looksEncodingLossy(s) and
Self.nonASCIIScalarCount(recovered) > Self.nonASCIIScalarCount(s) to first test
if s contains U+FFFD and if recovered is non-empty then return recovered,
otherwise keep the existing nonASCIIScalarCount comparison (or change to >= if
you prefer) so that recovered is chosen on clear lossy evidence while preserving
the current heuristic for ambiguous cases; refer to plainText (s),
Self.looksEncodingLossy, self.richTextFallback(), and Self.nonASCIIScalarCount
when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fd3c093e-e646-4cd9-a842-e9489894c217
📒 Files selected for processing (1)
macos/Sources/Helpers/Extensions/NSPasteboard+Extension.swift
| if let data = self.data(forType: .html), | ||
| let attr = try? NSAttributedString( | ||
| data: data, | ||
| options: [ | ||
| .documentType: NSAttributedString.DocumentType.html, | ||
| .characterEncoding: String.Encoding.utf8.rawValue, | ||
| ], | ||
| documentAttributes: nil | ||
| ), | ||
| !attr.string.isEmpty { | ||
| return attr.string | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find all call sites of getOpinionatedStringContents to assess threading context.
rg -nP --type=swift -C3 '\bgetOpinionatedStringContents\s*\('Repository: manaflow-ai/ghostty
Length of output: 2940
🌐 Web query:
Does NSAttributedString init with NSAttributedString.DocumentType.html require the main thread on macOS?
💡 Result:
Yes, initializing NSAttributedString with NSAttributedString.DocumentType.html requires the main thread on macOS. Apple's official documentation explicitly states: "Don’t call this method from a background thread if the options dictionary includes the documentType attribute with a value of html. If you do, the method tries to synchronize with the main thread, fails, and times out." This applies to the synchronous initializers like init(data:options:documentAttributes:). An older Apple document confirms that since OS X v10.4, NSAttributedString uses WebKit for HTML import, which is not thread-safe on background threads; on macOS (OS X v10.5+), it transfers to the main thread via performSelectorOnMainThread:withObject:waitUntilDone:, requiring the main thread's run loop to be running. Recent forum discussions (up to 2025) affirm this behavior persists on macOS. New asynchronous APIs like loadFromHTML(data:options:completionHandler:) exist but are separate from the synchronous init methods.
Citations:
- 1: https://developer.apple.com/documentation/foundation/nsattributedstring/init(data:options:documentattributes:)
- 2: https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/AttributedStrings/Tasks/CreatingAttributedStrings.html
- 3: https://developer.apple.com/documentation/foundation/nsattributedstring/init(data:options:documentattributes:)?language=objc
- 4: https://developer.apple.com/forums/thread/115405
🏁 Script executed:
# Examine the call site contexts to understand threading guarantees
sed -n '2100,2130p' macos/Sources/Ghostty/Surface\ View/SurfaceView_AppKit.swiftRepository: manaflow-ai/ghostty
Length of output: 1001
🏁 Script executed:
# Check the clipboard request handler context
sed -n '330,345p' macos/Sources/Ghostty/Ghostty.App.swiftRepository: manaflow-ai/ghostty
Length of output: 738
🏁 Script executed:
# Check the C interop callback pattern to understand threading context
rg -nB5 "completeClipboardRequest" macos/Sources/Ghostty/Ghostty.App.swift | head -30Repository: manaflow-ai/ghostty
Length of output: 587
🏁 Script executed:
# Look for the definition of the clipboard callback to understand its threading context
rg -nB10 "performable paste bindings" macos/Sources/Ghostty/Ghostty.App.swiftRepository: manaflow-ai/ghostty
Length of output: 628
🏁 Script executed:
# Search for where this callback function is registered/called to understand threading
rg -n "readClipboard\|clipboard.*callback" macos/Sources/Ghostty/Ghostty.App.swift | head -20Repository: manaflow-ai/ghostty
Length of output: 45
🏁 Script executed:
# Check if there's a dispatch or threading guard anywhere in the clipboard handling chain
rg -nB2 "readClipboard" macos/Sources/Ghostty/Ghostty.App.swift | head -40Repository: manaflow-ai/ghostty
Length of output: 666
🏁 Script executed:
# Check the C library interface to understand threading guarantees for clipboard callbacks
rg -n "read_clipboard_cb|ghostty_app_callbacks_s" macos/Sources/ -A3 | head -40Repository: manaflow-ai/ghostty
Length of output: 870
🏁 Script executed:
# Look for documentation or comments about the C callback threading model
rg -n "ghostty_clipboard_e|clipboard.*thread|read.*clipboard" macos/Sources/Ghostty/ -B2 -A2 | head -50Repository: manaflow-ai/ghostty
Length of output: 3491
🏁 Script executed:
# Search for any comments or documentation about threading in the C library headers
find . -name "*.h" -type f | head -20Repository: manaflow-ai/ghostty
Length of output: 673
🏁 Script executed:
# Check if there are any assertions or main-thread checks elsewhere in the codebase for clipboard handling
rg -n "dispatchPrecondition\|MainActor\|Thread\|main thread" macos/Sources/Ghostty/Ghostty.App.swift | head -20Repository: manaflow-ai/ghostty
Length of output: 45
🏁 Script executed:
# Check the main app event loop and threading model to understand if clipboard callbacks are main-thread guaranteed
head -100 macos/Sources/Ghostty/Ghostty.App.swift | grep -A20 "class App\|init\|runModal"Repository: manaflow-ai/ghostty
Length of output: 3304
🏁 Script executed:
# Look for documentation or comments about the C library's threading model in the header
grep -n "read_clipboard_cb\|thread\|main" ./include/ghostty.h | head -30Repository: manaflow-ai/ghostty
Length of output: 398
Add main-thread guarantee to getOpinionatedStringContents() for HTML parsing.
Apple's official documentation confirms that NSAttributedString(data:options:documentAttributes:) with DocumentType.html requires the main thread (it uses WebKit internally, which is not thread-safe). Calling this on a background thread will cause the method to attempt synchronization with the main thread, which fails and times out.
While current call sites (readSelection, pasteSelection, clipboard request handlers) appear to execute on the main thread, the method lacks defensive guards, making it fragile against future misuse.
Add either:
- A main-thread assertion (
dispatchPrecondition(condition: .onQueue(.main))) at method entry, or - A documented caller contract with explicit main-thread requirement in the method documentation
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@macos/Sources/Helpers/Extensions/NSPasteboard`+Extension.swift around lines
104 - 115, The HTML-to-NSAttributedString conversion inside
getOpinionatedStringContents() must only run on the main thread; add a
main-thread guard at the start of that function (e.g.,
dispatchPrecondition(condition: .onQueue(.main))) to assert caller correctness
and prevent background-thread invocation, and also update the function's doc
comment to state the main-thread requirement so callers know the contract;
ensure the guard is added before the call to
NSAttributedString(data:options:documentAttributes:) and the doc mentions
getOpinionatedStringContents() explicitly.
| private static func nonASCIIScalarCount(_ s: String) -> Int { | ||
| s.unicodeScalars.reduce(0) { $1.value > 127 ? $0 + 1 : $0 } | ||
| } |
There was a problem hiding this comment.
\u{FFFC} skews non-ASCII count and leaks into terminal output
NSAttributedString.string replaces every embedded image/attachment in RTF or HTML with U+FFFC (OBJECT REPLACEMENT CHARACTER, value 0xFFFC). Because 0xFFFC > 127, nonASCIIScalarCount counts each \u{FFFC} as a non-ASCII scalar. This has two effects:
- A clipboard with mangled plain text (
"?????", 0 non-ASCII) and RTF containing only a logo image (\u{FFFC}, 1 non-ASCII) satisfies1 > 0, sorichTextFallback()is returned — a single garbage character pasted instead of the original?-run. - Even in the happy path (Chinese text recovered correctly), any embedded image in the RTF/HTML causes
\u{FFFC}to appear in the pasted string, which renders as visible garbage in the terminal.
Consider filtering replacement characters out of the richTextFallback result:
private func richTextFallback() -> String? {
// ... existing extraction ...
let cleaned = result.unicodeScalars
.filter { $0.value != 0xFFFC && $0.value != 0xFFFD }
.reduce(into: "") { $0.append(Character($1)) }
return cleaned.isEmpty ? nil : cleaned
}| private static func looksEncodingLossy(_ s: String) -> Bool { | ||
| if s.contains("\u{FFFD}") { return true } | ||
| var run = 0 | ||
| for scalar in s.unicodeScalars { | ||
| if scalar == "?" { | ||
| run += 1 | ||
| if run >= 3 { return true } | ||
| } else { | ||
| run = 0 | ||
| } | ||
| } | ||
| return false | ||
| } |
There was a problem hiding this comment.
3-
? threshold triggers on common English punctuation
With run >= 3, text like "What???" or "Seriously???" causes looksEncodingLossy to return true, unconditionally invoking the RTF/HTML parser (which may load WebKit for HTML). The non-ASCII guard prevents a wrong swap, but the parsing overhead is paid every time a user pastes text ending in three question marks.
U+FFFD is an unambiguous encoding failure signal and never appears in correctly-produced text. Consider requiring U+FFFD as the primary trigger and raising the ?-run threshold (e.g. 5 or more) to reduce false-positive parse invocations:
private static func looksEncodingLossy(_ s: String) -> Bool {
if s.contains("\u{FFFD}") { return true }
var run = 0
for scalar in s.unicodeScalars {
if scalar == "?" {
run += 1
if run >= 5 { return true }
} else {
run = 0
}
}
return false
}| if let data = self.data(forType: .html), | ||
| let attr = try? NSAttributedString( | ||
| data: data, | ||
| options: [ | ||
| .documentType: NSAttributedString.DocumentType.html, | ||
| .characterEncoding: String.Encoding.utf8.rawValue, | ||
| ], | ||
| documentAttributes: nil | ||
| ), | ||
| !attr.string.isEmpty { | ||
| return attr.string | ||
| } |
There was a problem hiding this comment.
HTML
NSAttributedString init requires the main thread
NSAttributedString(data:options:documentAttributes:) with .html internally drives a WebKit document parser, which must run on the main thread. The comment in the PR description acknowledges the WebKit cost, but the code doesn't assert the calling thread. The readClipboard C callback (registered as read_clipboard_cb in Ghostty.App.swift) is invoked from the libghostty core and its threading guarantee isn't obvious from this file.
Consider adding a runtime assertion to make the requirement explicit:
if let data = self.data(forType: .html) {
assert(Thread.isMainThread, "NSAttributedString HTML init must be called on the main thread")
// ...
}There was a problem hiding this comment.
3 issues found across 1 file
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="macos/Sources/Helpers/Extensions/NSPasteboard+Extension.swift">
<violation number="1" location="macos/Sources/Helpers/Extensions/NSPasteboard+Extension.swift:62">
P2: Lossy plain text containing U+FFFD can bypass rich-text recovery because U+FFFD is counted as non-ASCII, making the strict `>` gate fail.</violation>
<violation number="2" location="macos/Sources/Helpers/Extensions/NSPasteboard+Extension.swift:101">
P1: `NSAttributedString.string` emits `U+FFFC` (Object Replacement Character) for every embedded image/attachment in RTF or HTML. Since `0xFFFC > 127`, each one inflates `nonASCIIScalarCount`, which can cause an incorrect swap (e.g., mangled plain text with 0 non-ASCII vs. RTF with only an image → `1 > 0` triggers fallback). Worse, any `U+FFFC` that survives into the return value renders as garbage in the terminal.
Filter `\u{FFFC}` (and `\u{FFFD}`) out of the string returned by `richTextFallback()` before using it.</violation>
<violation number="3" location="macos/Sources/Helpers/Extensions/NSPasteboard+Extension.swift:104">
P2: `NSAttributedString(data:options:documentAttributes:)` with `.html` drives WebKit internally and **must** run on the main thread — calling it from a background thread causes a synchronization attempt that silently times out (Apple docs: "the HTML importer should not be called from a background thread"). Add a `dispatchPrecondition(condition: .onQueue(.main))` before the HTML branch to make this requirement explicit and catch future misuse.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| @@ -36,6 +36,8 @@ extension NSPasteboard { | |||
| /// Does these things in order: | |||
There was a problem hiding this comment.
P1: NSAttributedString.string emits U+FFFC (Object Replacement Character) for every embedded image/attachment in RTF or HTML. Since 0xFFFC > 127, each one inflates nonASCIIScalarCount, which can cause an incorrect swap (e.g., mangled plain text with 0 non-ASCII vs. RTF with only an image → 1 > 0 triggers fallback). Worse, any U+FFFC that survives into the return value renders as garbage in the terminal.
Filter \u{FFFC} (and \u{FFFD}) out of the string returned by richTextFallback() before using it.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At macos/Sources/Helpers/Extensions/NSPasteboard+Extension.swift, line 101:
<comment>`NSAttributedString.string` emits `U+FFFC` (Object Replacement Character) for every embedded image/attachment in RTF or HTML. Since `0xFFFC > 127`, each one inflates `nonASCIIScalarCount`, which can cause an incorrect swap (e.g., mangled plain text with 0 non-ASCII vs. RTF with only an image → `1 > 0` triggers fallback). Worse, any `U+FFFC` that survives into the return value renders as garbage in the terminal.
Filter `\u{FFFC}` (and `\u{FFFD}`) out of the string returned by `richTextFallback()` before using it.</comment>
<file context>
@@ -45,7 +47,74 @@ extension NSPasteboard {
+ documentAttributes: nil
+ ),
+ !attr.string.isEmpty {
+ return attr.string
+ }
+
</file context>
| // rendering. | ||
| if let s = plainText, Self.looksEncodingLossy(s), | ||
| let recovered = self.richTextFallback(), | ||
| Self.nonASCIIScalarCount(recovered) > Self.nonASCIIScalarCount(s) { |
There was a problem hiding this comment.
P2: Lossy plain text containing U+FFFD can bypass rich-text recovery because U+FFFD is counted as non-ASCII, making the strict > gate fail.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At macos/Sources/Helpers/Extensions/NSPasteboard+Extension.swift, line 62:
<comment>Lossy plain text containing U+FFFD can bypass rich-text recovery because U+FFFD is counted as non-ASCII, making the strict `>` gate fail.</comment>
<file context>
@@ -45,7 +47,74 @@ extension NSPasteboard {
+ // rendering.
+ if let s = plainText, Self.looksEncodingLossy(s),
+ let recovered = self.richTextFallback(),
+ Self.nonASCIIScalarCount(recovered) > Self.nonASCIIScalarCount(s) {
+ return recovered
+ }
</file context>
| Self.nonASCIIScalarCount(recovered) > Self.nonASCIIScalarCount(s) { | |
| (Self.nonASCIIScalarCount(recovered) > Self.nonASCIIScalarCount(s) || s.contains("\u{FFFD}")) { |
| return attr.string | ||
| } | ||
|
|
||
| if let data = self.data(forType: .html), |
There was a problem hiding this comment.
P2: NSAttributedString(data:options:documentAttributes:) with .html drives WebKit internally and must run on the main thread — calling it from a background thread causes a synchronization attempt that silently times out (Apple docs: "the HTML importer should not be called from a background thread"). Add a dispatchPrecondition(condition: .onQueue(.main)) before the HTML branch to make this requirement explicit and catch future misuse.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At macos/Sources/Helpers/Extensions/NSPasteboard+Extension.swift, line 104:
<comment>`NSAttributedString(data:options:documentAttributes:)` with `.html` drives WebKit internally and **must** run on the main thread — calling it from a background thread causes a synchronization attempt that silently times out (Apple docs: "the HTML importer should not be called from a background thread"). Add a `dispatchPrecondition(condition: .onQueue(.main))` before the HTML branch to make this requirement explicit and catch future misuse.</comment>
<file context>
@@ -45,7 +47,74 @@ extension NSPasteboard {
+ return attr.string
+ }
+
+ if let data = self.data(forType: .html),
+ let attr = try? NSAttributedString(
+ data: data,
</file context>
|
Hi maintainers — the All AI reviewers and security scans have passed. Thanks! |
Summary
U+FFFDor literal?runs because the app's plain-text writer fell back to an ASCII encoding. Pasting into a terminal surface produced?????.getOpinionatedStringContents()now recovers from the rich-text variant when the plain-text variant is encoding-lossy, preserving existing behavior for clean pastes.How
U+FFFDor a run of 3+?) and the rich-text variant carries strictly more non-ASCII scalars, returns the rich-text-derived plain string.nil).The "more non-ASCII scalars" swap condition prevents regressing legitimate user text like
"really???"— both variants have the same non-ASCII count, so the rich-text version is not preferred.Test plan
?????→ now shows the correct characters."really???"style runs is preserved untouched.No new unit tests — this path depends on live
NSPasteboard+NSAttributedStringruntime behavior.🤖 Generated with Claude Code
Summary by cubic
Fixes garbled pastes like “?????” from some macOS apps by falling back to RTF/HTML when the plain-text pasteboard data is encoding-lossy. Keeps the fast plain-text path for clean pastes and prefers RTF to avoid WebKit.
getOpinionatedStringContents(), detect lossy plain text (contains U+FFFD or runs of 3+ '?').Written for commit 50904e2. Summary will update on new commits.
Summary by CodeRabbit