Skip to content

Conversation

@GreyElaina
Copy link

@GreyElaina GreyElaina commented Jan 7, 2026

Summary

Add proper grapheme segmentation and Kitty emoji reassembly for correct handling of emoji, CJK characters, and other multi-codepoint sequences.

Changes

StdinBuffer (grapheme segmentation)

  • Add grapheme-segmenter.ts using Intl.Segmenter for Unicode-correct text segmentation
  • Update StdinBuffer to emit complete grapheme clusters for non-escape input
  • Preserve Kitty keyboard protocol sequences unchanged for downstream parsing

KeyHandler (Kitty emoji reassembly)

  • Add emoji codepoint detection helpers (isGraphemeExtender, canStartGraphemeCluster)
  • Buffer Kitty sequences that form multi-codepoint emoji (ZWJ families, flags, skin tones, keycaps)
  • Flush buffer on timeout or when non-emoji input arrives
  • Preserve all raw sequences in KeyEvent.raw field

Architecture

stdin → StdinBuffer → KeyHandler.processInput() → parseKeypress() → emit
              ↓                    ↓
     grapheme clusters      emoji reassembly
     (for raw input)        (for Kitty protocol)

Key insight: StdinBuffer handles grapheme segmentation for raw UTF-8 input, while KeyHandler handles emoji reassembly for Kitty protocol sequences. This separation preserves the raw field correctly.

Supported Emoji Types

  • Basic emoji: 😀
  • ZWJ sequences: 👨‍👩‍👧 (family)
  • Flag emoji: 🇺🇸 🇯🇵
  • Skin tone modifiers: 👋🏻
  • Keycap sequences: #️⃣
  • Subdivision flags: 🏴󠁧󠁢󠁥󠁮󠁧󠁿

Testing

  • All 113 stdin-buffer tests pass
  • All 60 renderer.input tests pass
  • All 40 KeyHandler tests pass (10 new emoji tests)

Copilot AI review requested due to automatic review settings January 7, 2026 02:40
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances emoji and IME input handling in the stdin-buffer module, particularly for terminals using the Kitty keyboard protocol. The implementation adds sophisticated grapheme cluster handling to properly process complex emoji sequences like ZWJ families, flags, and skin-tone variants.

Key Changes:

  • Introduces grapheme-segmenter module with Intl.Segmenter polyfill support for proper Unicode grapheme cluster handling
  • Adds Kitty keyboard protocol support that buffers and reassembles emoji codepoints sent as separate escape sequences
  • Refactors sequence completion helpers for better code reuse

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
packages/core/src/lib/stdin-buffer.ts Core changes: adds Kitty protocol emoji buffering, grapheme cluster extraction, and helper functions for Unicode codepoint classification
packages/core/src/lib/grapheme-segmenter.ts New module providing grapheme segmentation utilities with Intl.Segmenter support and performance optimizations
packages/core/src/lib/parse.keypress.ts Updates keypress parsing to use grapheme-aware checks instead of simple length checks
packages/core/src/lib/stdin-buffer.test.ts Comprehensive test coverage for grapheme handling and Kitty protocol emoji reassembly
packages/core/package.json Adds @formatjs/intl-segmenter dependency for polyfill support
bun.lock Lock file updates for new dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

(codepoint >= 0x1f1e6 && codepoint <= 0x1f1ff) || // Regional Indicators
(codepoint >= 0x1f300 && codepoint <= 0x1faff) || // Emoji ranges (simplified)
codepoint === 0x1f3f4 || // Black Flag
(codepoint >= 0x23 && codepoint <= 0x39) || // #, *, 0-9 for keycaps
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The range check for keycap characters (0x23 to 0x39) includes some ASCII characters that shouldn't start grapheme clusters. The range 0x23-0x39 includes:

  • (0x23) ✓ valid for keycaps

  • $ (0x24) ✗ not used for keycaps
  • % through ) (0x25-0x29) ✗ not used for keycaps
    • (0x2a) ✓ valid for keycaps
    • , - . / (0x2b-0x2f) ✗ not used for keycaps
  • 0-9 (0x30-0x39) ✓ valid for keycaps

This overly broad range could cause unintended buffering of characters like $, %, &, etc. Consider using a more precise check: (codepoint >= 0x30 && codepoint <= 0x39) || codepoint === 0x23 || codepoint === 0x2a

Suggested change
(codepoint >= 0x23 && codepoint <= 0x39) || // #, *, 0-9 for keycaps
((codepoint >= 0x30 && codepoint <= 0x39) || codepoint === 0x23 || codepoint === 0x2a) || // #, *, 0-9 for keycaps

Copilot uses AI. Check for mistakes.
let segmenter: Intl.Segmenter | null = null

if (typeof Intl === "undefined" || typeof (Intl as any).Segmenter !== "function") {
await import("@formatjs/intl-segmenter/polyfill-force.js").catch(() => {})
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top-level await is used here which can cause issues in certain module systems and bundlers. This is executed at module load time, which could:

  1. Block the entire module graph from loading if the polyfill import is slow
  2. Cause problems in environments that don't support top-level await (though increasingly rare)
  3. The error is silently caught, which means there's no feedback if the polyfill fails to load

Consider moving this to a lazy initialization pattern inside getGraphemeSegmenter() or make the polyfill loading explicit and required upfront during application initialization.

Suggested change
await import("@formatjs/intl-segmenter/polyfill-force.js").catch(() => {})
import("@formatjs/intl-segmenter/polyfill-force.js").catch(() => {})

Copilot uses AI. Check for mistakes.
codepoint === 0x200d || // ZWJ
(codepoint >= 0xfe00 && codepoint <= 0xfe0f) || // Variation Selectors
(codepoint >= 0x1f3fb && codepoint <= 0x1f3ff) || // Skin Tone Modifiers
(codepoint >= 0x1f1e6 && codepoint <= 0x1f1ff) || // Regional Indicators
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regional Indicator codepoints (0x1f1e6-0x1f1ff) are listed in both isGraphemeExtender and canStartGraphemeCluster. While this works because the grapheme segmenter ultimately handles the splitting correctly, the classification is semantically confusing. Regional Indicators don't "extend" a grapheme - they combine in pairs to form flag emoji. Consider clarifying the logic or renaming the functions to better reflect their purpose in the Kitty buffering strategy (e.g., shouldBufferForCombining instead of isGraphemeExtender).

Suggested change
(codepoint >= 0x1f1e6 && codepoint <= 0x1f1ff) || // Regional Indicators

Copilot uses AI. Check for mistakes.
Comment on lines 35 to 84
export function firstGrapheme(str: string): string {
if (str.length === 0) return ""
const firstCode = str.charCodeAt(0)
if (firstCode < 128) {
if (str.length === 1 || str.charCodeAt(1) < 128) return str[0]!
}

const segments = getGraphemeSegmenter().segment(str)
const first = segments[Symbol.iterator]().next()
return first.done ? "" : first.value.segment
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential issue with surrogate pairs handling. If firstCode < 128 is false but str.length === 1, the code will try to get the first grapheme from the segmenter. However, if the string contains a single unpaired surrogate (which is technically invalid UTF-16), this could cause issues. While unpaired surrogates are rare in practice, consider adding a check or documenting that the input is expected to be valid UTF-16.

Copilot uses AI. Check for mistakes.
Comment on lines 312 to 313
if (remaining.length > 0) this.process(remaining)
return true
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tryCompletePaste() method can trigger recursive calls through this.process(remaining) at line 312. If the pasted content contains multiple nested bracketed paste sequences or a large amount of remaining data, this could potentially cause a deep call stack. While unlikely in practice, consider whether this recursion is bounded or if an iterative approach would be safer for handling remaining content after paste completion.

Copilot uses AI. Check for mistakes.
}
if (this.kittyBuffer.length === 0) return []

const text = String.fromCodePoint(...this.kittyBuffer)
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the spread operator with String.fromCodePoint(...this.kittyBuffer) could cause issues if the kittyBuffer becomes very large. JavaScript has a maximum call stack size, and spreading a large array as function arguments can exceed this limit (typically around 65,536 arguments, but varies by engine).

While emoji sequences are typically short (< 20 codepoints), consider adding a safeguard or chunking mechanism if there's any possibility of the buffer growing unexpectedly large. Alternatively, document the expected maximum size of the kittyBuffer.

Suggested change
const text = String.fromCodePoint(...this.kittyBuffer)
let text = ""
for (let i = 0; i < this.kittyBuffer.length; i++) {
text += String.fromCodePoint(this.kittyBuffer[i]!)
}

Copilot uses AI. Check for mistakes.
Comment on lines 403 to 430
if (this.kittyTimeout) {
clearTimeout(this.kittyTimeout)
this.kittyTimeout = null
}
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clear() method now clears kittyTimeout in addition to the regular timeout, but there's no test coverage for this scenario. Consider adding a test that:

  1. Processes a Kitty sequence that would trigger buffering (e.g., a Regional Indicator)
  2. Calls clear() before the kittyTimeout fires
  3. Waits beyond the timeout duration
  4. Verifies that nothing is emitted

This would ensure that the kittyTimeout is properly cleaned up and doesn't cause spurious emissions after clear() is called.

Copilot uses AI. Check for mistakes.
}
this.emitSequences(extractCompleteSequences(this.buffer.slice(0, startIndex)).sequences)
}
this.emitKittyBuffer()
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When entering bracketed paste mode, emitKittyBuffer() is called to flush any pending Kitty codepoints. However, there's no test coverage for the scenario where:

  1. A Kitty emoji sequence is partially buffered (e.g., first part of a skin-tone emoji)
  2. Bracketed paste mode is entered
  3. The buffered Kitty sequence should be emitted before processing the pasted content

Consider adding a test to verify this edge case is handled correctly.

Copilot uses AI. Check for mistakes.
function extractKittyCodepoint(sequence: string): number | null {
const match = KITTY_UNICODE_RE.exec(sequence)
if (!match) return null
const cp = parseInt(match[1]!, 10)
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The non-null assertion match[1]! is safe because the regex requires at least one \d+ group, so match[1] will always be defined when the regex matches. However, for code clarity and to avoid potential confusion, consider documenting this assumption or restructuring the code to make it more explicit.

Suggested change
const cp = parseInt(match[1]!, 10)
const [, group] = match
if (group === undefined) return null
const cp = parseInt(group, 10)

Copilot uses AI. Check for mistakes.
Comment on lines 26 to 54
function isGraphemeExtender(codepoint: number): boolean {
return (
codepoint === 0x200d || // ZWJ
(codepoint >= 0xfe00 && codepoint <= 0xfe0f) || // Variation Selectors
(codepoint >= 0x1f3fb && codepoint <= 0x1f3ff) || // Skin Tone Modifiers
(codepoint >= 0x1f1e6 && codepoint <= 0x1f1ff) || // Regional Indicators
codepoint === 0x20e3 || // Combining Enclosing Keycap
(codepoint >= 0xe0020 && codepoint <= 0xe007f) // Tag characters
)
}

function canStartGraphemeCluster(codepoint: number): boolean {
return (
(codepoint >= 0x1f1e6 && codepoint <= 0x1f1ff) || // Regional Indicators
(codepoint >= 0x1f300 && codepoint <= 0x1faff) || // Emoji ranges (simplified)
codepoint === 0x1f3f4 || // Black Flag
(codepoint >= 0x23 && codepoint <= 0x39) || // #, *, 0-9 for keycaps
(codepoint >= 0x2600 && codepoint <= 0x27bf) // Misc Symbols & Dingbats
)
}

function extractKittyCodepoint(sequence: string): number | null {
const match = KITTY_UNICODE_RE.exec(sequence)
if (!match) return null
const cp = parseInt(match[1]!, 10)
return cp >= 0 && cp <= 0x10ffff ? cp : null
}
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper functions isGraphemeExtender, canStartGraphemeCluster, and extractKittyCodepoint lack documentation comments. While the inline comments provide some context, these functions would benefit from JSDoc comments explaining:

  • Their purpose in the Kitty keyboard protocol handling
  • Parameter descriptions
  • Return value meanings
  • Example usage or edge cases

This would improve code maintainability, especially given the complexity of grapheme cluster handling.

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 257 to 260
const CHUNK_SIZE = 8192
for (let i = 0; i < buffer.length; i += CHUNK_SIZE) {
const slice = buffer.slice(i, i + CHUNK_SIZE)
chunks.push(String.fromCodePoint.apply(null, slice))
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using String.fromCodePoint.apply(null, slice) can throw a "Maximum call stack size exceeded" error if the slice array is very large, even though CHUNK_SIZE is set to 8192 to mitigate this. Different JavaScript engines have different limits (typically 65536-100000 arguments).

Consider using the spread operator with a safer approach or a loop:

chunks.push(String.fromCodePoint(...slice))

Or reduce CHUNK_SIZE to a more conservative value like 1000-5000 to ensure compatibility across all engines.

Suggested change
const CHUNK_SIZE = 8192
for (let i = 0; i < buffer.length; i += CHUNK_SIZE) {
const slice = buffer.slice(i, i + CHUNK_SIZE)
chunks.push(String.fromCodePoint.apply(null, slice))
const CHUNK_SIZE = 4096
for (let i = 0; i < buffer.length; i += CHUNK_SIZE) {
const slice = buffer.slice(i, i + CHUNK_SIZE)
chunks.push(String.fromCodePoint(...slice))

Copilot uses AI. Check for mistakes.
} else if (s.length === 1 && s >= "A" && s <= "Z") {
// shift+letter
key.name = s.toLowerCase()
key.shift = true
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to isSingleGrapheme(s) at line 302 invokes the Intl.Segmenter for every keypress that isn't matched by earlier conditions. This happens after checking for single ASCII characters (lines 291-301), but the isSingleGrapheme function itself also checks for ASCII at lines 54-58 in grapheme-segmenter.ts, resulting in duplicate checks.

For better performance, the ASCII check could be moved earlier in this function to avoid calling isSingleGrapheme for common ASCII characters. However, the current structure might be intentional for code clarity.

Suggested change
key.shift = true
key.shift = true
} else if (s.length === 1 && s >= " " && s <= "\x7f") {
// remaining single printable ASCII characters (punctuation, symbols, etc.)
key.name = s

Copilot uses AI. Check for mistakes.
Comment on lines 26 to 47
function isGraphemeExtender(codepoint: number): boolean {
return (
codepoint === 0x200d || // ZWJ
(codepoint >= 0xfe00 && codepoint <= 0xfe0f) || // Variation Selectors
(codepoint >= 0x1f3fb && codepoint <= 0x1f3ff) || // Skin Tone Modifiers
(codepoint >= 0x1f1e6 && codepoint <= 0x1f1ff) || // Regional Indicators
codepoint === 0x20e3 || // Combining Enclosing Keycap
(codepoint >= 0xe0020 && codepoint <= 0xe007f) // Tag characters
)
}

function canStartGraphemeCluster(codepoint: number): boolean {
return (
(codepoint >= 0x1f1e6 && codepoint <= 0x1f1ff) || // Regional Indicators
(codepoint >= 0x1f300 && codepoint <= 0x1faff) || // Emoji ranges (simplified)
codepoint === 0x1f3f4 || // Black Flag
codepoint === 0x23 || // # for keycap
codepoint === 0x2a || // * for keycap
(codepoint >= 0x30 && codepoint <= 0x39) || // 0-9 for keycaps
(codepoint >= 0x2600 && codepoint <= 0x27bf) // Misc Symbols & Dingbats
)
}
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper functions isGraphemeExtender and canStartGraphemeCluster lack documentation explaining their purpose and the Unicode ranges they cover. Given the complexity of Unicode grapheme cluster rules, these functions should have comments explaining:

  • What each Unicode range represents (the inline comments are good but could be in function-level docs)
  • Why these specific ranges were chosen
  • That this is a heuristic for buffering, not a complete grapheme cluster algorithm (which is delegated to Intl.Segmenter)

Copilot uses AI. Check for mistakes.
if (str.length === 1) return str[0]!
const secondCode = str.charCodeAt(1)
if (secondCode < 128) return str[0]!
} else if (str.length === 1) {
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The firstGrapheme function has an optimization path that returns str[0]! for single-character strings (line 74), but this doesn't handle the edge case where the string might be a single invalid surrogate. While unlikely in practice since the function is called on stdin input, it would be more robust to check firstCode >= 0xd800 && firstCode <= 0xdfff and fall through to the segmenter for safety.

Suggested change
} else if (str.length === 1) {
} else if (str.length === 1 && (firstCode < 0xd800 || firstCode > 0xdfff)) {

Copilot uses AI. Check for mistakes.
Comment on lines +25 to +40
export function getGraphemeSegmenter(): Intl.Segmenter {
if (segmenter) return segmenter

if (typeof Intl !== "undefined" && typeof (Intl as any).Segmenter === "function") {
segmenter = new Intl.Segmenter(undefined, { granularity: "grapheme" })
return segmenter
}

if (initError) {
throw initError
}

throw new Error(
"Intl.Segmenter is not available. Please ensure your runtime supports it or install @formatjs/intl-segmenter",
)
}
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The polyfill initialization is asynchronous but called at module load time (line 23). However, getGraphemeSegmenter() can be called synchronously before the polyfill finishes loading. If the native Intl.Segmenter is not available and the polyfill hasn't loaded yet, the function will throw an error instead of waiting for initialization to complete.

This could cause runtime errors when the stdin-buffer is used immediately after import in environments without native Intl.Segmenter support. Consider either:

  1. Making getGraphemeSegmenter() async and awaiting the initialization
  2. Checking if initPromise is still pending and waiting for it
  3. Documenting that users must await initialization before using the module

Copilot uses AI. Check for mistakes.
await import("@formatjs/intl-segmenter/polyfill-force.js")
} catch (e) {
initError = new Error(
"Failed to load Intl.Segmenter polyfill. Please ensure @formatjs/intl-segmenter is installed or use a runtime that supports Intl.Segmenter natively.",
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error from the polyfill import is caught but not stored correctly. The catch block at line 12 receives the error as parameter e, but then creates a new generic Error message without preserving the original error information. This makes debugging polyfill loading issues difficult.

The initError should either wrap the original error or include its message: initError = new Error("Failed to load Intl.Segmenter polyfill: " + (e instanceof Error ? e.message : String(e)))

Suggested change
"Failed to load Intl.Segmenter polyfill. Please ensure @formatjs/intl-segmenter is installed or use a runtime that supports Intl.Segmenter natively.",
"Failed to load Intl.Segmenter polyfill: " +
(e instanceof Error ? e.message : String(e)) +
" Please ensure @formatjs/intl-segmenter is installed or use a runtime that supports Intl.Segmenter natively.",

Copilot uses AI. Check for mistakes.
…n StdinBuffer

- Add grapheme-segmenter.ts using Intl.Segmenter for Unicode-correct text segmentation
- Update StdinBuffer to emit complete grapheme clusters instead of single characters
- Preserve Kitty keyboard protocol sequences unchanged for downstream parsing
- Add comprehensive tests for grapheme cluster handling
Buffer Kitty keyboard protocol emoji sequences and reassemble them into
complete grapheme clusters before emitting keypress events.

- Add emoji codepoint detection helpers (isGraphemeExtender, canStartGraphemeCluster)
- Buffer Kitty sequences that form multi-codepoint emoji (ZWJ, flags, skin tones, keycaps)
- Flush buffer on timeout or when non-emoji input arrives
- Preserve all raw sequences in the emitted KeyEvent.raw field
- Add comprehensive tests for emoji reassembly scenarios
- Reject lone surrogates in firstGrapheme fast path (fall through to segmenter)
- Preserve original error message when polyfill loading fails
@kommander
Copy link
Collaborator

Interesting, I hadn't thought about it from that perspective. That means full grapheme parsing for input though, which can get out of hand really fast. The stdin buffer was meant to handle incomplete sequences, split at byte level, which would break some behaviours. For input in the Textarea for example, even if the bytes arrive chunked, they would end up forming a joined grapheme at the end. Is this meant to have full grapheme character key events?

…JK detection

Update parse.keypress.ts to use isSingleGrapheme() instead of s.length === 1
for proper handling of multi-codepoint characters like emoji and CJK.
@GreyElaina
Copy link
Author

GreyElaina commented Jan 8, 2026

The grapheme-aware input is already on my PR branch with ASCII fast paths, so performance impact is minimal for typical input.
Scope clarification:

  1. StdinBuffer: Only changes how non-escape sequences are split (uses firstGrapheme() instead of single char). This handles raw UTF-8 correctly.
  2. KeyHandler: Already has emoji buffering for Kitty protocol (shouldBufferForEmoji() + 10ms timeout + Intl.Segmenter reassembly).

The issue I hit was specifically with Rime IME emoji input - it sends input through some keystroke-based mechanism that differs from:

  • Clipboard paste (works fine)
  • Raycast emoji picker (works fine)
  • macOS fn emoji menu (works fine)

These might be using the same underlying mechanism, but I haven't dug deeper.
So this is an edge case fix. But using grapheme-based checking (isSingleGrapheme(s) instead of s.length === 1) is more robust anyway - who knows what weird input methods exist out there (CJK IMEs, etc.). Better to handle grapheme clusters correctly from the start than chase down individual edge cases.

@kommander
Copy link
Collaborator

It feels like something is more fundamentally wrong with the stdin buffer if it causes such a complexity explosion. Grapheme detection is horribly fragile at the moment and already has a huge complexity in the native part to render them correctly. Doing something similar for input seems like beating a symptom of wrong underlying assumptions in the stdin buffer.

@GreyElaina
Copy link
Author

zig/native side grapheme is used for rendering, the stdin buffer is for input parsing, i must say they are two different things.

the stdin buffer itself is simple. these complexity is brought by the terminal protocol and Unicode itself (see libghostty, utf8 everywhere etc.), anyway not a design issue. it's not correct to break UTF-8 by bytes or to break emoji by codepoints if not according to grapheme.

@kommander
Copy link
Collaborator

Okay, I think I understand better now. This is specifically a Kitty Keyboard issue, which emits separate input events for a ZWJ emoji for example, right? And even though inserting them one by one it would be rendered correctly, but not emitted as single input event, but separate ones for each byte?

@GreyElaina
Copy link
Author

Yeah exactly. Kitty protocol emits separate escape sequences per codepoint, not per byte. So 👨‍👩‍👧 (family emoji, 7 codepoints joined by ZWJ) comes as 7 separate \x1b[<codepoint>u sequences.
The fix buffers these with a 10ms timeout and reassembles using Intl.Segmenter. StdinBuffer itself didn't change much for this: the Kitty reassembly logic is in KeyHandler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants