Skip to content

Feature: accessing and slicing Ropes by byte offset and grapheme cluster offset.Β #57

@phronmophobic

Description

@phronmophobic

I'm working on a code editor that uses tree-sitter for syntax highlighting and supporting text like "πŸ‘»πŸ‘©β€πŸ‘©β€πŸ‘¦β€πŸ‘¦" is a design goal.

Tree sitter's API references all text snippets by their byte offsets. For example, if tree sitter parses the clojure code (declare foo), it will tell you there's a list literal starting at byte offset 0, a symbol declare at byte offset 1, and a symbol foo at byte offset 9. To construct a view of the syntax highlighted text, I then need to grab chunks of text by their byte offsets.

On the UI side, the cursor needs to move forward and backwards by grapheme cluster. As an example, πŸ‘©β€πŸ‘©β€πŸ‘¦β€πŸ‘¦is 1 grapheme cluster or 25 utf-8 bytes or 7 code points or 11 java characters. It's really convenient that Bifurcan's Ropes already deals with the discrepancy between code points and java characters πŸ˜„.

I'm not sure if there's interest in extending Bifurcan's Rope book keeping to also support lookup by byte offset and/or grapheme cluster offset.

Thanks for the great library!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions