Note (2.0+): The
str/coremodule is now internal (str/internal/core). All functions documented here are available viaimport str. Usestr.function_name()in your code.
The core of str provides fundamental string operations that correctly handle Unicode grapheme clusters, including:
- Complex emoji sequences (ZWJ, skin tones, flags)
- Combining character sequences (diacritics, accents)
- Multi-codepoint grapheme clusters
- CRLF line endings (treated as single grapheme)
All functions in this module operate at the grapheme boundary level, ensuring Unicode correctness.
Truncates text to a maximum number of grapheme clusters, appending a suffix.
Example:
truncate("Hello 👨👩👧👦 World", 8, "...") // "Hello 👨👩👧👦..."Variant that prioritizes preserving complete emoji sequences.
Strict truncation that may split complex sequences if necessary.
Convenience function using "..." as the default suffix.
Truncates text with ellipsis (…) suffix.
Example:
ellipsis("Hello World", 8) // "Hello W…"Reverses text at grapheme cluster boundaries.
Example:
reverse("café") // "éfac"
reverse("👨👩👧👦") // "👨👩👧👦" (single cluster, unchanged)Returns the number of grapheme clusters in text. This is a grapheme-aware length function that correctly counts complex emoji, combining sequences, flags, and other multi-codepoint graphemes.
Example:
length("hello") // 5
length("👨👩👧👦") // 1 (single family emoji cluster)
length("café") // 4 (with combining accent)
length("🇮🇹") // 1 (flag is a single grapheme)
length("") // 0Returns the first N grapheme clusters from text.
Example:
take("hello", 3) // "hel"
take("👨👩👧👦abc", 2) // "👨👩👧👦a"Drops the first N grapheme clusters from text.
Example:
drop("hello", 2) // "llo"
drop("👨👩👧👦abc", 1) // "abc"Returns the grapheme cluster at the given index (0-based).
Example:
at("hello", 1) // Ok("e")
at("👨👩👧👦abc", 0) // Ok("👨👩👧👦")
at("hi", 10) // Error(Nil)Splits text into chunks of N graphemes. Like Rust's chunks() or Lodash's chunk(). The last chunk may be smaller.
Example:
chunk("abcdef", 2) // ["ab", "cd", "ef"]
chunk("abcdefg", 3) // ["abc", "def", "g"]
chunk("hello", 10) // ["hello"]
chunk("👨👩👧👦ab", 2) // ["👨👩👧👦a", "b"]Pads text on the left to reach the specified width.
Example:
pad_left("hi", 5, " ") // " hi"
pad_left("x", 3, "->") // "->->x"Pads text on the right.
Centers text within the specified width (right-biased when uneven: extra padding goes to the right).
Example:
center("hi", 6, " ") // " hi "Flexible padding function. Position is a type: Left, Right, or Both (center).
Example:
fill("x", 5, "-", Left) // "----x"
fill("x", 5, "-", Right) // "x----"
fill("x", 5, "-", Both) // "--x--"
fill("42", 5, "0", Left) // "00042"Counts occurrences of a substring (grapheme-aware).
Example:
count("aaaa", "aa", True) // 3 (overlapping)
count("aaaa", "aa", False) // 2 (non-overlapping)
count("👩👩👩", "👩", True) // 3Checks if a string contains only whitespace characters.
Example:
is_blank("") // True
is_blank(" ") // True
is_blank(" hello ") // FalseSplits text into words by whitespace.
Example:
words("Hello world\n\ttest") // ["Hello", "world", "test"]Splits text into lines. Handles \n, \r\n, and \r.
Example:
lines("a\nb\nc") // ["a", "b", "c"]
lines("a\r\nb") // ["a", "b"]Splits text on separator with a maximum number of parts. Like Python's str.split(sep, n).
Example:
splitn("a-b-c-d", "-", 2) // ["a", "b-c-d"]
splitn("a-b-c-d", "-", 3) // ["a", "b", "c-d"]
splitn("a-b", "-", 10) // ["a", "b"]
splitn("hello", "-", 2) // ["hello"]Removes common leading whitespace from all lines.
Example:
dedent(" a\n b\n c") // "a\nb\nc"Adds indentation to each line.
Example:
indent("hello\nworld", 2) // " hello\n world"Wraps text at the specified width, breaking on word boundaries.
Example:
wrap_at("hello world foo bar", 11) // "hello world\nfoo bar"Removes trailing newline if present (handles \n, \r\n, \r as graphemes).
Example:
chomp("hello\n") // "hello"
chomp("hello\r\n") // "hello"Wraps text with prefix and suffix.
Example:
surround("world", "Hello ", "!") // "Hello world!"Removes prefix and suffix if both are present.
Removes specified characters from both ends of text.
Example:
strip("..hello..", ".") // "hello"Collapses consecutive occurrences of a character to a single instance.
Example:
squeeze("heeello", "e") // "hello"
squeeze(" hello world ", " ") // " hello world "Splits text into three parts: before, separator, and after.
Example:
partition("a-b-c", "-") // #("a", "-", "b-c")
partition("hello", "-") // #("hello", "", "")Splits text from the last occurrence of separator. Like Python's str.rpartition(). If separator not found, returns #("", "", text).
Example:
rpartition("a-b-c", "-") // #("a-b", "-", "c")
rpartition("hello", "-") // #("", "", "hello")
rpartition("a--b--c", "--") // #("a--b", "--", "c")Finds the longest common prefix among a list of strings.
Example:
common_prefix(["abc", "abd", "abe"]) // "ab"Finds the longest common suffix among a list of strings.
Example:
common_suffix(["abc", "xbc", "zbc"]) // "bc"Checks if text contains only ASCII digits (0-9).
Example:
is_numeric("12345") // True
is_numeric("123.45") // FalseChecks if text contains only ASCII letters (a-z, A-Z).
Example:
is_alpha("hello") // True
is_alpha("hello123") // FalseChecks if text contains only ASCII letters and digits.
Example:
is_alphanumeric("hello123") // True
is_alphanumeric("hello-world") // FalseRemoves prefix from text if present.
Example:
remove_prefix("hello world", "hello ") // "world"
remove_prefix("hello", "bye") // "hello"Removes suffix from text if present.
Adds prefix if not already present.
Example:
ensure_prefix("world", "hello ") // "hello world"
ensure_prefix("hello world", "hello ") // "hello world"Adds suffix if not already present.
Checks if text starts with any of the given prefixes. Like Lodash's startsWith with multiple options.
Example:
starts_with_any("hello", ["hi", "he", "ha"]) // True
starts_with_any("hello", ["x", "y", "z"]) // False
starts_with_any("", ["a"]) // False
starts_with_any("hello", []) // FalseChecks if text ends with any of the given suffixes.
Example:
ends_with_any("file.txt", [".txt", ".md"]) // True
ends_with_any("file.rs", [".txt", ".md"]) // False
ends_with_any("hello", ["lo", "llo", "o"]) // TrueSwaps case of all ASCII letters.
Example:
swapcase("Hello World") // "hELLO wORLD"Capitalizes first grapheme and lowercases the rest. Like Python's str.capitalize().
Example:
capitalize("hello world") // "Hello world"
capitalize("hELLO wORLD") // "Hello world"
capitalize("HELLO") // "Hello"
capitalize("123abc") // "123abc"Calculates Levenshtein distance between two strings.
Example:
distance("kitten", "sitting") // 3
distance("hello", "hello") // 0Finds the index of the first occurrence of needle in text (grapheme-aware).
Example:
index_of("hello world", "world") // Ok(6)
index_of("👨👩👧👦 family", "family") // Ok(2)
index_of("hello", "x") // Error(Nil)Finds the index of the last occurrence of needle in text.
Example:
last_index_of("hello hello", "hello") // Ok(6)
last_index_of("a-b-c", "-") // Ok(3)Returns True if needle is found in text. This is grapheme-aware and correctly handles complex Unicode sequences.
Example:
contains("hello world", "world") // True
contains("hello", "x") // False
contains("👨👩👧👦 family", "👨👩👧👦") // True
contains("", "") // FalseReturns True if text starts with prefix on grapheme boundaries.
Example:
starts_with("hello", "he") // True
starts_with("hello", "") // True
starts_with("hi", "hello") // False
starts_with("👨👩👧👦abc", "👨👩👧👦") // TrueReturns True if text ends with suffix on grapheme boundaries.
Example:
ends_with("hello.txt", ".txt") // True
ends_with("hello", "") // True
ends_with("hi", "hello") // False
ends_with("abc👨👩👧👦", "👨👩👧👦") // TrueChecks if text contains any of the given needles.
Example:
contains_any("hello world", ["foo", "world"]) // True
contains_any("hello", ["x", "y", "z"]) // FalseChecks if text contains all of the given needles.
Example:
contains_all("hello world", ["hello", "world"]) // True
contains_all("hello", ["hello", "x"]) // FalseReplaces only the first occurrence of old with new.
Example:
replace_first("hello hello", "hello", "hi") // "hi hello"
replace_first("aaa", "a", "b") // "baa"Replaces only the last occurrence of old with new.
Example:
replace_last("hello hello", "hello", "hi") // "hello hi"
replace_last("aaa", "a", "b") // "aab"Checks if all cased characters are uppercase. Non-cased characters are ignored.
Example:
is_uppercase("HELLO") // True
is_uppercase("Hello") // False
is_uppercase("HELLO123") // True (numbers ignored)
is_uppercase("123") // False (no cased chars)Checks if all cased characters are lowercase.
Example:
is_lowercase("hello") // True
is_lowercase("Hello") // False
is_lowercase("hello123") // TrueChecks if text is in Title Case format: each word starts with uppercase and continues with lowercase. Words that don't start with a letter (numbers, emoji, punctuation) are ignored.
Example:
is_title_case("Hello World") // True
is_title_case("Hello world") // False
is_title_case("Hello 123 World") // True (numbers ignored)
is_title_case("Hello 🎉 World") // True (emoji ignored)
is_title_case("") // FalseReturns True if text is an empty string.
Example:
is_empty("") // True
is_empty(" ") // False
is_empty("a") // FalseChecks if text contains only ASCII characters (0x00-0x7F).
Example:
is_ascii("hello!@#") // True
is_ascii("café") // False
is_ascii("👋") // FalseChecks if text contains only printable ASCII characters (0x20-0x7E).
Example:
is_printable("hello") // True
is_printable("hello\n") // False
is_printable("hello\t") // FalseChecks if text contains only hexadecimal characters (0-9, a-f, A-F).
Example:
is_hex("abc123") // True
is_hex("DEADBEEF") // True
is_hex("xyz") // FalseEscapes HTML special characters to their entity equivalents.
Example:
escape_html("<div>Hello</div>") // "<div>Hello</div>"
escape_html("Tom & Jerry") // "Tom & Jerry"
escape_html("Say \"hello\"") // "Say "hello""Unescapes HTML entities to their character equivalents.
Example:
unescape_html("<div>") // "<div>"
unescape_html("Tom & Jerry") // "Tom & Jerry"Escapes regex metacharacters for use as a literal pattern.
Example:
escape_regex("hello.world") // "hello\\.world"
escape_regex("[test]") // "\\[test\\]"
escape_regex("a+b*c?") // "a\\+b\\*c\\?"Calculates similarity as a percentage (0.0 to 1.0) based on Levenshtein distance.
Example:
similarity("hello", "hello") // 1.0
similarity("hello", "hallo") // 0.8
similarity("abc", "xyz") // 0.0Calculates Hamming distance between two strings of equal length.
Example:
hamming_distance("karolin", "kathrin") // Ok(3)
hamming_distance("hello", "hallo") // Ok(1)
hamming_distance("abc", "ab") // Error(Nil)Returns the last N grapheme clusters from text.
Example:
take_right("hello", 3) // "llo"
take_right("👨👩👧👦abc", 2) // "bc"Drops the last N grapheme clusters from text.
Example:
drop_right("hello", 2) // "hel"
drop_right("👨👩👧👦abc", 2) // "👨👩👧👦a"Reverses the order of words in text.
Example:
reverse_words("hello world") // "world hello"
reverse_words("one two three") // "three two one"Extracts initials from text (first letter of each word, uppercase).
Example:
initials("John Doe") // "JD"
initials("visual studio code") // "VSC"Collapses all consecutive whitespace (spaces, tabs, newlines) into single spaces and trims. Like JavaScript's equivalent.
Example:
normalize_whitespace(" hello world ") // "hello world"
normalize_whitespace("hello\n\tworld") // "hello world"
normalize_whitespace(" a b c ") // "a b c"
normalize_whitespace("") // ""The module uses string.to_graphemes/1 from the Gleam standard library for grapheme segmentation, which provides Unicode-compliant grapheme cluster boundaries (UAX #29).
Key behaviors:
\r\nis treated as a single grapheme (CRLF cluster)- Emoji ZWJ sequences are single graphemes
- Combining marks stay attached to their base character
All functions operate in linear time with respect to the number of grapheme clusters. For very large strings (>100KB), consider pre-processing or chunking.
- str/extra — ASCII folding and slug generation
- str/tokenize — Pure-Gleam tokenizer reference
- Examples — Integration examples