This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
SentencePieceKit is a Swift wrapper around Google's SentencePiece C++ tokenizer library, distributed as a Swift Package. It supports iOS 13+, macOS 11+, and watchOS 4+.
Three-layer bridge pattern:
- sentencepiece.xcframework — Pre-compiled C++ static library (binary target)
- SentencePieceBridge — Objective-C++ wrapper (
Sources/SentencePieceBridge/) that converts between C++ and Objective-C types - SentencePieceKit — Swift public API (
Sources/SentencePieceKit/SentencePieceKit.swift) exposingSentencepieceTokenizer
Dependency chain: SentencePieceKit → SentencePieceBridge → sentencepiece (xcframework)
The bridge layer (SentencePieceBridge.mm) is the only file that touches C++ directly. All Swift code interacts through the Objective-C interface defined in SentencePieceBridge.h.
swift build # Build the package (macOS)
swift test # Run tests (Swift Testing framework)Requires: cmake, lipo, xcodebuild, wget
bash Scripts/build_xcframework.sh # Compiles C++ for all platforms
bash Scripts/verify_kit.sh # Validates architectures and buildsPackage.swift— SPM config: swift-tools-version 5.9, 4 targetsSources/SentencePieceKit/SentencePieceKit.swift— Entire public API (~100 lines)Sources/SentencePieceBridge/SentencePieceBridge.mm— C++ interop (~70 lines)Sources/SentencePieceBridge/include/SentencePieceBridge.h— Bridge interfaceScripts/build_xcframework.sh— Builds xcframework from SentencePiece C++ sourceExample/— iOS demo app (separate Xcode workspace, has its own CLAUDE.md)
- The class is named
SentencepieceTokenizer(lowercase 'p') — preserve this casing - The tokenizer is
@unchecked Sendablebecause the underlying C++ processor is immutable and thread-safe after initialization - Token offset support exists for compatibility with other tokenizer ecosystems (e.g., Hugging Face)
- Tests use the Swift Testing framework (
@Test,#expect), not XCTest - License: Apache 2.0