| title | Concepts | ||
|---|---|---|---|
| description | Core abstractions in chisel | ||
| author | zoobzio | ||
| published | 2026-01-19 | ||
| updated | 2026-01-19 | ||
| tags |
|
Chisel is built around a few core abstractions: chunks, kinds, providers, and context. Understanding these helps you reason about what chisel extracts and why.
A Chunk is a semantic unit of code. Unlike line-based splitting, chunks follow the natural boundaries of code: where functions start and end, where classes are defined, where documentation lives.
type Chunk struct {
Content string // The actual source code
Symbol string // Name: "Add", "UserService", "Config"
Kind Kind // Category: function, method, class, etc.
StartLine int // Where it begins (1-indexed)
EndLine int // Where it ends (1-indexed)
Context []string // Parent chain: ["class UserService"]
}Each chunk is self-contained. The Content field holds the complete source—including comments and documentation—so embeddings capture the full meaning.
See Types Reference for the complete definition.
Kind categorizes what a chunk represents. This lets you filter, group, or weight chunks differently in your pipeline.
| Kind | Description | Example |
|---|---|---|
function |
Standalone function | func Add(a, b int) int |
method |
Function with receiver/self | func (c *Calc) Add(n int) |
class |
Class or struct definition | class UserService {} |
interface |
Interface or trait | interface Reader {} |
type |
Type alias or other type | type ID = string |
enum |
Enumeration | enum Status { Active } |
constant |
Constant declaration | const MaxSize = 100 |
variable |
Variable declaration | var cache = map{} |
section |
Markdown header | ## Installation |
module |
Package/file level | Package documentation |
Not every language uses every kind. Go has no enums; Python has no interfaces. Chisel maps language constructs to the closest semantic equivalent.
See Types Reference for the complete list.
A Provider parses a specific language into chunks. Each provider understands its language's AST and extracts meaningful units.
type Provider interface {
Chunk(ctx context.Context, filename string, content []byte) ([]Chunk, error)
Language() Language
}Chisel ships with providers for:
- Go — Uses stdlib
go/parser, zero external dependencies - Markdown — Header-based splitting, zero dependencies
- TypeScript/JavaScript — Tree-sitter parser
- Python — Tree-sitter parser
- Rust — Tree-sitter parser
The provider isolation is intentional. If you only need Go support, you don't pay for tree-sitter. Import only what you use.
See Providers Guide for language-specific behavior.
Context captures the parent chain for nested definitions. When you chunk a method, the context tells you which class it belongs to.
class UserService {
private db: Database;
async getUser(id: string): Promise<User> {
return this.db.find(id);
}
}The getUser chunk will have:
Chunk{
Symbol: "getUser",
Kind: KindMethod,
Context: []string{"class UserService"},
}Context flows downward. A method inside a class inside a module might have:
Context: []string{"module api", "class UserService"}This enables queries like "find all methods in UserService" or "show me everything in the api module."
Language identifies which provider handles a file. Use it with the Chunker to route files automatically.
const (
Go Language = "go"
TypeScript Language = "typescript"
JavaScript Language = "javascript"
Python Language = "python"
Rust Language = "rust"
Markdown Language = "markdown"
)The Chunker maps languages to providers. If you're processing a single language, you can use the provider directly without the chunker.
- Architecture — How parsing works internally
- Providers Guide — Language-specific details
- Types Reference — Complete type definitions