Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
7e09e51
feat: add DeepCopy, StructuralCopy, two pass parser
jensneuse Apr 13, 2026
cb04b8d
feat: implement StructuralCopyWithTransform for object field mapping
jensneuse Apr 17, 2026
88b35b9
feat!: simplify MergeValues with always-replace semantics
jensneuse Apr 17, 2026
9b5a2d0
perf: use sentinel errors in parser and validate hot paths
jensneuse Apr 17, 2026
06c72db
perf: replace multi-compare chains with charFlags lookup table
jensneuse Apr 17, 2026
70857aa
perf: branchless parseHex4 with hexDigit lookup table
jensneuse Apr 17, 2026
5d60956
fix: address PR review feedback
jensneuse Apr 17, 2026
4363801
fix: remove unused sinkErr benchmark sink
jensneuse Apr 17, 2026
2631781
feat: add noEscapeSubtree advisory hint and PR review fixes
jensneuse Apr 18, 2026
b5f96ea
fix: address CI lint and CodeRabbit review on noEscapeSubtree
jensneuse Apr 18, 2026
7df6b61
perf: SWAR-accelerate hasSpecialChars (8-byte chunks)
jensneuse Apr 18, 2026
92d5fa3
test: cover array stale-ancestor marshal path; document intToStr
jensneuse Apr 18, 2026
3451331
fix: use strings.ContainsAny in hasSpecialCharsIndexAny
jensneuse Apr 18, 2026
1760021
feat: expose package-level copy API + always-copy semantics
jensneuse Apr 19, 2026
0a3d90f
fix: passthrough counter must mirror fill's rename-collision skip
jensneuse Apr 19, 2026
f600d16
fix: planner must dedupe duplicate source keys in passthrough
jensneuse Apr 19, 2026
22fcc7b
refactor: move copy APIs out of util.go into parser_deep_copy.go
jensneuse Jun 12, 2026
c809d0f
docs: document two-pass arena parser architecture and invariants
jensneuse Jun 12, 2026
0d29594
test: assert needsEscape at unescape call sites; anchor hasSpecialCha…
jensneuse Jun 12, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
tags
.idea
.DS_Store
.serena
IMPROVEMENTS_BENCHMARKS.md
*.prof
29 changes: 17 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,23 +160,24 @@ fmt.Println(v) // {"a":{"b":{"c":42}}}

`MergeValues` recursively merges two values. For objects, keys from `b` are
added to or replace keys in `a`. For arrays, elements are merged pairwise
(arrays must have equal length). For scalars, `b` replaces `a` when they differ.
(arrays must have equal length). For scalars, `b` replaces `a` unconditionally —
no value comparison is performed.

```go
a := arena.NewMonotonicArena()
var p astjson.Parser
base, _ := p.ParseWithArena(a, `{"name": "alice", "age": 30}`)
overlay, _ := p.ParseWithArena(a, `{"age": 31, "email": "alice@example.com"}`)

merged, changed, err := astjson.MergeValues(a, base, overlay)
merged, err := astjson.MergeValues(a, base, overlay)
fmt.Println(merged) // {"name":"alice","age":31,"email":"alice@example.com"}
```

`MergeValuesWithPath` wraps `b` in a nested object at the given path before merging:

```go
extra, _ := p.ParseWithArena(a, `"1.0"`)
merged, _, _ = astjson.MergeValuesWithPath(a, base, extra, "metadata", "version")
merged, _ = astjson.MergeValuesWithPath(a, base, extra, "metadata", "version")
// equivalent to merging {"metadata":{"version":"1.0"}} into base
```

Expand Down Expand Up @@ -224,18 +225,19 @@ s := v.String()

### DeepCopy

`DeepCopy` creates a complete copy of a value tree on the given arena. This is
`(*Parser).DeepCopy` creates a complete copy of a value tree on the given arena. This is
the safe way to insert heap-allocated values into arena-allocated containers:

```go
a := arena.NewMonotonicArena()
obj := astjson.ObjectValue(a)
var parser astjson.Parser

heapVal, _ := astjson.Parse(`{"nested": "data"}`)
obj.Set(a, "key", astjson.DeepCopy(a, heapVal)) // safe: copy lives in arena
obj.Set(a, "key", parser.DeepCopy(a, heapVal)) // safe: copy lives in arena
```

When `a` is nil, `DeepCopy` returns the value unchanged (no-op in heap mode).
When `a` is nil, `parser.DeepCopy` returns the value unchanged (no-op in heap mode).


## GC & Arena Safety
Expand Down Expand Up @@ -273,7 +275,7 @@ caller's responsibility when inserting values across allocation boundaries.
> unless another GC-visible reference keeps it alive.**

If the only reference to a heap Value lives in arena memory, the GC cannot see
it and may collect it, causing a use-after-free. Use `DeepCopy` to copy the
it and may collect it, causing a use-after-free. Use `parser.DeepCopy` to copy the
value onto the arena first.

**Unsafe:**
Expand All @@ -284,11 +286,12 @@ arenaObj.Set(a, "key", heapVal) // UNSAFE: GC can't see this ref
heapVal = nil // GC may collect it
```

**Safe — use DeepCopy:**
**Safe — use parser.DeepCopy:**
```go
arenaObj := astjson.ObjectValue(a)
var parser astjson.Parser
heapVal := astjson.StringValue(nil, "hello")
arenaObj.Set(a, "key", astjson.DeepCopy(a, heapVal)) // safe: copy lives in arena
arenaObj.Set(a, "key", parser.DeepCopy(a, heapVal)) // safe: copy lives in arena
```

**Also safe — all values from the same arena:**
Expand Down Expand Up @@ -339,9 +342,11 @@ resetting one arena while the other is still in use causes silent corruption.
* **One arena per unit of work.** Create an arena at the start of a request,
parse and build values on it, serialize the result, then let the arena be
collected. This gives you a clear, bounded lifetime.
* **Use `DeepCopy` at boundaries.** When inserting a value from an unknown
* **Use `astjson.DeepCopy` at boundaries.** When inserting a value from an unknown
source (different arena, heap, parsed separately) into an arena container,
wrap it in `DeepCopy(a, val)`. This is a no-op when `a` is nil.
wrap it in `astjson.DeepCopy(a, val)`. The copy is always independent —
with a non-nil arena it is arena-allocated; with a nil arena it is
heap-allocated.
* **Prefer arena mode for hot paths.** Arena mode avoids per-Value heap
allocations, reducing GC pause time in high-throughput services.
* **Use heap mode for simplicity.** If GC pressure is not a concern, pass
Expand Down Expand Up @@ -496,7 +501,7 @@ BenchmarkValidate/twitter/fastjson 2000 1036796 ns/op 609.10 MB/s
beyond the next `Parser.Parse` / `Scanner.Next` call.
* Make sure you don't access `astjson` objects from concurrently running goroutines.
* If using arena mode, read the [GC & Arena Safety](#gc--arena-safety) section carefully.
Mixing heap and arena values without `DeepCopy` causes silent use-after-free.
Mixing heap and arena values without `parser.DeepCopy` causes silent use-after-free.
* Build and run your program with [-race](https://golang.org/doc/articles/race_detector.html) flag.
Make sure the race detector detects zero races.
* If your program continues crashing after fixing the issues above, [file a bug](https://github.com/wundergraph/astjson/issues/new).
66 changes: 36 additions & 30 deletions arena_gc_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -429,7 +429,7 @@ func TestArenaGCSafety_MergeValues(t *testing.T) {
if err != nil {
t.Fatalf("iteration %d: parse right: %s", i, err)
}
merged, _, err := MergeValues(a, left, right)
merged, err := MergeValues(a, left, right)
if err != nil {
t.Fatalf("iteration %d: merge: %s", i, err)
}
Expand Down Expand Up @@ -475,7 +475,7 @@ func TestArenaGCSafety_MergeValuesWithPath(t *testing.T) {
t.Fatalf("iteration %d: parse left: %s", i, err)
}
right := StringValue(a, heapString("merged", i))
merged, _, err := MergeValuesWithPath(a, left, right, "data", "nested")
merged, err := MergeValuesWithPath(a, left, right, "data", "nested")
if err != nil {
t.Fatalf("iteration %d: merge: %s", i, err)
}
Expand Down Expand Up @@ -976,7 +976,7 @@ func TestArenaGCSafety_ComplexWorkflow(t *testing.T) {
if err != nil {
t.Fatalf("iteration %d: parse extra: %s", i, err)
}
merged, _, err := MergeValues(a, base, extra)
merged, err := MergeValues(a, base, extra)
if err != nil {
t.Fatalf("iteration %d: merge: %s", i, err)
}
Expand Down Expand Up @@ -1188,6 +1188,7 @@ func TestArenaGCSafety_StringValueBytes_HeapInput(t *testing.T) {
func TestArenaGCSafety_DeepCopy_ObjectSet(t *testing.T) {
old := debug.SetGCPercent(1)
defer debug.SetGCPercent(old)
var parser Parser

for i := 0; i < gcTestIterations; i++ {
a := arena.NewMonotonicArena()
Expand All @@ -1196,7 +1197,7 @@ func TestArenaGCSafety_DeepCopy_ObjectSet(t *testing.T) {
heapVal := StringValue(nil, heapString("safe", i))

// DeepCopy copies heapVal into arena a before storing.
obj.Set(a, "key", DeepCopy(a, heapVal))
obj.Set(a, "key", parser.DeepCopy(a, heapVal))

// Drop the only external reference to heapVal.
heapVal = nil //nolint:ineffassign
Expand All @@ -1221,14 +1222,15 @@ func TestArenaGCSafety_DeepCopy_ObjectSet(t *testing.T) {
func TestArenaGCSafety_DeepCopy_SetArrayItem(t *testing.T) {
old := debug.SetGCPercent(1)
defer debug.SetGCPercent(old)
var parser Parser

for i := 0; i < gcTestIterations; i++ {
a := arena.NewMonotonicArena()

arr := ArrayValue(a)
heapVal := IntValue(nil, i)

arr.SetArrayItem(a, 0, DeepCopy(a, heapVal))
arr.SetArrayItem(a, 0, parser.DeepCopy(a, heapVal))

heapVal = nil //nolint:ineffassign
forceGC()
Expand All @@ -1254,6 +1256,7 @@ func TestArenaGCSafety_DeepCopy_SetArrayItem(t *testing.T) {
func TestArenaGCSafety_DeepCopy_NestedObject(t *testing.T) {
old := debug.SetGCPercent(1)
defer debug.SetGCPercent(old)
var parser Parser

for i := 0; i < gcTestIterations; i++ {
a := arena.NewMonotonicArena()
Expand All @@ -1268,7 +1271,7 @@ func TestArenaGCSafety_DeepCopy_NestedObject(t *testing.T) {
heapObj.Set(nil, "scores", heapArr)

arenaContainer := ObjectValue(a)
arenaContainer.Set(a, "data", DeepCopy(a, heapObj))
arenaContainer.Set(a, "data", parser.DeepCopy(a, heapObj))

// Drop all heap references.
heapObj = nil //nolint:ineffassign
Expand Down Expand Up @@ -1298,22 +1301,33 @@ func TestArenaGCSafety_DeepCopy_NestedObject(t *testing.T) {
}
}

// TestArenaGCSafety_DeepCopy_NilArena verifies that DeepCopy(nil, v) is a
// no-op and returns v unchanged.
// TestArenaGCSafety_DeepCopy_NilArena verifies that parser.DeepCopy(nil, v)
// produces an independent heap-allocated deep copy — mutating the copy must
// not affect the source.
func TestArenaGCSafety_DeepCopy_NilArena(t *testing.T) {
v := StringValue(nil, "hello")
got := DeepCopy(nil, v)
if got != v {
t.Fatal("DeepCopy(nil, v) must return v unchanged")
src := MustParse(`{"key":"value"}`)
var parser Parser
cp := parser.DeepCopy(nil, src)
if cp == src {
t.Fatal("DeepCopy(nil, v) must not return the same pointer")
}
// Mutate the copy — source must be untouched.
cp.Set(nil, "key", StringValue(nil, "mutated"))
if string(src.MarshalTo(nil)) != `{"key":"value"}` {
t.Fatalf("source mutated by copy; source=%s", src.MarshalTo(nil))
}
if string(cp.MarshalTo(nil)) != `{"key":"mutated"}` {
t.Fatalf("copy mutation did not stick; copy=%s", cp.MarshalTo(nil))
}
}

// TestArenaGCSafety_DeepCopy_NilValue verifies that DeepCopy(a, nil) returns nil.
// TestArenaGCSafety_DeepCopy_NilValue verifies that parser.DeepCopy(a, nil) returns nil.
func TestArenaGCSafety_DeepCopy_NilValue(t *testing.T) {
a := arena.NewMonotonicArena()
got := DeepCopy(a, nil)
var parser Parser
got := parser.DeepCopy(a, nil)
if got != nil {
t.Fatal("DeepCopy(a, nil) must return nil")
t.Fatal("parser.DeepCopy(a, nil) must return nil")
}
runtime.KeepAlive(a)
}
Expand All @@ -1323,7 +1337,8 @@ func TestArenaGCSafety_DeepCopy_NilValue(t *testing.T) {
func TestArenaGCSafety_DeepCopy_EmptyObject(t *testing.T) {
a := arena.NewMonotonicArena()
obj := ObjectValue(a)
cp := DeepCopy(a, obj)
var parser Parser
cp := parser.DeepCopy(a, obj)
if cp.Type() != TypeObject {
t.Fatalf("expected TypeObject, got %v", cp.Type())
}
Expand Down Expand Up @@ -1423,7 +1438,7 @@ func TestArenaGCSafety_MergeValues_ScalarReplacement(t *testing.T) {
if err != nil {
t.Fatalf("iteration %d: parse right: %s", i, err)
}
merged, _, err := MergeValues(a, left, right)
merged, err := MergeValues(a, left, right)
if err != nil {
t.Fatalf("iteration %d: merge: %s", i, err)
}
Expand Down Expand Up @@ -1467,7 +1482,7 @@ func TestArenaGCSafety_MergeValues_RecursiveObjects(t *testing.T) {
if err != nil {
t.Fatalf("iteration %d: parse right: %s", i, err)
}
merged, _, err := MergeValues(a, left, right)
merged, err := MergeValues(a, left, right)
if err != nil {
t.Fatalf("iteration %d: merge: %s", i, err)
}
Expand Down Expand Up @@ -1525,13 +1540,10 @@ func TestArenaGCSafety_MergeValues_EmptyArrays(t *testing.T) {
if err != nil {
t.Fatalf("iteration %d: parse right: %s", i, err)
}
merged, changed, err := MergeValues(a, left, right)
merged, err := MergeValues(a, left, right)
if err != nil {
t.Fatalf("iteration %d: merge empty+full: %s", i, err)
}
if !changed {
t.Fatalf("iteration %d: expected changed=true for empty left", i)
}
forceGC()
arr := merged.GetArray()
if len(arr) != 3 {
Expand All @@ -1550,13 +1562,10 @@ func TestArenaGCSafety_MergeValues_EmptyArrays(t *testing.T) {
if err != nil {
t.Fatalf("iteration %d: parse right2: %s", i, err)
}
merged2, changed2, err := MergeValues(a, left2, right2)
merged2, err := MergeValues(a, left2, right2)
if err != nil {
t.Fatalf("iteration %d: merge full+empty: %s", i, err)
}
if changed2 {
t.Fatalf("iteration %d: expected changed=false for empty right", i)
}
forceGC()
arr2 := merged2.GetArray()
if len(arr2) != 2 {
Expand All @@ -1582,13 +1591,10 @@ func TestArenaGCSafety_MergeValues_NullHandling(t *testing.T) {
if err != nil {
t.Fatalf("iteration %d: parse right: %s", i, err)
}
merged, changed, err := MergeValues(a, left, right)
merged, err := MergeValues(a, left, right)
if err != nil {
t.Fatalf("iteration %d: merge: %s", i, err)
}
if changed {
t.Fatalf("iteration %d: expected changed=false for null right on object left", i)
}
forceGC()

nested := merged.Get("nested")
Expand Down
Loading
Loading