Skip to content

thesmos-ai/protoc-gen-codec

protoc-gen-codec

CI Release Go Reference Go Report Card Go Version License codecov Mutation

A custom protoc generator that emits high-performance binary serialization methods on existing hand-written types. No generated types — the .proto file is the schema, your source code is the behaviour, the generator bridges them.

For the full method/error/helper reference, see the API documentation on pkg.go.dev.

Features

  • Zero generated types — emits marshal / unmarshal / size / reset methods on your existing types
  • Zero-alloc marshal — a pre-allocated buffer variant writes directly into caller memory
  • Low-allocation unmarshal — generation-time schema analysis picks a per-message strategy that minimises allocations without aliasing the input buffer
  • Packed encoding — repeated scalars use proto3 packed format automatically
  • Deterministic output — map fields marshal in sorted-key order (content-addressable storage, signing, caching)
  • Fixed-length guardscodec.fixed_len rejects truncated/padded byte arrays at unmarshal
  • Capacity-preserving reset — backing storage for slices/maps is preserved for pooled reuse
  • Typed errors — unmarshal failures wrap language-appropriate sentinels for programmatic matching; error messages include field name + number
  • DoS-resistant — bounds-checked length handling prevents OOM from inflated length varints
  • In-bench alloc + latency gateStartContract(b).AllocsMax(n).LatencyMax(d) fails benchmarks at the first regression instead of waiting for a benchstat baseline diff
  • 100% test coverage + mutation testing — every *.codec.go line covered, every aggregate test package at 100%, and 100% effective mutation kill rate on the runtime and analyzer layers (gremlins)

Stability

The project follows semantic versioning. As of v1.0.0:

  • The Go runtime API (codec.Codec / Marshaler / Unmarshaler / Sizer / Resetter interfaces, error sentinels, and exported wire primitives) is stable.
  • The annotation surface (codec/options.proto) is stable.
  • Code generated by v1.0.0 will continue to compile and behave identically against any v1.x runtime.
  • The codectest.Spec[T] structure and its three runners (RunSuite, RunBenchSuite, RunFuzzSuite) are stable.

Minor releases (v1.x) preserve backwards compatibility. Deprecations carry a // Deprecated: notice for at least one minor version before removal. Breaking changes move to v2.

Annotations

Defined in codec/options.proto:

Annotation Scope Purpose
codec.type Message Maps proto message → target type
codec.oneof Message Declares Go-only discriminator + cast for a non-synthetic oneof
codec.field Field Explicit field name override
codec.cast Field Type cast (enums, fixed-point, byte arrays)
codec.fixed_len Field Strict byte-length guard on unmarshal
codec.use_pointer Field (message) Override pointer vs. value representation for nested messages
codec.keep_capacity Field Preserve slice capacity on reset

Quick Start (Go)

1. Install the plugin:

go install go.thesmos.sh/protoc-gen-codec/cmd/protoc-gen-codec-go@latest

Or download a pre-built binary from GitHub Releases.

2. Write your Go type (the source of truth for behaviour):

// mytype.go
package myapp

import "hash"

type Status uint32

const (
    StatusUnknown Status = 0
    StatusActive  Status = 1
)

type MyType struct {
    ID     string
    Status Status
    Ref    hash.Digest // a [32]byte alias
}

3. Write the matching .proto file (the schema, with codec.* annotations linking to the Go type):

// mytype.proto
syntax = "proto3";
import "codec/options.proto";

message MyType {
  option (codec.type) = "MyType";

  string id     = 1 [(codec.field) = "ID"];
  uint32 status = 2 [(codec.field) = "Status", (codec.cast) = "Status"];
  bytes  ref    = 3 [(codec.field) = "Ref", (codec.cast) = "hash.Digest", (codec.fixed_len) = 32];
}

4. Generate the codec methods (writes mytype.codec.go next to mytype.go):

protoc \
  --plugin=protoc-gen-codec-go=$(which protoc-gen-codec-go) \
  --codec-go_out=. --codec-go_opt=paths=source_relative \
  -I . mytype.proto

5. Use the generated methods:

m := &MyType{ID: "abc", Status: StatusActive}
buf, err := m.MarshalCodec()        // serialize
// ...
var got MyType
err = got.UnmarshalCodec(buf)        // deserialize

That's it. No generated types, no reflection, no type registry — your MyType keeps its existing fields and methods, and gains seven codec methods (MarshalCodec, MarshalToCodec, MarshalCodecInternal, UnmarshalCodec, UnmarshalCodecInternal, SizeCodec, ResetCodec).

For the full method semantics, supported field categories, and unsupported features, see docs/generators/go.md. For the project architecture, see docs/architecture.md.

Architecture

The schema analyzer (internal/core/) is language-neutral and operates only on .proto descriptor data — it has no knowledge of any target language. The Go-target emitter under internal/lang/golang/ consumes the analyzer's output and emits *.codec.go. This split is intentional: a future second target would slot in next to golang/ without changes to the analyzer or annotation surface (codec/options.proto). Today, Go is the only shipping target; the directory layout reflects the eventual shape, not current reality.

Runtime

The Go runtime (lang/go/codec/) is what generated code imports for wire primitives and error sentinels. It carries no dependencies beyond the Go standard library. Testing helpers live in a separate sub-package (lang/go/codec/codectest/) so the runtime stays dependency-free for production use.

Testing your consumer types

Each language ships a testing sub-package with a declarative Spec[T] you write once per annotated type, plus three role-specific runners that read it. For Go:

import (
    "time"
    "go.thesmos.sh/protoc-gen-codec/lang/go/codec/codectest"
)

var specMyType = codectest.Spec[MyType]{
    Sample:             sampleMyType(),
    ScalarVarintFields: []int32{2, 3, 5},
    // PackedVarintFields / PackedFixed64Fields / PackedFixed32Fields,
    // MapFields, RepeatedMessageFields, WKTFields,
    // Fixed64Fields, Fixed32Fields, FixedLenBytesFields,
    // Grower, NilPointerSample, Generator,
    // MarshalToAllocsMax, MarshalToLatencyMax, SkipJSONComparisons
    //   — all optional
}

func TestMyType_Codec(t *testing.T)      { codectest.RunSuite[MyType](t, specMyType) }
func BenchmarkMyType_Codec(b *testing.B) { codectest.RunBenchSuite[MyType](b, specMyType) }
func FuzzMyType_Codec(f *testing.F)      { codectest.RunFuzzSuite[MyType](f, specMyType) }

The RunSuite call expands into 30+ sub-tests (roundtrip, reset, nil-safety, cross-format, corruption, per-field wire-type mismatch, short-buffer handling, unknown-field skip, map-entry unknown-sub-field skip, property-based via rapid when a generator is supplied). RunBenchSuite wraps the Codec/MarshalTo subtest in a StartContract scope so any allocation regression (and any order-of-magnitude latency regression, when MarshalToLatencyMax is set) fails the bench in-process. Configured correctly per the Spec, the suite drives generated code coverage to 100%. See docs/generators/go.md for the full field-category cheatsheet and a complete worked example.

Documentation

  • docs/architecture.md — language-neutral architecture, design principles, and project layout
  • docs/generators/go.md — how protoc-gen-codec-go emits code (generated methods, field mapping, slab strategies, reset semantics, testing framework, StartContract bench gating)
  • docs/compliance/golang/codec.md — Go-target test plan: behavioural promises (REQ inventory), REQ→test mapping, and coverage / mutation / bench evidence
  • docs/performance/go.md — Go-target performance profile: per-fixture marshal/unmarshal/size benchmarks, Codec-vs-JSON ratios, cold-vs-warm-path speedup, wire-primitive timings (refreshed alongside make bench-baseline)

License

Apache License 2.0

About

High-performance protobuf codec for Go. Emits marshal/unmarshal on your hand-written types instead of generating new ones. Zero-alloc, deterministic, 100% mutation-tested.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors