Write hardware in Lean 4. Prove it correct. Generate Verilog.
A type-safe hardware description language that brings dependent types and theorem proving to hardware design.
Live docs & benchmarks: the project publishes three hosted pages at verilean.github.io/sparkle:
- π Tutorial (JupyterLite) β
the multi-chapter beginner course, runnable in-browser via xeus-lean.
(Known issue: some environments fail to boot the Lean kernel or load Sparkle;
when that happens, read the rendered notebooks under
docs/tutorial/Notebooks/or use the Docker path in Quick Start below.) - π API reference (doc-gen4) β
fully cross-linked documentation for every public definition, generated
from the source with
lake build Sparkle:docs. - π Benchmarks β CI-driven history of the RV32 JIT vs Verilator numbers: RV32 SoC Β· LiteX PicoRV32 Β· Multi-core (8-thread)
Quick Start: the multi-chapter tutorial walks from "hello counter" through Verilog generation, proofs, and FPGA bring-up. Run it in Docker, in your browser via xeus-lean's JupyterLite, or read the rendered notebooks directly on GitHub. For the full Signal DSL syntax, see docs/reference/SignalDSL_Syntax.md.
Try it in the browser: Sparkle plugs into
xeus-lean's WASM kernel
via the EXTRA_WASM_DIRS
extension point. See tools/wasm/ for the
staging-builder script. #synthesizeVerilog, #showVerilog, and
pure Signal.atTime simulation all work under WASM; the native JIT
path (Sparkle.Core.JIT.compileAndLoad) is stubbed and only
available from a native lake exe build.
- Write a pure Lean spec β define behaviour as pure functions.
- Prove properties β safety, liveness, fairness via Lean's theorem prover.
- Implement via Signal DSL β express the same logic using
Signalcombinators. - Generate Verilog β
#synthesizeVerilog/#writeVerilogDesignemit SystemVerilog.
See docs/reference/Verification_Framework.md for patterns and a worked Round-Robin Arbiter example (10 formal proofs).
Sparkle ships with production-grade IP cores β each with pure Lean specs, formal proofs, and synthesizable Signal DSL implementations.
| IP | Description | Proofs | Synth | Details |
|---|---|---|---|---|
| BitNet b1.58 | Formally verified LLM inference accelerator. Ternary weights, Q16.16 datapath, dual architecture (1-cycle vs 12-cycle). Standalone FPGA fit + LTL investigation | 60+ theorems | Full | 202K / 99K cells |
| YOLOv8n-WorldV2 | Open-vocabulary object detection. INT4/INT8 quantized, 15 modules, CLIP text embeddings | Golden validation | Full | Backbone + Neck + Head |
| RV32IMA SoC | RISC-V CPU β boots Linux 6.6.0. 4-stage pipeline, Sv32 MMU, UART, CLINT. JIT at 14.2M cyc/s (1.63x Verilator). 102 formal proofs | 102 theorems | Full | 122 registers |
| SVβSparkle Transpiler | Parse Verilog β JIT simulation. LiteX SoC at 18.1M cyc/s (1.72x Verilator). Verified reverse synthesis (2.14x speedup, zero sorry). 8-core parallel 11.9x Verilator. Timer oracle 9,900x. OracleReduction type class, 44 tests |
20+ theorems | JIT | 44 tests |
Networking stack (new β PR #66)
Full UART β SLIP β IPv4 β TCP β HTTP round-trip, live on Tang Nano 50K.
lake exe usb-webserver-jit-test runs a GET request end-to-end in seconds.
See docs/ip-catalog/Networking.md for
the full layer-stack breakdown, bring-up notes, and sim entry points.
| IP | Description | Proofs | Synth | Details |
|---|---|---|---|---|
| UART / SLIP | 8-N-1 UART RX/TX (configurable bitDiv) + RFC 1055 SLIP framer/deframer. Bring-up doc for Tang Nano 50K |
β | Full | LUT 2% |
| IPv4 / ARP / ICMP | RFC 791 IPv4 parser + emitter, ARP requester + responder, ICMP echo. Byte-exact against reference | 5+ theorems | Full | iverilog round-trip |
| TCP | Header + connection state machine + loopback. Includes retransmit / dup-ACK path | 3 theorems | Full | Cycle-accurate sim |
| HTTP/1.0 | Emitter + parser + iverilog loopback (gotRequest at cycle 48 in sim) |
β | Full | GET/POST |
| USB Web server | End-to-end pipeline (UARTβSLIPβIPv4βTCPβHTTP and back). Emits HTTP/1.0 200 OK\r\n\r\nHello, Sparkle! on any GET |
β | Full | Tang Nano 50K, LUT 2%, BRAM 0% |
| memcached ASCII server | Tier-1 (get / set / add / delete, key β€ 8 B / value β€ 16 B), BRAM-backed KV store + byte-stream FSM. Byte-exact against Lean reference oracle |
2 theorems | Full | LUT 1% / BRAM 25% / Fmax β 57 MHz |
| Ethernet framing | MAC framer + RX / TX header extract + payload streaming. DMAC / SMAC / EtherType recovery cycle-accurate | β | Full | iverilog round-trip |
| CRC32 | Bit-serial IEEE 802.3 CRC-32 engine. Reference vs HW parity checked in crc32-jit-test |
β | Full | 1 byte / cycle |
| IP | Description | Proofs | Synth | Details |
|---|---|---|---|---|
| AXI4-Lite Bus | Verified AXI4-Lite slave/master. Protocol compliance (valid persistence, deadlock-free), synthesizable | 14 theorems | Full | 23 sim tests |
| AXI4 Full | Multi-beat burst read/write + interleaving | β | Full | tested against RV32 SoC |
| PCIe TLP | Header emit + parse (Memory Read/Write, config space) + HFT loopback structural check | β | Full | 12-byte TLP round-trip |
| CAN / CAN-FD / CANopen / DroneCAN | Automotive bus stack (bit-stuffing, CRC, arbitration, error frames). DroneCAN HW node included | β | Full | serial-bus / avionics-bus tests |
| LIN / IΒ²C / SPI | Master + slave HW for the common embedded serial protocols | β | Full | serial-bus-test |
| SBUS / CRSF | Radio-control receiver protocols (drone control links) | β | Full | drone bring-up |
| MIL-STD-1553B | Avionics dual-redundant bus (Manchester encode/decode, RT/BC/BM) | β | Full | avionics-bus-test |
| IP | Description | Proofs | Synth | Details |
|---|---|---|---|---|
| AES / AES-GCM / GHASH | AES-128/192/256 + GCM AEAD + hardware GHASH. Byte-exact against NIST test vectors | β | Full | ghash-hw-test, hardware GF(2ΒΉΒ²βΈ) |
| SHA-256 / SHA-512 / Keccak-256 | Byte-exact hash primitives + HW pipeline (SHA-256) | β | Sim + HW SHA-256 | NIST vectors |
| Ed25519 / X25519 | Ed25519 sign/verify + X25519 scalar mult (RFC 7748). Field theorems | 5+ theorems | Sim | RFC 8032 vectors |
| P-256 / secp256k1 ECDSA | NIST P-256 + secp256k1 ECDSA (Bitcoin/Ethereum curve) | β | Sim | wycheproof |
| RSA-PSS | RSA signature verify (PKCS #1 v2.2 PSS) | β | Sim | webPKI test set |
| HKDF | RFC 5869 HKDF extract + expand (SHA-256 backend) | β | Sim | TLS 1.3 dep |
| Ethereum wallet stack | BIP-32 / BIP-39 seed + HD wallet, RLP encoder, EIP-1559 tx, ERC-20 ABI | β | Sim | Byte-exact vs reference clients |
| IP | Description | Proofs | Synth | Details |
|---|---|---|---|---|
| TLS 1.3 | Full TLS 1.3 client + server (record layer, handshake, key schedule, X.509 verify). AES-128-GCM + Ed25519 cipher suite | 3 theorems | Sim | Interop vs OpenSSL fixtures |
| HTTPS demo | HFT-over-TLS transport (TCP + TLS + custom framing) | β | Sim | Loopback demo |
| IP | Description | Proofs | Synth | Details |
|---|---|---|---|---|
| Merkle tree / polynomial commitment | Merkle-tree opening + polynomial evaluation with 8 honest openings round-trip | β | Sim | polynomial-test, merkle-test |
| Mini-STARK verifier | STARK proof verify (Goldilocks field, FRI, low-degree extension) | β | Sim | 8-opening verifier |
| Goldilocks field | p = 2βΆβ΄ β 2Β³Β² + 1 field arithmetic | β | Sim | STARK dep |
| IP | Description | Proofs | Synth | Details |
|---|---|---|---|---|
| H.264 Codec | Baseline Profile encoder + decoder. Hardware MP4 muxer produces playable files. CAVLC now byte-exact vs Lean reference for all 4Γ4 blocks (fixed in PR #66) | 15+ theorems | Full | 709-byte MP4 output |
| IP | Description | Proofs | Synth | Details |
|---|---|---|---|---|
| CDC Infrastructure | Lock-free multi-clock simulation. SPSC queue (210M ops/sec), rollback, 8-core parallel runner (3.87x on 8 cores). Since PR #66, dispatches through the JIT vtable β no more per-symbol dlsym (Issue #70) |
12 theorems | C | N-thread parallel |
| Drone SoC (bring-up) | Multi-IP drone/humanoid SoC status pages (DroneCAN + SBUS + CRSF wired to RV32) | Status page | β | |
| Humanoid SoC (bring-up) | Sensor / actuator bus fabric for humanoid platform | Status page | β |
-- Write this in Lean...
def counter {dom : DomainConfig} : Signal dom (BitVec 8) :=
Signal.circuit do
let count β Signal.reg 0#8
count <~ count + 1#8
return count
#synthesizeVerilog counter// ...and get this Verilog
module counter (
input logic clk,
input logic rst,
output logic [7:0] out
);
logic [7:0] count;
always_ff @(posedge clk) begin
if (rst)
count <= 8'h00;
else
count <= count + 8'h01;
end
assign out = count;
endmoduleThree powerful ideas in one language:
- Simulate β cycle-accurate functional simulation with pure Lean functions.
- Synthesize β automatic compilation to clean, synthesizable SystemVerilog.
- Verify β formal correctness proofs using Lean's theorem prover.
Chisel + FIRRTL solve many logical hardware bugs (latches, comb loops) but leave you fighting timing-closure with external linters. Sparkle gives you both out of the box:
- Logical Safety β
Signalenforces a strict DAG for combinational logic; feedback is only possible through explicitSignal.register/Signal.loop. Pattern-match exhaustiveness catches unhandled cases at compile time. Unintended latches are impossible by construction. - Physical / Timing Safety β a built-in DRC pass (inspired by the STARC guidelines) enforces registered outputs so Static Timing Analysis is predictable and critical paths don't cross module boundaries.
- Readable Verilog β Sparkle's IR keeps a 1:1 structural correspondence with your Lean code. When the DRC flags a timing issue you can actually read the generated SV to fix it.
Prerequisites: a glibc β₯ 2.34 Linux (Ubuntu 22.04+,
Debian 12+, Fedora 35+), macOS, or WSL2. Older systems
(e.g. Ubuntu 20.04, glibc 2.31) fail during lake build with
.../bin/cadical: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found
β cadical is the SAT solver bundled with the Lean 4.28
toolchain (used by bv_decide / omega), and it is linked
against glibc 2.34. This is a Lean-toolchain requirement, not a
Sparkle one; upgrade the OS (or use the Docker path in the
tutorial) if you hit it.
git clone https://github.com/Verilean/sparkle.git
cd sparkle
lake build # ~5 min first time
lake env lean --run Examples/Counter.lean # smoke-testA minimal register chain:
import Sparkle
open Sparkle.Core.Domain
open Sparkle.Core.Signal
-- Three-cycle delay line, polymorphic over clock domains.
def registerChain {dom : DomainConfig}
(input : Signal dom (BitVec 8)) : Signal dom (BitVec 8) :=
let d1 := Signal.register 0#8 input
let d2 := Signal.register 0#8 d1
Signal.register 0#8 d2
#synthesizeVerilog registerChainFor the full tour β VCD waveforms, JIT simulation, formal equivalence
commands, clock-domain crossings, and the synthesizable subset of Lean β
work through docs/tutorial/.
- Cycle-accurate simulation β the same semantics as the emitted Verilog,
runnable from Lean with
#evalandsample. - Automatic Verilog generation β
#synthesizeVeriloghandles clocks, resets, register inference, bit-width checking, and feedback-loop resolution. - Formal verification ready β
bv_decide+simp+Temporal.lean(LTL) for safety/liveness/fairness proofs directly against Signal code. - One-line equivalence checks β
#verify_eq,#verify_eq_at,#verify_eq_gitauto-generate theorems and discharge them withbv_decide. Seedocs/tutorial/notebooks/ch07-equivalence.ipynb. - Signal DSL with imperative feel β
Signal.circuitmacro gives you<~register assignment without losing the functional semantics. - Vector / array types β
HWVector Ξ± nwith compile-time-checked indexing for register files. - Memory primitives β
Signal.memorygenerates synchronous-write / registered-read BRAM-style RAMs. - Technology library support β
primitiveModulewraps vendor cells (SRAMs, PLLs, transceivers) into the type system. - JIT simulation β
sim!/#simcompile to native C++ via dlopen for 10β100Γ faster simulation than the Lean interpreter. - CDC-aware multi-domain simulation β
runSimauto-selects the fastest backend (single-domain or lock-free SPSC queue between threads). - Temporal logic β LTL operators (
always,eventually,next,Until) with induction principles, enabling cycle-skipping optimisation.
Each feature is exercised in the tutorial or one of the IPs; see the links in the IP Catalog above.
# Core simulation + Verilog generation
lake env lean --run Examples/Counter.lean
lake env lean --run Examples/LoopSynthesis.lean
lake env lean --run Examples/SimpleMemory.lean
# The 16-bit Sparkle-16 CPU (ALU / RegisterFile / Core / ISA proofs)
lake env lean --run Examples/Sparkle16/Core.lean
lake env lean --run Examples/Sparkle16/ISAProofTests.lean
# Clock-domain crossing demo
lake env lean --run Examples/CDC/MultiClockSim.lean
# RV32IMA SoC, BitNet, YOLOv8, H.264 β run via the test suite
lake test
# Verilator: build the SoC and boot firmware
cd verilator && make build && ./obj_dir/Vrv32i_soc ../firmware/firmware.hex 500000Each IP has a dedicated getting-started recipe in its own doc (BitNet, RV32, H264, YOLOv8, CDC).
- Hosted (built by CI, always up-to-date with
main):- π Tutorial (JupyterLite) β in-browser, xeus-lean kernel. (Boot issues on some machines β see "Live docs & benchmarks" at the top of this README for the fallback.)
- π API reference β doc-gen4 site covering every public definition.
- π Benchmarks β RV32 SoC, LiteX PicoRV32, Multi-core 8-thread.
- Generate the API reference locally with doc-gen4:
lake -R -Kenv=dev build Sparkle:docs
open .lake/build/doc/index.htmlPointers to the hand-written docs:
- Getting started / writing synthesizable code
- docs/tutorial/ β multi-chapter beginner course
- docs/reference/SignalDSL_Syntax.md β full DSL reference
- docs/reference/Troubleshooting_Synthesis.md
- Verification
- docs/reference/Verification_Framework.md β VDD patterns
- Examples/TemporalLogicExample.md β LTL usage
- IP-specific docs
- Project meta
- docs/CHANGELOG.md β release history
- docs/architecture/STATUS.md β current capability matrix
- docs/known-issues/KnownIssues.md
- docs/known-issues/BENCHMARK.md
ββββββββββββββββββββ
β Lean Signal DSL β ===, &&&, |||, hw_cond, Coe
ββββββββ¬ββββββββββββ
β
ββββββββββββββββ¬βββββββββββββββββββ¬ββββββββββββββββββββ
βΌ βΌ βΌ βΌ
βββββββββββββββ ββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ
β Simulation β β JIT (FFI) β β Verilator β β#synthesizeVerilogβ
β .atTime t β β C++ dlopen β β .sv β C++ β β Lean β IR β DRC β
β ~5K cyc/s β β ~13.0M c/s β β ~11.1M c/s β β β SystemVerilog β
β β β+oracle:1B+ β β β β β
βββββββββββββββ ββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ
Core abstractions:
- Domain β clock domain configuration (period, edge, reset).
- Signal β stream-based hardware values,
Signal d Ξ± β Nat β Ξ±. - BitPack β type class for hardware serialisation.
- Module / Circuit β IR for netlists.
- Compiler β automatic Lean β IR translation via metaprogramming.
Type-safety example:
-- This won't compile β bit-width mismatch is a compile-time error.
def broken {dom : DomainConfig} : Signal dom (BitVec 8) :=
Signal.register (0#16) (Signal.pure 0#16) -- Error: expected BitVec 8
def fixed {dom : DomainConfig} : Signal dom (BitVec 8) :=
let wide : Signal dom (BitVec 16) := Signal.register 0#16 (Signal.pure 0#16)
wide.map (BitVec.extractLsb' 0 8 Β·) -- β explicit truncationSee docs/reference/Troubleshooting_Synthesis.md and docs/known-issues/KnownIssues.md for the current list of:
- Imperative syntax limitations (
<~inside conditionals). - Pattern matching on tuples in synthesizable contexts.
if-then-else vsSignal.muxin Signal contexts.Signal.loopfeedback rules.bv_decidehanging insidelake buildon Lean 4.28 (interactive only).
lake testRuns Signal simulation, Verilog generation, vector / memory ops, temporal logic, CPU ISA proofs, BitNet golden-value validation, RV32 firmware, H.264 pipelines, YOLOv8 primitives, CDC queue stress, and the Verilator co-simulation layer.
| Feature | Sparkle | Clash | Chisel | Verilog |
|---|---|---|---|---|
| Language | Lean 4 | Haskell | Scala | Verilog |
| Type System | Dependent Types | Strong | Strong | Weak |
| Simulation | Built-in | Built-in | Built-in | External tools |
| Formal Verification | Native (Lean) | External | External | None |
| Logical Safety (no latches / comb loops) | By construction | Partial | Via FIRRTL | None |
| Physical / Timing Safety (DRC) | Built-in | None | None | SpyGlass ($$$) |
| Generated Verilog Readability | 1:1 structural | Readable | Obfuscated (FIRRTL) | N/A |
| Learning curve | High | High | Medium | Low |
| Proof integration | Seamless | Separate | Separate | N/A |
sparkle/
βββ Sparkle/ # Core library (Signal DSL, IR, Compiler, Backend, Verification)
βββ IP/ # Verified IP cores (BitNet, YOLOv8, RV32, Drone, Humanoid, Video, Bus)
βββ Examples/ # Runnable demos (Counter, Sparkle16 CPU, CDC, LoopSynthesis, β¦)
βββ Tests/ # LSpec test suites for everything above
βββ Tools/ # SVParser, verilog! / sim! macros, Signal DSL helpers
βββ verilator/ # Verilator co-simulation backend for the RV32IMA SoC
βββ firmware/ # RV32 firmware + OpenSBI + Linux device tree
βββ c_src/ # C FFI libraries (loop memoization, JIT dlopen)
βββ scripts/ # Tutorial syntax check + golden-value generators
βββ docs/ # Hand-written docs (Tutorial, per-IP, KnownIssues, BENCHMARK)
βββ lakefile.lean # Build configuration
Sparkle is an educational project demonstrating functional hardware description, dependent types for hardware, theorem proving for verification, and compiler construction / metaprogramming.
Contributions welcome β good first areas:
- Verified standard IP (parameterised FIFO, N-way arbiter, TileLink / AXI4 interconnect) with formal proofs.
- FPGA tape-out flow examples.
- Additional IR optimisation passes.
- More tutorials and worked examples.
The IR elaborator's error surface is still rough. Two messages in particular bury the real cause:
Cannot synthesise <name>: not inlinable and not a hardware module
Sub-module synthesis failed for <name> (tagged @[hardware_module])
Both are emitted from Sparkle/Compiler/Elab.lean:handleDefinitionUnfold
where an inner MetaM exception is swallowed by a catch _. When
this hits you, follow this workflow:
-
Look at the error in context. If
<name>is one ofSparkle.Core.runCircuitH,Bind.bind,Pure.pure,Sparkle.Core.Signal.bundle*, the elaborator'sunfoldDefinition?peeled the surfacedefbut choked on something inside your DSL body (typeclass projection, an Applicative lift, a multi-arg lambda). Seedocs/reference/Troubleshooting_Synthesis.mdΒ§"Synthesis Compiler Patterns" for the patterns the elaborator does accept β common rewrites are listed in Β§"Fix patterns". -
Get the real inner error. Temporarily change the two
catch _ =>clauses nearSparkle/Compiler/Elab.lean:1620and:1631tocatch e => ... e.toMessageData.toStringso the innerthrowErrorpropagates into the outer message. Most "not inlinable" failures resolve to something specific like a missing pattern-match arm, an unhandled operator, or aBitVec.zeroExtend-style call the IR doesn't speak. Revert the change before committing β leaving rawMessageDatain the user-facing error breaks the existing test fixtures. -
Found a new pattern that fails? Add a one-liner to
docs/reference/Troubleshooting_Synthesis.mdunder the appropriate "NOT supported" / "Fix patterns" bullet so the next contributor sees it before they re-derive the problem. Two recent examples of the kind of entry to add are "multi-arg user-defined function viaf <$> a <*> b" (rewrite the body to use Signal-native operators directly) and "(fun v => 0#m ++ v ++ 0#n)" (split into a chain of++). -
If the elaborator itself should learn this case, file a followup under
docs/known-issues/TODO.mdΒ§"Compiler / IR" so the rough surface can be filed down rather than papered over. The shipped error today is the cap on how fast a new contributor can debug their first synthesisable circuit, so work that reduces it is high-leverage.
The Sparkle/IP/Net/CRC32.lean development is a worked example:
the byte-feed engine first failed with the generic "Sub-module
synthesis failed" message, the inner error revealed
Cannot synthesise runCircuitH, and the fix turned out to be
rewriting crc32Step <$> crc <*> byte (a user-defined 2-arg
function lifted through Applicative) as a Signal-native chain
of ^^^/&&&/>>>/++/-.
Completed phases live in docs/CHANGELOG.md.
Next up:
- Verified Standard IP β Parameterised FIFO β generic depth / width FIFO.
- Verified Standard IP β N-way Arbiter β generalise the 2-client round-robin arbiter to N clients.
- Verified Standard IP β TileLink / AXI4 Interconnect β full AXI4 (bursts, IDs) and TileLink.
- GPGPU / Vector Core β apply the VDD framework to highly concurrent, memory-bound accelerator architectures.
- FPGA Tape-out Flow β end-to-end examples deploying Sparkle-generated Linux SoCs to physical FPGAs.
Junji Hashimoto β Twitter / X: @junjihashimoto3
Apache License 2.0 β see LICENSE.
- Inspired by Clash HDL
- Built with Lean 4
- Golden-reference cycle-accurate simulation via Verilator β used both as the CI co-sim reference and as the "if the JIT disagrees, the JIT is wrong" arbiter throughout the test suite.
- In-browser Lean via xeus-lean and JupyterLite β powers the hosted tutorial notebooks.
- Verilog toolchain integration via iverilog (round-trip checks) and Yosys (used in Ch 8 of the tutorial for equivalence checking / FPGA fit).
- Discord: https://discord.gg/94Xueve8WD β design discussion, weekly progress threads, beginner Q&A.