Skip to content

Phase 3 Plan D: bufcache + block driver + FS read path#16

Merged
cyyeh merged 34 commits into
mainfrom
phase3-plan-d
Apr 26, 2026
Merged

Phase 3 Plan D: bufcache + block driver + FS read path#16
cyyeh merged 34 commits into
mainfrom
phase3-plan-d

Conversation

@cyyeh
Copy link
Copy Markdown
Owner

@cyyeh cyyeh commented Apr 26, 2026

Summary

  • Plan 3.D landed end-to-end: kernel-side PLIC + block drivers, NBUF=16 LRU buffer cache (sleep-on-busy), full FS read layer (fs/{layout,bufcache,balloc,inode,dir,path}.zig), file table (NFILE=64) with per-process ofile[16] + cwd, 7 new syscalls (getcwd, chdir, openat, close, lseek, read, fstat), and a host-side mkfs tool that builds a 4 MB fs.img.
  • proc.exec now resolves the path via namei + readi into a 64 KB kernel scratch buffer (FS-mode) instead of looking up an embedded blob — /bin/init is loaded from disk for the first time.
  • Trap dispatcher gained a separate s_kernel_trap_entry vector so the scheduler's SIE-window IRQs (the new block.zig waiter wakes through this path) don't clobber the sleeping proc's tf/kstack.

Milestone: e2e-fs runs kernel-fs.elf against fs.img; the on-disk /bin/init opens /etc/motd, reads it, writes the contents to fd 1, exits 0:

$ zig build kernel-fs fs-img
$ zig-out/bin/ccc --disk zig-out/fs.img zig-out/bin/kernel-fs.elf
hello from phase 3
ticks observed: 4

34 commits: 1 plan + 25 task implementations + 4 debug fixes (sched_context order, kernel-vec separation, kbuf static, e2e harness) + 2 docs (README + deck).

Test plan

  • zig build test — all unit tests pass
  • zig build e2e — RV32I hello world
  • zig build e2e-mul — RV32IMA demo
  • zig build e2e-trap — privilege/trap demo
  • zig build e2e-hello-elf — Phase 1 §Definition of done
  • zig build e2e-kernel — Phase 2 §Definition of done
  • zig build e2e-multiproc-stub — Plan 3.B PID 1 + PID 2
  • zig build e2e-fork — Plan 3.C fork/exec/wait/exit
  • zig build e2e-plic-block — Plan 3.A IRQ + block round-trip
  • zig build e2e-snake — snake demo deterministic input
  • zig build e2e-fs — Plan 3.D milestone (this PR's headline gate)
  • zig build riscv-tests — rv32ui/um/ua/mi/si conformance
  • zig build wasm — wasm cross-build (deck demo)

🤖 Generated with Claude Code

cyyeh and others added 30 commits April 26, 2026 10:59
Implementation plan for Phase 3.D landing the entire FS read path on top
of Plan 3.C's process lifecycle: kernel-side PLIC + block drivers, the
buffer cache, the FS layer (layout/balloc/inode/dir/path), file table,
per-process cwd + ofile, FS-aware exec, 7 syscalls (openat/close/read/
lseek/fstat/chdir/getcwd), mkfs host tool, FS-mode kmain that loads init
from disk, and the e2e-fs acceptance gate.

Targets the post-restructure layout (src/kernel/, src/emulator/,
programs/, tests/e2e/).
- Add std.debug.assert(b.refs > 0) at start of brelse to catch premature releases
- Document why Pass 1 sleep needs no re-validation (refs bumped before sleep prevents eviction)
- Document why wakeup-then-busy=false ordering is safe (scheduler defers woken process)

These defensive improvements make subtle synchronization invariants more discoverable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six bugs discovered while smoke-testing kernel-fs.elf:

1. kmain (FS_DEMO): cpu.sched_context.{ra,sp} was set AFTER proc.exec,
   so proc.sleep inside exec saw sched_context.ra==0 and returned
   without yielding — the SIE=0 spin then deadlocked because the block
   IRQ couldn't fire. Move the setup before exec.

2. kmain (FS_DEMO): after exec returns, init_p.context still pointed
   inside proc.sleep with a stale stack. Re-arm to forkret + a fresh
   kstack frame so the next scheduler swtch enters via the trap-frame
   sret path, not by re-running sleep's epilogue. forkret promoted to
   pub export.

3. sched.scheduler: replaced the busy-spin (when nothing is Runnable
   but something is alive) with a one-instruction SIE window so a
   pending PLIC IRQ or timer SSI can actually be delivered.

4. trap.s_trap_dispatch (S-from-S): clear sstatus.SPIE so sret restores
   SIE=0 — otherwise the csrc that closes the SIE window never executes
   (a perpetual timer SSI re-traps before each fetch).

5. trap.s_trap_dispatch (timer SSI in scheduler): when cpu.cur is null
   the trap fired inside the SIE window, not inside a process. Skip
   the yield, which would otherwise swtch into sched_context and re-
   enter scheduler() from its top, wiping its loop state.

6. vm.mapKernelAndMmio: add the block device (0x10001000) and PLIC
   (0x0C000000, 0x0C002000, 0x0C201000) MMIO pages — without these,
   the kernel page-faults on its own driver registers as soon as
   paging is on.

7. mkfs: inode array was indexed by `inum - 1`, putting root at slot 0
   in the on-disk image while the kernel reads inum 1 from byte offset
   64 (= slot 1). Switched to `inodes[inum]` with slot 0 reserved as
   Free, and pre-fill root inline (createDir would seed `..` with the
   bogus parent_inum=0 placeholder).

Smoke test now reaches the syscall-return path inside fs_init's first
open("/etc/motd"); a remaining StorePageFault on the syscall return
write to tf->a0 is still under investigation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ndow

The user vector (s_trap_entry) assumes the trap came from U-mode: it
csrrw-swaps sp with sscratch (which holds the current proc), saves all
GPRs into the proc's tf at offset 0, and jumps to s_trap_dispatch on
the proc's kstack. That sequence is hostile in two ways when a trap
fires from S-mode inside the scheduler's SIE window:

1. The proc whose &p sits in sscratch is *sleeping* with frames on its
   kstack. Switching sp to its kstack_top and writing handler frames
   from there clobbers the in-flight syscall stack — causing a wild
   pointer (0xaaaaaab6 etc.) on the next post-swtch load.
2. Every GPR (including ra/sp/s0..s11) gets stamped into init_p.tf,
   wiping the saved user state. The next sret-to-U then jumps off into
   garbage.

Fix: install s_kernel_trap_entry as stvec only while the SIE window is
open. The kernel vector saves caller-saved regs to the *current* (i.e.
scheduler) stack, runs s_kernel_trap_dispatch, restores, srets back to
the csrc that closes the window. s_kernel_trap_dispatch only handles
the two interrupts the window can deliver — IRQ_BLOCK and timer SSI —
and forces SPIE=0 so the post-sret csrc executes without re-trapping.

Also panic on S-from-S in the user vector — that should never happen
once the kernel vector is installed, and catching it surfaces future
stvec bugs immediately rather than silently corrupting state.

The pre-existing SSI-in-scheduler check (cpu.cur==null skip yield) and
the user-vector SPIE clear remain as belt-and-suspenders for the
kernel vector path's narrow scope.

Smoke test of kernel-fs.elf now reaches the second user syscall (read
after openat) before hitting an unrelated stack-corruption symptom on
the post-swtch path; that's the next debug target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cyyeh and others added 4 commits April 26, 2026 19:26
file.read declared `var kbuf: [READ_CHUNK=4096]u8 = undefined` on its
stack, ballooning its frame to ~4.2 KB. Combined with the upstream
trap-handler + syscall-dispatch + sysRead frames and the downstream
inode.readi + bread + bget + sleep frames, the syscall reached almost
6 KB of stack — well beyond the 4 KB per-process kernel stack.

Pages allocated by page_alloc are consecutive: cpu.sched_stack_top is
adjacent (in PA) to init_p.kstack. So when init_p's read syscall
overflowed its kstack, it wrote into the next page — the scheduler
stack — and the array's `undefined` initializer (Zig's debug 0xaa
fill) clobbered the scheduler's locals. The next post-swtch lw read
back 0xaaaaaaaa as &cpu, faulting on the cpu.cur=null write.

Fix: hoist kbuf to a static module var. Single-threaded kernel ⇒ safe.

Smoke test of kernel-fs.elf now passes end-to-end:
  hello from phase 3
  ticks observed: 4

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spawns ccc --disk fs.img kernel-fs.elf, asserts:
  - exit 0
  - stdout contains "hello from phase 3\n"
  - stdout has the canonical "ticks observed: N\n" PID 1 trailer

Mirrors tests/e2e/fork.zig structure. All prior e2e suites still pass
(test, e2e, e2e-mul, e2e-trap, e2e-hello-elf, e2e-kernel,
e2e-multiproc-stub, e2e-fork, e2e-plic-block, e2e-snake).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final task in Plan 3.D. Headline status flips from "Plan C done" to
"Plan D done". Layout grows new entries for plic.zig, block.zig,
file.zig, fs/{layout,bufcache,balloc,inode,dir,path}.zig, mkfs.zig,
user/fs_init.zig, userland/fs/etc/motd, and tests/e2e/fs.zig. Building
table grows kernel-fs / kernel-fs-init / mkfs / fs-img / e2e-fs.

Final regression sweep (clean): test, e2e, e2e-mul, e2e-trap,
e2e-hello-elf, e2e-kernel, e2e-multiproc-stub, e2e-fork, e2e-plic-block,
e2e-snake, e2e-fs, riscv-tests, wasm — all PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…om disk)

Three new chapters slot between the 3.C lifecycle slide and the
epilogue:
  32. block + bufcache — first IRQ-driven kernel sleep, single-
      outstanding submit, NBUF=16 LRU buffer cache, the new
      s_kernel_trap_entry vector.
  33. FS read path — on-disk layout, the read stack
      (read → file.read → readi → bmap+bread → block.read), nameix.
  34. init from disk — exec via namei+readi, mkfs build pipeline,
      fs_init.zig, e2e-fs milestone, the kbuf-overflow war story.

TOC, intro caption, epilogue prose, and check-list grow a 3.D row;
"Next" panel flips from 3.D (filesystem + shell) to 3.E (write side
+ console + shell).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cyyeh cyyeh merged commit 3cd10b6 into main Apr 26, 2026
1 check passed
@cyyeh cyyeh deleted the phase3-plan-d branch April 26, 2026 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant