Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ src/bd_archive/
│ ├── mount.py # plain mount/umount (no sudo)
│ ├── udisks.py # is_available, mount/unmount, loop_setup/loop_delete
│ ├── eject.py # eject + close_tray + drive_status (CDROM ioctl, CDS_* constants)
│ ├── mediainfo.py # detect_disc_capacity (parses all non-00h format-type descriptors, falls back to Track Size for rewritable media)
│ ├── mediainfo.py # detect_disc_capacity (Free Blocks for write-once, Track Size for rewritables; assumes growisofs spare=none — no format → no OSA)
│ ├── optical.py # list_drives + resolve_device (sysfs sr* enum, interactive picker)
│ └── lsof.py # find_device_holders (optional — no-op if lsof absent)
├── archive/ # domain logic over tools/
Expand All @@ -82,6 +82,8 @@ Layering: `commands/` → `archive/` → `tools/` → `shell/`. Lower layers nev

Four subcommands form a pipeline. `create` previews disc count + last-disc fill before prompting for confirmation, so users can dry-run sizing without committing.

1. **`create`** (`commands/create.py`) reads disc capacity via `tools.mediainfo.detect_disc_capacity` (or `args.bytes`), scans the source, and computes slice sizing plus a disc-count estimate (optionally measuring the compression ratio via `--sample`). The user confirms via `prompt_yn` before any heavy work begins (skip with `-y`). Then runs `tools.dar.create_sliced` with `--hash sha512 --min-digits 4 -Q` (plus `-z<algo>[:level] -am` when compression is enabled) to slice the source into per-disc-sized `.dar` files in `<workdir>/tmp/`. par2 is generated **inline** via dar's `-E` hook (`bd_archive._par2_helper`) — the hook fires after each slice is fully written, so par2 reads the slice while it is still hot in the OS page cache, eliminating most SSD read traffic of the create phase. After dar completes, the catalog is isolated. For each slice in order: regenerate `README.txt` with the right disc number and call `tools.mkisofs.build` (mkisofs `-iso-level 3 -udf -V <label> -publisher "bd-archive v<ver>" -input-charset utf-8 -graft-points`) to assemble `<output>/images/disc_NNNN.iso` directly from in-place files (no staging copies). The ISO file size is checked against the format-aware writable capacity as a hard limit. Phase 3 also asserts par2 files are present on disk — a missing file means the `-E` helper silently failed during dar create. After each ISO is built, the slice + par2 are deleted from `tmp/`; once all slices are processed, `tmp/` is wiped entirely. If `-w` was not supplied, the default `<output>/.bd-archive-work/` is also removed, so `<output>` ends up containing only `images/disc_*.iso`.
2. **`burn`** (`commands/burn.py`) iterates `<input>/images/disc_*.iso` lexically and burns each via `growisofs -use-the-force-luke=notray -use-the-force-luke=spare=none -dvd-compat -Z dev=image.iso` — a byte-for-byte ISO write. `spare=none` skips the implicit BD-R format and disables drive-firmware defect management (read-after-write + Outer Spare Area reservation), which roughly doubles write speed on BD-R; par2 + sha512 + the post-burn verify pass cover the integrity story at the application layer, no on-the-fly mkisofs, so what's in the ISO file is exactly what ends up on disc. Volume label, publisher, file layout are all already in the file. Pre-burn fit check is **two-sided**: rejects too-small discs AND discs more than `DISC_OVERSIZE_TOLERANCE` (= 5%) larger than the ISO, guarding against wasting a 50 GB BD-DL on a 25 GB-sized archive. `--skip-fit-check` disables both directions. SIGINT is trapped during the burn itself: a first `Ctrl+C` warns and is ignored (cancelling mid-burn coasters the disc), a second within `BURN_ABORT_GRACE_S` (= 5 s) terminates growisofs and bubbles up as `KeyboardInterrupt`. growisofs runs in its own session (`start_new_session=True`) so the tty's SIGINT does not reach it directly. After burn, `DiscIO.close_tray_if_open` pulls the tray back in on auto-ejecting drives so the post-burn verify can mount. The post-burn verify runs `verify_disc` and loops on any mount/verify failure with a `Re-insert the disc … press Enter to retry` prompt, since some drives need a manual re-insert. Resumable via `--start N`; per-disc resume hints are logged on every cancel/error path. Catches `DeviceBusyError` from `tools/growisofs.py` (sg device locked) and offers an interactive retry — `tools.lsof.find_device_holders` is consulted to name the holding processes when available.
1. **`create`** (`commands/create.py`) reads disc capacity via `tools.mediainfo.detect_disc_capacity` (or `args.bytes`), scans the source, and computes slice sizing plus a disc-count estimate (optionally measuring the compression ratio via `--sample`). The internal dar archive name is `<-n value>-gen<N>` where N is 1 for a full archive and `base_gen + 1` for an incremental against `--base <catalog.dar>` (base gen parsed from the catalog filename via `archive.dar_archive.parse_dar_filename`, which also handles legacy pre-`-gen<N>` filenames as gen 1). Volume labels are `<truncated_name>_G<NN>_<NNNN>` — names longer than `ISO9660_LABEL_NAME_MAX` (23) are truncated in the label only; filenames inside the ISO keep the full name. When `--base` is set, `tools.dar.list_catalog_paths` parses `dar -l` output to get the set of paths already in the predecessor, and `archive.source_scan.scan_delta_bytes` re-scans the source counting only new/modified files for the preview's archive-size estimate. The user confirms via `prompt_yn` before any heavy work begins (skip with `-y`). Then runs `tools.dar.create_sliced` with `--hash sha512 --min-digits 4 -Q` (plus `-z<algo>[:level] -am` when compression is enabled, `-A <ref_catalog>` for incrementals, `-P <path>` per excluded file from auto-defer) to slice the source into per-disc-sized `.dar` files in `<workdir>/tmp/`. par2 is generated **inline** via dar's `-E` hook (`bd_archive._par2_helper`) — the hook fires after each slice is fully written, so par2 reads the slice while it is still hot in the OS page cache, eliminating most SSD read traffic of the create phase. After dar completes, the catalog is isolated. For each slice in order: regenerate `README.txt` with the right disc number + generation and call `tools.mkisofs.build` (mkisofs `-iso-level 3 -udf -V <label> -publisher "bd-archive v<ver>" -input-charset utf-8 -graft-points`) to assemble `<output>/images/disc_NNNN.iso` directly from in-place files (no staging copies). **Catalog files go onto Disc 1 only** — discs 2..N carry only their slice + par2 + README; the dar slice on the last disc embeds the master catalog at its end (dar default), so every set still has two spatially separated catalog copies. After all ISOs are built, the isolated catalog is also copied to `<output>/<name>-gen<N>-catalog.*.dar` so the user can keep it in their regular backup and use it as `--base` for future generations. The ISO file size is checked against the format-aware writable capacity as a hard limit; a missing par2 file (helper silently failed) hard-errors. After each ISO is built, the slice + par2 are deleted from `tmp/`; once all slices are processed, `tmp/` is wiped entirely. If `-w` was not supplied, the default `<output>/.bd-archive-work/` is also removed, so `<output>` ends up containing only `images/disc_*.iso` and the persisted catalog.

**Auto-defer** (`--min-last-disc-fill PERCENT`): when the projected last-disc fill is below PERCENT, the newest-by-mtime files are pushed to a future generation until either the threshold is met or the candidate pool is exhausted. For incrementals (`--base` given), the pool is "files whose relative path is not in the base catalog" — strictly conservative, so an already-archived file whose mtime has drifted on disk is never lost. For full archives (no `--base`), the pool is "all source files" with a loud warning that deferred files won't be archived anywhere until a later incremental picks them up. Deferred files become `-P <relpath>` flags on dar. The preview block shows count, byte total, oldest mtime, and a sample of deferred paths before the confirm prompt.
Expand All @@ -97,7 +99,7 @@ The build-then-burn separation makes mid-burn sizing failures **constructively i

### Slice sizing (`archive/sizing.py`)

Raw capacity from `tools.mediainfo.detect_disc_capacity` is the format-aware writable extent. The function parses `dvd+rw-mediainfo`'s `READ FORMAT CAPACITIES` block, collects **every** non-`00h` MMC-6 format-type descriptor (26h DVD+RW, 2Ah DVD-RW, 30h BD-RE, 32h BD-R, …), and returns the largest descriptor ≤ `upper_bound`. `upper_bound` is `Free Blocks` when non-zero (the per-burn capacity for write-once media — BD-R, DVD-R) and falls back to `Track Size` when `Free Blocks` is 0 (rewritable formatted media — DVD+RW, BD-RE, where `Free Blocks` doesn't apply). On a 25 GB BD-R the 32h descriptor is ~256 MiB less than the nominal Free Blocks figure due to the drive's default Outer Spare Area reservationonly the descriptor reflects what the drive actually accepts.
Raw capacity from `tools.mediainfo.detect_disc_capacity` is `Free Blocks` for write-once media (BD-R, DVD-R — the per-burn writable remainder) and falls back to `Track Size` when `Free Blocks` is 0 (rewritable formatted media — DVD+RW, BD-RE, where `Free Blocks` doesn't apply). This assumes the burn step keeps BD-R unformatted via growisofs's `-use-the-force-luke=spare=none` (see `tools/growisofs.py`): without the implicit format step, no Outer Spare Area is reserved, so the nominal `Free Blocks` figure (~25.03 GB on a 25 GB SL BD-R, 12219392 blocks × 2048) is fully writable. If `spare=none` is ever removed, the drive will format with a default OSA (~256 MiB on SL BD-R) and `Free Blocks` will over-report; in that case capacity must instead come from the MMC-6 32h format-type descriptor in `READ FORMAT CAPACITIES`. The two changes are coupledsee commit 43fce62 for the descriptor-walk variant.

`cmd_create` then subtracts `DISC_END_MARGIN = 1 MiB` from this raw capacity to derive `sizing_target` (absorbs ISO9660+UDF metadata growth that exceeds the slice estimate). `compute_slice_bytes(sizing_target, catalog_est, redundancy)` further subtracts `catalog_est + PAR2_AND_MISC_OVERHEAD` per disc (catalog scales with file count via `archive.source_scan.scan_source`; `PAR2_AND_MISC_OVERHEAD = 4 MiB` covers par2 index/packet/block-rounding overhead), divides by `(100 + redundancy)/100`, and floors to a MiB boundary. The post-build ISO size check against the full `raw_capacity` is the hard final gate.

Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,8 @@ The ISO files are now generated. This can take a while (multiple hours depending
After it's done, proceed with burning (x4 speed).
`bd-archive burn -i /path/to/staging-dir -S 4`

> **Note:** use **blank, unformatted** BD-R discs. `bd-archive` burns with `growisofs -use-the-force-luke=spare=none` to skip the implicit format step. That disables drive-firmware defect management, which roughly **doubles** burn speed (a 4x disc actually burns at 4x instead of 2x) and reclaims the ~256 MiB the drive would otherwise reserve as Outer Spare Area. Application-layer integrity is already covered by par2 FEC + sha512 checksums + a post-burn verify pass. A previously formatted BD-R will still burn, but at the reduced capacity its current format dictates — the fit check on capacity won't account for this.

bd-archiver will now ask you to insert the first disc. After it's inserted, press enter to start the burn process.
After burning it will automatically verify the data integrity. You may need manual intervention and close the tray before that if it opens automatically by firmware.
After everything is verified, insert the next disc until the end.
Expand Down
26 changes: 25 additions & 1 deletion src/bd_archive/tools/growisofs.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,23 @@ def burn(device: str, iso_path: Path, speed: str | None = None):
-use-the-force-luke=notray: skip post-burn tray eject/reload
(some drives physically pop the tray, requiring re-insert
before verify can run).
-use-the-force-luke=spare=none: skip the BD-R format step that
growisofs otherwise does unconditionally. Without that format,
BD-R defect management is off: no read-after-write verify, no
Outer Spare Area reservation. The drive writes at full rated
speed (~2x → ~4x on a 4x disc) and the full nominal Free Blocks
capacity is usable. We have par2 FEC + sha512 + a post-burn
verify pass, so drive-firmware DM is redundant defence in depth
that costs half the write time and ~256 MiB per disc.

⚠️ This flag is COUPLED to `tools.mediainfo.detect_disc_capacity`,
which returns `Free Blocks` directly (nominal capacity). If you
remove `spare=none`, growisofs will format the BD-R and reserve
an Outer Spare Area (~256 MiB on 25 GB SL), and writes past that
LBA will fail with SK=5h/LBA OUT OF RANGE — Free Blocks then
over-reports the writable extent. To re-enable DM you MUST also
revert detect_disc_capacity to read the MMC-6 32h format-type
descriptor from `READ FORMAT CAPACITIES` (see commit 43fce62).

Ctrl+C during a burn would coaster a BD-R, so we trap SIGINT
here: first press warns; a second press within BURN_ABORT_GRACE_S
Expand All @@ -43,7 +60,14 @@ def burn(device: str, iso_path: Path, speed: str | None = None):
locked; CalledProcessError on any other non-zero exit;
KeyboardInterrupt if the user confirmed a mid-burn abort.
"""
cmd = ["growisofs", "-use-the-force-luke=notray", "-dvd-compat", "-Z", f"{device}={iso_path}"]
cmd = [
"growisofs",
"-use-the-force-luke=notray",
"-use-the-force-luke=spare=none",
"-dvd-compat",
"-Z",
f"{device}={iso_path}",
]
if speed:
cmd += [f"-speed={speed}"]

Expand Down
52 changes: 17 additions & 35 deletions src/bd_archive/tools/mediainfo.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,24 @@


def detect_disc_capacity(device: str) -> int | None:
"""Read writable optical capacity from dvd+rw-mediainfo, accounting
for the drive's format / Outer Spare Area reservation.
"""Read writable optical capacity from dvd+rw-mediainfo.

Two relevant fields:
- `Free Blocks`: writable remainder. For write-once media (BD-R,
DVD-R) this is the per-burn capacity. For rewritable media
(DVD+RW, BD-RE) it reads 0 after format, since the concept
doesn't apply to overwriteable discs.
- `Track Size`: the full track extent. Used as fallback when
Free Blocks is 0 (rewritable formatted media).

On top of that, `READ FORMAT CAPACITIES` lists per-format-type
descriptors (MMC-6 codes — 26h DVD+RW, 2Ah DVD-RW, 30h BD-RE,
32h BD-R, etc.) giving the actual writable extent for each
spare-area configuration. The largest descriptor ≤ upper bound is
what the drive will accept in its default sequential/format mode.
For BD-R this is critical: Free Blocks reports the nominal
capacity but the drive enforces ~256 MiB less (default Outer
Spare Area) — only the 32h descriptor reflects the real limit.
DVD-R) this is the per-burn capacity in the disc's *current*
format state. On a blank, unformatted BD-R this is the full
nominal capacity (e.g. 12219392 blocks ≈ 25.03 GB on SL).
- `Track Size`: full track extent. Used as fallback when Free
Blocks reads 0 (rewritable formatted media — DVD+RW, BD-RE).

No format-descriptor walk: we burn with `-use-the-force-luke=spare=none`
(see `tools.growisofs.burn`), which keeps blank BD-R unformatted —
no Outer Spare Area reservation, no defect management. The drive
then accepts writes up to Free Blocks. If `spare=none` were
removed, the drive would format with default OSA (~256 MiB on a
25 GB SL BD-R) and Free Blocks would over-report by that amount;
in that case capacity would have to come from the MMC-6 32h
format-type descriptor in `READ FORMAT CAPACITIES` instead.

Returns None if no disc, command fails, or output unparseable.
"""
Expand All @@ -38,22 +37,5 @@ def detect_disc_capacity(device: str) -> int | None:
free_bytes = int(free_match.group(1)) * 2048 if free_match else 0
track_bytes = int(track_match.group(1)) * 2048 if track_match else 0

# Prefer Free Blocks when non-zero (write-once partial state).
# For rewritable formatted media (DVD+RW, BD-RE), Free Blocks ≡ 0
# and Track Size carries the writable extent.
upper_bound = free_bytes if free_bytes > 0 else track_bytes
if upper_bound == 0:
return None

# All format-type descriptors except 00h (= current capacity, which
# is already covered by Free Blocks/Track Size).
fmt_caps = []
for m in re.finditer(r"([0-9A-Fa-f]{2})h\(\d+\):\s+(\d+)\*2048", r.stdout):
if m.group(1).upper() == "00":
continue
fmt_caps.append(int(m.group(2)) * 2048)
candidates = [c for c in fmt_caps if 0 < c <= upper_bound]
if candidates:
return max(candidates)

return upper_bound
capacity = free_bytes if free_bytes > 0 else track_bytes
return capacity if capacity > 0 else None
Loading