Add structured file logging with per-run correlation#263
Conversation
Every msgvault invocation now leaves a durable, structured trail on
disk without the user having to redirect stderr. The default
destination is <data dir>/logs/msgvault-YYYY-MM-DD.log; stderr keeps
receiving the same human-readable text it always has.
internal/logging/logging.go:
- BuildHandler fans records to two sinks: text on stderr + JSON
on a daily log file. Per-process 6-byte hex run_id attached to
every record for cross-invocation correlation.
- Daily files rotate on size (50 MiB cap, 5 siblings kept).
- Degrades to stderr-only if the log directory can't be prepared.
cmd/msgvault/cmd/root.go:
- Installs the handler as slog default after config load.
- Writes "msgvault startup" and "msgvault exit" structured lines
with command, sanitized argv, version, os/arch, outcome.
- Panic recovery logs stack trace before exit.
- New persistent flags: --log-file, --log-level, --no-log-file.
- Config gains a [log] section (dir / level / disabled).
cmd/msgvault/cmd/logs.go:
- New 'msgvault logs' command: tail/filter the on-disk JSON logs.
- Flags: -n, -f, --run-id, --level, --grep, --all, --path.
internal/tui/model.go:
- loadAccounts logs a structured line on success/failure.
Addresses wesm#129.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add structured slog calls to search, stats, list-accounts, and the TUI (loadData, loadMessages, loadSearchWithOffset). Each emits start/done/fail lines with duration_ms so the daily log file gives a full audit trail of what ran. Initialise the package-level logger to a stderr text handler at declaration time so code paths that bypass PersistentPreRunE (tests, library embeds) never hit a nil pointer. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wires the store's sql.DB through a thin logging adapter so every
query leaves a structured trace in the daily log file. Combined
with the run_id attribute, this lets you answer "what SQL did that
command run?" and "which transaction took 5 seconds?" without
reaching for an external profiler.
internal/store/db_logger.go:
- loggedDB embeds *sql.DB and overrides Query, QueryContext,
QueryRow, QueryRowContext, Exec, ExecContext so every call
routes through logStmt/logStmtWith. Store methods that do
s.db.Query(...) compile unchanged and automatically pick up
the logging.
- Exec records the rows_affected count so write sizes show up
in the log alongside duration and stmt text.
- Statement text is normalized (whitespace collapsed) and
truncated to MaxStmtChars (default 300) so log lines stay
readable and disk usage stays bounded.
- Severity routing:
* errors -> WARN (always)
* slow queries -> WARN when duration >= SlowMs (default 100)
* full trace -> INFO for every query (--log-sql)
* otherwise -> DEBUG (only visible with --verbose)
- Known-benign migration errors ("duplicate column name",
"no such module: fts5") are downgraded to DEBUG so every
startup doesn't spam expected ALTER TABLE failures.
- ConfigureSQLLogging(opts) publishes SlowMs and FullTrace via
process-wide atomics so the CLI entry point can update them
once at startup without threading options through 32 store.Open
call sites.
internal/store/store.go:
- Store.db changed from *sql.DB to *loggedDB. DB() continues
to return *sql.DB so external consumers (DuckDB sqlite_scan,
for example) get the raw handle without a logging trampoline.
- withTx() now logs transaction begin/commit/rollback with
total duration. Transactions slower than SlowMs emit WARN;
normal commits emit DEBUG. Rollbacks emit INFO with the
triggering error's message in a 'reason' attribute.
- queryInChunks / execInChunks generic helpers take a new
chunkQuerier interface instead of *sql.DB so they accept
either a raw sql.DB (tests) or the logging wrapper
(production path). The interface covers exactly the two
methods they call.
CLI wiring:
- --log-sql persistent flag: enables FullTrace (every query
at INFO level). Default off.
- --log-sql-slow-ms persistent flag: override the slow-query
threshold. Zero keeps the 100ms default.
- [log].sql_trace and [log].sql_slow_ms in config.toml mirror
the flags for per-install defaults.
- PersistentPreRunE calls store.ConfigureSQLLogging(...) after
building the handler so the first store.Open of the run
already sees the configured values.
Tests cover Exec logging shape with rows_affected, slow-query
promotion to WARN with a synthetic duration, error always
logged, QueryRow emission, and normalizeStmt whitespace/truncation
behaviour. End-to-end smoke confirmed that the default mode is
quiet (no migration noise), --log-sql surfaces every query, and
--log-sql-slow-ms 1 warns on everything.
Bubble Tea takes over the terminal in alternate-screen mode. Any
slog write to stderr (e.g. the tui loadAccounts / loadData /
loadMessages diagnostic lines from the previous commits) corrupts
the render because Bubble Tea and slog end up racing for the same
file descriptor.
Fix: swap slog.Default() to a file-only logger for the duration
of the TUI run, restore it on return. The daily log file keeps
receiving every tui log line so 'msgvault logs -f' in another
pane still works exactly as before.
internal/logging/logging.go:
- Expose the JSON file handler as Result.FileHandler alongside
the existing multi-handler.
- Add Result.FileOnlyLogger() which returns a logger bound to
just the file sink, pre-attributed with the run_id so the
file entries correlate with the rest of the run. Returns a
discardHandler-backed logger when file logging is disabled
so the caller's swap still suppresses stderr.
- Add a minimal discardHandler type so FileOnlyLogger never
hands back nil.
cmd/msgvault/cmd/tui.go:
- Before p.Run() save slog.Default, swap to the file-only
logger, defer the restore. Only reads logResult if it's
non-nil so tests that bypass PersistentPreRunE still work.
roborev: Combined Review (
|
- findLogFiles: sort log files chronologically instead of lexicographically so rotated files (.log.1, .log.2) appear before the active .log for the same date, fixing --all ordering and ring-buffer tail correctness. - --log-file: thread the full path through logging.Options.FilePath so the flag actually overrides the log file path as documented, not just the directory. - followLogFile: buffer partial lines when ReadBytes returns a fragment at EOF mid-write, preventing silent loss of live log lines that span two reads. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove unused currentRun variable, add explicit _ discards for fmt.Fprint* return values, and apply linter auto-fixes for De Morgan's law and Sprintf-in-WriteString patterns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
roborev: Combined Review (
|
…mp dirs
The log file handle was only closed in ExecuteContext's defer, but
tests invoke commands directly via rootCmd.Execute() — leaving the
file open. On Windows this blocks TempDir cleanup ("file is being
used by another process"); in Docker the root-owned logs dir can't
be removed by the host runner.
- Add PersistentPostRunE that closes logResult after every command.
- Close any previous logResult at the top of PersistentPreRunE so
repeated test invocations don't leak handles.
- Change log directory permissions from 0700 to 0755 so Docker
host cleanup can traverse directories created by the container.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
roborev: Combined Review (
|
Given that a user accessing the logs have access to the actual databases... this feels overly cautious. However, it's not a bad standard to uphold so I'm taking a pass at it. |
- Fix exit log dropped on success: remove PersistentPostRunE close that ran before ExecuteContext could emit the exit record; the deferred close in ExecuteContext already handles shutdown. - Redact sensitive input from logs: positional args (emails, queries) logged as argc at info, full values at debug only. Search and TUI log query_len/has_search/has_account instead of raw text. Account scope uses "filtered" instead of the email address. - Fix log.dir path handling: expand ~ and resolve relative paths for cfg.Log.Dir in config.Load(), consistent with data_dir and oauth paths. Addresses: wesm#263 (comment) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
roborev: Combined Review (
|
… sort - File logging is now opt-in by default. Enable via [log] enabled=true, [log] dir="...", or --log-file. Without any of these, msgvault only writes to stderr (the pre-existing behavior users expect). This eliminates the surprise disk-persistence surface for sensitive data. - Fix panic logging loss: swap defer order in ExecuteContext so recoverAndLogPanic runs while the file handle is still open. Previously logResult.Close() ran first (LIFO), silently dropping the panic record. - Fix rotated log chronological order: .log.5 (oldest) now sorts before .log.1 (newest), with the active .log file last. Previously the suffix sort was ascending, putting newer rotations before older. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
I have not deeply examined the panic logging handling here. However, I don't think this scenario represents a "high severity issue" given that we are adding missing logging rather than somehow improving an existing function. Regardless, I'm open to the improvement in this scenario.
This is turning into a feedback loop. I am going to set logging as being opt-in to address this feedback. Redaction approaches are always tricky, and I think what most people are looking for here is to understand what is happening. This is not a robust external facing service. This is a personal tool.
Fine. Although if the next review comment is timezone related I am going to just make them UUIDs. :-) |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
roborev: Combined Review (
|
|
I also manually re-verified logging. I'm going to mark as ready. |
|
thanks! |
I had already added logging for my own work on the deduplicate and fixup features I've been working on, and so am proposing durable structured logging to address #129 and likely other development work. @wesm does this do what you want?
What changed
internal/logging: new package — fans log records to two sinks: human-readable text on stderr and a rotating JSON file under<data dir>/logs/. Every process gets a shortrun_idhex tag on every record for cross-invocation grep.msgvault logs: new command to tail, filter, and follow the on-disk log files (-f,--run-id,--level,--grep,--all,--path).root.go: installs the handler at startup; writes structuredmsgvault startup/msgvault exitlines; recovers panics to disk; new flags--log-file,--log-level,--no-log-file; new[log]config section.duration_ms.loadData,loadMessages, and search operations emit structured lines; stderr logging is suppressed during alt-screen to prevent render corruption.--log-sqland--log-sql-slow-msflags; slow queries (>100 ms by default) emit WARN; every query optionally traced at INFO.Why
Warnings and errors during sync workloads were silently dropped. The log file gives operators a reliable post-hoc record of what msgvault did and why it failed, without requiring stderr redirection.