Add pipeline trace logging (enable_pipeline_trace setting)#22
Conversation
Adds two debugging capabilities to the pipeline execution system: 1. Static pipeline graph: after query initialization, prints all pipelines and their dependency relationships to stderr. 2. Runtime timing trace: records wall-clock start/end times for each pipeline and outputs a Chrome Trace JSON report to stderr on query completion. The JSON is loadable in https://ui.perfetto.dev/ or chrome://tracing for an interactive Gantt visualization. Usage: SET enable_pipeline_trace = true; SELECT ...; -- pipeline graph + Chrome trace JSON go to stderr New files: src/include/duckdb/parallel/pipeline_tracer.hpp src/parallel/pipeline_tracer.cpp Key changes: - Pipeline: add pipeline_id, start_time_ns, end_time_ns, MarkStart/MarkEnd - PipelineEvent: call MarkStart in Schedule(), MarkEnd in FinishEvent() - PipelineFinishEvent: call MarkEnd in FinishEvent() (last-call wins, so finalization time is captured for base pipelines) - Executor: assign IDs, save traced_pipelines ref, print graph at init and Chrome trace JSON before pipelines.clear() - Settings: add enable_pipeline_trace (local bool, default false) https://claude.ai/code/session_01UV8WQWq1RyBsyTAxsbu4yG
client_config.hpp is already included transitively via client_context.hpp. https://claude.ai/code/session_01UV8WQWq1RyBsyTAxsbu4yG
Allows directing each trace output to a dedicated file:
SET pipeline_graph_output = '/tmp/graph.txt'; -- pipeline dependency graph
SET pipeline_trace_output = '/tmp/trace.json'; -- Chrome Trace JSON (Perfetto)
When a path is empty (default), the respective output falls back to stderr.
If the file cannot be opened, a warning is prepended and stderr is used.
Changes:
- PipelineTracer: add WriteOutput() helper (file vs stderr), update
PrintGraph/PrintChromeTrace signatures to accept output_path
- ClientConfig: add pipeline_graph_output / pipeline_trace_output fields
- Settings: add PipelineGraphOutputSetting / PipelineTraceOutputSetting
- settings.json: register both new settings
- custom_settings.cpp: implement SetLocal/ResetLocal/GetSetting for both
- executor.cpp: thread file paths through to tracer calls
https://claude.ai/code/session_01UV8WQWq1RyBsyTAxsbu4yG
ybtsdst
left a comment
There was a problem hiding this comment.
整体看下来:结构干净、改动范围合理、关闭时零开销。以下按严重程度记录一些点。
可能有问题的
1. settings.json 缺 default_value 字段
新加的三个 setting(enable_pipeline_trace、pipeline_graph_output、pipeline_trace_output)都没写 default_value,而旁边 enable_object_cache、http_logging_output 等都写了。即使 custom_implementation: true 真正默认值来自 ClientConfig 成员默认值,settings codegen / 文档生成可能仍依赖该字段。建议补:
"default_value": "false" // enable_pipeline_trace
"default_value": "" // 两个路径2. MarkStart / MarkEnd 写竞争(理论上 data race)
start_time_ns、end_time_ns 是 plain int64_t,跨线程写入(Schedule 在调度线程,FinishEvent 在 worker 线程,后续 PrintChromeTrace 在持 executor_lock 的线程读取)。实践上 event 完成边界 + executor_lock 提供 happens-before,但 TSAN 会报警。改成 atomic<int64_t>,或加注释说明依赖。
3. 文件写入用 trunc 而非 append
pipeline_tracer.cpp 用 std::ios::trunc,同一 session 跑多个 query 会互相覆盖,只剩最后一条。要么改 append,要么在描述/help 里写明"每次 query 覆盖"。
4. PipelineEvent::Schedule() 抛异常时 MarkEnd 不会被调用
MarkStart 在 try 之外,如果 pipeline->Schedule(event) 抛异常,FinishEvent 不会跑,end_time_ns 留 -1。PrintChromeTrace 已过滤 <0 的 pipeline,所以不出错,但失败 pipeline 在 trace 里完全消失。可在 catch 里也补一次 MarkEnd 以保留存在性。
风格 / 小瑕疵
5. Chrome Trace 的 tid: pipeline_id 语义不准
每条 pipeline 实际跨多个 worker 线程并行执行。用 tid=pipeline_id 让每条 pipeline 在 Perfetto 里独占一行,可视化够用,但严格说不是线程。如果想要真实并行度,可按 worker thread tid 输出多个 events,但属于另一种设计取舍。
6. JSON 转义只处理了 \"
PhysicalOperator::GetName() 通常是字母,低风险,但严格 JSON 还需 escape \\ 和控制字符。
7. trace 文件缺 metadata
Chrome Trace 支持 displayTimeUnit、otherData 等顶层字段。加 \"displayTimeUnit\":\"ns\" 能让 Perfetto 显示精度更准。可选。
8. friend class PipelineTracer 略冗余
pipeline_tracer.cpp 访问的字段都已有 public getter (GetPipelineId() 等)。要么用 getter 去掉 friend,要么删 getter 仅靠 friend,两者都留有点重。
设计上 OK 的点
enable_pipeline_trace=false时仅一次 bool 检查,热路径开销可忽略。traced_pipelines持shared_ptr让 timing 数据活过pipelines.clear(),合理。MarkEndlast-call-wins 让PipelineFinishEvent覆盖PipelineEvent,把 finalize 时间也包进 base pipeline,设计有想法。- 文件打不开时 fallback 到 stderr 并 warning,体验 OK。
…end/trunc, Replace API
- src/main/config.cpp: register 3 new settings in internal_options[] (DUCKDB_LOCAL)
and adjust setting_aliases[] indexes for entries shifted by the inserts.
Equivalent to running scripts/generate_settings.py; done by hand to keep the
diff minimal.
- src/include/duckdb/parallel/pipeline.hpp: make start_time_ns/end_time_ns atomic;
MarkStart/MarkEnd run on different threads than PipelineTracer::PrintChromeTrace.
- src/parallel/pipeline_tracer.cpp:
- WriteOutput takes an `append` flag; PrintGraph appends, PrintChromeTrace
truncates (Chrome Trace JSON must remain a single valid object).
- StringUtil::ReplaceAll -> StringUtil::Replace (the in-place ReplaceAll
does not exist; Replace is the global, returning variant).
- src/include/duckdb/parallel/pipeline_tracer.hpp: document the append flag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously the tracer recorded a single (start, end) pair per Pipeline, which collapsed all parallel PipelineTasks (created by LaunchScanTasks according to MaxThreads()) into one logical event. Real parallelism and per-worker timing were invisible. Now each PipelineTask records its own lifetime: - task_start_ns and task_thread_hash are captured on the first ExecuteTask() entry (handles PROCESS_PARTIAL yielding by checking task_start_ns < 0 sentinel, so re-entry doesn't overwrite the start); - on TASK_FINISHED return, the task appends a TaskTiming record to Pipeline::task_timings under task_timings_lock. PrintChromeTrace iterates task_timings and emits one Chrome Trace event per task, with tid = thread_hash so each worker thread occupies its own row in Perfetto, making real parallelism and load distribution directly visible. Pipeline-level MarkStart/MarkEnd and the start_time_ns/end_time_ns atomics are removed; PipelineEvent::Schedule/FinishEvent and PipelineFinishEvent::FinishEvent no longer touch timing. The graph output (PrintGraph) is structural-only and is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
为 DuckDB pipeline 执行系统添加调试/可视化能力,通过 session 变量控制:
三个 setting 互相独立。文件无法打开时自动回退 stderr 并打印警告前缀。graph 是纯文本 append(多查询累积);trace JSON 必须是单一合法 JSON,所以 truncate(每条 query 覆盖)。
Output
Pipeline graph (
pipeline_graph_output):Chrome Trace JSON (
pipeline_trace_output,可拖入 https://ui.perfetto.dev/):每个并行 task 一条 event(不是每个 pipeline 一条),
tid= worker thread id 的 hash,所以每个工作线程占 Perfetto 的一行,真实并行度可见。{"traceEvents":[ {"name":"#0[0]: TableScan→HashJoin","ph":"X","pid":0,"tid":1832912,"ts":0,"dur":12480}, {"name":"#0[1]: TableScan→HashJoin","ph":"X","pid":0,"tid":1832913,"ts":40,"dur":12300}, {"name":"#1[0]: TableScan→HashJoinBuild","ph":"X","pid":0,"tid":1832914,"ts":0,"dur":45120} ]}Design
Per-task wall-clock capture (in
PipelineTask)PipelineTask { int64_t task_start_ns = -1; // 首次 ExecuteTask 进入时设; // 任务 yield + reschedule 不重置 → 包含 blocked 间隔 uint64_t task_thread_hash = 0; // std::hash<thread::id>,首次进入时锁定 }ExecuteTask入口:若task_start_ns < 0则记录 start_ns + thread hash。返回TASK_FINISHED前调用pipeline.RecordTaskTiming(start, end, hash)。Per-task storage (in
Pipeline)RecordTaskTiming持锁push_back。读路径只在 query end 单线程读一次,不需要 RCU/CAS。Trace emit lifecycle (in
Executor)InitializeInternal:enable_pipeline_trace=true时AssignIds、记录pipeline_trace_start_ns(query 起点)、打印 graph、把pipelines复制到traced_pipelines(shared_ptr,所有权延续)。ExecuteTask(query 完成路径):在pipelines.clear()之前 emit Chrome Trace,然后清traced_pipelines。Reset:清traced_pipelines+pipeline_trace_start_ns。traced_pipelines用vector<shared_ptr<Pipeline>>复制而非 raw 引用,是因为 timing 数据存活在 Pipeline 对象上,要等pipelines.clear()之后才 emit,shared_ptr 持有保证 Pipeline 不被析构。Overhead
Executor::InitializeInternal一个 bool 检查 +ExecuteTask一个traced_pipelines.empty()检查,基本零开销。steady_clock::now()× 2 + 一次 thread::id hash + 一次 mutexpush_back。task 数量级下完全忽略。Changes
New files
src/include/duckdb/parallel/pipeline_tracer.hppPipelineTracer类src/parallel/pipeline_tracer.cppAssignIds/PrintGraph/PrintChromeTrace/WriteOutputModified files
src/include/duckdb/parallel/pipeline.hppPipelineTask:task_start_ns+task_thread_hash;Pipeline:pipeline_id、TaskTiming结构、task_timings+task_timings_lock、RecordTaskTiming、GetPipelineId、friendPipelineTracersrc/parallel/pipeline.cppPipelineTask::ExecuteTask首次进入捕获 start/thread,TASK_FINISHED前RecordTaskTiming;Pipeline::RecordTaskTiming加锁 appendsrc/include/duckdb/execution/executor.hpptraced_pipelines、pipeline_trace_start_nssrc/parallel/executor.cppInitializeInternal末尾 AssignIds + 记 query start ns + 打印 graph + 备份 traced_pipelines;ExecuteTask在pipelines.clear()前 emit Chrome Trace;Reset清状态src/include/duckdb/main/client_config.hppenable_pipeline_trace、pipeline_graph_output、pipeline_trace_outputsrc/include/duckdb/main/settings.hppEnablePipelineTraceSetting、PipelineGraphOutputSetting、PipelineTraceOutputSettingsrc/main/settings/custom_settings.cppSetLocal/ResetLocal/GetSettingsrc/common/settings.jsonsrc/main/config.cppsrc/parallel/CMakeLists.txtpipeline_tracer.cpp进 unity build