Optimize `set_layer_metrics` batching by izzet · Pull Request #52 · llnl/dfanalyzer

izzet · 2026-03-06T02:08:12Z

This pull request refactors the set_layer_metrics method in analyzer.py to improve performance and correctness when generating derived metric columns, and adds a new test suite to verify its behavior. The main changes include precomputing column types and numeric representations, building derived columns in-memory before appending, and introducing comprehensive tests for correctness and performance.

Refactor and performance improvements in metric computation

Precompute column type information (is_size_col, is_string_col) and numeric representations (numeric_cols) once per source column to avoid repeated computation and improve efficiency in set_layer_metrics.
Build all derived metric columns in-memory using a dictionary and append them to the DataFrame in a single operation, reducing fragmentation and improving performance.

Correctness and logic changes

Ensure that size-related derived columns are only created for metrics explicitly listed in size_derived_metrics, fixing previous logic that could produce unintended columns.
For string-derived columns, use None for non-matching rows so that downstream processing (e.g., unique_set_flatten) correctly skips missing values.

New tests for correctness and performance

Add tests/test_set_layer_metrics.py with correctness tests to verify column creation, value propagation, and missing value handling, as well as a performance smoke test to ensure efficient repeated calls.

Copilot

Pull request overview

Refactors Analyzer.set_layer_metrics to reduce repeated per-column work and mitigate DataFrame fragmentation during derived-metric column generation, and adds a dedicated test module to validate expected behavior.

Changes:

Precomputes column classification and numeric coercions, and evaluates each derived-metric condition once per metric.
Builds all derived columns in-memory and appends them to hlm in a single concat.
Adds tests/test_set_layer_metrics.py covering correctness plus a basic performance smoke loop.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`python/dftracer/analyzer/analyzer.py`	Refactors `set_layer_metrics` to batch derived-column creation and reduce repeated evaluation/coercion.
`tests/test_set_layer_metrics.py`	Introduces tests for derived-column correctness and a repeat-call perf smoke test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

python/dftracer/analyzer/analyzer.py

tests/test_set_layer_metrics.py

python/dftracer/analyzer/analyzer.py

Optimize set_layer_metrics batching

b141ff4

izzet self-assigned this Mar 6, 2026

izzet added the enhancement New feature or request label Mar 6, 2026

izzet requested review from Copilot, hariharan-devarajan and rayandrew March 6, 2026 02:08

Copilot started reviewing on behalf of izzet March 6, 2026 02:08 View session

Copilot AI reviewed Mar 6, 2026

View reviewed changes

python/dftracer/analyzer/analyzer.py Outdated Show resolved Hide resolved

tests/test_set_layer_metrics.py Show resolved Hide resolved

python/dftracer/analyzer/analyzer.py Outdated Show resolved Hide resolved

izzet added 2 commits March 6, 2026 12:04

Handle set-like derived metrics safely

221f70e

Mark derived metric tests for smoke and full

8ccc85d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `set_layer_metrics` batching#52

Optimize `set_layer_metrics` batching#52
izzet wants to merge 3 commits intollnl:developfrom
izzet:feature/set-layer-metrics-batch

izzet commented Mar 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

izzet commented Mar 6, 2026

Refactor and performance improvements in metric computation

Correctness and logic changes

New tests for correctness and performance

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants