Draft
Conversation
- Added InputConfig and FileInputConfig classes for handling file inputs. - Introduced ZMQInputConfig for ZMQ address configuration. - Refactored output configurations to inherit from FileOutput. - Added ZMQOutput class for handling streaming outputs to ZMQ addresses. - Updated the DarshanAnalyzer to use new input handling methods. - Enhanced DFTracerAnalyzer to support reading from ZMQ streams. - Added utility functions for handling pandas DataFrames and streaming. - Introduced new constants for file and host hashes. - Updated meson.build to include new utility files for pandas and streaming.
- Updated CI configuration to include 'streaming' in pip install dependencies. - Refactored DFAnalyzer to remove unused imports and improve code clarity. - Enhanced the epoch_window_via_dict class in streaming.py to better handle epoch events and added logging for error handling. - Added new tests for epoch window functionality, ensuring proper buffering and emission of events based on epoch.start and epoch.end signals. - Created new end-to-end tests for ZMQ analysis pipeline, validating the integration of streaming with the analyzer. - Removed outdated test_streaming.py and consolidated tests into more relevant files. - Added a tar.gz file containing real trace data for testing purposes.
…ailable for testing
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #29 +/- ##
===========================================
+ Coverage 57.48% 70.87% +13.39%
===========================================
Files 26 27 +1
Lines 2164 2414 +250
===========================================
+ Hits 1244 1711 +467
+ Misses 920 703 -217 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
… in set_unique_counts
- Updated `test_analyzer_dftracer_read_zmq` to use `dftracer_ai_logging_posix_events` instead of `epoch_posix_events`. - Modified `_test_e2e` in `test_e2e.py` to conditionally add `analyzer.assign_epochs=True` based on the analyzer and preset. - Changed `test_e2e_zmq` to utilize `dftracer_ai_logging_posix_events` for event data. - Introduced a new test file `test_metrics.py` with comprehensive tests for `set_main_metrics`, `set_view_metrics`, and `set_cross_layer_metrics` functions, ensuring proper handling of edge cases and metrics calculations.
…r time, count, and size
…ring "sum" for consistency
…lly for improved clarity and performance
…actions and ensure proper type casting
- Add _detect_fabric_protocol() to detect CXI fabric on Cray EX systems - Check for /dev/cxi* and /dev/hfi* devices for Slingshot interconnect - Fall back to TCP for local development - Support environment variable overrides for testing
Replace streamz-based streaming with a simpler callback architecture using direct ZMQ sockets. This removes the streamz dependency and provides more control over the streaming lifecycle. - Add zmq_io module with open_consumer/open_producer functions - Refactor analyze_zmq to use output_handler callback instead of Stream - Extract common streaming logic into _analyze_stream method - Update ZMQOutput to send results as multipart messages with parquet - Deprecate read_zmq and postread_zmq methods - Update tests to use new callback-based approach - Remove streamz-based streaming tests
…erformance insights Add a new analysis facts framework that evaluates trace data against configurable rules to produce structured performance findings. The implementation includes: - FactEngine and FactRule classes for rule-based fact evaluation - FactsConfig for enabling/disabling fact generation with configurable options - New dataclasses: AnalysisFact, FactWindow, FactScope, FactSeverity, FactProvenance, FactEnvelope for structured fact representation - DLIO-specific fact rules (dlio.yaml) for detecting common performance patterns (fetch_pressure, compute_dominance, epoch_straggler, rank_imbalance, etc.) - JSON schema for fact envelope validation - Integration with Analyzer to evaluate facts from flat views - Fact envelope output support in ZMQ and Mofka streaming outputs - Conditional debug output for metric_boundaries and flat_views The facts system allows users to define custom rules that evaluate trace metrics and emit structured findings with severity scores, confidence levels, and opportunity tags.
…mprovements - Add --num-ranks parameter to wait for all ranks' epoch.block events before triggering analysis (ensures 1 analysis per epoch, not per rank) - Add SIGTERM-based graceful shutdown with timestamp-bounded trace drain - Add drain summary logging (cat_counts, top event names per epoch) - Fix fact engine NA handling: use numpy fmax/fmin for nullable pd.NA values in derived metric max()/min() expressions - Add facts.evaluate.done info log and analysis_facts count to analysis_complete log for pipeline observability - Publish analysis_facts as JSON envelope to Mofka output topic - Update streaming tests for multi-rank and fact engine scenarios
…ming This change replaces the EpochBuffer implementation with a WindowBuffer that provides more flexible window boundary tracking across multiple ranks. The new implementation: - Uses WindowBoundaryTracker to handle overlapping windows and out-of-order boundary events from different ranks - Maintains backward compatibility by preserving the "epoch" field in events - Adds "window" field for new window-based analysis - Improves test coverage for window boundary tracking logic - Updates fact engine to support layer-scoped facts with fillna0 support The change enables more sophisticated streaming analysis patterns while maintaining compatibility with existing epoch-based workflows.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.