Skip to content

105 har missing modules#106

Merged
otavioon merged 10 commits into
discovery-unicamp:mainfrom
gustavo-luz:105-har-missing-modules
May 19, 2026
Merged

105 har missing modules#106
otavioon merged 10 commits into
discovery-unicamp:mainfrom
gustavo-luz:105-har-missing-modules

Conversation

@gustavo-luz
Copy link
Copy Markdown
Contributor

@gustavo-luz gustavo-luz commented May 15, 2026

Describe your changes

Adds the components required to replicate the paper
"Benchmarking Encoders and Self-Supervised Learning for Smartphone-Based Human Activity Recognition"
(IEEE Access, https://ieeexplore.ieee.org/document/11417778) on top of current Minerva main. The paper's experiment fork contained modules that were never upstreamed, so the public release cannot currently reproduce the paper's pre-training pipeline. This PR brings those modules in.

Data layer

  • SeriesDataset : Time-series base dataset (parent class for the HAR datasets).
  • HARDatasetCPC (Rodrigues 2024), TNCDataset and HarDataset (Xu 2023).
  • MultiModalHARSeriesDataModule and per-dataset HAR DataModules (har.py, har_rodrigues_24.py, har_xu_23.py).

Models

  • DIET : SSL technique, one of the four benchmarked in the paper. The other three (TNC, TF-C, LFR) are already
    in upstream.
  • DIETLinear : linear projection head used by DIET.
  • Extensions to LFRHARArchitecture: optional permute flag on HARSCnnEncoder (used when feeding (B, T, C) instead of (B, C, T))

Tests

New pytest files under tests/ covering SeriesDataset (folder-CSV + multimodal), both HAR datasets, both HAR DataModules, DIETLinear, DIET

Commit layout

  1. feat(data): add SeriesDataset base class for time series
  2. feat(data): add HAR datasets for Rodrigues 2024 and Xu 2023
  3. feat(data): add HAR DataModules (generic + Rodrigues 24, Xu 23)
  4. feat(ssl): add DIET and linear head
  5. feat(nets): extend LFRHARArchitecture to be compatible with TNC (permute)
  6. test: cover SeriesDataset, HAR datasets/DataModules, DIET, DIETLinear

No changes to existing public APIs. pyproject.toml, requirements.txt, and every existing __init__.py are byte-identical to upstream, no dependency or export changes required.

Issue ticket number and link (If apply)

Closes #105 [Feature Request] Missing modules required to replicate published HAR paper.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • I have added tests to this functionality.
  • All tests are passing.
  • [N/A] I have updated the documentation as needed.
  • I have followed the project's coding guidelines.

No new dependencies are introduced. pyproject.toml and requirements.txt are unchanged from upstream main.

Summary by CodeRabbit

  • New Features

    • Added multiple HAR data modules and dataset formats (CSV files, folder-per-sample, preprocessed .npy, CPC/TNC) with configurable transforms, sampling (percentage / per-class), and train/val/test/predict handling
    • Introduced multi-modal time-series datasets and sliding-window utilities
    • Added DIET components: DIET LightningModule, DIETLinear head, and AdaptedHead wrapper
    • HAR CNN encoder gains optional input permutation
  • Tests

    • Comprehensive tests covering new datasets, data modules, sampling behaviors, windowing utilities, and DIET/DIETLinear/AdaptedHead smoke and integration tests

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Warning

Rate limit exceeded

@otavioon has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 23 minutes and 29 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 9896ee4e-0b89-4372-9fab-823faac7be1c

📥 Commits

Reviewing files that changed from the base of the PR and between 67c4f2b and 86654ff.

📒 Files selected for processing (1)
  • minerva/data/data_modules/har_xu_23.py
📝 Walkthrough

Walkthrough

This PR adds CSV/NPY time-series dataset abstractions, two HAR-specific dataset implementations (Rodrigues/CPC windowing and Xu/TNC neighbor sampling), multiple Lightning DataModules (folder-based, multimodal, CPC, and Xu variants), a DIET LightningModule with automatic linear-head construction, a DIET linear/adapter module, a small LFR encoder permute option, and comprehensive tests covering datasets, datamodules, and model training paths.

Changes

HAR Dataset Infrastructure and DIET SSL Model

Layer / File(s) Summary
CSV-based time-series dataset abstractions
minerva/data/datasets/series_dataset.py
MultiModalSeriesCSVDataset loads wide CSVs with feature-prefix selection, optional reshape to (C, T) or flattened vectors, per-sample transforms, label handling and optional map_labels. SeriesFolderCSVDataset loads per-sample CSV files, supports eager/lazy caching, cyclic padding to max length, per-sample transforms, and label handling.
Rodrigues/CPC HAR dataset & utilities
minerva/data/datasets/har_rodrigues_24.py
Utilities (norm_shape, sliding_window, opp_sliding_window) for strided window extraction and alignment. HARDatasetCPC loads per-phase CSVs across provided paths, windowizes features/labels, supports return_index_as_label, optional transpose, and split reuse/concatenation (use_train_as_val, use_val_with_train).
Xu/TNC temporal-neighbor dataset
minerva/data/datasets/har_xu_23.py
TNCDataset samples center windows, close neighbors (ADF-driven or cosine-similarity ranking), and distant non-neighbors; HarDataset loads subsequence .npy features/labels, enforces matching lengths, and returns torch tensors (optional flatten).
UserActivity & MultiModal DataModules
minerva/data/data_modules/har.py
UserActivityFolderDataModule wraps SeriesFolderCSVDataset for folder-organized splits with per-split transforms. MultiModalHARSeriesDataModule wraps MultiModalSeriesCSVDataset across domain directories, supports data_percentage (random train subset) or samples_per_class (class-balanced train subset), map_labels, optional validation concatenation into training, reuse of train as validation, and optional RandomDomainSampler batching.
HARDataModuleCPC (Rodrigues variant)
minerva/data/data_modules/har_rodrigues_24.py
HARDataModuleCPC wires HARDatasetCPC for train/val/test with sliding-window params (input_size, window, overlap), batching, workers, drop_last, and optional transpose; provides dataloader accessors with training shuffle.
HarDataModule and Downstream (Xu variant)
minerva/data/data_modules/har_xu_23.py
HarDataModule loads .npy splits and builds TNCDataset dataloaders for contrastive training. HarDataModule_Downstream builds HarDataset-backed dataloaders with configurable feature-prefixes and target column, centralizing dataloader creation.
DIET LightningModule
minerva/models/ssl/diet.py
DIET wraps a backbone with optional adapter and flattening, forwards through linear_head to logits, lazily constructs linear_head from backbone output size and training datamodule when missing, uses CrossEntropyLoss(label_smoothing=0.8), Adam optimizer, and optional WarmupCosineAnnealingLR scheduler; asserts linear_head.out_features == num_data.
DIET linear head and adapter
minerva/models/nets/diet_linear.py
DIETLinear is a simple nn.Linear(in_features, out_features). AdaptedHead applies an adapter callable to inputs before delegating to a wrapped model.
LFR encoder permute option
minerva/models/nets/lfr_har_architectures.py
HARSCnnEncoder gains an optional permute flag; when true, input is permuted (0,2,1) before convolutional processing.

Comprehensive Test Coverage

Layer / File(s) Summary
Tests: SeriesFolderCSVDataset & MultiModalSeriesCSVDataset
tests/data/datasets/test_series_dataset_folder_csv.py, tests/data/datasets/test_series_dataset_multimodal.py
Fixture-based tests for initialization, feature/label extraction, single/multi-prefix selection, features=None/label=None modes, lazy caching, padding behavior, transform application (single/list), map_labels validation, return_index_as_label, and string representations and error cases.
Tests: HARDatasetCPC & sliding-window utilities
tests/data/datasets/test_har_rodrigues_24.py
Unit tests for norm_shape, sliding_window, opp_sliding_window and integration tests for HARDatasetCPC initialization, __getitem__ shapes/labels, __len__, return_index_as_label mode, transpose_data effect, and split management (use_train_as_val, use_val_with_train).
Tests: TNCDataset & HarDataset
tests/data/datasets/test_har_xu_23.py
TNCDataset tests for length and neighbor/non-neighbor outputs, degenerate correlation scenarios, short-series errors, identical-sample behavior; HarDataset tests for length, feature/label shapes/dtypes, flattening, invalid paths, and mismatch errors.
Tests: DataModules and dataloader behavior
tests/data/data_modules/test_data_module_har.py, tests/data/data_modules/test_data_module_tnc.py
Tests for MultiModalHARSeriesDataModule default setup, dataset sizes, train-only data_percentage downsampling, samples_per_class balancing with seed determinism and subset properties, mutual-exclusivity and bounds errors; HarDataModule test for use_val_with_train toggle.
Tests: DIET and DIETLinear / AdaptedHead
tests/models/nets/test_diet_linear.py, tests/models/ssl/test_diet.py
Forward-pass smoke tests for DIETLinear and AdaptedHead. DIET integration tests verify training with explicit linear_head, automatic head construction, and assertion errors when linear_head.out_features mismatches dataset num_data.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.03% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title '105 har missing modules' is vague and generic; it references an issue number but doesn't clearly convey the main change—adding missing HAR data/model components to support paper reproduction. Use a more descriptive title like 'Add HAR data modules and DIET SSL for paper reproducibility' to clarify the scope.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The PR description is comprehensive, covering data layer, models, tests, commit layout, dependencies, and checklist items with detailed explanations.
Linked Issues check ✅ Passed The PR fully addresses issue #105 objectives: adds time-series base dataset, HAR datasets (Rodrigues 2024, Xu 2023), DataModules, DIET SSL, DIETLinear head, permute flag for LFR architecture, and comprehensive tests.
Out of Scope Changes check ✅ Passed All changes align with PR objectives: new data modules, HAR datasets, DataModules, DIET/DIETLinear models, LFR extension, and tests. Minor import reordering in existing test files is aligned with the new structure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

🧹 Nitpick comments (9)
tests/models/nets/test_diet_linear.py (1)

17-17: ⚡ Quick win

Avoid hard-coding batch size in adapter reshape.

Line 17 hard-codes 32, which makes the test fragile if the input batch size changes. Use the runtime batch dimension instead.

♻️ Proposed fix
-    adapted_model = AdaptedHead(model=model, adapter=lambda x: x.reshape(32, -1))
+    adapted_model = AdaptedHead(model=model, adapter=lambda x: x.reshape(x.size(0), -1))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/models/nets/test_diet_linear.py` at line 17, The test hard-codes a
batch size in the adapter passed to AdaptedHead (adapter=lambda x: x.reshape(32,
-1)); replace the fixed 32 with the runtime batch dimension so the adapter uses
x.shape[0] (or x.size(0)) to compute the first dimension dynamically; update the
adapter to reshape using the input's batch size (e.g., lambda x:
x.reshape(x.shape[0], -1) or lambda x: x.view(x.size(0), -1)) so AdaptedHead
works for any batch size.
minerva/data/data_modules/har_xu_23.py (2)

170-177: 💤 Low value

Mutable default argument (Ruff B006).

Same issue as HarDataset: replace the list literal default with None and assign the fallback inside __init__ to avoid shared-state surprises if the module is ever instantiated multiple times with mutations to the list.

♻️ Proposed change
-        feature_column_prefixes: List[str] = [
-            "accel-x",
-            "accel-y",
-            "accel-z",
-            "gyro-x",
-            "gyro-y",
-            "gyro-z",
-        ],
+        feature_column_prefixes: Optional[List[str]] = None,
         target_column: str = "standard activity code",
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@minerva/data/data_modules/har_xu_23.py` around lines 170 - 177, The parameter
feature_column_prefixes is using a mutable default list; change its signature in
the class constructor (where feature_column_prefixes is defined) to
feature_column_prefixes: Optional[List[str]] = None (import Optional from typing
if not already), and inside __init__ of the same class (the constructor that
currently uses feature_column_prefixes) set self.feature_column_prefixes =
feature_column_prefixes or
["accel-x","accel-y","accel-z","gyro-x","gyro-y","gyro-z"] so each instance gets
its own list rather than sharing a single default mutable object.

70-97: ⚡ Quick win

Docstring default contradicts signature; consider deferring IO to setup().

  • Lines 70–72: docstring says “By default, this is True” for use_val_with_train, but the signature default is False. Fix the doc.
  • Lines 84–86: all three .npy arrays (including test) are loaded eagerly in __init__. With Lightning DataModules the idiomatic location is setup(stage) so the heavy IO runs once per process and only loads the splits required for the current stage; this also lets the module be constructed cheaply (and pickled across DDP workers without holding the arrays).
  • Lines 95–97: leftover commented print calls — drop them.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@minerva/data/data_modules/har_xu_23.py` around lines 70 - 97, The docstring
for use_val_with_train contradicts the function signature—update the docstring
so its default matches the __init__ signature (use_val_with_train default False)
or change the parameter default to True; then remove eager file IO from __init__
(np.load of "train_data.npy", "val_data.npy", "test_data.npy" that populate
har_train, har_val, har_test) and defer those loads into a new or existing
setup(self, stage=None) method so data is loaded per stage (and the DataModule
can be cheaply instantiated/pickled); also delete the commented print lines at
the end. Ensure logic that handles use_train_as_val and use_val_with_train
remains in setup and still references har_train/har_val/har_test.
minerva/data/datasets/har_xu_23.py (2)

334-410: ⚡ Quick win

Mutable default + feature_column_prefixes / target_column appear unused.

Two concerns on the HarDataset API:

  1. feature_column_prefixes: List[str] = [...] (Lines 334–341) is a mutable default (Ruff B006). Replace with None and initialize inside __init__.
  2. feature_column_prefixes and target_column are assigned to self but never consulted in __getitem__ — features and labels come straight from *_data_subseq.npy / *_labels_subseq.npy. The constructor implies column selection that doesn't actually happen, which will confuse downstream users and tests. Either implement the selection or drop the parameters from the public signature.
♻️ Suggested fix for the mutable default
-        feature_column_prefixes: List[str] = [
-            "accel-x",
-            "accel-y",
-            "accel-z",
-            "gyro-x",
-            "gyro-y",
-            "gyro-z",
-        ],
+        feature_column_prefixes: Optional[List[str]] = None,
         target_column: str = "standard activity code",
         flatten: bool = False,
     ):
         ...
-        self.feature_column_prefixes = feature_column_prefixes
+        self.feature_column_prefixes = feature_column_prefixes or [
+            "accel-x", "accel-y", "accel-z", "gyro-x", "gyro-y", "gyro-z",
+        ]

Also applies to: 441-446

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@minerva/data/datasets/har_xu_23.py` around lines 334 - 410, The HarDataset
__init__ uses a mutable default for feature_column_prefixes and exposes
feature_column_prefixes and target_column that are never used in __getitem__;
change the signature to use feature_column_prefixes: Optional[List[str]] = None
(and set self.feature_column_prefixes = ["accel-x",... ] if None inside
__init__) to fix the Ruff B006 mutable-default issue, and then either (A)
implement column selection in __getitem__ (use self.feature_column_prefixes and
self.target_column to index or map loaded arrays before returning
features/labels) or (B) remove feature_column_prefixes and target_column from
the public API and delete their assignments in __init__ so the constructor
matches the actual behavior; update any tests/usage accordingly and preserve
flatten handling already on self.flatten.

201-218: ⚡ Quick win

Replace bare except with a narrow exception type.

Bare except: (Line 217) also catches KeyboardInterrupt and SystemExit, which is dangerous inside long-running training loops. adfuller typically raises ValueError/numpy.linalg.LinAlgError on degenerate windows — catch those (or at minimum Exception) instead. Ruff E722.

♻️ Proposed change
-                except:
+                except (ValueError, np.linalg.LinAlgError, Exception):
                     corr.append(0.6)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@minerva/data/datasets/har_xu_23.py` around lines 201 - 218, The bare except
in the loop that computes p-values from adfuller (inside the for w_t loop
updating corr) should be tightened to catch specific exceptions (e.g.,
ValueError and numpy.linalg.LinAlgError, or at minimum Exception) instead of
catching all exceptions; update the try/except around the adfuller call and p
computation (references: adfuller call, p_val, corr.append) to catch the narrow
exception types and fall back to appending 0.6 on those known failures so
KeyboardInterrupt/SystemExit are not swallowed.
minerva/data/datasets/har_rodrigues_24.py (1)

2-10: 💤 Low value

Duplicate import numpy as np.

Lines 2 and 9 both import numpy as np. Remove the second one.

♻️ Proposed fix
 import numpy as np
 import os
 from minerva.utils.typing import PathLike
 import torch
 from torch.utils.data import Dataset
 from pathlib import Path
 import pandas as pd
-import numpy as np
 from numpy.lib.stride_tricks import as_strided as ast
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@minerva/data/datasets/har_rodrigues_24.py` around lines 2 - 10, Remove the
duplicate import of NumPy: there are two identical lines importing "numpy as np"
in minerva/data/datasets/har_rodrigues_24.py (the second occurrence in the shown
diff); delete the redundant "import numpy as np" so only a single import remains
and ensure no other references are altered.
minerva/data/data_modules/har_rodrigues_24.py (1)

100-125: 💤 Low value

Repeated DataLoader construction differs only by split and shuffle.

The three *_dataloader methods duplicate identical kwargs. A small helper (mirroring MultiModalHARSeriesDataModule._get_loader) would remove the duplication and make future changes (e.g., adding pin_memory, persistent_workers) a single edit.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@minerva/data/data_modules/har_rodrigues_24.py` around lines 100 - 125,
Extract the repeated DataLoader construction into a private helper (e.g., add a
method like _get_loader(self, dataset, shuffle: bool)) and have
train_dataloader, val_dataloader, and test_dataloader call this helper with
self.train_dataset/self.val_dataset/self.test_dataset and the appropriate
shuffle flag; the helper should centralize shared kwargs
(batch_size=self.batch_size, drop_last=self.drop_last,
num_workers=self.num_workers and easily add pin_memory or persistent_workers
later) and mirror the pattern used by MultiModalHARSeriesDataModule._get_loader
to keep behavior consistent.
minerva/data/data_modules/har.py (1)

660-686: 💤 Low value

Replace print with a logger in _get_loader.

Lines 662–664 and 678 use print to signal the loader configuration; this spams stdout on every dataloader creation and is not filterable. Use logging (or remove entirely once stable).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@minerva/data/data_modules/har.py` around lines 660 - 686, In _get_loader
replace the two print(...) calls that announce the DataLoader configuration with
logger calls: use the module/class logger (e.g., logging.getLogger(__name__) or
an existing self.logger) and emit an appropriate level (info or debug) message
that includes n_domains_per_sample and batch_size for the RandomDomainSampler
branch and the shuffle value for the standard DataLoader branch; update
references around RandomDomainSampler, DataLoader, n_domains_per_sample,
batch_size, and drop_last to use the logger so messages are filterable and not
printed to stdout.
minerva/data/datasets/series_dataset.py (1)

343-377: ⚖️ Poor tradeoff

pad=True defeats lazy=True by reading every CSV at init.

_get_longest_sample_size calls self[i] for every index inside _disable_fix_length. When lazy=True, each __getitem__ triggers _read_csv, reading and parsing every CSV at construction time just to determine the longest sequence. Users requesting lazy mode get neither the memory savings nor the deferred I/O they expect.

Consider scanning only file row-counts (e.g., counting newlines or using sum(1 for _ in open(f))) instead of fully parsing each CSV.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@minerva/data/datasets/series_dataset.py` around lines 343 - 377,
_get_longest_sample_size currently forces eager loading by calling self[i]
(which triggers __getitem__ -> _read_csv) inside _disable_fix_length; change it
to scan file sizes directly from disk instead of indexing into the dataset:
iterate over files from _scan_data(), open each Path and count data rows (e.g.,
count lines or use csv.reader to skip header) to compute the longest sequence
length, respect the early return when not self.pad, and remove the reliance on
_disable_fix_length for this operation so lazy=True no longer causes full
parsing at init.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@minerva/data/data_modules/har_rodrigues_24.py`:
- Around line 25-54: Update the class docstring to accurately describe the
constructor parameter semantics: change the description of use_val_with_train to
state that when True the validation partition is concatenated into the training
partition (not replaced), explicitly document use_train_as_val behavior, and add
missing parameter entries for columns, num_workers, and drop_last with brief
types/defaults; reference HARDatasetCPC.load_dataset and the constructor
parameter names (use_val_with_train, use_train_as_val, columns, num_workers,
drop_last) so readers can correlate behavior with the loader implementation.
- Around line 55-98: The datasets are being built in __init__ causing
HARDatasetCPC to call load_dataset() three times (re-reading CSVs and rerunning
opp_sliding_window) and breaking Lightning's stage-aware lifecycle; move all
expensive dataset construction into setup(stage) and instead load the raw arrays
once (call load_dataset() / the CSV reader a single time and perform
opp_sliding_window once) then slice those arrays per phase
('train'/'val'/'test') and instantiate HARDatasetCPC with phase-specific slices
(or add a constructor flag to HARDatasetCPC to accept preloaded arrays) while
preserving use_val_with_train logic so concatenation happens only once.

In `@minerva/data/data_modules/har_xu_23.py`:
- Around line 260-269: The train_dataloader function currently calls
self._get_dataset_dataloader("train", shuffle=False); change this so the
training DataLoader is created with shuffle=True by calling
self._get_dataset_dataloader("train", shuffle=True) in the train_dataloader
method (function name: train_dataloader) to ensure training batches are shuffled
unless a sampler is intentionally provided elsewhere.

In `@minerva/data/data_modules/har.py`:
- Around line 87-94: The default features tuple is getting wrapped into a
single-item list downstream causing KeyError; fix by normalizing to a proper
list (not a list-of-tuple). Either change the default declaration of features to
a list literal or, better, coerce when storing/forwarding: in
UserActivityFolderDataModule (where features is stored) set self.features =
list(features) and/or update SeriesFolderCSVDataset.__init__ to convert non-list
iterables correctly (replace the current if not isinstance(features, list):
features = [features] with features = list(features) for iterable inputs).
Ensure the unique symbols touched are features (the default),
UserActivityFolderDataModule (assignment of features), and
SeriesFolderCSVDataset.__init__ (features normalization).
- Around line 528-538: The code currently materializes the entire Subset by
iterating over subset_ to build x_list and y_list solely for a debug print,
causing unnecessary O(N) work; remove the loop that populates x_list/y_list and
the debug print (the lines referencing x_list, y_list and the print f'\n\ny list
is {y_list} \n\n\n'), leaving sampled_indices, Subset(dataset, sampled_indices)
assigned to subset_ and returning subset_ unchanged so the DataLoader can lazily
load samples.

In `@minerva/data/datasets/har_rodrigues_24.py`:
- Around line 254-255: The labels are being cast to np.uint8 which will wrap
values ≥256 and breaks the "return_index_as_label" path where y =
np.arange(len(x)) can exceed 255; in the block that sets
datasets[phase]["labels"] (the same area referencing label and
"return_index_as_label"), change the cast to a wider integer type (e.g.,
np.int64) or conditionally skip/cast differently when label ==
"return_index_as_label" so sample indices are preserved without overflow.
- Around line 237-255: The CSV file iteration using phase_path.glob("*.csv") is
nondeterministic and may yield empty lists causing np.concatenate to throw;
update the loop in the loader that uses self.paths / phase_path (where data_x
and data_y are built and assigned into datasets[phase]) to iterate over a sorted
list of files (e.g., sorted(phase_path.glob("*.csv"))) to ensure reproducible
order, and handle the case of no files by checking if data_x/data_y are
non-empty before calling np.concatenate (or initialize datasets[phase]["data"]
and ["labels"] to empty arrays of the correct dtype when no CSVs are found) so
concatenation never raises ValueError.
- Around line 200-203: The current non-deterministic use of np.random.shuffle in
the label-assignment branch makes index labels unreproducible; update the
dataset to accept a seed (e.g., add a seed parameter to the constructor of the
class containing self.data and self.labels) and replace np.random.shuffle with a
per-instance RNG: create rng = np.random.default_rng(self.seed) and call
rng.shuffle(datum_index) when computing datum_index in the block guarded by if
self.label and self.label == "return_index_as_label"; ensure the new seed
parameter is stored on the instance (self.seed) and used anywhere else
randomness is required so behavior is fully reproducible.

In `@minerva/data/datasets/har_xu_23.py`:
- Around line 122-130: The constructor for TNCDataset does not validate that the
series length T is greater than 4 * window_size, causing
np.random.randint(2*window_size, T - 2*window_size) in __getitem__ to raise
low>=high later; add a validation in TNCDataset.__init__ after
self.T/self.window_size are set that checks if self.T > 4 * self.window_size and
raise a ValueError with a clear message (mention T and window_size) if the check
fails so misconfiguration fails fast; keep existing epsilon/delta logic intact.
- Around line 352-360: Docstring filenames are inconsistent with the actual
loader which uses "{annotate}_data_subseq.npy" for all splits and there is a
stale commented-out load line; update the docstring to list
train_data_subseq.npy, train_labels_subseq.npy, val_data_subseq.npy,
val_labels_subseq.npy, test_data_subseq.npy, test_labels_subseq.npy (matching
the loader that uses the annotate variable) and remove the stale comment "#
self.labels = np.load(...)" in the class/function that constructs/loads the
dataset (refer to the annotate variable used when building file names and the
constructor/__init__ or load method where the commented line appears).
- Around line 320-326: The fallback branch that checks `if len(x_n) == 0` is
unreachable and would crash if hit because it calls `.unsqueeze` on a NumPy
array; remove or guard this dead branch in the method that builds negatives
(look for variables `x_n`, `x`, `self.time_series` in the same function) and
instead ensure inputs and `np.random.randint(..., size=self.mc_sample_size)` are
called with valid bounds (or assert `self.mc_sample_size > 0`) so an empty `x_n`
cannot occur; if you still need a safe fallback, replace the `.unsqueeze(0)`
call with `np.expand_dims(x, 0)` (or `x[None, ...]`) to avoid AttributeError.

In `@minerva/data/datasets/series_dataset.py`:
- Around line 142-163: When feature_prefixes is None the code overwrites
self.feature_prefixes with all non-label columns then blindly reshapes using
len(self.feature_prefixes], collapsing time to 1; change this by detecting the
auto-discovery case before the reshape: preserve whether prefixes were
explicitly provided (e.g. store a boolean like was_prefixes_provided before you
overwrite), and if was_prefixes_provided is False and self.features_as_channels
is True either skip the channel reshape (do not call data.reshape(...)) or raise
a clear error asking for explicit feature_prefixes when
features_as_channels=True; modify the branch around the feature_prefixes
assignment and the block that calls data.reshape(...) to implement this guard
using the symbols self.feature_prefixes, self.features_as_channels, and the
existing data.reshape call.

In `@minerva/models/nets/diet_linear.py`:
- Around line 11-13: In DietLinear.forward, don't call self.model.forward(x)
directly; instead pass the adapted tensor through the model via the module call
(use self.model(x)) so PyTorch's __call__ and hooks run; update the forward
method that currently does x = self.adapter(x); return self.model.forward(x) to
use self.model(x) after adapter, leaving self.adapter and self.model references
intact.

In `@minerva/models/ssl/diet.py`:
- Around line 92-106: The lazy linear-head construction uses a CPU tensor for
sizing which can mismatch the backbone's device; update the random_input
creation to move it to the same device as the model/backbone (e.g., use the
backbone's parameter device or a model attribute) before calling self.backbone,
so that the forwards for self.backbone (and optional self.adapter) run on the
correct device and out.size(1) is computed without device errors; keep the rest
of the logic (flattening, encoding_size and creating self.linear_head)
unchanged.
- Around line 1-8: There are two identical imports of the torch module in
minerva/models/ssl/diet.py; remove the duplicate import so only a single "import
torch" remains (keep whichever import line fits the file's import
ordering/style), then run the linter/formatter to ensure imports are clean; look
for the duplicate symbol "torch" in the import block to locate the lines to
change.
- Line 19: Update the type annotation for the parameter named loss to be
explicit Optional[Callable] instead of Callable = None; import Optional from
typing if not already imported and ensure the signature (e.g., in the
constructor or function where loss is defined) uses Optional[Callable] for
clarity and PEP 484 compliance (keep the default = None unchanged).

In `@tests/data/datasets/test_series_dataset_multimodal.py`:
- Around line 368-377: The test currently shuffles labels_from_dataset in-place
which mutates the original returned labels and can hide defects; instead, keep
labels_from_dataset immutable and operate on a copy: create a copy (e.g.,
shuffled_labels = labels_from_dataset.copy()), shuffle that copy until it
differs from ground_truth, run the length and non-identity assertions against
shuffled_labels, and keep the final sorted check using the original
labels_from_dataset (sorted_labels_from_dataset = np.sort(labels_from_dataset))
to verify return_index_as_label preserves the identity when sorted.

---

Nitpick comments:
In `@minerva/data/data_modules/har_rodrigues_24.py`:
- Around line 100-125: Extract the repeated DataLoader construction into a
private helper (e.g., add a method like _get_loader(self, dataset, shuffle:
bool)) and have train_dataloader, val_dataloader, and test_dataloader call this
helper with self.train_dataset/self.val_dataset/self.test_dataset and the
appropriate shuffle flag; the helper should centralize shared kwargs
(batch_size=self.batch_size, drop_last=self.drop_last,
num_workers=self.num_workers and easily add pin_memory or persistent_workers
later) and mirror the pattern used by MultiModalHARSeriesDataModule._get_loader
to keep behavior consistent.

In `@minerva/data/data_modules/har_xu_23.py`:
- Around line 170-177: The parameter feature_column_prefixes is using a mutable
default list; change its signature in the class constructor (where
feature_column_prefixes is defined) to feature_column_prefixes:
Optional[List[str]] = None (import Optional from typing if not already), and
inside __init__ of the same class (the constructor that currently uses
feature_column_prefixes) set self.feature_column_prefixes =
feature_column_prefixes or
["accel-x","accel-y","accel-z","gyro-x","gyro-y","gyro-z"] so each instance gets
its own list rather than sharing a single default mutable object.
- Around line 70-97: The docstring for use_val_with_train contradicts the
function signature—update the docstring so its default matches the __init__
signature (use_val_with_train default False) or change the parameter default to
True; then remove eager file IO from __init__ (np.load of "train_data.npy",
"val_data.npy", "test_data.npy" that populate har_train, har_val, har_test) and
defer those loads into a new or existing setup(self, stage=None) method so data
is loaded per stage (and the DataModule can be cheaply instantiated/pickled);
also delete the commented print lines at the end. Ensure logic that handles
use_train_as_val and use_val_with_train remains in setup and still references
har_train/har_val/har_test.

In `@minerva/data/data_modules/har.py`:
- Around line 660-686: In _get_loader replace the two print(...) calls that
announce the DataLoader configuration with logger calls: use the module/class
logger (e.g., logging.getLogger(__name__) or an existing self.logger) and emit
an appropriate level (info or debug) message that includes n_domains_per_sample
and batch_size for the RandomDomainSampler branch and the shuffle value for the
standard DataLoader branch; update references around RandomDomainSampler,
DataLoader, n_domains_per_sample, batch_size, and drop_last to use the logger so
messages are filterable and not printed to stdout.

In `@minerva/data/datasets/har_rodrigues_24.py`:
- Around line 2-10: Remove the duplicate import of NumPy: there are two
identical lines importing "numpy as np" in
minerva/data/datasets/har_rodrigues_24.py (the second occurrence in the shown
diff); delete the redundant "import numpy as np" so only a single import remains
and ensure no other references are altered.

In `@minerva/data/datasets/har_xu_23.py`:
- Around line 334-410: The HarDataset __init__ uses a mutable default for
feature_column_prefixes and exposes feature_column_prefixes and target_column
that are never used in __getitem__; change the signature to use
feature_column_prefixes: Optional[List[str]] = None (and set
self.feature_column_prefixes = ["accel-x",... ] if None inside __init__) to fix
the Ruff B006 mutable-default issue, and then either (A) implement column
selection in __getitem__ (use self.feature_column_prefixes and
self.target_column to index or map loaded arrays before returning
features/labels) or (B) remove feature_column_prefixes and target_column from
the public API and delete their assignments in __init__ so the constructor
matches the actual behavior; update any tests/usage accordingly and preserve
flatten handling already on self.flatten.
- Around line 201-218: The bare except in the loop that computes p-values from
adfuller (inside the for w_t loop updating corr) should be tightened to catch
specific exceptions (e.g., ValueError and numpy.linalg.LinAlgError, or at
minimum Exception) instead of catching all exceptions; update the try/except
around the adfuller call and p computation (references: adfuller call, p_val,
corr.append) to catch the narrow exception types and fall back to appending 0.6
on those known failures so KeyboardInterrupt/SystemExit are not swallowed.

In `@minerva/data/datasets/series_dataset.py`:
- Around line 343-377: _get_longest_sample_size currently forces eager loading
by calling self[i] (which triggers __getitem__ -> _read_csv) inside
_disable_fix_length; change it to scan file sizes directly from disk instead of
indexing into the dataset: iterate over files from _scan_data(), open each Path
and count data rows (e.g., count lines or use csv.reader to skip header) to
compute the longest sequence length, respect the early return when not self.pad,
and remove the reliance on _disable_fix_length for this operation so lazy=True
no longer causes full parsing at init.

In `@tests/models/nets/test_diet_linear.py`:
- Line 17: The test hard-codes a batch size in the adapter passed to AdaptedHead
(adapter=lambda x: x.reshape(32, -1)); replace the fixed 32 with the runtime
batch dimension so the adapter uses x.shape[0] (or x.size(0)) to compute the
first dimension dynamically; update the adapter to reshape using the input's
batch size (e.g., lambda x: x.reshape(x.shape[0], -1) or lambda x:
x.view(x.size(0), -1)) so AdaptedHead works for any batch size.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 5781ce65-6b2b-41a3-a422-80fdb566cd28

📥 Commits

Reviewing files that changed from the base of the PR and between 7df0485 and 72cb741.

📒 Files selected for processing (17)
  • minerva/data/data_modules/har.py
  • minerva/data/data_modules/har_rodrigues_24.py
  • minerva/data/data_modules/har_xu_23.py
  • minerva/data/datasets/har_rodrigues_24.py
  • minerva/data/datasets/har_xu_23.py
  • minerva/data/datasets/series_dataset.py
  • minerva/models/nets/diet_linear.py
  • minerva/models/nets/lfr_har_architectures.py
  • minerva/models/ssl/diet.py
  • tests/data/data_modules/test_data_module_har.py
  • tests/data/data_modules/test_data_module_tnc.py
  • tests/data/datasets/test_har_rodrigues_24.py
  • tests/data/datasets/test_har_xu_23.py
  • tests/data/datasets/test_series_dataset_folder_csv.py
  • tests/data/datasets/test_series_dataset_multimodal.py
  • tests/models/nets/test_diet_linear.py
  • tests/models/ssl/test_diet.py

Comment thread minerva/data/data_modules/har_rodrigues_24.py
Comment thread minerva/data/data_modules/har_rodrigues_24.py
Comment thread minerva/data/data_modules/har_xu_23.py Outdated
Comment thread minerva/data/data_modules/har.py
Comment thread minerva/data/data_modules/har.py
Comment thread minerva/models/nets/diet_linear.py
Comment thread minerva/models/ssl/diet.py
Comment thread minerva/models/ssl/diet.py
Comment thread minerva/models/ssl/diet.py
Comment thread tests/data/datasets/test_series_dataset_multimodal.py
@otavioon otavioon self-requested a review May 15, 2026 16:37
otavioon added 2 commits May 19, 2026 14:40
Removed the HarDataModule_Downstream class and its methods, which handled loading and batching of data for human activity recognition tasks.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (1)
tests/data/datasets/test_series_dataset_multimodal.py (1)

369-378: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not mutate dataset output in the assertion path.

At Line 373, shuffling labels_from_dataset in-place weakens the test and can let a broken return_index_as_label implementation pass.

✅ Suggested fix
-    labels_from_dataset = dataset[:][1]
     ground_truth = np.arange(len(dataset))
-
-    while sum(abs(labels_from_dataset - ground_truth)) == 0:
-        np.random.shuffle(labels_from_dataset)
-
-    assert len(labels_from_dataset) == len(ground_truth)
-    assert sum(abs(labels_from_dataset - ground_truth)) > 0
-    sorted_labels_from_dataset = np.sort(labels_from_dataset)
-    assert sum(abs(sorted_labels_from_dataset - ground_truth)) == 0
+    draws = [dataset[:][1] for _ in range(8)]
+    for labels_from_dataset in draws:
+        assert len(labels_from_dataset) == len(ground_truth)
+        np.testing.assert_array_equal(np.sort(labels_from_dataset), ground_truth)
+
+    assert any(
+        not np.array_equal(labels_from_dataset, ground_truth)
+        for labels_from_dataset in draws
+    ), "return_index_as_label should eventually return a non-identity permutation."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/data/datasets/test_series_dataset_multimodal.py` around lines 369 -
378, The test currently shuffles labels_from_dataset in-place which mutates the
dataset output; instead make and shuffle a copy, e.g. create a copy
(labels_shuffled = labels_from_dataset.copy() or np.copy(labels_from_dataset)),
run the while loop and np.random.shuffle on that copy, and use labels_shuffled
for the inequality assertions and sorting; keep labels_from_dataset unchanged
for the final sorted comparison with ground_truth so the original dataset output
is not mutated (referencing variables labels_from_dataset, labels_shuffled,
ground_truth, and dataset).
🧹 Nitpick comments (1)
minerva/data/data_modules/har.py (1)

653-654: ⚡ Quick win

Replace debug print() calls with proper logging.

These print statements output on every dataloader creation, polluting stdout during training runs. Consider using Python's logging module or removing them entirely.

♻️ Proposed fix
+import logging
+
+logger = logging.getLogger(__name__)
+
 # In _get_loader method:
         if self.n_domains_per_sample is not None:
-            print(
-                f"Using DataLoader with RandomDomainSampler with n_domains_per_sample={self.n_domains_per_sample}"
-            )
+            logger.debug(
+                "Using DataLoader with RandomDomainSampler with n_domains_per_sample=%d",
+                self.n_domains_per_sample,
+            )
             # ... sampler code ...
         else:
-            print(f"Using DataLoader with shuffle={shuffle}")
+            logger.debug("Using DataLoader with shuffle=%s", shuffle)

Also applies to: 669-669

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@minerva/data/data_modules/har.py` around lines 653 - 654, Replace the debug
print statements in the HAR data module with proper logging: locate the print
calls that output "Using DataLoader with RandomDomainSampler with
n_domains_per_sample={self.n_domains_per_sample}" (and the similar one around
line 669) inside the HAR data module (e.g., methods that build the DataLoader /
RandomDomainSampler like train_dataloader/val_dataloader setup). Add or reuse a
module-level logger (logger = logging.getLogger(__name__)) and change the prints
to logger.debug(...) or logger.info(...) as appropriate, passing the formatted
message and avoiding stdout prints so training logs remain clean.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@minerva/data/data_modules/har_xu_23.py`:
- Around line 72-75: The docstring for the parameter use_val_with_train
contradicts the actual default (the signature sets use_val_with_train=False);
update the docstring to state the correct default of False (or change the
function/class signature to True if you intended that behavior). Locate the
parameter use_val_with_train in the function or class signature in har_xu_23.py
and make the docstring parameter description match the actual default value so
they are consistent.
- Around line 172-179: The constructor currently uses a mutable list default for
feature_column_prefixes; change the parameter to accept None (e.g.,
feature_column_prefixes: Optional[List[str]] = None) and inside __init__ set
self.feature_column_prefixes = feature_column_prefixes or
["accel-x","accel-y","accel-z","gyro-x","gyro-y","gyro-z"] so the default list
is created per-instance; update the signature and assignment in the HarXu23 (or
the class containing feature_column_prefixes) and remove the mutable default
from the parameter list.

In `@minerva/data/data_modules/har.py`:
- Around line 415-416: The docstring for the HAR data module incorrectly
documents the parameter shuffle_train as type `str`; update the docstring to
reflect the actual type (`bool`) where shuffle_train is described (e.g., in the
class/function docstring that documents shuffle_train in
minerva.data.data_modules.har.py) so the doc comment matches the parameter
declaration and usage.

In `@minerva/data/datasets/har_xu_23.py`:
- Around line 219-220: The bare except block around the adfuller call should be
replaced with specific exception handling; update the try/except that currently
does "except: corr.append(0.6)" to catch the expected exceptions (e.g.,
ValueError, numpy.linalg.LinAlgError, or at minimum Exception) and append the
fallback 0.6 only for those cases; locate the adfuller invocation and the
corr.append call to make this change (ensure you import/qualify LinAlgError if
used) so we don't swallow system-exiting exceptions like
KeyboardInterrupt/SystemExit.
- Around line 336-343: The class currently defines feature_column_prefixes as a
module/class-level default list which is mutable; change the signature to accept
feature_column_prefixes: Optional[List[str]] = None and in the class __init__
assign self.feature_column_prefixes = feature_column_prefixes or
["accel-x","accel-y","accel-z","gyro-x","gyro-y","gyro-z"] so each instance gets
its own list and no shared mutable default is used; update any references to
feature_column_prefixes to use self.feature_column_prefixes.

In `@minerva/models/ssl/diet.py`:
- Around line 95-100: The lazy head probe creation assumes sliceable Dataset and
CPU tensors; fix by obtaining a sample input robustly (use try: sample =
training_dataset[0] except: sample = next(iter(training_dataset))) then extract
the tensor input (if sample is tuple/list use sample[0]) and infer device from
the backbone (device = next(self.backbone.parameters()).device or
next(self.backbone.buffers()).device). Create random_input with the same shape
on that device (torch.rand(sample_input.shape, device=device)) so the probe
supports all Dataset types and matches the backbone device before calling
self.backbone to compute embeddings (preserve handling of self.adapter and
subsequent code).

---

Duplicate comments:
In `@tests/data/datasets/test_series_dataset_multimodal.py`:
- Around line 369-378: The test currently shuffles labels_from_dataset in-place
which mutates the dataset output; instead make and shuffle a copy, e.g. create a
copy (labels_shuffled = labels_from_dataset.copy() or
np.copy(labels_from_dataset)), run the while loop and np.random.shuffle on that
copy, and use labels_shuffled for the inequality assertions and sorting; keep
labels_from_dataset unchanged for the final sorted comparison with ground_truth
so the original dataset output is not mutated (referencing variables
labels_from_dataset, labels_shuffled, ground_truth, and dataset).

---

Nitpick comments:
In `@minerva/data/data_modules/har.py`:
- Around line 653-654: Replace the debug print statements in the HAR data module
with proper logging: locate the print calls that output "Using DataLoader with
RandomDomainSampler with n_domains_per_sample={self.n_domains_per_sample}" (and
the similar one around line 669) inside the HAR data module (e.g., methods that
build the DataLoader / RandomDomainSampler like train_dataloader/val_dataloader
setup). Add or reuse a module-level logger (logger =
logging.getLogger(__name__)) and change the prints to logger.debug(...) or
logger.info(...) as appropriate, passing the formatted message and avoiding
stdout prints so training logs remain clean.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 148136c1-d979-461b-9f66-bc2ead1963bd

📥 Commits

Reviewing files that changed from the base of the PR and between 72cb741 and 67c4f2b.

📒 Files selected for processing (19)
  • minerva/data/data_modules/har.py
  • minerva/data/data_modules/har_rodrigues_24.py
  • minerva/data/data_modules/har_xu_23.py
  • minerva/data/datasets/har_rodrigues_24.py
  • minerva/data/datasets/har_xu_23.py
  • minerva/data/datasets/series_dataset.py
  • minerva/models/nets/diet_linear.py
  • minerva/models/nets/lfr_har_architectures.py
  • minerva/models/ssl/diet.py
  • tests/data/data_modules/test_data_module_har.py
  • tests/data/data_modules/test_data_module_tnc.py
  • tests/data/datasets/test_har_rodrigues_24.py
  • tests/data/datasets/test_har_xu_23.py
  • tests/data/datasets/test_series_dataset_folder_csv.py
  • tests/data/datasets/test_series_dataset_multimodal.py
  • tests/models/nets/test_diet_linear.py
  • tests/models/nets/test_lfr_har_architectures.py
  • tests/models/ssl/test_diet.py
  • tests/models/ssl/test_lfr_implementation_har.py
✅ Files skipped from review due to trivial changes (3)
  • tests/models/nets/test_lfr_har_architectures.py
  • minerva/models/nets/diet_linear.py
  • tests/models/ssl/test_lfr_implementation_har.py
🚧 Files skipped from review as they are similar to previous changes (9)
  • tests/data/data_modules/test_data_module_tnc.py
  • tests/models/nets/test_diet_linear.py
  • minerva/data/data_modules/har_rodrigues_24.py
  • minerva/data/datasets/har_rodrigues_24.py
  • tests/models/ssl/test_diet.py
  • minerva/models/nets/lfr_har_architectures.py
  • tests/data/datasets/test_har_xu_23.py
  • tests/data/datasets/test_har_rodrigues_24.py
  • minerva/data/datasets/series_dataset.py

Comment thread minerva/data/data_modules/har_xu_23.py
Comment thread minerva/data/data_modules/har_xu_23.py Outdated
Comment thread minerva/data/data_modules/har.py
Comment thread minerva/data/datasets/har_xu_23.py
Comment thread minerva/data/datasets/har_xu_23.py
Comment thread minerva/models/ssl/diet.py
@otavioon
Copy link
Copy Markdown
Collaborator

LGTM

@otavioon otavioon merged commit 03b0a01 into discovery-unicamp:main May 19, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Missing modules required to replicate published HAR paper

2 participants