Add warning to prevent biased transfer when batch effects are missing by bhumigaddam · Pull Request #394 · amarquand/PCNtoolkit

bhumigaddam · 2026-03-12T19:20:14Z

While exploring the transfer functionality in PCNtoolkit, I noticed that it is possible to run a transfer when the new dataset contains fewer batch effects than the dataset used to train the original model. In such situations, the transfer step may still run, but the correction could potentially be biased because the model was trained with a larger set of batch effect levels.

To make this situation clearer to users, this change adds a warning when the transfer dataset contains fewer batch effects than the training dataset. The goal is simply to make users aware of the potential issue so they can interpret the results more carefully.

This change does not modify the underlying behavior of the model; it only adds a warning to improve transparency during transfer operations.

Happy to adjust this if a different warning message or placement would be preferred.

…tibility fix

contsili · 2026-03-13T15:24:48Z

Hi @bhumigaddam, why did you remove json and in its place add Filelock? We already import filelock in line 45.

Regarding the batch effect warning, I think it is a really good idea. It would help if you could provide an example script that I can reproduce this warning and test it

regarding the warning we use our own warning function that exists inside the output.py file. Please see how we use the warnings in other locations e.g.
Output.warning(Warnings.BLABLA), where BLABLA is a constant inside the output.py file, that way we can reuse warnings and keep them all in one location

divye-joshi · 2026-03-13T16:38:20Z

@bhumigaddam Hi! I am also a new contributor.

You can check out the reproduction notebooks I created in issues #396, #378, and #383.
Maybe they will help you replicate your intended idea in the future.

You can also check out my PR #379 where i added a warning for the correct way to create errors/warnings and edit output. py.

I hope this helps you!

bhumigaddam · 2026-03-13T17:08:48Z

Hi @bhumigaddam, why did you remove json and in its place add Filelock? We already import filelock in line 45.

Regarding the batch effect warning, I think it is a really good idea. It would help if you could provide an example script that I can reproduce this warning and test it

regarding the warning we use our own warning function that exists inside the output.py file. Please see how we use the warnings in other locations e.g.
Output.warning(Warnings.BLABLA), where BLABLA is a constant inside the output.py file, that way we can reuse warnings and keep them all in one location

Hi @contsili, thanks for taking the time to review the PR and for the helpful suggestions!

Regarding the "json" → "FileLock" change, my intention was to address a file locking issue I encountered while testing on Windows. I thought explicitly using "FileLock" there might help prevent conflicts when accessing files. However, I see now that "filelock" is already imported earlier in the file, so I’ll revisit that part and adjust the implementation accordingly.

For the batch effect warning, the situation I had in mind is when a model is trained on data containing multiple batch effect levels (for example multiple sites), but the dataset used for transfer only contains a subset of those levels. In that case the transfer step still runs, but the correction could potentially be biased because some batch levels seen during training are missing.

A simplified example workflow would look like this:

train a normative model on data containing multiple batch effects

train_data = NormData(...)
model = NormativeModel(...)
model.fit(train_data)

transfer the model to a dataset with fewer batch effects

transfer_data = NormData(...)
model.transfer(transfer_data)

In this scenario the transfer dataset contains fewer batch effect levels than the dataset used to train the model, which is where the warning would be triggered.

I’ll also update the implementation to use the project's internal warning mechanism ("Output.warning") as suggested so it aligns with how warnings are handled in "output.py".

Thanks again for the guidance!

bhumigaddam · 2026-03-13T17:13:56Z

Hi @divye-joshi, thanks for sharing these resources and pointing me to your notebooks and PR.

I’ll take a look at the issues (#396, #378, #383) and your PR #379 to better understand how warnings are implemented in the project.

Appreciate the help!

bhumigaddam added 2 commits March 12, 2026 22:57

Add warning for fewer batch effects during transfer and Windows compa…

328fd8b

…tibility fix

Resolve merge conflict with dev branch

405f32e

contsili changed the base branch from master to dev March 13, 2026 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add warning to prevent biased transfer when batch effects are missing#394

Add warning to prevent biased transfer when batch effects are missing#394
bhumigaddam wants to merge 2 commits intoamarquand:devfrom
bhumigaddam:improve-transfer-warning

bhumigaddam commented Mar 12, 2026

Uh oh!

contsili commented Mar 13, 2026

Uh oh!

divye-joshi commented Mar 13, 2026 •

edited

Loading

Uh oh!

bhumigaddam commented Mar 13, 2026

Uh oh!

bhumigaddam commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bhumigaddam commented Mar 12, 2026

Uh oh!

contsili commented Mar 13, 2026

Uh oh!

divye-joshi commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bhumigaddam commented Mar 13, 2026

train a normative model on data containing multiple batch effects

transfer the model to a dataset with fewer batch effects

Uh oh!

bhumigaddam commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

divye-joshi commented Mar 13, 2026 •

edited

Loading