Skip to content

Map for adding cross validation training and evaluation #86

@jmamath

Description

@jmamath

Hello and thank you for this amazing package.

Instead of using replicates, I would be interested in adding a cross validation training and evaluation scheme based on the domain metadata.

Say a dataset has domain: A,B,C. I would like to:

  • train on 70% of data sampled from A,B and evaluate in distribution on the remaining 30 % from A,B and out of distribution on C.
  • train on 70% of data sampled from B,C and evaluate in distribution on the remaining 30 % from B,C and out of distribution on A.
  • train on 70% of data sampled from C,A and evaluate in distribution on the remaining 30 % from C,A and out of distribution on B.

Finally average the in distribution and the out of distribution metric to have the final performance.

Here the 70-30 split is arbitrary and should be modifiable.

I am just starting exploring the package having only replicated the ERM result on the camelyon17 dataset.

It seems that the grouper object might be a good start to implement the following procedure. But, I am still lacking a high level overview of the code. So how would you do this ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions