I've now gained a good understanding of experimaestro, and this demo has been very helpful. However, there are a few aspects that I found unclear. I'm raising this issue to discuss potential improvements without interfering with any ongoing changes to the code.
Demo flow
1- Configs
The demo should start by introducing Configs. Specifically, I think the CNN class should be placed in a separate model.py file and inherit from Config, as shown below.
One important clarification: the term Config might be misleading, as it does not mean that the class is just a "configuration object." Instead, the demo should make it clear that inheriting Config is simply a way for users to define which parameters can be modified in their experiments using Param (and which ones are unrelated to the results, using Meta). This distinction is why I added ckpt_path as a Meta field.
Additionally, the demo should emphasize that these parameters should not be declared in __init__. Instead, they should be class attributes, with __post_init__ handling the initialization logic.
class CNN(nn.Module, Config):
n_layers: Param[int] = 3
hidden_dim: Param[int] = 64
kernel_size: Param[int] = 3
ckpt_path: Meta[str] = "path/to/checkpoint"
def __post_init__(self):
"""Simple CNN module with n_layers hidden layers and hidden_dim hidden units"""
super(CNN, self).__init__()
# create a list of hidden CNN layers with ReLU activation
self.layers = nn.Sequential()
for i in range(n_layers):
self.layers.add_module(f'conv{i}',
nn.Conv2d(
in_channels=1 if i == 0 else hidden_dim,
out_channels=hidden_dim,
kernel_size=kernel_size,
padding='same'))
self.layers.add_module(f'relu{i}', nn.ReLU())
# pooling layer to reduce the size of the output to 13x13
self.layers.add_module(f'pool', nn.MaxPool2d(kernel_size=2))
# output layer
self.output = nn.Linear(hidden_dim * 14 * 14
, 10)
def forward(self, x):
....
2 - Tasks
Once Configs are introduced, the demo can present Task objects—essentially configs that implement an execute method. This method defines the logic the user wants to run, such as processing data or training a model.
A key point: Task parameters are declared the same way as Config parameters, and they can include other Configs as parameters. The current demo declares parameters like n_layers and hidden_dim inside the task, but I think it would be more modular to have a model: Param[CNN] instead, like this:
class TrainOnMNIST(Task):
"""Main Task that learns a rank r Self Attention layer to perform NER from LLM representations"""
# experimaestro Task parameters
## Model
model: Param[CNN]
# Training
epochs: Param[int] = 1 # number of epochs to train the model
n_val: Param[int] = 100 # number of steps between validation and logging
lr: Param[float] = 1e-2 # learning rate
batch_size: Param[int] = 64 # batch size
## Task version, (not mandatory)
version: Constant[str] = '1.0'
def execute(self) :
....
This highlights experimaestro's modularity and keeps the demo clean.
3 - Experiments
After explaining Configs and Tasks, the experiment.py file can (should) be introduced as the orchestrator of the experiment. It handles:
- Launching tasks on a cluster
- Saving results
- (Potentially other things—I’m still not 100% sure)
Points that should be clearer in the demo:
- The experiment file must contain a
run function.
- While Configurations are not mandatory, it’s probably a good idea to include one.
- A
Configuration object is not the same as a Config object—it’s specific to experiments.
- The relationship between
Configuration and params.yaml should be clarified: Configuration defines which parameters can be changed in the YAML file.
- The purpose of
tag(...) should be explained: why is n_layers tagged but not batch_size?
- How do you launch experiments without a GPU cluster (e.g., for local debugging)?
also, The following example should be updated to reflect the changes made earlier:
for n_layer in cfg.n_layers:
for hidden_dim in cfg.hidden_dim:
for kernel_size in cfg.kernel_size:
# Create a task with the given parameters
task = TrainOnMNIST(
# Model params are 'tagged' for later monitoring
model=CNN(n_layers=tag(n_layers),
hidden_dim=tag(hidden_dim),
kernel_size=tag(kernel_size)),
# Training params are not tagged
epochs=cfg.epochs,
n_val=cfg.n_val,
lr=cfg.lr,
batch_size=cfg.batch_size,
)
4 - Launching
The demo suggests launching the experiment using:
experimaestro run-experiment debug.yaml
However, running this raises a workspace error. This suggests that workspaces should be introduced earlier in the demo.
The easiest way to do this might be to add a "Setting Up the Experiment Environment" section, covering:
- The launchers.py file
- The settings.yaml file
That said, launchers.py seems quite complex, so I’m not sure how best to present it in the demo.
Conclusion
These are just recommendations based on my experience with the demo. Many of these points can be discussed further, but I think they provide a solid starting point for improvements. I'm happy to help with any changes if needed, but I also don’t want to marcher sur les pieds of anyone already working on this. Let me know how we can move forward!
I've now gained a good understanding of experimaestro, and this demo has been very helpful. However, there are a few aspects that I found unclear. I'm raising this issue to discuss potential improvements without interfering with any ongoing changes to the code.
Demo flow
1- Configs
The demo should start by introducing Configs. Specifically, I think the
CNNclass should be placed in a separatemodel.pyfile and inherit fromConfig, as shown below.One important clarification: the term
Configmight be misleading, as it does not mean that the class is just a "configuration object." Instead, the demo should make it clear that inheritingConfigis simply a way for users to define which parameters can be modified in their experiments usingParam(and which ones are unrelated to the results, usingMeta). This distinction is why I addedckpt_pathas aMetafield.Additionally, the demo should emphasize that these parameters should not be declared in
__init__. Instead, they should be class attributes, with__post_init__handling the initialization logic.2 - Tasks
Once Configs are introduced, the demo can present
Taskobjects—essentially configs that implement anexecutemethod. This method defines the logic the user wants to run, such as processing data or training a model.A key point:
Taskparameters are declared the same way asConfigparameters, and they can include other Configs as parameters. The current demo declares parameters liken_layersandhidden_diminside the task, but I think it would be more modular to have a model: Param[CNN] instead, like this:This highlights experimaestro's modularity and keeps the demo clean.
3 - Experiments
After explaining Configs and Tasks, the experiment.py file can (should) be introduced as the orchestrator of the experiment. It handles:
Points that should be clearer in the demo:
runfunction.Configurationobject is not the same as aConfigobject—it’s specific to experiments.Configurationandparams.yamlshould be clarified: Configuration defines which parameters can be changed in the YAML file.tag(...)should be explained: why isn_layerstagged but notbatch_size?also, The following example should be updated to reflect the changes made earlier:
4 - Launching
The demo suggests launching the experiment using:
However, running this raises a workspace error. This suggests that workspaces should be introduced earlier in the demo.
The easiest way to do this might be to add a "Setting Up the Experiment Environment" section, covering:
That said, launchers.py seems quite complex, so I’m not sure how best to present it in the demo.
Conclusion
These are just recommendations based on my experience with the demo. Many of these points can be discussed further, but I think they provide a solid starting point for improvements. I'm happy to help with any changes if needed, but I also don’t want to marcher sur les pieds of anyone already working on this. Let me know how we can move forward!