You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It depends. If the pvc exists, it uses it. If the pvc doesn't exist, the execution fails (not at compile time, but at runtime). The one thing it does not do is create a PVC!
This one creates a new pvc. But if you were hoping to attach to an existing pvc, this won't work.
In summary, if you want to create a pvc, use the parameter "pvc_name_suffix", and if you want to use an existing pvc, use the parameter "pvc_name", but don't use both. This is not at all intuitive.
The answer to all of the following questions is ... you are out of luck.
What if you want to create a pvc with a specific name (not just a specific suffix)?
What if you want the component to use an existing pvc if it exists, otherwise create it?
What if you want the component not to fail whether or not the pvc exists?
What if you think it's weird being forced to supply size and access mode details, when you just want to use an existing pvc?
What if you are tired of workarounds that involve managing PVCs outside the pipeline, or conditional logic (with kfp.dsl If and Else) inside your pipeline just to make it work?
What if you want to run your pipeline more than once in a row, and not have it create a bunch of different PVCs?
Here's what actually works now on repeated pipeline runs (note that these are not always ideal):
Use Component 1 above with a previously created PVC (the irony of "CreatePVC" in the definition and in the logs in this case is thick). And then there is the question of why are we using this function at all, and not just using the mount_pvc.
Use Component 2 above, and create a new PVC each time you run the pipeline.
Run the pipeline once with the Component 2 (to create a pvc), then switch out pvc_name_suffix for pvc_name and recompile so that you can use the existing pvc on the next run.
Possible Solutions
Encourage using the CreatePVC function to just creating PVCs (possibly phased out over time). Then, create a new function pvc:
In the first example, I use a flag. If we want more functionality options, we can use a "mode". For example, "create" would mean "create only" (if it exists, then the run fails), "apply" would mean the same as in kubectl (could be used to update or create--for example expanding the pvc storage on the fly), and "use" would mean "don't create or update, just use existing". I created these quickly for conceptual understanding. We can discuss the final types in more detail during development.
What about the case of needing to create a separate pvc on each run (currently covered by the CreatePVCpvc_name_suffix parameter)? There are two options:
Set programmatically at runtime through a pipeline parameter (passed into the pvc component).
Note that instead of passing a variable, we could also use builtin functionality to set the new name based on the run id or something similar (kfp.dsl.RUN_ID_PLACEHOLDER).
A generate_prefix (or generate_suffix) boolean parameter on the pvc function. The functionality would be similar to that when the current pvc_name_suffix parameter is set in CreatePVC; however, this new structure would allow the pvc_name field to be a permanent and stable required parameter entity in every run, with the prefix/suffix functionality being additive, not either/or (as is the current case with having to chose either pvc_name or pvc_name_suffix in CreatePVC).
Benefits:
the function is more robust to repeated runs, without the need for changing parameters between runs, or specifying kfp conditionals
the function behavior is more explicit, and the syntax is simplified
the other questions above get solutions
users have less headaches trying to remember syntax and debug issues
Additional Thoughts
The logging implementation around the proposed pvc function would also need attention, so that it is clear what is being done, or what happened in the case of errors.
The new workspace concept can relieve some of the headaches with CreatePVC, however, it is not a replacement for the functionality. There are times when users need to persist changes across pipeline runs, or when they need to have finer-grained control over the PVC resource. I'd be happy to give examples if needed.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Problem
Creating and using PVCs in pipelines is needlessly confusing and complex. Can we fix this headache once and for all?
Component 1: What does this code do?
It depends. If the pvc exists, it uses it. If the pvc doesn't exist, the execution fails (not at compile time, but at runtime). The one thing it does not do is create a PVC!
Component 2: How about this code?
This one creates a new pvc. But if you were hoping to attach to an existing pvc, this won't work.
In summary, if you want to create a pvc, use the parameter "pvc_name_suffix", and if you want to use an existing pvc, use the parameter "pvc_name", but don't use both. This is not at all intuitive.
See links: github source and docs
The answer to all of the following questions is ... you are out of luck.
Here's what actually works now on repeated pipeline runs (note that these are not always ideal):
mount_pvc.pvc_name_suffixforpvc_nameand recompile so that you can use the existing pvc on the next run.Possible Solutions
CreatePVCfunction to just creating PVCs (possibly phased out over time). Then, create a new functionpvc:In the first example, I use a flag. If we want more functionality options, we can use a "mode". For example, "create" would mean "create only" (if it exists, then the run fails), "apply" would mean the same as in kubectl (could be used to update or create--for example expanding the pvc storage on the fly), and "use" would mean "don't create or update, just use existing". I created these quickly for conceptual understanding. We can discuss the final types in more detail during development.
What about the case of needing to create a separate pvc on each run (currently covered by the
CreatePVCpvc_name_suffixparameter)? There are two options:Set programmatically at runtime through a pipeline parameter (passed into the
pvccomponent).Note that instead of passing a variable, we could also use builtin functionality to set the new name based on the run id or something similar (
kfp.dsl.RUN_ID_PLACEHOLDER).A
generate_prefix(orgenerate_suffix) boolean parameter on thepvcfunction. The functionality would be similar to that when the currentpvc_name_suffixparameter is set inCreatePVC; however, this new structure would allow thepvc_namefield to be a permanent and stable required parameter entity in every run, with the prefix/suffix functionality being additive, not either/or (as is the current case with having to chose eitherpvc_nameorpvc_name_suffixinCreatePVC).Benefits:
Additional Thoughts
The logging implementation around the proposed
pvcfunction would also need attention, so that it is clear what is being done, or what happened in the case of errors.The new workspace concept can relieve some of the headaches with
CreatePVC, however, it is not a replacement for the functionality. There are times when users need to persist changes across pipeline runs, or when they need to have finer-grained control over the PVC resource. I'd be happy to give examples if needed.Beta Was this translation helpful? Give feedback.
All reactions