FLPATH-4176: k8s-storage-sp enhancement and cross-references#54
Conversation
|
The DCO job fails, because the commit is missing the |
I think we should have enhancement per platform.
Currently, we more or less ignore subject, it must just have
This is another subject we need to figure out. IMHO as part of another enhancement. |
d092ec7 to
0424bb2
Compare
|
I added a comment about volumes encryption in the enhancement |
| > **Question for reviewers:** Which NATS subject format should the storage SP | ||
| > use? [Service Provider Status Reporting](../state-management/service-provider-status-reporting.md) | ||
| > and [SP Resource Status Reader](../sp-resource-status-reader/sp-resource-status-reader.md) | ||
| > specify `dcm.{serviceType}` (e.g., `dcm.storage`), but | ||
| > [k8s-container-sp](k8s-container-sp.md) documents a hierarchical subject | ||
| > (`dcm.providers.{providerName}.container.instances.{instanceId}.status`). | ||
| > Which contract should this implementation follow? |
There was a problem hiding this comment.
I suggest moving this into the Open Questions section (as per template ) so we won't forget to answer all the questions.
I vote for dcm.storage because that matches what the existing SPs use. @gabriel-farache wdyt?
There was a problem hiding this comment.
+1 on the topic name, I think dcm.{serviceType} is the name used in the implementations of existing SPs
There was a problem hiding this comment.
should I keep the question open?
There was a problem hiding this comment.
I think you can update the doc and remove the question according to the info in this thread :)
| Users can override StorageClass and access mode per volume via | ||
| `providerHints.kubernetes` (see POST endpoint documentation). |
There was a problem hiding this comment.
this pr in service-type-definitions.md defines accessMode as a top-level Storage field (not under providerHints.kubernetes). This line says users override access mode via providerHints.kubernetes, which contradicts that schema. What is the correct way?
There was a problem hiding this comment.
Additionally, I think we currently have no way to leverage the providerHints data, we had the issue in the K8s Container SP witht the visibility and we ended up adding a dedicated field instead
| (`allowVolumeExpansion: false` or not set) | ||
| - **400 Bad Request**: New capacity is smaller than or equal to current capacity | ||
| (shrinking not supported) | ||
| - **409 Conflict**: Namespace `ResourceQuota` would be exceeded (when quota check |
There was a problem hiding this comment.
ResourceQuota handling is ambiguous: lines 384–386 say the SP may reject (optional), but here 409 Conflict “when quota check is implemented”. Can you clarify?
There was a problem hiding this comment.
“when quota check is implemented” is confusing indeed, either there is a ResourceQuota and the SP should check it before proceeding and if the patch would violate it, then send the 409 or there is no ResourceQuota and the SP does nothing
But the SP should definitely implement the ResourceQuota check
There was a problem hiding this comment.
Maybe like "when a namespace ResourceQuota on requests.storage exists, SP must pre-check on PATCH and return 409 if exceeded; if no quota object, skip"
There was a problem hiding this comment.
Updated: when a namespace-wide ResourceQuota on requests.storage exists, SP pre-checks on PATCH and returns 409 if exceeded; if no such quota object, skip the check.
gabriel-farache
left a comment
There was a problem hiding this comment.
some comments
Also, the changes on the other files are unrelated to the current PR, right? Maybe do not add them in the PR to keep it clean
We should have another PR in which we run the make targets to formats the file so we are all on the same page
|
I agree with @gabriel-farache, let's move the formatting and cross reference updates into another PR to keep the current PR easier to review. |
| | Field | Required | Type | Description | | ||
| | :--------- | :------- | :----- | :------------------------------------------------------------- | | ||
| | capacity | Yes | string | Volume size with unit (e.g., _100Gi_, _1TB_) | | ||
| | accessMode | No | string | Access mode (_ReadWriteOnce_, _ReadOnlyMany_, _ReadWriteMany_) | |
There was a problem hiding this comment.
I am unsure, if openstack, vmware etc would have accessMode same as kubernetes. Maybe this should go to providerHints as well?
There was a problem hiding this comment.
I think @machacekondra has a good point on this. But this is in contrast with @gabriel-farache opinion see #54 (comment)
There was a problem hiding this comment.
You are right that accessMode is a k8s term, but although it's rare, it exists in other platforms:
OpenStack “multiattach=True”
VMware “multi-writer”
Ceph RBD “exclusive-lock=false”
NFS/CephFS “inherently shared”
Kubernetes accessModes (Pod-level, not host-level)
Do you think attachmentMode is better?
There was a problem hiding this comment.
Ok, after rethinking I think you are right, I will change it
0424bb2 to
d05138c
Compare
| not hold multiple kubeconfigs or manage multiple clusters. | ||
| - Production: runs as a Kubernetes Deployment in the target cluster (using | ||
| in-cluster service account) or as an external service with kubeconfig access. | ||
| - Local development: uses the `k8s-storage` profile in |
There was a problem hiding this comment.
Note that those scripts will be part of control-plane for now:
dcm-project/control-plane#10
But I think it's not really imporant to mention anything like this in enchacement doc.
There was a problem hiding this comment.
removed test plan. I also explain better the usage when Multi-Backend + Multi-Tenant Storage
d05138c to
4220917
Compare
| storage that can be requested through DCM catalog items and attached to | ||
| workloads (for example, a database PVC for a three-tier application). This | ||
| enhancement defines the Kubernetes Storage Service Provider that implements the | ||
| portable `storage` service type on Kubernetes clusters. |
There was a problem hiding this comment.
the portable
storageservice type
Where is this service type defined? According to https://github.com/LinskId/enhancements/blob/main/enhancements/service-type-definitions/service-type-definitions.md the services types are
Virtual Machine
Virtual machines with CPU, memory, storage, and OS specificationsContainer
Fields common to Kubernetes, Docker, Podman, Openshift, CRI-O, containerdCluster
Fields common to Kubernetes, OpenShift, EKS, GKE, AKS, and other distributionsDatabase
Fields common across all database types (SQL, NoSQL, search, time-series, etc.)
service-type-definitions.md should be updated accordingly
There was a problem hiding this comment.
It used to be in service-type-definitions.md, I thought we wanted to move those changes into another PR.
There was a problem hiding this comment.
ah, I may not have followed that part.
I am fine with having it in another PR, just a matter of ordering: if this SP's enhancement is to implement the storage service types, it is odd to have the SP enhancement done before the new service type enhancement is there
There was a problem hiding this comment.
done, I added the type and flow back and other cross refs
There was a problem hiding this comment.
The definition of the storage service type that is missing, apart from that, all looks good to me
We can keep the providerHints in here but until we define how it works "in real life" the implementation will likely diverge from that and introduce a new static field
6d0556b to
b0de216
Compare
| SP->>SP: Map platform status → DCM status | ||
| SP->>SP: Build CloudEvent | ||
| SP->>MSG: Publish to:<br/>dcm.providers.{provider}.{serviceType}<br/>.instances.{instanceId}.status | ||
| SP->>MSG: Publish to:<br/>dcm.providers.{provider}.{serviceType}<br/>.instances.{instanceId}.status<br/>(or e.g. dcm.storage) |
There was a problem hiding this comment.
to align with the storage NATs subject suggested in k8s-storage-sp, which is dcm.storage
There was a problem hiding this comment.
I can remove it and only refer to this path in enhancement for now
There was a problem hiding this comment.
userflow.md (sections 5.3 and 6.5) still use the hierarchical NATS subject.
k8s-storage-sp.md and service-provider-status-reporting.md define storage status on dcm.storage. It's confusing, can you please double check?
| A --> B[Map platform status<br/>to DCM status enum] | ||
| B --> C[Build CloudEvent v1.0] | ||
| C --> D[Publish to NATS<br/>dcm.providers.provider.serviceType<br/>.instances.instanceId.status] | ||
| C --> D[Publish to NATS<br/>dcm.providers.{provider}.{serviceType}<br/>.instances.{instanceId}.status<br/>(or e.g. dcm.storage)] |
There was a problem hiding this comment.
it is related to the question (I can ignore it for now):
Which NATS subject format should the storage SP use? Service Provider Status Reporting and SP Resource Status Reader specify dcm.{serviceType} (e.g., dcm.storage), but k8s-container-sp documents a hierarchical subject (dcm.providers.{providerName}.container.instances.{instanceId}.status).
Which contract should this implementation follow?
Currently, we more or less ignore subject, it must just have dcm. prefix. The rest of the data are send via CloudEvent, see: https://github.com/dcm-project/service-provider-manager/blob/main/internal/consumer/consumer.go#L130
We need to fix those inconsistencies in enhancement documents, per what is implemented.
There was a problem hiding this comment.
We need to fix those inconsistencies in enhancement documents, per what is implemented.
Agree but I think we should do that in another PR once and for all
I am not against defining here the name of the topic it's just that it is odd to have in the flow diagram instead of in a table or something like that
There was a problem hiding this comment.
Is this in contrast with my #54 (comment) ?
@gabriel-farache it's better to clarify what we expect so @LinskId can understand the direction.
gabriel-farache
left a comment
There was a problem hiding this comment.
apart from the weird change in the user-flows, LGTM
b0de216 to
a0aeff7
Compare
a0aeff7 to
f0681b0
Compare
Signed-off-by: igavra <igavra@redhat.com> Assisted-By: Cursor AI
f0681b0 to
b2f2188
Compare
Summary
Adds the Kubernetes Storage Service Provider enhancement and related DCM docs for standalone persistent storage (FLPATH-4176 / FLPATH-4113).
Introduces storage as the fifth portable service type in service-type-definitions (capacity, accessMode, optional providerHints.kubernetes for StorageClass / volumeMode).
Defines the K8s reference SP: a DCM control-plane adapter that submits and watches PVCs via the Kubernetes API — not a CSI or storage backend. One SP instance per managed cluster; cluster must already have StorageClass + CSI.
v1 operations: CREATE, READ, UPDATE (capacity expansion only), DELETE; status via CloudEvents on NATS; deployment model aligned with k8s-container-service-provider.
Also updates user flows, storage status enum (PROVISIONING, RUNNING, FAILED, …), and dcm.storage in the status-reader doc.
Flows defined
Catalog / service type — User orders a portable storage catalog item; Placement → SP Resource Manager invokes the registered K8s Storage SP.
SP registration — On startup, SP self-registers with DCM Service Provider Manager (serviceType: storage, ops CREATE/READ/UPDATE/DELETE, endpoint /api/v1alpha1/volumes).
Volume create — POST /volumes → PVC in configured namespace (SP_K8S_NAMESPACE) with DCM labels → cluster/CSI provisions → SP watches PVC phase → publishes status to NATS.
Volume expand — PATCH /volumes/{id} with larger capacity after validating allowVolumeExpansion (optional ResourceQuota check); expansion completes asynchronously on the cluster.
Volume read / delete — GET (list/get by dcm-instance-id); DELETE removes PVC.
Status reporting — SharedIndexInformer on labeled PVCs maps Pending/Bound/resize conditions to DCM storage status.
SP health — DCM polls SP /health (same pattern as other SPs).
Questions for reviewers
Epic scope (FLPATH-4113): Is the epic satisfied by portable storage + K8s reference SP only, or are enhancement docs required for other platforms (VMware, OpenStack, …)?
Which NATS subject format should the storage SP use? Service Provider Status Reporting and SP Resource Status Reader specify
dcm.{serviceType}(e.g.,dcm.storage), but k8s-container-sp documents a hierarchical subject (dcm.providers.{providerName}.container.instances.{instanceId}.status).Which contract should this implementation follow?
Namespace: v1 single SP_K8S_NAMESPACE vs per-instance namespace (Alternative deferred to v2; multi-SP-per-tenant workaround documented).
Signed-off-by: Idit Gavra igavra@redhat.com
Assisted-By: Cursor AI