A Kubernetes operator that manages application deployments via three Custom Resource Definitions (CRDs): HelxApp (application template), HelxInst (instance request), and HelxUser (user record). When all three exist and reference each other, the controller synthesizes Deployments, Services, and PersistentVolumeClaims.
See docs/execution-model.md for the full artifact generation pipeline and sequence diagrams.
- Data Model
- Operator Behavior
- Volume DSL
- Security Context Resolution
- Deployment
- RBAC Model
- Development
- Testing
- Test Coverage
- Extending the Operator
- AI Prompt Reference
- License
The operator manages three namespaced CRDs in the helx.renci.org/v1 API group. The three resources arrive independently and in any order. Workloads are only created when a complete triple (app + user + instance) exists.
Defines what an application is: container images, ports, environment, volumes, and security context.
apiVersion: helx.renci.org/v1
kind: HelxApp
metadata:
name: jupyterlab
spec:
appClassName: JupyterLab
services:
- name: main
image: jupyter/minimal-notebook:latest
command: ["/bin/sh", "-c", "start-notebook.sh"]
environment:
NB_PREFIX: "/"
ports:
- containerPort: 8888
port: 8888
volumes:
home: "{{ .system.UserName }}-home:/home/{{ .system.UserName }},rwx,retain"| Field | Description |
|---|---|
appClassName |
Logical class name, stamped onto pod labels |
services[] |
Ordered list of container definitions |
services[].name |
Container name; also the key for per-instance resource overrides |
services[].image |
Image reference, optionally followed by ,key=value options (e.g. ,Always sets imagePullPolicy) |
services[].command[] |
Entrypoint override; may contain Go template expressions like {{ .system.UserName }} |
services[].environment |
Map of env vars; values may contain Go template expressions |
services[].init |
If true, runs as an init container |
services[].ports[] |
containerPort/port pairs; a non-zero port triggers Service creation |
services[].resourceBounds |
Advisory min/max per resource type |
services[].securityContext |
Per-container UID/GID/FSGroup/supplementalGroups |
services[].volumes |
Map of volumeId to volume DSL string (see Volume DSL) |
A per-user instantiation: "run this app for this user". This is the trigger for workload creation.
apiVersion: helx.renci.org/v1
kind: HelxInst
metadata:
name: jupyterlab-jeffw
spec:
appName: jupyterlab
userName: jeffw
resources:
main:
request: { cpu: "2", memory: "1G" }
limit: { cpu: "2", memory: "1.1G" }
securityContext:
runAsUser: 1000
fsGroup: 2000| Field | Description |
|---|---|
appName |
Name (or namespace/name) of the HelxApp to instantiate |
userName |
Name (or namespace/name) of the HelxUser |
resources |
Map of service name to {request, limit} resource specifications |
securityContext |
Optional override; takes highest priority (see Security Context Resolution) |
Status fields (set by the controller):
| Field | Description |
|---|---|
uuid |
Assigned on first reconciliation; labels all derived Kubernetes objects |
observedGeneration |
Prevents redundant reconciliation |
Represents a platform user. Optional userHandle URL provides security context via HTTP.
apiVersion: helx.renci.org/v1
kind: HelxUser
metadata:
name: jeffw
spec:
userHandle: "http://ldap-service/user/jeffw"| Field | Description |
|---|---|
userHandle |
Optional URL; HTTP GET returns JSON with runAsUser, runAsGroup, fsGroup, supplementalGroups |
HelxApp (template) HelxUser (identity)
\ /
\--- appName userName --/
\ \ / /
\ v v /
+--- HelxInst (trigger) ---+
|
v
Deployment + Services + PVCs
The controller maintains bidirectional associations between the three CRD types in memory. When any resource arrives, it is registered in the graph and checked for completable triples:
- HelxInst created — registered; if app + user already exist, workloads are created immediately
- HelxApp created — registered; any instances already waiting for this app get their workloads created
- HelxUser created — same as HelxApp, for instances waiting for this user
When a complete triple exists, the controller:
- Transforms
HelxApp.Spec.Servicesinto template data structures - Builds a
Systemcontext (app name, user name, UUID, environment, security context, volumes) - Renders Go templates (
deployment.tmpl,pvc.tmpl,service.tmpl) with double-pass rendering — the first pass produces YAML, the second re-renders the YAML itself as a template to resolve expressions like{{ .system.UserName }}in field values - Creates or patches Kubernetes objects via
CreateOrUpdateResource
For a HelxInst whose HelxApp defines N services:
| Object | Count | Condition |
|---|---|---|
| Deployment | 1 | Always |
| PersistentVolumeClaim | 1 per unique pvc:// volume |
Only for PVC-scheme volumes |
| Service | 1 per service | Only when port != 0 |
All derived objects share the label helx.renci.org/id: <UUID>.
| Trigger | Effect |
|---|---|
| HelxInst deleted | Controller removes from graph; Kubernetes owner-reference GC removes Deployment, Services, PVCs |
| HelxApp deleted | Controller actively deletes workloads for all connected instances |
| HelxUser deleted | Same as HelxApp — active deletion of connected instance workloads |
Objects with label helx.renci.org/retain: "true" survive deletion, allowing persistent data to outlive instances.
| Label | Value | Applied to |
|---|---|---|
executor |
helxapp-controller |
All derived objects |
helx.renci.org/id |
Instance UUID | All derived objects |
helx.renci.org/app-name |
App name | Deployment, pod template |
helx.renci.org/username |
User name | Deployment, pod template |
helx.renci.org/app-class-name |
App class | Pod template |
helx.renci.org/instance-name |
Instance name | Pod template |
helx.renci.org/retain |
"true" |
PVCs that survive deletion |
Volume entries in HelxApp.Spec.Services[].Volumes use a mini-language:
[scheme://]src:mountPath[#subPath][,option[=value]...]
| Segment | Description |
|---|---|
scheme |
pvc (default) or nfs |
src |
PVC claim name, or NFS path (//server/export) |
mountPath |
Container mount point |
subPath |
Optional subdirectory within the volume |
Options:
| Option | Effect |
|---|---|
retain |
Adds helx.renci.org/retain: "true" label; PVC survives instance deletion |
rwx |
ReadWriteMany access mode |
rox |
ReadOnlyMany access mode |
rwop |
ReadWriteOncePod access mode |
size=X |
Storage request (default 1G) |
storageClass=X |
Storage class name |
ro |
Mount read-only in the container |
Examples:
volumes:
home: "{{ .system.UserName }}-home:/home/{{ .system.UserName }},rwx,retain"
data: "shared-data:/data,size=50Gi,rwx"
cache: "nfs:///nfs-server/cache:/mnt/cache"
scratch: "scratch-vol:/tmp/scratch#mysubdir,rwop"The pod security context is resolved in priority order:
- HelxInst.Spec.SecurityContext — explicit per-instance override (highest priority)
- HelxUser.Spec.UserHandle — HTTP GET to the URL; JSON response parsed for
runAsUser,runAsGroup,fsGroup,supplementalGroups - Omitted — no security context on the pod spec
Per-service security contexts from HelxApp.Spec.Services[].SecurityContext are applied at the container level, independent of the pod-level context.
The Helm chart (chart/) supports two installation modes controlled by the cluster value (default false).
Install CRDs and deploy the controller cluster-wide:
make install # install CRDs
helm install helxapp-controller chart/ --set cluster=trueA cluster admin must first grant the developer's service account CRD permissions (Kubernetes RBAC escalation prevention requires this):
# Cluster admin — one-time per namespace:
make grant-access SA=<namespace>:<service-account>
# Developer — install the controller in their namespace:
helm install helxapp-controller chart/With cluster=false (default), the chart creates only namespace-scoped Roles and RoleBindings. The controller automatically watches only its own namespace via the WATCH_NAMESPACE environment variable (set from the pod's namespace via the downward API).
helm uninstall helxapp-controller # remove controller
make uninstall # remove CRDs (cluster admin)The controller SA requires these permissions in its watch namespace:
| Resource | API Group | Verbs |
|---|---|---|
| helxapps, helxapps/status, helxapps/finalizers | helx.renci.org | get, list, watch, create, update, patch, delete |
| helxinsts, helxinsts/status, helxinsts/finalizers | helx.renci.org | get, list, watch, create, update, patch, delete |
| helxusers, helxusers/status, helxusers/finalizers | helx.renci.org | get, list, watch, create, update, patch, delete |
| deployments | apps | get, list, watch, create, update, patch, delete |
| services | core | get, list, watch, create, update, patch, delete |
| persistentvolumeclaims | core | get, list, watch, create, update, patch, delete |
| Mode | --namespace flag / WATCH_NAMESPACE |
RBAC type | CRD access |
|---|---|---|---|
| Namespace-scoped | Set to deployment namespace | Roles + RoleBindings | Watch/manage CRDs in one namespace |
| Cluster-scoped | Empty (watches all namespaces) | ClusterRoles + ClusterRoleBindings | Watch/manage CRDs in all namespaces |
The chart and kustomize config provide graduated access roles:
| Role | Permissions |
|---|---|
helxapp-viewer-role |
get, list, watch on CRDs |
helxapp-editor-role |
create, delete, get, list, patch, update, watch on CRDs |
helxapp-manager-role |
Full CRD management including status and finalizers |
For namespace-scoped installs, make grant-access SA=<ns>:<sa> creates the ClusterRole + RoleBinding needed for the SA to manage CRDs in its namespace. This is a one-time operation by a cluster admin.
make install # install CRDs
make run # run controller against current kubeconfig (watches current namespace)make build # compile to bin/manager
make docker-build docker-push IMG=<registry>:tag # build and push container imageAfter editing api/v1/*_types.go:
make manifests generate # regenerate CRD YAML and DeepCopy methods
make test # verify everything still works| Tier | Command | Scope | Requirements |
|---|---|---|---|
| Unit tests | make test |
Pure logic: template rendering, volume DSL, object graph, security context extraction | None (envtest provides local API server) |
| Controller tests | make test |
CRD CRUD via envtest (local API server, no real cluster) | setup-envtest (auto-downloaded) |
| E2E tests | make e2e |
Full controller behavior against a live cluster | Deployed controller in current kubeconfig namespace |
make test # unit + controller tests
make e2e # e2e tests (requires live cluster)
# Single test:
go test ./template_io/ -run TestRenderNginx -v
# E2E single test:
cd e2e && go test -v -run TestE2E_CreateTriple_DeploymentCreated ./...The 22 e2e tests cover:
| Area | Tests |
|---|---|
| Core lifecycle | Create triple, order independence, delete inst/app/user cascades |
| Workload correctness | Labels, container spec, service ports, PVC creation, NFS volumes, resource limits |
| Security context | Instance-level override applied to pod spec |
| Volume DSL | RWX access mode, retain label, storage size |
| Retain behavior | PVC survives instance deletion |
| Update handling | App image update, instance resource update propagated to deployment |
| Status fields | UUID assignment, ObservedGeneration tracking |
| Multi-instance | Separate deployments with distinct UUIDs |
Last updated: 2026-03-27
| Package | Coverage | Notes |
|---|---|---|
connect |
91.7% | HTTP client for user handle; missing: response body read error |
template_io |
75.2% | Template parsing, rendering, volume types, security context |
helxapp_operations |
58.1% | Object graph, artifact generation, transforms; cluster CRUD functions (0%) require envtest/live cluster |
controllers |
0.0% | Reconciler logic; covered by e2e tests against live cluster |
api/v1 |
0.0% | Generated DeepCopy code; excluded from coverage targets |
| Total (unit) | 36.2% | |
| E2E | 22 tests | Covers reconciler + cluster CRUD paths not reachable by unit tests |
| Gap | Reason | Path to cover |
|---|---|---|
helxapp_operations.CreateOrUpdateResource |
Requires a real API server with scheme registration | E2E tests cover this path |
helxapp_operations.Delete{Deployments,PVCs,Services} |
Same — cluster CRUD | E2E tests |
helxapp_operations.{Deployment,PVC,Service}FromYAML |
YAML decode + cluster create | E2E tests |
controllers/*.Reconcile |
Full reconciler loop | E2E tests |
template_io.LoadTemplatesFromDB |
Requires PostgreSQL | Integration test with test container |
| Scenario | Action | Affects |
|---|---|---|
| New field on existing resource (e.g., adding probes to HelxApp services) | Add field to api/v1/*_types.go, run make manifests generate |
CRD schema, templates, unit tests |
| New resource kind (e.g., HelxPolicy for network policies) | New type file in api/v1/, new controller in controllers/, register in main.go |
CRD schema, RBAC, Helm chart roles, all test tiers |
| New workload output (e.g., generating Ingress objects) | Add template in templates/, add rendering in helxapp_operations.GenerateArtifacts |
Templates, unit tests, e2e tests, RBAC (Ingress permissions) |
New volume scheme (e.g., hostPath://) |
Extend processVolume() in helxapp_operations, add template branch in pod.tmpl |
Volume DSL, unit tests |
- Edit
api/v1/*_types.go— add/modify struct fields with JSON tags make manifests generate— regenerate CRD YAML and DeepCopy methods- Update templates in
templates/if the new field affects rendered output - Update
helxapp_operationsif the field requires transformation logic - Add unit tests for the new transform/rendering paths
- Add e2e tests verifying the end-to-end behavior
- Update RBAC if the controller needs permissions for new resource types
- Update Helm chart roles (
chart/templates/roles.yaml) for both cluster and namespace modes make test— verify unit + controller tests passmake e2e— verify against a live cluster
Every CRD field should have test coverage at multiple tiers:
api/v1/*_types.go field
│
├─ Unit test: verify transformApp/transformVolumes/etc. handles the field
│ (helxapp_operations/helxapp_operations_test.go)
│
├─ Template test: verify the rendered YAML contains expected output
│ (template_io/template_io_test.go)
│
├─ Controller test: verify CRD round-trips through the API server
│ (controllers/helxapp_controller_test.go via envtest)
│
└─ E2E test: verify the field produces the correct Kubernetes object
(e2e/e2e_test.go against live cluster)
This section provides a structured context block for AI assistants working on this codebase. Include it in your prompt or system instructions.
Click to expand prompt context
## helxapp-controller — AI assistant context
### Project type
Kubernetes operator (controller-runtime / Kubebuilder) managing three CRDs
in API group helx.renci.org/v1.
### CRDs
- HelxApp: application template (images, ports, env, volumes, security context)
- HelxInst: per-user instance request referencing an app + user; triggers workload creation
- HelxUser: user record; optional userHandle URL for security context
### Core behavior
The three CRDs arrive independently and in any order. The controller maintains
an in-memory relational graph (helxapp_operations package). Workload objects
(Deployment, PVCs, Services) are only created when a complete triple exists.
Templates use double-pass rendering: Go templates produce YAML, then the YAML
is re-rendered as a template to resolve {{ .system.* }} expressions in field values.
### Key packages
| Package | Role |
|----------------------|---------------------------------------------------|
| api/v1/ | CRD type definitions (spec, status, DeepCopy) |
| controllers/ | Three reconcilers, one per CRD kind |
| helxapp_operations/ | In-memory graph, artifact generation, cluster CRUD |
| template_io/ | Template types, rendering, volume DSL parsing |
| templates/ | Go templates (deployment, pod, container, pvc, service) |
| connect/ | HTTP client for userHandle URLs |
| e2e/ | End-to-end tests (separate Go module) |
### Volume DSL
[scheme://]src:mountPath[#subPath][,option[=value]...]
Schemes: pvc (default), nfs
Options: retain, rwx/rox/rwop, size, storageClass, ro
### Security context priority
1. HelxInst.Spec.SecurityContext (explicit override)
2. HTTP GET to HelxUser.Spec.UserHandle URL
3. Omitted
### Build commands
make build # compile bin/manager
make test # unit + controller tests (envtest)
make e2e # e2e tests against live cluster
make manifests # regenerate CRD manifests (after changing api/v1/*_types.go)
make generate # regenerate DeepCopy methods
### After modifying api/v1/*_types.go
Always run: make manifests generate
### Test structure
- Unit tests: template_io/, helxapp_operations/, connect/
- Controller tests (envtest): controllers/ (Ginkgo v2 + Gomega)
- E2E tests (live cluster): e2e/ (separate Go module, 22 tests)
### RBAC
- Namespace-scoped: Roles + RoleBindings, WATCH_NAMESPACE env var
- Cluster-scoped: ClusterRoles + ClusterRoleBindings, no namespace restriction
- Controller needs: CRD verbs + deployments + services + PVCs
### Labels on derived objects
helx.renci.org/id: <UUID> — set-based lookup/deletion
helx.renci.org/retain: "true" — survives instance deletion
helx.renci.org/app-name, username, app-class-name, instance-name
### Key patterns
- Objects with helx.renci.org/retain: "true" survive instance deletion
- PVC patches filter out "remove" operations to protect bound claims
- Templates parsed at startup via Initialize() from /templates directory
- All derived objects share label helx.renci.org/id: <UUID>
Copyright 2023.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.