Skip to content

feat(sdk): conductor-sdk RECON phase 1 -- preflight, CapacityAudit, QuorumHeadroom, RBAC requirements#1

Merged
ontave merged 5 commits into
mainfrom
feature/recon-phase1
May 29, 2026
Merged

feat(sdk): conductor-sdk RECON phase 1 -- preflight, CapacityAudit, QuorumHeadroom, RBAC requirements#1
ontave merged 5 commits into
mainfrom
feature/recon-phase1

Conversation

@ontave
Copy link
Copy Markdown
Contributor

@ontave ontave commented May 29, 2026

Summary

  • PreflightCheck interface and PreflightResult types for structured capability pre-execution validation
  • CapacityAudit with full machineconfig parsing: NodeLabels/NodeTaints = LabelOwnershipConflict blocker, KubeletArgDeprecated warning, ExtensionIncompatible warning (RECON-J4/J5)
  • QuorumHeadroom: 3-node and 5-node CP quorum safety calculation before CP-targeted coordinated ops (RECON-H5)
  • RequiresCoordination flag per CapabilityEntry to gate concurrent quorum-sensitive operations
  • RBAC requirements completeness check for all declared capabilities
  • PackSourceRef annotation constants (ont.packdelivery.dev/source-chart, source-version, source-digest) for upstream version tracking (RECON-CMN1)
  • 4 test files: preflight interface, CapacityAudit scenarios, QuorumHeadroom calculation, RBAC requirements

Test plan

  • go test ./... passes (all 4 test files, all scenarios including blocker paths)
  • CapacityAudit correctly classifies NodeLabels as Blocker=true
  • QuorumHeadroom correctly blocks concurrent CP ops on 3-node cluster

ontave added 5 commits May 18, 2026 16:14
…ments

Adds PodRestartCapability, ResourcePatchCapability, ForceVolumeDetachCapability,
CredentialRefreshCapability constants and CapabilityRBACRequirements struct with
per-capability RBAC declarations. Includes generated remediation-rbac.md and unit tests (T-CW-26 through T-CW-30).
Adds machineconfig-sync capability constant for the MachineConfigSync CR
conductor exec handler. Injects the ont.platform.dev/controlled node label
to mark nodes as ONT-governed after applying the source-of-truth machineconfig.
…acityAudit function

RECON-J1: new PreflightCheck interface in runnerlib/preflight.go
- PreflightParams, PreflightResult, PreflightBlocker, PreflightWarning types
- PreflightResult.Passed() helper; 7 unit tests covering all code paths
- CapabilityEntry gains optional Preflight PreflightCheck field (json:"-", Go-only)

RECON-J5: new CapacityAudit function in runnerlib/capacity_audit.go
- Statically evaluates raw Talos machineconfig YAML for K_eff reduction sources
- Warning codes: MachineConfigSyntax (blocker), LabelOwnershipConflict, TaintKeyCollision,
  KubeletArgDeprecated, ExtensionIncompatible (blocker when catalogue provided)
- LabelOwnershipConflict is non-blocking at static layer; RECON-J4 live check makes it a
  PreflightBlocker when confirmed via Kubernetes managedFields API
- 10 unit tests covering all warning codes, skip conditions, and edge cases
- Direct dependency on go.yaml.in/yaml/v2 (was indirect)
node-reenrollment re-applies stored machineconfig to a node that has been
reset to Talos maintenance mode. Triggered by ClusterNodeHealthLoop when
two-phase health classification detects MaintenanceMode. RECON-C10.
@ontave ontave merged commit 156ccdc into main May 29, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant