Skip to content

refactor(rig-v2):migrate k0sctl from rig v0.x to rig v2#1092

Draft
kke wants to merge 30 commits into
mainfrom
feat/rig-v2
Draft

refactor(rig-v2):migrate k0sctl from rig v0.x to rig v2#1092
kke wants to merge 30 commits into
mainfrom
feat/rig-v2

Conversation

@kke

@kke kke commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

wip

this will be a massive change in code, but minimal or preferably zero change in use

@kke kke added enhancement New feature or request chore Housekeeping / typo / code quality improvements rigv2 Issues that have or may have a fix in rig v2 labels Jun 9, 2026
@kke kke force-pushed the feat/rig-v2 branch 2 times, most recently from b6dea63 to 6739723 Compare June 9, 2026 13:24
@kke kke requested a review from Copilot June 9, 2026 13:24

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates k0sctl’s remote-execution and host-management layer from rig v0.x to rig/v2, refactoring host connection, command execution, remote filesystem operations, and OS-configurer plumbing to the new APIs while aiming to keep user-facing behavior unchanged.

Changes:

  • Replaced rig/exec usage with rig/v2 (cmd, protocol, remotefs) across phases/actions and k0s/node helpers (including context-aware reconnects and stream execution via Proc().Start(ctx)).
  • Refactored cluster.Host to embed rig.CompositeConfig + *rig.Client, added context-aware Connect(ctx) and a slog→logrus logging bridge.
  • Reworked the configurer subsystem to use a new configurer.Host interface and an internal OS-module registry, updating Linux/Windows configurers and unit tests accordingly; updated Go toolchain and dependencies.

Reviewed changes

Copilot reviewed 62 out of 63 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/node/statusfunc.go Switches node readiness/status checks to rig v2 exec + reconnect with context.
pkg/k0s/binprovider/host.go Updates binprovider host contract to rig v2 client/FS model.
pkg/k0s/binprovider/helpers.go Migrates staging helpers to remotefs upload + sudo FS fallback logic.
pkg/apis/k0sctl.k0sproject.io/v1beta1/cluster/spec_test.go Updates tests for rig v2 connection config types.
pkg/apis/k0sctl.k0sproject.io/v1beta1/cluster/k0s.go Moves token/cluster-id commands to rig v2 exec options.
pkg/apis/k0sctl.k0sproject.io/v1beta1/cluster/host.go Major Host refactor: rig v2 client lifecycle, OS detection/configurer resolution, sudo FS usage.
pkg/apis/k0sctl.k0sproject.io/v1beta1/cluster/host_test.go Updates host tests for context-aware Connect and v2 config.
phase/validate_hosts.go Uses rig v2 sudo FS for cleanup walk.
phase/validate_hosts_test.go Updates mocks to new configurer Host type.
phase/uploadfiles.go Migrates uploads/downloads/metadata operations to remotefs + new configurer APIs.
phase/upload_k0s_test.go Updates test hosts/configurer typing for rig v2 + OS release.
phase/upgrade_workers.go Reworks reinstall path to use rig v2 exec (incl. Windows stderr behavior).
phase/upgrade_controllers.go Switches reinstall/readiness checks to rig v2 sudo exec/output.
phase/restore.go Moves restore execution to rig v2 process start/wait with context.
phase/reset_workers.go Moves reset execution to rig v2 process start/wait with context.
phase/reset_leader.go Uses rig v2 sudo exec/output for leader reset.
phase/reset_controllers.go Moves reset execution to rig v2 process start/wait with context.
phase/reinstall.go Migrates reinstall and status polling to rig v2 sudo exec.
phase/prepare_hosts.go Updates environment update + reconnect logic to rig v2 error types and protocol detection.
phase/lock.go Updates lockfile stat path to rig v2 configurer stat usage.
phase/install_workers.go Migrates token invalidate/reset/install logic to rig v2 sudo exec + cmd options.
phase/install_controllers.go Migrates controller install exec-streams to rig v2 proc start/wait + readiness checks.
phase/initialize_k0s.go Migrates leader init install + readiness checks to rig v2 sudo exec/output.
phase/get_kubeconfig.go Uses rig v2 cmd hide-output with sudo exec/output.
phase/get_kubeconfig_test.go Updates test config for rig v2 connection config types.
phase/gather_k0s_facts.go Migrates etcd member listing and k0s status/version reads to rig v2 proc/exec.
phase/detect_os.go Updates OS detection fallback to rig v2 release model + preserves detected release.
phase/connect.go Migrates connect retries to context-aware Connect(ctx) + v2 retryability errors.
phase/configure_k0s.go Migrates config validation and directory creation to rig v2 proc + sudo FS.
phase/backup.go Uses rig v2 sudo exec and sudo FS for backup file discovery/streaming.
phase/apply_manifests.go Migrates kubectl apply streaming to rig v2 proc with stdin reader.
action/config_status.go Migrates leader connect + event query to rig v2 sudo exec/output.
action/config_edit.go Migrates leader connect + apply via stdin string to rig v2 cmd options.
configurer/windows/windows.go Updates Windows configurer registration and exec/stderr redaction to rig v2 APIs.
configurer/windows/windows_test.go Updates stubs to rig v2 runner interfaces and types.
configurer/windows.go Refactors BaseWindows to new Host interface and adds rig v2 service/FS helpers.
configurer/windows_test.go Updates configurer package Windows host mocks to rig v2 interfaces.
configurer/registry.go Introduces internal OS-module registry replacing rig v0 registry usage.
configurer/linux/ubuntu.go Switches OS module registration to configurer registry + rig v2 release.
configurer/linux/sles.go Switches registration + implements zypper-based package install using rig v2 sudo exec.
configurer/linux/slackware.go Switches registration + updates slackpkg install path to rig v2 sudo exec.
configurer/linux/opensuse.go Switches OS module registration to new registry + rig v2 release.
configurer/linux/linux_test.go Updates mocks to rig v2 + remotefs-based file existence control.
configurer/linux/flatcar.go Switches registration + updates Host type for unsupported package install.
configurer/linux/enterpriselinux/rocky.go Switches OS module registration to new registry + rig v2 release.
configurer/linux/enterpriselinux/rhel.go Switches OS module registration and CoreOS exclusion to rig v2 release fields.
configurer/linux/enterpriselinux/oracle.go Switches OS module registration + fixes doc string typo.
configurer/linux/enterpriselinux/fedora.go Switches OS module registration and CoreOS exclusion to rig v2 release fields.
configurer/linux/enterpriselinux/centos.go Switches OS module registration to new registry + rig v2 release.
configurer/linux/enterpriselinux/amazon.go Switches registration + updates Hostname signature to new Host type.
configurer/linux/enterpriselinux/almalinux.go Switches OS module registration to new registry + rig v2 release.
configurer/linux/enterpriselinux.go Implements yum-based package install using rig v2 sudo exec.
configurer/linux/debian.go Switches registration + implements apt-get based package install using rig v2 sudo exec.
configurer/linux/coreos.go Switches registration + updates Host type for unsupported package install.
configurer/linux/archlinux.go Switches registration + implements pacman-based package install using rig v2 sudo exec.
configurer/linux/alpine.go Switches registration + implements apk-based package install using rig v2 sudo exec.
configurer/linux.go Refactors Base Linux configurer to new Host interface, adds rig v2 service/FS-based helpers.
configurer/interface.go Updates Configurer/HostValidator APIs to new Host interface and simplified method signatures.
configurer/host.go Adds configurer.Host interface (rig v2 runner + sudo + FS).
cmd/init.go Updates generated example config to rig v2 composite config types.
cmd/flags.go Switches redaction toggle to rig v2 cmd package; removes rig v0 global logger usage.
go.mod Bumps toolchain, replaces rig v0 with rig/v2, adds slog-logrus bridge, updates x/* deps.
go.sum Updates dependency checksums for rig/v2 migration and related transitive changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread phase/configure_k0s.go
Comment thread configurer/windows.go Outdated
Comment thread configurer/windows.go
@kke kke force-pushed the feat/rig-v2 branch 5 times, most recently from f666437 to ad19bc8 Compare June 10, 2026 12:32
@kke kke requested a review from Copilot June 10, 2026 12:45

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 75 out of 76 changed files in this pull request and generated 4 comments.

Comment thread phase/daemon_reload.go Outdated
Comment thread phase/apply_manifests.go
Comment thread phase/validate_hosts.go Outdated
Comment thread pkg/apis/k0sctl.k0sproject.io/v1beta1/cluster/host.go

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 76 out of 77 changed files in this pull request and generated 5 comments.

Comment thread pkg/apis/k0sctl.k0sproject.io/v1beta1/cluster/host.go
Comment thread phase/validate_hosts.go Outdated
Comment thread phase/lock.go
Comment thread phase/configure_k0s.go
Comment thread configurer/linux.go
kke added 9 commits June 11, 2026 12:09
Replace the rig v0.x integration with rig v2 across the configurer package,
cluster.Host, the binary providers and the node status helpers. This is the
mechanical + sudo-model portion of the migration (Phases 1-2 plus the
configurer simplification of Phase 3); the phase/*.go call sites remain on the
old API and are migrated separately.

Key changes:
- configurer: add own OS module registry (registry.go) matching on the full
  *os.Release, replacing rig/os/registry. Add configurer.Host (host.go) =
  cmd.SimpleRunner + Sudo() + FS(). Shrink the Configurer interface: drop
  exec.Option varargs, Stat now returns fs.FileInfo. Linux/BaseWindows no
  longer embed rig os modules; FS and service operations delegate to
  client.FS()/client.Sudo().Service(). Re-add per-distro InstallPackage
  commands previously inherited from rig v0.x os modules.
- cluster.Host: embed rig.ClientWithConfig instead of rig.Connection; store
  *os.Release; ResolveConfigurer uses configurer.ResolveOSModule; sudo via
  h.Sudo(); connection validation via ConnectionConfig.Validate().
- binprovider.Host: expose Sudo() *rig.Client; upload via remotefs.Upload.
- pkg/node: protocol.ErrNonRetryable replaces rig.ErrCantConnect; Connect(ctx).
- go.mod: add development replace directive to a local rig v2 checkout
  (documented inline; drop before release).

Build state: ./configurer/..., the cluster package, pkg/k0s/binprovider and
pkg/node build; `go mod tidy` passes. `go build ./...` still fails only in
package phase (exec/sudo/os.Host call sites, follow-up work). Test files in the
migrated packages are not yet ported and do not compile.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Fixes all test compilation and runtime failures after the v2 production
code migration. Changes fall into two categories:

Test migration (test files only):
- Update mock types in configurer/linux, configurer/windows,
  configurer/windows/windows_test to satisfy the v2 configurer.Host
  interface (cmd.SimpleRunner + Sudo()*rig.Client + FS() remotefs.FS)
- Replace exec.Option/Waiter with cmd.ExecOption/protocol.Waiter
- Remove Upload/Execf/ExecOutputf/ExecStreams (not in v2 interface)
- Replace os.Host with configurer.Host in validate_hosts_test
- Replace rig.Connection/SSH/OSVersion struct literals with v2 equivalents
- Replace rigos.Linux embed and rigos.Host param in upload_k0s_test
- Add newFileExistClient helper in linux_test for TestPaths: v2
  KubeconfigPath calls h.Sudo().FS().FileExist() instead of Execf

Production fixes required to make tests pass:
- cluster/host.go: replace rig.ClientWithConfig yaml:",inline" with
  rig.CompositeConfig yaml:",inline" + *rig.Client yaml:"-" — the
  documented correct k0sctl embedding pattern (see rig compositeconfig_test
  TestK0sctlHostPattern). ClientWithConfig has its own UnmarshalYAML which
  swallows sibling fields like role/files when used inline with yaml.v2.
- Connect() lazily creates *rig.Client with WithConnectionFactory so the
  logger can be injected per-connect rather than per-unmarshal
- Add String() override to avoid nil-client panic before Connect()
- Address() falls back to CompositeConfig fields before returning 127.0.0.1
- Validate() skips connection validation when no protocol is configured
- ExpandTokens: guard h.IsConnected() with h.Client != nil check
- cmd/init.go: update host construction to use CompositeConfig directly
- phase/detect_os.go: remove redundant .Client. selector (QF1008)

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Host.String() before Connect() now delegates to CompositeConfig.String()
which includes the port (e.g. "ssh.Config{127.0.0.1:2222}"), so hosts that
share an IP but use different SSH ports are correctly treated as distinct
during cluster validation.

stageUpload now tries h.FS().MkdirAll() before falling back to
h.Sudo().FS().MkdirAll(), keeping the binary install directory user-owned
when the caller already has write access (e.g. test temp dirs). This
prevents permission-denied failures during TempDir cleanup in tests while
preserving the sudo fallback for system paths like /usr/local/bin.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
…riptPath)

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
- Remove double colon in k0s config validation error message
- DownloadURL on Windows now uses h.Sudo() to match Linux behaviour
  and allow writes to system paths
- KubeconfigPath on Windows now uses h.Sudo().FS().FileExist() to
  match the Linux implementation and correctly detect admin.conf
  under privilege-restricted dataDirs

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
The Configurer interface had ~17 methods that were thin wrappers around
rig v2's remotefs.FS/OS interface (FileExist, DeleteFile, MoveFile,
TempFile, TempDir, MachineID, SystemTime, LookPath, FileContains,
Hostname, CommandExist, IsContainer, Stat, MkDir, Chown, Touch,
DeleteDir). Both PosixFS and WinFS implement all of these natively,
so the Configurer layer added no value.

Remove all 17 methods from the interface and from Linux/BaseWindows
implementations. Call sites now use h.FS() or h.Sudo().FS() directly.
Kept WriteFile, ReadFile, and Chmod which have perm-string parsing logic.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
kke added 20 commits June 11, 2026 12:09
…er layer

Use rig v2's sh.Command and sh.CommandBuilder for all shell command
construction in Linux configurers and package manager implementations.
Eliminates manual shellescape.Quote calls and fmt.Sprintf string
building, ensuring correct argument quoting throughout.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
internal/shell was a vendored copy of rig v2's sh/shellescape package.
Replace its usage in flags.go with the upstream package directly, and
replace al.essio.dev/pkg/shellescape in configure_k0s.go with sh.Quote.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Remove the HTTPStatus method from the Configurer interface and its
Linux/Windows implementations; call remotefs.HTTPStatus via h.FS()
with a threaded context instead.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
…irectly

Both methods were thin wrappers; call sites now use h.Sudo().FS().WriteFile
and h.FS().ReadFile directly with literal octal permissions.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
…emoval

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Replace the Configurer.CheckPrivilege(h) call in ValidateHosts with
rig v2's h.CheckSudo(ctx), and remove the interface method and its
Linux/Windows implementations. Bump rig v2 dependency to include
Client.CheckSudo.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Architecture detection is now delegated to rig v2's OSRelease.Arch()
which normalizes arch strings to GOARCH equivalents. The Host.Arch()
wrapper method is retained so all eight call sites continue to work
unchanged; only its implementation switches from Configurer.Arch(h)
to h.OSRelease.Arch().

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Delegates Context, Timeout, WithDefaultTimeout, and Times to
github.com/k0sproject/rig/v2/retry internally. All exported function
signatures and the ErrAbort sentinel are preserved unchanged; no call
sites require modification.

ErrAbort maps to rigretry.If(func(e error) bool { return !errors.Is(e, ErrAbort) }).

Behavioral note: delay between attempts now grows linearly (delay x
attempt) due to rig's default backoff model, rather than the previous
constant-interval ticker. The effective retry window is unchanged for
typical usage.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Replace Configurer.DownloadURL with rig v2's remotefs.FS.DownloadURL,
which is already implemented for both Posix (curl/wget) and Windows
(Invoke-WebRequest) targets. Host.DownloadURL now delegates directly to
h.Sudo().FS().DownloadURL; the call site in phase/uploadfiles.go uses
that wrapper. Remove the trailingNumber helper from linux.go that was
only used by the deleted DownloadURL implementation.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
…lock file

Replace the platform-specific UpsertFile (temp+mv-n on Linux, Test-Path/Move-Item
on Windows) with a direct OpenFile(O_CREATE|O_EXCL|O_WRONLY) call on the remotefs.FS
returned by h.Sudo().FS(). Both PosixFS and WinFS honour O_EXCL: PosixFS returns
fs.ErrExist when the file already exists; WinFS passes FileMode.CreateNew to the
rigrcp daemon which throws if the file is already present. The fallback read is
switched to h.Sudo().FS() so it works against the root-owned lock file.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
…h/ShellQuote

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Remove 8 service-related methods from the Configurer interface
(StartService, StopService, RestartService, ServiceIsRunning,
ServiceScriptPath, DaemonReload, UpdateServiceEnvironment,
CleanupServiceEnvironment) and replace all ~19 call sites with direct
calls to h.Sudo().Service(name).{Start,Stop,Restart,IsRunning,
ScriptPath,DaemonReload,SetEnvironment}(ctx).

Real ctx is now threaded at all call sites in Run() methods; CleanUp()
methods with no ctx parameter continue to use context.Background().

On Windows, gather_k0s_facts preserves service-existence detection via
an inline sc.exe query fallback since WinSCM does not expose a script
path.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
…fix)

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
- Host.Arch() falls back to rig live detection when k0sctl OSRelease
  field lacks arch (fixes OS_OVERRIDE and ID_LIKE smoke-test failures)
- Move Linux-specific host tests to host_posix_test.go with //go:build
  !windows to fix Windows CI failures from POSIX path assumptions
- daemon_reload: use h.K0sServiceName() instead of hardcoded
  "k0scontroller" so worker hosts reload the correct service
- apply_manifests: return error when kubectl apply fails instead of
  silently logging and continuing
- validate_hosts: fix fs.WalkDir called with glob pattern (globs are not
  expanded); walk the directory and filter by name prefix instead
- host: normalize backslashes in k0sBinaryPathDir() before path.Dir()
  to handle user-configured Windows-style paths correctly

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
…th from Configurer

- Drop Slackware support: no systemd/OpenRC, no rig package manager
  support, not a viable k0s platform
- Remove InstallPackage from Configurer interface; call rig's
  PackageManager directly in phase/prepare_hosts.go with a single
  pm.Update()+pm.Install() instead of one call per package
- Remove ReplaceK0sTokenPath from Configurer interface and impls — it
  was never called anywhere (dead code)
- Simplify per-distro configurer files now that InstallPackage is gone
  (debian, archlinux, sles, coreos, flatcar, enterpriselinux)
- Alpine.Prepare() now calls the package manager directly

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
- Host.Arch: guard against nil Client (host not connected)
- validate_hosts: normalize backslashes before path.Dir in
  cleanUpOldK0sTmpFiles so Windows-style paths resolve correctly
- lock: distinguish os.ErrExist from other OpenFile errors in tryLock;
  permission denied / missing directory now returns immediately instead
  of falling through to Stat/Read/Remove
- configure_k0s: replace context.TODO() with context.Background()
- linux UpdateEnvironment: validate env keys/values reject newlines and
  keys reject '=' to prevent /etc/environment corruption

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
The rig v2 retry uses delay*(backoffFactor*attempt) — a linearly growing
backoff. With the default 5s interval and a 2-minute timeout this reduced
attempts from ~24 (old constant-interval) to ~7-8, causing timeouts on
slow operations that previously succeeded.

Restore the original constant-interval implementation using
time.NewTicker. Call sites already translate rig.ErrNonRetryable into
retry.ErrAbort via errors.Join, so the abort sentinel wiring is correct.

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
The migration to remotefs.HTTPStatus lost the -k flag that the old
configurer/linux.go HTTPStatus used (curl -kso /dev/null ...), causing
curl exit 60 (SSL certificate verification failure) against k0s's
self-signed kube-apiserver.

Switch CheckHTTPStatus to remotefs.HTTPStatusInsecure (new in rig
8bbc03d) which runs curl -k on POSIX and Invoke-WebRequest with
SkipCertificateCheck on Windows PS6+"

Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Signed-off-by: Kimmo Lehto <klehto@mirantis.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore Housekeeping / typo / code quality improvements enhancement New feature or request rigv2 Issues that have or may have a fix in rig v2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants