Skip to content

Commit acda09d

Browse files
helltCopilot
andcommitted
Update EDA SW upgrade procedure (#187)
* update upgrade procedure * Update docs/software-install/upgrades/index.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * use git scale down instead of uninstall to keep git database pv * orpho and gramma checks --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> (cherry picked from commit 18415a6)
1 parent 05a6041 commit acda09d

1 file changed

Lines changed: 105 additions & 67 deletions

File tree

docs/software-install/upgrades/index.md

Lines changed: 105 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,14 @@
22

33
Assuming you have a working EDA cluster the upgrade procedure will consist of the following steps:
44

5-
1. Backup your existing cluster.
6-
2. Update the [playground][playground] repository.
7-
3. Uninstall the existing version of EDA.
8-
4. Install the new EDA `kpt` package (on both active and standby members if running geo redundant).
9-
5. Restore your backup.
10-
6. Upgrade your applications.
5+
1. Pause NPP interactions.
6+
2. Backup your existing cluster.
7+
3. Update the [playground][playground] repository.
8+
4. Uninstall the existing version of EDA.
9+
5. Install the new EDA `kpt` package (on both active and standby members if running geo redundant).
10+
6. Restore your backup.
11+
7. Upgrade your applications.
12+
8. Resume NPP interactions.
1113

1214
/// admonition | Nuances for Air-gapped and Geo-redundant clusters
1315
type: info
@@ -18,9 +20,9 @@ In geo redundant clusters, cluster members cannot run different versions. Theref
1820

1921
/// admonition | EDA upgrade procedure scope
2022
type: subtle-note
21-
This is the Nokia EDA software upgrade procedure, it does not cover upgrading Talos Linux or Kubernetes.
23+
This is the Nokia EDA software upgrade procedure, it does not cover upgrading Talos Linux or Kubernetes. Nokia EDA does not require upgrading Talos or Kubernetes for every EDA version upgrade, unless explicitly stated in the release notes.
2224

23-
To upgrade Talos and Kubernetes perform **one of** the following:
25+
In case Talos and/or Kubernetes upgrade is desired perform **one of** the following:
2426
/// tab | Install a new EDA cluster
2527
When running EDA cluster on virtual machines it might be easier to perform a new installation with the desired Talos and Kubernetes versions and restore your existing cluster backup into the new cluster:
2628

@@ -34,33 +36,64 @@ Follow the respective [Talos Linux upgrade documentation](https://docs.siderolab
3436
///
3537
///
3638

37-
## Backing up your cluster
39+
## Pausing NPP interactions
40+
41+
Prior to taking a backup of your cluster, place all `TopoNode` resources into `emulate` mode to avoid any ongoing interactions with the network devices during the upgrade process.
42+
43+
In this mode, EDA does not interact with target devices, effectively pausing the cluster's interaction with your infrastructure. You can still interact with EDA and the `TopoNode` resources; changes are pushed upon switching back to `normal` mode.
3844

39-
Backing up your existing cluster is performed using the [`edactl` CLI tool](../../user-guide/using-the-clis.md#edactl):
45+
To set `emulate` mode in bulk, run the script from the [playground](https://github.com/nokia-eda/playground) repo directory on a machine where you have [`kubectl`](../../user-guide/using-the-clis.md#kubectl) configured with the access to your cluster:
4046

4147
```{.shell .no-select}
42-
edactl platform backup
48+
make set-npp-mode-emulate
4349
```
4450

45-
<div class="embed-result highlight">
51+
After the script has been run, verify that the `TopoNode` resources are in `emulate` mode:
52+
53+
```{.shell .no-select}
54+
kubectl get toponode -A \
55+
-o custom-columns='NAMESPACE:.metadata.namespace,NAME:.metadata.name,MODE:.spec.npp.mode'
56+
```
57+
58+
<div class="embed-result">
4659
```{.shell .no-select .no-copy}
47-
Platform backup done at eda-backup-engine-config-2025-04-22_13-51-50.tar.gz
60+
NAMESPACE NAME MODE
61+
my-other-ns leaf1 emulate
62+
my-other-ns leaf2 emulate
63+
my-other-ns leaf3 emulate
64+
my-other-ns leaf4 emulate
65+
my-other-ns spine1 emulate
66+
my-other-ns spine2 emulate
67+
eda leaf1 emulate
68+
eda leaf2 emulate
69+
eda spine1 emulate
4870
```
4971
</div>
5072

51-
This will create a backup in a gzipped tarball format in the toolbox pod. The backup archive contains all the necessary information to restore your cluster.
73+
## Backing up your cluster
5274

53-
Copy this backup outside of your `eda-toolbox` pod - as this pod is destroyed and recreated during the upgrade. Replace the file name with the one from the `edactl platform backup` command output and run:
75+
Backing up your existing cluster is performed using the `collect-backup` target provided in the [`Makefile`](https://github.com/nokia-eda/playground/blob/main/Makefile) of the playground repository.
5476

5577
```{.shell .no-select}
56-
toolboxpod=$(kubectl -n eda-system get pods \
57-
-l eda.nokia.com/app=eda-toolbox -o jsonpath="{.items[0].metadata.name}")
78+
make collect-backup
79+
```
5880

59-
kubectl cp eda-system/$toolboxpod:/eda/eda-backup-engine-config-2025-04-22_13-51-50.tar.gz \
60-
/tmp/eda-backup.tar.gz
81+
<div class="embed-result">
82+
```{.shell .no-select .no-copy}
83+
[ INFO ] Starting backup
84+
Platform backup done at /tmp/eda-platform-backup-2025-12-18-21-37-42.tar.gz
85+
[ OK ] Collected backup
86+
[ INFO ] Transferring to host /tmp/eda-support/logs-2025-12-18/eda-platform-backup-2025-12-18-21-37-42.tar.gz
87+
tar: Removing leading `/' from member names
88+
[ OK ] Transferred to /tmp/eda-support/logs-2025-12-18/eda-platform-backup-2025-12-18-21-37-42.tar.gz
6189
```
90+
</div>
6291

63-
The backup file will be copied to the `/tmp/eda-backup.tar.gz` file on your system.
92+
This will create a timestamped backup archive in the toolbox pod and copy it to the system where make target was run in the `/tmp/eda-support/logs-<date>` directory. The backup archive contains all the necessary information to restore your cluster.
93+
94+
/// warning | Testing the backup
95+
It is highly recommended to test the backup by restoring it in a test cluster before proceeding with the upgrade of your production cluster.
96+
///
6497

6598
## Updating playground repository
6699

@@ -70,7 +103,7 @@ The workflow to upgrade EDA slightly differs depending on whether you have the o
70103

71104
/// tab | Playground repository present
72105

73-
If you have an existing [playground repository][playground] ensure it is up to date by running:
106+
If you have an existing [playground repository][playground], ensure it is up to date by running:
74107

75108
```bash
76109
git pull --rebase --autostash -v
@@ -139,7 +172,7 @@ Ensure the package inventory is in sync with your existing cluster:
139172
make cluster-restore-inventory
140173
```
141174

142-
## Uninstalling EDA core components
175+
## Uninstalling EDA components
143176

144177
The existing EDA core components must be uninstalled, before installing the new version.
145178

@@ -152,41 +185,6 @@ If you have a geo-redundant installation, on your active cluster member, update
152185
Do not update the EngineConfig resource on standby members. Although stopped, if the standby members were to start, they must continue to look for the active member (and fail to do so) throughout the upgrade.
153186
///
154187

155-
### Pausing NPP interactions
156-
157-
Place your `TopoNode` resources into `emulate` mode by setting the resource's `.spec.npp.mode` from `normal` to `emulate`.
158-
159-
* In this mode, EDA does not interact with targets, effectively pausing the cluster's interaction with your infrastructure.
160-
* You can still interact with EDA and the `TopoNode` resources; changes are pushed upon switching back to `normal` mode.
161-
162-
You can do this with running the following script in on your machine where you have `kubectl` configured to access your cluster:
163-
164-
```{.shell .no-select}
165-
make set-npp-mode-emulate
166-
```
167-
168-
After patching script is run, verify that the `TopoNode` resources are in `emulate` mode:
169-
170-
```{.shell .no-select}
171-
kubectl get toponode -A \
172-
-o custom-columns='NAMESPACE:.metadata.namespace,NAME:.metadata.name,MODE:.spec.npp.mode'
173-
```
174-
175-
<div class="embed-result">
176-
```{.shell .no-select .no-copy}
177-
NAMESPACE NAME MODE
178-
eda-telemetry leaf1 emulate
179-
eda-telemetry leaf2 emulate
180-
eda-telemetry leaf3 emulate
181-
eda-telemetry leaf4 emulate
182-
eda-telemetry spine1 emulate
183-
eda-telemetry spine2 emulate
184-
eda leaf1 emulate
185-
eda leaf2 emulate
186-
eda spine1 emulate
187-
```
188-
</div>
189-
190188
### Stopping EDA platform
191189

192190
To stop EDA components, enter the following command:
@@ -197,15 +195,20 @@ make eda-stop-core
197195

198196
This command returns no output, but will result in all Pods packaged as part of `eda-kpt-base` being stopped and removed from the cluster.
199197

200-
### Uninstalling EDA core
198+
/// details | Nuances for geo redundant clusters
199+
type: info
200+
For geo redundant clusters, execute the `edactl platform stop` command on both active and standby members, via their respective `eda-toolbox` Pods.
201+
///
202+
203+
### Uninstalling EDA core components
201204

202-
Proceed with EDA core components uninstallation:
205+
Proceed with uninstalling EDA core components:
203206

204207
```bash
205208
make eda-uninstall-core
206209
```
207210

208-
Now you should see no core components in your cluster. Check with the following command[^1]:
211+
You should now see no core components in your cluster. Check with the following command[^1]:
209212

210213
```{.shell .no-select}
211214
kubectl get pods -n eda-system
@@ -224,10 +227,13 @@ trust-manager-69955c46b8-bghj6 1/1 Running 0 95m
224227
```
225228
</div>
226229

227-
/// details | Nuances for geo redundant clusters
228-
type: info
229-
For geo redundant clusters, execute the `edactl platform stop` command on both active and standby members, via their respective `eda-toolbox` Pods.
230-
///
230+
### Stopping EDA git servers
231+
232+
Continue with stopping EDA Git servers by scaling down the EDA Git deployments:
233+
234+
```{.shell .no-select}
235+
make scale-down-git-servers
236+
```
231237

232238
## Updating EDA kpt packages
233239

@@ -266,13 +272,15 @@ make install-external-packages eda-install-core eda-is-core-ready
266272

267273
## Restoring your backup
268274

269-
Copy the backup file you extracted at the beginning of this procedure back into the new `eda-toolbox` pod:
275+
Copy the backup file you [collected at the beginning](#backing-up-your-cluster) of this procedure from your machine back into the new `eda-toolbox` pod:
270276

271277
```{.shell .no-select}
278+
backupfile=/tmp/eda-support/logs-2025-12-18/eda-platform-backup-2025-12-18-21-37-42.tar.gz
279+
272280
toolboxpod=$(kubectl -n eda-system get pods \
273281
-l eda.nokia.com/app=eda-toolbox -o jsonpath="{.items[0].metadata.name}")
274282
275-
kubectl -n eda-system cp /tmp/eda-backup.tar.gz \
283+
kubectl -n eda-system cp $backupfile \
276284
$toolboxpod:/tmp/eda-backup.tar.gz
277285
```
278286

@@ -284,12 +292,42 @@ edactl platform restore /tmp/eda-backup.tar.gz
284292

285293
## Upgrading your applications
286294

287-
A default install of EDA will install current-version applications, but your restore will have restored previous versions. These versions may be incompatible with the new version of EDA core, and must be upgraded immediately following the upgrade. The existing `Makefile` can be used to do so:
295+
Installing the new version of EDA will deploy application versions according to the installed release; however, the backup restore operation will have restored application versions as they were set in the original cluster. These versions may be incompatible with the new version of EDA core, and must be upgraded immediately following the EDA backup restore. The existing `Makefile` can be used to do so:
288296

289297
```{.shell .no-select}
290298
make eda-install-apps
291299
```
292300

301+
## Resume NPP interactions
302+
303+
To resume NPP interactions, set all `TopoNode` resources back to the `normal` mode.
304+
305+
```{.shell .no-select}
306+
make set-npp-mode-normal
307+
```
308+
309+
After the script has been run, verify that the `TopoNode` resources are in `normal` mode:
310+
311+
```{.shell .no-select}
312+
kubectl get toponode -A \
313+
-o custom-columns='NAMESPACE:.metadata.namespace,NAME:.metadata.name,MODE:.spec.npp.mode'
314+
```
315+
316+
<div class="embed-result">
317+
```{.shell .no-select .no-copy}
318+
NAMESPACE NAME MODE
319+
my-other-ns leaf1 normal
320+
my-other-ns leaf2 normal
321+
my-other-ns leaf3 normal
322+
my-other-ns leaf4 normal
323+
my-other-ns spine1 normal
324+
my-other-ns spine2 normal
325+
eda leaf1 normal
326+
eda leaf2 normal
327+
eda spine1 normal
328+
```
329+
</div>
330+
293331
## Verifying cluster health
294332

295333
Check the following to ensure your cluster is healthy:

0 commit comments

Comments
 (0)