Skip to content

Conversation

@GTRekter
Copy link

@GTRekter GTRekter commented Jan 14, 2026

In some cases, the CNI conflist file might be a symlink to a different file located in another directory or with a different extension (e.g., in OCI it uses .conflist-current).

Problem

Right now, the sync function loops over all files with the .conflist or .conf extension. This will:
Exclude symlink files
Exclude files with an extension other than .conflist or .conf

Challenge

We can’t use a simple wildcard after .conflist, because some CNIs create temporary files. That would create a lot of noise in the logs, since the Linkerd CNI binaries would try to modify temporary files that do not exist.

Solution

In both sync function and main flow, if a .conflist or .conf file is a symlink, we will resolve it and get the path to the original file. We will then append the Linkerd configuration to that file.

Tests

Local

  • Start a new cluster and deploy Cilium:
k3d cluster create 01 --agents 1
helm upgrade --install cilium cilium/cilium --namespace kube-system --create-namespace
  • Cilium uses the value stored in write-cni-conf-when-ready to identify both the location and name of the CNI configuration file. Update the ConfigMap to change the filename to 05-cilium.conflist-current and disable cni-exclusive so we can create the symlink. Finally, restart the Cilium DaemonSet.
kubectl patch cm -n kube-system cilium-config --type='json' -p='[{"op":"replace","path":"/data/write-cni-conf-when-ready","value":"/host/etc/cni/net.d/05-cilium.conflist-current"}]'
kubectl patch cm -n kube-system cilium-config --type='json' -p='[{"op":"replace","path":"/data/cni-exclusive","value":"false"}]'
kubectl rollout restart ds cilium -n kube-system
kubectl rollout restart ds -n  kube-system cilium-envoy
kubectl rollout restart deploy -n kube-system cilium-operator
  • Build the Linkerd CNI image, import it into k3d, and install the Linkerd CNI (stable):
docker exec -it k3d-01-server-0 sh
~ # ls /etc/cni/net.d
05-cilium.conflist-current
~ # ln -sfn "/etc/cni/net.d/05-cilium.conflist-current" "/etc/cni/net.d/05-cilium.conflist"
~ # ls /etc/cni/net.d
05-cilium.conflist  05-cilium.conflist-current

docker exec -it k3d-01-agent-0 sh
~ # ls /etc/cni/net.d
05-cilium.conflist-current  05-cilium.conflist.cilium_bak
  • Build the Linkerd CNI Image, import into k3d and install linkerd CNI (Stable)
docker build . --file Dockerfile-cni-plugin --tag linkerd/cni-plugin:dev
k3d image import linkerd/cni-plugin:dev -c 01
helm upgrade --install linkerd2-cni linkerd-stable/linkerd2-cni --version 30.12.2 --set image.name='linkerd/cni-plugin' --set image.version=dev --create-namespace -n linkerd-cni
  • Check the Linkerd CNI logs. You will see that the Linkerd CNI pod running on k3d-01-agent-0 won’t detect any configuration, while the one running on k3d-01-server-0 will follow the symlink and patch 05-cilium.conflist-current.
kubectl get pods -n linkerd-cni -o wide     
NAME                READY   STATUS    RESTARTS   AGE   IP            NODE              NOMINATED NODE   READINESS GATES
linkerd-cni-9q27z   1/1     Running   0          15m   10.42.0.66    k3d-01-server-0   <none>           <none>
linkerd-cni-gsl4p   1/1     Running   0          15m   10.42.1.116   k3d-01-agent-0    <none>           <none>

kubectl logs -n linkerd-cni  linkerd-cni-9q27z
[2026-01-14 05:56:04] Wrote linkerd CNI binaries to /host/opt/cni/bin
Setting up watches.
Watches established.
[2026-01-14 05:56:04] Wait for CNI config monitor to become ready
[2026-01-14 05:56:04] Trigger CNI config detection for /host/etc/cni/net.d/05-cilium.conflist
Setting up watches.
Watches established.

kubectl logs -n linkerd-cni  linkerd-cni-gsl4p
[2026-01-14 05:56:04] Wrote linkerd CNI binaries to /host/opt/cni/bin
Setting up watches.
Watches established.
[2026-01-14 05:56:04] Wait for CNI config monitor to become ready
[2026-01-14 05:56:04] No active CNI configuration files found
Setting up watches.
Watches established.
  • Check the contents of 05-cilium.conflist-current:
docker exec -it k3d-01-server-0 sh
~ # cat  etc/cni/net.d/05-cilium.conflist-current
{
  "cniVersion": "0.3.1",
  "name": "cilium",
  "plugins": [
    {
      "type": "cilium-cni",
      "enable-debug": false,
      "log-file": "/var/run/cilium/cilium-cni.log"
    },
    {
      "name": "linkerd-cni",
      "type": "linkerd-cni",
      "log_level": "info",
      "policy": {
        "type": "k8s",
        "k8s_api_root": "https://__KUBERNETES_SERVICE_HOST__:__KUBERNETES_SERVICE_PORT__",
        "k8s_auth_token": "__SERVICEACCOUNT_TOKEN__"
      },
      "kubernetes": {
        "kubeconfig": "/etc/cni/net.d/ZZZ-linkerd-cni-kubeconfig"
      },
      "linkerd": {
        "incoming-proxy-port": 4143,
        "outgoing-proxy-port": 4140,
        "proxy-uid": 2102,
        "ports-to-redirect": [],
        "inbound-ports-to-ignore": [
          "4191",
          "4190"
        ],
        "simulate": false,
        "use-wait-flag": false
      }
    }
  ]
}

docker exec -it k3d-01-agent-0 sh 
~ # cat  etc/cni/net.d/05-cilium.conflist-current
{
  "cniVersion": "0.3.1",
  "name": "cilium",
  "plugins": [
    {
       "type": "cilium-cni",
       "enable-debug": false,
       "log-file": "/var/run/cilium/cilium-cni.log"
    }
  ]
}
  • Create a symlink in the k3d-01-agent-01 node so that the network-validator container in pods deployed on that node will work.
docker exec -it k3d-01-agent-0 sh
~ # ls /etc/cni/net.d
05-cilium.conflist-current
~ # ln -sfn "/etc/cni/net.d/05-cilium.conflist-current" "/etc/cni/net.d/05-cilium.conflist"
~ # ls /etc/cni/net.d
05-cilium.conflist  05-cilium.conflist-current

kubectl logs -n linkerd-cni   linkerd-cni-gsl4p  
[2026-01-14 05:56:04] Wrote linkerd CNI binaries to /host/opt/cni/bin
Setting up watches.
Watches established.
[2026-01-14 05:56:04] Wait for CNI config monitor to become ready
[2026-01-14 05:56:04] No active CNI configuration files found
Setting up watches.
Watches established.
[2026-01-14 06:31:06] Detected event: CREATE /host/etc/cni/net.d/05-cilium.conflist
[2026-01-14 06:31:06] File /host/etc/cni/net.d/05-cilium.conflist resolves to /host/etc/cni/net.d/05-cilium.conflist-current
[2026-01-14 06:31:06] New/changed file [/host/etc/cni/net.d/05-cilium.conflist-current] detected; re-installing
[2026-01-14 06:31:06] Using CNI config template from CNI_NETWORK_CONFIG environment variable.
[2026-01-14 06:31:06] CNI config: {
  "name": "linkerd-cni",
  "type": "linkerd-cni",
  "log_level": "info",
  "policy": {
      "type": "k8s",
      "k8s_api_root": "https://__KUBERNETES_SERVICE_HOST__:__KUBERNETES_SERVICE_PORT__",
      "k8s_auth_token": "__SERVICEACCOUNT_TOKEN__"
  },
  "kubernetes": {
      "kubeconfig": "/etc/cni/net.d/ZZZ-linkerd-cni-kubeconfig"
  },
  "linkerd": {
    "incoming-proxy-port": 4143,
    "outgoing-proxy-port": 4140,
    "proxy-uid": 2102,
    "ports-to-redirect": [],
    "inbound-ports-to-ignore": ["4191","4190"],
    "simulate": false,
    "use-wait-flag": false
  }
}
[2026-01-14 06:31:06] Created CNI config /host/etc/cni/net.d/05-cilium.conflist-current
  • Install linkerd
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.2.1/standard-install.yaml
linkerd install --crds | kubectl apply -f -
linkerd install --linkerd-cni-enabled | kubectl apply -f -
  • Deploy meshed clinet/server
kubectl apply -f - <<EOF
apiVersion: v1
kind: Namespace
metadata:
  name: simple-app
  annotations:
    linkerd.io/inject: enabled
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: simple-app-v1
  namespace: simple-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: simple-app-v1
      version: v1
  template:
    metadata:
      labels:
        app: simple-app-v1
        version: v1
    spec:
      containers:
        - name: http-app
          image: hashicorp/http-echo:latest
          args:
            - "-text=Simple App v1 - CLUSTER_NAME"
          ports:
            - containerPort: 5678
---
apiVersion: v1
kind: Service
metadata:
  name: simple-app-v1
  namespace: simple-app
spec:
  selector:
    app: simple-app-v1
    version: v1
  ports:
    - port: 80
      targetPort: 5678
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: traffic
  namespace: simple-app
  labels:
    app: traffic
spec:
  replicas: 1
  selector:
    matchLabels:
      app: traffic
  template:
    metadata:
      labels:
        app: traffic
    spec:
      containers:
      - name: traffic
        image: curlimages/curl:latest 
        command: ["/bin/sh", "-c"]
        args:
        - |
          while true; do
            TIMESTAMP_SEND=$(date '+%Y-%m-%d %H:%M:%S')
            PAYLOAD="{\"timestamp\":\"$TIMESTAMP_SEND\",\"test_id\":\"sniff_me\",\"message\":\"hello-world\"}"
            echo "$TIMESTAMP_SEND - Sending payload: $PAYLOAD"
            RESPONSE=$(curl -s -X POST \
              -H "Content-Type: application/json" \
              -d "$PAYLOAD" \
              http://simple-app-v1.simple-app.svc.cluster.local:80)
            TIMESTAMP_RESPONSE=$(date '+%Y-%m-%d %H:%M:%S')
            echo "$TIMESTAMP_RESPONSE - RESPONSE: $RESPONSE"
            sleep 0.1
          done
EOF
  • Double-check that the client/server communication works as expected.
kubectl logs -n simple-app deploy/traffic -c traffic 
  • Delete the deployments in the simple-app namespace and validate that the DELETE is called as expected
kubectl delete deploy -n simple-app

Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @GTRekter , the implementation looks good to me, and thanks for the extended testing instructions 👍

I'd like to make sure I understand the problem, so let me rephrase things. In OCI (is that the same as OKE?) they have a CNI plugin that creates config files with the conflist-current extension. And your solution is then to have users create a symlink with an extension we support (conf or conflist), and thus this PR that adds support for symlinks. Is that correct?

Are you able to dig into OCI's docs and see if that extension is configurable, so that we get a full range of possible solutions?

Have you checked with the team that reported this issue whether they're able to create such symlinks? Also note that the symlinks would need to be created as soon as any new node is provisioned, so probably this solution would require for them to come up with additional automation.

Are there other platforms besides OCI where this is also an issue? If so, what extension do their files use? If we can come up with a list of extension names, we could just add them to the list we originally support, without having users to add the symlink.

(Also, please take care of the DCO 😉 )

# back into the host mount (e.g. /etc/... -> /host/etc/...).
# We need to /etc -> /host/etc when the symlink points to an absolute path as
# net.d is the container’s filesystem, not the host mount, so when cp -L follows
# that absolute target it lands on a path that doesn’t.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... doesn't exist?

file=$(resolve_cni_config_path "${file}")
tmp_file="$(mktemp -u /tmp/linkerd-cni.patch-candidate.XXXXXX)"
cp -fp "${file}" "${tmp_file}"
cp -fpL "${file}" "${tmp_file}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${file} shouldn't be a symlink at this point, so why the -L?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants