This guide walks you through deploying a K3s HA cluster with 3 master (server) nodes on AWS EC2 t3.large instances using embedded etcd. All examples are AWS-native — no external datastore or load balancer is required to bootstrap the cluster.
For the full K3s documentation see https://docs.k3s.io/.
| Role | Count | Instance type | OS |
|---|---|---|---|
| K3s server (master) | 3 | t3.large (2 vCPU / 8 GiB RAM) | Ubuntu 22.04 LTS |
t3.large provides enough CPU and RAM to run the K3s control plane (etcd, API server, scheduler, controller-manager) alongside application workloads. All 3 nodes form an HA control plane backed by embedded etcd. Worker nodes can be added later (see Step 9).
- AWS account with permissions to create EC2 instances, VPCs, and security groups
- AWS CLI v2 installed and configured (
aws configure) - An EC2 SSH key pair created in the target region
kubectlinstalled on your local machine
export AWS_REGION="us-east-1"
export KEY_NAME="my-k3s-key" # existing EC2 key pair name
export VPC_ID=$(aws ec2 describe-vpcs \
--filters "Name=isDefault,Values=true" \
--query "Vpcs[0].VpcId" --output text \
--region $AWS_REGION)
export SUBNET_ID=$(aws ec2 describe-subnets \
--filters "Name=vpc-id,Values=$VPC_ID" \
--query "Subnets[0].SubnetId" --output text \
--region $AWS_REGION)export SG_ID=$(aws ec2 create-security-group \
--group-name k3s-ha-sg \
--description "K3s HA cluster security group" \
--vpc-id $VPC_ID \
--region $AWS_REGION \
--query GroupId --output text)
# SSH
aws ec2 authorize-security-group-ingress --group-id $SG_ID \
--protocol tcp --port 22 --cidr 0.0.0.0/0 --region $AWS_REGION
# Kubernetes API server
aws ec2 authorize-security-group-ingress --group-id $SG_ID \
--protocol tcp --port 6443 --cidr 0.0.0.0/0 --region $AWS_REGION
# etcd (inter-node only — restrict to the SG itself)
aws ec2 authorize-security-group-ingress --group-id $SG_ID \
--protocol tcp --port 2379-2380 --source-group $SG_ID --region $AWS_REGION
# Kubelet
aws ec2 authorize-security-group-ingress --group-id $SG_ID \
--protocol tcp --port 10250 --source-group $SG_ID --region $AWS_REGION
# Flannel VXLAN
aws ec2 authorize-security-group-ingress --group-id $SG_ID \
--protocol udp --port 8472 --source-group $SG_ID --region $AWS_REGION
# NodePort range (for test applications)
aws ec2 authorize-security-group-ingress --group-id $SG_ID \
--protocol tcp --port 30000-32767 --cidr 0.0.0.0/0 --region $AWS_REGION
echo "Security group: $SG_ID"# Ubuntu 22.04 LTS AMI (update the ami-* ID for your region)
export AMI_ID="ami-0c7217cdde317cfec" # us-east-1 Ubuntu 22.04 LTS
for i in 1 2 3; do
aws ec2 run-instances \
--image-id $AMI_ID \
--instance-type t3.large \
--key-name $KEY_NAME \
--security-group-ids $SG_ID \
--subnet-id $SUBNET_ID \
--associate-public-ip-address \
--tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=k3s-master-$i}]" \
--region $AWS_REGION \
--query "Instances[0].InstanceId" --output text
doneNote: Replace
ami-0c7217cdde317cfecwith the latest Ubuntu 22.04 LTS AMI for your region. Find it with:aws ec2 describe-images --owners 099720109477 \ --filters "Name=name,Values=ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*" \ --query "sort_by(Images,&CreationDate)[-1].ImageId" \ --output text --region $AWS_REGION
aws ec2 describe-instances \
--filters "Name=tag:Name,Values=k3s-master-*" "Name=instance-state-name,Values=running" \
--query "Reservations[*].Instances[*].[Tags[?Key=='Name']|[0].Value,PrivateIpAddress,PublicIpAddress]" \
--output table --region $AWS_REGIONRecord the values — you will need them throughout this guide:
| Hostname | Private IP | Public IP |
|---|---|---|
| k3s-master-1 | 10.0.x.x | 1.2.3.4 |
| k3s-master-2 | 10.0.x.x | 1.2.3.5 |
| k3s-master-3 | 10.0.x.x | 1.2.3.6 |
Run the following on each of the 3 instances.
ssh -i ~/.ssh/$KEY_NAME.pem ubuntu@<public-ip># On k3s-master-1
sudo hostnamectl set-hostname k3s-master-1
# On k3s-master-2
sudo hostnamectl set-hostname k3s-master-2
# On k3s-master-3
sudo hostnamectl set-hostname k3s-master-3sudo apt-get update && sudo apt-get upgrade -y
sudo timedatectl set-timezone UTCAdd an entry for each node so they can resolve each other by hostname. Replace the IPs with your private IPs.
sudo tee -a /etc/hosts <<EOF
10.0.1.10 k3s-master-1
10.0.1.11 k3s-master-2
10.0.1.12 k3s-master-3
EOFK3s does not require swap to be disabled, but it is recommended for predictable performance.
sudo swapoff -a sudo sed -i '/ swap / s/^/#/' /etc/fstab
SSH into k3s-master-1.
sudo mkdir -p /etc/rancher/k3s
# Replace 10.0.1.10 with the private IP of k3s-master-1
# Replace 1.2.3.4 with the public IP / Elastic IP of k3s-master-1
sudo tee /etc/rancher/k3s/config.yaml <<EOF
cluster-init: true
node-ip: 10.0.1.10
advertise-address: 10.0.1.10
tls-san:
- 10.0.1.10
- 1.2.3.4
- k3s-master-1
disable: [servicelb, traefik]
EOFWhy
disable: [servicelb, traefik]?
servicelb(Klipper) is replaced by the AWS cloud controller or an NLB.traefikis replaced by the NGINX Ingress Controller in Step 7. Using the list syntax avoids the YAML duplicate-key bug where only the lastdisable:entry would take effect.
curl -sfL https://get.k3s.io | sh -sudo kubectl get nodes
sudo kubectl get pods -Asudo cat /var/lib/rancher/k3s/server/tokenSave this token — you will need it in the next step.
Run the following on k3s-master-2 and k3s-master-3 (adjust IPs accordingly).
sudo mkdir -p /etc/rancher/k3s
# Example for k3s-master-2. Replace IPs and token with your values.
sudo tee /etc/rancher/k3s/config.yaml <<EOF
server: https://10.0.1.10:6443
token: <token-from-master-1>
node-ip: 10.0.1.11
advertise-address: 10.0.1.11
tls-san:
- 10.0.1.11
- 1.2.3.5
- k3s-master-2
disable: [servicelb, traefik]
EOFcurl -sfL https://get.k3s.io | sh -s - serversudo kubectl get nodes -o wideAll 3 nodes should appear with status Ready and role control-plane,master.
NAME STATUS ROLES AGE VERSION
k3s-master-1 Ready control-plane,etcd,master 5m v1.30.x+k3s1
k3s-master-2 Ready control-plane,etcd,master 2m v1.30.x+k3s1
k3s-master-3 Ready control-plane,etcd,master 1m v1.30.x+k3s1
# Copy the kubeconfig from k3s-master-1 to your local machine
scp -i ~/.ssh/$KEY_NAME.pem ubuntu@<master-1-public-ip>:/etc/rancher/k3s/k3s.yaml ~/.kube/k3s.yaml
# Update the server address to the public IP of a master node
sed -i 's|https://127.0.0.1:6443|https://<master-1-public-ip>:6443|' ~/.kube/k3s.yaml
# Use this kubeconfig
export KUBECONFIG=~/.kube/k3s.yaml
kubectl get nodeskubectl apply -f web-app.yml
# Verify pods and service
kubectl get pods,svc
# Access the app — use the public IP of any master node and the NodePort
curl http://<master-public-ip>:30080Expected output: welcome to my web app!
K3s can deploy Helm charts automatically by placing a manifest in /var/lib/rancher/k3s/server/manifests/.
# On k3s-master-1
sudo cp nginx-ingress.yml /var/lib/rancher/k3s/server/manifests/nginx-ingress.yaml
# Verify the controller starts
kubectl -n ingress-nginx get pods
kubectl -n ingress-nginx get svcThe LoadBalancer service automatically provisions an AWS NLB. The NLB DNS name is shown in the EXTERNAL-IP column of kubectl -n ingress-nginx get svc.
The nginx-ingress.yml manifest in this repository already includes the NLB annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"K3s ships with local-path-provisioner out-of-the-box. It dynamically provisions hostPath volumes on the node where the pod is scheduled. Refer to the local-path-provisioner docs for more configuration options.
Add the following to /etc/rancher/k3s/config.yaml on all master nodes and restart K3s:
default-local-storage-path: /mnt/disk1sudo systemctl restart k3s
# Restart the provisioner to pick up the new path
kubectl -n kube-system rollout restart deploy local-path-provisionerkubectl create -f pvc.yaml
kubectl create -f pod.yaml
kubectl get pvc
kubectl get pv
kubectl get pod volume-testFor production workloads that need durable, network-attached block storage, install the AWS EBS CSI driver:
kubectl apply -k "github.com/kubernetes-sigs/aws-ebs-csi-driver/deploy/kubernetes/overlays/stable/?ref=release-1.35"Create a StorageClass backed by EBS:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ebs-sc
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
parameters:
type: gp3To add a dedicated worker (agent) node, launch an additional EC2 instance (any type) in the same security group and run:
# On the new worker node
sudo mkdir -p /etc/rancher/k3s
sudo tee /etc/rancher/k3s/config.yaml <<EOF
server: https://10.0.1.10:6443
token: <token-from-master-1>
node-ip: <worker-private-ip>
EOF
curl -sfL https://get.k3s.io | sh -s - agentVerify on a master node:
kubectl get nodes -o wide# On server (master) nodes
/usr/local/bin/k3s-uninstall.sh
# On agent (worker) nodes
/usr/local/bin/k3s-agent-uninstall.shjournalctl -u k3s -fsudo k3s etcd-snapshot list
# Detailed etcd member list
sudo k3s etcd-snapshot ls
# etcdctl (available inside the K3s binary)
sudo k3s kubectl -n kube-system exec -it \
$(sudo k3s kubectl -n kube-system get pod -l component=etcd -o jsonpath='{.items[0].metadata.name}') \
-- etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/client.key \
member list- Check security group rules — ensure ports 8472/UDP (Flannel VXLAN) and 10250/TCP (Kubelet) are open between nodes.
- Verify all nodes can resolve each other's hostnames:
ping k3s-master-2from k3s-master-1.
- Ensure port 6443/TCP is open in the security group for your local IP.
- Verify the
server:URL in your local kubeconfig points to the correct public IP.
K3s uses IMDSv2 on EC2 for the node's ProviderID. If you see errors related to instance metadata, ensure IMDSv2 is enabled and the hop limit is at least 2:
aws ec2 modify-instance-metadata-options \
--instance-id <instance-id> \
--http-put-response-hop-limit 2 \
--http-endpoint enabled \
--region $AWS_REGIONSee registries.yml for the ECR configuration. The recommended approach on EC2 is to attach the AmazonEC2ContainerRegistryReadOnly IAM policy to the instance profile — no static credentials are needed.