The recommended place for user generated systemd files is /etc/systemd/system/ folder. All systemd unit files created in this tutorial will be placed inside this folder.
There are many arguments need to be passed in during etcd bootstrap. To keep the systemd file short and clean, it is better to store the environment configurations at a seperate place. There are two ways to store the configurations. The first way is to use system environment variables. There is a easy-to-remember naming convention for etcd environment variables. The variable names start with ETCD_, followed by variable names with dashes replaced with underscores. For example, the variable name of --name is ETCD_NAME; the variable name of --listen-client-urls is ETCD_LISTEN_CLIENT_URLS and etc. The second way to store the parameters is to store them in a seperate file and pass the file to systemd. In this tutorial, we are going to use files to store bootstrap arguments for all Kubernetes cluster components.
Create etcd environment file etcd in /usr/k8s/bin/env. Please refer etcd as an example.
Create the etcd systemd file etcd.service in /etc/systemd/system/ with below content:
[Unit]
Description=etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
EnvironmentFile=/usr/k8s/bin/env/etcd
WorkingDirectory=/var/lib/etcd/
ExecStart=/usr/k8s/bin/etcd \
$NODE_NAME \
$INITIAL_ADVERTISE_PEER_URLS \
$ETCD_LISTEN_PEER_URLS \
$ETCD_LISTEN_CLIENT_URLS \
$ETCD_ADVERTISE_CLIENT_URLS \
$ETCD_INITIAL_CLUSTER_TOKEN \
$ETCD_INITIAL_CLUSTER \
$ETCD_INITIAL_CLUSTER_STATE \
$ETCD_CERT_FILE \
$ETCD_KEY_FILE \
$ETCD_CLIENT_CERT_AUTH \
$ETCD_TRUSTED_CA_FILE \
$ETCD_PEER_CERT_FILE \
$ETCD_PEER_KEY_FILE \
$ETCD_PEER_CLIENT_CERT_AUTH \
$ETCD_PEER_TRUSTED_CA_FILE \
$ETCD_DATA_DIR
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.targetSimilar to etcd, it is better to create a seperate file to store parameters for API Server bootstrap. There are several parameters that are used by all the components, so we create a file to store all shared parameters first, then create the API Server environment file.
Please refer shared-config and apiserver as an example.
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
After=etcd.service
[Service]
Environment=NODE_IP=192.168.1.101
EnvironmentFile=/usr/k8s/bin/env/shared-config
EnvironmentFile=/usr/k8s/bin/env/apiserver
ExecStart=/usr/k8s/bin/kube-apiserver \
$KUBE_ADMISSION_CONTROL \
$KUBE_API_ADVERTISE_ADDRESS \
$KUBE_API_BIND_ADDRESS \
$KUBE_API_INSECURE_ADDRESS \
$KUBELET_AUTH \
$KUBE_SERVICE_ADDRESSES \
$KUBE_SERVICE_NODE_PORT_RANGE \
$KUBE_SSL_CERTS \
$SERVICE_ACCOUNT_CONFIGS \
$ETCD_SSL_CERTS \
$KUBE_ETCD_SERVERS \
$AUDIT_LOG_CONFIGS \
$KUBE_LOG_LEVEL \
$KUBE_LOGTOSTDERR \
$OTHER_CONFIGS
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
After=etcd.service
After=kube-api-server.service
[Service]
EnvironmentFile=/usr/k8s/bin/env/shared-config
EnvironmentFile=/usr/k8s/bin/env/controller-manager
ExecStart=/usr/k8s/bin/kube-controller-manager \
$KUBE_MASTER \
$BIND_ADDRESS \
$KUBE_SERVICE_ADDRESSES \
$KUBE_CLUSTER_CIDR \
$KUBE_CLUSTER_CERTS \
$KUBE_CONTROLLER_MANAGER_ARGS \
$KUBE_LOG_LEVEL
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
After=etcd.service
After=kube-api-server.service
[Service]
EnvironmentFile=/usr/k8s/bin/env/shared-config
EnvironmentFile=/usr/k8s/bin/env/scheduler
ExecStart=/usr/k8s/bin/kube-scheduler \
$BIND_ADDRESS \
$KUBE_MASTER \
$KUBE_SCHEDULER_ARGS \
$KUBE_LOG_LEVEL
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.targetNote:
When creating thesystemdfiles, please make sureType=Notifyis ONLY applied to API Server and not to the Controller Manager and Scheduler. Otherwise,systemdwill keep restarting Controller Manager and Scheduler due to a startup timeout (because only API Server sends a notify system call to indicate it is able to accept requests). Please refersystemdmanpage for more information.
[Unit]
Description=Flanneld overlay address etcd agent
After=network.target
After=network-online.target
Wants=network-online.target
Before=docker.service
[Service]
EnvironmentFile=/usr/k8s/bin/env/flannel
ExecStartPre=-/usr/bin/mkdir -p /run/flannel
ExecStart=/usr/k8s/bin/flanneld \
$FLANNEL_CERTS \
$ETCD_ENDPOINTS \
$ETCD_PREFIX
ExecStartPost=/usr/k8s/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
Restart=on-failure
Type=notify
[Install]
WantedBy=multi-user.target
RequiredBy=docker.serviceNote: Flannel needs to start before docker. After flannel systemd starts, it executes
mk-docker-opts.shscript and this script will write subnet data into/run/flannel/dockerwhich will be used as environment file by dockersystemd. It will also set docker start options toDOCKER_NETWORK_OPTIONSvariable which will be also be used in dockersystemd.
Add the environment file generated by Flannel and pass it to Docker start command.
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network.target firewalld.service
[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
EnvironmentFile=-/run/flannel/docker
ExecStart=/usr/bin/dockerd $DOCKER_NETWORK_OPTIONS
ExecReload=/bin/kill -s HUP $MAINPID
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
[Install]
WantedBy=multi-user.targetAlso, please make sure docker is using cgroupfs as the cgroup driver. The kube-dns might not be able to run properly if systemd is used.
cat << EOF > /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=cgroupfs"]
}
EOFCreate working directory first
sudo mkdir -p /var/lib/kubeletCreate /etc/systemd/system/kubelet.service file
[Unit]
Description=kubelet: The Kubernetes Node Agent
Documentation=https://kubernetes.io/docs/concepts/overview/components/#kubelet https://kubernetes.io/docs/reference/generated/kubelet/
[Service]
WorkingDirectory=/var/lib/kubelet
EnvironmentFile=-/usr/k8s/bin/env/kubelet
ExecStartPre=-/usr/bin/mkdir -p /var/lib/kubelet
ExecStart=/usr/k8s/bin/kubelet \
$KUBELET_ADDRESS \
$KUBELET_HOSTNAME \
$CLUSTER_DNS \
$CLUSTER_DOMAIN \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL \
$KUBELET_ARGS
Restart=on-failure
RestartSec=5
KillMode=process
[Install]
WantedBy=multi-user.targetImportant Note from Docker official website: For security reasons, Docker configures the iptables rules to prevent containers from forwarding traffic from outside the host machine, on Linux hosts. Docker sets the default policy of the FORWARD chain to DROP. In Docker 1.12 and earlier, the default FORWARD chain policy was ACCEPT. After Docker 1.13 or later, this default is changed to
false.
To make Pods reachable from different hosts, two configurations needs to be done:
- IP forwarding needs to be enabled on the host.
$ sysctl net.ipv4.conf.all.forwarding
net.ipv4.conf.all.forwarding = 1If the value is 0, we can either add net.ipv4.conf.all.forwarding=1 to /etc/sysctl.d/ and sysctl -p /etc/sysctl.conf
or run sysctl -w net.ipv4.ip_forward=1 at startup.
- Set iptables
FORWARDpolicy toACCEPT:iptables -P FORWARD ACCEPTThis can be done by having anothersystemdservice to run both command.
Create working directory first
sudo mkdir -p /var/lib/kube-proxyCreate /etc/systemd/system/kube-proxy.service file
[Unit]
Description=kube-proxy: The Kubernetes Proxy Server
Documentation=https://kubernetes.io/docs/concepts/overview/components/#kube-proxy https://kubernetes.io/docs/reference/generated/kube-proxy/
After=network.target
[Service]
WorkingDirectory=/var/lib/kube-proxy
EnvironmentFile=-/usr/k8s/bin/env/kube-proxy
ExecStart=/usr/k8s/bin/kube-proxy \
$BIND_ADDRESS \
$HOST_NAME_OVERRIDE \
$CLUSTER_CIDR \
$KUBECONFIG \
$KUBE_LOGTOSTDERR \
$KUBE_LOG_LEVEL
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.targetCreate pod-nginx.yaml with the following content:
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresentCreate service-nginx.yaml with the following content:
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
ports:
- port: 8000 # the port that this service should serve on
# the container on each pod to connect to, can be a name
# (e.g. 'www') or a number (e.g. 80)
targetPort: 80
protocol: TCP
# just like the selector in the deployment,
# but this time it identifies the set of pods to load balance
# traffic to.
selector:
app: nginxCreate the Pod and the Service:
kubectl create -f pod-nginx.yaml
kubectl create -f service-nginx.yamlCheck the Pod and the Service:
kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 1h
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 39m
nginx-service ClusterIP 10.254.189.113 <none> 8000/TCP 21hCheck if the Service and Pod are working correctly from worker node:
curl 10.254.189.113:8000
# should return nginx welcome pageNote: use
mkdir -pto create folder in thesystemdservices so if the folder exists this command does not fail which does not fail the start of the service. Otherwise, the service could fail to start if thePreStartExecfailed on execution.