文章目录
- 1.环境准备(所有节点)
- 1.1 关闭无用服务
- 1.2 环境和网络
- 1.3 apt源
- 1.4 系统优化
- 1.5 安装nfs客户端
- 2. 装containerd(所有节点)
- 3. master的高可用方案(master上操作)
- 3.1 安装以及配置haproxy(3master相同)
- 3.2 keepalived
- 1)master-01
- 2)master-02
- 3)master-03
- 4. 核心组件安装(所有节点)
- 4.1 kubelet、kubeadm、kubectl安装
- 4.2 提前下载镜像(所有节点)【非必要】
- 4.2 缺少一个镜像
- 5. 集群初始化
- 5.1 主机名
- 5.2 初始化
- 1)master-01
- 2)其他master
- 3)node节点
- 4) 检查
- 5.3 网络安装
- 6. storageclass 的安装
1.环境准备(所有节点)
- 服务器规划
类型 | IP 地址 | 主机名 |
---|---|---|
Master 1 | 10.10.239.155 | k8s-master-01 |
Master 2 | 10.10.239.156 | k8s-master-02 |
Master 3 | 10.10.239.161 | k8s-master-03 |
虚拟IP | 10.10.239.150 | apiserver-vip |
- 系统信息
root@cto-gpu-pro-n01:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.5 LTS
Release: 22.04
Codename: jammy
1.1 关闭无用服务
- 关闭交换分区
# sed -i "/swap/{s/^/#/g}" /etc/fstab
# swapoff -a
- 关闭防火墙
root@boe:~# ufw disable
root@boe:~# ufw status
Status: inactive
1.2 环境和网络
- hostname
# hostname master-01
# vim /etc/hostmame
- hosts
cat >> /etc/hosts << EOF
10.10.239.155 master-01
10.10.239.156 master-02
10.10.239.157 master-03
EOF
- 打开路由
# cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
# modprobe br_netfilter
# echo br_netfilter | tee /etc/modules-load.d/k8s.conf
# sysctl -p /etc/sysctl.d/k8s.conf
- 加载内核模块
#cat > /etc/modules-load.d/ipvs.conf <<EOF
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
ip_vs_ftp
nf_conntrack
EOF
# systemctl restart systemd-modules-load.service
验证结果如下:
root@boe:~# lsmod | grep '^ip_vs'
ip_vs_ftp 16384 0
ip_vs_sed 16384 0
ip_vs_nq 16384 0
ip_vs_fo 16384 0
ip_vs_sh 16384 0
ip_vs_dh 16384 0
ip_vs_lblcr 16384 0
ip_vs_lblc 16384 0
ip_vs_wrr 16384 0
ip_vs_rr 16384 1338
ip_vs_wlc 16384 0
ip_vs_lc 16384 0
ip_vs 176128 1366 ip_vs_wlc,ip_vs_rr,ip_vs_dh,ip_vs_lblcr,ip_vs_sh,ip_vs_fo,ip_vs_nq,ip_vs_lblc,ip_vs_wrr,ip_vs_lc,ip_vs_sed,ip_vs_ftp
1.3 apt源
- apt源的配置文件
vim /etc/apt/sources.list
- 内容如下
# See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to
# newer versions of the distribution.
deb http://archive.ubuntu.com/ubuntu/ jammy main restricted
# deb-src http://archive.ubuntu.com/ubuntu/ jammy main restricted
## Major bug fix updates produced after the final release of the
## distribution.
deb http://archive.ubuntu.com/ubuntu/ jammy-updates main restricted
# deb-src http://archive.ubuntu.com/ubuntu/ jammy-updates main restricted
## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
## team. Also, please note that software in universe WILL NOT receive any
## review or updates from the Ubuntu security team.
deb http://archive.ubuntu.com/ubuntu/ jammy universe
# deb-src http://archive.ubuntu.com/ubuntu/ jammy universe
deb http://archive.ubuntu.com/ubuntu/ jammy-updates universe
# deb-src http://archive.ubuntu.com/ubuntu/ jammy-updates universe
## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
## team, and may not be under a free licence. Please satisfy yourself as to
## your rights to use the software. Also, please note that software in
## multiverse WILL NOT receive any review or updates from the Ubuntu
## security team.
deb http://archive.ubuntu.com/ubuntu/ jammy multiverse
# deb-src http://archive.ubuntu.com/ubuntu/ jammy multiverse
deb http://archive.ubuntu.com/ubuntu/ jammy-updates multiverse
# deb-src http://archive.ubuntu.com/ubuntu/ jammy-updates multiverse
## N.B. software from this repository may not have been tested as
## extensively as that contained in the main release, although it includes
## newer versions of some applications which may provide useful features.
## Also, please note that software in backports WILL NOT receive any review
## or updates from the Ubuntu security team.
deb http://archive.ubuntu.com/ubuntu/ jammy-backports main restricted universe multiverse
# deb-src http://archive.ubuntu.com/ubuntu/ jammy-backports main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu/ jammy-security main restricted
# deb-src http://archive.ubuntu.com/ubuntu/ jammy-security main restricted
deb http://archive.ubuntu.com/ubuntu/ jammy-security universe
# deb-src http://archive.ubuntu.com/ubuntu/ jammy-security universe
deb http://archive.ubuntu.com/ubuntu/ jammy-security multiverse
# deb-src http://archive.ubuntu.com/ubuntu/ jammy-security multiverse
- 安装依赖工具(非必要)
apt install vim wget curl net-tools bind9-utils socat ipvsadm ipset
- vim: 用于安装 Vim 编辑器。
- wget: 用于下载文件的命令行工具。
- curl: 用于与服务器通信的命令行工具,常用于测试和下载数据。
- net-tools: 包含一些基本的网络工具,如 ifconfig,netstat 等。
- conntrack-tools: 包含用于管理和查看连接跟踪表的工具(如 conntrack)。
- bind9-utils: 提供一些与 DNS 相关的工具,如 dig,nslookup 等。
- socat: 用于处理网络和文件的双向数据流。
- ipvsadm: 用于配置 IP 虚拟服务器(IPVS)。
- ipset: 用于管理 IP 集合(用于防火墙等)。
- 缓存离线包(非必要)
用下边这个脚本可以把以上服务的依赖下载到本地
#!/bin/bash
PACKAGES="vim wget curl net-tools bind9-utils socat ipvsadm ipset"
mkdir -p ./deb-downloads
cd ./deb-downloads || exit
for pkg in $PACKAGES; do
echo "正在处理 $pkg 及其依赖..."
apt-rdepends "$pkg" | grep -v "^ " | grep -vE '^(debconf-2.0|perlapi-|libc6|libgcc-s1|linux-libc-dev|gcc-|gcc-|base-files)$' | sort -u | while read dep; do
echo "下载 $dep ..."
apt download "$dep" || echo "警告:无法下载 $dep"
done
done
以后安装使用
sudo dpkg -i /usr/local/src/pkg-down/tools/deb-downloads/lib*.deb
sudo dpkg -i /usr/local/src/pkg-down/tools/deb-downloads/*.deb
1.4 系统优化
- 内核优化
# cat >>/etc/sysctl.conf <<EOF
net.ipv4.ip_forward = 1
vm.swappiness = 0
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.tcp_max_syn_backlog = 65536
net.core.netdev_max_backlog = 32768
net.core.somaxconn = 32768
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 94500000 915000000 927000000
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.ip_local_port_range = 1024 65535
EOF
# sysctl -p
- 句柄数
ulimit -n 655350
永修生效修改如下两个文件
# cat >>/etc/security/limits.conf <<EOF
* soft memlock unlimited
* hard memlock unlimited
* soft nofile 655350
* hard nofile 655350
* soft nproc 655350
* hard nproc 655350
EOF
vim /etc/systemd/system.conf
DefaultLimitNOFILE=655350
或者
echo ulimit -n 655350 >>/etc/profile
1.5 安装nfs客户端
以后storageClass需要用
apt install -y nfs-common
2. 装containerd(所有节点)
- 需要的工具
sudo apt install apt-transport-https ca-certificates software-properties-common
- 添加秘钥
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
- 添加apt源
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
- 更新索引
apt update
- 安装containerd
apt install containerd.io
- 生产containerd配置文件
mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
- 添加 SystemdCgroup = true
# sed -i '/containerd.runtimes.runc.options/a\ \ \ \ \ \ \ \ \ \ \ \ SystemdCgroup = true' /etc/containerd/config.toml
- 启动
systemctl start containerd
systemctl enable containerd
- 测试
ctr images pull docker.io/library/nginx:alpine
- 缓存安装包(非必要)
apt-rdepends containerd | grep -v "^ " | grep -vE '^(debconf-2.0|perlapi-|libc6|libgcc-s1|linux-libc-dev|gcc-|base-files)$' | sort -u | xargs -I{} sh -c 'echo 下载 {}; apt download {} || echo "警告:无法下载 {}"'
以后安装使用
sudo dpkg -i /usr/local/src/pkg-down/tools/deb-downloads/lib*.deb
sudo dpkg -i /usr/local/src/pkg-down/tools/deb-downloads/*.deb
3. master的高可用方案(master上操作)
3.1 安装以及配置haproxy(3master相同)
- 安装
# apt -y install haproxy
- 修改配置
修改/etc/haproxy/haproxy.cfg
文件
global
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
defaults
mode tcp
log global
retries 3
timeout connect 10s
timeout client 1m
timeout server 1m
frontend kubernetes
bind *:7443
mode tcp
default_backend kubernetes_master
backend kubernetes_master
balance roundrobin
server master01 10.10.239.155:6443 check maxconn 2000
server master02 10.10.239.156:6443 check maxconn 2000
server master03 10.10.239.161:6443 check maxconn 2000
- 启动
# systemctl start haproxy
# systemctl enable haproxy
3.2 keepalived
1)master-01
- 安装
# apt -y install keepalived
- 修改配置文件
修改/etc/keepalived/keepalived.conf
,从节点按注释修改。
global_defs {
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
// vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state MASTER #从节点为BACKUP
interface ens4 #根据实际情况修改
virtual_router_id 150 #不能和其他集群冲突
priority 100 #master最高,根据情况修改
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.10.239.150 #虚拟ip
}
}
- 启动服务
# systemctl enable keepalived
# systemctl start keepalived
2)master-02
- 安装
# apt -y install keepalived
- 修改配置文件
修改/etc/keepalived/keepalived.conf
,从节点按注释修改。
global_defs {
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
// vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state BACKUP
interface ens4
virtual_router_id 150
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.10.239.150
}
}
- 启动服务
# systemctl enable keepalived
# systemctl start keepalived
3)master-03
- 安装
# apt -y install keepalived
- 修改配置文件
修改/etc/keepalived/keepalived.conf
,从节点按注释修改。
global_defs {
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
// vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state BACKUP
interface ens4
virtual_router_id 150
priority 80
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.10.239.150
}
}
- 启动服务
# systemctl enable keepalived
# systemctl start keepalived
4. 核心组件安装(所有节点)
4.1 kubelet、kubeadm、kubectl安装
- 使得 apt 支持 ssl 传输
apt-get update && apt-get install -y apt-transport-https
- 加载秘钥
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
- 添加k8s镜像源
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
- apt安装
apt-get install -y kubelet=1.23.17-00 kubeadm=1.23.17-00 kubectl=1.23.17-00
- runtime
crictl config runtime-endpoint /run/containerd/containerd.sock
- 启动服务
systemctl daemon-reload
systemctl enable kubelet && systemctl start kubelet
4.2 提前下载镜像(所有节点)【非必要】
- 提前下载镜像
kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers
- 结果
W0528 03:24:37.161961 250879 version.go:105] falling back to the local client version: v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.23.17
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.6
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.6-0
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.8.6
4.2 缺少一个镜像
说明:集群中已经将需要的镜像改成阿里云镜像了,但是初始化的时候还是会需要检查
registry.k8s.io/pause:3.8
这个镜像,虽然pod启动时实际用到的仍是阿里云的pause:3.8
镜像
- 拉取镜像
缺少的这个镜像我提前推倒内网harbor了
ctr images pull harbocto.boe.com.cn/kubernetes/pause:3.8
- 打tag
ctr images tag harbocto.boe.com.cn/kubernetes/pause:3.8 registry.k8s.io/pause:3.8
- 查看结果
[root@cto-gpu-pro-m01 ~]# crictl images
IMAGE TAG IMAGE ID SIZE
......
harbocto.boe.com.cn/kubernetes/pause 3.8 4873874c08efc 309kB
registry.k8s.io/pause 3.8 4873874c08efc 309kB
......
5. 集群初始化
5.1 主机名
- master-01
hostname k8s-master-01
echo "hostname k8s-master-01" > /etc/hostname
cat >> /etc/hosts <<EOF
10.10.239.155 k8s-master-01
10.10.239.155 k8s-master-02
10.10.239.156 k8s-master-03
EOF
- master-02
hostname k8s-master-02
echo "hostname k8s-master-02" > /etc/hostname
cat >> /etc/hosts <<EOF
10.10.239.150 k8s-master-01
10.10.239.155 k8s-master-02
10.10.239.156 k8s-master-03
EOF
- master-03
hostname k8s-master-04
echo "hostname k8s-master-04" > /etc/hostname
cat >> /etc/hosts <<EOF
10.10.239.150 k8s-master-01
10.10.239.155 k8s-master-02
10.10.239.156 k8s-master-03
EOF
5.2 初始化
1)master-01
- master 1
kubeadm init \
--control-plane-endpoint "10.10.239.150:7443" \
--upload-certs \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.23.17 \
--pod-network-cidr 10.244.0.0/16
- 输出
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join 10.10.239.150:7443 --token 6877nh.cjqvl57r45g3pygo \
--discovery-token-ca-cert-hash sha256:86774076430c354eb6114d703c5e05afdfbd1a2270edcd7b51117feb961719c9 \
--control-plane --certificate-key 95555619210539f09ad585dd1a19ecbf8326180fd91a0827173b2339576c6adc
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.10.239.150:7443 --token 6877nh.cjqvl57r45g3pygo \
--discovery-token-ca-cert-hash sha256:86774076430c354eb6114d703c5e05afdfbd1a2270edcd7b51117feb961719c9
- kubectl 配置环境(按上边输出操作即可)
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
2)其他master
- 按上边的输出执行命令
kubeadm join 10.10.239.150:7443 --token 6877nh.cjqvl57r45g3pygo \
--discovery-token-ca-cert-hash sha256:86774076430c354eb6114d703c5e05afdfbd1a2270edcd7b51117feb961719c9 \
--control-plane --certificate-key 95555619210539f09ad585dd1a19ecbf8326180fd91a0827173b2339576c6adc
- kubectl 配置环境
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
3)node节点
kubeadm join 10.10.239.150:7443 --token 6877nh.cjqvl57r45g3pygo \
--discovery-token-ca-cert-hash sha256:86774076430c354eb6114d703c5e05afdfbd1a2270edcd7b51117feb961719c9
4) 检查
- 查看节点
root@k8s-master-01:~# kubectl get node -n kube-system
NAME STATUS ROLES AGE VERSION
k8s-master-01 NotReady control-plane,master 20m v1.23.17
k8s-master-02 NotReady control-plane,master 11m v1.23.17
k8s-master-03 NotReady control-plane,master 10m v1.23.17
NotReady
是因为我们没有设置状态
- 查看kube-system中pod
root@k8s-master-01:~# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-6d8c4cb4d-flqtp 0/1 Pending 0 20m
coredns-6d8c4cb4d-w8tc8 0/1 Pending 0 20m
etcd-k8s-master-01 1/1 Running 0 20m
etcd-k8s-master-02 1/1 Running 0 11m
etcd-k8s-master-03 1/1 Running 0 10m
kube-apiserver-k8s-master-01 1/1 Running 0 20m
kube-apiserver-k8s-master-02 1/1 Running 0 12m
kube-apiserver-k8s-master-03 1/1 Running 1 (10m ago) 10m
kube-controller-manager-k8s-master-01 1/1 Running 1 (11m ago) 20m
kube-controller-manager-k8s-master-02 1/1 Running 0 12m
kube-controller-manager-k8s-master-03 1/1 Running 0 9m53s
kube-proxy-2xnjx 1/1 Running 0 11m
kube-proxy-jftx5 1/1 Running 0 20m
kube-proxy-zpxv8 1/1 Running 0 12m
kube-scheduler-k8s-master-01 1/1 Running 1 (11m ago) 20m
kube-scheduler-k8s-master-02 1/1 Running 0 12m
kube-scheduler-k8s-master-03 1/1 Running 0 9m45s
5.3 网络安装
有很多选择,
一般小集群我选择 flannel,见 《flannel网络的安装和删除》
大的集群选择calico,见《calico网络安装和删除》
6. storageclass 的安装
- 安装
见文档 《K8S对象-StorageClass》 - FAQ
《【FAQ】1.21创建PV报错》