Ubuntu 16.04部署Kubernetes v1.12.1集群并部署Polyaxon服务器
Ubuntu使用Kubernetes集群部署Polyaxon
主要步骤
- 系统环境设置
1.1 准备
1.2 查看系统信息
1.3 查看Mac地址、产品uuid、Hostname
1.4 关闭防火墙
1.5 禁用SELINUX
1.6 关闭swap
1.7 配置/etc/hosts文件
1.8 示例说明 - 安装docker-ce
2.1 添加 docker 源
2.2 查看 docker 版本
2.3 安装 docker 18.06.1-ce
2.4 验证 docker 的安装 - Kubernetes (k8s)
3.1 说明
3.2 各节点安装kubelet、kubeadm、kubectl
3.3 Master使用kubeadm创建一个单Master集群
3.4 安装网络插件
3.5 Master隔离
3.6 子节点加入master节点
3.7 测试
3.8 卸载清理k8s - 安装helm
4.1 Master安装Helm客户端
4.2 Master安装Helm Tiller服务端 - polyapxon
5.1 部署Polyapxon服务器
5.2 安装Polyaxon客户端
5.3 卸载Polyaxon
部署表
/ | 系统环境设置 | 安装 Docker | 安装 Kubeadm | Kubeadm init | 安装网络插件 | Kubeadm join | 安装 helm | 安装 polyaxon |
---|---|---|---|---|---|---|---|---|
Master | √ | √ | √ | √ | √ | × | √ | √ |
Slave/子节点 | √ | √ | √ | × | × | √ | × | × |
系统环境设置(所有机器)
1.1 准备
- 若干台 Ubuntu 16.04 或 CentOS 7 系统
- 每台机器相互SSH免密登录
- 每台机器2 GB或更多的内存
- 集群中所有机器之间网络连接正常
- 所有机器具有不同的Mac地址、产品uuid、Hostname
1.2 查看系统信息
执行如下语句查看系统信息:
lsb_release -a
1.3 查看Mac地址、产品uuid、Hostname
Kubernetes要求集群中所有机器具有不同的Mac地址、产品uuid、Hostname。可以使用如下命令查看:
# UUID
cat /sys/class/dmi/id/product_uuid# Mac地址
ip link# Hostname
cat /etc/hostname
1.4 关闭防火墙
关闭防火墙:
systemctl stop firewalld
systemctl disable firewalld
CendOS需要打开相应的端口,详见:Check required ports。
1.5 禁用SELINUX:
禁用SELINUX:
setenforce 0
vim /etc/selinux/config
#写入
SELINUX=disabled
#保存并退出
1.6 关闭swap
Kubernetes 1.8 开始要求必须禁用Swap,如果不关闭,默认配置下Kubelet将无法启动,而polyapxon只支持Kubernetes 1.8版本及以上。编辑系统/etc/fstab
文件,注释掉引用swap
的行:
保存并重启后执行:
sudo swapoff -a
1.7 配置/etc/hosts文件
执行sudo vim /etc/hosts
写入:
192.168.0.6 master
192.168.0.5 slave1
192.168.0.7 slave2
保存并退出
1.8 示例说明
本例使用3台ubuntu 16.04主机运行在同一个内网中
/ | IP | hostname |
---|---|---|
Master | 192.168.0.6 | seeta-03 |
Slave1 | 192.168.0.5 | seeta-02 |
Slave2 | 192.168.0.7 | seeta-0002 |
2. 安装docker-ce
Kubernetes 用于管理 docker ,docker的版本要兼容kubernetes,可以到官网兼容性列表查看,想要安装的是哪个版本Kubernetes,就看哪个版本的CHANGELOG。本文中安装的是1.12.1版本即查看CHANGELOG-1.12.md
,下图可以看到docker 最高兼容到 18.06,docker建议尽量安装最新版本。
2.1 添加 docker 源
sudo apt-get update
sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] $(lsb_release -cs) stable"
sudo apt-get -y update
2.2 查看 docker 版本
sudo apt-cache madison docker-ce
2.3 安装 docker 18.06.1-ce
sudo apt install docker-ce=18.06.1~ce~3-0~ubuntu
sudo systemctl enable docker
2.4 验证 docker 的安装
执行如下命令查看docker版本
sudo docker version
输出:
Client:Version: 18.06.1-ceAPI version: 1.38Go version: go1.10.3Git commit: e68fc7aBuilt: Tue Aug 21 17:24:56 2018OS/Arch: linux/amd64Experimental: falseServer:Engine:Version: 18.06.1-ceAPI version: 1.38 (minimum version 1.12)Go version: go1.10.3Git commit: e68fc7aBuilt: Tue Aug 21 17:23:21 2018OS/Arch: linux/amd64Experimental: false
3. Kubernetes
3.1 说明
Kubernetes 是一个开源的,用于管理云平台中多个主机上的容器化的应用,Kubernetes的目标是让部署容器化的应用简单并且高效, Kubernetes 提供了应用部署,规划,更新,维护的一种机制。详情访问:k8s官网
3.2 各节点安装kubelet、kubeadm、kubectl
3.2.1 工具说明
工具 | 说明 |
---|---|
kubeadm | 引导启动k8s集群的命令行工具 |
kubelet | 在群集中所有节点上运行的核心组件, 用来执行如启动pods和containers等操作 |
kubectl | 操作集群的命令行工具 |
3.2.2 添加apt-key
sudo apt-get update && apt-get install -y apt-transport-https
curl -s .gpg | sudo apt-key add -
3.2.3 各节点添加kubernetes源
sudo cat <<EOF >/etc/apt/sources.list.d/kubernetes.list # 输入下面两行内容deb / kubernetes-xenial mainEOF
3.2.4 各节点查看源中的软件版本
sudo apt-cache madison kubelet
输出:
3.2.5 各节点安装 kubelet、kubeadm、kubectl
sudo apt-get update
sudo apt install kubelet=1.12.1-00 kubeadm=1.12.1-00 kubectl=1.12.1-00
sudo apt-mark hold kubelet kubeadm kubectl
3.2.6 各节点查看初始镜像要求
kubeadm config images list
输出:
k8s.gcr.io/kube-apiserver:v1.13.1
k8s.gcr.io/kube-controller-manager:v1.13.1
k8s.gcr.io/kube-scheduler:v1.13.1
k8s.gcr.io/kube-proxy:v1.13.1
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.2.24
k8s.gcr.io/coredns:1.2.2
3.2.7 各节点拉取镜像
由于国内并不能访问gcr.io,可以使用打tag的方式,我们选择更为简单的方法,通过修改配置文件来镜像配置的实现。在kubeadm v1.11+版本中,增加了一个kubeadm config print-default
命令,可以让我们方便的将kubeadm的默认配置打印到文件中,在各个节点的一个安全的路径下如$HOME/k8s/
执行下列命令:
kubeadm config print-default > kubeadm.conf
修改kubeadm.conf
中的镜像仓储地址:
sed -i "s/imageRepository: .*/imageRepository: registry.aliyuncs\/google_containers/g" kubeadm.conf
指定版本号,避免初始化时从.12.txt
读取,使用如下命令来设置:
#注意可以修改版本号
sed -i "s/kubernetesVersion: .*/kubernetesVersion: v1.12.1/g" kubeadm.conf
使用--config
参数指定kubeadm.conf
文件来运行 kubeadm 的images pull
的命令,在kubeadm.conf
所在目录下执行:
kubeadm config images pull --config kubeadm.conf
耐心等待,输出:
W0102 18:43:32.305695 28207 common.go:105] WARNING: Detected resource kinds that may not apply: [InitConfiguration MasterConfiguration JoinConfiguration NodeConfiguration]
[config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1alpha3, Kind=JoinConfiguration
[config/images] Pulled registry.aliyuncs/google_containers/kube-apiserver:v1.12.1
[config/images] Pulled registry.aliyuncs/google_containers/kube-controller-manager:v1.12.1
[config/images] Pulled registry.aliyuncs/google_containers/kube-scheduler:v1.12.1
[config/images] Pulled registry.aliyuncs/google_containers/kube-proxy:v1.12.1
[config/images] Pulled registry.aliyuncs/google_containers/pause:3.1
[config/images] Pulled registry.aliyuncs/google_containers/etcd:3.2.24
[config/images] Pulled registry.aliyuncs/google_containers/coredns:1.2.2
注意: 基础镜像pause的拉取地址需要单独设置,否则还是会从k8s.gcr.io来拉取,单独打一个tag:
sudo docker tag registry.aliyuncs/google_containers/pause:3.1 k8s.gcr.io/pause:3.1
3.3 Master使用kubeadm创建一个单Master集群
初始化Master节点
通常,我们在执行init命令时,可能还需要指定advertiseAddress
、--pod-network-cidr
等参数,但是由于我们这里使用kubeadm.conf
配置文件来初始化,就不在命令行中指定其他参数了,只需在kubeadm.conf来设置:
#你需要更改此处 “ 192.168.0.6 ” 为自己master节点的IP
sed -i "s/advertiseAddress: .*/advertiseAddress: 192.168.0.6/g" kubeadm.conf
将--pod-network-cid
设置为10.244.0.0/16
,修改如下:
#无需修改
sed -i "s/podSubnet: .*/podSubnet: \"10.244.0.0\/16\"/g" kubeadm.conf
执行初始化命令:
sudo kubeadm init --config kubeadm.conf
输出:
W1109 17:01:47.071494 42929 common.go:105] WARNING: Detected resource kinds that may not apply: [InitConfiguration MasterConfiguration JoinConfiguration NodeConfiguration]
[config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1alpha3, Kind=JoinConfiguration
[init] using Kubernetes version: v1.12.2
[preflight] running pre-flight checks
[preflight/images] Pulling images required for setting up a Kubernetes cluster
[preflight/images] This might take a minute or two, depending on the speed of your internet connection
[preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[preflight] Activating the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [ubuntu1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.8]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [ubuntu1 localhost] and IPs [127.0.0.1 ::1]
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [ubuntu1 localhost] and IPs [192.168.0.8 127.0.0.1 ::1]
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[certificates] Generated sa key and public key.
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests"
[init] this might take a minute or longer if the control plane images have to be pulled
[apiclient] All control plane components are healthy after 57.002438 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.12" in namespace kube-system with the configuration for the kubelets in the cluster
[markmaster] Marking the node ubuntu1 as master by adding the label "node-role.kubernetes.io/master=''"
[markmaster] Marking the node ubuntu1 as master by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ubuntu1" as an annotation
[bootstraptoken] using token: abcdef.0123456789abcdef
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxyYour Kubernetes master has initialized successfully!To start using your cluster, you need to run the following as a regular user:mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/configYou should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: can now join any number of machines by running the following on each node
as root:kubeadm join 192.168.0.6:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:5e44393289eb7e463479f93327b2593a45b32fa8afb8978d878e0f2c9bf8e29b
注意:如果你执行init命令没有成功,修改完配置信息,需要先执行sudo kubeadm reset
命令再重新执行kubeadm init
命令
Tips: kubeadm init
最后一行输出的kubeadm join
命令语句最好记录一下,后面部署子节点会用到,就是这句(这只是我的示例):
kubeadm join 192.168.0.6:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:5e44393289eb7e463479f93327b2593a45b32fa8afb8978d878e0f2c9bf8e29b
如果想使用非root用户操作kubectl,执行以下命令(这也是kubeadm init输出的一部分):
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
3.4 安装网络插件
3.4.1 查看状态
在安装之前,先查看一下当前Pods的状态:
kubectl get pods --all-namespaces
输出:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-5c545769d8-6cl9s 0/1 Pending 0 110s
kube-system coredns-5c545769d8-h8fjj 0/1 Pending 0 111s
kube-system etcd-ubuntu1 1/1 Running 0 75s
kube-system kube-apiserver-ubuntu1 1/1 Running 0 87s
kube-system kube-controller-manager-ubuntu1 1/1 Running 0 96s
kube-system kube-proxy-snhqr 1/1 Running 0 111s
kube-system kube-scheduler-ubuntu1 1/1 Running 0 98s
如上,可以看到CoreDND的状态是Pending,因为我们还没有安装网络插件。
由于我的虚拟机网段是192.168.0.x
,无法使用Calico
网络,所以使用了Canal
网络插件,它是Calico
和Flannel
的结合体,在上面kubeadm init
的时候已经指定了--pod-network-cidr=10.244.0.0/16
,这是Canal插件所要求的。
3.4.2 各个节点修改/etc/resolv.conf
上述pods中有两个CoreDNS,CoreDNS启动后会通过宿主机的/etc/resolv.conf
文件去获取上游DNS的信息,如果这个时候获取的DNS的服务器是本地地址的话,就会出现环路,即便执行了安装网络插件的命令,这两个pods的状态会一直处于CrashLoopBackoff
状态。这一问题的官方解决办法:Troubleshooting Loops In Kubernetes Clusters,一共有三种方法我们使用第三种方法:修改各个主机/etc/resolv.conf
中的DNS。
vim /etc/resolv.conf
修改前:
nameserver 127.0.0.1
修改后:
nameserver 8.8.8.8
#nameserver 127.0.0.1
使修改生效:
sudo ldconfig
3.4.3 Master安装Canal网络插件
# 源地址:.3/getting-started/kubernetes/installation/hosted/canal/rbac.yaml
kubectl apply -f .3/rbac.yaml# 源地址:.3/getting-started/kubernetes/installation/hosted/canal/canal.yaml
kubectl apply -f .3/canal.yaml
关于更多Canal
的信息,可以查看 Installing Calico for policy and flannel for networking。
耐心的等待,然后再使用kubectl get pods --all-namespaces
命令来查看网络插件的安装情况:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-5c545769d8-6cl9s 1/1 Running 0 7h
kube-system coredns-5c545769d8-h8fjj 1/1 Running 0 7h
kube-system etcd-ubuntu1 1/1 Running 0 7h
kube-system kube-apiserver-ubuntu1 1/1 Running 0 7h
kube-system kube-controller-manager-ubuntu1 1/1 Running 0 7h
kube-system kube-proxy-snhqr 1/1 Running 0 7h
kube-system kube-scheduler-ubuntu1 1/1 Running 0 7h
当STATUS全部变为了Running,表示网络插件安装成功。如果按照上述方法coredns两个pods还是无法Runing,建议使用使用systemctl daemon-reload
命令重新读取配置,再使用service kubelet restart
重启 kubelet,再重新执行安装命令。如果还是不行可以试试看重启Master主机,再执行安装命令。
3.5 Master隔离
默认情况下,由于安全原因,集群并不会将pods部署在Master节点上。但是在开发环境下,我们可能就只有一个Master节点,这时可以使用下面的命令来解除这个限制:
kubectl taint nodes --all node-role.kubernetes.io/master-
输出:
node/ubuntu1 untainted
3.6 子节点加入master节点
在子节点执行之前master节点kubeadm init
输出的kubeadm join
命令:kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash>
(你需要执行自己init成功后的join命令):
kubeadm join 192.168.0.6:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:5e44393289eb7e463479f93327b2593a45b32fa8afb8978d878e0f2c9bf8e29b
如果我们忘记了Master节点--token
,可以使用如下命令来查看:
sudo kubeadm token list
输出:
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
pe9eow.4wywpjhj9txkvef9 23h 2019-01-08T11:28:44+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
默认情况下,token的有效期是24小时,如果我们的token已经过期的话,可以使用以下命令重新生成:
kubeadm token create
输出:
pe9eow.4wywpjhj9txkvef9
如果我们也没有--discovery-token-ca-cert-hash
的值,可以使用以下命令生成:
输入如下命令查看:
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
输出:
5e44393289eb7e463479f93327b2593a45b32fa8afb8978d878e0f2c9bf8e29b
执行kubeadm join
命令,输出如下:
[preflight] running pre-flight checks[WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh] or no builtin kernel ipvs support: map[ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{} ip_vs:{} ip_vs_rr:{}]
you can solve this problem with following methods:1. Run 'modprobe -- ' to load missing kernel modules;
2. Provide the missing builtin kernel ipvs support[discovery] Trying to connect to API Server "192.168.0.6:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.0.6:6443"
[discovery] Requesting info from "https://192.168.0.6:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.0.8:6443"
[discovery] Successfully established connection with API Server "192.168.0.6:6443"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ubuntu2" as an annotationThis node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.Run 'kubectl get nodes' on the master to see this node join the cluster.
这时候我们就可以在Master节点上使用kubectl get nodes
命令来查看节点的状态:
sudo kubectl get nodes
输出:
NAME STATUS ROLES AGE VERSION
seeta-0002 Ready <none> 1d v1.12.1
seeta-02 Ready <none> 1d v1.12.1
seeta-03 Ready master 1d v1.12.1
要想在子节点也支持kubectl命令的话,你需要将Master
节点的/etc/kubernetes/admin.conf
拷贝到所有子节点,执行如下列语句(注意192.168.0.7
是我一个子节点的IP,你需要修改为自己的子节点IP):
# Master下执行:
sudo scp /etc/kubernetes/admin.conf 192.168.0.7:/etc/kubernetes/admin.conf# 子节点(192.168.0.7)下执行:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
这样你可以在各个节点使用kubectl命令啦
3.7 测试
3.7.1 验证kube-apiserver
, kube-controller-manager
, kube-scheduler
, pod network
部署一个 Nginx Deployment,包含3个Pod(因为我这里一共有三台机器):
sudo kubectl create deployment nginx --image=nginx:alpine
sudo kubectl scale deployment nginx --replicas=3
验证Nginx Pod是否正确运行,并且会分配10.244.开头的集群IP:
sudo kubectl get pods -l app=nginx -o wide
输出:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
nginx-65d5c4f7cc-652hd 1/1 Running 1 11d 10.244.0.26 seeta-03 <none>
nginx-65d5c4f7cc-csjdl 1/1 Running 0 11d 10.244.2.2 seeta-0002 <none>
nginx-65d5c4f7cc-s4n78 1/1 Running 0 11d 10.244.1.2 seeta-02 <none>
3.7.2 验证一下kube-proxy是否正常:
以 NodePort 方式对外提供服务,(官方文档),执行如下语句:
sudo kubectl expose deployment nginx --port=80 --type=NodePort
#查看集群外可访问的Port:
sudo kubectl get services nginx
输出:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx NodePort 10.97.226.246 <none> 80:31504/TCP 11d
可以通过任意 NodeIP:Port 在集群外部访问这个服务,本示例中部署的3台集群IP分别是192.168.0.6
、192.168.0.5
、192.168.0.7
:
curl http://192.168.0.5:31504
curl http://192.168.0.6:31504
curl http://192.168.0.7:31504
3.7.3 验证dns
, pod network
是否正常:
# 运行Busybox并进入交互模式
sudo kubectl expose deployment nginx --port=80 --type=NodePort
# 输入`nslookup nginx`查看是否可以正确解析出集群内的IP,已验证DNS是否正常
nslookup nginx
输出如下表示dns正常:
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.localName: nginx
Address 1: 10.97.226.246 nginx.default.svc.cluster.local
通过服务名进行访问,验证kube-proxy是否正常,继续输入如下命令:
curl http://nginx/
输出:
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>body {width: 35em;margin: 0 auto;font-family: Tahoma, Verdana, Arial, sans-serif;}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p><p>For online documentation and support please refer to
<a href="/">nginx</a>.<br/>
Commercial support is available at
<a href="/">nginx</a>.</p><p><em>Thank you for using nginx.</em></p>
</body>
</html>
分别访问一下3个Pod的集群IP,验证跨Node的网络通信是否正常:
curl http://10.244.0.26/
curl http://10.244.2.2/
curl http://10.244.1.2/
测试结束,ctrl+d
退出交互模式,删除curl这个pod:
sudo kubectl delete deploy curl
输出:
deployment.extensions "curl" deleted
3.8 卸载清理k8s
输入如下命令(未实验过):
kubeadm reset -f
modprobe -r ipip
lsmod
rm -rf ~/.kube/
rm -rf /etc/kubernetes/
rm -rf /etc/systemd/system/kubelet.service.d
rm -rf /etc/systemd/system/kubelet.service
rm -rf /usr/bin/kube*
rm -rf /etc/cni
rm -rf /opt/cni
rm -rf /var/lib/etcd
rm -rf /var/etcd
4. 安装helm
4.1 Master安装Helm客户端
从官网下载2.5版本以上的helm文件(尽量使用较新的版本),以下都以v2.12.1
版为例,从官网下载helm-v2.12.1-linux-amd64.tar.gz
,此处需要翻墙,提供一个v2.12.1
的百度云链接,在Master节点执行:
tar -zxvf helm-v2.12.1-linux-amd64.tar.gz
#解压后名字变成linux-amd64
mv linux-amd64/helm /usr/local/bin/helm
#查看Helm客户端是否成功安装,目前只能查看到客户端的版本,服务器还没有安装。
sudo helm version
输出:
Client: &version.Version{SemVer:"v2.12.1", GitCommit:"02a47c7249b1fc6d8fd3b94e6b4babf9d818144e", GitTreeState:"clean"}
helm 有很多子命令和参数,为了提高使用命令行的效率,通常建议安装 helm 的 bash 命令补全脚本,方法如下:
helm completion bash > .helmrc
echo "source .helmrc" >> .bashrc
source .bashrc
4.2 Master安装Helm Tiller服务端
拉取国内镜像:
sudo docker pull registry-hangzhou.aliyuncs/google_containers/tiller:v2.12.1
sudo docker tag registry-hangzhou.aliyuncs/google_containers/tiller:v2.12.1 gcr.io/kubernetes-helm/tiller:v2.12.1
权限配置:
sudo kubectl create serviceaccount --namespace kube-system tiller
sudo kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
sudo kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'
Helm部署tiller:
helm init -i registry-hangzhou.aliyuncs/google_containers/tiller:v2.12.1
再次执行sudo helm version
,可能需要等待一会,输出如下内容表示Helm部署完毕:
Client: &version.Version{SemVer:"v2.12.1", GitCommit:"02a47c7249b1fc6d8fd3b94e6b4babf9d818144e", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.12.1", GitCommit:"02a47c7249b1fc6d8fd3b94e6b4babf9d818144e", GitTreeState:"clean"}
5. Polyapxon
Polyapxon的官方安装文档
5.1 部署Polyapxon服务器
Master执行如下命令创建polyaxon命名空间:
sudo kubectl create namespace polyaxon
输出:
namespace "polyaxon" created
Polyapxon有配置文件你可以创建一个config.yml
或者 polyaxon_config.yml
,根据github其官网和自己的需求填写,他们还提供了一个yml生成器。但是我使用默认的配置文件进行部署:
helm repo add polyaxon
helm repo update
sudo helm install polyaxon/polyaxon --name=polyaxon --namespace=polyaxon
大概过3-5分钟,会输出如下信息:
NOTES:
Polyaxon is currently running:1. Get the application URL by running these commands:NOTE: It may take a few minutes for the LoadBalancer IP to be available.You can watch the status by running:'kubectl get --namespace polyaxon svc -w polyaxon-polyaxon-api'export POLYAXON_IP=$(kubectl get svc --namespace polyaxon polyaxon-polyaxon-api -o jsonpath='{.status.loadBalancer.ingress[0].ip}')export POLYAXON_HTTP_PORT=80export POLYAXON_WS_PORT=1337echo http://$POLYAXON_IP:$POLYAXON_HTTP_PORT2. Setup your cli by running theses commands:polyaxon config set --host=$POLYAXON_IP --http_port=$POLYAXON_HTTP_PORT --ws_port=$POLYAXON_WS_PORT3. Log in with superuserUSER: rootPASSWORD: Get login password withkubectl get secret --namespace polyaxon polyaxon-polyaxon-secret -o jsonpath="{.data.POLYAXON_ADMIN_PASSWORD}" | base64 --decode
这里输出Notes中一共有三点,一是如何获取到 polyaxon 的运行地址,二是更新 polyaxon 的配置,三是客户端登陆信息。这里第一点获取运行地址需要更改一下运行命令,否则会出错,将 export POLYAXON_IP=$(kubectl get svc --namespace polyaxon polyaxon-polyaxon-api -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
改为export POLYAXON_IP=$(kubectl get svc --namespace polyaxon polyaxon-polyaxon-api -o jsonpath='{.spec.clusterIP}')
。进入root权限(su -),我们就提示顺序执行:
export POLYAXON_IP=$(kubectl get svc --namespace polyaxon polyaxon-polyaxon-api -o jsonpath='{.spec.clusterIP}')
export POLYAXON_HTTP_PORT=80
export POLYAXON_WS_PORT=1337
echo http://$POLYAXON_IP:$POLYAXON_HTTP_PORT
#最后一步会输出类似:http://10.107.85.252:80
5.2 Polyaxon客户端
Master执行如下命令安装客户端:
#python2的话使用pip就好了
pip3 install -U polyaxon-cli
polyaxon config set --host=$POLYAXON_IP --http_port=$POLYAXON_HTTP_PORT --ws_port=$POLYAXON_WS_PORT
进入root权限(su -)登录客户端,默认的用户名为root
,密码为rootpassword
:
polyaxon login --username=root
Please enter your password:#Login successful
如果你不确定密码的话,输入部署时的Notes中的第三点,最后一句命令(注意密码会在用户名前面打印,不太显眼):
kubectl get secret --namespace polyaxon polyaxon-polyaxon-secret -o jsonpath="{.data.POLYAXON_ADMIN_PASSWORD}" | base64 --decode
查看polyaxon-api
服务映射出来的端口号:
kubectl get service -n polyaxon
输出:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
polyaxon-docker-registry NodePort 10.98.185.92 <none> 5000:31813/TCP 6m26s
polyaxon-polyaxon-api LoadBalancer 10.107.85.252 <pending> 80:30945/TCP,1337:31122/TCP 6m26s
polyaxon-postgresql ClusterIP 10.98.103.223 <none> 5432/TCP 6m26s
polyaxon-rabbitmq ClusterIP 10.103.114.255 <none> 4369/TCP,5672/TCP,25672/TCP,15672/TCP 6m26s
polyaxon-redis ClusterIP 10.106.3.31 <none> 6379/TCP 6m26s
注意:polyaxon-polyaxon-api
的POST为80:30945
,这个30945
即为服务的端口号,你可以在浏览器访问http://192.168.0.6:30945/
使用polyaxon
。
Tips: 如果你想安装 NFS
并且创建 persistent volume
与 persistent volume claim
,进行持久存储,可以参考这篇文章和官网,虽然我按他的配置没能部署起来。
5.3 卸载Polyaxon
删除polyaxon:
helm delete polyaxon --purge
如果你没有安装好polyaxon,使用如下命令删除:
helm delete polyaxon --purge --no-hooks
删除命名空间:
kubectl delete namespace polyaxon
Ubuntu 16.04部署Kubernetes v1.12.1集群并部署Polyaxon服务器
Ubuntu使用Kubernetes集群部署Polyaxon
主要步骤
- 系统环境设置
1.1 准备
1.2 查看系统信息
1.3 查看Mac地址、产品uuid、Hostname
1.4 关闭防火墙
1.5 禁用SELINUX
1.6 关闭swap
1.7 配置/etc/hosts文件
1.8 示例说明 - 安装docker-ce
2.1 添加 docker 源
2.2 查看 docker 版本
2.3 安装 docker 18.06.1-ce
2.4 验证 docker 的安装 - Kubernetes (k8s)
3.1 说明
3.2 各节点安装kubelet、kubeadm、kubectl
3.3 Master使用kubeadm创建一个单Master集群
3.4 安装网络插件
3.5 Master隔离
3.6 子节点加入master节点
3.7 测试
3.8 卸载清理k8s - 安装helm
4.1 Master安装Helm客户端
4.2 Master安装Helm Tiller服务端 - polyapxon
5.1 部署Polyapxon服务器
5.2 安装Polyaxon客户端
5.3 卸载Polyaxon
部署表
/ | 系统环境设置 | 安装 Docker | 安装 Kubeadm | Kubeadm init | 安装网络插件 | Kubeadm join | 安装 helm | 安装 polyaxon |
---|---|---|---|---|---|---|---|---|
Master | √ | √ | √ | √ | √ | × | √ | √ |
Slave/子节点 | √ | √ | √ | × | × | √ | × | × |
系统环境设置(所有机器)
1.1 准备
- 若干台 Ubuntu 16.04 或 CentOS 7 系统
- 每台机器相互SSH免密登录
- 每台机器2 GB或更多的内存
- 集群中所有机器之间网络连接正常
- 所有机器具有不同的Mac地址、产品uuid、Hostname
1.2 查看系统信息
执行如下语句查看系统信息:
lsb_release -a
1.3 查看Mac地址、产品uuid、Hostname
Kubernetes要求集群中所有机器具有不同的Mac地址、产品uuid、Hostname。可以使用如下命令查看:
# UUID
cat /sys/class/dmi/id/product_uuid# Mac地址
ip link# Hostname
cat /etc/hostname
1.4 关闭防火墙
关闭防火墙:
systemctl stop firewalld
systemctl disable firewalld
CendOS需要打开相应的端口,详见:Check required ports。
1.5 禁用SELINUX:
禁用SELINUX:
setenforce 0
vim /etc/selinux/config
#写入
SELINUX=disabled
#保存并退出
1.6 关闭swap
Kubernetes 1.8 开始要求必须禁用Swap,如果不关闭,默认配置下Kubelet将无法启动,而polyapxon只支持Kubernetes 1.8版本及以上。编辑系统/etc/fstab
文件,注释掉引用swap
的行:
保存并重启后执行:
sudo swapoff -a
1.7 配置/etc/hosts文件
执行sudo vim /etc/hosts
写入:
192.168.0.6 master
192.168.0.5 slave1
192.168.0.7 slave2
保存并退出
1.8 示例说明
本例使用3台ubuntu 16.04主机运行在同一个内网中
/ | IP | hostname |
---|---|---|
Master | 192.168.0.6 | seeta-03 |
Slave1 | 192.168.0.5 | seeta-02 |
Slave2 | 192.168.0.7 | seeta-0002 |
2. 安装docker-ce
Kubernetes 用于管理 docker ,docker的版本要兼容kubernetes,可以到官网兼容性列表查看,想要安装的是哪个版本Kubernetes,就看哪个版本的CHANGELOG。本文中安装的是1.12.1版本即查看CHANGELOG-1.12.md
,下图可以看到docker 最高兼容到 18.06,docker建议尽量安装最新版本。
2.1 添加 docker 源
sudo apt-get update
sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] $(lsb_release -cs) stable"
sudo apt-get -y update
2.2 查看 docker 版本
sudo apt-cache madison docker-ce
2.3 安装 docker 18.06.1-ce
sudo apt install docker-ce=18.06.1~ce~3-0~ubuntu
sudo systemctl enable docker
2.4 验证 docker 的安装
执行如下命令查看docker版本
sudo docker version
输出:
Client:Version: 18.06.1-ceAPI version: 1.38Go version: go1.10.3Git commit: e68fc7aBuilt: Tue Aug 21 17:24:56 2018OS/Arch: linux/amd64Experimental: falseServer:Engine:Version: 18.06.1-ceAPI version: 1.38 (minimum version 1.12)Go version: go1.10.3Git commit: e68fc7aBuilt: Tue Aug 21 17:23:21 2018OS/Arch: linux/amd64Experimental: false
3. Kubernetes
3.1 说明
Kubernetes 是一个开源的,用于管理云平台中多个主机上的容器化的应用,Kubernetes的目标是让部署容器化的应用简单并且高效, Kubernetes 提供了应用部署,规划,更新,维护的一种机制。详情访问:k8s官网
3.2 各节点安装kubelet、kubeadm、kubectl
3.2.1 工具说明
工具 | 说明 |
---|---|
kubeadm | 引导启动k8s集群的命令行工具 |
kubelet | 在群集中所有节点上运行的核心组件, 用来执行如启动pods和containers等操作 |
kubectl | 操作集群的命令行工具 |
3.2.2 添加apt-key
sudo apt-get update && apt-get install -y apt-transport-https
curl -s .gpg | sudo apt-key add -
3.2.3 各节点添加kubernetes源
sudo cat <<EOF >/etc/apt/sources.list.d/kubernetes.list # 输入下面两行内容deb / kubernetes-xenial mainEOF
3.2.4 各节点查看源中的软件版本
sudo apt-cache madison kubelet
输出:
3.2.5 各节点安装 kubelet、kubeadm、kubectl
sudo apt-get update
sudo apt install kubelet=1.12.1-00 kubeadm=1.12.1-00 kubectl=1.12.1-00
sudo apt-mark hold kubelet kubeadm kubectl
3.2.6 各节点查看初始镜像要求
kubeadm config images list
输出:
k8s.gcr.io/kube-apiserver:v1.13.1
k8s.gcr.io/kube-controller-manager:v1.13.1
k8s.gcr.io/kube-scheduler:v1.13.1
k8s.gcr.io/kube-proxy:v1.13.1
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.2.24
k8s.gcr.io/coredns:1.2.2
3.2.7 各节点拉取镜像
由于国内并不能访问gcr.io,可以使用打tag的方式,我们选择更为简单的方法,通过修改配置文件来镜像配置的实现。在kubeadm v1.11+版本中,增加了一个kubeadm config print-default
命令,可以让我们方便的将kubeadm的默认配置打印到文件中,在各个节点的一个安全的路径下如$HOME/k8s/
执行下列命令:
kubeadm config print-default > kubeadm.conf
修改kubeadm.conf
中的镜像仓储地址:
sed -i "s/imageRepository: .*/imageRepository: registry.aliyuncs\/google_containers/g" kubeadm.conf
指定版本号,避免初始化时从.12.txt
读取,使用如下命令来设置:
#注意可以修改版本号
sed -i "s/kubernetesVersion: .*/kubernetesVersion: v1.12.1/g" kubeadm.conf
使用--config
参数指定kubeadm.conf
文件来运行 kubeadm 的images pull
的命令,在kubeadm.conf
所在目录下执行:
kubeadm config images pull --config kubeadm.conf
耐心等待,输出:
W0102 18:43:32.305695 28207 common.go:105] WARNING: Detected resource kinds that may not apply: [InitConfiguration MasterConfiguration JoinConfiguration NodeConfiguration]
[config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1alpha3, Kind=JoinConfiguration
[config/images] Pulled registry.aliyuncs/google_containers/kube-apiserver:v1.12.1
[config/images] Pulled registry.aliyuncs/google_containers/kube-controller-manager:v1.12.1
[config/images] Pulled registry.aliyuncs/google_containers/kube-scheduler:v1.12.1
[config/images] Pulled registry.aliyuncs/google_containers/kube-proxy:v1.12.1
[config/images] Pulled registry.aliyuncs/google_containers/pause:3.1
[config/images] Pulled registry.aliyuncs/google_containers/etcd:3.2.24
[config/images] Pulled registry.aliyuncs/google_containers/coredns:1.2.2
注意: 基础镜像pause的拉取地址需要单独设置,否则还是会从k8s.gcr.io来拉取,单独打一个tag:
sudo docker tag registry.aliyuncs/google_containers/pause:3.1 k8s.gcr.io/pause:3.1
3.3 Master使用kubeadm创建一个单Master集群
初始化Master节点
通常,我们在执行init命令时,可能还需要指定advertiseAddress
、--pod-network-cidr
等参数,但是由于我们这里使用kubeadm.conf
配置文件来初始化,就不在命令行中指定其他参数了,只需在kubeadm.conf来设置:
#你需要更改此处 “ 192.168.0.6 ” 为自己master节点的IP
sed -i "s/advertiseAddress: .*/advertiseAddress: 192.168.0.6/g" kubeadm.conf
将--pod-network-cid
设置为10.244.0.0/16
,修改如下:
#无需修改
sed -i "s/podSubnet: .*/podSubnet: \"10.244.0.0\/16\"/g" kubeadm.conf
执行初始化命令:
sudo kubeadm init --config kubeadm.conf
输出:
W1109 17:01:47.071494 42929 common.go:105] WARNING: Detected resource kinds that may not apply: [InitConfiguration MasterConfiguration JoinConfiguration NodeConfiguration]
[config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1alpha3, Kind=JoinConfiguration
[init] using Kubernetes version: v1.12.2
[preflight] running pre-flight checks
[preflight/images] Pulling images required for setting up a Kubernetes cluster
[preflight/images] This might take a minute or two, depending on the speed of your internet connection
[preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[preflight] Activating the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [ubuntu1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.8]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [ubuntu1 localhost] and IPs [127.0.0.1 ::1]
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [ubuntu1 localhost] and IPs [192.168.0.8 127.0.0.1 ::1]
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[certificates] Generated sa key and public key.
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests"
[init] this might take a minute or longer if the control plane images have to be pulled
[apiclient] All control plane components are healthy after 57.002438 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.12" in namespace kube-system with the configuration for the kubelets in the cluster
[markmaster] Marking the node ubuntu1 as master by adding the label "node-role.kubernetes.io/master=''"
[markmaster] Marking the node ubuntu1 as master by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ubuntu1" as an annotation
[bootstraptoken] using token: abcdef.0123456789abcdef
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxyYour Kubernetes master has initialized successfully!To start using your cluster, you need to run the following as a regular user:mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/configYou should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: can now join any number of machines by running the following on each node
as root:kubeadm join 192.168.0.6:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:5e44393289eb7e463479f93327b2593a45b32fa8afb8978d878e0f2c9bf8e29b
注意:如果你执行init命令没有成功,修改完配置信息,需要先执行sudo kubeadm reset
命令再重新执行kubeadm init
命令
Tips: kubeadm init
最后一行输出的kubeadm join
命令语句最好记录一下,后面部署子节点会用到,就是这句(这只是我的示例):
kubeadm join 192.168.0.6:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:5e44393289eb7e463479f93327b2593a45b32fa8afb8978d878e0f2c9bf8e29b
如果想使用非root用户操作kubectl,执行以下命令(这也是kubeadm init输出的一部分):
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
3.4 安装网络插件
3.4.1 查看状态
在安装之前,先查看一下当前Pods的状态:
kubectl get pods --all-namespaces
输出:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-5c545769d8-6cl9s 0/1 Pending 0 110s
kube-system coredns-5c545769d8-h8fjj 0/1 Pending 0 111s
kube-system etcd-ubuntu1 1/1 Running 0 75s
kube-system kube-apiserver-ubuntu1 1/1 Running 0 87s
kube-system kube-controller-manager-ubuntu1 1/1 Running 0 96s
kube-system kube-proxy-snhqr 1/1 Running 0 111s
kube-system kube-scheduler-ubuntu1 1/1 Running 0 98s
如上,可以看到CoreDND的状态是Pending,因为我们还没有安装网络插件。
由于我的虚拟机网段是192.168.0.x
,无法使用Calico
网络,所以使用了Canal
网络插件,它是Calico
和Flannel
的结合体,在上面kubeadm init
的时候已经指定了--pod-network-cidr=10.244.0.0/16
,这是Canal插件所要求的。
3.4.2 各个节点修改/etc/resolv.conf
上述pods中有两个CoreDNS,CoreDNS启动后会通过宿主机的/etc/resolv.conf
文件去获取上游DNS的信息,如果这个时候获取的DNS的服务器是本地地址的话,就会出现环路,即便执行了安装网络插件的命令,这两个pods的状态会一直处于CrashLoopBackoff
状态。这一问题的官方解决办法:Troubleshooting Loops In Kubernetes Clusters,一共有三种方法我们使用第三种方法:修改各个主机/etc/resolv.conf
中的DNS。
vim /etc/resolv.conf
修改前:
nameserver 127.0.0.1
修改后:
nameserver 8.8.8.8
#nameserver 127.0.0.1
使修改生效:
sudo ldconfig
3.4.3 Master安装Canal网络插件
# 源地址:.3/getting-started/kubernetes/installation/hosted/canal/rbac.yaml
kubectl apply -f .3/rbac.yaml# 源地址:.3/getting-started/kubernetes/installation/hosted/canal/canal.yaml
kubectl apply -f .3/canal.yaml
关于更多Canal
的信息,可以查看 Installing Calico for policy and flannel for networking。
耐心的等待,然后再使用kubectl get pods --all-namespaces
命令来查看网络插件的安装情况:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-5c545769d8-6cl9s 1/1 Running 0 7h
kube-system coredns-5c545769d8-h8fjj 1/1 Running 0 7h
kube-system etcd-ubuntu1 1/1 Running 0 7h
kube-system kube-apiserver-ubuntu1 1/1 Running 0 7h
kube-system kube-controller-manager-ubuntu1 1/1 Running 0 7h
kube-system kube-proxy-snhqr 1/1 Running 0 7h
kube-system kube-scheduler-ubuntu1 1/1 Running 0 7h
当STATUS全部变为了Running,表示网络插件安装成功。如果按照上述方法coredns两个pods还是无法Runing,建议使用使用systemctl daemon-reload
命令重新读取配置,再使用service kubelet restart
重启 kubelet,再重新执行安装命令。如果还是不行可以试试看重启Master主机,再执行安装命令。
3.5 Master隔离
默认情况下,由于安全原因,集群并不会将pods部署在Master节点上。但是在开发环境下,我们可能就只有一个Master节点,这时可以使用下面的命令来解除这个限制:
kubectl taint nodes --all node-role.kubernetes.io/master-
输出:
node/ubuntu1 untainted
3.6 子节点加入master节点
在子节点执行之前master节点kubeadm init
输出的kubeadm join
命令:kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash>
(你需要执行自己init成功后的join命令):
kubeadm join 192.168.0.6:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:5e44393289eb7e463479f93327b2593a45b32fa8afb8978d878e0f2c9bf8e29b
如果我们忘记了Master节点--token
,可以使用如下命令来查看:
sudo kubeadm token list
输出:
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
pe9eow.4wywpjhj9txkvef9 23h 2019-01-08T11:28:44+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
默认情况下,token的有效期是24小时,如果我们的token已经过期的话,可以使用以下命令重新生成:
kubeadm token create
输出:
pe9eow.4wywpjhj9txkvef9
如果我们也没有--discovery-token-ca-cert-hash
的值,可以使用以下命令生成:
输入如下命令查看:
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
输出:
5e44393289eb7e463479f93327b2593a45b32fa8afb8978d878e0f2c9bf8e29b
执行kubeadm join
命令,输出如下:
[preflight] running pre-flight checks[WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh] or no builtin kernel ipvs support: map[ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{} ip_vs:{} ip_vs_rr:{}]
you can solve this problem with following methods:1. Run 'modprobe -- ' to load missing kernel modules;
2. Provide the missing builtin kernel ipvs support[discovery] Trying to connect to API Server "192.168.0.6:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.0.6:6443"
[discovery] Requesting info from "https://192.168.0.6:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.0.8:6443"
[discovery] Successfully established connection with API Server "192.168.0.6:6443"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ubuntu2" as an annotationThis node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.Run 'kubectl get nodes' on the master to see this node join the cluster.
这时候我们就可以在Master节点上使用kubectl get nodes
命令来查看节点的状态:
sudo kubectl get nodes
输出:
NAME STATUS ROLES AGE VERSION
seeta-0002 Ready <none> 1d v1.12.1
seeta-02 Ready <none> 1d v1.12.1
seeta-03 Ready master 1d v1.12.1
要想在子节点也支持kubectl命令的话,你需要将Master
节点的/etc/kubernetes/admin.conf
拷贝到所有子节点,执行如下列语句(注意192.168.0.7
是我一个子节点的IP,你需要修改为自己的子节点IP):
# Master下执行:
sudo scp /etc/kubernetes/admin.conf 192.168.0.7:/etc/kubernetes/admin.conf# 子节点(192.168.0.7)下执行:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
这样你可以在各个节点使用kubectl命令啦
3.7 测试
3.7.1 验证kube-apiserver
, kube-controller-manager
, kube-scheduler
, pod network
部署一个 Nginx Deployment,包含3个Pod(因为我这里一共有三台机器):
sudo kubectl create deployment nginx --image=nginx:alpine
sudo kubectl scale deployment nginx --replicas=3
验证Nginx Pod是否正确运行,并且会分配10.244.开头的集群IP:
sudo kubectl get pods -l app=nginx -o wide
输出:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
nginx-65d5c4f7cc-652hd 1/1 Running 1 11d 10.244.0.26 seeta-03 <none>
nginx-65d5c4f7cc-csjdl 1/1 Running 0 11d 10.244.2.2 seeta-0002 <none>
nginx-65d5c4f7cc-s4n78 1/1 Running 0 11d 10.244.1.2 seeta-02 <none>
3.7.2 验证一下kube-proxy是否正常:
以 NodePort 方式对外提供服务,(官方文档),执行如下语句:
sudo kubectl expose deployment nginx --port=80 --type=NodePort
#查看集群外可访问的Port:
sudo kubectl get services nginx
输出:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx NodePort 10.97.226.246 <none> 80:31504/TCP 11d
可以通过任意 NodeIP:Port 在集群外部访问这个服务,本示例中部署的3台集群IP分别是192.168.0.6
、192.168.0.5
、192.168.0.7
:
curl http://192.168.0.5:31504
curl http://192.168.0.6:31504
curl http://192.168.0.7:31504
3.7.3 验证dns
, pod network
是否正常:
# 运行Busybox并进入交互模式
sudo kubectl expose deployment nginx --port=80 --type=NodePort
# 输入`nslookup nginx`查看是否可以正确解析出集群内的IP,已验证DNS是否正常
nslookup nginx
输出如下表示dns正常:
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.localName: nginx
Address 1: 10.97.226.246 nginx.default.svc.cluster.local
通过服务名进行访问,验证kube-proxy是否正常,继续输入如下命令:
curl http://nginx/
输出:
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>body {width: 35em;margin: 0 auto;font-family: Tahoma, Verdana, Arial, sans-serif;}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p><p>For online documentation and support please refer to
<a href="/">nginx</a>.<br/>
Commercial support is available at
<a href="/">nginx</a>.</p><p><em>Thank you for using nginx.</em></p>
</body>
</html>
分别访问一下3个Pod的集群IP,验证跨Node的网络通信是否正常:
curl http://10.244.0.26/
curl http://10.244.2.2/
curl http://10.244.1.2/
测试结束,ctrl+d
退出交互模式,删除curl这个pod:
sudo kubectl delete deploy curl
输出:
deployment.extensions "curl" deleted
3.8 卸载清理k8s
输入如下命令(未实验过):
kubeadm reset -f
modprobe -r ipip
lsmod
rm -rf ~/.kube/
rm -rf /etc/kubernetes/
rm -rf /etc/systemd/system/kubelet.service.d
rm -rf /etc/systemd/system/kubelet.service
rm -rf /usr/bin/kube*
rm -rf /etc/cni
rm -rf /opt/cni
rm -rf /var/lib/etcd
rm -rf /var/etcd
4. 安装helm
4.1 Master安装Helm客户端
从官网下载2.5版本以上的helm文件(尽量使用较新的版本),以下都以v2.12.1
版为例,从官网下载helm-v2.12.1-linux-amd64.tar.gz
,此处需要翻墙,提供一个v2.12.1
的百度云链接,在Master节点执行:
tar -zxvf helm-v2.12.1-linux-amd64.tar.gz
#解压后名字变成linux-amd64
mv linux-amd64/helm /usr/local/bin/helm
#查看Helm客户端是否成功安装,目前只能查看到客户端的版本,服务器还没有安装。
sudo helm version
输出:
Client: &version.Version{SemVer:"v2.12.1", GitCommit:"02a47c7249b1fc6d8fd3b94e6b4babf9d818144e", GitTreeState:"clean"}
helm 有很多子命令和参数,为了提高使用命令行的效率,通常建议安装 helm 的 bash 命令补全脚本,方法如下:
helm completion bash > .helmrc
echo "source .helmrc" >> .bashrc
source .bashrc
4.2 Master安装Helm Tiller服务端
拉取国内镜像:
sudo docker pull registry-hangzhou.aliyuncs/google_containers/tiller:v2.12.1
sudo docker tag registry-hangzhou.aliyuncs/google_containers/tiller:v2.12.1 gcr.io/kubernetes-helm/tiller:v2.12.1
权限配置:
sudo kubectl create serviceaccount --namespace kube-system tiller
sudo kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
sudo kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'
Helm部署tiller:
helm init -i registry-hangzhou.aliyuncs/google_containers/tiller:v2.12.1
再次执行sudo helm version
,可能需要等待一会,输出如下内容表示Helm部署完毕:
Client: &version.Version{SemVer:"v2.12.1", GitCommit:"02a47c7249b1fc6d8fd3b94e6b4babf9d818144e", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.12.1", GitCommit:"02a47c7249b1fc6d8fd3b94e6b4babf9d818144e", GitTreeState:"clean"}
5. Polyapxon
Polyapxon的官方安装文档
5.1 部署Polyapxon服务器
Master执行如下命令创建polyaxon命名空间:
sudo kubectl create namespace polyaxon
输出:
namespace "polyaxon" created
Polyapxon有配置文件你可以创建一个config.yml
或者 polyaxon_config.yml
,根据github其官网和自己的需求填写,他们还提供了一个yml生成器。但是我使用默认的配置文件进行部署:
helm repo add polyaxon
helm repo update
sudo helm install polyaxon/polyaxon --name=polyaxon --namespace=polyaxon
大概过3-5分钟,会输出如下信息:
NOTES:
Polyaxon is currently running:1. Get the application URL by running these commands:NOTE: It may take a few minutes for the LoadBalancer IP to be available.You can watch the status by running:'kubectl get --namespace polyaxon svc -w polyaxon-polyaxon-api'export POLYAXON_IP=$(kubectl get svc --namespace polyaxon polyaxon-polyaxon-api -o jsonpath='{.status.loadBalancer.ingress[0].ip}')export POLYAXON_HTTP_PORT=80export POLYAXON_WS_PORT=1337echo http://$POLYAXON_IP:$POLYAXON_HTTP_PORT2. Setup your cli by running theses commands:polyaxon config set --host=$POLYAXON_IP --http_port=$POLYAXON_HTTP_PORT --ws_port=$POLYAXON_WS_PORT3. Log in with superuserUSER: rootPASSWORD: Get login password withkubectl get secret --namespace polyaxon polyaxon-polyaxon-secret -o jsonpath="{.data.POLYAXON_ADMIN_PASSWORD}" | base64 --decode
这里输出Notes中一共有三点,一是如何获取到 polyaxon 的运行地址,二是更新 polyaxon 的配置,三是客户端登陆信息。这里第一点获取运行地址需要更改一下运行命令,否则会出错,将 export POLYAXON_IP=$(kubectl get svc --namespace polyaxon polyaxon-polyaxon-api -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
改为export POLYAXON_IP=$(kubectl get svc --namespace polyaxon polyaxon-polyaxon-api -o jsonpath='{.spec.clusterIP}')
。进入root权限(su -),我们就提示顺序执行:
export POLYAXON_IP=$(kubectl get svc --namespace polyaxon polyaxon-polyaxon-api -o jsonpath='{.spec.clusterIP}')
export POLYAXON_HTTP_PORT=80
export POLYAXON_WS_PORT=1337
echo http://$POLYAXON_IP:$POLYAXON_HTTP_PORT
#最后一步会输出类似:http://10.107.85.252:80
5.2 Polyaxon客户端
Master执行如下命令安装客户端:
#python2的话使用pip就好了
pip3 install -U polyaxon-cli
polyaxon config set --host=$POLYAXON_IP --http_port=$POLYAXON_HTTP_PORT --ws_port=$POLYAXON_WS_PORT
进入root权限(su -)登录客户端,默认的用户名为root
,密码为rootpassword
:
polyaxon login --username=root
Please enter your password:#Login successful
如果你不确定密码的话,输入部署时的Notes中的第三点,最后一句命令(注意密码会在用户名前面打印,不太显眼):
kubectl get secret --namespace polyaxon polyaxon-polyaxon-secret -o jsonpath="{.data.POLYAXON_ADMIN_PASSWORD}" | base64 --decode
查看polyaxon-api
服务映射出来的端口号:
kubectl get service -n polyaxon
输出:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
polyaxon-docker-registry NodePort 10.98.185.92 <none> 5000:31813/TCP 6m26s
polyaxon-polyaxon-api LoadBalancer 10.107.85.252 <pending> 80:30945/TCP,1337:31122/TCP 6m26s
polyaxon-postgresql ClusterIP 10.98.103.223 <none> 5432/TCP 6m26s
polyaxon-rabbitmq ClusterIP 10.103.114.255 <none> 4369/TCP,5672/TCP,25672/TCP,15672/TCP 6m26s
polyaxon-redis ClusterIP 10.106.3.31 <none> 6379/TCP 6m26s
注意:polyaxon-polyaxon-api
的POST为80:30945
,这个30945
即为服务的端口号,你可以在浏览器访问http://192.168.0.6:30945/
使用polyaxon
。
Tips: 如果你想安装 NFS
并且创建 persistent volume
与 persistent volume claim
,进行持久存储,可以参考这篇文章和官网,虽然我按他的配置没能部署起来。
5.3 卸载Polyaxon
删除polyaxon:
helm delete polyaxon --purge
如果你没有安装好polyaxon,使用如下命令删除:
helm delete polyaxon --purge --no-hooks
删除命名空间:
kubectl delete namespace polyaxon
发布评论