# mybookstore-microservice-user **Repository Path**: wang5620079/mybookstore-microservice-user ## Basic Information - **Project Name**: mybookstore-microservice-user - **Description**: 用户微服务,鉴权及用户权限管控使用 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 2 - **Forks**: 0 - **Created**: 2021-08-17 - **Last Updated**: 2025-10-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README Kubernetes书城项目 ## 项目简介 本项目是基于当前流行技术组合的前后端分离的简单商城演示系统,本项目融合了Ant Design Pro前端框架、jwt认证、python-flask、redis、Elasticsearch全文检索,ELK日志采集,使用jenkins作为CI/CD工具,使用ansible作为自动化运维工具,以k8s为部署运行环境,最终目标是将项目部署在istio 的service mesh环境中,实现云原生的服务灰度发布、服务治理等功能。届时微服务也会变成多语言开发,充分实现云原生环境下业务开发语言的无关性。 本项目的目的,就是为了记录自己从虚拟机创建到整个环境搭建完成的过程。 ## 前言 当前Kubernetes已经成为云服务中服务编排的事实标准,而以Kubernetes作为运行底座的istio项目也稳稳占据了service mesh服务网格的C位,有很大的发展潜力。正是因为这些技术的发展,云原生的概念越来越深入人心。本项目旨在尝试编写一个语言的商城演示项目,来演示云原生相关的技术和能力。 本项目的应用部分编写较为简单,并未过多考虑高并发场景,目前应用部分仅仅作为任务功能验证的媒体。随着项目的逐步深入,未来可能会逐步添加消息队列、redis双写一致等特性,逐渐丰富项目的内容,增强高并发场景的应对能力,不排除完全替换现有的python编程语言。 ## 各组件仓库地址 | 仓库名称 | 地址 | 说明 | | :------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | | mybookstore-front | https://gitee.com/wang5620079/mybookstore-front.git | 书城前端UI页面 | | mybookstore-app | https://gitee.com/wang5620079/mybookstore-app.git | 书城服务端,聚合
后端服务,类似于
java中的ZUUL网关 | | mybookstore-microservice-book | https://gitee.com/wang5620079/mybookstore-microservice-book.git | 书籍微服务 | | mybookstore-microservice-order | https://gitee.com/wang5620079/mybookstore-microservice-order.git | 订单管理微服务 | | mybookstore-microservice-other | https://gitee.com/wang5620079/mybookstore-microservice-other.git | 其他微服务 | | mybookstore-microservice-user | https://gitee.com/wang5620079/mybookstore-microservice-user.git | 用户中心微服务 | | mybookstore-microservice-search | | 全文检索服务,待添加 | | mybookstore-microservice-comment | | 评论微服务,待前加 | ## 版本计划 本项目部署,本着先集中后分散的原则。先搭建集中式的基本环境,再在基本环境基础上进行拆分和微服务部署。版本计划如下: ### 1、基础微服务版本(本次发布版本) **实现基本的基于Ant Design Pro、python-flask、redis、mysql实现基本的JWT认证、动态菜单管理功能,业务流程上实现基本的书城页面展示、收藏管理、购物车管理、支付管理、订单展现功能。**暂不实现库存管理、用户管理、评论等功能。 **部署CI/CD流水线,实现Jenkins的自动化微服务发布,将服务以k8sdeployment的形式部署,并通过Service暴露服务,通过ingress暴露对外服务,增加https自签名认证,实现基本的微服务部署。** 搭建基于虚拟机的Kubernetes 二进制环境部署,并整合Ceph存储、Jenkins 工具、Harbor镜像仓库、Prometheus监控工具。 搭建ansible自动化运维环境,基于ansible进行基础环境搭建。 ### 2、istio版本 加入ES的全文检索功能,加入JAVA、Go语言编写的其他微服务。 安装istio支持。 ### 其他版本计划待定 ## 系统架构及构建说明 ### 1、应用架构说明 ![应用架构图 (2)](https://gitee.com/wang5620079/mypics/raw/master//202109012239300.png) 本次发布的版本为**基础版本**,基础版本前端采用AntDesign Vue作为前端界面框架,服务器端采用python开发。基础版本构建采用较为常用的前台、中台架构。数据库采用mysql,部署在宿主机上,并采用Kubernetes的最佳实践,即EP+Service的形式,以Service服务的形式为容器提供服务。 **Istio版本**之后,将使用java、go语言等改造已经上线的微服务,实现微服务多语言开发,并将新语言的微服务作为该服务的新版本,实现线上微服务的灰度发布和平滑升级。Istio版本后,应用架构将发生较大的变化。由于网格部署的灵活性,系统架构将不再拘泥于现有的前台、中台架构。 ### 2、CICD说明 ![image-20210901224017176](https://gitee.com/wang5620079/mypics/raw/master//202109012240298.png) 本次CICD使用的工具是比较常用的jenkins。整个jenkins体系部署在Kubernetes环境中,使用pod作为jenkins构建时的agent,从而实现容器化的构建,充分利用云原生能力,发挥弹性伸缩的灵活性,提升资源的使用效率。 本项目并未集成测试环节,即单元测试和集成测试都忽略掉了。代码直接通过拉取gitee代码仓库,构建镜像。镜像构建完成后,上传到Harbor镜像仓库中。流水线拉取镜像直接部署。 ## 系统截图 ### 1、首页 ![image-20210822211831970](https://gitee.com/wang5620079/mypics/raw/master//202108222118222.png) ### 2、书籍阅览 ![image-20210822211900188](https://gitee.com/wang5620079/mypics/raw/master//202108222119457.png) ### 3、收藏管理 ![image-20210822211946905](https://gitee.com/wang5620079/mypics/raw/master//202108222119169.png) ### 4、购物车管理 ![image-20210822212016049](https://gitee.com/wang5620079/mypics/raw/master//202108222120298.png) ### 5、支付管理 ![image-20210815195418641](https://gitee.com/wang5620079/mypics/raw/master//202108160024588.png) ### 6、模拟支付 ![image-20210822212100872](https://gitee.com/wang5620079/mypics/raw/master//202108222121108.png) ### 7、订单管理 ![image-20210815195702648](https://gitee.com/wang5620079/mypics/raw/master//202108160025720.png) ### 8、用户中心 ![image-20210815195740125](https://gitee.com/wang5620079/mypics/raw/master//202108160025244.png) ## 技术选型 ### 应用选型 | 技术 | 版本 | 说明 | | :------------: | :-----: | :----------------------------------------: | | Ant Design Pro | 3.02 | 前端UI框架,便于快速前端开发 | | npm | 6.14.12 | 前端编译环境 | | python | 3.7.11 | 服务端语言之一 | | python-flask | 1.1.4 | 服务器端开发框架 | | gunicorn | 20.1.0 | 服务器端wsgi http server,用于部署容器服务 | | redis | 6.2.5 | 用于待支付订单,未来会加入更多使用场景 | | mysql | 8.0.26 | 数据库 | ### 基础环境选型 | 技术 | 版本 | 说明 | | :-------------------: | :-----: | :---------------------------------: | | centos | 7 | 内核版本5.4.131(需要升级内核版本) | | docker | 19.03.9 | 容器运行环境 | | kubernetes | 1.21 | 容器编排环境 | | Jenkins | 2.289.1 | CICD工具 | | rook-ceph | v1.6 | 开源云存储工具 | | helm | v3.6.3 | k8s包管理器 | | Harbor | v2.3.0 | 镜像仓库 | | kube-prometheus-stack | 16.0.0 | 监控工具 | | ansible | 2.9.23 | 自动化管理工具 | | istio | 1.11 | 官网最新版本 | ## 运行环境记录 #### 宿主机 本次试验运行的宿主机为一台48C 128G内存的物理机,在物理机上部署虚拟机,实现集群化部署。宿主机配置截图如下: ![image-20210822112154279](https://gitee.com/wang5620079/mypics/raw/master//202108221121328.png) #### 虚拟机配置说明 | 主机名 | IP地址 | CPU | 内存 | 磁盘 | 说明 | | :---------: | :-------------: | :--: | :--: | :------: | :----------------------------------------------------------: | | k8s-master1 | 192.168.100.60 | 4 | 8 | 50G | 自动化运维主机,同时安装ansible
k8s master节点
50G系统盘 | | k8s-master2 | 192.168.100.61 | 4 | 8 | 50G | k8s master节点
50G系统盘 | | k8s-master3 | 192.168.100.62 | 4 | 8 | 50G | k8s master节点
50G系统盘 | | k8s-node1 | 192.168.100.63 | 8 | 16 | 50G/100G | k8s worker节点,
50G系统盘,100G数据盘 | | k8s-node2 | 192.168.100.64 | 8 | 16 | 50G/100G | k8s worker节点,
50G系统盘,100G数据盘 | | k8s-node3 | 192.168.100.65 | 8 | 16 | 50G/100G | k8s worker节点,
50G系统盘,100G数据盘 | | nginx100 | 192.168.100.100 | 2 | 4 | 50G | 高可用负载均衡用
实际生产应该用keepalived集群代替
这个IP地址也是apiserver的高可用lb地址 | ## 搭建过程记录 ### 一、 k8s二进制部署基础环境搭建 #### 1.1 集群架构说明 在这里引用一张官网的图,详细说明请见官网链接https://kubernetes.io/zh/docs/tasks/administer-cluster/highly-available-master/: > 注意:为部署方便,笔者所用集群,API-Server负载均衡使用的是Nginx的TCP负载均衡,且为单点部署。实际生产中,应使用硬件负载均衡,或者Keepalived之类的集群化负载均衡,避免单点。 ![ha-master-gce](https://gitee.com/wang5620079/mypics/raw/master//202108172252314.png) #### 1.2 基础环境构建 ##### 1.2.1、模板机搭建 **笔者使用的虚拟机是vmware虚拟机,故使用一台主机作为源主机,升级好其内核,安装好docker,然后作为模板主机进行克隆复制即可。** **1)内核升级** 注意: > 3.10内核在大规模集群具有不稳定性,内核应升级到4.19+ ``` shell # 查看内核版本 uname -sr # 0、升级软件包,不升级内核 yum update -y --exclude=kernel* # 1、下载公钥 rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org rpm -Uvh https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm # 安装镜像加速 yum install -y yum-plugin-fastestmirror # 3、仓库启用后,列出可用的内核相关包: yum --disablerepo="*" --enablerepo="elrepo-kernel" list available #kernel-lt: long term support:长期支持版 #kernel-ml: mainline stable: 主线稳定版 # 4、选择自己的版本进行安装 5.4.119-1.el7.elrepo yum --enablerepo=elrepo-kernel install -y kernel-lt # 5、查看内核 uname -sr #查看内核位置 awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg CentOS Linux 7 Rescue 0a87210b6f6337e79a6611c512e524ce (5.4.119-1.el7.elrepo.x86_64) #第0个 CentOS Linux (5.4.119-1.el7.elrepo.x86_64) 7 (Core) ##我们的在第1个 CentOS Linux (3.10.0-1160.el7.x86_64) 7 (Core) CentOS Linux (0-rescue-cc2c86fe566741e6a2ff6d399c5d5daa) 7 (Core) # 6、重新创建内核配置。 grub2-mkconfig -o /boot/grub2/grub.cfg # 确认内核的位置,修改默认内核即可 # 7、修改使用默认内核 vi /etc/default/grub # 将 GRUB_DEFAULT 设置为 0,代表 GRUB 初始化页面的第一个内核将作为默认内核 # 再重新整理下内核 grub2-mkconfig -o /boot/grub2/grub.cfg # 8、重开机 reboot # 9、检查 uname -r #######显示如下######### [root@k8s-master1 ~]# uname -r 5.4.131-1.el7.elrepo.x86_64 [root@k8s-master1 ~]# ``` **2)主机配置** ```sh # 关闭selinux setenforce 0 sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/sysconfig/selinux sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config # 关闭swap swapoff -a && sysctl -w vm.swappiness=0 sed -ri 's/.*swap.*/#&/' /etc/fstab #修改limit ulimit -SHn 65535 vi /etc/security/limits.conf # 末尾添加如下内容 * soft nofile 655360 * hard nofile 131072 * soft nproc 655350 * hard nproc 655350 * soft memlock unlimited * hard memlock unlimited #安装后续用的一些工具 yum install wget git jq psmisc net-tools yum-utils device-mapper-persistent-data lvm2 -y # 安装ipvs工具,方便以后操作ipvs,ipset,conntrack等 yum install ipvsadm ipset sysstat conntrack libseccomp -y # 配置ipvs模块,执行以下命令,在内核4.19+版本改为nf_conntrack, 4.18下改为nf_conntrack_ipv4 modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack #修改ipvs配置,加入以下内容 vi /etc/modules-load.d/ipvs.conf ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh ip_vs_sh ip_vs_fo ip_vs_nq ip_vs_sed ip_vs_ftp ip_vs_sh nf_conntrack ip_tables ip_set xt_set ipt_set ipt_rpfilter ipt_REJECT ipip # 执行命令 systemctl enable --now systemd-modules-load.service #--now = enable+start #检测是否加载 lsmod | grep -e ip_vs -e nf_conntrack #开启IPV4转发 cat < /etc/sysctl.d/k8s.conf net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 fs.may_detach_mounts = 1 vm.overcommit_memory=1 net.ipv4.conf.all.route_localnet = 1 vm.panic_on_oom=0 fs.inotify.max_user_watches=89100 fs.file-max=52706963 fs.nr_open=52706963 net.netfilter.nf_conntrack_max=2310720 net.ipv4.tcp_keepalive_time = 600 net.ipv4.tcp_keepalive_probes = 3 net.ipv4.tcp_keepalive_intvl =15 net.ipv4.tcp_max_tw_buckets = 36000 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_max_orphans = 327680 net.ipv4.tcp_orphan_retries = 3 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_max_syn_backlog = 16768 net.ipv4.ip_conntrack_max = 65536 net.ipv4.tcp_timestamps = 0 net.core.somaxconn = 16768 EOF sysctl --system # 配置完内核后,重启服务器,保证重启后内核依旧加载 reboot lsmod | grep -e ip_vs -e nf_conntrack #关闭防火墙 systemctl stop firewalld.service systemctl disable firewalld.service ``` **3) 安装docker** ```sh # 安装docker yum remove docker* yum install -y yum-utils yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo yum install -y docker-ce-19.03.9 docker-ce-cli-19.03.9 containerd.io-1.4.4 #修改docker配置,新版kubelet建议使用systemd,所以可以把docker的CgroupDriver改成systemd #修改镜像仓库地址用于加速,需要在阿里云镜像服务注册,具体可以查询阿里云镜像服务说明 mkdir /etc/docker cat > /etc/docker/daemon.json < etcd访问单独设置一个证书,该证书用于签发etcd组件健康检查、etcd集群间通信、server服务器间通信以及apiserver访问etcd。 ![image-20210817232718883](https://gitee.com/wang5620079/mypics/raw/master//202108172327934.png) ###### **3)根证书生成过程** - **证书工具准备** > 证书的生成使用cfssl工具 ```sh #使用ansible在三台master主机上创建pki目录 ansible k8s-master* -a "mkdir -p /etc/kubernetes/pki" ``` - **下载证书工具** ```sh #master1上执行 # 下载cfssl核心组件 wget https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssl-certinfo_1.5.0_linux_amd64 wget https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssl_1.5.0_linux_amd64 wget https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssljson_1.5.0_linux_amd64 #授予执行权限 chmod +x cfssl* #批量重命名 for name in `ls cfssl*`; do mv $name ${name%_1.5.0_linux_amd64}; done #移动到文件 mv cfssl* /usr/bin ``` - **开始生成根证书** ```sh #master1上执行 cd /etc/kubernetes/pki vi ca-config.json { "signing": { "default": { "expiry": "87600h" }, "profiles": { "server": { "expiry": "87600h", "usages": [ "signing", "key encipherment", "server auth" ] }, "client": { "expiry": "87600h", "usages": [ "signing", "key encipherment", "client auth" ] }, "peer": { "expiry": "87600h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] }, "kubernetes": { "expiry": "87600h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] }, "etcd": { "expiry": "87600h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } } ``` **CA签名,生成签名请求** ```sh #master1上执行 vi /etc/kubernetes/pki/ca-csr.json { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "Kubernetes", "OU": "Kubernetes" } ], "ca": { "expiry": "87600h" } } ``` > **各字段说明如下** > > > > **CN(Common Name)**: > > 公用名(Common Name)必须填写,一般可以是网站域 > > **O(Organization)**: > > Organization(组织名)是必须填写的,如果申请的是OV、EV型证书,组织名称必须严格和企业在政府登记名称一致,一般需要和营业执照上的名称完全一致。不可以使用缩写或者商标。如果需要使用英文名称,需要有DUNS编码或者律师信证明。 > > **OU(Organization Unit)** > > OU单位部门,这里一般没有太多限制,可以直接填写IT DEPT等皆可。 > > **C(City)** > > City是指申请单位所在的城市。 > > **ST(State/Province)** > > ST是指申请单位所在的省份。 > > **C(Country Name)** > > C是指国家名称,这里用的是两位大写的国家代码,中国是CN。 - **生成证书** ```sh #master1执行 cfssl gencert -initca ca-csr.json | cfssljson -bare ca - # ca.csr ca.pem(ca公钥) ca-key.pem(私钥) ``` ##### **1.2.4、etcd高可用集群搭建** ###### **1)下载etcd安装文件** ```sh #master1主机上执行 # 下载etcd安装文件 wget https://github.com/etcd-io/etcd/releases/download/v3.4.16/etcd-v3.4.16-linux-amd64.tar.gz ## 复制到其他节点 ansible k8s-master* -m copy -a "src=etcd-v3.4.16-linux-amd64.tar.gz dest=/root" ## 解压到 /usr/local/bin ansible k8s-master* -m shell -a "tar -zxvf etcd-v3.4.16-linux-amd64.tar.gz --strip-components=1 -C /usr/local/bin etcd-v3.4.16-linux-amd64/etcd{,ctl}" ##验证 ansible k8s-master* -m shell -a "etcdctl" #只要有打印就ok ``` ###### **2)生成etcd证书** > 生成签名请求 ```sh vi etcd-ca-csr.json { "CN": "etcd", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "etcd", "OU": "etcd" } ], "ca": { "expiry": "87600h" } } ``` ```sh #生成etcd的ca根证书 cfssl gencert -initca etcd-ca-csr.json | cfssljson -bare /etc/kubernetes/pki/etcd/ca - ``` ```sh vi etcd-csr.json { "CN": "etcd-myha", "key": { "algo": "rsa", "size": 2048 }, "hosts": [ "127.0.0.1", "k8s-master1", "k8s-master2", "k8s-master3", "192.168.0.10", "192.168.0.11", "192.168.0.12" ], "names": [ { "C": "CN", "L": "beijing", "O": "etcd", "ST": "beijing", "OU": "System" } ] } // 注意:hosts用自己的主机名和ip // 也可以在签发的时候再加上 -hostname=127.0.0.1,k8s-master1,k8s-master2,k8s-master3, // 可以指定受信的主机列表 // "hosts": [ // "k8s-master1", // "www.example.net" // ], # 签发etcd证书 cfssl gencert \ -ca=/etc/kubernetes/pki/etcd/ca.pem \ -ca-key=/etc/kubernetes/pki/etcd/ca-key.pem \ -config=/etc/kubernetes/pki/ca-config.json \ -profile=etcd \ etcd-myha-csr.json | cfssljson -bare /etc/kubernetes/pki/etcd/etcd ``` > 分发证书文件 ```sh ansible k8s-master* -m copy -a "src=/etc/kubernetes/pki/etcd dest=/etc/kubernetes/pki" ``` ###### **3)etcd高可用安装** > 先创建etcd目录 ```sh ansible k8s-master* -m shell -a "mkdir -p /etc/etcd" ``` > 创建etcd的启动配置文件/etc/etcd/etcd.yaml,**每个master主机都要单独执行** ```sh # etcd yaml示例。 name: 'etcd-master1' #每个机器可以写自己的域名,不能重复 data-dir: /var/lib/etcd wal-dir: /var/lib/etcd/wal snapshot-count: 5000 heartbeat-interval: 100 election-timeout: 1000 quota-backend-bytes: 0 listen-peer-urls: 'https://192.168.100.60:2380' # 本机ip+2380端口,代表和集群通信 listen-client-urls: 'https://192.168.100.60:2379,http://127.0.0.1:2379' #改为自己的 max-snapshots: 3 max-wals: 5 cors: initial-advertise-peer-urls: 'https://192.168.100.60:2380' #自己的ip advertise-client-urls: 'https://192.168.100.60:2379' #自己的ip discovery: discovery-fallback: 'proxy' discovery-proxy: discovery-srv: initial-cluster: 'etcd-master1=https://192.168.100.60:2380,etcd-master2=https://192.168.100.61:2380,etcd-master3=https://192.168.100.62:2380' #这里不一样 initial-cluster-token: 'etcd-k8s-cluster' initial-cluster-state: 'new' strict-reconfig-check: false enable-v2: true enable-pprof: true proxy: 'off' proxy-failure-wait: 5000 proxy-refresh-interval: 30000 proxy-dial-timeout: 1000 proxy-write-timeout: 5000 proxy-read-timeout: 0 client-transport-security: cert-file: '/etc/kubernetes/pki/etcd/etcd.pem' key-file: '/etc/kubernetes/pki/etcd/etcd-key.pem' client-cert-auth: true trusted-ca-file: '/etc/kubernetes/pki/etcd/ca.pem' auto-tls: true peer-transport-security: cert-file: '/etc/kubernetes/pki/etcd/etcd.pem' key-file: '/etc/kubernetes/pki/etcd/etcd-key.pem' peer-client-cert-auth: true trusted-ca-file: '/etc/kubernetes/pki/etcd/ca.pem' auto-tls: true debug: false log-package-levels: log-outputs: [default] force-new-cluster: false ``` ```sh #把etcd做成服务,并设置开机启动 ansible k8s-master* -m shell -a "cat << EOF > /usr/lib/systemd/system/etcd.service [Unit] Description=Etcd Service Documentation=https://etcd.io/docs/v3.4/op-guide/clustering/ After=network.target [Service] Type=notify ExecStart=/usr/local/bin/etcd --config-file=/etc/etcd/etcd.yaml Restart=on-failure RestartSec=10 LimitNOFILE=65536 [Install] WantedBy=multi-user.target Alias=etcd3.service EOF" #设置开机启动 ansible k8s-master* -m shell -a "systemctl daemon-reload" ansible k8s-master* -m shell -a "systemctl enable --now etcd" ``` ###### **4)验证安装** ```sh # 查看etcd集群状态 etcdctl --endpoints="192.168.100.60:2379,192.168.100.61:2379,192.168.100.62:2379" --cacert=/etc/kubernetes/pki/etcd/ca.pem --cert=/etc/kubernetes/pki/etcd/etcd.pem --key=/etc/kubernetes/pki/etcd/etcd-key.pem endpoint status --write-out=table ``` ##### **1.2.5、k8s组件安装** 本次安装的k8s 版本为1.21.1 ###### **1)解压安装文件** ```sh #master1上执行 wget https://dl.k8s.io/v1.21.1/kubernetes-server-linux-amd64.tar.gz #复制到所有节点 ansible k8s-ha -m copy -a "src=kubernetes-server-linux-amd64.tar.gz dest=/root" #所有master节点上解压 ansible k8s-master* -m shell -a "tar -xvf /root/kubernetes-server-linux-amd64.tar.gz --strip-components=3 -C /usr/local/bin kubernetes/server/bin/kube{let,ctl,-apiserver,-controller-manager,-scheduler,-proxy}" #所有node节点上解压 ansible k8s-node* -m shell -a "tar -xvf /root/kubernetes-server-linux-amd64.tar.gz --strip-components=3 -C /usr/local/bin kubernetes/server/bin/kube{let,ctl,-proxy}" ``` ###### **2)证书生成** **apiserver根证书生成申请** ```sh #master1中执行 cd /etc/kubernetes/pki vi apiserver-csr.json #为了以后扩展主机方便,多写了几个hosts { "CN": "kube-apiserver", "hosts": [ "10.96.0.1", "127.0.0.1", "192.168.100.100", "192.168.100.60", "192.168.100.61", "192.168.100.62", "192.168.100.63", "192.168.100.64", "192.168.100.65", "192.168.100.66", "192.168.100.67", "192.168.100.68", "192.168.100.69", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "L": "BeiJing", "ST": "BeiJing", "O": "Kubernetes", "OU": "Kubernetes" } ] } ``` **生成apiserver证书** ```sh vi ca-csr.json { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "Kubernetes", "OU": "Kubernetes" } ], "ca": { "expiry": "87600h" } } #命令生成根证书和ca及ca-key cfssl gencert -initca ca-csr.json | cfssljson -bare ca - cfssl gencert -ca=/etc/kubernetes/pki/ca.pem -ca-key=/etc/kubernetes/pki/ca-key.pem -config=/etc/kubernetes/pki/ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare /etc/kubernetes/pki/apiserver ``` **front-proxy证书生成** ```sh vi front-proxy-ca-csr.json { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 } } #front-proxy 根ca生成 cfssl gencert -initca front-proxy-ca-csr.json | cfssljson -bare /etc/kubernetes/pki/front-proxy-ca ``` **front-proxy-client证书** ```sh vi front-proxy-client-csr.json #准备申请client客户端 { "CN": "front-proxy-client", "key": { "algo": "rsa", "size": 2048 } } #生成front-proxy-client 证书 cfssl gencert -ca=/etc/kubernetes/pki/front-proxy-ca.pem -ca-key=/etc/kubernetes/pki/front-proxy-ca-key.pem -config=ca-config.json -profile=kubernetes front-proxy-client-csr.json | cfssljson -bare /etc/kubernetes/pki/front-proxy-client ``` **controller-manage证书生成与配置** ```sh #根证书生成申请 vi controller-manager-csr.json { "CN": "system:kube-controller-manager", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "system:kube-controller-manager", "OU": "Kubernetes" } ] } #生成根证书 cfssl gencert \ -ca=/etc/kubernetes/pki/ca.pem \ -ca-key=/etc/kubernetes/pki/ca-key.pem \ -config=ca-config.json \ -profile=kubernetes \ controller-manager-csr.json | cfssljson -bare /etc/kubernetes/pki/controller-manager #生成配置文件 #注意这里的server地址应该是高可用地址 # set-cluster:设置一个集群项, kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=https://192.168.100.100:6443 \ --kubeconfig=/etc/kubernetes/controller-manager.conf # 设置一个环境项,一个上下文 kubectl config set-context system:kube-controller-manager@kubernetes \ --cluster=kubernetes \ --user=system:kube-controller-manager \ --kubeconfig=/etc/kubernetes/controller-manager.conf # set-credentials 设置一个用户项 kubectl config set-credentials system:kube-controller-manager \ --client-certificate=/etc/kubernetes/pki/controller-manager.pem \ --client-key=/etc/kubernetes/pki/controller-manager-key.pem \ --embed-certs=true \ --kubeconfig=/etc/kubernetes/controller-manager.conf # 使用某个环境当做默认环境,生成context环境配置 kubectl config use-context system:kube-controller-manager@kubernetes \ --kubeconfig=/etc/kubernetes/controller-manager.conf ``` **admin证书生成和配置** ```sh #编写证书申请 vi admin-csr.json { "CN": "admin", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "system:masters", "OU": "Kubernetes" } ] } #生成证书 cfssl gencert \ -ca=/etc/kubernetes/pki/ca.pem \ -ca-key=/etc/kubernetes/pki/ca-key.pem \ -config=/etc/kubernetes/pki/ca-config.json \ -profile=kubernetes \ admin-csr.json | cfssljson -bare /etc/kubernetes/pki/admin #生成配置文件,注意这里的server地址,还是高可用地址 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=https://192.168.100.100:6443 \ --kubeconfig=/etc/kubernetes/admin.conf kubectl config set-credentials kubernetes-admin \ --client-certificate=/etc/kubernetes/pki/admin.pem \ --client-key=/etc/kubernetes/pki/admin-key.pem \ --embed-certs=true \ --kubeconfig=/etc/kubernetes/admin.conf kubectl config set-context kubernetes-admin@kubernetes \ --cluster=kubernetes \ --user=kubernetes-admin \ --kubeconfig=/etc/kubernetes/admin.conf kubectl config use-context kubernetes-admin@kubernetes \ --kubeconfig=/etc/kubernetes/admin.conf ``` > 引申说明:kubelet的证书不用配置,因为集群机器很多,每个配置一遍不可能。k8s采用Bootstrap 机制进行认证,具体可查看官网 > > https://kubernetes.io/zh/docs/reference/access-authn-authz/bootstrap-tokens/ **ServiceAccount Key生成** > k8s底层,每创建一个ServiceAccount,都会分配一个Secret,而Secret里面有秘钥,秘钥就是由我们接下来的sa生成的。所以我们提前创建出sa信息 ```sh openssl genrsa -out /etc/kubernetes/pki/sa.key 2048 openssl rsa -in /etc/kubernetes/pki/sa.key -pubout -out /etc/kubernetes/pki/sa.pub ``` ###### **3)分发所有生成的证书文件** ```sh #这里其实也可以采用scp形式分发 #master1执行 for NODE in k8s-master2 k8s-master3 do for FILE in admin.conf controller-manager.conf scheduler.conf do scp /etc/kubernetes/${FILE} $NODE:/etc/kubernetes/${FILE} done done ``` ##### 1.2.6、k8s高可用配置 > 前述的配置,都配置了一个k8s的高可用地址192.168.100.100.这个地址必须用nginx做负载均衡,或者使用keepalived+nginx形式配置高可用。笔者为了配置方便,就简单搭建了一个地址为192.168.100.100的nginx主机做负载均衡。但是实际生产中应该用集群化的负载均衡,避免单点情况出现。 > > nginx使用TCP层的反向代理,并配置负载均衡 ###### **1)nginx安装** 具体安装编译环节可以见下面连接,这里不详述。 https://www.cnblogs.com/huningfei/p/12973323.html ###### **2)nginx配置文件** > 这里引申下k8s各个组件通信协议说明,可以见连接: > > https://zhuanlan.zhihu.com/p/82314737 > > 我截图如下: ![image-20210822104236021](https://gitee.com/wang5620079/mypics/raw/master//202108221042098.png) ```sh #如上说明,k8s各个组件之间,其实是通过http协议,笔者觉得用http代理也行,但是我当初配置的时候把握不准,就索性直接配置了TCP反向代理实现,未来这个地方可以验证一下 vi /usr/local/nginx/conf/nginx.conf worker_processes auto; events { worker_connections 1024; } error_log /var/log/nginx_error.log info; stream { upstream cloudsocket { #hash $remote_addr consistent; server 192.168.100.60:6443 weight=5 max_fails=1 fail_timeout=10s; server 192.168.100.61:6443 weight=5 max_fails=1 fail_timeout=10s; server 192.168.100.62:6443 weight=5 max_fails=1 fail_timeout=10s; } server { listen 6443; proxy_connect_timeout 1s; proxy_timeout 300s; proxy_pass cloudsocket; } server { listen 8011; server_name localhost; location /nginx_status { stub_status on; access_log off; allow 127.0.0.1; } } } ``` 配置nginx作为服务,并设置开机启动 ```sh [Unit] Description=The Nginx HTTP Server [Service] Type=forking PIDfile=/usr/local/nginx/logs/nginx.pid ExecStart=/usr/local/nginx/sbin/nginx ExecReload=/usr/local/nginx/sbin/nginx -s reload ExecStop=/usr/local/nginx/sbin/nginx -s reload PrivateTmp=true [Install] WantedBy=multi-user.target ``` 启动nginx服务 ```sh #启动 systemctl daemon-reload && systemctl enable --now nginx #查看状态 systemctl status nginx ``` ##### **1.2.7 组件启动**&集群配置 ###### **1)启动apiserver** 所有Master节点创建`kube-apiserver.service`, > 注意,以下文档使用的k8s service网段为`10.96.0.0/16`,该网段不能和宿主机的网段、Pod网段的重复 > > 特别注意:docker的网桥默认为 `172.17.0.1/16`。不要使用这个网段(本次搭建的网络插件使用calico) ```sh # 每个master节点都需要执行以下内容 # --advertise-address: 需要改为本master节点的ip # --service-cluster-ip-range=10.96.0.0/16: 需要改为自己规划的service网段 # --etcd-servers: 改为自己etcd-server的所有地址 #这里我都列出来了 #192.168.100.60主机配置 vi /usr/lib/systemd/system/kube-apiserver.service [Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-apiserver \ --v=2 \ --logtostderr=true \ --allow-privileged=true \ --bind-address=0.0.0.0 \ --secure-port=6443 \ --insecure-port=0 \ --advertise-address=192.168.100.60 \ --service-cluster-ip-range=10.96.0.0/16 \ --service-node-port-range=30000-32767 \ --etcd-servers=https://192.168.100.60:2379,https://192.168.100.61:2379,https://192.168.100.62:2379 \ --etcd-cafile=/etc/kubernetes/pki/etcd/ca.pem \ --etcd-certfile=/etc/kubernetes/pki/etcd/etcd.pem \ --etcd-keyfile=/etc/kubernetes/pki/etcd/etcd-key.pem \ --client-ca-file=/etc/kubernetes/pki/ca.pem \ --tls-cert-file=/etc/kubernetes/pki/apiserver.pem \ --tls-private-key-file=/etc/kubernetes/pki/apiserver-key.pem \ --kubelet-client-certificate=/etc/kubernetes/pki/apiserver.pem \ --kubelet-client-key=/etc/kubernetes/pki/apiserver-key.pem \ --service-account-key-file=/etc/kubernetes/pki/sa.pub \ --service-account-signing-key-file=/etc/kubernetes/pki/sa.key \ --service-account-issuer=https://kubernetes.default.svc.cluster.local \ --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota \ --authorization-mode=Node,RBAC \ --enable-bootstrap-token-auth=true \ --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.pem \ --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.pem \ --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client-key.pem \ --requestheader-allowed-names=aggregator,front-proxy-client \ --requestheader-group-headers=X-Remote-Group \ --requestheader-extra-headers-prefix=X-Remote-Extra- \ --requestheader-username-headers=X-Remote-User # --token-auth-file=/etc/kubernetes/token.csv Restart=on-failure RestartSec=10s LimitNOFILE=65535 [Install] WantedBy=multi-user.target #61主机配置 [Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-apiserver \ --v=2 \ --logtostderr=true \ --allow-privileged=true \ --bind-address=0.0.0.0 \ --secure-port=6443 \ --insecure-port=0 \ --advertise-address=192.168.100.61 \ --service-cluster-ip-range=10.96.0.0/16 \ --service-node-port-range=30000-32767 \ --etcd-servers=https://192.168.100.60:2379,https://192.168.100.61:2379,https://192.168.100.62:2379 \ --etcd-cafile=/etc/kubernetes/pki/etcd/ca.pem \ --etcd-certfile=/etc/kubernetes/pki/etcd/etcd.pem \ --etcd-keyfile=/etc/kubernetes/pki/etcd/etcd-key.pem \ --client-ca-file=/etc/kubernetes/pki/ca.pem \ --tls-cert-file=/etc/kubernetes/pki/apiserver.pem \ --tls-private-key-file=/etc/kubernetes/pki/apiserver-key.pem \ --kubelet-client-certificate=/etc/kubernetes/pki/apiserver.pem \ --kubelet-client-key=/etc/kubernetes/pki/apiserver-key.pem \ --service-account-key-file=/etc/kubernetes/pki/sa.pub \ --service-account-signing-key-file=/etc/kubernetes/pki/sa.key \ --service-account-issuer=https://kubernetes.default.svc.cluster.local \ --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota \ --authorization-mode=Node,RBAC \ --enable-bootstrap-token-auth=true \ --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.pem \ --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.pem \ --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client-key.pem \ --requestheader-allowed-names=aggregator,front-proxy-client \ --requestheader-group-headers=X-Remote-Group \ --requestheader-extra-headers-prefix=X-Remote-Extra- \ --requestheader-username-headers=X-Remote-User # --token-auth-file=/etc/kubernetes/token.csv Restart=on-failure RestartSec=10s LimitNOFILE=65535 [Install] WantedBy=multi-user.target #62主机配置 [Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-apiserver \ --v=2 \ --logtostderr=true \ --allow-privileged=true \ --bind-address=0.0.0.0 \ --secure-port=6443 \ --insecure-port=0 \ --advertise-address=192.168.100.62 \ --service-cluster-ip-range=10.96.0.0/16 \ --service-node-port-range=30000-32767 \ --etcd-servers=https://192.168.100.60:2379,https://192.168.100.61:2379,https://192.168.100.62:2379 \ --etcd-cafile=/etc/kubernetes/pki/etcd/ca.pem \ --etcd-certfile=/etc/kubernetes/pki/etcd/etcd.pem \ --etcd-keyfile=/etc/kubernetes/pki/etcd/etcd-key.pem \ --client-ca-file=/etc/kubernetes/pki/ca.pem \ --tls-cert-file=/etc/kubernetes/pki/apiserver.pem \ --tls-private-key-file=/etc/kubernetes/pki/apiserver-key.pem \ --kubelet-client-certificate=/etc/kubernetes/pki/apiserver.pem \ --kubelet-client-key=/etc/kubernetes/pki/apiserver-key.pem \ --service-account-key-file=/etc/kubernetes/pki/sa.pub \ --service-account-signing-key-file=/etc/kubernetes/pki/sa.key \ --service-account-issuer=https://kubernetes.default.svc.cluster.local \ --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota \ --authorization-mode=Node,RBAC \ --enable-bootstrap-token-auth=true \ --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.pem \ --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.pem \ --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client-key.pem \ --requestheader-allowed-names=aggregator,front-proxy-client \ --requestheader-group-headers=X-Remote-Group \ --requestheader-extra-headers-prefix=X-Remote-Extra- \ --requestheader-username-headers=X-Remote-User # --token-auth-file=/etc/kubernetes/token.csv Restart=on-failure RestartSec=10s LimitNOFILE=65535 [Install] WantedBy=multi-user.target ``` ###### **2)启动apiserver服务** ```sh #每台master主机上都执行,这里建议手执行,便于查看每台主机的真实运行情况 #启动服务 systemctl daemon-reload && systemctl enable --now kube-apiserver #查看状态 systemctl status kube-apiserver ``` 看到下面就是启动成功了 ![image-20210822111539102](https://gitee.com/wang5620079/mypics/raw/master//202108221115187.png) 2)启动controller-manager服务 所有Master节点配置kube-controller-manager.service > 注意:笔者使用的k8s Pod网段为`196.16.0.0/16`,该网段不能和宿主机的网段、k8s Service网段的重复,请按需修改; > > 特别注意:docker的网桥默认为 `172.17.0.1/16`。不要使用这个网段 ```sh # 所有master节点执行 vi /usr/lib/systemd/system/kube-controller-manager.service #60主机 [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-controller-manager \ --v=2 \ --logtostderr=true \ --address=127.0.0.1 \ --root-ca-file=/etc/kubernetes/pki/ca.pem \ --cluster-signing-cert-file=/etc/kubernetes/pki/ca.pem \ --cluster-signing-key-file=/etc/kubernetes/pki/ca-key.pem \ --service-account-private-key-file=/etc/kubernetes/pki/sa.key \ --kubeconfig=/etc/kubernetes/controller-manager.conf \ --leader-elect=true \ --use-service-account-credentials=true \ --node-monitor-grace-period=40s \ --node-monitor-period=5s \ --pod-eviction-timeout=2m0s \ --controllers=*,bootstrapsigner,tokencleaner \ --allocate-node-cidrs=true \ --cluster-cidr=196.16.0.0/16 \ --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.pem \ --node-cidr-mask-size=24 Restart=always RestartSec=10s [Install] WantedBy=multi-user.target #61主机 [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-controller-manager \ --v=2 \ --logtostderr=true \ --address=127.0.0.1 \ --root-ca-file=/etc/kubernetes/pki/ca.pem \ --cluster-signing-cert-file=/etc/kubernetes/pki/ca.pem \ --cluster-signing-key-file=/etc/kubernetes/pki/ca-key.pem \ --service-account-private-key-file=/etc/kubernetes/pki/sa.key \ --kubeconfig=/etc/kubernetes/controller-manager.conf \ --leader-elect=true \ --use-service-account-credentials=true \ --node-monitor-grace-period=40s \ --node-monitor-period=5s \ --pod-eviction-timeout=2m0s \ --controllers=*,bootstrapsigner,tokencleaner \ --allocate-node-cidrs=true \ --cluster-cidr=196.16.0.0/16 \ --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.pem \ --node-cidr-mask-size=24 Restart=always RestartSec=10s [Install] WantedBy=multi-user.target #62主机 [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-controller-manager \ --v=2 \ --logtostderr=true \ --address=127.0.0.1 \ --root-ca-file=/etc/kubernetes/pki/ca.pem \ --cluster-signing-cert-file=/etc/kubernetes/pki/ca.pem \ --cluster-signing-key-file=/etc/kubernetes/pki/ca-key.pem \ --service-account-private-key-file=/etc/kubernetes/pki/sa.key \ --kubeconfig=/etc/kubernetes/controller-manager.conf \ --leader-elect=true \ --use-service-account-credentials=true \ --node-monitor-grace-period=40s \ --node-monitor-period=5s \ --pod-eviction-timeout=2m0s \ --controllers=*,bootstrapsigner,tokencleaner \ --allocate-node-cidrs=true \ --cluster-cidr=196.16.0.0/16 \ --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.pem \ --node-cidr-mask-size=24 Restart=always RestartSec=10s [Install] WantedBy=multi-user.target ``` **启动服务** ```sh # 所有master节点执行 systemctl daemon-reload systemctl daemon-reload && systemctl enable --now kube-controller-manager systemctl status kube-controller-manager ``` 成功后的截图如下(那几个error请忽略,是kube-controller-manager获取锁的时候的报错,但是服务是正常的) ![image-20210822112427178](https://gitee.com/wang5620079/mypics/raw/master//202108221124252.png) ###### **3)启动scheduler服务** ```sh #每个master主机上执行 vi /usr/lib/systemd/system/kube-scheduler.service [Unit] Description=Kubernetes Scheduler Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-scheduler \ --v=2 \ --logtostderr=true \ --address=127.0.0.1 \ --leader-elect=true \ --kubeconfig=/etc/kubernetes/scheduler.conf Restart=always RestartSec=10s [Install] WantedBy=multi-user.target ``` 启动服务 ```sh systemctl daemon-reload systemctl daemon-reload && systemctl enable --now kube-scheduler systemctl status kube-scheduler ``` ![image-20210822113157695](https://gitee.com/wang5620079/mypics/raw/master//202108221131766.png) ###### **4)node节点启动和加入集群** > TLS与引导启动原理详见官网: > > https://kubernetes.io/zh/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/ ```sh #master1执行 #准备随机的token #准备一个随机token。但是我们只需要16个字符 head -c 16 /dev/urandom | od -An -t x | tr -d ' ' # 记下这个token,这里假设是94d86a79f6a219a8c4f8f8df200faeed # 生成16个字符的 head -c 8 /dev/urandom | od -An -t x | tr -d ' ' # d683399b7a553977 #生成配置文件 #设置集群 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=https://192.168.100.100:6443 \ --kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf #设置秘钥 kubectl config set-credentials tls-bootstrap-token-user \ --token=wang5620079.94d86a79f6a219a8c4f8f8df200faeed \ --kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf #设置上下文 kubectl config set-context tls-bootstrap-token-user@kubernetes \ --cluster=kubernetes \ --user=tls-bootstrap-token-user \ --kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf #使用设置 kubectl config use-context tls-bootstrap-token-user@kubernetes \ --kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf #设置kubectl权限,就是让master1能用kubectl命令 mkdir -p /root/.kube ; cp /etc/kubernetes/admin.conf /root/.kube/config #验证 kubectl get nodes [root@k8s-master1 ~]# kubectl get nodes No resources found #出现上面的提示,说明已经可以连接kubectl已经可以与apiserver通信并获取资源 ``` **创建集群引导权限文件** ```sh #master节点上执行 vi /etc/kubernetes/bootstrap.secret.yaml apiVersion: v1 kind: Secret metadata: name: bootstrap-token-wang5620079 namespace: kube-system type: bootstrap.kubernetes.io/token stringData: description: "The default bootstrap token generated by 'kubelet '." token-id: wang5620079 token-secret: 94d86a79f6a219a8c4f8f8df200faeed usage-bootstrap-authentication: "true" usage-bootstrap-signing: "true" auth-extra-groups: system:bootstrappers:default-node-token,system:bootstrappers:worker,system:bootstrappers:ingress --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: kubelet-bootstrap roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:node-bootstrapper subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:bootstrappers:default-node-token --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: node-autoapprove-bootstrap roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:nodeclient subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:bootstrappers:default-node-token --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: node-autoapprove-certificate-rotation roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:nodes --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:kube-apiserver-to-kubelet rules: - apiGroups: - "" resources: - nodes/proxy - nodes/stats - nodes/log - nodes/spec - nodes/metrics verbs: - "*" --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:kube-apiserver namespace: "" roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:kube-apiserver-to-kubelet subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: kube-apiserver #使用这个文件创建集群引导secret kubectl create -f /etc/kubernetes/bootstrap.secret.yaml ``` **node节点启动引导配置** ```sh #证书配置到其他master节点 #执行复制所有令牌操作 for NODE in k8s-master2 k8s-master3 k8s-node1 k8s-node2; do ssh $NODE mkdir -p /etc/kubernetes/pki/etcd for FILE in ca.pem etcd.pem etcd-key.pem; do scp /etc/kubernetes/pki/etcd/$FILE $NODE:/etc/kubernetes/pki/etcd/ done for FILE in pki/ca.pem pki/ca-key.pem pki/front-proxy-ca.pem bootstrap-kubelet.conf; do scp /etc/kubernetes/$FILE $NODE:/etc/kubernetes/${FILE} done done ``` - **所有节点配置kubelet,注意是所有节点哦,包括master和node** ```sh #所有节点 ansible k8s-ha -m shell -a "mkdir -p /var/lib/kubelet /var/log/kubernetes /etc/systemd/system/kubelet.service.d /etc/kubernetes/manifests/" ## 所有node节点必须有 kubelet kube-proxy for NODE in k8s-master2 k8s-master3 k8s-node3 k8s-node1 k8s-node2 k8s-node3; do scp -r /etc/kubernetes/* root@$NODE:/etc/kubernetes/ done ``` **所有节点,配置kubelet.service** ```sh vi /usr/lib/systemd/system/kubelet.service [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/kubernetes/kubernetes After=docker.service Requires=docker.service [Service] ExecStart=/usr/local/bin/kubelet Restart=always StartLimitInterval=0 RestartSec=10 [Install] WantedBy=multi-user.target #可以配置好一个节点后,用ansible分发 ``` **所有节点,配置kubelet的启动配置文件** ```sh #所有节点,配置kubelet service配置文件 vi /etc/systemd/system/kubelet.service.d/10-kubelet.conf [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf" Environment="KUBELET_SYSTEM_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin" Environment="KUBELET_CONFIG_ARGS=--config=/etc/kubernetes/kubelet-conf.yml --pod-infra-container-image=registry.cn-beijing.aliyuncs.com/wang5620079/k8s:pause-3.4.1" Environment="KUBELET_EXTRA_ARGS=--node-labels=node.kubernetes.io/node='' " ExecStart= ExecStart=/usr/local/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_SYSTEM_ARGS $KUBELET_EXTRA_ARGS #所有节点,配置kubelet-conf文件 vi /etc/kubernetes/kubelet-conf.yml apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration address: 0.0.0.0 port: 10250 readOnlyPort: 10255 authentication: anonymous: enabled: false webhook: cacheTTL: 2m0s enabled: true x509: clientCAFile: /etc/kubernetes/pki/ca.pem authorization: mode: Webhook webhook: cacheAuthorizedTTL: 5m0s cacheUnauthorizedTTL: 30s cgroupDriver: systemd cgroupsPerQOS: true clusterDNS: - 10.96.0.10 #注意这里的配置clusterDNS 为service网络的ip值,改成自己的。 clusterDomain: cluster.local containerLogMaxFiles: 5 containerLogMaxSize: 10Mi contentType: application/vnd.kubernetes.protobuf cpuCFSQuota: true cpuManagerPolicy: none cpuManagerReconcilePeriod: 10s enableControllerAttachDetach: true enableDebuggingHandlers: true enforceNodeAllocatable: - pods eventBurst: 10 eventRecordQPS: 5 evictionHard: imagefs.available: 15% memory.available: 100Mi nodefs.available: 10% nodefs.inodesFree: 5% evictionPressureTransitionPeriod: 5m0s failSwapOn: true fileCheckFrequency: 20s hairpinMode: promiscuous-bridge healthzBindAddress: 127.0.0.1 healthzPort: 10248 httpCheckFrequency: 20s imageGCHighThresholdPercent: 85 imageGCLowThresholdPercent: 80 imageMinimumGCAge: 2m0s iptablesDropBit: 15 iptablesMasqueradeBit: 14 kubeAPIBurst: 10 kubeAPIQPS: 5 makeIPTablesUtilChains: true maxOpenFiles: 1000000 maxPods: 110 nodeStatusUpdateFrequency: 10s oomScoreAdj: -999 podPidsLimit: -1 registryBurst: 10 registryPullQPS: 5 resolvConf: /etc/resolv.conf rotateCertificates: true runtimeRequestTimeout: 2m0s serializeImagePulls: true staticPodPath: /etc/kubernetes/manifests streamingConnectionIdleTimeout: 4h0m0s syncFrequency: 1m0s volumeStatsAggPeriod: 1m0s ``` **启动kubelet服务** ```sh #所有节点启动kubelet服务 systemctl daemon-reload && systemctl enable --now kubelet systemctl status kubelet #会提示Unable to update cni config,这是因为没有安装网络插件,后面会安装网络插件 ``` 效果如下 ![image-20210822115922772](https://gitee.com/wang5620079/mypics/raw/master//202108221159850.png) - **kube-proxy配置** ```sh #在master1上面创建配置文件,然后分发 #创建kube-proxy的sa kubectl -n kube-system create serviceaccount kube-proxy #创建角色绑定 kubectl create clusterrolebinding system:kube-proxy \ --clusterrole system:node-proxier \ --serviceaccount kube-system:kube-proxy #导出变量,方便后面使用 SECRET=$(kubectl -n kube-system get sa/kube-proxy --output=jsonpath='{.secrets[0].name}') JWT_TOKEN=$(kubectl -n kube-system get secret/$SECRET --output=jsonpath='{.data.token}' | base64 -d) PKI_DIR=/etc/kubernetes/pki K8S_DIR=/etc/kubernetes # 生成kube-proxy配置 # --server: 指定自己的apiserver地址或者lb地址 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=https://192.168.100.100:6443 \ --kubeconfig=${K8S_DIR}/kube-proxy.conf # kube-proxy秘钥设置 kubectl config set-credentials kubernetes \ --token=${JWT_TOKEN} \ --kubeconfig=/etc/kubernetes/kube-proxy.conf kubectl config set-context kubernetes \ --cluster=kubernetes \ --user=kubernetes \ --kubeconfig=/etc/kubernetes/kube-proxy.conf kubectl config use-context kubernetes \ --kubeconfig=/etc/kubernetes/kube-proxy.conf #分发文件 for NODE in k8s-master2 k8s-master3 k8s-node1 k8s-node2 k8s-node3; do scp /etc/kubernetes/kube-proxy.conf $NODE:/etc/kubernetes/ done ``` **配置kube-proxy.service** ```sh # 所有节点配置 kube-proxy.service 服务,并设置为开机启动 vi /usr/lib/systemd/system/kube-proxy.service [Unit] Description=Kubernetes Kube Proxy Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-proxy \ --config=/etc/kubernetes/kube-proxy.yaml \ --v=2 Restart=always RestartSec=10s [Install] WantedBy=multi-user.target ``` **配置kube-proxy.yaml** ```sh # 所有机器执行 vi /etc/kubernetes/kube-proxy.yaml apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: 0.0.0.0 clientConnection: acceptContentTypes: "" burst: 10 contentType: application/vnd.kubernetes.protobuf kubeconfig: /etc/kubernetes/kube-proxy.conf #kube-proxy引导文件 qps: 5 clusterCIDR: 196.16.0.0/16 #修改为Pod-CIDR configSyncPeriod: 15m0s conntrack: max: null maxPerCore: 32768 min: 131072 tcpCloseWaitTimeout: 1h0m0s tcpEstablishedTimeout: 24h0m0s enableProfiling: false healthzBindAddress: 0.0.0.0:10256 hostnameOverride: "" iptables: masqueradeAll: false masqueradeBit: 14 minSyncPeriod: 0s syncPeriod: 30s ipvs: masqueradeAll: true minSyncPeriod: 5s scheduler: "rr" syncPeriod: 30s kind: KubeProxyConfiguration metricsBindAddress: 127.0.0.1:10249 mode: "ipvs" nodePortAddresses: null oomScoreAdj: -999 portRange: "" udpIdleTimeout: 250ms ``` **启动服务** ```sh systemctl daemon-reload && systemctl enable --now kube-proxy systemctl status kube-proxy ``` ![image-20210822121342664](https://gitee.com/wang5620079/mypics/raw/master//202108221213746.png) ###### **5)部署calico网络插件** ```sh #下载calico文件,并修改 curl https://docs.projectcalico.org/manifests/calico-etcd.yaml -o calico.yaml #修改如下的内容 #1、修改etcd集群地址 sed -i 's#etcd_endpoints: "http://:"#etcd_endpoints: "https://192.168.100.60:2379,https://192.168.100.61:2379,https://192.168.100.62:2379"#g' calico.yaml # etcd的证书内容,需要base64编码设置到yaml中 ETCD_CA=`cat /etc/kubernetes/pki/etcd/ca.pem | base64 -w 0 ` ETCD_CERT=`cat /etc/kubernetes/pki/etcd/etcd.pem | base64 -w 0 ` ETCD_KEY=`cat /etc/kubernetes/pki/etcd/etcd-key.pem | base64 -w 0 ` # 替换etcd中的证书base64编码后的内容 sed -i "s@# etcd-key: null@etcd-key: ${ETCD_KEY}@g; s@# etcd-cert: null@etcd-cert: ${ETCD_CERT}@g; s@# etcd-ca: null@etcd-ca: ${ETCD_CA}@g" calico.yaml #打开 etcd_ca 等默认设置(calico启动后自己生成)。 sed -i 's#etcd_ca: ""#etcd_ca: "/calico-secrets/etcd-ca"#g; s#etcd_cert: ""#etcd_cert: "/calico-secrets/etcd-cert"#g; s#etcd_key: "" #etcd_key: "/calico-secrets/etcd-key" #g' calico.yaml # 修改自己的Pod网段 196.16.0.0/16 POD_SUBNET="196.16.0.0/16" sed -i 's@# - name: CALICO_IPV4POOL_CIDR@- name: CALICO_IPV4POOL_CIDR@g; s@# value: "192.168.0.0/16"@ value: '"${POD_SUBNET}"'@g' calico.yaml # 一定确定自己是否修改好了 #确认calico是否修改好 grep "CALICO_IPV4POOL_CIDR" calico.yaml -A 1 #应用配置,安装插件 kubectl apply -f calico.yaml ``` ###### **6)部署coreDNS** ```sh git clone https://github.com/coredns/deployment.git cd deployment/kubernetes #10.96.0.10 改为 service 网段的 第 10 个ip ./deploy.sh -s -i 10.96.0.10 | kubectl apply -f - ``` ###### **7)给集群节点打标签** ```sh kubectl label node k8s-master1 node-role.kubernetes.io/master='' kubectl label node k8s-master2 node-role.kubernetes.io/master='' kubectl label node k8s-master3 node-role.kubernetes.io/master='' #打上污点 kubectl taint node k8s-master1 node-role.kubernetes.io/master:NoSchedule kubectl taint node k8s-master2 node-role.kubernetes.io/master:NoSchedule kubectl taint node k8s-master3 node-role.kubernetes.io/master:NoSchedule ``` ###### **8)集群验证** > - 验证Pod网络可访问性 > - 同名称空间,不同名称空间可以使用 ip 互相访问 > - 跨机器部署的Pod也可以互相访问 > - 验证Service网络可访问性 > - 集群机器使用serviceIp可以负载均衡访问 > - pod内部可以访问service域名 serviceName.namespace > - pod可以访问跨名称空间的service ```yaml #用nginx镜像验证 apiVersion: apps/v1 kind: Deployment metadata: name: nginx-01 namespace: default labels: app: nginx-01 spec: selector: matchLabels: app: nginx-01 replicas: 1 template: metadata: labels: app: nginx-01 spec: containers: - name: nginx-01 image: nginx --- apiVersion: v1 kind: Service metadata: name: nginx-svc namespace: default spec: selector: app: nginx-01 type: ClusterIP ports: - name: nginx-svc port: 80 targetPort: 80 protocol: TCP --- apiVersion: v1 kind: Namespace metadata: name: hello spec: {} --- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-hello namespace: hello labels: app: nginx-hello spec: selector: matchLabels: app: nginx-hello replicas: 1 template: metadata: labels: app: nginx-hello spec: containers: - name: nginx-hello image: nginx --- apiVersion: v1 kind: Service metadata: name: nginx-svc-hello namespace: hello spec: selector: app: nginx-hello type: ClusterIP ports: - name: nginx-svc-hello port: 80 targetPort: 80 protocol: TCP ``` 看到pod部署上去了,且deploy能保证副本数,service可以访问,那么集群就配置完成了。 至此,集群基本配置完成!yeah!太不容易了。 ### 二、Kubernetes基础预装组件构建 以上仅仅是完成了集群的基础配置,构建了一个能基本运行的k8s集群,下面还要在集群上安装其他基础预装组件,如metrics-server、ingress-nginx等 #### 2.1 metrics-server安装 编辑metrics-server.yaml,内容如下 ```yaml apiVersion: v1 kind: ServiceAccount metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server rbac.authorization.k8s.io/aggregate-to-admin: "true" rbac.authorization.k8s.io/aggregate-to-edit: "true" rbac.authorization.k8s.io/aggregate-to-view: "true" name: system:aggregated-metrics-reader rules: - apiGroups: - metrics.k8s.io resources: - pods - nodes verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: k8s-app: metrics-server name: system:metrics-server rules: - apiGroups: - "" resources: - pods - nodes - nodes/stats - namespaces - configmaps verbs: - get - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server-auth-reader namespace: kube-system roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-reader subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: metrics-server:system:auth-delegator roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:auth-delegator subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: k8s-app: metrics-server name: system:metrics-server roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:metrics-server subjects: - kind: ServiceAccount name: metrics-server namespace: kube-system --- apiVersion: v1 kind: Service metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: ports: - name: https port: 443 protocol: TCP targetPort: https selector: k8s-app: metrics-server --- apiVersion: apps/v1 kind: Deployment metadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system spec: selector: matchLabels: k8s-app: metrics-server strategy: rollingUpdate: maxUnavailable: 0 template: metadata: labels: k8s-app: metrics-server spec: containers: - args: - --v=6 - --cert-dir=/tmp - --kubelet-insecure-tls - --secure-port=4443 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port image: registry.cn-beijing.aliyuncs.com/wang5620079/k8s:metrics-server-v0.4.3 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /livez port: https scheme: HTTPS periodSeconds: 10 name: metrics-server ports: - containerPort: 4443 name: https protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readyz port: https scheme: HTTPS periodSeconds: 10 securityContext: readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 volumeMounts: - mountPath: /tmp name: tmp-dir nodeSelector: kubernetes.io/os: linux priorityClassName: system-cluster-critical serviceAccountName: metrics-server volumes: - emptyDir: {} name: tmp-dir --- apiVersion: apiregistration.k8s.io/v1 kind: APIService metadata: labels: k8s-app: metrics-server name: v1beta1.metrics.k8s.io spec: group: metrics.k8s.io groupPriorityMinimum: 100 insecureSkipTLSVerify: true service: name: metrics-server namespace: kube-system version: v1beta1 versionPriority: 10 ``` ```sh #执行 kubectl apply -f metrics-server.yaml ``` #### 2.2 ingress-nginx安装 > 注意:笔者安装方式与官网部署方式不同,**ingress-controller使用DaemonSet形式安装,并使用hostNetwork形式,直接占用node节点的80和443端口,目的是是用ingress-nginx暴露node的80和443端口,分别对外提供http和https服务。因此在安装ingress-nginx服务之前,要保证node节点上的80和443端口没有被占用。** > > 暴露端口的节点是可控的,只要为节点打上“node-role=ingress”标签即可。 ```sh #首先,node节点上打上标签 kubectl label node k8s-node1 node-role=ingress kubectl label node k8s-node2 node-role=ingress kubectl label node k8s-node3 node-role=ingress ``` ingress-nginx安装yaml配置 ```sh apiVersion: v1 kind: Namespace metadata: name: ingress-nginx labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx --- # Source: ingress-nginx/templates/controller-serviceaccount.yaml apiVersion: v1 kind: ServiceAccount metadata: labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: controller name: ingress-nginx namespace: ingress-nginx automountServiceAccountToken: true --- # Source: ingress-nginx/templates/controller-configmap.yaml apiVersion: v1 kind: ConfigMap metadata: labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: controller name: ingress-nginx-controller namespace: ingress-nginx data: --- # Source: ingress-nginx/templates/clusterrole.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm name: ingress-nginx rules: - apiGroups: - '' resources: - configmaps - endpoints - nodes - pods - secrets verbs: - list - watch - apiGroups: - '' resources: - nodes verbs: - get - apiGroups: - '' resources: - services verbs: - get - list - watch - apiGroups: - extensions - networking.k8s.io # k8s 1.14+ resources: - ingresses verbs: - get - list - watch - apiGroups: - '' resources: - events verbs: - create - patch - apiGroups: - extensions - networking.k8s.io # k8s 1.14+ resources: - ingresses/status verbs: - update - apiGroups: - networking.k8s.io # k8s 1.14+ resources: - ingressclasses verbs: - get - list - watch --- # Source: ingress-nginx/templates/clusterrolebinding.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm name: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: ingress-nginx subjects: - kind: ServiceAccount name: ingress-nginx namespace: ingress-nginx --- # Source: ingress-nginx/templates/controller-role.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: controller name: ingress-nginx namespace: ingress-nginx rules: - apiGroups: - '' resources: - namespaces verbs: - get - apiGroups: - '' resources: - configmaps - pods - secrets - endpoints verbs: - get - list - watch - apiGroups: - '' resources: - services verbs: - get - list - watch - apiGroups: - extensions - networking.k8s.io # k8s 1.14+ resources: - ingresses verbs: - get - list - watch - apiGroups: - extensions - networking.k8s.io # k8s 1.14+ resources: - ingresses/status verbs: - update - apiGroups: - networking.k8s.io # k8s 1.14+ resources: - ingressclasses verbs: - get - list - watch - apiGroups: - '' resources: - configmaps resourceNames: - ingress-controller-leader-nginx verbs: - get - update - apiGroups: - '' resources: - configmaps verbs: - create - apiGroups: - '' resources: - events verbs: - create - patch --- # Source: ingress-nginx/templates/controller-rolebinding.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: controller name: ingress-nginx namespace: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: ingress-nginx subjects: - kind: ServiceAccount name: ingress-nginx namespace: ingress-nginx --- # Source: ingress-nginx/templates/controller-service-webhook.yaml apiVersion: v1 kind: Service metadata: labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: controller name: ingress-nginx-controller-admission namespace: ingress-nginx spec: type: ClusterIP ports: - name: https-webhook port: 443 targetPort: webhook selector: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/component: controller --- # Source: ingress-nginx/templates/controller-service.yaml apiVersion: v1 kind: Service metadata: annotations: labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: controller name: ingress-nginx-controller namespace: ingress-nginx spec: type: ClusterIP ## 改为clusterIP ports: - name: http port: 80 protocol: TCP targetPort: http - name: https port: 443 protocol: TCP targetPort: https selector: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/component: controller --- # Source: ingress-nginx/templates/controller-deployment.yaml apiVersion: apps/v1 kind: DaemonSet metadata: labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: controller name: ingress-nginx-controller namespace: ingress-nginx spec: selector: matchLabels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/component: controller revisionHistoryLimit: 10 minReadySeconds: 0 template: metadata: labels: app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/component: controller spec: dnsPolicy: ClusterFirstWithHostNet ## 这里调整dns为主机网络 hostNetwork: true ## 直接让nginx占用本机80端口和443端口,所以使用主机网络 containers: - name: controller image: registry.cn-beijing.aliyuncs.com/wang5620079/k8s:ingress-nginx-controller-v0.46.0 imagePullPolicy: IfNotPresent lifecycle: preStop: exec: command: - /wait-shutdown args: - /nginx-ingress-controller - --election-id=ingress-controller-leader - --ingress-class=nginx - --configmap=$(POD_NAMESPACE)/ingress-nginx-controller - --validating-webhook=:8443 - --validating-webhook-certificate=/usr/local/certificates/cert - --validating-webhook-key=/usr/local/certificates/key securityContext: capabilities: drop: - ALL add: - NET_BIND_SERVICE runAsUser: 101 allowPrivilegeEscalation: true env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: LD_PRELOAD value: /usr/local/lib/libmimalloc.so livenessProbe: httpGet: path: /healthz port: 10254 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 1 successThreshold: 1 failureThreshold: 5 readinessProbe: httpGet: path: /healthz port: 10254 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 1 successThreshold: 1 failureThreshold: 3 ports: - name: http containerPort: 80 protocol: TCP - name: https containerPort: 443 protocol: TCP - name: webhook containerPort: 8443 protocol: TCP volumeMounts: - name: webhook-cert mountPath: /usr/local/certificates/ readOnly: true resources: requests: cpu: 100m memory: 90Mi limits: cpu: 1000m memory: 800Mi nodeSelector: node-role: ingress serviceAccountName: ingress-nginx terminationGracePeriodSeconds: 300 volumes: - name: webhook-cert secret: secretName: ingress-nginx-admission --- # Source: ingress-nginx/templates/admission-webhooks/validating-webhook.yaml # before changing this value, check the required kubernetes version # https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#prerequisites apiVersion: admissionregistration.k8s.io/v1 kind: ValidatingWebhookConfiguration metadata: labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: admission-webhook name: ingress-nginx-admission webhooks: - name: validate.nginx.ingress.kubernetes.io matchPolicy: Equivalent rules: - apiGroups: - networking.k8s.io apiVersions: - v1beta1 operations: - CREATE - UPDATE resources: - ingresses failurePolicy: Fail sideEffects: None admissionReviewVersions: - v1 - v1beta1 clientConfig: service: namespace: ingress-nginx name: ingress-nginx-controller-admission path: /networking/v1beta1/ingresses --- # Source: ingress-nginx/templates/admission-webhooks/job-patch/serviceaccount.yaml apiVersion: v1 kind: ServiceAccount metadata: name: ingress-nginx-admission annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: admission-webhook namespace: ingress-nginx --- # Source: ingress-nginx/templates/admission-webhooks/job-patch/clusterrole.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: ingress-nginx-admission annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: admission-webhook rules: - apiGroups: - admissionregistration.k8s.io resources: - validatingwebhookconfigurations verbs: - get - update --- # Source: ingress-nginx/templates/admission-webhooks/job-patch/clusterrolebinding.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: ingress-nginx-admission annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: admission-webhook roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: ingress-nginx-admission subjects: - kind: ServiceAccount name: ingress-nginx-admission namespace: ingress-nginx --- # Source: ingress-nginx/templates/admission-webhooks/job-patch/role.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: ingress-nginx-admission annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: admission-webhook namespace: ingress-nginx rules: - apiGroups: - '' resources: - secrets verbs: - get - create --- # Source: ingress-nginx/templates/admission-webhooks/job-patch/rolebinding.yaml apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: ingress-nginx-admission annotations: helm.sh/hook: pre-install,pre-upgrade,post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: admission-webhook namespace: ingress-nginx roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: ingress-nginx-admission subjects: - kind: ServiceAccount name: ingress-nginx-admission namespace: ingress-nginx --- # Source: ingress-nginx/templates/admission-webhooks/job-patch/job-createSecret.yaml apiVersion: batch/v1 kind: Job metadata: name: ingress-nginx-admission-create annotations: helm.sh/hook: pre-install,pre-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: admission-webhook namespace: ingress-nginx spec: template: metadata: name: ingress-nginx-admission-create labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: admission-webhook spec: containers: - name: create image: docker.io/jettech/kube-webhook-certgen:v1.5.1 imagePullPolicy: IfNotPresent args: - create - --host=ingress-nginx-controller-admission,ingress-nginx-controller-admission.$(POD_NAMESPACE).svc - --namespace=$(POD_NAMESPACE) - --secret-name=ingress-nginx-admission env: - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace restartPolicy: OnFailure serviceAccountName: ingress-nginx-admission securityContext: runAsNonRoot: true runAsUser: 2000 --- # Source: ingress-nginx/templates/admission-webhooks/job-patch/job-patchWebhook.yaml apiVersion: batch/v1 kind: Job metadata: name: ingress-nginx-admission-patch annotations: helm.sh/hook: post-install,post-upgrade helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: admission-webhook namespace: ingress-nginx spec: template: metadata: name: ingress-nginx-admission-patch labels: helm.sh/chart: ingress-nginx-3.30.0 app.kubernetes.io/name: ingress-nginx app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/version: 0.46.0 app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: admission-webhook spec: containers: - name: patch image: docker.io/jettech/kube-webhook-certgen:v1.5.1 imagePullPolicy: IfNotPresent args: - patch - --webhook-name=ingress-nginx-admission - --namespace=$(POD_NAMESPACE) - --patch-mutating=false - --secret-name=ingress-nginx-admission - --patch-failure-policy=Fail env: - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace restartPolicy: OnFailure serviceAccountName: ingress-nginx-admission securityContext: runAsNonRoot: true runAsUser: 2000 ``` #### 2.3 helm应用商店安装 直接在master1上执行如下的命令即可 ```sh curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash ``` ### 三、平台组件安装 #### 3.1 redis安装 本次redis的安装,采用的是单节点安装的形式(本来尝试过用Operator安装,但是安装完后是一个集群,占用内存太大了,机器内存不够了……囧)。安装文件如下: ```yaml --- #redis配置的configmap kind: ConfigMap apiVersion: v1 metadata: name: redis-config namespace: platform labels: app: redis data: redis.conf: |- dir /data port 6379 bind 0.0.0.0 appendonly yes protected-mode no pidfile /data/redis-6379.pid --- #存储 apiVersion: v1 kind: PersistentVolumeClaim metadata: name: redis-pvc namespace: platform spec: accessModes: - ReadWriteOnce volumeMode: Filesystem resources: requests: storage: 5Gi storageClassName: "rook-ceph-block" --- apiVersion: apps/v1 kind: Deployment metadata: name: redis namespace: platform labels: app: redis spec: replicas: 1 selector: matchLabels: app: redis template: metadata: labels: app: redis spec: # 进行初始化操作,修改系统配置,解决 Redis 启动时提示的警告信息 initContainers: - name: system-init image: busybox:1.32 imagePullPolicy: IfNotPresent command: - "sh" - "-c" - "echo 2048 > /proc/sys/net/core/somaxconn && echo never > /sys/kernel/mm/transparent_hugepage/enabled" securityContext: privileged: true runAsUser: 0 volumeMounts: - name: sys mountPath: /sys containers: - name: redis image: redis:5.0.8 command: - "sh" - "-c" - "redis-server /usr/local/etc/redis/redis.conf" ports: - containerPort: 6379 resources: limits: cpu: 1000m memory: 1024Mi requests: cpu: 1000m memory: 1024Mi livenessProbe: tcpSocket: port: 6379 initialDelaySeconds: 300 timeoutSeconds: 1 periodSeconds: 10 successThreshold: 1 failureThreshold: 3 readinessProbe: tcpSocket: port: 6379 initialDelaySeconds: 5 timeoutSeconds: 1 periodSeconds: 10 successThreshold: 1 failureThreshold: 3 volumeMounts: - name: redis-data mountPath: /data - name: config mountPath: /usr/local/etc/redis/redis.conf subPath: redis.conf volumes: - name: redis-data persistentVolumeClaim: claimName: redis-pvc - name: config configMap: name: redis-config - name: sys hostPath: path: /sys --- #service apiVersion: v1 kind: Service metadata: name: redis namespace: platform labels: app: redis spec: type: ClusterIP ports: - name: redis port: 6379 selector: app: redis ``` #### 3.2 mysql以ep+service形式暴露服务到集群环境 配置很简单,如下 ```yaml --- apiVersion: v1 kind: Service metadata: name: mysql namespace: platform spec: ports: - port: 3306 --- kind: Endpoints apiVersion: v1 metadata: name: mysql namespace: platform subsets: - addresses: - ip: 192.168.3.80 ports: - port: 3306 ``` ### 四、构建CICD基础环境 CICD基础环境,主要由ceph存储、Harbor镜像仓库和Jenkins组成。ceph存储提供底层存储,镜像仓库提供镜像存储,jenkins提供CICD集成环境。 #### 4.1 rook-ceph安装 > 参考官网:https://rook.io/docs/rook/v1.6/ceph-quickstart.html ##### **1)前提条件** - Raw devices (no partitions or formatted filesystems); 原始磁盘,无分区或者格式化 - Raw partitions (no formatted filesystem);原始分区,无格式化文件系统 因此需要在虚拟机上增加一块磁盘,并且不能分区或格式化。 **每台node节点增加一个100G的磁盘,然后重启系统(k8s集群配置好后,主机重启,集群也会自动重启)。** ![image-20210822143302051](https://gitee.com/wang5620079/mypics/raw/master//202108221433146.png) ![image-20210822143557127](https://gitee.com/wang5620079/mypics/raw/master//202108221435224.png) ##### **2)通过operator部署** > 参考ceph官网介绍:https://ceph.io/ > > 参考rook-ceph官网介绍:https://rook.io/docs/rook/master/ceph-quickstart.html > > 注意:笔者环境安装的时候,rook-ceph最新版本是1.6.7 ```sh #下载文件: git clone --single-branch --branch master https://github.com/rook/rook.git #修改配置文件: cd rook-1.6.7/cluster/examples/kubernetes/ceph ``` ```sh #修改operator.yaml,更换镜像源 ROOK_CSI_CEPH_IMAGE: "registry.cn-beijing.aliyuncs.com/wang5620079/rook-ceph:cephcsi-v3.3.1" ROOK_CSI_REGISTRAR_IMAGE: "registry.cn-beijing.aliyuncs.com/wang5620079/rook-ceph:csi-node-driver-registrar-v2.0.1" ROOK_CSI_RESIZER_IMAGE: "registry.cn-beijing.aliyuncs.com/wang5620079/rook-ceph:csi-resizer-v1.0.1" ROOK_CSI_PROVISIONER_IMAGE: "registry.cn-beijing.aliyuncs.com/wang5620079/rook-ceph:csi-provisioner-v2.0.4" ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.cn-beijing.aliyuncs.com/wang5620079/rook-ceph:csi-snapshotter-v4.0.0" ROOK_CSI_ATTACHER_IMAGE: "registry.cn-beijing.aliyuncs.com/wang5620079/rook-ceph:csi-attacher-v3.0.2" ``` **修改`cluster.yaml`磁盘配置** ```yaml storage: # cluster level storage configuration and selection useAllNodes: false useAllDevices: false config: osdsPerDevice: "3" #每个设备osd数量 nodes: - name: "k8s-node3" devices: - name: "sdb" - name: "k8s-node1" devices: - name: "sdb" - name: "k8s-node2" devices: - name: "sdb" ``` **运行命令部署** ```sh cd cluster/examples/kubernetes/ceph kubectl create -f crds.yaml -f common.yaml -f operator.yaml #注意修改operator镜像 #验证部署完成 kubectl -n rook-ceph get pod ``` **`一定要保证有如下的服务部署完成了,否则会出问题!!`** ![image-20210822150618453](https://gitee.com/wang5620079/mypics/raw/master//202108221506546.png) 3)部署dashboard 前面的步骤其实已经部署完成了。 先获取访问密码: ```sh #获取访问密码 kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo ``` 将service修改成NodePort形式就行了 ```sh apiVersion: v1 kind: Service metadata: labels: app: rook-ceph-mgr ceph_daemon_id: a rook_cluster: rook-ceph name: rook-ceph-mgr-dashboard-active namespace: rook-ceph spec: ports: - name: dashboard port: 8443 protocol: TCP targetPort: 8443 selector: #service选择哪些Pod app: rook-ceph-mgr ceph_daemon_id: a rook_cluster: rook-ceph sessionAffinity: None type: NodePort ``` 截图如下 ![image-20210822151101326](https://gitee.com/wang5620079/mypics/raw/master//202108221511436.png) 然后访问dashboard ![image-20210822151330196](https://gitee.com/wang5620079/mypics/raw/master//202108221513434.png) ##### **3)部署ceph的存储池,创建storageclass** **创建块存储(RDB)的storageclass** > 参考https://www.rook.io/docs/rook/v1.6/ceph-block.html ```sh apiVersion: ceph.rook.io/v1 kind: CephBlockPool metadata: name: replicapool namespace: rook-ceph spec: failureDomain: host #容灾模式,host或者osd replicated: size: 2 #数据副本数量 --- apiVersion: storage.k8s.io/v1 kind: StorageClass #存储驱动 metadata: name: rook-ceph-block # Change "rook-ceph" provisioner prefix to match the operator namespace if needed provisioner: rook-ceph.rbd.csi.ceph.com parameters: # clusterID is the namespace where the rook cluster is running clusterID: rook-ceph # Ceph pool into which the RBD image shall be created pool: replicapool # (optional) mapOptions is a comma-separated list of map options. # For krbd options refer # https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options # For nbd options refer # https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options # mapOptions: lock_on_read,queue_depth=1024 # (optional) unmapOptions is a comma-separated list of unmap options. # For krbd options refer # https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options # For nbd options refer # https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options # unmapOptions: force # RBD image format. Defaults to "2". imageFormat: "2" # RBD image features. Available for imageFormat: "2". CSI RBD currently supports only `layering` feature. imageFeatures: layering # The secrets contain Ceph admin credentials. csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph # Specify the filesystem type of the volume. If not specified, csi-provisioner # will set default as `ext4`. Note that `xfs` is not recommended due to potential deadlock # in hyperconverged settings where the volume is mounted on the same node as the osds. csi.storage.k8s.io/fstype: ext4 # Delete the rbd volume when a PVC is deleted reclaimPolicy: Delete allowVolumeExpansion: true ``` **创建文件存储storageclass** > 参考https://rook.io/docs/rook/v1.6/ceph-filesystem.html ```sh apiVersion: ceph.rook.io/v1 kind: CephFilesystem metadata: name: myfs namespace: rook-ceph # namespace:cluster spec: # The metadata pool spec. Must use replication. metadataPool: replicated: size: 3 requireSafeReplicaSize: true parameters: # Inline compression mode for the data pool # Further reference: https://docs.ceph.com/docs/nautilus/rados/configuration/bluestore-config-ref/#inline-compression compression_mode: none # gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool # for more info: https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size #target_size_ratio: ".5" # The list of data pool specs. Can use replication or erasure coding. dataPools: - failureDomain: host replicated: size: 3 # Disallow setting pool with replica 1, this could lead to data loss without recovery. # Make sure you're *ABSOLUTELY CERTAIN* that is what you want requireSafeReplicaSize: true parameters: # Inline compression mode for the data pool # Further reference: https://docs.ceph.com/docs/nautilus/rados/configuration/bluestore-config-ref/#inline-compression compression_mode: none # gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool # for more info: https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size #target_size_ratio: ".5" # Whether to preserve filesystem after CephFilesystem CRD deletion preserveFilesystemOnDelete: true # The metadata service (mds) configuration metadataServer: # The number of active MDS instances activeCount: 1 # Whether each active MDS instance will have an active standby with a warm metadata cache for faster failover. # If false, standbys will be available, but will not have a warm cache. activeStandby: true # The affinity rules to apply to the mds deployment placement: # nodeAffinity: # requiredDuringSchedulingIgnoredDuringExecution: # nodeSelectorTerms: # - matchExpressions: # - key: role # operator: In # values: # - mds-node # topologySpreadConstraints: # tolerations: # - key: mds-node # operator: Exists # podAffinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - rook-ceph-mds # topologyKey: kubernetes.io/hostname will place MDS across different hosts topologyKey: kubernetes.io/hostname preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - rook-ceph-mds # topologyKey: */zone can be used to spread MDS across different AZ # Use in k8s cluster if your cluster is v1.16 or lower # Use in k8s cluster is v1.17 or upper topologyKey: topology.kubernetes.io/zone # A key/value list of annotations annotations: # key: value # A key/value list of labels labels: # key: value resources: # The requests and limits set here, allow the filesystem MDS Pod(s) to use half of one CPU core and 1 gigabyte of memory # limits: # cpu: "500m" # memory: "1024Mi" # requests: # cpu: "500m" # memory: "1024Mi" # priorityClassName: my-priority-class mirroring: enabled: false ``` 至此,rook-ceph安装完成。我们创建pvc的时候,设置storageclass为对应的块存储或者文件存储,就可以自动创建对应pv,而且可以实现pvc的动态扩容。 #### 4.2 Harbor镜像仓库安装 ##### **1)Harbor说明** ![2104126-20201217173704535-710001277](https://gitee.com/wang5620079/mypics/raw/master//202108221540963.png) 各组件功能说明如下: - Nginx(Proxy):用于代理Harbor的registry,UI, token等服务 - db:负责储存用户权限、审计日志、Dockerimage分组信息等数据。 - UI:提供图形化界面,帮助用户管理registry上的镜像, 并对用户进行授权 - jobsevice:负责镜像复制工作的,他和registry通信,从一个registry pull镜像然后push到另一个registry,并记录job_log - Adminserver:是系统的配置管理中心附带检查存储用量,ui和jobserver启动时候回需要加载adminserver的配置。 - Registry:原生的docker镜像仓库,负责存储镜像文件。 - Log:为了帮助监控Harbor运行,负责收集其他组件的log,记录到syslog中 ##### **2)Harbor安装** Harbor采用helm安装 **首先添加charts仓库** ```sh helm repo add harbor https://helm.goharbor.io helm pull harbor/harbor ``` **然后,制作ingress访问使用的证书** ```sh master1节点上,创建个工作目录 mkdir install-Harbor cd install-Harbor #创建根证书,通用的,其实每个命名空间都可以用 openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.cert -subj "/CN=*.my-site.com/O=*.my-site.com" #创建命名空间和secret kubectl create ns devops kubectl create secret tls my-site.com --cert=tls.cert --key=tls.key -n devops ``` **第三,创建override.yaml配置文件** Harbor用到了ceph的存储 ```sh expose: #web浏览器访问用的证书 type: ingress tls: certSource: "secret" secret: secretName: "my-site.com" notarySecretName: "harbor.my-site.com" ingress: hosts: core: harbor.my-site.com notary: notary-harbor.my-site.com externalURL: https://harbor.my-site.com internalTLS: #harbor内部组件用的证书 enabled: true certSource: "auto" persistence: enabled: true resourcePolicy: "keep" persistentVolumeClaim: registry: # 存镜像的 storageClass: "rook-ceph-block" accessMode: ReadWriteOnce size: 5Gi chartmuseum: #存helm的chart storageClass: "rook-ceph-block" accessMode: ReadWriteOnce size: 5Gi jobservice: # storageClass: "rook-ceph-block" accessMode: ReadWriteOnce size: 1Gi database: #数据库 pgsql storageClass: "rook-ceph-block" accessMode: ReadWriteOnce size: 1Gi redis: # storageClass: "rook-ceph-block" accessMode: ReadWriteOnce size: 1Gi trivy: # 漏洞扫描 storageClass: "rook-ceph-block" accessMode: ReadWriteOnce size: 5Gi metrics: enabled: true ``` **最后,应用所有配置** ```sh helm install itharbor ./ -f values.yaml -f override.yaml -n devops ``` ##### **3)访问测试** 在宿主的windows主机hostswen文件中添加一个映射: ```sh 192.168.100.63 harbor.my-site.com ``` 然后访问https://harbor.my-site.com/ 默认的用户名是admin,密码是Harbor12345 截图如下: ![image-20210822155418060](https://gitee.com/wang5620079/mypics/raw/master//202108221554295.png) ![image-20210822155458571](https://gitee.com/wang5620079/mypics/raw/master//202108221554811.png) (上图中的仓库是我后期创建的) 至此,Harbor安装完成 ##### **4)排坑——Harbor无法重启的坑** helm安装的Harbor有个大坑,Harbor安装完成后,如果集群重启,Harbor服务会重启不了,报错。 **Harbor的pg数据库的sts数据库总是重启不成功,kubectl logs 提示:“/var/lib/postgresql/data/pgdata/pg13权限不对.** 这个时候,就要改Harbor的配置文件。 **harbor 的charts中,templates下的database-sts.yaml中,这一句修改目录权限设置的不够,要chown -R 否则重启后,报权限不够。** ![QQ图片20210627170454](https://gitee.com/wang5620079/mypics/raw/master//202108221559102.png) #### 4.3 Jenkins安装 Jenkins采用手动安装方式,并且使用ceph作为存储 ##### **1) Jenkins安装配置文件** ```yaml apiVersion: apps/v1 kind: StatefulSet metadata: name: jenkins namespace: devops spec: selector: matchLabels: app: jenkins # has to match .spec.template.metadata.labels serviceName: "jenkins" replicas: 1 template: metadata: labels: app: jenkins # has to match .spec.selector.matchLabels spec: terminationGracePeriodSeconds: 10 containers: - name: jenkins image: jenkinsci/blueocean:1.24.7 securityContext: runAsUser: 0 #设置以ROOT用户运行容器 privileged: true #拥有特权 ports: - containerPort: 8080 name: web - name: jnlp #jenkins slave与集群的通信口 containerPort: 50000 resources: limits: memory: 2Gi cpu: "2000m" requests: memory: 700Mi cpu: "500m" env: - name: LIMITS_MEMORY valueFrom: resourceFieldRef: resource: limits.memory divisor: 1Mi - name: "JAVA_OPTS" #设置变量,指定时区和 jenkins slave 执行者设置 value: " -Xmx$(LIMITS_MEMORY)m -XshowSettings:vm -Dhudson.slaves.NodeProvisioner.initialDelay=0 -Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.85 -Duser.timezone=Asia/Shanghai " volumeMounts: - name: home mountPath: /var/jenkins_home volumeClaimTemplates: - metadata: name: home spec: accessModes: [ "ReadWriteOnce" ] storageClassName: "rook-ceph-block" resources: requests: storage: 5Gi --- apiVersion: v1 kind: Service metadata: name: jenkins namespace: devops spec: selector: app: jenkins type: ClusterIP ports: - name: web port: 8080 targetPort: 8080 protocol: TCP - name: jnlp port: 50000 targetPort: 50000 protocol: TCP --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: jenkins namespace: devops spec: tls: - hosts: - jenkins.my-site.com secretName: my-site.com rules: - host: jenkins.my-site.com http: paths: - path: / pathType: Prefix backend: service: name: jenkins port: number: 8080 ``` **访问jenkins** ```sh #先在jenkins.my-site.com中添加一个映射关系: #192.168.100.63 jenkins.my-site.com #然后kubectl logs jenkins的pod -n devops #在日志中会看到初始登录密码 ``` ![image-20210822161402306](https://gitee.com/wang5620079/mypics/raw/master//202108221614544.png) 登录进去之后,按照提示,安装各种插件。然后就能进去主目录 ![image-20210822161520734](https://gitee.com/wang5620079/mypics/raw/master//202108221615968.png) #### 4.4 Jenkins与K8s整合 Jenkins可以与k8s通过插件进行集成。充分利用k8s的自动伸缩能力,来实现动态slave功能。示意如下: ![1622192953793](https://gitee.com/wang5620079/mypics/raw/master//202108221620813.png) ##### **1)安装核心插件** 安装如下几个核心插件 ```sh - kubernetes - docker - git ``` ![image-20210822162308706](https://gitee.com/wang5620079/mypics/raw/master//202108221623943.png) ![image-20210822162235132](https://gitee.com/wang5620079/mypics/raw/master//202108221622367.png) ![image-20210822162339449](https://gitee.com/wang5620079/mypics/raw/master//202108221623677.png) ##### **2)Jenkins的k8s配置** 点击《系统管理》—>《Configure System》—>《配置一个云》—>《kubernetes》,如下: ![image-20210822162507320](https://gitee.com/wang5620079/mypics/raw/master//202108221625544.png) ![1622196753966](https://gitee.com/wang5620079/mypics/raw/master//202108221625364.png) ##### **3)k8s的打包机配置** > 注意: > > 这里所说的打包机,就是以pod的形式运行的Jenkins的agent。 > > Jenkins的Agent大概分两种。 一是基于SSH的,需要把Master的SSH公钥配置到所有的Agent宿主机上去。 二是基于JNLP的,走HTTP协议,每个Agent需要配置一个独特的密码。 基于SSH的,可以由Master来启动;基于JNLP的,需要自己启动。 > > 本文所述的agent采用的是jnlp形式。这个agent是通过jnlp动态运行起来的pod。 > > jenkins的所有构建命令会在这个pod里面运行 > > - 注意配置以下内容 > - 名称 > - 命名空间 > - 标签列表 > - 容器名称、镜像 > - serviceAccount挂载项 > - `运行命令`: 改为 `jenkins-slave` 笔者配置的agent有以下几个 | slave-label | 镜像 | 集成工具 | | :---------: | ------------------------------------------------------------ | ------------------------------------------------------- | | maven | registry.cn-beijing.aliyuncs.com/wang5620079/jenkins:jnlp-maven-3.6.3 | jq、curl、maven | | nodejs | registry.cn-beijing.aliyuncs.com/wang5620079/jenkins:jnlp-nodejs-14.16.1 | jq、curl、nodejs、npm(已经设置全局目录在 /root/npm下) | | kubectl | registry.cn-beijing.aliyuncs.com/wang5620079/jenkins:jnlp-kubectl-1.21.1 | kubectl、helm、helm-push、jq、curl | | docker | registry.cn-beijing.aliyuncs.com/wang5620079/jenkins:jnlp-docker-20.10.2 | jq、curl、docker | - **maven打包机配置** **准备工作** > 创建一个名为maven-conf的configmap,用于保存maven的settings.xml ![image-20210822170529451](https://gitee.com/wang5620079/mypics/raw/master//202108221706848.png) 创建过程如下: settings.xml的内容如下,`注意localRepository的路径配置`。 ```xml /root/maven/.m2 alimaven aliyun-maven http://maven.aliyun.com/nexus/content/groups/public/; aliyun-repo office-repo office-central http://repo1.maven.org/maven2 true true office-central http://repo1.maven.org/maven2 true true aliyun-repo aliyun-central qcloud mirror central http://maven.aliyun.com/nexus/content/groups/public true true aliyun--plugin-central http://maven.aliyun.com/nexus/content/groups/public true true aliyun-repo ``` **然后创建configmap** ```sh kubectl create configmap maven-config.conf --from-file=config=/root/settings.xml -n devops ``` **另外,还需创建一个`存储maven下载的jar包的pvc`** ```sh vi maven-jar-pvc.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: maven-jar-pvc namespace: devops labels: app: maven-jar-pvc spec: storageClassName: rook-cephfs accessModes: - ReadWriteMany resources: requests: storage: 5Gi ``` **最后,配置maven打包机** ![image-20210822165634413](https://gitee.com/wang5620079/mypics/raw/master//202108221656643.png) ![image-20210822165807422](https://gitee.com/wang5620079/mypics/raw/master//202108221658656.png) ![image-20210822165846645](https://gitee.com/wang5620079/mypics/raw/master//202108221658878.png) ![image-20210822165925136](https://gitee.com/wang5620079/mypics/raw/master//202108221659364.png) 至此,maven打包机创建完成。 - **kubectl打包机配置** ![image-20210822171416286](https://gitee.com/wang5620079/mypics/raw/master//202108221714525.png) ![image-20210822171438744](https://gitee.com/wang5620079/mypics/raw/master//202108221714977.png) ![image-20210822171503071](https://gitee.com/wang5620079/mypics/raw/master//202108221715303.png) - **nodejs打包机** ![image-20210822171638410](https://gitee.com/wang5620079/mypics/raw/master//202108221716638.png) ![image-20210822171657669](https://gitee.com/wang5620079/mypics/raw/master//202108221716903.png) ![image-20210822171715598](https://gitee.com/wang5620079/mypics/raw/master//202108221717829.png) ![image-20210822171734774](https://gitee.com/wang5620079/mypics/raw/master//202108221717004.png) - **docker打包机** ![image-20210822171915599](https://gitee.com/wang5620079/mypics/raw/master//202108221719822.png) ![image-20210822171934727](https://gitee.com/wang5620079/mypics/raw/master//202108221719950.png) ![image-20210822172210823](https://gitee.com/wang5620079/mypics/raw/master//202108221722053.png) ###### **4)流水线测试** 新建一个流水线项目 ![image-20210822172940332](https://gitee.com/wang5620079/mypics/raw/master//202108221729486.png) 配置流水线定义如下 ![image-20210822173020749](https://gitee.com/wang5620079/mypics/raw/master//202108221730977.png) pipline的内容如下: ```sh pipeline { //无代理,各阶段声明自己的代理 agent none stages { stage('检查nodejs打包机') { //使用nodejs代理 agent { label 'nodejs' } steps { echo "nodejs版本:" sh 'node -v' echo "npm modules目录位置" sh 'npm config ls -l | grep prefix' echo "检查完成..." } } stage('检查maven打包机') { //使用nodejs代理 agent { label 'maven' } steps { echo "maven版本:" sh 'mvn -v' echo "maven配置文件" sh 'cat /app/maven/settings.xml' echo "maven目录位置信息" sh 'ls -al /app/maven/' } } stage('检查docker打包机') { //使用nodejs代理 agent { label 'docker' } steps { echo "docker版本:" sh 'docker version' sh 'docker images' } } stage('检查kubectl打包机') { //使用nodejs代理 agent { label 'kubectl' } steps { echo "kubectl版本:" sh ' kubectl version' echo "kubectl操作集群: 所有Pod" sh 'kubectl get pods' echo "kubectl操作集群: 所有nodes" sh 'kubectl get nodes' } } } } ``` 打开blue ocean进行构建测试 ![image-20210822173145325](https://gitee.com/wang5620079/mypics/raw/master//202108221731575.png) 我们可以看到构建过程运行了 ![image-20210822173618168](https://gitee.com/wang5620079/mypics/raw/master//202108221736399.png) 构建过程中,Jenkins会动态创建pod用于构建,以nodejs打包机为例,截图如下: ![image-20210822173712376](https://gitee.com/wang5620079/mypics/raw/master//202108221737474.png) ### 五、Kubernetes书城项目的自动构建配置说明 本部分以研发测试环境说明Kubernetes书城项目的自动化构建过程,为啥以研发测试环境为例说明呢?因为生产环境还没鼓捣完![img](https://gitee.com/wang5620079/mypics/raw/master//202108221749438.png) #### 5.1 Jenkins增加时间戳插件 构建过程中,为了镜像的tag命名方便,笔者引用了一个Jenkins插件Zentimestamp。 ![image-20210822180400368](https://gitee.com/wang5620079/mypics/raw/master//202108221804616.png) 插件安装完成后,可以设置全局时间格式 ![image-20210822180523326](https://gitee.com/wang5620079/mypics/raw/master//202108221805571.png) 这样就可以在Jenkins中以全局变量的形式,使用$BUILD_TIMESTAMP这个环境变量了,就省了多次构建时为镜像设置tag的过程(我承认我懒了……)。 #### 5.2 前端UI部分自动化构建部署 > 前端项目命名为:mybookstore-front > > 对应代码地址:https://gitee.com/wang5620079/mybookstore-front.git > > 前端部分,vue编译的js、html等静态文件要部署于nginx容器中运行,nginx实现动静分类。Dockerfile配置用于生成nginx的镜像,Jenkinsfile-dev是流水线文件。下面分文件详述。 ##### **1)nginx.conf文件** nginx文件内容如下: ```shell server { listen 80; #测试环境主机ip地址作为host,未来还会改成ingress域名访问 server_name 192.168.100.63; #charset koi8-r; access_log /var/log/nginx/host.access.log main; error_log /var/log/nginx/error.log error; #动静分离 location / { root /usr/share/nginx/html; index index.html index.htm; try_files $uri $uri/ /index.html; } #使用集群内的mybookstore-app-svc服务,利用k8s的service域名解析和k8s的service的负载均衡能力,将压力分散到应用层的各个pod中 location /api/ { proxy_pass http://mybookstore-app-svc.dev.svc.cluster.local:5000; } #error_page 404 /404.html; # redirect server error pages to the static page /50x.html # error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; } } ``` 配置文件需要注意以下几点: - **server_name应该和访问域名匹配**,否则会导致服务无法访问。这里目前测试环境还是用的节点ip地址直接访问,未来要配置ingress访问,这里还要改。 - 动静分离目录要与Dockerfile中的动静分离目录一致。 - api的url解析,利用了k8s集群的service域名解析能力。在集群内,mybookstore-app的service对应的域名就是`mybookstore-app-svc.dev.svc.cluster.local` ##### **2)Dockerfile配置** ```sh #使用一个比较小的alpine版本基础镜像 FROM nginx:stable-alpine RUN rm /etc/nginx/conf.d/default.conf ADD nginx.conf /etc/nginx/conf.d/default.conf #这里要注意,目标目录与nginx.conf中保持一致 COPY dist/ /usr/share/nginx/html/ ``` 前端编译构建出来的静态文件在dist目录中,COPY dist/ /usr/share/nginx/html/这句话就是把编译构建的文件放到静态文件目录中。 ##### **3)jenkinsfile-dev** 具体文件内容见下面: ```shell pipeline { //无代理,各阶段声明自己的代理 agent none //用时间戳作为镜像版本 environment { tsversion = "v${BUILD_TIMESTAMP}" appname="mybookstore-front" svcname="${appname}-svc" namespace="dev" repo="mybookstore-dev" replicas=1 port=80 } stages { stage('开始打包镜像') { agent { label 'nodejs' } steps { echo "检查环境" sh 'ls -l' echo "安装依赖" sh 'npm install --registry=https://registry.npm.taobao.org' echo "开始编译" sh 'npm run build' echo "检查编译情况" sh 'cd dist && ls -l' echo "stash文件" //注意这里使用的是dist/**,要递归把所有文件都stash的 stash includes: 'dist/**', name: 'dist' } } stage('打包并上传镜像') { //使用nodejs代理 agent { label 'docker' } steps { echo "unstash" unstash 'dist' sh "docker build -t ${appname}:${tsversion} -f Dockerfile ." echo "检查镜像是否打包完成" sh 'docker images|grep mybookstore' echo "上传镜像" sh 'docker login -u admin -p Harbor12345 harbor.my-site.com' sh "docker tag ${appname}:${tsversion} harbor.my-site.com/${repo}/${appname}:${tsversion}" sh "docker push harbor.my-site.com/${repo}/${appname}:${tsversion}" } } stage('kubectl 部署') { //使用nodejs代理 agent { label 'kubectl' } steps { echo "kubectl 部署deployment" sh """cat - < deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: labels: app: ${appname} version: ${tsversion} name: ${appname} namespace: ${namespace} spec: replicas: ${replicas} selector: matchLabels: app: ${appname} template: metadata: labels: app: ${appname} spec: containers: - image: harbor.my-site.com/${repo}/${appname}:${tsversion} name: ${appname} ports: - containerPort: ${port} EOF""" sh 'ls -l' sh 'cat deployment.yaml' sh "kubectl apply -f deployment.yaml" echo "kubectl 部署nodeport的svc" sh """cat - < service.yaml apiVersion: v1 kind: Service metadata: labels: app: ${svcname} name: ${svcname} namespace: ${namespace} spec: ports: - port: ${port} protocol: TCP targetPort: ${port} selector: app: ${appname} type: NodePort EOF""" sh "kubectl apply -f service.yaml" } } } } ``` 这里有几个关键点: - tsversion = "v${BUILD_TIMESTAMP}"设置时间戳为镜像的版本,还有deployment的version 标签值。注意:deployment的标签值不能是纯数字,所以tsversion 在${BUILD_TIMESTAMP}前面加了个v字。 - 注意每个stage的agent配置,实际就是用的前述配置的各个不同的打包机。 - 打包并上传镜像阶段,npm构建使用了淘宝加速,将构建好的镜像上传到harbor.my-site.com镜像仓库中。这里写了明文密码。实际生产中,可以使用credentials来避免明文密码。本项目生产环境配置中也会使用credentials。 - **开始打包镜像和打包并上传镜像这两个阶段,使用stash和unstash用于在不同的阶段传递文件。因为这两个阶段是在不同的pod中运行的,很可能这两个pod都不在同一台机器上。故需要stash和unstash机制进行文件传递。** ##### **4)流水线项目创建** 创建流水线项目如下图: ![image-20210822182657336](https://gitee.com/wang5620079/mypics/raw/master//202108221826504.png) ![image-20210822182844683](https://gitee.com/wang5620079/mypics/raw/master//202108221828941.png) 注意配置流水线的url,分支为dev分支,Jenkinsfile文件名为Jenkinsfile-dev ##### **5)构建测试** 运行流水线项目,看到如下输出: ![image-20210822183833549](https://gitee.com/wang5620079/mypics/raw/master//202108221838794.png) ![image-20210822183855193](https://gitee.com/wang5620079/mypics/raw/master//202108221838439.png) ![image-20210822183937475](https://gitee.com/wang5620079/mypics/raw/master//202108221839723.png) ![image-20210822184024028](https://gitee.com/wang5620079/mypics/raw/master//202108221840273.png) 集群中dev命名空间下也有了对应的pod、deploy和svc ![img](https://gitee.com/wang5620079/mypics/raw/master//202108221844839.PNG) ##### **6)排坑——antdesign vue pro的坑** antdesign vue pro默认的vue版本依赖为2.6.14,如下package.json图 ![image-20210822204347566](https://gitee.com/wang5620079/mypics/raw/master//202108222043681.png) 但是在permission.js中引用了vuerouter,且为较高版本的api使用,需要进行修改。 ![image-20210822204441547](https://gitee.com/wang5620079/mypics/raw/master//202108222044682.png) 如果不这样修改,编译后的版本,如果放到容器中运行,会报错,如下图: ![Image](https://gitee.com/wang5620079/mypics/raw/master//202108222045525.png) ##### **7)排坑——jenkins发布ingress** **20210825更新:** 最新的front项目,增加了自动发布ingress的代码,如下: ```sh sh """cat - < ingress.yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ${ingname} namespace: ${namespace} annotations: nginx.ingress.kubernetes.io/rewrite-target: / spec: tls: - hosts: - ${inghost} #借用了一个secret,创建secret过程请见前述内容 secretName: ${secretname} rules: - host: ${inghost} http: paths: - path: / pathType: Prefix backend: service: name: ${svcname} port: number: ${port} EOF""" sh "kubectl apply -f ingress.yaml" ``` jenkins发布ingress后,会报“User:system:serviceaccount.dev.jenkins”没有list ingress的权限,在jenkins的heml仓库中,找到install-jenkis.yaml,增加clusterrole对应权限即可: ```sh apiVersion: apps/v1 kind: StatefulSet metadata: name: jenkins namespace: devops spec: selector: matchLabels: app: jenkins # has to match .spec.template.metadata.labels serviceName: "jenkins" replicas: 1 template: metadata: labels: app: jenkins # has to match .spec.selector.matchLabels spec: serviceAccountName: "jenkins" terminationGracePeriodSeconds: 10 containers: - name: jenkins image: jenkinsci/blueocean:1.24.7 securityContext: runAsUser: 0 #设置以ROOT用户运行容器 privileged: true #拥有特权 ports: - containerPort: 8080 name: web - name: jnlp #jenkins slave与集群的通信口 containerPort: 50000 resources: limits: memory: 2Gi cpu: "2000m" requests: memory: 700Mi cpu: "500m" env: - name: LIMITS_MEMORY valueFrom: resourceFieldRef: resource: limits.memory divisor: 1Mi - name: "JAVA_OPTS" #设置变量,指定时区和 jenkins slave 执行者设置 value: " -Xmx$(LIMITS_MEMORY)m -XshowSettings:vm -Dhudson.slaves.NodeProvisioner.initialDelay=0 -Dhudson.slaves.NodeProvisioner.MARGIN=50 -Dhudson.slaves.NodeProvisioner.MARGIN0=0.75 -Duser.timezone=Asia/Shanghai " volumeMounts: - name: home mountPath: /var/jenkins_home volumeClaimTemplates: - metadata: name: home spec: accessModes: [ "ReadWriteOnce" ] storageClassName: "rook-ceph-block" resources: requests: storage: 5Gi --- apiVersion: v1 kind: Service metadata: name: jenkins namespace: devops spec: selector: app: jenkins type: ClusterIP ports: - name: web port: 8080 targetPort: 8080 protocol: TCP - name: jnlp port: 50000 targetPort: 50000 protocol: TCP --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: jenkins namespace: devops spec: tls: - hosts: - jenkins.my-site.com secretName: my-site.com rules: - host: jenkins.my-site.com http: paths: - path: / pathType: Prefix backend: service: name: jenkins port: number: 8080 --- apiVersion: v1 kind: ServiceAccount metadata: name: jenkins namespace: devops --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: jenkins rules: - apiGroups: ["extensions", "apps"] resources: ["deployments"] verbs: ["create", "delete", "get", "list", "watch", "patch", "update"] - apiGroups: ["extensions","networking.k8s.io"] #这里是关键 resources: ["ingresses"] verbs: ["create", "delete", "get", "list", "watch", "patch", "update"] - apiGroups: [""] resources: ["services"] verbs: ["create", "delete", "get", "list", "watch", "patch", "update"] - apiGroups: [""] resources: ["pods"] verbs: ["create","delete","get","list","patch","update","watch"] - apiGroups: [""] resources: ["pods/exec"] verbs: ["create","delete","get","list","patch","update","watch"] - apiGroups: [""] resources: ["pods/log"] verbs: ["get","list","watch"] - apiGroups: [""] resources: ["secrets"] verbs: ["get"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: jenkins roleRef: kind: ClusterRole name: jenkins apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount name: jenkins namespace: devops ``` 效果图如下: ![image-20210825175017143](https://gitee.com/wang5620079/mypics/raw/master//202108251750314.png) #### 5.2 server端自动化构建部署 server端应用配置包含mybookstore-app微服务(应用层统一接入微服务)、mybookstore-book微服务、mybookstore-user微服务、mybookstore-order微服务、mybookstore-other微服务。下面以mybookstore-app微服务为例记录说明。 ##### **1)config-prod.yaml、config-dev.yaml配置文件** 这两个配置文件,为生产测试的配置文件。以config-dev.yaml为例 ```yaml #版本 version: "v1.0.0" #日志级别 logging: gloableLogLevel: DEBUG fileLogLevel: INFO consoleLogLevel: DEBUG #日志输出路径,只在linux环境下有效 logDir: "/data/logs" #数据库配置,生产环境将会用endpoint+svc形式将mysql服务引入集群内部,从而实现域名访问mysql。 dbConfig: username: "root" password: "123456" host: "192.168.3.80" port: 3306 dbname: "mybookstore_dev" #redis配置,生产环境将实现redis容器化部署,到时候会以集群内域名直接访问形式部署。 redisConfig: username: "" password: "" host: "192.168.100.70" port: 6379 #各个中台微服务配置,这里记录的是微服务对应sercie的集群内地址。 microservices: book: urls: 'book' host: "mybookstore-book-svc.dev.svc.cluster.local" port: 6000 user: urls: 'user,auth' host: "mybookstore-user-svc.dev.svc.cluster.local" port: 7000 order: urls: 'order' host: "mybookstore-order-svc.dev.svc.cluster.local" port: 8000 other: urls: 'other' host: "mybookstore-other-svc.dev.svc.cluster.local" port: 9000 loginConfig: SECRET_KEY: "MY_SECRET_KEY" ``` ##### **2)Dockerfile文件** Dockerfile文件用于镜像生成,说明如下。 ```dockerfile #采用官方python3.7镜像 FROM python:3.7.11-stretch WORKDIR /mybookstore #复制pip依赖文件 COPY . /mybookstore/ #修正容器内的时区问题 COPY Shanghai /etc/localtime #安装python依赖 RUN ["pip","install", "-r","requirements.txt","-i","https://pypi.tuna.tsinghua.edu.cn/simple"] #暴露服务端口 EXPOSE 5000 #为run.sh文件增加执行权限 RUN chmod a+x run.sh #容器启动执行run.sh ENTRYPOINT ["./run.sh"] ``` ##### **3)run.sh容器启动运行文件** ```sh #!/bin/bash set -e #gunicorn日志目录 logdir=/data/logs/gunicorn if [ ! -d $logdir ] then mkdir -p $logdir fi # gunicorn启动命令,注意这里的端口与上述Dockerfile中的端口要一致 gunicorn --preload mybookstore-app:app \ --bind 0.0.0.0:5000 \ --workers 4 \ --log-level debug \ --access-logfile=$logdir/access_print.log \ --error-logfile=$logdir/error_print.log exec "$@" ``` python应用,使用gunicorn作为WSGI UNIX HTTP Server容器。run.sh中的端口必须与Dockerfile中相同。 ##### **4)Jenkins-dev文件** ```shell pipeline { //无代理,各阶段声明自己的代理 agent none //用时间戳作为镜像版本 environment { tsversion = "v${BUILD_TIMESTAMP}" appname="mybookstore-app" svcname="${appname}-svc" namespace="dev" repo="mybookstore-dev" replicas=1 port=5000 } stages { stage('开始打包镜像') { agent { label 'docker' } steps { echo "检查环境变量" sh 'printenv' echo "检查tags" sh 'git tag' echo "docker版本:" sh 'docker version' echo "打包镜像" sh "docker build -t ${appname}:${tsversion} -f Dockerfile ." echo "检查镜像是否打包完成" sh 'docker images|grep mybookstore' echo "上传镜像" sh 'docker login -u admin -p Harbor12345 harbor.my-site.com' sh "docker tag ${appname}:${tsversion} harbor.my-site.com/${repo}/${appname}:${tsversion}" sh "docker push harbor.my-site.com/${repo}/${appname}:${tsversion}" } } stage('kubectl 部署') { //使用nodejs代理 agent { label 'kubectl' } steps { echo "kubectl 部署deployment" sh """cat - < deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: labels: app: ${appname} version: ${tsversion} name: ${appname} namespace: ${namespace} spec: replicas: ${replicas} selector: matchLabels: app: ${appname} template: metadata: labels: app: ${appname} spec: volumes: - name: log-volume hostPath: path: /data/${appname}/logs containers: - image: harbor.my-site.com/${repo}/${appname}:${tsversion} name: ${appname} ports: - containerPort: ${port} volumeMounts: - name: log-volume mountPath: /data/logs EOF""" sh 'ls -l' sh 'cat deployment.yaml' sh "kubectl apply -f deployment.yaml" echo "kubectl 部署svc" sh """cat - < service.yaml apiVersion: v1 kind: Service metadata: labels: app: ${svcname} name: ${svcname} namespace: ${namespace} spec: ports: - port: ${port} protocol: TCP targetPort: ${port} selector: app: ${appname} EOF""" sh "kubectl apply -f service.yaml" } } } } ``` Jenkinsfile-dev与前述前端项目类似,在此不做赘述。 ##### **5)流水线运行效果** 效果截图如下 启动构建后,会创建docker镜像,用于镜像打包 ![image-20210822205548995](https://gitee.com/wang5620079/mypics/raw/master//202108222057993.png) 此时可以看到在进行镜像打包 ![image-20210822210932705](https://gitee.com/wang5620079/mypics/raw/master//202108222109960.png) 运行到kubectl部署服务阶段,会生成kubectl镜像 ![image-20210822210206428](C:\Users\wp\AppData\Roaming\Typora\typora-user-images\image-20210822210206428.png) 此时可以看到开始进行kubectl部署阶段 ![](https://gitee.com/wang5620079/mypics/raw/master//202108222109960.png) 可以看到app服务成功部署 ![image-20210822211050149](https://gitee.com/wang5620079/mypics/raw/master//202108222110242.png) #### 5.3 最终效果 将所有微服务配置好,如下图: ![image-20210822211136310](https://gitee.com/wang5620079/mypics/raw/master//202108222111506.png) 最终,可以实现整个服务部署在k8s集群中。各个服务多次升级后,会对应多个replicaset。 ![image-20210822211318491](https://gitee.com/wang5620079/mypics/raw/master//202108222113625.png) 访问front服务对应的nodeport,可以实现服务的访问 ![image-20210822211802253](https://gitee.com/wang5620079/mypics/raw/master//202108222118503.png) ### 六、监控与日志采集 本项目监控包含2方面的部署,一是prometheus监控,一是elk日志监控。 #### 6.1 prometheus监控部署 prometheus的安装部署,采用helm进行配置,步骤如下: ##### 1)增加prometheus仓库 ```sh helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update helm pull prometheus-community/kube-prometheus-stack --version 16.0.0 ``` ##### 2)修改配置文件 ```sh vi override.yaml #直接Ingress部署的 alertmanager: ingress: enabled: true ingressClassName: nginx hosts: - alertmanager.my-site.com paths: - / pathType: Prefix tls: - secretName: my-site.com hosts: - alertmanager.my-site.com grafana: enabled: true defaultDashboardsEnabled: true adminPassword: Admin123456 ingress: enabled: true hosts: - grafana.my-site.com path: / pathType: Prefix tls: - secretName: my-site.com hosts: - grafana.my-site.com prometheus: ingress: enabled: true hosts: [prometheus.my-site.com] paths: - / pathType: Prefix tls: - secretName: my-site.com hosts: - prometheus.my-site.com additionalPodMonitors: - name: registry.cn-beijing.aliyuncs.com/wang5620079/k8s:kube-state-metrics-v2.0.0 ``` ##### 3)prometheus安装 ```sh kubectl create ns monitor helm install -f values.yaml -f override.yaml prometheus-stack ./ -n monitor ``` 安装完成后,monitor命名空间出现下面的pod ![image-20210830162315951](https://gitee.com/wang5620079/mypics/raw/master//202108301623059.png) ##### 4)监控页面及效果 我们在https://grafana.com/grafana/dashboards中找到最火的kubernetes面板,编号是13105,导入到grafana中,可以看到下图的情况 ![image-20210830152412990](https://gitee.com/wang5620079/mypics/raw/master//202108301524192.png) ![image-20210830152449570](https://gitee.com/wang5620079/mypics/raw/master//202108301524776.png) #### 6.2 elk日志监控部署 elk环境,采用7.3.2版本,为了方便配置,没有启用xpack安全组件,但是启用了xpack的monitor组件,用于系统的监控。 ##### 1)Elasticsearch高可用环境部署 Elasticsearch高可用环境使用k8s的sts有状态服务进行配置,并配置pod的反亲和性,目的是使es可以使用ceph的存储,并用反亲和性使得es的pod分布在不同的主机上,防止某一节点宕机引起整个集群宕机。 es和kibana采用同一个yaml配置,配置如下: ```yaml --- apiVersion: v1 kind: Service metadata: name: es-cluster-node namespace: monitor spec: clusterIP: None selector: app: es-cluster ports: - name: transport port: 9300 protocol: TCP --- # cluster ip apiVersion: v1 kind: Service metadata: name: es-cluster-external namespace: monitor spec: selector: app: es-cluster ports: - name: http port: 9200 targetPort: 9200 type: ClusterIP --- apiVersion: v1 kind: ConfigMap metadata: name: es-config namespace: monitor data: elasticsearch.yml: | node.name: ${HOSTNAME} cluster.name: my-elastic-cluster cluster.initial_master_nodes: ["esnode-0"] network.host: "0.0.0.0" bootstrap.memory_lock: false discovery.zen.ping.unicast.hosts: ["esnode-0.es-cluster-node.monitor.svc.cluster.local","esnode-1.es-cluster-node.monitor.svc.cluster.local","esnode-2.es-cluster-node.monitor.svc.cluster.local"] discovery.zen.minimum_master_nodes: 1 xpack.security.enabled: false xpack.monitoring.enabled: true --- apiVersion: apps/v1 kind: StatefulSet metadata: name: esnode namespace: monitor labels: app: es-cluster spec: serviceName: es-cluster-node selector: matchLabels: app: es-cluster replicas: 3 updateStrategy: type: RollingUpdate template: metadata: labels: app: es-cluster spec: containers: - name: elasticsearch #resources: # requests: # memory: 300Mi # cpu: 0.01 # limits: # memory: 1.5Gi # cpu: 1 securityContext: privileged: true runAsUser: 1000 capabilities: add: - IPC_LOCK - SYS_RESOURCE image: harbor.my-site.com/elk/elasticsearch:7.3.2 imagePullPolicy: IfNotPresent env: - name: ES_JAVA_OPTS value: "-Xms2g -Xmx2g" readinessProbe: httpGet: scheme: HTTP path: /_cluster/health?local=true port: 9200 initialDelaySeconds: 5 ports: - containerPort: 9200 name: es-http - containerPort: 9300 name: es-transport volumeMounts: - name: es-data mountPath: /usr/share/elasticsearch/data - name: elasticsearch-config mountPath: /usr/share/elasticsearch/config/elasticsearch.yml subPath: elasticsearch.yml affinity: #利用pod亲和性,使得es的节点分布在不同主机上 podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - elasticsearch topologyKey: "kubernetes.io/hostname" volumes: - name: elasticsearch-config configMap: name: es-config items: - key: elasticsearch.yml path: elasticsearch.yml volumeClaimTemplates: - metadata: name: es-data spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 5Gi storageClassName: "rook-ceph-block" --- #安装kibana apiVersion: apps/v1 kind: Deployment metadata: name: kibana namespace: monitor labels: k8s-app: kibana spec: replicas: 1 selector: matchLabels: k8s-app: kibana template: metadata: labels: k8s-app: kibana spec: containers: - name: kibana image: harbor.my-site.com/elk/kibana:7.3.2 #resources: # limits: # cpu: 1 # memory: 500Mi # requests: # cpu: 0.5 # memory: 200Mi env: - name: ELASTICSEARCH_HOSTS value: http://es-cluster-external.monitor.svc.cluster.local:9200 - name: I18N_LOCALE value: zh-CN ports: - containerPort: 5601 name: ui protocol: TCP --- apiVersion: v1 kind: Service metadata: name: kibana namespace: monitor spec: type: NodePort ports: - port: 5601 protocol: TCP targetPort: ui nodePort: 30601 selector: k8s-app: kibana ``` 部署好后,效果如下: ![image-20210830155357155](https://gitee.com/wang5620079/mypics/raw/master//202108301553269.png) 访问节点的30601端口,可以访问到kibana的页面 ![image-20210830170312652](https://gitee.com/wang5620079/mypics/raw/master//202108301703817.png) ##### 2)logstash日志解析配置 logstash用于日志的解析,并将数据传入ES集群中。logstash日志解析配置包含2个部分,一个是解析配置conf配置,命名为mybookstore.conf,另一个是logstash本身的运行配置,命名为mybookstore.yml 。这两个配置文件写到一个k8s的cm中。 ```sh vi mybookstore.conf #制作镜像用的配置文件 input { beats{ port => 5044 } } filter { grok { match => { "message" => "%{TIMESTAMP_ISO8601:time} - (?([\w\.]+))\[(?([\w\:]+))\] - %{LOGLEVEL:loglevel}:%{GREEDYDATA:msg}" } } date { match => ["time", "ISO8601", "dd/MM/yyyy:HH:mm:ss","yyyy-MM-dd HH:mm:ss", "yyyy-MM-dd HH:mm:ss,SSS", "yyyy-MM-dd HH:mm:ss.SSS", "yyyy-MM-dd HH:mm:ss.SSSS"] target => ["@timestamp"] timezone => "Asia/Shanghai" } date { match => ["time", "ISO8601", "dd/MM/yyyy:HH:mm:ss","yyyy-MM-dd HH:mm:ss", "yyyy-MM-dd HH:mm:ss,SSS", "yyyy-MM-dd HH:mm:ss.SSS", "yyyy-MM-dd HH:mm:ss.SSSS"] target => ["time"] } mutate { add_field => { #设置目标index名称 "target_index" => "%{[fields][appname]}-%{+YYYY-MM-dd}" } add_field => { #增加podid "hostname" => "%{[host][name]}" } add_field => { #增加podip "IP" => "%{[host][ip][0]}" } add_field => { #增加path "path" => "%{[log][file][path]}" } add_field => { #增加path "appname" => "%{[fields][appname]}" } } mutate{ remove_field => ["log","@version","agent","host","tags","ecs","input","fields"] } } output { elasticsearch { action => "index" hosts => ["http://es-cluster-external.monitor.svc.cluster.local:9200"] index => "%{[target_index]}" timeout => 5000 codec => plain { charset => "UTF-8" } } #stdout { # codec => rubydebug #} } ``` ```yaml vi mybookstore.yaml # Settings file in YAML # # Settings can be specified either in hierarchical form, e.g.: # # pipeline: # batch: # size: 125 # delay: 5 # # Or as flat keys: # # pipeline.batch.size: 125 # pipeline.batch.delay: 5 # # ------------ Node identity ------------ # # Use a descriptive name for the node: # # node.name: test # # If omitted the node name will default to the machine's host name # # ------------ Data path ------------------ # # Which directory should be used by logstash and its plugins # for any persistent needs. Defaults to LOGSTASH_HOME/data # # path.data: # # ------------ Pipeline Settings -------------- # # The ID of the pipeline. # # pipeline.id: main # # Set the number of workers that will, in parallel, execute the filters+outputs # stage of the pipeline. # # This defaults to the number of the host's CPU cores. # # pipeline.workers: 2 # # How many events to retrieve from inputs before sending to filters+workers # # pipeline.batch.size: 125 # # How long to wait in milliseconds while polling for the next event # before dispatching an undersized batch to filters+outputs # # pipeline.batch.delay: 50 # # Force Logstash to exit during shutdown even if there are still inflight # events in memory. By default, logstash will refuse to quit until all # received events have been pushed to the outputs. # # WARNING: enabling this can lead to data loss during shutdown # # pipeline.unsafe_shutdown: false # # ------------ Pipeline Configuration Settings -------------- # # Where to fetch the pipeline configuration for the main pipeline # # path.config: # # Pipeline configuration string for the main pipeline # # config.string: # # At startup, test if the configuration is valid and exit (dry run) # # config.test_and_exit: false # # Periodically check if the configuration has changed and reload the pipeline # This can also be triggered manually through the SIGHUP signal # # config.reload.automatic: false # # How often to check if the pipeline configuration has changed (in seconds) # # config.reload.interval: 3s # # Show fully compiled configuration as debug log message # NOTE: --log.level must be 'debug' # # config.debug: false # # When enabled, process escaped characters such as \n and \" in strings in the # pipeline configuration files. # # config.support_escapes: false # # ------------ Module Settings --------------- # Define modules here. Modules definitions must be defined as an array. # The simple way to see this is to prepend each `name` with a `-`, and keep # all associated variables under the `name` they are associated with, and # above the next, like this: # # modules: # - name: MODULE_NAME # var.PLUGINTYPE1.PLUGINNAME1.KEY1: VALUE # var.PLUGINTYPE1.PLUGINNAME1.KEY2: VALUE # var.PLUGINTYPE2.PLUGINNAME1.KEY1: VALUE # var.PLUGINTYPE3.PLUGINNAME3.KEY1: VALUE # # Module variable names must be in the format of # # var.PLUGIN_TYPE.PLUGIN_NAME.KEY # # modules: # # ------------ Cloud Settings --------------- # Define Elastic Cloud settings here. # Format of cloud.id is a base64 value e.g. dXMtZWFzdC0xLmF3cy5mb3VuZC5pbyRub3RhcmVhbCRpZGVudGlmaWVy # and it may have an label prefix e.g. staging:dXMtZ... # This will overwrite 'var.elasticsearch.hosts' and 'var.kibana.host' # cloud.id: # # Format of cloud.auth is: : # This is optional # If supplied this will overwrite 'var.elasticsearch.username' and 'var.elasticsearch.password' # If supplied this will overwrite 'var.kibana.username' and 'var.kibana.password' # cloud.auth: elastic: # # ------------ Queuing Settings -------------- # # Internal queuing model, "memory" for legacy in-memory based queuing and # "persisted" for disk-based acked queueing. Defaults is memory # # queue.type: memory # # If using queue.type: persisted, the directory path where the data files will be stored. # Default is path.data/queue # # path.queue: # # If using queue.type: persisted, the page data files size. The queue data consists of # append-only data files separated into pages. Default is 64mb # # queue.page_capacity: 64mb # # If using queue.type: persisted, the maximum number of unread events in the queue. # Default is 0 (unlimited) # # queue.max_events: 0 # # If using queue.type: persisted, the total capacity of the queue in number of bytes. # If you would like more unacked events to be buffered in Logstash, you can increase the # capacity using this setting. Please make sure your disk drive has capacity greater than # the size specified here. If both max_bytes and max_events are specified, Logstash will pick # whichever criteria is reached first # Default is 1024mb or 1gb # # queue.max_bytes: 1024mb # # If using queue.type: persisted, the maximum number of acked events before forcing a checkpoint # Default is 1024, 0 for unlimited # # queue.checkpoint.acks: 1024 # # If using queue.type: persisted, the maximum number of written events before forcing a checkpoint # Default is 1024, 0 for unlimited # # queue.checkpoint.writes: 1024 # # If using queue.type: persisted, the interval in milliseconds when a checkpoint is forced on the head page # Default is 1000, 0 for no periodic checkpoint. # # queue.checkpoint.interval: 1000 # # ------------ Dead-Letter Queue Settings -------------- # Flag to turn on dead-letter queue. # # dead_letter_queue.enable: false # If using dead_letter_queue.enable: true, the maximum size of each dead letter queue. Entries # will be dropped if they would increase the size of the dead letter queue beyond this setting. # Default is 1024mb # dead_letter_queue.max_bytes: 1024mb # If using dead_letter_queue.enable: true, the directory path where the data files will be stored. # Default is path.data/dead_letter_queue # # path.dead_letter_queue: # # ------------ Metrics Settings -------------- # # Bind address for the metrics REST endpoint # # http.host: "127.0.0.1" # # Bind port for the metrics REST endpoint, this option also accept a range # (9600-9700) and logstash will pick up the first available ports. # # http.port: 9600-9700 # # ------------ Debugging Settings -------------- # # Options for log.level: # * fatal # * error # * warn # * info (default) # * debug # * trace # # log.level: info # path.logs: # # ------------ Other Settings -------------- # # Where to find custom plugins # path.plugins: [] # # ------------ X-Pack Settings (not applicable for OSS build)-------------- # # X-Pack Monitoring # https://www.elastic.co/guide/en/logstash/current/monitoring-logstash.html xpack.monitoring.enabled: false #xpack.monitoring.elasticsearch.username: logstash_system #xpack.monitoring.elasticsearch.password: password #xpack.monitoring.elasticsearch.hosts: ["https://es1:9200", "https://es2:9200"] #xpack.monitoring.elasticsearch.ssl.certificate_authority: [ "/path/to/ca.crt" ] #xpack.monitoring.elasticsearch.ssl.truststore.path: path/to/file #xpack.monitoring.elasticsearch.ssl.truststore.password: password #xpack.monitoring.elasticsearch.ssl.keystore.path: /path/to/file #xpack.monitoring.elasticsearch.ssl.keystore.password: password #xpack.monitoring.elasticsearch.ssl.verification_mode: certificate #xpack.monitoring.elasticsearch.sniffing: false #xpack.monitoring.collection.interval: 10s #xpack.monitoring.collection.pipeline.details.enabled: true # # X-Pack Management # https://www.elastic.co/guide/en/logstash/current/logstash-centralized-pipeline-management.html xpack.management.enabled: false #xpack.management.pipeline.id: ["main", "apache_logs"] #xpack.management.elasticsearch.username: logstash_admin_user #xpack.management.elasticsearch.password: password #xpack.management.elasticsearch.hosts: ["https://es1:9200", "https://es2:9200"] #xpack.management.elasticsearch.ssl.certificate_authority: [ "/path/to/ca.crt" ] #xpack.management.elasticsearch.ssl.truststore.path: /path/to/file #xpack.management.elasticsearch.ssl.truststore.password: password #xpack.management.elasticsearch.ssl.keystore.path: /path/to/file #xpack.management.elasticsearch.ssl.keystore.password: password #xpack.management.elasticsearch.ssl.verification_mode: certificate #xpack.management.elasticsearch.sniffing: false #xpack.management.logstash.poll_interval: 5s ``` 用这两个文件,创建一个cm,放在monitor命名空间下 ```sh kubectl create cm mybookstore-logstash --from-file=mybookstore.conf -n monitor ``` 然后,创建logstash的deployment,并配置service,来实现日志的解析.logstash中,把cm挂载到/etc/logstash作为配置文件 ```yaml --- #安装logstash apiVersion: apps/v1 kind: Deployment metadata: name: mybookstore-logstash namespace: monitor labels: app: mybookstore-logstash spec: replicas: 1 selector: matchLabels: app: mybookstore-logstash template: metadata: labels: app: mybookstore-logstash spec: containers: - name: logstash image: harbor.my-site.com/elk/logstash:7.3.2 command: ["bin/logstash"] args: ["-f", "/etc/logstash/mybookstore.conf","--path.settings","/etc/logstash"] ports: - containerPort: 5044 name: logstash-port protocol: TCP volumeMounts: - name: logstash-config mountPath: /etc/logstash volumes: - name: logstash-config configMap: name: mybookstore-logstash --- #创建svc apiVersion: v1 kind: Service metadata: name: mybookstore-logstash namespace: monitor spec: type: ClusterIP ports: - port: 5044 protocol: TCP targetPort: logstash-port selector: app: mybookstore-logstash ``` 效果如下: ![image-20210830161818389](https://gitee.com/wang5620079/mypics/raw/master//202108301618502.png) ##### 3)sidecar的filebeat配置及部署 filebeat用于日志采集。本次日志采集使用sidecar的形式,与服务部署在一个pod中,并用volume文件共享的形式进行日志采集。 **先创建filebeat的配置文件** ```yaml vi filebeat.yml filebeat.inputs: - type: log enabled: true paths: - /data/logs/*.log fields: appname: '${appname}' encoding: utf-8 multiline.pattern: ^20 multiline.negate: true multiline.match: after # ======================= Elasticsearch template setting ======================= # ------------------------------ Logstash Output ------------------------------- output.logstash: # The Logstash hosts hosts: ["mybookstore-logstash.monitor.svc.cluster.local:5044"] # ================================= Processors ================================= processors: - add_host_metadata: when.not.contains.tags: forwarded - add_cloud_metadata: ~ - add_host_metadata: netinfo.enabled: true ``` **创建cm** ```sh kubectl create cm mybookstore-filebeat --from-file=filebeat.yml -n dev ``` ##### 4)业务容器中加入sidecar的filebeat日志采集 我们以mybookstore-app这个微服务为例 **修改Jenkinsfile-dev,改动后如下,关键看其中的filebeat容器:** ```sh pipeline { //无代理,各阶段声明自己的代理 agent none //用时间戳作为镜像版本 environment { tsversion = "v${BUILD_TIMESTAMP}" appname="mybookstore-app" svcname="${appname}-svc" namespace="dev" repo="mybookstore-dev" replicas=1 port=5000 } stages { stage('开始打包镜像') { agent { label 'docker' } steps { echo "检查环境变量" sh 'printenv' echo "检查tags" sh 'git tag' echo "docker版本:" sh 'docker version' echo "打包镜像" sh "docker build -t ${appname}:${tsversion} -f Dockerfile ." echo "检查镜像是否打包完成" sh 'docker images|grep mybookstore' echo "上传镜像" sh 'docker login -u admin -p Harbor12345 harbor.my-site.com' sh "docker tag ${appname}:${tsversion} harbor.my-site.com/${repo}/${appname}:${tsversion}" sh "docker push harbor.my-site.com/${repo}/${appname}:${tsversion}" } } stage('kubectl 部署') { //使用nodejs代理 agent { label 'kubectl' } steps { echo "kubectl 部署deployment" sh """cat - < deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: labels: app: ${appname} version: ${tsversion} name: ${appname} namespace: ${namespace} spec: replicas: ${replicas} selector: matchLabels: app: ${appname} template: metadata: labels: app: ${appname} spec: containers: - image: harbor.my-site.com/${repo}/${appname}:${tsversion} name: ${appname} ports: - containerPort: ${port} volumeMounts: - name: log-volume mountPath: /data/logs - image: harbor.my-site.com/elk/filebeat:7.3.2 env: - name: appname value: ${appname} command: ["filebeat"] args: ["-e","-c", "/etc/filebeat/filebeat.yml"] name: filebeat volumeMounts: - name: filebeat-config mountPath: /etc/filebeat/ - name: log-volume mountPath: /data/logs volumes: - name: log-volume hostPath: path: /data/${appname}/logs - name: filebeat-config configMap: name: mybookstore-filebeat EOF""" sh 'ls -l' sh 'cat deployment.yaml' sh "kubectl apply -f deployment.yaml" echo "kubectl 部署svc" sh """cat - < service.yaml apiVersion: v1 kind: Service metadata: labels: app: ${svcname} name: ${svcname} namespace: ${namespace} spec: ports: - port: ${port} protocol: TCP targetPort: ${port} selector: app: ${appname} EOF""" sh "kubectl apply -f service.yaml" } } } } ``` 增加filebeat的sidecar日志采集容器后,重新运行流水线,流水线运行完成后,业务容器变成如下: ![image-20210830163819021](https://gitee.com/wang5620079/mypics/raw/master//202108301638138.png) 5)最终效果: 我们访问书城页面,elk就会有对应的日志采集输出,效果日下: ![image-20210830170154433](https://gitee.com/wang5620079/mypics/raw/master//202108301701602.png) 我们配置好对应的索引模式 ![image-20210830170436191](https://gitee.com/wang5620079/mypics/raw/master//202108301704349.png) 然后查看Discovory,可以看到对应的日志数据。 ![image-20210830165003478](https://gitee.com/wang5620079/mypics/raw/master//202108301650691.png) --- **我们可以配置仪表盘,来实现访问量的监控** ![image-20210830165528000](https://gitee.com/wang5620079/mypics/raw/master//202108301655178.png) 引申一下: **Kibana可以配置更多类型的图标,类似如下的图** ![Image](https://gitee.com/wang5620079/mypics/raw/master//202108301706752.png) ##### 5)排坑——ELK磁盘占用高导致ELK索引只读(PVC的动态扩容) 运行的时间久了,ES磁盘空间占用会越来越高,如下图: ![image-20210901225326616](https://gitee.com/wang5620079/mypics/raw/master//202109012253854.png) 索引大小也越来越大 ![image-20210901225355810](https://gitee.com/wang5620079/mypics/raw/master//202109012253040.png) **es的默认磁盘水位警戒线是85%,一旦磁盘使用率超过85%,es不会再为该节点分配分片,es还有一个磁盘水位警戒线是90%,超过后,将尝试将分片重定位到其他节点。** 此时的方法是,删除无用的索引,并设置需要使用的索引的read_only_allow_delete值,详见: https://stackoverflow.com/questions/48155774/elasticsearch-read-only-allow-delete-auto-setting ```sh PUT your_index_name/_settings { "index": { "blocks": { "read_only_allow_delete": "false" } } } ``` ### 七、jmeter压力测试 ##### 7.1 暴露nodeport前台页面服务 ```sh kubectl expose deployment mybookstore-front --name=mybookstore-front --type=NodePort --port=80 --target-port=80 -n dev ``` 效果如下: ![image-20210901231430799](https://gitee.com/wang5620079/mypics/raw/master//202109012314903.png) 界面如下: ![image-20210901231833098](https://gitee.com/wang5620079/mypics/raw/master//202109012318360.png) ##### 7.2 jmeter配置 本次采用jmeter进行参数化压测,模拟用户登录和压力测试,jmeter版本为5.4.1。 先放个整体效果图 ![image-20210902161830579](https://gitee.com/wang5620079/mypics/raw/master//202109021618714.png) ##### 1)测试CSV数据准备 从数据库中导出基础测试数据,首先是**用户数据**,如下: ```sh username,password "admin","21232f297a57a5a743894a0e4a801fc3" "dhm000","e10adc3949ba59abbe56e057f20f883e" "sl000","e10adc3949ba59abbe56e057f20f883e" "zp001","e10adc3949ba59abbe56e057f20f883e" "jfy002","e10adc3949ba59abbe56e057f20f883e" "zj003","e10adc3949ba59abbe56e057f20f883e" "wr004","e10adc3949ba59abbe56e057f20f883e" "cjh005","e10adc3949ba59abbe56e057f20f883e" "xp006","e10adc3949ba59abbe56e057f20f883e" "ly007","e10adc3949ba59abbe56e057f20f883e" "ml008","e10adc3949ba59abbe56e057f20f883e" "zb009","e10adc3949ba59abbe56e057f20f883e" ``` 注意,数据的第1行是不带引号的,是列标题。 然后,导入书籍数据,我们只导出部分数据即可,数据和上面用户数据一样,第一行是列标题,以下是数据节选 ```sh bookid,name,sub_name,authors "1","彼得大帝","二十世纪外国文学丛书","(苏)托尔斯泰" "2","鲁滨孙飘流记",,"未知" "3","逻辑哲学论",,"未知" "4","皆大欢喜","Vol. 12","罗志野/李德荣 注释/裘克安 校" "5","毛姆读书随笔",,"[英] 威廉·萨默塞特·毛姆" "6","了不起的盖茨比",,"[美] 菲茨杰拉德" "7","扎根",,"未知" "8","水仙已乘鲤鱼去",,"未知" "9","第十二夜",,"支荩忠" "10","规训与惩罚","监狱的诞生","未知" "11","如何阅读一本书",,"[美] 莫提默·J. 艾德勒/查尔斯·范多伦" "12","威尼斯商人",,"未知" "13","战争与和平(上下)",,"未知" ``` ##### 2)添加测试线程组 首先增加HTTP请求默认值 ![image-20210902162712667](https://gitee.com/wang5620079/mypics/raw/master//202109021627799.png) 添加测试线程组 ![image-20210902162755454](https://gitee.com/wang5620079/mypics/raw/master//202109021627570.png) ##### 3)设置自动登录和加入购物车的CSV数据集 设置用户登录所需的csv数据集 ![image-20210902163129685](https://gitee.com/wang5620079/mypics/raw/master//202109021631824.png) 设置书籍信息csv数据集,配置同上 ![image-20210902171923593](https://gitee.com/wang5620079/mypics/raw/master//202109021719722.png) ##### 4)配置登录请求 增加HTTP请求,名为登录请求,注意登录请求的请求参数、URL地址 ![image-20210902172509553](https://gitee.com/wang5620079/mypics/raw/master//202109021725684.png) 然后为登录请求增加HTTP请求头信息 ![image-20210902172622661](https://gitee.com/wang5620079/mypics/raw/master//202109021726779.png) 增加json提取器,目的是提取返回数据中的token字段,并将字段命名为token变量 ![image-20210902172720146](https://gitee.com/wang5620079/mypics/raw/master//202109021727277.png) 增加bashshell后置处理器,后置处理器有2个作用: 一个是设置token为全局变量: ```sh ${__setProperty(Token,${token})} ``` 一个是设置返回值的字符集解码 ```sh prev.setDataEncoding("UTF-8"); ``` ![image-20210902172841480](https://gitee.com/wang5620079/mypics/raw/master//202109021728600.png) 最后增加一个查看结果树设置。 ![image-20210902180135552](https://gitee.com/wang5620079/mypics/raw/master//202109021801678.png) ##### 5)测试用户信息查询HTTP请求 增加一个用户信息查询HTTP请求,注意url为/api/user/info ![image-20210902175548832](https://gitee.com/wang5620079/mypics/raw/master//202109021755979.png) 该HTTP请求中,要增加HTTP头信息管理器,**管理器中要增加对上面Token的引用** ![image-20210902175834946](https://gitee.com/wang5620079/mypics/raw/master//202109021758070.png) 其他几个配置截图如下 ![image-20210902175922391](https://gitee.com/wang5620079/mypics/raw/master//202109021759525.png) **json提取器提取用户id** ![image-20210902180032145](https://gitee.com/wang5620079/mypics/raw/master//202109021800275.png) 添加一个查看结果树,在此不截图了 ##### 6)测试添加购物车 增加一个添加购物车的HTTP请求,注意这里请求的url 是/api/other/addshoppingcart,**请求类型为PUT**,请求参数为: ```json {"params":{"userid":${userid},"bookid":${bookid}}} ``` **userid参数,引用自上面的获取用户信息;bookid引用自前面书籍csv数据集的csv提取器** ![image-20210902180334463](https://gitee.com/wang5620079/mypics/raw/master//202109021803591.png) 其他配置如下: ![image-20210902181134857](https://gitee.com/wang5620079/mypics/raw/master//202109021811988.png) ![image-20210902181158077](https://gitee.com/wang5620079/mypics/raw/master//202109021811201.png) ##### 7)测试查询购物车 ![image-20210902181257896](https://gitee.com/wang5620079/mypics/raw/master//202109021812027.png) ![image-20210902181325942](https://gitee.com/wang5620079/mypics/raw/master//202109021813065.png) ![image-20210902181349888](https://gitee.com/wang5620079/mypics/raw/master//202109021813008.png) **8)结果查看** 运行jmeter,开始测试,结果如下 ![image-20210902181514938](https://gitee.com/wang5620079/mypics/raw/master//202109021815075.png) ![image-20210902181833866](C:\Users\wp\AppData\Roaming\Typora\typora-user-images\image-20210902181833866.png) ![image-20210902181538223](https://gitee.com/wang5620079/mypics/raw/master//202109021815352.png) ![image-20210902181606742](https://gitee.com/wang5620079/mypics/raw/master//202109021816870.png) ![image-20210902181644392](https://gitee.com/wang5620079/mypics/raw/master//202109021816526.png) ##### 8)elk绘制访问曲线图 我们把jmeter的线程组设置成循环执行 ![image-20210902182012032](https://gitee.com/wang5620079/mypics/raw/master//202109021820161.png) 开始测试,然后观察elk配置的访问图表,可以看到访问量增加了(机器跑一会儿jemter就会出问题,只能运行一会儿) ![image-20210902182310238](https://gitee.com/wang5620079/mypics/raw/master//202109021823385.png) ### 八、应用健壮性 #### 8.1 应用存活探针 ##### 1)应用预留存活探针接口 应用中,我们可以预留健康检查接口,以mybookstore-app应用如下,对应的url是/health ![image-20210902183147519](https://gitee.com/wang5620079/mypics/raw/master//202109021831671.png) ##### 2)deploy中增加livenessProbe,进行探活检测 详见jenkinsfile-dev中的配置 ![image-20210902183536255](https://gitee.com/wang5620079/mypics/raw/master//202109021835417.png) #### 8.2 使用ingress增加https访问 我们要使用ingress-nginx来实现外部https访问mybookstore-front前端页面,效果图如下: ![image-20210902184046110](https://gitee.com/wang5620079/mypics/raw/master//202109021840298.png) 步骤如下: ##### 1)创建根证书 ```sh mkdir my-ca cd my-ca openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=*.my-site.com/O=*.my-site.com" ``` 注意cn设置和o的设置,设置成通用域名*.my-site.com ##### 2)创建ingress mybook-front微服务中,增加创建ingress的部分 ![image-20210902185751305](https://gitee.com/wang5620079/mypics/raw/master//202109021857498.png) # <持续更新中...>