一、准备:
在下面每台机器上配置hosts解析,并在相应的机器上更改hostname:
机器centos 7.9:
cat >>/etc/hosts < 172.24.8.90 k8s-master01 172.24.8.91 k8s-master02 #安装consul 172.24.8.92 k8s-master03 172.24.8.95 keepalived01 #安装prometheus 172.24.8.96 keepalived02 #安装granafa 172.24.8.101 node1 172.24.8.102 node2 EOF 二、安装prometheus 在172.24.8.95机器上 mkdir -p /usr/local/prometheus wget https://github.com/prometheus/prometheus/releases/download/v2.37.0/prometheus-2.37.0.linux-amd64.tar.gz tar -zxvf prometheus-2.37.0.linux-amd64.tar.gz mv prometheus-2.37.0.linux-amd64 /usr/local/prometheus 配置安装包启动,目的是能够使用systemctl start prometheusd.service 进行启动prometheus cat > /usr/lib/systemd/system/prometheusd.service < [Unit] Description=Prometheus [Service] ExecStart= /usr/local/prometheus/prometheus --config.file= /usr/local/prometheus/prometheus.yml --web.enable-lifecycle Restart=on-failure [Install] WantedBy=multi-user.target EOF 启动prometheus,并设置开机启动,并查看启动状态 systemctl start prometheusd.service && systemctl enable prometheusd.service && systemctl status prometheusd.serviceprometheusd.service 打开浏览器访问http://172.24.8.95:9090/ 三、安装grafana 在172.24.8.96机器上 sudo wget https://dl.grafana.com/enterprise/release/grafana-enterprise-9.0.2-1.x86_64.rpm sudo yum install grafana-enterprise-9.0.2-1.x86_64.rpm 配置安装包启动,目的是能够使用systemctl start prometheusd.service 进行启动prometheus cat > /usr/lib/systemd/system/grafana-server.service << EOF [Unit] Description=Grafana instance Documentation=http://docs.grafana.org Wants=network-online.target After=network-online.target After=postgresql.service mariadb.service mysqld.service [Service] EnvironmentFile=/etc/sysconfig/grafana-server User=grafana Group=grafana Type=notify Restart=on-failure WorkingDirectory=/usr/share/grafana RuntimeDirectory=grafana RuntimeDirectoryMode=0750 ExecStart=/usr/sbin/grafana-server --config=${CONF_FILE} \ --pidfile=${PID_FILE_DIR}/grafana-server.pid \ --packaging=rpm \ cfg:default.paths.logs=${LOG_DIR} \ cfg:default.paths.data=${DATA_DIR} \ cfg:default.paths.plugins=${PLUGINS_DIR} \ cfg:default.paths.provisioning=${PROVISIONING_CFG_DIR} \ LimitNOFILE=10000 TimeoutStopSec=20 CapabilityBoundingSet= DeviceAllow= LockPersonality=true MemoryDenyWriteExecute=false NoNewPrivileges=true PrivateDevices=true PrivateTmp=true ProtectClock=true ProtectControlGroups=true ProtectHome=true ProtectHostname=true ProtectKernelLogs=true ProtectKernelModules=true ProtectKernelTunables=true ProtectProc=invisible ProtectSystem=full RemoveIPC=true RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX RestrictNamespaces=true RestrictRealtime=true RestrictSUIDSGID=true SystemCallArchitectures=native UMask=0027 [Install] WantedBy=multi-user.target EOF 启动grafana,并设置开机启动,并查看启动状态 systemctl start grafana-server.service && systemctl enable grafana-server.service && systemctl status grafana-server.service 打开浏览器访问http://172.24.8.96:3000/ 账号:admin 密码:admin ,登录后需要重新设置密码 四、安装consul(可做成3或5个consul机器集群) http://172.24.8.91:8500/ sudo yum install -y yum-utils sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo sudo yum -y install consul cat > /etc/systemd/system/consul.service < [unit] Description=Consul Service Discovery Agent Documentation=http://www.consul.io/ After=network-online.target Wants=network-online.target [Service] Type=simple User=root Group=root ExecStart=/usr/local/bin/consul agent -dev -ui \ -bootstrap-expect=1 \ -data-dir=/var/lib/consul \ -node=consul \ -client=0.0.0.0 \ -config-dir=/etc/consul.d \ Restart=on-failure [Install] WantedBy=multi-user.target EOF 启动consul,并设置开机启动,并查看启动状态 systemctl start consul.service && systemctl enable consul.service && systemctl status consul.service 打开浏览器访问:http://172.24.8.91:8500/ 五、配置prometheus.yml 在172.24.8.95上 vim /usr/local/yunji/prometheus/prometheus.yml job_name: 即你的服务集群名称 consul_sd_configs:即服务发现 server: 即服务发现的服务器地址和端口 source_labels: 例子:__meta_consul_tags 不包含 node 标签的服务,__meta_consul_tags 对应到 Consul 服务中的值为 “tags”: [“node”],默认 consul 服务是不带该标签的,从而实现过滤 regex: 即下面第六部配置node_exporter集群名称的tag,即上面说的job_named中的node,用集群的通配符进行适配,如果不适配,prometheus将无法识别,从而不能批量注册。 六、每台集群配置的node_exporter mkdir -p /usr/local/node_exporter/logs cd /usr/local/node_exporter/ wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz tar -zxvf node_exporter-1.3.1.linux-amd64.tar.gz cd node_exporter-1.3.1.linux-amd64 mv node_exporter /usr/local/node_exporter/ cat > /usr/local/node_exporter/node_exporter.sh < #!/bin/bash . /etc/profile name=`hostname` tag=`echo $name` #用机器名称做tag,机器名称较复杂,可以根据情况过滤使用。 ip=/usr/sbin/ifconfig | grep inet |awk 'NR==2{print $2}' #可能每个IP都在不一样的行,自己运行脚本前先检查 logpath=/usr/local/yunji/node_exporter/logs logfile=${logpath}/node_exporter.log url="http://172.24.8.91:8500" exporter=node_exporter pid=`ps -ef | grep -v grep | grep "/usr/local/yunji/node_exporter/${exporter}#34; | awk '{print $2}'` json_data=cat < "id": "$name", "name": "yunji", "address": "$ip", "port": 9100, "tags": ["$tag"], #该tag对于的是第五大点的regex, "checks": [ { "tcp": "${ip}:9100", "interval": "10s" } ] } EOF start() { [ -d $logpath ] || mkdir -p $logpath [ -f $logfile ] || touch $logfile /usr/local/yunji/node_exporter/${exporter} >$logfile 2>&1 & if [ $? -eq 0 ];then /usr/bin/curl -XPUT -d "$json_data" ${url}/v1/agent/service/register #注册到consul中 fi } stop() { /usr/bin/curl -XPUT ${url}/v1/agent/service/deregister/$name if [ -n "$pid" ];then kill -9 $pid echo "进程已停止" exit else echo "进程没有运行" exit fi } case $1 in start) start;; stop) stop;; *) echo "请输入:sh node_exporter.sh start|stop" esac EOF sh node_exporter.sh #启动node_exporter,自动注册到consul中 效果如下图已成功注册: 再在http://172.24.8.95:9090/上查看prometheus,点击Status-->Targets,如下,发现机器已经添加到promethues中 七、在grafana中导入node_exporter模板 1、连接prometheus,设置-->Data Sources-->Add data source,如下图: name:可以是服务或者公司名称简写 URL:即prometheus的地址http://172.24.8.95:9090/,记得需要加上http://,否则测试不成功 Access:Server(default)选择默认即可 拉至最下面点击Save & test,测试成功即可, 2、导入node_exporter 先注册一个grafana.com官网账号,然后访问如下连接,下载node_exporter的json文件: https://grafana.com/grafana/dashboards/12062 下载完毕,打开grafana,导入node_exporter的json文件: 选择Upload Json file,把刚才下载的node_exporter的json文件添加即可: 选择Dashboards-->点击Node Exporter Full 即可看到监控 如下图,所有的监控已添加,并能正常监控: 到此,监控已添加完毕,面试过程中,会询问自定义监控怎么添加的:
留言与评论(共有 0 条评论)
“”