运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana


文章目录

  • Prometheus简介
    • 名词说明
      • Prometheus
      • Exporters
      • 服务发现
      • AlertManager
      • Bridge
      • Client library
      • Instance
      • Job
      • Notification
      • Promdash
      • Prometheus
      • PromQL
      • Pushgateway
      • Remote Read
      • Remote Read Adapter
      • Remote Read Endpoint
      • Remote Write
      • Remote Write Adapter
      • Remote Write Endpoint
      • Sample
      • Silence
      • Target
  • prometheus安装
    • 容器安装
    • 源码安装
  • node_exporter安装
  • prometheus 配置
  • Prometheus界面操作
  • 安装Grafana 可视化
    • 安装
      • docker安装
    • 访问页面
    • 添加Prometheus数据源
    • 添加模版

本篇文章你可以初步体验Prometheus + Grafana,并且能够自己搭建一套服务。
Prometheus简介 Prometheus 整体架构如下图
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

名词说明 Prometheus
Prometheus usually refers to the core binary of the Prometheus system. It may also refer to the Prometheus monitoring system as a whole.
Prometheus 二进制服务端
Exporters
An exporter is a binary running alongside the application you want to obtain metrics from. The exporter exposes Prometheus metrics, commonly by converting metrics that are exposed in a non-Prometheus format into a format that Prometheus supports.
Exporters是与要获取指标的应用程序运行在一起的二进制文件。Exporter 通常通过将非Prometheus格式的指标转换为Prometheus支持的格式的指标。
Prometheus 通过 Exporters 暴露的HTTP接口获取主机和应用程序上的指标, 我们可以安装不同的Exporters扩展我们需要收集的信息。
Prometheus Server通过访问该Exporter提供的Endpoint端点,即可获取到需要采集的监控数据
Exporters 主要功能是将收集到到指标转换成 Prometheus的格式。
服务发现
Kubernetes主要提供了如下5种服务发现模式和Prometheus进行集成
  • Node
  • Pod
  • Endpoints
  • Service
  • Ingress
TODO 这里的说明我们在后续的文章当中介绍
AlertManager
EndPoint
(A source of metrics that can be scraped, usually corresponding to a single process. )
endpoint暴露了格式化的metrics数据给Prometheus服务器
Bridge
A bridge is a component that takes samples from a client library and exposes them to a non-Prometheus monitoring system. For example, the Python, Go, and Java clients can export metrics to Graphite.
Bridge 是从客户端获取样本并将其发送给非Prometheus监视系统。例如,python、go、java客户端可以将指标导出到Graphite。
Client library
A client library is a library in some language (e.g. Go, Java, Python, Ruby) that makes it easy to directly instrument your code, write custom collectors to pull metrics from other systems and expose the metrics to Prometheus.
客户端库是某种语言(例如go、java、python、ruby)的库,可以轻松的检测代码,编写自定义收集器以从其他系统提取指标并将指标发送给Prometheus。
Instance
An instance is a label that uniquely identifies a target in a job.
实例是标识作业中目标的唯一标签。
Job
A collection of targets with the same purpose, for example monitoring a group of like processes replicated for scalability or reliability, is called a job.
具有相同目的的目标的集合(例如,监视一组为可伸缩性或可靠性而复制的相似过程)被称为作业。
Notification
A notification represents a group of one or more alerts, and is sent by the Alertmanager to email, Pagerduty, Slack etc.
通知代表一组一个或多个警报,并由Alertmanager(警报管理器)发送到电子邮件,Pagerduty,Slack等
Promdash
Promdash was a native dashboard builder for Prometheus. It has been deprecated and replaced by Grafana.
Promdash是Prometheus的内置的仪表板构建器。它已被弃用并由Grafana代替。
Prometheus
Prometheus usually refers to the core binary of the Prometheus system. It may also refer to the Prometheus monitoring system as a whole.
Prometheus通常是指Prometheus系统的核心二进制文件。它也可以指整个普罗米修斯监测系统。
PromQL
PromQL is the Prometheus Query Language. It allows for a wide range of operations including aggregation, slicing and dicing, prediction and joins.
PromQL是Prometheus的查询语言。它允许广泛的操作,包括聚合、切片和切割、预测和连接。
Pushgateway
The Pushgateway persists the most recent push of metrics from batch jobs. This allows Prometheus to scrape their metrics after they have terminated.
Pushgateway维持获取批处理作业指标的最新数据。Prometheus可以在指标关闭后抓取其指标。
Prometheus 可以看作一个代理。 Prometheus Pushgateway的存在是为了允许临时任务和批处理作业向Prometheus公开其指标。由于这类作业可能存在的时间不够长,无法被清除,因此可以将其指标推送到Pushgateway。然后,Pushgateway将这些指标暴露给Prometheus。
Remote Read
Remote read is a Prometheus feature that allows transparent reading of time series from other systems (such as long term storage) as part of queries.
远程读取是普罗米修斯的一个特性,它允许从其他系统(如长期存储)透明地读取时间序列,作为查询的一部分。
Remote Read Adapter
Not all systems directly support remote read. A remote read adapter sits between Prometheus and another system, converting time series requests and responses between them.
不是所有系统都直接支持远程读取。远程读取适配器位于Prometheus和其他系统之间,转换时间序列请求和响应。
Remote Read Endpoint
A remote read endpoint is what Prometheus talks to when doing a remote read.
远程读取端点是Prometheus在进行远程读取时要与之通信的端点。
Remote Write
Remote write is a Prometheus feature that allows sending ingested samples on the fly to other systems, such as long term storage.
远程写是Prometheus的一项功能,它允许将收取的样本即时发送到其他系统,例如长期存储。
Remote Write Adapter
Not all systems directly support remote write. A remote write adapter sits between Prometheus and another system, converting the samples in the remote write into a format the other system can understand.
并非所有系统都直接支持远程写入。远程写适配器位于Prometheus和另一个系统之间,将远程写操作中的样本转换为另一个系统可以理解的格式。
Remote Write Endpoint
A remote write endpoint is what Prometheus talks to when doing a remote write.
远程写端点是Prometheus在进行远程写操作时使用的对象。
Sample
A sample is a single value at a point in time in a time series.
In Prometheus, each sample consists of a float64 value and a millisecond-precision timestamp.
样本是时间序列中某个时间点上的单个值。
在普罗米修斯中,每个样本由一个float64值和一个毫秒级精度的时间戳组成。
Silence
A silence in the Alertmanager prevents alerts, with labels matching the silence, from being included in notifications.
Alertmanager中的“静音”功能可以防止将标签与“静音”匹配的警报包含在通知中。
Target
A target is the definition of an object to scrape. For example, what labels to apply, any authentication required to connect, or other information that defines how the scrape will occur.
目标是要刮除的对象的定义。例如,要应用哪些标签,连接所需的任何身份验证或定义刮擦方式的其他信息。
prometheus安装 容器安装
docker run -p 9090:9090 -v /etc/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus

源码安装
export VERSION=2.26.0 curl -LOhttps://github.com/prometheus/prometheus/releases/download/v$VERSION/prometheus-$VERSION.linux-amd64.tar.gz # 下载之后包含默认的配置文件和启动程序 tar zxvf prometheus-2.26.0.linux-amd64.tar.gz ./prometheus --config.file=prometheus.yml

访问
http://localhost:9090
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

node_exporter安装 开源地址:https://github.com/prometheus/node_exporter
export VERSION=1.1.2 curl -OL https://github.com/prometheus/node_exporter/releases/download/v0.15.2/node_exporter-$VERSION.darwin-amd64.tar.gz tar -xzf node_exporter-$VERSION.darwin-amd64.tar.gz # 启动服务 ./node_exporter --web.listen-address 127.0.0.1:9100

访问node_exporter可以看到收集的指标信息
http://localhost:9100/metrics运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

prometheus 配置
# my global config global: # 数据采集周期 scrape_interval:15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. # 规则计算周期 evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s).# Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. # 收集节点的配置 scrape_configs: # The job name is added as a label `job=` to any timeseries scraped from this config. - job_name: 'prometheus' # Override the global default and scrape targets from this job every 5 seconds. scrape_interval: 5s # metrics_path defaults to '/metrics' # scheme defaults to 'http'. # 静态配置 static_configs: - targets: ['localhost:9090'] # 采集node exporter监控数据 - job_name: 'node' static_configs: - targets: ['localhost:9100', '192.168.1.2:9100','192.168.1.3:9100'] # 添加3数据

检查配置文件
go get github.com/prometheus/prometheus/cmd/promtool promtool check config prometheus.yml

promtool 参数说明
# 检查配置文件 check config ... Check if the config files are valid or not. # 检查规则 check rules ... Check if the rule files are valid or not. # 检查metrics check metrics Pass Prometheus metrics over stdin to lint them for consistency and correctness.examples:$ cat metrics.prom | promtool check metrics$ curl -s http://localhost:9090/metrics | promtool check metrics

Prometheus界面操作 可以通过Prometheus的语言PomQL
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

安装Grafana 可视化 初次接触者可以先在线体验一下 https://play.grafana.org/
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

安装 docker安装
docker run -d --name=grafana -p 3000:3000 grafana/grafana

访问页面 访问 loclhost:3000
系统默认用户名和密码为admin/admin,第一次登陆系统会要求修改密码,修改密码后登陆
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

登陆后的界面如下
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

添加Prometheus数据源
数据源:Graphite,InfluxDB,OpenTSDB,Prometheus,Elasticsearch,CloudWatch和KairosDB等;
我们在数据源里面添加Prometheus
Configuration->Data Srources ->Add Resource->Prometheus ->Select
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

添加模版 grafana 提供了很多的监控模版可以快速生成监控页面。
可以访问地址 https://grafana.com/grafana/dashboards
看到很多已经做好的模版,我们可以根据我们监控的对象选择不同的模版。
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

我们监控主机的信息,我选择了如下模版
红色框内是对应的模版介绍。
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

我们点击右侧的Copy ID to Clipboard
Create ->Import-> 粘贴刚刚复制的ID 这里是8919 , 点击Load
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

之后我们配置好名称和数据源点击Import
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

我们就会看到如下图, 还是特别炫酷的是不是。
运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana
文章图片

如果到实际到业务,我们还是需要根据业务到需求定制我们到页面。
关于详细到操作和详细的Prometheus信息后续继续介绍
后记
【运维GO-云原生|监控实战(一)监控初体验 - Prometheus+Grafana】2021年200篇计划 第7篇

    推荐阅读