运维GO-云原生|监控实战（一）监控初体验 - Prometheus+Grafana 监控类|运维|云原生|运维GO-d

文章目录

Prometheus简介
- 名词说明
- - Prometheus
  - Exporters
  - 服务发现
  - AlertManager
  - Bridge
  - Client library
  - Instance
  - Job
  - Notification
  - Promdash
  - Prometheus
  - PromQL
  - Pushgateway
  - Remote Read
  - Remote Read Adapter
  - Remote Read Endpoint
  - Remote Write
  - Remote Write Adapter
  - Remote Write Endpoint
  - Sample
  - Silence
  - Target
prometheus安装
- 容器安装
- 源码安装
node_exporter安装
prometheus 配置
Prometheus界面操作
安装Grafana 可视化
- 安装
- - docker安装
- 访问页面
- 添加Prometheus数据源
- 添加模版

本篇文章你可以初步体验Prometheus + Grafana，并且能够自己搭建一套服务。

Prometheus简介 Prometheus 整体架构如下图

运维GO-云原生|监控实战（一）监控初体验 - Prometheus+Grafana

文章图片

名词说明 Prometheus

Prometheus usually refers to the core binary of the Prometheus system. It may also refer to the Prometheus monitoring system as a whole.
Prometheus 二进制服务端

Exporters

An exporter is a binary running alongside the application you want to obtain metrics from. The exporter exposes Prometheus metrics, commonly by converting metrics that are exposed in a non-Prometheus format into a format that Prometheus supports.
Exporters是与要获取指标的应用程序运行在一起的二进制文件。Exporter 通常通过将非Prometheus格式的指标转换为Prometheus支持的格式的指标。

Prometheus 通过 Exporters 暴露的HTTP接口获取主机和应用程序上的指标，我们可以安装不同的Exporters扩展我们需要收集的信息。
Prometheus Server通过访问该Exporter提供的Endpoint端点，即可获取到需要采集的监控数据
Exporters 主要功能是将收集到到指标转换成 Prometheus的格式。
服务发现
Kubernetes主要提供了如下5种服务发现模式和Prometheus进行集成

Node
Pod
Endpoints
Service
Ingress

TODO 这里的说明我们在后续的文章当中介绍
AlertManager
EndPoint

（A source of metrics that can be scraped, usually corresponding to a single process. )

endpoint暴露了格式化的metrics数据给Prometheus服务器
Bridge

A bridge is a component that takes samples from a client library and exposes them to a non-Prometheus monitoring system. For example, the Python, Go, and Java clients can export metrics to Graphite.
Bridge 是从客户端获取样本并将其发送给非Prometheus监视系统。例如，python、go、java客户端可以将指标导出到Graphite。

Client library

A client library is a library in some language (e.g. Go, Java, Python, Ruby) that makes it easy to directly instrument your code, write custom collectors to pull metrics from other systems and expose the metrics to Prometheus.
客户端库是某种语言（例如go、java、python、ruby）的库，可以轻松的检测代码，编写自定义收集器以从其他系统提取指标并将指标发送给Prometheus。

Instance

An instance is a label that uniquely identifies a target in a job.
实例是标识作业中目标的唯一标签。

Job

A collection of targets with the same purpose, for example monitoring a group of like processes replicated for scalability or reliability, is called a job.
具有相同目的的目标的集合（例如，监视一组为可伸缩性或可靠性而复制的相似过程）被称为作业。

Notification

A notification represents a group of one or more alerts, and is sent by the Alertmanager to email, Pagerduty, Slack etc.
通知代表一组一个或多个警报，并由Alertmanager（警报管理器）发送到电子邮件，Pagerduty，Slack等

Promdash

Promdash was a native dashboard builder for Prometheus. It has been deprecated and replaced by Grafana.
Promdash是Prometheus的内置的仪表板构建器。它已被弃用并由Grafana代替。

Prometheus

Prometheus usually refers to the core binary of the Prometheus system. It may also refer to the Prometheus monitoring system as a whole.
Prometheus通常是指Prometheus系统的核心二进制文件。它也可以指整个普罗米修斯监测系统。

PromQL

PromQL is the Prometheus Query Language. It allows for a wide range of operations including aggregation, slicing and dicing, prediction and joins.
PromQL是Prometheus的查询语言。它允许广泛的操作，包括聚合、切片和切割、预测和连接。

Pushgateway

The Pushgateway persists the most recent push of metrics from batch jobs. This allows Prometheus to scrape their metrics after they have terminated.
Pushgateway维持获取批处理作业指标的最新数据。Prometheus可以在指标关闭后抓取其指标。

Prometheus 可以看作一个代理。 Prometheus Pushgateway的存在是为了允许临时任务和批处理作业向Prometheus公开其指标。由于这类作业可能存在的时间不够长，无法被清除，因此可以将其指标推送到Pushgateway。然后，Pushgateway将这些指标暴露给Prometheus。
Remote Read

Remote read is a Prometheus feature that allows transparent reading of time series from other systems (such as long term storage) as part of queries.
远程读取是普罗米修斯的一个特性，它允许从其他系统(如长期存储)透明地读取时间序列，作为查询的一部分。

Remote Read Adapter

Not all systems directly support remote read. A remote read adapter sits between Prometheus and another system, converting time series requests and responses between them.
不是所有系统都直接支持远程读取。远程读取适配器位于Prometheus和其他系统之间，转换时间序列请求和响应。

Remote Read Endpoint

A remote read endpoint is what Prometheus talks to when doing a remote read.
远程读取端点是Prometheus在进行远程读取时要与之通信的端点。

Remote Write

Remote write is a Prometheus feature that allows sending ingested samples on the fly to other systems, such as long term storage.
远程写是Prometheus的一项功能，它允许将收取的样本即时发送到其他系统，例如长期存储。

Remote Write Adapter

Not all systems directly support remote write. A remote write adapter sits between Prometheus and another system, converting the samples in the remote write into a format the other system can understand.
并非所有系统都直接支持远程写入。远程写适配器位于Prometheus和另一个系统之间，将远程写操作中的样本转换为另一个系统可以理解的格式。

Remote Write Endpoint

A remote write endpoint is what Prometheus talks to when doing a remote write.
远程写端点是Prometheus在进行远程写操作时使用的对象。

Sample

A sample is a single value at a point in time in a time series.
In Prometheus, each sample consists of a float64 value and a millisecond-precision timestamp.
样本是时间序列中某个时间点上的单个值。
在普罗米修斯中，每个样本由一个float64值和一个毫秒级精度的时间戳组成。

Silence

A silence in the Alertmanager prevents alerts, with labels matching the silence, from being included in notifications.
Alertmanager中的“静音”功能可以防止将标签与“静音”匹配的警报包含在通知中。

Target

A target is the definition of an object to scrape. For example, what labels to apply, any authentication required to connect, or other information that defines how the scrape will occur.
目标是要刮除的对象的定义。例如，要应用哪些标签，连接所需的任何身份验证或定义刮擦方式的其他信息。

prometheus安装容器安装

docker run -p 9090:9090 -v /etc/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus

源码安装

export VERSION=2.26.0 curl -LOhttps://github.com/prometheus/prometheus/releases/download/v$VERSION/prometheus-$VERSION.linux-amd64.tar.gz # 下载之后包含默认的配置文件和启动程序 tar zxvf prometheus-2.26.0.linux-amd64.tar.gz ./prometheus --config.file=prometheus.yml

访问
http://localhost:9090

文章图片

node_exporter安装开源地址：https://github.com/prometheus/node_exporter

export VERSION=1.1.2 curl -OL https://github.com/prometheus/node_exporter/releases/download/v0.15.2/node_exporter-$VERSION.darwin-amd64.tar.gz tar -xzf node_exporter-$VERSION.darwin-amd64.tar.gz # 启动服务 ./node_exporter --web.listen-address 127.0.0.1:9100

访问node_exporter可以看到收集的指标信息
http://localhost:9100/metrics

文章图片

prometheus 配置

# my global config global: # 数据采集周期 scrape_interval:15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. # 规则计算周期 evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. # scrape_timeout is set to the global default (10s).# Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. # 收集节点的配置 scrape_configs: # The job name is added as a label `job=` to any timeseries scraped from this config. - job_name: 'prometheus' # Override the global default and scrape targets from this job every 5 seconds. scrape_interval: 5s # metrics_path defaults to '/metrics' # scheme defaults to 'http'. # 静态配置 static_configs: - targets: ['localhost:9090'] # 采集node exporter监控数据 - job_name: 'node' static_configs: - targets: ['localhost:9100', '192.168.1.2:9100','192.168.1.3:9100'] # 添加3数据

检查配置文件

go get github.com/prometheus/prometheus/cmd/promtool promtool check config prometheus.yml

promtool 参数说明

# 检查配置文件 check config ... Check if the config files are valid or not. # 检查规则 check rules ... Check if the rule files are valid or not. # 检查metrics check metrics Pass Prometheus metrics over stdin to lint them for consistency and correctness.examples:$ cat metrics.prom | promtool check metrics$ curl -s http://localhost:9090/metrics | promtool check metrics

Prometheus界面操作可以通过Prometheus的语言PomQL