使用 Prometheus + Grafana + Exporter 监控服务器的运行状态

博主： Lucien
发布时间：2019 年 05 月 25 日
15617 次浏览
暂无评论
7235字数
分类： Linux 运维

使用 Prometheus + Grafana + Exporter 监控服务器的运行状态

原文地址：https://www.lucien.ink/archives/449/

1. 摘要

本文主要介绍如何使用 node_exporter 采集 Linux 系统的信息，借助 Prometheus 最终以仪表盘的形式显示在 Grafana 中。

2. 效果展示

3. 介绍

Grafana、Prometheus、Exporter 这三个组件的背景资料我就不介绍了，搜一下就会有很多。这里主要说一下他们三者之间的关系。

3.1 前置知识

在编写应用程序的时候，通常会记录 Log 以便事后分析，在很多情况下是产生了问题之后，再去查看 Log ，是一种事后的静态分析。在很多时候，我们可能需要了解整个系统在当前，或者某一时刻运行的情况，比如当前系统中对外提供了多少次服务，这些服务的响应时间是多少，随时间变化的情况是什么样的，系统出错的频率是多少。这些动态的准实时信息对于监控整个系统的运行健康状况来说很重要。

于是就产生了 metrics 这种数据，它长这样 https://monitor.lucien.ink/metrics 。

3.2 关系

Exporter 的主要任务是提供 metrics 信息。
而 metrics 大多数人是看不懂的，所以 Prometheus 为这种格式的信息提供了 Prometheus Query Language (PromQL) ，可以进行一些类似数据库那样的联合查询、过滤等操作，这样一来就能提炼出我们想要的东西，类似于内存占用、负载等。大致的流程就是：从远端（可以有多个）采集 metrics 信息到本地 $\rightarrow$ 通过各种 QL 提炼信息。
虽然 PromQL 非常的强大，但是对于大部分人来说是有很高的学习成本的，所以 Grafana 就将各种 PromQL 封装起来，并将 PromQL 的结果以图表的形式展示出来。

大概就是 生产 $\rightarrow$ 加工 $\rightarrow$ 二次加工 这样一种流程。

当然了，Prometheus 和 Grafana 的功能远不止如此，更强大的是报警功能，但这不是本文的主题。

3.3 Exporter

值得一提的是，Exporter 组件是一类组件，它们的主要作用就是提供 metrics 信息以供加工提炼。

有的组件会自行提供 metrics 信息，比如 Grafana、Prometheus、Etcd 等等，在本文的 $3.1$ 中给出的 metrics 就是 Grafana 本身产生的。

有的组件不会提供 metrics 信息，比如说我们自己写的一些程序。

而有的甚至不是组件，比如 Linux 系统本身。

4. 部署

本文采用的安装方式皆为二进制 + systemd 托管的安装方式，因为 OpenVZ 等架构的 VPS 不能运行 docker，所以选择更普适一些的方法。

4.1 下载

node_exporter：https://github.com/prometheus/node_exporter/releases
Prometheus：https://github.com/prometheus/prometheus/releases
Grafana（选择 Standalone Linux Binaries 版本）：https://grafana.com/grafana/download

4.2 解压、安装

新建一个空文件夹，并将下载的 tar.gz 移动至这个空文件夹中。

请保证以下目录结构：

dir
├── grafana-x.x.x.linux-amd64.tar.gz
├── node_exporter-x.x.x.linux-amd64.tar.gz
└── prometheus-x.x.x.linux-amd64.tar.gz

然后在文件夹中执行：

curl api.pasteme.cn/8413 | bash

可以在 https://pasteme.cn/8413 中查看命令详情。

至此，所有安装已经完成了，三个组件对应的 systemd 服务名称分别是：grafana-server、prometheus、node_exporter。

4.3 验证

4.3.1 systemctl status xxx

可以用 systemctl status 命令来查看各个组件的运行状态。

systemctl status node_exporter
systemctl status prometheus
systemctl status grafana-server

4.3.2 查看 metrics

node_exporter、Prometheus、Grafana 的默认端口分别是 9100、9090、3000 ，我们可以通过以下命令来查看 metrics 信息，有输出就代表正在运行。

curl localhost:9100/metrics
curl localhost:9090/metrics
curl localhost:3000/metrics

4.4 开机自启

这是 systemd 老生常谈的一个话题了。

systemctl enable node_exporter
systemctl enable prometheus
systemctl enable grafana-server

4.5 卸载

curl api.pasteme.cn/8414 | bash

可以在 https://pasteme.cn/8414 中查看命令详情。

5. 配置

虽然我们已经完成了三个组件的安装，但此时它们都还是互相独立的三个组件，我们需要对其进行一些配置。

5.1 prometheus

编辑 /usr/local/prometheus/prometheus.yml

我们会看到如下内容：

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090'] # 我们需要修改这里

将 targets 所在的那一行修改为以下内容，注意空格缩进，yaml 的格式检查很严格。

    - targets: ['localhost:9100']

这个修改会让 Prometheus 从 localhost:9100/metrics 进行 metrics 信息的读取，默认的 9090 是 Prometheus 本身的 metrics 信息。

保存修改过的文件之后重启一下 prometheus 服务即可。

systemctl restart prometheus

可以用本文 $4.3$ 提到的方法验证是否启动成功，如果没有的话请检查 yml 文件的格式。

5.2 Grafana

cd /usr/local/grafana/bin
chmod +x grafana-cli
./grafana-cli plugins install grafana-piechart-panel
systemctl restart grafana-server

这里是为了安装一个 饼图 的插件。

然后访问 http://<YOUR_IP>:3000 ，默认的账号密码都是 admin。

默认的账号密码都是 admin

点击 Add data source。

点击 Add data source

选择 Prometheus。

选择 Prometheus

Http $\rightarrow$ URL 中填入 http://localhost:9090 ，也就是 prometheus 提供的接口。

然后点击 Save & Test。

填入 http://localhost:9090

然后把鼠标挪到左上角的 + 上，注意是挪上去，然后在弹出的菜单中点击 Import。

然后把鼠标挪到左上角的 + 上，然后在弹出的菜单中点击 Import

然后我们在这里可以引入各种大神为各种 Exporter 写好的 Dashboard ，可以去 https://grafana.com/dashboards 自行搜寻，在这里我们用一名国人为 node_exporter 写的 Dashboard ，对应的主页为 https://grafana.com/dashboards/8919 。

我们在 Grafana.com Dashboard 一栏中填入 8919 ，然后点击一下旁边的空白处。

在 Grafana.com Dashboard 一栏中填入 8919 ，然后点击一下旁边的空白处

点击空白处之后会自动导入对应的 Dashboard ，此时会让你设置数据来源，在 Options $\rightarrow$ prometheus_111 这里选择我们刚才添加的 Prometheus ，然后点击 Import 就可以了。

prometheus_111 这里选择我们刚才添加的 Prometheus ，然后点击 Import

5.2.3 配置完成

至此，我们就成功地将 Grafana、Prometheus、node_exporter 关联起来了。

6. 监控多个节点

在完成了本文的 $5$、$6$ 部分之后，仅仅是完成了监控本机的过程，如果要监控其它的节点，需在被监控的节点上安装相应的 Exporter，下面以本文中提到的 node_exporter 为例，介绍如何添加节点。

6.1 部署

6.1.1 下载 Exporter

node_exporter：https://github.com/prometheus/node_exporter/releases

6.1.2 解压、安装

新建一个空文件夹，并将下载的 tar.gz 移动至这个空文件夹中。

请保证以下目录结构：

dir
└── node_exporter-x.x.x.linux-amd64.tar.gz

然后在文件夹中执行：

curl api.pasteme.cn/8416 | bash

可以在 https://pasteme.cn/8416 中查看命令详情。

至此，node_exporter 安装已经完成了，对应的 systemd 服务名称分别是 node_exporter。

6.1.3 验证

参考本文 $4.3$ ，不再赘述。

6.1.4 开机自启

systemctl enable node_exporter

6.1.5 卸载

systemctl disable node_exporter
systemctl stop node_exporter
rm -f /lib/systemd/system/node_exporter.service
rm -rf /usr/local/node_exporter

6.2 配置 Prometheus

在监控节点上编辑 Prometheus 的配置文件 /usr/local/prometheus/prometheus.yml。

将 targets 所在的那一行修改为以下内容，注意空格缩进，yaml 的格式检查很严格。

    - targets: ['localhost:9100', 'addr:9100']

其中 addr 是被监控节点的 IP 或域名。

然后重启 Prometheus，在 Grafana 的 Dashboard 中就可以看到新的节点了。

systemctl restart prometheus

6.2.1 关于 targets 的说明

可以观察到，targets 传入的是一个数组，Prometheus 会收集数组中的每个元素的 metrics ，然后 Grafana 再处理这些数据。

最后修改：2022 年 04 月 24 日

谢谢老板！

发表评论取消回复

评论 *

私密评论

名称 *

🎲

邮箱 *

地址

Red
感谢，搞好了！
flipped895
忘了从哪个友链点进来的,看到你也喜欢南京市民还是acm大佬果断...
jiyouzhan
这篇文章写得深入浅出，让我这个小白也看懂了！
潜心学习的匿名人士
该评论仅登录用户及评论双方可见
煎饼来一套
可以改一下吗？比如连续几次不健康才重启，避免随机干扰

使用 Prometheus + Grafana + Exporter 监控服务器的运行状态

Lucien • 2019 年 05 月 25 日

<h1>使用 Prometheus + Grafana + Exporter 监控服务器的运行状态</h1><p>原文地址：<a href="https://www.lucien.ink/archives/449/" target="_blank" >https://www.lucien.ink/archives/449/</a></p><h2>1. 摘要</h2><p>本文主要介绍如何使用 <code>node_exporter</code> 采集 <code>Linux</code> 系统的信息，借助 <code>Prometheus</code> 最终以仪表盘的形式显示在 <code>Grafana</code> 中。</p><h2>2. 效果展示</h2><p><img src="https://www.lucien.ink/usr/uploads/2019/05/3521641643.jpg" alt="image.jpg" title="image.jpg"style=""></p><h2>3. 介绍</h2><p><code>Grafana</code>、<code>Prometheus</code>、<code>Exporter</code> 这三个组件的背景资料我就不介绍了，搜一下就会有很多。这里主要说一下他们三者之间的关系。</p><h3>3.1 前置知识</h3><p>在编写应用程序的时候，通常会记录 <code>Log</code> 以便事后分析，在很多情况下是产生了问题之后，再去查看 <code>Log</code> ，是一种事后的静态分析。在很多时候，我们可能需要了解整个系统在当前，或者某一时刻运行的情况，比如当前系统中对外提供了多少次服务，这些服务的响应时间是多少，随时间变化的情况是什么样的，系统出错的频率是多少。这些动态的准实时信息对于监控整个系统的运行健康状况来说很重要。</p><p>于是就产生了 <code>metrics</code> 这种数据，它长这样 <a href="https://www.lucien.ink/go/grafana_metrics/" target="_blank" >https://monitor.lucien.ink/metrics</a> 。</p><h3>3.2 关系</h3><ul><li><code>Exporter</code> 的主要任务是提供 <code>metrics</code> 信息。</li><li>而 <code>metrics</code> 大多数人是看不懂的，所以 <code>Prometheus</code> 为这种格式的信息提供了 <code>Prometheus Query Language (PromQL)</code> ，可以进行一些类似数据库那样的联合查询、过滤等操作，这样一来就能提炼出我们想要的东西，类似于内存占用、负载等。大致的流程就是：从远端（可以有多个）采集 <code>metrics</code> 信息到本地 $\rightarrow$ 通过各种 <code>QL</code> 提炼信息。</li><li>虽然 <code>PromQL</code> 非常的强大，但是对于大部分人来说是有很高的学习成本的，所以 <code>Grafana</code> 就将各种 <code>PromQL</code> 封装起来，并将 <code>PromQL</code> 的结果以图表的形式展示出来。</li></ul><p>大概就是 <code>生产</code> $\rightarrow$ <code>加工</code> $\rightarrow$ <code>二次加工</code> 这样一种流程。</p><p>当然了，<code>Prometheus</code> 和 <code>Grafana</code> 的功能远不止如此，更强大的是报警功能，但这不是本文的主题。</p><h3>3.3 Exporter</h3><p>值得一提的是，<code>Exporter</code> 组件是一类组件，它们的主要作用就是提供 <code>metrics</code> 信息以供加工提炼。</p><p>有的组件会自行提供 <code>metrics</code> 信息，比如 <code>Grafana</code>、<code>Prometheus</code>、<code>Etcd</code> 等等，在本文的 $3.1$ 中给出的 <code>metrics</code> 就是 <code>Grafana</code> 本身产生的。</p><p>有的组件不会提供 <code>metrics</code> 信息，比如说我们自己写的一些程序。</p><p>而有的甚至不是组件，比如 <code>Linux</code> 系统本身。</p><h2>4. 部署</h2><p>本文采用的安装方式皆为二进制 + <code>systemd</code> 托管的安装方式，因为 <code>OpenVZ</code> 等架构的 <code>VPS</code> 不能运行 <code>docker</code>，所以选择更普适一些的方法。</p><h3>4.1 下载</h3><ul><li><code>node_exporter</code>：<a href="https://blog.lucien.ink/go/aHR0cHM6Ly9naXRodWIuY29tL3Byb21ldGhldXMvbm9kZV9leHBvcnRlci9yZWxlYXNlcw==" target="_blank" >https://github.com/prometheus/node_exporter/releases</a></li><li><code>Prometheus</code>：<a href="https://blog.lucien.ink/go/aHR0cHM6Ly9naXRodWIuY29tL3Byb21ldGhldXMvcHJvbWV0aGV1cy9yZWxlYXNlcw==" target="_blank" >https://github.com/prometheus/prometheus/releases</a></li><li><code>Grafana</code>（选择 <code>Standalone Linux Binaries</code> 版本）：<a href="https://blog.lucien.ink/go/aHR0cHM6Ly9ncmFmYW5hLmNvbS9ncmFmYW5hL2Rvd25sb2Fk" target="_blank" >https://grafana.com/grafana/download</a></li></ul><h3>4.2 解压、安装</h3><p><strong>新建一个空文件夹</strong>，并将下载的 <code>tar.gz</code> 移动至这个空文件夹中。</p><p>请保证以下目录结构：</p><pre><code class="lang-bash">dir
├── grafana-x.x.x.linux-amd64.tar.gz
├── node_exporter-x.x.x.linux-amd64.tar.gz
└── prometheus-x.x.x.linux-amd64.tar.gz</code></pre><p>然后在文件夹中执行：</p><pre><code class="lang-bash">curl api.pasteme.cn/8413 | bash</code></pre><p>可以在 <a href="https://blog.lucien.ink/go/aHR0cHM6Ly9wYXN0ZW1lLmNuLzg0MTM=" target="_blank" >https://pasteme.cn/8413</a> 中查看命令详情。</p><p>至此，所有安装已经完成了，三个组件对应的 <code>systemd</code> 服务名称分别是：<code>grafana-server</code>、<code>prometheus</code>、<code>node_exporter</code>。</p><h3>4.3 验证</h3><h4>4.3.1 systemctl status xxx</h4><p>可以用 <code>systemctl status</code> 命令来查看各个组件的运行状态。</p><pre><code class="lang-bash">systemctl status node_exporter
systemctl status prometheus
systemctl status grafana-server</code></pre><h4>4.3.2 查看 metrics</h4><p><code>node_exporter</code>、<code>Prometheus</code>、<code>Grafana</code> 的默认端口分别是 <code>9100</code>、<code>9090</code>、<code>3000</code> ，我们可以通过以下命令来查看 <code>metrics</code> 信息，有输出就代表正在运行。</p><pre><code class="lang-bash">curl localhost:9100/metrics
curl localhost:9090/metrics
curl localhost:3000/metrics</code></pre><h3>4.4 开机自启</h3><p>这是 <code>systemd</code> 老生常谈的一个话题了。</p><pre><code class="lang-bash">systemctl enable node_exporter
systemctl enable prometheus
systemctl enable grafana-server</code></pre><h3>4.5 卸载</h3><pre><code class="lang-bash">curl api.pasteme.cn/8414 | bash</code></pre><p>可以在 <a href="https://blog.lucien.ink/go/aHR0cHM6Ly9wYXN0ZW1lLmNuLzg0MTQ=" target="_blank" >https://pasteme.cn/8414</a> 中查看命令详情。</p><h2>5. 配置</h2><p>虽然我们已经完成了三个组件的安装，但此时它们都还是互相独立的三个组件，我们需要对其进行一些配置。</p><h3>5.1 prometheus</h3><p>编辑 <code>/usr/local/prometheus/prometheus.yml</code></p><p>我们会看到如下内容：</p><pre><code class="lang-yaml"># my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - &quot;first_rules.yml&quot;
  # - &quot;second_rules.yml&quot;

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=&lt;job_name&gt;` to any timeseries scraped from this config.
  - job_name: 'prometheus'

# metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

static_configs:
    - targets: ['localhost:9090'] # 我们需要修改这里</code></pre><p>将 <code>targets</code> 所在的那一行修改为以下内容，注意空格缩进，<code>yaml</code> 的格式检查很严格。</p><pre><code class="lang-yml">    - targets: ['localhost:9100']</code></pre><p>这个修改会让 <code>Prometheus</code> 从 <code>localhost:9100/metrics</code> 进行 <code>metrics</code> 信息的读取，默认的 <code>9090</code> 是 <code>Prometheus</code> 本身的 <code>metrics</code> 信息。</p><p>保存修改过的文件之后重启一下 <code>prometheus</code> 服务即可。</p><pre><code class="lang-bash">systemctl restart prometheus</code></pre><p>可以用本文 $4.3$ 提到的方法验证是否启动成功，如果没有的话请检查 <code>yml</code> 文件的格式。</p><h3>5.2 Grafana</h3><pre><code class="lang-bash">cd /usr/local/grafana/bin
chmod +x grafana-cli
./grafana-cli plugins install grafana-piechart-panel
systemctl restart grafana-server</code></pre><p>这里是为了安装一个 <code>饼图</code> 的插件。</p><p>然后访问 <code>http://&lt;YOUR_IP&gt;:3000</code> ，默认的账号密码都是 <code>admin</code>。</p><p><img src="https://www.lucien.ink/usr/uploads/2019/05/2976993592.png" alt="默认的账号密码都是 admin" title="默认的账号密码都是 admin"style=""></p><p>点击 <code>Add data source</code>。</p><p><img src="https://www.lucien.ink/usr/uploads/2019/05/1831285385.png" alt="点击 Add data source" title="点击 Add data source"style=""></p><p>选择 <code>Prometheus</code>。</p><p><img src="https://www.lucien.ink/usr/uploads/2019/05/104610226.png" alt="选择 Prometheus" title="选择 Prometheus"style=""></p><p><code>Http</code> $\rightarrow$ <code>URL</code> 中填入 <code>http://localhost:9090</code> ，也就是 <code>prometheus</code> 提供的接口。</p><p>然后点击 <code>Save &amp; Test</code>。</p><p><img src="https://www.lucien.ink/usr/uploads/2019/05/219343550.png" alt="填入 http://localhost:9090" title="填入 http://localhost:9090"style=""></p><p>然后把鼠标挪到左上角的 <code>+</code> 上，注意是挪上去，然后在弹出的菜单中点击 <code>Import</code>。</p><p><img src="https://www.lucien.ink/usr/uploads/2019/05/2783796477.png" alt="然后把鼠标挪到左上角的 + 上，然后在弹出的菜单中点击 Import" title="然后把鼠标挪到左上角的 + 上，然后在弹出的菜单中点击 Import"style=""></p><p>然后我们在这里可以引入各种大神为各种 <code>Exporter</code> 写好的 <code>Dashboard</code> ，可以去 <a href="https://blog.lucien.ink/go/aHR0cHM6Ly9ncmFmYW5hLmNvbS9kYXNoYm9hcmRz" target="_blank" >https://grafana.com/dashboards</a> 自行搜寻，在这里我们用一名国人为 <code>node_exporter</code> 写的 <code>Dashboard</code> ，对应的主页为 <a href="https://blog.lucien.ink/go/aHR0cHM6Ly9ncmFmYW5hLmNvbS9kYXNoYm9hcmRzLzg5MTk=" target="_blank" >https://grafana.com/dashboards/8919</a> 。</p><p>我们在 <code>Grafana.com Dashboard</code> 一栏中填入 <code>8919</code> ，然后点击一下旁边的空白处。</p><p><img src="https://www.lucien.ink/usr/uploads/2019/05/1391180055.png" alt="在 Grafana.com Dashboard 一栏中填入 8919 ，然后点击一下旁边的空白处" title="在 Grafana.com Dashboard 一栏中填入 8919 ，然后点击一下旁边的空白处"style=""></p><p>点击空白处之后会自动导入对应的 <code>Dashboard</code> ，此时会让你设置数据来源，在 <code>Options</code> $\rightarrow$ <code>prometheus_111</code> 这里选择我们刚才添加的 <code>Prometheus</code> ，然后点击 <code>Import</code> 就可以了。</p><p><img src="https://www.lucien.ink/usr/uploads/2019/05/2644863738.png" alt="prometheus_111 这里选择我们刚才添加的 Prometheus ，然后点击 Import" title="prometheus_111 这里选择我们刚才添加的 Prometheus ，然后点击 Import"style=""></p><h3>5.2.3 配置完成</h3><p>至此，我们就成功地将 <code>Grafana</code>、<code>Prometheus</code>、<code>node_exporter</code> 关联起来了。</p><h2>6. 监控多个节点</h2><p>在完成了本文的 $5$、$6$ 部分之后，仅仅是完成了监控本机的过程，如果要监控其它的节点，需在被监控的节点上安装相应的 <code>Exporter</code>，下面以本文中提到的 <code>node_exporter</code> 为例，介绍如何添加节点。</p><h3>6.1 部署</h3><h4>6.1.1 下载 Exporter</h4><ul><li><code>node_exporter</code>：<a href="https://blog.lucien.ink/go/aHR0cHM6Ly9naXRodWIuY29tL3Byb21ldGhldXMvbm9kZV9leHBvcnRlci9yZWxlYXNlcw==" target="_blank" >https://github.com/prometheus/node_exporter/releases</a></li></ul><h4>6.1.2 解压、安装</h4><p><strong>新建一个空文件夹</strong>，并将下载的 <code>tar.gz</code> 移动至这个空文件夹中。</p><p>请保证以下目录结构：</p><pre><code class="lang-bash">dir
└── node_exporter-x.x.x.linux-amd64.tar.gz</code></pre><p>然后在文件夹中执行：</p><pre><code class="lang-bash">curl api.pasteme.cn/8416 | bash</code></pre><p>可以在 <a href="https://blog.lucien.ink/go/aHR0cHM6Ly9wYXN0ZW1lLmNuLzg0MTY=" target="_blank" >https://pasteme.cn/8416</a> 中查看命令详情。</p><p>至此，<code>node_exporter</code> 安装已经完成了，对应的 <code>systemd</code> 服务名称分别是 <code>node_exporter</code>。</p><h4>6.1.3 验证</h4><p>参考本文 $4.3$ ，不再赘述。</p><h4>6.1.4 开机自启</h4><pre><code class="lang-bash">systemctl enable node_exporter</code></pre><h4>6.1.5 卸载</h4><pre><code class="lang-bash">systemctl disable node_exporter
systemctl stop node_exporter
rm -f /lib/systemd/system/node_exporter.service
rm -rf /usr/local/node_exporter</code></pre><h3>6.2 配置 Prometheus</h3><p>在监控节点上编辑 <code>Prometheus</code> 的配置文件 <code>/usr/local/prometheus/prometheus.yml</code>。</p><p>将 <code>targets</code> 所在的那一行修改为以下内容，注意空格缩进，<code>yaml</code> 的格式检查很严格。</p><pre><code class="lang-yml">    - targets: ['localhost:9100', 'addr:9100']</code></pre><p>其中 <code>addr</code> 是被监控节点的 IP 或域名。</p><p>然后重启 <code>Prometheus</code>，在 <code>Grafana</code> 的 <code>Dashboard</code> 中就可以看到新的节点了。</p><pre><code class="lang-bash">systemctl restart prometheus</code></pre><h4>6.2.1 关于 targets 的说明</h4><p>可以观察到，<code>targets</code> 传入的是一个数组，<code>Prometheus</code> 会收集数组中的每个元素的 <code>metrics</code> ，然后 <code>Grafana</code> 再处理这些数据。</p>

使用 Prometheus + Grafana + Exporter 监控服务器的运行状态

1. 摘要

2. 效果展示

3. 介绍

3.1 前置知识

3.2 关系

3.3 Exporter

4. 部署

4.1 下载

4.2 解压、安装

4.3 验证

4.3.1 systemctl status xxx

4.3.2 查看 metrics

4.4 开机自启

4.5 卸载

5. 配置

5.1 prometheus

5.2 Grafana

5.2.3 配置完成

6. 监控多个节点

6.1 部署

6.1.1 下载 Exporter

6.1.2 解压、安装

6.1.3 验证

6.1.4 开机自启

6.1.5 卸载

6.2 配置 Prometheus

6.2.1 关于 targets 的说明

发表评论 取消回复

使用 Prometheus + Grafana + Exporter 监控服务器的运行状态

发表评论取消回复