Skip to content

Commit

Permalink
feat: add tutorial documentation for monitoring setup and update navi…
Browse files Browse the repository at this point in the history
…gation
  • Loading branch information
tiankaima committed Jan 15, 2025
1 parent f7ed07e commit 56ef4a9
Show file tree
Hide file tree
Showing 6 changed files with 36 additions and 33 deletions.
File renamed without changes.
File renamed without changes.
File renamed without changes
32 changes: 1 addition & 31 deletions docs/lab/srv/8x4090.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,34 +4,4 @@

从上架开始就接手的服务器,记录的内容会更全面一点。

上架检查单 -> [docs/lab/checklist.md](/lab/checklist)

## 监控

<https://grafana.lab.tiankaima.cn:8443/>

配置了 Prometheus + Grafana 来做监控:

- CPU、内存、硬盘、网络流量:`node-exporter`
- GPU 监控:`dcgm-exporter`
- 监控本体:`prometheus`
- 可视化:`grafana`

配置文件参考:<https://gist.github.com/tiankaima/9c31f36435af0c5093704b366d43eea2>

!!! note ""

+ 为 Grafana 开启了「允许未登录」的设置,可以直接访问查看监控数据,只能查看不能修改。
+ 同机房的另一台机器 `8xa6000` 使用了类似的部署方案,但使用这台机器的 grafana 做可视化,在下面的设置中可以切换数据源。

![切换 Prometheus 数据源](./img/1.png){width=400}

## 网络说明

+ 使用如下命令设置代理:

```bash
export http_proxy="http://192.168.50.1:7890";
export https_proxy=$http_proxy;
export no_proxy="localhost, 127.0.0.1, ::1"
```
上架检查单 -> [docs/lab/checklist.md](/lab/checklist)
31 changes: 31 additions & 0 deletions docs/lab/tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# 使用文档

## 监控

<https://grafana.lab.tiankaima.cn:8443/>

配置了 Prometheus + Grafana 来做监控:

- CPU、内存、硬盘、网络流量:`node-exporter`
- GPU 监控:`dcgm-exporter`
- 监控本体:`prometheus`
- 可视化:`grafana`

配置文件参考:<https://gist.github.com/tiankaima/9c31f36435af0c5093704b366d43eea2>

!!! note ""

+ 为 Grafana 开启了「允许未登录」的设置,可以直接访问查看监控数据,只能查看不能修改。
+ 同机房的另一台机器 `8xa6000` 使用了类似的部署方案,但使用这台机器的 grafana 做可视化,在下面的设置中可以切换数据源。

![切换 Prometheus 数据源](./img/1.png){width=400}

## 网络说明

+ 使用如下命令设置代理:

```bash
export http_proxy="http://192.168.50.1:7890";
export https_proxy=$http_proxy;
export no_proxy="localhost, 127.0.0.1, ::1"
```
6 changes: 4 additions & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,10 @@ nav:
- 3D Gaussian Splatting: ml/dl/gs/3d-gs.md
- Lab:
- lab/index.md
- lab/checklist.md
- lab/network.md
- lab/tutorial.md
- Admin:
- lab/admin/checklist.md
- lab/admin/network.md
- Servers:
- lab/srv/8x4090.md
- lab/srv/8xa6000.md
Expand Down

0 comments on commit 56ef4a9

Please sign in to comment.