美文网首页
ceph adm 环境查看log

ceph adm 环境查看log

作者: cloudFans | 来源:发表于2022-11-24 11:20 被阅读0次

有两种方式, 首先都需要获取到集群cluster id


# 定位到服务所在节点
ceph orch ps 

# 每个节点上的log都通过如下方式查看

# ceph -s
  cluster:
    id:     8c2b898a-0324-11ed-8b84-089204a58dfa
    health: HEALTH_OK

ls -l  /var/log/ceph/<cluster-fsid>
# 或者
# journalctl -u ceph-5c5a50ae-272a-455d-99e9-32c6a013e694@mon.foo
journalctl -u ceph-8c2b898a-0324-11ed-8b84-089204a58dfa@mon.ceph-rbd-1
  1. 排除ceph 3节点集群内部存在网络不稳定的问题
# 1. 目前先排除掉 ceph 3节点 集群内部存在网络问题,如果有网络问题,log中肯定有其他的err
[root@ceph-rbd-1 ~]#  grep  -i ERROR /var/log/messages | grep -v "RBD image has snapshots" | grep -v "has overlapping roots" | grep -v mgr-ceph | grep -v ceph-mgr | grep -v ".log"
Nov 22 15:55:54 ceph-rbd-1 ceph-8c2b898a-0324-11ed-8b84-089204a58dfa-grafana-ceph-rbd-1[747899]: server.go:3160: http: TLS handshake error from 10.60.21.48:4660: remote error: tls: unknown certificate
Nov 22 15:56:57 ceph-rbd-1 ceph-8c2b898a-0324-11ed-8b84-089204a58dfa-grafana-ceph-rbd-1[747899]: server.go:3160: http: TLS handshake error from 10.60.23.23:50341: remote error: tls: unknown certificate
Nov 22 15:58:20 ceph-rbd-1 ceph-8c2b898a-0324-11ed-8b84-089204a58dfa-grafana-ceph-rbd-1[747899]: server.go:3160: http: TLS handshake error from 10.60.23.23:50473: remote error: tls: unknown certificate
Nov 23 14:53:22 ceph-rbd-1 ceph-8c2b898a-0324-11ed-8b84-089204a58dfa-grafana-ceph-rbd-1[747899]: server.go:3160: http: TLS handshake error from 10.60.23.23:61933: remote error: tls: unknown certificate
Nov 23 14:57:06 ceph-rbd-1 ceph-8c2b898a-0324-11ed-8b84-089204a58dfa-grafana-ceph-rbd-1[747899]: server.go:3160: http: TLS handshake error from 10.60.23.23:62164: remote error: tls: unknown certificate
Nov 23 15:03:03 ceph-rbd-1 ceph-8c2b898a-0324-11ed-8b84-089204a58dfa-grafana-ceph-rbd-1[747899]: server.go:3160: http: TLS handshake error from 10.60.23.23:62535: remote error: tls: unknown certificate
[root@ceph-rbd-1 ~]#
[root@ceph-rbd-1 ~]#
[root@ceph-rbd-1 ~]# ssh ceph-rbd-2
root@ceph-rbd-2's password:
Last login: Fri Nov 25 11:36:30 2022 from 10.122.16.11
[root@ceph-rbd-2 ~]#
[root@ceph-rbd-2 ~]# grep  -i ERROR /var/log/messages | grep -v "RBD image has snapshots" | grep -v "has overlapping roots" | grep -v mgr-ceph | grep -v ceph-mgr | grep -v ".log"
[root@ceph-rbd-2 ~]# logout
Connection to ceph-rbd-2 closed.
[root@ceph-rbd-1 ~]# ssh ceph-rbd-3
root@ceph-rbd-3's password:
Last login: Fri Nov 25 11:36:56 2022 from 10.122.16.11
[root@ceph-rbd-3 ~]# grep  -i ERROR /var/log/messages | grep -v "RBD image has snapshots" | grep -v "has overlapping roots" | grep -v mgr-ceph | grep -v ceph-mgr | grep -v ".log"


# 2.  pg_autoscaler 可能有点问题,感觉一直在扩缩之间抖动

#  grep  -i ERROR /var/log/messages | grep -v "RBD image has snapshots"
Nov 20 03:51:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 6 has overlapping roots: {-12, -1}
Nov 20 03:51:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 7 has overlapping roots: {-12, -1, -2}
Nov 20 03:52:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 6 has overlapping roots: {-12, -1}
Nov 20 03:52:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 7 has overlapping roots: {-12, -1, -2}
Nov 20 03:53:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 6 has overlapping roots: {-12, -1}
Nov 20 03:53:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 7 has overlapping roots: {-12, -1, -2}
Nov 20 03:58:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 6 has overlapping roots: {-12, -1}
Nov 20 03:58:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 7 has overlapping roots: {-12, -1, -2}
Nov 20 03:59:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 6 has overlapping roots: {-12, -1}
Nov 20 03:59:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 7 has overlapping roots: {-12, -1, -2}
Nov 20 04:00:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 6 has overlapping roots: {-12, -1}
Nov 20 04:00:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 7 has overlapping roots: {-12, -1, -2}
Nov 20 04:01:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 6 has overlapping roots: {-12, -1}
Nov 20 04:01:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 7 has overlapping roots: {-12, -1, -2}
Nov 20 04:02:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 6 has overlapping roots: {-12, -1}
Nov 20 04:02:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 7 has overlapping roots: {-12, -1, -2}
Nov 20 04:03:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 6 has overlapping roots: {-12, -1}
Nov 20 04:03:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 7 has overlapping roots: {-12, -1, -2}
Nov 20 04:04:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 6 has overlapping roots: {-12, -1}
Nov 20 04:04:58 ceph-rbd-1 ceph-mgr[729584]: [pg_autoscaler ERROR root] pool 7 has overlapping roots: {-12, -1, -2}


journalctl -u ceph-8c2b898a-0324-11ed-8b84-089204a58dfa@osd.22.service


相关文章

  • ceph adm 环境查看log

    有两种方式, 首先都需要获取到集群cluster id 排除ceph 3节点集群内部存在网络不稳定的问题

  • Linux进程状态与信号

    问题描述 今天测试环境上出现创建缓存分区失败的情况,查看log发现是ceph-disk zap /dev/sdx ...

  • Ceph日常排错【2】

    今天ceph的一个osd节点down了,导致出现Warming : 查看OSD节点的/var/log/messag...

  • ceph pgp_num 未能更新的问题

    由于ceph 17 ceph-adm的池和集群自动平衡有问题,导致pg_num 和 pgp_num未自动调整。所以...

  • Ceph告警:too many PGs per OSD处理

    问题描述 一套ceph生产环境出现了告警,ceph -s查看集群状态后,告警如下: 本文旨在记录对“too man...

  • Shell编程

    变量 查看所有变量:set查看环境变量:env,printenv查看语系:locale注销变量:unset log...

  • ceph性能统计实现

    0. ceph是如何查看集群性能数据的 通过ceph -s显示本次查看的集群性能数据 通过ceph -w实时更新集...

  • windows下安装node.js,npm

    windows下安装node.js,npm 官网下载安装,查看环境变量和当前node版本 C:\Users\Adm...

  • ceph分布式存储-集群容量评估

    1. 环境介绍 1.1 软件环境 ceph集群: mon:ceph-xxx-osd02.ys,ceph-xxx-o...

  • MySQL部署到k8s实例

    k8s && ceph 环境信息 ceph 部署以及配置 搭建ceph集群 ceph 集群配置 ceph 创建My...

网友评论

      本文标题:ceph adm 环境查看log

      本文链接:https://www.haomeiwen.com/subject/bstxfdtx.html