ceph03 节点 osd总是会down
状态
[root@ceph03 ceph_volume]# ceph -s
cluster:
id: ef6723d3-e8e0-4d39-a276-aa169b4035b4
health: HEALTH_WARN
1 osds down
1 host (1 osds) down
查看日志
[root@ceph03 ceph_volume]# tail -f /var/log/ceph/ceph-osd.2.log
2023-01-09 15:44:34.046 7f37c28df700 1 osd.2 66 _collect_metadata sdb: no unique device id for sdb: fallback method has model 'VMware Virtual S' but no serial'
2023-01-09 15:44:34.662 7f37b8ec6700 1 osd.2 67 state: booting -> active
2023-01-09 15:45:02.592 7f37b8ec6700 0 log_channel(cluster) log [WRN] : Monitor daemon marked osd.2 down, but it is still running
2023-01-09 15:45:02.592 7f37b8ec6700 0 log_channel(cluster) log [DBG] : map e68 wrongly marked me down at e68
2023-01-09 15:45:02.592 7f37b8ec6700 0 osd.2 68 _committed_osd_maps marked down 6 > osd_max_markdown_count 5 in last 600.000000 seconds, shutting down
2023-01-09 15:45:02.592 7f37b8ec6700 1 osd.2 68 start_waiting_for_healthy
2023-01-09 15:45:02.595 7f37b8ec6700 0 osd.2 68 _committed_osd_maps shutdown OSD via async signal
2023-01-09 15:45:02.595 7f37c6165700 -1 received signal: Interrupt from Kernel ( Could be generated by pthread_kill(), raise(), abort(), alarm() ) UID: 0
2023-01-09 15:45:02.595 7f37c6165700 -1 osd.2 68 *** Got signal Interrupt ***
2023-01-09 15:45:02.595 7f37c6165700 -1 osd.2 68 *** Immediate shutdown (osd_fast_shutdown=true) ***
重新dd了sdb
然后重新加入集群 显示in up .但是等一会儿还是会down
查看正常的ceph01 ceph02 的日志
没有检测到osd2的心跳,所以查看网络问题
/var/log/ceph/ceph-osd.0.log:2023-01-09 15:44:26.424 7f3f5497e700 -1 osd.0 65 heartbeat_check: no reply from 192.168.175.140:6808 osd.2 ever on either front or back, first ping sent 2023-01-09 15:44:02.472023 (oldest deadline 2023-01-09 15:44:22.472023)
[root@ceph03 ceph_volume]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2023-01-09 15:23:10 CST; 47min ago
Docs: man:firewalld(1)
Main PID: 540 (firewalld)
CGroup: /system.slice/firewalld.service
└─540 /usr/bin/python2 -Es /usr/sbin/firewalld --nofork --nopid
Jan 09 15:23:09 ceph03 systemd[1]: Starting firewalld - dynamic firewall daemon...
Jan 09 15:23:10 ceph03 systemd[1]: Started firewalld - dynamic firewall daemon.
[root@ceph03 ceph_volume]# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-N FORWARD_IN_ZONES
-N FORWARD_OUT_ZONES
-N FORWARD_direct
-N FWDI_public
-N FWDI_public_allow
-N FWDI_public_deny
-N FWDI_public_log
-N FWDO_public
-N FWDO_public_allow
-N FWDO_public_deny
-N FWDO_public_log
-N INPUT_ZONES
-N INPUT_direct
-N IN_public
-N IN_public_allow
-N IN_public_deny
-N IN_public_log
-N OUTPUT_direct
-A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -j INPUT_direct
-A INPUT -j INPUT_ZONES
-A INPUT -m conntrack --ctstate INVALID -j DROP
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i lo -j ACCEPT
-A FORWARD -j FORWARD_direct
-A FORWARD -j FORWARD_IN_ZONES
-A FORWARD -j FORWARD_OUT_ZONES
-A FORWARD -m conntrack --ctstate INVALID -j DROP
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
-A OUTPUT -o lo -j ACCEPT
-A OUTPUT -j OUTPUT_direct
-A FORWARD_IN_ZONES -i eth1 -g FWDI_public
-A FORWARD_IN_ZONES -i eth0 -g FWDI_public
-A FORWARD_IN_ZONES -g FWDI_public
-A FORWARD_OUT_ZONES -o eth1 -g FWDO_public
-A FORWARD_OUT_ZONES -o eth0 -g FWDO_public
-A FORWARD_OUT_ZONES -g FWDO_public
-A FWDI_public -j FWDI_public_log
-A FWDI_public -j FWDI_public_deny
-A FWDI_public -j FWDI_public_allow
-A FWDI_public -p icmp -j ACCEPT
-A FWDO_public -j FWDO_public_log
-A FWDO_public -j FWDO_public_deny
-A FWDO_public -j FWDO_public_allow
-A INPUT_ZONES -i eth1 -g IN_public
-A INPUT_ZONES -i eth0 -g IN_public
-A INPUT_ZONES -g IN_public
-A IN_public -j IN_public_log
-A IN_public -j IN_public_deny
-A IN_public -j IN_public_allow
-A IN_public -p icmp -j ACCEPT
-A IN_public_allow -p tcp -m tcp --dport 22 -m conntrack --ctstate NEW,UNTRACKED -j ACCEPT
关闭ceph03防火墙
systemctl stop firewalld
systemctl disable firewalld
[root@ceph03 ceph_volume]# systemctl start ceph-osd.target
[root@ceph03 ceph_volume]# ps -ef |grep osd
ceph 2293 1 35 16:13 ? 00:00:00 /usr/bin/ceph-osd -f --cluster ceph --id 2 --setuser ceph --setgroup ceph
root 2403 1265 0 16:13 pts/0 00:00:00 grep --color=auto osd








网友评论