一、背景
master、slave数据复制同步,如果master持久化数据小于slave,slave回滚到跟master同步状态。
假如原来master、slave数据同步出了问题,master数据量远大于slave库数据量,而你并没有察觉,某次master、slave、sentinel进程全部挂了。
如果先起原来slave和sentinel,将原来slave选举为现在master,然后再启原来的master成为现在的slave,那么就会导致原来的master,也就是现在的slave数据发生回滚。
显然这不是你想看到的的情况,所以务必先启动原来的master,让其也成为现在的master,避免此类问题的发生!
二、redis重启playbook
针对1主1从3哨兵架构,redis的实例重启:
1.从3个sentinel日志中获取判断之前master ip
2.启动master节点redis进程,然后启动slave节点上的redis进程
3.启动sentinel节点上的redis进程
image.png
$ cat hosts
[master_slave]
xx.xx.xx.xx ansible_ssh_host=xx.xx.xx.xx ansible_ssh_pass=XXX
xx.xx.xx.xx ansible_ssh_host=xx.xx.xx.xx ansible_ssh_pass=XXX
[sentinel]
xx.xx.xx.xx ansible_ssh_host=xx.xx.xx.xx ansible_ssh_pass=XXX
xx.xx.xx.xx ansible_ssh_host=xx.xx.xx.xx ansible_ssh_pass=XXX
xx.xx.xx.xx ansible_ssh_host=xx.xx.xx.xx ansible_ssh_pass=XXX
[all:vars]
ansible_ssh_extra_args='-o StrictHostKeyChecking=no'
ansible_hosts_dir='/path/to/'
instance_name='XXX'
instance_port='XXX'
$ cat start_master-slave-sentinel.yml
# 从哨兵中日志中找出断电前主节点ip
- hosts: sentinel
gather_facts: false
tasks:
- name: "sentinel"
block:
- name: "print redis instance name"
debug:
msg: "#################### {{ instance_name }} ####################"
- name: "select redis master ip from sentinel log"
shell: |
switch_line=`grep -n "+switch-master" /path/to/sentinel.log | awk 'END {print}' | awk -F ":" '{print $1}'`
monitor_line=`grep -n "+monitor master" /path/to/sentinel.log | awk 'END {print}' | awk -F ":" '{print $1}'`
if [ "${switch_line}" == "" -a "${monitor_line}" == "" ]; then
redis_master_ip=""
fi
if [ "${switch_line}" != "" -a "${monitor_line}" == "" ]; then
redis_master_ip=`grep -n "+switch-master" /path/to/sentinel.log | awk 'END {print}' | awk '{print $(NF-1)}'`
fi
if [ "${switch_line}" == "" -a "${monitor_line}" != "" ]; then
redis_master_ip=`grep -n "+monitor master" /path/to/sentinel.log | awk 'END {print}'| awk '{print $(NF-3)}'`
fi
if [ "${switch_line}" != "" -a "${monitor_line}" != "" ]; then
if [ "${switch_line}" -gt "${monitor_line}" ]; then
redis_master_ip=`grep -n "+switch-master" /path/to/sentinel.log | awk 'END {print}' | awk '{print $(NF-3)}'`
else
redis_master_ip=`grep -n "+monitor master" /path/to/sentinel.log | awk 'END {print}' | awk '{print $(NF-1)}'`
fi
fi
echo "${redis_master_ip}"
register: redis_master_ip
- name: "print redis master ip"
debug:
msg: "{{ redis_master_ip.stdout_lines }}"
# 检查3个哨兵日志中主节点ip的一致性
- hosts: localhost
gather_facts: false
tasks:
- name: "localhost"
block:
- name: "check the consistency of master ip from sentinel"
shell: |
ip0="{{ hostvars[groups['sentinel'][0]].redis_master_ip.stdout_lines[0] }}"
ip1="{{ hostvars[groups['sentinel'][1]].redis_master_ip.stdout_lines[0] }}"
ip2="{{ hostvars[groups['sentinel'][2]].redis_master_ip.stdout_lines[0] }}"
if [ "${ip0}${ip1}${ip2}" == "" -o "${ip0}${ip1}" == "" -o "${ip1}${ip2}" == "" -o "${ip0}${ip2}" == "" ]; then
echo "哨兵日志中,至少两个主节点ip为空,无法判断,请检查!"
exit 1
fi
if [ "${ip0}" == "${ip1}" ]; then
echo "${ip0}"
elif [ "${ip1}" == "${ip2}" ]; then
echo "${ip1}"
elif [ "${ip0}" == "${ip2}" ]; then
echo "${ip2}"
else
echo "哨兵日志中,至少两个主节点ip不同,无法判断,请检查!"
exit 1
fi
register: consistent_master_ip
- name: "print consistent_master_ip"
debug:
msg: "{{ consistent_master_ip.stdout }}"
# 如果至少有2个哨兵日志中主节点ip一致,启动master,如果启动失败,退出实例启动程序
- name: "start master"
shell: ansible -i {{ ansible_hosts_dir }}/hosts "{{ consistent_master_ip.stdout }}" -m shell -a "sh /path/to/start_redis.sh"
- name: "check if master is started"
shell: |
if ansible -i {{ ansible_hosts_dir }}/hosts "{{ consistent_master_ip.stdout }}" -m shell -a "ps aux | grep 'redis-server' | grep -w '{{ instance_port }}' | grep -v 'grep'"; then
echo "主节点启动成功!"
else
echo "主节点启动失败,退出实例启动程序,请检查!"
exit 1
fi
register: master_start_result
- name: "print master start result"
debug:
msg: "{{ master_start_result.stdout_lines }}"
# 启动slave
- hosts: mater_slave
gather_facts: false
tasks:
- name: "mater_slave"
block:
- name: "start slave"
shell: >
if ! ps aux | grep "redis-server" | grep -w "{{ instance_port }}"| grep -v "grep" > /dev/null 2>&1; then
sh /path/to/start_redis.sh
fi
# 启动哨兵
- hosts: sentinel
gather_facts: false
tasks:
- name: "sentinel"
block:
- name: "start sentinel"
shell: sh /path/to/start_sentinel.sh
$ ansible-playbook -i hosts start_master-slave-sentinel.yml
三、参考
redis哨兵日志说明
https://www.cnblogs.com/rxysg/p/15688683.html












网友评论