美文网首页
Redis哨兵模式集群搭建和容灾测试

Redis哨兵模式集群搭建和容灾测试

作者: OrangeLoveMilan | 来源:发表于2020-05-25 10:27 被阅读0次

Redis由单点升为集群

官网doc:

https://redis.io/topics/sentinel

架构:

一主两从三哨兵

  • 主 172.24.32.200:16379
  • 从 172.24.32.200:16389
  • 从 172.24.32.200:16399
  • 哨兵 172.24.32.200:26379
  • 哨兵 172.24.32.200:26389
  • 哨兵 172.24.32.200:26399

两从部署

1、目录

mkdir -p /data/redis/redis_16389
mkdir -p /data/redis/redis_16399
cp /etc/redis/redis_16379.conf /etc/redis/redis_16389.conf
cp /etc/redis/redis_16379.conf /etc/redis/redis_16399.conf

2、修改配置文件

##修改默认端口
sed -i 's|port 16379|port 16389|g' /etc/redis/redis_16389.conf
sed -i 's|port 16379|port 16399|g' /etc/redis/redis_16399.conf

##添加主库信息
sed -i '$aslaveof 172.24.32.200 16379' /etc/redis/redis_16389.conf
sed -i '$aslaveof 172.24.32.200 16379' /etc/redis/redis_16399.conf

##修改日志文件目录
sed -i 's|logfile "/var/log/redis.log"|logfile "/data/redis/redis_16389/redis_16389.log"|g' /etc/redis/redis_16389.conf
sed -i 's|logfile "/var/log/redis.log"|logfile "/data/redis/redis_16399/redis_16399.log"|g' /etc/redis/redis_16399.conf

##修改工作目录,即rdb的存放目录
sed -i 's|dir ./|dir /data/redis/redis_16389/|g' /etc/redis/redis_16389.conf
sed -i 's|dir ./|dir /data/redis/redis_16399/|g' /etc/redis/redis_16399.conf

##三个库都加上主库的密码,主库也要添加上,不然主库挂掉后重启,连不上新的主库
sed -i '$amasterauth aj3jaHSk3n4' /etc/redis/redis_16379.conf
sed -i '$amasterauth aj3jaHSk3n4' /etc/redis/redis_16389.conf
sed -i '$amasterauth aj3jaHSk3n4' /etc/redis/redis_16399.conf

3、启动配置文件

cat >>/usr/lib/systemd/system/redis16389.service<<EOF
[Unit]
Description=redis16389
After=network.target

[Service]
Type=forking
PIDFile=/var/run/redis_16389.pid
ExecStart=/usr/local/bin/redis-server /etc/redis/redis_16389.conf
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
PrivateTmp=true

[Install]
WantedBy=multi-user.target
EOF
-----------------------------------------------------------

cat >>/usr/lib/systemd/system/redis16399.service<<EOF
[Unit]
Description=redis16399
After=network.target

[Service]
Type=forking
PIDFile=/var/run/redis_16399.pid
ExecStart=/usr/local/bin/redis-server /etc/redis/redis_16399.conf
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
PrivateTmp=true

[Install]
WantedBy=multi-user.target
EOF

4、启动

systemctl start redis16389
systemctl start redis16399

5、开机自启动

systemctl enable redis16389
systemctl enable redis16399

6、检查日志,看有无数据同步异常

tailf /data/redis/redis_16399/redis_16399.log
21713:S 11 May 2020 02:33:31.504 * Master replied to PING, replication can continue...
21713:S 11 May 2020 02:33:31.505 * Partial resynchronization not possible (no cached master)
21713:S 11 May 2020 02:33:31.507 * Full resync from master: 263a60a0531c3c3564fd1e2bca6f138449f0bc7c:42
21713:S 11 May 2020 02:33:31.521 * MASTER <-> REPLICA sync: receiving 1270 bytes from master to disk
21713:S 11 May 2020 02:33:31.521 * MASTER <-> REPLICA sync: Flushing old data
21713:S 11 May 2020 02:33:31.521 * MASTER <-> REPLICA sync: Loading DB in memory
21713:S 11 May 2020 02:33:31.521 * Loading RDB produced by version 6.0.1
21713:S 11 May 2020 02:33:31.521 * RDB age 0 seconds
21713:S 11 May 2020 02:33:31.522 * RDB memory usage when created 1.91 Mb
21713:S 11 May 2020 02:33:31.522 * MASTER <-> REPLICA sync: Finished with success

7、登录到从库查看数据、校验未销毁的key的总数

redis-cli -h 172.24.32.200 -p 16399
172.24.32.200:16399> auth aj3jaHSk3n4
OK
172.24.32.200:16379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=172.24.32.200,port=16389,state=online,offset=6272,lag=0
slave1:ip=172.24.32.200,port=16399,state=online,offset=6272,lag=0
master_replid:263a60a0531c3c3564fd1e2bca6f138449f0bc7c
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:6272
master_repl_meaningful_offset:0
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:6272
172.24.32.200:16399> dbsize
(integer) 100
172.24.32.200:16399> get test33
"460"

与主库一致,同步完成

哨兵

1、目录

mkdir -p /data/redis/redis-sentine/sentinel01_16379
mkdir -p /data/redis/redis-sentine/sentinel02_16389
mkdir -p /data/redis/redis-sentine/sentinel03_16399
mkdir -p /etc/redis-sentinel/
cp /data/redis/redis-stable/sentinel.conf /etc/redis-sentinel/sentinel01.conf
cp /data/redis/redis-stable/sentinel.conf /etc/redis-sentinel/sentinel02.conf
cp /data/redis/redis-stable/sentinel.conf /etc/redis-sentinel/sentinel03.conf

2、修改配置文件

##修改端口
sed -i 's|port 26379|port 26379|g' /etc/redis-sentinel/sentinel01.conf
sed -i 's|port 26379|port 26389|g' /etc/redis-sentinel/sentinel02.conf
sed -i 's|port 26379|port 26399|g' /etc/redis-sentinel/sentinel03.conf

##后台启动
sed -i 's|daemonize no|daemonize yes|g' /etc/redis-sentinel/sentinel*

##pidfile 目录
sed -i 's|pidfile /var/run/redis-sentinel.pid|pidfile /data/redis/redis-sentinel/sentinel01_16379/redis-sentinel-01.pid|g'  /etc/redis-sentinel/sentinel01.conf
sed -i 's|pidfile /var/run/redis-sentinel.pid|pidfile /data/redis/redis-sentinel/sentinel02_16389/redis-sentinel-02.pid|g'  /etc/redis-sentinel/sentinel02.conf
sed -i 's|pidfile /var/run/redis-sentinel.pid|pidfile /data/redis/redis-sentinel/sentinel03_16399/redis-sentinel-03.pid|g' /etc/redis-sentinel/sentinel03.conf

##日志目录
sed -i 's|logfile ""|logfile "/data/redis/redis-sentinel/sentinel01_16379/redis-sentinel-01.log"|g' /etc/redis-sentinel/sentinel01.conf
sed -i 's|logfile ""|logfile "/data/redis/redis-sentinel/sentinel02_16389/redis-sentinel-02.log"|g' /etc/redis-sentinel/sentinel02.conf
sed -i 's|logfile ""|logfile "/data/redis/redis-sentinel/sentinel03_16399/redis-sentinel-03.log"|g' /etc/redis-sentinel/sentinel03.conf

##sentinel monitor
sed -i 's|sentinel monitor mymaster 127.0.0.1 6379 2|sentinel monitor mymaster 172.24.32.200 16379 2|g' /etc/redis-sentinel/sentinel*

##工作目录
sed -i 's|dir /tmp|dir /data/redis/redis-sentinel/sentinel01_16379/|g' /etc/redis-sentinel/sentinel01.conf
sed -i 's|dir /tmp|dir /data/redis/redis-sentinel/sentinel02_16389/|g' /etc/redis-sentinel/sentinel02.conf
sed -i 's|dir /tmp|dir /data/redis/redis-sentinel/sentinel03_16399/|g' /etc/redis-sentinel/sentinel03.conf

##密码
sed -i '$asentinel auth-pass mymaster aj3jaHSk3n4' /etc/redis-sentinel/sentinel*

3、启动文件:

cat >>/usr/lib/systemd/system/redissentinel01.service<<EOF
[Unit]
Description=redissentinel01
After=network.target

[Service]
Type=forking
PIDFile=/var/run/redissentinel01.pid
ExecStart=/usr/local/bin/redis-sentinel /etc/redis-sentinel/sentinel01.conf
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
PrivateTmp=true

[Install]
WantedBy=multi-user.target
EOF
------------------------------------------------------------------
cat >>/usr/lib/systemd/system/redissentinel02.service<<EOF
[Unit]
Description=redissentinel02
After=network.target

[Service]
Type=forking
PIDFile=/var/run/redissentinel02.pid
ExecStart=/usr/local/bin/redis-sentinel /etc/redis-sentinel/sentinel02.conf
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
PrivateTmp=true

[Install]
WantedBy=multi-user.target
EOF
------------------------------------------------------------------
cat >>/usr/lib/systemd/system/redissentinel03.service<<EOF
[Unit]
Description=redissentinel03
After=network.target

[Service]
Type=forking
PIDFile=/var/run/redissentinel03.pid
ExecStart=/usr/local/bin/redis-sentinel /etc/redis-sentinel/sentinel03.conf
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s QUIT $MAINPID
PrivateTmp=true

[Install]
WantedBy=multi-user.target
EOF

4、启动:

systemctl start redissentinel01
systemctl start redissentinel02
systemctl start redissentinel03

5、开机自启动

cat >>/etc/rc.local<<EOF
##redis-sentinel
systemctl enable redissentinel01
systemctl enable redissentinel02
systemctl enable redissentinel03
EOF

6、查看日志,无报错:

tailf /data/redis/redis-sentinel/sentinel01_16379/redis-sentinel-01.log
28642:X 11 May 2020 03:59:40.262 # Redis version=6.0.1, bits=64, commit=00000000, modified=0, pid=28642, just started
28642:X 11 May 2020 03:59:40.262 # Configuration loaded
28643:X 11 May 2020 03:59:40.266 * Increased maximum number of open files to 10032 (it was originally set to 1024).
28643:X 11 May 2020 03:59:40.268 * Running mode=sentinel, port=26379.
28643:X 11 May 2020 03:59:40.269 # Sentinel ID is a60f5ae9ea4da4ee661fc57035faa7941beb34ee
28643:X 11 May 2020 03:59:40.269 # +monitor master mymaster 172.24.32.200 16379 quorum 2
28643:X 11 May 2020 03:59:40.271 * +slave slave 172.24.32.200:16389 172.24.32.200 16389 @ mymaster 172.24.32.200 16379
28643:X 11 May 2020 03:59:40.282 * +slave slave 172.24.32.200:16399 172.24.32.200 16399 @ mymaster 172.24.32.200 16379
28643:X 11 May 2020 03:59:42.299 * +sentinel sentinel 4c0ba39859c55e84875a1fc7fb8d4a9b2ba3c918 172.24.32.200 26389 @ mymaster 172.24.32.200 16379
28643:X 11 May 2020 03:59:42.945 * +sentinel sentinel c2d73943fd60afc898a4d1029fc27e338f43c0f1 172.24.32.200 26399 @ mymaster 172.24.32.200 16379

哨兵验证

redis-cli -h 172.24.32.200 -p 26379
172.24.32.200:26379> sentinel master mymaster
 1) "name"
 2) "mymaster"
 3) "ip"
 4) "172.24.32.200"
 5) "port"
 6) "16379"
 7) "runid"
 8) "f040808ade2ece7838802fc9745d402ad84c2940"
 9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "74"
19) "last-ping-reply"
20) "74"
21) "down-after-milliseconds"
22) "30000"
23) "info-refresh"
24) "6122"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "26213"
29) "config-epoch"
30) "0"
31) "num-slaves"
32) "2"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "180000"
39) "parallel-syncs"
40) "1"

容灾验证

情景1、主节点关闭,两个从,三个sentinel正常

关闭主库300秒

redis-cli -h 172.24.32.200 -a aj3jaHSk3n4 -p 16379 DEBUG sleep 300
SENTINEL get-master-addr-by-name mymaster
172.24.32.200:26379> SENTINEL get-master-addr-by-name mymaster
1) "172.24.32.200"
2) "16399"

查看另一个存活节点的配置文件

vim /etc/redis/redis_16389.conf

发现:

replicaof 172.24.32.200 16399

已经切换了了,剩余的两个节点
16399主
16389从

300秒后
16379的(以前的master)节点已经恢复,成了slave

情景2、主节点关闭,主节点的sentinel关闭,验证验证 redis 操作的可用性,并验证 redis 中所存储数据的可用性

关闭主节点和主节点的sentinel

ps -ef | grep redis-server|grep 16399 | awk '{ print $2 }'|xargs kill -9
ps -ef | grep redis-sentinel|grep 26399 | awk '{ print $2 }'|xargs kill -9

进入剩下的sentinel

[root@localhost ~]# redis-cli -h 172.24.32.200 -p 26379
172.24.32.200:26379> sentinel master mymaster
 1) "name"
 2) "mymaster"
 3) "ip"
 4) "172.24.32.200"
 5) "port"
 6) "16389"

此时master变成了16389节点
进入master

redis-cli -h 172.24.32.200 -a aj3jaHSk3n4 -p 16389 

查看info

Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
172.24.32.200:16389> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=172.24.32.200,port=16379,state=online,offset=1715148,lag=0
master_replid:5286e323614633422849de29c63556ca493669fb
master_replid2:654c49dac8cd9b63b94b1e110567cdc29635f49d
master_repl_offset:1715148
master_repl_meaningful_offset:1715148
second_repl_offset:1674207
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:666573
repl_backlog_histlen:1048576

查询

172.24.32.200:16389> get test33
"460"

插入新的value

172.24.32.200:16389> set test999 666
OK
172.24.32.200:16389> get test999
"666"

去从库16379查询
redis-cli -h 172.24.32.200 -a aj3jaHSk3n4 -p 16379

172.24.32.200:16379> get test999
"666"

redis主从正常使用

恢复down掉的节点

redis-server /etc/redis/redis_16399.conf
redis-sentinel /etc/redis-sentinel/sentinel03.conf

从新的sentinel进入

redis-cli -h 172.24.32.200 -p 26399

查询

172.24.32.200:26399> sentinel master mymaster
 1) "name"
 2) "mymaster"
 3) "ip"
 4) "172.24.32.200"
 5) "port"
 6) "16389"
 7) "runid"
 8) "8e58e4d10e36ec432ef79e138bbe3a094ef6d38f"
 9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "887"
19) "last-ping-reply"
20) "887"
21) "down-after-milliseconds"
22) "30000"
23) "info-refresh"
24) "7619"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "67315"
29) "config-epoch"
30) "2"
31) "num-slaves"
32) "2"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "180000"
39) "parallel-syncs"
40) "1"

进入刚启动的从16399

redis-cli -h 172.24.32.200 -a aj3jaHSk3n4 -p 16399 

获取down掉过程中,其他两个节点有的值test999

172.24.32.200:16399> get test999
"666"

数据同步上了。此次测试,一个redis节点及其对应的 sentinel挂掉
redis集群可继续使用,并且节点恢复后,数据也都同步上

思考:redis 优化(配置文件优化、内存优化)
redis 缓存穿透、击穿、雪崩的模拟和解决方案
redis 缓存失效策略和主键失效机制
redis 基于CAS的乐观锁

相关文章

网友评论

      本文标题:Redis哨兵模式集群搭建和容灾测试

      本文链接:https://www.haomeiwen.com/subject/paweahtx.html