Preface       Semi-synchronous replication is supported since MySQL 5.5 and then enhanced gradually in 5.6 & 5.7.It's also called enhanced semi-synchronous replicaiton in MySQL 5.7.The new method of ACK rule in 5.7(after_sync) make the consistency be strongly guaranteed.Maybe that is the reason why it gets the name.Today I'm gonna demonstrate the effect of sevaral variables of semi-sync.   Introduce       Semi-synchronized replicatin can be installed as a plugin.There're a lot of variables with prefix of "repl_semi_sync" after you install the "rpl_semi_sync_master" plugin on your MySQL server.Meanwhile,there're a series of global status about it as well.Let's see the details about it.   Framework
Hostname IP/Port Identity OS Version MySQL Version GTID Mode Binlog Format Semi-Sync status
zlm2 192.168.1.101/3306 master CentOS 7.0 5.7.21 on row on
zlm3 192.168.1.102/3306 slave1 CentOS 7.0 5.7.21 on row on
zlm4 192.168.1.103/3306 slave2 CentOS 7.0 5.7.21 on row on
  Example   Check the variables relevant to semi-sync replication.
 1 //Master
 2 (zlm@192.168.1.101 3306)[(none)]>show variables like 'rpl_semi%';
 3 +-------------------------------------------+------------+
 4 | Variable_name                             | Value      |
 5 +-------------------------------------------+------------+
 6 | rpl_semi_sync_master_enabled              | OFF        |
 7 | rpl_semi_sync_master_timeout              | 10000      |
 8 | rpl_semi_sync_master_trace_level          | 32         |
 9 | rpl_semi_sync_master_wait_for_slave_count | 2          |
10 | rpl_semi_sync_master_wait_no_slave        | ON         |
11 | rpl_semi_sync_master_wait_point           | AFTER_SYNC |
12 +-------------------------------------------+------------+
13 6 rows in set (0.00 sec)
14 
15 //rpl_semi_sync_master_enabled --Specify whether to use the feature of semi-sync(master side,default "off").
16 //rpl_semi_sync_master_timeout --Specify the threshold of timeout between master and slave(default value is "10000" means 10s).
17 //rpl_semi_sync_master_trace_level --Specify the debug trace level on master(default "32",others "1","16","64").
18 //rpl_semi_sync_master_wait_for_slave_count --Specify how many slaves the master should wait when timeout happens(default "1").
19 //rpl_semi_sync_master_wait_no_slave --Specify whether the master switches to normal replication(default "on").When setting "on",it permits the amount of slave can be less than the value of "rpl_semi_sync_master_wait_for_slave_count" you specified and prevents it becoming normal replication.
20 //rpl_semi_sync_master_wait_point --Specify the ACK mode(default "after-sync").It's the most important variable of semi-sync.
21 
22 //Slave1
23 (zlm@192.168.1.102 3306)[(none)]>show variables like 'rpl_semi%';
24 +---------------------------------+-------+
25 | Variable_name                   | Value |
26 +---------------------------------+-------+
27 | rpl_semi_sync_slave_enabled     | OFF   |
28 | rpl_semi_sync_slave_trace_level | 32    |
29 +---------------------------------+-------+
30 2 rows in set (0.00 sec)
31 
32 //There're only two variables after install the rpl_semi_sync_slave plugin.
33 //rpl_semi_sync_slave_enabled -- Specify whether use the feature of semi-sync(slave side,default "off").
34 //rpl_semi_sync_slave_trace_level -- Specify the debug trace level on slave(default "32").

 

Enable semi-sync replication on both master and slaves.
 1 //Master
 2 (zlm@192.168.1.101 3306)[(none)]>set rpl_semi_sync_master_enabled=on;
 3 ERROR 1229 (HY000): Variable 'rpl_semi_sync_master_enabled' is a GLOBAL variable and should be set with SET GLOBAL
 4 (zlm@192.168.1.101 3306)[(none)]>set global rpl_semi_sync_master_enabled=on;
 5 Query OK, 0 rows affected (0.00 sec)
 6 
 7 (zlm@192.168.1.101 3306)[(none)]>show variables like 'rpl_semi_sync_master_enabled';
 8 +------------------------------+-------+
 9 | Variable_name                | Value |
10 +------------------------------+-------+
11 | rpl_semi_sync_master_enabled | ON    |
12 +------------------------------+-------+
13 1 row in set (0.01 sec)
14 
15 //Slave1
16 (zlm@192.168.1.102 3306)[(none)]>set @@global.rpl_semi_sync_slave_enabled=on;
17 Query OK, 0 rows affected (0.00 sec)
18 
19 (zlm@192.168.1.102 3306)[(none)]>show variables like 'rpl_semi_sync%';
20 +---------------------------------+-------+
21 | Variable_name                   | Value |
22 +---------------------------------+-------+
23 | rpl_semi_sync_slave_enabled     | ON    |
24 | rpl_semi_sync_slave_trace_level | 32    |
25 +---------------------------------+-------+
26 2 rows in set (0.00 sec)
27 
28 //Slave2
29 (zlm@192.168.1.103 3306)[(none)]>set @@global.rpl_semi_sync_slave_enabled=on;
30 Query OK, 0 rows affected (0.00 sec)
31 
32 (zlm@192.168.1.103 3306)[(none)]>show variables like 'rpl_semi_sync%';
33 +---------------------------------+-------+
34 | Variable_name                   | Value |
35 +---------------------------------+-------+
36 | rpl_semi_sync_slave_enabled     | ON    |
37 | rpl_semi_sync_slave_trace_level | 32    |
38 +---------------------------------+-------+
39 2 rows in set (0.01 sec)
40 
41 //Master
42 (zlm@192.168.1.101 3306)[(none)]>show global status like 'Rpl_semi_sync_master_clients';
43 +------------------------------+-------+
44 | Variable_name                | Value |
45 +------------------------------+-------+
46 | Rpl_semi_sync_master_clients | 2     | //It means there're two slaves using semi-sync replicaiton follow master now.
47 +------------------------------+-------+
48 1 row in set (0.00 sec)

 

SRE实战 互联网时代守护先锋,助力企业售后服务体系运筹帷幄!一键直达领取阿里云限量特价优惠。 Execute sysbench to generate some trasactions.
 1 [root@zlm2 09:45:08 ~/sysbench-1.0/src/lua]
 2 #sysbench oltp_insert.lua --mysql-host=192.168.1.101 --mysql-port=3306 --mysql-user=zlm --mysql-password=zlmzlm --mysql-db=sysbench --tables=5 --table-size=10000 --mysql-storage-engine=innodb prepare
 3 sysbench 1.0.15 (using bundled LuaJIT 2.1.0-beta2)
 4 
 5 Creating table 'sbtest1'...
 6 Inserting 10000 records into 'sbtest1'
 7 Creating a secondary index on 'sbtest1'...
 8 Creating table 'sbtest2'...
 9 Inserting 10000 records into 'sbtest2'
10 Creating a secondary index on 'sbtest2'...
11 Creating table 'sbtest3'...
12 Inserting 10000 records into 'sbtest3'
13 Creating a secondary index on 'sbtest3'...
14 Creating table 'sbtest4'...
15 Inserting 10000 records into 'sbtest4'
16 Creating a secondary index on 'sbtest4'...
17 Creating table 'sbtest5'...
18 Inserting 10000 records into 'sbtest5'
19 Creating a secondary index on 'sbtest5'...
20 
21 [root@zlm2 09:45:22 ~/sysbench-1.0/src/lua]
22 #sysbench oltp_insert.lua --mysql-host=192.168.1.101 --mysql-port=3306 --mysql-user=zlm --mysql-password=zlmzlm --mysql-db=sysbench --threads=3 --time=1200 --report-interval=30 --rand-type=uniform run
23 sysbench 1.0.15 (using bundled LuaJIT 2.1.0-beta2)
24 
25 Running the test with following options:
26 Number of threads: 3
27 Report intermediate results every 30 second(s)
28 Initializing random number generator from current time
29 
30 
31 Initializing worker threads...
32 
33 Threads started!
34 
35 [ 30s ] thds: 3 tps: 931.95 qps: 931.95 (r/w/o: 0.00/931.95/0.00) lat (ms,95%): 6.09 err/s: 0.00 reconn/s: 0.00
36 [ 60s ] thds: 3 tps: 971.62 qps: 971.62 (r/w/o: 0.00/971.62/0.00) lat (ms,95%): 5.18 err/s: 0.00 reconn/s: 0.00
37 [ 90s ] thds: 3 tps: 949.33 qps: 949.33 (r/w/o: 0.00/949.33/0.00) lat (ms,95%): 4.41 err/s: 0.00 reconn/s: 0.00

 

Check the global status of semi-sync on master.
 1 (zlm@192.168.1.101 3306)[(none)]>show global status like 'Rpl_semi_sync%';
 2 +--------------------------------------------+------------+
 3 | Variable_name                              | Value      |
 4 +--------------------------------------------+------------+
 5 | Rpl_semi_sync_master_clients               | 2          |
 6 | Rpl_semi_sync_master_net_avg_wait_time     | 0          |
 7 | Rpl_semi_sync_master_net_wait_time         | 0          |
 8 | Rpl_semi_sync_master_net_waits             | 3949144    |
 9 | Rpl_semi_sync_master_no_times              | 9          |
10 | Rpl_semi_sync_master_no_tx                 | 1788275    |
11 | Rpl_semi_sync_master_status                | ON         |
12 | Rpl_semi_sync_master_timefunc_failures     | 0          |
13 | Rpl_semi_sync_master_tx_avg_wait_time      | 1859       |
14 | Rpl_semi_sync_master_tx_wait_time          | 1354150254 |
15 | Rpl_semi_sync_master_tx_waits              | 728148     |
16 | Rpl_semi_sync_master_wait_pos_backtraverse | 0          |
17 | Rpl_semi_sync_master_wait_sessions         | 0          |
18 | Rpl_semi_sync_master_yes_tx                | 795124     |
19 +--------------------------------------------+------------+
20 14 rows in set (0.00 sec)

 

Set the variable "rpl_semi_sync_master_timeout" to 60000.
 1 (zlm@192.168.1.101 3306)[(none)]>set @@global.rpl_semi_sync_master_timeout=60000;
 2 Query OK, 0 rows affected (0.00 sec)
 3 
 4 (zlm@192.168.1.101 3306)[(none)]>show global variables like 'rpl_semi_sync_master_timeout';
 5 +------------------------------+-------+
 6 | Variable_name                | Value |
 7 +------------------------------+-------+
 8 | rpl_semi_sync_master_timeout | 60000 |
 9 +------------------------------+-------+
10 1 row in set (0.00 sec)

 

Stop slave on slave1 and record the execution time.
1 (zlm@192.168.1.102 3306)[(none)]>system date
2 Mon Jul 16 11:20:37 CEST 2018
3 (zlm@192.168.1.102 3306)[(none)]>stop slave; //The stopping time was 11:20:40 around.
4 Query OK, 0 rows affected (0.84 sec)

 

Check the global status of semi-sync and error log on master.
 1 (zlm@192.168.1.101 3306)[(none)]>show global status like 'Rpl_semi_sync%';
 2 +--------------------------------------------+------------+
 3 | Variable_name                              | Value      |
 4 +--------------------------------------------+------------+
 5 | Rpl_semi_sync_master_clients               | 1          | //The amount of client turned into "1".
 6 | Rpl_semi_sync_master_net_avg_wait_time     | 0          |
 7 | Rpl_semi_sync_master_net_wait_time         | 0          |
 8 | Rpl_semi_sync_master_net_waits             | 3975124    |
 9 | Rpl_semi_sync_master_no_times              | 9          |
10 | Rpl_semi_sync_master_no_tx                 | 1788275    |
11 | Rpl_semi_sync_master_status                | ON         | //Although the amount of client has been less than "slave_count" but it still work in semi-sync mode.
12 | Rpl_semi_sync_master_timefunc_failures     | 0          |
13 | Rpl_semi_sync_master_tx_avg_wait_time      | 1860       |
14 | Rpl_semi_sync_master_tx_wait_time          | 1368916900 |
15 | Rpl_semi_sync_master_tx_waits              | 735759     |
16 | Rpl_semi_sync_master_wait_pos_backtraverse | 0          |
17 | Rpl_semi_sync_master_wait_sessions         | 1          |
18 | Rpl_semi_sync_master_yes_tx                | 803390     |
19 +--------------------------------------------+------------+
20 14 rows in set (0.00 sec)
21 
22 2018-07-16T11:20:39.628398+01:00 0 [ERROR] mysqld: Got an error reading communication packets //It's the time I stopped the slave on slave1.
23 2018-07-16T11:20:40.355498+01:00 54 [ERROR] Semi-sync master failed on net_flush() before waiting for slave reply
24 2018-07-16T11:20:40.355532+01:00 54 [Note] Stop semi-sync binlog_dump to slave (server_id: 1023306)
25 2018-07-16T11:20:41.358003+01:00 54 [Note] Aborted connection 54 to db: 'unconnected' user: 'repl' host: 'zlm3' (Found net error)
26 2018-07-16T11:20:43.367327+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6128ms. The settings might not be optimal. (flushed=700 and evicted=0, during the time.)
27 2018-07-16T11:20:59.683904+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 15315ms. The settings might not be optimal. (flushed=2000 and evicted=0, during the time.)
28 2018-07-16T11:21:05.709621+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5025ms. The settings might not be optimal. (flushed=731 and evicted=0, during the time.)
29 2018-07-16T11:21:40.353597+01:00 57 [Warning] Timeout waiting for reply of binlog (file: mysql-bin.000008, pos: 61052572), semi-sync up to file mysql-bin.000008, position 61052134.
30 2018-07-16T11:21:40.353630+01:00 57 [Note] Semi-sync replication switched OFF. //Until the timeout was accumulated to 60s(11:21:40),the semi-sync mode reverts "OFF" what means the master work as asynchronous replication.
31 
32 (zlm@192.168.1.101 3306)[(none)]>show global status like 'Rpl_semi_sync%';
33 +--------------------------------------------+------------+
34 | Variable_name                              | Value      |
35 +--------------------------------------------+------------+
36 | Rpl_semi_sync_master_clients               | 1          |
37 | Rpl_semi_sync_master_net_avg_wait_time     | 0          |
38 | Rpl_semi_sync_master_net_wait_time         | 0          |
39 | Rpl_semi_sync_master_net_waits             | 4003361    |
40 | Rpl_semi_sync_master_no_times              | 10         |
41 | Rpl_semi_sync_master_no_tx                 | 1826027    |
42 | Rpl_semi_sync_master_status                | OFF        | //Now the status became "OFF".
43 | Rpl_semi_sync_master_timefunc_failures     | 0          |
44 | Rpl_semi_sync_master_tx_avg_wait_time      | 1860       |
45 | Rpl_semi_sync_master_tx_wait_time          | 1368916900 |
46 | Rpl_semi_sync_master_tx_waits              | 735759     |
47 | Rpl_semi_sync_master_wait_pos_backtraverse | 0          |
48 | Rpl_semi_sync_master_wait_sessions         | 0          |
49 | Rpl_semi_sync_master_yes_tx                | 803390     |
50 +--------------------------------------------+------------+
51 14 rows in set (0.00 sec)

 

Change the variable "rpl_semi_sync_master_wait_no_slave" ot off.
 1 (zlm@192.168.1.101 3306)[(none)]>set @@global.rpl_semi_sync_master_wait_no_slave=off;
 2 Query OK, 0 rows affected (0.00 sec)
 3 
 4 (zlm@192.168.1.101 3306)[(none)]>show global variables like 'rpl_semi%';
 5 +-------------------------------------------+------------+
 6 | Variable_name                             | Value      |
 7 +-------------------------------------------+------------+
 8 | rpl_semi_sync_master_enabled              | ON         |
 9 | rpl_semi_sync_master_timeout              | 60000      |
10 | rpl_semi_sync_master_trace_level          | 32         |
11 | rpl_semi_sync_master_wait_for_slave_count | 2          |
12 | rpl_semi_sync_master_wait_no_slave        | OFF        |
13 | rpl_semi_sync_master_wait_point           | AFTER_SYNC |
14 +-------------------------------------------+------------+
15 6 rows in set (0.00 sec)

 

Start slave on slave1 to restore the initial status of semi-sync replication.
 1 //Slave
 2 (zlm@192.168.1.102 3306)[(none)]>start slave;
 3 Query OK, 0 rows affected (0.01 sec)
 4 
 5 //Error log of master
 6 2018-07-16T11:33:22.727473+01:00 60 [Note] Start binlog_dump to master_thread_id(60) slave_server(1023306), pos(, 4)
 7 2018-07-16T11:33:31.005135+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 8431ms. The settings might not be optimal. (flushed=737 and evicted=0, during the time.)
 8 2018-07-16T11:33:31.838932+01:00 60 [Note] Start semi-sync binlog_dump to slave (server_id: 1023306), pos(, 4)
 9 2018-07-16T11:33:50.117239+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 6667ms. The settings might not be optimal. (flushed=741 and evicted=0, during the time.)
10 2018-07-16T11:33:59.977585+01:00 0 [Note] Semi-sync replication switched ON at (mysql-bin.000008, 206778238)

 

Stop slave on slave1 again and record the execution time.
1 (zlm@192.168.1.102 3306)[(none)]>system date
2 Mon Jul 16 11:35:55 CEST 2018
3 (zlm@192.168.1.102 3306)[(none)]>stop slave; //The stopping time was 11:35:58 around.
4 Query OK, 0 rows affected (1.60 sec)

 

Check the global status of semi-sync and error log on master one more time.  
 1 (zlm@192.168.1.101 3306)[(none)]>show global status like 'Rpl_semi_sync%';
 2 +--------------------------------------------+------------+
 3 | Variable_name                              | Value      |
 4 +--------------------------------------------+------------+
 5 | Rpl_semi_sync_master_clients               | 2          |
 6 | Rpl_semi_sync_master_net_avg_wait_time     | 0          |
 7 | Rpl_semi_sync_master_net_wait_time         | 0          |
 8 | Rpl_semi_sync_master_net_waits             | 4313674    |
 9 | Rpl_semi_sync_master_no_times              | 11         |
10 | Rpl_semi_sync_master_no_tx                 | 2088478    |
11 | Rpl_semi_sync_master_status                | ON         |
12 | Rpl_semi_sync_master_timefunc_failures     | 0          |
13 | Rpl_semi_sync_master_tx_avg_wait_time      | 1874       |
14 | Rpl_semi_sync_master_tx_wait_time          | 1444213447 |
15 | Rpl_semi_sync_master_tx_waits              | 770393     |
16 | Rpl_semi_sync_master_wait_pos_backtraverse | 0          |
17 | Rpl_semi_sync_master_wait_sessions         | 1          |
18 | Rpl_semi_sync_master_yes_tx                | 840887     |
19 +--------------------------------------------+------------+
20 14 rows in set (0.00 sec)
21 
22 2018-07-16T11:35:58.920988+01:00 0 [ERROR] mysqld: Got an error reading communication packets
23 2018-07-16T11:36:00.321218+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 14988ms. The settings might not be optimal. (flushed=2000 and evicted=0, during the time.)
24 2018-07-16T11:36:11.150976+01:00 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 8718ms. The settings might not be optimal. (flushed=1124 and evicted=0, during the time.)
25 2018-07-16T11:36:28.973672+01:00 60 [Note] Stop semi-sync binlog_dump to slave (server_id: 1023306)
26 2018-07-16T11:36:29.012288+01:00 60 [Note] Semi-sync replication switched OFF. //The master didn't wait the timeout occur and revert to aynchronous replication directly within about merely 30s.
27 
28 (zlm@192.168.1.101 3306)[(none)]>show global status like 'Rpl_semi_sync%';
29 +--------------------------------------------+------------+
30 | Variable_name                              | Value      |
31 +--------------------------------------------+------------+
32 | Rpl_semi_sync_master_clients               | 1          |
33 | Rpl_semi_sync_master_net_avg_wait_time     | 0          |
34 | Rpl_semi_sync_master_net_wait_time         | 0          |
35 | Rpl_semi_sync_master_net_waits             | 4336577    |
36 | Rpl_semi_sync_master_no_times              | 12         |
37 | Rpl_semi_sync_master_no_tx                 | 2121402    |
38 | Rpl_semi_sync_master_status                | OFF        |
39 | Rpl_semi_sync_master_timefunc_failures     | 0          |
40 | Rpl_semi_sync_master_tx_avg_wait_time      | 1913       |
41 | Rpl_semi_sync_master_tx_wait_time          | 1474253267 |
42 | Rpl_semi_sync_master_tx_waits              | 770394     |
43 | Rpl_semi_sync_master_wait_pos_backtraverse | 0          |
44 | Rpl_semi_sync_master_wait_sessions         | 0          |
45 | Rpl_semi_sync_master_yes_tx                | 840887     |
46 +--------------------------------------------+------------+
47 14 rows in set (0.00 sec)

 

Summay
  • The behavior of semi-sync replication depends on the variables that you specified.
  • If the value of "rpl_semi_sync_master_wait_no_slave" is "on",the status of semi-sync won't change until it reaches the value of timeout,even though the amount of slaves are less than it in "rpl_semi_sync_master_slave_count".
  • If the value of "rpl_semi_sync_master_wait_no_slave" is "off",master won't wait for timeout before it reverts to normal replication when slave decreases to less than the value above.
  • In order to observe the result conveniently,I increase the value of "rpl_semi_sync_master_timeout" to 60s.
  • In most high consistent scenarios,the value can be tremendous enough to avoid degenerating to asynchronous replication.
  • If the degeneration is due to network failure such as instant glitch.The semi-sync status will be ok after the network becomes normal.
  • I'm afraid the best practice is to set "rpl_semi_sync_master_wait_no_slave" to "on" and the "rpl_semi_sync_master_timeout" to a comparative value.

 

扫码关注我们
微信号:SRE实战
拒绝背锅 运筹帷幄