Skip to content

Commit 9bb253e

Browse files
author
Karolina Szczepankiewicz
committed
Bug #33482064 rpl.rpl_semi_sync_turn_on_off_optimize_for_static_plugin_config fails on asan
Problem ------- Ack Receiver thread is reading empty pointer which is pointed out by ASAN (heap-use-after-free). Analysis / Root-cause analysis ------------------------------ Problem happens when semi-synchronous replication is turned on and the replica is disconnecting. These conditions may cause Ack Receiver to encounter error when reading communication packets (ER_NET_READ_ERROR). It may happen that Binary log dump thread resources are released before Ack Receiver will remove connection information while AckReceiver wrongfully iterates over it. Let's consider the following flow: 1. Removing of thread X is requested from thread S. S changes `is_leaving` to true and waits for the confirmation. 2. Function `init_replica_sockets` is called from the Ack Receiver thread A, therefore X is allowed to leave. A changes `is_leaving` to false. 3. A is listening on sockets, possibly `listen_on_sockets` returns some error, but not critical, or just A processes the data from sockets quickly. 4. Other threads are requested to leave - `m_slaves_changed` is set to true 5. A is woken up again (S is still waiting for its turn), and calls `init_replica_sockets again`. Listener checks the status of X (still not removed from the Ack Receiver container), it is false, therefore X is copied into the Listener internal container. 6. S is woken up. It sees that status of the X thread is false, therefore it was allowed to leave. S thinks that X was removed from the listener container, therefore it is removing X from the Ack Receiver container. 7. A is woken up to process data from sockets. In the meantime, replica shuts down and S resources are released. A gots ER_NET_READ_ERROR and checks the while condition (net.vio->has_data(net.vio)). Vio pointer is NULL at this point. A SIGSEGV signal is issued. Implemented solution -------------------- 'Slave' 'is_leaving' internal variable type is changed to newly introduced enumeration class 'enum_status'. 'enum_status' class contains three states: up, leaving and down. Leaving thread is changing status to 'leaving'. Ack Receiver notices that 'status' value of the leaving thread changed to 'leaving' and changes the status to 'down'. If 'Slave' object state is 'down' or 'leaving', Ack Receiver thread is ommiting the 'Slave' object while creating the 'Listener' vector. Signed-off-by: Karolina Szczepankiewicz <karolina.szczepankiewicz@oracle.com> Change-Id: Ic59191479473c86d4b97da060457930a52e79f0a
1 parent 23053b1 commit 9bb253e

File tree

3 files changed

+11
-9
lines changed

3 files changed

+11
-9
lines changed

plugin/semisync/semisync_source_ack_receiver.cc

+5-4
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,6 @@ bool Ack_receiver::add_slave(THD *thd) {
151151
&slave.compress_ctx, algorithm,
152152
thd->get_protocol_classic()->get_compression_level());
153153
}
154-
slave.is_leaving = false;
155154
slave.vio = thd->get_protocol_classic()->get_vio();
156155
slave.vio->mysql_socket.m_psi = nullptr;
157156

@@ -186,7 +185,7 @@ void Ack_receiver::remove_slave(THD *thd) {
186185
*/
187186
for (it = m_slaves.begin(); it != m_slaves.end(); ++it) {
188187
if (it->thread_id == thd->thread_id()) {
189-
it->is_leaving = true;
188+
it->m_status = Slave::EnumStatus::leaving;
190189
m_slaves_changed = true;
191190
break;
192191
}
@@ -196,7 +195,8 @@ void Ack_receiver::remove_slave(THD *thd) {
196195
Wait till Ack_receiver::run() is done reading from the
197196
slave's socket.
198197
*/
199-
while ((it != m_slaves.end()) && (it->is_leaving) && (m_status == ST_UP)) {
198+
while ((it != m_slaves.end()) &&
199+
(it->m_status == Slave::EnumStatus::leaving) && (m_status == ST_UP)) {
200200
mysql_cond_wait(&m_cond, &m_mutex);
201201
/*
202202
In above cond_wait, we release and reacquire m_mutex.
@@ -311,8 +311,9 @@ void Ack_receiver::run() {
311311
if (likely(len != packet_error))
312312
repl_semisync->reportReplyPacket(slave_obj.server_id, net.read_pos,
313313
len);
314-
else if (net.last_errno == ER_NET_READ_ERROR)
314+
else if (net.last_errno == ER_NET_READ_ERROR) {
315315
listener.clear_socket_info(i);
316+
}
316317
} while (net.vio->has_data(net.vio) && m_status == ST_UP);
317318
}
318319
i++;

plugin/semisync/semisync_source_ack_receiver.h

+2-1
Original file line numberDiff line numberDiff line change
@@ -34,11 +34,12 @@
3434
#include "sql/sql_class.h"
3535

3636
struct Slave {
37+
enum class EnumStatus { up, leaving, down };
3738
uint32_t thread_id;
3839
Vio *vio;
3940
uint server_id;
4041
mysql_compress_context compress_ctx;
41-
bool is_leaving;
42+
EnumStatus m_status = EnumStatus::up;
4243

4344
my_socket sock_fd() const { return vio->mysql_socket.fd; }
4445
};

plugin/semisync/semisync_source_socket_listener.h

+4-4
Original file line numberDiff line numberDiff line change
@@ -51,8 +51,8 @@ class Poll_socket_listener {
5151
Do not consider the slave's socket
5252
if the slave is in the process of leaving.
5353
*/
54-
if (slaves[i].is_leaving) {
55-
slaves[i].is_leaving = false;
54+
if (slaves[i].m_status != Slave::EnumStatus::up) {
55+
slaves[i].m_status = Slave::EnumStatus::down;
5656
continue;
5757
}
5858
pollfd poll_fd;
@@ -105,8 +105,8 @@ class Select_socket_listener {
105105
Do not consider the slave's socket
106106
if the slave is in the process of leaving.
107107
*/
108-
if (slaves[i].is_leaving) {
109-
slaves[i].is_leaving = false;
108+
if (slaves[i].m_status != Slave::EnumStatus::up) {
109+
slaves[i].m_status = Slave::EnumStatus::down;
110110
continue;
111111
}
112112
my_socket socket_id = slaves[i].sock_fd();

0 commit comments

Comments
 (0)