Skip to content

Commit 2393280

Browse files
author
Anibal Pinto
committed
BUG#34819861: Gtid_set::Interval_chunk mem incorrectly tracked on THD_applier_module_receiver
In a GR cluster setup with a primary and two secondaries, it was observed that the primary was not releasing all allocated memory related to the "memory/sql/Gtid_set::Interval_chunk" event name. The memory used by GR associated to the the key `memory/sql/Gtid_set::Interval_chunk` is allocated by one thread: `thread/group_rpl/THD_applier_module_receiver`, but it is released by a different thread: `thread/group_rpl/THD_certifier_broadcast`. This was causing a incorrect memory usage relative to `thread/group_rpl/THD_applier_module_receiver` since only memory allocations were accounted, making the reported memory always increasing. The global memory reported on performance_schema.memory_summary_global_by_event_name` was not affected. In order to avoid the incorrect memory usage, the used memory ownership is transferred from `thread/group_rpl/THD_applier_module_receiver` when it becomes out of scope into `thread/group_rpl/THD_certifier_broadcast` when it becomes part of the scope. Change-Id: Ieda9c40b52a42f75636004d590440674b2665af9
1 parent 78542c6 commit 2393280

File tree

6 files changed

+206
-3
lines changed

6 files changed

+206
-3
lines changed

include/prealloced_array.h

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -310,6 +310,15 @@ class Prealloced_array {
310310
return false;
311311
}
312312

313+
/**
314+
Claim memory ownership.
315+
316+
@param claim take ownership of memory
317+
*/
318+
void claim_memory_ownership(bool claim) {
319+
if (!using_inline_buffer()) my_claim(m_ext.m_array_ptr, claim);
320+
}
321+
313322
/**
314323
Copies an element into the back of the array.
315324
Complexity: Constant (amortized time, reallocation may happen).
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
include/group_replication.inc
2+
Warnings:
3+
Note #### Sending passwords in plain text without SSL/TLS is extremely insecure.
4+
Note #### Storing MySQL user name or password information in the connection metadata repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START REPLICA; see the 'START REPLICA Syntax' in the MySQL Manual for more information.
5+
[connection server1]
6+
include/start_and_bootstrap_group_replication.inc
7+
8+
############################################################
9+
# 1. Populate server with data to alloc memory using
10+
# Gtid_set::Interval_chunk key.
11+
CREATE TABLE t1 (c1 INT NOT NULL AUTO_INCREMENT PRIMARY KEY, c2 INT) ENGINE=InnoDB;
12+
BEGIN;
13+
INSERT INTO t1 (c2) values (2);
14+
INSERT INTO t1 (c2) values (2);
15+
INSERT INTO t1 (c2) values (2);
16+
INSERT INTO t1 (c2) values (2);
17+
INSERT INTO t1 (c2) values (2);
18+
COMMIT;
19+
20+
############################################################
21+
# 2. Assert values on thread and global counters before
22+
# running garbage collect.
23+
include/assert.inc [Global counters shall been higher than sum of applier and certifier threads.]
24+
25+
############################################################
26+
# 3. Wait for execution of another garbage collection
27+
28+
############################################################
29+
# 4. Assert values on thread and global counters after
30+
# running garbage collect.
31+
include/assert.inc [Global counters shall been higher than sum of applier and certifier threads.]
32+
33+
############################################################
34+
# 5. Cleanup
35+
DROP TABLE t1;
36+
include/group_replication_end.inc
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
################################################################################
2+
# Validate that primary releases all allocated memory related to
3+
# "memory/sql/Gtid_set::Interval_chunk" event name.
4+
#
5+
# Test:
6+
# 0. The test requires one server: M1.
7+
# 1. Populate server with data to alloc memory using
8+
# Gtid_set::Interval_chunk key.
9+
# 2. Assert values on thread and global counters before
10+
# running garbage collect.
11+
# 3. Wait for execution of another garbage collection
12+
# 4. Assert values on thread and global counters after
13+
# running garbage collect.
14+
# 5. Cleanup
15+
################################################################################
16+
17+
--source include/have_nodebug.inc
18+
--source include/not_valgrind.inc
19+
--source include/not_asan.inc
20+
--source include/big_test.inc
21+
--source include/have_group_replication_plugin.inc
22+
--let $rpl_skip_group_replication_start= 1
23+
--source include/group_replication.inc
24+
25+
--source include/start_and_bootstrap_group_replication.inc
26+
27+
28+
--echo
29+
--echo ############################################################
30+
--echo # 1. Populate server with data to alloc memory using
31+
--echo # Gtid_set::Interval_chunk key.
32+
33+
CREATE TABLE t1 (c1 INT NOT NULL AUTO_INCREMENT PRIMARY KEY, c2 INT) ENGINE=InnoDB;
34+
35+
BEGIN;
36+
INSERT INTO t1 (c2) values (2);
37+
INSERT INTO t1 (c2) values (2);
38+
INSERT INTO t1 (c2) values (2);
39+
INSERT INTO t1 (c2) values (2);
40+
INSERT INTO t1 (c2) values (2);
41+
COMMIT;
42+
43+
--exec $MYSQL_SLAP --create-schema=test --delimiter=";" --iterations=100 --query="INSERT INTO t1 (c2) SELECT c2 FROM t1 LIMIT 5" --concurrency=1 --silent
44+
45+
46+
--echo
47+
--echo ############################################################
48+
--echo # 2. Assert values on thread and global counters before
49+
--echo # running garbage collect.
50+
51+
--let $applier_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_by_thread_by_event_name JOIN performance_schema.threads USING(thread_id) WHERE event_name = "memory/sql/Gtid_set::Interval_chunk" AND name = "thread/group_rpl/THD_applier_module_receiver"`
52+
--let $certifier_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_by_thread_by_event_name JOIN performance_schema.threads USING(thread_id) WHERE event_name = "memory/sql/Gtid_set::Interval_chunk" AND name = "thread/group_rpl/THD_certifier_broadcast"`
53+
--let $global_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_global_by_event_name WHERE EVENT_NAME = "memory/sql/Gtid_set::Interval_chunk"`
54+
55+
--let $assert_text= Global counters shall been higher than sum of applier and certifier threads.
56+
--let $assert_cond= [SELECT $applier_bytes_used + $certifier_bytes_used] <= $global_bytes_used
57+
--source include/assert.inc
58+
59+
60+
--echo
61+
--echo ############################################################
62+
--echo # 3. Wait for execution of another garbage collection
63+
64+
--let $wait_timeout= 150
65+
--let $wait_condition= SELECT Count_transactions_rows_validating=5 FROM performance_schema.replication_group_member_stats WHERE member_id IN (SELECT @@server_uuid)
66+
--source include/wait_condition.inc
67+
68+
69+
--echo
70+
--echo ############################################################
71+
--echo # 4. Assert values on thread and global counters after
72+
--echo # running garbage collect.
73+
74+
--let $applier_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_by_thread_by_event_name JOIN performance_schema.threads USING(thread_id) WHERE event_name = "memory/sql/Gtid_set::Interval_chunk" AND name = "thread/group_rpl/THD_applier_module_receiver"`
75+
--let $certifier_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_by_thread_by_event_name JOIN performance_schema.threads USING(thread_id) WHERE event_name = "memory/sql/Gtid_set::Interval_chunk" AND name = "thread/group_rpl/THD_certifier_broadcast"`
76+
--let $global_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_global_by_event_name WHERE EVENT_NAME = "memory/sql/Gtid_set::Interval_chunk"`
77+
78+
--let $assert_text= Global counters shall been higher than sum of applier and certifier threads.
79+
--let $assert_cond= [SELECT $applier_bytes_used + $certifier_bytes_used] <= $global_bytes_used
80+
--source include/assert.inc
81+
82+
83+
--echo
84+
--echo ############################################################
85+
--echo # 5. Cleanup
86+
87+
DROP TABLE t1;
88+
89+
--source include/group_replication_end.inc

plugin/group_replication/src/certifier.cc

Lines changed: 53 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -494,7 +494,15 @@ void Certifier::clear_certification_info() {
494494
for (Certification_info::iterator it = certification_info.begin();
495495
it != certification_info.end(); ++it) {
496496
// We can only delete the last reference.
497-
if (it->second->unlink() == 0) delete it->second;
497+
if (it->second->unlink() == 0) {
498+
/*
499+
Claim Gtid_set_ref used memory to
500+
`thread/group_rpl/THD_certifier_broadcast` thread, since this is thread
501+
that does release the memory.
502+
*/
503+
it->second->claim_memory_ownership(true);
504+
delete it->second;
505+
}
498506
}
499507

500508
certification_info.clear();
@@ -714,6 +722,15 @@ Certification_result Certifier::add_writeset_to_certification_info(
714722
item_previous_sequence_number != parallel_applier_sequence_number)
715723
transaction_last_committed = item_previous_sequence_number;
716724
}
725+
/*
726+
The memory used by Gtid_set_ref is allocated by
727+
`thread/group_rpl/THD_applier_module_receiver`, though it will be released
728+
by `thread/group_rpl/THD_certifier_broadcast` thread. To avoid untracked
729+
memory release on `thread/group_rpl/THD_applier_module_receiver` we do
730+
dissociate this used memory from this thread.
731+
*/
732+
snapshot_version_value->claim_memory_ownership(false);
733+
717734
return Certification_result::positive;
718735
}
719736

@@ -1036,7 +1053,15 @@ bool Certifier::add_item(const char *item, Gtid_set_ref *snapshot_version,
10361053
*item_previous_sequence_number =
10371054
it->second->get_parallel_applier_sequence_number();
10381055

1039-
if (it->second->unlink() == 0) delete it->second;
1056+
if (it->second->unlink() == 0) {
1057+
/*
1058+
Claim Gtid_set_ref used memory to
1059+
`thread/group_rpl/THD_certifier_broadcast` thread, since this is thread
1060+
that does release the memory.
1061+
*/
1062+
it->second->claim_memory_ownership(true);
1063+
delete it->second;
1064+
}
10401065

10411066
it->second = snapshot_version;
10421067
error = false;
@@ -1225,7 +1250,15 @@ void Certifier::garbage_collect_internal(Gtid_set *executed_gtid_set,
12251250

12261251
while (it != certification_info.end()) {
12271252
if (it->second->is_subset_not_equals(stable_gtid_set)) {
1228-
if (it->second->unlink() == 0) delete it->second;
1253+
if (it->second->unlink() == 0) {
1254+
/*
1255+
Claim Gtid_set_ref used memory to
1256+
`thread/group_rpl/THD_certifier_broadcast` thread, since this is
1257+
thread that does release the memory.
1258+
*/
1259+
it->second->claim_memory_ownership(true);
1260+
delete it->second;
1261+
}
12291262
certification_info.erase(it++);
12301263
} else
12311264
++it;
@@ -1743,6 +1776,15 @@ bool Certifier::set_certification_info_part(
17431776
value->link();
17441777
certification_info.insert(
17451778
std::pair<std::string, Gtid_set_ref *>(key, value));
1779+
/*
1780+
The memory used by Gtid_set_ref is allocated by
1781+
`thread/group_rpl/THD_applier_module_receiver`, though it will be
1782+
released by `thread/group_rpl/THD_certifier_broadcast` thread. To avoid
1783+
untracked memory release on
1784+
`thread/group_rpl/THD_applier_module_receiver` we do dissociate this
1785+
used memory from this thread.
1786+
*/
1787+
value->claim_memory_ownership(false);
17461788
}
17471789

17481790
return false;
@@ -1992,6 +2034,14 @@ int Certifier::set_certification_info(
19922034
value->link();
19932035
certification_info.insert(
19942036
std::pair<std::string, Gtid_set_ref *>(key, value));
2037+
/*
2038+
The memory used by Gtid_set_ref is allocated by
2039+
`thread/group_rpl/THD_applier_module_receiver`, though it will be released
2040+
by `thread/group_rpl/THD_certifier_broadcast` thread. To avoid untracked
2041+
memory release on `thread/group_rpl/THD_applier_module_receiver` we do
2042+
dissociate this used memory from this thread.
2043+
*/
2044+
value->claim_memory_ownership(false);
19952045
}
19962046

19972047
if (initialize_server_gtid_set()) {

sql/rpl_gtid.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1592,6 +1592,14 @@ class Gtid_set {
15921592
public:
15931593
/// Destroy this Gtid_set.
15941594
~Gtid_set();
1595+
1596+
/**
1597+
Claim ownership of memory.
1598+
1599+
@param claim claim ownership of memory.
1600+
*/
1601+
void claim_memory_ownership(bool claim);
1602+
15951603
/**
15961604
Removes all gtids from this Gtid_set.
15971605

sql/rpl_gtid_set.cc

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,17 @@ Gtid_set::Gtid_set(Tsid_map *_tsid_map, const char *text,
101101
*status = add_gtid_text(text);
102102
}
103103

104+
void Gtid_set::claim_memory_ownership(bool claim) {
105+
m_intervals.claim_memory_ownership(claim);
106+
107+
Interval_chunk *chunk = chunks;
108+
while (chunk != nullptr) {
109+
Interval_chunk *next_chunk = chunk->next;
110+
my_claim(chunk, claim);
111+
chunk = next_chunk;
112+
}
113+
}
114+
104115
void Gtid_set::init() {
105116
DBUG_TRACE;
106117
has_cached_string_length = false;

0 commit comments

Comments
 (0)