Fix the spelling of preceding, specified, received and additionally #2

ottok · 2014-12-02T12:05:44Z

I am tired of fixing the same bugs in different downstream packages I work with, so I decided to push my fixes directly to the source at Oracle MySQL. This simple spelling fix is my first in a series of multiple fixes.

I hereby grant Oracle full rights regarding this contribution. Please accept my fix!

morgo · 2014-12-02T17:35:19Z

Hi Otto, thank you for the contribution!

We are not currently accepting pull requests from GitHub. May I suggest filing a bug report on bugs.mysql.com?

(I would also be happy to do so on your behalf.)

ottok · 2014-12-03T09:03:30Z

Ok, thanks. I filed the pull request now at http://bugs.mysql.com/bug.php?id=75084

morgo · 2014-12-03T13:14:12Z

Thank you Otto! I will close this pull request for now.

Patch #2: As a consequence of earlier patches, my_afree() is no longer needed. Remove all usage of it.

…rows. Background 1: When binlog_format = row, CREATE ... SELECT is logged in two pieces, like: Anonymous_Gtid query_log_event(CREATE TABLE without SELECT) Anonymous_Gtid query_log_event(BEGIN) ...row events... query_log_event(COMMIT) (or Xid_log_event) Internally, there is a call to MYSQL_BIN_LOG::commit after the table has been created and before the rows are selected. When gtid_next='ANONYMOUS', we must not release anonymous ownership for the commit occurring in the middle of the statement (since that would allow a concurrent client to set gtid_mode=on, making it impossible to commit the rest of the statement). Also, the commit in the middle of the statement should not decrease the counter of ongoing GTID-violating transactions, since that would allow a concurrent client to set ENFORCE_GTID_CONSISTENCY=ON even if there is an ongoing transaction that violates GTID-consistency. The logic to skip releasing anonymous ownership and skip decreasing the counter is as follows. Before calling mysql_bin_log.commit, it sets the flag thd->is_commit_in_middle_of_statement. Eventually, mysql_bin_log.commit calls gtid_state->update_on_commit, which calls gtid_state->update_gtids_impl, which reads the thd->is_commit_in_middle_of_statement and accordingly decides to skip releasing anonymous ownership and/or skips decreasing the counter. Problem 1: When thd->is_commit_in_middle_of_statement has been set, it is crucial that there is another call to update_gtids_impl when the transaction ends (otherwise the session will keep holding anonymous ownership and will not decrease the counters). Normally, this happens because mysql_bin_log.commit is always called, and mysql_bin_log.commit normally invokes ordered_commit, which calls update_gtids_impl. However, in case the SELECT part of the statement does not find any rows, mysql_bin_log.commit skips the call to ordered_commit, so update_gtids_impl does not get called. This is the first problem we fix in this commit. Fix 1: We fix this problem as follows. After calling mysql_bin_log.commit to log the CREATE part of CREATE...SELECT, the CREATE...SELECT code sets thd->pending_gtid_state_update=true (this is a new flag that we introduce in this patch). If the flag is set, update_gtids_impl clears it. At the end of mysql_bin_log.commit, we check the flag to see if update_gtids_impl has been called by any function invoked by mysql_bin_log.commit. If not, i.e., if the flag is still true at the end of mysql_bin_log.commit, it means we have reached the corner case where update_gtids_impl was skipped. Thus we call it explicitly from mysql_bin_log.commit. Background 2: GTID-violating DDL (CREATE...SELECT and CREATE TEMPORARY) is detected in is_ddl_gtid_compatible, called from gtid_pre_statement_checks, which is called from mysql_parse just before the implicit pre-commit. is_ddl_gtid_compatible determines whether an error or warning or nothing is to be generated, and whether to increment the counters of GTID- violating transactions. In case an error is generated, it is important that the error happens before the implicit commit, so that the statement fails before it commits the ongoing transaction. Problem 2: In case a warning is to be generated, and there is an ongoing transaction, the implicit commit will write to the binlog, and thus it will call gtid_state->update_gtids_impl, which will decrease the counters of GTID-violating transactions. Thus, the counters will be zero for the duration of the transaction. Fix 2: We call is_ddl_gtid_compatible *both* before the implicit commit and after the implicit commit. If an error is to be generated, the error is generated before the commit. If a warning is to be generated and/or the counter of GTID-violating transactions is to be increased, then this happens after the commit. Code changes #1: @sql/binlog.cc - Move MYSQL_BIN_LOG::commit to a new function MYSQL_BIN_LOG::write_binlog_and_commit_engine. Make MYSQL_BIN_LOG::commit call this function, and after the return check thd->pending_gtid_state_update to see if another call to gtid_state->update_on_[commit|rollback] is needed. - Simplify MYSQL_BIN_LOG::write_bin_log_and_commit_engine; remove useless local variable 'error' that would never change its value. @sql/binlog.h: - Declaration of new function. @sql/rpl_gtid_state.cc: - Set thd->pending_gtid_state_update to false at the end of update_gtids_impl. Code changes #2: @sql/binlog.cc: - Add two parameters to handle_gtid_consistency and is_ddl_compatible: handle_error is true in the call to is_ddl_gtid_compatible that happens *before* the implicit commit and fals in the call to is_ddl_gtid_compatible that happens *after* the implicit commit. It tells the function to generate the error, if an error is to be generated. The other parameter, handle_nonerror, is true in the call to is_ddl_gtid_compatible that happens *after* the implicit commit and false in the call that happens *before* the implicit commit. It tells the function to generate the warnings and increment the counter, if that needs to be done. @sql/rpl_gtid_execution.cc: - Call is_ddl_gtid_compatible after the implicit commit. Pass the two new parameters to the function. @sql/sql_class.h: - Update prototype for is_ddl_gtid_compatible. @sql/sql_insert.cc: - Set thd->pending_gtid_state_update = true after committing the CREATE part of a CREATE...SELECT. Misc changes: @sql/binlog.cc - Add DEBUG_SYNC symbol end_decide_logging_format used in tests. @sql/rpl_gtid_state.cc: - For modularity, move out parts of update_gtids_impl to a new function, end_gtid_violating_transaction. - Move the lock/unlock of global_sid_lock into update_gtids_impl. - Make update_gtids_impl release global_sid_lock before the call to end_gtid_violating_transaction, so as to hold it as short as possible. @sql/rpl_gtid.h - Because we release the locks earlier in update_gtids_impl in rpl_gtid_state.cc, we need to acquire the lock again in end_[anonymous|automatic]_gtid_violating_transaction, in order to do some debug assertions. - Add DBUG_PRINT for the counters. Test changes: - Split binlog_enforce_gtid_consistency into six tests, depending on the type of scenarios it tests: Three classes of GTID-violation: *_create_select_*: test CREATE ... SELECT. *_tmp_*: test CREATE TEMPORARY/DROP TEMPORARY inside a transaction. *_trx_nontrx_*: test combinations of transactional and non-transactional updates in the same statement or in the same transaction. For each class of GTID-violation, one positive and one negative test: *_consistent.test: Cases which are *not* GTID-violating *_violation.test: Cases which *are* GTID-violating. - The general logic of these test is: - extra/binlog_tests/enforce_gtid_consistency.test iterates over all values of GTID_MODE, ENFORCE_GTID_CONSISTENCY, and GTID_NEXT. For each case, it sources file; the name of the sourced file is specified by the file that sources extra/binlog_tests/enforce_gtid_consistency.test - The top-level file in suite/binlog/t invokes extra/binlog_tests/enforce_gtid_consistency.test, specifying one of the filenames extra/binlog_tests/enforce_gtid_consistency_[create_select|tmp| trx_nontrx]_[consistent|violation].test. - Each of the files extra/binlog_tests/enforce_gtid_consistency_[create_select|tmp| trx_nontrx]_[consistent|violation].test sets up a number of test scenarios. Each test scenario is executed by sourcing extra/binlog_tests/enforce_gtid_consistency_statement.inc. - extra/binlog_tests/enforce_gtid_consistency_statement.inc executes the specified statement, checks that warnings are generated and counters incremented/decremented as specified by the caller. - Since the tests set GTID_MODE explicitly, it doesn't make sense to run the test in both combinations GTID_MODE=ON/OFF. However, for the *_trx_nontrx_* cases, it is important to test that it works both with binlog_direct_non_transactional_updates=on and off. The suite is never run with those combinations. To leverage from the combinations GTID_MODE=ON/OFF, we run the test with binlog_direct_non_transactional_updates=on if GTID_MODE=ON, and we run the test with binlog_direct_non_transactional_updates=off if GTID_MODE=OFF.

…TIME% After-push test result update #2.

Two places in replication code were causing Valgrind errors: 1. Memory leak in Relay_log_info::wait_for_gtid_set, since it passed the return value from Gtid_set::to_string() directly to DBUG_PRINT, without storing it anywhere so that it can be freed. 2. In MYSQL_BIN_LOG::init_gtid_sets would pass a bad pointer to DBUG_PRINT in some cases. In problem #1, an underlying problem was that to_string returns newly allocated memory and this was easy to miss when reading the code that calls the function. It would be better to return the value through a parameter, since that forces the caller to store it in a variable, and then it is more obvious that the value must be freed. And in fact such a function existed already, so we fix the problem by removing the (redundant) no-args version of Gtid_set::to_string and using the one- or two-argument function instead. In problem #2, print an empty string if we detect that the pointer will be bad. These bugs were found when adding some debug printouts to read_gtids_from_binlog. These debug printouts never made it to the server code through any other bug report, but would be useful to have for future debugging, so including them in this patch. Additionally, removed the call to global_sid_lock->rdlock() used before Previous_gtids_log_event::get_str(). This is not needed since Previous_gtids_log_event doesn't use shared resources.

Bug #21306392: REMOVE OLD-STYLE MAX_STATEMENT_TIME HINT (REPLACE WITH MAX_EXECUTION_TIME) After-push test update #2.

…TO SELF Problem: If a multi-column update statement fails when updating one of the columns in a row, it will go on and update the remaining columns in that row before it stops and reports an error. If the failure happens when updating a JSON column, and the JSON column is also referenced later in the update statement, new and more serious errors can happen when the update statement attempts to read the JSON column, as it may contain garbage at this point. The fix is twofold: 1) Field_json::val_str() currently returns NULL if an error happens. This is correct for val_str() functions in the Item class hierarchy, but not for val_str() functions in the Field class hierarchy. The val_str() functions in the Field classes instead return a pointer to an empty String object on error. Since callers don't expect it to return NULL, this caused a crash when a caller unconditionally dereferenced the returned pointer. The patch makes Field_json::val_str() return a pointer to an empty String on error to avoid such crashes. 2) Whereas #1 fixes the immediate crash, Field_json::val_str() may still read garbage when this situation occurs. This could lead to unreliable behaviour, and both valgrind and ASAN warn about it. The patch therefore also makes Field_json::store() start by clearing the field, so that it will hold an empty value rather than garbage after an error has happened. Fix #2 is sufficient to fix the reported problems. Fix #1 is included for consistency, so that Field_json::val_str() behaves the same way as the other Field::val_str() functions. The query in the bug report didn't always crash. Since the root cause was that it had read garbage, it could be lucky and read something that looked like a valid value. In that case, Field_json::val_str() didn't return NULL, and the crash was avoided. The patch also makes these changes: - It removes the Field_json::store_dom() function, since it is only called from one place. It is now inlined instead. - It corrects information about return values in the comment that describes the ensure_utf8mb4() function.

Background: WAIT_FOR_EXECUTED_GTID_SET waits until a specified set of GTIDs is included in GTID_EXECUTED. SET GTID_PURGED adds GTIDs to GTID_EXECUTED. RESET MASTER clears GTID_EXECUTED. There were multiple issues: 1. Problem: The change in GTID_EXECUTED implied by SET GTID_PURGED did not cause WAIT_FOR_EXECUTED_GTID_SET to stop waiting. Analysis: WAIT_FOR_EXECUTED_GTID_SET waits for a signal to be sent. But SET GTID_PURGED never sent the signal. Fix: Make GTID_PURGED send the signal. Changes: - sql/rpl_gtid_state.cc:Gtid_state::add_lost_gtids - sql/rpl_gtid_state.cc: removal of #ifdef HAVE_GTID_NEXT_LIST - sql/rpl_gtid.h: removal of #ifdef HAVE_GTID_NEXT_LIST 2. Problem: There was a race condition where WAIT_FOR_EXECUTED_GTID_SET could miss the signal from a commit and go into an infinite wait even if GTID_EXECUTED contains all the waited-for GTIDs. Analysis: In the bug, WAIT_FOR_EXECUTED_GTID_SET took a lock while taking a copy of the global state. Then it released the lock, analyzed the copy of the global state, and decided whether it should wait. But if the GTID to wait for was committed after the lock was released, WAIT_FOR_EXECUTED_GTID_SET would miss the signal and go to an infinite wait even if GTID_EXECUTED contains all the waited-for GTIDs. Fix: Refactor the code so that it holds the lock all the way from before it reads the global state until it goes to the wait. Changes: - sql/rpl_gtid_state.cc:Gtid_state::wait_for_gtid_set: Most of the changes in this function are to fix this bug. Note: When the bug existed, it was possible to create a test case for this by placing a debug sync point in the section where it does not hold the lock. However, after the bug has been fixed this section does not exist, so there is no way to test it deterministically. The bug would also cause the test to fail rarely, so a way to test this is to run the test case 1000 times. 3. Problem: The function would take global_sid_lock.wrlock every time it has to wait, and while holding it takes a copy of the entire gtid_executed (which implies allocating memory). This is not very optimal: it may process the entire set each time it waits, and it may wait once for each member of the set, so in the worst case it is O(N^2) where N is the size of the set. Fix: This is fixed by the same refactoring that fixes problem #2. In particular, it does not re-process the entire Gtid_set for each committed transaction. It only removes all intervals of gtid_executed for the current sidno from the remainder of the wait-for-set. Changes: - sql/rpl_gtid_set.cc: Add function remove_intervals_for_sidno. - sql/rpl_gtid_state.cc: Use remove_intervals_for_sidno and remove only intervals for the current sidno. Remove intervals incrementally in the innermost while loop, rather than recompute the entire set each iteration. 4. Problem: If the client that executes WAIT_FOR_EXECUTED_GTID_SET owns a GTID that is included in the set, then there is no chance for another thread to commit it, so it will wait forever. In effect, it deadlocks with itself. Fix: Detect the situation and generate an error. Changes: - sql/share/errmsg-utf8.txt: new error code ER_CANT_WAIT_FOR_EXECUTED_GTID_SET_WHILE_OWNING_A_GTID - sql/item_func.cc: check the condition and generate the new error 5. Various simplfications. - sql/item_func.cc:Item_wait_for_executed_gtid_set::val_int: - Pointless to set null_value when generating an error. - add DBUG_ENTER - Improve the prototype for Gtid_state::wait_for_gtid_set so that it takes a Gtid_set instead of a string, and also so that it requires global_sid_lock. - sql/rpl_gtid.h:Mutex_cond_array - combine wait functions into one and make it return bool - improve some comments - sql/rpl_gtid_set.cc:Gtid_set::remove_gno_intervals: - Optimize so that it returns early if this set becomes empty @mysql-test/extra/rpl_tests/rpl_wait_for_executed_gtid_set.inc - Move all wait_for_executed_gtid_set tests into mysql-test/suite/rpl/t/rpl_wait_for_executed_gtid_set.test @mysql-test/include/kill_wait_for_executed_gtid_set.inc @mysql-test/include/wait_for_wait_for_executed_gtid_set.inc - New auxiliary scripts. @mysql-test/include/rpl_init.inc - Document undocumented side effect. @mysql-test/suite/rpl/r/rpl_wait_for_executed_gtid_set.result - Update result file. @mysql-test/suite/rpl/t/rpl_wait_for_executed_gtid_set.test - Rewrote the test to improve coverage and cover all parts of this bug. @sql/item_func.cc - Add DBUG_ENTER - No point in setting null_value when generating an error. - Do the decoding from text to Gtid_set here rather than in Gtid_state. - Check for the new error ER_CANT_WAIT_FOR_EXECUTED_GTID_SET_WHILE_OWNING_A_GTID @sql/rpl_gtid.h - Simplify the Mutex_cond_array::wait functions in the following ways: - Make them one function since they share most code. This also allows calling the three-argument function with NULL as the last parameter, which simplifies the caller. - Make it return bool rather than 0/ETIME/ETIMEOUT, to make it more easy to use. - Make is_thd_killed private. - Add prototype for new Gtid_set::remove_intervals_for_sidno. - Add prototype for Gtid_state::wait_for_sidno. - Un-ifdef-out lock_sidnos/unlock_sidnos/broadcast_sidnos since we now need them. - Make wait_for_gtid_set return bool. @sql/rpl_gtid_mutex_cond_array.cc - Remove the now unused check_thd_killed. @sql/rpl_gtid_set.cc - Optimize Gtid_set::remove_gno_intervals, so that it returns early if the Interval list becomes empty. - Add Gtid_set::remove_intervals_for_sidno. This is just a wrapper around the already existing private member function Gtid_set::remove_gno_intervals. @sql/rpl_gtid_state.cc - Rewrite wait_for_gtid_set to fix problems 2 and 3. See code comment for details. - Factor out wait_for_sidno from wait_for_gtid. - Enable broadcast_sidnos/lock_sidnos/unlock_sidnos, which were ifdef'ed out. - Call broadcast_sidnos after updating the state, to fix issue #1. @sql/share/errmsg-utf8.txt - Add error message used to fix issue #4.

After-push test update #2 to make it query cache-friendly.

An assert failure is seen in some queries which have a semijoin and use the materialization strategy. The assertion fails if either the length of the key is zero or the number of key parts is zero. This could indicate two different problems. 1) If the length is zero, there may not be a problem, as it can legitimately be zero if, for example, the key is a zero-length string. 2) If the number of key parts is zero, there is a bug, as a key must have at least one part. The patch fixes issue #1 by removing the length check in the assertion. Issue #2 happens if JOIN::update_equalities_for_sjm() doesn't recognize the expression selected from a subquery, and fails to replace it with a reference to a column in a temporary table that holds the materialized result. This causes it to not recognize it as a part of the key later, and keyparts could end up as zero. The patch fixes it by calling real_item() on the expression in order to see through Item_refs that may wrap the expression if the subquery reads from a view.

ottok · 2016-01-23T01:23:45Z

For the record: fixed in commit 60aeb14

Problem: The binary log group commit sync is failing when committing a group of transactions into a non-transactional storage engine while other thread is rotating the binary log. Analysis: The binary log group commit procedure (ordered_commit) acquires LOCK_log during the #1 stage (flush). As it holds the LOCK_log, a binary log rotation will have to wait until this flush stage to finish before actually rotating the binary log. For the #2 stage (sync), the binary log group commit only holds the LOCK_log if sync_binlog=1. In this case, the rotation has to wait also for the sync stage to finish. When sync_binlog>1, the sync stage releases the LOCK_log (to let other groups to enter the flush stage), holding only the LOCK_sync. In this case, the rotation can acquire the LOCK_log in parallel with the sync stage. For commits into transactional storage engine, the binary log rotation checks a counter of "flushed but not yet committed" transactions, waiting until this counter to be zeroed before closing the current binary log file. As the commit of the transactions happen in the #3 stage of the binary log group commit, the sync of the binary log in stage #2 always succeed. For commits into non-transactional storage engine, the binary log rotation is checking the "flushed but not yet committed" transactions counter, but it is zero because it only counts transactions that contains XIDs. So, the rotation is allowed to take place in parallel with the #2 stage of the binary log group commit. When the sync is called at the same time that the rotation has closed the old binary log file but didn't open the new file yet, the sync is failing with the following error: 'Can't sync file 'UNOPENED' to disk (Errcode: 9 - Bad file descriptor)'. Fix: For non-transactional only workload, binary log group commit will keep the LOCK_log when entering #2 stage (sync) if the current group is supposed to be synced to the binary log file.

TO DISSAPPEAR" Problem ------- The test case is failing to make the slave server to "disappear". Analysis -------- The "crash_in_a_worker" debug sync point is relying on the fact that the workload will be parallelized and reach MTS worker #2, but on slow systems the parallelization will not happen and the server will fail to "disappear". Fix --- Ensure that the workload will be distributed by at all the workers even on slow systems.

Author: Andrei Elkin <andrei.elkin@oracle.com> Date: Fri Nov 25 15:17:17 2016 +0200 WL#9175 Correct recovery of DDL statements/transactions by binary log The patch consists of two parts implementing the WL agenda which is is to provide crash-safety for DDL. That is a server (a general one, master or slave) must be able to recover from crash to commit or rollback every DDL command that was in progress on the eve of crash. The Commit decision is done to commands that had reached Engine-prepared status and got successfully logged into binary log. Otherwise they are rolled back. In order to achieve the goal some refinements are done to the binlogging mechanism, minor addition is done to the server recovery module and some changes applied to the slave side. The binary log part includes Query-log-event which is made to contain xid that is a key item at server recovery. The recovery now is concern with it along with its standard location in Xid_log_event. The first part deals with the ACL DDL sub-class and TRIGGER related queries are fully 2pc-capable. It constructs the WL's framework which is proved on these subclasses. It also specifies how to cover the rest of DDLs by the WL's framework. For those not 2pc-ready DDL cases, sometimes "stub" tests are prepared to be refined by responsible worklogs. Take a few notes to the low-level details of implementation. Note #1. Tagging by xid number is done to the exact 2pc-capable DDL subclass. For DDL:s that will be ready for xiding in future, there is a tech specification how to do so. Note #2. By virtue of existing mechanisms, the slave applier augments the DDL transaction incorporating the slave info table update and the Gtid-executed table (either one optionally) at time of the DDL is ready for the final commit. When for filtering reason the DDL skips committing at its regular time, the augmented transaction would still be not empty consisting of only the added statements, and it would have to be committed by top-level slave specific functions through Log_event::do_update_pos(). To aid this process Query_log_event::has_committed is introduced. Note #3 (QA, please read this.) Replication System_table interface that is employed by handler of TABLE type slave info had to be refined in few places. Note #4 (runtime code). While trying to lessen the footprint to the runtime server code few concessions had to be conceded. These include changes to ha_commit_trans() to invoke new pre_commit() and post_commit(), and post_rollback() hooks due to the slave extra statement. ------------------------------------------------------------------- The 2nd part patch extends the basic framework, xidifies the rest of DDL commands that are (at least) committable at recovery. At the moment those include all Data Definition Statements except ones related to VIEWs, STORED Functions and Procedures. DDL Query is recoverable for these subclasses when it has been recorded into the binary log and was discovered there at the server restart, quite compatible with the DML algorithm. However a clean automatic rollback can't be provided for some of the commands and the user would have to complete recovery manually.

… WITH DEVELOPER STUDIO When we do a release type build of the server (with both optimized and debug enabled server/plugins) with Developer Studio, some MTR tests when run with --debug-server will fail in one of two ways: 1. Tests which try to load a plugin into the mysql client fail with missing symbols. This is caused by the plugin having references to functions which do not exist in the non-debug client. 2. Some tests on sparc fail with Thread stack overrun. Fix for issue #1: mtr will have appended /debug to the plugin dir part when running with --debug-server and if there actually is such a directory. The fix is to remove any trailing /debug from the env. variable within the test. This will affect the client only, not the server. Developer builds will not have put the plugins in a subdirectory /debug so it makes no different to those. Fix for issue #2: apparently this thread stack overrun is not feasible to avoid, so just skip the test if running with debug server on sparc; there is already an include file to do that. Also added not_sparc_debug.inc to the "white list" so the tests are skipped even when running mtr --no-skip. (cherry picked from commit 9c79e477261ab252e38def436bca3336ef597603)

(A) move ndb_basename into utility library (B) new portability function ndb_getsockname()

In WL-included builds ASAN run witnessed missed ~Query_log_event invocation. The destruct-or was not called due to the WL's changes in the error propagation that specifically affect LC MTS. The failure is exposed in particular by rpl_trigger as the following stack: #0 0x9ecd98 in __interceptor_malloc (/export/home/pb2/test/sb_2-22611026-1489061390.32/mysql-commercial-8.0.1-dmr-linux-x86_64-asan/bin/mysqld+0x9ecd98) mysql#1 0x2b1a245 in my_raw_malloc(unsigned long, int) obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:209:12 mysql#2 0x2b1a245 in my_malloc obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:72 mysql#3 0x2940590 in Query_log_event::Query_log_event(char const*, unsigned int, binary_log::Format_description_event const*, binary_log::Log_event_type) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:4343:46 mysql#4 0x293d235 in Log_event::read_log_event(char const*, unsigned int, char const**, Format_description_log_event const*, bool) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:1686:17 mysql#5 0x293b96f in Log_event::read_log_event() mysql#6 0x2a2a1c9 in next_event(Relay_log_info*) Previously before the WL Mts_submode_logical_clock::wait_for_workers_to_finish() had not returned any error even when Coordinator thread is killed. The WL patch needed to refine such behavior, but at doing so it also had to attend log_event.cc::schedule_next_event() to register an error to follow an existing pattern. While my_error() does not take place the killed Coordinator continued scheduling, ineffectively though - no Worker gets engaged (legal case of deferred scheduling), and without noticing its killed status up to a point when it resets the event pointer in apply_event_and_update_pos(): *ptr_ev= NULL; // announcing the event is passed to w-worker The reset was intended for an assigned Worker to perform the event destruction or by Coordinator itself when the event is deferred. As neither is the current case the event gets unattended for its termination. In contrast in the pre-WL sources the killed Coordinator does find a Worker. However such Worker could be already down (errored out and exited), in which case apply_event_and_update_pos() reasonably returns an error and executes delete ev in exec_relay_log_event() error branch. **Fixed** with deploying my_error() call in log_event.cc::schedule_next_event() error branch which fits to the existing pattern. THD::is_error() has been always checked by Coordinator before any attempt to reset *ptr_ev= NULL. In the errored case Coordinator does not reset and destroys the event itself in the exec_relay_log_event() error branch pretty similarly to how the pre-WL sources do. Tested against rpl_trigger and rpl suites to pass. Approved on rb#15667.

Some character sets are designated as MY_CS_STRNXFRM, meaning that sorting needs to go through my_strnxfrm() (implemented by the charset), and some are not, meaning that a client can do the strnxfrm itself based on cs->sort_order. However, most of the logic related to the latter has been removed already (e.g. filesort always uses my_strnxfrm() since 2003), and now it's mostly in the way. The three main uses left are: 1. A microoptimization for constructing sort keys in filesort. 2. A home-grown implementation of Boyer-Moore for accelerating certain LIKE patterns that should probably be handled through FTS. 3. Some optimizations to MyISAM prefix keys. Given that our default collation (utf8mb4_0900_ai_ci) now is a strnxfrm-based collation, the benefits of keeping these around for a narrow range of single-byte locales (like latin1_swedish_ci, cp850 and a bunch of more obscure locales) seems dubious. We seemingly can't remove the flag entirely due to mysql#3 seemingly affecting the on-disk MyISAM structure, but we can remove the code for mysql#1 and mysql#2. Change-Id: If974e490d451b7278355e33ab1fca993f446b792

Patch #2: find_child_doms() checks for duplicates each time it adds a result to the result vector. As explained in the header comment for Json_wrapper::seek(), the duplicate elimination is needed for paths which contain multiple ellipses, so in most cases it is unnecessary work. This patch makes find_child_doms() only check for duplicates in the case where the path contains multiple ellipses, and only when inspecting an ellipsis path leg which is not the first one. Additionally: Remove checks for empty vector after a successful call to push_back(). That is, replace checks for is_seek_done(result, only_need_one) with a simple check for only_need_one when we know the result vector cannot be empty. Call is_seek_done() from the loop in Json_dom::seek() instead of at the top of find_child_doms(), so that we can break out of the loop earlier if we find a match and only need one. Microbenchmarks (64-bit, Intel Core i7-4770 3.4 GHz, GCC 6.3): BM_JsonDomSearchEllipsis 25693 ns/iter [+210.9%] BM_JsonDomSearchEllipsis_OnlyOne 17881 ns/iter [+324.3%] BM_JsonDomSearchKey 128 ns/iter [ +0.8%] BM_JsonBinarySearchEllipsis 231319 ns/iter [ +38.7%] BM_JsonBinarySearchEllipsis_OnlyOne 222726 ns/iter [ +41.6%] BM_JsonBinarySearchKey 86 ns/iter [ 0.0%] Change-Id: I0ee624830680247ec5aed302c0408db00240d441

PLUGIN INSTALLS Split the UDF initialization/deinitialization into two: 1. Initialization/deinitialization of the global structures 2. Loading of the UDF definitions from the table and removing them from the global Then kept the #2 at the place of the current initialization/deinitalization routines and added #2 initialization very early (before component/plugin initialization) and #2 deinitialization very late (after the plugin/compononent deinitialization. Added a test plugin and a regression test.

…TABLE_UPGRADE_GUARD To repeat: cmake -DWITH_ASAN=1 -DWITH_ASAN_SCOPE=1 ./mtr --mem --sanitize main.dd_upgrade_error A few dd tests fail with: ==26861==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7000063bf5e8 at pc 0x00010d4dbe8b bp 0x7000063bda40 sp 0x7000063bda38 READ of size 8 at 0x7000063bf5e8 thread T2 #0 0x10d4dbe8a in Prealloced_array<st_plugin_int**, 16ul>::empty() const prealloced_array.h:186 #1 0x10d406a8b in lex_end(LEX*) sql_lex.cc:560 #2 0x10dae4b6d in dd::upgrade::Table_upgrade_guard::~Table_upgrade_guard() (mysqld:x86_64+0x100f87b6d) #3 0x10dadc557 in dd::upgrade::migrate_table_to_dd(THD*, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, bool) (mysqld:x86_64+0x100f7f557) #4 0x10dad7e85 in dd::upgrade::migrate_plugin_table_to_dd(THD*) (mysqld:x86_64+0x100f7ae85) #5 0x10daec6a1 in dd::upgrade::do_pre_checks_and_initialize_dd(THD*) upgrade.cc:1216 #6 0x10cd0a5c0 in bootstrap::handle_bootstrap(void*) bootstrap.cc:336 Change-Id: I265ec6dd97ee8076aaf03763840c0cdf9e20325b Fix: increase lifetime of 'LEX lex;' which is used by 'table_guard'

as a component * The validate password plugin is converted to component. * For mysql-8.0 we will have both component and plugin(But plugin will be installed/uninstalled with below deprecate warning) "validate password plugin' is deprecated and will be removed in a future release. Please use validate_password component instead" * In the next major release we remove the plugin .so file. * With the component enabled, we see the system/status variables preceded with "validate_password." instead of "validate_password_"(for example, if we take 'length' system variable we see it as "validate_password.length" instead of "validate_password_length") * The packaging script should be installing the component by default for new installs The way it currently works with deprecations and removals is the following: 1. We add a deprecation warning to using the old way and we switch the server to use the new way by default for new installations. 2. We expect that people will upgrade their existing installations too to avoid the warning. 3. In the next major release we remove the old way and hope that people did #2.

…/BINDINGS/XCOM/XCOM/TASK.C Problem ---------------------------- A Segmentation fault was encountered on a daily-trunk pushbuild run for the test 'group_replication.gr_ssl_mode_verify_identity_error' when starting group replication with the below symptoms - group_replication.gr_ssl_mode_verify_identity_error w5 [ fail ] Test ended at 2017-09-27 17:37:18 CURRENT_TEST: group_replication.gr_ssl_mode_verify_identity_error mysqltest: At line 58: query 'START GROUP_REPLICATION' failed with wrong errno 2013: 'Lost connection to MySQL server during query', instead of 3092 Analysis ----------------------------- The test that fails sporadically is Step #2 of gr_ssl_mode_verify_identity_error. That step causes a global SSL comunication failure, in which not even the local connections can be established, thus causing the join operation to fail. When that happens, GCS will request the XCom thread termination. Since it does not have socket access, it will contact it directly calling xcom_fsm(exit), but this is a well-known non-thread safe operation, hence causing this crash. Fix ----------------------------- Create a new callback for GCS called should_exit, which is tested within task_loop, in order to check if we should exit the XCom thread.

Fix misc UBSAN warnings in unit tests. To repeat: export UBSAN_OPTIONS="print_stacktrace=1" ./runtime_output_directory/merge_large_tests-t --gtest_filter='-*DeathTest*' > /dev/null unittest/gunit/gis_algos-t.cc:78:70: runtime error: downcast of address 0x000012dc0be8 which does not point to an object of type 'Gis_polygon_ring' include/sql_string.h:683:35: runtime error: null pointer passed as argument 2, which is declared to never be null mysql#1 0x373e7af in histograms::Value_map<String>::add_values(String const&, unsigned long long) sql/histograms/value_map.cc:149 mysql#2 0x294fcf2 in dd_column_statistics_unittest::add_values(histograms::Value_map<String>&) unittest/gunit/dd_column_statistics-t.cc:62 runtime_output_directory/merge_keyring_file_tests-t --gtest_filter='-*DeathTest*' > /dev/null plugin/keyring/common/keyring_key.cc:82:57: runtime error: null pointer passed as argument 2, which is declared to never be null Change-Id: I2651362e3373244b72e6893f0e22e67402b49a52 (cherry picked from commit 1fe3f72561994da1d912a257689e1b18106f8828)

Post push fix sql/opt_range.cc:14196:22: runtime error: -256 is outside the range of representable values of type 'unsigned int' #0 0x4248a9d in cost_skip_scan(TABLE*, unsigned int, unsigned int, unsigned long long, Cost_estimate*, unsigned long long*, Item*, Opt_trace_object*) sql/opt_range.cc:14196:22 #1 0x41c524c in get_best_skip_scan(PARAM*, SEL_TREE*, bool) sql/opt_range.cc:14086:5 #2 0x41b7b65 in test_quick_select(THD*, Bitmap<64u>, unsigned long long, unsigned long long, bool, enum_order, QEP_shared_owner const*, Item*, Bitmap<64u>*, QUICK_SELECT_I**) sql/opt_range.cc:3352:23 #3 0x458fc08 in get_quick_record_count(THD*, JOIN_TAB*, unsigned long long) sql/sql_optimizer.cc:5542:17 #4 0x458a0cd in JOIN::estimate_rowcount() sql/sql_optimizer.cc:5290:25 The fix is to handle REC_PER_KEY_UNKNOWN explicitly, to avoid using -1.0 in computations later. Change-Id: Ie8a81bdf7323e4f66abcad0a9aca776de8acd945

…etwork https://bugs.mysql.com/bug.php?id=109668 Description ----------- GR suffered from problems caused by the security probes and network scanner processes connecting to the group replication communication port. This usually is not a problem, but poses a serious threat when another member tries to join the cluster by initialting a connection to the member which is affected by external processes using the port dedicated for group communication for longer durations. On such activites by external processes, the SSL enabled server stalled forever on the SSL_accept() call waiting for handshake data. Below is the stacktrace: Thread 55 (Thread 0x7f7bb77ff700 (LWP 2198598)): #0 in read () mysql#1 in sock_read () mysql#2 in BIO_read () mysql#3 in ssl23_read_bytes () mysql#4 in ssl23_get_client_hello () mysql#5 in ssl23_accept () mysql#6 in xcom_tcp_server_startup(Xcom_network_provider*) () When the server stalled in the above path forever, it prohibited other members to join the cluster resulting in the following messages on the joiner server's logs. [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group' [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is already leaving or joining a group.' Solution -------- This patch adds two new variables 1. group_replication_xcom_ssl_socket_timeout It is a file-descriptor level timeout in seconds for both accept() and SSL_accept() calls when group replication is listening on the xcom port. When set to a valid value, say for example 5 seconds, both accept() and SSL_accept() return after 5 seconds. The default value has been set to 0 (waits infinitely) for backward compatibility. This variable is effective only when GR is configred with SSL. 2. group_replication_xcom_ssl_accept_retries It defines the number of retries to be performed before closing the socket. For each retry the server thread calls SSL_accept() with timeout defined by the group_replication_xcom_ssl_socket_timeout for the SSL handshake process once the connection has been accepted by the first accept() call. The default value has been set to 10. This variable is effective only when GR is configred with SSL. Note: - Both of the above variables are dynamically configurable, but will become effective only on START GROUP_REPLICATION. - This patch is only for the Linux systems.

Post push fix. NdbSocket::copy method duplicated the mutex pointer, leaving two objects referring to one mutex. Typically the source will destroy its mutex, making it unusable for target object. Remove copy method. Change-Id: I2cc36128c343c7bab08d96651b12946ecd87210c

Part of WL#15135 Certificate Architecture This patch introduces class TlsKeyManager, containing all TLS authentication and key management logic. A single instance of TlsKeyManager in each node owns the local NodeCertificate, an SSL_CTX, and a table holding the serial numbers and expiration dates of all peer certificates. A large set of TLS-related error codes is introduced in the file TlsKeyErrors.h. The unit test testTlsKeyManager-t tests TLS authentication over client/server connections on localhost. Change-Id: I2ee42efc268219639691f73a1d7638a336844d88

Implement ndb$certificates base table and certificates view. Update results for tests ndbinfo and ndbinfo plans. Change-Id: Iab1b89f5eb82ac1b3e0c049dd55eb7d07394070a

Move client_authenticate() out of SocketClient::connect() (which returns void) into a separate SocketClient::authenticate() method which can return a value. In SocketAuthenticator, change the signature of the authentication routines to return an int (which can represent a result code) rather than a bool. Results less than AuthOk represent failure, and results greater than or equal to AuthOk represent success. Remove the username and password variables from SocketAuthSimple; make them constant strings in the implementation. There are no functional changes. Change-Id: I4c25e99f1b9b692db39213dfa63352da8993a8fb

This changes TransporterRegistry::connect_ndb_mgmd() to return NdbSocket rather than ndb_socket_t. It extends the StartTls test in testMgmd to test upgrading the TLS MGM protocol socket to a transporter. Change-Id: Ic3b9ccf39ec78ed25705a4bbbdc5ac2953a35611

Post push fix. NdbSocket::copy method duplicated the mutex pointer, leaving two objects referring to one mutex. Typically the source will destroy its mutex, making it unusable for target object. Fix by use the transfer method instead. Change-Id: I199c04b870049498463903f6358f79a38649f543

If the argument to a window function contains a subquery, the access path of that subquery would be printed twice when doing 'EXPLAIN FORMAT=TREE'. When using the Hypergraph optimizer, the subquery path was not printed at all, whether using FORMAT=TREE or FORMAT=JSON. This commit fixes this by ensuring that we ignore duplicate paths, and (for Hypergraph) by traversing the structures needed to find the relevant Item_subselect objects. Change-Id: I2abedcf690294f98ce169b74e53f042f46c47a45

Post-push fix: Cherry-picking the fix onto mysql-trunk introduced an unintended duplication of a code block, causing a shadowing-warning when building with g++. This commit corrects that. Change-Id: I1b279818ca0d30e32fc8dabb76c647120b531e8f

Problem ================================ Group Replication ASAN run failing without any symptom of a leak, but with shutdown issues: worker[6] Shutdown report from /dev/shm/mtr-3771884/var-gr-debug/6/log/mysqld.1.err after tests: group_replication.gr_flush_logs group_replication.gr_delayed_initialization_thread_handler_error group_replication.gr_sbr_verifications group_replication.gr_server_uuid_matches_group_name_bootstrap group_replication.gr_stop_async_on_stop_gr group_replication.gr_certifier_message_same_member group_replication.gr_ssl_mode_verify_identity_error_xcom Analysis and Fix ================================ It ended up being a leak on gr_ssl_mode_verify_identity_error_xcom test: Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x7f1709fbe1c7 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:99 #1 0x7f16ea0df799 in xcom_tcp_server_startup(Xcom_network_provider*) (/export/home/tmp/BUG35594709/mysql-trunk/BIN-ASAN/plugin_output_directory /group_replication.so+0x65d799) #2 0x7f170751e2b2 (/lib/x86_64-linux-gnu/libstdc++.so.6+0xdc2b2) This happens because we delegated incoming connections cleanup to the external consumer in incoming_connection_task. Since it calls incoming_connection() from Network_provider_manager, in case of a concurrent stop, a connection could be left orphan in the shared atomic due to the lack of an Active Provider, thus creating a memory leak. The solution is to make this cleanup on Network_provider_manager, on both stop_provider() and in stop_all_providers() methods, thus ensuring that no incoming connection leaks. Change-Id: I2367c37608ad075dee63785e9f908af5e81374ca

Post push fix. In test program testTlsKeyManager-t a struct sockaddr pointer was passed to inet_ntop instead of struct in_addr for AF_INET and struct in6_addr for AF_INET6. That caused wrong addresses to be printed on error: not ok 26 - Client cert for test hostname is OK >>> Test of address 2.0.0.0 for msdn.microsoft.com returned error authorization failure: bad hostname not ok 27 - Client cert for test hostname is OK >>> Test of address a00::2620:1ec:46:0 for msdn.microsoft.com returned error authorization failure: bad hostname not ok 28 - Client cert for test hostname is OK >>> Test of address a00::2620:1ec:bdf:0 for msdn.microsoft.com returned error authorization failure: bad hostname Should be 13.107.x.53 or 2620:1ec:x::53. Changed to use ndb_sockaddr and Ndb_inet_ntop instead. Change-Id: Iae4bebca26462f9b65c3232e9768c574e767b380

Move client_authenticate() out of SocketClient::connect() (which returns void) into a separate SocketClient::authenticate() method which can return a value. In SocketAuthenticator, change the signature of the authentication routines to return an int (which can represent a result code) rather than a bool. Results less than AuthOk represent failure, and results greater than or equal to AuthOk represent success. Remove the username and password variables from SocketAuthSimple; make them constant strings in the implementation. There are no functional changes. Change-Id: I4c25e99f1b9b692db39213dfa63352da8993a8fb

This changes TransporterRegistry::connect_ndb_mgmd() to return NdbSocket rather than ndb_socket_t. Back-ported from mysql-trunk. Change-Id: Ic3b9ccf39ec78ed25705a4bbbdc5ac2953a35611

Post-push fix. ASan reported memory leaks from some EXPLAIN tests, such as main.explain_tree. The reason was that the Json_dom objects that were discarded to avoid describing a subquery twice, were not properly destroyed. The EXPLAIN code uses unique_ptr to make sure the Json_dom objects are destroyed, but there are windows in which the objects only exist as unmanaged raw pointers. This patch closes the window which caused this memory leak by changing ExplainChild::obj from a raw pointer to a unique_ptr, so that it gets destroyed even if it doesn't make it into the final tree that describes the full plan. Change-Id: I0f0885da867e8a34335ff11f3ae9da883a878ba4

BUG#35949017 Schema dist setup lockup Bug#35948153 Problem setting up events due to stale NdbApi dictionary cache [#2] Bug#35948153 Problem setting up events due to stale NdbApi dictionary cache [#1] Bug#32550019 Missing check for ndb_schema_result leads to schema dist timeout Change-Id: I4a32197992bf8b6899892f21587580788f828f34

cache [#2] This is second patch, solving the problem of ineffiecent cache invalidation when invalidating a table which is known to be invalid but unknown if it is in the cache or not. Problem: Currently the only way to invalidate a table in the NdbApi dictionary cache is to open the table and then mark it as invalid. In case the table does not exists in the cache, it will still have to be opened and thus fetched fom NDB. This means that in order to get the latest table definition it has to be fetched two times, although the table definition does not already exist in the cache. This is inefficient. Analysis: In order to avoid the double roundtrip there need to be a function which marks the table as invalid only if it exists in the cache. Fix: Implement a NdbApi function that invalidates table by name if it exists in the cache. Replace the old pattern of opening table in order to invalidate it with the new function. The old pattern is still a valid use case for invalidating a table after having worked with it. Change-Id: I20f275f1fed76d991330348bea4ae72548366467

…and a local DDL executed https://bugs.mysql.com/bug.php?id=113727 Problem ------- In high concurrency scenarios, MySQL replica can enter into a deadlock due to a race condition between the replica applier thread and the client thread performing a binlog group commit. Analysis -------- It needs at least 3 threads for this deadlock to happen 1. One client thread 2. Two replica applier threads How this deadlock happens? -------------------------- 0. Binlog is enabled on replica, but log_replica_updates is disabled. 1. Initially, both "Commit Order" and "Binlog Flush" queues are empty. 2. Replica applier thread 1 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier thread 1 3.1. Becomes leader (In Commit_stage_manager::enroll_for()). 3.2. Registers in the commit order queue. 3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log. 3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is not yet released. NOTE: SE commit for applier thread is already done by the time it reaches here. 4. Replica applier thread 2 enters the group commit pipeline to register in the "Commit Order" queue since `log-replica-updates` is disabled on the replica node. 5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the applier thread 2 5.1. Becomes leader (In Commit_stage_manager::enroll_for()) 5.2. Registers in the commit order queue. 5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier thread 1 it will wait until the lock is released. 6. Client thread enters the group commit pipeline to register in the "Binlog Flush" queue. 7. Since "Commit Order" queue is not empty (there is applier thread 2 in the queue), it enters the conditional wait `m_stage_cond_leader` with an intention to become the leader for both the "Binlog Flush" and "Commit Order" queues. 8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update the GTID by calling gtid_state->update_commit_group() from Commit_order_manager::flush_engine_and_signal_threads(). 9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log. 9.1. It checks if there is any thread waiting in the "Binlog Flush" queue to become the leader. Here it finds the client thread waiting to be the leader. 9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the cond_var `m_stage_cond_leader` and enters a conditional wait until the thread's `tx_commit_pending` is set to false by the client thread (will be done in the Commit_stage_manager::process_final_stage_for_ordered_commit_group() called by client thread from fetch_and_process_flush_stage_queue()). 10. The client thread wakes up from the cond_var `m_stage_cond_leader`. The thread has now become a leader and it is its responsibility to update GTID of applier thread 2. 10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log. 10.2. Returns from `enroll_for()` and proceeds to process the "Commit Order" and "Binlog Flush" queues. 10.3. Fetches the "Commit Order" and "Binlog Flush" queues. 10.4. Performs the storage engine flush by calling ha_flush_logs() from fetch_and_process_flush_stage_queue(). 10.5. Proceeds to update the GTID of threads in "Commit Order" queue by calling gtid_state->update_commit_group() from Commit_stage_manager::process_final_stage_for_ordered_commit_group(). 11. At this point, we will have - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and - Applier thread 1 performing GTID update for itself (from step 8). Due to the lack of proper synchronization between the above two threads, there exists a time window where both threads can call gtid_state->update_commit_group() concurrently. In subsequent steps, both threads simultaneously try to modify the contents of the array `commit_group_sidnos` which is used to track the lock status of sidnos. This concurrent access to `update_commit_group()` can cause a lock-leak resulting in one thread acquiring the sidno lock and not releasing at all. ----------------------------------------------------------------------------------------------------------- Client thread Applier Thread 1 ----------------------------------------------------------------------------------------------------------- update_commit_group() => global_sid_lock->rdlock(); update_commit_group() => global_sid_lock->rdlock(); calls update_gtids_impl_lock_sidnos() calls update_gtids_impl_lock_sidnos() set commit_group_sidno[2] = true set commit_group_sidno[2] = true lock_sidno(2) -> successful lock_sidno(2) -> waits update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { unlock_sidno(2); commit_group_sidnos[2] = false; } Applier thread continues.. lock_sidno(2) -> successful update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()` if (commit_group_sidnos[2]) { <=== this check fails and lock is not released. unlock_sidno(2); commit_group_sidnos[2] = false; } Client thread continues without releasing the lock ----------------------------------------------------------------------------------------------------------- 12. As the above lock-leak can also happen the other way i.e, the applier thread fails to unlock, there can be different consequences hereafter. 13. If the client thread continues without releasing the lock, then at a later stage, it can enter into a deadlock with the applier thread performing a GTID update with stack trace. Client_thread ------------- mysql#1 __GI___lll_lock_wait mysql#2 ___pthread_mutex_lock mysql#3 native_mutex_lock <= waits for commit lock while holding sidno lock mysql#4 Commit_stage_manager::enroll_for mysql#5 MYSQL_BIN_LOG::change_stage mysql#6 MYSQL_BIN_LOG::ordered_commit mysql#7 MYSQL_BIN_LOG::commit mysql#8 ha_commit_trans mysql#9 trans_commit_implicit mysql#10 mysql_create_like_table mysql#11 Sql_cmd_create_table::execute mysql#12 mysql_execute_command mysql#13 dispatch_sql_command Applier thread -------------- mysql#1 ___pthread_mutex_lock mysql#2 native_mutex_lock mysql#3 safe_mutex_lock mysql#4 Gtid_state::update_gtids_impl_lock_sidnos <= waits for sidno lock mysql#5 Gtid_state::update_commit_group mysql#6 Commit_order_manager::flush_engine_and_signal_threads <= acquires commit lock here mysql#7 Commit_order_manager::finish mysql#8 Commit_order_manager::wait_and_finish mysql#9 ha_commit_low mysql#10 trx_coordinator::commit_in_engines mysql#11 MYSQL_BIN_LOG::commit mysql#12 ha_commit_trans mysql#13 trans_commit mysql#14 Xid_log_event::do_commit mysql#15 Xid_apply_log_event::do_apply_event_worker mysql#16 Slave_worker::slave_worker_exec_event mysql#17 slave_worker_exec_job_group mysql#18 handle_slave_worker 14. If the applier thread continues without releasing the lock, then at a later stage, it can perform recursive locking while setting the GTID for the next transaction (in set_gtid_next()). In debug builds the above case hits the assertion `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the replica applier thread when it tries to re-acquire the lock. Solution -------- In the above problematic example, when seen from each thread individually, we can conclude that there is no problem in the order of lock acquisition, thus there is no need to change the lock order. However, the root cause for this problem is that multiple threads can concurrently access to the array `Gtid_state::commit_group_sidnos`. In its initial implementation, it was expected that threads should hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it was not considered when upstream implemented WL#7846 (MTS: slave-preserve-commit-order when log-slave-updates/binlog is disabled). With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired when the client thread (binlog flush leader) when it tries to perform GTID update on behalf of threads waiting in "Commit Order" queue, thus providing a guarantee that `Gtid_state::commit_group_sidnos` array is never accessed without the protection of `MYSQL_BIN_LOG::LOCK_commit`.

…nt on Windows and posix [#2] The posix version of NdbProcess::start_process assumed the arguments where quoted using " and \ in a way that resembles POSIX sh quoting, and unquoted spaces were treated as argument separators splitting the argument to several. But the Windows version of NdbProcess::start_process did not treat options in the same way. And the Windows C runtime (CRT) parse arguments different from POSIX sh. Note that if program do not use CRT when it may treat the command line in its own way and the quoting done for CRT will mess up the command line. On Windows NdbProcess:start_process should only be used for CRT compatible programs on Windows with respect to argument quoting on command line, or one should make sure given arguments will not trigger unwanted quoting. This may be relevant for ndb_sign_keys and --CA-tool=<batch-file>. Instead this patch change the intention of start_process to pass arguments without modification from caller to the called C programs argument vector in its main entry function. In posix path that is easy, just pass the incoming C strings to execvp. On Windows one need to quote for Windows CRT when composing the command line. Note that the command part of command line have different quoting than the following arguments have. Change-Id: I763530c634d3ea460b24e6e01061bbb5f3321ad4

Problem: Starting ´ndb_mgmd --bind-address´ may potentially cause abnormal program termination in MgmtSrvr destructor when ndb_mgmd restart itself. Core was generated by `ndb_mgmd --defa'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f8ce4066b8f in raise () from /lib64/libc.so.6 #1 0x00007f8ce4039ea5 in abort () from /lib64/libc.so.6 #2 0x00007f8ce40a7d97 in __libc_message () from /lib64/libc.so.6 #3 0x00007f8ce40af08c in malloc_printerr () from /lib64/libc.so.6 #4 0x00007f8ce40b132d in _int_free () from /lib64/libc.so.6 #5 0x00000000006e9ffe in MgmtSrvr::~MgmtSrvr (this=0x28de4b0) at mysql/8.0/storage/ndb/src/mgmsrv/MgmtSrvr.cpp: 890 #6 0x00000000006ea09e in MgmtSrvr::~MgmtSrvr (this=0x2) at mysql/8.0/ storage/ndb/src/mgmsrv/MgmtSrvr.cpp:849 #7 0x0000000000700d94 in mgmd_run () at mysql/8.0/storage/ndb/src/mgmsrv/main.cpp:260 #8 0x0000000000700775 in mgmd_main (argc=<optimized out>, argv=0x28041d0) at mysql/8.0/storage/ndb/src/ mgmsrv/main.cpp:479 Analysis: While starting up, the ndb_mgmd will allocate memory for bind_address in order to potentially rewrite the parameter. When ndb_mgmd restart itself the memory will be released and dangling pointer causing double free. Fix: Drop support for bind_address=[::], it is not documented anywhere, is not useful and doesn't work. This means the need to rewrite bind_address is gone and bind_address argument need neither alloc or free. Change-Id: I7797109b9d8391394587188d64d4b1f398887e94

… for connection xxx'. The new iterator based explains are not impacted. The issue here is a race condition. More than one thread is using the query term iterator at the same time (whoch is neithe threas safe nor reantrant), and part of its state is in the query terms being visited which leads to interference/race conditions. a) the explain thread uses an iterator here: Sql_cmd_explain_other_thread::execute is inspecting the Query_expression of the running query calling master_query_expression()->find_blocks_query_term which uses an iterator over the query terms in the query expression: for (auto qt : query_terms<>()) { if (qt->query_block() == qb) { return qt; } } the above search fails to find qb due to the interference of the thread b), see below, and then tries to access a nullpointer: * thread #36, name = ‘connection’, stop reason = EXC_BAD_ACCESS (code=1, address=0x0) frame #0: 0x000000010bb3cf0d mysqld`Query_block::type(this=0x00007f8f82719088) const at sql_lex.cc:4441:11 frame #1: 0x000000010b83763e mysqld`(anonymous namespace)::Explain::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:792:50 frame #2: 0x000000010b83cc4d mysqld`(anonymous namespace)::Explain_join::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:1487:21 frame #3: 0x000000010b837c34 mysqld`(anonymous namespace)::Explain::prepare_columns(this=0x00007000020611b8) at opt_explain.cc:744:26 frame #4: 0x000000010b83ea0e mysqld`(anonymous namespace)::Explain_join::explain_qep_tab(this=0x00007000020611b8, tabnum=0) at opt_explain.cc:1415:32 frame #5: 0x000000010b83ca0a mysqld`(anonymous namespace)::Explain_join::shallow_explain(this=0x00007000020611b8) at opt_explain.cc:1364:9 frame #6: 0x000000010b83379b mysqld`(anonymous namespace)::Explain::send(this=0x00007000020611b8) at opt_explain.cc:770:14 frame #7: 0x000000010b834147 mysqld`explain_query_specification(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, query_term=0x00007f8f82719088, ctx=CTX_JOIN) at opt_explain.cc:2088:20 frame #8: 0x000000010bd36b91 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f82719088) at sql_union.cc:1519:11 frame #9: 0x000000010bd36c68 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f8271d748) at sql_union.cc:1526:13 frame #10: 0x000000010bd373f7 mysqld`Query_expression::explain(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00) at sql_union.cc:1591:7 frame #11: 0x000000010b835820 mysqld`mysql_explain_query_expression(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2392:17 frame #12: 0x000000010b835400 mysqld`explain_query(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2353:13 * frame #13: 0x000000010b8363e4 mysqld`Sql_cmd_explain_other_thread::execute(this=0x00007f8fba585b68, thd=0x00007f8fbb111e00) at opt_explain.cc:2531:11 frame #14: 0x000000010bba7d8b mysqld`mysql_execute_command(thd=0x00007f8fbb111e00, first_level=true) at sql_parse.cc:4648:29 frame #15: 0x000000010bb9e230 mysqld`dispatch_sql_command(thd=0x00007f8fbb111e00, parser_state=0x0000700002065de8) at sql_parse.cc:5303:19 frame #16: 0x000000010bb9a4cb mysqld`dispatch_command(thd=0x00007f8fbb111e00, com_data=0x0000700002066e38, command=COM_QUERY) at sql_parse.cc:2135:7 frame #17: 0x000000010bb9c846 mysqld`do_command(thd=0x00007f8fbb111e00) at sql_parse.cc:1464:18 frame #18: 0x000000010b2f2574 mysqld`handle_connection(arg=0x0000600000e34200) at connection_handler_per_thread.cc:304:13 frame #19: 0x000000010e072fc4 mysqld`pfs_spawn_thread(arg=0x00007f8fba8160b0) at pfs.cc:3051:3 frame #20: 0x00007ff806c2b202 libsystem_pthread.dylib`_pthread_start + 99 frame #21: 0x00007ff806c26bab libsystem_pthread.dylib`thread_start + 15 b) the query thread being explained is itself performing LEX::cleanup and as part of the iterates over the query terms, but still allows EXPLAIN of the query plan since thd->query_plan.set_query_plan(SQLCOM_END, ...) hasn't been called yet. 20:frame: Query_terms<(Visit_order)1, (Visit_leaves)0>::Query_term_iterator::operator++() (in mysqld) (query_term.h:613) 21:frame: Query_expression::cleanup(bool) (in mysqld) (sql_union.cc:1861) 22:frame: LEX::cleanup(bool) (in mysqld) (sql_lex.h:4286) 30:frame: Sql_cmd_dml::execute(THD*) (in mysqld) (sql_select.cc:799) 31:frame: mysql_execute_command(THD*, bool) (in mysqld) (sql_parse.cc:4648) 32:frame: dispatch_sql_command(THD*, Parser_state*) (in mysqld) (sql_parse.cc:5303) 33:frame: dispatch_command(THD*, COM_DATA const*, enum_server_command) (in mysqld) (sql_parse.cc:2135) 34:frame: do_command(THD*) (in mysqld) (sql_parse.cc:1464) 57:frame: handle_connection(void*) (in mysqld) (connection_handler_per_thread.cc:304) 58:frame: pfs_spawn_thread(void*) (in mysqld) (pfs.cc:3053) 65:frame: _pthread_start (in libsystem_pthread.dylib) + 99 66:frame: thread_start (in libsystem_pthread.dylib) + 15 Solution: This patch solves the issue by removing iterator state from Query_term, making the query_term iterators thread safe. This solution labels every child query_term with its index in its parent's m_children vector. The iterator can therefore easily compute the next child to visit based on Query_term::m_sibling_idx. A unit test case is added to check reentrancy. One can also manually verify that we have no remaining race condition by running two client connections files (with \. <file>) with a big number of copies of the repro query in one connection and a big number of EXPLAIN format=json FOR <connection>, e.g. EXPLAIN FORMAT=json FOR CONNECTION 8\G in the other. The actual connection number would need to verified in connection one, of course. Change-Id: Ie7d56610914738ccbbecf399ccc4f465f7d26ea7

This is a combination of 5 commits. This is the 1st commit message: WL#15746: TLS Enhancements for HeatWave-AutoML & Dask Comm. Upgrade Problem: -------- - HeatWave-AutoML communication was unauthenticated, unauthorized, and unencrypted. - Dask communication utilized TCP, not aligning with FedRamp guidelines. Solution: --------- - Introduced TLS and mTLS in HeatWave-AutoML's plugin and driver for authentication, authorization, and encryption. - Applied TLS to Dask to ensure authentication, encryption, and authorization. Dask Authorization (OCID-based): -------------------------------- 1. For each DBsystem: - MySQL node sends OCIDs of authorized nodes to the head driver via: a. rapid_net_nodes b. rapid_net_allowed_ocids (older API, mainly for MTR tests) - Scenarios: a. All OCIDs provided: Dask authorizes. b. Any OCID absent: ML call fails with message. 2. During Dask worker registration to the Dask scheduler, a script is dispatched to the Dask worker for execution, retrieving the worker's OCID for authorization purposes. - If the OCID isn't approved, the connection is denied, terminating the worker and causing the ML query to fail. 3. For every Dask worker (both as listener and connector), an OCID- based authorization is performed post SSL/TLS connection handshake. The process compares the OCID from the peer's certificate against the allowed_ocids received from the HeatWave-AutoML MySQL plugin. HWAML Plugin Changes: --------------------- - Sourced certificate data and SSL setup from disk, incorporating SSL/TLS for HWAML. - Reused "keystore" variable to specify disk location for certificate retrieval. - Certificates and keys expected in PKCS12 format. - Introduced "have_ml_encryption" variable (default=0). > Acts as a switch to explicitly deactivate HWAML network encryption, akin to "disable_net_encryption" affecting network encryption for HeatWave. Set to 1 to enable. - Introduced a customized verifier function for verify_callback to be set in SSL_CTX_set_verify and used in the handshake process of SSL/TLS. The customized verifier function will perform instance id (OCID) based authorization on the plugin side during standard SSL/TLS handshake process. - CRL (Certificate Revocation List) checks are also conducted if CRL Distribution Points are present and accessible in the provided certificate. HWAML Driver Changes & OCID-based Authorization: ------------------------------------------------ - Introduced "enable_encryption" (default=0). > Set to 1 to enable encryption. - When receiving a new connection request and encryption is on, the driver performs OCID-based self-checking, comparing OCID retrieved from its own instance principal with the OCID in the provided certificate on disk. - The driver compares OCID from "mysql_compute_id" and extracted OCID from mTLS certificate during connection. - Introduced "cert_dir" argument for certificate directory specification. - Expected files: cert_chain.pem, certificate.pem, private_key.pem. > OCID should be in the userID (UID) or CN field of the certificate.pem subject. - CRL (Certificate Revocation List) checks are also conducted post handshake, if CRL Distribution Points are present and accessible in the provided certificate, alongside OCID authorization. Encryption Behavior: -------------------- - If encryption is deactivated on both plugin and driver side, HWAML will work without encryption as it was before this commit. Enabling Encryption: -------------------- - By default, "have_ml_encryption" and "enable_encryption" are set to 0 > Encryption is disabled by default. - For the HWAML plugin: > "have_ml_encryption" set to 1 (default is 0). > Specify the .pfx file's path using the "keystore". - For the HWAML Driver: > "enable_encryption" set to 1 (default is 0) > Specify "mysql_instance_id" and "cert_dir". Testing: -------- - MTR has been modified for the encryption setup. > Runs with encryption if "OCI_INSTANCE_ID" is set to a valid value. - On OCI (when "OLRAPID_KEYSTORE" is not set): > Certificates and keys are generated; PEMs for driver and PKCS12 for plugin. - On AWS (when "OLRAPID_KEYSTORE" is set as the path to PKCS12 keystore files): > PEM files are extracted from the provided PKCS12 and used for the driver. The plugin uses the provided PKCS12 keystore file. Change-Id: I553ca135241e03484db6debbe186e6d34d582bf4 This is the commit message #2: WL#15746 - Adding ML encryption support to BM Enabling ML encryption on Baumeister: - Certificates are generated on MySQLd during initialization - Needed certicates for workers are packaged and sent to worker nodes - Workers use packaged files to generate their certificates - Arguments are added to driver.py invoke - Keystore path is added to mysql config Change-Id: I11a5cc5926488ff4fbf91bb6c10a091358db7dc9 This is the commit message #3: WL#15746: Enhanced CRL Daemon Checker Issue ===== The previous design assumed a plain HTTPS link for the CRL distribution point, accessible to all. This assumption no longer holds, as public accessibility for CRL distribution contradicts OCI guidelines. Now, the CRL distribution point in certificates provided by the control plane is expected to be protected by OCI Instance Principal Authentication. However, using this authentication method introduces a delay of several seconds, which is impractical for HeatWave-AutoML. Solution ======== The CRL fetching code now uses OCI Instance Principal Authentication. To mitigate performance issues, the CRL checking process has been redesigned. Instead of enforcing CRL checks per connection in MySQL Plugin and HeatWave-AutoML Driver communications, a daemon thread in HeatWave-AutoML Driver, Dask scheduler, and Dask Worker now periodically fetches and verifies the CRL against all active connections. This separation minimizes performance impacts. Consequently, MySQL Plugin's CRL checks have been removed, as checks in the Driver, Scheduler, and Worker sufficiently cover all cluster nodes. Changes ======= - Implemented CRL checker as a daemon thread in Driver, Scheduler, and Worker. - Each connection/socket has an associated CRL checker. - CRL checks occur periodically at set intervals. - Skips CRL check if the CRL is temporarily unavailable. - Failing a CRL check results in the associated connection/socket being closed. On the Driver, a stop event is triggered (akin to CTRL-C). Change-Id: Id998cfe9e15d9236291b0ae420d65c2197837966 This is the commit message #4: WL#15746: Fix Dask workers being shutdown without releasing address Issue ===== Dask workers getting shutting but not releasing the address used properly sometimes. Solution ======== Reverted some changes in heatwave_cluster.py in dask worker shutdown function. Hopefully this will fix the address issue Change-Id: I5a6749b5a25b0ccb73ba7369e545bc010da1b84f This is the commit message #5: WL#15746: Implement Dask Worker Join Timeout for Head Node Issue: ====== In the cluster_shutdown method, the join operation on the head node's worker process lacked a timeout. This led to potential indefinite waiting and eventual hanging of the head node. Solution: ========= A timeout has been introduced for the worker process join on the head node. Unlike non-head nodes, which rely on worker join to complete Dask tasks and cannot have a timeout, the head node can safely implement this. Now, if the worker process on the head node fails to shut down post-join, indicating a timeout, it will be manually terminated. This ensures proper release of associated resources and prevents hanging of the head node. Additional Change: ================== Added Cert Rotation Guard for DASK clusters. This feature initiates on the first plugin-driver connection when the DASK cluster is off, recording the certificate's expiry date. During driver idle times, it checks the current cert's expiry against this date. If it detects a change, indicating a certificate rotation, it shuts down the DASK cluster. The cluster restarts on the next connection request, ensuring the use of the latest certificate. Change-Id: Ie63a2e2b7664e05e1622d8bd6503663e13fa73cb

…EXCEPT SELECT 4) A work-around is to set the optimizer flag to not use hash map de-duplication for INTERSECT, EXCEPT, like so: SET optimizer_switch="hash_set_operations=off"; With hash_set_operations enabled, however, we get too may result rows. For the IN predicate, the set operation is computed repeatedly, with filters pushed down to set set operation operands: -> Filter: <in_optimizer>(c.pk,<exists>(select mysql#2)) (cost=2.25 rows=20) -> Covering index scan on c using idx_c_col_datetime_key (cost=2.25 rows=20) -> Select mysql#2 (subquery in condition; dependent) -> Limit: 1 row(s) (cost=2.61..2.61 rows=1) -> Table scan on <except temporary> (cost=2.61..2.61 rows=1) -> Except materialize with deduplication (cost=0.1..0.1 rows=1) -> Filter: (<cache>(c.pk) = <ref_null_helper>(2)) (cost=0..0 rows=1) -> Rows fetched before execution (cost=0..0 rows=1) -> Filter: (<cache>(c.pk) = <ref_null_helper>(4)) (cost=0..0 rows=1) -> Rows fetched before execution (cost=0..0 rows=1) Only the row with pk==2 should pass the filters under the except node, and that's what happens. However, on repeated execution, the hash map used to implement the Except materialize is not re-initialized to being empty. The patch adds reinitialization of the hash map for such cases. Change-Id: Idf2e36f9085e36748900017a0aad420e4e476f78

…rong result This error happens both with hash_set_operations on or off. If we look at the explain, we can see why it happens: -> Filter: <in_optimizer>(c.pk,<exists>(select mysql#2)) -> Covering index scan on c using idx_c_col_datetime_key -> Select mysql#2 (subquery in condition; dependent) -> Limit: 1 row(s) <--------------------- OK optimization -> Table scan on <except temporary> -> Except all materialize -> Limit: 1 row(s) <--------------------------- problem -> Table scan on <union temporary> -> Union all materialize -> Filter: (<cache>(c.pk) = <ref_null_helper>(2)) -> Rows fetched before execution -> Filter: (<cache>(c.pk) = <ref_null_helper>(2)) -> Rows fetched before execution -> Filter: (<cache>(c.pk) = <ref_null_helper>(2)) -> Rows fetched before execution There is a limit node on top of the left EXCEPT operand right over the UNION ALL node which shouldn't be there. It used to be a clever optimization when MySQL only had UNION set operations, but it clearly is wrong for EXCEPT ALL. For EXCEPT DISTINCT, UNION and INTERSECT, it is fine, though. The solution is to skip pushed down limits in the presence of EXCEPT ALL inside subqueries, unless it is the top query block in the subquery's query_expression, i.e. we retain the top-most limit 1 above. Change-Id: Idf784cbfbe8efbaca03ad17c8c42d73ab7acaa1a

… and .6node3rpl Issue #1 Problem: Test fail in 4node4rpl (1 node group). Solution: Skip test when there is only one NG. Issue #2 Problem: Test fail in 6node3rpl (2 node group) with timeout. Test idea is to restart, with nostart option, *ALL* nodes in same node group to check if QMGR handles it wrongly as "node group is missing". In the test only two nodes in same node group are restarted, it works for 2 replica setups but, for 4 replica, test hangs waiting cluster to enter a noStart state. Solution: Instead of restart exactly 2 nodes, restart ALL nodes in a given node group. Change-Id: Iafb0511992a553723013e73593ea10540cd03661

…N (SELECT 2 EXCEPT SELECT 4) A work-around is to set the optimizer flag to not use hash map de-duplication for INTERSECT, EXCEPT, like so: SET optimizer_switch="hash_set_operations=off"; With hash_set_operations enabled, however, we get too may result rows. For the IN predicate, the set operation is computed repeatedly, with filters pushed down to set set operation operands: -> Filter: <in_optimizer>(c.pk,<exists>(select #2)) (cost=2.25 rows=20) -> Covering index scan on c using idx_c_col_datetime_key (cost=2.25 rows=20) -> Select #2 (subquery in condition; dependent) -> Limit: 1 row(s) (cost=2.61..2.61 rows=1) -> Table scan on <except temporary> (cost=2.61..2.61 rows=1) -> Except materialize with deduplication (cost=0.1..0.1 rows=1) -> Filter: (<cache>(c.pk) = <ref_null_helper>(2)) (cost=0..0 rows=1) -> Rows fetched before execution (cost=0..0 rows=1) -> Filter: (<cache>(c.pk) = <ref_null_helper>(4)) (cost=0..0 rows=1) -> Rows fetched before execution (cost=0..0 rows=1) Only the row with pk==2 should pass the filters under the except node, and that's what happens. However, on repeated execution, the hash map used to implement the Except materialize is not re-initialized to being empty. The patch adds reinitialization of the hash map for such cases. Change-Id: Idf2e36f9085e36748900017a0aad420e4e476f78

This is a combination of 5 commits. This is the 1st commit message: WL#15746: TLS Enhancements for HeatWave-AutoML & Dask Comm. Upgrade Problem: -------- - HeatWave-AutoML communication was unauthenticated, unauthorized, and unencrypted. - Dask communication utilized TCP, not aligning with FedRamp guidelines. Solution: --------- - Introduced TLS and mTLS in HeatWave-AutoML's plugin and driver for authentication, authorization, and encryption. - Applied TLS to Dask to ensure authentication, encryption, and authorization. Dask Authorization (OCID-based): -------------------------------- 1. For each DBsystem: - MySQL node sends OCIDs of authorized nodes to the head driver via: a. rapid_net_nodes b. rapid_net_allowed_ocids (older API, mainly for MTR tests) - Scenarios: a. All OCIDs provided: Dask authorizes. b. Any OCID absent: ML call fails with message. 2. During Dask worker registration to the Dask scheduler, a script is dispatched to the Dask worker for execution, retrieving the worker's OCID for authorization purposes. - If the OCID isn't approved, the connection is denied, terminating the worker and causing the ML query to fail. 3. For every Dask worker (both as listener and connector), an OCID- based authorization is performed post SSL/TLS connection handshake. The process compares the OCID from the peer's certificate against the allowed_ocids received from the HeatWave-AutoML MySQL plugin. HWAML Plugin Changes: --------------------- - Sourced certificate data and SSL setup from disk, incorporating SSL/TLS for HWAML. - Reused "keystore" variable to specify disk location for certificate retrieval. - Certificates and keys expected in PKCS12 format. - Introduced "have_ml_encryption" variable (default=0). > Acts as a switch to explicitly deactivate HWAML network encryption, akin to "disable_net_encryption" affecting network encryption for HeatWave. Set to 1 to enable. - Introduced a customized verifier function for verify_callback to be set in SSL_CTX_set_verify and used in the handshake process of SSL/TLS. The customized verifier function will perform instance id (OCID) based authorization on the plugin side during standard SSL/TLS handshake process. - CRL (Certificate Revocation List) checks are also conducted if CRL Distribution Points are present and accessible in the provided certificate. HWAML Driver Changes & OCID-based Authorization: ------------------------------------------------ - Introduced "enable_encryption" (default=0). > Set to 1 to enable encryption. - When receiving a new connection request and encryption is on, the driver performs OCID-based self-checking, comparing OCID retrieved from its own instance principal with the OCID in the provided certificate on disk. - The driver compares OCID from "mysql_compute_id" and extracted OCID from mTLS certificate during connection. - Introduced "cert_dir" argument for certificate directory specification. - Expected files: cert_chain.pem, certificate.pem, private_key.pem. > OCID should be in the userID (UID) or CN field of the certificate.pem subject. - CRL (Certificate Revocation List) checks are also conducted post handshake, if CRL Distribution Points are present and accessible in the provided certificate, alongside OCID authorization. Encryption Behavior: -------------------- - If encryption is deactivated on both plugin and driver side, HWAML will work without encryption as it was before this commit. Enabling Encryption: -------------------- - By default, "have_ml_encryption" and "enable_encryption" are set to 0 > Encryption is disabled by default. - For the HWAML plugin: > "have_ml_encryption" set to 1 (default is 0). > Specify the .pfx file's path using the "keystore". - For the HWAML Driver: > "enable_encryption" set to 1 (default is 0) > Specify "mysql_instance_id" and "cert_dir". Testing: -------- - MTR has been modified for the encryption setup. > Runs with encryption if "OCI_INSTANCE_ID" is set to a valid value. - On OCI (when "OLRAPID_KEYSTORE" is not set): > Certificates and keys are generated; PEMs for driver and PKCS12 for plugin. - On AWS (when "OLRAPID_KEYSTORE" is set as the path to PKCS12 keystore files): > PEM files are extracted from the provided PKCS12 and used for the driver. The plugin uses the provided PKCS12 keystore file. Change-Id: I553ca135241e03484db6debbe186e6d34d582bf4 This is the commit message mysql#2: WL#15746 - Adding ML encryption support to BM Enabling ML encryption on Baumeister: - Certificates are generated on MySQLd during initialization - Needed certicates for workers are packaged and sent to worker nodes - Workers use packaged files to generate their certificates - Arguments are added to driver.py invoke - Keystore path is added to mysql config Change-Id: I11a5cc5926488ff4fbf91bb6c10a091358db7dc9 This is the commit message mysql#3: WL#15746: Enhanced CRL Daemon Checker Issue ===== The previous design assumed a plain HTTPS link for the CRL distribution point, accessible to all. This assumption no longer holds, as public accessibility for CRL distribution contradicts OCI guidelines. Now, the CRL distribution point in certificates provided by the control plane is expected to be protected by OCI Instance Principal Authentication. However, using this authentication method introduces a delay of several seconds, which is impractical for HeatWave-AutoML. Solution ======== The CRL fetching code now uses OCI Instance Principal Authentication. To mitigate performance issues, the CRL checking process has been redesigned. Instead of enforcing CRL checks per connection in MySQL Plugin and HeatWave-AutoML Driver communications, a daemon thread in HeatWave-AutoML Driver, Dask scheduler, and Dask Worker now periodically fetches and verifies the CRL against all active connections. This separation minimizes performance impacts. Consequently, MySQL Plugin's CRL checks have been removed, as checks in the Driver, Scheduler, and Worker sufficiently cover all cluster nodes. Changes ======= - Implemented CRL checker as a daemon thread in Driver, Scheduler, and Worker. - Each connection/socket has an associated CRL checker. - CRL checks occur periodically at set intervals. - Skips CRL check if the CRL is temporarily unavailable. - Failing a CRL check results in the associated connection/socket being closed. On the Driver, a stop event is triggered (akin to CTRL-C). Change-Id: Id998cfe9e15d9236291b0ae420d65c2197837966 This is the commit message mysql#4: WL#15746: Fix Dask workers being shutdown without releasing address Issue ===== Dask workers getting shutting but not releasing the address used properly sometimes. Solution ======== Reverted some changes in heatwave_cluster.py in dask worker shutdown function. Hopefully this will fix the address issue Change-Id: I5a6749b5a25b0ccb73ba7369e545bc010da1b84f This is the commit message mysql#5: WL#15746: Implement Dask Worker Join Timeout for Head Node Issue: ====== In the cluster_shutdown method, the join operation on the head node's worker process lacked a timeout. This led to potential indefinite waiting and eventual hanging of the head node. Solution: ========= A timeout has been introduced for the worker process join on the head node. Unlike non-head nodes, which rely on worker join to complete Dask tasks and cannot have a timeout, the head node can safely implement this. Now, if the worker process on the head node fails to shut down post-join, indicating a timeout, it will be manually terminated. This ensures proper release of associated resources and prevents hanging of the head node. Additional Change: ================== Added Cert Rotation Guard for DASK clusters. This feature initiates on the first plugin-driver connection when the DASK cluster is off, recording the certificate's expiry date. During driver idle times, it checks the current cert's expiry against this date. If it detects a change, indicating a certificate rotation, it shuts down the DASK cluster. The cluster restarts on the next connection request, ensuring the use of the latest certificate. Change-Id: Ie63a2e2b7664e05e1622d8bd6503663e13fa73cb

In case `with_ndb_home` is set, `buf` is allocated with `PATH_MAX` and the home is already written into the buffer. The additional path is written using `snprintf` and it starts off at `len`. It still can write up to `PATH_MAX` though which is wrong, since if we already have a home written into it, we only have `PATH_MAX - len` available in the buffer. On Ubuntu 24.04 with debug builds this is caught and it crashes: ``` *** buffer overflow detected ***: terminated Signal 6 thrown, attempting backtrace. stack_bottom = 0 thread_stack 0x0 #0 0x604895341cb6 <unknown> mysql#1 0x7ff22524531f <unknown> at sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 mysql#2 0x7ff22529eb1c __pthread_kill_implementation at ./nptl/pthread_kill.c:44 mysql#3 0x7ff22529eb1c __pthread_kill_internal at ./nptl/pthread_kill.c:78 mysql#4 0x7ff22529eb1c __GI___pthread_kill at ./nptl/pthread_kill.c:89 mysql#5 0x7ff22524526d __GI_raise at sysdeps/posix/raise.c:26 mysql#6 0x7ff2252288fe __GI_abort at ./stdlib/abort.c:79 mysql#7 0x7ff2252297b5 __libc_message_impl at sysdeps/posix/libc_fatal.c:132 mysql#8 0x7ff225336c18 __GI___fortify_fail at ./debug/fortify_fail.c:24 mysql#9 0x7ff2253365d3 __GI___chk_fail at ./debug/chk_fail.c:28 mysql#10 0x7ff225337db4 ___snprintf_chk at ./debug/snprintf_chk.c:29 mysql#11 0x6048953593ba <unknown> mysql#12 0x604895331a3d <unknown> mysql#13 0x6048953206e7 <unknown> mysql#14 0x60489531f4b1 <unknown> mysql#15 0x60489531e8e6 <unknown> mysql#16 0x7ff22522a1c9 __libc_start_call_main at sysdeps/nptl/libc_start_call_main.h:58 mysql#17 0x7ff22522a28a __libc_start_main_impl at csu/libc-start.c:360 mysql#18 0x60489531ed54 <unknown> mysql#19 0xffffffffffffffff <unknown> ``` In practice this buffer overflow only would happen with very long paths. Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

Offload to HeatWave Follow up fix for testcase failure in ndb_opt.join_pushdown_bka and ndb_opt.join_pushdown_default. The used window function no longer need filesort after the patch and this is indicated in the explain output. [ 72%] ndb_opt.join_pushdown_bka w14 [ fail ] Test ended at 2024-09-10 20:39:13 join_pushdown_bka.result 2024-09-10 20:15:22.000000000 +0300 +++ /dev/shm/mtr-742949/var-ndb/14/log/join_pushdown_bka.reject 2024-09-10 23:39:03.080405944 +0300 @@ -7981,8 +7981,7 @@ -> Select #2 (subquery in projection; dependent) -> Limit: 1 row(s) -> Window aggregate with buffering: sum(t2.a) OVER w - -> Sort: t3.a - -> Table scan on t2 + -> Table scan on t2 Change-Id: I5b753fa897fb22c7bd86908166b667794709f421

Description: ============ Dropping a primary key and adding a new auto-increment column as a primary key in descending order using the "inplace" algorithm fails. Analysis: ========= Dropping an existing primary key and adding a new auto-increment key in descending order requires arranging the records in reverse order, which necessitates a file sort. However, this scenario was not detected in the method innobase_pk_order_preserved(), causing it to return false. As a result, the ALTER INPLACE operation, which calls this method, skips the file sort. Instead, it processes the primary key as usual in batches, a method known as bulk mode. In bulk mode, records are inserted into a sort buffer (in descending order in this case). When the sort buffer becomes full, records are directly inserted into the B-tree. Consider a case where we have 2000 records, and the sort buffer can hold 1000 records in a batch: Batch #1 inserted: Records 1000 to 1 (in descending order) Batch #2 inserted: Records 2000 to 1001 (in descending order) If the records from both batches happen to be in the same page, the record order is violated. It's important to note that this record order violation would still exist even if the sort buffer were skipped when file sort was skipped. Therefore, enabling file sort is essential to ensure correct record order across batches. Fix: ==== Enable file sort when add autoinc descending. This patch is based on the contribution from Shaohua Wang at Alibaba Group. We thank you for contributing to MySQL. Change-Id: I398173bbd27db7f5e29218d217bf11c30297c242

Fix the spelling of preceding, specified, received and additionally

b94d3bf

morgo closed this Dec 3, 2014

MyDanny pushed a commit that referenced this pull request Mar 10, 2015

Bug#19729545: REMOVE UNNEEDED CMAKE CHECKS AND #IFDEFS IN 5.7.6

a53cce1

Patch #2: As a consequence of earlier patches, my_afree() is no longer needed. Remove all usage of it.

bkandasa pushed a commit that referenced this pull request Aug 4, 2015

Bug #21306646: RENAME MAX_STATEMENT_TIME% VARIABLES TO MAX_EXECUTION_…

cef26bc

…TIME% After-push test result update #2.

bjornmu pushed a commit that referenced this pull request Oct 21, 2015

Bug #21306319: HARMONIZE MAX_EXECUTION_TIME HINT WITH MAX_STATEMENT_TIME

3f9c41f

Bug #21306392: REMOVE OLD-STYLE MAX_STATEMENT_TIME HINT (REPLACE WITH MAX_EXECUTION_TIME) After-push test update #2.

bjornmu pushed a commit that referenced this pull request Oct 21, 2015

Bug#21286261: QUERY DIGEST DOES NOT REFLECT NEW OPTIMIZER HINTS

c928cd4

After-push test update #2 to make it query cache-friendly.

pobrzut pushed a commit that referenced this pull request May 8, 2017

wl#9819 patch #2 : Misc. housekeeping required for worklog

92f3f38

(A) move ndb_basename into utility library (B) new portability function ndb_getsockname()

pobrzut pushed a commit that referenced this pull request May 8, 2017

wl#9819 Version 2. Patch #2: New definition of ProcessInfoRep signal.

63aaebd

bjornmu pushed a commit that referenced this pull request Oct 25, 2023

WL#15166 patch #2 Ndbinfo certificates table

c6e2f62

Implement ndb$certificates base table and certificates view. Update results for tests ndbinfo and ndbinfo plans. Change-Id: Iab1b89f5eb82ac1b3e0c049dd55eb7d07394070a

VarunNagaraju added a commit to VarunNagaraju/mysql-server that referenced this pull request Jan 3, 2025

Performance check mysql#2 with PMR containers

62bd1d3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix the spelling of preceding, specified, received and additionally #2

Fix the spelling of preceding, specified, received and additionally #2

Uh oh!

ottok commented Dec 2, 2014

Uh oh!

morgo commented Dec 2, 2014

Uh oh!

ottok commented Dec 3, 2014

Uh oh!

morgo commented Dec 3, 2014

Uh oh!

ottok commented Jan 23, 2016

Uh oh!

Uh oh!

Fix the spelling of preceding, specified, received and additionally #2

Fix the spelling of preceding, specified, received and additionally #2

Uh oh!

Conversation

ottok commented Dec 2, 2014

Uh oh!

morgo commented Dec 2, 2014

Uh oh!

ottok commented Dec 3, 2014

Uh oh!

morgo commented Dec 3, 2014

Uh oh!

ottok commented Jan 23, 2016

Uh oh!

Uh oh!