Skip to content

Write use statement to every slowlog entry #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

emate
Copy link

@emate emate commented Feb 4, 2015

Due to nature of slowlogs, it's hard to parse them in efficient way. Allowing to write 'use' statement to every slowlog entry greatly improve ease of parsing logs with fixed number of lines without causing any impact on system (i/o footprint is unnoticeable).

@emate emate closed this Feb 4, 2015
@emate emate reopened this Feb 4, 2015
@morgo
Copy link

morgo commented Feb 5, 2015

Hi Marcin, thank you for the contribution!

We are not currently accepting pull requests from GitHub. May I suggest filing a bug report on bugs.mysql.com?

(I would also be happy to do so on your behalf.)

@emate
Copy link
Author

emate commented Feb 5, 2015

Hi Morgan,

thanks for reply. I already did that, http://bugs.mysql.com/?id=75775.

@morgo
Copy link

morgo commented Apr 1, 2015

Thank you! I am going to go ahead and close this pull request.

@morgo morgo closed this Apr 1, 2015
bjornmu pushed a commit that referenced this pull request Oct 21, 2015
Background:

WAIT_FOR_EXECUTED_GTID_SET waits until a specified set of GTIDs is
included in GTID_EXECUTED. SET GTID_PURGED adds GTIDs to
GTID_EXECUTED. RESET MASTER clears GTID_EXECUTED.

There were multiple issues:

 1. Problem:

    The change in GTID_EXECUTED implied by SET GTID_PURGED did
    not cause WAIT_FOR_EXECUTED_GTID_SET to stop waiting.

    Analysis:

    WAIT_FOR_EXECUTED_GTID_SET waits for a signal to be sent.
    But SET GTID_PURGED never sent the signal.

    Fix:

    Make GTID_PURGED send the signal.

    Changes:
    - sql/rpl_gtid_state.cc:Gtid_state::add_lost_gtids
    - sql/rpl_gtid_state.cc: removal of #ifdef HAVE_GTID_NEXT_LIST
    - sql/rpl_gtid.h: removal of #ifdef HAVE_GTID_NEXT_LIST

 2. Problem:

    There was a race condition where WAIT_FOR_EXECUTED_GTID_SET
    could miss the signal from a commit and go into an infinite
    wait even if GTID_EXECUTED contains all the waited-for GTIDs.

    Analysis:

    In the bug, WAIT_FOR_EXECUTED_GTID_SET took a lock while
    taking a copy of the global state. Then it released the lock,
    analyzed the copy of the global state, and decided whether it
    should wait.  But if the GTID to wait for was committed after
    the lock was released, WAIT_FOR_EXECUTED_GTID_SET would miss
    the signal and go to an infinite wait even if GTID_EXECUTED
    contains all the waited-for GTIDs.

    Fix:

    Refactor the code so that it holds the lock all the way from
    before it reads the global state until it goes to the wait.

    Changes:

    - sql/rpl_gtid_state.cc:Gtid_state::wait_for_gtid_set:
      Most of the changes in this function are to fix this bug.

    Note:

    When the bug existed, it was possible to create a test case
    for this by placing a debug sync point in the section where
    it does not hold the lock.  However, after the bug has been
    fixed this section does not exist, so there is no way to test
    it deterministically.  The bug would also cause the test to
    fail rarely, so a way to test this is to run the test case
    1000 times.

 3. Problem:

    The function would take global_sid_lock.wrlock every time it has
    to wait, and while holding it takes a copy of the entire
    gtid_executed (which implies allocating memory).  This is not very
    optimal: it may process the entire set each time it waits, and it
    may wait once for each member of the set, so in the worst case it
    is O(N^2) where N is the size of the set.

    Fix:

    This is fixed by the same refactoring that fixes problem #2.  In
    particular, it does not re-process the entire Gtid_set for each
    committed transaction. It only removes all intervals of
    gtid_executed for the current sidno from the remainder of the
    wait-for-set.

    Changes:
    - sql/rpl_gtid_set.cc: Add function remove_intervals_for_sidno.
    - sql/rpl_gtid_state.cc: Use remove_intervals_for_sidno and remove
      only intervals for the current sidno. Remove intervals
      incrementally in the innermost while loop, rather than recompute
      the entire set each iteration.

 4. Problem:

    If the client that executes WAIT_FOR_EXECUTED_GTID_SET owns a
    GTID that is included in the set, then there is no chance for
    another thread to commit it, so it will wait forever.  In
    effect, it deadlocks with itself.

    Fix:

    Detect the situation and generate an error.

    Changes:
    - sql/share/errmsg-utf8.txt: new error code
      ER_CANT_WAIT_FOR_EXECUTED_GTID_SET_WHILE_OWNING_A_GTID
    - sql/item_func.cc: check the condition and generate the new error

 5. Various simplfications.

    - sql/item_func.cc:Item_wait_for_executed_gtid_set::val_int:
      - Pointless to set null_value when generating an error.
      - add DBUG_ENTER
      - Improve the prototype for Gtid_state::wait_for_gtid_set so
        that it takes a Gtid_set instead of a string, and also so that
        it requires global_sid_lock.
    - sql/rpl_gtid.h:Mutex_cond_array
      - combine wait functions into one and make it return bool
      - improve some comments
    - sql/rpl_gtid_set.cc:Gtid_set::remove_gno_intervals:
      - Optimize so that it returns early if this set becomes empty

@mysql-test/extra/rpl_tests/rpl_wait_for_executed_gtid_set.inc
- Move all wait_for_executed_gtid_set tests into
  mysql-test/suite/rpl/t/rpl_wait_for_executed_gtid_set.test

@mysql-test/include/kill_wait_for_executed_gtid_set.inc
@mysql-test/include/wait_for_wait_for_executed_gtid_set.inc
- New auxiliary scripts.

@mysql-test/include/rpl_init.inc
- Document undocumented side effect.

@mysql-test/suite/rpl/r/rpl_wait_for_executed_gtid_set.result
- Update result file.

@mysql-test/suite/rpl/t/rpl_wait_for_executed_gtid_set.test
- Rewrote the test to improve coverage and cover all parts of this bug.

@sql/item_func.cc
- Add DBUG_ENTER
- No point in setting null_value when generating an error.
- Do the decoding from text to Gtid_set here rather than in Gtid_state.
- Check for the new error
  ER_CANT_WAIT_FOR_EXECUTED_GTID_SET_WHILE_OWNING_A_GTID

@sql/rpl_gtid.h
- Simplify the Mutex_cond_array::wait functions in the following ways:
  - Make them one function since they share most code. This also allows
    calling the three-argument function with NULL as the last
    parameter, which simplifies the caller.
  - Make it return bool rather than 0/ETIME/ETIMEOUT, to make it more
    easy to use.
- Make is_thd_killed private.
- Add prototype for new Gtid_set::remove_intervals_for_sidno.
- Add prototype for Gtid_state::wait_for_sidno.
- Un-ifdef-out lock_sidnos/unlock_sidnos/broadcast_sidnos since we now
  need them.
- Make wait_for_gtid_set return bool.

@sql/rpl_gtid_mutex_cond_array.cc
- Remove the now unused check_thd_killed.

@sql/rpl_gtid_set.cc
- Optimize Gtid_set::remove_gno_intervals, so that it returns early
  if the Interval list becomes empty.
- Add Gtid_set::remove_intervals_for_sidno. This is just a wrapper
  around the already existing private member function
  Gtid_set::remove_gno_intervals.

@sql/rpl_gtid_state.cc
- Rewrite wait_for_gtid_set to fix problems 2 and 3. See code
  comment for details.
- Factor out wait_for_sidno from wait_for_gtid.
- Enable broadcast_sidnos/lock_sidnos/unlock_sidnos, which were ifdef'ed out.
- Call broadcast_sidnos after updating the state, to fix issue #1.

@sql/share/errmsg-utf8.txt
- Add error message used to fix issue #4.
bjornmu pushed a commit that referenced this pull request Apr 10, 2017
Author: Andrei Elkin <andrei.elkin@oracle.com>
Date:   Fri Nov 25 15:17:17 2016 +0200

    WL#9175 Correct recovery of DDL statements/transactions by binary log

    The patch consists of two parts implementing the WL agenda which is
    is to provide crash-safety for DDL.
    That is a server (a general one, master or slave) must be able to recover
    from crash to commit or rollback every DDL command that was in progress
    on the eve of crash.

    The Commit decision is done to commands that had reached
    Engine-prepared status and got successfully logged into binary log.
    Otherwise they are rolled back.

    In order to achieve the goal some refinements are done to the binlogging
    mechanism, minor addition is done to the server recovery module and some changes
    applied to the slave side.

    The binary log part includes Query-log-event which is made to contain xid
    that is a key item at server recovery. The recovery now is concern with it along
    with its standard location in Xid_log_event.

    The first part deals with the ACL DDL sub-class and
    TRIGGER related queries are fully 2pc-capable. It constructs
    the WL's framework which is proved on these subclasses.
    It also specifies how to cover the rest of DDLs by the WL's framework.
    For those not 2pc-ready DDL cases, sometimes "stub" tests are prepared
    to be refined by responsible worklogs.

    Take a few notes to the low-level details of implementation.

    Note #1.

    Tagging by xid number is done to the exact 2pc-capable DDL subclass.
    For DDL:s that will be ready for xiding in future, there is a tech specification
    how to do so.

    Note #2.

    By virtue of existing mechanisms, the slave applier augments the DDL
    transaction incorporating the slave info table update and the
    Gtid-executed table (either one optionally) at time of the DDL is
    ready for the final commit.
    When for filtering reason the DDL skips committing at its regular
    time, the augmented transaction would still be not empty consisting of
    only the added statements, and it would have to be committed by
    top-level slave specific functions through Log_event::do_update_pos().

    To aid this process Query_log_event::has_committed is introduced.

    Note #3 (QA, please read this.)

    Replication System_table interface that is employed by handler of TABLE type slave info
    had to be refined in few places.

    Note #4 (runtime code).

    While trying to lessen the footprint to the runtime server code few
    concessions had to be conceded. These include changes to
    ha_commit_trans()
    to invoke new pre_commit() and post_commit(), and post_rollback() hooks
    due to the slave extra statement.

    -------------------------------------------------------------------

    The 2nd part patch extends the basic framework,
    xidifies the rest of DDL commands that are
    (at least) committable at recovery. At the moment those include
    all Data Definition Statements except ones related to
    VIEWs, STORED Functions and Procedures.

    DDL Query is recoverable for these subclasses when it has been
    recorded into the binary log and was discovered there at the server
    restart, quite compatible with the DML algorithm.
    However a clean automatic rollback can't be provided for some
    of the commands and the user would have to complete recovery
    manually.
pobrzut pushed a commit that referenced this pull request May 8, 2017
pobrzut pushed a commit that referenced this pull request May 8, 2017
akopytov pushed a commit to akopytov/mysql-server that referenced this pull request Aug 25, 2017
In WL-included builds ASAN run witnessed missed ~Query_log_event invocation.
The destruct-or was not called due to the WL's changes in the error propagation
that specifically affect LC MTS.
The failure is exposed in particular by rpl_trigger as the following
stack:

  #0 0x9ecd98 in __interceptor_malloc (/export/home/pb2/test/sb_2-22611026-1489061390.32/mysql-commercial-8.0.1-dmr-linux-x86_64-asan/bin/mysqld+0x9ecd98)
  mysql#1 0x2b1a245 in my_raw_malloc(unsigned long, int) obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:209:12
  mysql#2 0x2b1a245 in my_malloc obj/mysys/../../mysqlcom-pro-8.0.1-dmr/mysys/my_malloc.cc:72
  mysql#3 0x2940590 in Query_log_event::Query_log_event(char const*, unsigned int, binary_log::Format_description_event const*, binary_log::Log_event_type) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:4343:46
  mysql#4 0x293d235 in Log_event::read_log_event(char const*, unsigned int, char const**, Format_description_log_event const*, bool) obj/sql/../../mysqlcom-pro-8.0.1-dmr/sql/log_event.cc:1686:17
  mysql#5 0x293b96f in Log_event::read_log_event()
  mysql#6 0x2a2a1c9 in next_event(Relay_log_info*)

Previously before the WL
Mts_submode_logical_clock::wait_for_workers_to_finish() had not
returned any error even when Coordinator thread is killed.

The WL patch needed to refine such behavior, but at doing so
it also had to attend log_event.cc::schedule_next_event() to register
an error to follow an existing pattern.
While my_error() does not take place the killed Coordinator continued
scheduling, ineffectively though - no Worker gets engaged (legal case
of deferred scheduling), and without noticing its killed status up to
a point when it resets the event pointer in
apply_event_and_update_pos():

  *ptr_ev= NULL; // announcing the event is passed to w-worker

The reset was intended for an assigned Worker to perform the event
destruction or by Coordinator itself when the event is deferred.
As neither is the current case the event gets unattended for its termination.

In contrast in the pre-WL sources the killed Coordinator does find a Worker.
However such Worker could be already down (errored out and exited), in
which case apply_event_and_update_pos() reasonably returns an error and executes

  delete ev

in exec_relay_log_event() error branch.

**Fixed** with deploying my_error() call in log_event.cc::schedule_next_event()
error branch which fits to the existing pattern.
THD::is_error() has been always checked by Coordinator before any attempt to
reset *ptr_ev= NULL. In the errored case Coordinator does not reset and
destroys the event itself in the exec_relay_log_event() error branch pretty similarly to
how the pre-WL sources do.

Tested against rpl_trigger and rpl suites to pass.

Approved on rb#15667.
bjornmu pushed a commit that referenced this pull request Sep 21, 2017
Patch #4:

Move handling of auto-wrapping to the handling of array path legs, so
that we don't pay the cost of checking if auto-wrapping should be
performed for every path leg.

Microbenchmarks (64-bit, Intel Core i7-4770 3.4 GHz, GCC 6.3):

BM_JsonDomSearchEllipsis              22608 ns/iter [+12.5%]
BM_JsonDomSearchEllipsis_OnlyOne      16169 ns/iter [ +9.8%]
BM_JsonDomSearchKey                     129 ns/iter [ -0.8%]
BM_JsonBinarySearchEllipsis          230855 ns/iter [ +1.1%]
BM_JsonBinarySearchEllipsis_OnlyOne  223140 ns/iter [ +1.3%]
BM_JsonBinarySearchKey                   86 ns/iter [  0.0%]

Change-Id: I2a233ee7b5a709a1388a370cb96126b17d59595b
bjornmu pushed a commit that referenced this pull request Jan 23, 2018
…TABLE_UPGRADE_GUARD

To repeat: cmake -DWITH_ASAN=1 -DWITH_ASAN_SCOPE=1
./mtr --mem --sanitize main.dd_upgrade_error

A few dd tests fail with:
==26861==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7000063bf5e8 at pc 0x00010d4dbe8b bp 0x7000063bda40 sp 0x7000063bda38
READ of size 8 at 0x7000063bf5e8 thread T2
    #0 0x10d4dbe8a in Prealloced_array<st_plugin_int**, 16ul>::empty() const prealloced_array.h:186
    #1 0x10d406a8b in lex_end(LEX*) sql_lex.cc:560
    #2 0x10dae4b6d in dd::upgrade::Table_upgrade_guard::~Table_upgrade_guard() (mysqld:x86_64+0x100f87b6d)
    #3 0x10dadc557 in dd::upgrade::migrate_table_to_dd(THD*, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, Stateless_allocator<char, dd::String_type_alloc, My_free_functor> > const&, bool) (mysqld:x86_64+0x100f7f557)
    #4 0x10dad7e85 in dd::upgrade::migrate_plugin_table_to_dd(THD*) (mysqld:x86_64+0x100f7ae85)
    #5 0x10daec6a1 in dd::upgrade::do_pre_checks_and_initialize_dd(THD*) upgrade.cc:1216
    #6 0x10cd0a5c0 in bootstrap::handle_bootstrap(void*) bootstrap.cc:336

Change-Id: I265ec6dd97ee8076aaf03763840c0cdf9e20325b
Fix: increase lifetime of 'LEX lex;' which is used by 'table_guard'
bjornmu pushed a commit that referenced this pull request Oct 22, 2018
Post push fix

sql/opt_range.cc:14196:22: runtime error: -256 is outside the range of representable values of type 'unsigned int'
    #0 0x4248a9d in cost_skip_scan(TABLE*, unsigned int, unsigned int, unsigned long long, Cost_estimate*, unsigned long long*, Item*, Opt_trace_object*) sql/opt_range.cc:14196:22
    #1 0x41c524c in get_best_skip_scan(PARAM*, SEL_TREE*, bool) sql/opt_range.cc:14086:5
    #2 0x41b7b65 in test_quick_select(THD*, Bitmap<64u>, unsigned long long, unsigned long long, bool, enum_order, QEP_shared_owner const*, Item*, Bitmap<64u>*, QUICK_SELECT_I**) sql/opt_range.cc:3352:23
    #3 0x458fc08 in get_quick_record_count(THD*, JOIN_TAB*, unsigned long long) sql/sql_optimizer.cc:5542:17
    #4 0x458a0cd in JOIN::estimate_rowcount() sql/sql_optimizer.cc:5290:25

The fix is to handle REC_PER_KEY_UNKNOWN explicitly, to avoid using
-1.0 in computations later.

Change-Id: Ie8a81bdf7323e4f66abcad0a9aca776de8acd945
surbhat1595 pushed a commit that referenced this pull request Apr 25, 2019
…E TO A SERVER

Problem
========================================================================
Running the GCS tests with ASAN seldomly reports a user-after-free of
the server reference that the acceptor_learner_task uses.

Here is an excerpt of ASAN's output:

==43936==ERROR: AddressSanitizer: heap-use-after-free on address 0x63100021c840 at pc 0x000000530ff8 bp 0x7fc0427e8530 sp 0x7fc0427e8520
WRITE of size 8 at 0x63100021c840 thread T3
    #0 0x530ff7 in server_detected /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:962
    #1 0x533814 in buffered_read_bytes /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1249
    #2 0x5481af in buffered_read_msg /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1399
    #3 0x51e171 in acceptor_learner_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4690
    #4 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #5 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #6 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #7 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #8 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)
    #9 0x7fc047ff2bfc in __clone (/lib64/libc.so.6+0xfebfc)

0x63100021c840 is located 64 bytes inside of 65688-byte region [0x63100021c800,0x63100022c898)
freed by thread T3 here:
    #0 0x7fc04a5d7508 in __interceptor_free (/lib64/libasan.so.4+0xde508)
    #1 0x52cf86 in freesrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:836
    #2 0x52ea78 in srv_unref /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:868
    #3 0x524c30 in reply_handler_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4914
    #4 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #5 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #6 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #7 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #8 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)

previously allocated by thread T3 here:
    #0 0x7fc04a5d7a88 in __interceptor_calloc (/lib64/libasan.so.4+0xdea88)
    #1 0x543604 in mksrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:721
    #2 0x543b4c in addsrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:755
    #3 0x54af61 in update_servers /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1747
    #4 0x501082 in site_install_action /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1572
    #5 0x55447c in import_config /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/site_def.c:486
    #6 0x506dfc in handle_x_snapshot /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:5257
    #7 0x50c444 in xcom_fsm /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:5325
    #8 0x516c36 in dispatch_op /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4510
    #9 0x521997 in acceptor_learner_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4772
    #10 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #11 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #12 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #13 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #14 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)

Analysis
========================================================================
The server structure is reference counted by the associated sender_task
and reply_handler_task.
When they finish, they unreference the server, which leads to its memory
being freed.

However, the acceptor_learner_task keeps a "naked" reference to the
server structure.
Under the right ordering of operations, i.e. the sender_task and
reply_handler_task terminating after the acceptor_learner_task acquires,
but before it uses, the reference to the server structure, leads to the
acceptor_learner_task accessing the server structure after it has been
freed.

Solution
========================================================================
Let the acceptor_learner_task also reference count the server structure
so it is not freed while still in use.

Reviewed-by: André Negrão <andre.negrao@oracle.com>
Reviewed-by: Venkatesh Venugopal <venkatesh.venugopal@oracle.com>
RB: 21209
bjornmu pushed a commit that referenced this pull request Jan 18, 2021
…TH VS 2019 [#4] [noclose]

storage\ndb\src\ndbapi\ObjectMap.hpp(168,1): warning C4302: 'type cast': truncation from 'void *' to 'long'

Change-Id: I11ffba127bc19db15e9a50307b50532941f9fdb2
bjornmu pushed a commit that referenced this pull request Oct 19, 2021
Patch #4: Wrong results with AND of JSON_CONTAINS and multi-valued indexes

For a multi-valued index on json array field f=[1, 2], this statement
wrongly returned empty result set:

  SELECT * FROM f WHERE JSON_CONTAINS(f, 1) AND JSON_CONTAINS(f, 2);

The cause is that get_func_mm_tree() converts two JSON_CONTAINS() to
`f_idx = 1 AND f_idx = 2` which is always false for single value index
but possible for multi-valued index.

Fixed by anding the two key ranges, rather than marking the condition
as always false.

This is a contribution by Yubao Liu.

Change-Id: I535fc6ce8755f4f3b6e8cbd77b4c0ee4aa685cae
rahulmalik87 pushed a commit to rahulmalik87/mysql-server that referenced this pull request Jun 16, 2022
…rnings.

Change-Id: I579c2691165865101559915239b2cd027c10ab56
nawazn pushed a commit that referenced this pull request Jul 26, 2022
The NdbEventOperationImpl owns a NdbDictionary::Event by having a
pointer to the Event's implementation.

Make sure the pointer is always initialized in constructor and mark the
pointer const. Also mark NdbEventImpl's pointer to the Event as const.
Thus the m_eventImpl->m_facade pointer path should now be enforced by
compiler.

Change-Id: I469b39d66a4d83daf08307d980d555da1ab79827
nawazn pushed a commit that referenced this pull request Jul 26, 2022
-- Patch #1: Persist secondary load information --

Problem:
We need a way of knowing which tables were loaded to HeatWave after
MySQL restarts due to a crash or a planned shutdown.

Solution:
Add a new "secondary_load" flag to the `options` column of mysql.tables.
This flag is toggled after a successful secondary load or unload. The
information about this flag is also reflected in
INFORMATION_SCHEMA.TABLES.CREATE_OPTIONS.

-- Patch #2 --

The second patch in this worklog triggers the table reload from InnoDB
after MySQL restart.

The recovery framework recognizes that the system restarted by checking
whether tables are present in the Global State. If there are no tables
present, the framework will access the Data Dictionary and find which
tables were loaded before the restart.

This patch introduces the "Data Dictionary Worker" - a MySQL service
recovery worker whose task is to query the INFORMATION_SCHEMA.TABLES
table from a separate thread and find all tables whose secondary_load
flag is set to 1.

All tables that were found in the Data Dictionary will be appended to
the list of tables that have to be reloaded by the framework from
InnoDB.

If an error occurs during restart recovery we will not mark the recovery
as failed. This is done because the types of failures that can occur
when the tables are reloaded after a restart are less critical compared
to previously existing recovery situations. Additionally, this code will
soon have to be adapted for the next worklog in this area so we are
proceeding with the simplest solution that makes sense.

A Global Context variable m_globalStateEmpty is added which indicates
whether the Global State should be recovered from an external source.

-- Patch #3 --

This patch adds the "rapid_reload_on_restart" system variable. This
variable is used to control whether tables should be reloaded after a
restart of mysqld or the HeatWave plugin. This variable is persistable
(i.e., SET PERSIST RAPID_RELOAD_ON_RESTART = TRUE/FALSE).

The default value of this variable is set to false.

The variable can be modified in OFF, IDLE, and SUSPENDED states.

-- Patch #4 --

This patch refactors the recovery code by removing all recovery-related
code from ha_rpd.cc and moving it to separate files:

  - ha_rpd_session_factory.h/cc:
  These files contain the MySQLAdminSessionFactory class, which is used
to create admin sessions in separate threads that can be used to issue
SQL queries.

  - ha_rpd_recovery.h/cc:
  These files contain the MySQLServiceRecoveryWorker,
MySQLServiceRecoveryJob and ObjectStoreRecoveryJob classes which were
previously defined in ha_rpd.cc. This file also contains a function that
creates the RecoveryWorkerFactory object. This object is passed to the
constructor of the Recovery Framework and is used to communicate with
the other section of the code located in rpdrecoveryfwk.h/cc.

This patch also renames rpdrecvryfwk to rpdrecoveryfwk for better
readability.

The include relationship between the files is shown on the following
diagram:

        rpdrecoveryfwk.h◄──────────────rpdrecoveryfwk.cc
            ▲    ▲
            │    │
            │    │
            │    └──────────────────────────┐
            │                               │
        ha_rpd_recovery.h◄─────────────ha_rpd_recovery.cc──┐
            ▲                               │           │
            │                               │           │
            │                               │           │
            │                               ▼           │
        ha_rpd.cc───────────────────────►ha_rpd.h       │
                                            ▲           │
                                            │           │
            ┌───────────────────────────────┘           │
            │                                           ▼
    ha_rpd_session_factory.cc──────►ha_rpd_session_factory.h

Other changes:
  - In agreement with Control Plane, the external Global State is now
  invalidated during recovery framework startup if:
    1) Recovery framework recognizes that it should load the Global
    State from an external source AND,
    2) rapid_reload_on_restart is set to OFF.

  - Addressed review comments for Patch #3, rapid_reload_on_restart is
  now also settable while plugin is ON.

  - Provide a single entry point for processing external Global State
  before starting the recovery framework loop.

  - Change when the Data Dictionary is read. Now we will no longer wait
  for the HeatWave nodes to connect before querying the Data Dictionary.
  We will query it when the recovery framework starts, before accepting
  any actions in the recovery loop.

  - Change the reload flow by inserting fake global state entries for
  tables that need to be reloaded instead of manually adding them to a
  list of tables scheduled for reload. This method will be used for the
  next phase where we will recover from Object Storage so both recovery
  methods will now follow the same flow.

  - Update secondary_load_dd_flag added in Patch #1.

  - Increase timeout in wait_for_server_bootup to 300s to account for
  long MySQL version upgrades.

  - Add reload_on_restart and reload_on_restart_dbg tests to the rapid
  suite.

  - Add PLUGIN_VAR_PERSIST_AS_READ_ONLY flag to "rapid_net_orma_port"
  and "rapid_reload_on_restart" definitions, enabling their
  initialization from persisted values along with "rapid_bootstrap" when
  it is persisted as ON.

  - Fix numerous clang-tidy warnings in recovery code.

  - Prevent suspended_basic and secondary_load_dd_flag tests to run on
  ASAN builds due to an existing issue when reinstalling the RAPID
  plugin.

-- Bug#33752387 --

Problem:
A shutdown of MySQL causes a crash in queries fired by DD worker.

Solution:
Prevent MySQL from killing DD worker's queries by instantiating a
DD_kill_immunizer before the queries are fired.

-- Patch #5 --

Problem:
A table can be loaded before the DD Worker queries the Data Dictionary.
This means that table will be wrongly processed as part of the external
global state.

Solution:
If the table is present in the current in-memory global state we will
not consider it as part of the external global state and we will not
process it by the recovery framework.

-- Bug#34197659 --

Problem:
If a table reload after restart causes OOM the cluster will go into
RECOVERYFAILED state.

Solution:
Recognize when the tables are being reloaded after restart and do not
move the cluster into RECOVERYFAILED. In that case only the current
reload will fail and the reload for other tables will be attempted.

Change-Id: Ic0c2a763bc338ea1ae6a7121ff3d55b456271bf0
pull bot pushed a commit to zhuizhu-95/mysql-server that referenced this pull request Apr 28, 2023
Potential read of uninitialized dataNodes[0] when no alive nodes are
available. Fix by checking for zero data nodes and fail test.

trunk/storage/ndb/test/src/UtilTransactions.cpp:1710:9: warning: 3rd
function call argument is an uninitialized value [clang-analyzer-
core.CallAndMessage]

Change-Id: I7c1e362eb0d62bdb560967144ca39966aed8a3c1
pull bot pushed a commit to zhuizhu-95/mysql-server that referenced this pull request Apr 28, 2023
  # This is the 1st commit message:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA

  Problem Statement
  -----------------
  Currently customers cannot enable heatwave analytics service to their
  HA DBSystem or enable HA if they are using Heatwave enabled DBSystem.
  In this change, we attempt to remove this limitation and provide
  failover support of heatwave in an HA enabled DBSystem.

  High Level Overview
  -------------------
  To support heatwave with HA, we extended the existing feature of auto-
  reloading of tables to heatwave on MySQL server restart (WL-14396). To
  provide seamless failover functionality to tables loaded to heatwave,
  each node in the HA cluster (group replication) must have the latest
  view of tables which are currently loaded to heatwave cluster attached
  to the primary, i.e., the secondary_load flag should be in-sync always.

  To achieve this, we made following changes -
    1. replicate secondary load/unload DDL statements to all the active
       secondary nodes by writing the DDL into the binlog, and
    2. Control how secondary load/unload is executed when heatwave cluster
       is not attached to node executing the command

  Implementation Details
  ----------------------
  Current implementation depends on two key assumptions -
   1. All MDS DBSystems will have RAPID plugin installed.
   2. No non-MDS system will have the RAPID plugin installed.

  Based on these assumptions, we made certain changes w.r.t. how server
  handles execution of secondary load/unload statements.
   1. If secondary load/unload command is executed from a mysql client
      session on a system without RAPID plugin installed (i.e., non-MDS),
      instead of an error, a warning message will be shown to the user,
      and the DDL is allowed to commit.
   2. If secondary load/unload command is executed from a replication
      connection on an MDS system without heatwave cluster attached,
      instead of throwing an error, the DDL is allowed to commit.
   3. If no error is thrown from secondary engine, then the DDL will
      update the secondary_load metadata and write a binlog entry.

  Writing to binlog implies that all the consumer of binlog now need to
  handle this DDL gracefully. This has an adverse effect on Point-in-time
  Recovery. If the PITR backup is taken from a DBSystem with heatwave, it
  may contain traces of secondary load/unload statements in its binlog.
  If such a backup is used to restore a new DBSystem, it will cause failure
  while trying to execute statements from its binlog because
   a) DBSystem will not heatwave cluster attached at this time, and
   b) Statements from binlog are executed from standard mysql client
      connection, thus making them indistinguishable from user executed
      command.
  Customers will be prevented (by control plane) from using PITR functionality
  on a heatwave enabled DBSystem until there is a solution for this.

  Testing
  -------
  This commit changes the behavior of secondary load/unload statements, so it
   - adjusts existing tests' expectations, and
   - adds a new test validating new DDL behavior under different scenarios

  Change-Id: Ief7e9b3d4878748b832c366da02892917dc47d83

  # This is the commit message #2:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (PITR SUPPORT)

  Problem
  -------
  A PITR backup taken from a heatwave enabled system could have traces
  of secondary load or unload statements in binlog. When such a backup
  is used to restore another system, it can cause failure because of
  following two reasons:

  1. Currently, even if the target system is heatwave enabled, heatwave
  cluster is attached only after PITR restore phase completes.
  2. When entries from binlogs are applied, a standard mysql client
  connection is used. This makes it indistinguishable from other user
  session.

  Since secondary load (or unload) statements are meant to throw error
  when they are executed by user in the absence of a healthy heatwave
  cluster, PITR restore workflow will fail if binlogs from the backup
  have any secondary load (or unload) statements in them.

  Solution
  --------
  To avoid PITR failure, we are introducing a new system variable
  rapid_enable_delayed_secondary_ops. It controls how load or unload
  commands are to be processed by rapid plugin.

    - When turned ON, the plugin silently skips the secondary engine
      operation (load/unload) and returns success to the caller. This
      allows secondary load (or unload) statements to be executed by the
      server in the absence of any heatwave cluster.
    - When turned OFF, it follows the existing behavior.
    - The default value is OFF.
    - The value can only be changed when rapid_bootstrap is IDLE or OFF.
    - This variable cannot be persisted.

  In PITR workflow, Control Plane would set the variable at the start of
  PITR restore and then reset it at the end of workflow. This allows the
  workflow to complete without failure even when heatwave cluster is not
  attached. Since metadata is always updated when secondary load/unload
  DDLs are executed, when heatwave cluster is attached at a later point
  in time, the respective tables get reloaded to heatwave automatically.

  Change-Id: I42e984910da23a0e416edb09d3949989159ef707

  # This is the commit message mysql#3:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (TEST CHANGES)

  This commit adds new functional tests for the MDS HA + HW integration.

  Change-Id: Ic818331a4ca04b16998155efd77ac95da08deaa1

  # This is the commit message mysql#4:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA
  BUG#34776485: RESTRICT DEFAULT VALUE FOR rapid_enable_delayed_secondary_ops

  This commit does two things:
  1. Add a basic test for newly introduced system variable
  rapid_enable_delayed_secondary_ops, which controls the behavior of
  alter table secondary load/unload ddl statements when rapid cluster
  is not available.

  2. It also restricts the DEFAULT value setting for the system variable
  So, following is not allowed:
  SET GLOBAL rapid_enable_delayed_secondary_ops = default
  This variable is to be used in restricted scenarios and control plane
  only sets it to ON/OFF before and after PITR apply. Allowing set to
  default has no practical use.

  Change-Id: I85c84dfaa0f868dbfc7b1a88792a89ffd2e81da2

  # This is the commit message mysql#5:

  Bug#34726490: ADD DIAGNOSTICS FOR SECONDARY LOAD / UNLOAD DDL

  Problem:
  --------
  If secondary load or unload DDL gets rolled back due to some error after
  it had loaded / unloaded the table in heatwave cluster, there is no undo
  of the secondary engine action. Only secondary_load flag update is
  reverted and binlog is not written. From User's perspective, the table
  is loaded and can be seen on performance_schema. There are also no
  error messages printed to notify that the ddl didn't commit. This
  creates a problem to debug any issue in this area.

  Solution:
  ---------
  The partial undo of secondary load/unload ddl will be handled in
  bug#34592922. In this commit, we add diagnostics to reveal if the ddl
  failed to commit, and from what stage.

  Change-Id: I46c04dd5dbc07fc17beb8aa2a8d0b15ddfa171af

  # This is the commit message mysql#6:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (TEST FIX)

  Since ALTER TABLE SECONDARY LOAD / UNLOAD DDL statements now write
  to binlog, from Heatwave's perspective, SCN is bumped up.

  In this commit, we are adjusting expected SCN values in certain
  tests which does secondary load/unload and expects SCN to match.

  Change-Id: I9635b3cd588d01148d763d703c72cf50a0c0bb98

  # This is the commit message mysql#7:

  Adding MTR tests for ML in rapid group_replication suite

  Added MTR tests with Heatwave ML queries with in
  an HA setup.

  Change-Id: I386a3530b5bbe6aea551610b6e739ab1cf366439

  # This is the commit message mysql#8:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (MTR TEST ADJUSTMENT)

  In this commit we have adjusted the existing test to work with the
  new MTR test infrastructure which extends the functionalities to
  HA landscape. With this change, a lot of mannual settings have now
  become redundant and thus removed in this commit.

  Change-Id: Ie1f4fcfdf047bfe8638feaa9f54313d509cbad7e

  # This is the commit message mysql#9:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (CLANG-TIDY FIX)

  Fix clang-tidy warnings found in previous change#16530, patch#20

  Change-Id: I15d25df135694c2f6a3a9146feebe2b981637662

Change-Id: I3f3223a85bb52343a4619b0c2387856b09438265
dveeden pushed a commit to dveeden/mysql-server that referenced this pull request May 5, 2023
New classes SecureSocketInputStream and SecureSocketOutputStream
take a const NdbSocket reference. The existing classes
SocketInputStream and SocketOutuputStream are re-implemented
to use the NdbSocket versions; in this case, any SSL * will
be fetched from the SSL socket table.

Change-Id: Iaacb7be5e842bdf7da1b7674eaef90886bac6671
dveeden pushed a commit to dveeden/mysql-server that referenced this pull request May 5, 2023
…ting_sharing

ASAN reports a use-after-free in the exit-handlers of the
routing-sharing integration tests:

AddressSanitizer: heap-use-after-free on address ...
   ...
   mysql#4 0x708d05 in std::default_delete<SharedRestartableRouter>::
      operator()(SharedRestartableRouter*) const
      .../include/c++/9/bits/unique_ptr.h:81:2
   mysql#5 0x708c98 in std::unique_ptr<SharedRestartableRouter,
      std::default_delete<SharedRestartableRouter> >::~unique_ptr()
      .../include/c++/9/bits/unique_ptr.h:292:4
   mysql#6 0x6a7448 in std::array<std::unique_ptr<SharedRestartableRouter,
      std::default_delete<SharedRestartableRouter> >, 3ul>::~array()
      .../include/c++/9/array:94:12
   mysql#7 0x7f160a5e68d6 in __run_exit_handlers
      .../glibc-2.31/stdlib/exit.c:108:8

... as the RestartedRouters are destructed after their process-manager
is destructed.

- The process manager is owned by the test
- The RestartableRouters are 'static' and outlive the test.

Change
------

- explicitely free the intermediate routers at test-suite teardown to
  ensure the proper sequence.

Change-Id: Id405fb26b0519c49820a58b5f70d0ee59d2cb83f
venkatesh-prasad-v pushed a commit to venkatesh-prasad-v/mysql-server that referenced this pull request Aug 3, 2023
…etwork

https://bugs.mysql.com/bug.php?id=109668

Description
-----------
GR suffered from problems caused by the security probes and network scanner
processes connecting to the group replication communication port. This usually
is not a problem, but poses a serious threat when another member tries to join
the cluster by initialting a connection to the member which is affected by
external processes using the port dedicated for group communication for longer
durations.

On such activites by external processes, the SSL enabled server stalled forever
on the SSL_accept() call waiting for handshake data. Below is the stacktrace:

    Thread 55 (Thread 0x7f7bb77ff700 (LWP 2198598)):
    #0 in read ()
    mysql#1 in sock_read ()
    mysql#2 in BIO_read ()
    mysql#3 in ssl23_read_bytes ()
    mysql#4 in ssl23_get_client_hello ()
    mysql#5 in ssl23_accept ()
    mysql#6 in xcom_tcp_server_startup(Xcom_network_provider*) ()

When the server stalled in the above path forever, it prohibited other members
to join the cluster resulting in the following messages on the joiner server's
logs.

    [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group'
    [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is already leaving or joining a group.'

Solution
--------
This patch adds two new variables

1. group_replication_xcom_ssl_socket_timeout

   It is a file-descriptor level timeout in seconds for both accept() and
   SSL_accept() calls when group replication is listening on the xcom port.
   When set to a valid value, say for example 5 seconds, both accept() and
   SSL_accept() return after 5 seconds. The default value has been set to 0
   (waits infinitely) for backward compatibility. This variable is effective
   only when GR is configred with SSL.

2. group_replication_xcom_ssl_accept_retries

   It defines the number of retries to be performed before closing the socket.
   For each retry the server thread calls SSL_accept()  with timeout defined by
   the group_replication_xcom_ssl_socket_timeout for the SSL handshake process
   once the connection has been accepted by the first accept() call. The
   default value has been set to 10. This variable is effective only when GR is
   configred with SSL.

Note:
- Both of the above variables are dynamically configurable, but will become
  effective only on START GROUP_REPLICATION.
- This patch is only for the Linux systems.
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
Part of WL#15135 Certificate Architecture

In Ndb_cluster_connection, this patch provides a new top-level
method configure_tls(). It also implements TLS initialization
in connect(), calling down through the TransporterFacade layer
to TransporterRegistry.

In the MySQL server this adds the new read-only configuration
option ndb-tls-search-path, with a compile-time default that is
configurable in CMake, WITH_NDB_TLS_SEARCH_PATH.

Unmodified API nodes that do not call into configure_tls() will
still be able to make TLS connections if keys are found
somewhere in the default search path.

Change-Id: Id1d046ff3c5a48a30131c3d15274f5ed625933a9
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
On Win32, every binary needs one instance of openssl/applink.c.

MySQL has one in client_authentication.cc, and this one is present
in the server and in the client library.

This patch adds instances to ndb_sign_keys, ndb_mgmd, ndbd,
and testNodeCertificate-t, and includes fixes for other
assorted compiler errors and warnings on Win32.

Change-Id: I2d7f940b92ddac7d860d2c6fc2d98ead23e195b2
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
In class Transporter, add two new member variables:
  m_require_tls is a boolean TLS requirement
  m_encrypted is true only when TLS is actually in use

A corresponding change in struct TransporterConfiguration also
adds authMode.

Some application logic is added in IPConfig.cpp to configure
the new variables.

On the server side, TransporterRegistry always uses a TLS
authenticator. On the client side, all Transporter clients
initialize a SocketAuthSimple authenticator, but then TCP
Transporter clients delete this in the TCP_Transporter
constructor and replace it with a TLS authenticator.

Change-Id: I6392eecfc712f8a8f500697f34324eea01d29a8c
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
In the NDB configuration, add boolen options RequireTls and
RequireCertificate to the [MGM] section. Both options default to
false.

Add a new test testMgmd -n MgmdWithoutCertificate

In NdbStdOpt, add the --ndb-mgm-tls command-line option. The allowed
values are "relaxed" and "strict". The default is "relaxed".  This option
will be used for utility programs, allowing the user to specify the
TLS-related behavior of MGM clients.

Change-Id: Id32bb8805ca19a8cf8b52f45c54a7be4d912c5e4
venkatesh-prasad-v pushed a commit to venkatesh-prasad-v/mysql-server that referenced this pull request Jan 26, 2024
…and a local

             DDL executed

https://bugs.mysql.com/bug.php?id=113727

Problem
-------
In high concurrency scenarios, MySQL replica can enter into a deadlock due to a
race condition between the replica applier thread and the client thread
performing a binlog group commit.

Analysis
--------
It needs at least 3 threads for this deadlock to happen

1. One client thread
2. Two replica applier threads

How this deadlock happens?
--------------------------
0. Binlog is enabled on replica, but log_replica_updates is disabled.

1. Initially, both "Commit Order" and "Binlog Flush" queues are empty.

2. Replica applier thread 1 enters the group commit pipeline to register in the
   "Commit Order" queue since `log-replica-updates` is disabled on the replica
   node.

3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier
   thread 1

   3.1. Becomes leader (In Commit_stage_manager::enroll_for()).

   3.2. Registers in the commit order queue.

   3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log.

   3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is
        not yet released.

   NOTE: SE commit for applier thread is already done by the time it reaches
         here.

4. Replica applier thread 2 enters the group commit pipeline to register in the
   "Commit Order" queue since `log-replica-updates` is disabled on the replica
   node.

5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the
   applier thread 2

   5.1. Becomes leader (In Commit_stage_manager::enroll_for())

   5.2. Registers in the commit order queue.

   5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier
        thread 1 it will wait until the lock is released.

6. Client thread enters the group commit pipeline to register in the
   "Binlog Flush" queue.

7. Since "Commit Order" queue is not empty (there is applier thread 2 in the
   queue), it enters the conditional wait `m_stage_cond_leader` with an
   intention to become the leader for both the "Binlog Flush" and
   "Commit Order" queues.

8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update
   the GTID by calling gtid_state->update_commit_group() from
   Commit_order_manager::flush_engine_and_signal_threads().

9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log.

   9.1. It checks if there is any thread waiting in the "Binlog Flush" queue
        to become the leader. Here it finds the client thread waiting to be
        the leader.

   9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the
        cond_var `m_stage_cond_leader` and enters a conditional wait until the
        thread's `tx_commit_pending` is set to false by the client thread
       (will be done in the
       Commit_stage_manager::process_final_stage_for_ordered_commit_group()
       called by client thread from fetch_and_process_flush_stage_queue()).

10. The client thread wakes up from the cond_var `m_stage_cond_leader`.  The
    thread has now become a leader and it is its responsibility to update GTID
    of applier thread 2.

    10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log.

    10.2. Returns from `enroll_for()` and proceeds to process the
          "Commit Order" and "Binlog Flush" queues.

    10.3. Fetches the "Commit Order" and "Binlog Flush" queues.

    10.4. Performs the storage engine flush by calling ha_flush_logs() from
          fetch_and_process_flush_stage_queue().

    10.5. Proceeds to update the GTID of threads in "Commit Order" queue by
          calling gtid_state->update_commit_group() from
          Commit_stage_manager::process_final_stage_for_ordered_commit_group().

11. At this point, we will have

    - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and
    - Applier thread 1 performing GTID update for itself (from step 8).

    Due to the lack of proper synchronization between the above two threads,
    there exists a time window where both threads can call
    gtid_state->update_commit_group() concurrently.

    In subsequent steps, both threads simultaneously try to modify the contents
    of the array `commit_group_sidnos` which is used to track the lock status of
    sidnos. This concurrent access to `update_commit_group()` can cause a
    lock-leak resulting in one thread acquiring the sidno lock and not
    releasing at all.

-----------------------------------------------------------------------------------------------------------
Client thread                                           Applier Thread 1
-----------------------------------------------------------------------------------------------------------
update_commit_group() => global_sid_lock->rdlock();     update_commit_group() => global_sid_lock->rdlock();

calls update_gtids_impl_lock_sidnos()                   calls update_gtids_impl_lock_sidnos()

set commit_group_sidno[2] = true                        set commit_group_sidno[2] = true

                                                        lock_sidno(2) -> successful

lock_sidno(2) -> waits

                                                        update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()`

                                                        if (commit_group_sidnos[2]) {
                                                          unlock_sidno(2);
                                                          commit_group_sidnos[2] = false;
                                                        }

                                                        Applier thread continues..

lock_sidno(2) -> successful

update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()`

if (commit_group_sidnos[2]) { <=== this check fails and lock is not released.
  unlock_sidno(2);
  commit_group_sidnos[2] = false;
}

Client thread continues without releasing the lock
-----------------------------------------------------------------------------------------------------------

12. As the above lock-leak can also happen the other way i.e, the applier
    thread fails to unlock, there can be different consequences hereafter.

13. If the client thread continues without releasing the lock, then at a later
    stage, it can enter into a deadlock with the applier thread performing a
    GTID update with stack trace.

    Client_thread
    -------------
    mysql#1  __GI___lll_lock_wait
    mysql#2  ___pthread_mutex_lock
    mysql#3  native_mutex_lock                                       <= waits for commit lock while holding sidno lock
    mysql#4  Commit_stage_manager::enroll_for
    mysql#5  MYSQL_BIN_LOG::change_stage
    mysql#6  MYSQL_BIN_LOG::ordered_commit
    mysql#7  MYSQL_BIN_LOG::commit
    mysql#8  ha_commit_trans
    mysql#9  trans_commit_implicit
    mysql#10 mysql_create_like_table
    mysql#11 Sql_cmd_create_table::execute
    mysql#12 mysql_execute_command
    mysql#13 dispatch_sql_command

    Applier thread
    --------------
    mysql#1  ___pthread_mutex_lock
    mysql#2  native_mutex_lock
    mysql#3  safe_mutex_lock
    mysql#4  Gtid_state::update_gtids_impl_lock_sidnos               <= waits for sidno lock
    mysql#5  Gtid_state::update_commit_group
    mysql#6  Commit_order_manager::flush_engine_and_signal_threads   <= acquires commit lock here
    mysql#7  Commit_order_manager::finish
    mysql#8  Commit_order_manager::wait_and_finish
    mysql#9  ha_commit_low
    mysql#10 trx_coordinator::commit_in_engines
    mysql#11 MYSQL_BIN_LOG::commit
    mysql#12 ha_commit_trans
    mysql#13 trans_commit
    mysql#14 Xid_log_event::do_commit
    mysql#15 Xid_apply_log_event::do_apply_event_worker
    mysql#16 Slave_worker::slave_worker_exec_event
    mysql#17 slave_worker_exec_job_group
    mysql#18 handle_slave_worker

14. If the applier thread continues without releasing the lock, then at a later
    stage, it can perform recursive locking while setting the GTID for the next
    transaction (in set_gtid_next()).

    In debug builds the above case hits the assertion
    `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the
    replica applier thread when it tries to re-acquire the lock.

Solution
--------
In the above problematic example, when seen from each thread
individually, we can conclude that there is no problem in the order of lock
acquisition, thus there is no need to change the lock order.

However, the root cause for this problem is that multiple threads can
concurrently access to the array `Gtid_state::commit_group_sidnos`.

In its initial implementation, it was expected that threads should
hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it
was not considered when upstream implemented WL#7846 (MTS:
slave-preserve-commit-order when log-slave-updates/binlog is disabled).

With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired
when the client thread (binlog flush leader) when it tries to perform GTID
update on behalf of threads waiting in "Commit Order" queue, thus providing a
guarantee that `Gtid_state::commit_group_sidnos` array is never accessed
without the protection of `MYSQL_BIN_LOG::LOCK_commit`.
bjornmu pushed a commit that referenced this pull request Apr 30, 2024
When built with ASAN, a use-after-free is reported for the TcpPortPool.

AddressSanitizer: heap-use-after-free on address 0x60200019f190 at pc
0x00000076a18d bp 0x7fff51e7d1d0 sp 0x7fff51e7d1c0

    #4 0x770b73 in UniqueId::ProcessUniqueIds::erase(unsigned int)
       ../router/tests/helpers/tcp_port_pool.h:112
    #5 0x770c48 in UniqueId::~UniqueId()
       ../router/tests/helpers/tcp_port_pool.cc:234
    ...
    #12 0x82faa3 in testing::UnitTest::~UnitTest()
	../extra/googletest/googletest-release-1.12.0/googletest/src/gtest.cc:5496
    #13 0x7f5fe085ace8 in __run_exit_handlers (/lib64/libc.so.6+0x39ce8)

0x60200019f190 is located 0 bytes inside of 16-byte region
[0x60200019f190,0x60200019f1a0)
freed by thread T0 here:
    #0 0x7f5fe3cbd10f in operator delete(void*, unsigned long)
       (/lib64/libasan.so.6+0xb710f)
    #1 0x7f5fe085ace8 in __run_exit_handlers (/lib64/libc.so.6+0x39ce8)

Background
==========

__run_exit_handlers destroys "static" and "global" variables in reverse
order of their creation.

googletest's unit-tests are a static, and the TcpPortPool also has
ProcessUniqueId's which contains the process-wide unique-ids.

At construct: unittest -> tcp-port-pool -> proces-unique-ids
At destruct : process-unique-ids -> tcp-port-pool -> 💥

The use-after-free happens as the process-unique-ids static is
destructed before the tcp-port-pool which tries to its Ids from the
process-unique-ids.

Change
======

- extend the lifetime of the process-unique-ids to after the last use of
  the tcp-port-pool via a std::shared_ptr<>

Change-Id: I75b8b781e1d240f18ca72f2c86182639a7699f06
bjornmu pushed a commit that referenced this pull request Apr 30, 2024
…nt on Windows and posix [#4]

Introduce quoting functions suitable for POSIX shell (sh) and running
C/C++ programs on Windows via CMD.EXE.

Use them when running a program via ssh. A simple heuristic to guess the
kind of quoting needed on remote host is. If a \ appears in any argument
use the quoting function for Windows. If / appears in any argument use
the quoting function for POSIX.

Change-Id: I851eb3da22d716d181319e825e888631cd16aeb7
bjornmu pushed a commit that referenced this pull request Apr 30, 2024
Problem:
Starting ´ndb_mgmd --bind-address´ may potentially cause abnormal
program termination in MgmtSrvr destructor when ndb_mgmd restart itself.

  Core was generated by `ndb_mgmd --defa'.
  Program terminated with signal SIGABRT,   Aborted.
  #0  0x00007f8ce4066b8f in raise () from /lib64/libc.so.6
  #1  0x00007f8ce4039ea5 in abort () from /lib64/libc.so.6
  #2  0x00007f8ce40a7d97 in __libc_message () from /lib64/libc.so.6
  #3  0x00007f8ce40af08c in malloc_printerr () from /lib64/libc.so.6
  #4  0x00007f8ce40b132d in _int_free () from /lib64/libc.so.6
  #5  0x00000000006e9ffe in MgmtSrvr::~MgmtSrvr (this=0x28de4b0) at
mysql/8.0/storage/ndb/src/mgmsrv/MgmtSrvr.cpp:
890
  #6  0x00000000006ea09e in MgmtSrvr::~MgmtSrvr (this=0x2) at mysql/8.0/
storage/ndb/src/mgmsrv/MgmtSrvr.cpp:849
  #7  0x0000000000700d94 in mgmd_run () at
mysql/8.0/storage/ndb/src/mgmsrv/main.cpp:260
  #8  0x0000000000700775 in mgmd_main (argc=<optimized out>,
argv=0x28041d0) at mysql/8.0/storage/ndb/src/
mgmsrv/main.cpp:479

Analysis:
While starting up, the ndb_mgmd will allocate memory for bind_address in
order to potentially rewrite the parameter. When ndb_mgmd restart itself
the memory will be released and dangling pointer causing double free.

Fix:
Drop support for bind_address=[::], it is not documented anywhere, is
not useful and doesn't work.
This means the need to rewrite bind_address is gone and bind_address
argument need neither alloc or free.

Change-Id: I7797109b9d8391394587188d64d4b1f398887e94
bjornmu pushed a commit that referenced this pull request Jul 1, 2024
… for connection xxx'.

The new iterator based explains are not impacted.

The issue here is a race condition. More than one thread is using the
query term iterator at the same time (whoch is neithe threas safe nor
reantrant), and part of its state is in the query terms being visited
which leads to interference/race conditions.

a) the explain thread

uses an iterator here:

   Sql_cmd_explain_other_thread::execute

is inspecting the Query_expression of the running query
calling master_query_expression()->find_blocks_query_term which uses
an iterator over the query terms in the query expression:

   for (auto qt : query_terms<>()) {
       if (qt->query_block() == qb) {
           return qt;
       }
   }

the above search fails to find qb due to the interference of the
thread b), see below, and then tries to access a nullpointer:

    * thread #36, name = ‘connection’, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  frame #0: 0x000000010bb3cf0d mysqld`Query_block::type(this=0x00007f8f82719088) const at sql_lex.cc:4441:11
  frame #1: 0x000000010b83763e mysqld`(anonymous namespace)::Explain::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:792:50
  frame #2: 0x000000010b83cc4d mysqld`(anonymous namespace)::Explain_join::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:1487:21
  frame #3: 0x000000010b837c34 mysqld`(anonymous namespace)::Explain::prepare_columns(this=0x00007000020611b8) at opt_explain.cc:744:26
  frame #4: 0x000000010b83ea0e mysqld`(anonymous namespace)::Explain_join::explain_qep_tab(this=0x00007000020611b8, tabnum=0) at opt_explain.cc:1415:32
  frame #5: 0x000000010b83ca0a mysqld`(anonymous namespace)::Explain_join::shallow_explain(this=0x00007000020611b8) at opt_explain.cc:1364:9
  frame #6: 0x000000010b83379b mysqld`(anonymous namespace)::Explain::send(this=0x00007000020611b8) at opt_explain.cc:770:14
  frame #7: 0x000000010b834147 mysqld`explain_query_specification(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, query_term=0x00007f8f82719088, ctx=CTX_JOIN) at opt_explain.cc:2088:20
  frame #8: 0x000000010bd36b91 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f82719088) at sql_union.cc:1519:11
  frame #9: 0x000000010bd36c68 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f8271d748) at sql_union.cc:1526:13
  frame #10: 0x000000010bd373f7 mysqld`Query_expression::explain(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00) at sql_union.cc:1591:7
  frame #11: 0x000000010b835820 mysqld`mysql_explain_query_expression(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2392:17
  frame #12: 0x000000010b835400 mysqld`explain_query(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2353:13
 * frame #13: 0x000000010b8363e4 mysqld`Sql_cmd_explain_other_thread::execute(this=0x00007f8fba585b68, thd=0x00007f8fbb111e00) at opt_explain.cc:2531:11
  frame #14: 0x000000010bba7d8b mysqld`mysql_execute_command(thd=0x00007f8fbb111e00, first_level=true) at sql_parse.cc:4648:29
  frame #15: 0x000000010bb9e230 mysqld`dispatch_sql_command(thd=0x00007f8fbb111e00, parser_state=0x0000700002065de8) at sql_parse.cc:5303:19
  frame #16: 0x000000010bb9a4cb mysqld`dispatch_command(thd=0x00007f8fbb111e00, com_data=0x0000700002066e38, command=COM_QUERY) at sql_parse.cc:2135:7
  frame #17: 0x000000010bb9c846 mysqld`do_command(thd=0x00007f8fbb111e00) at sql_parse.cc:1464:18
  frame #18: 0x000000010b2f2574 mysqld`handle_connection(arg=0x0000600000e34200) at connection_handler_per_thread.cc:304:13
  frame #19: 0x000000010e072fc4 mysqld`pfs_spawn_thread(arg=0x00007f8fba8160b0) at pfs.cc:3051:3
  frame #20: 0x00007ff806c2b202 libsystem_pthread.dylib`_pthread_start + 99
  frame #21: 0x00007ff806c26bab libsystem_pthread.dylib`thread_start + 15

b) the query thread being explained is itself performing LEX::cleanup
and as part of the iterates over the query terms, but still allows
EXPLAIN of the query plan since

   thd->query_plan.set_query_plan(SQLCOM_END, ...)

hasn't been called yet.

     20:frame: Query_terms<(Visit_order)1, (Visit_leaves)0>::Query_term_iterator::operator++() (in mysqld) (query_term.h:613)
     21:frame: Query_expression::cleanup(bool) (in mysqld) (sql_union.cc:1861)
     22:frame: LEX::cleanup(bool) (in mysqld) (sql_lex.h:4286)
     30:frame: Sql_cmd_dml::execute(THD*) (in mysqld) (sql_select.cc:799)
     31:frame: mysql_execute_command(THD*, bool) (in mysqld) (sql_parse.cc:4648)
     32:frame: dispatch_sql_command(THD*, Parser_state*) (in mysqld) (sql_parse.cc:5303)
     33:frame: dispatch_command(THD*, COM_DATA const*, enum_server_command) (in mysqld) (sql_parse.cc:2135)
     34:frame: do_command(THD*) (in mysqld) (sql_parse.cc:1464)
     57:frame: handle_connection(void*) (in mysqld) (connection_handler_per_thread.cc:304)
     58:frame: pfs_spawn_thread(void*) (in mysqld) (pfs.cc:3053)
     65:frame: _pthread_start (in libsystem_pthread.dylib) + 99
     66:frame: thread_start (in libsystem_pthread.dylib) + 15

Solution:

This patch solves the issue by removing iterator state from
Query_term, making the query_term iterators thread safe. This solution
labels every child query_term with its index in its parent's
m_children vector.  The iterator can therefore easily compute the next
child to visit based on Query_term::m_sibling_idx.

A unit test case is added to check reentrancy.

One can also manually verify that we have no remaining race condition
by running two client connections files (with \. <file>) with a big
number of copies of the repro query in one connection and a big number
of EXPLAIN format=json FOR <connection>, e.g.

    EXPLAIN FORMAT=json FOR CONNECTION 8\G

in the other. The actual connection number would need to verified
in connection one, of course.

Change-Id: Ie7d56610914738ccbbecf399ccc4f465f7d26ea7
bjornmu pushed a commit that referenced this pull request Jul 1, 2024
This is a combination of 5 commits.

This is the 1st commit message:

WL#15746: TLS Enhancements for HeatWave-AutoML & Dask Comm. Upgrade

Problem:
--------
- HeatWave-AutoML communication was unauthenticated, unauthorized,
  and unencrypted.
- Dask communication utilized TCP, not aligning with FedRamp
  guidelines.

Solution:
---------
- Introduced TLS and mTLS in HeatWave-AutoML's plugin and driver for
  authentication, authorization, and encryption.
- Applied TLS to Dask to ensure authentication, encryption, and
  authorization.

Dask Authorization (OCID-based):
--------------------------------
1. For each DBsystem:
    - MySQL node sends OCIDs of authorized nodes to the head driver
      via:
        a. rapid_net_nodes
        b. rapid_net_allowed_ocids (older API, mainly for MTR tests)
    - Scenarios:
        a. All OCIDs provided: Dask authorizes.
        b. Any OCID absent: ML call fails with message.
2. During Dask worker registration to the Dask scheduler, a script is
    dispatched to the Dask worker for execution, retrieving the worker's
    OCID for authorization purposes.
    - If the OCID isn't approved, the connection is denied, terminating
      the worker and causing the ML query to fail.
3. For every Dask worker (both as listener and connector), an OCID-
    based authorization is performed post SSL/TLS connection handshake.
    The process compares the OCID from the peer's certificate against
    the allowed_ocids received from the HeatWave-AutoML MySQL plugin.

HWAML Plugin Changes:
---------------------
- Sourced certificate data and SSL setup from disk, incorporating
  SSL/TLS for HWAML.
- Reused "keystore" variable to specify disk location for
  certificate retrieval.
- Certificates and keys expected in PKCS12 format.
- Introduced "have_ml_encryption" variable (default=0).
    > Acts as a switch to explicitly deactivate HWAML network
      encryption, akin to "disable_net_encryption" affecting
      network encryption for HeatWave. Set to 1 to enable.
- Introduced a customized verifier function for verify_callback to
  be set in SSL_CTX_set_verify and used in the handshake process
  of SSL/TLS. The customized verifier function will perform
  instance id (OCID) based authorization on the plugin side during
  standard SSL/TLS handshake process.
- CRL (Certificate Revocation List) checks are also conducted if CRL
  Distribution Points are present and accessible in the provided
  certificate.

HWAML Driver Changes & OCID-based Authorization:
------------------------------------------------
- Introduced "enable_encryption" (default=0).
    > Set to 1 to enable encryption.
- When receiving a new connection request and encryption is on, the
  driver performs OCID-based self-checking, comparing OCID retrieved
  from its own instance principal with the OCID in the
  provided certificate on disk.
- The driver compares OCID from "mysql_compute_id" and extracted OCID
  from mTLS certificate during connection.
- Introduced "cert_dir" argument for certificate directory
  specification.
- Expected files: cert_chain.pem, certificate.pem, private_key.pem.
    > OCID should be in the userID (UID) or CN field of the
      certificate.pem subject.
- CRL (Certificate Revocation List) checks are also conducted post
  handshake, if CRL Distribution Points are present and accessible in
  the provided certificate, alongside OCID authorization.

Encryption Behavior:
--------------------
- If encryption is deactivated on both plugin and driver side, HWAML
  will work without encryption as it was before this commit.

Enabling Encryption:
--------------------
- By default, "have_ml_encryption" and "enable_encryption" are set to 0
    > Encryption is disabled by default.
- For the HWAML plugin:
    > "have_ml_encryption" set to 1 (default is 0).
    > Specify the .pfx file's path using the "keystore".
- For the HWAML Driver:
    > "enable_encryption" set to 1 (default is 0)
    > Specify "mysql_instance_id" and "cert_dir".

Testing:
--------
- MTR has been modified for the encryption setup.
    > Runs with encryption if "OCI_INSTANCE_ID" is set to a valid
      value.
- On OCI (when "OLRAPID_KEYSTORE" is not set):
    > Certificates and keys are generated; PEMs for driver and PKCS12
      for plugin.
- On AWS (when "OLRAPID_KEYSTORE" is set as the path to PKCS12
  keystore files):
    > PEM files are extracted from the provided PKCS12 and used for
      the driver. The plugin uses the provided PKCS12 keystore file.

Change-Id: I553ca135241e03484db6debbe186e6d34d582bf4

This is the commit message #2:

WL#15746 - Adding ML encryption support to BM

Enabling ML encryption on Baumeister:
- Certificates are generated on MySQLd during initialization
- Needed certicates for workers are packaged and sent to worker nodes
- Workers use packaged files to generate their certificates
- Arguments are added to driver.py invoke
- Keystore path is added to mysql config

Change-Id: I11a5cc5926488ff4fbf91bb6c10a091358db7dc9

This is the commit message #3:

WL#15746: Enhanced CRL Daemon Checker

Issue
=====
The previous design assumed a plain HTTPS link for the CRL distribution
point, accessible to all. This assumption no longer holds, as public
accessibility for CRL distribution contradicts OCI guidelines. Now, the
CRL distribution point in certificates provided by the control plane is
expected to be protected by OCI Instance Principal Authentication.
However, using this authentication method introduces a delay of several
seconds, which is impractical for HeatWave-AutoML.

Solution
========
The CRL fetching code now uses OCI Instance Principal Authentication.
To mitigate performance issues, the CRL checking process has been
redesigned. Instead of enforcing CRL checks per connection in MySQL
Plugin and HeatWave-AutoML Driver communications, a daemon thread in
HeatWave-AutoML Driver, Dask scheduler, and Dask Worker now periodically
fetches and verifies the CRL against all active connections. This
separation minimizes performance impacts. Consequently, MySQL Plugin's
CRL checks have been removed, as checks in the Driver, Scheduler, and
Worker sufficiently cover all cluster nodes.

Changes
=======
- Implemented CRL checker as a daemon thread in Driver, Scheduler, and
  Worker.
- Each connection/socket has an associated CRL checker.
- CRL checks occur periodically at set intervals.
- Skips CRL check if the CRL is temporarily unavailable.
- Failing a CRL check results in the associated connection/socket being
  closed. On the Driver, a stop event is triggered (akin to CTRL-C).

Change-Id: Id998cfe9e15d9236291b0ae420d65c2197837966

This is the commit message #4:

WL#15746: Fix Dask workers being shutdown without releasing address

Issue
=====
Dask workers getting shutting but not releasing the address used
properly sometimes.

Solution
========
Reverted some changes in heatwave_cluster.py in dask worker shutdown
function. Hopefully this will fix the address issue

Change-Id: I5a6749b5a25b0ccb73ba7369e545bc010da1b84f

This is the commit message #5:

WL#15746: Implement Dask Worker Join Timeout for Head Node

Issue:
======
In the cluster_shutdown method, the join operation on the head node's
worker process lacked a timeout. This led to potential indefinite
waiting and eventual hanging of the head node.

Solution:
=========
A timeout has been introduced for the worker process join on the head
node. Unlike non-head nodes, which rely on worker join to complete Dask
tasks and cannot have a timeout, the head node can safely implement
this. Now, if the worker process on the head node fails to shut down
post-join, indicating a timeout, it will be manually terminated. This
ensures proper release of associated resources and prevents hanging of
the head node.

Additional Change:
==================
Added Cert Rotation Guard for DASK clusters. This feature initiates on
the first plugin-driver connection when the DASK cluster is off,
recording the certificate's expiry date. During driver idle times,
it checks the current cert's expiry against this date. If it detects a
change, indicating a certificate rotation, it shuts down the DASK
cluster. The cluster restarts on the next connection request, ensuring
the use of the latest certificate.

Change-Id: Ie63a2e2b7664e05e1622d8bd6503663e13fa73cb
laurynas-biveinis pushed a commit to laurynas-biveinis/mysql-server that referenced this pull request Oct 22, 2024
This is a combination of 5 commits.

This is the 1st commit message:

WL#15746: TLS Enhancements for HeatWave-AutoML & Dask Comm. Upgrade

Problem:
--------
- HeatWave-AutoML communication was unauthenticated, unauthorized,
  and unencrypted.
- Dask communication utilized TCP, not aligning with FedRamp
  guidelines.

Solution:
---------
- Introduced TLS and mTLS in HeatWave-AutoML's plugin and driver for
  authentication, authorization, and encryption.
- Applied TLS to Dask to ensure authentication, encryption, and
  authorization.

Dask Authorization (OCID-based):
--------------------------------
1. For each DBsystem:
    - MySQL node sends OCIDs of authorized nodes to the head driver
      via:
        a. rapid_net_nodes
        b. rapid_net_allowed_ocids (older API, mainly for MTR tests)
    - Scenarios:
        a. All OCIDs provided: Dask authorizes.
        b. Any OCID absent: ML call fails with message.
2. During Dask worker registration to the Dask scheduler, a script is
    dispatched to the Dask worker for execution, retrieving the worker's
    OCID for authorization purposes.
    - If the OCID isn't approved, the connection is denied, terminating
      the worker and causing the ML query to fail.
3. For every Dask worker (both as listener and connector), an OCID-
    based authorization is performed post SSL/TLS connection handshake.
    The process compares the OCID from the peer's certificate against
    the allowed_ocids received from the HeatWave-AutoML MySQL plugin.

HWAML Plugin Changes:
---------------------
- Sourced certificate data and SSL setup from disk, incorporating
  SSL/TLS for HWAML.
- Reused "keystore" variable to specify disk location for
  certificate retrieval.
- Certificates and keys expected in PKCS12 format.
- Introduced "have_ml_encryption" variable (default=0).
    > Acts as a switch to explicitly deactivate HWAML network
      encryption, akin to "disable_net_encryption" affecting
      network encryption for HeatWave. Set to 1 to enable.
- Introduced a customized verifier function for verify_callback to
  be set in SSL_CTX_set_verify and used in the handshake process
  of SSL/TLS. The customized verifier function will perform
  instance id (OCID) based authorization on the plugin side during
  standard SSL/TLS handshake process.
- CRL (Certificate Revocation List) checks are also conducted if CRL
  Distribution Points are present and accessible in the provided
  certificate.

HWAML Driver Changes & OCID-based Authorization:
------------------------------------------------
- Introduced "enable_encryption" (default=0).
    > Set to 1 to enable encryption.
- When receiving a new connection request and encryption is on, the
  driver performs OCID-based self-checking, comparing OCID retrieved
  from its own instance principal with the OCID in the
  provided certificate on disk.
- The driver compares OCID from "mysql_compute_id" and extracted OCID
  from mTLS certificate during connection.
- Introduced "cert_dir" argument for certificate directory
  specification.
- Expected files: cert_chain.pem, certificate.pem, private_key.pem.
    > OCID should be in the userID (UID) or CN field of the
      certificate.pem subject.
- CRL (Certificate Revocation List) checks are also conducted post
  handshake, if CRL Distribution Points are present and accessible in
  the provided certificate, alongside OCID authorization.

Encryption Behavior:
--------------------
- If encryption is deactivated on both plugin and driver side, HWAML
  will work without encryption as it was before this commit.

Enabling Encryption:
--------------------
- By default, "have_ml_encryption" and "enable_encryption" are set to 0
    > Encryption is disabled by default.
- For the HWAML plugin:
    > "have_ml_encryption" set to 1 (default is 0).
    > Specify the .pfx file's path using the "keystore".
- For the HWAML Driver:
    > "enable_encryption" set to 1 (default is 0)
    > Specify "mysql_instance_id" and "cert_dir".

Testing:
--------
- MTR has been modified for the encryption setup.
    > Runs with encryption if "OCI_INSTANCE_ID" is set to a valid
      value.
- On OCI (when "OLRAPID_KEYSTORE" is not set):
    > Certificates and keys are generated; PEMs for driver and PKCS12
      for plugin.
- On AWS (when "OLRAPID_KEYSTORE" is set as the path to PKCS12
  keystore files):
    > PEM files are extracted from the provided PKCS12 and used for
      the driver. The plugin uses the provided PKCS12 keystore file.

Change-Id: I553ca135241e03484db6debbe186e6d34d582bf4

This is the commit message mysql#2:

WL#15746 - Adding ML encryption support to BM

Enabling ML encryption on Baumeister:
- Certificates are generated on MySQLd during initialization
- Needed certicates for workers are packaged and sent to worker nodes
- Workers use packaged files to generate their certificates
- Arguments are added to driver.py invoke
- Keystore path is added to mysql config

Change-Id: I11a5cc5926488ff4fbf91bb6c10a091358db7dc9

This is the commit message mysql#3:

WL#15746: Enhanced CRL Daemon Checker

Issue
=====
The previous design assumed a plain HTTPS link for the CRL distribution
point, accessible to all. This assumption no longer holds, as public
accessibility for CRL distribution contradicts OCI guidelines. Now, the
CRL distribution point in certificates provided by the control plane is
expected to be protected by OCI Instance Principal Authentication.
However, using this authentication method introduces a delay of several
seconds, which is impractical for HeatWave-AutoML.

Solution
========
The CRL fetching code now uses OCI Instance Principal Authentication.
To mitigate performance issues, the CRL checking process has been
redesigned. Instead of enforcing CRL checks per connection in MySQL
Plugin and HeatWave-AutoML Driver communications, a daemon thread in
HeatWave-AutoML Driver, Dask scheduler, and Dask Worker now periodically
fetches and verifies the CRL against all active connections. This
separation minimizes performance impacts. Consequently, MySQL Plugin's
CRL checks have been removed, as checks in the Driver, Scheduler, and
Worker sufficiently cover all cluster nodes.

Changes
=======
- Implemented CRL checker as a daemon thread in Driver, Scheduler, and
  Worker.
- Each connection/socket has an associated CRL checker.
- CRL checks occur periodically at set intervals.
- Skips CRL check if the CRL is temporarily unavailable.
- Failing a CRL check results in the associated connection/socket being
  closed. On the Driver, a stop event is triggered (akin to CTRL-C).

Change-Id: Id998cfe9e15d9236291b0ae420d65c2197837966

This is the commit message mysql#4:

WL#15746: Fix Dask workers being shutdown without releasing address

Issue
=====
Dask workers getting shutting but not releasing the address used
properly sometimes.

Solution
========
Reverted some changes in heatwave_cluster.py in dask worker shutdown
function. Hopefully this will fix the address issue

Change-Id: I5a6749b5a25b0ccb73ba7369e545bc010da1b84f

This is the commit message mysql#5:

WL#15746: Implement Dask Worker Join Timeout for Head Node

Issue:
======
In the cluster_shutdown method, the join operation on the head node's
worker process lacked a timeout. This led to potential indefinite
waiting and eventual hanging of the head node.

Solution:
=========
A timeout has been introduced for the worker process join on the head
node. Unlike non-head nodes, which rely on worker join to complete Dask
tasks and cannot have a timeout, the head node can safely implement
this. Now, if the worker process on the head node fails to shut down
post-join, indicating a timeout, it will be manually terminated. This
ensures proper release of associated resources and prevents hanging of
the head node.

Additional Change:
==================
Added Cert Rotation Guard for DASK clusters. This feature initiates on
the first plugin-driver connection when the DASK cluster is off,
recording the certificate's expiry date. During driver idle times,
it checks the current cert's expiry against this date. If it detects a
change, indicating a certificate rotation, it shuts down the DASK
cluster. The cluster restarts on the next connection request, ensuring
the use of the latest certificate.

Change-Id: Ie63a2e2b7664e05e1622d8bd6503663e13fa73cb
dbussink added a commit to planetscale/mysql-server that referenced this pull request Nov 21, 2024
In case `with_ndb_home` is set, `buf` is allocated with `PATH_MAX` and
the home is already written into the buffer.

The additional path is written using `snprintf` and it starts off at
`len`. It still can write up to `PATH_MAX` though which is wrong, since
if we already have a home written into it, we only have `PATH_MAX - len`
available in the buffer.

On Ubuntu 24.04 with debug builds this is caught and it crashes:

```
*** buffer overflow detected ***: terminated
Signal 6 thrown, attempting backtrace.
stack_bottom = 0 thread_stack 0x0
 #0 0x604895341cb6 <unknown>
 mysql#1 0x7ff22524531f <unknown> at sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
 mysql#2 0x7ff22529eb1c __pthread_kill_implementation at ./nptl/pthread_kill.c:44
 mysql#3 0x7ff22529eb1c __pthread_kill_internal at ./nptl/pthread_kill.c:78
 mysql#4 0x7ff22529eb1c __GI___pthread_kill at ./nptl/pthread_kill.c:89
 mysql#5 0x7ff22524526d __GI_raise at sysdeps/posix/raise.c:26
 mysql#6 0x7ff2252288fe __GI_abort at ./stdlib/abort.c:79
 mysql#7 0x7ff2252297b5 __libc_message_impl at sysdeps/posix/libc_fatal.c:132
 mysql#8 0x7ff225336c18 __GI___fortify_fail at ./debug/fortify_fail.c:24
 mysql#9 0x7ff2253365d3 __GI___chk_fail at ./debug/chk_fail.c:28
 mysql#10 0x7ff225337db4 ___snprintf_chk at ./debug/snprintf_chk.c:29
 mysql#11 0x6048953593ba <unknown>
 mysql#12 0x604895331a3d <unknown>
 mysql#13 0x6048953206e7 <unknown>
 mysql#14 0x60489531f4b1 <unknown>
 mysql#15 0x60489531e8e6 <unknown>
 mysql#16 0x7ff22522a1c9 __libc_start_call_main at sysdeps/nptl/libc_start_call_main.h:58
 mysql#17 0x7ff22522a28a __libc_start_main_impl at csu/libc-start.c:360
 mysql#18 0x60489531ed54 <unknown>
 mysql#19 0xffffffffffffffff <unknown>
```

In practice this buffer overflow only would happen with very long paths.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants