Tuesday, July 09, 2024

A MySQL 9.0.0 branch with patches

Recently I made public a branch of MySQL 8.4.0 with patches. Now that newer upstream versions have been released, I have rebased the branch on 9.0.0: patched-mysql-9.0.0.

I did not add any new patches, and dropped the LLVM compilation fix patch. Thus the current patch set improves AddressSanitizer support and includes several fixes/improvements for the clone plugin to support a second transactional storage engine:

  1. Fix for Bug 115120: Provide memory debugging macro implementations for AddressSanitizer.
  2. Fix for Bug 109922: SEs do not coordinate clone rollbacks on instance startup.
  3. Fix for Bug 109920: SEs cannot add to clone data size estimates.
  4. Fix for Bug 109919: Clone plugin does not support arbitrary length O_DIRECT files.
  5. Fix for Bug 107715: Multiple SE clone not calling clone_end after mixed clone_begin failure/success.
  6. Fix for Bug 106750: Some clone tests needlessly marked as Linux-only.
  7. Fix for Bug 106737: Wrong SE called to ack clone application if >1 SE.

Thursday, July 04, 2024

Building and testing MySQL 8.0.38 / 8.4.1 / 9.0.0 on macOS

Oracle has just released MySQL 8.0.38/8.4.1/9.0.0, so let's see how the Valgrind testing of the previous set of releases is going:

[ 65%] innodb.bug33788578_rec_IV_set            w2  [ fail ]  Found warnings/errors in error log file!

It only managed two thirds of a run (a --big-test one) in a few weeks, which means that I either need a faster machine, or that Valgrind testing does not take even remotely reasonable amount of time. Despite the run being incomplete, it resulted in couple of bug reports because every single test gives an error: #115229: pwrite(buf) points to uninitialised byte(s) in os_fusionio_get_sector_size(). I also noticed that one test always times out: #114998: Test rpl_gtid.rpl_gtid_parallel times out under Valgrind. I need to rethink Valgrind testing going forward, maybe by trying dropping --big-test first.

Back to the new releases. A nice surprise is that the they build with XCode 15.3 even though the associated bug (#113123: Compilation fails with LLVM 17 and XCode 15.3) is still open. However, an LLVM 18 build fails in the bundled zlib, and this time I refrained from reporting a bug. GCC 11–14 also fail, but they are officially unsupported for macOS, so I won't be reporting that, unless I encounter issues on Linux.

Fixed and no longer reproducing bugs:

New bugs:

Little to no changes

Finally, I did not retest #113113: Build failure with Homebrew LLVM 14-17 on macOS, where I keep using the workaround of setting CMAKE_AR to Homebrew ar.

To sum up, 5 bugs fixed or no longer reproducing, 7 new, 11 with no changes, 1 not tested, and 2 new Valgrind ones. I don't like this trend and I miss the middle-8.0 releases that were fully clean under macOS.

Friday, June 14, 2024

Implementing a Table-Returning MySQL Function

Edit: links to Percona SEQUENCE_TABLE implementation added to the end

MySQL has many native functions and it's relatively straightforward to write new ones. While the official internals manual has a chapter on adding native functions, some aspects aren't immediately obvious, such as the possible return types for functions. The linked docs suggest that functions can return a double, an integer, or a string, but doesn't mention that they can also return JSON documents (i.e. JSON_ARRAY) and relational tables (i.e. JSON_TABLE). I have filed a bug for this omission, meanwhile this post will be my attempt at documenting the implementation of table-returning functions.

We will implement a table-returning function called LAST_INSERT_IDS, We will focus on the server infrastructure to support that, not on the function's purpose or the actual data it returns. Let's do the simplest possible thing: return a table with one column and one row.

Exploring the Codebase

Grepping the source tree for JSON_TABLE points to Table_function class declared in sql/table_function.h. Source code history then points to this commit, which in turn mentions Worklog #8867 (Add JSON table functions). Luckily for us it's a relatively old worklog, thus available publicly! From the worklog we can learn that any result tables will start as in-memory temporary tables which will spill to disk as needed. Good. The worklog also touches upon a lot of stuff elsewhere in the server so that the query optimizer will know how to deal with table-returning functions. We won't need to deal with that while implementing a new function, but it may provide debugging starters should anything go wrong.

Function class

To start writing code, we have to derive a new concrete class from Table_function and let the compiler tell us what methods must be implemented:

class Table_function_last_insert_ids final : public Table_function {
 public:
  bool init() override;
  bool fill_result_table() override;
  const char *func_name() const override;
  bool print(const THD *thd, String *str,
             enum_query_type query_type) const override;
  bool walk(Item_processor processor, enum_walk walk, uchar *arg) override;

 private:
  List<Create_field> *get_field_list() override;
  bool do_init_args() override;
};

But let's not implement them just yet (stub them out with assert(0) bodies to learn what gets called when), and let's look into instantiating the class object first. The regular functions get registered with MySQL in func_array variable in sql/item_create.cc, and the surrounding code knows how to create their objects. But JSON_TABLE is not there! Grepping shows that it is wired directly to the parser, probably because its arguments are non-trivial. Our function is much simpler in that regard, can we get away with putting it there instead of patching parser? It turns out that no, we cannot, because the simpler functions derive from Item_func class, which derives from Item, which is what this infrastructure expects. And our table-returning function derives from Table_function, which has no further ancestors.

Patching the parser

OK, off to the parser we go. To find a starting point, let's check what JSON_TABLE does. Grepping for Table_function_json in the parser (the worklog is not too specific in this area), we get a match for PT_table_factor_function class in sql/parse_tree_nodes.cc. Then we grep for that class in the parser grammar and we get a hint why the table-returning functions are harder to implement than other kinds of functions:

table_function:
        JSON_TABLE_SYM '(' expr ',' text_literal columns_clause ')'
        opt_table_alias
        {
          // Alias isn't optional, follow derived's behavior
          if ($8 == NULL_CSTR)
          {
            my_message(ER_TF_MUST_HAVE_ALIAS,
                       ER_THD(YYTHD, ER_TF_MUST_HAVE_ALIAS), MYF(0));
            MYSQL_YYABORT;
          }

          $$= NEW_PTN PT_table_factor_function($3, $5, $6, to_lex_string($8));
        }
      ;

We can see that the parser knows how to work with exactly one table-returning function that is named JSON_TABLE, thus we have to patch the parser. Let's keep everything under the non-terminal symbol table_function:

table_function:
          json_table_function
        | last_insert_ids_function
        ;

last_insert_ids_function:
          LAST_INSERT_IDS_SYM '(' ')'
          opt_table_alias
          {
            // Alias isn't optional, follow derived's behavior
            if ($4 == NULL_CSTR)
            {
                my_message(ER_TF_MUST_HAVE_ALIAS,
                           ER_THD(YYTHD, ER_TF_MUST_HAVE_ALIAS), MYF(0));
                MYSQL_YYABORT;
            }

            $$= NEW_PTN PT_last_insert_ids_function(to_lex_string($4));
          }
        ;

json_table_function:
        // As before

Unfortunately the opt_table_alias block must be duplicated. But, no increase in the parser shift/reduce conflicts! There is more not particularly interesting stuff to do in the lexer and the parser:

  • Add the function name to the lexer, {SYM("LAST_INSERT_IDS", LAST_INSERT_IDS_SYM)}, to sql/lex.h.
  • Declare this lexer symbol as a token for the parser. I did not make it a keyword because JSON_TABLE wasn't one neither: %token LAST_INSERT_IDS_SYM 10024.
  • Finally, declare last_insert_ids_function symbol to be of table_reference type, same as table_function was before.

Parse Tree Node class

That's enough for the lexer and the parser, next we need to implement PT_last_insert_ids_function class:

class PT_last_insert_ids_function : public PT_table_reference {
 public:
  PT_last_insert_ids_function(const LEX_STRING &table_alias)
      : m_table_alias{table_alias} {}

 private:
  const LEX_STRING m_table_alias;
};

Without a PT_last_insert_ids_function::contextualize method, which would actually create the Table_function_last_insert_ids object, the above implementation fails rather non-obviously:

mysqltest: At line 24: Query 'SELECT * FROM LAST_INSERT_IDS() as ids' failed.
ERROR 1096 (HY000): No tables used

Let's add that method by copying, pasting, and deleting the non-applicable bits of PT_table_factor_function::contextualize. The job of this method will be to create the function object and to assign a name and a table for the query optimizer for the result.

bool PT_last_insert_ids_function::contextualize(Parse_context *pc) {
  if (super::contextualize(pc)) return true;
  auto *const fn = new (pc->mem_root) Table_function_last_insert_ids{};
  if (unlikely(fn == nullptr)) return true;

  LEX_CSTRING alias;
  alias.length = strlen(fn->func_name());
  alias.str = sql_strmake(fn->func_name(), alias.length);
  if (unlikely(alias.str == nullptr)) return true;

  auto *const ti = new (pc->mem_root) Table_ident(alias, fn);
  if (ti == nullptr) return true;

  m_table_ref = pc->select->add_table_to_list(pc->thd, ti, m_table_alias.str, 0,
                                              TL_READ, MDL_SHARED_READ);
  if (m_table_ref == nullptr || pc->select->add_joined_table(m_table_ref))
    return true;

  return false;
}

Filling out the function class implementation

This implementation forces us to define the first method in Table_function_last_insert_ids, a very simple one:

const char *func_name() const override { return "last_insert_ids"; }

Let's run it!

SELECT * FROM LAST_INSERT_IDS() as ids;
...
Assertion failed: (0), function init, file table_function.h, line 440.

…which points to the stub Table_function_last_insert_ids::init(). Let's say we have no meaningful initialization at this point, so remove the assert from the stub:

Assertion failed: (false), function get_field_list, file table_function.h, line 464.

This is get_field_list method, and we have to implement it to describe the schema of the table we will be returning:

List<Create_field> *Table_function_last_insert_ids::get_field_list() {
  assert(fields.is_empty());

  auto *const field = new Create_field;
  field->init_for_tmp_table(MYSQL_TYPE_LONGLONG, MAX_BIGINT_WIDTH,
                            DECIMAL_NOT_SPECIFIED, /* is_nullable */ false,
                            /* is_unsigned */ true, 0, "insert_id");
  fields.push_back(field);

  return &fields;
}

This implementation assumes it is not a simple getter and thus is called exactly once. If the assumption will shown to be incorrect by the assert, I'll move the code to init and will reduce this one to a simple getter. Also, something will have to free the memory we are allocating for the field, that's for later.

Run again…

Assertion failed: (false), function do_init_args, file table_function.h, line 467.

For JSON_TABLE, this method is documented to "check whether given default values can be saved to fields." Let's assume that we have no meaningful actions here, remove the assert, return success, run again:

Assertion failed: (0), function fill_result_table, file table_function.h, line 442.

Finally we get to the second serious method that should provide the payload for the result table:

bool Table_function_last_insert_ids::fill_result_table() {
  assert(!fields.is_empty());
  assert(!table->materialized);
  assert(table->s->fields == 1);

  empty_table();

  auto *const field = get_field(0);
  assert(field->field_index() == 0);
  field->store(current_thd->first_successful_insert_id_in_prev_stmt);
  field->set_notnull();
  write_row();

  return false;
}

And…

SELECT * FROM LAST_INSERT_IDS() as ids;
insert_id
0

Success!

Implementing the remaining stubs

We still have two stub methods with asserts in them: print and walk. Let's try figuring out under what conditions they get called, again, by looking at JSON_TABLE implementation.

Let's start with print. It gets called by sql_lex.cc Table_ref::print. There are references to optimizer trace and something else around that. Let's see if that "something else" refers to the good old EXPLAIN:

EXPLAIN SELECT * FROM LAST_INSERT_IDS() as ids;
Assertion failed: (false), function print, file table_function.h, line 450.
7   mysqld                              0x000000010385ffb8 Table_function_last_insert_ids::walk(bool (Item::*)(unsigned char*), enum_walk, unsigned char*) + 0

Interesting! We tried to find what calls print, but we found what calls walk instead. Since our function is atomic with regard to the query optimizer and has no internal structure, there is nothing to walk, let's remove the assert:

EXPLAIN SELECT * FROM LAST_INSERT_IDS() as ids;
Assertion failed: (false), function print, file table_function.h, line 450.

Great, so the same EXPLAIN statement exercises both walk and print. Let's add a simple implementation for the latter:

bool print(const THD *, String *str, enum_query_type) const override {
  return !str->append(STRING_WITH_LEN("last_insert_ids()"));
}

And now the EXPLAIN!

EXPLAIN SELECT * FROM LAST_INSERT_IDS() as ids;
id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	ids	NULL	ALL	NULL	NULL	NULL	NULL	2	100.00	Table function: last_insert_ids; Using temporary
Warnings:
Note	1003	/* select#1 */ select `ids`.`insert_id` AS `insert_id` from last_insert_ids() `ids`

Memory management

One last thing is that Field object which was allocated in Table_function_last_insert_ids::get_field_list and never freed. Let's see if the standard tooling catches this error. AddressSanitizer under Linux includes LeakSanitizer, let's try it:

==31384==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 1440 byte(s) in 6 object(s) allocated from:
    #0 0xab9c0d1a99ac in operator new(unsigned long) (/home/laurynas/vilniusdb/last_insert_ids/_build-debug-llvm-14-san/runtime_output_directory/mysqld+0xb0599ac) (BuildId: 611e155cd8dc43455bd5d5e0ddbdfe53ae205fa4)
    #1 0xab9c0e6bb8bc in Table_function_last_insert_ids::get_field_list() /home/laurynas/vilniusdb/last_insert_ids/sql/table_function.cc:778:23
    #2 0xab9c0e6b5514 in Table_function::create_result_table(THD*, unsigned long long, char const*) /home/laurynas/vilniusdb/last_insert_ids/sql/table_function.cc:64:46
    #3 0xab9c0e0024fc in Table_ref::setup_table_function(THD*) /home/laurynas/vilniusdb/last_insert_ids/sql/sql_derived.cc:933:23
    #4 0xab9c0e388bfc in Query_block::resolve_placeholder_tables(THD*, bool) /home/laurynas/vilniusdb/last_insert_ids/sql/sql_resolver.cc:1310:15
    #5 0xab9c0e3854a4 in Query_block::prepare(THD*, mem_root_deque<Item*>*) /home/laurynas/vilniusdb/last_insert_ids/sql/sql_resolver.cc:247:7
    #6 0xab9c0e3e3da4 in Sql_cmd_select::prepare_inner(THD*) /home/laurynas/vilniusdb/last_insert_ids/sql/sql_select.cc:484:17
    #7 0xab9c0e3e2d74 in Sql_cmd_dml::prepare(THD*) /home/laurynas/vilniusdb/last_insert_ids/sql/sql_select.cc:399:11
    #8 0xab9c0e3e4360 in Sql_cmd_dml::execute(THD*) /home/laurynas/vilniusdb/last_insert_ids/sql/sql_select.cc:539:9
    #9 0xab9c0e263934 in mysql_execute_command(THD*, bool, unsigned long long*) /home/laurynas/vilniusdb/last_insert_ids/sql/sql_parse.cc:5489:29
    #10 0xab9c0e25c8c8 in dispatch_sql_command(THD*, Parser_state*, unsigned long long*) /home/laurynas/vilniusdb/last_insert_ids/sql/sql_parse.cc:6276:21
    #11 0xab9c0e253894 in dispatch_command(THD*, COM_DATA const*, enum_server_command) /home/laurynas/vilniusdb/last_insert_ids/sql/sql_parse.cc:2565:7
    #12 0xab9c0e259124 in do_command(THD*) /home/laurynas/vilniusdb/last_insert_ids/sql/sql_parse.cc:1745:18
    #13 0xab9c0e7f6f10 in handle_connection(void*) /home/laurynas/vilniusdb/last_insert_ids/sql/conn_handler/connection_handler_per_thread.cc:307:13
    #14 0xab9c12076ee8 in pfs_spawn_thread(void*) /home/laurynas/vilniusdb/last_insert_ids/storage/perfschema/pfs.cc:3022:3
    #15 0xfbb124cc5978 in start_thread nptl/./nptl/pthread_create.c:447:8
    #16 0xfbb124d2ba48 in thread_start misc/../sysdeps/unix/sysv/linux/aarch64/clone3.S:76

Yep, a leak. Let's try freeing that memory in the destructor:

Table_function_last_insert_ids::~Table_function_last_insert_ids() {
  assert(fields.elements == 1);
  delete fields.head();
}

Run again, no leak or any other Sanitizer error.

Summary checklist

So, to summarize, the checklist for adding a new table-returning function to MySQL:

  • Add the function name as a lexer symbol in sql/lex.h
  • Add the function name symbol as a token in the parser (sql/sql_yacc.yy)
  • In the parser, rename table_function to json_table_function and add new table_function with json_table_function and new function alternatives. All these symbols must be typed as table_reference
  • Create a new parse tree node class, for example:
class PT_last_insert_ids_function : public PT_table_reference {
  using super = PT_table_reference;

 public:
  PT_last_insert_ids_function(const LEX_STRING &table_alias)
      : m_table_alias{table_alias} {}

  bool contextualize(Parse_context *pc) override;

 private:
  const LEX_STRING m_table_alias;
};

bool PT_last_insert_ids_function::contextualize(Parse_context *pc) {
  if (super::contextualize(pc)) return true;
  auto *const fn = new (pc->mem_root) Table_function_last_insert_ids{};
  if (unlikely(fn == nullptr)) return true;

  LEX_CSTRING alias;
  alias.length = strlen(fn->func_name());
  alias.str = sql_strmake(fn->func_name(), alias.length);
  if (unlikely(alias.str == nullptr)) return true;

  auto *const ti = new (pc->mem_root) Table_ident(alias, fn);
  if (ti == nullptr) return true;

  m_table_ref = pc->select->add_table_to_list(pc->thd, ti, m_table_alias.str, 0,
                                              TL_READ, MDL_SHARED_READ);
  if (m_table_ref == nullptr || pc->select->add_joined_table(m_table_ref))
    return true;

  return false;
}
  • Create a new table function class:
class Table_function_last_insert_ids final : public Table_function {
 public:
  ~Table_function_last_insert_ids() override;

  bool init() override { return false; }

  bool fill_result_table() override;

  const char *func_name() const override { return "last_insert_ids"; }

  bool print(const THD *, String *str, enum_query_type) const override {
    return !str->append(STRING_WITH_LEN("last_insert_ids()"));
  }

  bool walk(Item_processor, enum_walk, uchar *) override {
    return false;
  }

 private:
  List<Create_field> *get_field_list() override;

  bool do_init_args() override { return false; }

  List<Create_field> fields;
};

Table_function_last_insert_ids::~Table_function_last_insert_ids() {
  assert(fields.elements == 1);
  delete fields.head();
}

List<Create_field> *Table_function_last_insert_ids::get_field_list() {
  assert(fields.is_empty());

  auto *const field = new Create_field;
  field->init_for_tmp_table(MYSQL_TYPE_LONGLONG, MAX_BIGINT_WIDTH,
                            DECIMAL_NOT_SPECIFIED, /* is_nullable */ false,
                            /* is_unsigned */ true, 0, "insert_id");
  fields.push_back(field);

  return &fields;
}

bool Table_function_last_insert_ids::fill_result_table() {
  assert(!fields.is_empty());
  assert(!table->materialized);
  assert(table->s->fields == 1);

  empty_table();

  auto *const field = get_field(0);
  assert(field->field_index() == 0);
  field->store(current_thd->first_successful_insert_id_in_prev_stmt);
  field->set_notnull();
  write_row();

  return false;
}

Percona SEQUENCE_TABLE

After I have posted this, Percona's Yura Soroking pointed out on LinkedIn their implementation of SEQUENCE_TABLE function: blog post, the initial implementation source code. This implementation follows similar framework as the toy implementation above.

Tuesday, May 28, 2024

A MySQL 8.4.0 branch with patches

While writing the previous post, I noticed that I didn't have a central location for the patches that I submitted to Oracle. Some were in local branches, some were .patch files lying around. So now I pushed a tree that has all those patches applied in a single place: patched-mysql-8.4.0, and I even added a README. This tree hopefully will make it easier to rebase on future Oracle releases. I had maintained similar branches before while pushing some of the Percona patches to Oracle around early 8.0 times.

As for the patches themselves, the majority of them add various missing features for the clone plugin to be able to support more than one transactional storage engine. There is also a compilation fix and a slightly improved AddressSanitizer support patch.

All MySQL 8.4.0 users are advised to migrate ASAP! /s

Wednesday, May 08, 2024

Building and testing MySQL 8.0.37 & 8.4.0 on macOS

The first MySQL LTS release, 8.4.0, is out, together with 8.0.37! Which means it's time for me to build and test them in my main work environment, continuing the series (8.3.0/8.0.36, 8.2.0/8.0.35).

The first surprise is that both these releases do not build with the current XCode (15.3 at the time of writing), because the LLVM 17 compilation failure I previously reported (#113123: Compilation fails with LLVM 17 and XCode 15.3) is not fixed yet and started affected Apple toolchain too. I am using Homebrew-packaged LLVM 16 for all the builds of these versions. I didn't do any LLVM 18 nor GCC testing neither.

For the good news, I no longer get a build-breaking warning in NDB, even though its bug (#113662) is still open. Finally, for the build-related no-news, nothing has changed with regard to system vs bundled libraries: both versions continue to build with all the system libraries, except for zlib.

On to the testsuite. It seems that 8.4.0 release build testsuite takes some 20% less time than the 8.0.37 one. It must be the payoff from the deprecated feature removal removing their tests too, although I did not try to confirm that.

Fixed bugs and no longer reproducing test failures on open bugs:

New test failures:

I wanted to call the following the list of bugs with no changes, but I can't. While some of the bugs indeed have no changes, there is also a small but definite creep there: a test that only failed in one configuration before now fails in two. A second sibling test started failing with same symptoms. That's definitely not "no changes", but not new independent bugs neither. It's worrying.

I did not re-test two bugs:

Thus, the numbers are: 4 new bugs reported, 5 fixed or no longer reproducing, 10 bugs with no changes or with a bit of creep, 2 non-tested.

But wait, there's more! I have also sent some patches to Oracle by OCA! I am only counting the recent patches I developed at VilniusDB and not the ones I submitted at Percona. So how many have they applied since 8.3.0?

So, zero. Of course two out of four patches are for clone with a 2nd transactional storage engine, thus the excuse of Oracle MySQL not having such engine sounds plausible.

That's about it. I am still running Valgrind tests, and they should complete any week now, unless I have to kill them to take their machine for something else. I will not be updating this post or writing a new one unless their results are very, very unexpected. Otherwise here's to the new MySQL releases and hopefully I will continue the series in three months.

Tuesday, January 30, 2024

Introducing patch2testlist for MySQL development

I wrote a small shell utility patch2testlist that might be useful for fellow MySQL developers. It reads a diff and outputs the list of tests touched in this diff to run in a format suitable for mysql-test-run.pl consumption. Furthermore, when provided with a path to the source tree of the diff, it handles included files.

There are two ways to invoke it.

  1. Quick-and-dirty mode that does not handle included files:

    $ ./mtr `git diff | patch2testlist` ...
    
  2. Thorough mode that considers included files, if the source tree path is given:

    $ ./mtr `git diff | patch2testlist ../..` ...
    

What does it do? Let's consider an example:

$ git diff | diffstat
 mysql-test/extra/rpl_tests/rpl_replica_start_after_clone.inc                                    |    2 
 mysql-test/include/keyring_tests/binlog/rpl_encryption_master_key_rotation_at_startup.inc       |    5 -
 mysql-test/include/keyring_tests/mats/rpl_encryption.inc                                        |    2 
 mysql-test/include/keyring_tests/mats/rpl_encryption_master_key_generation_recovery.inc         |    2 
 mysql-test/suite/auth_sec/include/acl_tables_row_locking_test.inc                               |    4 
 mysql-test/suite/binlog/t/binlog_restart_server_with_exhausted_index_value.test                 |    1 
 mysql-test/suite/component_keyring_file/inc/rpl_setup_component.inc                             |    1 
 mysql-test/suite/innodb/t/log_8_0_11_case1.test                                                 |    1 
 mysql-test/suite/rocksdb/r/sys_tables.result                                                    |    2 
 mysql-test/suite/rocksdb/r/sys_tables_acl_tables_row_locking.result                             |  384 +++++++++++++++++++---------------------------------------------------------------
 mysql-test/suite/rocksdb/r/sys_tables_is_statistics_mysql.result                                |    4 
 mysql-test/suite/rocksdb/r/sys_tables_mysqlcheck.result                                         |    8 -
 mysql-test/suite/rpl/t/rpl_cloned_slave_relay_log_info.test                                     |    4 
 mysql-test/suite/rpl/t/rpl_encryption.test                                                      |    3 
 mysql-test/suite/rpl/t/rpl_encryption_master_key_generation_recovery.test                       |    3 
 mysql-test/suite/rpl/t/rpl_encryption_master_key_rotation_at_startup.test                       |    5 -
 mysql-test/suite/rpl/t/rpl_gtid_innodb_sys_header.test                                          |    2 
 mysql-test/suite/rpl_gtid/t/rpl_gtid_xa_commit_failure_before_gtid_externalization.test         |    1 
 mysql-test/suite/rpl_gtid/t/rpl_gtid_xa_commit_one_phase_failure_before_prepare_in_engines.test |    1 
 mysql-test/suite/rpl_gtid/t/rpl_gtid_xa_prepare_failure_before_prepare_in_engines.test          |    1 
 mysql-test/suite/rpl_gtid/t/rpl_gtid_xa_rollback_failure_before_gtid_externalization.test       |    1 
 mysql-test/suite/rpl_nogtid/t/rpl_assign_gtids_to_anonymous_transactions_clone.test             |    4 
 mysql-test/suite/rpl_nogtid/t/rpl_gtid_mode.test                                                |    5 -
 mysql-test/suite/rpl_nogtid/t/rpl_nogtid_encryption_read.test                                   |    3 
 mysql-test/suite/test_services/t/test_host_application_signal_plugin.test                       |    3 
 mysql-test/t/basedir.test                                                                       |    5 -
 mysql-test/t/mysqld_daemon.test                                                                 |    3 
 mysql-test/t/mysqld_safe.test                                                                   |   27 ++---
 mysql-test/t/restart_server.test                                                                |    3 
 mysql-test/t/restart_server_no_acl.test                                                         |    3 
...
$ git diff | patch2testlist
binlog.binlog_restart_server_with_exhausted_index_value innodb.log_8_0_11_case1 main.basedir
main.mysqld_daemon main.mysqld_safe main.restart_server main.restart_server_no_acl
rocksdb.sys_tables rocksdb.sys_tables_acl_tables_row_locking
rocksdb.sys_tables_is_statistics_mysql rocksdb.sys_tables_mysqlcheck
rpl.rpl_cloned_slave_relay_log_info rpl.rpl_encryption
rpl.rpl_encryption_master_key_generation_recovery
rpl.rpl_encryption_master_key_rotation_at_startup rpl.rpl_gtid_innodb_sys_header
rpl_gtid.rpl_gtid_xa_commit_failure_before_gtid_externalization
rpl_gtid.rpl_gtid_xa_commit_one_phase_failure_before_prepare_in_engines
rpl_gtid.rpl_gtid_xa_prepare_failure_before_prepare_in_engines
rpl_gtid.rpl_gtid_xa_rollback_failure_before_gtid_externalization
rpl_nogtid.rpl_assign_gtids_to_anonymous_transactions_clone rpl_nogtid.rpl_gtid_mode
rpl_nogtid.rpl_nogtid_encryption_read test_services.test_host_application_signal_plugin

The quick-and-dirty mode above does not require a hundred line script, a ten-line one will do. But notice that several of the changed files in the diffstat output are test include files (i.e. rpl_replica_start_after_clone.inc). Ideally we'd want to run any tests that include (directly and indirectly) such files, and the ten-line script does not handle this case.

That's what the other ninety lines of the script do. If the optional source tree path argument is given, then it greps for any included files under mysql-test/, then greps for newly-found files and so on until it finds no more:

$ git diff | patch2testlist ../..
auth_sec.acl_tables_row_locking binlog.binlog_restart_server_with_exhausted_index_value
component_keyring_file.rpl_binlog_cache_encryption
component_keyring_file.rpl_binlog_cache_temp_file_encryption
component_keyring_file.rpl_default_table_encryption component_keyring_file.rpl_encryption
component_keyring_file.rpl_encryption_master_key_generation_recovery
component_keyring_file.rpl_encryption_master_key_rotation_at_startup innodb.log_8_0_11_case1
main.basedir main.mysqld_daemon main.mysqld_safe main.restart_server main.restart_server_no_acl
rocksdb.sys_tables rocksdb.sys_tables_acl_tables_row_locking rocksdb.sys_tables_is_statistics_mysql
rocksdb.sys_tables_mysqlcheck rpl.rpl_cloned_slave_relay_log_info rpl.rpl_encryption
rpl.rpl_encryption_master_key_generation_recovery rpl.rpl_encryption_master_key_rotation_at_startup
rpl.rpl_gtid_innodb_sys_header rpl.rpl_slave_start_after_clone
rpl_gtid.rpl_gtid_only_start_replica_after_clone
rpl_gtid.rpl_gtid_xa_commit_failure_before_gtid_externalization rpl_gtid.rpl_gtid_xa_commit_one_phase_failure_before_prepare_in_engines
rpl_gtid.rpl_gtid_xa_prepare_failure_before_prepare_in_engines
rpl_gtid.rpl_gtid_xa_rollback_failure_before_gtid_externalization
rpl_nogtid.rpl_assign_gtids_to_anonymous_transactions_clone rpl_nogtid.rpl_gtid_mode
rpl_nogtid.rpl_nogtid_encryption_read test_services.test_host_application_signal_plugin

As you can see the list is now significantly longer, indicating a more thorough test run coverage of the diff. All this extra grepping takes about 90 seconds on my machine, if some popular include files are touched. I have no idea whether that's with hot or cold FS cache. I also don't know whether replacing grep with rg would it make it faster.

To minimize the false positives in included file search, grep considers the lines that don't start with the MTR language comment character #, and are like ...source...basename-of-included-file. This allows false positives in indented comments and inside string literals (that one should be rare) and it cannot tell apart files with the same name in different directories. In theory it also allows false negatives if an include file is referenced using a string variable to store its name. Any suggestions for better regexps are welcome.

It goes without saying that it is best applied on test-only patches. If you touch the source code, then you should be looking at whole MTR runs, or, if possible, MTR runs of selected suites. But if you are indeed working on a test-only patch, this script reduces the required test time effectively.

Should be portable but currently tested on macOS only. Feedback is welcome!

Wednesday, January 24, 2024

Building and testing MySQL 8.0.36 and 8.3.0 on macOS

The previous releases (8.0.35 and 8.2.0) resulted in me reporting fifteen bugs. Let's find out whether 8.0.36 and 8.3.0 will fare better on an M1 Mac.

Let's start with the build. Boost goes away as an external dependency in 8.3.0, removing the need to specify Boost-related CMake options, good. The server continues to build successfully with -DWITH_SYSTEM_LIBS=ON but now started requiring -DWITH_ZLIB=bundled, because 8.3.0 made the system libraries option govern zlib too, and the one in XCode is one patch level version too old. The Homebrew-installed version is ignored.

8.0.36 Release configuration builds with a single potentially-fatal warning: bug #113662 (NDB compilation error on macOS Release build). Finding this made me look, why is NDB built at all, if I did not add -DWITH_NDB=ON? This resulted in bug #113661 (NDB storage engine built ignoring -DWITH_NDB=OFF (which is OFF by default too)).

The most serious build-related issue I saw previously was incorrect query results if compiled with LLVM 15 and newer, reported as bug #113049 (MTR test json.array_index fails with a result difference) and bug #113046 (MTR tests for SELECT fail with ICP, MRR, possibly other flags). This issue has been fixed, although the bugs are still open (thus no release notes entries neither). As Tor Didriksen explained, they are open due to still remaining issues with recent MSVC compilers. But, LLVM works fine for me now and that's great.

The previous releases also required -ffp-contract=off compilation flag workaround to take care of some failures: bug #113047 (MTR test main.derived_limit fails with small cost differences), bug #113048 (MTR test gis.gis_bugs_crashes fails with a result difference). This has been mostly addressed, except that #113047 is fixed in 8.0.36 and 8.4.0 but not 8.3.0, so that failure still remains if the workaround is dropped.

The previous releases could not be compiled with LLVM 17, and no changes occurred here, bug #113123 (Compilation fails with LLVM 17) still applies.

Moving on to tests in Release, Debug, and Debug+ASan+UBsan configurations. Looking better than the last time, this is what I had to report:

So, to sum up, 10 bugs reported, 4 bugs confirmed fixed, 5 bugs (#113123, #113260, #113189, #113190, #113258) have no changes, and 1 bug (#113023) I did not test.

All in all, this looks OK. While no perfect clean testsuite results I was used to in some older releases, no miscompilation-like bugs neither, and that's fine.