Monday, October 23, 2023

Strong typing: comparing Rust newtype to C++

Rust source code often makes heavy use of the newtype idiom, where a new type is created for an underlying primitive type to differentiate it from other uses of the same underlying type. The term "newtype" comes from Haskell where it's not an idiom, but a keyword, thus a built-in language feature.

I was wondering why I never heard of newtype as a C++ developer, because C++ is obviously a strongly-typed language, where the same issue exists. There are different ways to approach it and this is my attempt to get the thoughts on the topic in order, there will be no earth-shattering insights.

Rust: newtype

Suppose you are developing a database and have transaction IDs and log sequence numbers. Both are u64 but are not interoperable in any way. So in Rust, a natural implementation would be to apply newtype idiom twice:

pub struct TransactionId(u64);

impl TransactionId {
    fn new(id: u64) -> Self {

    fn get(&self) -> u64 {

impl fmt::Display for TransactionId { ... }

pub struct LogSequenceNumber(u64);

impl LogSequenceNumber { ... }

Let's enumerate the options in C++.

C++: do nothing

Do nothing, and use std::uint64_t for both types. No compiler protection, no documentation at the type name, thus the most bug-prone option. Obviously there is nothing to stop us from using this same option in Rust too.

C++: use type aliases

Introduce type aliases:

// Can also be done with typedef, but let's stick to modern C++:
using transaction_id = std::uint64_t;
using log_sequence_number = std::uint64_t;

This expresses the intent, documents things whenever the type name appears, and is not too verbose. The downside is that it does not introduce new types, only aliases for existing ones, meaning that transaction IDs assign to LSNs and back freely.

C++: introduce new types

Introduce new types. Like in Rust, differently-named structs with identical fields can be used.

struct transaction_id {
  std::uint64_t val;

struct log_sequence_id {
  std::uint64_t val;

Now type safety is increased and the type mix-up is prevented by the compiler. But so are most operations with type variables, requiring writing extra code to have the desired functionality, compared to the first two options. Writing this extra code will be more verbose than the same in Rust because the latter has support for traits, which can have default implementations.

struct log_sequence_id {
  // explicit is important, we don't want to make the incompatible types
  // implicitly-covertable again inadvertently
  explicit log_sequence_id(std::uint64_t v) : val{v} {}

  log_sequence_id& operator += (std::size_t log_delta) {
    val += log_delta;
    return *this;

Naturally, limiting available operations is advantageous too, in both languages. For example, it makes no sense to add two transaction IDs together.

Since this is C++, meaning that we have the template-hammer, making all the problems look like template-nails for better or worse, we could try avoiding spelling out structs every time:

// Written this way only to show a point. The actual implementation would be more
// complex to be able to handle move-only types and wrap large objects efficiently.
template<typename T, typename Tag>
class newtype {
  explicit newtype(T v) : val{v} {}
  void set(T v) { val = v; }
  T get() const { return val; }
  T val;

struct log_sequence_id_tag{};
using log_sequence_id = newtype<std::uint64_t, log_sequence_id_tag>;

struct transaction_id_tag{};
using transaction_id = newtype<std::uint64_t, transaction_id_tag>;

Now introducing a newtype is reduced to two lines of code. Again, C++ developers do not usually discuss newtype but they do discuss strongly-typed using and typedefs, which is the same thing, called differently.

In most cases we are wrapping a single value of a primitive or string type. Those wrapped values are then operated using free functions or methods of some other classes. Thus, in this setting, this is a great option and we are done. But suppose we want to add some methods to the newly-introduced type instead of using free functions. The newtype template will not allow this, not unless we introduce inheritance:

using log_sequence_id_base = newtype<std::uint64_t, log_sequence_id_tag>;

class log_sequence_id : public log_sequence_id_base {

At which point the use of the newtype template becomes questionable and the code simplifies by folding the value into the class:

class log_sequence_id {
  std::uint64_t value;

Here we are back to creating a new type manually, just like before, without templates. This seems to be different from Rust, where a single-field struct will clearly show its newtype origins in the declaration, regardless of how much functionality it acquired later on.

So, there you have it. Both languages are strongly typed and have means to introduce new distinct types built on the existing ones, with Rust calling this newtype, and developers having a choice in C++ between type aliases, which don't actually increase type safety, to succinct templates and verbose types with some trade-offs.

Monday, September 25, 2023

Implementing durability in a MySQL storage engine

update 2023-09-28: edited for non-durable SE commits under group commit, and fixed the trx->flush_log_later discussion.

update 2023-09-27: Binlog group commits asks the storage engines to commit non-durably, will edit the post even more.

update 2023-09-26: trx->flush_log_later is actually used. Will edit the post.

Let's review how a MySQL storage engine should implement transaction durability by flushing / syncing WAL writes to disk. For performance reasons (group 2PC), let's also review when it specifically should not sync writes to disk. The reference durability implementation is, of course, InnoDB.

The main storage engine entry point is handlerton::commit. Since in general the storage engines participate in two-phase commit protocol with the binary log, there is also handlerton::prepare, and handlerton::flush_logs participates too. Let's ignore rollbacks, savepoints, explicit XA transactions, read only transactions, transactions on temporary tables only, crash recovery, and transaction coordinators other than the binlog.

Background: Group Commit

It was implemented (WL#5223) in its current form in MySQL 5.6, and its internals are described in this Mats Kindahl's blog post. I will not repeat everything here (and I'm sure I'd miss a lot of details), but for durability discussion, from the storage engine side, the group commit looks as follows:

  • prepare(t1) with reduced durability;
  • prepare(t2) with reduced durability;
  • prepare(tn) with reduced durability;
  • flush_logs(), making all the prepares above durable;
  • commit(t1) with reduced durability;
  • commit(t2) with reduced durability;
  • commit(tn) with reduced durability.

A surprise here is that the commits are performed with reduced durability too. How do reduced-durability commits implement full durability for the committed transactions, then? Turns out, the design of binlog group commit is only the commit of binlog itself is durable, and for the storage engines, prepares are made durable in batches and that's it. If their commits are lost, binlog crash recovery will roll forward the prepared transactions.

This design is counterintuitive if one thinks that innodb-flush-log-at-trx-commit=1, as documented, makes InnoDB commits durable in this setup, which it does not, and it is possible to see binlog crash recovery in action. Davi Arnaut reported this as bug #75519 in 2015, and IMHO few users are aware of this behavior.

Anyway, back to the implementation. Apparently the server developers did not want to change the prepare/commit handlerton interface, so the server durability request (full or reduced) is not passed in as an argument, but must be queried by thd_get_durability_property returning an enum with two possible values HA_REGULAR_DURABILITY and HA_IGNORE_DURABILITY.

Later, in 8.0, this durability property was reused to implement correct & performant commit order on multithreaded replicas, when binlog is disabled (WL#7846).

InnoDB: handlerton::commit

Implemented by innobase_commit.

Comes last in the group commit, but let's review it first. In other setups it might be the only entry point.

Wherever I say "write [to the disk] and sync|flush", the mental model is that of a buffered write with a separate flush/sync afterwards. If O_SYNC or O_DSYNC is used to write the log instead, then the write and the sync are a single operation.

Let's ignore non-default innobase_commit_concurrency setups.

First the code sets trx->flush_log_later and then goes through the call stack innobase_commit -> innobase_commit_low -> trx_commit_for_mysql -> trx_commit -> trx_commit_low. The last one calls trx_write_serialisation_history, which makes the necessary commit writes to a mini-transaction, then trx_commit_low commits the mini-transaction by creating the redo log records. Nothing is done for durability yet at this point. Finally trx_commit_low calls trx_commit_in_memory, which sees that trx->flush_log_later is set and sets trx->must_flush_log_later. (if trx_commit is called from other API than SE commit, then flush_log_later will not be set and the durability will be ensured in this function).

At this point the callstack returns all the way back to innobase_commit, which calls trx_complete_for_mysql, which now checks trx->must_flush_log_later (set), durability request (reduced), and whether this is a DDL transaction. If it is not, then nothing is done, and InnoDB reports the commit as successful. If it is a DDL transaction, then log is flushed ignoring the reduced durability request and innodb_flush_log_at_trx_commit setting..

The above mentioned that DDL transactions are flushed more than regular ones, regardless of innodb_flush_log_at_trx_commit setting. This is a deliberate design decision, which has to do with the data dictionary, I believe. To understand why, consider the relevant parts of server startup sequence:

  1. InnoDB comes up, and performs its own recovery from its redo log.
  2. Server data dictionary is initialized.
  3. Binlog crash recovery runs.
If any DD transactions are trapped in prepared state by the time of the data dictionary initialization, they will be invisible, while their disk changes (e.g. a tablespace renamed on disk) will be present on disk. This inconsistency is likely to be fatal for the DD, and binlog crash recovery runs too late to recover from that.

InnoDB: handlerton::prepare

Implemented by innobase_xa_prepare.

It calls trx_prepare_for_mysql -> trx_prepare -> trx_prepare_low, which updates the undo log state for the transaction a mini-transaction, committing which makes the top-level transaction prepared. Then trx_prepare calls trx_flush_logs, which will either do nothing or write and flush the redo log up to the mini-transaction's commit LSN, depending on the server durability request.

InnoDB: handlerton::flush_logs

Implemented by innobase_flush_logs.

It has a bool argument telling whether it was invoked as a part of binlog group commit, which is the interesting case here, ignoring the other option of it being invoked by FLUSH LOGS SQL statement. It writes the redo log buffer to disk and flushes it according it to innodb_flush_log_at_trx_commit value.

Bugs reported while writing this:

Bugs found while writting this:

Bugs that made me write this:

Friday, August 25, 2023

MySQL Build Times: Use Ninja

I had noticed Ninja as one of the possible CMake generators long time ago, but never paid attention to it, as I could not imagine it being better than Make so much that it'd be worth switching. Then, when I posted my MySQL -ftime-trace results, I got a comment on LinkedIn that Ninja visualizes build time nicely. I tried that, and it did, and I went back to Make builds.

A few months later, I am looking at Vittorio Romeo's "Improving Compilation Times" presentation slides (download them, do not read inline on GitHub; there is also the talk video itself), and the very first low-hanging fruit advice is "use Ninja".

OK, so let's actually try, say, Debug build on Facebook MySQL 8.0.28:

  • make -j13: 4m43s
  • ninja: 4m17s

A 10% improvement with roughly zero effort is nice. There are other niceties too: you don't have to figure out the right make parallelism argument for -j, as Ninja handles that automatically, and the terminal is not spammed with the build log of all the source files that have been built uneventfully. Only compiler warnings and any irregular build output is there.

To use it, add -G Ninja to CMake invocation, and then use ninja instead of make to build. I have patched my scripts.

Tuesday, May 09, 2023

Tips for making MySQL builds & tests faster

Recently, Mark discovered that a part of FB MySQL sources were recompiled three times in a single build. That has been fixed, and I also played with clang -ftime-trace to see where the build time goes. I believe there is more to this topic, so I wanted to organize my thoughts on the subject.

In software development, the shorter the change-build-test cycle is, the higher the developer productivity. Yet MySQL is not making it easy to have short "build" and "test" steps in this cycle. A reasonably powerful Intel Core i9 laptop used to take 20-30 minutes for a clean debug build without the unit tests. The MTR testsuite takes hours, and it is possible to make it run for days if you wish.

What can be done about this? Here's what's working for me, focusing on 8.0 trees. There is no silver bullet here, and some of the suggestions might be even somewhat effort-intensive to set up. Luckily, most suggestions are independent and optional.

Source trees and build artifacts

TL;DR: build incrementally as much as possible. Disk space is cheap, while your time is expensive. A common "antipattern" (quotes because there is nothing wrong with such workflow otherwise) is to have a single local git clone with a single build directory. Changing a branch with git checkout forces a clean build. Changing the build type (i.e. you built Debug previously, now need RelWithDebInfo, or Debug + AddressSanitizer enabled) forces a clean build. Don't force clean builds; keep all the previously-built artifacts around as much as possible:

  • Have one build dir for every build type (e.g. build/debug, build/release, build/debug-asan). Never delete a build dir unless forced to.
  • Never use git checkout in the local clone directory, use git worktrees for everything. Only delete a worktree (and its build dirs) once that branch is merged.
  • You are free to store the build dirs wherever you want, but in order to keep track which build dir belongs to which source tree, for me, the simplest option was to keep them below the source tree. Oracle MySQL has build/ in its .gitignore, so that's a good prefix dir for them.

One objection to incremental builds is that they might somehow result in differences in build artifacts compared to the same clean builds, and that would be bad. In practice, sometimes something does break an incremental build with a build error, forcing a clean rebuild–an occasional checkpoint if you will. I have never encountered a silent divergence that was somehow detrimental, and your CI/CD farm will build your PRs cleanly anyway.

Now if you follow the worktree advice, you are likely to have quite a few of them. Some of them will be your personal feature branches, while others will be shared feature branches, and main/master/8.0/5.7 trunk branches where others commit and push. Set up a cron job to pull the later and build overnight. Pros: ready builds for your work in the morning. Cons: you left your work last night with a working build, and someone pushed a commit that broke the build for you, which is what you find in the morning. IMHO the pros outweigh the cons.

Some MySQL source trees, like the Meta one, are incompatible with git worktrees. Luckily there is a not too-complicated workaround of adding the following CMake options: -DMYSQL_GITHASH=0 -DMYSQL_GITDATE=2100-02-29 -DROCKSDB_GITHASH=0 -DROCKSDB_GITDATE=2100-02-29. Maybe one day it will be fixed properly.

A thing that did not work well for me is ccache. While easy to set up, at least twice I had to waste a lot of time on apparent source-binary mismatch only to figure out that ccache is substituting an incorrect binary object file. That was enough for me to drop it, and I could never measure its benefit anyway.

Build options

According to the MySQL docs, there are over 160 CMake options. Some of them can affect the build times for better.

Use system libraries as much possible

You are in the business of developing MySQL, not its bundled 3rd party libraries. If you are lucky, you are also not in the business of developing MySQL integration with any of them. So, ignore the bundled libraries as much as possible and use your system ones: -DWITH_SYSTEM_LIBS=ON, after installing all the dependencies (which I won't list here). Unfortunately, that's only a theory, and in practice there's a difference between theory and practice. Let's take macOS, for example, to see which of the bundled still have to be used:

  • 8.0.33: -DWITH_RAPIDJSON=bundled
  • 8.0.32-29: system libs only, yay!
  • 8.0.28-27: -DWITH_RAPIDJSON=bundled -DWITH_LZ4=bundled -DWITH_FIDO=bundled
  • 8.0.26: -DWITH_RAPIDJSON=bundled -DWITH_LZ4=bundled

That's not too bad, and, given that we are stuck with every release for at least three months (much longer than that if using, say, the Meta tree), it's worth figuring out.

Skip the unit tests but be careful

-DWITH_UNIT_TESTS=OFF is by far the single most time-saving CMake option. Usually it is also not as bad as it may sound for development because the MTR tests are still there, and depending on what you are working on, the MTR tests might cover your testing needs completely. The biggest risk there is updating some internal API in a not particularly interesting way and forgetting to update its users in the unit tests. I am still figuring out the best way here to have my cake and eat it too.

Older versions: skip the functionality you don't need

This is quickly becoming an obsolete tip, but including it for completeness. Group replication used to be an optional build part, and the X plugin still is (-DWITH_MYSQLX=OFF). Unfortunately, disabling the X plugin breaks quite a few unrelated-to-X MTR tests, so I don't do that anymore.


Use libeatmydata

Install Stewart Smith's libeatmydata and always use it, except if building with sanitizers on Linux. It is packaged for Ubuntu, macOS Homebrew, and likely elsewhere. It cuts about 25% of MTR testsuite runtime by silently substituting all the fsync and related calls with no-ops for the tested processes. It is transparent, invisible, not getting in your way etc.–a pure win. Its invocation in the context of MTR tests is a handful to type, so I am using a shell script helper:

UNAME_OUT="$(uname -s)"
if [ "$UNAME_OUT" = "Darwin" ]; then
    if [ "$(arch)" = "arm64" ]; then
    unset BREW
    export MTR_EMD=(
    unset EMD_LIBDIR
    export MTR_EMD=(

mtr_emd() {
    ./mtr "${MTR_EMD[@]}" "$@"

MTR also provides the --mem option which tries to use a non-persistent filesystem for running tests. In theory this should speed it up even more than libeatmydata, but I could never get it work reliably.

Use --parallel, find its best value for your machine & your tests.

For instance, for single-server tests on an Apple M1 Max (8 performance and 2 efficiency cores) the fastest I found was --parallel=15. Replication tests complicate this by spawning three servers per test instead of one.

Don't test what you don't need

Now we are deep in Captain Obvious territory, but still. Developing a plugin, say clone? Great, most of the time --suite=clone is enough. Sure, even with plugins the separation is not perfect, and there are dependencies, for example group replication uses clone too, but it's a start. You are less lucky if you work on InnoDB, or something in i.e. THD or Handler that is a dependency of everything.


Throwing hardware at the build time problem is a great option if you have the resources. IMHO, there are two main routes to consider: the offline-portable one & the connected-powerful one.

If you need a laptop and want the option of working offline, then Apple Silicon is second to none. Replacing an Intel Core i9 laptop with a M1 Max one made MySQL builds five times faster. Five actual times! The downside is that macOS is not exactly a datacenter server OS, and so if you are the only one on the team on macOS, guess who just became the macOS port maintainer? There is an option of running Linux on Apple Silicon, which I heard virtualizes at near-bare metal speed, but I haven't explored it yet. Even then you become the ARM port maintainer, which is not that bad considering Graviton.

If a desktop is OK, then there are options, although I haven't tried this myself. Apple should still be in the running, and you could get an AMD CPU with up to 64 cores, which should make a short work of even MySQL build.

If a network is OK, then Sunny recommends using icecream to distribute compilation. I haven't tried this either.


I wrote down everything I knew about making MySQL build and test faster. Have I missed anything? Please comment.

Thursday, April 27, 2023

MySQL, clang -ftime-trace, & ClangBuildAnalyzer

TL;DR: install ClangBuildAnalyzer, compile MySQL with CC='clang -ftime-trace' CXX='clang++ -ftime-trace', run the analyzer, enjoy nice reports telling you why it's so slow.

Inspired by Mark's question on Twitter, I decided to try out clang -ftime-trace on MySQL source tree. This clang flag was added by Aras Pranckevičius (sure I'll call out a Lithuanian author by name) to LLVM 9.0, and it outputs per-source-file JSONs in Chrome Tracing format explaining where did the compilation time go. Now, instead of stating that for this file parsing took X seconds, template instantiation Y minutes, and register allocation Z microseconds it provides actionable information with object granularity, i.e. it tells you which exactly templates were expensive to instantiate etc.

To get those reports, -ftime-trace needs to be added to compilation flags. Don't add it to MySQL CMake CMAKE_CXX_FLAGS etc, because you'll have to get the rest of -O2 -DNDEBUG etc. right. Instead, pretend that this flag is a part of compiler invocation itself, and set CMAKE_CXX_COMPILER='clang++ -ftime-trace', and likewise for CMAKE_C_COMPILER. Or, use CC and CXX environment variables as I did in the TL;DR section.

A clean MySQL 8.0.33 results in a few thousands of those JSONs. Each one could be loaded in Chrome by navigating to chrome://tracing and loading a file. Individually. This blog post suggests zipping them all up, enabling Chrome to load that zip instead. But the post also warns that it takes a long time to load, and I can confirm Chrome crashing after some 40 minutes of work.

Luckily for us Aras wrote a project-level analyzer tool: ClangBuildAnalyzer. It handles those thousands of JSONs just fine, and very quickly, and for a regular 8.0.33 Release configuration build outputs the following:

**** Time summary:
Compilation (6522 times):
  Parsing (frontend):         4669.4 s
  Codegen & opts (backend):   3139.1 s

**** Files that took longest to parse (compiler frontend):
 16499 ms: ./unittest/gunit/xplugin/xpl/CMakeFiles/xpl_test_src.dir/mock/
 12057 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/
 12011 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/
 11907 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/
 11533 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/
 11454 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/
 11085 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/
 10930 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/
 10201 ms: ./unittest/gunit/xplugin/xpl/CMakeFiles/xpl_test_src.dir/
 10195 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/

**** Files that took longest to codegen (compiler backend):
 48974 ms: ./unittest/gunit/innodb/CMakeFiles/merge_innodb_tests-t.dir/
 31598 ms: ./unittest/gunit/xplugin/xpl/CMakeFiles/xpl_test_src.dir/mock/
 30849 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 30007 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 29611 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/
 27047 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/
 26776 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/
 26253 ms: ./unittest/gunit/CMakeFiles/merge_large_tests-t.dir/
 25877 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 25600 ms: ./unittest/gunit/xplugin/xcl/CMakeFiles/xclient_unit_tests.dir/

**** Templates that took longest to instantiate:
109225 ms: std::__function::__func<(lambda at /Users/laurynas/vilniusdb/mysql-8... (7110 times, avg 15 ms)
107390 ms: std::__function::__func<(lambda at /Users/laurynas/vilniusdb/mysql-8... (7110 times, avg 15 ms)
 93389 ms: std::function<void ()>::function<(lambda at /Users/laurynas/vilniusd... (2370 times, avg 39 ms)
 92573 ms: std::__function::__value_func<void ()>::__value_func<(lambda at /Use... (2370 times, avg 39 ms)
 92539 ms: std::function<void ()>::function<(lambda at /Users/laurynas/vilniusd... (2370 times, avg 39 ms)
 91689 ms: std::__function::__value_func<void ()>::__value_func<(lambda at /Use... (2370 times, avg 38 ms)
 91551 ms: std::__function::__value_func<void ()>::__value_func<(lambda at /Use... (2370 times, avg 38 ms)
 90546 ms: std::__function::__value_func<void ()>::__value_func<(lambda at /Use... (2370 times, avg 38 ms)
 64823 ms: std::__function::__alloc_func<(lambda at /Users/laurynas/vilniusdb/m... (7110 times, avg 9 ms)
 63866 ms: std::__function::__alloc_func<(lambda at /Users/laurynas/vilniusdb/m... (7110 times, avg 8 ms)
 39867 ms: std::__function::__func<(lambda at /Users/laurynas/vilniusdb/mysql-8... (4740 times, avg 8 ms)
 39246 ms: std::__function::__func<(lambda at /Users/laurynas/vilniusdb/mysql-8... (4740 times, avg 8 ms)
 22360 ms: std::unique_ptr<std::unordered_multimap<const MDL_key *, MDL_ticket_... (1016 times, avg 22 ms)
 22285 ms: std::unique_ptr<std::unordered_multimap<const MDL_key *, MDL_ticket_... (1018 times, avg 21 ms)
 22138 ms: std::default_delete<std::unordered_multimap<const MDL_key *, MDL_tic... (1018 times, avg 21 ms)
 17698 ms: std::copy_n<const char16_t *, unsigned long, char16_t *> (3325 times, avg 5 ms)
 17666 ms: testing::internal::ValueArray<bool, bool>::operator ParamGenerator<b... (525 times, avg 33 ms)
 17432 ms: std::copy<const char16_t *, char16_t *> (3326 times, avg 5 ms)
 17359 ms: std::copy_n<const wchar_t *, unsigned long, wchar_t *> (3326 times, avg 5 ms)
 17057 ms: std::copy<const wchar_t *, wchar_t *> (3326 times, avg 5 ms)
 16950 ms: net::basic_waitable_timer<std::chrono::steady_clock>::cancel (150 times, avg 113 ms)
 16946 ms: net::io_context::cancel<net::basic_waitable_timer<std::chrono::stead... (150 times, avg 112 ms)
 16813 ms: std::copy_n<const char *, unsigned long, char *> (3314 times, avg 5 ms)
 16622 ms: std::copy_n<const char32_t *, unsigned long, char32_t *> (3326 times, avg 4 ms)
 16448 ms: std::__copy<const char16_t *, const char16_t *, char16_t *, 0> (3326 times, avg 4 ms)
 16373 ms: std::copy<const char *, char *> (3326 times, avg 4 ms)
 16342 ms: std::copy<const char32_t *, char32_t *> (3326 times, avg 4 ms)
 16110 ms: net::basic_waitable_timer<std::chrono::steady_clock>::~basic_waitabl... (139 times, avg 115 ms)
 16047 ms: std::__copy<const wchar_t *, const wchar_t *, wchar_t *, 0> (3326 times, avg 4 ms)
 15781 ms: std::unordered_multimap<const MDL_key *, MDL_ticket_store::MDL_ticke... (1018 times, avg 15 ms)

**** Template sets that took longest to instantiate:
333347 ms: std::function<$>::function<$> (8367 times, avg 39 ms)
330713 ms: std::__function::__value_func<$>::__value_func<$> (8367 times, avg 39 ms)
270430 ms: std::__function::__func<$>::__func (8367 times, avg 32 ms)
236822 ms: testing::internal::FunctionMocker<$>::Invoke (2370 times, avg 99 ms)
229859 ms: std::__function::__alloc_func<$>::__alloc_func (25098 times, avg 9 ms)
227397 ms: testing::internal::FunctionMocker<$>::InvokeWith (2370 times, avg 95 ms)
174340 ms: std::forward_as_tuple<$> (34235 times, avg 5 ms)
163483 ms: std::unique_ptr<$> (75303 times, avg 2 ms)
145187 ms: std::tuple<$> (45300 times, avg 3 ms)
141977 ms: std::__function::__func<$>::__clone (16733 times, avg 8 ms)
 98501 ms: std::__hash_table<$> (11200 times, avg 8 ms)
 94028 ms: std::allocator_traits<$> (31254 times, avg 3 ms)
 93971 ms: std::__compressed_pair<$>::__compressed_pair<$> (35840 times, avg 2 ms)
 91566 ms: std::unordered_map<$> (8606 times, avg 10 ms)
 90396 ms: std::decay<$> (20916 times, avg 4 ms)
 82788 ms: std::copy<$> (16492 times, avg 5 ms)
 80875 ms: std::map<$> (14763 times, avg 5 ms)
 77163 ms: std::__copy<$> (16493 times, avg 4 ms)
 73875 ms: std::vector<$>::push_back (6296 times, avg 11 ms)
 71983 ms: std::copy_n<$> (13989 times, avg 5 ms)
 71340 ms: stdx::expected<$> (9078 times, avg 7 ms)
 70331 ms: std::__tree<$> (19917 times, avg 3 ms)
 68211 ms: std::vector<$> (32556 times, avg 2 ms)
 67929 ms: std::vector<$>::__push_back_slow_path<$> (6078 times, avg 11 ms)
 62076 ms: std::vector<$>::__swap_out_circular_buffer (8254 times, avg 7 ms)
 57906 ms: std::__decay<$> (12966 times, avg 4 ms)
 54976 ms: std::__compressed_pair<$> (17033 times, avg 3 ms)
 53465 ms: std::unordered_map<$>::unordered_map (4697 times, avg 11 ms)
 53099 ms: testing::internal::MatcherBase<$>::MatcherBase<$> (5591 times, avg 9 ms)
 52731 ms: testing::internal::MatcherBase<$>::Init<$> (5591 times, avg 9 ms)

**** Functions that took longest to compile:
  2560 ms: MYSQLparse(THD*, Parse_tree_root**) (/Users/laurynas/vilniusdb/mysql-8.0.33/_build-release-time-report/sql/
  1602 ms: (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/
  1577 ms: (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/
  1396 ms: (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/
  1280 ms: (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/
   877 ms: __cxx_global_var_init.334 (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/router/tests/
   839 ms: json_binary_unittest::JsonBinaryTest_BasicTest_Test::TestBody() (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/
   823 ms: (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/
   821 ms: __cxx_global_var_init.143 (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/router/tests/
   809 ms: ShareConnectionTinyPoolOneServerTest_not_sharable_Test::TestBody() (/Users/laurynas/vilniusdb/mysql-8.0.33/router/tests/integration/
   732 ms: testing::internal::FlatTupleBase<testing::internal::FlatTuple<Target... (/Users/laurynas/vilniusdb/mysql-8.0.33/router/tests/component/
   719 ms: AccountReuseCreateComboTestP::gen_testcases() (/Users/laurynas/vilniusdb/mysql-8.0.33/router/tests/component/
   689 ms: KeyringFrontendTest_ensure_Test::TestBody() (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/router/tests/
   624 ms: duk__js_execute_bytecode_inner (/Users/laurynas/vilniusdb/mysql-8.0.33/extra/duktape/duktape-2.7.0/src/duktape.c)
   609 ms: (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/mysql_protocol/tests/
   591 ms: MySQLRouter::prepare_command_options() (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/router/src/
   579 ms: json_dom_unittest::JsonDomTest_BasicTest_Test::TestBody() (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/
   578 ms: SetObjectMembers(std::__1::unique_ptr<Json_object, std::__1::default... (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/join_optimizer/
   538 ms: operations_unittest::KeyringCommonOperations_test_OperationsTestWith... (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/components/keyring_common/
   531 ms: strnxfrm_unittest::StrmxfrmHashTest_HashStability_Test::TestBody() (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/
   511 ms: spec_adder(rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjs... (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/rest_routing/src/
   499 ms: mysys_my_time::MysysMyTime_StrToDatetime_Test::TestBody() (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/
   476 ms: my_strnncoll_uca_900(CHARSET_INFO const*, unsigned char const*, unsi... (/Users/laurynas/vilniusdb/mysql-8.0.33/strings/
   467 ms: dd_properties_unittest::PropertiesTest_ValidSetGetIntBool_Test::Test... (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/
   450 ms: testing::internal::FlatTupleBase<testing::internal::FlatTuple<Target... (/Users/laurynas/vilniusdb/mysql-8.0.33/router/tests/component/
   444 ms: KeyringManager_init_with_key_file_Test::TestBody() (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/harness/tests/
   427 ms: (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/
   425 ms: (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/xplugin/xpl/
   416 ms: __cxx_global_var_init.49 (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/http/tests/
   378 ms: (/Users/laurynas/vilniusdb/mysql-8.0.33/unittest/gunit/xplugin/xpl/

**** Function sets that took longest to compile / optimize:
 39400 ms: testing::internal::FunctionMocker<$>::InvokeWith(std::__1::tuple<$>&&) (1961 times, avg 20 ms)
 21051 ms: testing::internal::TypedExpectation<$>::ExplainMatchResultTo(std::__... (1993 times, avg 10 ms)
 19210 ms: testing::internal::ParameterizedTestSuiteInfo<$>::RegisterTests() (406 times, avg 47 ms)
 18661 ms: void testing::internal::TuplePrefix<$>::ExplainMatchFailuresTo<$>(st... (2627 times, avg 7 ms)
 17837 ms: testing::internal::TypeParameterizedTest<$>::Register(char const*, t... (1286 times, avg 13 ms)
 14176 ms: testing::internal::TypedExpectation<$>::GetCurrentAction(testing::in... (1993 times, avg 7 ms)
 12635 ms: bool boost::geometry::detail::partition::partition_two_ranges<$>::ap... (562 times, avg 22 ms)
 11895 ms: testing::internal::FunctionMocker<$>::PerformDefaultAction(std::__1:... (1961 times, avg 6 ms)
  6589 ms: testing::internal::FunctionMocker<$>::PerformAction(void const*, std... (1961 times, avg 3 ms)
  5890 ms: testing::internal::FunctionMocker<$>::DescribeDefaultActionTo(std::_... (1993 times, avg 2 ms)
  5854 ms: testing::internal::FunctionMocker<$>::PrintTriedExpectationsLocked(s... (1993 times, avg 2 ms)
  5767 ms: testing::internal::FunctionMocker<$>::UntypedFindMatchingExpectation... (1993 times, avg 2 ms)
  5176 ms: std::__1::ostreambuf_iterator<$> std::__1::__pad_and_output<$>(std::... (1053 times, avg 4 ms)
  4801 ms: testing::internal::OnCallSpec<$>::GetAction() const (1961 times, avg 2 ms)
  4660 ms: void std::__1::__introsort<$>(boost::geometry::detail::overlay::turn... (46 times, avg 101 ms)
  4271 ms: testing::internal::TypedExpectation<$>::GetActionForArguments(testin... (1993 times, avg 2 ms)
  3850 ms: std::__1::basic_ostream<$>& std::__1::__put_character_sequence<$>(st... (1053 times, avg 3 ms)
  3807 ms: testing::internal::SuiteApiResolver<$>::GetSetUpCaseOrSuite(char con... (2157 times, avg 1 ms)
  3796 ms: testing::internal::MockSpec<$>::InternalExpectedAt(char const*, int,... (359 times, avg 10 ms)
  3777 ms: std::__1::__function::__func<$>::target(std::type_info const&) const (1456 times, avg 2 ms)
  3254 ms: testing::internal::ParameterizedTestSuiteInfo<$>* testing::internal:... (406 times, avg 8 ms)
  3079 ms: std::__1::__function::__func<$>::__clone() const (906 times, avg 3 ms)
  2966 ms: testing::internal::TestFactoryImpl<$>::CreateTest() (968 times, avg 3 ms)
  2809 ms: classic_protocol::Codec<$>::decode(net::const_buffer const&, std::__... (120 times, avg 23 ms)
  2744 ms: testing::internal::SuiteApiResolver<$>::GetTearDownCaseOrSuite(char ... (2157 times, avg 1 ms)
  2600 ms: testing::internal::MatcherBase<$>::DescribeTo(std::__1::basic_ostrea... (1491 times, avg 1 ms)
  2496 ms: testing::internal::FunctionMocker<$>::ClearDefaultActionsLocked() (846 times, avg 2 ms)
  2447 ms: bool boost::geometry::detail::partition::partition_two_ranges<$>::ap... (104 times, avg 23 ms)
  2442 ms: testing::internal::MatcherBase<$>::~MatcherBase() (588 times, avg 4 ms)
  2394 ms: testing::Matcher<$>::~Matcher() (889 times, avg 2 ms)

*** Expensive headers:
209141 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/sql/sql_class.h (included 658 times, avg 317 ms), included via: replicated_columns_view.h column_filter_factory.h column_filter_inbound_func_indexes.h column_filter.h  (877 ms)  (868 ms)  (827 ms)  (809 ms)  (806 ms) hold_transactions.h  (800 ms)

181774 ms: /Applications/ (included 3328 times, avg 54 ms), included via:
  classic_stmt_reset_forwarder.h forwarding_processor.h processor.h basic_protocol_splicer.h functional  (310 ms)
  sql_authorization.h functional  (244 ms)
  functional  (239 ms)
  keycache.h string_view functional  (231 ms)
  geometry.hpp geometry.hpp radian_access.hpp cast.hpp converter.hpp converter_policies.hpp functional  (231 ms)
  json_dom.h functional  (229 ms)

165255 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/extra/googletest/googletest-release-1.12.0/googletest/include/gtest/gtest.h (included 525 times, avg 314 ms), included via:  (757 ms)  (717 ms)  (717 ms)  (701 ms)  (692 ms)  (691 ms)

155001 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/include/m_string.h (included 2102 times, avg 73 ms), included via:  (598 ms)  (558 ms)
  NdbTCP.cpp.o ndb_global.h  (557 ms)  (531 ms)  (476 ms) my_sys.h  (474 ms)

125881 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/include/lex_string.h (included 1271 times, avg 99 ms), included via:  (390 ms) dd_trigger.h  (386 ms) log_builtins.h log.h  (382 ms) xpl_log.h log_builtins.h log.h  (364 ms) sql_authentication.h  (360 ms) events.h  (336 ms)

121427 ms: /Applications/ (included 3369 times, avg 36 ms), included via:
  vector __split_buffer memory  (134 ms)
  fstream __locale memory  (122 ms)
  fstream __locale memory  (120 ms)
  gtest.h memory  (115 ms)
  ndb_global.h m_string.h algorithm memory  (113 ms)
  geometry.hpp geometry.hpp radian_access.hpp cast.hpp converter.hpp converter_policies.hpp functional boyer_moore_searcher.h  (110 ms)

117425 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/sql/field.h (included 693 times, avg 169 ms), included via:  (790 ms)  (733 ms)  (692 ms)  (692 ms)  (618 ms) sql_select.h  (585 ms)

107213 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/extra/googletest/googletest-release-1.12.0/googlemock/include/gmock/gmock.h (included 271 times, avg 395 ms), included via:  (907 ms)  (829 ms)  (813 ms)  (787 ms)  (783 ms) gcs_base_test.h  (782 ms)

107076 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/sql/table.h (included 845 times, avg 126 ms), included via: dd_routine.h  (718 ms) global.h  (700 ms)  (664 ms) rpl_sys_key_access.h  (619 ms)  (575 ms)  (396 ms)

102310 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/sql/handler.h (included 923 times, avg 110 ms), included via:  (717 ms) ha_example.h  (668 ms) ha_archive.h  (631 ms) ha_mock.h  (625 ms) ha_perfschema.h  (624 ms) ha_blackhole.h  (598 ms)

  done in 1.4s.

Nice, isn'it? Now what can we find in there?

  • "Files that took longest to parse (compiler frontend)": all tests
  • "Files that took longest to codegen (compiler backend)": tests again, with a bit of GIS code (no surprise there because Boost is used there)
  • "Templates that took longest to instantiate": lots of truncated output of std::function (cannot tell where used because truncation and too lazy to rerun the analyzer with the longer output option)
  • "Template sets that took longest to instantiate": std::function again, Google Test, and regular C++ standard library things.
  • "Functions that took longest to compile": MYSQLparse (no surprise there), GIS code (again no surprise), tests (by now no surprise neither)
  • "Function sets that took longest to compile / optimize:": tests tests tests with a bit of GIS
  • "Expensive headers": sql_class.h (I knew it!), C++ library (why so much of shared_ptr BTW?), and tests again.

Let's not compile the tests, then

MySQL CMake allows setting -DWITH_UNIT_TESTS=OFF, which presumably should get rid of most things in the above. Of course, not compiling the unit tests has the inconvenience of not being able to run them after a build, and breaking the build for Mark. But let's experiment.

Rerunning the above with -DWITH_UNIT_TESTS=OFF cut the wall compilation time almost in half, which is unfortunately invalidated by Chrome slowly making its way to a crash in the first third of the full compilation, and the report now looks very different:

**** Time summary:
Compilation (4955 times):
  Parsing (frontend):         2912.1 s
  Codegen & opts (backend):   1280.0 s

**** Files that took longest to parse (compiler frontend):
  7134 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
  6966 ms: ./router/src/router/src/CMakeFiles/router_frontend_lib.dir/
  6820 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
  6527 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
  6454 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
  5857 ms: ./sql/CMakeFiles/sql_gis.dir/gis/srs/
  5748 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
  5577 ms: ./sql/CMakeFiles/sql_main.dir/auth/
  5543 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
  5132 ms: ./sql/CMakeFiles/sql_gis.dir/

**** Files that took longest to codegen (compiler backend):
 32200 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 30934 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 26649 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 26390 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 23451 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 21500 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 19349 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 19187 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 18280 ms: ./sql/CMakeFiles/sql_gis.dir/gis/
 16962 ms: ./sql/CMakeFiles/sql_gis.dir/gis/

**** Templates that took longest to instantiate:
 20773 ms: std::unique_ptr<std::unordered_multimap<const MDL_key *, MDL_ticket_... (914 times, avg 22 ms)
 20707 ms: std::unique_ptr<std::unordered_multimap<const MDL_key *, MDL_ticket_... (915 times, avg 22 ms)
 20550 ms: std::default_delete<std::unordered_multimap<const MDL_key *, MDL_tic... (915 times, avg 22 ms)
 14289 ms: std::unordered_multimap<const MDL_key *, MDL_ticket_store::MDL_ticke... (915 times, avg 15 ms)
 12743 ms: std::copy_n<const wchar_t *, unsigned long, wchar_t *> (2547 times, avg 5 ms)
 12637 ms: std::copy_n<const char *, unsigned long, char *> (2542 times, avg 4 ms)
 12551 ms: std::copy<const wchar_t *, wchar_t *> (2547 times, avg 4 ms)
 12346 ms: std::copy_n<const char16_t *, unsigned long, char16_t *> (2546 times, avg 4 ms)
 12307 ms: std::copy<const char *, char *> (2547 times, avg 4 ms)
 12145 ms: std::copy<const char16_t *, char16_t *> (2547 times, avg 4 ms)
 11944 ms: std::__copy<const wchar_t *, const wchar_t *, wchar_t *, 0> (2547 times, avg 4 ms)
 11894 ms: std::copy_n<const char32_t *, unsigned long, char32_t *> (2547 times, avg 4 ms)
 11674 ms: std::copy<const char32_t *, char32_t *> (2547 times, avg 4 ms)
 11667 ms: std::__copy<const char *, const char *, char *, 0> (2547 times, avg 4 ms)
 11526 ms: std::__copy<const char16_t *, const char16_t *, char16_t *, 0> (2547 times, avg 4 ms)
 10919 ms: std::__copy<const char32_t *, const char32_t *, char32_t *, 0> (2547 times, avg 4 ms)
 10219 ms: collation_unordered_map<std::string, std::unique_ptr<user_var_entry,... (600 times, avg 17 ms)
  9859 ms: std::basic_string<char>::basic_string (2726 times, avg 3 ms)
  9834 ms: std::vector<unsigned char *>::push_back (784 times, avg 12 ms)
  9728 ms: malloc_unordered_map<char **, std::unique_ptr<char, My_free_deleter>... (601 times, avg 16 ms)
  9355 ms: std::vector<unsigned char *>::__push_back_slow_path<unsigned char *c... (784 times, avg 11 ms)
  9318 ms: std::unordered_map<std::string, unsigned long> (846 times, avg 11 ms)
  8973 ms: std::unordered_map<std::string, std::unique_ptr<user_var_entry, void... (600 times, avg 14 ms)
  8932 ms: std::basic_string<wchar_t>::basic_string (2611 times, avg 3 ms)
  8793 ms: std::__scalar_hash<std::_PairT, 2>::operator() (2580 times, avg 3 ms)
  8642 ms: net::basic_waitable_timer<std::chrono::steady_clock>::cancel (81 times, avg 106 ms)
  8639 ms: net::io_context::cancel<net::basic_waitable_timer<std::chrono::stead... (81 times, avg 106 ms)
  8568 ms: collation_unordered_map<std::string, std::unique_ptr<Table_ref, My_f... (600 times, avg 14 ms)
  8505 ms: std::basic_string<char16_t>::basic_string (2609 times, avg 3 ms)
  8493 ms: std::basic_string<char32_t>::basic_string (2618 times, avg 3 ms)

**** Template sets that took longest to instantiate:
 91311 ms: std::unique_ptr<$> (44993 times, avg 2 ms)
 81831 ms: std::__hash_table<$> (9702 times, avg 8 ms)
 74391 ms: std::unordered_map<$> (7352 times, avg 10 ms)
 64459 ms: std::allocator_traits<$> (22807 times, avg 2 ms)
 62238 ms: std::map<$> (11577 times, avg 5 ms)
 59739 ms: std::function<$>::function<$> (1549 times, avg 38 ms)
 59342 ms: std::__function::__value_func<$>::__value_func<$> (1549 times, avg 38 ms)
 54833 ms: std::copy<$> (11586 times, avg 4 ms)
 54215 ms: std::__tree<$> (15582 times, avg 3 ms)
 51692 ms: std::__copy<$> (11585 times, avg 4 ms)
 50051 ms: std::copy_n<$> (10264 times, avg 4 ms)
 48536 ms: std::decay<$> (11965 times, avg 4 ms)
 48319 ms: std::__function::__func<$>::__func (1549 times, avg 31 ms)
 48146 ms: std::vector<$>::push_back (3923 times, avg 12 ms)
 45687 ms: std::unordered_map<$>::unordered_map (3885 times, avg 11 ms)
 45017 ms: std::vector<$>::__push_back_slow_path<$> (3822 times, avg 11 ms)
 44368 ms: std::vector<$> (21423 times, avg 2 ms)
 41048 ms: std::vector<$>::__swap_out_circular_buffer (5253 times, avg 7 ms)
 40485 ms: std::__function::__alloc_func<$>::__alloc_func (4647 times, avg 8 ms)
 38700 ms: std::basic_string<$>::basic_string (11728 times, avg 3 ms)
 36048 ms: std::unique_ptr<$>::reset (4008 times, avg 8 ms)
 35840 ms: std::pair<$> (16452 times, avg 2 ms)
 35779 ms: std::__hash_table<$>::__hash_table (4198 times, avg 8 ms)
 35340 ms: std::unique_ptr<$>::~unique_ptr (3870 times, avg 9 ms)
 34007 ms: std::__compressed_pair<$> (10756 times, avg 3 ms)
 33242 ms: std::__uninitialized_allocator_move_if_noexcept<$> (5252 times, avg 6 ms)
 33225 ms: std::basic_string<$> (11487 times, avg 2 ms)
 32465 ms: std::forward_as_tuple<$> (6724 times, avg 4 ms)
 31334 ms: malloc_unordered_map<$> (3118 times, avg 10 ms)
 30751 ms: std::__decay<$> (7475 times, avg 4 ms)

**** Functions that took longest to compile:
  2514 ms: MYSQLparse(THD*, Parse_tree_root**) (/Users/laurynas/vilniusdb/mysql-8.0.33/_build-release-time-report-no-unit-tests/sql/
   819 ms: (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/
   592 ms: duk__js_execute_bytecode_inner (/Users/laurynas/vilniusdb/mysql-8.0.33/extra/duktape/duktape-2.7.0/src/duktape.c)
   582 ms: SetObjectMembers(std::__1::unique_ptr<Json_object, std::__1::default... (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/join_optimizer/
   510 ms: MySQLRouter::prepare_command_options() (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/router/src/
   456 ms: my_strnncoll_uca_900(CHARSET_INFO const*, unsigned char const*, unsi... (/Users/laurynas/vilniusdb/mysql-8.0.33/strings/
   429 ms: spec_adder(rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjs... (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/rest_routing/src/
   384 ms: init_handlers(mysql_harness::PluginFuncEnv*, mysql_harness::LoaderCo... (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/harness/src/logging/
   351 ms: rapidjson::internal::Schema<rapidjson::GenericSchemaDocument<rapidjs... (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/harness/src/
   304 ms: my_strnxfrm_uca_900(CHARSET_INFO const*, unsigned char*, unsigned lo... (/Users/laurynas/vilniusdb/mysql-8.0.33/strings/
   276 ms: spec_adder(rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjs... (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/rest_metadata_cache/src/
   271 ms: my_hash_sort_uca_900(CHARSET_INFO const*, unsigned char const*, unsi... (/Users/laurynas/vilniusdb/mysql-8.0.33/strings/
   250 ms: test_sql(void*) (/Users/laurynas/vilniusdb/mysql-8.0.33/plugin/test_service_sql_api/
   247 ms: CmdArgHandler::process(std::__1::vector<std::__1::basic_string<char,... (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/harness/src/
   242 ms: open_table_def(THD*, TABLE_SHARE*, dd::Table const&) (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/
   237 ms: trx_undo_report_row_operation(unsigned long, unsigned long, que_thr_... (/Users/laurynas/vilniusdb/mysql-8.0.33/storage/innobase/trx/
   230 ms: test_wl4435_3() (/Users/laurynas/vilniusdb/mysql-8.0.33/testclients/
   212 ms: void std::__1::__introsort<std::__1::_ClassicAlgPolicy, boost::geome... (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/gis/
   211 ms: dd::tables::Tables::Tables() (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/dd/impl/tables/
   210 ms: spec_adder(rapidjson::GenericDocument<rapidjson::UTF8<char>, rapidjs... (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/rest_connection_pool/src/
   209 ms: TlsServerContext::default_ciphers() (/Users/laurynas/vilniusdb/mysql-8.0.33/router/src/harness/src/
   209 ms: main (/Users/laurynas/vilniusdb/mysql-8.0.33/client/
   207 ms: build_gcs_parameters(Gcs_interface_parameters&) (/Users/laurynas/vilniusdb/mysql-8.0.33/plugin/group_replication/src/
   205 ms: init_server_components() (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/
   202 ms: void std::__1::__introsort<std::__1::_ClassicAlgPolicy, boost::geome... (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/gis/
   200 ms: ConnectJoins(int, int, int, QEP_TAB*, THD*, CallingContext, std::__1... (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/
   194 ms: lock_wait_timeout_thread() (/Users/laurynas/vilniusdb/mysql-8.0.33/storage/innobase/lock/
   193 ms: void std::__1::__introsort<std::__1::_ClassicAlgPolicy, boost::geome... (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/gis/
   187 ms: (anonymous namespace)::CostingReceiver::FoundSingleNode(int) (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/join_optimizer/
   186 ms: handle_slave_io (/Users/laurynas/vilniusdb/mysql-8.0.33/sql/

**** Function sets that took longest to compile / optimize:
 13251 ms: bool boost::geometry::detail::partition::partition_two_ranges<$>::ap... (562 times, avg 23 ms)
  4962 ms: void std::__1::__introsort<$>(boost::geometry::detail::overlay::turn... (46 times, avg 107 ms)
  2395 ms: bool boost::geometry::detail::partition::partition_two_ranges<$>::ap... (104 times, avg 23 ms)
  2264 ms: std::__1::ostreambuf_iterator<$> std::__1::__pad_and_output<$>(std::... (536 times, avg 4 ms)
  1943 ms: void std::__1::__introsort<$>(boost::geometry::detail::relate::linea... (26 times, avg 74 ms)
  1717 ms: std::__1::basic_ostream<$>& std::__1::__put_character_sequence<$>(st... (536 times, avg 3 ms)
  1657 ms: std::__1::basic_stringbuf<$>::str() const (180 times, avg 9 ms)
  1298 ms: unsigned int std::__1::__sort3<$>(boost::geometry::detail::overlay::... (46 times, avg 28 ms)
  1248 ms: bool boost::geometry::partition<$>::apply<$>(boost::geometry::sectio... (281 times, avg 4 ms)
  1224 ms: unsigned int std::__1::__sort5<$>(boost::geometry::detail::overlay::... (46 times, avg 26 ms)
  1220 ms: void std::__1::__tree_balance_after_insert<$>(std::__1::__tree_node_... (317 times, avg 3 ms)
  1211 ms: std::__1::deque<$>::__add_back_capacity() (94 times, avg 12 ms)
  1049 ms: bool boost::geometry::detail::overlay::get_turn_info_for_endpoint<$>... (52 times, avg 20 ms)
   991 ms: bool boost::geometry::detail::partition::partition_one_range<$>::app... (104 times, avg 9 ms)
   968 ms: spec_adder(rapidjson::GenericDocument<$>&) (4 times, avg 242 ms)
   968 ms: std::__1::__tree_node_base<$>*& std::__1::__tree<$>::__find_equal<$>... (253 times, avg 3 ms)
   956 ms: bool boost::geometry::detail::partition::partition_two_ranges<$>::ap... (64 times, avg 14 ms)
   939 ms: (anonymous namespace)::Function_factory<$>::create_func(THD*, MYSQL_... (270 times, avg 3 ms)
   938 ms: std::__1::basic_stringbuf<$>::overflow(int) (180 times, avg 5 ms)
   922 ms: classic_protocol::Codec<$>::decode(net::const_buffer const&, std::__... (70 times, avg 13 ms)
   884 ms: bool boost::geometry::detail::partition::partition_two_ranges<$>::ap... (40 times, avg 22 ms)
   866 ms: boost::geometry::policies::relate::segments_intersection_policy<$>::... (30 times, avg 28 ms)
   864 ms: std::__1::basic_string<$>::basic_string[abi:v15006]<$>(char const*) (351 times, avg 2 ms)
   837 ms: rapidjson::internal::Schema<$>::Schema(rapidjson::GenericSchemaDocum... (5 times, avg 167 ms)
   814 ms: bool std::__1::__insertion_sort_incomplete<$>(boost::geometry::detai... (46 times, avg 17 ms)
   813 ms: void std::__1::__introsort<$>(boost::geometry::detail::relate::linea... (14 times, avg 58 ms)
   810 ms: std::__1::__tree<$>::destroy(std::__1::__tree_node<$>*) (443 times, avg 1 ms)
   808 ms: unsigned int std::__1::__sort4<$>(boost::geometry::detail::overlay::... (46 times, avg 17 ms)
   795 ms: void std::__1::__tree_remove<$>(std::__1::__tree_node_base<$>*, std:... (90 times, avg 8 ms)
   788 ms: std::__1::vector<$>::~vector[abi:v15006]() (418 times, avg 1 ms)

*** Expensive headers:
181934 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/sql/sql_class.h (included 587 times, avg 309 ms), included via:  (936 ms) replicated_columns_view.h column_filter_factory.h column_filter_inbound_func_indexes.h column_filter.h  (910 ms)  (838 ms)  (812 ms)  (810 ms) replicated_columns_view_with_gipk_on_source.h replicated_columns_view.h column_filter_factory.h column_filter_inbound_func_indexes.h column_filter.h  (771 ms)

140903 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/include/m_string.h (included 1793 times, avg 78 ms), included via:  (428 ms)  (412 ms) buf0checksum.h buf0types.h os0event.h univ.i  (406 ms)
  NdbThread.cpp.o ndb_global.h  (389 ms)
  AccLock.cpp.o AccLock.hpp SignalData.hpp ndb_global.h  (386 ms)
  NdbReceiver.cpp.o API.hpp ndb_global.h  (382 ms)

129393 ms: /Applications/ (included 2549 times, avg 50 ms), included via:
  bgc_ticket_manager.h atomic_bgc_ticket_guard.h functional  (231 ms)
  acl_table_user.h functional  (228 ms)
  config_parser.h functional  (217 ms)
  dynamic_privilege_table.h functional  (217 ms)
  loader_config.h config_parser.h functional  (210 ms)
  random_generator.h random discrete_distribution.h numeric functional  (208 ms)

114343 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/include/lex_string.h (included 1195 times, avg 95 ms), included via: sql_authentication.h  (329 ms) sql_audit.h  (308 ms) dd_trigger.h  (305 ms)  (299 ms) sql_user_table.h sql_system_table_check.h log_builtins.h log.h  (297 ms) show_query_builder.h  (292 ms)

105206 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/sql/field.h (included 634 times, avg 165 ms), included via:  (733 ms) rpl_sys_table_access.h  (611 ms) sql_select.h  (597 ms)  (597 ms)  (581 ms)  (577 ms)

95348 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/sql/table.h (included 779 times, avg 122 ms), included via: dd_routine.h  (633 ms) global.h  (582 ms)  (551 ms)  (493 ms) rpl_sys_key_access.h  (488 ms) rpl_sys_key_access.h  (369 ms)

94011 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/storage/innobase/include/univ.i (included 209 times, avg 449 ms), included via: mach0data.h mtr0types.h sync0rw.h  (710 ms) buf0block_hint.h buf0types.h os0event.h  (705 ms) btr0sea.h  (700 ms) srv0tmp.h srv0srv.h buf0checksum.h buf0types.h os0event.h  (684 ms) buf0checksum.h buf0types.h os0event.h  (681 ms) btr0btr.h btr0types.h  (674 ms)

93077 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/sql/log_event.h (included 241 times, avg 386 ms), included via: global.h context.h  (889 ms)  (879 ms) context.h  (862 ms) recovery.h global.h context.h  (811 ms) iterators.h binlog_reader.h  (775 ms)  (775 ms)

91706 ms: /Users/laurynas/vilniusdb/mysql-8.0.33/sql/handler.h (included 842 times, avg 108 ms), included via: ha_example.h  (756 ms) ha_mock.h  (685 ms)  (649 ms) ha_tina.h  (614 ms) ha_archive.h  (580 ms) api0misc.h  (575 ms)

87795 ms: /Applications/ (included 2580 times, avg 34 ms), included via:
  AccLock.hpp SignalData.hpp ndb_global.h m_string.h algorithm memory  (102 ms)
  dynamic_privilege_table.h functional boyer_moore_searcher.h  (98 ms)
  signing_key.h string memory  (97 ms)
  ndb_global.h m_string.h algorithm memory  (96 ms)
  dict0dict.h set __node_handle memory  (93 ms)
  my_byteorder.h template_utils.h algorithm memory  (93 ms)

  done in 0.8s.

What changed?

  • "Files that took longest to parse (compiler frontend)": all GIS
  • "Files that took longest to codegen (compiler backend)": all GIS
  • "Templates that took longest to instantiate": all those std::function gone, so must have been tests. Something about MDL remaining.
  • "Template sets that took longest to instantiate": std::unique_ptr. Oh well.
  • "Functions that took longest to compile": MYSQLparse (no surprise there), system variables handling, 3rd party code, and variety of other things
  • "Function sets that took longest to compile / optimize:": Boost/GIS, the standard library.
  • "Expensive headers": sql_class.h, and the C++ standard library headers mostly gone together with shared_ptr (so it was tests using it).

Interestingly (and usefully) this provides a different kind of insight than what Mark discovered for Meta branch: it would not discover RocksDB being compiled three times. It would be great if the MySQL developers looked into this tool, which seems very easy to use, and reducing compilation times. That sql_class.h file is way overdue for splitting!

Tuesday, February 07, 2023

Changes for the MySQL clone plugin to support other storage engines

In case you ever wondered what is needed for the Oracle MySQL clone plugin to support other transactional engines than InnoDB, I got you covered:

Wednesday, January 18, 2023

On Writing to the MySQL Error Log

If you develop a 3rd party MySQL plugin, it might take a while for you to figure out how to make your plugin write to the server error log in an official supported way. Let's take a look.

First things first: you don't want to patch the core server and ideally you'd develop and distribute the plugin only. I don't think there exists a single plugin that pulls this off today, but that's the ideal.

Then, you want to be a good citizen and not write to stderr directly. By doing that, you'd bypass the whole error logging machinery and the plugin messages could not be filtered, redirected to i.e. JSON or system error log sinks, and would not have proper severity and timestamps set. As the plugin developer you also want to be productive and avoid avoidable bugs at least, so even if you wanted to write to stderr directly, both printf and std::cerr are terrible APIs. For the former, fixing -Wformat-security warnings for all the platforms results in unreadable format strings "Foo: %" PRIu64 ", bar: %" PRIu64, the latter is just universally terrible with the sole good property of being type-safe. You'd want std::format (or {fmt} in C++17), so you could be ergonomic and type safe at the same time: "Foo: {}, bar: {}".

OK, so you have to use official MySQL APIs for plugins to write to the error logs. Which ones? Well, the internal documentation has a nice chapter "MySQL Services for Plugins" with an example:

my_plugin_log_message(plugin_ptr, MY_ERROR_LEVEL, "Cannot initialize plugin");

That's not too bad, you think, except that this is a printf-like API and not std::format. Then you grep the source tree for this function, and whoa:

For use by legacy sql_print_*(), legacy my_plugin_log_message();
also available via the log_builtins service as message().
Wherever possible, use the fluent C++ wrapper LogErr()
(see log_builtins.h) instead.

Oh, so my_plugin_log_message is "legacy" and you should be using fluent C++ LogErr from log_builtins.h. Let's go there:

#define LogErr(severity, ecode, ...) \
  LogEvent()                         \
      .prio(severity)                \
      .errcode(ecode)                \
      .subsys(LOG_SUBSYSTEM_TAG)     \
      .component(LOG_COMPONENT_TAG)  \
      .source_line(__LINE__)         \
      .source_file(MY_BASENAME)      \
      .function(__FUNCTION__)        \
      .lookup(ecode, ##__VA_ARGS__)

The fluency here applies to the macro definition itself, and not to the macro calls. There is no usage comment for it and there are a lot of things going on that need figuring out, maybe let's try grepping for examples? Grep shows that this macro is indeed heavily used, and also that it is not specific for plugins, because the core server uses it too. So you focus the grep to the plugin/ directory and pick a (not really) random sample:

    mysql_error_service_emit_printf(svc_error, ER_GRP_RPL_UDF_ERROR, MYF(0),
                                    action_name, error_message);
    if (log_error)
    return false;
// Log the error in case we can't do much

Many, many things going on here. Why there are LogErr and LogPluginErr both at the same time? Is it the latter you should be using? What's that mysql_error_service_emit_printf thing? It looks error-log'ish too. Finally, why there is both ER_... something and error_message?

Let's start with LogErr vs LogPluginErr. Remember that my_plugin_log_message was documented in the internals manual? These two are not there. Googling gives skeleton Doxygen docs for log_builtins.h, so might as well go to the source code directly:

#define LogPluginErr(severity, ecode, ...)                           \
  LogEvent()                                                         \
      .prio(severity)                                                \
      .errcode(ecode)                                                \
      .subsys(LOG_SUBSYSTEM_TAG)                                     \
      .component("plugin:" LOG_COMPONENT_TAG)                        \
      .source_line(__LINE__)                                         \
      .source_file(MY_BASENAME)                                      \
      .function(__FUNCTION__)                                        \
      .lookup_quoted(ecode, "Plugin " LOG_COMPONENT_TAG " reported", \

That's not too different from the LogErr definition above and seems interchangeable, with the difference being all those extra "plugin:" strings. If you have to choose between LogErr and LogErrPlugin, the latter looks like the right choice. Grepping shows heavy usage in plugin/ directory (and none in the core server), which seems correct. Let's figure out LOG_SUBSYSTEM_TAG and LOG_COMPONENT_TAG (and maybe MY_BASENAME) later.

Next question, should you be concerned about mysql_error_service_emit_printf? The Doxygen docs say "This function is substitute api service for my_error function." And my_error Doxygen docs say "Fill in and print a previously registered error message.". Print where? To an error log? You grep grep grep and half an hour later find yourself in my_message_sql and convince yourself that this error is, in fact, printed to the client and not to the error log. OK, so that was a digression.

At this point you are very certain you should be using LogPluginErr, but it was taking those ER_... error codes and extra parameters. How do you allocate an error code for your 3rd party, out-of-tree plugin? Maybe let's see how the in-tree plugins do that as for the starting point. You grep to find this in share/messages_to_error_log.txt:

  eng "The function '%s' failed. %s"

You learn that these are printf-style format strings determining the tail of arguments for LogPluginErr. But remember, you don't want to patch the core server. So maybe the build process for this file could pick up some extra files too? You look at comp_err source and how it is invoked by CMake currently, and arrive at the conclusion, that no, you cannot do that.

And there is a fundamental reason: what error code numbers should be assigned to your plugin, so that there are no conflicts with the core server and all the other 3rd party plugins in the world? Ideally without a centralized number assignment authority? The closest I have to the answer is that the plugins could use the letter prefixes, i.e. the core server already prefixes all error codes with MY-. But the prefixes only reduce the conflict probability, not eliminate it.

So at this point you know you are supposed to use LogPluginErr but you can only use the existing error codes for it. How to find some suitable ones? Well a loophole happens to exist:

  eng "%s"

An error code that says "this is a printf-formatted message", perfect.

You are almost there, what's remaining are the LOG_SUBSYSTEM_TAG, LOG_COMPONENT_TAG, and MY_BASENAME macros. By grepping for the first two you find something like this, in the group replication plugin:


OK, so LOG_COMPONENT_TAG should be defined to your plugin name, and LOG_SUBSYSTEM_TAG to something broader if applicable. And MY_BASENAME takes care of itself: log_builtins.h includes log_shared.h, which includes my_basename.h, which defines it.

Looks like you got all the pieces! Let's try it!

#define LOG_SUBSYSTEM_TAG "ThePlugin"  // This is proof-of-concept only:
#define LOG_COMPONENT_TAG "ThePlugin"  // should be done in CMake
// ...
#include "mysql/components/services/log_builtins.h"
// ...
LogPluginErr(ERROR_LEVEL, ER_LOG_PRINTF_MSG, "Hello, world!");

And the error log:

2023-01-18T06:38:01Z UTC - mysqld got signal 11 ;
0   mysqld                              0x0000000102610b8c my_print_stacktrace(unsigned char const*, unsigned long) + 72
1   mysqld                              0x0000000100f47250 print_fatal_signal(int) + 764
2   mysqld                              0x0000000100f4751c handle_fatal_signal + 120
3   libsystem_platform.dylib            0x00000001962fb2a4 _sigtramp + 56
4                     0x0000000124724c8c LogEvent::LogEvent() + 28
5                     0x0000000124724c8c LogEvent::LogEvent() + 28

Well, this is the error log all right, just not what you asked it to log.

 * thread #41, name = 'connection', stop reason = EXC_BAD_ACCESS (code=1, address=0x80)
    frame #0: 0x0000000151405980`LogEvent::LogEvent(this=0x00000001717e0308) at log_builtins.h:968:15
   965 	  */
   966 	  LogEvent() {
   967 	    have_msg = false;
-> 968 	    if ((ll = log_line_init()) != nullptr) {
   969 	      if ((msg = (char *)log_malloc(LOG_BUFF_MAX)) == nullptr) {
   970 	        log_line_exit(ll);
   971 	        ll = nullptr;

log_line_init is a nullptr. It looks like a function/method call but there is a macro to expand first:

#define log_line_init log_bi->line_init

And then

(lldb) p log_bi
(const mysql_service_log_builtins_t *) $0 = nullptr

More grepping, this time for log_bi. It is also helpful if you read the internals manual and recall something vague that everything is dynamic now. So let's say you find the following in the rewriter plugin:

static SERVICE_TYPE(registry) *reg_srv = nullptr;
SERVICE_TYPE(log_builtins) *log_bi = nullptr;
SERVICE_TYPE(log_builtins_string) *log_bs = nullptr;
// ...
static int rewriter_plugin_init(MYSQL_PLUGIN plugin_ref) {
  // ...
  // Initialize error logging service.
  if (init_logging_service_for_plugin(&reg_srv, &log_bi, &log_bs)) return 1;
  // ...
static int rewriter_plugin_deinit(void *) {
  // ...
  deinit_logging_service_for_plugin(&reg_srv, &log_bi, &log_bs);
  // ...

You add the corresponding code to your plugin, and finally:

2023-01-18T06:56:40.735486Z 9 [Note] [MY-011071] [Server] Plugin ThePlugin reported: 'Hello, world!'


Can you do better? What if you need to log something else than a single static string? Yes, you can, and I am advocating for introducing std::format or {fmt} into the core server here too:

             fmt::format("An error has occurred while reading {}: errno = {}",
                         path, my_errno()).c_str());

So, here we are. Of course, if you already happen to know that you should be using LogPluginErr(ER_LOG_PRINTF_MSG) or even just my_plugin_log_message then the majority of this post might seem to be trivial. But I believe that documentation and lowering the entry barrier for new MySQL plugin writers is important too.