MySQL has had pluggable storage engine support for decades now. Their capabilities vary wildly. Some of those storage engines are not even storage engine-like, for example BLACKHOLE does not store data at all. For OLTP, this herd thins very quickly if we only consider ACID-supporting ones: InnoDB, MyRocks, NDB, (up until recently) TokuDB, and (up until ten years ago) PBXT. I am not too familiar with OLAP, but I know there's Warp.
The core server layer has accumulated dozens of engine feature flags over those decades to accommodate those varying capabilities, let's take a look at what they can be. To make this post look less like a source code annotation dump, I'll try to sort them by feature area. If I classify something as an "implementation detail", it still could be more than that, there could be e.g. performance implications or something user-visible like forced closing of a table, but I'll be not figuring that out at this time.
Handlerton Flags
To recap, a "handlerton" is a "handler singleton", and a "handler" applies to a single opened table. So a handlerton is a singleton for all the tables in a particular storage engine, the capabilities for it as the whole, although later we'll see that some of the table flags are engine-level too. Perhaps that happened because handlerton concept was introduced later, in 5.0 series AFAIK, and it was all handler earlier.
The list in the source code starts with #define
and ends with constexpr
, showing
its evolution as the coding standards evolve. Although it all should be somehow made
into an enum class
bitset, I think, with a healthy part of handler flags moved over
here too.
Features
HTON_ALTER_NOT_SUPPORTED
: cannot doALTER TABLE
, i.e. performance_schema.HTON_TEMPORARY_NOT_SUPPORTED
: cannot doCREATE TEMPORARY TABLE
, i.e. performance_schema again.HTON_SUPPORT_LOG_TABLES
: can hostmysql.general_log
andslow_log
tables. CSV and MyISAM!HTON_NO_PARTITION
: tables cannot be partitioned. performance_schema, Warp.HTON_SUPPORTS_EXTENDED_KEYS
: all the secondary keys include the primary key columns in the end. This is a feature for the query optimizer. InnoDB, MyRocks, TokuDB.HTON_SUPPORTS_FOREIGN_KEYS
: self-descriptive. InnoDB, NDB.HTON_SUPPORTS_ATOMIC_DDL
: DDL statements are atomic, nice. InnoDB, NDB.HTON_SUPPORTS_PACKED_KEYS
: MyISAM something something.HTON_SUPPORTS_SECONDARY_ENGINE
,HTON_SECONDARY_ENGINES_DDL
: too early to tell, the secondary engine feature is under development, and even then it might not be public, i.e. something for HeatWave.HTON_SUPPORTS_TABLE_ENCRYPTION
: self-descriptive. InnoDB.HTON_SUPPORTS_ENGINE_ATTRIBUTE
: whetherCREATE
andALTER
can takeENGINE_ATTRIBUTE
for engine-specific attributes provided by user.HTON_SUPPORTS_GENERATED_INVISIBLE_PK
: will create a hidden primary key if you don't provide one. InnoDB.
Hidden Storage Engines
HTON_HIDDEN
: do not advertise as a storage engine at all, do not allowCREATE TABLE
in it etc. Binary log is such storage engine!HTON_NOT_USER_SELECTABLE
: do not allow user access any tables in this engine. Binlog again. Not sure why it cannot be merged withHTON_HIDDEN
.
Implementation Details
HTON_CLOSE_CURSORS_AT_COMMIT
: last in-tree engine that used it was BDB in 5.0, which I am not familiar with. MariaDB docs (here goes Planet MySQL) say this was BerkeleyDB. But also TokuDB.HTON_CAN_RECREATE
: mapTRUNCATE TABLE
toDROP
&CREATE
internally. MyRocks, Warp.HTON_FLUSH_AFTER_RENAME
: introduced somewhere between 5.0 and 5.1, removed in 8.0. An implementation detail for the BDB storage engine.HTON_NO_BINLOG_ROW_OPT
: if using row-based binlog format, do not do anything about before/after row images for this engine, meaning all the columns will be always included. NDB.
Handlerton Flags for Foreign Keys
Foreign key capabilities are finer-grained. Archeologically we are above the
#define
layer, but below the constexpr
layer. These are static const
, so they
are emitted in every single compilation unit including handler.h
, unless the
compilers manage to clean it up.
HTON_FKS_WITH_PREFIX_PARENT_KEYS
: whether the foreign key can have arbitrary columns with resulting non-uniqueness etc. as long as the unique parent key columns are in its prefix. InnoDB.HTON_FKS_WITH_SUPPORTING_HASH_KEYS
: the foreign key can be a hash key. NDB.HTON_FKS_NEED_DIFFERENT_PARENT_AND_SUPPORTING_KEY
: can't have a self-referring foreign key. InnoDB.HTON_FKS_WITH_EXTENDED_PARENT_KEYS
: if the keys have primary key columns in its suffix, use that. InnoDB.
Optional Handlerton Function Pointers
We are done with handlerton flags, but we are not done with handlerton. Turns out, some of its function pointers are optional, and capabilities are inferred from them being non-null. A lot of them are transaction-related. These capabilities are:
Savepoints
savepoint_set
: transaction savepoints. InnoDB, MyRocks, TokuDB.savepoint_rollback_can_release_mdl
: ability to rollback MDL locks taken after a savepoint on rollback there. InnoDB, MyRocks
XA Transactions
commit_by_xid
,rollback_by_xid
,set_prepared_in_tc
,set_prepared_in_tc_by_xid
: XA. InnoDB, MyRocks, TokuDB.recover
: can recover XA transactions on startup. InnoDB, MyRocks.recover_prepared_in_tc
: can recover prepared XA transactions on startup. InnoDB.
Other Features
prepare
,set_prepared_in_tc
: 2PC support. InnoDB, MyRocks.start_consistent_snapshot
: supportsSTART TRANSACTION WITH CONSISTENT SNAPSHOT
. InnoDB, MyRocks.flush_logs
: self-descriptive. InnoDB, MyRocks, TokuDB.show_status
: Likewise.alter_tablespace
,get_tablespace_type_by_name
: supports tablespaces. InnoDB.is_supported_system_table
: can host at least some of themysql
schema tables. InnoDB.get_table_statistics
,get_index_column_cardinality
,get_tablespace_statistics
,prepare_secondary_engine
,optimize_secondary_engine
,compare_secondary_engine_cost
: self-descriptive. InnoDB, MyRocks.clone_interface
: can clone. InnoDB. Watch this space for more exciting announcements!lock_hton_log
,unlock_hton_log
,collect_hton_log_info
: can participate inperformance_schema.LOG_STATUS
table. InnoDB, MyRocks
Implementation Details
replace_native_transaction_in_thd
,ddse_dict_init
,dict_...
: native data dictionary support. InnoDB.page_track
: can be used with the page tracking service.drop_database
: wants to participate inDROP DATABASE
implementation. NDB.panic
: can be shutdown, even if it's statically compiled in.fill_is_schema
: wants to fill in some of INFORMATION_SCHEMA tables by itself.get_tablespace
: unused.post_ddl
,post_recover
,check_fk_column_compat
,se_before_commit
,se_after_commit
,se_before_rollback
: just let's say it's all implementation details.
Handler Table Flags
These flags are per-table, but before handlerton was a thing, a lot of per-engine flags were added too.
Features
HA_NO_TRANSACTIONS
: this is not a real OLTP database storage engine. MyISAM.HA_CAN_SQL_HANDLER
: supports HANDLER interface. InnoDB.HA_NO_AUTO_INCREMENT
: does not support auto increment fields. performance_schema, Warp.HA_HAS_CHECKSUM
:CREATE TABLE
supportsCHECKSUM
option. MyISAM.HA_CAN_REPAIR
: canREPAIR TABLE
. MyISAM, Warp.HA_CAN_EXPORT
: supportsFLUSH TABLE FOR EXPORT
. InnoDB.HA_SUPPORTS_DEFAULT_EXPRESSION
:DEFAULT
column clauses may be expressions. InnoDB, MyRocks, MyISAM.HA_UPDATE_NOT_SUPPORTED
: self-descriptive. ARCHIVE.HA_DELETE_NOT_SUPPORTED
: likewise.
Supported Column and Index Types
HA_NO_PREFIX_CHAR_KEYS
: does not support indexingCHAR
column prefixes. NDB.HA_NO_VARCHAR
: does not supportVARCHAR
type. No engine sets this.HA_CAN_FULLTEXT
: supports fulltext indexes. InnoDB.HA_NO_BLOBS
: does not support BLOBs.HA_CAN_INDEX_BLOBS
: can index BLOBs. InnoDB, MyRocks, TokuDB.HA_BLOB_PARTIAL_UPDATE
: BLOBs can be updated partially. InnoDB.HA_CAN_GEOMETRY
: can handle spatial data. InnoDB.HA_CAN_RTREEKEYS
: supports R-Tree indexes.HA_SUPPORTS_GEOGRAPHIC_GEOMETRY_COLUMN
: geometry supports can do not only coordinates on a flat surface, but geographic coordinates too. InnoDB.HA_CAN_BIT_FIELD
: supportsBIT
type. MyISAM.HA_GENERATED_COLUMNS
: supports generated columns. InnoDB, MyRocks, MyISAM.HA_CAN_INDEX_VIRTUAL_GENERATED_COLUMN
: can index them. InnoDB, MyRocks.HA_ANY_INDEX_MAY_BE_UNIQUE
: self-descriptive. MRG_MyISAM.HA_DESCENDING_INDEX
: supports descending indexes. InnoDB.HA_MULTI_VALUED_KEY_SUPPORT
: can index JSON arrays. InnoDB.
Query Optimizer Features
HA_PARTIAL_COLUMN_READ
: can return a subset of row columns. InnoDB, MyRocks.HA_TABLE_SCAN_ON_INDEX
: a clustered index always exists and should be used for table scans. InnoDB, TokuDB.HA_FAST_KEY_READ
: random key order is as fast as sequential. Set for in-memory engines, but probably should be set for everything on SSD?HA_NULL_IN_KEY
: can indexNULL
. InnoDB, MyRocks, TokuDB.HA_STATS_RECORDS_IS_EXACT
: table record count in statistics is exact. InnoDB.HA_PRIMARY_KEY_IN_READ_INDEX
: when doing an index-only read on a secondary key, primary key columns will be returned too. InnoDB, MyRocks, TokuDB.HA_PRIMARY_KEY_REQUIRED_FOR_POSITION
: toposition()
a handler, primary key is needed. Otherwise, position works as a row counter. InnoDB, MyRocks, TokuDB.HA_COUNT_ROWS_INSTANT
: not sure what the difference fromHA_STATS_RECORDS_IS_EXACT
is. The name suggests that counting records is instant, and the comment suggests that it's exact. MyISAM.HA_READ_BEFORE_WRITE_REMOVAL
: supports blind writes, i.e. to write a row you don't have to read it first. NDB. But, this has no relation to a similar optimization in MyRocks and TokuDB.HA_BLOCK_CONST_TABLE
: disables const-table optimization. NDB.HA_CAN_FULLTEXT_HINTS
: can provide extra FULLTEXT-related info inEXPLAIN SELECT
? InnoDB.
Implementation Details
HA_UNUSED3
: used to beHA_REC_NOT_IN_SEQ
before 8.0. Implementation detail.HA_REQUIRES_KEY_COLUMNS_FOR_DELETE
: need to read all the key columns before deleting a row by that key. Unused in MySQL, used by MariaDB MyISAM and InnoDB.HA_DUPLICATE_POS
: on duplicate key error, provide the position of the existing key.HA_AUTO_PART_KEY
: multi-part keys can include auto increment columns. TokuDB.HA_REQUIRE_PRIMARY_KEY
: a user-provided primary key must exist. No SE uses this.HA_UNUSED14
: until 5.7 it wasHA_CAN_INSERT_DELAYED
. MyISAM.HA_PRIMARY_KEY_REQUIRED_FOR_DELETE
: DELETE and UPDATE work by primary key. performance_schema.HA_FILE_BASED
: each table is stored in a separate file. MyISAM, TokuDB.HA_NO_COPY_ON_ALTER
: the name suggests that "ALTER TABLE
does not copy data", but this is different from onlineALTER
. MRG_MyISAM.HA_HAS_OWN_BINLOGGING
: NDB.HA_BINLOG_ROW_CAPABLE
,HA_BINLOG_STMT_CAPABLE
: what binary log formats are supported.HA_DUPLICATE_KEY_NOT_IN_ORDER
: multiple key conflicts in a multiple value-REPLACE
statements are reported not in the ascending key name order. Unused in MySQL, used by CONNECT storage engine in MariaDB.HA_CAN_FULLTEXT_EXT
: supports "extended fulltext API", i.e. it is InnoDB and not MyISAM.HA_READ_OUT_OF_SYNC
: the storage engine is BLACKHOLE. I cannot make more sense out of the description: "what you read will not be what is expected to be in the table".HA_NO_READ_LOCAL_LOCK
: does not supportLOCK TABLE READ LOCAL
, and does not want them to be upgraded toLOCK TABLE READ
by the server. InnoDB.HA_ATTACHABLE_TRX_COMPATIBLE
: the storage engine is "compatible" with attachable transactions. It is a superset of "supporting" them. InnoDB and MyISAM.HA_NO_INDEX_ACCESS
: the indexes are not really indexes, because they don't support access through them. MOCK (that's a storage engine name, not a suggestion).
Index Flags
These are capabilities for individual indexes.
HA_READ_NEXT
: the index can go from one record to the next. Not used, always assumed.HA_READ_PREV
: the index can go to the previous record.HA_READ_ORDER
: the index is ordered. Usually might mean not a hash index.HA_READ_RANGE
: the index supports ranges, again not a hash index, usually.HA_ONLY_WHOLE_INDEX
: the index does not support key prefix search. Hash indexes again.HA_TABLE_SCAN_ON_NULL
: the index does not store NULLs even if the underlying column does, forcing a switch to table scan. NDB.HA_KEYREAD_ONLY
: index-only scans are supported. InnoDB, MyRocks, TokuDB.HA_KEY_SCAN_NOT_ROR
: Not sure. Will update if I figure it out. InnoDB.HA_DO_INDEX_COND_PUSHDOWN
: supports Index Condition Pushdown. InnoDB, MyRocks, TokuDB.
Handler Partitioning Flags
Storage engines can implement partitioning with different levels of capabilities.
HA_CAN_UPDATE_PARTITION_KEY
: can doUPDATE
involving the partitioning key. NDB.HA_CAN_PARTITION_UNIQUE
: can handle unique indexes across partitions correctly. NDB.HA_USE_AUTO_PARTITION
: not sure what "automatic partitioning" is, but this is something NDB supports.HA_CAN_EXCHANGE_PARTITION
: canALTER TABLE EXCHANGE PARTITION
with a non-partitioned table. InnoDB.HA_CANNOT_PARTITION_FK
: a foreign key cannot be the partition key. MyRocks, TokuDB.HA_TRUNCATE_PARTITION_PRECLOSE
:TRUNCATE PARTITION
requires closing the involved tables. InnoDB.
Writing this post resulted in three bug reports asking to remove get_tablespace, HA_REQUIRES_KEY_COLUMNS_FOR_DELETE, and HA_DUPLICATE_KEY_NOT_IN_ORDER.
1 comment:
"this is not a real OLTP database storage engine. MyISAM."
If only they had gone the Mongo/mmap route -- crash safe, single writer, multiple reader. That would have given MyISAM a few more years of happy users.
Post a Comment