Skip to content
/ server Public

MDEV-38315 Optimize NULL only reads for wide tables in core engines#4627

Open
alexaustin007 wants to merge 2 commits intoMariaDB:mainfrom
alexaustin007:MDEV-38315
Open

MDEV-38315 Optimize NULL only reads for wide tables in core engines#4627
alexaustin007 wants to merge 2 commits intoMariaDB:mainfrom
alexaustin007:MDEV-38315

Conversation

@alexaustin007
Copy link

@alexaustin007 alexaustin007 commented Feb 6, 2026

Implemented MDEV-38315

  • InnoDB uses null-only path: sets nullness from record metadata and skips value/BLOB fetch when value is not needed

    • Aria supports null-only in both ROW_FORMAT=DYNAMIC and ROW_FORMAT=PAGE, with safety gating for internal/temp tables
    • MyISAM supports null-only reads with field mapping and safe gating for normal tables

    Testing
    MTR_BINDIR=../build ./mariadb-test-run.pl innodb.innodb_null_only maria.maria_null_only myisam.myisam_null_only

    Passes broader regression test suits
    MTR_BINDIR=../build ./mariadb-test-run.pl --suite=innodb
    MTR_BINDIR=../build ./mariadb-test-run.pl --suite=myisam
    MTR_BINDIR=../build ./mariadb-test-run.pl --suite=maria

Copy link
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an initial review, focusing on the InnoDB part, and not reviewing the changes in row0sel.cc in detail.

Comment on lines 3 to 26
--disable_query_log
--disable_warnings
SET GLOBAL innodb_monitor_disable='module_buffer_page';
SET GLOBAL innodb_monitor_reset_all='module_buffer_page';
SET GLOBAL innodb_monitor_enable='module_buffer_page';
--enable_warnings

CREATE TABLE t (
id INT PRIMARY KEY,
b MEDIUMBLOB NULL,
c INT
) ENGINE=InnoDB ROW_FORMAT=DYNAMIC;

INSERT INTO t VALUES
(1, REPEAT('a', 100000), 1),
(2, NULL, 2),
(3, REPEAT('b', 100000), 3),
(4, NULL, 4);

--let $restart_noprint=2
--let $restart_parameters=--innodb-buffer-pool-load-at-startup=0 --innodb-buffer-pool-dump-at-shutdown=0
--source include/restart_mysqld.inc
--let $restart_parameters=
--disable_query_log
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the server being restarted after the data load? Restarting the server can significantly increase the execution time.

The ROW_FORMAT=DYNAMIC is redundant; this test should work with any value of innodb_default_row_format.

Hiding the queries from the output will make the .result file harder to read.

It is worth noting that in InnoDB, all of VARCHAR, TEXT, BLOB, MEDIUMBLOB and so on are being treated in the same way. I see that we are writing to the column b values that will never fit in a single InnoDB page. Because the maximum length of an InnoDB index record is 16383 bytes, shorter BLOB columns would work as well. Why does the column c exist and why does it allow NULL even if we are only inserting NOT NULL values?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server restart is intentional for test correctness so physical BLOB page reads from information_schema.innodb_metrics. Without a cold start after every data load, BLOB pages may still be in buffer pool and the assertion can pass falsely. I already removed unnecessary restarts and kept only those needed for I/O validation.

Why does the column c exist and why does it allow NULL even if we are only inserting NOT NULL values?
Good point, fixed in the latest update

Comment on lines 97 to 101
CREATE TABLE t_red (
id INT PRIMARY KEY,
b MEDIUMBLOB NULL,
c INT
) ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of repeating the test for each ROW_FORMAT, you should make use of the following:

--source include/innodb_default_row_format.inc

Comment on lines 7091 to 7102

templ->null_only = false;
if (!templ->is_virtual
&& bitmap_is_set(&table->null_set, field->field_index)) {
templ->null_only = true;
if (prebuilt->template_type == ROW_MYSQL_WHOLE_ROW
|| prebuilt->select_lock_type == LOCK_X
|| prebuilt->pk_filter
|| prebuilt->in_fts_query) {
templ->null_only = false;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is introducing a large number of frequently executed conditional branches. The conditions on prebuilt could be initialized to a new bool parameter of the function, which the caller would pass as a local variable that is initialized outside the loop, like this:

const bool maybe_null_only= whole_row
		    || prebuilt->select_lock_type == LOCK_X
		    || prebuilt->pk_filter
		    || prebuilt->in_fts_query;

Side note: It looks like the prebuilt->template_type could have been shrunk to a single bit in ab01901.

Then, in this function, inside the pre-existing if (!templ->is_virtual) block we can use the following simple assignment:

templ->null_only = maybe_null_only && bitmap_is_set(&table->null_set, field->field_index);

Comment on lines 4141 to 4145
if (templ->mysql_null_bit_mask) {
mysql_rec[templ->mysql_null_byte_offset]
&= static_cast<byte>
(~templ->mysql_null_bit_mask);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the condition? We could just unconditionally perform the assignment; it should be smaller and faster.

Which test case in the regression test suite is exercising this code path (index condition pushdown)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines 446 to 450
@@ -447,6 +447,7 @@ struct mysql_row_templ_t {
type and this field is != 0, then
it is an unsigned integer type */
ulint is_virtual; /*!< if a column is a virtual column */
bool null_only; /*!< only NULL status is required */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the data structure with GDB ptype/o and try to rearrange the fields and change some data types so that we get better locality of reference and a smaller footprint. I would assume that null_only and mysql_null_bit_mask will be accessed together. The ulint (an alias of size_t) is an overkill and some old legacy that we could fix. For example, would byte be a more suitable data type of mysql_null_bit_mask?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed, Changed mysql_null_bit_mask from ulint to byte

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants