Description:
When creating a VECTOR column with a dimension large enough to overflow a signed 32-bit integer when multiplied by 4 (sizeof(float)), MySQL silently accepts the DDL and stores a completely wrong, much smaller dimension instead of rejecting the statement with an error.
Examples on MySQL 9.6.0:
VECTOR(1073741825) → silently stored as vector(1)
VECTOR(1073741826) → silently stored as vector(2)
The documentation states the maximum allowed dimension is 16,383. Values far beyond this limit are accepted without any warning or error, and the resulting column has an incorrect dimension due to integer overflow in the byte-count calculation (dimension * 4) performed before the range validation.
This means a schema defined as VECTOR(1073741825) is silently created as VECTOR(1), causing silent data corruption — inserted data is stored with the wrong dimensionality and no error is raised.
Thank you,
Yakir Gibraltar
How to repeat:
-- Tested on MySQL 9.6.0 (Docker image)
DROP DATABASE IF EXISTS vectordb;
CREATE DATABASE vectordb;
USE vectordb;
-- Case 1: 1073741825 * 4 = 4294967300, overflows to 4 bytes → dimension 1
CREATE TABLE t_wrap1 (v VECTOR(1073741825));
SHOW CREATE TABLE t_wrap1;
-- Expected: ERROR - dimension out of range (max is 16383)
-- Observed: `v` vector(1) ← WRONG dimension, no error
-- Case 2: 1073741826 * 4 = 4294967304, overflows to 8 bytes → dimension 2
CREATE TABLE t_wrap2 (v VECTOR(1073741826));
SHOW CREATE TABLE t_wrap2;
-- Observed: `v` vector(2) ← WRONG dimension, no error
-- The column silently stores data as dimension-1:
INSERT INTO t_wrap1 VALUES (TO_VECTOR('[1]'));
SELECT VECTOR_DIM(v), FROM_VECTOR(v) FROM t_wrap1;
-- Returns: 1 | [1.00000e+00]
-- (user intended a 1073741825-dimensional vector, gets a 1-dimensional one)
Suggested fix:
Validate the dimension against MAX_VECTOR_DIMENSION (16383) before computing the byte count. The overflow occurs in sql/parse_tree_column_attrs.h (~line 772-775) where the dimension is multiplied by sizeof(float) to produce a byte count. The fix is to perform the range check on the dimension value itself (before multiplication) using 64-bit arithmetic, or add an explicit guard:
if (dimension < 1 || dimension > MAX_VECTOR_DIMENSION) {
// raise ER_TOO_BIG_VECTOR_COLUMN
}
This check must occur before any multiplication to avoid the overflow path.
Description: When creating a VECTOR column with a dimension large enough to overflow a signed 32-bit integer when multiplied by 4 (sizeof(float)), MySQL silently accepts the DDL and stores a completely wrong, much smaller dimension instead of rejecting the statement with an error. Examples on MySQL 9.6.0: VECTOR(1073741825) → silently stored as vector(1) VECTOR(1073741826) → silently stored as vector(2) The documentation states the maximum allowed dimension is 16,383. Values far beyond this limit are accepted without any warning or error, and the resulting column has an incorrect dimension due to integer overflow in the byte-count calculation (dimension * 4) performed before the range validation. This means a schema defined as VECTOR(1073741825) is silently created as VECTOR(1), causing silent data corruption — inserted data is stored with the wrong dimensionality and no error is raised. Thank you, Yakir Gibraltar How to repeat: -- Tested on MySQL 9.6.0 (Docker image) DROP DATABASE IF EXISTS vectordb; CREATE DATABASE vectordb; USE vectordb; -- Case 1: 1073741825 * 4 = 4294967300, overflows to 4 bytes → dimension 1 CREATE TABLE t_wrap1 (v VECTOR(1073741825)); SHOW CREATE TABLE t_wrap1; -- Expected: ERROR - dimension out of range (max is 16383) -- Observed: `v` vector(1) ← WRONG dimension, no error -- Case 2: 1073741826 * 4 = 4294967304, overflows to 8 bytes → dimension 2 CREATE TABLE t_wrap2 (v VECTOR(1073741826)); SHOW CREATE TABLE t_wrap2; -- Observed: `v` vector(2) ← WRONG dimension, no error -- The column silently stores data as dimension-1: INSERT INTO t_wrap1 VALUES (TO_VECTOR('[1]')); SELECT VECTOR_DIM(v), FROM_VECTOR(v) FROM t_wrap1; -- Returns: 1 | [1.00000e+00] -- (user intended a 1073741825-dimensional vector, gets a 1-dimensional one) Suggested fix: Validate the dimension against MAX_VECTOR_DIMENSION (16383) before computing the byte count. The overflow occurs in sql/parse_tree_column_attrs.h (~line 772-775) where the dimension is multiplied by sizeof(float) to produce a byte count. The fix is to perform the range check on the dimension value itself (before multiplication) using 64-bit arithmetic, or add an explicit guard: if (dimension < 1 || dimension > MAX_VECTOR_DIMENSION) { // raise ER_TOO_BIG_VECTOR_COLUMN } This check must occur before any multiplication to avoid the overflow path.