Bug #120223 VECTOR(0) silently accepted, produces unusable column with corrupted metadata (dimension 4294967295)
Submitted: 7 Apr 12:06 Modified: 9 Apr 19:32
Reporter: Yakir Gibraltar (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Data Types Severity:S2 (Serious)
Version:9.6.0 OS:Linux
Assigned to: CPU Architecture:Any

[7 Apr 12:06] Yakir Gibraltar
Description:
VECTOR(0) is silently accepted as a valid column definition. Instead of raising an error, MySQL creates a column with corrupted metadata:

  - SHOW CREATE TABLE reports the dimension as vector(4294967295) — unsigned integer underflow of (0 * 4 = 0) bytes wrapping to UINT32_MAX.
  - INFORMATION_SCHEMA.COLUMNS reports CHARACTER_MAXIMUM_LENGTH = 0.
  - Every INSERT fails with "ERROR 1406: Data too long for column", making the column permanently unusable.

The same broken path is also triggered by VECTOR(2147483648), which produces vector(0) in SHOW CREATE TABLE and a similarly unusable column.

A minimum dimension of 1 is never enforced at parse time, so dimension 0 passes through to storage where it causes metadata corruption.

Thank you,
Yakir Gibraltar

How to repeat:
-- Tested on MySQL 9.6.0 (Docker image)

DROP DATABASE IF EXISTS vectordb;
CREATE DATABASE vectordb;
USE vectordb;

-- Case 1: VECTOR(0)
CREATE TABLE t_zero (v VECTOR(0));
SHOW CREATE TABLE t_zero;
-- Expected: ERROR - dimension must be >= 1
-- Observed: `v` vector(4294967295)  <- corrupted UINT32_MAX

SELECT COLUMN_TYPE, CHARACTER_MAXIMUM_LENGTH
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA='vectordb' AND TABLE_NAME='t_zero';
-- Observed: vector(4294967295) | 0  <- contradictory metadata

INSERT INTO t_zero VALUES (TO_VECTOR('[1]'));
-- Observed: ERROR 1406 (22001): Data too long for column 'v'
-- Column is permanently unusable

-- Case 2: VECTOR(2147483648) - same broken path
CREATE TABLE t_negdim (v VECTOR(2147483648));
SHOW CREATE TABLE t_negdim;
-- Observed: `v` vector(0)  <- also corrupted, all inserts fail

Suggested fix:
Enforce a minimum dimension of 1 (and maximum of 16383) at parse time before any byte-count arithmetic, in sql/parse_tree_column_attrs.h:

  if (dimension < 1 || dimension > MAX_VECTOR_DIMENSION) {
      my_error(ER_TOO_BIG_VECTOR_COLUMN, ...);
      return true;
  }

Using 64-bit arithmetic for the byte-count (dimension * 4ULL) would also prevent the unsigned underflow at dimension=0 and the overflow seen in bug #120222.
[9 Apr 19:32] Roy Lyseng
Thank you for the bug report.
Verified as described.