Description:
MyVector (https://github.com/askdba/myvector) is an open-source GPL-2.0 native vector similarity search extension for MySQL, implementing HNSW and KNN approximate nearest-neighbor search via UDFs and stored procedures on VARBINARY columns. As of v1.26.5 (2026-05-08) it ships a MySQL Server Component build (INSTALL COMPONENT) alongside the legacy plugin path, targeting MySQL 8.4 LTS and 9.7 LTS.
This report documents three bugs found in MyVector's own component code while building against MySQL's Component Services API (mysql_udf_registration, log_builtins, event_reader.h). It is submitted to share findings with the MySQL community and to request review of the component API surface MyVector relies on — in case any of the behaviours observed (e.g. FORMAT_DESCRIPTION_EVENT carrying a non-zero next_log_pos at reconnect) represent undocumented edge cases that other component authors may hit.
BUG 1 — FORMAT_DESCRIPTION_EVENT crash loop in MyVector's binlog listener
When the binlog listener reconnects, MySQL sends a FORMAT_DESCRIPTION_EVENT (type=15) with a non-zero next_log_pos pointing past EOF. MyVector's listener incorrectly treated this as a valid continuation position, entering an infinite reconnect crash loop. Fixed in MyVector v1.26.5 by detecting and skipping FORMAT_DESCRIPTION_EVENT before advancing the read position. It would help to know if this next_log_pos behaviour is documented or intentional.
BUG 2 — KNNIndex missing coordinate tracking causes double-insert on reconnect
MyVector's KNNIndex (brute-force fallback) was missing the setLastUpdateCoordinates / getLastUpdateCoordinates interface that HNSWIndex correctly implemented. As a result, isAfter() always returned true and already-indexed rows were re-inserted on every binlog reconnect. Fixed in MyVector v1.26.5 by implementing the interface on KNNIndex.
BUG 3 — Stale binlog checkpoint saved after index build
MyVector's BuildMyVectorIndexSQL was saving the binlog listener's stale reconnect position as the post-build checkpoint instead of querying SHOW BINARY LOG STATUS for the actual server position. This caused pre-build INSERT events to be replayed on restart. Fixed in MyVector v1.26.5 by querying SHOW BINARY LOG STATUS explicitly before saving the checkpoint.
Additional fixes in v1.26.5:
- Thread safety: replaced gmtime/asctime with gmtime_r/asctime_r throughout (issue #87).
- Null-guard in myvector_ann_set row function: *length=0 on null-index return (issue #87).
- binary_log namespace conflict on MySQL 8.4 component build: guarded for 9.7+ which declares it in event_reader.h.
Reference: https://github.com/askdba/myvector/blob/main/CHANGELOG.md (section [1.26.5])
How to repeat:
1. Start MySQL 8.4 LTS or 9.7 LTS with the MyVector component installed:
INSTALL COMPONENT 'file://component_myvector';
2. Create a table with a vector column and build an HNSW index:
CREATE TABLE words50d (
wordid INT PRIMARY KEY,
word VARCHAR(50),
wordvec VARBINARY(200) COMMENT 'MYVECTOR(type=HNSW,dim=50,size=100000,dist=L2)'
);
CALL mysql.myvector_index_build('vectordb.words50d.wordvec', 'wordid');
3. Start the binlog listener for real-time index updates (see docs/ONLINE_INDEX_UPDATES.md).
4. Restart the MySQL server.
Observed (Bug 1): Binlog listener enters an infinite crash loop on reconnect due to FORMAT_DESCRIPTION_EVENT (type=15) returning a non-zero next_log_pos past EOF.
Observed (Bug 2): Insert new rows after restart — already-indexed rows are double-inserted into the KNN index because isAfter() always returns true (missing setLastUpdateCoordinates/getLastUpdateCoordinates on KNNIndex).
Observed (Bug 3): Run CALL mysql.myvector_index_build() then restart — pre-build INSERT events are replayed because the saved checkpoint reflects the stale listener reconnect position, not SHOW BINARY LOG STATUS.
All three bugs are reproducible with the component build on MySQL 8.4 LTS and 9.7 LTS. The plugin build on 8.0/8.4/9.0 is not affected.
Docker-based smoke test: https://github.com/askdba/myvector/tree/main/scripts (smoke-component.sh)
Suggested fix:
All three bugs have been fixed in MyVector v1.26.5 (2026-05-08):
Bug 1 fix: Detect FORMAT_DESCRIPTION_EVENT (type=15) in the binlog event loop and skip position advancement; do not treat next_log_pos as a valid continuation point on reconnect.
Bug 2 fix: Implement setLastUpdateCoordinates() and getLastUpdateCoordinates() on KNNIndex to match the HNSWIndex interface, so isAfter() correctly determines whether a row has already been indexed.
Bug 3 fix: In BuildMyVectorIndexSQL, replace the stale listener position with the result of SHOW BINARY LOG STATUS queried immediately before saving the checkpoint.
Thread safety fix: Replace gmtime/asctime with gmtime_r/asctime_r throughout (issue #87).
binary_log namespace fix: Guard conflicting declaration for MySQL 8.4 component builds; 9.7+ already has it in event_reader.h.
Patches and build scripts: https://github.com/askdba/myvector/releases/tag/v1.26.5
Component build helpers: scripts/build-component-8.4-docker.sh, scripts/build-component-9.7-docker.sh
Smoke test: scripts/smoke-component.sh