Bug #92759 | Add TABLESAMPLE support | ||
---|---|---|---|
Submitted: | 12 Oct 2018 6:14 | Modified: | 19 Nov 2018 14:22 |
Reporter: | Daniël van Eeden (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S4 (Feature request) |
Version: | 8.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | compatibility, tablesample |
[12 Oct 2018 6:14]
Daniël van Eeden
[12 Oct 2018 6:15]
Daniël van Eeden
Corrected the severity
[12 Oct 2018 8:00]
Daniël van Eeden
Microsoft SQL Server, Hive and IBM DB2 also seem to implement this.
[12 Oct 2018 11:08]
MySQL Verification Team
Hello Daniël, Thank you for the report and feature request! regards, Umesh
[13 Oct 2018 18:50]
Ruud H.G. van Tol
If you can accept the extra index-disk-space-usage, and there is <some_column> with plenty of variance, then this is a workaround: ALTER TABLE t1 ADD column randy smallint unsigned as (CONV(LEFT(MD5(some_column),4),16,10)) , ADD INDEX (randy) ; The next steps would obviously be: SELECT * FROM t1 WHERE randy >= FLOOR(65536 * RAND()) LIMIT 100 ; That can easily return less than 100 rows, so in that case you could retry with: SELECT * FROM t1 WHERE randy < FLOOR(65536 * RAND()) ORDER BY randy DESC LIMIT 100 ; or alternatively: SELECT * FROM t1 WHERE randy >= 0 LIMIT 42 ; (where 42 is the number of missing rows)
[19 Nov 2018 14:22]
Daniël van Eeden
Using mysqldump with a tablesample could be useful to setup a development/acceptance copy with a percentage of the data. This will probably not really work with foreign keys.
[11 Mar 2024 9:16]
bro Simon
May I ask how mysqldump uses tablesample? I couldn't find any relevant options in the man pages of mysqldump