Bug #26832 ENUM types not optimized
Submitted: 4 Mar 2007 18:17 Modified: 5 Apr 2011 5:46
Reporter: Roberto Spadim (Basic Quality Contributor) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.5 OS:Any
Assigned to: CPU Architecture:Any
Tags: ENUM OPTIMIZATION, qc

[4 Mar 2007 18:17] Roberto Spadim
Description:
enum types not optimized when i search a value that don't bellow to enum values, like
enum('a','b','c')
select * from table where enum_field='d'

this make an table scan or index scan, see the screnshots from my query

How to repeat:
create a table with enum and search with a non enum value

Suggested fix:
when using enum fields check it and make a 0 rows optimization before start table scan optimizations
[13 Apr 2007 13:08] Valeriy Kravchuk
Thank you for a problem report, and sorry for a delay with its processing. Please, check with a newer version, 5.1.17, and, in case of the same problem, send a complete test case that demonstrates the behaviour described.
[13 May 2007 23:02] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[14 May 2007 4:10] Valeriy Kravchuk
Roberto,

Please, check with a newer version, 5.1.17, and, in case of the same problem,
send a complete test case that demonstrates the behaviour described.
[14 Jun 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[15 Jul 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[16 Aug 2007 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[29 Mar 2011 18:50] Roberto Spadim
the problem continue
create table with a ENUM field and select with a value that´s not in enum values

ok it´s nice to read raw values of myisam tables when crashed, but maybe a optimization could help us here, i don´t know if we will have problems, since a enum not null field with a value diferent from enum values isn´t a NULL value, it´s just a NOVALUE?! i don´t know what to do here hehe, maybe a warning when INSERT or UPDATE? i don´t know what to do with this

select * from mov_contacorrente where lote_tipo='beto' (2 seconds, 700000rows in table, full table scan)

create table:
CREATE TABLE `mov_contacorrente` (
  `cc_hash_key` varchar(75) NOT NULL DEFAULT '',
  `origem_tipo` enum('cob','doc','bc') NOT NULL DEFAULT 'doc',
  `hash_origem` varchar(75) NOT NULL DEFAULT '',
  `linha_digitavel` varchar(75) NOT NULL,
  `plano_cobranca` int(10) unsigned NOT NULL DEFAULT '0',
  `tipo_operacao` enum('tr','mp','cv','ie','ma','') NOT NULL DEFAULT '',
  `tipo_cobranca` enum('dv','cd','cs') NOT NULL DEFAULT 'dv',
  `evento_id` int(10) unsigned NOT NULL DEFAULT '0',
  `sacador_tipo` enum('f','j') NOT NULL DEFAULT 'f',
  `sacador_id` bigint(20) NOT NULL DEFAULT '0',
  `vendedor_tipo` enum('f','j') NOT NULL DEFAULT 'f',
  `vendedor_id` int(10) unsigned NOT NULL DEFAULT '0',
  `credor_tipo` enum('f','j') NOT NULL DEFAULT 'f',
  `credor_id` bigint(20) NOT NULL DEFAULT '0',
  `credor_ca` bigint(20) NOT NULL DEFAULT '0',
  `credor_co` int(11) NOT NULL DEFAULT '0',
  `credor_centro` bigint(20) NOT NULL DEFAULT '0',
  `credor_conta` bigint(20) unsigned NOT NULL DEFAULT '0',
  `credor_carteira` bigint(20) unsigned NOT NULL DEFAULT '0',
  `credor_convenio` bigint(20) unsigned NOT NULL DEFAULT '0',
  `devedor_tipo` enum('f','j') NOT NULL DEFAULT 'f',
  `devedor_id` bigint(20) NOT NULL DEFAULT '0',
  `devedor_ca` bigint(20) NOT NULL DEFAULT '0',
  `devedor_co` int(11) NOT NULL DEFAULT '0',
  `devedor_centro` bigint(20) NOT NULL DEFAULT '0',
  `devedor_conta` bigint(20) unsigned NOT NULL DEFAULT '0',
  `devedor_carteira` bigint(20) unsigned NOT NULL DEFAULT '0',
  `praca_pagamento_tipo` enum('f','j') NOT NULL DEFAULT 'f',
  `praca_pagamento_id` bigint(20) NOT NULL DEFAULT '0',
  `liquidez` float(5,2) NOT NULL DEFAULT '0.00',
  `prioridade_pgto` mediumint(3) unsigned NOT NULL DEFAULT '0',
  `moeda` char(3) NOT NULL DEFAULT 'R$',
  `valor_face` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `valor_residual` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `valor_residual_compensado` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `valor_emdigitacao` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `valor_pago_moeda_default` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `valor_pago_dinheiro_default` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `sit_cobranca` enum('n','s','l','lb') NOT NULL DEFAULT 'n',
  `data_vencimento_face` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_emissao_face` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_vencimento_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_vencimento_credor` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_vencimento_devedor` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_vencimento_auto` enum('Y','N') NOT NULL DEFAULT 'N',
  `data_entrada_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_entrada_credor` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_entrada_devedor` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_liquidacao_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_compensacao_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_compensacao_credor` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_compensacao_devedor` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_ultima_cobranca` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_proxima_cobranca` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `negativado` enum('Y','N') NOT NULL DEFAULT 'N',
  `protestado` enum('Y','N') NOT NULL DEFAULT 'N',
  `protesto_data` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `tabeliao_tipo` enum('f','j') NOT NULL DEFAULT 'f',
  `tabeliao_id` bigint(20) NOT NULL DEFAULT '0',
  `multa_data_prevista` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `multa_data` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `multa_valor_tipo` enum('%','$','%rs') NOT NULL DEFAULT '%',
  `multa_valor` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `multa_valor_cobrado` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `juros_data_prevista` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `juros_tipo` enum('$','%fs','%fc','%rs') NOT NULL DEFAULT '$',
  `juros_valor` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `juros_periodo` mediumint(3) unsigned NOT NULL DEFAULT '1',
  `juros_periodo_tipo` enum('dia','semana','mes','ano') NOT NULL DEFAULT 'dia',
  `repasse_por` double NOT NULL DEFAULT '0',
  `repasse_pessoa_tipo` enum('f','j') NOT NULL DEFAULT 'f',
  `repasse_pessoa_id` bigint(20) NOT NULL DEFAULT '0',
  `contador_relacionamentos` int(10) unsigned NOT NULL DEFAULT '0',
  `contador_repasse` int(10) unsigned NOT NULL DEFAULT '0',
  `contador_ocorrencia` int(10) unsigned NOT NULL DEFAULT '0',
  `contador_juros` int(10) unsigned NOT NULL DEFAULT '0',
  `bordero` decimal(10,1) NOT NULL DEFAULT '0.0',
  `lote_unidade` int(11) NOT NULL DEFAULT '0',
  `lote_data` date NOT NULL DEFAULT '0000-00-00',
  `lote_tipo` char(1) NOT NULL DEFAULT 'r',
  `lote_numero` int(10) NOT NULL DEFAULT '0',
  `digitador_tipo` enum('f','j') NOT NULL DEFAULT 'f',
  `digitador_id` bigint(20) NOT NULL DEFAULT '0',
  `sit_spa` enum('d','s','p','a','mov') NOT NULL DEFAULT 'd',
  `data_spa_d` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_spa_s` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_spa_p` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_spa_a` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_spa_mov` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `repasse_tipo` enum('total','cob') NOT NULL DEFAULT 'total',
  `importado` varchar(25) NOT NULL,
  `rodado` enum('Y','N') NOT NULL DEFAULT 'N',
  `desconto_data_prevista` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `desconto_tipo` enum('$','%fs','%fc','%rs') NOT NULL DEFAULT '%fs',
  `desconto_valor` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `pess_mov_tipo` enum('f','j') NOT NULL DEFAULT 'f',
  `pess_mov_id` int(10) unsigned NOT NULL DEFAULT '0',
  `desconto_financeiro` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `imposto_retido` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `devedor_ca_descfin` int(11) NOT NULL DEFAULT '0',
  `credor_ca_descfin` int(11) NOT NULL DEFAULT '0',
  `devedor_ca_impret` int(11) NOT NULL DEFAULT '0',
  `credor_ca_impret` int(11) NOT NULL DEFAULT '0',
  `recompra_Valor` decimal(17,5) NOT NULL DEFAULT '0.00000',
  `sit_venda_titulo` enum('','d','r','l') NOT NULL DEFAULT '',
  `agente_financeiro_tipo` enum('f','j') NOT NULL DEFAULT 'f',
  `agente_financeiro_id` int(11) NOT NULL DEFAULT '0',
  `data_venda_bordero` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `data_recompra` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`cc_hash_key`),
  KEY `origem` (`hash_origem`,`cc_hash_key`),
  KEY `bordero` (`sacador_tipo`,`sacador_id`,`bordero`),
  KEY `lote` (`lote_unidade`,`lote_data`,`lote_numero`),
  KEY `devedor` (`devedor_tipo`,`devedor_id`,`data_vencimento_gmt`),
  KEY `residual_devedor` (`valor_residual`,`devedor_tipo`,`devedor_id`),
  KEY `credor_emi` (`credor_tipo`,`credor_id`,`data_emissao_face`),
  KEY `devedor_emi` (`devedor_tipo`,`devedor_id`,`data_emissao_face`),
  KEY `credor` (`credor_tipo`,`credor_id`,`data_vencimento_gmt`,`credor_convenio`),
  KEY `vencimento` (`data_vencimento_gmt`),
  KEY `venc` (`data_vencimento_gmt`,`cc_hash_key`),
  KEY `emi` (`data_emissao_face`,`cc_hash_key`),
  KEY `t_bol` (`evento_id`,`lote_tipo`,`importado`,`cc_hash_key`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
PARTITIONS 4 */
[29 Mar 2011 18:50] Roberto Spadim
stay with this problem in mysql 5.5
changing to performace not serious
[3 Apr 2011 12:17] Valeriy Kravchuk
I had never seen complete, correct and repeatable test case in this bug report. In your last table the column you refer to is NOT ENUM:

 `lote_tipo` char(1) NOT NULL DEFAULT 'r',

and is NOT indexed. Why do you expect anything else bug full table scan is beyond my understanding.

In my test simple test case:

mysql> show create table te\G
*************************** 1. row ***************************
       Table: te
Create Table: CREATE TABLE `te` (
  `c1` enum('a','b','c') DEFAULT NULL,
  `c2` int(11) DEFAULT NULL,
  KEY `c1` (`c1`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> insert into te values('1',1),('2',2);
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into te select * from te;
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into te select * from te;
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> insert into te select * from te;
Query OK, 8 rows affected (0.00 sec)
Records: 8  Duplicates: 0  Warnings: 0

mysql> insert into te select * from te;
Query OK, 16 rows affected (0.00 sec)
Records: 16  Duplicates: 0  Warnings: 0

mysql> insert into te select * from te;
Query OK, 32 rows affected (0.01 sec)
Records: 32  Duplicates: 0  Warnings: 0

mysql> insert into te select * from te;
Query OK, 64 rows affected (0.01 sec)
Records: 64  Duplicates: 0  Warnings: 0

mysql> insert into te select * from te;
Query OK, 128 rows affected (0.02 sec)
Records: 128  Duplicates: 0  Warnings: 0

mysql> insert into te values('3',3);
Query OK, 1 row affected (0.00 sec)

mysql> explain select * from te where c1 = '3';
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+
|  1 | SIMPLE      | te    | ref  | c1            | c1   | 2       | const |    1 | Using where |
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+
1 row in set (0.02 sec)

mysql> explain select * from te where c1 = '1';
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+
|  1 | SIMPLE      | te    | ref  | c1            | c1   | 2       | const |  128 | Using where |
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+
1 row in set (0.00 sec)

mysql> explain select * from te where c1 = '4';
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+
|  1 | SIMPLE      | te    | ref  | c1            | c1   | 2       | const |    1 | Using where |
+----+-------------+-------+------+---------------+------+---------+-------+------+-------------+
1 row in set (0.00 sec)

I see expected results. Even though '4' is not a valid value for ENUM, index is used to search for it, and will aloow to find out fast enough zero rows with this value exist.

Is this just a feature request to check if value for ENUM column is valid one even before optimization? Please, clarify.
[3 Apr 2011 15:42] Roberto Spadim
sorry about no test case, i will try to explain with your example

------
... all insert´s / create table on your example was done ... continuing ...

insert into te values('g',1),('f',2);
/* 2 rows affected, 0 rows found. Duration for 2 queries: 0,093 sec. */
sql_mode: PIPES_AS_CONCAT,ALLOW_INVALID_DATES

show warnings;
"Level","Code","Message"
"Warning","1265","Dado truncado para coluna 'c1' na linha 1"
"Warning","1265","Dado truncado para coluna 'c1' na linha 2"
------

no problems... it´s what i want... but there´s a 'default' value when truncate occur (i don´t know if it´s a null value, or '' or 0 index of enum field, but we can select it with where c1=0 or c1='' sometimes, i didn´t checked this in mysql docs) check:

------
select * from te ;
"c1","c2"
... some values are repeated....
"a","1"
"b","2"
"c","3"
... last two rows: ...
"","1"
"","2"
------

"" -> the 'default' value for truncated field (ok no problems)

the problem is:
------
select * from te where c1 = 'g';
the result: 
/* 0 rows affected, 0 rows found. Duration for 1 query: 0,093 sec. */

but check that no optimization was done

explain select * from te where c1 = 'g';
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","SIMPLE","te","ALL",NULL,NULL,NULL,NULL,"261","Using where"
------
full table scan!? 261 rows!

i was thinking about something like this result:

------
delete from te;
explain select * from te where c1 = 'g';
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","SIMPLE",NULL,NULL,NULL,NULL,NULL,NULL,NULL,"Impossible WHERE noticed after reading const tables"
------
no row scan since c1='g' don´t exists and it´s not a default value for truncated values c1=0 or c1='' for example

but check that this works: (without deleting before...)

------
this: select * from te where c1=0
and this: select * from te where c1=''
result the same values for truncated values (i think it´s ok):

"c1","c2"
"","1"
"","2"
------
nice, it´s what i want too select with 'default' truncated value for enum

.............................
the optimization idea:
select * from te where c1=0 or select * from te where c1=''
use the default 'truncated' value like today implementation, no changes/feature request here

but if it´s a truncated value where
select * from te where c1='abc' (c1='abc' will return 0 rows)

check if enum('abc') exists (default truncated values are alowed like c1=0 or c1='')
if not c1='abc' should be rewrite as something like '0=1' 
like this:

------
explain select * from te  where 0=1
"id","select_type","table","type","possible_keys","key","key_len","ref","rows","Extra"
"1","SIMPLE",NULL,NULL,NULL,NULL,NULL,NULL,NULL,"Impossible WHERE"
------
no full table scan! :)

c1='abc' is a impossible WHERE
but c1='' or c1=0 isn´t a impossible WHERE for truncated values (nice!)

result: no full table scan for enum fields
this is good for some alter table, and not optimized app
i don´t know if it´s easy to implement, but it´s a good optimization for 'bad' coded apps
[3 Apr 2011 15:45] Roberto Spadim
don´t need index here
[3 Apr 2011 16:24] Valeriy Kravchuk
OK, we have a reasonable feature request here for new kind of "Impossible WHERE" check for ENUMs.
[3 Apr 2011 18:25] Roberto Spadim
nice! :)
any help contact me =)
[3 Apr 2011 18:28] Roberto Spadim
this apply to SET() type too?
i never used SET i preffer two tables (1:N tables with foreign key for this, or app serialization) could you test it? i don´t know how to use SET fields
[5 Apr 2011 5:46] Roberto Spadim
changed to any OS
need some test with SET field type but ENUM type optimizer 'work', is near to set type 'work'