Bug #3564 wrong optimizer plan
Submitted: 26 Apr 2004 6:26 Modified: 25 Jun 2004 13:32
Reporter: Thomas Mayer Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:4.0.18 OS:Linux (Suse Linux 8.2)
Assigned to: Heikki Tuuri CPU Architecture:Any

[26 Apr 2004 6:26] Thomas Mayer
Description:
The optimizer seems to use a wrong plan in this case there is a lot of joins on the same ID.

MySQL's plan prefers the table with few entrys of send_id-vaues but is not considering it could prefer another table first with more id-values but less rows because of another join with t1.

In My Case, mysql could use t1 as first table containing only 32 rows and join t5 using index idx_datum as second table.
t5 is containing millions of rows, but with the join there should only be 200000 rows of t5 left.

Then it should use t3 as there is a lot of data thrown out.

The rest doesn't play a role afterwards.

Using force index(idx_datum) in t5 does not solve the problem, although mysql thinks it CAN use it as possible key! But I can't force it to do so.

How to repeat:

drop table if exists import_tu_todo;
create temporary table import_tu_todo
select distinct t1.datum 
from timeaxis t1
inner join tu_monat t2 on t1.at_stat between t2.datum_von and t2.datum_bis;

alter table import_tu_todo
add primary key(datum);

explain
select 11, t5.paketnummer, t2.at_stat, t5.zeit,t3.tour, t4.zust_produkt,
concat(t2.at_stat,'_', t3.tour,'_', 11,'_', t4.zust_produkt),
t4.send_id
from import_tu_todo t1 
	inner join bericht t5 on t1.datum=t5.datum
	inner join timeaxis t2 on t1.datum=t2.datum
	inner join bericht t3 on t3.send_id=t5.send_id
	inner join sendung t4 on t4.send_id=t3.send_id
	left join bericht t6 on (t3.send_id=t6.send_id and t6.scannart = 4
		and concat(t6.datum,t6.zeit) between concat(t3.datum,t3.zeit) and concat(t5.datum,t5.zeit)
		)
	left join bericht t7 on (t3.send_id=t7.send_id and t7.scannart = 2 and t7.datum > t3.datum and t7.datum<=t5.datum)
	left join bericht t8 on (t3.send_id=t8.send_id and t8.scannart = 2 and t8.datum<=t3.datum and t7.send_id is not null)
where t3.scannart=3
and t3.eind_tag=1
and t5.scannart=3
and t5.eind_tag=1
and t3.ber_id<>t5.ber_id
and t3.datum<t5.datum
and t6.send_id is null
and t8.send_id is null
;

table,type,possible_keys,key,key_len,ref,rows,Extra
t4,ALL,PRIMARY,NULL,NULL,NULL,9141006,
t3,ref,idx_datum,idx_send_id,idx_send_id,4,t4.send_id,1,Using where
t5,ref,idx_datum,idx_send_id,idx_send_id,4,t3.send_id,1,Using where
t1,eq_ref,PRIMARY,PRIMARY,3,t5.datum,1,Using index
t2,eq_ref,PRIMARY,PRIMARY,3,t1.datum,1,
t6,ref,idx_send_id,idx_send_id,4,t3.send_id,1,Using where; Not exists
t7,ref,idx_datum,idx_send_id,idx_send_id,4,t3.send_id,1,
t8,ref,idx_datum,idx_send_id,idx_send_id,4,t3.send_id,1,Using where; Not exists

====

explain
select 11, t5.paketnummer, t2.at_stat, t5.zeit,t3.tour, t4.zust_produkt,
concat(t2.at_stat,'_', t3.tour,'_', 11,'_', t4.zust_produkt),
t4.send_id
from import_tu_todo t1 
	inner join bericht t5 on t1.datum=t5.datum
	inner join timeaxis t2 on t1.datum=t2.datum
	inner join bericht t3 on t3.send_id=t5.send_id
	inner join sendung t4 on t4.send_id=t3.send_id
where t3.scannart=3
and t3.eind_tag=1
and t5.scannart=3
and t5.eind_tag=1
and t3.ber_id<>t5.ber_id
and t3.datum<t5.datum
and t6.send_id is null
and t8.send_id is null
;

table,type,possible_keys,key,key_len,ref,rows,Extra
t4,ALL,PRIMARY,NULL,NULL,NULL,9141006,
t3,ref,idx_datum,idx_send_id,idx_send_id,4,t4.send_id,1,Using where
t5,ref,idx_datum,idx_send_id,idx_send_id,4,t3.send_id,1,Using where
t1,eq_ref,PRIMARY,PRIMARY,3,t5.datum,1,Using index
t2,eq_ref,PRIMARY,PRIMARY,3,t1.datum,1,

=====
explain
select 11, t5.paketnummer, t2.at_stat, t5.zeit,t3.tour, t4.zust_produkt,
concat(t2.at_stat,'_', t3.tour,'_', 11,'_', t4.zust_produkt),
t4.send_id
from import_tu_todo t1 
	inner join bericht t5 force index(idx_datum) on t1.datum=t5.datum
	inner join timeaxis t2 on t1.datum=t2.datum
	inner join bericht t3 on t3.send_id=t5.send_id
	inner join sendung t4 on t4.send_id=t3.send_id
where t3.scannart=3
and t3.eind_tag=1
and t5.scannart=3
and t5.eind_tag=1
and t3.ber_id<>t5.ber_id
and t3.datum<t5.datum;

table,type,possible_keys,key,key_len,ref,rows,Extra
t5,ALL,idx_datum,NULL,NULL,NULL,13641624,Using where
t1,eq_ref,PRIMARY,PRIMARY,3,t5.datum,1,Using index
t2,eq_ref,PRIMARY,PRIMARY,3,t1.datum,1,
t3,ref,idx_datum,idx_send_id,idx_send_id,4,t5.send_id,1,Using where
t4,eq_ref,PRIMARY,PRIMARY,4,t3.send_id,1,

Suggested fix:
Force Index should always be used if possible and in my case it IS possible as EXPLAIN tells! I thought this is the sense of force index in case the optimizer's plan does not work properly.

In my Opinion, MySQL should even take the right query plan by itsself without having need of a force index.
[26 Apr 2004 7:39] Thomas Mayer
I tried tracking it down to a testcase, but had no luck. Using only a few values, mysql seems to do everything the right way here:

drop table if exists all_ids;
create table all_ids(
id int(10) not null,
primary key(id)
);
insert into all_ids values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(13),(14),(15),(16),(17),(18),(19),(20);

drop table if exists some_data;
create table some_data(
datum date not null,
value tinyint(3),
primary key(datum)
);
insert into some_data values('2004-03-01',1),('2004-03-02',2),('2004-03-04',3),('2004-03-08',4),('2004-03-09',5);

drop table if exists any_data;
create table any_data(
id int(10) not null,
datum date not null,
value tinyint(3),
any_id int(10) not null auto_increment,
primary key(any_id),
key idx_id(id),
key idx_datum(datum)
);
insert into any_data (id, datum, value)
select id, datum, value
from all_ids, some_data;

insert into any_data (id, datum, value)
select id, datum, value
from all_ids, some_data;

insert into any_data (id, datum, value)
select id, datum, value
from all_ids, some_data;

insert into any_data (id, datum, value)
select id, datum, value
from all_ids, some_data;

explain
select t4.any_id, t5.id, t4.id
from any_data t1
	inner join some_data t2 on t1.datum=t2.datum
	inner join any_data t3 on t1.id=t3.id
	inner join any_data t4 on t1.id=t4.id
	inner join all_ids t5 on t1.id=t5.id
where t1.any_id<>t3.any_id
and t1.value=3
and t3.value=3
and t4.value=2;
[26 Apr 2004 8:42] Thomas Mayer
Here is some additional Information concerning my first comment. This case happens in 'real world'. 

select count(*) from import_tu_todo;
-- returns 32

select count(*)
from import_tu_todo t1
	inner join bericht t2 on t1.datum=t2.datum
where 
t2.scannart=3;

-- returns 219914

====

explain
select t2.paketnummer
from import_tu_todo t1
	inner join bericht t2 on t1.datum=t2.datum
where 
t2.scannart=3;
-- returns:
table,type,possible_keys,key,key_len,ref,rows,Extra
t2,ALL,idx_datum,NULL,NULL,NULL,13641624,Using where
t1,eq_ref,PRIMARY,PRIMARY,3,t2.datum,1,Using index

===

explain
select t2.paketnummer
from import_tu_todo t1
	inner join bericht t2 force index(idx_datum) on t1.datum=t2.datum
where 
t2.scannart=3;
-- returns:
table,type,possible_keys,key,key_len,ref,rows,Extra
t2,ALL,idx_datum,NULL,NULL,NULL,13641624,Using where
t1,eq_ref,PRIMARY,PRIMARY,3,t2.datum,1,Using index
===

I think this tells us that there is something definetely wrong:
MySQL prefers using a full table scan with 13641624 rows instead of using idx_datum which would only return 219914 rows.

Maybe there are some hard-coded numbers integrated in the optimizer instead of using dynamic decisions which consider both ways? It does not make sense to me why the testcase above works and the real-world example does not.

Force index does not really force it.

The field datum is of type date in both tables.

With apologies,
Thomas
[26 Apr 2004 9:41] Thomas Mayer
I think I've got it:

drop table if exists import_tu_todo;
create  table import_tu_todo
select distinct t1.datum 
from timeaxis t1
inner join tu_monat t2 on t1.at_stat between t2.datum_von and t2.datum_bis
and t1.datum between "2004-04-01" and "2004-04-21";

alter table import_tu_todo
add primary key(datum);

explain
select t2.paketnummer
from import_tu_todo t1
	inner join bericht t2 on t1.datum=t2.datum
where 
t2.scannart=3;

-- returns
table,type,possible_keys,key,key_len,ref,rows,Extra
t1,index,PRIMARY,PRIMARY,3,NULL,21,Using index
t2,ref,idx_datum,idx_datum,3,t1.datum,757868,Using where

This query (without explain) takes 4.27 seconds and returns 178619 rows.

====
drop table if exists import_tu_todo;
create  table import_tu_todo
select distinct t1.datum 
from timeaxis t1
inner join tu_monat t2 on t1.at_stat between t2.datum_von and t2.datum_bis
and t1.datum between "2004-04-01" and "2004-04-22";

alter table import_tu_todo
add primary key(datum);

explain
select t2.paketnummer
from import_tu_todo t1
	inner join bericht t2 on t1.datum=t2.datum
where 
t2.scannart=3;
-- returns
table,type,possible_keys,key,key_len,ref,rows,Extra
t2,ALL,idx_datum,NULL,NULL,NULL,13641624,Using where
t1,eq_ref,PRIMARY,PRIMARY,3,t2.datum,1,Using index

This query (without explain) takes 141.42 seconds and returns 192919 rows!

===
Please notice that Example2 has only 1 row more in import_tu_todo!

This means that MySQL optimizer thinks that it would take more time to use idx_datum as soon as there's more than 21 rows in import_tu_todo. But that's not true as 141.42 seconds is much more than 4.27 seconds.

MySQL optimizer should take decisions to perform constantly with growing number of rows or better.
[26 Apr 2004 10:15] Thomas Mayer
-- This would be a workaround for my problem:
explain
select paketnummer
from import_tu_todo t1 force index(primary)
	inner join bericht t2 on t1.datum=t2.datum
where 
t2.scannart=3
and t1.datum>='0000-00-00';

-- returns:
table,type,possible_keys,key,key_len,ref,rows,Extra
t1,range,PRIMARY,PRIMARY,3,NULL,11,Using where; Using index
t2,ref,idx_datum,idx_datum,3,t1.datum,757868,Using where

Note that only "and t1.datum>='0000-00-00'" makes MySQL to take the (nearly)fastest plan.

With apologies,

Thomas
[27 Apr 2004 2:12] Thomas Mayer
The workaround above does not work in all cases. In case the query is getting more complex it does not solve my problem as idx_datum is not being used again.
[28 Apr 2004 16:09] Sergei Golubchik
I don't think it's a bug. MySQL cannot know how many rows in bericht table will match scannart=3 condition (it's 1/70th according to your results). Without this information, optimizer does the best it can: second plan that is expected to retrieve 13641624*1 rows is better than the first, that is expected to retrieve 21*757868=15915228 rows.
MySQL choses the first plan for the first query, because these 21 rows are read directly from the index - note "Using index" - so it slightly reduces the cost. Still when you add a row - 22*757868=16673096, MySQL still prefers the plan with 13641624 rows.

You may use straight join as a workaround.

Do you have an index on scannart ?
[29 Apr 2004 12:05] Thomas Mayer
What is faulty then is the number 757868!
You wrote it was expected 21*757868=15915228 rows.

In fact it should be something about 45000 (see statement below) instead of 757868. And this is independent of the field scannart (see the other statement below).

That is why mysql switches the plan between the two cases (21 days and 22 days), because the number of expected rows is wrong. In result the cost is calculated wrong afterwards. It should be:

22*45000=990000 rows
990000<13641624 (much smaller!)

You can imagine, this would improve performance of my server significantly. Right now I always had to set force index onto idx_datum.

I still consider this to be a bug, so state open remains...

I do optimize table every night, so that should not be the problem.

I also do not want to set an index on scannart because I want to avoid MySQL to use it:
select scannart, count(*) from bericht
group by 1;

returns:
scannart	count(*)
1	6712
2	4168693
3	4161223
4	59520
5	5217835
6	30904
7	90111
8	40341
9	92095
13	120493

select datum, count(*) from bericht
group by 1;

returns:
datum	count(*)
02.01.1980	1
26.02.1980	1
18.05.1980	2
13.02.2001	1
08.01.2002	1
30.11.2002	2
01.12.2002	95
02.12.2002	14
03.12.2002	35
04.12.2002	25
05.12.2002	15
06.12.2002	21
09.12.2002	4
10.12.2002	6
11.12.2002	14
12.12.2002	12
13.12.2002	7
16.12.2002	12
17.12.2002	494
18.12.2002	36
19.12.2002	11
20.12.2002	16
23.12.2002	15
26.12.2002	8
27.12.2002	41
30.12.2002	49
31.12.2002	13
01.01.2003	1
02.01.2003	13863
03.01.2003	18496
04.01.2003	999
06.01.2003	5097
07.01.2003	36619
08.01.2003	41327
09.01.2003	41196
10.01.2003	36157
13.01.2003	41634
14.01.2003	40648
15.01.2003	42413
16.01.2003	44062
17.01.2003	38836
18.01.2003	1
20.01.2003	44041
21.01.2003	43802
22.01.2003	43675
23.01.2003	42196
24.01.2003	41915
27.01.2003	43660
28.01.2003	42797
29.01.2003	44356
30.01.2003	45574
31.01.2003	35178
01.02.2003	2
03.02.2003	48325
04.02.2003	41700
05.02.2003	44501
06.02.2003	43831
07.02.2003	44290
09.02.2003	1
10.02.2003	42530
11.02.2003	41733
12.02.2003	43866
13.02.2003	42279
14.02.2003	39492
17.02.2003	40146
18.02.2003	40119
19.02.2003	40954
20.02.2003	42774
21.02.2003	41326
24.02.2003	39609
25.02.2003	40517
26.02.2003	43024
27.02.2003	43449
28.02.2003	43110
02.03.2003	2
03.03.2003	38961
04.03.2003	34501
05.03.2003	37296
06.03.2003	39912
07.03.2003	40260
09.03.2003	24
10.03.2003	39733
11.03.2003	40127
12.03.2003	44115
13.03.2003	42722
14.03.2003	46092
17.03.2003	47210
18.03.2003	44993
19.03.2003	46296
20.03.2003	49954
21.03.2003	43744
24.03.2003	46169
25.03.2003	42942
26.03.2003	47533
27.03.2003	44842
28.03.2003	44086
29.03.2003	3
31.03.2003	42360
01.04.2003	43328
02.04.2003	44681
03.04.2003	43243
04.04.2003	43358
07.04.2003	42970
08.04.2003	43154
09.04.2003	45202
10.04.2003	44823
11.04.2003	41068
14.04.2003	42287
15.04.2003	42084
16.04.2003	43387
17.04.2003	41813
22.04.2003	38549
23.04.2003	39084
24.04.2003	41876
25.04.2003	40681
28.04.2003	43153
29.04.2003	40627
30.04.2003	44365
2003-05-00	1
02.05.2003	33882
05.05.2003	42767
06.05.2003	42627
07.05.2003	46252
08.05.2003	48717
09.05.2003	39726
12.05.2003	39901
13.05.2003	42313
14.05.2003	42669
15.05.2003	43298
16.05.2003	40779
19.05.2003	39449
20.05.2003	40419
21.05.2003	41986
22.05.2003	42046
23.05.2003	39501
25.05.2003	1
26.05.2003	40619
27.05.2003	42389
28.05.2003	45805
30.05.2003	35919
02.06.2003	37637
03.06.2003	38004
04.06.2003	38233
05.06.2003	39578
06.06.2003	38044
09.06.2003	29
10.06.2003	35444
11.06.2003	36147
12.06.2003	37819
13.06.2003	36631
14.06.2003	1
16.06.2003	37463
17.06.2003	38481
18.06.2003	43012
19.06.2003	31
20.06.2003	37351
21.06.2003	5
22.06.2003	1
23.06.2003	37480
24.06.2003	37690
25.06.2003	40694
26.06.2003	40410
27.06.2003	40951
28.06.2003	3
30.06.2003	40404
01.07.2003	40852
02.07.2003	41375
03.07.2003	41204
04.07.2003	40932
05.07.2003	3
07.07.2003	39657
08.07.2003	37640
09.07.2003	41320
10.07.2003	40426
11.07.2003	37233
12.07.2003	6
14.07.2003	39060
15.07.2003	37205
16.07.2003	39227
17.07.2003	41330
18.07.2003	41273
21.07.2003	38357
22.07.2003	35613
23.07.2003	37674
24.07.2003	38495
25.07.2003	35037
28.07.2003	36532
29.07.2003	37756
30.07.2003	37658
31.07.2003	42035
01.08.2003	30524
02.08.2003	2
04.08.2003	33985
05.08.2003	32985
06.08.2003	34404
07.08.2003	33006
08.08.2003	30732
09.08.2003	1
11.08.2003	31661
12.08.2003	31345
13.08.2003	32878
14.08.2003	31562
15.08.2003	29996
16.08.2003	4
18.08.2003	33235
19.08.2003	33080
20.08.2003	35622
21.08.2003	34757
22.08.2003	33095
23.08.2003	1
25.08.2003	36672
26.08.2003	35219
27.08.2003	38092
28.08.2003	37204
29.08.2003	36432
01.09.2003	39476
02.09.2003	38995
03.09.2003	40635
04.09.2003	39662
05.09.2003	36505
08.09.2003	40679
09.09.2003	39560
10.09.2003	41672
11.09.2003	43023
12.09.2003	40596
15.09.2003	42295
16.09.2003	45569
17.09.2003	44716
18.09.2003	44101
19.09.2003	40466
22.09.2003	41934
23.09.2003	42009
24.09.2003	46738
25.09.2003	47129
26.09.2003	44032
28.09.2003	35
29.09.2003	46232
30.09.2003	46355
01.10.2003	48037
02.10.2003	47555
03.10.2003	2
04.10.2003	1
06.10.2003	45950
07.10.2003	44153
08.10.2003	45574
09.10.2003	48840
10.10.2003	46029
11.10.2003	2
12.10.2003	40
13.10.2003	47372
14.10.2003	47176
15.10.2003	48416
16.10.2003	48135
17.10.2003	44543
18.10.2003	1
20.10.2003	45811
21.10.2003	45182
22.10.2003	46834
23.10.2003	48392
24.10.2003	47876
25.10.2003	4
27.10.2003	48154
28.10.2003	47441
29.10.2003	47694
30.10.2003	43557
31.10.2003	51193
01.11.2003	1
02.11.2003	2
03.11.2003	44711
04.11.2003	45742
05.11.2003	46142
06.11.2003	45101
07.11.2003	43668
10.11.2003	44362
11.11.2003	43493
12.11.2003	47443
13.11.2003	47384
14.11.2003	45362
16.11.2003	2
17.11.2003	46108
18.11.2003	47262
19.11.2003	48588
20.11.2003	49486
21.11.2003	45709
22.11.2003	1
23.11.2003	3
24.11.2003	44911
25.11.2003	52282
26.11.2003	52988
27.11.2003	52677
28.11.2003	47213
01.12.2003	48680
02.12.2003	46779
03.12.2003	49371
04.12.2003	49625
05.12.2003	49768
08.12.2003	46968
09.12.2003	50364
10.12.2003	52817
11.12.2003	53496
12.12.2003	52083
14.12.2003	33
15.12.2003	49081
16.12.2003	52774
17.12.2003	52467
18.12.2003	53970
19.12.2003	47559
21.12.2003	35
22.12.2003	41375
23.12.2003	29161
24.12.2003	73
28.12.2003	8
29.12.2003	20390
30.12.2003	13342
31.12.2003	62
02.01.2004	16533
04.01.2004	3
05.01.2004	17476
06.01.2004	4
07.01.2004	46434
08.01.2004	46857
09.01.2004	41000
11.01.2004	3
12.01.2004	43672
13.01.2004	43875
14.01.2004	47014
15.01.2004	47517
16.01.2004	43588
18.01.2004	26
19.01.2004	44849
20.01.2004	44612
21.01.2004	46394
22.01.2004	46171
23.01.2004	43055
26.01.2004	45389
27.01.2004	43308
28.01.2004	47425
29.01.2004	46454
30.01.2004	46305
01.02.2004	3
02.02.2004	42504
03.02.2004	42679
04.02.2004	45595
05.02.2004	44674
06.02.2004	43488
09.02.2004	42161
10.02.2004	43912
11.02.2004	43569
12.02.2004	46481
13.02.2004	42826
15.02.2004	1
16.02.2004	44113
17.02.2004	43057
18.02.2004	44634
19.02.2004	45265
20.02.2004	42542
22.02.2004	13
23.02.2004	38654
24.02.2004	34649
25.02.2004	40231
26.02.2004	42320
27.02.2004	44378
29.02.2004	3
01.03.2004	44118
02.03.2004	42187
03.03.2004	45370
04.03.2004	46146
05.03.2004	47693
07.03.2004	1
08.03.2004	45296
09.03.2004	44802
10.03.2004	46816
11.03.2004	47869
12.03.2004	46042
14.03.2004	18
15.03.2004	46351
16.03.2004	45874
17.03.2004	46630
18.03.2004	48759
19.03.2004	44868
21.03.2004	19
22.03.2004	45620
23.03.2004	45389
24.03.2004	50008
25.03.2004	47021
26.03.2004	45391
27.03.2004	1
29.03.2004	44962
30.03.2004	45344
31.03.2004	49290
01.04.2004	47909
02.04.2004	43281
05.04.2004	45669
06.04.2004	46166
07.04.2004	47692
08.04.2004	49675
12.04.2004	1
13.04.2004	47510
14.04.2004	43543
15.04.2004	46813
16.04.2004	44209
18.04.2004	1
19.04.2004	45555
20.04.2004	42997
21.04.2004	48881
22.04.2004	46999
23.04.2004	43109
25.04.2004	1
26.04.2004	43543
27.04.2004	44367
28.04.2004	47607
29.04.2004	29813
[29 Apr 2004 15:04] Sergei Golubchik
What are the strustures of tables involved ?

show create table import_tu_todo;
show create table bericht;
show index import_tu_todo;
show index bericht;
[29 Apr 2004 16:05] Thomas Mayer
Les voilĂ :

mysql> show create table import_tu_todo;
+----------------+----------------------------------------------------------------------------------------------------------------------+
| Table          | Create Table                                                                                                         |
+----------------+----------------------------------------------------------------------------------------------------------------------+
| import_tu_todo | CREATE TABLE `import_tu_todo` (
  `datum` date NOT NULL default '0000-00-00',
  PRIMARY KEY  (`datum`)
) TYPE=InnoDB |
+----------------+----------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>
mysql> show create table bericht;
+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| bericht |CREATE TABLE `bericht` (
  `scannart` tinyint(3) unsigned NOT NULL default '0',
  `terminal` smallint(3) unsigned NOT NULL default '0',
  `tour` smallint(3) unsigned NOT NULL default '0',
  `gebiet` smallint(3) unsigned default NULL,
  `stopp_nr` smallint(3) unsigned default NULL,
  `paketnummer` bigint(11) unsigned NOT NULL default '0',
  `fc1` tinyint(3) unsigned default NULL,
  `fc2` tinyint(3) unsigned default NULL,
  `fc3` tinyint(3) unsigned default NULL,
  `fc4` tinyint(3) unsigned default NULL,
  `fc5` tinyint(3) unsigned default NULL,
  `paketart` tinyint(1) unsigned default NULL,
  `land` tinyint(2) unsigned default NULL,
  `plz` mediumint(5) unsigned default NULL,
  `route` smallint(3) unsigned default NULL,
  `gk_nr` smallint(4) unsigned default NULL,
  `gk_filiale` smallint(4) unsigned default NULL,
  `unbenutzt` char(20) default NULL,
  `stopp` char(40) default NULL,
  `kaufhausnummer` int(8) unsigned default NULL,
  `auftragsnummer` char(13) default NULL,
  `importnummer` char(7) default NULL,
  `datum` date NOT NULL default '0000-00-00',
  `zeit` time default NULL,
  `gewicht` mediumint(8) unsigned default NULL,
  `ber_id` int(10) unsigned NOT NULL default '0',
  `send_id` int(10) unsigned NOT NULL default '0',
  `kunden_nr` mediumint(6) unsigned default NULL,
  `importdatum` date default '0000-00-00',
  `eind` tinyint(3) unsigned default '0',
  `eind_tag` tinyint(3) unsigned default '0',
  PRIMARY KEY  (`ber_id`),
  KEY `idx_pkt` (`paketnummer`),
  KEY `idx_datum` (`datum`),
  KEY `idx_kd` (`kunden_nr`),
  KEY `idx_importdatum` (`importdatum`),
  KEY `idx_send_id` (`send_id`),
  KEY `idx_scannart` (`scannart`),
  CONSTRAINT `0_19` FOREIGN KEY (`send_id`) REFERENCES `sendung` (`send_id`) ON DELETE CASCADE
) TYPE=InnoDB |
+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.47 sec)

mysql>
mysql> show keys from import_tu_todo;
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table          | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| import_tu_todo |          0 | PRIMARY  |            1 | datum       | A         |          32 |     NULL | NULL   |      | BTREE      |         |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
1 row in set (0.00 sec)

mysql>
mysql> show keys from  bericht;
+---------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table   | Non_unique | Key_name        | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| bericht |          0 | PRIMARY         |            1 | ber_id      | A         |    13776044 |     NULL | NULL   |      | BTREE      |         |
| bericht |          1 | idx_pkt         |            1 | paketnummer | A         |    13776044 |     NULL | NULL   |      | BTREE      |         |
| bericht |          1 | idx_datum       |            1 | datum       | A         |          18 |     NULL | NULL   |      | BTREE      |         |
| bericht |          1 | idx_kd          |            1 | kunden_nr   | A         |          18 |     NULL | NULL   | YES  | BTREE      |         |
| bericht |          1 | idx_importdatum |            1 | importdatum | A         |          18 |     NULL | NULL   | YES  | BTREE      |         |
| bericht |          1 | idx_send_id     |            1 | send_id     | A         |    13776044 |     NULL | NULL   |      | BTREE      |         |
| bericht |          1 | idx_scannart    |            1 | scannart    | A         |          18 |     NULL | NULL   |      | BTREE      |         |
+---------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
7 rows in set (0.45 sec)
[29 Apr 2004 16:08] Thomas Mayer
Les voilĂ :

mysql> show create table import_tu_todo;
+----------------+----------------------------------------------------------------------------------------------------------------------+
| Table          | Create Table                                                                                                         |
+----------------+----------------------------------------------------------------------------------------------------------------------+
| import_tu_todo | CREATE TABLE `import_tu_todo` (
  `datum` date NOT NULL default '0000-00-00',
  PRIMARY KEY  (`datum`)
) TYPE=InnoDB |
+----------------+----------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>
mysql> show create table bericht;
+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| bericht |CREATE TABLE `bericht` (
  `scannart` tinyint(3) unsigned NOT NULL default '0',
  `terminal` smallint(3) unsigned NOT NULL default '0',
  `tour` smallint(3) unsigned NOT NULL default '0',
  `gebiet` smallint(3) unsigned default NULL,
  `stopp_nr` smallint(3) unsigned default NULL,
  `paketnummer` bigint(11) unsigned NOT NULL default '0',
  `fc1` tinyint(3) unsigned default NULL,
  `fc2` tinyint(3) unsigned default NULL,
  `fc3` tinyint(3) unsigned default NULL,
  `fc4` tinyint(3) unsigned default NULL,
  `fc5` tinyint(3) unsigned default NULL,
  `paketart` tinyint(1) unsigned default NULL,
  `land` tinyint(2) unsigned default NULL,
  `plz` mediumint(5) unsigned default NULL,
  `route` smallint(3) unsigned default NULL,
  `gk_nr` smallint(4) unsigned default NULL,
  `gk_filiale` smallint(4) unsigned default NULL,
  `unbenutzt` char(20) default NULL,
  `stopp` char(40) default NULL,
  `kaufhausnummer` int(8) unsigned default NULL,
  `auftragsnummer` char(13) default NULL,
  `importnummer` char(7) default NULL,
  `datum` date NOT NULL default '0000-00-00',
  `zeit` time default NULL,
  `gewicht` mediumint(8) unsigned default NULL,
  `ber_id` int(10) unsigned NOT NULL default '0',
  `send_id` int(10) unsigned NOT NULL default '0',
  `kunden_nr` mediumint(6) unsigned default NULL,
  `importdatum` date default '0000-00-00',
  `eind` tinyint(3) unsigned default '0',
  `eind_tag` tinyint(3) unsigned default '0',
  PRIMARY KEY  (`ber_id`),
  KEY `idx_pkt` (`paketnummer`),
  KEY `idx_datum` (`datum`),
  KEY `idx_kd` (`kunden_nr`),
  KEY `idx_importdatum` (`importdatum`),
  KEY `idx_send_id` (`send_id`),
  KEY `idx_scannart` (`scannart`),
  CONSTRAINT `0_19` FOREIGN KEY (`send_id`) REFERENCES `sendung` (`send_id`) ON DELETE CASCADE
) TYPE=InnoDB |
+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.47 sec)

mysql>
mysql> show keys from import_tu_todo;
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table          | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| import_tu_todo |          0 | PRIMARY  |            1 | datum       | A         |          32 |     NULL | NULL   |      | BTREE      |         |
+----------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
1 row in set (0.00 sec)

mysql>
mysql> show keys from  bericht;
+---------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table   | Non_unique | Key_name        | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| bericht |          0 | PRIMARY         |            1 | ber_id      | A         |    13776044 |     NULL | NULL   |      | BTREE      |         |
| bericht |          1 | idx_pkt         |            1 | paketnummer | A         |    13776044 |     NULL | NULL   |      | BTREE      |         |
| bericht |          1 | idx_datum       |            1 | datum       | A         |          18 |     NULL | NULL   |      | BTREE      |         |
| bericht |          1 | idx_kd          |            1 | kunden_nr   | A         |          18 |     NULL | NULL   | YES  | BTREE      |         |
| bericht |          1 | idx_importdatum |            1 | importdatum | A         |          18 |     NULL | NULL   | YES  | BTREE      |         |
| bericht |          1 | idx_send_id     |            1 | send_id     | A         |    13776044 |     NULL | NULL   |      | BTREE      |         |
| bericht |          1 | idx_scannart    |            1 | scannart    | A         |          18 |     NULL | NULL   |      | BTREE      |         |
+---------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
7 rows in set (0.45 sec)
[29 Apr 2004 20:41] Thomas Mayer
Here is another example which shows:

- estimated row counts are usually too high in my case (about +50%), but that is not that important

- The calculation of number of rows in joins is wrong:

In the stamement
mysql> explain
    -> select * from bericht t1
    -> inner join import_tu_todo t2 on t1.datum=t2.datum;
it seems as if MySQL calculates
Number of days(t2)*Number of days(t1)*Number of rows(t1)
in Numbers: 21*21*36444=16071804

In fact, it should be
Number of days(t2)*Number of rows(t1)
in Numbers 21*38883=816543

Or even totally exact:
21*28566=599886
of course we do not want it to work that exact and prefer performace... And MySQL can't know at optimization how many rows per date remain AFTER the join.

I believe it depends exponentially on the number of days(t1) whether mysql chooses a fast or very slow plan to resolve the query.

In my view this is a serious bug!

Of course the estimated numbers do not fit exactly to my words. The only thing that is shure is that the result is close to 21 times as high as it should be.

Thanks for your convenience. I should have written this report first of all.

Thomas

mysql> explain
    -> select * from bericht where datum="2004-04-01";
+---------+------+---------------+-----------+---------+-------+-------+-------------+
| table   | type | possible_keys | key       | key_len | ref   | rows  | Extra       |
+---------+------+---------------+-----------+---------+-------+-------+-------------+
| bericht | ref  | idx_datum     | idx_datum |       3 | const | 70150 | Using where |
+---------+------+---------------+-----------+---------+-------+-------+-------------+
1 row in set (0.00 sec)

mysql> select count(*) from bericht where datum="2004-04-01";
+----------+
| count(*) |
+----------+
|    47909 |
+----------+
1 row in set (0.00 sec)

mysql>
mysql> explain
    -> select * from bericht where datum="2004-04-05";
+---------+------+---------------+-----------+---------+-------+-------+-------------+
| table   | type | possible_keys | key       | key_len | ref   | rows  | Extra       |
+---------+------+---------------+-----------+---------+-------+-------+-------------+
| bericht | ref  | idx_datum     | idx_datum |       3 | const | 82440 | Using where |
+---------+------+---------------+-----------+---------+-------+-------+-------------+
1 row in set (0.00 sec)

mysql> select count(*) from bericht where datum="2004-04-05";
+----------+
| count(*) |
+----------+
|    45669 |
+----------+
1 row in set (0.00 sec)

mysql>
mysql> explain
    -> select * from bericht where datum between "2004-04-01" and "2004-04-21";
+---------+-------+---------------+-----------+---------+------+--------+-------------+
| table   | type  | possible_keys | key       | key_len | ref  | rows   | Extra       |
+---------+-------+---------------+-----------+---------+------+--------+-------------+
| bericht | range | idx_datum     | idx_datum |       3 | NULL | 812472 | Using where |
+---------+-------+---------------+-----------+---------+------+--------+-------------+
1 row in set (0.00 sec)

mysql> select count(*) from bericht where datum between "2004-04-01" and "2004-04-21";
+----------+
| count(*) |
+----------+
|   599902 |
+----------+
1 row in set (0.00 sec)

mysql>
mysql> drop table if exists import_tu_todo;
Query OK, 0 rows affected (0.00 sec)

mysql> create table import_tu_todo
    -> select distinct t1.datum
    -> from timeaxis t1
    -> inner join tu_monat t2 on t1.at_stat between t2.datum_von and t2.datum_bis
    -> and t1.at_stat<="2004-04-21";

alter table import_tu_todo
add primary key(datum);

explain
Query OK, 21 rows affected (0.02 sec)
Records: 21  Duplicates: 0  Warnings: 0

mysql>
mysql> alter table import_tu_todo
    -> add primary key(datum);
select * from import_tu_todo;
select count(*) from import_tu_todo;

explain
select * from bericht t1
        inner join import_tu_todo t2 on t1.datum=t2.datum;
select count(*) from bericht t1
        inner join import_tu_todo t2 on t1.datum=t2.datum;Query OK, 21 rows affected (0.12 sec)
Records: 21  Duplicates: 0  Warnings: 0

mysql>
mysql> explain
    -> select * from import_tu_todo;
+----------------+-------+---------------+---------+---------+------+------+-------------+
| table          | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----------------+-------+---------------+---------+---------+------+------+-------------+
| import_tu_todo | index | NULL          | PRIMARY |       3 | NULL |   21 | Using index |
+----------------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> select count(*) from import_tu_todo;
+----------+
| count(*) |
+----------+
|       21 |
+----------+
1 row in set (0.00 sec)

mysql> explain
    -> select * from bericht where datum in
    -> ('2004-04-01', '2004-04-02', '2004-04-03', '2004-04-04', '2004-04-05', '2004-04-06', '2004-04-07', '2004-04-08', '2004-04-09', '2004-04-10', '2004-04-11', '2004-04-12', '2004-04-13', '2004-04-14', '2004-04-15', '2004-04-16', '2004-04-17', '2004-04-18', '2004-04-19', '2004-04-20', '2004-04-21');
+---------+-------+---------------+-----------+---------+------+--------+-------------+
| table   | type  | possible_keys | key       | key_len | ref  | rows   | Extra       |
+---------+-------+---------------+-----------+---------+------+--------+-------------+
| bericht | range | idx_datum     | idx_datum |       3 | NULL | 816552 | Using where |
+---------+-------+---------------+-----------+---------+------+--------+-------------+
1 row in set (0.00 sec)

mysql>
mysql> select count(*) from bericht where datum in
    -> ('2004-04-01', '2004-04-02', '2004-04-03', '2004-04-04', '2004-04-05', '2004-04-06', '2004-04-07', '2004-04-08', '2004-04-09', '2004-04-10', '2004-04-11', '2004-04-12', '2004-04-13', '2004-04-14', '2004-04-15', '2004-04-16', '2004-04-17', '2004-04-18', '2004-04-19', '2004-04-20', '2004-04-21');
+----------+
| count(*) |
+----------+
|   599902 |
+----------+
1 row in set (0.68 sec)

mysql> explain
    -> select * from bericht t1
    -> inner join import_tu_todo t2 on t1.datum=t2.datum;
+-------+-------+---------------+-----------+---------+----------+--------+-------------+
| table | type  | possible_keys | key       | key_len | ref      | rows   | Extra       |
+-------+-------+---------------+-----------+---------+----------+--------+-------------+
| t2    | index | PRIMARY       | PRIMARY   |       3 | NULL     |     21 | Using index |
| t1    | ref   | idx_datum     | idx_datum |       3 | t2.datum | 765335 |             |
+-------+-------+---------------+-----------+---------+----------+--------+-------------+
2 rows in set (0.00 sec)

mysql> select count(*) from bericht t1
    -> inner join import_tu_todo t2 on t1.datum=t2.datum;
+----------+
| count(*) |
+----------+
|   599902 |
+----------+
1 row in set (0.45 sec)
[29 Apr 2004 22:06] Sergei Golubchik
Danke.

Check the cardinality for idx_datum column - it's 18 only, not 45000.
It cannot be 45000 of course, as you have many dates with very small count(*)'s - so they shift the statistics down. Still MyISAM estimates a cardinality in your case as 450, while InnoDB - as 18. As "rows" estimation in explain is inverse proportional to key cardinality, switching to MyISAM should help. I don't know why InnoDB reports such a low cardinality, I'll let Heikki comment on this. Of course when count(*) values are either very small or very big with nothing in between, no "cardinality" value will be good enough, so this InnoDB behaviour is not a bug, I assume.
[30 Apr 2004 10:10] Thomas Mayer
Hello Sergei, hello Heikki,

I'm quite convinced of the concept of InnoDB. In my view optimization should also do the right thing with InnoDB as well (At least this is a wish...).

As the amount of data is quite huge I do not have a possibility any more to test it with MyISAM.

Do I consider it right then, that InnoDB table handler reports wrong numbers of records or the table handler interface of MySQL server only works correct with MyISAM table handler?

Could it be that InnoDB table handler does another calculation than MyISAM does? I do not mean exact row counts but the way of doing this calculation.

Regards,
Thomas
[30 Apr 2004 12:07] Thomas Mayer
I did the following:
delete from bericht where datum<"2003-01-01";

optimize table bericht;
analyze table bericht;

show keys from bericht;

Table,Non_unique,Key_name,Seq_in_index,Column_name,Collation,Cardinality,Sub_part,Packed,Null,Index_type,Comment
bericht,0,PRIMARY,1,ber_id,A,13822714,NULL,NULL,,BTREE,
bericht,1,idx_pkt,1,paketnummer,A,13822714,NULL,NULL,,BTREE,
bericht,1,idx_datum,1,datum,A,18,NULL,NULL,,BTREE,
bericht,1,idx_kd,1,kunden_nr,A,7092,NULL,NULL,YES,BTREE,
bericht,1,idx_importdatum,1,importdatum,A,18,NULL,NULL,YES,BTREE,
bericht,1,idx_send_id,1,send_id,A,13822714,NULL,NULL,,BTREE,

As you can see, cardinality 18 remains at index idx_datum, after I deleted most of the small COUNTs per datum.

=========

What I don't understand is why these small counts per idx_datum shall lead to small cardinalities:

Documentation says: Cardinality: The number of unique values in the index. This is updated by running ANALYZE TABLE or myisamchk -a. Cardinality is counted based on statistics stored as integers, so it's not necessarily accurate for small tables.  

This would mean: the higher this cardinality is, the more is the index considered as candidate.

But this would mean that the more different values idx_datum contains, the more should it be used. And this would mean, the more "waste" (that I deleted above) remains inside the table, the higher is the cardinality and the more is the index being used.

So it should not be the small counts per idx_datum that lead to such small cardinalities, but the contrary.
[25 Jun 2004 13:32] Michael Widenius
Sorry, but the bug system is not the appropriate forum for asking
support questions. Your problem is not the result of a bug.
For a list of more appropriate places to ask for help using MySQL
products, please visit http://www.mysql.com/support/

Thank you for your interest in MySQL.

Additional info:

I just added this to the InnoDB section of the manual:

----
ANALYZE TABLE counts cardinality by doing a 10 random
dives to each of the index trees and updates index cardinality
estimates according.  Note that since it is only an estimate,
repeated runs of ANALYZE TABLE may produce different numbers.
This makes ANALYZE TABLE fast on InnoDB tables but not 100 %
accurate as it doesn't take all rows into account.

MySQL uses index cardinality estimates only in join optimization. If
some join is not optimized in the right way, you may try using
ANALYZE TABLE. In the few cases the ANALYZE TABLE doesn't
produce values good enough for your particular tables you can use
FORCE INDEX with your queries to force the usage of a particular
index or set max_seeks_for_key to ensure that MySQL prefer index
lookups over table scans.
-----

Some comments about your queries:

When you do:

SELECT ... where key_column IN (constant list) 

or

SELECT ... where key_column=constant

MySQL will not use cardinality when deciding which index to
use. Instead it will do a quick estimate in the index tree to found
out approximately how many rows the used range will cover.

For most table handlers the approximation is usually within 10-30 %
of the correct number of rows, which is good enough for choosing which
index to use (or if we should use table scan).

If the estimate is far off, then you can try to do an OPTIMIZE TABLE.
If this doesn't help then you can either switch to another table type
or use FORCE INDEX / USE INDEX to tell MySQL which index it should use
for the query.

About cardinality:

The higher the cardinality for one index, the better chance that index
will be used when joining tables. (Exactly as you stated in your comment).

If your index consists of a small set of key values that covers most
of the index tree, InnoDB may not take these into account when
counting the cardinality and you get a small cardinality even if there
is a lot of index values.  In this it's however good that the
cardinality is low as the index is probably not very usable for joins.

The above explains why you in your case don't see a big change in cardinality
even if you deleted some of the small counts(*).

Hope the above explains what is happening and gives you some ideas of
how to avoid your problem.

Last: Note that this issue is not a relatable critical bug, which is
what this public bug tracking is for.

If this issue is critical for you, you should consider paying for a
MySQL support contract and let our support team help you solve this
(In the worst case by adding more rules to the MySQL optimizer).

Regards,
Monty