Description:
I'm hesitant to report this bug, because I don't have a good way to reproduce it.
My slave servers running 5.5.5 cannot keep up with my master running 5.5.5. The master runs a raid 10 with binlog and innodb_flush_log_at_trx_commit=1. My slaves run raid 0 (4 spindles) and innodb_flush_log_at_trx_commit=2.
With plenty of spare IO capacity, the slave falls further and further behind. Since I have RBR, it took a while to find the offending queries - using information_schema.innodb_trx, I can see that these _very_ simple update and insert queries are stuff in a "fetching" rows state for at least 100+ seconds apiece - there are at least 3-6 of these coming through on the master per second, all done in < 100ms.
Once I verified there were plenty of IOPS left on my spindles, and that it wasn't a hardware problem, I confirmed it by setting up and running Tungsten Replicator instead of mysql native - I was 130,000 seconds behind and within an hour it caught up.
There is something definitely wrong with replication in at least this version, and it may have something to do with my schema or the order of my updates, and I'd be glad to offer any information, I just can't tell you how to reproduce it just yet.
How to repeat:
Unknown at this time. The current queries that cause this problem are insert into a table with ~20m rows, triggering an update of a table with ~13m rows, triggering an insert into a table with ~17m rows.
These queries, when executed on the master take combined less than 100ms. On the slave, they take 120 seconds.