Bug #82157 | error 20008 'Query aborted due to out of query memory' from NDBCLUSTER): HY000 | ||
---|---|---|---|
Submitted: | 8 Jul 2016 2:20 | Modified: | 26 Aug 2016 10:36 |
Reporter: | alain cocconi | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Cluster: Cluster (NDB) storage engine | Severity: | S1 (Critical) |
Version: | mysql-5.6.29 ndb-7.4.11 | OS: | Ubuntu (14.04.2 LTS) |
Assigned to: | MySQL Verification Team | CPU Architecture: | Any |
[8 Jul 2016 2:20]
alain cocconi
[22 Jul 2016 13:28]
MySQL Verification Team
Hi, while we are looking at the provided log files, can you let us know if you have any monitoring of your nodes? Do you have maybe MEM installed or do you at least monitor cpu usage, ram usage, io usage on your data and management nodes? The "state" you get at should be solved by cluster restart but it's not acceptable solution for the production cluster, I know. What I'd like to know is if you tried it and did you do rolling restart or shutdown/start, and if you did rolling restart how did the clients behave? It is possible the only one node is in problem so if you did this rolling restart few times was everything up and running ok after only one node is restarted? I would also need you to provide me with some results from the ndbinfo db. select * from ndbinfo.counters; select * from ndbinfo.diskpagebuffer; select * from ndbinfo.memory_per_fragment; select * from ndbinfo.memoryusage; select * from ndbinfo.operations_per_fragment; select * from ndbinfo.resources; now I need this info "after restart" and then after ~20 hours (so just before you expect it to start shooting those errors). also if you don't have at least some SAR data, can you extract some cpu usage and ram usage data from all nodes after restart and when you start experiencing the issue. kind regards Bogdan Kecman
[26 Jul 2016 0:50]
alain cocconi
Hi Yes 'm monitoring cpu, mem, network, io etc on both servers. But nothing's happening when I've those errors. Before to reboot servers, I've try stop, start, restart with the manager but no succes, still the error after 1 or 2 hours running. To return to a stable state I've stopped to use a database in the cluster. And now all is ok. So I'm investigating what was going wrong with that database and I will return to you. Thanks
[26 Jul 2016 10:36]
MySQL Verification Team
Hi, Let us know when you finish your investigation, and please get us the data I requested from the ndbinfo database. take care Bogdan
[27 Aug 2016 1:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".