Bug #94307 DD tables and index se_private_data is no longer private
Submitted: 13 Feb 2019 12:22 Modified: 21 Feb 2019 15:50
Reporter: Satya B Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Data Dictionary Severity:S3 (Non-critical)
Version:8.0.14 OS:Any
Assigned to: CPU Architecture:Any

[13 Feb 2019 12:22] Satya B
Description:
One of the good things about the new Data Dictionary is that engines can add their own metadata in the DD tables. This is se_privata_data.

With this commit

commit f2adb0437e6c38a9b35ddf869ab2f067d1af57aa
Author: Sivert Sorumgard <sivert.sorumgaard@oracle.com>
Date:   Mon Mar 12 13:56:12 2018 +0100

    Bug#27309072: Need a stable api for handling dd::properties.
    

the se_private_data is no longer "private". Until now server doesn't parse or has any knowledge about se_private_data. It knows that they are of key-value pairs but doesn't try to parse them.

For example I_S views don't show se_private_data.

With the above commit, all valid se_private_data keys have to "hardcoded" in server.

See sql/dd/impl/types/index_impl.cc
static const std::set<String_type> default_valid_se_private_data_keys = {
   // InnoDB keys:
   "id", "root", "space_id", "table_id", "trx_id"};

This breaks the design contract followed until now.

SE keys shouldn't be hardcoded in server.

How to repeat:
See above commit, code reading

Suggested fix:
Let engines specify what are valid DD attributes in se_private_data and they should do the validation.

In fact, the validation should be version based. But I don't think we are storing versions in se_private_data.
[13 Feb 2019 12:24] Satya B
correction
[13 Feb 2019 20:50] Sivert Sørumgård
1. The SQL layer does indeed already parse the se_private_data in the sense that it breaks it up into key/value pairs that are stored in appropriate std::map structures. This is done e.g. when populating a data dictionary object upon handling a miss in the DD cache, and when de-serializing an SDI object upon import of a MyISAM table.

2. The se_private_data keys do not necessarily need to be hard coded at the SQL layer. A dd::Properties object may very well be instantiated without submitting a std::set listing the valid keys, in which case you will not get any validation of the key names that are used (i.e., the same behavior as we had prior to the commit you pointed to).

3. Sets of valid key names could be collected dynamically from the storage engines when needed, but this might easily introduce performance issues, as well as complications related to which engines happen to be loaded at a given time.
[14 Feb 2019 7:56] Satya Bodapati
Sivert,

Thanks for the quick reply.

On 1), Right they are parsed but no real sense/meaning is inferred by server. Essentially server didn't care.

On 2), do you mean it is achievable even after your commit? validation can be avoided for SE keys by server? Or you plan to fix that now?

On 3) Agree but I didn't mean that. Server shouldnt do validation of SE private data.I meant that each SE should do its own validation of se_private_data.
[20 Feb 2019 15:06] MySQL Verification Team
Hi,

Thank you for your report.

I truly do not see where is the bug.  Yes, all the se_private_data are hard-coded into the server, but that is alright. We know exactly what storage engines we support and do not see a need to do it any other way.

Regarding your last comment, these are data for the storage engines, so why would SQL layer interfere ????

Regarding your second item, it should be answered by someone from Development, but I personally do not see a need for any changes there. I could be wrong, though .....

We agree that SE should do its own validation of those data.
[21 Feb 2019 4:37] Satya Bodapati
Sinisa,

the core issue is InnoDB se_private_data keys are hardcoded in server to do validation.

See sql/dd/impl/types/index_impl.cc ---------------------> Server file

static const std::set<String_type> default_valid_se_private_data_keys = {
   // InnoDB keys:
   "id", "root", "space_id", "table_id", "trx_id"}; ----> InnoDB keys

Writing InnoDB keys in non-InnoDB code (server code) should be avoided.
[21 Feb 2019 14:02] MySQL Verification Team
Hi Satya,

I completely understand what you are writing about, but I do not see how is this a bug ???

Does this lead to server crashing, malfunctioning in any way, decreased performance or else ???

Also, I was in discussion with your ex-colleagues and, so far, they do not see a problem there too.

BTW, I hope that you are well !!!!
[21 Feb 2019 14:14] Satya Bodapati
Sinisa,

You need not categorize this as a bug. I don't know what category a 'design issue/concern' falls under.

The concern is engines are forced to 'hardcode' the se_private_data keys in server code. While this doesn't cause harm functionally, it is bad from design perspective.

<snip>
Also, I was in discussion with your ex-colleagues and, so far, they do not see a problem there too.
</snip>

I see :)

<snip>
BTW, I hope that you are well !!!!
</snip>

Yes am good and hope you are well too :)

My proposal would be to:
1. keep the InnoDB se_private_data keys inside InnoDB codebase (ie. within storage/innobase)

2. Let InnoDB call the validation function when it retrieves se_private_data
[21 Feb 2019 15:50] MySQL Verification Team
Hi Satya,

I have sent a request to Development that your observations find their way into a new WorkLog.

When it is done, I will inform you of its number on this page.

Have a nice time and nice life !!!!
[25 Feb 2019 9:27] Laurynas Biveinis
Isn't closing this bug premature then?
[25 Feb 2019 13:44] MySQL Verification Team
Hi,

Bug database is not a forum where WorkLog creation and implementation are to be followed.

Also, decision is made that this idea, as well as WorkLog entry itself, will not be so soon. At this moment, nobody knows when, since entire discussion is not yet started, but will involve teams working on data dictionary and all storage engines. 

But, the most important datum is this forum is not intended for WorkLog discussions and reporting.
[15 Mar 2019 8:33] Sivert Sørumgård
A source code change is needed to disable the SQL layer validation for the SE private data keys. We have concluded we can do this in a bugfix. Since this bug report is closed, we have created Bug#94667 to track this.