Bug #111088 src tarball made from github repo and provided in src.rpm files is not the same
Submitted: 19 May 2023 13:08 Modified: 2 Jun 2023 16:25
Reporter: Simon Mudd (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Packaging Severity:S3 (Non-critical)
Version:8.0.32 OS:Any
Assigned to: CPU Architecture:Any
Tags: 8.0.32, build, rpm

[19 May 2023 13:08] Simon Mudd
Description:
I notice that if I generate a tar ball from the github.com/mysql/mysql-server tree for say 8.0.32 I get a certain content.

I downloaded the RHEL8 mysql-community-8.0.32.el8.src.rpm file and "installed" it which provides the files required to rebuild the binaries.  I then extracted the provided mysql-8.0.32.tar.gz file.

When comparing the contents the files were not the same.
I've seen this in older versions of MySQL too. 

Given the source is different this is somewhat confusing. It would be better to clean this up.

How to repeat:
Generate a tar ball for 8.0.32 from the github repo (in this case for 8.0.32):

$ git archive --format=tar.gz -o ../mysql-8.0.32-github.tar.gz --prefix=mysql-8.0.32-github/ mysql-8.0.32

Download the src.rpm file for 8.0.32 (I did this for RHEL8), mysql-community-8.0.32-1.el8.src.rpm

$ rpm -ivh mysql-community-8.0.32-1.el8.src.rpm to get the tarball installed into ~/rpmbuild/SOURCES

Extract the 2 tarballs using slightly different names:

$ cd ~/rpmbuild/SOURCES
$ tar xzf mysql-8.0.32.tar.gz

See the diff -uNr output shows that in the src.rpm generated tarball there is an additional compiled mysql-8.0.32-srcrpm/sql/sql_hints.yy.cc file, and a Docs/INFO_SRC file (which doesn't really matter so much):

diff -uNr mysql-8.0.32-github/Docs/INFO_SRC mysql-8.0.32-srcrpm/Docs/INFO_SRC
--- mysql-8.0.32-github/Docs/INFO_SRC   1970-01-01 01:00:00.000000000 +0100
+++ mysql-8.0.32-srcrpm/Docs/INFO_SRC   2022-12-16 16:56:12.000000000 +0100
@@ -0,0 +1,7 @@
+commit: 683372e870fea4430bafe8bf06ef6f06a234e539
+date: 2022-12-16 13:56:25 +0100
+build-date: 2022-12-16 15:36:03 +0000
+short: 683372e870f
+branch: mysql-8.0.32-release
+
+MySQL source 8.0.32
diff -uNr mysql-8.0.32-github/sql/sql_hints.yy.cc mysql-8.0.32-srcrpm/sql/sql_hints.yy.cc
--- mysql-8.0.32-github/sql/sql_hints.yy.cc     1970-01-01 01:00:00.000000000 +0100
+++ mysql-8.0.32-srcrpm/sql/sql_hints.yy.cc     2022-12-16 16:57:58.000000000 +0100
@@ -0,0 +1,2503 @@
+/* A Bison parser, made by GNU Bison 3.0.4.  */
+
+/* Bison implementation for Yacc-like parsers in C
+
+   Copyright (C) 1984, 1989-1990, 2000-2015 Free Software Foundation, Inc.
...

[sjmudd@mad17 SOURCES]$ wc -l mysql-8.0.32-srcrpm/sql/sql_hints.yy.cc
2503 mysql-8.0.32-srcrpm/sql/sql_hints.yy.cc
[sjmudd@mad17 SOURCES]$

So this is an unneeded 2503 line long file. I presume the standard build procedure within rpm will rebuild this file so it's not needed.

Suggested fix:
Remove the unneeded file from the tarball created from the .src.rpm. This tarball is what's used by rpm to build the binaries.  The sql_hints.yy.cc file is not needed as the standard rpm build process should recreate it.
[22 May 2023 8:30] MySQL Verification Team
Hello Simon,

Thank you for the report and feedback.

regards,
Umesh
[1 Jun 2023 17:12] Balasubramanian Kandasamy
Thanks for the bug report.

We ship and use one source tarball across platforms to be consistent. Due to the limitations on some platforms, the source tarball includes some generated files which are strictly not required on all platforms.
[2 Jun 2023 16:21] Simon Mudd
ok, can you share the procedure for generating these from the git source so that I can reproduce the procedure or point me at the documentation or code for doing this if it exists already?

Thanks.
[2 Jun 2023 16:25] Simon Mudd
rpm works on the basis of repeatable builds from the source.

The source in git is different so it would be convenient to know how you build the tarball so I can reproduce the same build process being used by Oracle for building the binaries on the platforms I'm interested in.

rpm assumes source comes in tarballs which was true in the 90s but is no longer typical and the idea of the src rpm is you have a clean well defined source from which to build your code.  Given the tarball I saw was not the same as the source fro the git tree this is somewhat confusing and it also makes it harder to compare different versions based on sources in "source tar balls" or "git trees" as I'd expect them to be the same but as you point out this is not actually the case.