Bug #31822 Redefinition of C++ bool prevents use of external C++ libraries in plug-ins
Submitted: 24 Oct 2007 18:44 Modified: 27 Oct 2007 0:01
Reporter: Bill Mitchell Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Compiling Severity:S2 (Serious)
Version:5.1 OS:Windows
Assigned to: CPU Architecture:Any

[24 Oct 2007 18:44] Bill Mitchell
Description:
Config-win.h redefines bool to the Microsoft C header type BOOL.  This removes access to the C++ bool native type.  The impact is that plug-in authors writing in C++ code using the bool type do not get what they expect.  In particular, when the plug-in code needs to access code in an external C++ library, e.g., the Xerces library for manipulating XML documents, the code in the plug-in thinks the type is the redefined BOOL whereas the pre-compiled object libary thinks the type is the native bool.  At best, this causes link errors when the type conflict arises in a method interface and the method name decorated by its parameters does not match.  At worst, this causes difficult to analyze execution failures when the plug-in code treats the object members differently than does the code in the library.  

I have not yet determined if the same problem arises on the non-Windows platforms from the typedef of bool as char in my_global.h.  

In terms of the impact, I regard this problem more as critical than severe.  Although some might suggest this sounds like a feature request, the plug-in feature already exists, and MySQL 5.1 includes changes to make it more accessible.  But plug-ins that cannot access publicly available C++ libraries have only limited utility.

The workarounds appear to be (1) don't use publicly available C++ libraries but rather re-implement everything from scratch to avoid the use of bool, (2) overhaul the C++ libraries to replace all use of the native C++ type bool with something else, or (3) find some way to wrapper everything in the C++ library with new classes that hide the use of bool from their interface.  Workarounds (2) and (3) may be just about as extensive as fixing the MySQL source to leave the native bool type available.    

How to repeat:
Using the 5.1.22-rc build and Visual Studio 2005, try invoking the Xerces initialize() and terminate() methods in the example storage engine.  In particular, I added the following code fragment to the ha_example::create() method:

#include <xercesc/util/PlatformUtils.hpp>

XERCES_CPP_NAMESPACE_USE 

...

int ha_example::create(const char *name, TABLE *table_arg,
                       HA_CREATE_INFO *create_info)
{
...
  try {
    XMLPlatformUtils::Initialize();
  }
  catch (const XMLException& toCatch) {
	  DBUG_PRINT("error:",("XMLException, type='%s', message='%s', source file='%s', line='%d'",
		toCatch.getType(), toCatch.getMessage(), toCatch.getSrcFile(), toCatch.getSrcLine()));
    // Do your failure processing here
    DBUG_RETURN(-1);
  }

  // Do your actual work with Xerces-C++ here.

  XMLPlatformUtils::Terminate();
...
}
Building this required adding the xercesc include directory under the MySQL include directory, copying the xerces -c_2.dll and xerces-c_2.lib files to the storage\example directory, and modifying the storage\example\CMakeLists.txt file as follows:
...
IF(NOT SOURCE_SUBLIBS)
  ADD_LIBRARY(example ${EXAMPLE_SOURCES})
  TARGET_LINK_LIBRARIES(example xerces-c_2)
ENDIF(NOT SOURCE_SUBLIBS)

Although the example storage engine builds just fine, the mysqld link fails with the following message:
Linking...
example.lib(ha_example.obj) : error LNK2019: unresolved external symbol "__declspec(dllimport) public: static void __cdecl xercesc_2_8::XMLPlatformUtils::Initialize(char const * const,char const * const,class xercesc_2_8::PanicHandler * const,class xercesc_2_8::MemoryManager * const,int)" (__imp_?Initialize@XMLPlatformUtils@xercesc_2_8@@SAXQBD0QAVPanicHandler@2@QAVMemoryManager@2@H@Z) referenced in function "public: virtual int __thiscall ha_example::create(char const *,struct st_table *,struct st_ha_create_information *)" (?create@ha_example@@UAEHPBDPAUst_table@@PAUst_ha_create_information@@@Z)
Debug\mysqld.exe : fatal error LNK1120: 1 unresolved externals

With a little bit of research, one determines that the problem is that ha_example.cc has passed as it's fifth parameter an int, whereas the xerces-c_2.dll defines the XMLPlatformUtils::Initialize method as expecting the fifth parameter to be a bool.  Thus, the external symbol appears unresolved because the decorated names don't match.  

Suggested fix:
At first glance, it would appear straightforward albeit time-consuming to remove the #define bool BOOl in config.win.h and replace it with:
#if !defined(HAVE_BOOL)
#undef HAVE_BOOL
#define HAVE_BOOL
#endif /* !defined(HAVE_BOOL) */
This will cause my_global.h to leave the native bool intact when compiling C++ code.  
Leaving #define bool_defined in config.win will cause my_global.h to leave bool undefined when compiling C code.  

Of course, the tricky part comes in resolving all the related source issues.  Looking at my_global.h, it appears it should be possible and reasonable to use my_bool instead of bool throughout the MySQL code.  The MS defined BOOL might be needed in routines specific to the Windows interface.  In starting down this path, I can make the following observations:
(1) There are existing places in the MySQL code that mix my_bool and bool type declarations with interesting and perhaps incorrect results at execution time but with no compilation warnings as they were both interpreted as integer values.  
(2) bool is a perfectly fine type in the C++ code, where it means what it says.  But mixing bool and my_bool leads to numerous type conversion situations.  It should be straightforward to choose one or the other for each situation and make it consistent.  
(3) my_bool is defined as a char, where BOOL is defined in WinDef.h as int.  There are places in the MySQL code where an arithmetic AND operator is applied against a mask, yielding a result that is then stored.  Placing this in a char instead of an int may cause truncation and loss of the flags from the AND.  
(4) bool avoids problem (3) by causing C++ to always generate a logical result.  The MS C++ compiler diagnoses these with a performance warning as it is generating a test to calculate an exact 1 or 0 value.  Some of these might truly be in performance sensitive code sections that deserve further thought to develop a better resolution, perhaps using int instead of bool as appropriate.  
(5) The performance warning in (4) could be eliminated by using the existing MySQL test() macro to make the calculation of a 1 or 0 value explicit in the code.
[24 Oct 2007 19:59] MySQL Verification Team
Thank you for the bug report. Duplicate of bug: http://bugs.mysql.com/bug.php?id=26461.
[26 Oct 2007 23:57] Bill Mitchell
Yes, Miguel, I did recognize that bug #26461 is one of several that discusses issues surrounding the definition of bool in config-win.h.  So I see why you might regard this bug #31822 as a duplicate.  But I think I can draw a distinction.  Bug #26461 discusses problems that arise when a source file includes one but not both of the header files config-win.h and my_global.h.  In principal, one could resolve that bug report by ensuring that the two header files give equivalent redefinitions, either 1 byte or 4 byte for consistency.  Bug #31822 discusses a different issue, the complete inability to use publicly available C++ libraries from a C++ plugin.  And the only resolution on the mysql side would be the elimination of redefinitions of bool in both header files, leaving the native C++ bool intact.  

Looking at the two reports from a practical level, a source resolution to #31822 must necessarily fix #26461 as well.  But there exist source changes that resolve #26461, but leave #31822 as an outstanding issue.
[27 Oct 2007 0:01] Bill Mitchell
To determine the size of the issue on the mysql side, and to verify that the problem can indeed be resolved, I have developed a set of source changes that attempt to make consistent use for each set of related variables/methods of either bool or my_bool.  Where consistency is too hard to achieve, most of the compilation warnings are remedied through appropriate use of the test() method.  Beyond the constraints I sketched in the initial bug report, the most significant source constraint appears where some functions expect the address of an item, e.g., a char* that points to a my_bool.  Some of these functions are in C modules, and not C++ modules, where the native C++ bool should not be passed.  

I have uploaded a .zip file, bug-fix-31822.zip, with the set of changes I have made so far.  These yield an error-free compilation with a manageable set of warning messages.  A few of these warnings relate to real source errors that should be looked at.  The largest clump of remaining warnings are in the yacc parser, the has numerous conversions from int/my_bool to bool.  Not yet being really familiar with yacc, I thought I would leave these for someone else to consider, although they could all be made to go away very quickly with a few more test() calls.  

For each of the modified files, I have included a .rev file that describes the changes in each file.  For header and source files with the same name, I shared all the revision comments in one file with the same base name.  

Of course, not yet being as expert as I might like on mysql, the implications for not-Windows platforms, and the implications for the data across the client interfaces, there is a chance that these source changes affect a data type that should be kept fixed across multiple versions of the product.  I hope that is not the case.  Also, I have not yet built a Linux version where I could run the provided test suite.  So all I know is that it compiles relatively cleanly and seems to work on very preliminary simple tests.
[27 Oct 2007 1:16] MySQL Verification Team
Thank you for the feedback and additional comments. Besides the link as
duplicate bug I added your 2 last comments in the http://bugs.mysql.com/bug.php?id=26461 also you can add comments to that
bug. Thank you for the contribution.