Re: llvm & RTTI over shared libraries

From: <jbo_at_insane.engineer>
Date: Mon, 25 Apr 2022 13:01:48 UTC
Hello guys,

Thank you for your replies.

I've created a small minimal test case which reproduces the problem (attached).
The key points here are:
  - CMake based project consisting of:
    - The header-only interface for the plugin and the types (test-interface).
    - The main executable that loads the plugin (test-core).
    - A plugin implementation (plugin-one).
  - Compiles out-of-the-box on FreeBSD 13/stable with both lang/gcc11 and devel/llvm14.
  - It uses the exact mechanism I use to load the plugins in my actual application.

stdout output when compiling with lang/gcc11:

  t is type_int
  t is type_string
  done.


stdout output when compiling with lang/llvm14:

  could not cast t
  could not cast t
  done.


Unfortunately, I could not yet figure out which compiler/linker flags llvm requires to implement the same behavior as GCC does. I understand that eventually I'd be better of rewriting the necessary parts to eliminate that problem but this is not a quick job.

Could somebody lend me a hand in figuring out which compiler/linker flags are necessary to get this to work with llvm?


Best regards,
~ Joel



------- Original Message -------
On Saturday, April 23rd, 2022 at 22:42, Mark Millard <marklmi@yahoo.com> wrote:


>
>
>
> On 2022-Apr-23, at 15:33, Mark Millard marklmi@yahoo.com wrote:
>
> > • Joerg Sonnenberger <joerg_at_bec.de> wrote on
> > • Date: Sat, 23 Apr 2022 21:33:04 UTC :
> >
> > > Am Tue, Apr 19, 2022 at 11:03:33PM -0700 schrieb Mark Millard:
> > >
> > > > Joerg Sonnenberger <joerg_at_bec.de> wrote on
> > > > Tue, 19 Apr 2022 21:49:44 UTC :
> > > >
> > > > > Am Thu, Apr 14, 2022 at 04:36:24PM +0000 schrieb jbo@insane.engineer:
> > > > >
> > > > > > > After some research I seem to understand that the way that RTTI is handled over shared library boundaries is different between GCC and LLVM.
> > > > >
> > > > > I think you are running into the old problem that GCC thinks comparing
> > > > > types by name makes sense where as everyone else compares types by type
> > > > > pointer identity.
> > > >
> > > > Seems out of date for the GCC information . . .
> > > >
> > > > https://gcc.gnu.org/faq.html#dso reports:
> > > >
> > > > QUOTE
> > > > The new C++ ABI in the GCC 3.0 series uses address comparisons, rather than string compares, to determine type equality.
> > > > END QUOTE
> > >
> > > Compare that with the implementation in <typeinfo>.
> >
> > Looking at /usr/local/lib/gcc11/include/c++/typeinfo I see:
> > configurable, in part based on the intent for possible
> > handling RTLD_LOCAL (when weak symbol are available). I'll
> > quote the comments for reference . . .
> >
> > // Determine whether typeinfo names for the same type are merged (in which
> > // case comparison can just compare pointers) or not (in which case strings
> > // must be compared), and whether comparison is to be implemented inline or
> > // not. We used to do inline pointer comparison by default if weak symbols
> > // are available, but even with weak symbols sometimes names are not merged
> > // when objects are loaded with RTLD_LOCAL, so now we always use strcmp by
> > // default. For ABI compatibility, we do the strcmp inline if weak symbols
> > // are available, and out-of-line if not. Out-of-line pointer comparison
> > // is used where the object files are to be portable to multiple systems,
> > // some of which may not be able to use pointer comparison, but the
> > // particular system for which libstdc++ is being built can use pointer
> > // comparison; in particular for most ARM EABI systems, where the ABI
> > // specifies out-of-line comparison. The compiler's target configuration
> > // can override the defaults by defining __GXX_TYPEINFO_EQUALITY_INLINE to
> > // 1 or 0 to indicate whether or not comparison is inline, and
> > // __GXX_MERGED_TYPEINFO_NAMES to 1 or 0 to indicate whether or not pointer
> > // comparison can be used.
> >
> > So, to some extent, the details are choices in the likes of lang/gcc11
> > instead of an always-the-same rule for handling. Below gives some
> > more idea of what __GXX_TYPEINFO_EQUALITY_INLINE and
> > __GXX_MERGED_TYPEINFO_NAMES do for configuration. Is there a combination
> > that matches FreeBSD's system clang++ related behavior? If yes, should
> > the likes of lang/gcc11 be using that combination?
>
>
> I should have quoted a little bit more that describes the
> defaults used:
>
> #ifndef __GXX_MERGED_TYPEINFO_NAMES
> // By default, typeinfo names are not merged.
> #define __GXX_MERGED_TYPEINFO_NAMES 0
> #endif
>
> // By default follow the old inline rules to avoid ABI changes.
> #ifndef __GXX_TYPEINFO_EQUALITY_INLINE
> #if !GXX_WEAK
> #define __GXX_TYPEINFO_EQUALITY_INLINE 0
> #else
> #define __GXX_TYPEINFO_EQUALITY_INLINE 1
> #endif
> #endif
>
> . . .
>
> > #if !__GXX_TYPEINFO_EQUALITY_INLINE
> > // In old abi, or when weak symbols are not supported, there can
> > // be multiple instances of a type_info object for one
> > // type. Uniqueness must use the _name value, not object address.
> > . . .
> > #else
> > #if !__GXX_MERGED_TYPEINFO_NAMES
> > . . .
> > // Even with the new abi, on systems that support dlopen
> > // we can run into cases where type_info names aren't merged,
> > // so we still need to do string comparison.
> > . . .
> > #else
> > // On some targets we can rely on type_info's NTBS being unique,
> > // and therefore address comparisons are sufficient.
> > . . .
> > #endif
> > #endif
>
>
>
> ===
> Mark Millard
> marklmi at yahoo.com