Lumiera
The new emerging NLE for GNU/Linux

some arbitrary hints and notes regarding the use of Boost Libraries in Lumiera

Notable Features

Some of the Boost Libraries are especially worth mentioning. You should familiarise yourself with those features, as we’re using them heavily throughout our code base. As it stands, the C/C++ language(s) are quite lacking and outdated, compared with today’s programming standards. To fill some of these gaps and to compensate for compiler deficiencies, some members of the C++ committee and generally very knowledgeable programmers created a set of C++ libraries generally known as Boost. Some of these especially worthy additions were proposed and subsequently included into the new C++11 standard.

C++11 Features

In Lumiera, we heavily rely on some features proposed and matured in the Boost libraries, and meanwhile included in the current language standard. These features are thus now provided through the standard library accompanying the compiler. For sake of completeness, we’ll mention them here. In the past we used the Boost-Implementation of these facilities, but we don’t need Boost for this purpose anymore.

memory

The <memory> libraries (formerly <boost/memory.hpp> rsp. <tr1/memory>) define a family of smart-pointers to serve several needs of basic memory management. In almost all cases, they’re superior to using std::auto_ptr.
When carefully combining these nifty templates with the RAII pattern, most concerns for memory management, clean-up and error handling simply go away. (but please understand how to avoid circular references and care for the implications of parallelism though)

functional

The function template adds generic functor objects to C++. In combination with the bind function (which binds or ties an existing function invocation into a functor object), this allows to “erase” (hide) the difference between functions, function pointers and member functions at your interfaces and thus enables building all sorts of closures, signals (generic callbacks) and notification services. Picking up on these concepts might be mind bending at start, but it’s certainly worth the effort (in terms of programmer productivity)

hashtables and hash functions

The unordered_* collection types amend a painful omission in the STL. To work properly, these collection implementations need a way to calculate a hash value for every key (rsp. entry in case of the Set-container). The hash function to use can be defined as additional parameter; there are also some conventions to pick a hash function automatically. Currently (2014), we have two options for the hash function implementation: The std::hash or the boost::hash implementation. (→ read more here…)

STATIC_ASSERT

a helper to check and enforce some conditions regarding types at compile time. In case of assertion failure a compilation error is provoked, which should at least give a clue towards the real problem guarded by the static assertion. It is good practice to place an extended source code comment near the static assertion statement to help solving the actual issue.

Relevant Bosst extensions

operators

The boost::operators library allows to build families of types/objects with consistent algebraic properties. Especially, it eases building equality comparable, totally ordered, additive, mulitplicative entities: You’re just required to provide some basic operators and the library will define all sorts of additional operations to care for the logical consequences, removing the need for writing lots of boilerplate code.

lexical_cast

Converting numerics to string and back without much ado, as you’d expect it from any decent language.

boost::format

String formatting with printf style directives. Interpolating values into a template string for formatted output — but typesafe, using defined conversion operators and without the dangers of the plain-C printf famility of functions. But beware: boost::format is implemented on top of the C++ output stream operations (<< and manipulators), which in turn are implemented based on printf — you can expect it to be 5 to 10 times slower than the latter, and it has quite some compilation overhead and size impact (→ see our own formatting front-end to reduce this overhead)

metaprogramming library

A very elaborate, and sometimes mind-bending library and framework. While heavily used within Boost to build the more advanced features, it seems too complicated and esoteric for general purpose and everyday use. Code written using the MPL tends to be very abstract and almost unreadable for people without math background. In Lumiera, we try to avoid using MPL directly. Instead, we supplement some metaprogramming helpers (type lists and tuples), written in a simple LISP style, which — hopefully — should be decipherable without having to learn an abstract academic terminology and framework.

variant and any

These library provide a nice option for building data structures able to hold a mixture of multiple types, especially types not directly related to each other. boost::variant is a typeseafe union record, while boost::any is able to hold just any other type you provide at runtime, still with some degree of type safety when retrieving the stored values. Both libraries are compellingly simple to use, yet add some overhead in terms of size, runtime, and compile time.

regular expressions

Boost provides a full blown regular expression library, supporting roughly the feature set of perl regular expressions. The usage and handling is somewhat brittle though, when compared with perl, python, java, etc.

program-options and filesystem

Same as the aforementioned, these two libraries just supply a familiar programming model for these tasks (parsing the command line and navigating the filesystem) which can be considered quasi standard today, and is available pretty much in the same style in Java, Python, Ruby, Perl and others.

Negative Impact

Most Boost libraries are header only and all of them make heavy use of template related features of C++. Thus, every inclusion of a Boost library might lead to increased compilation times. We pay that penalty per compilation unit (not per header). Yet still, using a boost library within a header frequently included throughout the code base might dangerously leverage that effect.

debug mode

Usually, when developing, we translate our code without optimisation and with full debugging informations switched on. Unfortunately, C++ templates were never designed to serve as a functional metaprogramming language to start with — but that’s exactly what we’re (ab)using them for. The Boost libraries drive that to quite some extreme. This leads to lots and lots of debugging information to be added to the object files, mentioning each and every intermediary type created in the course of expanding the metaprogramming facilities. Even seemingly simple things may result in object files becoming several megabytes large.

Fortunately, all of this overhead gets removed when stripping your executable and libraries (or when compiling without debug information). So this is solely an issue relevant for the developers, as it increases compilation, linking and startup times.

runtime overhead and template bloat

The core Boost libraries (the not-so experimental ones) have a reputation for being of very high quality. The’re written by experts with a deep level of understanding of the language, the usual implementation and the performance implications. Mostly, those quite elaborate metaprogramming techniques where chosen exactly to minimise runtime overhead or code size.

Since each instantiation of a template constitutes a completely new class, carelessly written template code can lead to heavily bloated executables. Every instantiated function and every class with virtual methods (i.e. with a VTable) adds to the weight. But this negative effect can be balanced by the ability of reducing inline code. According to my own experience, I’d be much more concerned about my own code adding template bloat, then being concerned about the Boost libraries (those people know very well what they’re doing…)

some practical guidelines

Facilities like boost::format, boost::variant, boost::any, boost::lambda and the more elaborate metaprogramming stuff add considerable weight. A good advice is to confine those features to the implementation level: use them within individual translation units (*.cpp) where this makes sense, but don’t express general interfaces in terms of those library constructs.

Actually, for the most relevant of these, namely boost::format and boost::variant, we have either created a lightweight front-end or our own simplified implementation in the support library, leading to a significant reduction in overall code size.

Beyond those somewhat problematic entities, there used to be several incredibly useful tools from the Boost library, which create only moderate overhead — nothing to be really concerned about. Fortunately, mostly these have meanwhile been adapted into the official standard library, and can thus be used without creating a dependency on Boost.

  • the functional tools (function, bind and friends), the hash functions, lexical_cast and the regular expressions create a moderate overhead. Probably fine in general purpose code, but you should be aware that there is a price tag. About the same as with many STL other features.

  • the shared_ptr weak_ptr, intrusive_ptr and unique_ptr are really indispensable and can be used liberally everywhere. Same for enable_if and the type_traits. The impact of those features on compilation times and code size is negligible and the runtime overhead is zero, compared to performing the same stuff manually (obviously a ref-counting pointer like smart_ptr has the size of two raw pointers and incurs an overhead on each creation, copy and disposal of a variable — but that’s besides the point I’m trying to make here, and generally unavoidable)