Bannalia: trivial notes on themes diverse: Some experiments with Boost.Unordered on Fil-C

Fil-C is a C and C++ compiler built on top of LLVM that adds run-time memory-safety mechanisms preventing out-of-bounds and use-after-free accesses. This naturally comes at a price in execution time, so I was curious about how much of a penalty that is for a performance-oriented, relatively low-level library like Boost.Unordered.

Compiling and testing

From the user's perspective, Fil-C is basically a Clang clone, so it is fairly easy to integrate in previously existing toolchains. This repo shows how to plug Fil-C into Boost.Unordered's CI, which runs on GitHub Actions and is powered by Boost's own B2 build system. The most straightforward way to make B2 use Fil-C is by having a user-config.jam file like this:

using clang : : fil++ ;

which instructs B2 to use the clang toolset with the only change that the compiler name is not the default clang++ but fil++.

We've encountered only minor difficulties during the process:

In the enviroment used (Linux x64), B2 automatically includes --target=x86_64-pc-linux as part of the commandline, which confuses the adapted version of libc++ shipping with Fil-C. This option had to be overridden with --target=x86_64-unknown-linux-gnu (which is the default for Clang).
As of this writing, Fil-C does not accept inline assembly code (asm or __asm__ blocks), which Boost.Unordered uses to provide embedded GDB pretty-printers. The feature was disabled with the macro BOOST_ALL_NO_EMBEDDED_GDB_SCRIPTS.

Other than this, the extensive Boost.Unordered test suite compiled and ran successfully, except for some tests involving Boost.Interprocess, which uses inline assembly in some places. CI completed in around 2.5x the time it takes with a regular compiler. It is worth noting that Fil-C happily accepted SSE2 SIMD intrinsics crucially used by Boost.Unordered.

Run-time performance

We ran some performance tests compiled with Fil-C v0.674 on a Linux machine, release settings (benchmark code and setup here). The figures show execution times in ns per element for Clang 15 (solid lines) and Fil-C (dashed lines) and three containers: boost::unordered_map (closed-addressing hashmap), and boost::unordered_flat_map and boost::unordered_node_map (open addressing).


Running insertion	Running erasure


Successful lookup	Unsuccessful lookup

Execution with Fil-C is around 2x-4x slower, with wide variations depending on the benchmarked scenario and container of choice. Closed-addressing boost::unordered_map is the container experiencing the largest degradation, presumably because it does the most amount of pointer chasing.

Bannalia: trivial notes on themes diverse

Monday, November 10, 2025

Some experiments with Boost.Unordered on Fil-C

No comments :

Post a Comment