Wednesday, February 09, 2022

UnoDB ported to MSVC

I have ported UnoDB to Microsoft Visual Studio, the third major C++ compiler. The other two, GCC and clang, are reasonably close to each other, while MSVC likes to do its own thing, not only due to different runtime platform, but also because of a different C++ standard interpretation in the compiler and library. MSVC 6.0 (the Internet Explorer 6.0 of C++ compilers, don't try to use the C++98 standard library there) was the first compiler I used in my professional C++ work two decades ago. Luckily 6.0 is not the current version, and Microsoft tried to clean up its act with respect to standard C++ since (not fully – why std::exception::what() is not noexcept?) So, the port forced to write more, uhm, portable code. Most GCC and clang-specific compiler extensions and builtins mapped straightforwardly to MSVC ones, and a few that didn't could be #ifdef'ed away trivially. There was only one incompatible API call – posix_memalign – which mapped to _aligned_malloc trivially too. Then the port pointed out actual bugs in my assumptions, mostly in the standard library use:

Besides the actual source code, I was most worried about CMake script compatibility. Turns out, MSVC and CMake work quite nicely together if one uses CMake presets, and porting the build scripts was relatively easy. 
At this point the basic port was completed with tests and benchmarks passing. Even though my main development platform is macOS and test/benchmark one is Linux, I wanted to do a first-class port that matches the existing platforms as much as possible and makes the most of MSVC features. For that, I did more work:
  • CI. Github Actions has good Windows runner support. It was a bit alien to learn and write PowerShell script bits and not to revert to UNIX'isms, otherwise it works and runs great.
  • Clean compiler diagnostics. I have enabled /W4 warning level and fixed a few unremarkable issues. I did have to add some CMake kludge to remove the default /W3. I am using CMake 3.12 and this issue has been fixed in 3.15.
  • AddressSanitizer. It's very nice that MSVC has it, and it mostly works as expected. There are two open MSVC bugs though, so I am unable to fully test it in CI yet.
  • clang/LLVM in MSVC. Now that's interesting – a clang compiler integrated with the rest of MSVC toolchain. For code, that required adding #ifdefs in the style of defined(_MSC_VER) && !defined (__clang__), which takes some time to get used to. For build system, things got even more interesting. There is a clang-cl.exe compiler driver that translates MSVC cl.exe command line options to clang ones internally. Which is great, except that there are a few cl.exe flags that do not have translations and cause errors, and that there are a few clang-style flags that have to passed together with cl.exe-style flags too. And then this whole one-toolchain-two-compilers business took CMake by surprise – I had to use incorrect-if-ever-ported-to-native-LLVM-on-windows CMake workarounds. CMake 3.14 introduces CMAKE_CXX_COMPILER_FRONTEND to address this. Anyway, this is also tested in CI.
  • clang/LLVM in MSVC with AddressSanitizer. Yep, that's also a thing, but here I hit the wall of incompatible runtime libs and stopped.
But by far the biggest payoff in doing the port were the MSVC static analyzer fixes. The analyzer was a honorable mention in my static analyzer post and now that I can run it on my code, the results exceeded my expectations. It started innocently enough, a few trivial changes (and some more), a few new asserts added to help out data flow analysis. Similar to asserts, I added some assumptions – which are facts provided to compiler about variable values in release build too. (One of my previous experimental branches did the same - lock-free 128-bit atomic stats used assumptions to limit double values to 2^63 for integer conversion to avoid branching to the highest-bit-set case handling). Some code was de-duplicated, dead code was removed. Then it caused me to discover a typo, which invalidated many months of runs of Node48 and Node256 random get benchmarks. Luckily there were no code changes dependent on the affected benchmarks.
The static analyzer also checks for C++ Core Guidelines violations, and there is a slightly sad story there, of the "why we can't have nice things in C++" kind. To suppress violation diagnostics, MSVC provides a well-meaning gsl::suppress(x.1) attribute. To make suppressions portable, LLVM well-meaningly ported it but made the argument a string literal – gsl::suppress("x.1") – because that's what portability means! To make both compilers work, the Core Guidelines introduces a well-meaning GSL_SUPPRESS(x.1) macro. If it is present in the code, clang-format will insert a space after dot – GSL_SUPPRESS(x. 1) – breaking the compilation. There is no .clang-format option to stop this behavior, and so all the actual uses of this macro in the actual portable codebases look like this:
// clang-format off
GSL_SUPPRESS(f.4) // NO-FORMAT: attribute
// clang-format on
Where NO-FORMAT: attribute seems to be a thing to handle MSVC own formatter (?). Luckily, I could replace GSL suppressions with direct MSVC compiler suppressions, and that's a minor inconvenience in the big picture. 
All in all, I learned a lot by doing an MSVC port, working with Windows was a bit of a change of regular programming environment, and improved the code more than expected – not too bad.