WavePacket (C++/Python) Blog

Time-dependent simulation of open and closed quantum systems

Status: Alpha

Brought to you by: bsch63, ulflor

Perils of compiling software under Windows

I have been planning for years to write something in praise of vcpkg as making Windows compilation less unbearable. While building a binary package of
Wavepacket recently, I found, however, that there are also other issues when porting software from Unix, hence a shift in focus.

Introduction

Only after programming C++ under Windows for some time did I fully grasp how much of a hacker's system Unix is. You only need to master a few concepts, such as how the linker works or which package to install for the library headers, then adding external dependencies to your code becomes almost trivial, also for others. And things just work; you might spend ages blissfully unaware of problems such as ABI compatibility, symbol versioning, linker maps or the details of the ELF symbol resolution.

This merry life comes to an end as soon as you decide to compile anything under Windows. Fundamentally, this is caused by different requirements and ecosystems.

Unix distributions thrive on Open Source Software (OSS). At their heart, they are a set of build scripts that download and build thousands of software packages. These packages are easily available through some package manager. Because of that, you can usually assume that every competent user has reasonably up-to-date system package versions and can install all dependencies.

Windows, on the other hand, was meant as an operating system for proprietary software. You have some version of Windows, possibly outdated, on which you install possibly outdated proprietary software that may have been built for an earlier version of Windows. Even worse, the proprietary software may contain
proprietary components that are themselves outdates and have been built for yet another version of Windows.

The net result of many of the subsequent design decisions is that building software on Windows is a moderate pain. This is, by the way, not restricted to OSS, commercial software also has to fight with the limitations. Let us have a look at a few consequences.

Problems (and solutions) when compiling under Windows

1. No package management

Under Windows, the user is assumed to buy software and install it through the software's installer. Package management is not offered by the system itself.

If you want to efficiently manage dependencies, your first step should be the use of a package manager. Windows-native managers such as chocolatey do not cover most development needs. For C/C++ development, you want to use a specific package manager that just builds your direct and indirect dependencies with the latest sources and makes them available to your build. From personal experience, I can recommend vcpkg, but I would assume that Conan is an equally good choice.

The general approach of vcpkg is to offer a CMake (and Meson etc.) integration that mitigates many problems. For example, it sets prefix paths such that the vcpkg are easily found by a package search or copies dlls into your output folder (see the next item).

2. The dynamic loader is comparably dumb

If your code depends on other libraries, you need a dynamic loader to make these libraries available to your code at run time. As a rule of thumb, the Windows dynamic loader loads dlls only from the current directory. It does not understand runpaths, nor does Windows have common directories like /usr/lib where the libraries that you need may have been installed by default (slight simplification, but not relevant for your needs). Of course, user overrides for the dynamic loader like LD_LIBRARY_PATH do not exist, either.

The only general solution for his limitation is to copy all dependencies into the directory of an executable. Even if you build a testrunner, you need to make sure that your library under test as well as all its dependencies are copied into the same directory as the testrunner. Welcome to the state of the art for professional software development under Windows!

A common workaround is to build everything into one common output folder. Then you only need to copy the dependencies once and have them available for all programs. However, this requires of course additional build script infrastructure.

3. The dynamic loader may get in the way

There is another problem with the dynamic loader and plugin architectures. You will usually encounter those if you build a (Python / Java / ...) extension module. Again skipping some complexities, the problem goes like this:

The dynamic loader identifies a dll by its filename.
Dll filenames usually have no version number suffix; symlinks and the like also do not exist.
A program can load every dll only once.

Now imagine Python (which depends on sqlite.dll) loading an extension module that brings an own sqlite.dll. In general, his setup will crash because either Python or the extension module gets the wrong sqlite.dll. The same can even happen if two different extension modules need the same dll.

The solution for this problem is called namespacing. If the dynamic loader is too dumb to distinguish different libraries with the same name, you make the library names unique. Hence the extension module would rename its sqlite.dll into something like sqlite_js9x81nsj.dll.

Note that this situation is usually not a problem under Unix. Ignoring some details, each ELF shared object has an own list of dependencies to search for symbols, so a Python extension module would first look into its own sqlite.so before going up the hierarchy and look into the dependencies of Python.

4. Licenses

It should be clear by now that you cannot reasonably expect even advanced users to build your code under Windows. Instead, you have to build the code yourself, and distribute the resulting binary including all external dependencies.

This opens a new can of worms: Licenses. You might not have noticed under Unix because you only ever distribute your own source code. As soon as you distribute your binary, however, you need to fulfill all license obligations of the external dependencies.

Make sure you supply all used licenses with the binary package. And make sure that you fulfill them. While doing so, you may come across enjoyable special cases; for a harmless example the MSVC C++ standard libraries require you to adhere to the export restrictions of the United States. There may be worse conditions.

5. Miscellaneous other issues

Last, but not least, Windows is simply a different platform, so you need to change the programming style.

When compiling a library, the Unix default is to expose all symbols (functions and globals) in the resulting shared object, while under Windows the default is to not expose any symbols. This difference in behavior makes programs difficult to port. You can change the defaults with flags (for example the CMake target property WINDOWS_EXPORT_ALL_SYMBOLS). However, if the code exports data, such as static variables, this workaround does not cut it, and you will need to expicitly declare the export. as I had to do with my underlying tensor library
By default, what is a shared object file "mylib.so" under Unix is split into three different files under Windows: the runtime dependency "bin/mylib.dll", the debugging information "bin/mylib.pdb" and the import library with information for the linker "lib/mylib.lib" For additional confusion, the latter has the same extension ".lib" as a static library.
If you use the Windows API directly, always convert your strings into UTF16 wide strings and use the "W" version of the Windows API functions. This guarantees consistent behavior, the narrow string "Ansi" functions may depend on the user localization. This is a leftover artefact of Windows' early support for Unicode.

Some other issues that I have never come across personally, but that you should be aware of:

Never pass memory ownership over Dll boundaries, allocation and deallocation must always be done in the same module. If you build a library that hands out allocated memory, offer an API to free this memory again. The only exception is if you can guarantee that all affected modules are build against the same runtime. Failure to do so can lead to mystery crashes.
The background is that different modules (possibly clients of your library) can be built against different runtimes that have different ideas about how to allocate memory. This problem is rather esoteric under Unix; by default all shared objects use the same global libc. These differences can be a major pain when porting a library to Windows.
You are more likely to encounter ABI compatibility problems under Windows, because the system has not been compiled with essentially the same compiler. Plus, there are different calling conventions in use. Most likely, you will be fine if you at consume or expose only C interfaces, but be aware
of this issue.

Conclusions

This list of problems is by no means exhaustive, but should cover the main pain points. I hope it gives a reasonable overview of the main pitfalls and enough
pointers and keywords to solve the issue.

Personally, all the problems that I have also seen at work have made me into a very practical advocate of OSS. Once they are packaged, OSS components are easy to use and upgrade, usually make sense as a package, and do not require additional build infrastructure. This is notably different from issues that we have at work with commercial software.

Posted by 2024-05-12