Today something completely different. I wanted to write about this topic for a long time, and it also has nice overlap with vcpkg (which I also wanted to write up for some time now), so I finally decided to just do it.
This blog post was inspired by me learning more DotNet internals than I would normally care about. One of the rather frustrating experiences during this process was the lack of readily available information on how DotNet deals with dependencies. Hence this blog post.
On second thought, maybe the information was there, but perfectionist that I am it took me some time to accept the state of the art.
One word of warning: I have setup our DotNet code at work to use Paket for dependency management and never again looked back to Nuget. It might be that the situation discussed here has improved by now.
The whole article is licensed under the CC0 license. Copy parts, whatever. It would be nice to receive attribution, but in the end I do not care.
The three takeaway messages (and sections) of this post are:
When you have a project, you typically decompose it into coherent modules. These can be executables, test runners, libraries etc. The modules expose a public API that allows other, consuming modules to use the encapsulated functionality. This public API changes over time, either in syntax or in behavior. Hence, modules always reference other modules by a specific version, either implicitly (e.g. "the version from this commit") or explicitly through some version number, e.g., from semantic versioning.
In almost all non-trivial cases, not all code will be written by you / your team, and you will rely on external modules for certain functionality. In this case, you need to make sure that these provide the correct API version for error-free consumption by your code. That is, you need to manage external dependencies of your code.
Note that in the following, I will switch to VisualStudio / DotNet speak. VisualStudio calls projects "solutions", and the individual modules "projects" with "assemblies" as build output. Finally, assemblies can be wrapped in a (usually Nuget) "package" that contains additional metadata, in particular dependencies on other packages.
Consider the trivial case of a program P depending on an external assembly E, as depicted in figure 1
Figure 1: simple dependency of a program P on an external assembly E in a specific version.
All you need to do here is to choose the correct version of E and you are done.
The dependency is evaluated in two contexts: During the build, and at run time. During the build, the compiler looks for the assembly E and essentially verifies that all referenced symbols with the correct signature exist. When running the program, the DotNet runtime loads the assembly E with the correct version when needed and redirects calls to the corresponding symbols in E. In-between, the symbols are stored essentially as strings.
Let us choose a more complex example, where P depends on two assemblies A, B, and these in turn depend on an external assembly E. See figure 2 for the scheme.
Figure2: Fundamental diamond dependency conflict, where the program P depends on two assemblies A, B that reference an external dependency in different versions x.y.z and a.b.c, respectively.
In practical situations, the dependency graph can be more complex, with transitive assemblies in between or more than two assemblies requiring E, but the circular structure as central feature is commonly found.
To understand the principal problem, consider the following scenario. Let A be the output of an internal project that is under your control, while B is an external assembly that you only get in compiled form. You compile A against E in version x.y.z, and at runtime you also only provide E in version x.y.z. to fulfill both dependencies from A and B. Let the versions x.y.z and a.b.c be different.
Now assembly B calls a particular API (E in version a.b.c) but gets a different API (E in version x.y.z). In particular, symbols or the underlying behavior may have changed between the two versions. Whenever B calls one of the changed symbols in E, bad things may happen. In the best case function signatures changed, then the runtime cannot find the requested symbol and throws an exception. This can cripple your application because some functionality is not reachable or even crashes the application, but at least you can localize the error. In the worse case E behaves differently from what B expects. Then you have a very obscure, hard to trace down bug.
What is particularly insidious in this case is that all the code is perfectly sound. No amount of static code analysis and almost no review will highlight an error. However, your total system, composed of P, A, B, E(x.y.z) is broken, and this only at runtime. In the literature you will find such situations labelled with the term "hell", as in "dll hell" or "jar hell" or, as I will expand below, "assembly hell".
There are basically three approaches:
Pick & Pray
The simplest solution uses a common version to satisfy both dependencies. However, as discussed above, this is largely programming by coincidence and not guaranteed to work.
There is one exception, though. If and only if the assembly E is trustworthy not to change the public API without notice, and if no such notice has been given between x.y.z and a.b.c, the two versions should be interchangeable. For example, if E uses semantic versioning and the two versions differ only in the minor or micro version number, then you can pick the higher version to fulfill both dependencies.
Duplicate the assembly
Another way to solve the diamond dependency is to resolve the diamond like in figure 3.
Figure 3: Possible solution to the dependency problem of figure 2, by supplying assembly E twice with two distinct versions.
If you depend on the assembly E in two different versions, well, you just supply the assembly in two different versions. This generally works, however, there are a few caveats.
If the assembly E has internal state, this state will be duplicated, which can be a problem. For example if E is a logging framework, duplicating E gives you two logging frameworks that work independently. These can then interfere in arbitrary ways, for example by overwriting each other's files.
More importantly, this solution is only available if the runtime supports it. Some runtimes that do not support this setup are the ELF format (linux shared libraries) or Sun Java before version 9 (though you can work around both restrictions with modest extra effort). In both cases, the application holds a list of modules, either jar files or shared libraries, that can be searched for missing symbols (functions or variables). When a symbol is requested, the runtime or dynamic loader takes the symbol name essentially as a string, and goes through the list of modules until it finds a matching entry. If multiple entries exist, only the first is ever returned, hence duplication does not work.
Enforce a common assembly version
You can try to ensure a.b.c == x.y.z, that is, that all code only ever requests a single version of the external assembly. This should be your preferred solution whenever possible.
Of course, as usual, this approach is not always feasible. It only works if you have complete control over the affected assemblies A, B in figure 2. But often, at least one of the assemblies is itself an external assembly; then you would need to modify and compile foreign code, which is considerable extra work and complexity. Even if the assemblies A, B are in principle under your control, they might be maintained by different teams. Or they may be used by several internal clients with different requirements. Requiring a consistent version of all dependencies can then add a lot of inertia to the development process. Instead of a "hey, let's quickly update this package" approach, you need to involve and synchronize multiple teams etc., which can also translate into considerable costs and effort.
Now how does the DotNet environment fare with regard to the dependency problem? Unfortunately not too well.
Apparently the runtime seems to support the layout of figure 3. You can have the same symbol name referring to different code in different assemblies. However, the assembly loader effectively does not support this layout. The details are listed here, but in essence every assembly has a name (and version, but this can be overridden), and assemblies are referred to by their name. The DotNet assembly loader can load an assembly with a given name only once, unless you use rather arcane features like loader contexts.
In essence, unless you go quite some extra miles with renaming assemblies, rebuilding other assemblies to refer to the new names etc., the duplication of assemblies does not work. You may spend the extra effort for one or two crucial assemblies that have been shown to be problematic, but not for the dozens of dependencies that a serious DotNet application may depend on. So in general you can choose: Try to get common dependency versions or pray.
While DotNet is not alone with this problem, its impact is exacerbated by the fact that dependency trees can quickly become deep and wide. In particular Nuget packages like the various Azure components have a tendency to drag in the whole world in arbitrary versions.
To give some numbers from work: We have a simulation, probably 50-100k lines for the kernel, that allows simulation (and corresponding tracking for billing, authentication etc.) in Azure, which is why it depends on various Azure packages. The program has more than 200 external dependencies. In contrast a multi-million-line, highly complex desktop simulation has less than 50 dependencies, many of them highly specialized and conflict-free.
For the versioning problem, a notable example is Newtonsoft.Json, probably the most widely used DotNet Json parser.
Various Azure packages depend on Newtonsoft.Json in version 6.x, while other packages required it in a version like 12.x. That is a whooping difference of six in the major version! It seems safe to assume a priori that these dependencies are incompatible.
A third problem comes from the way Nuget packages tend to deal with version constraints. The default version constraint, and what feels like the only one I have ever seen is a minimum version. So if you have a Nuget package/assembly X that has been built against a package/assembly Y in version, say, 2.1.0, then the Nuget package for X declares a dependency on the package Y with any version >= 2.1.0. This dependency can be fulfilled by 2.1.1 (ok), 2.5.0 (probably ok), but also 10.0 (not ok). While this approach has its virtues, it makes dependency management harder because you lack warnings that would indicate potentially incompatible versions.
Given the state of the art of DotNet dependency management, there are a few features that a DotNet package manager should have:
When you have a solution, consisting of libraries, test runners, programs etc., you generally want a single version of each external assembly for everything.
This is pretty important. Consider a test runner using a different version of some external assembly from your production program. You may have, for example, a dependency error in your program that is never caught by a test or different behavior in test and production. Good luck debugging that.
You should be able to get a dependency graph for your solution.
The graph shows all dependencies from your projects to Nuget packages and transitive dependencies between Nuget packages. It allows you to review the versions of all dependencies, so that you can identify potentially problematic dependencies. Once you have "signed off" the dependency graph, it should be the immutable reference for the packages/assemblies to use, and should only change if you add/remove or update dependencies.
Unfortunately, Nuget has the very fundamental design decision; Nuget manages dependencies project-wise, not solution-wise. In theory, this offers additional flexibility, and is probably the only way to integrate with the build system. However, it adds a lot of overhead to the dependency management, because you need to manually ensure that all projects use the same version of a given package. I have fond memories of a coworker spending hours and an endless stream of curses on this task for a comparably trivial program (~50k lines), and people fearing an upgrade because of the days of trouble it brings. Also, while Nuget nowadays offers something like a dependency graph, you get one for each project, which limits its use.
Nuget has a few further downsides:
No C++/CLI support.
If you combine DotNet and native code, you might sometimes like to consume Nuget packages in the C++/CLI glue code. Unfortunately, Nuget has no C++/CLI support whatsoever. C++/CLI projects cannot have Nuget-managed dependencies, and when looking for potential incompatibilities between packages, Nuget never looks past a C++/CLI project. That is, to spell it out, you do not even get a warning if your dependencies are entirely broken.
Integration into the build system.
By default, when you use Nuget, it integrates into MSBuild. Superficially, that sounds pretty cool, and allows, for example, to just specify the Nuget packages in your project files. In practice, it is not.
MSBuild is simply a pretty complex monster, and bolting down yet another layer of functionality does not make it more robust. I would recommend to turn on detailed logging for a project with some dependencies, and read up where and how MSBuild looks for potential assemblies to link your code against.The result should convince you that you want to override MSBuild, not integrate. There are also some exotic cases, where MSBuild can "lose" assemblies in interesting ways that suggest you do not want to integrate.
Entering Paket, an alternative package manager. It is not perfect: the documentation took some time to get used to, and the way it manipulates project files feels a bit dirty. But Paket supports the work flow as it should be, and it has this down-to-earth feeling of Unix tools: doing one thing properly.
To use Paket, you logically proceed in two steps. First, you define what external dependencies with which version constraints you need globally for your solution, and run Paket on this input. The result is a dependency graph in a file called paket.lock that tells you every single package and version that is needed. This in itself is already pretty cool: Just by parsing this file, you can see which package requires what other package. If you are not happy with the dependency graph, you can fine-tune by adding more requirements or changing required versions.
Once Paket has created the dependency graph it avoids touching it. When you add or remove dependencies, it will build a new dependency graph using the existing one as starting point, and you can also update packages, which apparently changes dependencies. But as a general rule, once you have created a dependency graph and are happy with that, you never need to worry about breaking something by accident. Afterwards, you need to add a pre-build step that downloads all the packages in the dependency graph and puts them in some specific directory.
The second logical step is defining for each project what dependencies are needed. Paket will take a look at these definitions and modify the VisualStudio project files. For every dependency, it adds an <assemblyreference> tag to the project file with the location of the downloaded package as HintPath. This is about the strongest coupling you can have to an external assembly in MSBuild. MSBuild will always look in the HintPath first, and copy the assembly to the output directory in any case etc.</assemblyreference>
That is all that Paket does. No complicated project manipulations, no complex interaction with the build system, just downloading packages and forcing dependencies to the downloads. And assembly references also work perfectly well with C++/CLI, so with Paket you can add Nuget packages to C++/CLI projects.