Menu

Dependency management with DotNet or Why you should use Paket

Today something completely different.

This blog post was inspired by me learning more DotNet internals than I would normally care about. One of the rather frustrating experiences during this process was the lack of readily available information on how DotNet deals with dependencies. Hence this blog post.

On second thought, maybe the information was there, but perfectionist that I am it took me some time to accept the state of the art.

One word of warning: I have setup our DotNet code at work to use Paket for dependency management and never again looked back to Nuget. It might be that the situation discussed here has improved by now.

Shortest possible summary

The three takeaway messages (and sections) of this post are:

  1. Dependency management is hard.
    Be careful and make conscious decisions, or you will be hurt in places you did not even know.
  2. Dependency management in DotNet is harder.
    It does not seem practical to do this "cleanly", so it is rather easy to break things.
  3. Ditch Nuget (the tool), choose Paket instead.
    Nuget is complicated, lacks functionality, and is terribly brittle in edge cases.

1. Dependency management is hard

When you have a project, you typically decompose it into coherent modules. These can be executables, test runners, libraries etc. The modules expose a public API that allows other, consuming modules to use the encapsulated functionality. This public API changes over time, either in syntax or in behavior. Hence, modules always reference other modules by a specific version, either implicitly (e.g. "the version from this commit") or explicitly through some version number, e.g., from semantic versioning.

In almost all non-trivial cases, not all code will be written by you / your team, and you will rely on external modules for certain functionality. In this case, you need to make sure that these provide the correct API version for error-free consumption by your code. That is, you need to manage external dependencies of your code.

Note that in the following, I will switch to VisualStudio / DotNet speak. VisualStudio calls projects "solutions", and the individual modules "projects" with "assemblies" as build output. Finally, assemblies can be wrapped in a (usually Nuget) "package" that contains additional metadata, in particular dependencies on other packages.

Example 1

Consider the trivial case of a program P depending on an external assembly E, as depicted in figure 1

[[ Insert figure 1 here ]]

All you need to do here is to choose the correct version of E and you are done.

The dependency is evaluated in two contexts: During the build, and at run time. During the build, the compiler looks for the assembly E and essentially verifies that all referenced symbols with the correct signature exist. When running the program, the DotNet runtime loads the assembly E with the correct version when needed and redirects calls to the corresponding symbols in E.

Example 2

Let us choose a more complex example, where P depends on two assemblies A, B, and these in turn depend on an external assembly E. See figure 2 for the scheme.

[[ Insert figure 2 here ]]

In practical situations, the dependency graph can be more complex, with transitive assemblies in between or more than two assemblies requiring E, but the circular structure as central feature is commonly found.

To understand the principal problem, consider the following scenario. Let A be the output of an internal project that is under your control, while B is an external assembly that you only get in compiled form. You compile A against E in version x.y.z, and at runtime you also only provide E in version x.y.z. to fulfill both dependencies from A and B. Let the versions x.y.z and a.b.c be different.

Now assembly B calls a particular API (E in version a.b.c) but gets a different API (E in version x.y.z). In particular, symbols or the underlying behavior may have changed between the two versions. Whenever B calls one of the changed symbols in E, bad things may happen. In the best case function signatures changed, then the runtime cannot find the requested symbol and throws an exception. This can cripple your application because some functionality is not reachable or even crashes the application, but at least you can localize the error. In the worse case E behaves differently from what B expects. Then you have a very obscure, hard to trace down bug.

What is particularly insidious in this case is that all the code is perfectly sound. No amount of static code analysis and almost no review will highlight an error. However, your total system, composed of P, A, B, E(x.y.z) is broken, and this only at runtime. In the literature you will find such situations labelled with the term "hell", as in "dll hell" or "jar hell" or, as I will expand below, "assembly hell".

What to do with a diamond dependency

There are basically three approaches:

  1. Pick & Pray
    The simplest solution uses a common version to satisfy both dependencies. However, as discussed above, this is largely programming by coincidence and not guaranteed to work.

    There is one exception, though, where picking a common assembly version is ok. If and only if the assembly E is trustworthy not to change the public API without notice, and if no such notice has been given between x.y.z and a.b.c, the two versions should be interchangeable. For example, if E uses semantic versioning and the two versions differ only in the minor or micro version number, then you can pick the higher version to fulfill both dependencies.

  2. Duplicate the assembly

    Another way to solve the diamond dependency is to resolve the diamond like in figure 3.

    [[ figure 3 ]]

    If you depend on the assembly E in two different versions, well, you just supply the assembly in two different versions. This generally works, however, there are a few caveats.

    If the assembly E has internal state, this state will be duplicated, which can be a problem. For example if E is a logging framework, duplicating E gives you two logging frameworks that work independently. These can then interfere in arbitrary ways, for example by overwriting each other's files.

    More importantly, this solution is only available if the runtime supports it. Some runtimes that do not support this setup are the ELF format (linux shared libraries) or Sun Java before version 9 (though you can work around both restrictions with extra effort). In both cases, the application holds a list of modules, either jar files or shared libraries, that can be searched for missing symbols (functions or variables). When a symbol is requested, the runtime or dynamic loader takes the symbol name essentially as a string, and goes through the list of modules until it finds a matching entry. If multiple entries exist, only the first is ever returned, hence duplication does not work.

  3. Enforce a common assembly version

    You can try to ensure a.b.c == x.y.z, that is, that all code only ever requests a single version of the external assembly. This should be your preferred solution whenever possible.

    Of course, as usual, this approach is not always feasible. It only works if you have complete control over the affected assemblies A, B in figure 2. But often, at least one of the assemblies is itself an external assembly; then you suddenly need to modify and compile foreign code, which is at least a lot of extra work and complexity. Even if the assemblies A, B are in principle under your control, they might be maintained by different teams. Or may be used by several internal clients with different requirements. Requiring a consistent version of all dependencies can then add a lot of inertia to the development process. Instead of a "hey, let's quickly update this package" approach, you need to involve and synchronize multiple teams etc., which can also translate into considerable costs and effort.

Posted by Ulf Lorenz 2021-09-09 | Draft

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.