Technology

05 Module System

Author Photo

Johannes Kirschbauer

· 14 min read
Thumbnail

nixpkgs#403839 motivated me to write down how the module system works. It seems more and more people are helping out there to improve it.

After reviewing this PR, I realized even people who work on this full-time find it tricky to explain how it all fits together. So I decided to write it down—partly to remind myself, and partly to de-mystify this magnificent monster.

Terms like “definitions”, “declarations”, “imports”, “modules”, “config” and “options” the whole system can get really confusing.

De-mystify this magnificent monster

I will assume that you are familiar with writing your own modules using mkOption and lib.types

If your brain just short-circuited reading that code in nixpkgs. I think i got a comprehensive overview ready, let’s break it down starting from a high level:

Overview

The NixOS module system is a powerful but complex mechanism that allows multiple modules to declare options and provide configuration definitions, which are then merged into a final system configuration. At a high level, the process looks like the following:

I need to introduce the terms declaration and definition (both plural/singular) these are often used to refer to specific internal parts of the module system. I cannot mention how important these terms are and I will use them myself a lot, when looking at the individual parts.

Also note that the implementation of the module system is constantly changing, however the overall architecture should be relatively stable and thus the concepts shown in this document should remain valid over a long period of time. While people are adding metadata, improving error handling and other things, the overall concept is unlikely to get changed significantly anytime soon.

The following picture should give you an overview of the important parts and how the data flows

                          ┌─────────────────────────────────────┐
                          │     ALL MODULE SOURCES (Polymorph)  │
                          └─────────────────────────────────────┘

                      Recursively collect & filter disabled modules

                          ┌───────────────────────────────────────┐
                          │     ENABLED MODULE SET (Uniform)      │
                          └───────────────────────────────────────┘

                      ┌─────────────────┴────────────────┐
                      ▼                                  ▼
         ┌────────────────────────┐       ┌────────────────────────────┐     ┌────────────────────────────┐
         │  OPTION DECLARATIONS   │◄─────►│ CONFIGURATION DEFINITIONS  │ ──► │   UNMATCHED DEFINITIONS    │
         └────────────────────────┘       └────────────────────────────┘     └────────────────────────────┘
                      │                                 │                               │
         Declarations from all modules      Definitions from all modules                │
                      │                                 │                               │
         are **merged**, yields final  type are **merged**, based on final type   are **merged**, with freeform.type.merge
                      ▼                                 ▼                               ▼
                ┌──────────────────────────────────────────────────────────────────────────────┐
                │                               Final MERGING                                  │
                │ - Merge declarations → final type       (typeMerge = a: b: ...)              │
                │ - Use final type to merge config values (merge = lof: defs: ...)             │
                └──────────────────────────────────────────────────────────────────────────────┘


                                        ┌────────────────────────────────┐
                                        │  FINAL CONFIGURATION RESULT    │
                                        └────────────────────────────────┘

Module Collection

Modules in NixOS can import other modules, forming a recursive tree of modules. Additionally, modules can be disabled via a flag, and disabled modules are filtered out from the final set.

Here are the relevant data structures that we are going to use simplified into Haskell-like types:

StructuredModule :: {
    disabled = [
    { disabled = [ Path | String ]; file :: string; }
    ];
    key :: String;
    module :: Module
    modules :: ListOf Module;
}

Module :: {
    disabledModules :: ListOf [ Path | String ];
    imports :: ListOf [ Module ];
    _class :: String;
    _file :: String;
    key :: String;
    config :: Attribute set
    options :: Attribute set of Options
}

Recursive Module Collection

The module system recursively collects modules, skipping those marked as disabled:

let
    filterStructuredModuleList = structuredModules: filter (structuredModule: !isDisabledModule structuredModule) structuredModules;
in
(genericClosure {
    startSet = filterStructuredModuleList structuredModules; #  [ { key = ...; module :: Module, ... } ]
    operator = attrs: filterStructuredModuleList attrs.modules; # structuredModule -> [ StructuredModules ]
})
# returns a flat ListOf [ StructuredModules ]

Lets take this nested module structure as an example. This StructuredModules list falls out of a preprocessing step, which I won’t discuss in detail, because its not really needed in order to understand how the module system works.

[
    { key = "A";
      module = {};
      modules = [
        { key = "A1"; module = {}; modules = []; }
        { key = "A2"; module = {}; modules = []; }
      ];
    }
    { key = "B";
      module = {};
      modules = [
        { key = "B1"; module = {}; modules = []; }
        { key = "B2"; module = {}; modules = [
          { key = "B2.1"; module = {}; modules = []; }
          { key = "B2.2"; module = {}; modules = []; }
        ]; }
      ];
    }
]

…gets collected into…

[
    {
        key = "A";
        module = «thunk«;
        modules = «thunk«;
    }
    {
        key = "B";
        module = «thunk«;
        modules = «thunk«;
    }
    {
        key = "A1";
        module = «thunk«;
        modules = «thunk«;
    }
    {
        key = "A2";
        module = «thunk«;
        modules = «thunk«;
    }
    {
        key = "B1";
        module = «thunk«;
        modules = «thunk«;
    }
    {
        key = "B2.1";
        module = «thunk«;
        modules = «thunk«;
    }
    {
        key = "B2.2";
        module = «thunk«;
        modules = «thunk«;
    }
]

Finally this map function returns a list of all the modules

# :: ListOf [ StructuredModules ] -> [ Module ]
map (attrs: attrs.module)

So far this is roughly how to get the ENABLED MODULE SET from the picture above

Option merging

Now that we have a flat list of modules, we need to:

  • Collect option declarations

  • Merge them into a single declaration per name

  • Use each final option’s type.merge to merge the corresponding config values

If we ignore the error handling code for now we have a listOf module from the Module Collection

The next step performed is to collect the option declarations and merge duplicates or throw an error if the option cannot be merged.

1. Collecting the option declarations

/*
    ListOf [ Modules ] -> {
        ${declName} :: Declaration
    }
    where

    Declaration :: {
        _file = "<unknown-file>";
        options :: Option
        pos :: {
            column :: Int;
            file :: String
            line :: Int;
        }
    }
*/
declarationsByName = zipAttrsWith (n: v: v) (
    map (
        module:
            let
                subtree = module.options;
            in
                mapAttrs (n: option: {
                    inherit (module) _file;
                    pos = unsafeGetAttrPos n subtree;
                    options = option;
                }) subtree
    ) modules
)

For example the options declarations

options.foo = lib.mkOption {
    default = "foo";
};
options.a.b = lib.mkOption {
    default = "foo";
};

Would produce the following declarationsByName

{
  foo = [
    ## Declaration 1 of options.foo
    {
      _file = "<unknown-file>";
      options = {
        _type = "option";
        default = "foo";
      };
      pos = {
        column = 9;
        file = "test.nix";
        line = 13;
      };
    }
    ## Further declarations on options.foo from other modules
    # ...
  ];
  a = [
    # Declaration 1 of options.a
    {
      _file = "<unknown-file>";
      options = {
        b = {
          _type = "option";
          default = "foo";
        };
      };
      pos = {
        column = 9;
        file = "test.nix";
        line = 17;
      };
    }
    ## Further declarations on options.a from other modules
    # ...
  ];
}

Every attribute name contains a list of one or more option declarations — possibly from multiple modules — which need to be merged, Or an error should be thrown if they are non-mergeable.

Also note, that the name is know at this point and therefore we can correlate to definitionsByName, that is the equivalent config, which is expected to contain a value of this type. We will later look into that.

Lets focus on merging the potentially duplicate options such that there is only one option for each name

Options are merged using the mergeOptionDecls function

/*
    ListOf String -> ListOf Decls -> {
        _type = "option";
        declarationPositions = ListOf Pos
        declarations = ListOf String
        default = <value>;
        loc = ListOf String
        options = Options | ListOf Module
        type = OptionType
    }
*/
mergeOptionDecls loc decls

Where mergeOptionDecls is implemented like this. To make it simpler i removed some of the complex backwards compatibility and submodule merging logic. and created this simplified version to explain the concept:

mergeOptionDecls = declarations:
    foldl' (res: decl:
        decl.options
        # Merge the type attribute if both sides have a type
        # In reality this is a bit more convoluted
        // optionalAttrs (res ? type) {
            type = res.type.typeMerge decl.options.type.functor;
        }
    )
    {
        options = { };
    }
    declarations

As we can observe the merging uses foldl to produce one result for a list of multiple declarations. The accumulator is initialized with a neutral value. Then type.typeMerge is called over and over for every option.

An expansion of the type merge would look like this:

[
   A
   B
   C
   ...
]
->
((A.type.typeMerge B.type.functor).typeMerge C.type.functor) ...

While we don’t need to understand right now how typeMerge works internally. Imagine every typeMerge returns a new type that can be used to type-merge again with the next item in the list.

typeMerge :: Self -> Type -> Type

Some examples:

types.str.typeMerge types.str -> types.str

types.attrsOf.typeMerge types.attrsOf -> types.attrsOf (merged elemType)

types.submodule.typeMerge types.submodule -> types.submodule (merged options)

Most types are mergable and those who are not, will throw an error.

This gives us one final option with one final type that we can use to merge all the defined config values.

Other fields of an option like description and default are not merged, if options are not mergeable they will throw The option ... is already declared ... Confusingly the module system currently throws the same message if types are not mergeable. (Although we could easily change that, because they are different throws).

How config is produced

From the collected modules as described in Module Collection we can concat all the config values into a single list This will ultimately produce an intermediate internal list called definitions or definitionsByName.

Hence the term defined in the error messages. I am not a big fan of these error messages, because they leak internal information. definitions and declarations are internal attributes of the system and in my opinion a users shouldn’t be required to understand them. Or they would be required to read this blogpost first to understand the error… 🧐

Back to the track …

configs = concatMap (
    module: module.config
) modules

This collects all configs into a single list

[
    {
        a = 1;
    }
    {
        a = 1;
        b = {
            c = 1;
        }
    }
]
# ...

We then group all values from every item in the list by its name. (a, b) The inner mapAttrs transforms the value (1 { c = 1; }) into an attribute where value = value; meaning we create an envelope, the name value contains the value. Along with the value we might propagate some error handling meta information.

/*
    definitionsByName :: {
        ${name} :: Definition
    }
*/
definitionsByName = zipAttrsWith (n: v: v) (
    map (
        module:
            mapAttrs (n: value: {
                inherit (module) file;
                inherit value;
            }) module.config
    ) configs
);
{
    # All definitions of 'a'
    a = [
        {
            value = 1;
            file = ... # <- file position you see in error messages
        }
        {
            value = 1;
            file = ... # <- file position you see in error messages
        }

    ];
    # All definitions of 'b'
    b = [
        {
            value = {
                c = 1;
            }
            file = ... # <- file position you see in error messages
        }
    ]
}

Finally we can produce a value by merging all the collected definitions. This is done via the merge attribute of the final option.type

every type has a merge = loc: definitions: ... attribute

The merge behavior is one of the most important characteristics of a type. We can use type.merge to produce a value of that type, if we have a couple of definitions

evalOptionValue opt definitions

Where evalOptionValue is roughly defined as follows - excluded some error handling and other features that got added over time:

# [ String ] -> Option -> ListOf Definition -> { value :: Value; ... }
evalOptionValue = loc: option: definitions:
    opt
    // {
        value = option.type.merge loc definitions;
    }

Lets take a look what how merge works:

Merging ! (One extensive example)

Starting Simple

As said every type has a merge function, that is absolutely required. It describes how the list of definitions is merged into one value.

For example the merge of types.int:

lib.types.int = mkOptionType {
    # ...
    merge = lib.mergeEqualOption;
}

lib.mergeEqualOption returns a value, if all definition values are equal

Nested types

Nested types are absolutely mandatory - without them we can only describe primitives.

Meaning: We absolutely want list and attrset types with a nested type. So lets understand how that works.

Note: Often the nested type is referred to as elemType.

To fully understand how more complex types work we should take a look at the more advanced merge function of attrsOf.

Other composed types like listOf or submodule are based on the same principles.

lib.types.attrsOf = mkOptionType {
    # ...
    merge = loc: defs:
        mapAttrs (n: v: v.value) (
            filterAttrs (n: v: v ? value) (
                zipAttrsWith (name: defs: (mergeDefinitions (loc ++ [ name ]) elemType defs).optionalValue)
                    (pushPositions defs)
            )
        )

}

Don’t worry about the complexity of this beast. I am going to explain in detail how it works

starting from inside out

Lets do a walkthrough based on some example definitions:

Reminder what the type would be:

# AttrsOf
# with a nested
# elemType = int
type = attrsOf int
# definitionsByName for attrsOf int
# Almost looks like a list of modules ;)
[
  {
    file = "<some-file>";
    value = { a = 1; };
  }
  {
    file = "<some-file>";
    value = {
      a = 1;
      b = 2;
    };
  }
]

We would then call attrsOf merge with definitionsByName

The first thing that happens is pushPositions it just propagates the file attribute down.

# defs = [ { value = { a = ...; };  } { value = { b = ...; }; }
zipAttrsWith (../*function*/..) (pushPositions defs)
pushPositions = map (
    def:
    mapAttrs (n: v: {
        inherit (def) file;
        value = v;
    }) def.value
)

So pushPositions yields:

pushPositions [ { value = { a = 1; }; file = "<some-file>"; } ... ]
=>
[ { a = { value = 1; file = "<some-file>"; } } ... ]

So it just switches a and value , file around. This is a preparation for zipping by a

So at this point it might need a short reminder how zipAttrsWith works:

zipAttrsWith (name: values: values) [ {a = "x";} {a = "y"; b = "z";} ]
=> { a = ["x" "y"]; b = ["z"]; }

zipAttrsWith groups nested values by name again. The list ["x" "y"] is produced by the function argument of the zipAttrsWith In fact it can be a any value that the function returns.

Lets take a look at the next step:

#                            ↓ All '{ value ... }' entries are grouped by key "a" and passed to the function
# (pushPositions defs) = [ { a = { value = 1; file = "<some-file>"; } } ... ]
zipAttrsWith (name: innerDefs: /*function*/..) (pushPositions defs)
# name = "a"
# defs = [ { value = 1; file = "<some-file>"; } ... ]

Lets reveal the inner function

# (pushPositions defs) = [ { a = { value = 1; file = "<some-file>"; } } ... ]
zipAttrsWith (name: innerDefs: mergeDefinitions loc elemType innerDefs) (pushPositions defs)

mergeDefinitions is a recursive call that uses elemType.merge to merge the inner definitions

In that case it is types.int which uses mergeEqualOption

mergeEqualOption just as any type.merge returns a value of that type. So in case of types.int mergeEqualOption returns a value 1 - which is of type int.

# lib.modules.mergeDefinitions returns:
{
    optionalValue = if isDefined then { value = mergedValue; } else { };
    # ...
}
# { value = 1; } or { } <= (mergeDefinitions loc elemType innerDefs).optionalValue
zipAttrsWith (name: innerDefs: (mergeDefinitions loc elemType innerDefs).optionalValue) (pushPositions defs)
=>
{
    a = { value = 1 };
    ...
}

So we can just substitute the real value already to make the outer function more readable:

filterAttrs (n: v: v ? value)
{
    a = { value = 1};
    ...
}
=>
{
    a = { value = 1 };
}

And finally

mapAttrs (n: v: v.value)
{
    a = { value = 1 };
}
=>
{
    a = 1;
}

In summary:

# type = attrsOf int
type.merge loc [
  {
    file = "<some-file>";
    value = { a = 1; };
  }
  {
    file = "<some-file>";
    value = {
      a = 1;
      b = 2;
    };
  }
]
=>
{
    a = 1;
    b = 2;
}

Producing the output involves some checking spaghetti’s, but ultimately it does:

options.foo = option // {
    value = option.type.merge loc definitionsByName.foo;
}

We can then map all the value attributes from the options, to produce a config

# This effectively collects all the .value attributes from the options recursively
declaredConfig = mapAttrsRecursiveCond (v: !isOption v) (_: v: v.value) options;
# mapAttrsRecursiveCond:
# - If the predicate returns false, mapAttrsRecursiveCond does not recurse, but instead applies the mapping function.
# - If the predicate returns true, it does recurse, and does not apply the mapping function.
#
config = declaredConfig;

🤗🤗🤗

Freeform (special case)

This is a special case introduced, to better support scenarios, where settings often change, or are not always known ahead of time.

The module system supports freeform attributes, meaning names in config, that have no defined option. They still need to be merged using the freeformType.merge

First of all we need to collect all of those definitions This is done by subtracting all seen options from the definitions leaving us with those definitions that don’t have a correlated options declaration

unmatchedDefnsByName = removeAttrs definitionsByName (attrNames matchedOptions)

The config in this case is produced by recursively updating it with the freeformConfig

freeformConfig = freeformType.merge prefix unmatchedDefinitions;
config = recursiveUpdate freeformConfig declaredConfig;

Cheers, keep hacking !

#nix#modules