nixpkgs#403839 motivated me to write down how the module system works. It seems more and more people are helping out there to improve it.
After reviewing this PR, I realized even people who work on this full-time find it tricky to explain how it all fits together. So I decided to write it down—partly to remind myself, and partly to de-mystify this magnificent monster.
Terms like “definitions”, “declarations”, “imports”, “modules”, “config” and “options” the whole system can get really confusing.
I will assume that you are familiar with writing your own modules using mkOption
and lib.types
If your brain just short-circuited reading that code in nixpkgs. I think i got a comprehensive overview ready, let’s break it down starting from a high level:
Overview
The NixOS module system is a powerful but complex mechanism that allows multiple modules to declare options and provide configuration definitions, which are then merged into a final system configuration. At a high level, the process looks like the following:
I need to introduce the terms declaration
and definition
(both plural/singular) these are often used to refer to specific internal parts of the module system.
I cannot mention how important these terms are and I will use them myself a lot, when looking at the individual parts.
Also note that the implementation of the module system is constantly changing, however the overall architecture should be relatively stable and thus the concepts shown in this document should remain valid over a long period of time. While people are adding metadata, improving error handling and other things, the overall concept is unlikely to get changed significantly anytime soon.
The following picture should give you an overview of the important parts and how the data flows
┌─────────────────────────────────────┐
│ ALL MODULE SOURCES (Polymorph) │
└─────────────────────────────────────┘
│
Recursively collect & filter disabled modules
▼
┌───────────────────────────────────────┐
│ ENABLED MODULE SET (Uniform) │
└───────────────────────────────────────┘
│
┌─────────────────┴────────────────┐
▼ ▼
┌────────────────────────┐ ┌────────────────────────────┐ ┌────────────────────────────┐
│ OPTION DECLARATIONS │◄─────►│ CONFIGURATION DEFINITIONS │ ──► │ UNMATCHED DEFINITIONS │
└────────────────────────┘ └────────────────────────────┘ └────────────────────────────┘
│ │ │
Declarations from all modules Definitions from all modules │
│ │ │
are **merged**, yields final type are **merged**, based on final type are **merged**, with freeform.type.merge
▼ ▼ ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│ Final MERGING │
│ - Merge declarations → final type (typeMerge = a: b: ...) │
│ - Use final type to merge config values (merge = lof: defs: ...) │
└──────────────────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────┐
│ FINAL CONFIGURATION RESULT │
└────────────────────────────────┘
Module Collection
Modules in NixOS can import other modules, forming a recursive tree of modules. Additionally, modules can be disabled via a flag, and disabled modules are filtered out from the final set.
Here are the relevant data structures that we are going to use simplified into Haskell-like types:
StructuredModule :: {
disabled = [
{ disabled = [ Path | String ]; file :: string; }
];
key :: String;
module :: Module
modules :: ListOf Module;
}
Module :: {
disabledModules :: ListOf [ Path | String ];
imports :: ListOf [ Module ];
_class :: String;
_file :: String;
key :: String;
config :: Attribute set
options :: Attribute set of Options
}
Recursive Module Collection
The module system recursively collects modules, skipping those marked as disabled:
let
filterStructuredModuleList = structuredModules: filter (structuredModule: !isDisabledModule structuredModule) structuredModules;
in
(genericClosure {
startSet = filterStructuredModuleList structuredModules; # [ { key = ...; module :: Module, ... } ]
operator = attrs: filterStructuredModuleList attrs.modules; # structuredModule -> [ StructuredModules ]
})
# returns a flat ListOf [ StructuredModules ]
Lets take this nested module structure as an example. This StructuredModules
list falls out of a preprocessing step, which I won’t discuss in detail, because its not really needed in order to understand how the module system works.
[
{ key = "A";
module = {};
modules = [
{ key = "A1"; module = {}; modules = []; }
{ key = "A2"; module = {}; modules = []; }
];
}
{ key = "B";
module = {};
modules = [
{ key = "B1"; module = {}; modules = []; }
{ key = "B2"; module = {}; modules = [
{ key = "B2.1"; module = {}; modules = []; }
{ key = "B2.2"; module = {}; modules = []; }
]; }
];
}
]
…gets collected
into…
[
{
key = "A";
module = «thunk«;
modules = «thunk«;
}
{
key = "B";
module = «thunk«;
modules = «thunk«;
}
{
key = "A1";
module = «thunk«;
modules = «thunk«;
}
{
key = "A2";
module = «thunk«;
modules = «thunk«;
}
{
key = "B1";
module = «thunk«;
modules = «thunk«;
}
{
key = "B2.1";
module = «thunk«;
modules = «thunk«;
}
{
key = "B2.2";
module = «thunk«;
modules = «thunk«;
}
]
Finally this map
function returns a list of all the modules
# :: ListOf [ StructuredModules ] -> [ Module ]
map (attrs: attrs.module)
So far this is roughly how to get the ENABLED MODULE SET
from the picture above
Option merging
Now that we have a flat list of modules, we need to:
-
Collect option declarations
-
Merge them into a single declaration per name
-
Use each final option’s type.merge to merge the corresponding config values
If we ignore the error handling code for now we have a listOf module
from the Module Collection
The next step performed is to collect the option declarations
and merge duplicates or throw an error if the option cannot be merged.
1. Collecting the option declarations
/*
ListOf [ Modules ] -> {
${declName} :: Declaration
}
where
Declaration :: {
_file = "<unknown-file>";
options :: Option
pos :: {
column :: Int;
file :: String
line :: Int;
}
}
*/
declarationsByName = zipAttrsWith (n: v: v) (
map (
module:
let
subtree = module.options;
in
mapAttrs (n: option: {
inherit (module) _file;
pos = unsafeGetAttrPos n subtree;
options = option;
}) subtree
) modules
)
For example the options declarations
options.foo = lib.mkOption {
default = "foo";
};
options.a.b = lib.mkOption {
default = "foo";
};
Would produce the following declarationsByName
{
foo = [
## Declaration 1 of options.foo
{
_file = "<unknown-file>";
options = {
_type = "option";
default = "foo";
};
pos = {
column = 9;
file = "test.nix";
line = 13;
};
}
## Further declarations on options.foo from other modules
# ...
];
a = [
# Declaration 1 of options.a
{
_file = "<unknown-file>";
options = {
b = {
_type = "option";
default = "foo";
};
};
pos = {
column = 9;
file = "test.nix";
line = 17;
};
}
## Further declarations on options.a from other modules
# ...
];
}
Every attribute name contains a list of one or more option declarations
— possibly from multiple modules — which need to be merged,
Or an error should be thrown if they are non-mergeable.
Also note, that the name
is know at this point and therefore we can correlate to definitionsByName
, that is the equivalent config, which is expected to contain a value of this type.
We will later look into that.
Lets focus on merging the potentially duplicate options such that there is only one option for each name
Options are merged using the mergeOptionDecls
function
/*
ListOf String -> ListOf Decls -> {
_type = "option";
declarationPositions = ListOf Pos
declarations = ListOf String
default = <value>;
loc = ListOf String
options = Options | ListOf Module
type = OptionType
}
*/
mergeOptionDecls loc decls
Where mergeOptionDecls
is implemented like this. To make it simpler i removed some of the complex backwards compatibility and submodule merging logic.
and created this simplified version to explain the concept:
mergeOptionDecls = declarations:
foldl' (res: decl:
decl.options
# Merge the type attribute if both sides have a type
# In reality this is a bit more convoluted
// optionalAttrs (res ? type) {
type = res.type.typeMerge decl.options.type.functor;
}
)
{
options = { };
}
declarations
As we can observe the merging uses foldl
to produce one result for a list of multiple declarations
. The accumulator
is initialized with a neutral value.
Then type.typeMerge
is called over and over for every option.
An expansion of the type merge would look like this:
[
A
B
C
...
]
->
((A.type.typeMerge B.type.functor).typeMerge C.type.functor) ...
While we don’t need to understand right now how typeMerge
works internally.
Imagine every typeMerge
returns a new type
that can be used to type-merge again with the next item in the list.
typeMerge :: Self -> Type -> Type
Some examples:
types.str.typeMerge types.str -> types.str
types.attrsOf.typeMerge types.attrsOf -> types.attrsOf (merged elemType)
types.submodule.typeMerge types.submodule -> types.submodule (merged options)
Most types are mergable and those who are not, will throw an error.
This gives us one final option with one final type that we can use to merge all the defined config values.
Other fields of an option like description
and default
are not merged, if options are not mergeable they will throw The option ... is already declared ...
Confusingly the module system currently throws the same message if types are not mergeable. (Although we could easily change that, because they are different throws
).
How config is produced
From the collected modules as described in Module Collection we can concat all the config
values into a single list
This will ultimately produce an intermediate internal list called definitions
or definitionsByName
.
Hence the term defined
in the error messages. I am not a big fan of these error messages, because they leak internal information. definitions
and declarations
are internal attributes of the system and in my opinion a users shouldn’t be required to understand them. Or they would be required to read this blogpost first to understand the error… 🧐
Back to the track …
configs = concatMap (
module: module.config
) modules
This collects all configs into a single list
[
{
a = 1;
}
{
a = 1;
b = {
c = 1;
}
}
]
# ...
We then group all values from every item in the list by its name. (a
, b
)
The inner mapAttrs
transforms the value
(1
{ c = 1; }
) into an attribute where value = value
;
meaning we create an envelope, the name value
contains the value.
Along with the value we might propagate some error handling meta information.
/*
definitionsByName :: {
${name} :: Definition
}
*/
definitionsByName = zipAttrsWith (n: v: v) (
map (
module:
mapAttrs (n: value: {
inherit (module) file;
inherit value;
}) module.config
) configs
);
{
# All definitions of 'a'
a = [
{
value = 1;
file = ... # <- file position you see in error messages
}
{
value = 1;
file = ... # <- file position you see in error messages
}
];
# All definitions of 'b'
b = [
{
value = {
c = 1;
}
file = ... # <- file position you see in error messages
}
]
}
Finally we can produce a value by merging all the collected definitions
. This is done via the merge
attribute of the final option.type
every type
has a merge = loc: definitions: ...
attribute
The merge
behavior is one of the most important characteristics of a type. We can use type.merge
to produce a value of that type, if we have a couple of definitions
evalOptionValue opt definitions
Where evalOptionValue
is roughly defined as follows - excluded some error handling and other features that got added over time:
# [ String ] -> Option -> ListOf Definition -> { value :: Value; ... }
evalOptionValue = loc: option: definitions:
opt
// {
value = option.type.merge loc definitions;
}
Lets take a look what how merge
works:
Merging ! (One extensive example)
Starting Simple
As said every type has a merge
function, that is absolutely required. It describes how the list of definitions
is merged into one value.
For example the merge
of types.int
:
lib.types.int = mkOptionType {
# ...
merge = lib.mergeEqualOption;
}
lib.mergeEqualOption
returns a value
, if all definition values
are equal
Nested types
Nested types are absolutely mandatory - without them we can only describe primitives.
Meaning: We absolutely want list
and attrset
types with a nested
type. So lets understand how that works.
Note: Often the nested type is referred to as elemType
.
To fully understand how more complex types work we should take a look at the more advanced merge function of attrsOf
.
Other composed types like listOf
or submodule
are based on the same principles.
lib.types.attrsOf = mkOptionType {
# ...
merge = loc: defs:
mapAttrs (n: v: v.value) (
filterAttrs (n: v: v ? value) (
zipAttrsWith (name: defs: (mergeDefinitions (loc ++ [ name ]) elemType defs).optionalValue)
(pushPositions defs)
)
)
}
Don’t worry about the complexity of this beast. I am going to explain in detail how it works
starting from inside out
Lets do a walkthrough based on some example definitions
:
Reminder what the type
would be:
# AttrsOf
# with a nested
# elemType = int
type = attrsOf int
# definitionsByName for attrsOf int
# Almost looks like a list of modules ;)
[
{
file = "<some-file>";
value = { a = 1; };
}
{
file = "<some-file>";
value = {
a = 1;
b = 2;
};
}
]
We would then call attrsOf merge
with definitionsByName
The first thing that happens is pushPositions
it just propagates the file
attribute down.
# defs = [ { value = { a = ...; }; } { value = { b = ...; }; }
zipAttrsWith (../*function*/..) (pushPositions defs)
pushPositions = map (
def:
mapAttrs (n: v: {
inherit (def) file;
value = v;
}) def.value
)
So pushPositions
yields:
pushPositions [ { value = { a = 1; }; file = "<some-file>"; } ... ]
=>
[ { a = { value = 1; file = "<some-file>"; } } ... ]
So it just switches a
and value
, file
around. This is a preparation for zipping
by a
So at this point it might need a short reminder how zipAttrsWith
works:
zipAttrsWith (name: values: values) [ {a = "x";} {a = "y"; b = "z";} ]
=> { a = ["x" "y"]; b = ["z"]; }
zipAttrsWith
groups nested values by name again. The list ["x" "y"]
is produced by the function
argument of the zipAttrsWith
In fact it can be a any value that the function
returns.
Lets take a look at the next step:
# ↓ All '{ value ... }' entries are grouped by key "a" and passed to the function
# (pushPositions defs) = [ { a = { value = 1; file = "<some-file>"; } } ... ]
zipAttrsWith (name: innerDefs: /*function*/..) (pushPositions defs)
# name = "a"
# defs = [ { value = 1; file = "<some-file>"; } ... ]
Lets reveal the inner function
# (pushPositions defs) = [ { a = { value = 1; file = "<some-file>"; } } ... ]
zipAttrsWith (name: innerDefs: mergeDefinitions loc elemType innerDefs) (pushPositions defs)
mergeDefinitions
is a recursive call that uses elemType.merge
to merge the inner definitions
In that case it is types.int
which uses mergeEqualOption
mergeEqualOption
just as any type.merge
returns a value of that type
.
So in case of types.int
mergeEqualOption
returns a value 1
- which is of type int
.
# lib.modules.mergeDefinitions returns:
{
optionalValue = if isDefined then { value = mergedValue; } else { };
# ...
}
# { value = 1; } or { } <= (mergeDefinitions loc elemType innerDefs).optionalValue
zipAttrsWith (name: innerDefs: (mergeDefinitions loc elemType innerDefs).optionalValue) (pushPositions defs)
=>
{
a = { value = 1 };
...
}
So we can just substitute the real value already to make the outer function more readable:
filterAttrs (n: v: v ? value)
{
a = { value = 1};
...
}
=>
{
a = { value = 1 };
}
And finally
mapAttrs (n: v: v.value)
{
a = { value = 1 };
}
=>
{
a = 1;
}
In summary:
# type = attrsOf int
type.merge loc [
{
file = "<some-file>";
value = { a = 1; };
}
{
file = "<some-file>";
value = {
a = 1;
b = 2;
};
}
]
=>
{
a = 1;
b = 2;
}
Producing the output involves some checking spaghetti’s, but ultimately it does:
options.foo = option // {
value = option.type.merge loc definitionsByName.foo;
}
We can then map all the value
attributes from the options, to produce a config
# This effectively collects all the .value attributes from the options recursively
declaredConfig = mapAttrsRecursiveCond (v: !isOption v) (_: v: v.value) options;
# mapAttrsRecursiveCond:
# - If the predicate returns false, mapAttrsRecursiveCond does not recurse, but instead applies the mapping function.
# - If the predicate returns true, it does recurse, and does not apply the mapping function.
#
config = declaredConfig;
🤗🤗🤗
Freeform (special case)
This is a special case introduced, to better support scenarios, where settings often change, or are not always known ahead of time.
The module system supports freeform attributes, meaning names in config, that have no defined option.
They still need to be merged using the freeformType.merge
First of all we need to collect all of those definitions
This is done by subtracting all seen options from the definitions
leaving us with those definitions that don’t have a correlated options declaration
unmatchedDefnsByName = removeAttrs definitionsByName (attrNames matchedOptions)
The config
in this case is produced by recursively updating it with the freeformConfig
freeformConfig = freeformType.merge prefix unmatchedDefinitions;
config = recursiveUpdate freeformConfig declaredConfig;
Cheers, keep hacking !