Improve config file upgrades – Ep 01
The GSoC project Krzysztof and I are working aims to facilitate configuration upgrade during package upgrades. Most of the cases, this upgrade should not require any input from the user, even if the user had previously customized his configuration file.
Currently configuration upgrade may ask questions to the user which may be hard to answer for most non-technical users. For more details see PackageConfigUpgrade on Debian wiki. (If you are not familiar with upgrade issues, you might want to read this page.)
At the 2 extremes of upgrade tools, we have:
- tools based on diff3 (like ucf)
- Config::Model based upgrades
Each tool has its own advantages and drawbacks.
On the plus side of diff3 tools :
- minimal work required on upstream or packager
On the other hand, here are the issues with diff3 based tools:
- may produce syntactically wrong files
- may leave conflicts unresolved
- may leave duplicated (and conflicting) statements
- may loose user data
On the other end of the spectrum, Config::Model pros are:
- Very robust, should work in most cases
- Can be set up to migrate user data from old config format to new format
- Leave upstream more room to change their project
But, Config::Model main problem is:
- Lot of work required from upstream or packager to create models
diff3 tools is lackluster for users. While Config::Model puts too much strain on packagers because it requires configuration models where all parameters are defined and declared in the model. For fast evolving projects, this can be a huge recurrent task (been there, done that for Xorg) . (Well, this recurrent task is a problem until doc, implementation and config model can be extracted from a single meta-description… But I digress, this may be a project for GSoC 2011 … :-p )
So, this GSoC project (for 2010) aims to find a compromise between these 2 extreme cases.
On Config::Model, I’ve always had a drawer statement whenever I was asked why Config::Model does not handle unknown parameter:
a configuration file cannot be validated if an unknown parameter can’t be distinguished from a user typo.
This stance is fine for casual user interaction on a stable system: an unknown parameter is most likely an error. And the user must be warned of his error as soon as possible. For a stable distro release, packagers should have the time to provide complete models, so basing upgrade on complete models should be possible (famous last words ?).
On the other hand, a lot of configuration upgrades will be done on unstable systems where software evolve quite fast, where models will likely fall behind software evolutions. Another shorcoming of my drawer statement is that it assumes user interaction where erroneous input is more likely. In case of configuration upgrades, there should be no user interaction. And we can assume that the config file to be upgraded was used (and thus validated) by the old version of the software. So unknown parameters should be tolerated and managed as best as possible.
Now, the big question is: how ?
In all attempts I’ve seen in managing upgrades that goes beyond diff3 tools, the configuration is stored as structured data.
Sometimes this structure can be infered from the configuration file syntax (shell vars, INI files, YAML, XML), sometimes not (sshd config file). In the latter case, some structural knowledge must be built in the upgrade tool.
Currently, Config::Model is designed to bring a complete description of the structure and semantics of a configuration file. This is perceived as bringing too much structural and semantic knowledge (“too semantic”). So the maintenance cost is perceived as too important. That may be right or wrong, but that does not matter. Perception is more important here.
So, let’s figure out a way to reduce the amount of work required to perform upgrades… (you may want to read a little bit more on configuration model creation before going on …)
One possibility would be to support configuration models that allow unknown leaves in nodes. Such a model would provide a kind of fallback leaf description. Something like: “this parameter is unknown, let’s assume it’s a simple string value”.
Let’s consider sshd_config example to see how such a behavior would pan out for some upgrade scenarios. sshd model is made of 2 main classes:
The second one contains all parameters allowed in a Match class (in fact, Match parameter is the only one in sshd that defines a new hierarchical level. All other parameters are simple leaves).
Now let’s consider typical upgrade scenarios from old version (with old models) to a more modern version. Let’s assume that the model is lagging behind and the model does not follow sshd evolutions.
So Config::Model will face 3 upgrade cases (extracted from sshd config history):
- Unexpected parameter that are actually simple leaf parameters. E.g. introduction of ClientAliveInterval.
- Obsolete parameters that should be migrated. E.g KeepAlive is renamed into TCPKeepAlive
- New structural parameters (introduction of Match element)
So what would happen with a model allowing unknown parameters but lacking specific instructions for these upgrades:
- ClientAliveInterval value would be kept. 🙂
- KeepAlive would be kept as is.TCPKeepAlive would also be kept. No migration is performed, but the configuration would be good. As long as sshd supports the old parameter, we’re good.
- Without knowledge of the structure brought by Match parameters, Config::Model would consider that parameter in Match block would duplicate parameters in the main block and remove the duplicate. User data may be lost. 😦
So, there’s not silver bullet. Let’s resume what we’ve learnt:
- to perform upgrade, the structure of the configuration must be specified
- A minimal model that would perform upgrade for Sshd without any validation would require only knowledge of Match parameter
Next step is to figure out what would happen if the same strategy is applied to other types of configuration files. But that will be for another blog.
All the best