Skip to content

Improving error messages of Dpkg dependency parser

Hello

Config::Model::Dpkg project (a Debian source package model based on Config::Model) is partly based on a ParseRec::Descent grammar. This grammar is used to parse the dependency of a Debian source package.

This article will show how such a grammar is written, its limitation regarding error handling and how to improve the situation.

Debian package main data is described in debian/control file. This file can feature a list of dependencies, i.e. a list of package that must be installed for the package to work. These dependencies are declared in fields like “Build-Depends”, or “Depends” as a list of package. For Dpkg model purpose, I needed only to parse one item of a dependency list at a time.

This dependency item can be a simple package name:

foo

or a package name with a version requirement:

foo ( > 1.24 )

or a package name with architectures restrictions:

foo [alpha amd64 hurd-arm linux-armeb]

or both:

foo ( > 1.24 ) [alpha amd64 hurd-arm linux-armeb]

or a list of alternate choices combining the possibilities above:

foo ( > 1.24 ) | bar [ linux-any] | baz ( << 3.14 ) [ ! hurd-armel !hurd-armeb ]

or a variable that is replaced during package build:

${perl-depends}

Writing a Parse::RecDescent grammar to parse this is relatively straightforward.

The first production handles alternate dependencies separated by ‘|’ and raises an error if some text was not “consumed” by the dependencies:

dependency_item: depend(s /\|/) eofile |

A dependency as explained above is expressed as:

depend: pkg_dep | variable

A variable like ${foo} or ${bar}-1.24~ is parsed with:

variable: /\${[\w:\-]+}[\w\.\-~+]*/

This rule handles a package name with optional version or arch restriction:

pkg_dep: pkg_name dep_version(?) arch_restriction(?) 
pkg_name: /[a-z0-9][a-z0-9\+\-\.]+/

The remaining rules are quite simple:

dep_version: '(' oper version ')' 
oper: '<<' | '<=' | '=' | '>=' | '>>'
version: variable | /[\w\.\-~:+]+/

arch_restriction: '[' arch(s) ']'
arch:  /!?[\w-]+/

eofile: /^\Z/

The grammar above works well to parse the dependency. You can test it with this small Perl script:

#!/usr/bin/perl
use strict;
use warnings;
use 5.010 ;
use Parse::RecDescent ;

my $parser = Parse::RecDescent->new(join('',));
my $dep = shift ;
say "parsing '$dep'";
my $ret = $parser->dependency_item($dep) ;

say "result is ", $ret if $ref ;

__DATA__
# insert grammar here !!!

Unfortunately, any error in the optional parts (i.e version requirements and arch restriction) leads to an error message which is not very helpful. The error message only mention that some text could not be parsed:

parsing 'foo ( != 1.24 ) | bar'

       ERROR (line 1): Invalid dependency item: Was expecting /\|/ but found
                       "( != 1.24 ) | bar" instead

or

parsing 'foo [ arm & armel] | bar'

       ERROR (line 1): Invalid dependency item: Was expecting /\|/ but found
                       "[ arm & armel] | bar" instead

The problem comes from the fact that version requirements or arch restrictions are optional. For instance if a version requirement has a syntax error, Parse::RecDescent will try to parse it as an arch restriction. This arch restriction rule will also fail and the last terminal (“eofile”) will fail. So the error message does not hint at the actual syntax problem.

To generate better error messages, I improved the suggestion made in Parse::RecDescent FAQ.

Instead of calling a plain subroutine, I use a sub reference that will store the error messages in a closure. This sub ref is declared in a start-up action. Note that the sub ref explicitly returns undef. I’ll explain why later.

{
    my @dep_errors ;
    my $add_error = sub {
        my ($err, $txt) = @_ ;
        push @dep_errors, "$err: '$txt'" ;
        return ;
    } ;
}

The following production always fails while ensuring that the error list is reset. This production is always run at the beginning of the dependency parsing:

dependency: { @dep_errors = (); }

Here’s the actual “dependency” production that is run when “dependency” method is called on the parser. It will return an array ref containing (1, data) if the dependency is valid or (0, errors) otherwise:

dependency: depend(s /\|/) eofile
  {
    $return = [ 1 , @{$item[1]} ] ;
  }
  |
  {
    push( @dep_errors, "Cannot parse: '$text'" ) unless @dep_errors ;
    $return =  [ 0, @dep_errors ];
  }

The following productions don’t change much:

depend: pkg_dep | variable
variable: /\${[\w:\-]+}[\w\.\-~+]*/
pkg_dep: pkg_name dep_version(?) arch_restriction(?) 
dep_version: '(' oper version ')'

The first rule of this production parses the package name which must be followed by a space, end of string ‘(‘ or ‘[‘. A positive look-ahead assertion is used so only the package name is consumed. If the first rule fails, the second rule provides a meaningful error message. The second rule will match anything which is not a space and create an error message. Since $add_error returns undef, the second rule returns undef and the production fails. So the text stored in the error message is not consumed:

pkg_name: /[a-z0-9][a-z0-9\+\-\.]+(?=\s|\Z|\(|\[)/
    | /\S+/ { $add_error->("bad package name", $item[1]) ;}

The same trick is used with these productions:

oper: '<<' | '=' | '>>'
    | /\S+/ { $add_error->("bad dependency version operator", $item[1]) ;}

version: variable | /[\w\.\-~:+]+(?=\s|\)|\Z)/
    | /\S+/ { $add_error->("bad dependency version", $item[1]) ;}

The action of this production is a little bit more tricky. The action ensures that ‘!’ are either added before all arch or not at all. Otherwise an error message is generated and added to the list of errors:

arch_restriction: '[' osarch(s) ']'
    {
        my $mismatch = 0;
        # $ref contains ['!',os,arch] or ['',os,arch]
        my $ref = $item[2] ;
        for (my $i = 0; $i < $#$ref -1 ; $i++ ) {
            $mismatch ||= ($ref->[$i][0] xor $ref->[$i+1][0]) ;
        }
        my @a = map { ($_->[0] || '') . ($_->[1] || '') . $_->[2] } @$ref ;
        if ($mismatch) {
            $add_error->("some names are prepended with '!' while others aren't.", "@a") ;
        }
        else {
            $return = 1 ;
        }
    }

The check above is possible only if the “osarch” production returns an array ref containing something like ('!','linux','any') for “!linux-any‘ or ('','linux','any') for “linux-any“:

osarch: not(?) os(?) arch
    {
        $return =  [ $item[1][0], $item[2][0], $item[3] ];
    }
    | /.?(?=\s|\]|\Z)/ { $add_error->("bad arch specification: ", $item[1]) ; }

not: '!'

Here’s the remaining of the grammar:

os: /(any|uclibc-linux|linux|kfreebsd|knetbsd|etc...)-/
   | /\w+/ '-' { $add_error->("bad os in architecture specification", $item[1]) ;}

arch: / (any |alpha|amd64 |arm\b |arm64 |etc... )
        (?=(\]| ))
      /x
      | /\w+/ { $add_error->("bad arch in architecture specification", $item[1]) ;}

eofile: /^\Z/

That’s all for grammar 2.0

Before someone yells: “Show me the message ! “, here are some example of bad dependencies and their error message generated by the parser:

parse 'foo ( != 1.24 ) | bar'
result is: 0 bad dependency version operator: '!='

parsing 'foo [ arm & armel] | bar'
result is: 0 bad arch specification: : '&'

parsing 'foo [ arm armel ] | bar [!moo]'
result is: 0 bad arch specification: : ']' bad arch in architecture specification: 'moo'

The 2 first error messages are spot on the actual error. The second one has a false positive (‘]’ is correct) but correctly highlights the wrong arch name (‘moo’).

Mission accomplished.

In order to keep this post (relatively) simple, I’ve removed the part that actually store parsed data. They don’t really matter for error handling. Nevertheless, you may see the whole grammar in Config::Model::Dpkg::Dependency module.

All the best

Next version of Config::Model will use asynchronous check

Hello

To check the validity of Debian dependencies in Debian package, Config::Model queries a remote web server to get the list of package version known to Debian.

The first version of this check did sequential requests: when the cache was not very fresh, I had to wait more than 60s before getting results for complex package like padre. That was very frustrating (but less frustrating than checking package names manually)

For the following version, I addedhacked AnyEvent in Dpkg model to run parrallel queries. This went much faster but gaves weird results: before getting a response from Debian server, the packages were flagged as unknown. To get consistent results, running twice cme check dpkg was required.

So, at new year, I’ve decided to bite the bullet and implement correctly value checks with asynchronous queries to remote server. This new feature is now ready and will be delivered in Config::Model 2.030. cme will now return consistent results.

This new release is mostly backward compatible. You may notice some quirks with some other modules based on Config::Model:

  • Current Config::Model::Dpkg will returns unconsistent results as usual
  • Tk UI will not show correct package status (i.e. packages may be wrongly shown as unknown)
  • config-model-edit process must be killed if it refuses to exit.
  • Tk UI non-regression tests will block

These quirks will disappear once these modules are updated. This should not be long since all the updates are ready in github.

All the best

Be wary of “optimised for …” devices

Hello

This story began with linphone having a weird behavior: every now and then, the mouse pointer would become stuck: the pointer moves, but clikcking on it always activate the same widget. Since this always happened in the middle of phone conference done for work, this was infuriating.

Long story short, I finally found that my plantronics USB headset was responsible for this weird behavior. This headset is connected to the USB bus and is seen by the computer as a USB sound card. The bug was triggered by pressing the “mute” button provided by the headset.

Suspicious of this usb device, I used lsusb to find the features provided by the heaset:

$ lsusb -v -s 1:3 2>&1 |grep InterfaceClass
bInterfaceClass 1 Audio
bInterfaceClass 1 Audio
bInterfaceClass 1 Audio
bInterfaceClass 1 Audio
bInterfaceClass 1 Audio
bInterfaceClass 3 Human Interface Device

Audio devices were expected, but why a HID device ? This device is a headset, not a keyboard or a mouse… Testing further, I also saw some number poping up on my screen whenever I pressed the the mute button. The only way to get the mouse back was to unplug the headset.

A quick google search gave a solution to setup X11 to ignore the headset. Problem solved.

But this did not answer the question regarding the HID device. Then I got a hint in the form of a sticker glued on the headset USB plug: “Optimized for Microsoft Lync”. This page gave the answer: an “optimised for Lync” device provides “mute/unmute across PC and device”.

I can only guess that when the mute button is pressed, some data is sent from the HID interface. Unfortunately, the window manager does not like to be on the receiving end of this data.

The moral of this story is: “optimized for something” actually means “not standard“.

ok. There’s a bug somewhere. The comments of this blog have convinced me that I went too far with the moral of this story. It’s now overstriked instead of plainly remvoed so the comments still make sense.

Thanks everybody for the constructive comments.

All the best

How to fix a configuration model

Hello

I’ve written a tutorial to show how a configuration model can be updated to cope with new features provided by new software release.

This tutorial shows how to add a new parameter provided by OpenSsh 5.9 into Ssh model. This example can be applied to other models (like Dpkg model).

I hope that people will be able now to perform simple enhancements to current models.

Feedback are welcome

Dpkg source editor/checker is going native

Hello

As Config::Model was becoming too big with too many dependencies, I’ve moved all Debian stuff from Config::Model repository into a separate Debian repository.

If you install only libconfig-model-perl 2.026-1 (from experimental), the command ‘cme check dpkg’ will not work. You must also install libconfig-model-dpkg-perl to get it back. libconfig-model-dpkg-perl is currently in experimental.

libconfig-model-dpkg-perl repository is hosted on Alioth in debian perl-packaging team and is a Debian native package. Nevertheless, it will be uploaded to CPAN. (just like dh-make-perl).
All people involved in packaging are welcome to hack it or suggest improvements

I’ll upload both packages into unstable once the dust settles.

All the best

Kerberos configuration editor looking for a maintainer

Hello

A while ago, Peter Knowles wrote a configuration model for Kerberos. Unfortunately, his priorities shifted before he completed this work. He was kind enough to send me the prototype in hope someone could take over.

Although I do not know Kerberos, I believe the model is rather complete. Just click on the image below to judge for yourself:

Class diagram of Kerberos configuration model

This work was released under a LGPL-2+ license.

There are some tasks left to get a Kerberos configuration editor. These are mentioned in config-model-kerberos readme file.

To look at the code, just clone config-model-kerberos with

git://github.com/dod38fr/config-model-kerberos.git

If you need more information on Config::Model, see:

If you are interested in adopting Kerberos model, please keep us informed on config-model-users at lists.sourceforge.net. This will avoid duplicated work. On my side, I’ll gladly answer questions related to Config::Model to help you get started.

All the best

cme fix dpkg now track and display changes

Hello

I’ve released Config::Model 2.018 which is now able to track the changes done with cme to your configuration data.

Thw GUI now features a new menu entry File -> show unsaved change that will give you a list of the changes done since running the command or since the last save (whichever occured last). For instance:

This feature is also available in non-graphic mode.

As an example, let’s refresh the lcdproc package:

$ cme fix dpkg
Fixing...
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Warning in 'control source Standards-Version' value '3.9.2': Current standards version is 3.9.3
Warning in 'copyright Format' value 'http://dep.debian.net/deps/dep5/': Format does not match the recommended URL for DEP-5

Changes:
- control source Standards-Version: '3.9.2' -> '3.9.3' # applied fix
- copyright Format: 'http://dep.debian.net/deps/dep5/' -> 'http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/' # applied fix

The Changes list can be reused to update the changelog, but the author does not dare injecting this directly in debian/changelog…

Let’s check the resulting changes:

$ git diff
diff --git a/debian/control b/debian/control
index 11a5b8a..2b81192 100644
--- a/debian/control
+++ b/debian/control
@@ -20,7 +20,7 @@ Build-Depends: autoconf,
                libxosd-dev,
                pkg-config,
                texinfo
-Standards-Version: 3.9.2
+Standards-Version: 3.9.3
 Vcs-Browser: http://git.debian.org/?p=collab-maint/lcdproc.git;a=summary
 Vcs-Git: git://git.debian.org/collab-maint/lcdproc.git
 Homepage: http://www.lcdproc.org/
diff --git a/debian/copyright b/debian/copyright
index 6cea444..4b48619 100644
--- a/debian/copyright
+++ b/debian/copyright
@@ -1,4 +1,4 @@
-Format: http://dep.debian.net/deps/dep5/
+Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
 Upstream-Name: lcdproc
 Upstream-Contact: William W. Ferrell 
 Source: http://www.lcdproc.org/
domi@ylum:~/debian-dev/lcdproc-stuff/lcdproc$ 

Last but not least, this track and display changes features is also available for other models, like OpenSsh, lcdproc, multistrap

All the best

Follow

Get every new post delivered to your Inbox.

Join 41 other followers