[Pkg-dkms-maint] Bug#842596: dkms fails on package upgrades

Johannes Kneer johannes.kneer at gmail.com
Sun Oct 30 17:01:44 UTC 2016


Package: dkms
Version: 2.3-1
Severity: important
Tags: upstream

Dear Maintainer,

for years (literally!) I've had problems with package updates using dkms. They
would fail with an error message that did not seem quite right, but I could
always solve the problem by manually removing the offending module _manually_
from /var/lib/dkms as "dkms remove" would also fail...

I've had a look at it, I'm still not quite sure what exactly goes wrong or in
which place. Below is what I found.
I'm using virtualbox-dkms as an example, but I have the same problem with
nvidia-kernel-dkms. The problems seem to be related to either the dkms script
itself or the common.postinst script, both of which are from the dkms package.

Summary:
- 'dkms remove -m module -v version --all' does not seem to work, whereas
giving the explicit kernel does? I'm not sure if this is another symptom or the
root cause.
- Also sometimes the error message on trying to install is refering to a
_built_ already existing, which seems to be another symptom.


Example (after manually removing all of dkms, virtualbox, cleaning /usr/src,
/var/lib/dkms)
------------------
>> aptitude install virtualbox
> The following NEW packages will be installed:
>   libgsoap10{a} virtualbox virtualbox-dkms{a} virtualbox-qt{a}

[everything works the 1st time]

Let's reconfigure to simulate an update:

>> dpkg-reconfigure virtualbox-dkms
> Removing old virtualbox-5.1.8 DKMS files...
> Loading new virtualbox-5.1.8 DKMS files...
> Error! DKMS tree already contains: virtualbox-5.1.8
> You cannot add the same module/version combo more than once.

Weird certainly not what I'd expect, let's purge the packages:

>> aptitude purge virtualbox virtualbox-dkms
>The following packages will be REMOVED:
>  libgsoap10{u} virtualbox{p} virtualbox-dkms{pu} virtualbox-qt{u}
> ...
> Removing virtualbox-dkms (5.1.8-dfsg-6) ...
> ...

Let's reinstall:

>> aptitude install virtualbox
> Setting up virtualbox-dkms (5.1.8-dfsg-6) ...
> Removing old virtualbox-5.1.8 DKMS files...
> Loading new virtualbox-5.1.8 DKMS files...
> Error! DKMS tree already contains: virtualbox-5.1.8
> You cannot add the same module/version combo more than once.

Let's try dkms manually to learn what fails:


(*)

>> dkms install virtualbox/5.1.8 -k 4.7.0-1-amd64
> Error! This module/version has already been built on: 4.7.0-1-amd64
> Directory: /var/lib/dkms/virtualbox/5.1.8/4.7.0-1-amd64/
> already exists.  Use the dkms remove function before trying to build again.

That is weird; I tried to _install_ and the error message tells me has already
been _built_? I'll look at that later, let's first follow along and try to
remove it and then try to finish the installation:

>> apt-get -f install
> ...
> Setting up virtualbox-dkms (5.1.8-dfsg-6) ...
> Loading new virtualbox-5.1.8 DKMS files...
> Building for 4.7.0-1-amd64
> Building initial module for 4.7.0-1-amd64
> Done.
> ...

Fine? Can I reconfigure now?

>> dpkg-reconfigure virtualbox-dkms
> Removing old virtualbox-5.1.8 DKMS files...
> Loading new virtualbox-5.1.8 DKMS files...
> Error! DKMS tree already contains: virtualbox-5.1.8
> You cannot add the same module/version combo more than once.

No. So updates will still fail. Why oh why? What is the package doing on
installation/configuration?

>> cd /tmp; dpkg -e /var/cache/apt/archives/virtualbox-dkms_5.1.8-dfsg-
6_all.deb; cat DEBIAN/postinst
> ...
> DKMS_NAME=virtualbox
> DKMS_PACKAGE_NAME=$DKMS_NAME-dkms
> DKMS_VERSION=5.1.8
> ...
>               for DKMS_POSTINST in /usr/lib/dkms/common.postinst
/usr/share/$DKMS_PACKAGE_NAME/postinst; do
>                       if [ -f $DKMS_POSTINST ]; then
>                               $DKMS_POSTINST $DKMS_NAME $DKMS_VERSION
/usr/share/$DKMS_PACKAGE_NAME "" $2
>                               postinst_found=1
>                               break
>                       fi
>               done
> ...

So '/usr/lib/dkms/common.postinst virtualbox 5.1.8 /usr/share/virtualbox-dkms
"" $2' is called.
The empty string is the architecture. common.postinst ultimately calls 'dkms
add -m virtualbox -v 5.1.8 > /dev/null' which fails with the error above:
> Removing old virtualbox-5.1.8 DKMS files...
> Loading new virtualbox-5.1.8 DKMS files...
> Error! DKMS tree already contains: virtualbox-5.1.8
> You cannot add the same module/version combo more than once.

Why should 'adding' fail after the exact module was removed? Let's have a look
at /usr/sbin/dkms. The Error indicates that in 'add_module()' we run into:
>    # Check that this module-version hasn't already been added
>    if is_module_added "$module" "$module_version"; then
>        die 3 $"DKMS tree already contains: $module-$module_version" \
>            $"You cannot add the same module/version combo more than once."
>    fi
> ...
> is_module_added() {
>     [[ $1 && $2 ]] || return 1
>     [[ -d $dkms_tree/$1/$2 ]] || return 2
>     [[ -L $dkms_tree/$1/$2/source || -d $dkms_tree/$1/$2/source ]];
> }

Looks ok. So dkms remove did not work?:
>> dkms remove -m virtualbox -v 5.1.8 --all
>> dkms status
> virtualbox, 5.1.8: added

Should be this way? Maybe, still the module should be removed from all kernels,
right?:
>> ls -la /var/lib/dkms/virtualbox/5.1.8
> total 12
> drwxr-xr-x 3 root root 4096 Oct 30 17:05 .
> drwxr-xr-x 3 root root 4096 Oct 30 17:05 ..
> drwxr-xr-x 4 root root 4096 Oct 30 17:05 4.7.0-1-amd64
> lrwxrwxrwx 1 root root   25 Oct 30 17:05 source -> /usr/src/virtualbox-5.1.8

Shouldn't this dir have been removed? Let's be more explicit that before (and
the script):
>> dkms remove -m virtualbox -v 5.1.8 -k 4.7.0-1-amd64
>
> -------- Uninstall Beginning --------
> Module:  virtualbox
> Version: 5.1.8
> Kernel:  4.7.0-1-amd64 ()
> -------------------------------------
>
> Status: This module version was INACTIVE for this kernel.
> depmod...
>
> DKMS: uninstall completed.
>
> ------------------------------
> Deleting module version: 5.1.8
> completely from the DKMS tree.
> ------------------------------
> Done.

Ok, now it is gone. Does --all not work? First test reconfiguring:
>> dpkg-reconfigure virtualbox-dkms
> Loading new virtualbox-5.1.8 DKMS files...
> Building for 4.7.0-1-amd64
> Building initial module for 4.7.0-1-amd64
> Done.

Works. What is wrong with --all?
I'm no expert in bash and I'm actually not quite sure how 'all' is handled, but
it doesn't seem to work.

So this is one suspect; I'm still wondering how I can be the only one having
this problem with dkms.



During the example above (*) I had the error message concerning the building
stage, when the script actually was called to install.
In /usr/sbin/dkms install_module() first checks whether the module has been
built; a prerequisites for the installation. The error message indicates that
the build check 'is_module_built' returns 0, when it shouldn't and
'build_module' is called where it shouldn't be called.

excerpts from /usr/sbin/dkms :
> is_module_built "$module" "$module_version" "$kernelver" "$arch" ||
build_module

The 'prepare_build' function called by 'build_module' checks whether the module
had already been build. And if this is the case it returns the error mentioned
above:
>    # Check that the module has not already been built for this kernel
>    [[ -d $base_dir ]] && die 3 \
>        $"This module/version has already been built on: $kernelver" \
>        $"Directory: $base_dir" \
>        $"already exists.  Use the dkms remove function before trying to build
again."

So is_module_built fails?

> is_module_built() {
>    [[ $1 && $2 && $3 && $4 ]] || return 1
>    local d="$dkms_tree/$1/$2/$3/$4" m=''
>    [[ -d $d/module ]] || return 1
>    read_conf_or_die "$3" "$4" "$dkms_tree/$1/$2/source/dkms.conf"
>    for m in "${dest_module_name[@]}"; do
>        [[ -f $d/module/$m.ko || -f $d/module/$m.o ]] || return 1
>    done
>}

Mhh it requires 4 arguments. Afaik $arch ($4) is empty, the architecture is not
set when calling dkms, and I didn't find a line that would set it...
is_module_built() returns 0 because ARCH is not set?

I'm not sure if what I see is the symptom of one or multiple bugs or if my
system configuration may be the culprit (even though I did not find any
indication of that). Any help or fix of the issue is appreciated, thank you in
advance.

Johannes




-- System Information:
Debian Release: stretch/sid
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.7.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=nds_DE.UTF-8, LC_CTYPE=nds_DE.UTF-8 (charmap=locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages dkms depends on:
ii  build-essential  12.2
ii  coreutils        8.25-2
ii  dpkg-dev         1.18.10
ii  gcc              4:6.1.1-1
ii  kmod             22-1.1
ii  make             4.1-9
ii  patch            2.7.5-1

Versions of packages dkms recommends:
ii  fakeroot             1.21-2
ii  linux-headers-amd64  4.7+75
ii  sudo                 1.8.17p1-2

Versions of packages dkms suggests:
ii  menu            2.1.47
ii  python3-apport  2.17.2-1

-- debconf information:
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = "en_US:de",
	LC_ALL = (unset),
	LANG = "nds_DE.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory



More information about the Pkg-dkms-maint mailing list