[Pkg-dkms-maint] Bug#842596: dkms fails on package upgrades
Johannes Kneer
johannes.kneer at gmail.com
Sun Oct 30 17:01:44 UTC 2016
Package: dkms
Version: 2.3-1
Severity: important
Tags: upstream
Dear Maintainer,
for years (literally!) I've had problems with package updates using dkms. They
would fail with an error message that did not seem quite right, but I could
always solve the problem by manually removing the offending module _manually_
from /var/lib/dkms as "dkms remove" would also fail...
I've had a look at it, I'm still not quite sure what exactly goes wrong or in
which place. Below is what I found.
I'm using virtualbox-dkms as an example, but I have the same problem with
nvidia-kernel-dkms. The problems seem to be related to either the dkms script
itself or the common.postinst script, both of which are from the dkms package.
Summary:
- 'dkms remove -m module -v version --all' does not seem to work, whereas
giving the explicit kernel does? I'm not sure if this is another symptom or the
root cause.
- Also sometimes the error message on trying to install is refering to a
_built_ already existing, which seems to be another symptom.
Example (after manually removing all of dkms, virtualbox, cleaning /usr/src,
/var/lib/dkms)
------------------
>> aptitude install virtualbox
> The following NEW packages will be installed:
> libgsoap10{a} virtualbox virtualbox-dkms{a} virtualbox-qt{a}
[everything works the 1st time]
Let's reconfigure to simulate an update:
>> dpkg-reconfigure virtualbox-dkms
> Removing old virtualbox-5.1.8 DKMS files...
> Loading new virtualbox-5.1.8 DKMS files...
> Error! DKMS tree already contains: virtualbox-5.1.8
> You cannot add the same module/version combo more than once.
Weird certainly not what I'd expect, let's purge the packages:
>> aptitude purge virtualbox virtualbox-dkms
>The following packages will be REMOVED:
> libgsoap10{u} virtualbox{p} virtualbox-dkms{pu} virtualbox-qt{u}
> ...
> Removing virtualbox-dkms (5.1.8-dfsg-6) ...
> ...
Let's reinstall:
>> aptitude install virtualbox
> Setting up virtualbox-dkms (5.1.8-dfsg-6) ...
> Removing old virtualbox-5.1.8 DKMS files...
> Loading new virtualbox-5.1.8 DKMS files...
> Error! DKMS tree already contains: virtualbox-5.1.8
> You cannot add the same module/version combo more than once.
Let's try dkms manually to learn what fails:
(*)
>> dkms install virtualbox/5.1.8 -k 4.7.0-1-amd64
> Error! This module/version has already been built on: 4.7.0-1-amd64
> Directory: /var/lib/dkms/virtualbox/5.1.8/4.7.0-1-amd64/
> already exists. Use the dkms remove function before trying to build again.
That is weird; I tried to _install_ and the error message tells me has already
been _built_? I'll look at that later, let's first follow along and try to
remove it and then try to finish the installation:
>> apt-get -f install
> ...
> Setting up virtualbox-dkms (5.1.8-dfsg-6) ...
> Loading new virtualbox-5.1.8 DKMS files...
> Building for 4.7.0-1-amd64
> Building initial module for 4.7.0-1-amd64
> Done.
> ...
Fine? Can I reconfigure now?
>> dpkg-reconfigure virtualbox-dkms
> Removing old virtualbox-5.1.8 DKMS files...
> Loading new virtualbox-5.1.8 DKMS files...
> Error! DKMS tree already contains: virtualbox-5.1.8
> You cannot add the same module/version combo more than once.
No. So updates will still fail. Why oh why? What is the package doing on
installation/configuration?
>> cd /tmp; dpkg -e /var/cache/apt/archives/virtualbox-dkms_5.1.8-dfsg-
6_all.deb; cat DEBIAN/postinst
> ...
> DKMS_NAME=virtualbox
> DKMS_PACKAGE_NAME=$DKMS_NAME-dkms
> DKMS_VERSION=5.1.8
> ...
> for DKMS_POSTINST in /usr/lib/dkms/common.postinst
/usr/share/$DKMS_PACKAGE_NAME/postinst; do
> if [ -f $DKMS_POSTINST ]; then
> $DKMS_POSTINST $DKMS_NAME $DKMS_VERSION
/usr/share/$DKMS_PACKAGE_NAME "" $2
> postinst_found=1
> break
> fi
> done
> ...
So '/usr/lib/dkms/common.postinst virtualbox 5.1.8 /usr/share/virtualbox-dkms
"" $2' is called.
The empty string is the architecture. common.postinst ultimately calls 'dkms
add -m virtualbox -v 5.1.8 > /dev/null' which fails with the error above:
> Removing old virtualbox-5.1.8 DKMS files...
> Loading new virtualbox-5.1.8 DKMS files...
> Error! DKMS tree already contains: virtualbox-5.1.8
> You cannot add the same module/version combo more than once.
Why should 'adding' fail after the exact module was removed? Let's have a look
at /usr/sbin/dkms. The Error indicates that in 'add_module()' we run into:
> # Check that this module-version hasn't already been added
> if is_module_added "$module" "$module_version"; then
> die 3 $"DKMS tree already contains: $module-$module_version" \
> $"You cannot add the same module/version combo more than once."
> fi
> ...
> is_module_added() {
> [[ $1 && $2 ]] || return 1
> [[ -d $dkms_tree/$1/$2 ]] || return 2
> [[ -L $dkms_tree/$1/$2/source || -d $dkms_tree/$1/$2/source ]];
> }
Looks ok. So dkms remove did not work?:
>> dkms remove -m virtualbox -v 5.1.8 --all
>> dkms status
> virtualbox, 5.1.8: added
Should be this way? Maybe, still the module should be removed from all kernels,
right?:
>> ls -la /var/lib/dkms/virtualbox/5.1.8
> total 12
> drwxr-xr-x 3 root root 4096 Oct 30 17:05 .
> drwxr-xr-x 3 root root 4096 Oct 30 17:05 ..
> drwxr-xr-x 4 root root 4096 Oct 30 17:05 4.7.0-1-amd64
> lrwxrwxrwx 1 root root 25 Oct 30 17:05 source -> /usr/src/virtualbox-5.1.8
Shouldn't this dir have been removed? Let's be more explicit that before (and
the script):
>> dkms remove -m virtualbox -v 5.1.8 -k 4.7.0-1-amd64
>
> -------- Uninstall Beginning --------
> Module: virtualbox
> Version: 5.1.8
> Kernel: 4.7.0-1-amd64 ()
> -------------------------------------
>
> Status: This module version was INACTIVE for this kernel.
> depmod...
>
> DKMS: uninstall completed.
>
> ------------------------------
> Deleting module version: 5.1.8
> completely from the DKMS tree.
> ------------------------------
> Done.
Ok, now it is gone. Does --all not work? First test reconfiguring:
>> dpkg-reconfigure virtualbox-dkms
> Loading new virtualbox-5.1.8 DKMS files...
> Building for 4.7.0-1-amd64
> Building initial module for 4.7.0-1-amd64
> Done.
Works. What is wrong with --all?
I'm no expert in bash and I'm actually not quite sure how 'all' is handled, but
it doesn't seem to work.
So this is one suspect; I'm still wondering how I can be the only one having
this problem with dkms.
During the example above (*) I had the error message concerning the building
stage, when the script actually was called to install.
In /usr/sbin/dkms install_module() first checks whether the module has been
built; a prerequisites for the installation. The error message indicates that
the build check 'is_module_built' returns 0, when it shouldn't and
'build_module' is called where it shouldn't be called.
excerpts from /usr/sbin/dkms :
> is_module_built "$module" "$module_version" "$kernelver" "$arch" ||
build_module
The 'prepare_build' function called by 'build_module' checks whether the module
had already been build. And if this is the case it returns the error mentioned
above:
> # Check that the module has not already been built for this kernel
> [[ -d $base_dir ]] && die 3 \
> $"This module/version has already been built on: $kernelver" \
> $"Directory: $base_dir" \
> $"already exists. Use the dkms remove function before trying to build
again."
So is_module_built fails?
> is_module_built() {
> [[ $1 && $2 && $3 && $4 ]] || return 1
> local d="$dkms_tree/$1/$2/$3/$4" m=''
> [[ -d $d/module ]] || return 1
> read_conf_or_die "$3" "$4" "$dkms_tree/$1/$2/source/dkms.conf"
> for m in "${dest_module_name[@]}"; do
> [[ -f $d/module/$m.ko || -f $d/module/$m.o ]] || return 1
> done
>}
Mhh it requires 4 arguments. Afaik $arch ($4) is empty, the architecture is not
set when calling dkms, and I didn't find a line that would set it...
is_module_built() returns 0 because ARCH is not set?
I'm not sure if what I see is the symptom of one or multiple bugs or if my
system configuration may be the culprit (even though I did not find any
indication of that). Any help or fix of the issue is appreciated, thank you in
advance.
Johannes
-- System Information:
Debian Release: stretch/sid
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 4.7.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=nds_DE.UTF-8, LC_CTYPE=nds_DE.UTF-8 (charmap=locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Versions of packages dkms depends on:
ii build-essential 12.2
ii coreutils 8.25-2
ii dpkg-dev 1.18.10
ii gcc 4:6.1.1-1
ii kmod 22-1.1
ii make 4.1-9
ii patch 2.7.5-1
Versions of packages dkms recommends:
ii fakeroot 1.21-2
ii linux-headers-amd64 4.7+75
ii sudo 1.8.17p1-2
Versions of packages dkms suggests:
ii menu 2.1.47
ii python3-apport 2.17.2-1
-- debconf information:
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = "en_US:de",
LC_ALL = (unset),
LANG = "nds_DE.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
More information about the Pkg-dkms-maint
mailing list