[Pkg-r-builders] [Semi-or-not-quite-off topic] Debian rebuilds

Dirk Eddelbuettel edd at debian.org
Sun Jun 18 20:46:11 UTC 2017


This is a bit of a brain-dump. I am just abusing the three of you as a
sounding board, and in particular Don with request for help.

Bear with me, or tune out. You can run the code along with m,e.


# Overall Issue: R forced new registration for .C() and .Fortran() in R 3.4.0

Source: NEWS in R, quoting

    * Packages which register native routines for .C or .Fortran need
      to be re-installed for this version (unless installed with
      R-devel SVN revision r72375 or later).


# Proposed Solution

Selective rebuilds, via fine-grained analysis below


# Alternate Solution

The consensus on early debian-bugs reports was to declare a new binary tag
for the r-api, enforcing rebuilds of everything that depends on package
r-base-core. I think it is overkill. Read on.


# Tooling

## Overview

The RcppAPT package (at https://github.com/eddelbuettel/rcppapt; CRAN is
behind) allows me to query `apt` to get fine-grained depends.  What follows
is an annotated log of the files in
https://github.com/eddelbuettel/rcppapt/tree/master/inst/scripts/debian-packages 

## Docker

As I don't have Debian unstable on a box, I just refreshed the Docker
container and ran (with my ~/git/ mounted as /mnt in Docker so that we can
install RcppAPT inside Docker)

```bash
## setup:
##   cd ~/git && docker run --rm -ti -v $(pwd):/mnt debian:unstable

apt-get update
apt-get -y dist-upgrade

apt-get -y install r-cran-rcpp r-cran-data.table libapt-pkg-dev less  # takes a moment

cd /mnt 
R CMD INSTALL rcppapt/       # assuming we're above rcppapt
```

Now we have RcppAPT in Docker, along with R and some helper packages.

## R Analysis -- Step One

```r
library(RcppAPT)

rd <- reverseDepends("r-base-core")     # 514 x 2
rd <- rd[grepl("^r-", rd[,1]), ]        # 487 x 2
```

This gets us 487 plausible candidate packages, casting too wide a net
including r-doc-* but excluding other binaries built against R.


## R Analysis -- Step Two

This contains some code I wrote weeks ago, the package may now be smarter (ie
returns `stringsAsFactors=FALSE` and faster query.  For another time. This
works:


```r
rd[,1] <- as.character(rd[,1])
rd[,2] <- as.character(rd[,2])

rd <- rd[order(rd[,2]), ]

library(data.table)
setDT(rd)

## special treatment
rd[ version=="3.0.0~20130330-1", version := "3.0.0.20130330-1"]
rd[ version=="3.2.4-revised-1", version := "3.2.4.1-1"]

rd[version!="", oldVersion := version  <=  package_version("3.3.3-1")]
rd[ is.na(oldVersion), oldVersion := FALSE]
```

The main crux is this look. I think I rewrote `getDepends()` to return a full
data.table from one regexp, but that usage may be on laptop.  The loop now is
inefficient and takes a few minutes but works

```r
rd[ version=="", skip:=TRUE ]
rd[ is.na(skip), skip:=FALSE]
n <- nrow(rd)
for (i in 1:n) {                        # takes a few minutes too ...
    #print(rd[i,])
    txt <- paste0(rd[i, package], "$")
    if (!rd[i,skip]) {
        dep <- getDepends(txt)
        if (nrow(dep) > 0) rd[i, isCompiled:="libc6" %in% as.character(dep[,"deppkg"])]
        #print(rd[i,])
    }
}
```

We are now down to 205 possible packages which are a) based on compiled code
(else no change to have `.C()` or `.Fortran()`) and have not yet been rebuilt
by current R:

```r
print(rd[ oldVersion==TRUE & skip!=TRUE & isCompiled , ], top=205)
```

We then mix this set of 205 with actual CRAN data:

```
rd[, cran:=grepl("^r-cran", package) ]
rd[ oldVersion==TRUE & skip!=TRUE & isCompiled & cran, ]

cand <- rd[ oldVersion==TRUE & skip!=TRUE & isCompiled & cran, ]   # 151
setkey(cand, package)

db <- tools::CRAN_package_db()
setDT(db)
db[, package:=paste0("r-cran-", tolower(Package))]
setkey(db, package)


foo <- db[ cand ]   # inner join
foo[, .(package, Package, Version, NeedsCompilation, oldVersion, skip)]

saveRDS(foo[, .(package, Package, Version, NeedsCompilation, oldVersion, skip)], file="debpackages.rds")
```

## R Analysis -- Step Three

The rds file saved here I scp'ied to a box with a full CRAN archive "open" as
ran this loop:

```r
## on another machine
deb <- readRDS("~/debpackages.rds")
for (i in 1:nrow(deb)) {
   deb[i, "dotCorFortran"] <- if (is.na(deb[i, "Package"])) NA else system(paste0("egrep -r -q \"\\.(C|Fortran)\\(\" ", deb[i, "Package"], "/R/*"))==0
}
saveRDS(deb, "~/debpackagesout.rds")
```

## R Analysis -- Step Four

Back on the initial machine "here":

```
deb <- readRDS("debpackagesout.rds")
setDT(deb)
deb[ is.na(deb[, dotCorFortran]) | deb[, dotCorFortran]==TRUE, 1:3]   ## 75
```

We are now down to a set of 75 packages, down from 487.  One is a known false
positive (r-cran-int64 which I have here, .C and .Fortran only in comments)
but I can still rebuild it. The list is below (and also in GitHub)

Another I just rebuilt (r-cran-rsprng).

Leaves around 74 or so to submit as requests for binary NMUs.

Don:  Good idea or not?

Dirk


                     package           Package  Version
 1:              r-cran-ade4              ade4    1.7-6
 2:          r-cran-adegenet          adegenet    2.0.1
 3:          r-cran-adephylo          adephylo   1.1-10
 4:            r-cran-amelia            Amelia    1.7.4
 5:               r-cran-ape               ape      4.1
 6:            r-cran-bayesm            bayesm    3.0-2
 7:            r-cran-bitops            bitops    1.0-6
 8:     r-cran-blockmodeling     blockmodeling    0.1.8
 9:           r-cran-boolnet           BoolNet    2.1.3
10:             r-cran-brglm             brglm    0.5-9
11:             r-cran-caret             caret   6.0-76
12:           r-cran-catools           caTools   1.17.1
13:            r-cran-cmprsk            cmprsk    2.2-7
14:              r-cran-coin              coin    1.1-3
15:          r-cran-contfrac          contfrac   1.1-10
16:        r-cran-data.table        data.table   1.10.4
17:              r-cran-deal              deal   1.2-37
18:            r-cran-deldir            deldir   0.1-14
19:           r-cran-desolve           deSolve     1.14
20:       r-cran-dosefinding       DoseFinding   0.9-15
21:               r-cran-eco               eco    3.1-7
22:               r-cran-erm               eRm   0.15-7
23:               r-cran-etm               etm    0.6-2
24:               r-cran-evd               evd    2.3-2
25:              r-cran-expm              expm  0.999-2
26:            r-cran-fields            fields      9.0
27:               r-cran-gam               gam   1.14-4
28:           r-cran-genabel           GenABEL    1.8-0
29:            r-cran-glmnet            glmnet   2.0-10
30:           r-cran-goftest           goftest    1.1-1
31:               r-cran-gsl               gsl 1.9-10.3
32:       r-cran-haplo.stats       haplo.stats    1.7.7
33:              r-cran-hdf5                NA       NA
34:            r-cran-hexbin            hexbin   1.27.1
35:            r-cran-igraph            igraph    1.0.1
36:             r-cran-int64                NA       NA
37:               r-cran-lhs               lhs     0.14
38:         r-cran-logspline         logspline    2.1.9
39:           r-cran-mapproj           mapproj    1.2-5
40:              r-cran-maps              maps    3.2.0
41:          r-cran-maptools          maptools    0.9-2
42:              r-cran-mcmc              mcmc    0.9-5
43:          r-cran-mcmcpack          MCMCpack    1.4-0
44:      r-cran-medadherence                NA       NA
45:          r-cran-mixtools          mixtools    1.1.0
46:           r-cran-mlbench           mlbench    2.1-1
47:               r-cran-mnp               MNP    3.0-1
48:               r-cran-msm               msm    1.6.4
49:             r-cran-ncdf4             ncdf4     1.16
50:              r-cran-nnls              nnls      1.4
51:          r-cran-pbivnorm          pbivnorm    0.6.0
52:          r-cran-phangorn          phangorn    2.2.0
53:         r-cran-phylobase         phylobase    0.8.4
54:           r-cran-polycub           polyCub    0.6.0
55:         r-cran-princurve         princurve   1.1-12
56:              r-cran-pscl              pscl    1.4.9
57:               r-cran-qtl               qtl   1.41-6
58:      r-cran-randomfields      RandomFields   3.1.50
59: r-cran-randomfieldsutils RandomFieldsUtils   0.3.25
60:      r-cran-randomforest      randomForest   4.6-12
61:      r-cran-raschsampler      RaschSampler    0.8-8
62:             r-cran-rcurl             RCurl 1.95-4.8
63:         r-cran-rniftilib                NA       NA
64:            r-cran-rsprng                NA       NA
65:                r-cran-sp                sp    1.2-4
66:              r-cran-spam              spam    1.4-0
67:          r-cran-spatstat          spatstat   1.51-0
68:               r-cran-spc               spc    0.5.3
69:             r-cran-spdep             spdep   0.6-13
70:      r-cran-surveillance      surveillance   1.13.1
71:               r-cran-tgp               tgp   2.4-14
72:        r-cran-tikzdevice        tikzDevice   0.10-1
73:         r-cran-treescape                NA       NA
74:             r-cran-vegan             vegan    2.4-3
75:              r-cran-vgam              VGAM    1.0-3
                     package           Package  Version

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org



More information about the Pkg-r-builders mailing list