[Pkg-r-builders] [Semi-or-not-quite-off topic] Debian rebuilds
Dirk Eddelbuettel
edd at debian.org
Sun Jun 18 20:46:11 UTC 2017
This is a bit of a brain-dump. I am just abusing the three of you as a
sounding board, and in particular Don with request for help.
Bear with me, or tune out. You can run the code along with m,e.
# Overall Issue: R forced new registration for .C() and .Fortran() in R 3.4.0
Source: NEWS in R, quoting
* Packages which register native routines for .C or .Fortran need
to be re-installed for this version (unless installed with
R-devel SVN revision r72375 or later).
# Proposed Solution
Selective rebuilds, via fine-grained analysis below
# Alternate Solution
The consensus on early debian-bugs reports was to declare a new binary tag
for the r-api, enforcing rebuilds of everything that depends on package
r-base-core. I think it is overkill. Read on.
# Tooling
## Overview
The RcppAPT package (at https://github.com/eddelbuettel/rcppapt; CRAN is
behind) allows me to query `apt` to get fine-grained depends. What follows
is an annotated log of the files in
https://github.com/eddelbuettel/rcppapt/tree/master/inst/scripts/debian-packages
## Docker
As I don't have Debian unstable on a box, I just refreshed the Docker
container and ran (with my ~/git/ mounted as /mnt in Docker so that we can
install RcppAPT inside Docker)
```bash
## setup:
## cd ~/git && docker run --rm -ti -v $(pwd):/mnt debian:unstable
apt-get update
apt-get -y dist-upgrade
apt-get -y install r-cran-rcpp r-cran-data.table libapt-pkg-dev less # takes a moment
cd /mnt
R CMD INSTALL rcppapt/ # assuming we're above rcppapt
```
Now we have RcppAPT in Docker, along with R and some helper packages.
## R Analysis -- Step One
```r
library(RcppAPT)
rd <- reverseDepends("r-base-core") # 514 x 2
rd <- rd[grepl("^r-", rd[,1]), ] # 487 x 2
```
This gets us 487 plausible candidate packages, casting too wide a net
including r-doc-* but excluding other binaries built against R.
## R Analysis -- Step Two
This contains some code I wrote weeks ago, the package may now be smarter (ie
returns `stringsAsFactors=FALSE` and faster query. For another time. This
works:
```r
rd[,1] <- as.character(rd[,1])
rd[,2] <- as.character(rd[,2])
rd <- rd[order(rd[,2]), ]
library(data.table)
setDT(rd)
## special treatment
rd[ version=="3.0.0~20130330-1", version := "3.0.0.20130330-1"]
rd[ version=="3.2.4-revised-1", version := "3.2.4.1-1"]
rd[version!="", oldVersion := version <= package_version("3.3.3-1")]
rd[ is.na(oldVersion), oldVersion := FALSE]
```
The main crux is this look. I think I rewrote `getDepends()` to return a full
data.table from one regexp, but that usage may be on laptop. The loop now is
inefficient and takes a few minutes but works
```r
rd[ version=="", skip:=TRUE ]
rd[ is.na(skip), skip:=FALSE]
n <- nrow(rd)
for (i in 1:n) { # takes a few minutes too ...
#print(rd[i,])
txt <- paste0(rd[i, package], "$")
if (!rd[i,skip]) {
dep <- getDepends(txt)
if (nrow(dep) > 0) rd[i, isCompiled:="libc6" %in% as.character(dep[,"deppkg"])]
#print(rd[i,])
}
}
```
We are now down to 205 possible packages which are a) based on compiled code
(else no change to have `.C()` or `.Fortran()`) and have not yet been rebuilt
by current R:
```r
print(rd[ oldVersion==TRUE & skip!=TRUE & isCompiled , ], top=205)
```
We then mix this set of 205 with actual CRAN data:
```
rd[, cran:=grepl("^r-cran", package) ]
rd[ oldVersion==TRUE & skip!=TRUE & isCompiled & cran, ]
cand <- rd[ oldVersion==TRUE & skip!=TRUE & isCompiled & cran, ] # 151
setkey(cand, package)
db <- tools::CRAN_package_db()
setDT(db)
db[, package:=paste0("r-cran-", tolower(Package))]
setkey(db, package)
foo <- db[ cand ] # inner join
foo[, .(package, Package, Version, NeedsCompilation, oldVersion, skip)]
saveRDS(foo[, .(package, Package, Version, NeedsCompilation, oldVersion, skip)], file="debpackages.rds")
```
## R Analysis -- Step Three
The rds file saved here I scp'ied to a box with a full CRAN archive "open" as
ran this loop:
```r
## on another machine
deb <- readRDS("~/debpackages.rds")
for (i in 1:nrow(deb)) {
deb[i, "dotCorFortran"] <- if (is.na(deb[i, "Package"])) NA else system(paste0("egrep -r -q \"\\.(C|Fortran)\\(\" ", deb[i, "Package"], "/R/*"))==0
}
saveRDS(deb, "~/debpackagesout.rds")
```
## R Analysis -- Step Four
Back on the initial machine "here":
```
deb <- readRDS("debpackagesout.rds")
setDT(deb)
deb[ is.na(deb[, dotCorFortran]) | deb[, dotCorFortran]==TRUE, 1:3] ## 75
```
We are now down to a set of 75 packages, down from 487. One is a known false
positive (r-cran-int64 which I have here, .C and .Fortran only in comments)
but I can still rebuild it. The list is below (and also in GitHub)
Another I just rebuilt (r-cran-rsprng).
Leaves around 74 or so to submit as requests for binary NMUs.
Don: Good idea or not?
Dirk
package Package Version
1: r-cran-ade4 ade4 1.7-6
2: r-cran-adegenet adegenet 2.0.1
3: r-cran-adephylo adephylo 1.1-10
4: r-cran-amelia Amelia 1.7.4
5: r-cran-ape ape 4.1
6: r-cran-bayesm bayesm 3.0-2
7: r-cran-bitops bitops 1.0-6
8: r-cran-blockmodeling blockmodeling 0.1.8
9: r-cran-boolnet BoolNet 2.1.3
10: r-cran-brglm brglm 0.5-9
11: r-cran-caret caret 6.0-76
12: r-cran-catools caTools 1.17.1
13: r-cran-cmprsk cmprsk 2.2-7
14: r-cran-coin coin 1.1-3
15: r-cran-contfrac contfrac 1.1-10
16: r-cran-data.table data.table 1.10.4
17: r-cran-deal deal 1.2-37
18: r-cran-deldir deldir 0.1-14
19: r-cran-desolve deSolve 1.14
20: r-cran-dosefinding DoseFinding 0.9-15
21: r-cran-eco eco 3.1-7
22: r-cran-erm eRm 0.15-7
23: r-cran-etm etm 0.6-2
24: r-cran-evd evd 2.3-2
25: r-cran-expm expm 0.999-2
26: r-cran-fields fields 9.0
27: r-cran-gam gam 1.14-4
28: r-cran-genabel GenABEL 1.8-0
29: r-cran-glmnet glmnet 2.0-10
30: r-cran-goftest goftest 1.1-1
31: r-cran-gsl gsl 1.9-10.3
32: r-cran-haplo.stats haplo.stats 1.7.7
33: r-cran-hdf5 NA NA
34: r-cran-hexbin hexbin 1.27.1
35: r-cran-igraph igraph 1.0.1
36: r-cran-int64 NA NA
37: r-cran-lhs lhs 0.14
38: r-cran-logspline logspline 2.1.9
39: r-cran-mapproj mapproj 1.2-5
40: r-cran-maps maps 3.2.0
41: r-cran-maptools maptools 0.9-2
42: r-cran-mcmc mcmc 0.9-5
43: r-cran-mcmcpack MCMCpack 1.4-0
44: r-cran-medadherence NA NA
45: r-cran-mixtools mixtools 1.1.0
46: r-cran-mlbench mlbench 2.1-1
47: r-cran-mnp MNP 3.0-1
48: r-cran-msm msm 1.6.4
49: r-cran-ncdf4 ncdf4 1.16
50: r-cran-nnls nnls 1.4
51: r-cran-pbivnorm pbivnorm 0.6.0
52: r-cran-phangorn phangorn 2.2.0
53: r-cran-phylobase phylobase 0.8.4
54: r-cran-polycub polyCub 0.6.0
55: r-cran-princurve princurve 1.1-12
56: r-cran-pscl pscl 1.4.9
57: r-cran-qtl qtl 1.41-6
58: r-cran-randomfields RandomFields 3.1.50
59: r-cran-randomfieldsutils RandomFieldsUtils 0.3.25
60: r-cran-randomforest randomForest 4.6-12
61: r-cran-raschsampler RaschSampler 0.8-8
62: r-cran-rcurl RCurl 1.95-4.8
63: r-cran-rniftilib NA NA
64: r-cran-rsprng NA NA
65: r-cran-sp sp 1.2-4
66: r-cran-spam spam 1.4-0
67: r-cran-spatstat spatstat 1.51-0
68: r-cran-spc spc 0.5.3
69: r-cran-spdep spdep 0.6-13
70: r-cran-surveillance surveillance 1.13.1
71: r-cran-tgp tgp 2.4-14
72: r-cran-tikzdevice tikzDevice 0.10-1
73: r-cran-treescape NA NA
74: r-cran-vegan vegan 2.4-3
75: r-cran-vgam VGAM 1.0-3
package Package Version
--
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
More information about the Pkg-r-builders
mailing list