[blog] 02/02: reproducing-r-packages: update and make the script less debian-specific

Ximin Luo infinity0 at debian.org
Wed May 3 12:32:44 UTC 2017


This is an automated email from the git hooks/post-receive script.

infinity0 pushed a commit to branch master
in repository blog.

commit a6d0a39c2495c0d052222ee0290a6428ec8e6d0c
Author: Ximin Luo <infinity0 at debian.org>
Date:   Wed May 3 14:32:37 2017 +0200

    reproducing-r-packages: update and make the script less debian-specific
---
 data/r-mini-repro-test.sh          | 22 ++++++++--------
 drafts/reproducing-r-packages.mdwn | 54 +++++++++++++++++++-------------------
 2 files changed, 38 insertions(+), 38 deletions(-)

diff --git a/data/r-mini-repro-test.sh b/data/r-mini-repro-test.sh
index bab4007..4b2bf9e 100755
--- a/data/r-mini-repro-test.sh
+++ b/data/r-mini-repro-test.sh
@@ -53,17 +53,17 @@ cd "$pkgdir"
 
 pkgname=$(sed -ne 's/Package: //p' ./DESCRIPTION)
 
-mkdir -p $PWD/debian/$p/usr/lib/R/site-library
-rm -f ./debian/$p/usr/lib/R/site-library/$pkgname/R/$pkgname.rdx
-rm -f ./debian/$p/usr/lib/R/site-library/$pkgname/R/$pkgname.rdb
+mkdir -p $PWD/repro-test/$p/usr/lib/R/site-library
+rm -f ./repro-test/$p/usr/lib/R/site-library/$pkgname/R/$pkgname.rdx
+rm -f ./repro-test/$p/usr/lib/R/site-library/$pkgname/R/$pkgname.rdb
 
-install -l debian/$p/usr/lib/R/site-library -d . \
+install -l repro-test/$p/usr/lib/R/site-library -d . \
 	"--built-timestamp=\"Thu, 01 Jan 1970 00:00:00 +0000\""
 
-stat ./debian/$p/usr/lib/R/site-library/$pkgname/R/$pkgname.rdx
-sha256sum ./debian/$p/usr/lib/R/site-library/$pkgname/R/$pkgname.rdx
-sha256sum ./debian/$p/usr/lib/R/site-library/$pkgname/R/$pkgname.rdb
-sha256sum ./debian/$p/usr/lib/R/site-library/$pkgname/help/$pkgname.rdx
-sha256sum ./debian/$p/usr/lib/R/site-library/$pkgname/help/$pkgname.rdb
-sha256sum ./debian/$p/usr/lib/R/site-library/$pkgname/help/paths.rds
-echo "don't forget to rm -rf ${pkgdir}/debian/${p} if you want a clean directory"
+stat ./repro-test/$p/usr/lib/R/site-library/$pkgname/R/$pkgname.rdx
+sha256sum ./repro-test/$p/usr/lib/R/site-library/$pkgname/R/$pkgname.rdx
+sha256sum ./repro-test/$p/usr/lib/R/site-library/$pkgname/R/$pkgname.rdb
+sha256sum ./repro-test/$p/usr/lib/R/site-library/$pkgname/help/$pkgname.rdx
+sha256sum ./repro-test/$p/usr/lib/R/site-library/$pkgname/help/$pkgname.rdb
+sha256sum ./repro-test/$p/usr/lib/R/site-library/$pkgname/help/paths.rds
+echo "don't forget to rm -rf ${pkgdir}/repro-test/${p} if you want a clean directory"
diff --git a/drafts/reproducing-r-packages.mdwn b/drafts/reproducing-r-packages.mdwn
index 7ef1b9b..d305a2d 100644
--- a/drafts/reproducing-r-packages.mdwn
+++ b/drafts/reproducing-r-packages.mdwn
@@ -1,6 +1,6 @@
 [[!meta title="Reproducing R packages"]]
 
-In the past week or so, Ximin Luo worked on making R generate reproducible
+In the past couple of weeks, Ximin Luo worked on making R generate reproducible
 output. This is now mostly complete, and we're waiting on feedback from
 upstream about our patch. In the meantime, there are a few packages that remain
 unreproducible, but the issue probably lies in those specific packages rather
@@ -23,9 +23,8 @@ identical output for the `.rdb` files, even though they are bitwise different.
 To get to the bottom of this, we'll have to use the R debugger.
 
 Attached to this post is [a script](/blog/data/r-mini-repro-test.sh) that
-smooths this process. The script is meant to be run against Debian's R
-packages, but it could probably be made to work with other distros' R packages
-with minimal changes, if it doesn't already work.
+smooths this process. I ran this against Debian's R packages, but it probably
+also works with other distros' R packages - try it and see.
 
 You run it like `./r-mini-repro-test.sh $pkgdir $builddir` and it will output
 some hashes for you; make sure to install the build dependencies first. You
@@ -34,9 +33,9 @@ variations; `$builddir` can be an arbitrary string but `$pkgdir` should point
 to the actual R package's source directory, so I just copy that to two
 locations and point the script at each of them in turn.
 
-Now, we can begin debugging. Before I did this, we had 477 unreproducible R
+Now, we can begin debugging. Before I did this, we had 478 unreproducible R
 packages so the biggest problem was likely with R itself. I downloaded the
-source code of both R and a small example package (r-cran-tensor), then figured
+source code of both R and a small example package ([[!pkg r-cran-tensor]]), then figured
 out how the R packages were actually being built. This resulted in me writing
 the script above, to speed up debugging. You can read about R's debugger [here
 (no HTTPS)](http://www.stats.uwo.ca/faculty/murdoch/software/debuggingR/), it's
@@ -88,9 +87,9 @@ function, and reading it further we see that it calls `makeLazyLoading` then
 `code2LazyLoadDB` then `makeLazyLoadDB`. Seems promising, let's confirm it
 before chasing potential wild geese.
 
-In the rest of these command-line outputs I'll prepend what I input with `>>>`
-but this doesn't actually get printed by `Rscript`. So don't be surprised that
-you don't see these, when you're trying to recreate my steps.
+**Important**: In the rest of these command-line outputs I'll prepend what I
+input with `>>>` but this doesn't actually get printed by `Rscript`. So don't
+be surprised that you don't see these, when you're trying to recreate my steps.
 
 Also I am a total R noob so possibly there are more elegant ways to do what I'm
 about to show you; this is what I came up with after 2-3 hours of research.
@@ -403,28 +402,29 @@ Testing the patch
 -----------------
 
 Of course r-cran-tensor is just 1 package, so then we tested this patch on all
-477 [tagged R
+478 [tagged R
 packages](https://tests.reproducible-builds.org/debian/issues/unstable/randomness_in_r_rdb_rds_databases_issue.html).
-Before the patch, all of these were unreproducible. Now 460/477 are
+Before the patch, all of these were unreproducible. Now 463/478 are
 reproducible, hurray!
 
-The first version of the patch made 2 packages FTBFS, r-bioc-biobase and
-r-cran-shinybs; this was fixed in a subsequent version of the patch. 1 package,
-r-cran-randomfields, FTBFS even on an unpatched r-base, probably due to
-differences between 3.3.3 and 3.4.0
+The first version of the patch made 2 packages FTBFS ("fail to build from
+source"), [[!pkg r-bioc-biobase]] and [[!pkg r-cran-shinybs]]; this was fixed
+in a subsequent version of the patch. (Some other packages, [[!pkg
+r-cran-randomfields]] [[!pkg r-cran-randomfieldsutils]], FTBFS even with an
+unpatched r-base, due to differences between 3.3.3 and 3.4.0 ([[!bug 861333]]),
+and not because of our patch.)
 
 Given the overwhelming proportion of packages that *did* reproduce, the other
 14 packages that are still unreproducible, are quite probably due to issues in
-those specific packages.
-
-Only 2 of them are certainly due to build-path differences: r-cran-runit
-r-cran-rinside. This is because, we currently see they are reproducible in
-Debian testing but not Debian unstable. This is slightly misleading, the real
-reason is not testing vs unstable directly; rather the fact that we use the
-same build path when trying to reproduce Debian testing packages, and different
-paths for unstable. The other 12 are unreproducible in both testing and
-unstable, so it either suffers from a non-build-path issue as well as a
-build-path issue, or only a non-build-path issue.
+those specific packages. Only 2 of these are certainly due to build-path
+differences: [[!pkg r-cran-runit]] [[!pkg r-cran-rinside]]. This is because, we
+currently see they are reproducible in Debian testing but not Debian unstable.
+This is slightly misleading, the real reason is not testing vs unstable
+directly; rather the fact that we (at the time of writing) use the same build
+path when trying to reproduce Debian testing packages, and different paths for
+unstable. The other 12 are unreproducible in both testing and unstable, so it
+either suffers from a non-build-path issue as well as a build-path issue, or
+only a non-build-path issue.
 
 So, further investigation *of each package* is needed. This is where you can
 help. :)
@@ -433,8 +433,8 @@ help. :)
 Debugging R packages
 --------------------
 
-To investigate why (for example) r-cran-runit doesn't reproduce despite our
-patches, we just do the above process again:
+To investigate why (for example) [[!pkg r-cran-runit]] doesn't reproduce
+despite our patches, we just do the above process again:
 
     $ DEBUG=1 ../r-mini-repro-test.sh r-cran-runit
     [..]

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/reproducible/blog.git



More information about the Reproducible-commits mailing list