[Reproducible-commits] [buildinfo-spec] 01/01: Minor wording tweaks and footnote numbering correction.

Fri Dec 11 16:18:09 UTC 2015

This is an automated email from the git hooks/post-receive script.

infinity0 pushed a commit to branch master
in repository buildinfo-spec.

commit 1c044b882742211c2f5356696dcd10228e9ed0e9
Author: Ximin Luo <infinity0 at debian.org>
Date:   Fri Dec 11 17:16:54 2015 +0100

    Minor wording tweaks and footnote numbering correction.
---
 notes/buildinfo.rst | 43 ++++++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/notes/buildinfo.rst b/notes/buildinfo.rst
index c65fc7d..f7289a7 100644
--- a/notes/buildinfo.rst
+++ b/notes/buildinfo.rst
@@ -44,10 +44,10 @@ ignores timestamps, hostnames, etc. Then,
   a program could read its own timestamp and do different things according to
   this value. So the result of our diff program would not actually mean
   "behaves the same/different" but instead mean "behaves the same/different if
-  the source code doesn't do certain things". Granted, reproducibility is about
-  verifying source code behaviour, but tying this to the output of our diff
-  program tangles up separate concerns and greatly increases the complexity and
-  cost-of-reasoning of our entire system.
+  the source code doesn't do certain things". Granted, reproducibility is meant
+  to eventually help us verify source code behaviour, but tying this to the
+  output of our diff program tangles up separate concerns and greatly increases
+  the complexity and cost-of-reasoning of our entire system.
 
 - For data formats containing natural language such as documentation, a similar
   argument to the above applies. For example, text could refer to the timestamp
@@ -93,12 +93,12 @@ in how useful this information is. More generally, when we verify a build, we:
 3. Verify bit-for-bit reproducibility against original product.
 
 When we run this process across many verifiers, they will all reproduce U', and
-may have different values for { U - U' }. The more processes we run, the more
-confidence we gain, that U' is a superset of the minimal information T that we
-actually need to reproduce the build. But even after running this process, a
-human reviewer still has to review U' to check that it contains no backdoors:
-since it was the same across all builds, there is the possibility that U' = T
-and all of it was needed to affect the final build result.
+may have different values for { their own U - U' }. The more processes we run,
+the more confidence we gain, that U' is a superset of the minimal information T
+that we actually need to reproduce the build. But even after running this
+process, a human reviewer still has to review U' to check that it contains no
+backdoors: since it was the same across all builds, there is the possibility
+that U' = T and all of it was needed to affect the final build result.
 
 So, it is in our interest (to make verification easier) to reduce U'. If we
 reduce U' such that it is no longer a superset of T, then we will fail to
@@ -107,18 +107,18 @@ smaller U' (across many verifiers), we can gain confidence in what T is. Beyond
 that, developers can try to tweak their source code, or the source code of
 their build tools, to reduce T itself down.
 
-As a baseline for *all* packages to aim for, T should exactly be the source
-code of the build input and the build tools - i.e. the **preferred form for
-verification** (against backdoors etc) - call this S. To verify a build for
+As an ideal standard for *all* packages to aim for, T should exactly be the
+source code of the build input and the build tools - i.e. the **preferred form
+for verification** (against backdoors etc) - call this S. To verify a build for
 S-reproducibility, we recursively build the source code of the build tools,
 *not even care about their exact binary result*, use these to build the build
-input, then finally attempt to reproduce the original build product. [2]_
+input, then finally compare our result against the original build product. [1]_
 
 In practise, we do not expect most existing packages to meet this standard, and
 our current (2015-12-11) reproducibility tests instead reproduce the entire
 *binary* build tools (i.e. an approximation of the state of the filesystem from
-the original build) when verifying. One has to start somewhere, and proceed one
-step at a time.
+the original build) when verifying. One has to start somewhere, and proceeding
+one concrete step at a time is better than trying to do too much at once.
 
 As an interesting side note, sometimes though we can do *even better* than
 S-reproducibility:
@@ -132,7 +132,7 @@ not expect different C compilers to generate the same binaries. But if any
 parts of build process are precisely defined like `cp`, then this reduces T
 even further, replacing concrete source code with this smaller definition.
 
-.. [2] Yes, this ignores cyclic-build-dependency and bootstrapping issues.
+.. [1] Yes, this ignores cyclic-build-dependency and bootstrapping issues.
     We'll have to figure this out later, when we actually start to try it. One
     plausible approach is to double-diverse-compile the initial compilers (that
     self-build-depend) using existing binaries. One may think of it like this:
@@ -167,7 +167,7 @@ To finally state the definition:
 A buildinfo file is a committment from a builder that they executed the build
 with certain parameters, and got a particular binary output with that input.
 The information should contain as much as information as possible, taking into
-account storage and distribution costs, but MUST *attempt to include* **all
+account storage and distribution costs, and MUST *attempt to include* **all
 information needed to reproduce that build** (i.e. an over-estimation of T).
 External artefacts MUST be referenced by hash, SHA256 or stronger.
 
@@ -187,7 +187,8 @@ This definition is meant to allow readers of the file to:
     information from the original buildinfo file
 
 - to tweak the build input to attempt to reduce the aforementioned minimal set,
-  which may be calculated by running the aforementioned strategies again.
+  which may be re-calculated for this new input by running the aforementioned
+  strategies again.
 
 Buildinfo files SHOULD be signed, but there may be rare applications where this
 is not suitable. You should have a very good reason for this, though.
@@ -258,8 +259,8 @@ In practice there is another issue as well: certain package managers refuse to
 upgrade packages signed by a different key, for "security" reasons. This is
 tivoization, but enforced by software rather than hardware. This goes against
 the spirit of FOSS, where users are supposed to be able to tinker with their
-own devices; see also [1]_. Note that if the package manager allows the user to
+own devices; see also [2]_. Note that if the package manager allows the user to
 override the authorization key to one that they *do* control, this freedom
 issue is resolved, but the technical issues above still remain.
 
-.. [1] https://www.fsf.org/campaigns/secure-boot-vs-restricted-boot/whitepaper-web
+.. [2] https://www.fsf.org/campaigns/secure-boot-vs-restricted-boot/whitepaper-web

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/reproducible/buildinfo-spec.git