Bug#851558: autopkgtest: define Restrictions for tests that aren't suitable for gating CI

Mon Jan 16 11:34:50 UTC 2017

control: tag -1 confirmed

Hello Simon,

Simon McVittie [2017-01-16  9:06 +0000]:
> Specifically, if a maintainer is unable to fix a particular unreliable
> or broken test, or the non-RC bug that it exposes, I think it's still
> correct for that test to continue to run on CI infrastructure, with
> its result logged but the usual side-effects of that result ignored.

FTR, in Ubuntu (same as in the discussion going on for adopting this into
Debian) we use britney overrides for this, like in

  http://bazaar.launchpad.net/~ubuntu-release/britney/hints-ubuntu/view/head:/pitti

i. e. the tests still run, but britney won't block the package on its failure.
The advantage is that it's much easier to control by the release team (it
doesn't require package uploads), and that you can keep more state: the hints
usually apply to a (maximum) version so that you can say "1.2-3 is known-broken
because of unrelated changes in $foo, but I expect the next upload to fix this
again", so that you can't cheat from the side of the developer.

As long as Debian doesn't actually use these tests for gating, there is no
problem in making them appear as failed. Particular in the tracker you actually
do want to point out failing tests. Or what other side effects do you mean?

> With the ability to ignore selected restrictions (#850494), this could
> be addressed by a restriction, perhaps something like one of these:
> 
>     Restrictions: experimental
>     Restrictions: intermittently-fails
>     Restrictions: may-fail
>     Restrictions: unimportant

I'm fine with adding this, as from the maintainer's perspective she could just
as well do an upload with commenting out that test, and both would appear in a
diff either way. This is for a different use case than chasing down regressing
tests (what I mentioned above), and thus seems useful for introducing new tests
which still need some work to fully work on the distro infrastructure.

Maybe "expected-failure" which is the term used by other test runners? (The
bikeshedding season is on! ☺ )

> possibly accompanied by:
> 
>     Features: known-bug=http://bugs.debian.org/1234

This won't modify the behaviour of the test, so a comment above Restrictions:
would just do as well -- but as unknown features are ignored, this is fine of
course.

> Tests with that restriction could perhaps be run normally, but
> skipped when determining whether a batch of changes would break testing.

I think autopkgtest would show their result as "EXPECTED-FAIL" or so and exit
with 0 (at least for that particular test case)?

Thanks for the suggestion!

Martin

-- 
Martin Pitt                        | http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/autopkgtest-devel/attachments/20170116/565b61ee/attachment.sig>