[Aptitude-svn-commit] r3799 - in branches/aptitude-0.3/aptitude: . src/generic/problemresolver

Wed Aug 10 23:37:46 UTC 2005

Author: dburrows
Date: Wed Aug 10 23:37:42 2005
New Revision: 3799

Modified:
   branches/aptitude-0.3/aptitude/ChangeLog
   branches/aptitude-0.3/aptitude/src/generic/problemresolver/model.tex
Log:
Start writing up how to eliminate 'stupid pairs' of actions.

Modified: branches/aptitude-0.3/aptitude/ChangeLog
==============================================================================

--- branches/aptitude-0.3/aptitude/ChangeLog	(original)
+++ branches/aptitude-0.3/aptitude/ChangeLog	Wed Aug 10 23:37:42 2005
@@ -1,3 +1,10 @@
+2005-08-10  Daniel Burrows  <dburrows at debian.org>
+
+	* src/generic/problemresolver/model.tex:
+
+	  Start describing how to eliminate "stupid pairs" from solutions,
+	  including a preliminary writeup of the algorithm to use.
+
 2005-08-10  Clytie Siddall <clytie at riverland.net.au>
 
 	* Update to Vietnamese translation. (Closes: #322276)

Modified: branches/aptitude-0.3/aptitude/src/generic/problemresolver/model.tex
==============================================================================
--- branches/aptitude-0.3/aptitude/src/generic/problemresolver/model.tex	(original)
+++ branches/aptitude-0.3/aptitude/src/generic/problemresolver/model.tex	Wed Aug 10 23:37:42 2005
@@ -10,6 +10,10 @@
 \usepackage{mathpartir}
 \usepackage{varioref}
 
+\usepackage{algorithm}
+% From algorithmicx
+\usepackage{algpseudocode}
+
 \pagestyle{fancy}
 
 \newtheorem{theorem}{Theorem}
@@ -45,10 +49,14 @@
 \newcommand{\nsol}[2]{\overset{#1}{\underset{#2}{\Rightarrow}}}
 \newcommand{\nsolmany}[1]{\overset{#1}{\underset{*}{\Rightarrow}}}
 
+\newcommand{\act}[2]{\begin{array}{c} #1 \\ #2 \end{array}}
+
 \newcommand{\installs}{\vartriangleright}
 \newcommand{\satisfies}{\vdash}
 \newcommand{\col}{\mathpunct{:}}
 
+\algrenewcommand{\algorithmicprocedure}{\textbf{function}}
+
 \mathlig{->}{\rightarrow}
 \mathlig{=>*}{\overset{*}{=>}}
 
@@ -762,8 +770,10 @@
 One simple way to trim the search tree is to drop any search node $I$
 that is a ``superset'' of a full solution $I_c$ -- meaning that $I_c$
 is a hybrid of $I$ and $I_0$.  This has the additional beneficial
-effect of preventing solutions from being offered to the user which
-are just a previously-displayed solution with some extra, redundant
+effect of preventing obviously redundant solutions from being offered
+to the user.
+
+a previously-displayed solution with some extra, redundant
 actions added to it.
 
 \subsection{Forbidden versions}
@@ -784,7 +794,7 @@
 \begin{figure}
   \begin{gather*}
     \inferrule{I \not \satisfies d \\ d = v -> S \\
-      I(v)=I_0(v) \\ \pkgof{v'}=\pkgof{v} \\ v' \notin F}
+      I(v')=I_0(v') \\ \pkgof{v'}=\pkgof{v} \\ v' \notin F}
     {(I,F) \nsol{I_0}{d} (I;v',F \cup S)} \\
     \inferrule{I \not \satisfies d \\ d = v -> S \\ v' \in S \\
       I(v')=I_0(v') \\ v' \notin F}
@@ -795,23 +805,11 @@
   \label{fig:excludever}
 \end{figure}
 
-This has the effect of forcing the algorithm to ``stick
-with'' a decision to forgo installing the targets of a dependency in
-favor of shifting the source.
-
-\begin{note}
-  This technique could just as well be applied by expanding the
-  forbidden set when generating successors for the \emph{targets} of a
-  dependency: that is, forbidding a different version of the source of
-  a dependency to be installed.  The decision regarding which
-  exclusion principle to use was made on the basis of a conjecture
-  that we are more likely to encounter a ``hard'' dependency problem
-  when moving ``up'' a dependency chain than when moving ``down'' it.
-\end{note}
-
-Of course, it is important to verify that cutting off wide swathes of
-the search space in this manner does not impede our ability to
-generate solutions:
+This has the effect of forcing the algorithm to ``stick with'' a
+decision to forgo installing the targets of a dependency in favor of
+shifting the source.  Of course, it is important to verify that
+cutting off wide swathes of the search space in this manner does not
+impede our ability to generate solutions:
 
 \begin{theorem}
   Let $I_c$ be any consistent installation (if one exists) and $I_0$
@@ -859,6 +857,234 @@
   and $I_c$.  Thus, the theorem is established with $I_c'=I_n$.
 \end{proof}
 
+\subsection{Eliminating ``Stupidity''}
+
+The technique described in the previous section will eliminate some
+obviously redundant solutions.  But can we do better?  In general, an
+obvious redundancy exists whenever we install $v_1$ to resolve $d_1$
+and $v_2$ to resolve $d_2$, where $v_1$ solves $d_2$ or $v_2$ solves
+$d_1$.  Aside from anything else, it is frustrating from a user's
+perspective to see a dependency resolver that produces ``obviously
+stupid'' answers.
+
+It turns out that there is, in fact, a polynomial-time algorithm which
+will eliminate all such pairs from a solution.  This algorithm
+represents a solution (or partial solution) as a \emph{resolution
+  sequence}:
+
+\begin{definition}
+  A resolution sequence for $I_0 \nsolmany{I_0} I_n$ is a sequence
+  $\act{v_1}{d_1},\dots,\act{v_n}{d_n}$ where for all $1 \leq k \leq
+  n$, $I_k=I_{k-1};v_k$ and $I_{k-1} \nsol{I_0}{d_k} I_k$.
+
+  A version-dependency pair $\act{v}{d}$ is called an \emph{action};
+  it indicates that the version $v$ was installed to satisfy the
+  dependency $d$.
+\end{definition}
+
+This is, of course, just a different way of looking at the problem: it
+focuses on the series of versions that will be installed, and
+eliminates the intermediate installations.  However, viewing the
+problem from this perspective immediately suggests a new way of
+approaching it.  The basic goal, as stated above, is to eliminate
+what I will henceforth refer to as \emph{stupid pairs}:
+
+\begin{definition}
+  A stupid pair is a pair of actions
+  $\left(\act{v_1}{d_1},\act{v_2}{d_2}\right)$ where $v_1$
+  solves\footnote{$v$ solves $d$ if $v$ appears on the right-hand side
+    of $d$, or if the left-hand side of $d$ is $v' \neq v$ where
+    $\pkgof{v} = \pkgof{v'}$.}  $d_2$ or $v_2$ solves $d_1$.
+\end{definition}
+
+Is there a way to manipulate the resolution sequence itself by
+interchanging and/or dropping elements from the sequence so as to
+eliminate all stupid pairs?  The answer is ``yes''!
+Algorithm~\vref{alg:eliminate-stupid} takes a starting installation
+$I_0$ and a resolution sequence $R$, and produces a new resolution
+sequence $R'$ such that $R' \subseteq R$.  To express the algorithms
+concisely, the following (ab)uses of notation are introduced:
+
+\begin{enumerate}
+\item If $I$ is an installation and $R$ is a resolution sequence,
+  $I;R$ indicates the application of $R$ to $I$: i.e., $I;R=I;\{v \st
+  \exists d . \act{v}{d} \in R\}$.
+
+\item If $I$ is an installation, then $Broken(I)$ is the set of broken
+  dependencies in $I$.
+
+\item If $R$ is a resolution sequence, I use $R$ in several places as
+  a set to indicate either the set of versions contained in $R$ or the
+  set of actions contained in $R$ (the usage will be clear from
+  context).
+
+\item ``Take any'' means that the program should arbitrarily choose a
+  single element of the specified set.
+
+\item On Line \ref{line:unpack-stupid-pair}, read the ``let'' as an
+  unpacking command: the sequence $R$ is divided into three
+  subsequences based on the first-occuring stupid pair.
+
+\item A set $S$ of versions is \emph{package-unique} if each package
+  is represented at most once in $S$: i.e., for all $v,v' \in S$, if
+  $\pkgof{v}=\pkgof{v'}$ then $v=v'$.
+\end{enumerate}
+
+\begin{algorithm}
+\begin{algorithmic}[1]
+\Procedure{EliminateStupid}{$I_0$, $R$}
+  \Require{$R$ is a resolution sequence for $I_0$ and $Broken(I_0;R) = \emptyset$}
+  \If{$R$ contains no stupid pairs}
+    \State \textbf{return} $R$
+  \Else
+    \State Let $R_1',\act{v_i}{d_i},R_2',\act{v_j}{d_j},R_3' = R$
+           where $\left(\act{v_i}{d_i},\act{v_j}{d_j}\right)$ is
+           a stupid pair and where $(a_1,a_2)$ is not a stupid pair for
+           all $a_1 \in R_1'$ and $a_2 \in R$. \label{line:unpack-stupid-pair}
+
+    \State Let $I' = I_0;R_1';v_j;R_2';R_3'$
+    \If{$Broken(I') = \emptyset$}
+      \State Let $R' = R_1',\act{v_j}{d_i},\Call{ResolveFrom}{I_0;R_1';v_j,R_2' \cup R_3'}$
+      \State \textbf{return} \Call{EliminateStupid}{$I_0$, $R'$}
+    \Else
+      \State Take any dependency $d_k$ such that $I' \not \satisfies d_k$ and $v_i \satisfies d_k$.
+      \State Let $R' = R_1',\call{ResolveFrom}{I_0;R_1',\{v_i,v_k\} \cup R_2' \cup R_3'}$
+      \State \textbf{return} \Call{EliminateStupid}{$I_0$, $R'$}
+    \EndIf
+  \EndIf
+\EndProcedure
+\end{algorithmic}
+\label{alg:eliminate-stupid}
+\caption{Algorithm to eliminate stupid pairs}
+\end{algorithm}
+
+\begin{algorithm}
+\begin{algorithmic}[1]
+\Procedure{ResolveFrom}{$I$, $S$}
+  \Require{$Broken(I;S)=\emptyset$, $S$ is package-unique, and $v \notin I$ for all $v \in S$}
+  \If{$Broken(I)=\emptyset$}
+    \State \textbf{return} $()$
+  \Else
+    \State Take any $d \in Broken(I)$ \label{line:resolve-from-take-broken}
+    \State Take any $v \in S$ such that $v$ solves $d$ \label{line:resolve-from-take-solver}
+    \State \textbf{return} $\left(\act{v}{d}, \Call{ResolveFrom}{I;v, S \backslash v}\right)$ \label{line:resolve-from-recur}
+  \EndIf
+\EndProcedure
+\end{algorithmic}
+\caption{Algorithm to resolve all dependencies from $S$}
+\label{alg:resolve-from}
+\end{algorithm}
+
+Informally, \textsc{ResolveFrom} is a routine which takes an
+installation and a set from which to draw versions, and generates an
+arbitrary resolution sequence by resolving all broken dependencies
+using the supplied versions.  The \textsc{EliminateStupid} routine
+works by eliminating the first stupid pair in the resolution sequence.
+This pair must consist of $\left(\act{v_1}{d_1},\act{v_2}{d_2}\right)$
+where $v_2$ solves $d_1$; intuitively, the idea is that we could just
+as well have installed $v_2$ instead of $v_1$.  Unfortunately, it is
+possible that simply cancelling the installation of $v_1$ will break
+some packages that were never explicitly fixed.  In this case, we know
+that there are packages that are not fixed by anything in the sequence
+except $v_1$, so we can justify the installation of $v_1$ by reference
+to them.
+
+After deciding upon the content of an eventual solution,
+\textsc{EliminateStupid} invokes \textsc{ResolveFrom} to generate a
+``sensible'' ordering of the solution and to drop unneeded versions
+from the solution.
+
+The formal proof of correctness follows:
+
+\begin{lemma}[Correctness of \textsc{ResolveFrom}]
+  If $Broken(I;S)=\emptyset$, $S$ is package-unique, and $v \notin I$
+  for all $v \in S$, then \textsc{ResolveFrom}
+  (Algorithm~\vref{alg:resolve-from}) terminates and returns a
+  resolution sequence $R$ for $I \nsolmany I;R$ such that
+  $Broken(I;R)=\emptyset$ and each element of $R$ is $\act{v}{d}$ for
+  some $v \in S$.
+  \label{lem:resolve-from-correct}
+\end{lemma}
+
+\begin{proof}
+  Proof is by induction on $S$.
+
+  If $S=\emptyset$, then $I;S=S$ and hence $Broken(I)=\emptyset$.
+  Thus, the algorithm terminates immediately.
+
+  Suppose instead that $S \neq \emptyset$ and that the lemma holds for
+  all $S'$ such that $S' \subset S$.  If $Broken(I)=\emptyset$ then
+  the algorithm again terminates immediately; otherwise, at least one
+  broken dependency exists for
+  Line~\ref{line:resolve-from-take-broken} to assign to $d$.  Since
+  $Broken(I;S)=\emptyset$, there must exist a $v \in S$ such that $v
+  \satisfies d$; Line~\ref{line:resolve-from-take-solver} will take an
+  arbitrary such $v$.
+
+  Finally, note that on Line~\ref{line:resolve-from-recur},
+  \textsc{ResolveFrom} is applied to a set $S \backslash v$; since $v
+  \in S$, we have $S \backslash v \subset S$ (i.e., $S \backslash v$
+  is strictly smaller than $S$) and hence the inductive hypothesis
+  holds.  Therefore, the recursive invocation $\Call{ResolveFrom}{I;v,
+    S \backslash v}$ terminates and returns a resolution sequence $R$
+  for $I;v$ such that $Broken(I;v;R)=\emptyset$ and each element of
+  $R$ is $\act{v'}{d'}$ for some $v' \in S$.
+
+  Clearly, the sequence $R'=\act{v}{d},R$ is returned by
+  \textsc{ResolveFrom}.  I claim that this sequence satisfies the
+  lemma requirements.
+
+  \begin{itemize}
+  \item Clearly $I \nsol{I}{d} I;v$.  Let $v_0 = v$, $I_0 = I;v_0$,
+    and $\act{v_1}{d_1}, \dots, \act{v_n}{d_n} = R$.  Let
+    $I_1,\dots,I_n$ be the sequence of installations associated with
+    $R$: the sequence such that $I_{k-1}=I_k;v_k$ and $I_{k-1}
+    \nsol{I_0}{d_k} I_k$.
+
+    For $1 \leq k \leq n$, note that by the lemma statement, we have
+    $\pkgof{v_k} \neq \pkgof{v_0}$, so $I(\pkgof{v_k}) =
+    I_0(\pkgof{v_k})$.  By the inductive hypothesis, $I_0(\pkgof{v_k})
+    \neq v_k$ and hence $I(\pkgof{v_k}) \neq v_k$.  Finally, since
+    $I_{k-1} \nsol{I_0}{d_k} I_k$, we have $I_{k-1} \not \satisfies
+    d_k$.
+
+    Therefore, $I_{k-1} \nsol{I}{d_k} I_k$ for $1 \leq k \leq n$, and
+    hence $R'$ is a resolution sequence.
+
+  \item Each element of $R$ is $\act{v}{d}$ for some $v \in S$, and
+    each
+  \end{itemize}
+\end{proof}
+
+\begin{lemma}[Correctness of \textsc{EliminateStupid}]
+  If $R$ is a resolution sequence for $I_0$ and
+  $Broken(I_0;R)=\emptyset$, then $\Call{EliminateStupid}{I_0, R}$ is
+  a resolution sequence $R'$ for $I_0$ such that $Broken(I_0;R') =
+  \emptyset$ and $R'$ contains no stupid pairs.
+  \label{lem:eliminate-stupid-correct}
+\end{lemma}
+
+\begin{proof}
+  Proof is by induction on the pair $(l,n)$ where $l$ is the length of
+  $R$ and $n$ is the number of stupid pairs in $R$; these pairs are
+  ordered lexicographically, so $(1,1) < (1,100) < (2,10) < (5,20)$.
+
+  If $n = 0$, then the algorithm terminates immediately.  In this
+  case, $R$ is a resolution sequence from the lemma prerequisite, and
+  $R$ contains no stupid pairs by definition.  Thus, the lemma holds.
+  If $l < 2$, then clearly $n = 0$ and the above reasoning holds.
+
+  Suppose instead that $n > 0$ and $l \geq 2$; i.e., there is at least
+  one stupid pair in $R$.  As in the algorithm, suppose that
+  $R_1',\act{v_i}{d_i},R_2',\act{v_j}{d_j},R_3'=R$, where
+  $\left(\act{v_i}{d_i},\act{v_j}{d_j}\right)$ is a stupid pair and no
+  element of $R_1'$ is implicated in a stupid pair; i.e.,
+  $\left(\act{v_i}{d_i},\act{v_j}{d_j}\right)$ is the ``first'' stupid
+  pair in $R$.
+
+  Let $I' = I_0;R_1';v_j;R_2';R_3'$.  If 
+\end{proof}
+
 \subsection{Use Logical Necessity}
 
 In combination with the tracking of forbidden versions, it is also
@@ -934,7 +1160,7 @@
   \end{gather*}
 
   \caption{Successor generation with soft dependencies}
-  \label{fig:excludever}
+  \label{fig:softdep}
 \end{figure}
 
 \begin{itemize}