r773 - in python-mechanize/trunk: . debian examples mechanize mechanize.egg-info test

Mon Apr 9 23:35:18 UTC 2007

Author: lunar
Date: 2007-04-09 23:35:16 +0000 (Mon, 09 Apr 2007)
New Revision: 773

Added:
   python-mechanize/trunk/debian/examples
   python-mechanize/trunk/ez_setup.py
   python-mechanize/trunk/mechanize/_beautifulsoup.py
   python-mechanize/trunk/mechanize/_debug.py
   python-mechanize/trunk/mechanize/_http.py
   python-mechanize/trunk/mechanize/_response.py
   python-mechanize/trunk/mechanize/_rfc3986.py
   python-mechanize/trunk/mechanize/_seek.py
   python-mechanize/trunk/mechanize/_upgrade.py
   python-mechanize/trunk/setup.cfg
   python-mechanize/trunk/test-tools/
   python-mechanize/trunk/test/test_browser.doctest
   python-mechanize/trunk/test/test_browser.py
   python-mechanize/trunk/test/test_forms.doctest
   python-mechanize/trunk/test/test_history.doctest
   python-mechanize/trunk/test/test_html.doctest
   python-mechanize/trunk/test/test_html.py
   python-mechanize/trunk/test/test_opener.py
   python-mechanize/trunk/test/test_password_manager.doctest
   python-mechanize/trunk/test/test_request.doctest
   python-mechanize/trunk/test/test_response.doctest
   python-mechanize/trunk/test/test_response.py
   python-mechanize/trunk/test/test_rfc3986.doctest
   python-mechanize/trunk/test/test_useragent.py
Removed:
   python-mechanize/trunk/ez_setup/
   python-mechanize/trunk/mechanize/_urllib2_support.py
   python-mechanize/trunk/test/test_conncache.py
   python-mechanize/trunk/test/test_mechanize.py
   python-mechanize/trunk/test/test_misc.py
Modified:
   python-mechanize/trunk/0.1-changes.txt
   python-mechanize/trunk/ChangeLog.txt
   python-mechanize/trunk/MANIFEST.in
   python-mechanize/trunk/PKG-INFO
   python-mechanize/trunk/README.html
   python-mechanize/trunk/README.html.in
   python-mechanize/trunk/README.txt
   python-mechanize/trunk/debian/changelog
   python-mechanize/trunk/debian/control
   python-mechanize/trunk/debian/docs
   python-mechanize/trunk/debian/rules
   python-mechanize/trunk/doc.html
   python-mechanize/trunk/doc.html.in
   python-mechanize/trunk/examples/pypi.py
   python-mechanize/trunk/functional_tests.py
   python-mechanize/trunk/mechanize.egg-info/PKG-INFO
   python-mechanize/trunk/mechanize.egg-info/SOURCES.txt
   python-mechanize/trunk/mechanize.egg-info/requires.txt
   python-mechanize/trunk/mechanize.egg-info/zip-safe
   python-mechanize/trunk/mechanize/__init__.py
   python-mechanize/trunk/mechanize/_auth.py
   python-mechanize/trunk/mechanize/_clientcookie.py
   python-mechanize/trunk/mechanize/_gzip.py
   python-mechanize/trunk/mechanize/_headersutil.py
   python-mechanize/trunk/mechanize/_html.py
   python-mechanize/trunk/mechanize/_lwpcookiejar.py
   python-mechanize/trunk/mechanize/_mechanize.py
   python-mechanize/trunk/mechanize/_mozillacookiejar.py
   python-mechanize/trunk/mechanize/_msiecookiejar.py
   python-mechanize/trunk/mechanize/_opener.py
   python-mechanize/trunk/mechanize/_request.py
   python-mechanize/trunk/mechanize/_urllib2.py
   python-mechanize/trunk/mechanize/_useragent.py
   python-mechanize/trunk/mechanize/_util.py
   python-mechanize/trunk/setup.py
   python-mechanize/trunk/test.py
   python-mechanize/trunk/test/test_cookies.py
   python-mechanize/trunk/test/test_date.py
   python-mechanize/trunk/test/test_urllib2.py
Log:
Ready 0.1.6b-1.


Modified: python-mechanize/trunk/0.1-changes.txt
===================================================================

--- python-mechanize/trunk/0.1-changes.txt	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/0.1-changes.txt	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,5 +1,9 @@
 Recent public API changes:
 
+- Since 0.1.2b beta release: Factory now takes EncodingFinder and
+  ResponseTypeFinder class instances instead of functions (since
+  closures don't play well with module pickle).
+
 - ClientCookie has been moved into the mechanize package and is no
   longer a separate package.  The ClientCookie interface is still
   supported, but all names must be imported from module mechanize
@@ -27,7 +31,7 @@
 - .forms() and .links() now both return iterators (in fact, generators),
   not sequences (not really an interface change: these were always
   documented to return iterables, but it will no doubt break some client
-  code).
+  code).  Use e.g. list(browser.forms()) if you want a list.
 
 - .links no longer raises LinkNotFoundError (was accidental -- only
   .click_link() / .find_link() should raise this).
@@ -48,7 +52,9 @@
 - mechanize.Browser.default_encoding is gone.
 
 - mechanize.Browser.set_seekable_responses() is gone (they're always
-  .seek()able).
+  .seek()able).  Browser and UserAgent now both inherit from
+  mechanize.UserAgentBase, and UserAgent is now there only to add the
+  single method .set_seekable_responses().
 
 - Added Browser.encoding().
 

Modified: python-mechanize/trunk/ChangeLog.txt
===================================================================
--- python-mechanize/trunk/ChangeLog.txt	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/ChangeLog.txt	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,7 +1,110 @@
 This isn't really in proper GNU ChangeLog format, it just happens to
 look that way.
 
-2006-05-06 John J Lee <jjl at pobox.com>
+2007-01-07 John J Lee <jjl at pobox.com>
+
+	* 0.1.6b release
+	* Add mechanize.ParseError class, document it as part of the
+	  mechanize.Factory interface, and raise it from all Factory
+	  implementations.  This is backwards-compatible, since the new
+	  exception derives from the old exceptions.
+	* Bug fix: Truncation when there is no full .read() before
+	  navigating to the next page, and an old response is read after
+	  navigation.  This happened e.g. with r = br.open();
+	  r.readline(); br.open(url); r.read(); br.back() .
+	* Bug fix: when .back() caused a reload, it was returning the old
+	  response, not the .reload()ed one.
+	* Bug fix: .back() was not returning a copy of the response, which
+	  presumably would cause seek position problems.
+	* Bug fix: base tag without href attribute would override document
+	  URL with a None value, causing a crash (thanks Nathan Eror).
+	* Fix .set_response() to close current response first.
+	* Fix non-idempotent behaviour of Factory.forms() / .links() .
+	  Previously, if for example you got a ParseError during execution
+	  of .forms(), you could call it again and have it not raise an
+	  exception, because it started out where it left off!
+	* Add a missing copy.copy() to RobustFactory .
+	* Fix redirection to 'URIs' that contain characters that are not
+	  allowed in URIs (thanks Riko Wichmann).  Also, Request
+	  constructor now logs a module logging warning about any such bad
+	  URIs.
+	* Add .global_form() method to Browser to support form controls
+	  whose HTML elements are not descendants of any FORM element.
+	* Add a new method .visit_response() .  This creates a new history
+	  entry from a response object, rather than just changing the
+	  current visited response.  This is useful e.g. when you want to
+	  use Browser features in a handler.
+	* Misc minor bug fixes.
+
+2006-10-25 John J Lee <jjl at pobox.com>
+
+	* 0.1.5b release: Update setuptools dependencies to depend on
+	  ClientForm>=0.2.5 (for an important bug fix affecting fragments
+	  in URLs).  There are no other changes in this release -- this
+	  release was done purely so that people upgrading to the latest
+	  version of mechanize will get the latest ClientForm too.
+
+2006-10-14 John J Lee <jjl at pobox.com>
+	* 0.1.4b release: (skipped a version deliberately for obscure
+	  reasons)
+	* Improved auth & proxies support.
+	* Follow RFC 3986.
+	* Add a .set_cookie() method to Browser .
+	* Add Browser.open_novisit() and Request.visit to allow fetching
+	  files without affecting Browser state.
+	* UserAgent and Browser are now subclasses of UserAgentBase.
+	  UserAgent's only role in life above what UserAgentBase does is
+	  to provide the .set_seekable_responses() method (it lives there
+	  because Browser depends on seekable responses, because that's
+	  how browser history is implemented).
+	* Bundle BeautifulSoup 2.1.1.  No more dependency pain!  Note that
+	  BeautifulSoup is, and always was, optional, and that mechanize
+	  will eventually switch to BeautifulSoup version 3, at which
+	  point it may well stop bundling BeautifulSoup.  Note also that
+	  the module is only used internally, and is not available as a
+	  public attribute of the package.  If you dare, you can import it
+	  ("from mechanize import _beautifulsoup"), but beware that it
+	  will go away later, and that the API of BeautifulSoup will
+	  change when the upgrade to 3 happens.  Also, BeautifulSoup
+	  support (mainly RobustFactory) is still a little experimental
+	  and buggy.
+	* Fix HTTP-EQUIV with no content attribute case (thanks Pratik
+	  Dam).
+	* Fix bug with quoted META Refresh URL (thanks Nilton Volpato).
+	* Fix crash with </base> tag (yajdbgr02 at sneakemail.com).
+	* Somebody found a server that (incorrectly) depends on HTTP
+	  header case, so follow the Title-Case convention.  Note that the
+	  Request headers interface(s), which were (somewhat oddly -- this
+	  is an inheritance from urllib2 that should really be fixed in a
+	  better way than it is currently) always case-sensitive still
+	  are; the only thing that changed is what actually eventually
+	  gets sent over the wire.
+	* Use mechanize (not urllib) to open robots.txt.  Don't consult
+	  RobotFileParser instance about non-HTTP URLs.
+	* Fix OpenerDirector.retrieve(), which was very broken (thanks
+	  Duncan Booth).
+	* Crash in a much more obvious way if trying to use OpenerDirector
+	  after .close() .
+	* .reload() on .back() if necessary (necessary iff response was
+	  not fully .read() on first .open()ing ) * Strip fragments before
+	  retrieving URLs (fixed Request.get_selector() to strip fragment)
+	* Fix catching HTTPError subclasses while still preserving all
+	  their response behaviour
+	* Correct over-enthusiastic documented guarantees of
+	  closeable_response .
+	* Fix assumption that httplib.HTTPMessage treats dict-style
+	  __setitem__ as append rather than set (where on earth did I get
+	  that from?).
+	* Expose History in mechanize/__init__.py (though interface is
+	  still experimental).
+	* Lots of other "internals" bugs fixed (thanks to reports /
+	  patches from Benji York especially, also Titus Brown, Duncan
+	  Booth, and me ;-), where I'm not 100% sure exactly when they
+	  were introduced, so not listing them here in detail.
+	* Numerous other minor fixes.
+	* Some code cleanup.
+
+2006-05-21 John J Lee <jjl at pobox.com>
 	* 0.1.2b release:
 	* mechanize now exports the whole urllib2 interface.
 	* Pull in bugfixed auth/proxy support code from Python 2.5.

Modified: python-mechanize/trunk/MANIFEST.in
===================================================================
--- python-mechanize/trunk/MANIFEST.in	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/MANIFEST.in	2007-04-09 23:35:16 UTC (rev 773)
@@ -10,5 +10,7 @@
 include ChangeLog.txt
 include 0.1.0-changes.txt
 include *.py
+prune docs-in-progress
 recursive-include examples *.py
 recursive-include attic *.py
+recursive-include test-tools *.py

Modified: python-mechanize/trunk/PKG-INFO
===================================================================
--- python-mechanize/trunk/PKG-INFO	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/PKG-INFO	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,12 +1,12 @@
 Metadata-Version: 1.0
 Name: mechanize
-Version: 0.1.2b
+Version: 0.1.6b
 Summary: Stateful programmatic web browsing.
 Home-page: http://wwwsearch.sourceforge.net/mechanize/
 Author: John J. Lee
 Author-email: jjl at pobox.com
 License: BSD
-Download-URL: http://wwwsearch.sourceforge.net/mechanize/src/mechanize-0.1.2b.tar.gz
+Download-URL: http://wwwsearch.sourceforge.net/mechanize/src/mechanize-0.1.6b.tar.gz
 Description: Stateful programmatic web browsing, after Andy Lester's Perl module
         WWW::Mechanize.
         
@@ -25,7 +25,7 @@
         
         
 Platform: any
-Classifier: Development Status :: 3 - Alpha
+Classifier: Development Status :: 4 - Beta
 Classifier: Intended Audience :: Developers
 Classifier: Intended Audience :: System Administrators
 Classifier: License :: OSI Approved :: BSD License

Modified: python-mechanize/trunk/README.html
===================================================================
--- python-mechanize/trunk/README.html	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/README.html	2007-04-09 23:35:16 UTC (rev 773)
@@ -5,7 +5,7 @@
 <head>
   <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
   <meta name="author" content="John J. Lee &lt;jjl at pobox.com&gt;">
-  <meta name="date" content="2006-05-21">
+  <meta name="date" content="2006-12-30">
   <meta name="keywords" content="Python,HTML,HTTP,browser,stateful,web,client,client-side,mechanize,cookie,form,META,HTTP-EQUIV,Refresh,ClientForm,ClientCookie,pullparser,WWW::Mechanize">
   <meta name="keywords" content="cookie,HTTP,Python,web,client,client-side,HTML,META,HTTP-EQUIV,Refresh">
   <title>mechanize</title>
@@ -31,16 +31,18 @@
 <ul>
 
   <li><code>mechanize.Browser</code> is a subclass of
-    <code>mechanize.UserAgent</code>, which is, in turn, a subclass of
+    <code>mechanize.UserAgentBase</code>, which is, in turn, a subclass of
     <code>urllib2.OpenerDirector</code> (in fact, of
     <code>mechanize.OpenerDirector</code>), so:
     <ul>
       <li>any URL can be opened, not just <code>http:</code>
-      <li><code>mechanize.UserAgent</code> offers easy dynamic configuration of
-      user-agent features like protocol, cookie, redirection and
-      <code>robots.txt</code> handling, without having to make a new
-      <code>OpenerDirector</code> each time, e.g.  by calling
-      <code>build_opener()</code>.
+
+      <li><code>mechanize.UserAgentBase</code> offers easy dynamic
+      configuration of user-agent features like protocol, cookie,
+      redirection and <code>robots.txt</code> handling, without having
+      to make a new <code>OpenerDirector</code> each time, e.g.  by
+      calling <code>build_opener()</code>.
+
     </ul>
   <li>Easy HTML form filling, using <a href="../ClientForm/">ClientForm</a>
     interface.
@@ -145,7 +147,6 @@
 <span class="pycmt"># Sometimes it's useful to process bad headers or bad HTML:
 </span>response = br.response()  <span class="pycmt"># this is a copy of response</span>
 headers = response.info()  <span class="pycmt"># currently, this is a mimetools.Message</span>
-<span class="pykw">del</span> headers[<span class="pystr">"Content-type"</span>]  <span class="pycmt"># get rid of (possibly multiple) existing headers</span>
 headers[<span class="pystr">"Content-type"</span>] = <span class="pystr">"text/html; charset=utf-8"</span>
 response.set_data(response.get_data().replace(<span class="pystr">"&lt;!---"</span>, <span class="pystr">"&lt;!--"</span>))
 br.set_response(response)</pre>
@@ -160,7 +161,7 @@
 
 
 
-so anything you would normally import from <code>urllib2</code> can
+<p>so anything you would normally import from <code>urllib2</code> can
 (and should, by preference, to insulate you from future changes) be
 imported from mechanize instead.  In many cases if you import an
 object from mechanize it will be the very same object you would get if
@@ -170,6 +171,28 @@
 way.
 
 
+<a name="useragentbase"></a>
+<h2>UserAgent vs UserAgentBase</h2>
+
+<p><code>mechanize.UserAgent</code> is a trivial subclass of
+<code>mechanize.UserAgentBase</code>, adding just one method,
+<code>.set_seekable_responses()</code>, which allows switching off the
+addition of the <code>.seek()</code> method to response objects:
+
+<pre>
+ua = mechanize.UserAgent()
+ua.set_seekable_responses(False)
+ua.set_handle_equiv(False)  <span class="pycmt"># handling HTTP-EQUIV would add the .seek() method too</span>
+response = ua.open(<span class="pystr">'http://wwwsearch.sourceforge.net/'</span>)
+<span class="pykw">assert</span> <span class="pykw">not</span> hasattr(response, <span class="pystr">"seek"</span>)
+<span class="pykw">print</span> response.read()</pre>
+
+
+<p>The reason for the extra class is that
+<code>mechanize.Browser</code> depends on seekable response objects
+(because response objects are used to implement the browser history).
+
+
 <a name="compatnotes"></a>
 <h2>Compatibility</h2>
 
@@ -263,35 +286,38 @@
 
 
 <a name="todo"></a>
-<h2>Todo</h2>
+<h2>To do</h2>
 
 <p>Contributions welcome!
 
-<h3>Specific to mechanize</h3>
+<p>The documentation to-do list has moved to the new "docs-in-progress"
+directory in SVN.
 
-<em>This is <strong>very</strong> roughly in order of priority</em>
+<p><em>This is <strong>very</strong> roughly in order of priority</em>
 
 <ul>
-  <li>Add .get_method() to Request.
   <li>Test <code>.any_response()</code> two handlers case: ordering.
   <li>Test referer bugs (frags and don't add in redirect unless orig
     req had Referer)
-  <li>Implement RFC 3986 URL absolutization.
+  <li>Remove use of urlparse from _auth.py.
   <li>Proper XHTML support!
-  <li>Make encoding_finder public, I guess (but probably improve it first).
-    (For example: support Mark Pilgrim's universal encoding detector?)
-  <li>Continue with the de-crufting enabled by requirement for Python 2.3.
   <li>Fix BeautifulSoup support to use a single BeautifulSoup instance
     per page.
   <li>Test BeautifulSoup support better / fix encoding issue.
+  <li>Support BeautifulSoup 3.
   <li>Add another History implementation or two and finalise interface.
   <li>History cache expiration.
-  <li>Investigate possible leak (see Balazs Ree's list posting).
-  <li>Add two-way links between BeautifulSoup & ClientForm object models.
-  <li>In 0.2: fork urllib2 &#8212; easier maintenance.
+  <li>Investigate possible leak further (see Balazs Ree's list posting).
+  <li>Make <code>EncodingFinder</code> public, I guess (but probably
+    improve it first).  (For example: support Mark Pilgrim's universal
+    encoding detector?)
+  <li>Add two-way links between BeautifulSoup &amp; ClientForm object
+    models.
   <li>In 0.2: switch to Python unicode strings everywhere appropriate
     (HTTP level should still use byte strings, of course).
-  <li>clean_url(): test browser behaviour.  I <em>think</em> this is correct...
+  <li><code>clean_url()</code>: test browser behaviour.  I <em>think</em>
+    this is correct...
+  <li>Use a nicer RFC 3986 join / split / unsplit implementation.
   <li>Figure out the Right Thing (if such a thing exists) for %-encoding.
   <li>How do IRIs fit into the world?
   <li>IDNA -- must read about security stuff first.
@@ -303,23 +329,16 @@
   <li>gzip transfer encoding (there's already a handler for this in
     mechanize, but it's poorly implemented ATM).
   <li>proxy.pac parsing (I don't think this needs JS interpretation)
-</ul>
+  <li>Topological sort for handlers, instead of .handler_order
+    attribute.  Ordering and other dependencies (where unavoidable)
+    should be defined separate from handlers themselves.  Add new
+    build_opener and deprecate the old one?  Actually, _useragent is
+    probably not far off what I'd have in mind (would just need a
+    method or two and a base class adding I think), and it's not a high
+    priority since I guess most people will just use the UserAgent and
+    Browser classes.
 
-<h3>Documentation</h3>
-<ul>
-  <li>Document means of processing response on ad-hoc basis with
-    .set_response() - e.g. to fix bad encoding in Content-type header or
-    clean up bad HTML.
-  <li>Add example to documentation showing can pass None as handle arg
-    to <code>mechanize.UserAgent</code> methods and then .add_handler()
-    if need to give it a specific handler instance to use for one of the
-    things it UserAgent already handles.  Hmm, think this contradicts docs
-    ATM!  And is it better to do this a different way...??
-  <li>Rearrange so have decent class-by-class docs,
-    a tutorial/background-info doc, and a howto/examples doc.
-  <li>Add more functional tests.
-  <li>Auth / proxies.
-</ul>
+ </ul>
 
 
 <a name="download"></a>
@@ -347,13 +366,11 @@
 EasyInstall is a one-liner for the common case, to be compared with the usual
 download-unpack-install cycle with <code>setup.py</code>.
 
-<p><strong>You need EasyInstall version 0.6a8 or newer.</strong>
-
 <h3>Using EasyInstall to download and install mechanize</h3>
 
 <ol>
   <li><a href="http://peak.telecommunity.com/DevCenter/EasyInstall#installing-easy-install">
-Install easy_install</a> (you need version 0.6a8 or newer)
+Install easy_install</a>
   <li><code>easy_install mechanize</code>
 </ol>
 
@@ -388,9 +405,7 @@
 <code>easy_install "projectname=dev"</code> for that project.
 
 <p>Note also that you can still carry on using a plain old SVN checkout as
-usual if you like (optionally in conjunction with <a
-href="./#develop"><code>setup.py develop</code></a> &#8211; this is
-particularly useful on Windows, since it functions rather like symlinks).
+usual if you like.
 
 <h3>Using setup.py from a .tar.gz, .zip or an SVN checkout to download and install mechanize</h3>
 
@@ -404,50 +419,19 @@
 
 <pre>python setup.py easy_install mechanize</pre>
 
-<a name="develop"></a>
-<h3>Using setup.py to install mechanize for development work on mechanize</h3>
 
-<p><strong>Note: this section is only useful for people who want to change
-mechanize</strong>: It is not useful to do this if all you want is to <a
-href="./#svnhead">keep up with SVN</a>.
-
-<p>For development of mechanize using EasyInstall (see the <a
-href="http://peak.telecommunity.com/DevCenter/setuptools">setuptools</a> docs
-for details), you have the option of using the <code>develop</code> distutils
-command.  This is particularly useful on Windows, since it functions rather
-like symlinks.  Get the mechanize source, then:
-
-<pre>python setup.py develop</pre>
-
-<p>Note that after every <code>svn update</code> on a
-<code>develop</code>-installed project, you should run <code>setup.py
-develop</code> to ensure that project's dependencies are updated if required.
-
-<p>Also note that, currently, if you also use the <code>develop</code>
-distutils command on the <em>dependencies</em> of mechanize (<em>viz</em>,
-ClientForm, and optionally BeautifulSoup) to keep up with SVN, you must run
-<code>setup.py develop</code> for each dependency of mechanize before running
-it for mechanize itself.  As a result, in this case it's probably simplest to
-just set up your <code>sys.path</code> manually rather than using
-<code>setup.py develop</code>.
-
-<p>One convenient way to get the latest source is:
-
-<pre>easy_install --editable --build-directory mybuilddir "mechanize==dev"</pre>
-
-
 <a name="source"></a>
 <h2>Download</h2>
 <p>All documentation (including this web page) is included in the distribution.
 
-<p>This is an alpha release: interfaces may change, and there will be bugs.
+<p>This is a beta release: there will be bugs.
 
 <p><em>Development release.</em>
 
 <ul>
 
-<li><a href="./src/mechanize-0.1.2b.tar.gz">mechanize-0.1.2b.tar.gz</a>
-<li><a href="./src/mechanize-0.1.2b.zip">mechanize-0.1.2b.zip</a>
+<li><a href="./src/mechanize-0.1.6b.tar.gz">mechanize-0.1.6b.tar.gz</a>
+<li><a href="./src/mechanize-0.1.6b.zip">mechanize-0.1.6b.zip</a>
 <li><a href="./src/ChangeLog.txt">Change Log</a> (included in distribution)
 <li><a href="./src/">Older versions.</a>
 </ul>
@@ -511,9 +495,9 @@
 
 <ul>
 
-  <li><a href="http://cheeseshop.python.org/pypi?:action=display&name=zope.testbrowser">
+  <li><a href="http://cheeseshop.python.org/pypi?:action=display&amp;name=zope.testbrowser">
     <code>zope.testbrowser</code></a> (or
-    <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&name=ZopeTestbrowser">
+    <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&amp;name=ZopeTestbrowser">
     <code>ZopeTestBrowser</code></a>, the standalone version).
   <li><a href="http://www.idyll.org/~t/www-tools/twill.html">twill</a>.
 </ul>
@@ -541,6 +525,13 @@
   <p>2.3 or above.
   <li>What else do I need?
   <p>mechanize depends on <a href="../ClientForm/">ClientForm</a>.
+  <li>Does mechanize depend on BeautifulSoup?
+     No.  mechanize offers a few (still rather experimental) classes that make
+     use of BeautifulSoup, but these classes are not required to use mechanize.
+     mechanize bundles BeautifulSoup version 2, so that module is no longer
+     required.  A future version of mechanize will support BeautifulSoup
+     version 3, at which point mechanize will likely no longer bundle the
+     module.
   <p>The versions of those required modules are listed in the
      <code>setup.py</code> for mechanize (included with the download).  The
      dependencies are automatically fetched by <a
@@ -559,6 +550,11 @@
 <a name="usagefaq"></a>
 <h2>FAQs - usage</h2>
 <ul>
+  <li>I'm not getting the HTML page I expected to see.
+    <ul>
+      <li><a href="http://wwwsearch.sourceforge.net/mechanize/doc.html#debugging">Debugging tips</a>
+      <li><a href="http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html">More tips</a>
+     </ul>
   <li>I'm <strong><em>sure</em></strong> this page is HTML, why does
      <code>mechanize.Browser</code> think otherwise?
 <pre>
@@ -576,7 +572,7 @@
 mailing list</a> rather than direct to me.
 
 <p><a href="mailto:jjl at pobox.com">John J. Lee</a>,
-May 2006.
+December 2006.
 
 <hr>
 

Modified: python-mechanize/trunk/README.html.in
===================================================================
--- python-mechanize/trunk/README.html.in	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/README.html.in	2007-04-09 23:35:16 UTC (rev 773)
@@ -6,7 +6,7 @@
 from colorize import colorize
 import time
 import release
-last_modified = release.svn_id_to_time("$Id: README.html.in 27559 2006-05-21 22:39:21Z jjlee $")
+last_modified = release.svn_id_to_time("$Id: README.html.in 36066 2006-12-30 21:00:39Z jjlee $")
 try:
     base
 except NameError:
@@ -42,16 +42,18 @@
 <ul>
 
   <li><code>mechanize.Browser</code> is a subclass of
-    <code>mechanize.UserAgent</code>, which is, in turn, a subclass of
+    <code>mechanize.UserAgentBase</code>, which is, in turn, a subclass of
     <code>urllib2.OpenerDirector</code> (in fact, of
     <code>mechanize.OpenerDirector</code>), so:
     <ul>
       <li>any URL can be opened, not just <code>http:</code>
-      <li><code>mechanize.UserAgent</code> offers easy dynamic configuration of
-      user-agent features like protocol, cookie, redirection and
-      <code>robots.txt</code> handling, without having to make a new
-      <code>OpenerDirector</code> each time, e.g.  by calling
-      <code>build_opener()</code>.
+
+      <li><code>mechanize.UserAgentBase</code> offers easy dynamic
+      configuration of user-agent features like protocol, cookie,
+      redirection and <code>robots.txt</code> handling, without having
+      to make a new <code>OpenerDirector</code> each time, e.g.  by
+      calling <code>build_opener()</code>.
+
     </ul>
   <li>Easy HTML form filling, using <a href="../ClientForm/">ClientForm</a>
     interface.
@@ -156,7 +158,6 @@
 # Sometimes it's useful to process bad headers or bad HTML:
 response = br.response()  # this is a copy of response
 headers = response.info()  # currently, this is a mimetools.Message
-del headers["Content-type"]  # get rid of (possibly multiple) existing headers
 headers["Content-type"] = "text/html; charset=utf-8"
 response.set_data(response.get_data().replace("<!---", "<!--"))
 br.set_response(response)
@@ -171,7 +172,7 @@
 """)}
 
 
-so anything you would normally import from <code>urllib2</code> can
+<p>so anything you would normally import from <code>urllib2</code> can
 (and should, by preference, to insulate you from future changes) be
 imported from mechanize instead.  In many cases if you import an
 object from mechanize it will be the very same object you would get if
@@ -181,6 +182,28 @@
 way.
 
 
+<a name="useragentbase"></a>
+<h2>UserAgent vs UserAgentBase</h2>
+
+<p><code>mechanize.UserAgent</code> is a trivial subclass of
+<code>mechanize.UserAgentBase</code>, adding just one method,
+<code>.set_seekable_responses()</code>, which allows switching off the
+addition of the <code>.seek()</code> method to response objects:
+
+@{colorize("""
+ua = mechanize.UserAgent()
+ua.set_seekable_responses(False)
+ua.set_handle_equiv(False)  # handling HTTP-EQUIV would add the .seek() method too
+response = ua.open('http://wwwsearch.sourceforge.net/')
+assert not hasattr(response, "seek")
+print response.read()
+""")}
+
+<p>The reason for the extra class is that
+<code>mechanize.Browser</code> depends on seekable response objects
+(because response objects are used to implement the browser history).
+
+
 <a name="compatnotes"></a>
 <h2>Compatibility</h2>
 
@@ -274,35 +297,38 @@
 
 
 <a name="todo"></a>
-<h2>Todo</h2>
+<h2>To do</h2>
 
 <p>Contributions welcome!
 
-<h3>Specific to mechanize</h3>
+<p>The documentation to-do list has moved to the new "docs-in-progress"
+directory in SVN.
 
-<em>This is <strong>very</strong> roughly in order of priority</em>
+<p><em>This is <strong>very</strong> roughly in order of priority</em>
 
 <ul>
-  <li>Add .get_method() to Request.
   <li>Test <code>.any_response()</code> two handlers case: ordering.
   <li>Test referer bugs (frags and don't add in redirect unless orig
     req had Referer)
-  <li>Implement RFC 3986 URL absolutization.
+  <li>Remove use of urlparse from _auth.py.
   <li>Proper XHTML support!
-  <li>Make encoding_finder public, I guess (but probably improve it first).
-    (For example: support Mark Pilgrim's universal encoding detector?)
-  <li>Continue with the de-crufting enabled by requirement for Python 2.3.
   <li>Fix BeautifulSoup support to use a single BeautifulSoup instance
     per page.
   <li>Test BeautifulSoup support better / fix encoding issue.
+  <li>Support BeautifulSoup 3.
   <li>Add another History implementation or two and finalise interface.
   <li>History cache expiration.
-  <li>Investigate possible leak (see Balazs Ree's list posting).
-  <li>Add two-way links between BeautifulSoup & ClientForm object models.
-  <li>In 0.2: fork urllib2 &#8212; easier maintenance.
+  <li>Investigate possible leak further (see Balazs Ree's list posting).
+  <li>Make <code>EncodingFinder</code> public, I guess (but probably
+    improve it first).  (For example: support Mark Pilgrim's universal
+    encoding detector?)
+  <li>Add two-way links between BeautifulSoup &amp; ClientForm object
+    models.
   <li>In 0.2: switch to Python unicode strings everywhere appropriate
     (HTTP level should still use byte strings, of course).
-  <li>clean_url(): test browser behaviour.  I <em>think</em> this is correct...
+  <li><code>clean_url()</code>: test browser behaviour.  I <em>think</em>
+    this is correct...
+  <li>Use a nicer RFC 3986 join / split / unsplit implementation.
   <li>Figure out the Right Thing (if such a thing exists) for %-encoding.
   <li>How do IRIs fit into the world?
   <li>IDNA -- must read about security stuff first.
@@ -314,23 +340,16 @@
   <li>gzip transfer encoding (there's already a handler for this in
     mechanize, but it's poorly implemented ATM).
   <li>proxy.pac parsing (I don't think this needs JS interpretation)
-</ul>
+  <li>Topological sort for handlers, instead of .handler_order
+    attribute.  Ordering and other dependencies (where unavoidable)
+    should be defined separate from handlers themselves.  Add new
+    build_opener and deprecate the old one?  Actually, _useragent is
+    probably not far off what I'd have in mind (would just need a
+    method or two and a base class adding I think), and it's not a high
+    priority since I guess most people will just use the UserAgent and
+    Browser classes.
 
-<h3>Documentation</h3>
-<ul>
-  <li>Document means of processing response on ad-hoc basis with
-    .set_response() - e.g. to fix bad encoding in Content-type header or
-    clean up bad HTML.
-  <li>Add example to documentation showing can pass None as handle arg
-    to <code>mechanize.UserAgent</code> methods and then .add_handler()
-    if need to give it a specific handler instance to use for one of the
-    things it UserAgent already handles.  Hmm, think this contradicts docs
-    ATM!  And is it better to do this a different way...??
-  <li>Rearrange so have decent class-by-class docs,
-    a tutorial/background-info doc, and a howto/examples doc.
-  <li>Add more functional tests.
-  <li>Auth / proxies.
-</ul>
+ </ul>
 
 
 <a name="download"></a>
@@ -358,13 +377,11 @@
 EasyInstall is a one-liner for the common case, to be compared with the usual
 download-unpack-install cycle with <code>setup.py</code>.
 
-<p><strong>You need EasyInstall version 0.6a8 or newer.</strong>
-
 <h3>Using EasyInstall to download and install mechanize</h3>
 
 <ol>
   <li><a href="http://peak.telecommunity.com/DevCenter/EasyInstall#installing-easy-install">
-Install easy_install</a> (you need version 0.6a8 or newer)
+Install easy_install</a>
   <li><code>easy_install mechanize</code>
 </ol>
 
@@ -399,9 +416,7 @@
 <code>easy_install "projectname=dev"</code> for that project.
 
 <p>Note also that you can still carry on using a plain old SVN checkout as
-usual if you like (optionally in conjunction with <a
-href="./#develop"><code>setup.py develop</code></a> &#8211; this is
-particularly useful on Windows, since it functions rather like symlinks).
+usual if you like.
 
 <h3>Using setup.py from a .tar.gz, .zip or an SVN checkout to download and install mechanize</h3>
 
@@ -415,48 +430,17 @@
 
 <pre>python setup.py easy_install mechanize</pre>
 
-<a name="develop"></a>
-<h3>Using setup.py to install mechanize for development work on mechanize</h3>
 
-<p><strong>Note: this section is only useful for people who want to change
-mechanize</strong>: It is not useful to do this if all you want is to <a
-href="./#svnhead">keep up with SVN</a>.
-
-<p>For development of mechanize using EasyInstall (see the <a
-href="http://peak.telecommunity.com/DevCenter/setuptools">setuptools</a> docs
-for details), you have the option of using the <code>develop</code> distutils
-command.  This is particularly useful on Windows, since it functions rather
-like symlinks.  Get the mechanize source, then:
-
-<pre>python setup.py develop</pre>
-
-<p>Note that after every <code>svn update</code> on a
-<code>develop</code>-installed project, you should run <code>setup.py
-develop</code> to ensure that project's dependencies are updated if required.
-
-<p>Also note that, currently, if you also use the <code>develop</code>
-distutils command on the <em>dependencies</em> of mechanize (<em>viz</em>,
-ClientForm, and optionally BeautifulSoup) to keep up with SVN, you must run
-<code>setup.py develop</code> for each dependency of mechanize before running
-it for mechanize itself.  As a result, in this case it's probably simplest to
-just set up your <code>sys.path</code> manually rather than using
-<code>setup.py develop</code>.
-
-<p>One convenient way to get the latest source is:
-
-<pre>easy_install --editable --build-directory mybuilddir "mechanize==dev"</pre>
-
-
 <a name="source"></a>
 <h2>Download</h2>
 <p>All documentation (including this web page) is included in the distribution.
 
-<p>This is an alpha release: interfaces may change, and there will be bugs.
+<p>This is a beta release: there will be bugs.
 
 <p><em>Development release.</em>
 
 <ul>
-@{version = "0.1.2b"}
+@{version = "0.1.6b"}
 <li><a href="./src/mechanize-@(version).tar.gz">mechanize-@(version).tar.gz</a>
 <li><a href="./src/mechanize-@(version).zip">mechanize-@(version).zip</a>
 <li><a href="./src/ChangeLog.txt">Change Log</a> (included in distribution)
@@ -522,9 +506,9 @@
 
 <ul>
 
-  <li><a href="http://cheeseshop.python.org/pypi?:action=display&name=zope.testbrowser">
+  <li><a href="http://cheeseshop.python.org/pypi?:action=display&amp;name=zope.testbrowser">
     <code>zope.testbrowser</code></a> (or
-    <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&name=ZopeTestbrowser">
+    <a href="http://cheeseshop.python.org/pypi?%3Aaction=display&amp;name=ZopeTestbrowser">
     <code>ZopeTestBrowser</code></a>, the standalone version).
   <li><a href="http://www.idyll.org/~t/www-tools/twill.html">twill</a>.
 </ul>
@@ -552,6 +536,13 @@
   <p>2.3 or above.
   <li>What else do I need?
   <p>mechanize depends on <a href="../ClientForm/">ClientForm</a>.
+  <li>Does mechanize depend on BeautifulSoup?
+     No.  mechanize offers a few (still rather experimental) classes that make
+     use of BeautifulSoup, but these classes are not required to use mechanize.
+     mechanize bundles BeautifulSoup version 2, so that module is no longer
+     required.  A future version of mechanize will support BeautifulSoup
+     version 3, at which point mechanize will likely no longer bundle the
+     module.
   <p>The versions of those required modules are listed in the
      <code>setup.py</code> for mechanize (included with the download).  The
      dependencies are automatically fetched by <a
@@ -570,6 +561,11 @@
 <a name="usagefaq"></a>
 <h2>FAQs - usage</h2>
 <ul>
+  <li>I'm not getting the HTML page I expected to see.
+    <ul>
+      <li><a href="http://wwwsearch.sourceforge.net/mechanize/doc.html#debugging">Debugging tips</a>
+      <li><a href="http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html">More tips</a>
+     </ul>
   <li>I'm <strong><em>sure</em></strong> this page is HTML, why does
      <code>mechanize.Browser</code> think otherwise?
 @{colorize("""

Modified: python-mechanize/trunk/README.txt
===================================================================
--- python-mechanize/trunk/README.txt	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/README.txt	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,18 +1,19 @@
-   [1]SourceForge.net Logo
 
+   [1]SourceForge.net Logo 
+
                                    mechanize
 
    Stateful programmatic web browsing in Python, after Andy Lester's Perl
    module [2]WWW::Mechanize .
-     * mechanize.Browser is a subclass of mechanize.UserAgent, which is,
-       in turn, a subclass of urllib2.OpenerDirector (in fact, of
+     * mechanize.Browser is a subclass of mechanize.UserAgentBase, which
+       is, in turn, a subclass of urllib2.OpenerDirector (in fact, of
        mechanize.OpenerDirector), so:
           + any URL can be opened, not just http:
-          + mechanize.UserAgent offers easy dynamic configuration of
+          + mechanize.UserAgentBase offers easy dynamic configuration of
             user-agent features like protocol, cookie, redirection and
             robots.txt handling, without having to make a new
             OpenerDirector each time, e.g. by calling build_opener().
-     * Easy HTML form filling, using [3]ClientForm interface.
+     * Easy HTML form filling, using ClientForm interface.
      * Convenient link parsing and following.
      * Browser history (.back() and .reload() methods).
      * The Referer HTTP header is added properly (optional).
@@ -23,7 +24,7 @@
 
    This documentation is in need of reorganisation and extension!
 
-   The two below are just to give the gist. There are also some [5]actual
+   The two below are just to give the gist. There are also some actual
    working examples.
 import re
 from mechanize import Browser
@@ -102,7 +103,6 @@
 # Sometimes it's useful to process bad headers or bad HTML:
 response = br.response()  # this is a copy of response
 headers = response.info()  # currently, this is a mimetools.Message
-del headers["Content-type"]  # get rid of (possibly multiple) existing headers
 headers["Content-type"] = "text/html; charset=utf-8"
 response.set_data(response.get_data().replace("<!---", "<!--"))
 br.set_response(response)
@@ -120,6 +120,23 @@
    comes from mechanize, either because bug fixes have been applied or
    the functionality of urllib2 has been extended in some way.
 
+UserAgent vs UserAgentBase
+
+   mechanize.UserAgent is a trivial subclass of mechanize.UserAgentBase,
+   adding just one method, .set_seekable_responses(), which allows
+   switching off the addition of the .seek() method to response objects:
+ua = mechanize.UserAgent()
+ua.set_seekable_responses(False)
+ua.set_handle_equiv(False)  # handling HTTP-EQUIV would add the .seek() method
+too
+response = ua.open('http://wwwsearch.sourceforge.net/')
+assert not hasattr(response, "seek")
+print response.read()
+
+   The reason for the extra class is that mechanize.Browser depends on
+   seekable response objects (because response objects are used to
+   implement the browser history).
+
 Compatibility
 
    These notes explain the relationship between mechanize, ClientCookie,
@@ -183,36 +200,35 @@
    (coincidentally-named) Johnny Lee for the MSIE CookieJar Perl code
    from which mechanize's support for that is derived.
 
-Todo
+To do
 
    Contributions welcome!
 
-Specific to mechanize
+   The documentation to-do list has moved to the new "docs-in-progress"
+   directory in SVN.
 
    This is very roughly in order of priority
-     * Add .get_method() to Request.
      * Test .any_response() two handlers case: ordering.
      * Test referer bugs (frags and don't add in redirect unless orig req
        had Referer)
-     * Implement RFC 3986 URL absolutization.
+     * Remove use of urlparse from _auth.py.
      * Proper XHTML support!
-     * Make encoding_finder public, I guess (but probably improve it
-       first). (For example: support Mark Pilgrim's universal encoding
-       detector?)
-     * Continue with the de-crufting enabled by requirement for Python
-       2.3.
      * Fix BeautifulSoup support to use a single BeautifulSoup instance
        per page.
      * Test BeautifulSoup support better / fix encoding issue.
+     * Support BeautifulSoup 3.
      * Add another History implementation or two and finalise interface.
      * History cache expiration.
-     * Investigate possible leak (see Balazs Ree's list posting).
+     * Investigate possible leak further (see Balazs Ree's list posting).
+     * Make EncodingFinder public, I guess (but probably improve it
+       first). (For example: support Mark Pilgrim's universal encoding
+       detector?)
      * Add two-way links between BeautifulSoup & ClientForm object
        models.
-     * In 0.2: fork urllib2 -- easier maintenance.
      * In 0.2: switch to Python unicode strings everywhere appropriate
        (HTTP level should still use byte strings, of course).
      * clean_url(): test browser behaviour. I think this is correct...
+     * Use a nicer RFC 3986 join / split / unsplit implementation.
      * Figure out the Right Thing (if such a thing exists) for
        %-encoding.
      * How do IRIs fit into the world?
@@ -225,30 +241,23 @@
      * gzip transfer encoding (there's already a handler for this in
        mechanize, but it's poorly implemented ATM).
      * proxy.pac parsing (I don't think this needs JS interpretation)
+     * Topological sort for handlers, instead of .handler_order
+       attribute. Ordering and other dependencies (where unavoidable)
+       should be defined separate from handlers themselves. Add new
+       build_opener and deprecate the old one? Actually, _useragent is
+       probably not far off what I'd have in mind (would just need a
+       method or two and a base class adding I think), and it's not a
+       high priority since I guess most people will just use the
+       UserAgent and Browser classes.
 
-Documentation
-
-     * Document means of processing response on ad-hoc basis with
-       .set_response() - e.g. to fix bad encoding in Content-type header
-       or clean up bad HTML.
-     * Add example to documentation showing can pass None as handle arg
-       to mechanize.UserAgent methods and then .add_handler() if need to
-       give it a specific handler instance to use for one of the things
-       it UserAgent already handles. Hmm, think this contradicts docs
-       ATM! And is it better to do this a different way...??
-     * Rearrange so have decent class-by-class docs, a
-       tutorial/background-info doc, and a howto/examples doc.
-     * Add more functional tests.
-     * Auth / proxies.
-
 Getting mechanize
 
-   You can install the [7]old-fashioned way, or using [8]EasyInstall. I
+   You can install the old-fashioned way, or using [8]EasyInstall. I
    recommend the latter even though EasyInstall is still in alpha,
    because it will automatically ensure you have the necessary
    dependencies, downloading if necessary.
 
-   [9]Subversion (SVN) access is also available.
+   Subversion (SVN) access is also available.
 
    Since EasyInstall is new, I include some instructions below, but
    mechanize follows standard EasyInstall / setuptools conventions, so
@@ -262,11 +271,9 @@
    a one-liner for the common case, to be compared with the usual
    download-unpack-install cycle with setup.py.
 
-   You need EasyInstall version 0.6a8 or newer.
-
 Using EasyInstall to download and install mechanize
 
-    1. [12]Install easy_install (you need version 0.6a8 or newer)
+    1. [12]Install easy_install
     2. easy_install mechanize
 
    If you're on a Unix-like OS, you may need root permissions for that
@@ -297,9 +304,7 @@
    "projectname=dev" for that project.
 
    Note also that you can still carry on using a plain old SVN checkout
-   as usual if you like (optionally in conjunction with [15]setup.py
-   develop - this is particularly useful on Windows, since it functions
-   rather like symlinks).
+   as usual if you like.
 
 Using setup.py from a .tar.gz, .zip or an SVN checkout to download and
 install mechanize
@@ -312,53 +317,26 @@
    setup.py --help easy_install)
 python setup.py easy_install mechanize
 
-Using setup.py to install mechanize for development work on mechanize
-
-   Note: this section is only useful for people who want to change
-   mechanize: It is not useful to do this if all you want is to [16]keep
-   up with SVN.
-
-   For development of mechanize using EasyInstall (see the [17]setuptools
-   docs for details), you have the option of using the develop distutils
-   command. This is particularly useful on Windows, since it functions
-   rather like symlinks. Get the mechanize source, then:
-python setup.py develop
-
-   Note that after every svn update on a develop-installed project, you
-   should run setup.py develop to ensure that project's dependencies are
-   updated if required.
-
-   Also note that, currently, if you also use the develop distutils
-   command on the dependencies of mechanize (viz, ClientForm, and
-   optionally BeautifulSoup) to keep up with SVN, you must run setup.py
-   develop for each dependency of mechanize before running it for
-   mechanize itself. As a result, in this case it's probably simplest to
-   just set up your sys.path manually rather than using setup.py develop.
-
-   One convenient way to get the latest source is:
-easy_install --editable --build-directory mybuilddir "mechanize==dev"
-
 Download
 
    All documentation (including this web page) is included in the
    distribution.
 
-   This is an alpha release: interfaces may change, and there will be
-   bugs.
+   This is a beta release: there will be bugs.
 
    Development release.
-     * [18]mechanize-0.1.2b.tar.gz
-     * [19]mechanize-0.1.2b.zip
-     * [20]Change Log (included in distribution)
-     * [21]Older versions.
+     * mechanize-0.1.6b.tar.gz
+     * mechanize-0.1.6b.zip
+     * hange Log (included in distribution)
+     * Older versions.
 
    For old-style installation instructions, see the INSTALL file included
-   in the distribution. Better, [22]use EasyInstall.
+   in the distribution. Better, use EasyInstall.
 
 Subversion
 
-   The [23]Subversion (SVN) trunk is
-   [24]http://codespeak.net/svn/wwwsearch/mechanize/trunk, so to check
+   The [20]Subversion (SVN) trunk is
+   [21]http://codespeak.net/svn/wwwsearch/mechanize/trunk, so to check
    out the source:
 svn co http://codespeak.net/svn/wwwsearch/mechanize/trunk mechanize
 
@@ -366,13 +344,13 @@
 
 Examples
 
-   The examples directory in the [25]source packages contains a couple of
+   The examples directory in the source packages contains a couple of
    silly, but working, scripts to demonstrate basic use of the module.
    Note that it's in the nature of web scraping for such scripts to
    break, so don't be too suprised if that happens - do let me know,
    though!
 
-   It's worth knowing also that the examples on the [26]ClientForm web
+   It's worth knowing also that the examples on the ClientForm web
    page are useful for mechanize users, and are now real run-able scripts
    rather than just documentation.
 
@@ -399,12 +377,12 @@
 
    There are several wrappers around mechanize designed for functional
    testing of web applications:
-     * [27]zope.testbrowser (or [28]ZopeTestBrowser, the standalone
+     * [24]zope.testbrowser (or [25]ZopeTestBrowser, the standalone
        version).
-     * [29]twill.
+     * [26]twill.
 
-   Richard Jones' [30]webunit (this is not the same as Steven Purcell's
-   [31]code of the same name). webunit and mechanize are quite similar.
+   Richard Jones' [27]webunit (this is not the same as Steven Purcell's
+   [28]code of the same name). webunit and mechanize are quite similar.
    On the minus side, webunit is missing things like browser history,
    high-level forms and links handling, thorough cookie handling, refresh
    redirection, adding of the Referer header, observance of robots.txt
@@ -415,27 +393,37 @@
    with aims limited to writing tests, where mechanize and the modules it
    depends on try hard to be general-purpose libraries.
 
-   There are many related links in the [32]General FAQ page, too.
+   There are many related links in the General FAQ page, too.
 
 FAQs - pre install
 
      * Which version of Python do I need?
        2.3 or above.
      * What else do I need?
-       mechanize depends on [33]ClientForm.
+       mechanize depends on ClientForm.
+     * Does mechanize depend on BeautifulSoup? No. mechanize offers a few
+       (still rather experimental) classes that make use of
+       BeautifulSoup, but these classes are not required to use
+       mechanize. mechanize bundles BeautifulSoup version 2, so that
+       module is no longer required. A future version of mechanize will
+       support BeautifulSoup version 3, at which point mechanize will
+       likely no longer bundle the module.
        The versions of those required modules are listed in the setup.py
        for mechanize (included with the download). The dependencies are
-       automatically fetched by [34]EasyInstall (or by [35]downloading a
+       automatically fetched by [31]EasyInstall (or by downloading a
        mechanize source package and running python setup.py install). If
        you like you can fetch and install them manually, instead - see
        the INSTALL.txt file (included with the distribution).
      * Which license?
-       mechanize is dual-licensed: you may pick either the [36]BSD
-       license, or the [37]ZPL 2.1 (both are included in the
+       mechanize is dual-licensed: you may pick either the [33]BSD
+       license, or the [34]ZPL 2.1 (both are included in the
        distribution).
 
 FAQs - usage
 
+     * I'm not getting the HTML page I expected to see.
+          + [35]Debugging tips
+          + [36]More tips
      * I'm sure this page is HTML, why does mechanize.Browser think
        otherwise?
 b = mechanize.Browser(
@@ -445,92 +433,57 @@
     factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)
     )
 
-   I prefer questions and comments to be sent to the [38]mailing list
+   I prefer questions and comments to be sent to the [37]mailing list
    rather than direct to me.
 
-   [39]John J. Lee, May 2006.
+   [38]John J. Lee, December 2006.
      _________________________________________________________________
 
-   [40]Home
-   [41]General FAQs
+   Home
+   General FAQs
    mechanize
-   [42]mechanize docs
-   [43]ClientForm
-   [44]ClientCookie
-   [45]ClientCookie docs
-   [46]pullparser
-   [47]DOMForm
-   [48]python-spidermonkey
-   [49]ClientTable
-   [50]1.5.2 urllib2.py
-   [51]1.5.2 urllib.py
-   [52]Examples
-   [53]Compatibility
-   [54]Documentation
-   [55]To-do
-   [56]Download
-   [57]Subversion
-   [58]More examples
-   [59]FAQs
+   mechanize docs
+   ClientForm
+   ClientCookie
+   ClientCookie docs
+   pullparser
+   DOMForm
+   python-spidermonkey
+   ClientTable
+   1.5.2 urllib2.py
+   1.5.2 urllib.py
+   Examples
+   Compatibility
+   Documentation
+   To-do
+   Download
+   Subversion
+   More examples
+   FAQs
 
 References
 
    1. http://sourceforge.net/
    2. http://search.cpan.org/dist/WWW-Mechanize/
-   3. file://localhost/tmp/ClientForm/
    4. http://www.robotstxt.org/wc/norobots.html
-   5. file://localhost/tmp/tmpexjjQ7/#tests
    6. http://search.cpan.org/dist/WWW-Mechanize/
-   7. file://localhost/tmp/tmpexjjQ7/#source
    8. http://peak.telecommunity.com/DevCenter/EasyInstall
-   9. file://localhost/tmp/tmpexjjQ7/#svn
   10. http://peak.telecommunity.com/DevCenter/EasyInstall
   11. http://peak.telecommunity.com/DevCenter/setuptools
   12. http://peak.telecommunity.com/DevCenter/EasyInstall#installing-easy-install
   13. http://peak.telecommunity.com/DevCenter/EasyInstall
   14. http://peak.telecommunity.com/DevCenter/PythonEggs
-  15. file://localhost/tmp/tmpexjjQ7/#develop
-  16. file://localhost/tmp/tmpexjjQ7/#svnhead
-  17. http://peak.telecommunity.com/DevCenter/setuptools
-  18. file://localhost/tmp/tmpexjjQ7/src/mechanize-0.1.2b.tar.gz
-  19. file://localhost/tmp/tmpexjjQ7/src/mechanize-0.1.2b.zip
-  20. file://localhost/tmp/tmpexjjQ7/src/ChangeLog.txt
-  21. file://localhost/tmp/tmpexjjQ7/src/
-  22. file://localhost/tmp/tmpexjjQ7/#download
-  23. http://subversion.tigris.org/
-  24. http://codespeak.net/svn/wwwsearch/mechanize/trunk#egg=mechanize-dev
-  25. file://localhost/tmp/tmpexjjQ7/#source
-  26. file://localhost/tmp/ClientForm/
-  27. http://cheeseshop.python.org/pypi?:action=display&name=zope.testbrowser
-  28. http://cheeseshop.python.org/pypi?%3Aaction=display&name=ZopeTestbrowser
-  29. http://www.idyll.org/~t/www-tools/twill.html
-  30. http://mechanicalcat.net/tech/webunit/
-  31. http://webunit.sourceforge.net/
-  32. file://localhost/tmp/bits/GeneralFAQ.html
-  33. file://localhost/tmp/ClientForm/
-  34. http://peak.telecommunity.com/DevCenter/EasyInstall
-  35. file://localhost/tmp/tmpexjjQ7/#source
-  36. http://www.opensource.org/licenses/bsd-license.php
-  37. http://www.zope.org/Resources/ZPL
-  38. http://lists.sourceforge.net/lists/listinfo/wwwsearch-general
-  39. mailto:jjl at pobox.com
-  40. file://localhost/tmp
-  41. file://localhost/tmp/bits/GeneralFAQ.html
-  42. file://localhost/tmp/mechanize/doc.html
-  43. file://localhost/tmp/ClientForm/
-  44. file://localhost/tmp/ClientCookie/
-  45. file://localhost/tmp/ClientCookie/doc.html
-  46. file://localhost/tmp/pullparser/
-  47. file://localhost/tmp/DOMForm/
-  48. file://localhost/tmp/python-spidermonkey/
-  49. file://localhost/tmp/ClientTable/
-  50. file://localhost/tmp/bits/urllib2_152.py
-  51. file://localhost/tmp/bits/urllib_152.py
-  52. file://localhost/tmp/tmpexjjQ7/#examples
-  53. file://localhost/tmp/tmpexjjQ7/#compatnotes
-  54. file://localhost/tmp/tmpexjjQ7/#docs
-  55. file://localhost/tmp/tmpexjjQ7/#todo
-  56. file://localhost/tmp/tmpexjjQ7/#download
-  57. file://localhost/tmp/tmpexjjQ7/#svn
-  58. file://localhost/tmp/tmpexjjQ7/#tests
-  59. file://localhost/tmp/tmpexjjQ7/#faq
+  20. http://subversion.tigris.org/
+  21. http://codespeak.net/svn/wwwsearch/mechanize/trunk#egg=mechanize-dev
+  24. http://cheeseshop.python.org/pypi?:action=display&name=zope.testbrowser
+  25. http://cheeseshop.python.org/pypi?%3Aaction=display&name=ZopeTestbrowser
+  26. http://www.idyll.org/~t/www-tools/twill.html
+  27. http://mechanicalcat.net/tech/webunit/
+  28. http://webunit.sourceforge.net/
+  31. http://peak.telecommunity.com/DevCenter/EasyInstall
+  33. http://www.opensource.org/licenses/bsd-license.php
+  34. http://www.zope.org/Resources/ZPL
+  35. http://wwwsearch.sourceforge.net/mechanize/doc.html#debugging
+  36. http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html
+  37. http://lists.sourceforge.net/lists/listinfo/wwwsearch-general
+  38. mailto:jjl at pobox.com

Modified: python-mechanize/trunk/debian/changelog
===================================================================
--- python-mechanize/trunk/debian/changelog	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/debian/changelog	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,3 +1,16 @@
+python-mechanize (0.1.6b-1) unstable; urgency=low
+
+  * New upstream release. (Closes: #418457)
+  * Drop obsolete patch to mechanize/_html.py, mechanize/_util.py and
+    mechanize/_mechanize.py.
+  * Re-generate README.txt using "lynx -dump" as upstream forget to do it.
+  * Use dh_installexamples to install examples instead of dh_installdocs.
+  * Remove dh_python from debian/rules.
+  * Update python-clientform dependency.
+  * Update email address in Uploaders.
+
+ -- Jérémy Bobbio <lunar at debian.org>  Tue, 10 Apr 2007 01:34:43 +0200
+
 python-mechanize (0.1.2b-2) unstable; urgency=low
 
   [ Brian Sutherland ]

Modified: python-mechanize/trunk/debian/control
===================================================================
--- python-mechanize/trunk/debian/control	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/debian/control	2007-04-09 23:35:16 UTC (rev 773)
@@ -2,7 +2,7 @@
 Section: python
 Priority: extra
 Maintainer: Debian/Ubuntu Zope Team <pkg-zope-developers at lists.alioth.debian.org>
-Uploaders: Brian Sutherland <jinty at web.de>, Fabio Tranchitella <kobold at debian.org>, Jérémy Bobbio <jeremy.bobbio at etu.upmc.fr>
+Uploaders: Brian Sutherland <jinty at web.de>, Fabio Tranchitella <kobold at debian.org>, Jérémy Bobbio <lunar at debian.org>
 Build-Depends-Indep: python-all-dev (>= 2.3.5-9), python-central (>= 0.5)
 Build-Depends: debhelper (>= 5.0.37.2), python-setuptools (>= 0.6b3)
 Standards-Version: 3.7.2
@@ -11,7 +11,7 @@
 
 Package: python-mechanize
 Architecture: all
-Depends: ${python:Depends}, python-clientform (>= 0.2.2)
+Depends: ${python:Depends}, python-clientform (>= 0.2.6)
 Provides: ${python:Provides}
 XB-Python-Version: ${python:Versions}
 Conflicts: python2.3-mechanize (<< 0.0.11a-3), python2.4-mechanize (<< 0.0.11a-3)

Modified: python-mechanize/trunk/debian/docs
===================================================================
--- python-mechanize/trunk/debian/docs	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/debian/docs	2007-04-09 23:35:16 UTC (rev 773)
@@ -4,4 +4,3 @@
 README.txt
 doc.html
 0.1-changes.txt
-examples

Added: python-mechanize/trunk/debian/examples
===================================================================
--- python-mechanize/trunk/debian/examples	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/debian/examples	2007-04-09 23:35:16 UTC (rev 773)
@@ -0,0 +1 @@
+examples/*

Modified: python-mechanize/trunk/debian/rules
===================================================================
--- python-mechanize/trunk/debian/rules	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/debian/rules	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,7 +1,7 @@
 #!/usr/bin/make -f
 
 # Uncomment this to turn on verbose mode.
-#export DH_VERBOSE=1
+export DH_VERBOSE=1
 
 PYVERS=$(shell pyversions -vr debian/control)
 PYMOD=mechanize
@@ -57,7 +57,6 @@
 	dh_compress
 	dh_fixperms
 	dh_pycentral
-	dh_python
 	dh_makeshlibs
 	dh_installdeb
 	dh_shlibdeps

Modified: python-mechanize/trunk/doc.html
===================================================================
--- python-mechanize/trunk/doc.html	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/doc.html	2007-04-09 23:35:16 UTC (rev 773)
@@ -5,7 +5,7 @@
 <head>
   <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
   <meta name="author" content="John J. Lee &lt;jjl at pobox.com&gt;">
-  <meta name="date" content="2006-05-21">
+  <meta name="date" content="2006-09-08">
   <title>mechanize documentation</title>
   <style type="text/css" media="screen">@import "../styles/style.css";</style>
   
@@ -468,7 +468,7 @@
 
 
   <li>If you're using a <code>urllib2.Request</code> from Python 2.4 or later,
-  or you're using a <code>mechanize.Request<code>, use the
+  or you're using a <code>mechanize.Request</code>, use the
   <code>unverifiable</code> and <code>origin_req_host</code> arguments to the
   constructor:
 
@@ -706,7 +706,7 @@
 keep compatibility with the Netscape protocol as implemented by Netscape.
 Microsoft Internet Explorer (MSIE) was very new when the standard was designed,
 but was starting to be very popular when the standard was finalised.  XXX P3P,
-and MSIE & Mozilla options
+and MSIE &amp; Mozilla options
 
 <p>XXX Apparently MSIE implements bits of RFC 2109 - but not very compliant
 (surprise).  Presumably other browsers do too, as a result.  mechanize
@@ -838,7 +838,7 @@
 mailing list</a> rather than direct to me.
 
 <p><a href="mailto:jjl at pobox.com">John J. Lee</a>,
-May 2006.
+September 2006.
 
 <hr>
 
@@ -866,7 +866,7 @@
 <br>
 
 <a href="./doc.html#examples">Examples</a><br>
-<a href="./doc.html#browsers">Mozilla & MSIE</a><br>
+<a href="./doc.html#browsers">Mozilla &amp; MSIE</a><br>
 <a href="./doc.html#file">Cookies in a file</a><br>
 <a href="./doc.html#cookiejar">Using a <code>CookieJar</code></a><br>
 <a href="./doc.html#extras">Processors</a><br>

Modified: python-mechanize/trunk/doc.html.in
===================================================================
--- python-mechanize/trunk/doc.html.in	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/doc.html.in	2007-04-09 23:35:16 UTC (rev 773)
@@ -6,7 +6,7 @@
 from colorize import colorize
 import time
 import release
-last_modified = release.svn_id_to_time("$Id: doc.html.in 27546 2006-05-21 18:52:39Z jjlee $")
+last_modified = release.svn_id_to_time("$Id: doc.html.in 32090 2006-09-08 21:19:26Z jjlee $")
 try:
     base
 except NameError:
@@ -479,7 +479,7 @@
 """)}
 
   <li>If you're using a <code>urllib2.Request</code> from Python 2.4 or later,
-  or you're using a <code>mechanize.Request<code>, use the
+  or you're using a <code>mechanize.Request</code>, use the
   <code>unverifiable</code> and <code>origin_req_host</code> arguments to the
   constructor:
 
@@ -718,7 +718,7 @@
 keep compatibility with the Netscape protocol as implemented by Netscape.
 Microsoft Internet Explorer (MSIE) was very new when the standard was designed,
 but was starting to be very popular when the standard was finalised.  XXX P3P,
-and MSIE & Mozilla options
+and MSIE &amp; Mozilla options
 
 <p>XXX Apparently MSIE implements bits of RFC 2109 - but not very compliant
 (surprise).  Presumably other browsers do too, as a result.  mechanize
@@ -863,7 +863,7 @@
 <br>
 
 <a href="./doc.html#examples">Examples</a><br>
-<a href="./doc.html#browsers">Mozilla & MSIE</a><br>
+<a href="./doc.html#browsers">Mozilla &amp; MSIE</a><br>
 <a href="./doc.html#file">Cookies in a file</a><br>
 <a href="./doc.html#cookiejar">Using a <code>CookieJar</code></a><br>
 <a href="./doc.html#extras">Processors</a><br>

Modified: python-mechanize/trunk/examples/pypi.py
===================================================================
--- python-mechanize/trunk/examples/pypi.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/examples/pypi.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,5 +1,10 @@
 #!/usr/bin/env python
 
+# ------------------------------------------------------------------------
+# THIS SCRIPT IS CURRENTLY NOT WORKING, SINCE PYPI's SEARCH FEATURE HAS
+# BEEN REMOVED!
+# ------------------------------------------------------------------------
+
 # Search PyPI, the Python Package Index, and retrieve latest mechanize
 # tarball.
 
@@ -16,6 +21,10 @@
     # mechanize.Factory (with XHTML support turned on):
     factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True)
     )
+# Addition 2005-06-13: Be naughty, since robots.txt asks not to
+# access /pypi now.  We're not madly searching for everything, so
+# I don't feel too guilty.
+b.set_handle_robots(False)
 
 # search PyPI
 b.open("http://www.python.org/pypi")

Copied: python-mechanize/trunk/ez_setup.py (from rev 765, python-mechanize/branches/upstream/current/ez_setup.py)

Modified: python-mechanize/trunk/functional_tests.py
===================================================================
--- python-mechanize/trunk/functional_tests.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/functional_tests.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -18,6 +18,7 @@
 
 #from mechanize import CreateBSDDBCookieJar
 
+## import logging
 ## logger = logging.getLogger("mechanize")
 ## logger.addHandler(logging.StreamHandler())
 ## logger.setLevel(logging.DEBUG)
@@ -59,15 +60,40 @@
         self.assertEqual(self.browser.title(), 'Python bits')
 
     def test_redirect(self):
-        # 302 redirect due to missing final '/'
-        self.browser.open('http://wwwsearch.sourceforge.net')
+        # 301 redirect due to missing final '/'
+        r = self.browser.open('http://wwwsearch.sourceforge.net/bits')
+        self.assertEqual(r.code, 200)
+        self.assert_("GeneralFAQ.html" in r.read(2048))
 
     def test_file_url(self):
         url = "file://%s" % sanepathname2url(
             os.path.abspath('functional_tests.py'))
-        self.browser.open(url)
+        r = self.browser.open(url)
+        self.assert_("this string appears in this file ;-)" in r.read())
 
+    def test_open_novisit(self):
+        def test_state(br):
+            self.assert_(br.request is None)
+            self.assert_(br.response() is None)
+            self.assertRaises(mechanize.BrowserStateError, br.back)
+        test_state(self.browser)
+        # note this involves a redirect, which should itself be non-visiting
+        r = self.browser.open_novisit("http://wwwsearch.sourceforge.net/bits")
+        test_state(self.browser)
+        self.assert_("GeneralFAQ.html" in r.read(2048))
 
+    def test_non_seekable(self):
+        # check everything still works without response_seek_wrapper and
+        # the .seek() method on response objects
+        ua = mechanize.UserAgent()
+        ua.set_seekable_responses(False)
+        ua.set_handle_equiv(False)
+        response = ua.open('http://wwwsearch.sourceforge.net/')
+        self.failIf(hasattr(response, "seek"))
+        data = response.read()
+        self.assert_("Python bits" in data)
+
+
 class ResponseTests(TestCase):
 
     def test_seek(self):
@@ -147,7 +173,6 @@
 class FunctionalTests(TestCase):
     def test_cookies(self):
         import urllib2
-        from mechanize import _urllib2_support
         # this test page depends on cookies, and an http-equiv refresh
         #cj = CreateBSDDBCookieJar("/home/john/db.db")
         cj = CookieJar()
@@ -183,23 +208,67 @@
             self.assert_(samedata == data)
         finally:
             o.close()
-            # uninstall opener (don't try this at home)
-            _urllib2_support._opener = None
+            install_opener(None)
 
+    def test_robots(self):
+        plain_opener = mechanize.build_opener(mechanize.HTTPRobotRulesProcessor)
+        browser = mechanize.Browser()
+        for opener in plain_opener, browser:
+            r = opener.open("http://wwwsearch.sourceforge.net/robots")
+            self.assertEqual(r.code, 200)
+            self.assertRaises(
+                mechanize.RobotExclusionError,
+                opener.open, "http://wwwsearch.sourceforge.net/norobots")
+
     def test_urlretrieve(self):
         url = "http://www.python.org/"
+        test_filename = "python.html"
+        def check_retrieve(opener, filename, headers):
+            self.assertEqual(headers.get('Content-Type'), 'text/html')
+            f = open(filename)
+            data = f.read()
+            f.close()
+            opener.close()
+            from urllib import urlopen
+            r = urlopen(url)
+            self.assertEqual(data, r.read())
+            r.close()
+
+        opener = mechanize.build_opener()
         verif = CallbackVerifier(self)
-        fn, hdrs = urlretrieve(url, "python.html", verif.callback)
+        filename, headers = opener.retrieve(url, test_filename, verif.callback)
         try:
-            f = open(fn)
-            data = f.read()
-            f.close()
+            self.assertEqual(filename, test_filename)
+            check_retrieve(opener, filename, headers)
+            self.assert_(os.path.isfile(filename))
         finally:
-            os.remove(fn)
-        r = urlopen(url)
-        self.assert_(data == r.read())
-        r.close()
+            os.remove(filename)
 
+        opener = mechanize.build_opener()
+        verif = CallbackVerifier(self)
+        filename, headers = opener.retrieve(url, reporthook=verif.callback)
+        check_retrieve(opener, filename, headers)
+        # closing the opener removed the temporary file
+        self.failIf(os.path.isfile(filename))
+
+    def test_reload_read_incomplete(self):
+        from mechanize import Browser
+        browser = Browser()
+        r1 = browser.open(
+            "http://wwwsearch.sf.net/bits/mechanize_reload_test.html")
+        # if we don't do anything and go straight to another page, most of the
+        # last page's response won't be .read()...
+        r2 = browser.open("http://wwwsearch.sf.net/mechanize")
+        self.assert_(len(r1.get_data()) < 4097)  # we only .read() a little bit
+        # ...so if we then go back, .follow_link() for a link near the end (a
+        # few kb in, past the point that always gets read in HTML files because
+        # of HEAD parsing) will only work if it causes a .reload()...
+        r3 = browser.back()
+        browser.follow_link(text="near the end")
+        # ... good, no LinkNotFoundError, so we did reload.
+        # we have .read() the whole file
+        self.assertEqual(len(r3._seek_wrapper__cache.getvalue()), 4202)
+
 ##     def test_cacheftp(self):
 ##         from urllib2 import CacheFTPHandler, build_opener
 ##         o = build_opener(CacheFTPHandler())
@@ -217,8 +286,7 @@
         self._count = 0
         self._testcase = testcase
     def callback(self, block_nr, block_size, total_size):
-        if block_nr != self._count:
-            self._testcase.fail()
+        self._testcase.assertEqual(block_nr, self._count)
         self._count = self._count + 1
 
 

Modified: python-mechanize/trunk/mechanize/__init__.py
===================================================================
--- python-mechanize/trunk/mechanize/__init__.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/__init__.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,3 +1,85 @@
+__all__ = [
+    'AbstractBasicAuthHandler',
+    'AbstractDigestAuthHandler',
+    'BaseHandler',
+    'Browser',
+    'BrowserStateError',
+    'CacheFTPHandler',
+    'ContentTooShortError',
+    'Cookie',
+    'CookieJar',
+    'CookiePolicy',
+    'DefaultCookiePolicy',
+    'DefaultFactory',
+    'FTPHandler',
+    'Factory',
+    'FileCookieJar',
+    'FileHandler',
+    'FormNotFoundError',
+    'FormsFactory',
+    'GopherError',
+    'GopherHandler',
+    'HTTPBasicAuthHandler',
+    'HTTPCookieProcessor',
+    'HTTPDefaultErrorHandler',
+    'HTTPDigestAuthHandler',
+    'HTTPEquivProcessor',
+    'HTTPError',
+    'HTTPErrorProcessor',
+    'HTTPHandler',
+    'HTTPPasswordMgr',
+    'HTTPPasswordMgrWithDefaultRealm',
+    'HTTPProxyPasswordMgr',
+    'HTTPRedirectDebugProcessor',
+    'HTTPRedirectHandler',
+    'HTTPRefererProcessor',
+    'HTTPRefreshProcessor',
+    'HTTPRequestUpgradeProcessor',
+    'HTTPResponseDebugProcessor',
+    'HTTPRobotRulesProcessor',
+    'HTTPSClientCertMgr',
+    'HTTPSHandler',
+    'HeadParser',
+    'History',
+    'LWPCookieJar',
+    'Link',
+    'LinkNotFoundError',
+    'LinksFactory',
+    'LoadError',
+    'MSIECookieJar',
+    'MozillaCookieJar',
+    'OpenerDirector',
+    'OpenerFactory',
+    'ParseError',
+    'ProxyBasicAuthHandler',
+    'ProxyDigestAuthHandler',
+    'ProxyHandler',
+    'Request',
+    'ResponseUpgradeProcessor',
+    'RobotExclusionError',
+    'RobustFactory',
+    'RobustFormsFactory',
+    'RobustLinksFactory',
+    'RobustTitleFactory',
+    'SeekableProcessor',
+    'TitleFactory',
+    'URLError',
+    'USE_BARE_EXCEPT',
+    'UnknownHandler',
+    'UserAgent',
+    'UserAgentBase',
+    'XHTMLCompatibleHeadParser',
+    '__version__',
+    'build_opener',
+    'install_opener',
+    'lwp_cookie_str',
+    'make_response',
+    'request_host',
+    'response_seek_wrapper',
+    'str2time',
+    'urlopen',
+    'urlretrieve']
+
 from _mechanize import __version__
 
 # high-level stateful browser-style interface
@@ -2,8 +84,9 @@
 from _mechanize import \
-     Browser, \
+     Browser, History, \
      BrowserStateError, LinkNotFoundError, FormNotFoundError
 
 # configurable URL-opener interface
-from _useragent import UserAgent
+from _useragent import UserAgentBase, UserAgent
 from _html import \
+     ParseError, \
      Link, \
@@ -14,19 +97,19 @@
      RobustFormsFactory, RobustLinksFactory, RobustTitleFactory
 
 # urllib2 work-alike interface (part from mechanize, part from urllib2)
+# This is a superset of the urllib2 interface.
 from _urllib2 import *
 
 # misc
+from _opener import ContentTooShortError, OpenerFactory, urlretrieve
 from _util import http2time as str2time
-from _util import response_seek_wrapper, make_response
-from _urllib2_support import HeadParser
+from _response import response_seek_wrapper, make_response
+from _http import HeadParser
 try:
-    from _urllib2_support import XHTMLCompatibleHeadParser
+    from _http import XHTMLCompatibleHeadParser
 except ImportError:
     pass
-#from _gzip import HTTPGzipProcessor  # crap ATM
 
-
 # cookies
 from _clientcookie import Cookie, CookiePolicy, DefaultCookiePolicy, \
      CookieJar, FileCookieJar, LoadError, request_host

Modified: python-mechanize/trunk/mechanize/_auth.py
===================================================================
--- python-mechanize/trunk/mechanize/_auth.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_auth.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -11,10 +11,11 @@
 
 """
 
-import re, base64, urlparse, posixpath, md5, sha
+import re, base64, urlparse, posixpath, md5, sha, sys
 
 from urllib2 import BaseHandler
-from urllib import getproxies, unquote, splittype, splituser, splitpasswd
+from urllib import getproxies, unquote, splittype, splituser, splitpasswd, \
+     splitport
 
 
 def _parse_proxy(proxy):
@@ -135,32 +136,45 @@
         # uri could be a single URI or a sequence
         if isinstance(uri, basestring):
             uri = [uri]
-        uri = tuple(map(self.reduce_uri, uri))
         if not realm in self.passwd:
             self.passwd[realm] = {}
-        self.passwd[realm][uri] = (user, passwd)
+        for default_port in True, False:
+            reduced_uri = tuple(
+                [self.reduce_uri(u, default_port) for u in uri])
+            self.passwd[realm][reduced_uri] = (user, passwd)
 
     def find_user_password(self, realm, authuri):
         domains = self.passwd.get(realm, {})
-        authuri = self.reduce_uri(authuri)
-        for uris, authinfo in domains.iteritems():
-            for uri in uris:
-                if self.is_suburi(uri, authuri):
-                    return authinfo
+        for default_port in True, False:
+            reduced_authuri = self.reduce_uri(authuri, default_port)
+            for uris, authinfo in domains.iteritems():
+                for uri in uris:
+                    if self.is_suburi(uri, reduced_authuri):
+                        return authinfo
         return None, None
 
-    def reduce_uri(self, uri):
-        """Accept netloc or URI and extract only the netloc and path"""
+    def reduce_uri(self, uri, default_port=True):
+        """Accept authority or URI and extract only the authority and path."""
+        # note HTTP URLs do not have a userinfo component
         parts = urlparse.urlsplit(uri)
         if parts[1]:
             # URI
-            return parts[1], parts[2] or '/'
-        elif parts[0]:
-            # host:port
-            return uri, '/'
+            scheme = parts[0]
+            authority = parts[1]
+            path = parts[2] or '/'
         else:
-            # host
-            return parts[2], '/'
+            # host or host:port
+            scheme = None
+            authority = uri
+            path = '/'
+        host, port = splitport(authority)
+        if default_port and port is None and scheme is not None:
+            dport = {"http": 80,
+                     "https": 443,
+                     }.get(scheme)
+            if dport is not None:
+                authority = "%s:%d" % (host, dport)
+        return authority, path
 
     def is_suburi(self, base, test):
         """Check if test is below base in a URI tree
@@ -404,6 +418,7 @@
     """
 
     auth_header = 'Authorization'
+    handler_order = 490
 
     def http_error_401(self, req, fp, code, msg, headers):
         host = urlparse.urlparse(req.get_full_url())[1]
@@ -416,6 +431,7 @@
 class ProxyDigestAuthHandler(BaseHandler, AbstractDigestAuthHandler):
 
     auth_header = 'Proxy-Authorization'
+    handler_order = 490
 
     def http_error_407(self, req, fp, code, msg, headers):
         host = req.get_host()
@@ -425,7 +441,7 @@
         return retry
 
 
-
+# XXX ugly implementation, should probably not bother deriving
 class HTTPProxyPasswordMgr(HTTPPasswordMgr):
     # has default realm and host/port
     def add_password(self, realm, uri, user, passwd):
@@ -436,32 +452,34 @@
             uris = uri
         passwd_by_domain = self.passwd.setdefault(realm, {})
         for uri in uris:
-            uri = self.reduce_uri(uri)
-            passwd_by_domain[uri] = (user, passwd)
+            for default_port in True, False:
+                reduced_uri = self.reduce_uri(uri, default_port)
+                passwd_by_domain[reduced_uri] = (user, passwd)
 
     def find_user_password(self, realm, authuri):
-        perms = [(realm, authuri), (None, authuri)]
+        attempts = [(realm, authuri), (None, authuri)]
         # bleh, want default realm to take precedence over default
         # URI/authority, hence this outer loop
         for default_uri in False, True:
-            for realm, authuri in perms:
+            for realm, authuri in attempts:
                 authinfo_by_domain = self.passwd.get(realm, {})
-                reduced_authuri = self.reduce_uri(authuri)
-                for uri, authinfo in authinfo_by_domain.iteritems():
-                    if uri is None and not default_uri:
-                        continue
-                    if self.is_suburi(uri, reduced_authuri):
-                        return authinfo
-                user, password = None, None
+                for default_port in True, False:
+                    reduced_authuri = self.reduce_uri(authuri, default_port)
+                    for uri, authinfo in authinfo_by_domain.iteritems():
+                        if uri is None and not default_uri:
+                            continue
+                        if self.is_suburi(uri, reduced_authuri):
+                            return authinfo
+                    user, password = None, None
 
-                if user is not None:
-                    break
+                    if user is not None:
+                        break
         return user, password
 
-    def reduce_uri(self, uri):
+    def reduce_uri(self, uri, default_port=True):
         if uri is None:
             return None
-        return HTTPPasswordMgr.reduce_uri(self, uri)
+        return HTTPPasswordMgr.reduce_uri(self, uri, default_port)
 
     def is_suburi(self, base, test):
         if base is None:
@@ -469,3 +487,11 @@
             hostport, path = test
             base = (hostport, "/")
         return HTTPPasswordMgr.is_suburi(self, base, test)
+
+
+class HTTPSClientCertMgr(HTTPPasswordMgr):
+    # implementation inheritance: this is not a proper subclass
+    def add_key_cert(self, uri, key_file, cert_file):
+        self.add_password(None, uri, key_file, cert_file)
+    def find_key_cert(self, authuri):
+        return HTTPPasswordMgr.find_user_password(self, None, authuri)

Copied: python-mechanize/trunk/mechanize/_beautifulsoup.py (from rev 765, python-mechanize/branches/upstream/current/mechanize/_beautifulsoup.py)

Modified: python-mechanize/trunk/mechanize/_clientcookie.py
===================================================================
--- python-mechanize/trunk/mechanize/_clientcookie.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_clientcookie.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,4 +1,4 @@
-"""HTTP cookie handling for web clients, plus some other stuff.
+"""HTTP cookie handling for web clients.
 
 This module originally developed from my port of Gisle Aas' Perl module
 HTTP::Cookies, from the libwww-perl library.
@@ -32,7 +32,7 @@
 
 """
 
-import sys, re, urlparse, string, copy, time, struct, urllib, types, logging
+import sys, re, copy, time, struct, urllib, types, logging
 try:
     import threading
     _threading = threading; del threading
@@ -46,7 +46,8 @@
 DEFAULT_HTTP_PORT = str(httplib.HTTP_PORT)
 
 from _headersutil import split_header_words, parse_ns_headers
-from _util import startswith, endswith, isstringlike, getheaders
+from _util import isstringlike
+import _rfc3986
 
 debug = logging.getLogger("mechanize.cookies").debug
 
@@ -105,17 +106,17 @@
     """
     # Note that, if A or B are IP addresses, the only relevant part of the
     # definition of the domain-match algorithm is the direct string-compare.
-    A = string.lower(A)
-    B = string.lower(B)
+    A = A.lower()
+    B = B.lower()
     if A == B:
         return True
     if not is_HDN(A):
         return False
-    i = string.rfind(A, B)
+    i = A.rfind(B)
     has_form_nb = not (i == -1 or i == 0)
     return (
         has_form_nb and
-        startswith(B, ".") and
+        B.startswith(".") and
         is_HDN(B[1:])
         )
 
@@ -133,15 +134,15 @@
     A and B may be host domain names or IP addresses.
 
     """
-    A = string.lower(A)
-    B = string.lower(B)
+    A = A.lower()
+    B = B.lower()
     if not (liberal_is_HDN(A) and liberal_is_HDN(B)):
         if A == B:
             # equal IP addresses
             return True
         return False
-    initial_dot = startswith(B, ".")
-    if initial_dot and endswith(A, B):
+    initial_dot = B.startswith(".")
+    if initial_dot and A.endswith(B):
         return True
     if not initial_dot and A == B:
         return True
@@ -156,13 +157,13 @@
 
     """
     url = request.get_full_url()
-    host = urlparse.urlparse(url)[1]
-    if host == "":
+    host = _rfc3986.urlsplit(url)[1]
+    if host is None:
         host = request.get_header("Host", "")
 
     # remove port, if present
     host = cut_port_re.sub("", host, 1)
-    return string.lower(host)
+    return host.lower()
 
 def eff_request_host(request):
     """Return a tuple (request-host, effective request-host name).
@@ -171,28 +172,23 @@
 
     """
     erhn = req_host = request_host(request)
-    if string.find(req_host, ".") == -1 and not IPV4_RE.search(req_host):
+    if req_host.find(".") == -1 and not IPV4_RE.search(req_host):
         erhn = req_host + ".local"
     return req_host, erhn
 
 def request_path(request):
     """request-URI, as defined by RFC 2965."""
     url = request.get_full_url()
-    #scheme, netloc, path, parameters, query, frag = urlparse.urlparse(url)
-    #req_path = escape_path(string.join(urlparse.urlparse(url)[2:], ""))
-    path, parameters, query, frag = urlparse.urlparse(url)[2:]
-    if parameters:
-        path = "%s;%s" % (path, parameters)
+    path, query, frag = _rfc3986.urlsplit(url)[2:]
     path = escape_path(path)
-    req_path = urlparse.urlunparse(("", "", path, "", query, frag))
-    if not startswith(req_path, "/"):
-        # fix bad RFC 2396 absoluteURI
+    req_path = _rfc3986.urlunsplit((None, None, path, query, frag))
+    if not req_path.startswith("/"):
         req_path = "/"+req_path
     return req_path
 
 def request_port(request):
     host = request.get_host()
-    i = string.find(host, ':')
+    i = host.find(':')
     if i >= 0:
         port = host[i+1:]
         try:
@@ -209,7 +205,7 @@
 HTTP_PATH_SAFE = "%/;:@&=+$,!~*'()"
 ESCAPED_CHAR_RE = re.compile(r"%([0-9a-fA-F][0-9a-fA-F])")
 def uppercase_escaped_char(match):
-    return "%%%s" % string.upper(match.group(1))
+    return "%%%s" % match.group(1).upper()
 def escape_path(path):
     """Escape any invalid characters in HTTP URL, and uppercase all escapes."""
     # There's no knowing what character encoding was used to create URLs
@@ -252,11 +248,11 @@
     '.local'
 
     """
-    i = string.find(h, ".")
+    i = h.find(".")
     if i >= 0:
         #a = h[:i]  # this line is only here to show what a is
         b = h[i+1:]
-        i = string.find(b, ".")
+        i = b.find(".")
         if is_HDN(h) and (i >= 0 or b == "local"):
             return "."+b
     return h
@@ -344,7 +340,7 @@
         self.port = port
         self.port_specified = port_specified
         # normalise case, as per RFC 2965 section 3.3.3
-        self.domain = string.lower(domain)
+        self.domain = domain.lower()
         self.domain_specified = domain_specified
         # Sigh.  We need to know whether the domain given in the
         # cookie-attribute had an initial dot, in order to follow RFC 2965
@@ -397,7 +393,7 @@
             args.append("%s=%s" % (name, repr(attr)))
         args.append("rest=%s" % repr(self._rest))
         args.append("rfc2109=%s" % repr(self.rfc2109))
-        return "Cookie(%s)" % string.join(args, ", ")
+        return "Cookie(%s)" % ", ".join(args)
 
 
 class CookiePolicy:
@@ -701,7 +697,7 @@
         # Try and stop servers setting V0 cookies designed to hack other
         # servers that know both V0 and V1 protocols.
         if (cookie.version == 0 and self.strict_ns_set_initial_dollar and
-            startswith(cookie.name, "$")):
+            cookie.name.startswith("$")):
             debug("   illegal name (starts with '$'): '%s'", cookie.name)
             return False
         return True
@@ -711,7 +707,7 @@
             req_path = request_path(request)
             if ((cookie.version > 0 or
                  (cookie.version == 0 and self.strict_ns_set_path)) and
-                not startswith(req_path, cookie.path)):
+                not req_path.startswith(cookie.path)):
                 debug("   path attribute %s is not a prefix of request "
                       "path %s", cookie.path, req_path)
                 return False
@@ -728,12 +724,12 @@
             domain = cookie.domain
             # since domain was specified, we know that:
             assert domain.startswith(".")
-            if string.count(domain, ".") == 2:
+            if domain.count(".") == 2:
                 # domain like .foo.bar
-                i = string.rfind(domain, ".")
+                i = domain.rfind(".")
                 tld = domain[i+1:]
                 sld = domain[1:i]
-                if (string.lower(sld) in [
+                if (sld.lower() in [
                     "co", "ac",
                     "com", "edu", "org", "net", "gov", "mil", "int",
                     "aero", "biz", "cat", "coop", "info", "jobs", "mobi",
@@ -757,19 +753,19 @@
         if cookie.domain_specified:
             req_host, erhn = eff_request_host(request)
             domain = cookie.domain
-            if startswith(domain, "."):
+            if domain.startswith("."):
                 undotted_domain = domain[1:]
             else:
                 undotted_domain = domain
-            embedded_dots = (string.find(undotted_domain, ".") >= 0)
+            embedded_dots = (undotted_domain.find(".") >= 0)
             if not embedded_dots and domain != ".local":
                 debug("   non-local domain %s contains no embedded dot",
                       domain)
                 return False
             if cookie.version == 0:
-                if (not endswith(erhn, domain) and
-                    (not startswith(erhn, ".") and
-                     not endswith("."+erhn, domain))):
+                if (not erhn.endswith(domain) and
+                    (not erhn.startswith(".") and
+                     not ("."+erhn).endswith(domain))):
                     debug("   effective request-host %s (even with added "
                           "initial dot) does not end end with %s",
                           erhn, domain)
@@ -783,7 +779,7 @@
             if (cookie.version > 0 or
                 (self.strict_ns_domain & self.DomainStrictNoDots)):
                 host_prefix = req_host[:-len(domain)]
-                if (string.find(host_prefix, ".") >= 0 and
+                if (host_prefix.find(".") >= 0 and
                     not IPV4_RE.search(req_host)):
                     debug("   host prefix %s for domain %s contains a dot",
                           host_prefix, domain)
@@ -797,7 +793,7 @@
                 req_port = "80"
             else:
                 req_port = str(req_port)
-            for p in string.split(cookie.port, ","):
+            for p in cookie.port.split(","):
                 try:
                     int(p)
                 except ValueError:
@@ -867,7 +863,7 @@
             req_port = request_port(request)
             if req_port is None:
                 req_port = "80"
-            for p in string.split(cookie.port, ","):
+            for p in cookie.port.split(","):
                 if p == req_port:
                     break
             else:
@@ -892,7 +888,7 @@
             debug("   effective request-host name %s does not domain-match "
                   "RFC 2965 cookie domain %s", erhn, domain)
             return False
-        if cookie.version == 0 and not endswith("."+erhn, domain):
+        if cookie.version == 0 and not ("."+erhn).endswith(domain):
             debug("   request-host %s does not match Netscape cookie domain "
                   "%s", req_host, domain)
             return False
@@ -905,12 +901,12 @@
         # Munge req_host and erhn to always start with a dot, so as to err on
         # the side of letting cookies through.
         dotted_req_host, dotted_erhn = eff_request_host(request)
-        if not startswith(dotted_req_host, "."):
+        if not dotted_req_host.startswith("."):
             dotted_req_host = "."+dotted_req_host
-        if not startswith(dotted_erhn, "."):
+        if not dotted_erhn.startswith("."):
             dotted_erhn = "."+dotted_erhn
-        if not (endswith(dotted_req_host, domain) or
-                endswith(dotted_erhn, domain)):
+        if not (dotted_req_host.endswith(domain) or
+                dotted_erhn.endswith(domain)):
             #debug("   request domain %s does not match cookie domain %s",
             #      req_host, domain)
             return False
@@ -927,7 +923,7 @@
     def path_return_ok(self, path, request):
         debug("- checking cookie path=%s", path)
         req_path = request_path(request)
-        if not startswith(req_path, path):
+        if not req_path.startswith(path):
             debug("  %s does not path-match %s", req_path, path)
             return False
         return True
@@ -1096,10 +1092,10 @@
             if version > 0:
                 if cookie.path_specified:
                     attrs.append('$Path="%s"' % cookie.path)
-                if startswith(cookie.domain, "."):
+                if cookie.domain.startswith("."):
                     domain = cookie.domain
                     if (not cookie.domain_initial_dot and
-                        startswith(domain, ".")):
+                        domain.startswith(".")):
                         domain = domain[1:]
                     attrs.append('$Domain="%s"' % domain)
                 if cookie.port is not None:
@@ -1137,8 +1133,7 @@
         attrs = self._cookie_attrs(cookies)
         if attrs:
             if not request.has_header("Cookie"):
-                request.add_unredirected_header(
-                    "Cookie", string.join(attrs, "; "))
+                request.add_unredirected_header("Cookie", "; ".join(attrs))
 
         # if necessary, advertise that we know RFC 2965
         if self._policy.rfc2965 and not self._policy.hide_cookie2:
@@ -1188,7 +1183,7 @@
             standard = {}
             rest = {}
             for k, v in cookie_attrs[1:]:
-                lc = string.lower(k)
+                lc = k.lower()
                 # don't lose case distinction for unknown fields
                 if lc in value_attrs or lc in boolean_attrs:
                     k = lc
@@ -1205,7 +1200,7 @@
                         bad_cookie = True
                         break
                     # RFC 2965 section 3.3.3
-                    v = string.lower(v)
+                    v = v.lower()
                 if k == "expires":
                     if max_age_set:
                         # Prefer max-age to expires (like Mozilla)
@@ -1272,7 +1267,7 @@
         else:
             path_specified = False
             path = request_path(request)
-            i = string.rfind(path, "/")
+            i = path.rfind("/")
             if i != -1:
                 if version == 0:
                     # Netscape spec parts company from reality here
@@ -1286,11 +1281,11 @@
         # but first we have to remember whether it starts with a dot
         domain_initial_dot = False
         if domain_specified:
-            domain_initial_dot = bool(startswith(domain, "."))
+            domain_initial_dot = bool(domain.startswith("."))
         if domain is Absent:
             req_host, erhn = eff_request_host(request)
             domain = erhn
-        elif not startswith(domain, "."):
+        elif not domain.startswith("."):
             domain = "."+domain
 
         # set default port
@@ -1365,8 +1360,8 @@
         """
         # get cookie-attributes for RFC 2965 and Netscape protocols
         headers = response.info()
-        rfc2965_hdrs = getheaders(headers, "Set-Cookie2")
-        ns_hdrs = getheaders(headers, "Set-Cookie")
+        rfc2965_hdrs = headers.getheaders("Set-Cookie2")
+        ns_hdrs = headers.getheaders("Set-Cookie")
 
         rfc2965 = self._policy.rfc2965
         netscape = self._policy.netscape
@@ -1550,12 +1545,12 @@
     def __repr__(self):
         r = []
         for cookie in self: r.append(repr(cookie))
-        return "<%s[%s]>" % (self.__class__, string.join(r, ", "))
+        return "<%s[%s]>" % (self.__class__, ", ".join(r))
 
     def __str__(self):
         r = []
         for cookie in self: r.append(str(cookie))
-        return "<%s[%s]>" % (self.__class__, string.join(r, ", "))
+        return "<%s[%s]>" % (self.__class__, ", ".join(r))
 
 
 class LoadError(Exception): pass

Copied: python-mechanize/trunk/mechanize/_debug.py (from rev 765, python-mechanize/branches/upstream/current/mechanize/_debug.py)

Modified: python-mechanize/trunk/mechanize/_gzip.py
===================================================================
--- python-mechanize/trunk/mechanize/_gzip.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_gzip.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,6 +1,6 @@
 import urllib2
 from cStringIO import StringIO
-import _util
+import _response
 
 # GzipConsumer was taken from Fredrik Lundh's effbot.org-0.1-20041009 library
 class GzipConsumer:
@@ -65,7 +65,7 @@
     def __init__(self): self.data = []
     def feed(self, data): self.data.append(data)
 
-class stupid_gzip_wrapper(_util.closeable_response):
+class stupid_gzip_wrapper(_response.closeable_response):
     def __init__(self, response):
         self._response = response
 

Modified: python-mechanize/trunk/mechanize/_headersutil.py
===================================================================
--- python-mechanize/trunk/mechanize/_headersutil.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_headersutil.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -9,12 +9,13 @@
 
 """
 
-import os, re, string, urlparse
+import os, re
 from types import StringType
 from types import UnicodeType
 STRING_TYPES = StringType, UnicodeType
 
-from _util import startswith, endswith, http2time
+from _util import http2time
+import _rfc3986
 
 def is_html(ct_headers, url, allow_xhtml=False):
     """
@@ -24,7 +25,7 @@
     """
     if not ct_headers:
         # guess
-        ext = os.path.splitext(urlparse.urlparse(url)[2])[1]
+        ext = os.path.splitext(_rfc3986.urlsplit(url)[2])[1]
         html_exts = [".htm", ".html"]
         if allow_xhtml:
             html_exts += [".xhtml"]
@@ -113,14 +114,14 @@
                     if m:  # unquoted value
                         text = unmatched(m)
                         value = m.group(1)
-                        value = string.rstrip(value)
+                        value = value.rstrip()
                     else:
                         # no value, a lone token
                         value = None
                 pairs.append((name, value))
-            elif startswith(string.lstrip(text), ","):
+            elif text.lstrip().startswith(","):
                 # concatenated headers, as per RFC 2616 section 4.2
-                text = string.lstrip(text)[1:]
+                text = text.lstrip()[1:]
                 if pairs: result.append(pairs)
                 pairs = []
             else:
@@ -159,8 +160,8 @@
                 else:
                     k = "%s=%s" % (k, v)
             attr.append(k)
-        if attr: headers.append(string.join(attr, "; "))
-    return string.join(headers, ", ")
+        if attr: headers.append("; ".join(attr))
+    return ", ".join(headers)
 
 def parse_ns_headers(ns_headers):
     """Ad-hoc parser for Netscape protocol cookie-attributes.
@@ -188,15 +189,15 @@
         params = re.split(r";\s*", ns_header)
         for ii in range(len(params)):
             param = params[ii]
-            param = string.rstrip(param)
+            param = param.rstrip()
             if param == "": continue
             if "=" not in param:
                 k, v = param, None
             else:
                 k, v = re.split(r"\s*=\s*", param, 1)
-                k = string.lstrip(k)
+                k = k.lstrip()
             if ii != 0:
-                lc = string.lower(k)
+                lc = k.lower()
                 if lc in known_attrs:
                     k = lc
                 if k == "version":
@@ -204,8 +205,8 @@
                     version_set = True
                 if k == "expires":
                     # convert expires date to seconds since epoch
-                    if startswith(v, '"'): v = v[1:]
-                    if endswith(v, '"'): v = v[:-1]
+                    if v.startswith('"'): v = v[1:]
+                    if v.endswith('"'): v = v[:-1]
                     v = http2time(v)  # None if invalid
             pairs.append((k, v))
 

Modified: python-mechanize/trunk/mechanize/_html.py
===================================================================
--- python-mechanize/trunk/mechanize/_html.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_html.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -8,97 +8,42 @@
 
 """
 
-import re, copy, urllib, htmlentitydefs
-from urlparse import urljoin
+import re, copy, htmlentitydefs
+import sgmllib, HTMLParser, ClientForm
 
 import _request
 from _headersutil import split_header_words, is_html as _is_html
+import _rfc3986
 
-## # XXXX miserable hack
-## def urljoin(base, url):
-##     if url.startswith("?"):
-##         return base+url
-##     else:
-##         return urlparse.urljoin(base, url)
+DEFAULT_ENCODING = "latin-1"
 
-## def chr_range(a, b):
-##     return "".join(map(chr, range(ord(a), ord(b)+1)))
 
-## RESERVED_URL_CHARS = ("ABCDEFGHIJKLMNOPQRSTUVWXYZ"
-##                       "abcdefghijklmnopqrstuvwxyz"
-##                       "-_.~")
-## UNRESERVED_URL_CHARS = "!*'();:@&=+$,/?%#[]"
-# we want (RESERVED_URL_CHARS+UNRESERVED_URL_CHARS), minus those
-# 'safe'-by-default characters that urllib.urlquote never quotes
-URLQUOTE_SAFE_URL_CHARS = "!*'();:@&=+$,/?%#[]~"
+# the base classe is purely for backwards compatibility
+class ParseError(ClientForm.ParseError): pass
 
-DEFAULT_ENCODING = "latin-1"
 
 class CachingGeneratorFunction(object):
-    """Caching wrapper around a no-arguments iterable.
+    """Caching wrapper around a no-arguments iterable."""
 
-    >>> i = [1]
-    >>> func = CachingGeneratorFunction(i)
-    >>> list(func())
-    [1]
-    >>> list(func())
-    [1]
-
-    >>> i = [1, 2, 3]
-    >>> func = CachingGeneratorFunction(i)
-    >>> list(func())
-    [1, 2, 3]
-
-    >>> i = func()
-    >>> i.next()
-    1
-    >>> i.next()
-    2
-    >>> i.next()
-    3
-
-    >>> i = func()
-    >>> j = func()
-    >>> i.next()
-    1
-    >>> j.next()
-    1
-    >>> i.next()
-    2
-    >>> j.next()
-    2
-    >>> j.next()
-    3
-    >>> i.next()
-    3
-    >>> i.next()
-    Traceback (most recent call last):
-    ...
-    StopIteration
-    >>> j.next()
-    Traceback (most recent call last):
-    ...
-    StopIteration
-    """
     def __init__(self, iterable):
-        def make_gen():
-            for item in iterable:
-                yield item
-
         self._cache = []
-        self._generator = make_gen()
+        # wrap iterable to make it non-restartable (otherwise, repeated
+        # __call__ would give incorrect results)
+        self._iterator = iter(iterable)
 
     def __call__(self):
         cache = self._cache
-
         for item in cache:
             yield item
-        for item in self._generator:
+        for item in self._iterator:
             cache.append(item)
             yield item
 
-def encoding_finder(default_encoding):
-    def encoding(response):
+
+class EncodingFinder:
+    def __init__(self, default_encoding):
+        self._default_encoding = default_encoding
+    def encoding(self, response):
         # HTTPEquivProcessor may be in use, so both HTTP and HTTP-EQUIV
         # headers may be in the response.  HTTP-EQUIV headers come last,
         # so try in order from first to last.
@@ -106,17 +51,18 @@
             for k, v in split_header_words([ct])[0]:
                 if k == "charset":
                     return v
-        return default_encoding
-    return encoding
+        return self._default_encoding
 
-def make_is_html(allow_xhtml):
-    def is_html(response, encoding):
+class ResponseTypeFinder:
+    def __init__(self, allow_xhtml):
+        self._allow_xhtml = allow_xhtml
+    def is_html(self, response, encoding):
         ct_hdrs = response.info().getheaders("content-type")
         url = response.geturl()
         # XXX encoding
-        return _is_html(ct_hdrs, url, allow_xhtml)
-    return is_html
+        return _is_html(ct_hdrs, url, self._allow_xhtml)
 
+
 # idea for this argument-processing trick is from Peter Otten
 class Args:
     def __init__(self, args_map):
@@ -140,7 +86,7 @@
     def __init__(self, base_url, url, text, tag, attrs):
         assert None not in [url, tag, attrs]
         self.base_url = base_url
-        self.absolute_url = urljoin(base_url, url)
+        self.absolute_url = _rfc3986.urljoin(base_url, url)
         self.url, self.text, self.tag, self.attrs = url, text, tag, attrs
     def __cmp__(self, other):
         try:
@@ -155,19 +101,6 @@
             self.base_url, self.url, self.text, self.tag, self.attrs)
 
 
-def clean_url(url, encoding):
-    # percent-encode illegal URL characters
-    # Trying to come up with test cases for this gave me a headache, revisit
-    # when do switch to unicode.
-    # Somebody else's comments (lost the attribution):
-##     - IE will return you the url in the encoding you send it
-##     - Mozilla/Firefox will send you latin-1 if there's no non latin-1
-##     characters in your link. It will send you utf-8 however if there are...
-    if type(url) == type(""):
-        url = url.decode(encoding, "replace")
-    url = url.strip()
-    return urllib.quote(url.encode(encoding), URLQUOTE_SAFE_URL_CHARS)
-
 class LinksFactory:
 
     def __init__(self,
@@ -203,40 +136,49 @@
         base_url = self._base_url
         p = self.link_parser_class(response, encoding=encoding)
 
-        for token in p.tags(*(self.urltags.keys()+["base"])):
-            if token.data == "base":
-                base_url = dict(token.attrs).get("href")
-                continue
-            if token.type == "endtag":
-                continue
-            attrs = dict(token.attrs)
-            tag = token.data
-            name = attrs.get("name")
-            text = None
-            # XXX use attr_encoding for ref'd doc if that doc does not provide
-            #  one by other means
-            #attr_encoding = attrs.get("charset")
-            url = attrs.get(self.urltags[tag])  # XXX is "" a valid URL?
-            if not url:
-                # Probably an <A NAME="blah"> link or <AREA NOHREF...>.
-                # For our purposes a link is something with a URL, so ignore
-                # this.
-                continue
+        try:
+            for token in p.tags(*(self.urltags.keys()+["base"])):
+                if token.type == "endtag":
+                    continue
+                if token.data == "base":
+                    base_href = dict(token.attrs).get("href")
+                    if base_href is not None:
+                        base_url = base_href
+                    continue
+                attrs = dict(token.attrs)
+                tag = token.data
+                name = attrs.get("name")
+                text = None
+                # XXX use attr_encoding for ref'd doc if that doc does not
+                #  provide one by other means
+                #attr_encoding = attrs.get("charset")
+                url = attrs.get(self.urltags[tag])  # XXX is "" a valid URL?
+                if not url:
+                    # Probably an <A NAME="blah"> link or <AREA NOHREF...>.
+                    # For our purposes a link is something with a URL, so
+                    # ignore this.
+                    continue
 
-            url = clean_url(url, encoding)
-            if tag == "a":
-                if token.type != "startendtag":
-                    # hmm, this'd break if end tag is missing
-                    text = p.get_compressed_text(("endtag", tag))
-                # but this doesn't work for eg. <a href="blah"><b>Andy</b></a>
-                #text = p.get_compressed_text()
+                url = _rfc3986.clean_url(url, encoding)
+                if tag == "a":
+                    if token.type != "startendtag":
+                        # hmm, this'd break if end tag is missing
+                        text = p.get_compressed_text(("endtag", tag))
+                    # but this doesn't work for eg.
+                    # <a href="blah"><b>Andy</b></a>
+                    #text = p.get_compressed_text()
 
-            yield Link(base_url, url, text, tag, token.attrs)
+                yield Link(base_url, url, text, tag, token.attrs)
+        except sgmllib.SGMLParseError, exc:
+            raise ParseError(exc)
 
 class FormsFactory:
 
     """Makes a sequence of objects satisfying ClientForm.HTMLForm interface.
 
+    After calling .forms(), the .global_form attribute is a form object
+    containing all controls not a descendant of any FORM element.
+
     For constructor argument docs, see ClientForm.ParseResponse
     argument docs.
 
@@ -259,22 +201,31 @@
         self.backwards_compat = backwards_compat
         self._response = None
         self.encoding = None
+        self.global_form = None
 
     def set_response(self, response, encoding):
         self._response = response
         self.encoding = encoding
+        self.global_form = None
 
     def forms(self):
         import ClientForm
         encoding = self.encoding
-        return ClientForm.ParseResponse(
-            self._response,
-            select_default=self.select_default,
-            form_parser_class=self.form_parser_class,
-            request_class=self.request_class,
-            backwards_compat=self.backwards_compat,
-            encoding=encoding,
-            )
+        try:
+            forms = ClientForm.ParseResponseEx(
+                self._response,
+                select_default=self.select_default,
+                form_parser_class=self.form_parser_class,
+                request_class=self.request_class,
+                encoding=encoding,
+                _urljoin=_rfc3986.urljoin,
+                _urlparse=_rfc3986.urlsplit,
+                _urlunparse=_rfc3986.urlunsplit,
+                )
+        except ClientForm.ParseError, exc:
+            raise ParseError(exc)
+        self.global_form = forms[0]
+        return forms[1:]
 
 class TitleFactory:
     def __init__(self):
@@ -289,11 +240,14 @@
         p = _pullparser.TolerantPullParser(
             self._response, encoding=self._encoding)
         try:
-            p.get_tag("title")
-        except _pullparser.NoMoreTokensError:
-            return None
-        else:
-            return p.get_text()
+            try:
+                p.get_tag("title")
+            except _pullparser.NoMoreTokensError:
+                return None
+            else:
+                return p.get_text()
+        except sgmllib.SGMLParseError, exc:
+            raise ParseError(exc)
 
 
 def unescape(data, entities, encoding):
@@ -334,42 +288,44 @@
         return repl
 
 
-try:
-    import BeautifulSoup
-except ImportError:
-    pass
-else:
-    import sgmllib
-    # monkeypatch to fix http://www.python.org/sf/803422 :-(
-    sgmllib.charref = re.compile("&#(x?[0-9a-fA-F]+)[^0-9a-fA-F]")
-    class MechanizeBs(BeautifulSoup.BeautifulSoup):
-        _entitydefs = htmlentitydefs.name2codepoint
-        # don't want the magic Microsoft-char workaround
-        PARSER_MASSAGE = [(re.compile('(<[^<>]*)/>'),
-                           lambda(x):x.group(1) + ' />'),
-                          (re.compile('<!\s+([^<>]*)>'),
-                           lambda(x):'<!' + x.group(1) + '>')
-                          ]
+# bizarre import gymnastics for bundled BeautifulSoup
+import _beautifulsoup
+import ClientForm
+RobustFormParser, NestingRobustFormParser = ClientForm._create_bs_classes(
+    _beautifulsoup.BeautifulSoup, _beautifulsoup.ICantBelieveItsBeautifulSoup
+    )
+# monkeypatch sgmllib to fix http://www.python.org/sf/803422 :-(
+import sgmllib
+sgmllib.charref = re.compile("&#(x?[0-9a-fA-F]+)[^0-9a-fA-F]")
 
-        def __init__(self, encoding, text=None, avoidParserProblems=True,
-                     initialTextIsEverything=True):
-            self._encoding = encoding
-            BeautifulSoup.BeautifulSoup.__init__(
-                self, text, avoidParserProblems, initialTextIsEverything)
+class MechanizeBs(_beautifulsoup.BeautifulSoup):
+    _entitydefs = htmlentitydefs.name2codepoint
+    # don't want the magic Microsoft-char workaround
+    PARSER_MASSAGE = [(re.compile('(<[^<>]*)/>'),
+                       lambda(x):x.group(1) + ' />'),
+                      (re.compile('<!\s+([^<>]*)>'),
+                       lambda(x):'<!' + x.group(1) + '>')
+                      ]
 
-        def handle_charref(self, ref):
-            t = unescape("&#%s;"%ref, self._entitydefs, self._encoding)
-            self.handle_data(t)
-        def handle_entityref(self, ref):
-            t = unescape("&%s;"%ref, self._entitydefs, self._encoding)
-            self.handle_data(t)
-        def unescape_attrs(self, attrs):
-            escaped_attrs = []
-            for key, val in attrs:
-                val = unescape(val, self._entitydefs, self._encoding)
-                escaped_attrs.append((key, val))
-            return escaped_attrs
+    def __init__(self, encoding, text=None, avoidParserProblems=True,
+                 initialTextIsEverything=True):
+        self._encoding = encoding
+        _beautifulsoup.BeautifulSoup.__init__(
+            self, text, avoidParserProblems, initialTextIsEverything)
 
+    def handle_charref(self, ref):
+        t = unescape("&#%s;"%ref, self._entitydefs, self._encoding)
+        self.handle_data(t)
+    def handle_entityref(self, ref):
+        t = unescape("&%s;"%ref, self._entitydefs, self._encoding)
+        self.handle_data(t)
+    def unescape_attrs(self, attrs):
+        escaped_attrs = []
+        for key, val in attrs:
+            val = unescape(val, self._entitydefs, self._encoding)
+            escaped_attrs.append((key, val))
+        return escaped_attrs
+
 class RobustLinksFactory:
 
     compress_re = re.compile(r"\s+")
@@ -379,7 +335,7 @@
                  link_class=Link,
                  urltags=None,
                  ):
-        import BeautifulSoup
+        import _beautifulsoup
         if link_parser_class is None:
             link_parser_class = MechanizeBs
         self.link_parser_class = link_parser_class
@@ -402,27 +358,29 @@
         self._encoding = encoding
 
     def links(self):
-        import BeautifulSoup
+        import _beautifulsoup
         bs = self._bs
         base_url = self._base_url
         encoding = self._encoding
         gen = bs.recursiveChildGenerator()
         for ch in bs.recursiveChildGenerator():
-            if (isinstance(ch, BeautifulSoup.Tag) and
+            if (isinstance(ch, _beautifulsoup.Tag) and
                 ch.name in self.urltags.keys()+["base"]):
                 link = ch
                 attrs = bs.unescape_attrs(link.attrs)
                 attrs_dict = dict(attrs)
                 if link.name == "base":
-                    base_url = attrs_dict.get("href")
+                    base_href = attrs_dict.get("href")
+                    if base_href is not None:
+                        base_url = base_href
                     continue
                 url_attr = self.urltags[link.name]
                 url = attrs_dict.get(url_attr)
                 if not url:
                     continue
-                url = clean_url(url, encoding)
+                url = _rfc3986.clean_url(url, encoding)
                 text = link.firstText(lambda t: True)
-                if text is BeautifulSoup.Null:
+                if text is _beautifulsoup.Null:
                     # follow _pullparser's weird behaviour rigidly
                     if link.name == "a":
                         text = ""
@@ -438,7 +396,7 @@
         import ClientForm
         args = form_parser_args(*args, **kwds)
         if args.form_parser_class is None:
-            args.form_parser_class = ClientForm.RobustFormParser
+            args.form_parser_class = RobustFormParser
         FormsFactory.__init__(self, **args.dictionary)
 
     def set_response(self, response, encoding):
@@ -454,10 +412,10 @@
         self._bs = soup
         self._encoding = encoding
 
-    def title(soup):
-        import BeautifulSoup
+    def title(self):
+        import _beautifulsoup
         title = self._bs.first("title")
-        if title == BeautifulSoup.Null:
+        if title == _beautifulsoup.Null:
             return None
         else:
             return title.firstText(lambda t: True)
@@ -477,18 +435,25 @@
 
     Public attributes:
 
+    Note that accessing these attributes may raise ParseError.
+
     encoding: string specifying the encoding of response if it contains a text
      document (this value is left unspecified for documents that do not have
      an encoding, e.g. an image file)
     is_html: true if response contains an HTML document (XHTML may be
      regarded as HTML too)
     title: page title, or None if no title or not HTML
+    global_form: form object containing all controls that are not descendants
+     of any FORM element, or None if the forms_factory does not support
+     supplying a global form
 
     """
 
+    LAZY_ATTRS = ["encoding", "is_html", "title", "global_form"]
+
     def __init__(self, forms_factory, links_factory, title_factory,
-                 get_encoding=encoding_finder(DEFAULT_ENCODING),
-                 is_html_p=make_is_html(allow_xhtml=False),
+                 encoding_finder=EncodingFinder(DEFAULT_ENCODING),
+                 response_type_finder=ResponseTypeFinder(allow_xhtml=False),
                  ):
         """
 
@@ -504,8 +469,8 @@
         self._forms_factory = forms_factory
         self._links_factory = links_factory
         self._title_factory = title_factory
-        self._get_encoding = get_encoding
-        self._is_html_p = is_html_p
+        self._encoding_finder = encoding_finder
+        self._response_type_finder = response_type_finder
 
         self.set_response(None)
 
@@ -521,51 +486,71 @@
     def set_response(self, response):
         """Set response.
 
-        The response must implement the same interface as objects returned by
-        urllib2.urlopen().
+        The response must either be None or implement the same interface as
+        objects returned by urllib2.urlopen().
 
         """
         self._response = response
         self._forms_genf = self._links_genf = None
         self._get_title = None
-        for name in ["encoding", "is_html", "title"]:
+        for name in self.LAZY_ATTRS:
             try:
                 delattr(self, name)
             except AttributeError:
                 pass
 
     def __getattr__(self, name):
-        if name not in ["encoding", "is_html", "title"]:
+        if name not in self.LAZY_ATTRS:
             return getattr(self.__class__, name)
 
-        try:
-            if name == "encoding":
-                self.encoding = self._get_encoding(self._response)
-                return self.encoding
-            elif name == "is_html":
-                self.is_html = self._is_html_p(self._response, self.encoding)
-                return self.is_html
-            elif name == "title":
-                if self.is_html:
-                    self.title = self._title_factory.title()
-                else:
-                    self.title = None
-                return self.title
-        finally:
-            self._response.seek(0)
+        if name == "encoding":
+            self.encoding = self._encoding_finder.encoding(
+                copy.copy(self._response))
+            return self.encoding
+        elif name == "is_html":
+            self.is_html = self._response_type_finder.is_html(
+                copy.copy(self._response), self.encoding)
+            return self.is_html
+        elif name == "title":
+            if self.is_html:
+                self.title = self._title_factory.title()
+            else:
+                self.title = None
+            return self.title
+        elif name == "global_form":
+            self.forms()
+            return self.global_form
 
     def forms(self):
-        """Return iterable over ClientForm.HTMLForm-like objects."""
+        """Return iterable over ClientForm.HTMLForm-like objects.
+
+        Raises mechanize.ParseError on failure.
+        """
+        # this implementation sets .global_form as a side-effect, for benefit
+        # of __getattr__ impl
         if self._forms_genf is None:
-            self._forms_genf = CachingGeneratorFunction(
-                self._forms_factory.forms())
+            try:
+                self._forms_genf = CachingGeneratorFunction(
+                    self._forms_factory.forms())
+            except:  # XXXX define exception!
+                self.set_response(self._response)
+                raise
+            self.global_form = getattr(
+                self._forms_factory, "global_form", None)
         return self._forms_genf()
 
     def links(self):
-        """Return iterable over mechanize.Link-like objects."""
+        """Return iterable over mechanize.Link-like objects.
+
+        Raises mechanize.ParseError on failure.
+        """
         if self._links_genf is None:
-            self._links_genf = CachingGeneratorFunction(
-                self._links_factory.links())
+            try:
+                self._links_genf = CachingGeneratorFunction(
+                    self._links_factory.links())
+            except:  # XXXX define exception!
+                self.set_response(self._response)
+                raise
         return self._links_genf()
 
 class DefaultFactory(Factory):
@@ -576,7 +561,8 @@
             forms_factory=FormsFactory(),
             links_factory=LinksFactory(),
             title_factory=TitleFactory(),
-            is_html_p=make_is_html(allow_xhtml=i_want_broken_xhtml_support),
+            response_type_finder=ResponseTypeFinder(
+                allow_xhtml=i_want_broken_xhtml_support),
             )
 
     def set_response(self, response):
@@ -585,7 +571,7 @@
             self._forms_factory.set_response(
                 copy.copy(response), self.encoding)
             self._links_factory.set_response(
-                copy.copy(response), self._response.geturl(), self.encoding)
+                copy.copy(response), response.geturl(), self.encoding)
             self._title_factory.set_response(
                 copy.copy(response), self.encoding)
 
@@ -601,19 +587,21 @@
             forms_factory=RobustFormsFactory(),
             links_factory=RobustLinksFactory(),
             title_factory=RobustTitleFactory(),
-            is_html_p=make_is_html(allow_xhtml=i_want_broken_xhtml_support),
+            response_type_finder=ResponseTypeFinder(
+                allow_xhtml=i_want_broken_xhtml_support),
             )
         if soup_class is None:
             soup_class = MechanizeBs
         self._soup_class = soup_class
 
     def set_response(self, response):
-        import BeautifulSoup
+        import _beautifulsoup
         Factory.set_response(self, response)
         if response is not None:
             data = response.read()
             soup = self._soup_class(self.encoding, data)
-            self._forms_factory.set_response(response, self.encoding)
+            self._forms_factory.set_response(
+                copy.copy(response), self.encoding)
             self._links_factory.set_soup(
                 soup, response.geturl(), self.encoding)
             self._title_factory.set_soup(soup, self.encoding)

Copied: python-mechanize/trunk/mechanize/_http.py (from rev 765, python-mechanize/branches/upstream/current/mechanize/_http.py)

Modified: python-mechanize/trunk/mechanize/_lwpcookiejar.py
===================================================================
--- python-mechanize/trunk/mechanize/_lwpcookiejar.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_lwpcookiejar.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -18,12 +18,12 @@
 
 """
 
-import time, re, string, logging
+import time, re, logging
 
 from _clientcookie import reraise_unmasked_exceptions, FileCookieJar, Cookie, \
      MISSING_FILENAME_TEXT, LoadError
 from _headersutil import join_header_words, split_header_words
-from _util import startswith, iso2time, time2isoz
+from _util import iso2time, time2isoz
 
 debug = logging.getLogger("mechanize").debug
 
@@ -89,7 +89,7 @@
                 debug("   Not saving %s: expired", cookie.name)
                 continue
             r.append("Set-Cookie3: %s" % lwp_cookie_str(cookie))
-        return string.join(r+[""], "\n")
+        return "\n".join(r+[""])
 
     def save(self, filename=None, ignore_discard=False, ignore_expires=False):
         if filename is None:
@@ -127,9 +127,9 @@
             while 1:
                 line = f.readline()
                 if line == "": break
-                if not startswith(line, header):
+                if not line.startswith(header):
                     continue
-                line = string.strip(line[len(header):])
+                line = line[len(header):].strip()
 
                 for data in split_header_words([line]):
                     name, value = data[0]
@@ -139,7 +139,7 @@
                         standard[k] = False
                     for k, v in data[1:]:
                         if k is not None:
-                            lc = string.lower(k)
+                            lc = k.lower()
                         else:
                             lc = None
                         # don't lose case distinction for unknown fields
@@ -161,7 +161,7 @@
                     if expires is None:
                         discard = True
                     domain = h("domain")
-                    domain_specified = startswith(domain, ".")
+                    domain_specified = domain.startswith(".")
                     c = Cookie(h("version"), name, value,
                                h("port"), h("port_spec"),
                                domain, domain_specified, h("domain_dot"),

Modified: python-mechanize/trunk/mechanize/_mechanize.py
===================================================================
--- python-mechanize/trunk/mechanize/_mechanize.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_mechanize.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -9,14 +9,16 @@
 
 """
 
-import urllib2, urlparse, sys, copy, re
+import urllib2, sys, copy, re
 
-from _useragent import UserAgent
+from _useragent import UserAgentBase
 from _html import DefaultFactory
-from _util import response_seek_wrapper, closeable_response
+from _response import response_seek_wrapper, closeable_response
+import _upgrade
 import _request
+import _rfc3986
 
-__version__ = (0, 1, 2, "b", None)  # 0.1.2b
+__version__ = (0, 1, 6, "b", None)  # 0.1.6b
 
 class BrowserStateError(Exception): pass
 class LinkNotFoundError(Exception): pass
@@ -45,60 +47,13 @@
     def clear(self):
         del self._history[:]
     def close(self):
-        """
-            If nothing has been added, .close should work.
-
-                >>> history = History()
-                >>> history.close()
-
-            Under some circumstances response can be None, in that case
-            this method should not raise an exception.
-
-                >>> history.add(None, None)
-                >>> history.close()
-        """
         for request, response in self._history:
             if response is not None:
                 response.close()
         del self._history[:]
 
-# Horrible, but needed, at least until fork urllib2.  Even then, may want
-# to preseve urllib2 compatibility.
-def upgrade_response(response):
-    # a urllib2 handler constructed the response, i.e. the response is an
-    # urllib.addinfourl, instead of a _Util.closeable_response as returned
-    # by e.g. mechanize.HTTPHandler
-    try:
-        code = response.code
-    except AttributeError:
-        code = None
-    try:
-        msg = response.msg
-    except AttributeError:
-        msg = None
 
-    # may have already-.read() data from .seek() cache
-    data = None
-    get_data = getattr(response, "get_data", None)
-    if get_data:
-        data = get_data()
-
-    response = closeable_response(
-        response.fp, response.info(), response.geturl(), code, msg)
-    response = response_seek_wrapper(response)
-    if data:
-        response.set_data(data)
-    return response
-class ResponseUpgradeProcessor(urllib2.BaseHandler):
-    # upgrade responses to be .close()able without becoming unusable
-    handler_order = 0  # before anything else
-    def any_response(self, request, response):
-        if not hasattr(response, 'closeable_response'):
-            response = upgrade_response(response)
-        return response
-
-
-class Browser(UserAgent):
+class Browser(UserAgentBase):
     """Browser-like class with support for history, forms and links.
 
     BrowserStateError is raised whenever the browser is in the wrong state to
@@ -113,9 +68,9 @@
 
     """
 
-    handler_classes = UserAgent.handler_classes.copy()
-    handler_classes["_response_upgrade"] = ResponseUpgradeProcessor
-    default_others = copy.copy(UserAgent.default_others)
+    handler_classes = UserAgentBase.handler_classes.copy()
+    handler_classes["_response_upgrade"] = _upgrade.ResponseUpgradeProcessor
+    default_others = copy.copy(UserAgentBase.default_others)
     default_others.append("_response_upgrade")
 
     def __init__(self,
@@ -128,8 +83,8 @@
         Only named arguments should be passed to this constructor.
 
         factory: object implementing the mechanize.Factory interface.
-        history: object implementing the mechanize.History interface.  Note this
-         interface is still experimental and may change in future.
+        history: object implementing the mechanize.History interface.  Note
+         this interface is still experimental and may change in future.
         request_class: Request class to use.  Defaults to mechanize.Request
          by default for Pythons older than 2.4, urllib2.Request otherwise.
 
@@ -145,8 +100,6 @@
         if history is None:
             history = History()
         self._history = history
-        self.request = self._response = None
-        self.form = None
 
         if request_class is None:
             if not hasattr(urllib2.Request, "add_unredirected_header"):
@@ -160,48 +113,77 @@
         self._factory = factory
         self.request_class = request_class
 
-        UserAgent.__init__(self)  # do this last to avoid __getattr__ problems
+        self.request = None
+        self._set_response(None, False)
 
+        # do this last to avoid __getattr__ problems
+        UserAgentBase.__init__(self)
+
     def close(self):
+        UserAgentBase.close(self)
         if self._response is not None:
             self._response.close()    
-        UserAgent.close(self)
         if self._history is not None:
             self._history.close()
             self._history = None
+
+        # make use after .close easy to spot
+        self.form = None
         self.request = self._response = None
+        self.request = self.response = self.set_response = None
+        self.geturl =  self.reload = self.back = None
+        self.clear_history = self.set_cookie = self.links = self.forms = None
+        self.viewing_html = self.encoding = self.title = None
+        self.select_form = self.click = self.submit = self.click_link = None
+        self.follow_link = self.find_link = None
 
+    def open_novisit(self, url, data=None):
+        """Open a URL without visiting it.
+
+        The browser state (including .request, .response(), history, forms and
+        links) are all left unchanged by calling this function.
+
+        The interface is the same as for .open().
+
+        This is useful for things like fetching images.
+
+        See also .retrieve().
+
+        """
+        return self._mech_open(url, data, visit=False)
+
     def open(self, url, data=None):
-        if self._response is not None:
-            self._response.close()
         return self._mech_open(url, data)
 
-    def _mech_open(self, url, data=None, update_history=True):
+    def _mech_open(self, url, data=None, update_history=True, visit=None):
         try:
             url.get_full_url
         except AttributeError:
             # string URL -- convert to absolute URL if required
-            scheme, netloc = urlparse.urlparse(url)[:2]
-            if not scheme:
+            scheme, authority = _rfc3986.urlsplit(url)[:2]
+            if scheme is None:
                 # relative URL
-                assert not netloc, "malformed URL"
                 if self._response is None:
                     raise BrowserStateError(
-                        "can't fetch relative URL: not viewing any document")
-                url = urlparse.urljoin(self._response.geturl(), url)
+                        "can't fetch relative reference: "
+                        "not viewing any document")
+                url = _rfc3986.urljoin(self._response.geturl(), url)
 
-        if self.request is not None and update_history:
-            self._history.add(self.request, self._response)
-        self._response = None
-        # we want self.request to be assigned even if UserAgent.open fails
-        self.request = self._request(url, data)
-        self._previous_scheme = self.request.get_type()
+        request = self._request(url, data, visit)
+        visit = request.visit
+        if visit is None:
+            visit = True
 
+        if visit:
+            self._visit_request(request, update_history)
+
         success = True
         try:
-            response = UserAgent.open(self, self.request, data)
+            response = UserAgentBase.open(self, request, data)
         except urllib2.HTTPError, error:
             success = False
+            if error.fp is None:  # not a response
+                raise
             response = error
 ##         except (IOError, socket.error, OSError), error:
 ##             # Yes, urllib2 really does raise all these :-((
@@ -214,10 +196,16 @@
 ##             # Python core, a fix would need some backwards-compat. hack to be
 ##             # acceptable.
 ##             raise
-        self.set_response(response)
+
+        if visit:
+            self._set_response(response, False)
+            response = copy.copy(self._response)
+        elif response is not None:
+            response = _upgrade.upgrade_response(response)
+
         if not success:
-            raise error
-        return copy.copy(self._response)
+            raise response
+        return response
 
     def __str__(self):
         text = []
@@ -241,24 +229,52 @@
         return copy.copy(self._response)
 
     def set_response(self, response):
-        """Replace current response with (a copy of) response."""
+        """Replace current response with (a copy of) response.
+
+        response may be None.
+
+        This is intended mostly for HTML-preprocessing.
+        """
+        self._set_response(response, True)
+
+    def _set_response(self, response, close_current):
         # sanity check, necessary but far from sufficient
-        if not (hasattr(response, "info") and hasattr(response, "geturl") and
-                hasattr(response, "read")):
+        if not (response is None or
+                (hasattr(response, "info") and hasattr(response, "geturl") and
+                 hasattr(response, "read")
+                 )
+                ):
             raise ValueError("not a response object")
 
         self.form = None
+        if response is not None:
+            response = _upgrade.upgrade_response(response)
+        if close_current and self._response is not None:
+            self._response.close()
+        self._response = response
+        self._factory.set_response(response)
 
-        if not hasattr(response, "seek"):
-            response = response_seek_wrapper(response)
-        if not hasattr(response, "closeable_response"):
-            response = upgrade_response(response)
-        else:
-            response = copy.copy(response)
+    def visit_response(self, response, request=None):
+        """Visit the response, as if it had been .open()ed.
 
-        self._response = response
-        self._factory.set_response(self._response)
+        Unlike .set_response(), this updates history rather than replacing the
+        current response.
+        """
+        if request is None:
+            request = _request.Request(response.geturl())
+        self._visit_request(request, True)
+        self._set_response(response, False)
 
+    def _visit_request(self, request, update_history):
+        if self._response is not None:
+            self._response.close()
+        if self.request is not None and update_history:
+            self._history.add(self.request, self._response)
+        self._response = None
+        # we want self.request to be assigned even if UserAgentBase.open
+        # fails
+        self.request = request
+
     def geturl(self):
         """Get URL of current document."""
         if self._response is None:
@@ -283,11 +299,53 @@
             self._response.close()
         self.request, response = self._history.back(n, self._response)
         self.set_response(response)
-        return response
+        if not response.read_complete:
+            return self.reload()
+        return copy.copy(response)
 
     def clear_history(self):
         self._history.clear()
 
+    def set_cookie(self, cookie_string):
+        """Request to set a cookie.
+
+        Note that it is NOT necessary to call this method under ordinary
+        circumstances: cookie handling is normally entirely automatic.  The
+        intended use case is rather to simulate the setting of a cookie by
+        client script in a web page (e.g. JavaScript).  In that case, use of
+        this method is necessary because mechanize currently does not support
+        JavaScript, VBScript, etc.
+
+        The cookie is added in the same way as if it had arrived with the
+        current response, as a result of the current request.  This means that,
+        for example, it is not appropriate to set the cookie based on the
+        current request, no cookie will be set.
+
+        The cookie will be returned automatically with subsequent responses
+        made by the Browser instance whenever that's appropriate.
+
+        cookie_string should be a valid value of the Set-Cookie header.
+
+        For example:
+
+        browser.set_cookie(
+            "sid=abcdef; expires=Wednesday, 09-Nov-06 23:12:40 GMT")
+
+        Currently, this method does not allow for adding RFC 2986 cookies.
+        This limitation will be lifted if anybody requests it.
+
+        """
+        if self._response is None:
+            raise BrowserStateError("not viewing any document")
+        if self.request.get_type() not in ["http", "https"]:
+            raise BrowserStateError("can't set cookie for non-HTTP/HTTPS "
+                                    "transactions")
+        cookiejar = self._ua_handlers["_cookies"].cookiejar
+        response = self.response()  # copy
+        headers = response.info()
+        headers["Set-cookie"] = cookie_string
+        cookiejar.extract_cookies(response, self.request)
+
     def links(self, **kwds):
         """Return iterable over links (mechanize.Link objects)."""
         if not self.viewing_html():
@@ -308,6 +366,24 @@
             raise BrowserStateError("not viewing HTML")
         return self._factory.forms()
 
+    def global_form(self):
+        """Return the global form object, or None if the factory implementation
+        did not supply one.
+
+        The "global" form object contains all controls that are not descendants of
+        any FORM element.
+
+        The returned form object implements the ClientForm.HTMLForm interface.
+
+        This is a separate method since the global form is not regarded as part
+        of the sequence of forms in the document -- mostly for
+        backwards-compatibility.
+
+        """
+        if not self.viewing_html():
+            raise BrowserStateError("not viewing HTML")
+        return self._factory.global_form
+
     def viewing_html(self):
         """Return whether the current response contains HTML data."""
         if self._response is None:
@@ -340,6 +416,10 @@
         interface, so you can call methods like .set_value(), .set(), and
         .click().
 
+        Another way to select a form is to assign to the .form attribute.  The
+        form assigned should be one of the objects returned by the .forms()
+        method.
+
         At least one of the name, predicate and nr arguments must be supplied.
         If no matching form is found, mechanize.FormNotFoundError is raised.
 
@@ -396,9 +476,9 @@
             original_scheme in ["http", "https"] and
             not (original_scheme == "https" and scheme != "https")):
             # strip URL fragment (RFC 2616 14.36)
-            parts = urlparse.urlparse(self.request.get_full_url())
-            parts = parts[:-1]+("",)
-            referer = urlparse.urlunparse(parts)
+            parts = _rfc3986.urlsplit(self.request.get_full_url())
+            parts = parts[:-1]+(None,)
+            referer = _rfc3986.urlunsplit(parts)
             request.add_unredirected_header("Referer", referer)
         return request
 
@@ -507,9 +587,6 @@
                 ".select_form()?)" % (self.__class__, name))
         return getattr(form, name)
 
-#---------------------------------------------------
-# Private methods.
-
     def _filter_links(self, links,
                     text=None, text_regex=None,
                     name=None, name_regex=None,

Modified: python-mechanize/trunk/mechanize/_mozillacookiejar.py
===================================================================
--- python-mechanize/trunk/mechanize/_mozillacookiejar.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_mozillacookiejar.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -9,11 +9,10 @@
 
 """
 
-import re, string, time, logging
+import re, time, logging
 
 from _clientcookie import reraise_unmasked_exceptions, FileCookieJar, Cookie, \
      MISSING_FILENAME_TEXT, LoadError
-from _util import startswith, endswith
 debug = logging.getLogger("ClientCookie").debug
 
 
@@ -72,23 +71,23 @@
                 if line == "": break
 
                 # last field may be absent, so keep any trailing tab
-                if endswith(line, "\n"): line = line[:-1]
+                if line.endswith("\n"): line = line[:-1]
 
                 # skip comments and blank lines XXX what is $ for?
-                if (startswith(string.strip(line), "#") or
-                    startswith(string.strip(line), "$") or
-                    string.strip(line) == ""):
+                if (line.strip().startswith("#") or
+                    line.strip().startswith("$") or
+                    line.strip() == ""):
                     continue
 
                 domain, domain_specified, path, secure, expires, name, value = \
-                        string.split(line, "\t")
+                        line.split("\t")
                 secure = (secure == "TRUE")
                 domain_specified = (domain_specified == "TRUE")
                 if name == "":
                     name = value
                     value = None
 
-                initial_dot = startswith(domain, ".")
+                initial_dot = domain.startswith(".")
                 assert domain_specified == initial_dot
 
                 discard = False
@@ -137,7 +136,7 @@
                     continue
                 if cookie.secure: secure = "TRUE"
                 else: secure = "FALSE"
-                if startswith(cookie.domain, "."): initial_dot = "TRUE"
+                if cookie.domain.startswith("."): initial_dot = "TRUE"
                 else: initial_dot = "FALSE"
                 if cookie.expires is not None:
                     expires = str(cookie.expires)
@@ -153,8 +152,8 @@
                     name = cookie.name
                     value = cookie.value
                 f.write(
-                    string.join([cookie.domain, initial_dot, cookie.path,
-                                 secure, expires, name, value], "\t")+
+                    "\t".join([cookie.domain, initial_dot, cookie.path,
+                               secure, expires, name, value])+
                     "\n")
         finally:
             f.close()

Modified: python-mechanize/trunk/mechanize/_msiecookiejar.py
===================================================================
--- python-mechanize/trunk/mechanize/_msiecookiejar.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_msiecookiejar.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -11,13 +11,12 @@
 
 # XXX names and comments are not great here
 
-import os, re, string, time, struct, logging
+import os, re, time, struct, logging
 if os.name == "nt":
     import _winreg
 
 from _clientcookie import FileCookieJar, CookieJar, Cookie, \
      MISSING_FILENAME_TEXT, LoadError
-from _util import startswith
 
 debug = logging.getLogger("mechanize").debug
 
@@ -50,7 +49,7 @@
     return divmod((filetime - WIN32_EPOCH), 10000000L)[0]
 
 def binary_to_char(c): return "%02X" % ord(c)
-def binary_to_str(d): return string.join(map(binary_to_char, list(d)), "")
+def binary_to_str(d): return "".join(map(binary_to_char, list(d)))
 
 class MSIEBase:
     magic_re = re.compile(r"Client UrlCache MMF Ver \d\.\d.*")
@@ -153,7 +152,7 @@
             else:
                 discard = False
             domain = cookie["DOMAIN"]
-            initial_dot = startswith(domain, ".")
+            initial_dot = domain.startswith(".")
             if initial_dot:
                 domain_specified = True
             else:
@@ -201,7 +200,7 @@
         now = int(time.time())
 
         if username is None:
-            username = string.lower(os.environ['USERNAME'])
+            username = os.environ['USERNAME'].lower()
 
         cookie_dir = os.path.dirname(filename)
 

Modified: python-mechanize/trunk/mechanize/_opener.py
===================================================================
--- python-mechanize/trunk/mechanize/_opener.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_opener.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -9,92 +9,30 @@
 
 """
 
-import urllib2, string, bisect, urlparse
-
-from _util import startswith, isstringlike
-from _request import Request
-
+import os, urllib2, bisect, urllib, httplib, types, tempfile
 try:
+    import threading as _threading
+except ImportError:
+    import dummy_threading as _threading
+try:
     set
 except NameError:
     import sets
     set = sets.Set
 
-def methnames(obj):
-    """Return method names of class instance.
+import _http
+import _upgrade
+import _rfc3986
+from _util import isstringlike
+from _request import Request
 
-    dir(obj) doesn't work across Python versions, this does.
 
-    """
-    return methnames_of_instance_as_dict(obj).keys()
+class ContentTooShortError(urllib2.URLError):
+    def __init__(self, reason, result):
+        urllib2.URLError.__init__(self, reason)
+        self.result = result
 
-def methnames_of_instance_as_dict(inst):
-    """
-    It is possible for an attribute to be present in the results of dir(inst),
-    but for getattr(inst, attr_name) to raise an Attribute error, that should
-    be handled gracefully.
 
-        >>> class BadAttr(object):
-        ...     def error(self):
-        ...         raise AttributeError
-        ...     error = property(error)
-
-        >>> inst = BadAttr()
-        >>> 'error' in dir(inst)
-        True
-        >>> inst.error
-        Traceback (most recent call last):
-        ...
-        AttributeError
-
-        >>> result = methnames_of_instance_as_dict(inst) # no exception
-    """
-    names = {}
-    names.update(methnames_of_class_as_dict(inst.__class__))
-    for methname in dir(inst):
-        try:
-            candidate = getattr(inst, methname)
-        except AttributeError:
-            continue
-        if callable(candidate):
-            names[methname] = None
-    return names
-
-def methnames_of_class_as_dict(klass):
-    """
-    It is possible for an attribute to be present in the results of dir(inst),
-    but for getattr(inst, attr_name) to raise an Attribute error, that should
-    be handled gracefully.
-
-        >>> class BadClass(object):
-        ...     def error(self):
-        ...         raise AttributeError
-        ...     error = property(error)
-        ...     __bases__ = []
-
-        >>> klass = BadClass()
-        >>> 'error' in dir(klass)
-        True
-        >>> klass.error
-        Traceback (most recent call last):
-        ...
-        AttributeError
-
-        >>> result = methnames_of_class_as_dict(klass) # no exception
-    """
-    names = {}
-    for methname in dir(klass):
-        try:
-            candidate = getattr(klass, methname)
-        except AttributeError:
-            continue
-        if callable(candidate):
-            names[methname] = None
-    for baseclass in klass.__bases__:
-        names.update(methnames_of_class_as_dict(baseclass))
-    return names
-
-
 class OpenerDirector(urllib2.OpenerDirector):
     def __init__(self):
         urllib2.OpenerDirector.__init__(self)
@@ -105,6 +43,7 @@
         self._any_request = {}
         self._any_response = {}
         self._handler_index_valid = True
+        self._tempfiles = []
 
     def add_handler(self, handler):
         if handler in self.handlers:
@@ -128,7 +67,7 @@
 
         for handler in self.handlers:
             added = False
-            for meth in methnames(handler):
+            for meth in dir(handler):
                 if meth in ["redirect_request", "do_open", "proxy_open"]:
                     # oops, coincidental match
                     continue
@@ -146,8 +85,8 @@
                 scheme = meth[:ii]
                 condition = meth[ii+1:]
 
-                if startswith(condition, "error"):
-                    jj = string.find(meth[ii+1:], "_") + ii + 1
+                if condition.startswith("error"):
+                    jj = meth[ii+1:].find("_") + ii + 1
                     kind = meth[jj+1:]
                     try:
                         kind = int(kind)
@@ -198,18 +137,25 @@
         self._any_request = any_request
         self._any_response = any_response
 
-    def _request(self, url_or_req, data):
+    def _request(self, url_or_req, data, visit):
         if isstringlike(url_or_req):
-            req = Request(url_or_req, data)
+            req = Request(url_or_req, data, visit=visit)
         else:
             # already a urllib2.Request or mechanize.Request instance
             req = url_or_req
             if data is not None:
                 req.add_data(data)
+            # XXX yuck, give request a .visit attribute if it doesn't have one
+            try:
+                req.visit
+            except AttributeError:
+                req.visit = None
+            if visit is not None:
+                req.visit = visit
         return req
 
     def open(self, fullurl, data=None):
-        req = self._request(fullurl, data)
+        req = self._request(fullurl, data, None)
         req_scheme = req.get_type()
 
         self._maybe_reindex_handlers()
@@ -267,48 +213,174 @@
             args = (dict, 'default', 'http_error_default') + orig_args
             return apply(self._call_chain, args)
 
+    BLOCK_SIZE = 1024*8
     def retrieve(self, fullurl, filename=None, reporthook=None, data=None):
         """Returns (filename, headers).
 
         For remote objects, the default filename will refer to a temporary
-        file.
+        file.  Temporary files are removed when the OpenerDirector.close()
+        method is called.
 
+        For file: URLs, at present the returned filename is None.  This may
+        change in future.
+
+        If the actual number of bytes read is less than indicated by the
+        Content-Length header, raises ContentTooShortError (a URLError
+        subclass).  The exception's .result attribute contains the (filename,
+        headers) that would have been returned.
+
         """
-        req = self._request(fullurl, data)
-        type_ = req.get_type()
+        req = self._request(fullurl, data, False)
+        scheme = req.get_type()
         fp = self.open(req)
         headers = fp.info()
-        if filename is None and type == 'file':
-            return url2pathname(req.get_selector()), headers
+        if filename is None and scheme == 'file':
+            # XXX req.get_selector() seems broken here, return None,
+            #   pending sanity :-/
+            return None, headers
+            #return urllib.url2pathname(req.get_selector()), headers
         if filename:
             tfp = open(filename, 'wb')
         else:
-            path = urlparse(fullurl)[2]
+            path = _rfc3986.urlsplit(fullurl)[2]
             suffix = os.path.splitext(path)[1]
-            tfp = tempfile.TemporaryFile("wb", suffix=suffix)
+            fd, filename = tempfile.mkstemp(suffix)
+            self._tempfiles.append(filename)
+            tfp = os.fdopen(fd, 'wb')
+
         result = filename, headers
-        bs = 1024*8
+        bs = self.BLOCK_SIZE
         size = -1
         read = 0
-        blocknum = 1
+        blocknum = 0
         if reporthook:
-            if headers.has_key("content-length"):
+            if "content-length" in headers:
                 size = int(headers["Content-Length"])
-            reporthook(0, bs, size)
+            reporthook(blocknum, bs, size)
         while 1:
             block = fp.read(bs)
+            if block == "":
+                break
             read += len(block)
+            tfp.write(block)
+            blocknum += 1
             if reporthook:
                 reporthook(blocknum, bs, size)
-            blocknum = blocknum + 1
-            if not block:
-                break
-            tfp.write(block)
         fp.close()
         tfp.close()
         del fp
         del tfp
-        if size>=0 and read<size:
-            raise IOError("incomplete retrieval error",
-                          "got only %d bytes out of %d" % (read,size))
+
+        # raise exception if actual size does not match content-length header
+        if size >= 0 and read < size:
+            raise ContentTooShortError(
+                "retrieval incomplete: "
+                "got only %i out of %i bytes" % (read, size),
+                result
+                )
+
         return result
+
+    def close(self):
+        urllib2.OpenerDirector.close(self)
+
+        # make it very obvious this object is no longer supposed to be used
+        self.open = self.error = self.retrieve = self.add_handler = None
+
+        if self._tempfiles:
+            for filename in self._tempfiles:
+                try:
+                    os.unlink(filename)
+                except OSError:
+                    pass
+            del self._tempfiles[:]
+
+
+class OpenerFactory:
+    """This class's interface is quite likely to change."""
+
+    default_classes = [
+        # handlers
+        urllib2.ProxyHandler,
+        urllib2.UnknownHandler,
+        _http.HTTPHandler,  # derived from new AbstractHTTPHandler
+        urllib2.HTTPDefaultErrorHandler,
+        _http.HTTPRedirectHandler,  # bugfixed
+        urllib2.FTPHandler,
+        urllib2.FileHandler,
+        # processors
+        _upgrade.HTTPRequestUpgradeProcessor,
+        _http.HTTPCookieProcessor,
+        _http.HTTPErrorProcessor,
+        ]
+    if hasattr(httplib, 'HTTPS'):
+        default_classes.append(_http.HTTPSHandler)
+    handlers = []
+    replacement_handlers = []
+
+    def __init__(self, klass=OpenerDirector):
+        self.klass = klass
+
+    def build_opener(self, *handlers):
+        """Create an opener object from a list of handlers and processors.
+
+        The opener will use several default handlers and processors, including
+        support for HTTP and FTP.
+
+        If any of the handlers passed as arguments are subclasses of the
+        default handlers, the default handlers will not be used.
+
+        """
+        opener = self.klass()
+        default_classes = list(self.default_classes)
+        skip = []
+        for klass in default_classes:
+            for check in handlers:
+                if type(check) == types.ClassType:
+                    if issubclass(check, klass):
+                        skip.append(klass)
+                elif type(check) == types.InstanceType:
+                    if isinstance(check, klass):
+                        skip.append(klass)
+        for klass in skip:
+            default_classes.remove(klass)
+
+        for klass in default_classes:
+            opener.add_handler(klass())
+        for h in handlers:
+            if type(h) == types.ClassType:
+                h = h()
+            opener.add_handler(h)
+
+        return opener
+
+
+build_opener = OpenerFactory().build_opener
+
+_opener = None
+urlopen_lock = _threading.Lock()
+def urlopen(url, data=None):
+    global _opener
+    if _opener is None:
+        urlopen_lock.acquire()
+        try:
+            if _opener is None:
+                _opener = build_opener()
+        finally:
+            urlopen_lock.release()
+    return _opener.open(url, data)
+
+def urlretrieve(url, filename=None, reporthook=None, data=None):
+    global _opener
+    if _opener is None:
+        urlopen_lock.acquire()
+        try:
+            if _opener is None:
+                _opener = build_opener()
+        finally:
+            urlopen_lock.release()
+    return _opener.retrieve(url, filename, reporthook, data)
+
+def install_opener(opener):
+    global _opener
+    _opener = opener

Modified: python-mechanize/trunk/mechanize/_request.py
===================================================================
--- python-mechanize/trunk/mechanize/_request.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_request.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -8,16 +8,33 @@
 
 """
 
-import urllib2, string
+import urllib2, urllib, logging
 
 from _clientcookie import request_host
+import _rfc3986
 
+warn = logging.getLogger("mechanize").warning
+# don't complain about missing logging handler
+logging.getLogger("mechanize").setLevel(logging.ERROR)
 
+
 class Request(urllib2.Request):
     def __init__(self, url, data=None, headers={},
-             origin_req_host=None, unverifiable=False):
+                 origin_req_host=None, unverifiable=False, visit=None):
+        # In mechanize 0.2, the interpretation of a unicode url argument will
+        # change: A unicode url argument will be interpreted as an IRI, and a
+        # bytestring as a URI. For now, we accept unicode or bytestring.  We
+        # don't insist that the value is always a URI (specifically, must only
+        # contain characters which are legal), because that might break working
+        # code (who knows what bytes some servers want to see, especially with
+        # browser plugins for internationalised URIs).
+        if not _rfc3986.is_clean_uri(url):
+            warn("url argument is not a URI "
+                 "(contains illegal characters) %r" % url)
         urllib2.Request.__init__(self, url, data, headers)
+        self.selector = None
         self.unredirected_hdrs = {}
+        self.visit = visit
 
         # All the terminology below comes from RFC 2965.
         self.unverifiable = unverifiable
@@ -31,6 +48,11 @@
             origin_req_host = request_host(self)
         self.origin_req_host = origin_req_host
 
+    def get_selector(self):
+        if self.selector is None:
+            self.selector, self.__r_selector = urllib.splittag(self.__r_host)
+        return self.selector
+
     def get_origin_req_host(self):
         return self.origin_req_host
 
@@ -39,14 +61,12 @@
 
     def add_unredirected_header(self, key, val):
         """Add a header that will not be added to a redirected request."""
-        self.unredirected_hdrs[string.capitalize(key)] = val
+        self.unredirected_hdrs[key.capitalize()] = val
 
     def has_header(self, header_name):
         """True iff request has named header (regular or unredirected)."""
-        if (self.headers.has_key(header_name) or
-            self.unredirected_hdrs.has_key(header_name)):
-            return True
-        return False
+        return (header_name in self.headers or
+                header_name in self.unredirected_hdrs)
 
     def get_header(self, header_name, default=None):
         return self.headers.get(

Copied: python-mechanize/trunk/mechanize/_response.py (from rev 765, python-mechanize/branches/upstream/current/mechanize/_response.py)

Copied: python-mechanize/trunk/mechanize/_rfc3986.py (from rev 765, python-mechanize/branches/upstream/current/mechanize/_rfc3986.py)

Copied: python-mechanize/trunk/mechanize/_seek.py (from rev 765, python-mechanize/branches/upstream/current/mechanize/_seek.py)

Copied: python-mechanize/trunk/mechanize/_upgrade.py (from rev 765, python-mechanize/branches/upstream/current/mechanize/_upgrade.py)

Modified: python-mechanize/trunk/mechanize/_urllib2.py
===================================================================
--- python-mechanize/trunk/mechanize/_urllib2.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_urllib2.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -3,24 +3,25 @@
 from urllib2 import \
      URLError, \
      HTTPError, \
-     GopherError, \
+     GopherError
+# ...and from mechanize
+from _opener import OpenerDirector, \
+     build_opener, install_opener, urlopen
+from _auth import \
      HTTPPasswordMgr, \
      HTTPPasswordMgrWithDefaultRealm, \
      AbstractBasicAuthHandler, \
-     AbstractDigestAuthHandler
-# ...and from mechanize
-from _opener import OpenerDirector
-from _auth import \
+     AbstractDigestAuthHandler, \
      HTTPProxyPasswordMgr, \
      ProxyHandler, \
      ProxyBasicAuthHandler, \
      ProxyDigestAuthHandler, \
      HTTPBasicAuthHandler, \
-     HTTPDigestAuthHandler
-from _urllib2_support import \
-     Request, \
-     build_opener, install_opener, urlopen, \
-     OpenerFactory, urlretrieve, \
+     HTTPDigestAuthHandler, \
+     HTTPSClientCertMgr
+from _request import \
+     Request
+from _http import \
      RobotExclusionError
 
 # handlers...
@@ -34,20 +35,27 @@
      FileHandler, \
      GopherHandler
 # ...and from mechanize
-from _urllib2_support import \
+from _http import \
      HTTPHandler, \
      HTTPRedirectHandler, \
-     HTTPRequestUpgradeProcessor, \
      HTTPEquivProcessor, \
-     SeekableProcessor, \
      HTTPCookieProcessor, \
      HTTPRefererProcessor, \
      HTTPRefreshProcessor, \
      HTTPErrorProcessor, \
+     HTTPRobotRulesProcessor
+from _upgrade import \
+     HTTPRequestUpgradeProcessor, \
+     ResponseUpgradeProcessor
+from _debug import \
      HTTPResponseDebugProcessor, \
-     HTTPRedirectDebugProcessor, \
-     HTTPRobotRulesProcessor
+     HTTPRedirectDebugProcessor
+from _seek import \
+     SeekableProcessor
+# crap ATM
+## from _gzip import \
+##      HTTPGzipProcessor
 import httplib
 if hasattr(httplib, 'HTTPS'):
-    from _urllib2_support import HTTPSHandler
+    from _http import HTTPSHandler
 del httplib

Deleted: python-mechanize/trunk/mechanize/_urllib2_support.py
===================================================================
--- python-mechanize/trunk/mechanize/_urllib2_support.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_urllib2_support.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,718 +0,0 @@
-"""Integration with Python standard library module urllib2.
-
-Also includes a redirection bugfix, support for parsing HTML HEAD blocks for
-the META HTTP-EQUIV tag contents, and following Refresh header redirects.
-
-Copyright 2002-2006 John J Lee <jjl at pobox.com>
-
-This code is free software; you can redistribute it and/or modify it
-under the terms of the BSD or ZPL 2.1 licenses (see the file
-COPYING.txt included with the distribution).
-
-"""
-
-import copy, time, tempfile, htmlentitydefs, re, logging, types, \
-       string, socket, urlparse, urllib2, urllib, httplib, sgmllib
-from urllib2 import URLError, HTTPError, BaseHandler
-from cStringIO import StringIO
-try:
-    import threading as _threading
-except ImportError:
-    import dummy_threading as _threading
-
-import _opener
-from _request import Request
-from _util import isstringlike, startswith, \
-     getheaders, closeable_response, response_seek_wrapper
-from _html import unescape, unescape_charref
-from _headersutil import is_html
-from _clientcookie import CookieJar, request_host
-
-debug = logging.getLogger("mechanize.cookies").debug
-
-
-CHUNK = 1024  # size of chunks fed to HTML HEAD parser, in bytes
-DEFAULT_ENCODING = 'latin-1'
-
-
-# This fixes a bug in urllib2 as of Python 2.1.3 and 2.2.2
-#  (http://www.python.org/sf/549151)
-# 2.2.3 is broken here (my fault!), 2.3 is fixed.
-class HTTPRedirectHandler(BaseHandler):
-    # maximum number of redirections to any single URL
-    # this is needed because of the state that cookies introduce
-    max_repeats = 4
-    # maximum total number of redirections (regardless of URL) before
-    # assuming we're in a loop
-    max_redirections = 10
-
-    # Implementation notes:
-
-    # To avoid the server sending us into an infinite loop, the request
-    # object needs to track what URLs we have already seen.  Do this by
-    # adding a handler-specific attribute to the Request object.  The value
-    # of the dict is used to count the number of times the same URL has
-    # been visited.  This is needed because visiting the same URL twice
-    # does not necessarily imply a loop, thanks to state introduced by
-    # cookies.
-
-    # Always unhandled redirection codes:
-    # 300 Multiple Choices: should not handle this here.
-    # 304 Not Modified: no need to handle here: only of interest to caches
-    #     that do conditional GETs
-    # 305 Use Proxy: probably not worth dealing with here
-    # 306 Unused: what was this for in the previous versions of protocol??
-
-    def redirect_request(self, newurl, req, fp, code, msg, headers):
-        """Return a Request or None in response to a redirect.
-
-        This is called by the http_error_30x methods when a redirection
-        response is received.  If a redirection should take place, return a
-        new Request to allow http_error_30x to perform the redirect;
-        otherwise, return None to indicate that an HTTPError should be
-        raised.
-
-        """
-        if code in (301, 302, 303, "refresh") or \
-               (code == 307 and not req.has_data()):
-            # Strictly (according to RFC 2616), 301 or 302 in response to
-            # a POST MUST NOT cause a redirection without confirmation
-            # from the user (of urllib2, in this case).  In practice,
-            # essentially all clients do redirect in this case, so we do
-            # the same.
-            return Request(newurl,
-                           headers=req.headers,
-                           origin_req_host=req.get_origin_req_host(),
-                           unverifiable=True)
-        else:
-            raise HTTPError(req.get_full_url(), code, msg, headers, fp)
-
-    def http_error_302(self, req, fp, code, msg, headers):
-        # Some servers (incorrectly) return multiple Location headers
-        # (so probably same goes for URI).  Use first header.
-        if headers.has_key('location'):
-            newurl = getheaders(headers, 'location')[0]
-        elif headers.has_key('uri'):
-            newurl = getheaders(headers, 'uri')[0]
-        else:
-            return
-        newurl = urlparse.urljoin(req.get_full_url(), newurl)
-
-        # XXX Probably want to forget about the state of the current
-        # request, although that might interact poorly with other
-        # handlers that also use handler-specific request attributes
-        new = self.redirect_request(newurl, req, fp, code, msg, headers)
-        if new is None:
-            return
-
-        # loop detection
-        # .redirect_dict has a key url if url was previously visited.
-        if hasattr(req, 'redirect_dict'):
-            visited = new.redirect_dict = req.redirect_dict
-            if (visited.get(newurl, 0) >= self.max_repeats or
-                len(visited) >= self.max_redirections):
-                raise HTTPError(req.get_full_url(), code,
-                                self.inf_msg + msg, headers, fp)
-        else:
-            visited = new.redirect_dict = req.redirect_dict = {}
-        visited[newurl] = visited.get(newurl, 0) + 1
-
-        # Don't close the fp until we are sure that we won't use it
-        # with HTTPError.  
-        fp.read()
-        fp.close()
-
-        return self.parent.open(new)
-
-    http_error_301 = http_error_303 = http_error_307 = http_error_302
-    http_error_refresh = http_error_302
-
-    inf_msg = "The HTTP server returned a redirect error that would " \
-              "lead to an infinite loop.\n" \
-              "The last 30x error message was:\n"
-
-
-class HTTPRequestUpgradeProcessor(BaseHandler):
-    # upgrade urllib2.Request to this module's Request
-    # yuck!
-    handler_order = 0  # before anything else
-
-    def http_request(self, request):
-        if not hasattr(request, "add_unredirected_header"):
-            newrequest = Request(request._Request__original, request.data,
-                                 request.headers)
-            try: newrequest.origin_req_host = request.origin_req_host
-            except AttributeError: pass
-            try: newrequest.unverifiable = request.unverifiable
-            except AttributeError: pass
-            request = newrequest
-        return request
-
-    https_request = http_request
-
-# XXX would self.reset() work, instead of raising this exception?
-class EndOfHeadError(Exception): pass
-class AbstractHeadParser:
-    # only these elements are allowed in or before HEAD of document
-    head_elems = ("html", "head",
-                  "title", "base",
-                  "script", "style", "meta", "link", "object")
-    _entitydefs = htmlentitydefs.name2codepoint
-    _encoding = DEFAULT_ENCODING
-
-    def __init__(self):
-        self.http_equiv = []
-
-    def start_meta(self, attrs):
-        http_equiv = content = None
-        for key, value in attrs:
-            if key == "http-equiv":
-                http_equiv = self.unescape_attr_if_required(value)
-            elif key == "content":
-                content = self.unescape_attr_if_required(value)
-        if http_equiv is not None:
-            self.http_equiv.append((http_equiv, content))
-
-    def end_head(self):
-        raise EndOfHeadError()
-
-    def handle_entityref(self, name):
-        #debug("%s", name)
-        self.handle_data(unescape(
-            '&%s;' % name, self._entitydefs, self._encoding))
-
-    def handle_charref(self, name):
-        #debug("%s", name)
-        self.handle_data(unescape_charref(name, self._encoding))
-
-    def unescape_attr(self, name):
-        #debug("%s", name)
-        return unescape(name, self._entitydefs, self._encoding)
-
-    def unescape_attrs(self, attrs):
-        #debug("%s", attrs)
-        escaped_attrs = {}
-        for key, val in attrs.items():
-            escaped_attrs[key] = self.unescape_attr(val)
-        return escaped_attrs
-
-    def unknown_entityref(self, ref):
-        self.handle_data("&%s;" % ref)
-
-    def unknown_charref(self, ref):
-        self.handle_data("&#%s;" % ref)
-
-
-try:
-    import HTMLParser
-except ImportError:
-    pass
-else:
-    class XHTMLCompatibleHeadParser(AbstractHeadParser,
-                                    HTMLParser.HTMLParser):
-        def __init__(self):
-            HTMLParser.HTMLParser.__init__(self)
-            AbstractHeadParser.__init__(self)
-
-        def handle_starttag(self, tag, attrs):
-            if tag not in self.head_elems:
-                raise EndOfHeadError()
-            try:
-                method = getattr(self, 'start_' + tag)
-            except AttributeError:
-                try:
-                    method = getattr(self, 'do_' + tag)
-                except AttributeError:
-                    pass # unknown tag
-                else:
-                    method(attrs)
-            else:
-                method(attrs)
-
-        def handle_endtag(self, tag):
-            if tag not in self.head_elems:
-                raise EndOfHeadError()
-            try:
-                method = getattr(self, 'end_' + tag)
-            except AttributeError:
-                pass # unknown tag
-            else:
-                method()
-
-        def unescape(self, name):
-            # Use the entitydefs passed into constructor, not
-            # HTMLParser.HTMLParser's entitydefs.
-            return self.unescape_attr(name)
-
-        def unescape_attr_if_required(self, name):
-            return name  # HTMLParser.HTMLParser already did it
-
-class HeadParser(AbstractHeadParser, sgmllib.SGMLParser):
-
-    def _not_called(self):
-        assert False
-
-    def __init__(self):
-        sgmllib.SGMLParser.__init__(self)
-        AbstractHeadParser.__init__(self)
-
-    def handle_starttag(self, tag, method, attrs):
-        if tag not in self.head_elems:
-            raise EndOfHeadError()
-        if tag == "meta":
-            method(attrs)
-
-    def unknown_starttag(self, tag, attrs):
-        self.handle_starttag(tag, self._not_called, attrs)
-
-    def handle_endtag(self, tag, method):
-        if tag in self.head_elems:
-            method()
-        else:
-            raise EndOfHeadError()
-
-    def unescape_attr_if_required(self, name):
-        return self.unescape_attr(name)
-
-def parse_head(fileobj, parser):
-    """Return a list of key, value pairs."""
-    while 1:
-        data = fileobj.read(CHUNK)
-        try:
-            parser.feed(data)
-        except EndOfHeadError:
-            break
-        if len(data) != CHUNK:
-            # this should only happen if there is no HTML body, or if
-            # CHUNK is big
-            break
-    return parser.http_equiv
-
-class HTTPEquivProcessor(BaseHandler):
-    """Append META HTTP-EQUIV headers to regular HTTP headers."""
-
-    handler_order = 300  # before handlers that look at HTTP headers
-
-    def __init__(self, head_parser_class=HeadParser,
-                 i_want_broken_xhtml_support=False,
-                 ):
-        self.head_parser_class = head_parser_class
-        self._allow_xhtml = i_want_broken_xhtml_support
-
-    def http_response(self, request, response):
-        if not hasattr(response, "seek"):
-            response = response_seek_wrapper(response)
-        headers = response.info()
-        url = response.geturl()
-        ct_hdrs = getheaders(response.info(), "content-type")
-        if is_html(ct_hdrs, url, self._allow_xhtml):
-            try:
-                try:
-                    html_headers = parse_head(response, self.head_parser_class())
-                finally:
-                    response.seek(0)
-            except (HTMLParser.HTMLParseError,
-                    sgmllib.SGMLParseError):
-                pass
-            else:
-                for hdr, val in html_headers:
-                    # rfc822.Message interprets this as appending, not clobbering
-                    headers[hdr] = val
-        return response
-
-    https_response = http_response
-
-class SeekableProcessor(BaseHandler):
-    """Make responses seekable."""
-
-    def any_response(self, request, response):
-        if not hasattr(response, "seek"):
-            return response_seek_wrapper(response)
-        return response
-
-class HTTPCookieProcessor(BaseHandler):
-    """Handle HTTP cookies.
-
-    Public attributes:
-
-    cookiejar: CookieJar instance
-
-    """
-    def __init__(self, cookiejar=None):
-        if cookiejar is None:
-            cookiejar = CookieJar()
-        self.cookiejar = cookiejar
-
-    def http_request(self, request):
-        self.cookiejar.add_cookie_header(request)
-        return request
-
-    def http_response(self, request, response):
-        self.cookiejar.extract_cookies(response, request)
-        return response
-
-    https_request = http_request
-    https_response = http_response
-
-try:
-    import robotparser
-except ImportError:
-    pass
-else:
-    class RobotExclusionError(urllib2.HTTPError):
-        def __init__(self, request, *args):
-            apply(urllib2.HTTPError.__init__, (self,)+args)
-            self.request = request
-
-    class HTTPRobotRulesProcessor(BaseHandler):
-        # before redirections, after everything else
-        handler_order = 800
-
-        try:
-            from httplib import HTTPMessage
-        except:
-            from mimetools import Message
-            http_response_class = Message
-        else:
-            http_response_class = HTTPMessage
-
-        def __init__(self, rfp_class=robotparser.RobotFileParser):
-            self.rfp_class = rfp_class
-            self.rfp = None
-            self._host = None
-
-        def http_request(self, request):
-            host = request.get_host()
-            scheme = request.get_type()
-            if host != self._host:
-                self.rfp = self.rfp_class()
-                self.rfp.set_url(scheme+"://"+host+"/robots.txt")
-                self.rfp.read()
-                self._host = host
-
-            ua = request.get_header("User-agent", "")
-            if self.rfp.can_fetch(ua, request.get_full_url()):
-                return request
-            else:
-                msg = "request disallowed by robots.txt"
-                raise RobotExclusionError(
-                    request,
-                    request.get_full_url(),
-                    403, msg,
-                    self.http_response_class(StringIO()), StringIO(msg))
-
-        https_request = http_request
-
-class HTTPRefererProcessor(BaseHandler):
-    """Add Referer header to requests.
-
-    This only makes sense if you use each RefererProcessor for a single
-    chain of requests only (so, for example, if you use a single
-    HTTPRefererProcessor to fetch a series of URLs extracted from a single
-    page, this will break).
-
-    There's a proper implementation of this in module mechanize.
-
-    """
-    def __init__(self):
-        self.referer = None
-
-    def http_request(self, request):
-        if ((self.referer is not None) and
-            not request.has_header("Referer")):
-            request.add_unredirected_header("Referer", self.referer)
-        return request
-
-    def http_response(self, request, response):
-        self.referer = response.geturl()
-        return response
-
-    https_request = http_request
-    https_response = http_response
-
-class HTTPResponseDebugProcessor(BaseHandler):
-    handler_order = 900  # before redirections, after everything else
-
-    def http_response(self, request, response):
-        if not hasattr(response, "seek"):
-            response = response_seek_wrapper(response)
-        info = getLogger("mechanize.http_responses").info
-        try:
-            info(response.read())
-        finally:
-            response.seek(0)
-        info("*****************************************************")
-        return response
-
-    https_response = http_response
-
-class HTTPRedirectDebugProcessor(BaseHandler):
-    def http_request(self, request):
-        if hasattr(request, "redirect_dict"):
-            info = getLogger("mechanize.http_redirects").info
-            info("redirecting to %s", request.get_full_url())
-        return request
-
-class HTTPRefreshProcessor(BaseHandler):
-    """Perform HTTP Refresh redirections.
-
-    Note that if a non-200 HTTP code has occurred (for example, a 30x
-    redirect), this processor will do nothing.
-
-    By default, only zero-time Refresh headers are redirected.  Use the
-    max_time attribute / constructor argument to allow Refresh with longer
-    pauses.  Use the honor_time attribute / constructor argument to control
-    whether the requested pause is honoured (with a time.sleep()) or
-    skipped in favour of immediate redirection.
-
-    Public attributes:
-
-    max_time: see above
-    honor_time: see above
-
-    """
-    handler_order = 1000
-
-    def __init__(self, max_time=0, honor_time=True):
-        self.max_time = max_time
-        self.honor_time = honor_time
-
-    def http_response(self, request, response):
-        code, msg, hdrs = response.code, response.msg, response.info()
-
-        if code == 200 and hdrs.has_key("refresh"):
-            refresh = getheaders(hdrs, "refresh")[0]
-            ii = string.find(refresh, ";")
-            if ii != -1:
-                pause, newurl_spec = float(refresh[:ii]), refresh[ii+1:]
-                jj = string.find(newurl_spec, "=")
-                if jj != -1:
-                    key, newurl = newurl_spec[:jj], newurl_spec[jj+1:]
-                if key.strip().lower() != "url":
-                    debug("bad Refresh header: %r" % refresh)
-                    return response
-            else:
-                pause, newurl = float(refresh), response.geturl()
-            if (self.max_time is None) or (pause <= self.max_time):
-                if pause > 1E-3 and self.honor_time:
-                    time.sleep(pause)
-                hdrs["location"] = newurl
-                # hardcoded http is NOT a bug
-                response = self.parent.error(
-                    "http", request, response,
-                    "refresh", msg, hdrs)
-
-        return response
-
-    https_response = http_response
-
-class HTTPErrorProcessor(BaseHandler):
-    """Process HTTP error responses.
-
-    The purpose of this handler is to to allow other response processors a
-    look-in by removing the call to parent.error() from
-    AbstractHTTPHandler.
-
-    For non-200 error codes, this just passes the job on to the
-    Handler.<proto>_error_<code> methods, via the OpenerDirector.error
-    method.  Eventually, urllib2.HTTPDefaultErrorHandler will raise an
-    HTTPError if no other handler handles the error.
-
-    """
-    handler_order = 1000  # after all other processors
-
-    def http_response(self, request, response):
-        code, msg, hdrs = response.code, response.msg, response.info()
-
-        if code != 200:
-            # hardcoded http is NOT a bug
-            response = self.parent.error(
-                "http", request, response, code, msg, hdrs)
-
-        return response
-
-    https_response = http_response
-
-
-class AbstractHTTPHandler(BaseHandler):
-
-    def __init__(self, debuglevel=0):
-        self._debuglevel = debuglevel
-
-    def set_http_debuglevel(self, level):
-        self._debuglevel = level
-
-    def do_request_(self, request):
-        host = request.get_host()
-        if not host:
-            raise URLError('no host given')
-
-        if request.has_data():  # POST
-            data = request.get_data()
-            if not request.has_header('Content-type'):
-                request.add_unredirected_header(
-                    'Content-type',
-                    'application/x-www-form-urlencoded')
-
-        scheme, sel = urllib.splittype(request.get_selector())
-        sel_host, sel_path = urllib.splithost(sel)
-        if not request.has_header('Host'):
-            request.add_unredirected_header('Host', sel_host or host)
-        for name, value in self.parent.addheaders:
-            name = string.capitalize(name)
-            if not request.has_header(name):
-                request.add_unredirected_header(name, value)
-
-        return request
-
-    def do_open(self, http_class, req):
-        """Return an addinfourl object for the request, using http_class.
-
-        http_class must implement the HTTPConnection API from httplib.
-        The addinfourl return value is a file-like object.  It also
-        has methods and attributes including:
-            - info(): return a mimetools.Message object for the headers
-            - geturl(): return the original request URL
-            - code: HTTP status code
-        """
-        host = req.get_host()
-        if not host:
-            raise URLError('no host given')
-
-        h = http_class(host) # will parse host:port
-        h.set_debuglevel(self._debuglevel)
-
-        headers = req.headers.copy()
-        headers.update(req.unredirected_hdrs)
-        # We want to make an HTTP/1.1 request, but the addinfourl
-        # class isn't prepared to deal with a persistent connection.
-        # It will try to read all remaining data from the socket,
-        # which will block while the server waits for the next request.
-        # So make sure the connection gets closed after the (only)
-        # request.
-        headers["Connection"] = "close"
-        try:
-            h.request(req.get_method(), req.get_selector(), req.data, headers)
-            r = h.getresponse()
-        except socket.error, err: # XXX what error?
-            raise URLError(err)
-
-        # Pick apart the HTTPResponse object to get the addinfourl
-        # object initialized properly.
-
-        # Wrap the HTTPResponse object in socket's file object adapter
-        # for Windows.  That adapter calls recv(), so delegate recv()
-        # to read().  This weird wrapping allows the returned object to
-        # have readline() and readlines() methods.
-
-        # XXX It might be better to extract the read buffering code
-        # out of socket._fileobject() and into a base class.
-
-        r.recv = r.read
-        fp = socket._fileobject(r, 'rb', -1)
-
-        resp = closeable_response(fp, r.msg, req.get_full_url(),
-                                  r.status, r.reason)
-        return resp
-
-
-class HTTPHandler(AbstractHTTPHandler):
-    def http_open(self, req):
-        return self.do_open(httplib.HTTPConnection, req)
-
-    http_request = AbstractHTTPHandler.do_request_
-
-if hasattr(httplib, 'HTTPS'):
-    class HTTPSHandler(AbstractHTTPHandler):
-        def https_open(self, req):
-            return self.do_open(httplib.HTTPSConnection, req)
-
-        https_request = AbstractHTTPHandler.do_request_
-
-class OpenerFactory:
-    """This class's interface is quite likely to change."""
-
-    default_classes = [
-        # handlers
-        urllib2.ProxyHandler,
-        urllib2.UnknownHandler,
-        HTTPHandler,  # from this module (derived from new AbstractHTTPHandler)
-        urllib2.HTTPDefaultErrorHandler,
-        HTTPRedirectHandler,  # from this module (bugfixed)
-        urllib2.FTPHandler,
-        urllib2.FileHandler,
-        # processors
-        HTTPRequestUpgradeProcessor,
-        HTTPCookieProcessor,
-        HTTPErrorProcessor
-        ]
-    handlers = []
-    replacement_handlers = []
-
-    def __init__(self, klass=_opener.OpenerDirector):
-        self.klass = klass
-
-    def build_opener(self, *handlers):
-        """Create an opener object from a list of handlers and processors.
-
-        The opener will use several default handlers and processors, including
-        support for HTTP and FTP.
-
-        If any of the handlers passed as arguments are subclasses of the
-        default handlers, the default handlers will not be used.
-
-        """
-        opener = self.klass()
-        default_classes = list(self.default_classes)
-        if hasattr(httplib, 'HTTPS'):
-            default_classes.append(HTTPSHandler)
-        skip = []
-        for klass in default_classes:
-            for check in handlers:
-                if type(check) == types.ClassType:
-                    if issubclass(check, klass):
-                        skip.append(klass)
-                elif type(check) == types.InstanceType:
-                    if isinstance(check, klass):
-                        skip.append(klass)
-        for klass in skip:
-            default_classes.remove(klass)
-
-        for klass in default_classes:
-            opener.add_handler(klass())
-        for h in handlers:
-            if type(h) == types.ClassType:
-                h = h()
-            opener.add_handler(h)
-
-        return opener
-
-build_opener = OpenerFactory().build_opener
-
-_opener = None
-urlopen_lock = _threading.Lock()
-def urlopen(url, data=None):
-    global _opener
-    if _opener is None:
-        urlopen_lock.acquire()
-        try:
-            if _opener is None:
-                _opener = build_opener()
-        finally:
-            urlopen_lock.release()
-    return _opener.open(url, data)
-
-def urlretrieve(url, filename=None, reporthook=None, data=None):
-    global _opener
-    if _opener is None:
-        urlopen_lock.acquire()
-        try:
-            if _opener is None:
-                _opener = build_opener()
-        finally:
-            urlopen_lock.release()
-    return _opener.retrieve(url, filename, reporthook, data)
-
-def install_opener(opener):
-    global _opener
-    _opener = opener

Modified: python-mechanize/trunk/mechanize/_useragent.py
===================================================================
--- python-mechanize/trunk/mechanize/_useragent.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_useragent.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -35,18 +35,24 @@
     https_request = http_request
 
 
-class UserAgent(OpenerDirector):
+class UserAgentBase(OpenerDirector):
     """Convenient user-agent class.
 
     Do not use .add_handler() to add a handler for something already dealt with
     by this code.
 
+    The only reason at present for the distinction between UserAgent and
+    UserAgentBase is so that classes that depend on .seek()able responses
+    (e.g. mechanize.Browser) can inherit from UserAgentBase.  The subclass
+    UserAgent exposes a .set_seekable_responses() method that allows switching
+    off the adding of a .seek() method to responses.
+
     Public attributes:
 
     addheaders: list of (name, value) pairs specifying headers to send with
      every request, unless they are overridden in the Request instance.
 
-     >>> ua = UserAgent()
+     >>> ua = UserAgentBase()
      >>> ua.addheaders = [
      ...  ("User-agent", "Mozilla/5.0 (compatible)"),
      ...  ("From", "responsible.person at example.com")]
@@ -130,6 +136,10 @@
             ppm = _auth.HTTPProxyPasswordMgr()
         self.set_password_manager(pm)
         self.set_proxy_password_manager(ppm)
+        # set default certificate manager
+        if "https" in ua_handlers:
+            cm = _urllib2.HTTPSClientCertMgr()
+            self.set_client_cert_manager(cm)
 
         # special case, requires extra support from mechanize.Browser
         self._handle_referer = True
@@ -200,6 +210,25 @@
         self._proxy_password_manager.add_password(
             realm, hostport, user, password)
 
+    def add_client_certificate(self, url, key_file, cert_file):
+        """Add an SSL client certificate, for HTTPS client auth.
+
+        key_file and cert_file must be filenames of the key and certificate
+        files, in PEM format.  You can use e.g. OpenSSL to convert a p12 (PKCS
+        12) file to PEM format:
+
+        openssl pkcs12 -clcerts -nokeys -in cert.p12 -out cert.pem
+        openssl pkcs12 -nocerts -in cert.p12 -out key.pem
+
+
+        Note that client certificate password input is very inflexible ATM.  At
+        the moment this seems to be console only, which is presumably the
+        default behaviour of libopenssl.  In future mechanize may support
+        third-party libraries that (I assume) allow more options here.
+
+        """
+        self._client_cert_manager.add_key_cert(url, key_file, cert_file)
+
     # the following are rarely useful -- use add_password / add_proxy_password
     # instead
     def set_password_manager(self, password_manager):
@@ -212,6 +241,11 @@
         self._proxy_password_manager = password_manager
         self._set_handler("_proxy_basicauth", obj=password_manager)
         self._set_handler("_proxy_digestauth", obj=password_manager)
+    def set_client_cert_manager(self, cert_manager):
+        """Set a mechanize.HTTPClientCertMgr, or None."""
+        self._client_cert_manager = cert_manager
+        handler = self._ua_handlers["https"]
+        handler.client_cert_manager = cert_manager
 
     # these methods all take a boolean parameter
     def set_handle_robots(self, handle):
@@ -321,3 +355,10 @@
         if newhandler is not None:
             self.add_handler(newhandler)
             self._ua_handlers[name] = newhandler
+
+
+class UserAgent(UserAgentBase):
+
+    def set_seekable_responses(self, handle):
+        """Make response objects .seek()able."""
+        self._set_handler("_seek", handle)

Modified: python-mechanize/trunk/mechanize/_util.py
===================================================================
--- python-mechanize/trunk/mechanize/_util.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize/_util.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,4 +1,4 @@
-"""Python backwards-compat., date/time routines, seekable file object wrapper.
+"""Utility functions and date/time routines.
 
  Copyright 2002-2006 John J Lee <jjl at pobox.com>
 
@@ -8,32 +8,13 @@
 
 """
 
-import re, string, time, copy, urllib, mimetools
-from types import TupleType
-from cStringIO import StringIO
+import re, string, time
 
-def startswith(string, initial):
-    if len(initial) > len(string): return False
-    return string[:len(initial)] == initial
-
-def endswith(string, final):
-    if len(final) > len(string): return False
-    return string[-len(final):] == final
-
 def isstringlike(x):
     try: x+""
     except: return False
     else: return True
 
-SPACE_DICT = {}
-for c in string.whitespace:
-    SPACE_DICT[c] = None
-del c
-def isspace(string):
-    for c in string:
-        if not SPACE_DICT.has_key(c): return False
-    return True
-
 ## def caller():
 ##     try:
 ##         raise SyntaxError
@@ -42,33 +23,6 @@
 ##     return sys.exc_traceback.tb_frame.f_back.f_back.f_code.co_name
 
 
-# this is here rather than in _HeadersUtil as it's just for
-# compatibility with old Python versions, rather than entirely new code
-def getheaders(msg, name):
-    """Get all values for a header.
-
-    This returns a list of values for headers given more than once; each
-    value in the result list is stripped in the same way as the result of
-    getheader().  If the header is not given, return an empty list.
-    """
-    result = []
-    current = ''
-    have_header = 0
-    for s in msg.getallmatchingheaders(name):
-        if isspace(s[0]):
-            if current:
-                current = "%s\n %s" % (current, string.strip(s))
-            else:
-                current = string.strip(s)
-        else:
-            if have_header:
-                result.append(current)
-            current = string.strip(s[string.find(s, ":") + 1:])
-            have_header = 1
-    if have_header:
-        result.append(current)
-    return result
-
 from calendar import timegm
 
 # Date/time conversion routines for formats used by the HTTP protocol.
@@ -86,7 +40,7 @@
 months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun",
           "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
 months_lower = []
-for month in months: months_lower.append(string.lower(month))
+for month in months: months_lower.append(month.lower())
 
 
 def time2isoz(t=None):
@@ -144,7 +98,7 @@
     # translate month name to number
     # month numbers start with 1 (January)
     try:
-        mon = months_lower.index(string.lower(mon))+1
+        mon = months_lower.index(mon.lower())+1
     except ValueError:
         # maybe it's already a number
         try:
@@ -185,7 +139,7 @@
         # adjust time using timezone string, to get absolute time since epoch
         if tz is None:
             tz = "UTC"
-        tz = string.upper(tz)
+        tz = tz.upper()
         offset = offset_from_tz_string(tz)
         if offset is None:
             return None
@@ -247,7 +201,7 @@
     m = strict_re.search(text)
     if m:
         g = m.groups()
-        mon = months_lower.index(string.lower(g[1])) + 1
+        mon = months_lower.index(g[1].lower()) + 1
         tt = (int(g[2]), mon, int(g[0]),
               int(g[3]), int(g[4]), float(g[5]))
         return my_timegm(tt)
@@ -255,7 +209,7 @@
     # No, we need some messy parsing...
 
     # clean up
-    text = string.lstrip(text)
+    text = text.lstrip()
     text = wkday_re.sub("", text, 1)  # Useless weekday
 
     # tz is time zone specifier string
@@ -300,7 +254,7 @@
 
     """
     # clean up
-    text = string.lstrip(text)
+    text = text.lstrip()
 
     # tz is time zone specifier string
     day, mon, yr, hr, min, sec, tz = [None]*7
@@ -315,340 +269,3 @@
         return None  # bad format
 
     return _str2time(day, mon, yr, hr, min, sec, tz)
-
-
-# XXX Andrew Dalke kindly sent me a similar class in response to my request on
-# comp.lang.python, which I then proceeded to lose.  I wrote this class
-# instead, but I think he's released his code publicly since, could pinch the
-# tests from it, at least...
-
-# For testing seek_wrapper invariant (note that
-# test_urllib2.HandlerTest.test_seekable is expected to fail when this
-# invariant checking is turned on).  The invariant checking is done by module
-# ipdc, which is available here:
-# http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/436834
-## from ipdbc import ContractBase
-## class seek_wrapper(ContractBase):
-class seek_wrapper:
-    """Adds a seek method to a file object.
-
-    This is only designed for seeking on readonly file-like objects.
-
-    Wrapped file-like object must have a read method.  The readline method is
-    only supported if that method is present on the wrapped object.  The
-    readlines method is always supported.  xreadlines and iteration are
-    supported only for Python 2.2 and above.
-
-    Public attribute: wrapped (the wrapped file object).
-
-    WARNING: All other attributes of the wrapped object (ie. those that are not
-    one of wrapped, read, readline, readlines, xreadlines, __iter__ and next)
-    are passed through unaltered, which may or may not make sense for your
-    particular file object.
-
-    """
-    # General strategy is to check that cache is full enough, then delegate to
-    # the cache (self.__cache, which is a cStringIO.StringIO instance).  A seek
-    # position (self.__pos) is maintained independently of the cache, in order
-    # that a single cache may be shared between multiple seek_wrapper objects.
-    # Copying using module copy shares the cache in this way.
-
-    def __init__(self, wrapped):
-        self.wrapped = wrapped
-        self.__have_readline = hasattr(self.wrapped, "readline")
-        self.__cache = StringIO()
-        self.__pos = 0  # seek position
-
-    def invariant(self):
-        # The end of the cache is always at the same place as the end of the
-        # wrapped file.
-        return self.wrapped.tell() == len(self.__cache.getvalue())
-
-    def __getattr__(self, name):
-        wrapped = self.__dict__.get("wrapped")
-        if wrapped:
-            return getattr(wrapped, name)
-        return getattr(self.__class__, name)
-
-    def seek(self, offset, whence=0):
-        assert whence in [0,1,2]
-
-        # how much data, if any, do we need to read?
-        if whence == 2:  # 2: relative to end of *wrapped* file
-            if offset < 0: raise ValueError("negative seek offset")
-            # since we don't know yet where the end of that file is, we must
-            # read everything
-            to_read = None
-        else:
-            if whence == 0:  # 0: absolute
-                if offset < 0: raise ValueError("negative seek offset")
-                dest = offset
-            else:  # 1: relative to current position
-                pos = self.__pos
-                if pos < offset:
-                    raise ValueError("seek to before start of file")
-                dest = pos + offset
-            end = len(self.__cache.getvalue())
-            to_read = dest - end
-            if to_read < 0:
-                to_read = 0
-
-        if to_read != 0:
-            self.__cache.seek(0, 2)
-            if to_read is None:
-                assert whence == 2
-                self.__cache.write(self.wrapped.read())
-                self.__pos = self.__cache.tell() - offset
-            else:
-                self.__cache.write(self.wrapped.read(to_read))
-                # Don't raise an exception even if we've seek()ed past the end
-                # of .wrapped, since fseek() doesn't complain in that case.
-                # Also like fseek(), pretend we have seek()ed past the end,
-                # i.e. not:
-                #self.__pos = self.__cache.tell()
-                # but rather:
-                self.__pos = dest
-        else:
-            self.__pos = dest
-
-    def tell(self):
-        return self.__pos
-
-    def __copy__(self):
-        cpy = self.__class__(self.wrapped)
-        cpy.__cache = self.__cache
-        return cpy
-
-    def get_data(self):
-        pos = self.__pos
-        try:
-            self.seek(0)
-            return self.read(-1)
-        finally:
-            self.__pos = pos
-
-    def read(self, size=-1):
-        pos = self.__pos
-        end = len(self.__cache.getvalue())
-        available = end - pos
-
-        # enough data already cached?
-        if size <= available and size != -1:
-            self.__cache.seek(pos)
-            self.__pos = pos+size
-            return self.__cache.read(size)
-
-        # no, so read sufficient data from wrapped file and cache it
-        if self.wrapped.read is None:
-            # XXX oops, wrapped file-like-object isn't valid, ignore it
-            return ''
-
-        self.__cache.seek(0, 2)
-        if size == -1:
-            self.__cache.write(self.wrapped.read())
-        else:
-            to_read = size - available
-            assert to_read > 0
-            self.__cache.write(self.wrapped.read(to_read))
-        self.__cache.seek(pos)
-
-        data = self.__cache.read(size)
-        self.__pos = self.__cache.tell()
-        assert self.__pos == pos + len(data)
-        return data
-
-    def readline(self, size=-1):
-        if not self.__have_readline:
-            raise NotImplementedError("no readline method on wrapped object")
-
-        # line we're about to read might not be complete in the cache, so
-        # read another line first
-        pos = self.__pos
-        self.__cache.seek(0, 2)
-        self.__cache.write(self.wrapped.readline())
-        self.__cache.seek(pos)
-
-        data = self.__cache.readline()
-        if size != -1:
-            r = data[:size]
-            self.__pos = pos+size
-        else:
-            r = data
-            self.__pos = pos+len(data)
-        return r
-
-    def readlines(self, sizehint=-1):
-        pos = self.__pos
-        self.__cache.seek(0, 2)
-        self.__cache.write(self.wrapped.read())
-        self.__cache.seek(pos)
-        data = self.__cache.readlines(sizehint)
-        self.__pos = self.__cache.tell()
-        return data
-
-    def __iter__(self): return self
-    def next(self):
-        line = self.readline()
-        if line == "": raise StopIteration
-        return line
-
-    xreadlines = __iter__
-
-    def __repr__(self):
-        return ("<%s at %s whose wrapped object = %r>" %
-                (self.__class__.__name__, hex(id(self)), self.wrapped))
-
-
-class response_seek_wrapper(seek_wrapper):
-
-    """
-    Supports copying response objects and setting response body data.
-
-    """
-
-    def __init__(self, wrapped):
-        seek_wrapper.__init__(self, wrapped)
-        self._headers = self.wrapped.info()
-
-    def __copy__(self):
-        cpy = seek_wrapper.__copy__(self)
-        # copy headers from delegate
-        cpy._headers = copy.copy(self.info())
-        return cpy
-
-    def info(self):
-        return self._headers
-
-    def set_data(self, data):
-        self.seek(0)
-        self.read()
-        self.close()
-        cache = self._seek_wrapper__cache = StringIO()
-        cache.write(data)
-        self.seek(0)
-
-
-class eoffile:
-    # file-like object that always claims to be at end-of-file...
-    def read(self, size=-1): return ""
-    def readline(self, size=-1): return ""
-    def __iter__(self): return self
-    def next(self): return ""
-    def close(self): pass
-
-class eofresponse(eoffile):
-    def __init__(self, url, headers, code, msg):
-        self._url = url
-        self._headers = headers
-        self.code = code
-        self.msg = msg
-    def geturl(self): return self._url
-    def info(self): return self._headers
-
-
-class closeable_response:
-    """Avoids unnecessarily clobbering urllib.addinfourl methods on .close().
-
-    Only supports responses returned by mechanize.HTTPHandler.
-
-    After .close(), the following methods are supported:
-
-    .read()
-    .readline()
-    .readlines()
-    .seek()
-    .tell()
-    .info()
-    .geturl()
-    .__iter__()
-    .next()
-    .close()
-
-    and the following attributes are supported:
-
-    .code
-    .msg
-
-    Also supports pickling (but the stdlib currently does something to prevent
-    it: http://python.org/sf/1144636).
-
-    """
-    # presence of this attr indicates is useable after .close()
-    closeable_response = None
-
-    def __init__(self, fp, headers, url, code, msg):
-        self._set_fp(fp)
-        self._headers = headers
-        self._url = url
-        self.code = code
-        self.msg = msg
-
-    def _set_fp(self, fp):
-        self.fp = fp
-        self.read = self.fp.read
-        self.readline = self.fp.readline
-        if hasattr(self.fp, "readlines"): self.readlines = self.fp.readlines
-        if hasattr(self.fp, "fileno"):
-            self.fileno = self.fp.fileno
-        else:
-            self.fileno = lambda: None
-        if hasattr(self.fp, "__iter__"):
-            self.__iter__ = self.fp.__iter__
-            if hasattr(self.fp, "next"):
-                self.next = self.fp.next
-
-    def __repr__(self):
-        return '<%s at %s whose fp = %r>' % (
-            self.__class__.__name__, hex(id(self)), self.fp)
-
-    def info(self):
-        return self._headers
-
-    def geturl(self):
-        return self._url
-
-    def close(self):
-        wrapped = self.fp
-        wrapped.close()
-        new_wrapped = eofresponse(
-            self._url, self._headers, self.code, self.msg)
-        self._set_fp(new_wrapped)
-
-    def __getstate__(self):
-        # There are three obvious options here:
-        # 1. truncate
-        # 2. read to end
-        # 3. close socket, pickle state including read position, then open
-        #    again on unpickle and use Range header
-        # XXXX um, 4. refuse to pickle unless .close()d.  This is better,
-        #  actually ("errors should never pass silently").  Pickling doesn't
-        #  work anyway ATM, because of http://python.org/sf/1144636 so fix
-        #  this later
-
-        # 2 breaks pickle protocol, because one expects the original object
-        # to be left unscathed by pickling.  3 is too complicated and
-        # surprising (and too much work ;-) to happen in a sane __getstate__.
-        # So we do 1.
-
-        state = self.__dict__.copy()
-        new_wrapped = eofresponse(
-            self._url, self._headers, self.code, self.msg)
-        state["wrapped"] = new_wrapped
-        return state
-
-def make_response(data, headers, url, code, msg):
-    """Convenient factory for objects implementing response interface.
-
-    data: string containing response body data
-    headers: sequence of (name, value) pairs
-    url: URL of response
-    code: integer response code (e.g. 200)
-    msg: string response code message (e.g. "OK")
-
-    """
-    hdr_text = []
-    for name_value in headers:
-        hdr_text.append("%s: %s" % name_value)
-    mime_headers = mimetools.Message(StringIO("\n".join(hdr_text)))
-    r = closeable_response(StringIO(data), mime_headers, url, code, msg)
-    return response_seek_wrapper(r)

Modified: python-mechanize/trunk/mechanize.egg-info/PKG-INFO
===================================================================
--- python-mechanize/trunk/mechanize.egg-info/PKG-INFO	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize.egg-info/PKG-INFO	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,12 +1,12 @@
 Metadata-Version: 1.0
 Name: mechanize
-Version: 0.1.2b
+Version: 0.1.6b
 Summary: Stateful programmatic web browsing.
 Home-page: http://wwwsearch.sourceforge.net/mechanize/
 Author: John J. Lee
 Author-email: jjl at pobox.com
 License: BSD
-Download-URL: http://wwwsearch.sourceforge.net/mechanize/src/mechanize-0.1.2b.tar.gz
+Download-URL: http://wwwsearch.sourceforge.net/mechanize/src/mechanize-0.1.6b.tar.gz
 Description: Stateful programmatic web browsing, after Andy Lester's Perl module
         WWW::Mechanize.
         
@@ -25,7 +25,7 @@
         
         
 Platform: any
-Classifier: Development Status :: 3 - Alpha
+Classifier: Development Status :: 4 - Beta
 Classifier: Intended Audience :: Developers
 Classifier: Intended Audience :: System Administrators
 Classifier: License :: OSI Approved :: BSD License

Modified: python-mechanize/trunk/mechanize.egg-info/SOURCES.txt
===================================================================
--- python-mechanize/trunk/mechanize.egg-info/SOURCES.txt	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize.egg-info/SOURCES.txt	2007-04-09 23:35:16 UTC (rev 773)
@@ -9,6 +9,7 @@
 README.txt
 doc.html
 doc.html.in
+ez_setup.py
 functional_tests.py
 setup.py
 test.py
@@ -17,14 +18,15 @@
 examples/cookietest.cgi
 examples/hack21.py
 examples/pypi.py
-ez_setup/README.txt
-ez_setup/__init__.py
 mechanize/__init__.py
 mechanize/_auth.py
+mechanize/_beautifulsoup.py
 mechanize/_clientcookie.py
+mechanize/_debug.py
 mechanize/_gzip.py
 mechanize/_headersutil.py
 mechanize/_html.py
+mechanize/_http.py
 mechanize/_lwpcookiejar.py
 mechanize/_mechanize.py
 mechanize/_mozillacookiejar.py
@@ -32,8 +34,11 @@
 mechanize/_opener.py
 mechanize/_pullparser.py
 mechanize/_request.py
+mechanize/_response.py
+mechanize/_rfc3986.py
+mechanize/_seek.py
+mechanize/_upgrade.py
 mechanize/_urllib2.py
-mechanize/_urllib2_support.py
 mechanize/_useragent.py
 mechanize/_util.py
 mechanize.egg-info/PKG-INFO
@@ -42,11 +47,23 @@
 mechanize.egg-info/requires.txt
 mechanize.egg-info/top_level.txt
 mechanize.egg-info/zip-safe
-test/test_conncache.py
+test/test_browser.doctest
+test/test_browser.py
 test/test_cookies.py
 test/test_date.py
+test/test_forms.doctest
 test/test_headers.py
-test/test_mechanize.py
-test/test_misc.py
+test/test_history.doctest
+test/test_html.doctest
+test/test_html.py
+test/test_opener.py
+test/test_password_manager.doctest
 test/test_pullparser.py
+test/test_request.doctest
+test/test_response.doctest
+test/test_response.py
+test/test_rfc3986.doctest
 test/test_urllib2.py
+test/test_useragent.py
+test-tools/doctest.py
+test-tools/linecache_copy.py

Modified: python-mechanize/trunk/mechanize.egg-info/requires.txt
===================================================================
--- python-mechanize/trunk/mechanize.egg-info/requires.txt	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize.egg-info/requires.txt	2007-04-09 23:35:16 UTC (rev 773)
@@ -1 +1 @@
-ClientForm>=0.2.2, ==dev
\ No newline at end of file
+ClientForm>=0.2.6, ==dev
\ No newline at end of file

Modified: python-mechanize/trunk/mechanize.egg-info/zip-safe
===================================================================
--- python-mechanize/trunk/mechanize.egg-info/zip-safe	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/mechanize.egg-info/zip-safe	2007-04-09 23:35:16 UTC (rev 773)
@@ -0,0 +1 @@
+

Copied: python-mechanize/trunk/setup.cfg (from rev 765, python-mechanize/branches/upstream/current/setup.cfg)

Modified: python-mechanize/trunk/setup.py
===================================================================
--- python-mechanize/trunk/setup.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/setup.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -52,15 +52,15 @@
 ## VERSION_MATCH = re.search(r'__version__ = \((.*)\)',
 ##                           open("mechanize/_mechanize.py").read())
 ## VERSION = unparse_version(str_to_tuple(VERSION_MATCH.group(1)))
-VERSION = "0.1.2b"
-INSTALL_REQUIRES = ["ClientForm>=0.2.2, ==dev"]
+VERSION = "0.1.6b"
+INSTALL_REQUIRES = ["ClientForm>=0.2.6, ==dev"]
 NAME = "mechanize"
 PACKAGE = True
 LICENSE = "BSD"  # or ZPL 2.1
 PLATFORMS = ["any"]
 ZIP_SAFE = True
 CLASSIFIERS = """\
-Development Status :: 3 - Alpha
+Development Status :: 4 - Beta
 Intended Audience :: Developers
 Intended Audience :: System Administrators
 License :: OSI Approved :: BSD License

Copied: python-mechanize/trunk/test/test_browser.doctest (from rev 765, python-mechanize/branches/upstream/current/test/test_browser.doctest)

Copied: python-mechanize/trunk/test/test_browser.py (from rev 765, python-mechanize/branches/upstream/current/test/test_browser.py)

Deleted: python-mechanize/trunk/test/test_conncache.py
===================================================================
--- python-mechanize/trunk/test/test_conncache.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/test/test_conncache.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,14 +0,0 @@
-"""Tests for mechanize._ConnCache module."""
-
-import unittest, sys
-
-class ConnCacheTests(unittest.TestCase):
-
-    def test_ConnectionCache(self):
-        from mechanize import ConnectionCache
-        ConnectionCache()
-
-
-if __name__ == "__main__":
-    #unittest.main()
-    pass

Modified: python-mechanize/trunk/test/test_cookies.py
===================================================================
--- python-mechanize/trunk/test/test_cookies.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/test/test_cookies.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,17 +1,15 @@
 """Tests for _ClientCookie."""
 
-import urllib2, re, os, string, StringIO, mimetools, time
+import urllib2, re, os, StringIO, mimetools, time
 from time import localtime
 from unittest import TestCase
 
-from mechanize._util import startswith
-
 class FakeResponse:
     def __init__(self, headers=[], url=None):
         """
         headers: list of RFC822-style 'Key: value' strings
         """
-        f = StringIO.StringIO(string.join(headers, "\n"))
+        f = StringIO.StringIO("\n".join(headers))
         self._headers = mimetools.Message(f)
         self._url = url
     def info(self): return self._headers
@@ -231,7 +229,7 @@
                           now)
         h = interact_netscape(c, "http://www.acme.com/")
         assert len(c) == 1
-        assert string.find(h, 'spam="bar"') != -1 and string.find(h, "foo") == -1
+        assert h.find('spam="bar"') != -1 and h.find("foo") == -1
 
         # max-age takes precedence over expires, and zero max-age is request to
         # delete both new cookie and any old matching cookie
@@ -252,7 +250,7 @@
         assert len(c) == 2
         c.clear_session_cookies()
         assert len(c) == 1
-        assert string.find(h, 'spam="bar"') != -1
+        assert h.find('spam="bar"') != -1
 
         # XXX RFC 2965 expiry rules (some apply to V0 too)
 
@@ -679,14 +677,14 @@
         url = "http://foo.bar.com/"
         interact_2965(c, url, "spam=eggs; Version=1")
         h = interact_2965(c, url)
-        assert string.find(h, "Domain") == -1, \
+        assert h.find( "Domain") == -1, \
                "absent domain returned with domain present"
 
         c = CookieJar(pol)
         url = "http://foo.bar.com/"
         interact_2965(c, url, 'spam=eggs; Version=1; Domain=.bar.com')
         h = interact_2965(c, url)
-        assert string.find(h, '$Domain=".bar.com"') != -1, \
+        assert h.find('$Domain=".bar.com"') != -1, \
                "domain not returned"
 
         c = CookieJar(pol)
@@ -694,7 +692,7 @@
         # note missing initial dot in Domain
         interact_2965(c, url, 'spam=eggs; Version=1; Domain=bar.com')
         h = interact_2965(c, url)
-        assert string.find(h, '$Domain="bar.com"') != -1, \
+        assert h.find('$Domain="bar.com"') != -1, \
                "domain not returned"
 
     def test_path_mirror(self):
@@ -706,14 +704,14 @@
         url = "http://foo.bar.com/"
         interact_2965(c, url, "spam=eggs; Version=1")
         h = interact_2965(c, url)
-        assert string.find(h, "Path") == -1, \
+        assert h.find("Path") == -1, \
                "absent path returned with path present"
 
         c = CookieJar(pol)
         url = "http://foo.bar.com/"
         interact_2965(c, url, 'spam=eggs; Version=1; Path=/')
         h = interact_2965(c, url)
-        assert string.find(h, '$Path="/"') != -1, "path not returned"
+        assert h.find('$Path="/"') != -1, "path not returned"
 
     def test_port_mirror(self):
         from mechanize import CookieJar, DefaultCookiePolicy
@@ -724,7 +722,7 @@
         url = "http://foo.bar.com/"
         interact_2965(c, url, "spam=eggs; Version=1")
         h = interact_2965(c, url)
-        assert string.find(h, "Port") == -1, \
+        assert h.find("Port") == -1, \
                "absent port returned with port present"
 
         c = CookieJar(pol)
@@ -738,14 +736,14 @@
         url = "http://foo.bar.com/"
         interact_2965(c, url, 'spam=eggs; Version=1; Port="80"')
         h = interact_2965(c, url)
-        assert string.find(h, '$Port="80"') != -1, \
+        assert h.find('$Port="80"') != -1, \
                "port with single value not returned with single value"
 
         c = CookieJar(pol)
         url = "http://foo.bar.com/"
         interact_2965(c, url, 'spam=eggs; Version=1; Port="80,8080"')
         h = interact_2965(c, url)
-        assert string.find(h, '$Port="80,8080"') != -1, \
+        assert h.find('$Port="80,8080"') != -1, \
                "port with multiple values not returned with multiple values"
 
     def test_no_return_comment(self):
@@ -757,7 +755,7 @@
                       'Comment="does anybody read these?"; '
                       'CommentURL="http://foo.bar.net/comment.html"')
         h = interact_2965(c, url)
-        assert string.find(h, "Comment") == -1, \
+        assert h.find("Comment") == -1, \
                "Comment or CommentURL cookie-attributes returned to server"
 
 # just pondering security here -- this isn't really a test (yet)
@@ -939,8 +937,8 @@
         c.add_cookie_header(req)
 
         h = req.get_header("Cookie")
-        assert (string.find(h, "PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
-                string.find(h, "CUSTOMER=WILE_E_COYOTE") != -1)
+        assert (h.find("PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
+                h.find("CUSTOMER=WILE_E_COYOTE") != -1)
 
 
         headers.append('Set-Cookie: SHIPPING=FEDEX; path=/foo')
@@ -951,18 +949,18 @@
         c.add_cookie_header(req)
 
         h = req.get_header("Cookie")
-        assert (string.find(h, "PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
-                string.find(h, "CUSTOMER=WILE_E_COYOTE") != -1 and
-                not string.find(h, "SHIPPING=FEDEX") != -1)
+        assert (h.find("PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
+                h.find("CUSTOMER=WILE_E_COYOTE") != -1 and
+                not h.find("SHIPPING=FEDEX") != -1)
 
 
         req = Request("http://www.acme.com/foo/")
         c.add_cookie_header(req)
 
         h = req.get_header("Cookie")
-        assert (string.find(h, "PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
-                string.find(h, "CUSTOMER=WILE_E_COYOTE") != -1 and
-                startswith(h, "SHIPPING=FEDEX;"))
+        assert (h.find("PART_NUMBER=ROCKET_LAUNCHER_0001") != -1 and
+                h.find("CUSTOMER=WILE_E_COYOTE") != -1 and
+                h.startswith("SHIPPING=FEDEX;"))
 
     def test_netscape_example_2(self):
         from mechanize import CookieJar, Request
@@ -1121,7 +1119,7 @@
 
         cookie = interact_2965(c, "http://www.acme.com/acme/process")
         assert (re.search(r'Shipping="?FedEx"?;\s*\$Path="\/acme"', cookie) and
-                string.find(cookie, "WILE_E_COYOTE") != -1)
+                cookie.find("WILE_E_COYOTE") != -1)
 
         # 
         # The user agent makes a series of requests on the origin server, after
@@ -1182,8 +1180,8 @@
         # the server.
 
         cookie = interact_2965(c, "http://www.acme.com/acme/parts/")
-        assert (string.find(cookie, "Rocket_Launcher_0001") != -1 and
-                not string.find(cookie, "Riding_Rocket_0023") != -1)
+        assert (cookie.find("Rocket_Launcher_0001") != -1 and
+                not cookie.find("Riding_Rocket_0023") != -1)
 
     def test_rejection(self):
         # Test rejection of Set-Cookie2 responses based on domain, path, port.
@@ -1292,7 +1290,7 @@
             c, "http://www.acme.com/foo%2f%25/<<%0anew\345/\346\370\345",
             'bar=baz; path="/foo/"; version=1');
         version_re = re.compile(r'^\$version=\"?1\"?', re.I)
-        assert (string.find(cookie, "foo=bar") != -1 and
+        assert (cookie.find("foo=bar") != -1 and
                 version_re.search(cookie))
 
         cookie = interact_2965(
@@ -1340,11 +1338,11 @@
 
         new_c = save_and_restore(c, True)
         assert len(new_c) == 6  # none discarded
-        assert string.find(repr(new_c), "name='foo1', value='bar'") != -1
+        assert repr(new_c).find("name='foo1', value='bar'") != -1
 
         new_c = save_and_restore(c, False)
         assert len(new_c) == 4  # 2 of them discarded on save
-        assert string.find(repr(new_c), "name='foo1', value='bar'") != -1
+        assert repr(new_c).find("name='foo1', value='bar'") != -1
 
     def test_netscape_misc(self):
         # Some additional Netscape cookies tests.
@@ -1369,9 +1367,8 @@
         req = Request("http://foo.bar.acme.com/foo")
         c.add_cookie_header(req)
         assert (
-            string.find(req.get_header("Cookie"), "PART_NUMBER=3,4") != -1 and
-            string.find(
-            req.get_header("Cookie"), "Customer=WILE_E_COYOTE") != -1)
+            req.get_header("Cookie").find("PART_NUMBER=3,4") != -1 and
+            req.get_header("Cookie").find("Customer=WILE_E_COYOTE") != -1)
 
     def test_intranet_domains_2965(self):
         # Test handling of local intranet hostnames without a dot.
@@ -1382,11 +1379,11 @@
                       "foo1=bar; PORT; Discard; Version=1;")
         cookie = interact_2965(c, "http://example/",
                                'foo2=bar; domain=".local"; Version=1')
-        assert string.find(cookie, "foo1=bar") >= 0
+        assert cookie.find("foo1=bar") >= 0
 
         interact_2965(c, "http://example/", 'foo3=bar; Version=1')
         cookie = interact_2965(c, "http://example/")
-        assert string.find(cookie, "foo2=bar") >= 0 and len(c) == 3
+        assert cookie.find("foo2=bar") >= 0 and len(c) == 3
 
     def test_intranet_domains_ns(self):
         from mechanize import CookieJar, DefaultCookiePolicy
@@ -1396,10 +1393,10 @@
         cookie = interact_netscape(c, "http://example/",
                                    'foo2=bar; domain=.local')
         assert len(c) == 2
-        assert string.find(cookie, "foo1=bar") >= 0
+        assert cookie.find("foo1=bar") >= 0
 
         cookie = interact_netscape(c, "http://example/")
-        assert string.find(cookie, "foo2=bar") >= 0 and len(c) == 2
+        assert cookie.find("foo2=bar") >= 0 and len(c) == 2
 
     def test_empty_path(self):
         from mechanize import CookieJar, Request, DefaultCookiePolicy

Modified: python-mechanize/trunk/test/test_date.py
===================================================================
--- python-mechanize/trunk/test/test_date.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/test/test_date.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,6 +1,6 @@
 """Tests for ClientCookie._HTTPDate."""
 
-import re, string, time
+import re, time
 from unittest import TestCase
 
 class DateTimeTests(TestCase):
@@ -68,8 +68,8 @@
 
         for s in tests:
             t = http2time(s)
-            t2 = http2time(string.lower(s))
-            t3 = http2time(string.upper(s))
+            t2 = http2time(s.lower())
+            t3 = http2time(s.upper())
 
             assert t == t2 == t3 == test_t, \
                    "'%s'  =>  %s, %s, %s (%s)" % (s, t, t2, t3, test_t)

Copied: python-mechanize/trunk/test/test_forms.doctest (from rev 765, python-mechanize/branches/upstream/current/test/test_forms.doctest)

Copied: python-mechanize/trunk/test/test_history.doctest (from rev 765, python-mechanize/branches/upstream/current/test/test_history.doctest)

Copied: python-mechanize/trunk/test/test_html.doctest (from rev 765, python-mechanize/branches/upstream/current/test/test_html.doctest)

Copied: python-mechanize/trunk/test/test_html.py (from rev 765, python-mechanize/branches/upstream/current/test/test_html.py)

Deleted: python-mechanize/trunk/test/test_mechanize.py
===================================================================
--- python-mechanize/trunk/test/test_mechanize.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/test/test_mechanize.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,850 +0,0 @@
-#!/usr/bin/env python
-
-import sys, random
-from unittest import TestCase
-import StringIO, re, UserDict, urllib2
-
-import mechanize
-FACTORY_CLASSES = [mechanize.DefaultFactory]
-try:
-    import BeautifulSoup
-except ImportError:
-    import warnings
-    warnings.warn("skipping tests involving BeautifulSoup")
-else:
-    FACTORY_CLASSES.append(mechanize.RobustFactory)
-
-
-def test_password_manager(self):
-    """
-    >>> mgr = mechanize.HTTPProxyPasswordMgr()
-    >>> add = mgr.add_password
-
-    >>> add("Some Realm", "http://example.com/", "joe", "password")
-    >>> add("Some Realm", "http://example.com/ni", "ni", "ni")
-    >>> add("c", "http://example.com/foo", "foo", "ni")
-    >>> add("c", "http://example.com/bar", "bar", "nini")
-    >>> add("b", "http://example.com/", "first", "blah")
-    >>> add("b", "http://example.com/", "second", "spam")
-    >>> add("a", "http://example.com", "1", "a")
-    >>> add("Some Realm", "http://c.example.com:3128", "3", "c")
-    >>> add("Some Realm", "d.example.com", "4", "d")
-    >>> add("Some Realm", "e.example.com:3128", "5", "e")
-
-    >>> mgr.find_user_password("Some Realm", "example.com")
-    ('joe', 'password')
-    >>> mgr.find_user_password("Some Realm", "http://example.com")
-    ('joe', 'password')
-    >>> mgr.find_user_password("Some Realm", "http://example.com/")
-    ('joe', 'password')
-    >>> mgr.find_user_password("Some Realm", "http://example.com/spam")
-    ('joe', 'password')
-    >>> mgr.find_user_password("Some Realm", "http://example.com/spam/spam")
-    ('joe', 'password')
-    >>> mgr.find_user_password("c", "http://example.com/foo")
-    ('foo', 'ni')
-    >>> mgr.find_user_password("c", "http://example.com/bar")
-    ('bar', 'nini')
-
-    Currently, we use the highest-level path where more than one match:
-
-    >>> mgr.find_user_password("Some Realm", "http://example.com/ni")
-    ('joe', 'password')
-
-    Use latest add_password() in case of conflict:
-
-    >>> mgr.find_user_password("b", "http://example.com/")
-    ('second', 'spam')
-
-    No special relationship between a.example.com and example.com:
-
-    >>> mgr.find_user_password("a", "http://example.com/")
-    ('1', 'a')
-    >>> mgr.find_user_password("a", "http://a.example.com/")
-    (None, None)
-
-    Ports:
-
-    >>> mgr.find_user_password("Some Realm", "c.example.com")
-    (None, None)
-    >>> mgr.find_user_password("Some Realm", "c.example.com:3128")
-    ('3', 'c')
-    >>> mgr.find_user_password("Some Realm", "http://c.example.com:3128")
-    ('3', 'c')
-    >>> mgr.find_user_password("Some Realm", "d.example.com")
-    ('4', 'd')
-    >>> mgr.find_user_password("Some Realm", "e.example.com:3128")
-    ('5', 'e')
-
-
-    Now features specific to HTTPProxyPasswordMgr.
-
-    Default realm:
-
-    >>> mgr.find_user_password("d", "f.example.com")
-    (None, None)
-    >>> add(None, "f.example.com", "6", "f")
-    >>> mgr.find_user_password("d", "f.example.com")
-    ('6', 'f')
-
-    Default host/port:
-
-    >>> mgr.find_user_password("e", "g.example.com")
-    (None, None)
-    >>> add("e", None, "7", "g")
-    >>> mgr.find_user_password("e", "g.example.com")
-    ('7', 'g')
-
-    Default realm and host/port:
-
-    >>> mgr.find_user_password("f", "h.example.com")
-    (None, None)
-    >>> add(None, None, "8", "h")
-    >>> mgr.find_user_password("f", "h.example.com")
-    ('8', 'h')
-
-    Default realm beats default host/port:
-
-    >>> add("d", None, "9", "i")
-    >>> mgr.find_user_password("d", "f.example.com")
-    ('6', 'f')
-
-    """
-    pass
-
-
-class CachingGeneratorFunctionTests(TestCase):
-
-    def _get_simple_cgenf(self, log):
-        from mechanize._html import CachingGeneratorFunction
-        todo = []
-        for ii in range(2):
-            def work(ii=ii):
-                log.append(ii)
-                return ii
-            todo.append(work)
-        def genf():
-            for a in todo:
-                yield a()
-        return CachingGeneratorFunction(genf())
-
-    def test_cache(self):
-        log = []
-        cgenf = self._get_simple_cgenf(log)
-        for repeat in range(2):
-            for ii, jj in zip(cgenf(), range(2)):
-                self.assertEqual(ii, jj)
-            self.assertEqual(log, range(2))  # work only done once
-
-    def test_interleaved(self):
-        log = []
-        cgenf = self._get_simple_cgenf(log)
-        cgen = cgenf()
-        self.assertEqual(cgen.next(), 0)
-        self.assertEqual(log, [0])
-        cgen2 = cgenf()
-        self.assertEqual(cgen2.next(), 0)
-        self.assertEqual(log, [0])
-        self.assertEqual(cgen.next(), 1)
-        self.assertEqual(log, [0, 1])
-        self.assertEqual(cgen2.next(), 1)
-        self.assertEqual(log, [0, 1])
-        self.assertRaises(StopIteration, cgen.next)
-        self.assertRaises(StopIteration, cgen2.next)
-
-
-class UnescapeTests(TestCase):
-
-    def test_unescape_charref(self):
-        from mechanize._html import unescape_charref
-        mdash_utf8 = u"\u2014".encode("utf-8")
-        for ref, codepoint, utf8, latin1 in [
-            ("38", 38, u"&".encode("utf-8"), "&"),
-            ("x2014", 0x2014, mdash_utf8, "&#x2014;"),
-            ("8212", 8212, mdash_utf8, "&#8212;"),
-            ]:
-            self.assertEqual(unescape_charref(ref, None), unichr(codepoint))
-            self.assertEqual(unescape_charref(ref, 'latin-1'), latin1)
-            self.assertEqual(unescape_charref(ref, 'utf-8'), utf8)
-
-    def test_unescape(self):
-        import htmlentitydefs
-        from mechanize._html import unescape
-        data = "&amp; &lt; &mdash; &#8212; &#x2014;"
-        mdash_utf8 = u"\u2014".encode("utf-8")
-        ue = unescape(data, htmlentitydefs.name2codepoint, "utf-8")
-        self.assertEqual("& < %s %s %s" % ((mdash_utf8,)*3), ue)
-
-        for text, expect in [
-            ("&a&amp;", "&a&"),
-            ("a&amp;", "a&"),
-            ]:
-            got = unescape(text, htmlentitydefs.name2codepoint, "latin-1")
-            self.assertEqual(got, expect)
-
-
-# XXX these 'mock' classes are badly in need of simplification
-class MockMethod:
-    def __init__(self, meth_name, action, handle):
-        self.meth_name = meth_name
-        self.handle = handle
-        self.action = action
-    def __call__(self, *args):
-        return apply(self.handle, (self.meth_name, self.action)+args)
-
-class MockHeaders(UserDict.UserDict):
-    def getallmatchingheaders(self, name):
-        return ["%s: %s" % (k, v) for k, v in self.data.iteritems()]
-    def getheaders(self, name):
-        return self.data.values()
-
-class MockResponse:
-    closeable_response = None
-    def __init__(self, url="http://example.com/", data=None, info=None):
-        self.url = url
-        self.fp = StringIO.StringIO(data)
-        if info is None: info = {}
-        self._info = MockHeaders(info)
-        self.source = "%d%d" % (id(self), random.randint(0, sys.maxint-1))
-    def info(self): return self._info
-    def geturl(self): return self.url
-    def read(self, size=-1): return self.fp.read(size)
-    def seek(self, whence):
-        assert whence == 0
-        self.fp.seek(0)
-    def close(self): pass
-    def __getstate__(self):
-        state = self.__dict__
-        state['source'] = self.source
-        return state
-    def __setstate__(self, state):
-        self.__dict__ = state
-
-def make_mock_handler():
-    class MockHandler:
-        processor_order = 500
-        handler_order = -1
-        def __init__(self, methods):
-            self._define_methods(methods)
-        def _define_methods(self, methods):
-            for name, action in methods:
-                if name.endswith("_open"):
-                    meth = MockMethod(name, action, self.handle)
-                else:
-                    meth = MockMethod(name, action, self.process)
-                setattr(self.__class__, name, meth)
-        def handle(self, fn_name, response, *args, **kwds):
-            self.parent.calls.append((self, fn_name, args, kwds))
-            if response:
-                if isinstance(response, urllib2.HTTPError):
-                    raise response
-                r = response
-                r.seek(0)
-            else:
-                r = MockResponse()
-            req = args[0]
-            r.url = req.get_full_url()
-            return r
-        def process(self, fn_name, action, *args, **kwds):
-            self.parent.calls.append((self, fn_name, args, kwds))
-            if fn_name.endswith("_request"):
-                return args[0]
-            else:
-                return args[1]
-        def close(self): pass
-        def add_parent(self, parent):
-            self.parent = parent
-            self.parent.calls = []
-        def __lt__(self, other):
-            if not hasattr(other, "handler_order"):
-                # Try to preserve the old behavior of having custom classes
-                # inserted after default ones (works only for custom user
-                # classes which are not aware of handler_order).
-                return True
-            return self.handler_order < other.handler_order
-    return MockHandler
-
-class TestBrowser(mechanize.Browser):
-    default_features = ["_seek"]
-    default_others = []
-    default_schemes = []
-
-
-class BrowserTests(TestCase):
-    def test_referer(self):
-        b = TestBrowser()
-        url = "http://www.example.com/"
-        r = MockResponse(url,
-"""<html>
-<head><title>Title</title></head>
-<body>
-<form name="form1">
- <input type="hidden" name="foo" value="bar"></input>
- <input type="submit"></input>
- </form>
-<a href="http://example.com/foo/bar.html" name="apples"></a>
-<a href="https://example.com/spam/eggs.html" name="secure"></a>
-<a href="blah://example.com/" name="pears"></a>
-</body>
-</html>
-""", {"content-type": "text/html"})
-        b.add_handler(make_mock_handler()([("http_open", r)]))
-
-        # Referer not added by .open()...
-        req = mechanize.Request(url)
-        b.open(req)
-        self.assert_(req.get_header("Referer") is None)
-        # ...even if we're visiting a document
-        b.open(req)
-        self.assert_(req.get_header("Referer") is None)
-        # Referer added by .click_link() and .click()
-        b.select_form("form1")
-        req2 = b.click()
-        self.assertEqual(req2.get_header("Referer"), url)
-        r2 = b.open(req2)
-        req3 = b.click_link(name="apples")
-        self.assertEqual(req3.get_header("Referer"), url+"?foo=bar")
-        # Referer not added when going from https to http URL
-        b.add_handler(make_mock_handler()([("https_open", r)]))
-        r3 = b.open(req3)
-        req4 = b.click_link(name="secure")
-        self.assertEqual(req4.get_header("Referer"),
-                         "http://example.com/foo/bar.html")
-        r4 = b.open(req4)
-        req5 = b.click_link(name="apples")
-        self.assert_(not req5.has_header("Referer"))
-        # Referer not added for non-http, non-https requests
-        b.add_handler(make_mock_handler()([("blah_open", r)]))
-        req6 = b.click_link(name="pears")
-        self.assert_(not req6.has_header("Referer"))
-        # Referer not added when going from non-http, non-https URL
-        r4 = b.open(req6)
-        req7 = b.click_link(name="apples")
-        self.assert_(not req7.has_header("Referer"))
-
-        # XXX Referer added for redirect
-
-    def test_encoding(self):
-        import mechanize
-        from StringIO import StringIO
-        import urllib, mimetools
-        # always take first encoding, since that's the one from the real HTTP
-        # headers, rather than from HTTP-EQUIV
-        b = mechanize.Browser()
-        for s, ct in [("", mechanize._html.DEFAULT_ENCODING),
-
-                      ("Foo: Bar\r\n\r\n", mechanize._html.DEFAULT_ENCODING),
-
-                      ("Content-Type: text/html; charset=UTF-8\r\n\r\n",
-                       "UTF-8"),
-
-                      ("Content-Type: text/html; charset=UTF-8\r\n"
-                       "Content-Type: text/html; charset=KOI8-R\r\n\r\n",
-                       "UTF-8"),
-                      ]:
-            msg = mimetools.Message(StringIO(s))
-            r = urllib.addinfourl(StringIO(""), msg, "http://www.example.com/")
-            b.set_response(r)
-            self.assertEqual(b.encoding(), ct)
-
-    def test_history(self):
-        import mechanize
-
-        def same_response(ra, rb):
-            return ra.source == rb.source
-
-        b = TestBrowser()
-        b.add_handler(make_mock_handler()([("http_open", None)]))
-        self.assertRaises(mechanize.BrowserStateError, b.back)
-        r1 = b.open("http://example.com/")
-        self.assertRaises(mechanize.BrowserStateError, b.back)
-        r2 = b.open("http://example.com/foo")
-        self.assert_(same_response(b.back(), r1))
-        r3 = b.open("http://example.com/bar")
-        r4 = b.open("http://example.com/spam")
-        self.assert_(same_response(b.back(), r3))
-        self.assert_(same_response(b.back(), r1))
-        self.assertRaises(mechanize.BrowserStateError, b.back)
-        # reloading does a real HTTP fetch rather than using history cache
-        r5 = b.reload()
-        self.assert_(not same_response(r5, r1))
-        # .geturl() gets fed through to b.response
-        self.assertEquals(b.geturl(), "http://example.com/")
-        # can go back n times
-        r6 = b.open("spam")
-        self.assertEquals(b.geturl(), "http://example.com/spam")
-        r7 = b.open("/spam")
-        self.assert_(same_response(b.response(), r7))
-        self.assertEquals(b.geturl(), "http://example.com/spam")
-        self.assert_(same_response(b.back(2), r5))
-        self.assertEquals(b.geturl(), "http://example.com/")
-        self.assertRaises(mechanize.BrowserStateError, b.back, 2)
-        r8 = b.open("/spam")
-
-        # even if we get a HTTPError, history and .response() should still get updated
-        error = urllib2.HTTPError("http://example.com/bad", 503, "Oops",
-                                  MockHeaders(), StringIO.StringIO())
-        b.add_handler(make_mock_handler()([("https_open", error)]))
-        self.assertRaises(urllib2.HTTPError, b.open, "https://example.com/")
-        self.assertEqual(b.response().geturl(), error.geturl())
-        self.assert_(same_response(b.back(), r8))
-
-        b.close()
-        # XXX assert BrowserStateError
-
-    def test_viewing_html(self):
-        # XXX not testing multiple Content-Type headers
-        import mechanize
-        url = "http://example.com/"
-
-        for allow_xhtml in False, True:
-            for ct, expect in [
-                (None, False),
-                ("text/plain", False),
-                ("text/html", True),
-
-                # don't try to handle XML until we can do it right!
-                ("text/xhtml", allow_xhtml),
-                ("text/xml", allow_xhtml),
-                ("application/xml", allow_xhtml),
-                ("application/xhtml+xml", allow_xhtml),
-
-                ("text/html; charset=blah", True),
-                (" text/html ; charset=ook ", True),
-                ]:
-                b = TestBrowser(mechanize.DefaultFactory(
-                    i_want_broken_xhtml_support=allow_xhtml))
-                hdrs = {}
-                if ct is not None:
-                    hdrs["Content-Type"] = ct
-                b.add_handler(make_mock_handler()([("http_open",
-                                            MockResponse(url, "", hdrs))]))
-                r = b.open(url)
-                self.assertEqual(b.viewing_html(), expect)
-
-        for allow_xhtml in False, True:
-            for ext, expect in [
-                (".htm", True),
-                (".html", True),
-
-                # don't try to handle XML until we can do it right!
-                (".xhtml", allow_xhtml),
-
-                (".html?foo=bar&a=b;whelk#kool", True),
-                (".txt", False),
-                (".xml", False),
-                ("", False),
-                ]:
-                b = TestBrowser(mechanize.DefaultFactory(
-                    i_want_broken_xhtml_support=allow_xhtml))
-                url = "http://example.com/foo"+ext
-                b.add_handler(make_mock_handler()(
-                    [("http_open", MockResponse(url, "", {}))]))
-                r = b.open(url)
-                self.assertEqual(b.viewing_html(), expect)
-
-    def test_empty(self):
-        import mechanize
-        url = "http://example.com/"
-
-        b = TestBrowser()
-        b.add_handler(make_mock_handler()([("http_open", MockResponse(url, "", {}))]))
-        r = b.open(url)
-        self.assert_(not b.viewing_html())
-        self.assertRaises(mechanize.BrowserStateError, b.links)
-        self.assertRaises(mechanize.BrowserStateError, b.forms)
-        self.assertRaises(mechanize.BrowserStateError, b.title)
-        self.assertRaises(mechanize.BrowserStateError, b.select_form)
-        self.assertRaises(mechanize.BrowserStateError, b.select_form,
-                          name="blah")
-        self.assertRaises(mechanize.BrowserStateError, b.find_link,
-                          name="blah")
-
-        b = TestBrowser()
-        r = MockResponse(url,
-"""<html>
-<head><title>Title</title></head>
-<body>
-</body>
-</html>
-""", {"content-type": "text/html"})
-        b.add_handler(make_mock_handler()([("http_open", r)]))
-        r = b.open(url)
-        self.assertEqual(b.title(), "Title")
-        self.assertEqual(len(list(b.links())), 0)
-        self.assertEqual(len(list(b.forms())), 0)
-        self.assertRaises(ValueError, b.select_form)
-        self.assertRaises(mechanize.FormNotFoundError, b.select_form,
-                          name="blah")
-        self.assertRaises(mechanize.FormNotFoundError, b.select_form,
-                          predicate=lambda x: True)
-        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
-                          name="blah")
-        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
-                          predicate=lambda x: True)
-
-    def test_forms(self):
-        for factory_class in FACTORY_CLASSES:
-            self._test_forms(factory_class())
-    def _test_forms(self, factory):
-        import mechanize
-        url = "http://example.com"
-
-        b = TestBrowser(factory=factory)
-        r = MockResponse(url,
-"""<html>
-<head><title>Title</title></head>
-<body>
-<form name="form1">
- <input type="text"></input>
- <input type="checkbox" name="cheeses" value="cheddar"></input>
- <input type="checkbox" name="cheeses" value="edam"></input>
- <input type="submit" name="one"></input>
-</form>
-<a href="http://example.com/foo/bar.html" name="apples">
-<form name="form2">
- <input type="submit" name="two">
-</form>
-</body>
-</html>
-""", {"content-type": "text/html"})
-        b.add_handler(make_mock_handler()([("http_open", r)]))
-        r = b.open(url)
-
-        forms = list(b.forms())
-        self.assertEqual(len(forms), 2)
-        for got, expect in zip([f.name for f in forms], [
-            "form1", "form2"]):
-            self.assertEqual(got, expect)
-
-        self.assertRaises(mechanize.FormNotFoundError, b.select_form, "foo")
-
-        # no form is set yet
-        self.assertRaises(AttributeError, getattr, b, "possible_items")
-        b.select_form("form1")
-        # now unknown methods are fed through to selected ClientForm.HTMLForm
-        self.assertEqual(
-            [i.name for i in b.find_control("cheeses").items],
-            ["cheddar", "edam"])
-        b["cheeses"] = ["cheddar", "edam"]
-        self.assertEqual(b.click_pairs(), [
-            ("cheeses", "cheddar"), ("cheeses", "edam"), ("one", "")])
-
-        b.select_form(nr=1)
-        self.assertEqual(b.name, "form2")
-        self.assertEqual(b.click_pairs(), [("two", "")])
-
-    def test_link_encoding(self):
-        for factory_class in FACTORY_CLASSES:
-            self._test_link_encoding(factory_class())
-    def _test_link_encoding(self, factory):
-        import urllib
-        import mechanize
-        from mechanize._html import clean_url
-        url = "http://example.com/"
-        for encoding in ["UTF-8", "latin-1"]:
-            encoding_decl = "; charset=%s" % encoding
-            b = TestBrowser(factory=factory)
-            r = MockResponse(url, """\
-<a href="http://example.com/foo/bar&mdash;&#x2014;.html"
-   name="name0&mdash;&#x2014;">blah&mdash;&#x2014;</a>
-""", #"
-{"content-type": "text/html%s" % encoding_decl})
-            b.add_handler(make_mock_handler()([("http_open", r)]))
-            r = b.open(url)
-
-            Link = mechanize.Link
-            try:
-                mdashx2 = u"\u2014".encode(encoding)*2
-            except UnicodeError:
-                mdashx2 = '&mdash;&#x2014;'
-            qmdashx2 = clean_url(mdashx2, encoding)
-            # base_url, url, text, tag, attrs
-            exp = Link(url, "http://example.com/foo/bar%s.html" % qmdashx2,
-                       "blah"+mdashx2, "a",
-                       [("href", "http://example.com/foo/bar%s.html" % mdashx2),
-                        ("name", "name0%s" % mdashx2)])
-            # nr
-            link = b.find_link()
-##             print
-##             print exp
-##             print link
-            self.assertEqual(link, exp)
-
-    def test_link_whitespace(self):
-        from mechanize import Link
-        for factory_class in FACTORY_CLASSES:
-            base_url = "http://example.com/"
-            url = "  http://example.com/foo.html%20+ "
-            stripped_url = url.strip()
-            html = '<a href="%s"></a>' % url
-            b = TestBrowser(factory=factory_class())
-            r = MockResponse(base_url, html, {"content-type": "text/html"})
-            b.add_handler(make_mock_handler()([("http_open", r)]))
-            r = b.open(base_url)
-            link = b.find_link(nr=0)
-            self.assertEqual(
-                link,
-                Link(base_url, stripped_url, "", "a", [("href", url)])
-                )
-
-    def test_links(self):
-        for factory_class in FACTORY_CLASSES:
-            self._test_links(factory_class())
-    def _test_links(self, factory):
-        import mechanize
-        from mechanize import Link
-        url = "http://example.com/"
-
-        b = TestBrowser(factory=factory)
-        r = MockResponse(url,
-"""<html>
-<head><title>Title</title></head>
-<body>
-<a href="http://example.com/foo/bar.html" name="apples"></a>
-<a name="pears"></a>
-<a href="spam" name="pears"></a>
-<area href="blah" name="foo"></area>
-<form name="form2">
- <input type="submit" name="two">
-</form>
-<frame name="name" href="href" src="src"></frame>
-<iframe name="name2" href="href" src="src"></iframe>
-<a name="name3" href="one">yada yada</a>
-<a name="pears" href="two" weird="stuff">rhubarb</a>
-<a></a>
-<iframe src="foo"></iframe>
-</body>
-</html>
-""", {"content-type": "text/html"})
-        b.add_handler(make_mock_handler()([("http_open", r)]))
-        r = b.open(url)
-
-        exp_links = [
-            # base_url, url, text, tag, attrs
-            Link(url, "http://example.com/foo/bar.html", "", "a",
-                 [("href", "http://example.com/foo/bar.html"),
-                  ("name", "apples")]),
-            Link(url, "spam", "", "a", [("href", "spam"), ("name", "pears")]),
-            Link(url, "blah", None, "area",
-                 [("href", "blah"), ("name", "foo")]),
-            Link(url, "src", None, "frame",
-                 [("name", "name"), ("href", "href"), ("src", "src")]),
-            Link(url, "src", None, "iframe",
-                 [("name", "name2"), ("href", "href"), ("src", "src")]),
-            Link(url, "one", "yada yada", "a",
-                 [("name", "name3"), ("href", "one")]),
-            Link(url, "two", "rhubarb", "a",
-                 [("name", "pears"), ("href", "two"), ("weird", "stuff")]),
-            Link(url, "foo", None, "iframe",
-                 [("src", "foo")]),
-            ]
-        links = list(b.links())
-        self.assertEqual(len(links), len(exp_links))
-        for got, expect in zip(links, exp_links):
-            self.assertEqual(got, expect)
-        # nr
-        l = b.find_link()
-        self.assertEqual(l.url, "http://example.com/foo/bar.html")
-        l = b.find_link(nr=1)
-        self.assertEqual(l.url, "spam")
-        # text
-        l = b.find_link(text="yada yada")
-        self.assertEqual(l.url, "one")
-        self.assertRaises(mechanize.LinkNotFoundError,
-                          b.find_link, text="da ya")
-        l = b.find_link(text_regex=re.compile("da ya"))
-        self.assertEqual(l.url, "one")
-        l = b.find_link(text_regex="da ya")
-        self.assertEqual(l.url, "one")
-        # name
-        l = b.find_link(name="name3")
-        self.assertEqual(l.url, "one")
-        l = b.find_link(name_regex=re.compile("oo"))
-        self.assertEqual(l.url, "blah")
-        l = b.find_link(name_regex="oo")
-        self.assertEqual(l.url, "blah")
-        # url
-        l = b.find_link(url="spam")
-        self.assertEqual(l.url, "spam")
-        l = b.find_link(url_regex=re.compile("pam"))
-        self.assertEqual(l.url, "spam")
-        l = b.find_link(url_regex="pam")
-        self.assertEqual(l.url, "spam")
-        # tag
-        l = b.find_link(tag="area")
-        self.assertEqual(l.url, "blah")
-        # predicate
-        l = b.find_link(predicate=
-                        lambda l: dict(l.attrs).get("weird") == "stuff")
-        self.assertEqual(l.url, "two")
-        # combinations
-        l = b.find_link(name="pears", nr=1)
-        self.assertEqual(l.text, "rhubarb")
-        l = b.find_link(url="src", nr=0, name="name2")
-        self.assertEqual(l.tag, "iframe")
-        self.assertEqual(l.url, "src")
-        self.assertRaises(mechanize.LinkNotFoundError, b.find_link,
-                          url="src", nr=1, name="name2")
-        l = b.find_link(tag="a", predicate=
-                        lambda l: dict(l.attrs).get("weird") == "stuff")
-        self.assertEqual(l.url, "two")
-
-        # .links()
-        self.assertEqual(list(b.links(url="src")), [
-            Link(url, url="src", text=None, tag="frame",
-                 attrs=[("name", "name"), ("href", "href"), ("src", "src")]),
-            Link(url, url="src", text=None, tag="iframe",
-                 attrs=[("name", "name2"), ("href", "href"), ("src", "src")]),
-            ])
-
-    def test_base_uri(self):
-        import mechanize
-        url = "http://example.com/"
-
-        for html, urls in [
-            (
-"""<base href="http://www.python.org/foo/">
-<a href="bar/baz.html"></a>
-<a href="/bar/baz.html"></a>
-<a href="http://example.com/bar %2f%2Fblah;/baz@~._-.html"></a>
-""",
-            [
-            "http://www.python.org/foo/bar/baz.html",
-            "http://www.python.org/bar/baz.html",
-            "http://example.com/bar%20%2f%2Fblah;/baz@~._-.html",
-            ]),
-            (
-"""<a href="bar/baz.html"></a>
-<a href="/bar/baz.html"></a>
-<a href="http://example.com/bar/baz.html"></a>
-""",
-            [
-            "http://example.com/bar/baz.html",
-            "http://example.com/bar/baz.html",
-            "http://example.com/bar/baz.html",
-            ]
-            ),
-            ]:
-            b = TestBrowser()
-            r = MockResponse(url, html, {"content-type": "text/html"})
-            b.add_handler(make_mock_handler()([("http_open", r)]))
-            r = b.open(url)
-            self.assertEqual([link.absolute_url for link in b.links()], urls)
-
-
-class ResponseTests(TestCase):
-    def test_set_response(self):
-        import copy
-        from mechanize import response_seek_wrapper
-
-        br = TestBrowser()
-        url = "http://example.com/"
-        html = """<html><body><a href="spam">click me</a></body></html>"""
-        headers = {"content-type": "text/html"}
-        r = response_seek_wrapper(MockResponse(url, html, headers))
-        br.add_handler(make_mock_handler()([("http_open", r)]))
-
-        r = br.open(url)
-        self.assertEqual(r.read(), html)
-        r.seek(0)
-        self.assertEqual(copy.copy(r).read(), html)
-        self.assertEqual(list(br.links())[0].url, "spam")
-
-        newhtml = """<html><body><a href="eggs">click me</a></body></html>"""
-
-        r.set_data(newhtml)
-        self.assertEqual(r.read(), newhtml)
-        self.assertEqual(br.response().read(), html)
-        br.response().set_data(newhtml)
-        self.assertEqual(br.response().read(), html)
-        self.assertEqual(list(br.links())[0].url, "spam")
-        r.seek(0)
-
-        br.set_response(r)
-        self.assertEqual(br.response().read(), newhtml)
-        self.assertEqual(list(br.links())[0].url, "eggs")
-
-    def test_str(self):
-        import mimetools
-        from mechanize import _util
-
-        br = TestBrowser()
-        self.assertEqual(
-            str(br),
-            "<TestBrowser (not visiting a URL)>"
-            )
-
-        fp = StringIO.StringIO('<html><form name="f"><input /></form></html>')
-        headers = mimetools.Message(
-            StringIO.StringIO("Content-type: text/html"))
-        response = _util.response_seek_wrapper(_util.closeable_response(
-            fp, headers, "http://example.com/", 200, "OK"))
-        br.set_response(response)
-        self.assertEqual(
-            str(br),
-            "<TestBrowser visiting http://example.com/>"
-            )
-
-        br.select_form(nr=0)
-        self.assertEqual(
-            str(br),
-            """\
-<TestBrowser visiting http://example.com/
- selected form:
- <f GET http://example.com/ application/x-www-form-urlencoded
-  <TextControl(<None>=)>>
->""")
-
-
-class UserAgentTests(TestCase):
-    def test_set_handled_schemes(self):
-        import mechanize
-        class MockHandlerClass(make_mock_handler()):
-            def __call__(self): return self
-        class BlahHandlerClass(MockHandlerClass): pass
-        class BlahProcessorClass(MockHandlerClass): pass
-        BlahHandler = BlahHandlerClass([("blah_open", None)])
-        BlahProcessor = BlahProcessorClass([("blah_request", None)])
-        class TestUserAgent(mechanize.UserAgent):
-            default_others = []
-            default_features = []
-            handler_classes = mechanize.UserAgent.handler_classes.copy()
-            handler_classes.update(
-                {"blah": BlahHandler, "_blah": BlahProcessor})
-        ua = TestUserAgent()
-
-        self.assertEqual(len(ua.handlers), 5)
-        ua.set_handled_schemes(["http", "https"])
-        self.assertEqual(len(ua.handlers), 2)
-        self.assertRaises(ValueError,
-            ua.set_handled_schemes, ["blah", "non-existent"])
-        self.assertRaises(ValueError,
-            ua.set_handled_schemes, ["blah", "_blah"])
-        ua.set_handled_schemes(["blah"])
-
-        req = mechanize.Request("blah://example.com/")
-        r = ua.open(req)
-        exp_calls = [("blah_open", (req,), {})]
-        assert len(ua.calls) == len(exp_calls)
-        for got, expect in zip(ua.calls, exp_calls):
-            self.assertEqual(expect, got[1:])
-
-        ua.calls = []
-        req = mechanize.Request("blah://example.com/")
-        ua._set_handler("_blah", True)
-        r = ua.open(req)
-        exp_calls = [
-            ("blah_request", (req,), {}),
-            ("blah_open", (req,), {})]
-        assert len(ua.calls) == len(exp_calls)
-        for got, expect in zip(ua.calls, exp_calls):
-            self.assertEqual(expect, got[1:])
-        ua._set_handler("_blah", True)
-
-if __name__ == "__main__":
-    import test_mechanize
-    import doctest
-    doctest.testmod(test_mechanize)
-    import unittest
-    unittest.main()

Deleted: python-mechanize/trunk/test/test_misc.py
===================================================================
--- python-mechanize/trunk/test/test_misc.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/test/test_misc.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,218 +0,0 @@
-"""Miscellaneous pyunit tests."""
-
-import copy
-import cStringIO, string
-from unittest import TestCase
-
-class TestUnSeekable:
-    def __init__(self, text):
-        self._file = cStringIO.StringIO(text)
-        self.log = []
-
-    def tell(self): return self._file.tell()
-
-    def seek(self, offset, whence=0): assert False
-
-    def read(self, size=-1):
-        self.log.append(("read", size))
-        return self._file.read(size)
-
-    def readline(self, size=-1):
-        self.log.append(("readline", size))
-        return self._file.readline(size)
-
-    def readlines(self, sizehint=-1):
-        self.log.append(("readlines", sizehint))
-        return self._file.readlines(sizehint)
-
-class TestUnSeekableResponse(TestUnSeekable):
-    def __init__(self, text, headers):
-        TestUnSeekable.__init__(self, text)
-        self.code = 200
-        self.msg = "OK"
-        self.headers = headers
-        self.url = "http://example.com/"
-
-    def geturl(self):
-        return self.url
-
-    def info(self):
-        return self.headers
-
-    def close(self):
-        pass
-
-
-class SeekableTests(TestCase):
-
-    text = """\
-The quick brown fox
-jumps over the lazy
-
-dog.
-
-"""
-    text_lines = map(lambda l: l+"\n", string.split(text, "\n")[:-1])
-
-    def testSeekable(self):
-        from mechanize._util import seek_wrapper
-        text = self.text
-        text_lines = self.text_lines
-
-        for ii in range(1, 6):
-            fh = TestUnSeekable(text)
-            sfh = seek_wrapper(fh)
-            test = getattr(self, "_test%d" % ii)
-            test(sfh)
-
-        # copies have independent seek positions
-        fh = TestUnSeekable(text)
-        sfh = seek_wrapper(fh)
-        self._testCopy(sfh)
-
-    def _testCopy(self, sfh):
-        sfh2 = copy.copy(sfh)
-        sfh.read(10)
-        text = self.text
-        self.assertEqual(sfh2.read(10), text[:10])
-        sfh2.seek(5)
-        self.assertEqual(sfh.read(10), text[10:20])
-        self.assertEqual(sfh2.read(10), text[5:15])
-        sfh.seek(0)
-        sfh2.seek(0)
-        return sfh2
-
-    def _test1(self, sfh):
-        text = self.text
-        text_lines = self.text_lines
-        assert sfh.read(10) == text[:10]  # calls fh.read
-        assert sfh.log[-1] == ("read", 10)  # .log delegated to fh
-        sfh.seek(0)  # doesn't call fh.seek
-        assert sfh.read(10) == text[:10]  # doesn't call fh.read
-        assert len(sfh.log) == 1
-        sfh.seek(0)
-        assert sfh.read(5) == text[:5]  # read only part of cached data
-        assert len(sfh.log) == 1
-        sfh.seek(0)
-        assert sfh.read(25) == text[:25]  # calls fh.read
-        assert sfh.log[1] == ("read", 15)
-        lines = []
-        sfh.seek(-1, 1)
-        while 1:
-            l = sfh.readline()
-            if l == "": break
-            lines.append(l)
-        assert lines == ["s over the lazy\n"]+text_lines[2:]
-        assert sfh.log[2:] == [("readline", -1)]*5
-        sfh.seek(0)
-        lines = []
-        while 1:
-            l = sfh.readline()
-            if l == "": break
-            lines.append(l)
-        assert lines == text_lines
-
-    def _test2(self, sfh):
-        text = self.text
-        sfh.read(5)
-        sfh.seek(0)
-        assert sfh.read() == text
-        assert sfh.read() == ""
-        sfh.seek(0)
-        assert sfh.read() == text
-        sfh.seek(0)
-        assert sfh.readline(5) == "The q"
-        assert sfh.read() == text[5:]
-        sfh.seek(0)
-        assert sfh.readline(5) == "The q"
-        assert sfh.readline() == "uick brown fox\n"
-
-    def _test3(self, sfh):
-        text = self.text
-        text_lines = self.text_lines
-        sfh.read(25)
-        sfh.seek(-1, 1)
-        self.assertEqual(sfh.readlines(), ["s over the lazy\n"]+text_lines[2:])
-        nr_logs = len(sfh.log)
-        sfh.seek(0)
-        assert sfh.readlines() == text_lines
-
-    def _test4(self, sfh):
-        text = self.text
-        text_lines = self.text_lines
-        count = 0
-        limit = 10
-        while count < limit:
-            if count == 5:
-                self.assertRaises(StopIteration, sfh.next)
-                break
-            else:
-                sfh.next() == text_lines[count]
-            count = count + 1
-        else:
-            assert False, "StopIteration not raised"
-
-    def _test5(self, sfh):
-        text = self.text
-        sfh.read(10)
-        sfh.seek(5)
-        self.assert_(sfh.invariant())
-        sfh.seek(0, 2)
-        self.assert_(sfh.invariant())
-        sfh.seek(0)
-        self.assertEqual(sfh.read(), text)
-
-    def testResponseSeekWrapper(self):
-        from mechanize import response_seek_wrapper
-        hdrs = {"Content-type": "text/html"}
-        r = TestUnSeekableResponse(self.text, hdrs)
-        rsw = response_seek_wrapper(r)
-        rsw2 = self._testCopy(rsw)
-        self.assert_(rsw is not rsw2)
-        self.assertEqual(rsw.info(), rsw2.info())
-        self.assert_(rsw.info() is not rsw2.info())
-
-        # should be able to close already-closed object
-        rsw2.close()
-        rsw2.close()
-
-    def testSetResponseData(self):
-        from mechanize import response_seek_wrapper
-        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
-        rsw = response_seek_wrapper(r)
-        rsw.set_data("""\
-A Seeming somwhat more than View;
-  That doth instruct the Mind
-  In Things that ly behind,
-""")
-        self.assertEqual(rsw.read(9), "A Seeming")
-        self.assertEqual(rsw.read(13), " somwhat more")
-        rsw.seek(0)
-        self.assertEqual(rsw.read(9), "A Seeming")
-        self.assertEqual(rsw.readline(), " somwhat more than View;\n")
-        rsw.seek(0)
-        self.assertEqual(rsw.readline(), "A Seeming somwhat more than View;\n")
-        rsw.seek(-1, 1)
-        self.assertEqual(rsw.read(7), "\n  That")
-
-        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
-        rsw = response_seek_wrapper(r)
-        rsw.set_data(self.text)
-        self._test2(rsw)
-        rsw.seek(0)
-        self._test4(rsw)
-
-    def testGetResponseData(self):
-        from mechanize import response_seek_wrapper
-        r = TestUnSeekableResponse(self.text, {'blah': 'yawn'})
-        rsw = response_seek_wrapper(r)
-
-        self.assertEqual(rsw.get_data(), self.text)
-        self._test2(rsw)
-        rsw.seek(0)
-        self._test4(rsw)
-
-
-if __name__ == "__main__":
-    import unittest
-    unittest.main()

Copied: python-mechanize/trunk/test/test_opener.py (from rev 765, python-mechanize/branches/upstream/current/test/test_opener.py)

Copied: python-mechanize/trunk/test/test_password_manager.doctest (from rev 765, python-mechanize/branches/upstream/current/test/test_password_manager.doctest)

Copied: python-mechanize/trunk/test/test_request.doctest (from rev 765, python-mechanize/branches/upstream/current/test/test_request.doctest)

Copied: python-mechanize/trunk/test/test_response.doctest (from rev 765, python-mechanize/branches/upstream/current/test/test_response.doctest)

Copied: python-mechanize/trunk/test/test_response.py (from rev 765, python-mechanize/branches/upstream/current/test/test_response.py)

Copied: python-mechanize/trunk/test/test_rfc3986.doctest (from rev 765, python-mechanize/branches/upstream/current/test/test_rfc3986.doctest)

Modified: python-mechanize/trunk/test/test_urllib2.py
===================================================================
--- python-mechanize/trunk/test/test_urllib2.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/test/test_urllib2.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -1,26 +1,33 @@
-"""Tests for ClientCookie._urllib2_support (and for urllib2)."""
+"""Tests for urllib2-level functionality.
 
+This is made up of:
+
+ - tests that I've contributed back to stdlib test_urllib2.py
+
+ - tests for features that aren't in urllib2, but works on the level of the
+   interfaces exported by urllib2, especially urllib2 "handler" interface,
+   but *excluding* the extended interfaces provided by mechanize.UserAgent
+   and mechanize.Browser.
+
+"""
+
 # XXX
 # Request (I'm too lazy)
 # CacheFTPHandler (hard to write)
-# parse_keqv_list, parse_http_list (I'm leaving this for Anthony Baxter
-#  and Greg Stein, since they're doing Digest Authentication)
-# Authentication stuff (ditto)
-# ProxyHandler, CustomProxy, CustomProxyHandler (I don't use a proxy)
+# parse_keqv_list, parse_http_list
 # GopherHandler (haven't used gopher for a decade or so...)
 
-import unittest, StringIO, os, sys, UserDict
+import unittest, StringIO, os, sys, UserDict, httplib
 
 import mechanize
 
-from mechanize._urllib2_support import Request, AbstractHTTPHandler, \
-     build_opener, parse_head, urlopen
-from mechanize._util import startswith
+from mechanize._http import AbstractHTTPHandler, parse_head
+from mechanize._response import test_response
 from mechanize import HTTPRedirectHandler, HTTPRequestUpgradeProcessor, \
      HTTPEquivProcessor, HTTPRefreshProcessor, SeekableProcessor, \
      HTTPCookieProcessor, HTTPRefererProcessor, \
      HTTPErrorProcessor, HTTPHandler
-from mechanize import OpenerDirector
+from mechanize import OpenerDirector, build_opener, urlopen, Request
 
 ## from logging import getLogger, DEBUG
 ## l = getLogger("mechanize")
@@ -38,14 +45,19 @@
     def readline(self, count=None): pass
     def close(self): pass
 
-class MockHeaders(UserDict.UserDict):
-    def getallmatchingheaders(self, name):
-        r = []
-        for k, v in self.data.items():
-            if k.lower() == name:
-                r.append("%s: %s" % (k, v))
-        return r
+def http_message(mapping):
+    """
+    >>> http_message({"Content-Type": "text/html"}).items()
+    [('content-type', 'text/html')]
 
+    """
+    f = []
+    for kv in mapping.items():
+        f.append("%s: %s" % kv)
+    f.append("")
+    msg = httplib.HTTPMessage(StringIO.StringIO("\r\n".join(f)))
+    return msg
+
 class MockResponse(StringIO.StringIO):
     def __init__(self, code, msg, headers, data, url=None):
         StringIO.StringIO.__init__(self, data)
@@ -90,7 +102,7 @@
             return res
         elif action == "return request":
             return Request("http://blah/")
-        elif startswith(action, "error"):
+        elif action.startswith("error"):
             code = int(action[-3:])
             res = MockResponse(200, "OK", {}, "")
             return self.parent.error("http", args[0], res, code, "", {})
@@ -391,6 +403,8 @@
         return self
     def set_url(self, url):
         self.calls.append(("set_url", url))
+    def set_opener(self, opener):
+        self.calls.append(("set_opener", opener))
     def read(self):
         self.calls.append("read")
     def can_fetch(self, ua, url):
@@ -666,8 +680,10 @@
             return  # skip test
         else:
             from mechanize import HTTPRobotRulesProcessor
+        opener = OpenerDirector()
         rfpc = MockRobotFileParserClass()
         h = HTTPRobotRulesProcessor(rfpc)
+        opener.add_handler(h)
 
         url = "http://example.com:80/foo/bar.html"
         req = Request(url)
@@ -676,6 +692,7 @@
         h.http_request(req)
         self.assert_(rfpc.calls == [
             "__call__",
+            ("set_opener", opener),
             ("set_url", "http://example.com:80/robots.txt"),
             "read",
             ("can_fetch", "", url),
@@ -715,6 +732,7 @@
         h.http_request(req)
         self.assert_(rfpc.calls == [
             "__call__",
+            ("set_opener", opener),
             ("set_url", "http://example.com/robots.txt"),
             "read",
             ("can_fetch", "", url),
@@ -726,10 +744,17 @@
         h.http_request(req)
         self.assert_(rfpc.calls == [
             "__call__",
+            ("set_opener", opener),
             ("set_url", "https://example.org/robots.txt"),
             "read",
             ("can_fetch", "", url),
             ])
+        # non-HTTP URL -> ignore robots.txt
+        rfpc.clear()
+        url = "ftp://example.com/"
+        req = Request(url)
+        h.http_request(req)
+        self.assert_(rfpc.calls == [])
 
     def test_cookies(self):
         cj = MockCookieJar()
@@ -769,28 +794,38 @@
         req = Request("http://example.com/")
         r = MockResponse(
             200, "OK",
-            MockHeaders({"Foo": "Bar", "Content-type": "text/html"}),
+            http_message({"Foo": "Bar",
+                          "Content-type": "text/html",
+                          "Refresh": "blah"}),
             '<html><head>'
             '<meta http-equiv="Refresh" content="spam&amp;eggs">'
-            '</head></html>'
+            '</head></html>',
+            "http://example.com/"
             )
         newr = h.http_response(req, r)
         headers = newr.info()
+        self.assert_(headers["Foo"] == "Bar")
         self.assert_(headers["Refresh"] == "spam&eggs")
-        self.assert_(headers["Foo"] == "Bar")
+        self.assert_(headers.getheaders("Refresh") == ["blah", "spam&eggs"])
 
     def test_refresh(self):
         # XXX test processor constructor optional args
         h = HTTPRefreshProcessor(max_time=None, honor_time=False)
 
-        for val in ['0; url="http://example.com/foo/"', "2"]:
+        for val, valid in [
+            ('0; url="http://example.com/foo/"', True),
+            ("2", True),
+            # in the past, this failed with UnboundLocalError
+            ('0; "http://example.com/foo/"', False),
+            ]:
             o = h.parent = MockOpener()
             req = Request("http://example.com/")
-            headers = MockHeaders({"refresh": val})
-            r = MockResponse(200, "OK", headers, "")
+            headers = http_message({"refresh": val})
+            r = MockResponse(200, "OK", headers, "", "http://example.com/")
             newr = h.http_response(req, r)
-            self.assertEqual(o.proto, "http")
-            self.assertEqual(o.args, (req, r, "refresh", "OK", headers))
+            if valid:
+                self.assertEqual(o.proto, "http")
+                self.assertEqual(o.args, (req, r, "refresh", "OK", headers))
 
     def test_redirect(self):
         from_url = "http://example.com/a.html"
@@ -808,7 +843,7 @@
                 req.origin_req_host = "example.com"  # XXX
                 try:
                     method(req, MockFile(), code, "Blah",
-                           MockHeaders({"location": to_url}))
+                           http_message({"location": to_url}))
                 except mechanize.HTTPError:
                     # 307 in response to POST requires user OK
                     self.assert_(code == 307 and data is not None)
@@ -824,7 +859,7 @@
         # loop detection
         def redirect(h, req, url=to_url):
             h.http_error_302(req, MockFile(), 302, "Blah",
-                             MockHeaders({"location": url}))
+                             http_message({"location": url}))
         # Note that the *original* request shares the same record of
         # redirections with the sub-requests caused by the redirections.
 
@@ -851,6 +886,39 @@
         except mechanize.HTTPError:
             self.assert_(count == HTTPRedirectHandler.max_redirections)
 
+    def test_redirect_bad_uri(self):
+        # bad URIs should be cleaned up before redirection
+        from mechanize._response import test_html_response
+        from_url = "http://example.com/a.html"
+        bad_to_url = "http://example.com/b. |html"
+        good_to_url = "http://example.com/b.%20%7Chtml"
+
+        h = HTTPRedirectHandler()
+        o = h.parent = MockOpener()
+
+        req = Request(from_url)
+        h.http_error_302(req, test_html_response(), 302, "Blah",
+                         http_message({"location": bad_to_url}),
+                         )
+        self.assertEqual(o.req.get_full_url(), good_to_url)
+
+    def test_refresh_bad_uri(self):
+        # bad URIs should be cleaned up before redirection
+        from mechanize._response import test_html_response
+        from_url = "http://example.com/a.html"
+        bad_to_url = "http://example.com/b. |html"
+        good_to_url = "http://example.com/b.%20%7Chtml"
+
+        h = HTTPRefreshProcessor(max_time=None, honor_time=False)
+        o = h.parent = MockOpener()
+
+        req = Request("http://example.com/")
+        r = test_html_response(
+            headers=[("refresh", '0; url="%s"' % bad_to_url)])
+        newr = h.http_response(req, r)
+        headers = o.args[-1]
+        self.assertEqual(headers["Location"], good_to_url)
+
     def test_cookie_redirect(self):
         # cookies shouldn't leak into redirected requests
         import mechanize
@@ -896,6 +964,8 @@
         realm = "ACME Widget Store"
         http_handler = MockHTTPHandler(
             401, 'WWW-Authenticate: Basic realm="%s"\r\n\r\n' % realm)
+        opener.add_handler(auth_handler)
+        opener.add_handler(http_handler)
         self._test_basic_auth(opener, auth_handler, "Authorization",
                               realm, http_handler, password_manager,
                               "http://acme.example.com/protected",
@@ -911,6 +981,8 @@
         realm = "ACME Networks"
         http_handler = MockHTTPHandler(
             407, 'Proxy-Authenticate: Basic realm="%s"\r\n\r\n' % realm)
+        opener.add_handler(auth_handler)
+        opener.add_handler(http_handler)
         self._test_basic_auth(opener, auth_handler, "Proxy-authorization",
                               realm, http_handler, password_manager,
                               "http://acme.example.com:3128/protected",
@@ -922,29 +994,54 @@
         # response (http://python.org/sf/1479302), where it should instead
         # return None to allow another handler (especially
         # HTTPBasicAuthHandler) to handle the response.
+
+        # Also (http://python.org/sf/1479302, RFC 2617 section 1.2), we must
+        # try digest first (since it's the strongest auth scheme), so we record
+        # order of calls here to check digest comes first:
+        class RecordingOpenerDirector(OpenerDirector):
+            def __init__(self):
+                OpenerDirector.__init__(self)
+                self.recorded = []
+            def record(self, info):
+                self.recorded.append(info)
         class TestDigestAuthHandler(mechanize.HTTPDigestAuthHandler):
-            handler_order = 400  # strictly before HTTPBasicAuthHandler
-        opener = OpenerDirector()
+            def http_error_401(self, *args, **kwds):
+                self.parent.record("digest")
+                mechanize.HTTPDigestAuthHandler.http_error_401(self,
+                                                             *args, **kwds)
+        class TestBasicAuthHandler(mechanize.HTTPBasicAuthHandler):
+            def http_error_401(self, *args, **kwds):
+                self.parent.record("basic")
+                mechanize.HTTPBasicAuthHandler.http_error_401(self,
+                                                            *args, **kwds)
+
+        opener = RecordingOpenerDirector()
         password_manager = MockPasswordManager()
         digest_handler = TestDigestAuthHandler(password_manager)
-        basic_handler = mechanize.HTTPBasicAuthHandler(password_manager)
-        opener.add_handler(digest_handler)
+        basic_handler = TestBasicAuthHandler(password_manager)
         realm = "ACME Networks"
         http_handler = MockHTTPHandler(
             401, 'WWW-Authenticate: Basic realm="%s"\r\n\r\n' % realm)
+        opener.add_handler(digest_handler)
+        opener.add_handler(basic_handler)
+        opener.add_handler(http_handler)
+        opener._maybe_reindex_handlers()
+
+        # check basic auth isn't blocked by digest handler failing
         self._test_basic_auth(opener, basic_handler, "Authorization",
                               realm, http_handler, password_manager,
                               "http://acme.example.com/protected",
                               "http://acme.example.com/protected",
                               )
+        # check digest was tried before basic (twice, because
+        # _test_basic_auth called .open() twice)
+        self.assertEqual(opener.recorded, ["digest", "basic"]*2)
 
     def _test_basic_auth(self, opener, auth_handler, auth_header,
                          realm, http_handler, password_manager,
                          request_url, protected_url):
         import base64, httplib
         user, password = "wile", "coyote"
-        opener.add_handler(auth_handler)
-        opener.add_handler(http_handler)
 
         # .add_password() fed through to password manager
         auth_handler.add_password(realm, request_url, user, password)
@@ -995,7 +1092,10 @@
             <meta http-equiv="moo" content="cow">
             </html>
             """,
-             [("refresh", "1; http://example.com/"), ("foo", "bar")])
+             [("refresh", "1; http://example.com/"), ("foo", "bar")]),
+            ("""<meta http-equiv="refresh">
+            """,
+             [])
             ]
         for html, result in htmls:
             self.assertEqual(parse_head(StringIO.StringIO(html), HeadParser()), result)
@@ -1026,11 +1126,10 @@
             self._count = self._count + 1
             msg = mimetools.Message(StringIO(self.headers))
             return self.parent.error(
-                "http", req, MockFile(), self.code, "Blah", msg)
+                "http", req, test_response(), self.code, "Blah", msg)
         else:
             self.req = req
-            msg = mimetools.Message(StringIO("\r\n\r\n"))
-            return MockResponse(200, "OK", msg, "", req.get_full_url())
+            return test_response("", [], req.get_full_url())
 
 
 class MyHTTPHandler(HTTPHandler): pass
@@ -1082,36 +1181,8 @@
         else:
             self.assert_(False)
 
-    def _methnames(self, *objs):
-        from mechanize._opener import methnames
-        r = []
-        for i in range(len(objs)):
-            obj = objs[i]
-            names = methnames(obj)
-            names.sort()
-            # special methods vary over Python versions
-            names = filter(lambda mn: mn[0:2] != "__" , names)
-            r.append(names)
-        return r
 
-    def test_methnames(self):
-        a, b, c, d = A(), B(), C(), D()
-        a, b, c, d = self._methnames(a, b, c, d)
-        self.assert_(a == ["a"])
-        self.assert_(b == ["a", "b"])
-        self.assert_(c == ["a", "c"])
-        self.assert_(d == ["a", "b", "c", "d"])
-
-        a, b, c, d = A(), B(), C(), D()
-        a.x = lambda self: None
-        b.y = lambda self: None
-        d.z = lambda self: None
-        a, b, c, d = self._methnames(a, b, c, d)
-        self.assert_(a == ["a", "x"])
-        self.assert_(b == ["a", "b", "y"])
-        self.assert_(c == ["a", "c"])
-        self.assert_(d == ["a", "b", "c", "d", "z"])
-
-
 if __name__ == "__main__":
+    import doctest
+    doctest.testmod()
     unittest.main()

Copied: python-mechanize/trunk/test/test_useragent.py (from rev 765, python-mechanize/branches/upstream/current/test/test_useragent.py)

Copied: python-mechanize/trunk/test-tools (from rev 765, python-mechanize/branches/upstream/current/test-tools)

Modified: python-mechanize/trunk/test.py
===================================================================
--- python-mechanize/trunk/test.py	2007-04-09 22:08:07 UTC (rev 772)
+++ python-mechanize/trunk/test.py	2007-04-09 23:35:16 UTC (rev 773)
@@ -8,20 +8,46 @@
 
 """
 
+import cgitb
+#cgitb.enable(format="text")
+
 # Modules containing tests to run -- a test is anything named *Tests, which
 # should be classes deriving from unittest.TestCase.
-MODULE_NAMES = ["test_date", "test_mechanize", "test_misc", "test_cookies",
+MODULE_NAMES = ["test_date", "test_browser", "test_response", "test_cookies",
                 "test_headers", "test_urllib2", "test_pullparser",
+                "test_useragent", "test_html", "test_opener",
                 ]
 
-import sys, os, traceback, logging
-from unittest import defaultTestLoader, TextTestRunner, TestSuite, TestCase
+import sys, os, traceback, logging, glob
+from unittest import defaultTestLoader, TextTestRunner, TestSuite, TestCase, \
+     _TextTestResult
 
-level = logging.DEBUG
+#level = logging.DEBUG
 #level = logging.INFO
+#level = logging.WARNING
 #level = logging.NOTSET
 #logging.getLogger("mechanize").setLevel(level)
+#logging.getLogger("mechanize").addHandler(logging.StreamHandler(sys.stdout))
 
+
+class CgitbTextResult(_TextTestResult):
+    def _exc_info_to_string(self, err, test):
+        """Converts a sys.exc_info()-style tuple of values into a string."""
+        exctype, value, tb = err
+        # Skip test runner traceback levels
+        while tb and self._is_relevant_tb_level(tb):
+            tb = tb.tb_next
+        if exctype is test.failureException:
+            # Skip assert*() traceback levels
+            length = self._count_relevant_tb_levels(tb)
+            return cgitb.text((exctype, value, tb))
+        return cgitb.text((exctype, value, tb))
+
+class CgitbTextTestRunner(TextTestRunner):
+    def _makeResult(self):
+        return CgitbTextResult(self.stream, self.descriptions, self.verbosity)
+
+
 class TestProgram:
     """A command-line program that runs a set of tests; this is primarily
        for making test modules conveniently executable.
@@ -57,7 +83,6 @@
         self.testLoader = testLoader
         self.progName = os.path.basename(argv[0])
         self.parseArgs(argv)
-        self.runTests()
 
     def usageExit(self, msg=None):
         if msg: print msg
@@ -98,23 +123,114 @@
         if self.testRunner is None:
             self.testRunner = TextTestRunner(verbosity=self.verbosity)
         result = self.testRunner.run(self.test)
-        sys.exit(not result.wasSuccessful())
+        return result
 
 
 if __name__ == "__main__":
+##     sys.path.insert(0, '/home/john/comp/dev/rl/jjlee/lib/python')
+##     import jjl
+##     import __builtin__
+##     __builtin__.jjl = jjl
+
     # XXX temporary stop-gap to run doctests
-    assert os.path.isdir('test')
-    sys.path.insert(0, 'test')
+
+    # XXXX coverage output seems incorrect ATM
+    run_coverage = "-c" in sys.argv
+    if run_coverage:
+        sys.argv.remove("-c")
+    use_cgitb = "-t" in sys.argv
+    if use_cgitb:
+        sys.argv.remove("-t")
+    run_doctests = "-d" not in sys.argv
+    if not run_doctests:
+        sys.argv.remove("-d")
+
+    # import local copy of Python 2.5 doctest
+    assert os.path.isdir("test")
+    sys.path.insert(0, "test")
+    # needed for recent doctest / linecache -- this is only for testing
+    # purposes, these don't get installed
+    # doctest.py revision 45701 and linecache.py revision 45940.  Since
+    # linecache is used by Python itself, linecache.py is renamed
+    # linecache_copy.py, and this copy of doctest is modified (only) to use
+    # that renamed module.
+    sys.path.insert(0, "test-tools")
     import doctest
-    import test_mechanize
-    doctest.testmod(test_mechanize)
-    from mechanize import _headersutil, _auth, _clientcookie, _pullparser
-    doctest.testmod(_headersutil)
-    doctest.testmod(_auth)
-    doctest.testmod(_clientcookie)
-    doctest.testmod(_pullparser)
 
+    import coverage
+    if run_coverage:
+        print 'running coverage'
+        coverage.erase()
+        coverage.start()
+
+    import mechanize
+
+    if run_doctests:
+        # run .doctest files needing special support
+        common_globs = {"mechanize": mechanize}
+        pm_doctest_filename = os.path.join("test", "test_password_manager.doctest")
+        for globs in [
+            {"mgr_class": mechanize.HTTPPasswordMgr},
+            {"mgr_class": mechanize.HTTPProxyPasswordMgr},
+            ]:
+            globs.update(common_globs)
+            doctest.testfile(
+                pm_doctest_filename,
+                #os.path.join("test", "test_scratch.doctest"),
+                globs=globs,
+                )
+
+        # run .doctest files
+        special_doctests = [pm_doctest_filename,
+                            os.path.join("test", "test_scratch.doctest"),
+                            ]
+        doctest_files = glob.glob(os.path.join("test", "*.doctest"))
+
+        for dt in special_doctests:
+            if dt in doctest_files:
+                doctest_files.remove(dt)
+        for df in doctest_files:
+            doctest.testfile(df)
+
+        # run doctests in docstrings
+        from mechanize import _headersutil, _auth, _clientcookie, _pullparser, \
+             _http, _rfc3986
+        doctest.testmod(_headersutil)
+        doctest.testmod(_rfc3986)
+        doctest.testmod(_auth)
+        doctest.testmod(_clientcookie)
+        doctest.testmod(_pullparser)
+        doctest.testmod(_http)
+
+    # run vanilla unittest tests
     import unittest
     test_path = os.path.join(os.path.dirname(sys.argv[0]), "test")
     sys.path.insert(0, test_path)
-    TestProgram(MODULE_NAMES)
+    test_runner = None
+    if use_cgitb:
+        test_runner = CgitbTextTestRunner()
+    prog = TestProgram(MODULE_NAMES, testRunner=test_runner)
+    result = prog.runTests()
+
+    if run_coverage:
+        # HTML coverage report
+        import colorize
+        try:
+            os.mkdir("coverage")
+        except OSError:
+            pass
+        private_modules = glob.glob("mechanize/_*.py")
+        private_modules.remove("mechanize/__init__.py")
+        for module_filename in private_modules:
+            module_name = module_filename.replace("/", ".")[:-3]
+            print module_name
+            module = sys.modules[module_name]
+            f, s, m, mf = coverage.analysis(module)
+            fo = open(os.path.join('coverage', os.path.basename(f)+'.html'), 'wb')
+            colorize.colorize_file(f, outstream=fo, not_covered=mf)
+            fo.close()
+            coverage.report(module)
+            #print coverage.analysis(module)
+
+    # XXX exit status is wrong -- does not take account of doctests
+    sys.exit(not result.wasSuccessful())