r72213 - in /trunk/w3c-linkchecker: MANIFEST META.yml NEWS SIGNATURE bin/checklink bin/checklink.pod debian/changelog docs/linkchecker.js etc/checklink.conf lib/W3C/LinkChecker.pm
periapt-guest at users.alioth.debian.org
periapt-guest at users.alioth.debian.org
Sun Apr 3 21:54:42 UTC 2011
Author: periapt-guest
Date: Sun Apr 3 21:54:29 2011
New Revision: 72213
URL: http://svn.debian.org/wsvn/pkg-perl/?sc=1&rev=72213
Log:
TODO: check javascript file carefully
* New upstream release
Added:
trunk/w3c-linkchecker/docs/linkchecker.js
- copied unchanged from r72212, branches/upstream/w3c-linkchecker/current/docs/linkchecker.js
Modified:
trunk/w3c-linkchecker/MANIFEST
trunk/w3c-linkchecker/META.yml
trunk/w3c-linkchecker/NEWS
trunk/w3c-linkchecker/SIGNATURE
trunk/w3c-linkchecker/bin/checklink
trunk/w3c-linkchecker/bin/checklink.pod
trunk/w3c-linkchecker/debian/changelog
trunk/w3c-linkchecker/etc/checklink.conf
trunk/w3c-linkchecker/lib/W3C/LinkChecker.pm
Modified: trunk/w3c-linkchecker/MANIFEST
URL: http://svn.debian.org/wsvn/pkg-perl/trunk/w3c-linkchecker/MANIFEST?rev=72213&op=diff
==============================================================================
--- trunk/w3c-linkchecker/MANIFEST (original)
+++ trunk/w3c-linkchecker/MANIFEST Sun Apr 3 21:54:29 2011
@@ -9,7 +9,8 @@
etc/checklink.conf Optional configuration file
etc/perltidyrc perltidy(1) profile
docs/checklink.html Additional documentation
-docs/linkchecker.css Cascading style sheet for the documentation
+docs/linkchecker.css Cascading style sheet used in docs and generated HTML
+docs/linkchecker.js JavaScript used in the generated HTML
images/double.png
images/grad.png
images/head-bl.png
Modified: trunk/w3c-linkchecker/META.yml
URL: http://svn.debian.org/wsvn/pkg-perl/trunk/w3c-linkchecker/META.yml?rev=72213&op=diff
==============================================================================
--- trunk/w3c-linkchecker/META.yml (original)
+++ trunk/w3c-linkchecker/META.yml Sun Apr 3 21:54:29 2011
@@ -1,6 +1,6 @@
--- #YAML:1.0
name: W3C-LinkChecker
-version: 4.7
+version: 4.8
abstract: W3C Link Checker
author:
- W3C QA-dev Team <public-qa-dev at w3.org>
Modified: trunk/w3c-linkchecker/NEWS
URL: http://svn.debian.org/wsvn/pkg-perl/trunk/w3c-linkchecker/NEWS?rev=72213&op=diff
==============================================================================
--- trunk/w3c-linkchecker/NEWS (original)
+++ trunk/w3c-linkchecker/NEWS Sun Apr 3 21:54:29 2011
@@ -1,7 +1,15 @@
This document contains information about high level changes between
Link Checker releases.
-Version 4.7
+Version 4.8 - 2011-04-02
+- Avoid some robot delays by improving the order in which links are checked.
+- Avoid some unnecessary HEAD requests in recursive mode.
+- Clarify output wrt. links that have already been checked.
+- Make connection cache size configurable, and increase the default to 2.
+- Move JavaScript to an external file.
+- Check applet and object archive links.
+
+Version 4.7 - 2011-03-17
- Support for IRI.
- Support for more HTML5 links.
- Decode query string parameters as UTF-8.
@@ -9,7 +17,7 @@
- New dependencies: Encode-Locale (command line mode only).
- Updated dependencies: libwww-perl >= 5.833, URI >= 1.53.
-Version 4.6
+Version 4.6 - 2010-05-01
- Support for checking links in CSS.
- Results UI improvements, added "progress bar".
- Support for larger variety of character and content encodings.
@@ -22,13 +30,13 @@
- New dependencies: CSS-DOM >= 0.09.
- Updated dependencies: Perl >= 5.8.
-Version 4.5
+Version 4.5 - 2009-03-30
- Removed W3C trademarked icons from distribution tarball.
- Avoid "false positive" failures from "make test" in certain setups.
- Make quiet command line mode quieter.
- Lowered default timeout to 30 seconds.
-Version 4.4
+Version 4.4 - 2009-02-12
- checking more elements and attributes, such as BLOCKQUOTE cite="", BODY
background="", EMBED, etc
- Changes in the UI to make it match other validators more closely
@@ -38,21 +46,21 @@
- Add non-robot developer mode
- many bug fixes and code cleanup
-Version 4.3
+Version 4.3 - 2006-10-22
- Various minor improvements to result output, both in text and HTML modes.
- Fix --quiet and checking multiple documents to match documentation.
- Eliminate various warnings (emitted by code, not from results).
- Documentation improvements.
-Version 4.2.1
+Version 4.2.1 - 2005-05-15
- Include documentation of the reorganized access keys.
-Version 4.2
+Version 4.2 - 2005-04-27
- Access key reorganization, making them less likely to conflict with
browsers' "native" key bindings.
- Redirects are now checked for private IP addresses too.
-Version 4.1
+Version 4.1 - 2004-11-24
- Added workarounds against browser timeouts in "summary only" mode.
- Improved caching and reuse of fetched /robots.txt information.
- Fixed a bug where a complete protocol response (including headers)
Modified: trunk/w3c-linkchecker/SIGNATURE
URL: http://svn.debian.org/wsvn/pkg-perl/trunk/w3c-linkchecker/SIGNATURE?rev=72213&op=diff
==============================================================================
--- trunk/w3c-linkchecker/SIGNATURE (original)
+++ trunk/w3c-linkchecker/SIGNATURE Sun Apr 3 21:54:29 2011
@@ -14,16 +14,17 @@
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
-SHA1 be94f7305b57b86945ffac2e855cd9c687d829ec MANIFEST
-SHA1 838d23b4a1e435126c9ee820fdd51f7f0baa43ca META.yml
+SHA1 b075772a968f5694bfbb4ce33eadf26566a25f47 MANIFEST
+SHA1 2c2e46c15a894e6fdb6360a96d3e46fef368ea13 META.yml
SHA1 ab9150095a45776c2020e5781d19054c7018da8b Makefile.PL
-SHA1 66a454333dfdb1ab49d1846d583aba735ed35ab7 NEWS
+SHA1 0e45d552ca655a7aa616b5580fe26360194c7b25 NEWS
SHA1 f1f868ea73db7d39ab491ebb50c84de76cce4b44 README
-SHA1 4e5c0c858e53971eb11d4621cc36720617242f6c bin/checklink
-SHA1 07cc637f007a0d57868a9f1105d4e9b7c6c8da5d bin/checklink.pod
+SHA1 75b87d400f5656fa36865cbb8638ae761ef8a045 bin/checklink
+SHA1 4406433ae670dd4f7be3f2c76d55aefb239e9bc9 bin/checklink.pod
SHA1 b188063249c820f0aa5a34b5f735e8f334a536e1 docs/checklink.html
SHA1 fa101fed018fc8e41beca63a0a667fb94c10a557 docs/linkchecker.css
-SHA1 94659a6cba9d947859df23d202aa4c411e2c488b etc/checklink.conf
+SHA1 8fa71b54357c9ed6ac8e01ab600120032d35b080 docs/linkchecker.js
+SHA1 92d01a8a6e7edcd200d70492f4e551984b97b7a0 etc/checklink.conf
SHA1 87c74944dbc80b5d6ab8aac1d09419607b15efff etc/perltidyrc
SHA1 bcb7896bee3764f85a03ab14495efc233f70e215 images/double.png
SHA1 ff9a7be7fee245dd81a7dc4124544d692a140119 images/grad.png
@@ -37,16 +38,16 @@
SHA1 401b5fba02d0d8484775a4a77503fa0d136b96ce images/round-br.png
SHA1 9eb1ee6188391715284a3db080e6e92d163864d9 images/round-tr.png
SHA1 cc01bd358bc1d6d42ca350ad0a4a42778ca4440e images/textbg.png
-SHA1 307f3bef10b817772b619a97b23a711fd06fd3e8 lib/W3C/LinkChecker.pm
+SHA1 993d4a54cd4a6672afeaa938d15cd9154f94aa44 lib/W3C/LinkChecker.pm
SHA1 962ba9fff082c4087239b55618ada2a8f1564464 t/00compile.t
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
-iQEVAwUBTYJfUod580Rxl2NsAQKW6Af/XVf0TroCS1AuV7y9gEqWLGFxEfqY19WF
-f44SU3dwuawr2NMSOPBCcdpThfPrteVMBXjZYTXS7kqZnWcQ2chwTPknYX6g6zrX
-L5mmXNxx7CuYG/1CO1h0+deZ231Z+/R/0uKYDOsL9FsdhTrAJ7qyP3tJyfNECZie
-0g+t+xtYRhXbMUw1LFSB+81szZv1XXZVRnKhLWP54kwBVbebt4/XmsMCYtdCDzPK
-2IU4NSssB/rNxykbNTT8EqPYT8ecXeNG7YqNZkdcGimKzfzsYxdcZnIo6WLGP6Yg
-sTon4mKVsIpeGwYYf4uAprc1Jqf+g+EOhUOP7XWOf6sWQkFrmYR6zQ==
-=APVv
+iQEVAwUBTZdffId580Rxl2NsAQIQgwf/VFvg4vg7KvODiSA5vkfmGJU56Pr9Oxbq
+MCkmCpWfVHo3i4Dzxz7QTubELk6nksKHaoUfVdDCmgRaG9XNVZBb59WCPzedFYsS
+7BoUpzB0u580fOfBO0FhxbEIfEVoGplFN/9BTMBHzJxO/fSRNwHnqsZ1nn1yeCN4
+j23yqibBQapFnd8NFyNHSEzTDEsqtV7cLLYljJlljYP5au2IChaV3hAJ3gsRs0OL
+KVLGGoPQSHR/MhxzIWfituh8MwB4ttjZ5Z0AQibiUfcCfxBA+rgrsT61rquLJOmk
+wUQQHXZVBj9xXdB7fbbezi44+kqOf4U2GTgmNr1quexHm1W24YPd9w==
+=yFBd
-----END PGP SIGNATURE-----
Modified: trunk/w3c-linkchecker/bin/checklink
URL: http://svn.debian.org/wsvn/pkg-perl/trunk/w3c-linkchecker/bin/checklink?rev=72213&op=diff
==============================================================================
--- trunk/w3c-linkchecker/bin/checklink (original)
+++ trunk/w3c-linkchecker/bin/checklink Sun Apr 3 21:54:29 2011
@@ -268,18 +268,19 @@
video => ['src', 'poster'],
};
-# Tag=>attribute mapping of things we treat as space separated lists of links.
+# Tag=>[separator, attributes] mapping of things we treat as lists of links.
use constant LINK_LIST_ATTRS => {
- a => ['ping'],
- area => ['ping'],
- head => ['profile'],
+ a => [qr/\s+/, ['ping']],
+ applet => [qr/[\s,]+/, ['archive']],
+ area => [qr/\s+/, ['ping']],
+ head => [qr/\s+/, ['profile']],
+ object => [qr/\s+/, ['archive']],
};
# TBD/TODO:
-# - applet/@archive, @code?
+# - applet/@code?
# - bgsound/@src?
# - object/@classid?
-# - object/@archive?
# - isindex/@action?
# - layer/@background, at src?
# - ilayer/@background?
@@ -293,7 +294,7 @@
# Version info
$PACKAGE = 'W3C Link Checker';
$PROGRAM = 'W3C-checklink';
- $VERSION = '4.7';
+ $VERSION = '4.8';
$REVISION = sprintf('version %s (c) 1999-2011 W3C', $VERSION);
$AGENT = sprintf(
'%s/%s %s',
@@ -362,33 +363,17 @@
$DocType =
'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">';
my $css_url = URI->new_abs('linkchecker.css', $Cfg{Doc_URI});
- $Head = sprintf(<<'EOF', HTML::Entities::encode($AGENT), $css_url);
+ my $js_url = URI->new_abs('linkchecker.js', $Cfg{Doc_URI});
+ $Head =
+ sprintf(<<'EOF', HTML::Entities::encode($AGENT), $css_url, $js_url);
<meta http-equiv="Content-Script-Type" content="text/javascript" />
<meta name="generator" content="%s" />
<link rel="stylesheet" type="text/css" href="%s" />
-<script type="text/javascript">
-function show_progress(progress_id, progress_text, progress_percentage)
-{
- var div = document.getElementById("progress" + progress_id);
-
- var head = div.getElementsByTagName("h3")[0];
- var text = document.createTextNode(progress_text);
- var span = document.createElement("span");
- span.appendChild(text);
- head.replaceChild(span, head.getElementsByTagName("span")[0]);
-
- var bar = div.getElementsByTagName("div")[0];
- bar.firstChild.style.width = progress_percentage;
- bar.title = progress_percentage;
-
- var pre = div.getElementsByTagName("pre")[0];
- pre.scrollTop = pre.scrollHeight;
-}
-</script>
+<script type="text/javascript" src="%s"></script>
EOF
# Trusted environment variables that need laundering in taint mode.
- foreach (qw(NNTPSERVER NEWSHOST)) {
+ for (qw(NNTPSERVER NEWSHOST)) {
($ENV{$_}) = ($ENV{$_} =~ /^(.*)$/) if $ENV{$_};
}
@@ -417,6 +402,7 @@
Hide_Same_Realm => 0,
Depth => 0, # < 0 means unlimited recursion.
Sleep_Time => 1,
+ Connection_Cache_Size => 2,
Max_Documents => 150, # For the online version.
User => undef,
Password => undef,
@@ -462,8 +448,7 @@
my $ua = W3C::UserAgent->new($AGENT); # @@@ TODO: admin address
-# @@@ make number of keep-alive connections customizable
-$ua->conn_cache({total_capacity => 1}); # 1 keep-alive connection
+$ua->conn_cache({total_capacity => $Opts{Connection_Cache_Size}});
if ($ua->can('delay')) {
$ua->delay($Opts{Sleep_Time} / 60);
}
@@ -533,7 +518,7 @@
my $check_num = 1;
my @bases = @{$Opts{Base_Locations}};
- foreach my $uri (@ARGV) {
+ for my $uri (@ARGV) {
# Reset base locations so that previous URI's given on the command line
# won't affect the recursion scope for this URI (see check_uri())
@@ -550,7 +535,7 @@
if ($Opts{HTML}) {
&html_footer();
}
- elsif (($doc_count > 0) && !$Opts{Summary_Only}) {
+ elsif ($doc_count > 0 && !$Opts{Summary_Only}) {
printf("\n%s\n", &global_stats());
}
@@ -589,7 +574,7 @@
$uri = $query->param('uri');
if (!$uri) {
- &html_header('', 1); # Set cookie only from results page.
+ &html_header('', undef); # Set cookie only from results page.
my %cookies = CGI::Cookie->fetch();
&print_form(scalar($query->Vars()), $cookies{$PROGRAM}, 1);
&html_footer();
@@ -737,6 +722,7 @@
'u|user=s' => \$Opts{User},
'p|password=s' => \$Opts{Password},
't|timeout=i' => \$Opts{Timeout},
+ 'C|connection-cache=i' => \$Opts{Connection_Cache_Size},
'S|sleep=i' => \$Opts{Sleep_Time},
'L|languages=s' => \$Opts{Accept_Language},
'c|cookies=s' => \$Opts{Cookies},
@@ -1055,7 +1041,7 @@
return if defined($response->{Stop});
if ($Opts{HTML}) {
- &html_header($uri, 0, $cookie) if ($check_num == 1);
+ &html_header($uri, $cookie) if ($check_num == 1);
&print_form($params, $cookie, $check_num) if $is_start;
}
@@ -1227,6 +1213,7 @@
scalar(keys %{$p->{Links}}))
if ($Opts{Verbose});
my %links;
+ my %hostlinks;
# Record all the links found
while (my ($link, $lines) = each(%{$p->{Links}})) {
@@ -1247,7 +1234,13 @@
my $canon_uri = URI->new($abs_link_uri->canonical());
my $fragment = $canon_uri->fragment(undef);
if (!defined($Opts{Exclude}) || $canon_uri !~ $Opts{Exclude}) {
- foreach my $line_num (keys(%$lines)) {
+ if (!exists($links{$canon_uri})) {
+ my $hostport =
+ $canon_uri->can('host_port') ? $canon_uri->host_port() :
+ '';
+ push(@{$hostlinks{$hostport}}, $canon_uri);
+ }
+ for my $line_num (keys(%$lines)) {
if (!defined($fragment) || !length($fragment)) {
# Document without fragment
@@ -1262,17 +1255,20 @@
}
}
+ my @order = &distribute_links(\%hostlinks);
+ undef %hostlinks;
+
# Build the list of broken URI's
- my $nlinks = scalar(keys(%links));
+ my $nlinks = scalar(@order);
&hprintf("Checking %d links to build list of broken URI's\n", $nlinks)
if ($Opts{Verbose});
my %broken;
my $link_num = 0;
- while (my ($u, $ulinks) = each(%links)) {
- $u = URI->new($u);
+ for my $u (@order) {
+ my $ulinks = $links{$u};
if ($Opts{Summary_Only}) {
@@ -1330,7 +1326,7 @@
$broken{$u}{location} = 1;
# All the fragments associated are hence broken
- foreach my $fragment (keys %{$ulinks->{fragments}}) {
+ for my $fragment (keys %{$ulinks->{fragments}}) {
$broken{$u}{fragments}{$fragment}++;
}
}
@@ -1357,7 +1353,7 @@
# Do we want to process other documents?
if ($depth != 0) {
- foreach my $u (map { URI->new($_) } keys %links) {
+ for my $u (map { URI->new($_) } keys %links) {
next unless $results{$u}{location}{success}; # Broken link?
@@ -1402,6 +1398,42 @@
return;
}
+###############################################################
+# Distribute links based on host:port to avoid RobotUA delays #
+###############################################################
+
+sub distribute_links(\%)
+{
+ my $hostlinks = shift;
+
+ # Hosts ordered by weight (number of links), descending
+ my @order =
+ sort { scalar(@{$hostlinks->{$b}}) <=> scalar(@{$hostlinks->{$a}}) }
+ keys %$hostlinks;
+
+ # All link list flattened into one, in host weight order
+ my @all;
+ push(@all, @{$hostlinks->{$_}}) for @order;
+
+ return @all if (scalar(@order) < 2);
+
+ # Indexes and chunk size for "zipping" the end result list
+ my $num = scalar(@{$hostlinks->{$order[0]}});
+ my @indexes = map { $_ * $num } (0 .. $num - 1);
+
+ # Distribute them
+ my @result;
+ while (my @chunk = splice(@all, 0, $num)) {
+ @result[@indexes] = @chunk;
+ @indexes = map { $_ + 1 } @indexes;
+ }
+
+ # Weed out undefs
+ @result = grep(defined, @result);
+
+ return @result;
+}
+
##########################################
# Decode Content-Encodings in a response #
##########################################
@@ -1457,13 +1489,13 @@
# Get the resource
my $response;
if (defined($results{$uri}{response}) &&
- !(($method eq 'GET') && ($results{$uri}{method} eq 'HEAD')))
+ !($method eq 'GET' && $results{$uri}{method} eq 'HEAD'))
{
$response = $results{$uri}{response};
}
else {
$response = &get_uri($method, $uri, $referer);
- &record_results($uri, $method, $response);
+ &record_results($uri, $method, $response, $referer);
&record_redirects($redirects, $response);
}
if (!$response->is_success()) {
@@ -1476,7 +1508,7 @@
}
else {
if ($Opts{HTML}) {
- &html_header($uri, 0, $cookie) if ($check_num == 1);
+ &html_header($uri, $cookie) if ($check_num == 1);
&print_form($params, $cookie, $check_num) if $is_start;
print "<p>", &status_icon($response->code());
}
@@ -1510,7 +1542,7 @@
# No, there is a problem...
if (!$in_recursion) {
if ($Opts{HTML}) {
- &html_header($uri, 0, $cookie) if ($check_num == 1);
+ &html_header($uri, $cookie) if ($check_num == 1);
&print_form($params, $cookie, $check_num) if $is_start;
print "<p>", &status_icon(406);
@@ -1543,7 +1575,7 @@
return 0 if ($candidate =~ $excluded_doc);
}
- foreach my $base (@{$Opts{Base_Locations}}) {
+ for my $base (@{$Opts{Base_Locations}}) {
my $rel = $candidate->rel($base);
next if ($candidate eq $rel); # Relative path not possible?
next if ($rel =~ m|^(\.\.)?/|); # Relative path upwards?
@@ -1704,9 +1736,10 @@
# Record the results of an HTTP request #
#########################################
-sub record_results (\$$$)
-{
- my ($uri, $method, $response) = @_;
+sub record_results (\$$$$)
+{
+ my ($uri, $method, $response, $referer) = @_;
+ $results{$uri}{referer} = $referer;
$results{$uri}{response} = $response;
$results{$uri}{method} = $method;
$results{$uri}{location}{code} = $response->code();
@@ -1753,8 +1786,8 @@
# What type of broken link is it? (stored in {record} - the {display}
# information is just for visual use only)
- if (($results{$uri}{location}{display} == 401) &&
- ($results{$uri}{location}{code} == 404))
+ if ($results{$uri}{location}{display} == 401 &&
+ $results{$uri}{location}{code} == 404)
{
$results{$uri}{location}{record} = 404;
}
@@ -2015,6 +2048,10 @@
elsif ($tag eq 'applet' || $tag eq 'object') {
if (my $codebase = $attr->{codebase}) {
+ # Applet codebases are directories, append trailing slash
+ # if it's not there so that new_abs does the right thing.
+ $codebase .= "/" if ($tag eq 'applet' && $codebase !~ m|/$|);
+
# TODO: HTML 4 spec says applet/@codebase may only point to
# subdirs of the directory containing the current document.
# Should we do something about that?
@@ -2031,9 +2068,10 @@
# List of links attributes:
if (my $link_attrs = LINK_LIST_ATTRS()->{$tag}) {
- for my $la (@$link_attrs) {
+ my ($sep, $attrs) = @$link_attrs;
+ for my $la (@$attrs) {
if (defined(my $value = $attr->{$la})) {
- for my $link (split(/\s+/, $value)) {
+ for my $link (split($sep, $value)) {
$self->add_link($link, $tag_local_base, $line);
}
}
@@ -2109,9 +2147,9 @@
# Extract the doctype
my @declaration = split(/\s+/, $text, 4);
- if (($#declaration >= 3) &&
- ($declaration[0] eq 'DOCTYPE') &&
- (lc($declaration[1]) eq 'html'))
+ if ($#declaration >= 3 &&
+ $declaration[0] eq 'DOCTYPE' &&
+ lc($declaration[1]) eq 'html')
{
# Parse the doctype declaration
@@ -2164,25 +2202,28 @@
# $links is a hash of the links in the documents checked
# $redirects is a map of the redirects encountered
- # Get the document with the appropriate method
- # Only use GET if there are fragments. HEAD is enough if it's not the
- # case.
- my @fragments = keys %{$links->{$uri}{fragments}};
- my $method = scalar(@fragments) ? 'GET' : 'HEAD';
+ # Get the document with the appropriate method: GET if there are
+ # fragments to check or links are wanted, HEAD is enough otherwise.
+ my $fragments = $links->{$uri}{fragments} || {};
+ my $method = ($want_links || %$fragments) ? 'GET' : 'HEAD';
my $response;
my $being_processed = 0;
- if ((!defined($results{$uri})) ||
- (($method eq 'GET') && ($results{$uri}{method} eq 'HEAD')))
+ if (!defined($results{$uri}) ||
+ ($method eq 'GET' && $results{$uri}{method} eq 'HEAD'))
{
$being_processed = 1;
$response = &get_uri($method, $uri, $referer);
# Get the information back from get_uri()
- &record_results($uri, $method, $response);
+ &record_results($uri, $method, $response, $referer);
# Record the redirects
&record_redirects($redirects, $response);
+ }
+ elsif (!($Opts{Summary_Only} || (!$doc_count && $Opts{HTML}))) {
+ my $ref = $results{$uri}{referer};
+ &hprintf("Already checked%s\n", $ref ? ", referrer $ref" : ".");
}
# We got the response of the HTTP request. Stop here if it was a HEAD.
@@ -2220,7 +2261,7 @@
}
# Check that the fragments exist
- foreach my $fragment (keys %{$links->{$uri}{fragments}}) {
+ for my $fragment (keys %$fragments) {
if (defined($p->{Anchors}{$fragment}) ||
&escape_match($fragment, $p->{Anchors}) ||
grep { $_ eq "$uri#$fragment" } @{$Opts{Suppress_Fragment}})
@@ -2237,7 +2278,7 @@
sub escape_match ($\%)
{
my ($a, $hash) = (URI::Escape::uri_unescape($_[0]), $_[1]);
- foreach my $b (keys %$hash) {
+ for my $b (keys %$hash) {
return 1 if ($a eq URI::Escape::uri_unescape($b));
}
return 0;
@@ -2467,7 +2508,7 @@
EOF
print("\n");
- foreach my $anchor (@errors) {
+ for my $anchor (@errors) {
my $format;
my @unique = &sort_unique(
map { line_number($_) }
@@ -2499,7 +2540,7 @@
# Process each URL
my ($c, $previous_c);
- foreach my $u (@$urls) {
+ for my $u (@$urls) {
my @fragments = keys %{$broken->{$u}{fragments}};
# Did we get a redirect?
@@ -2508,7 +2549,7 @@
# List of lines
my @total_lines;
push(@total_lines, keys(%{$links->{$u}{location}}));
- foreach my $f (@fragments) {
+ for my $f (@fragments) {
push(@total_lines, keys(%{$links->{$u}{fragments}{$f}}))
unless ($f eq $u && defined($links->{$u}{$u}{LINE_UNKNOWN()}));
}
@@ -2687,7 +2728,7 @@
}
# Fragments
- foreach my $f (@fragments) {
+ for my $f (@fragments) {
my @unique_lines =
&sort_unique(keys %{$links->{$u}{fragments}{$f}});
my $plural = (scalar(@unique_lines) > 1) ? 's' : '';
@@ -2767,10 +2808,8 @@
RC_ROBOTS_TXT() => sprintf(
'The link was not checked due to %srobots exclusion rules%s. Check the link manually, and see also the link checker %sdocumentation on robots exclusion%s.',
$Opts{HTML} ? (
- '<a href="http://www.robotstxt.org/robotstxt.html">',
- '</a>',
- "<a href=\"$Cfg{Doc_URI}#bot\">",
- '</a>'
+ '<a href="http://www.robotstxt.org/robotstxt.html">', '</a>',
+ "<a href=\"$Cfg{Doc_URI}#bot\">", '</a>'
) : ('') x 4
),
RC_DNS_ERROR() =>
@@ -2840,7 +2879,7 @@
# Sort the URI's by HTTP Code
my %code_summary;
my @idx;
- foreach my $u (@urls) {
+ for my $u (@urls) {
if (defined($results->{$u}{location}{record})) {
my $c = &code_shown($u, $results);
$code_summary{$c}++;
@@ -2878,7 +2917,7 @@
</thead>
<tbody>
EOF
- foreach my $code (sort(keys(%code_summary))) {
+ for my $code (sort(keys(%code_summary))) {
printf('<tr%s>', &bgcolor($code));
printf('<td><a href="#d%scode_%s">%s</a></td>',
$doc_count, $code, http_rc($code));
@@ -2934,16 +2973,16 @@
# HTML interface #
##################
-sub html_header ($;$$)
-{
- my ($uri, $doform, $cookie) = @_;
+sub html_header ($$)
+{
+ my ($uri, $cookie) = @_;
my $title = defined($uri) ? $uri : '';
$title = ': ' . $title if ($title =~ /\S/);
my $headers = '';
if (!$Opts{Command_Line}) {
- $headers .= "Cache-Control: no-cache\nPragma: no-cache\n" if $doform;
+ $headers .= "Cache-Control: no-cache\nPragma: no-cache\n" if $uri;
$headers .= "Content-Type: text/html; charset=utf-8\n";
$headers .= "Set-Cookie: $cookie\n" if $cookie;
@@ -2952,40 +2991,14 @@
$headers .= "Content-Language: en\n\n";
}
- my $script = my $onload = '';
- if ($doform) {
- $script = <<'EOF';
-<script type="text/javascript">
-function uriOk(num)
-{
- if (document.getElementById) {
- var u = document.getElementById('uri_' + num);
- var ok = false;
- if (u.value.length > 0) {
- if (u.value.search) {
- ok = (u.value.search(/\S/) !== -1);
- } else {
- ok = true;
- }
- }
- if (! ok) {
- u.focus();
- }
- return ok;
- }
- return true;
-}
-</script>
-EOF
- $onload =
- ' onload="if(document.getElementById){document.getElementById(\'uri_1\').focus()}"';
- }
+ my $onload = $uri ? '' :
+ ' onload="if(document.getElementById){document.getElementById(\'uri_1\').focus()}"';
print $headers, $DocType, "
<html lang=\"en\" xmlns=\"http://www.w3.org/1999/xhtml\" xml:lang=\"en\">
<head>
<title>W3C Link Checker", &encode($title), "</title>
-", $Head, $script, "</head>
+", $Head, "</head>
<body", $onload, '>';
&banner($title);
return;
Modified: trunk/w3c-linkchecker/bin/checklink.pod
URL: http://svn.debian.org/wsvn/pkg-perl/trunk/w3c-linkchecker/bin/checklink.pod?rev=72213&op=diff
==============================================================================
--- trunk/w3c-linkchecker/bin/checklink.pod (original)
+++ trunk/w3c-linkchecker/bin/checklink.pod Sun Apr 3 21:54:29 2011
@@ -160,6 +160,12 @@
Timeout for requests, in seconds. The default is 30.
+=item B<-C, --connection-cache> I<number>
+
+Maximum number of cached connections. Using this option overrides the
+C<Connection_Cache_Size> configuration file parameter, see its
+documentation below for the default value and more information.
+
=item B<-d, --domain> I<domain>
Perl regular expression describing the domain to which the authentication
@@ -234,12 +240,17 @@
CSS_Validator_URI =
http://jigsaw.w3.org/css-validator/validator?uri=%s
-C<Doc_URI> and C<Style_URI> are URIs used for linking to the documentation
-and style sheet from the dynamically generated content of the link checker.
-The defaults are:
+C<Doc_URI> is a URI used for linking to the documentation, and CSS and
+JavaScript files in the dynamically generated content of the link checker.
+The default is:
Doc_URI = http://validator.w3.org/docs/checklink.html
- Style_URI = http://validator.w3.org/docs/linkchecker.css
+
+C<Connection_Cache_Size> is an integer denoting the maximum number of
+connections the link checker will keep open at any given time. The
+default is:
+
+ Connection_Cache_Size = 2
=back
Modified: trunk/w3c-linkchecker/debian/changelog
URL: http://svn.debian.org/wsvn/pkg-perl/trunk/w3c-linkchecker/debian/changelog?rev=72213&op=diff
==============================================================================
--- trunk/w3c-linkchecker/debian/changelog (original)
+++ trunk/w3c-linkchecker/debian/changelog Sun Apr 3 21:54:29 2011
@@ -1,3 +1,11 @@
+w3c-linkchecker (4.8-1) UNRELEASED; urgency=low
+
+ TODO: check javascript file carefully
+
+ * New upstream release
+
+ -- Nicholas Bamber <nicholas at periapt.co.uk> Sun, 03 Apr 2011 22:54:51 +0100
+
w3c-linkchecker (4.7-1) unstable; urgency=low
[ gregor herrmann ]
Modified: trunk/w3c-linkchecker/etc/checklink.conf
URL: http://svn.debian.org/wsvn/pkg-perl/trunk/w3c-linkchecker/etc/checklink.conf?rev=72213&op=diff
==============================================================================
--- trunk/w3c-linkchecker/etc/checklink.conf (original)
+++ trunk/w3c-linkchecker/etc/checklink.conf Sun Apr 3 21:54:29 2011
@@ -44,8 +44,10 @@
#
# Doc_URI is the URI to the Link Checker documentation, shown in the
# results report in CGI mode, and the usage message in command line mode.
-# If you have installed the documentation locally somewhere, you may wish to
-# change this to point to that version. This must be an absolute URI.
+# The URIs to the CSS and JavaScript files in the generated HTML are also
+# formed using this as their base URI. If you have installed the documentation
+# locally somewhere, you may wish to change this to point to that location.
+# This must be an absolute URI.
#
# Default:
# Doc_URI = http://validator.w3.org/docs/checklink.html
@@ -59,3 +61,11 @@
#
# Default:
# Forbidden_Protocols = javascript,mailto
+
+
+#
+# Connection_Cache_Size is an integer denoting the maximum number of
+# connections the link checker will keep open at any given time.
+#
+# Default:
+# Connection_Cache_Size = 2
Modified: trunk/w3c-linkchecker/lib/W3C/LinkChecker.pm
URL: http://svn.debian.org/wsvn/pkg-perl/trunk/w3c-linkchecker/lib/W3C/LinkChecker.pm?rev=72213&op=diff
==============================================================================
--- trunk/w3c-linkchecker/lib/W3C/LinkChecker.pm (original)
+++ trunk/w3c-linkchecker/lib/W3C/LinkChecker.pm Sun Apr 3 21:54:29 2011
@@ -2,5 +2,5 @@
package W3C::LinkChecker;
use strict;
use vars qw($VERSION);
-$VERSION = "4.7";
+$VERSION = "4.8";
1;
More information about the Pkg-perl-cvs-commits
mailing list