r47594 - in /branches/upstream/liburi-perl/current: Changes MANIFEST META.yml README URI.pm URI/IRI.pm URI/_idna.pm URI/_punycode.pm URI/_server.pm t/idna.t t/iri.t t/num_eq.t t/old-base.t t/punycode.t t/rfc2732.t

jawnsy-guest at users.alioth.debian.org jawnsy-guest at users.alioth.debian.org
Sun Nov 22 01:35:00 UTC 2009


Author: jawnsy-guest
Date: Sun Nov 22 01:34:51 2009
New Revision: 47594

URL: http://svn.debian.org/wsvn/pkg-perl/?sc=1&rev=47594
Log:
[svn-upgrade] Integrating new upstream version, liburi-perl (1.50+dfsg)

Added:
    branches/upstream/liburi-perl/current/URI/IRI.pm
    branches/upstream/liburi-perl/current/URI/_idna.pm
    branches/upstream/liburi-perl/current/URI/_punycode.pm
    branches/upstream/liburi-perl/current/t/idna.t
    branches/upstream/liburi-perl/current/t/iri.t
    branches/upstream/liburi-perl/current/t/num_eq.t
    branches/upstream/liburi-perl/current/t/punycode.t
Modified:
    branches/upstream/liburi-perl/current/Changes
    branches/upstream/liburi-perl/current/MANIFEST
    branches/upstream/liburi-perl/current/META.yml
    branches/upstream/liburi-perl/current/README
    branches/upstream/liburi-perl/current/URI.pm
    branches/upstream/liburi-perl/current/URI/_server.pm
    branches/upstream/liburi-perl/current/t/old-base.t
    branches/upstream/liburi-perl/current/t/rfc2732.t

Modified: branches/upstream/liburi-perl/current/Changes
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/Changes?rev=47594&op=diff
==============================================================================
--- branches/upstream/liburi-perl/current/Changes (original)
+++ branches/upstream/liburi-perl/current/Changes Sun Nov 22 01:34:51 2009
@@ -1,3 +1,47 @@
+2009-11-21   Gisle Aas <gisle at ActiveState.com>
+
+   Release 1.50
+
+   The main news in this release is the initial attempt at providing
+   support to IRIs.  URI objects now support the 'as_iri' and 'ihost'
+   methods.
+
+   Gisle Aas (28):
+      Added more tests for setting IPv6 addresses using the host method
+      Document how the host methods deal with IPv6 addresses
+      A "test case" to start IDNA prototype from
+      Escape IDNA hostnames
+      Introduce the as_unicode method
+      Make as_unicode undo punycode for server URLs
+      An IRI class might be helpful (RFC 3987)
+      Must punycode each part of the domain name separately
+      Include initial private Punycode module
+      Get URI::_punycode working
+      punycode of plain ascii should not edit with "-"
+      Some more tests from RFC 3492
+      Add private URI::_idna module based on encodings/idna.py
+      Start using URI::_idna for encoding of URIs
+      Avoid various use of undef warnings
+      Fix test affected by IDNA
+      Keep reference to IDNA::Punycode in the URI::_punycode docs
+      Ensure upgraded strings as input
+      Update manifest with the new idna/punycode files
+      Rename as_unicde to as_iri
+      draft-duerst-iri-bis-07: The proposed RFC 3987 update
+      Load Encode when its used
+      Rename host_unicode as ihost
+      Add basic iri test
+      Hack to make as_iri turn A-labels into U-labels
+      Make as_iri leave escapes not forming valid UTF-8 sequences
+      Merge branch 'iri'
+      Don't include RFCs in the cpan tarball
+
+   Michael G. Schwern (3):
+      Fix != overloading to match ==
+      Note that mailto does not contain a host() and this is not a bug.
+      Strip brackets off IPv6 hosts [RT#34309]
+
+
 2009-08-14   Gisle Aas <gisle at ActiveState.com>
 
    Release 1.40

Modified: branches/upstream/liburi-perl/current/MANIFEST
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/MANIFEST?rev=47594&op=diff
==============================================================================
--- branches/upstream/liburi-perl/current/MANIFEST (original)
+++ branches/upstream/liburi-perl/current/MANIFEST Sun Nov 22 01:34:51 2009
@@ -5,14 +5,17 @@
 URI.pm
 URI/Escape.pm
 URI/Heuristic.pm
+URI/IRI.pm
 URI/QueryParam.pm
 URI/Split.pm
 URI/URL.pm
 URI/WithBase.pm
 URI/_foreign.pm
 URI/_generic.pm
+URI/_idna.pm
 URI/_ldap.pm
 URI/_login.pm
+URI/_punycode.pm
 URI/_query.pm
 URI/_segment.pm
 URI/_server.pm
@@ -62,16 +65,20 @@
 t/gopher.t
 t/heuristic.t
 t/http.t
+t/idna.t
+t/iri.t
 t/ldap.t
 t/mailto.t
 t/mix.t
 t/mms.t
 t/news.t
+t/num_eq.t
 t/old-absconf.t
 t/old-base.t
 t/old-file.t
 t/old-relbase.t
 t/pop.t
+t/punycode.t
 t/query.t
 t/query-param.t
 t/rel.t

Modified: branches/upstream/liburi-perl/current/META.yml
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/META.yml?rev=47594&op=diff
==============================================================================
--- branches/upstream/liburi-perl/current/META.yml (original)
+++ branches/upstream/liburi-perl/current/META.yml Sun Nov 22 01:34:51 2009
@@ -1,6 +1,6 @@
 --- #YAML:1.0
 name:               URI
-version:            1.40
+version:            1.50
 abstract:           Uniform Resource Identifiers (absolute and relative)
 author:
     - Gisle Aas <gisle at activestate.com>

Modified: branches/upstream/liburi-perl/current/README
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/README?rev=47594&op=diff
==============================================================================
--- branches/upstream/liburi-perl/current/README (original)
+++ branches/upstream/liburi-perl/current/README Sun Nov 22 01:34:51 2009
@@ -23,7 +23,7 @@
 comp.lang.perl.modules USENET Newsgroup.  Bug reports and suggestions
 for improvements can be sent to the <libwww at perl.org> mailing list.
 
-Copyright 1998-2004,2008 Gisle Aas.
+Copyright 1998-2009 Gisle Aas.
 Copyright 1998 Graham Barr.
 
 This library is free software; you can redistribute it and/or modify

Modified: branches/upstream/liburi-perl/current/URI.pm
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/URI.pm?rev=47594&op=diff
==============================================================================
--- branches/upstream/liburi-perl/current/URI.pm (original)
+++ branches/upstream/liburi-perl/current/URI.pm Sun Nov 22 01:34:51 2009
@@ -2,7 +2,7 @@
 
 use strict;
 use vars qw($VERSION);
-$VERSION = "1.40";
+$VERSION = "1.50";
 
 use vars qw($ABS_REMOTE_LEADING_DOTS $ABS_ALLOW_RELATIVE_SCHEME $DEFAULT_QUERY_FORM_DELIMITER);
 
@@ -22,11 +22,15 @@
 use URI::Escape ();
 
 use overload ('""'     => sub { ${$_[0]} },
-	      '=='     => sub { overload::StrVal($_[0]) eq
-                                overload::StrVal($_[1])
-                              },
+              '=='     => sub { _obj_eq(@_) },
+              '!='     => sub { !_obj_eq(@_) },
               fallback => 1,
              );
+
+# Check if two objects are the same object
+sub _obj_eq {
+    return overload::StrVal($_[0]) eq overload::StrVal($_[1]);
+}
 
 sub new
 {
@@ -74,11 +78,19 @@
     my $class = shift;
     my($str, $scheme) = @_;
     # find all funny characters and encode the bytes.
-    $str =~ s*([^$uric\#])* URI::Escape::escape_char($1) *ego;
+    $str = $class->_uric_escape($str);
     $str = "$scheme:$str" unless $str =~ /^$scheme_re:/o ||
                                  $class->_no_scheme_ok;
     my $self = bless \$str, $class;
     $self;
+}
+
+
+sub _uric_escape
+{
+    my($class, $str) = @_;
+    $str =~ s*([^$uric\#])* URI::Escape::escape_char($1) *ego;
+    return $str;
 }
 
 
@@ -241,6 +253,43 @@
 {
     my $self = shift;
     $$self;
+}
+
+
+sub as_iri
+{
+    my $self = shift;
+    my $str = $$self;
+    if ($str =~ /\bxn--/ && $self->can("ihost")) {
+	my $ihost = $self->ihost;
+	if ($ihost) {
+	    my $u = $self->clone;
+	    $u->host("%%host%%");
+	    $str = $u->as_string;
+	    $str =~ s/%%host%%/$ihost/;
+	}
+    }
+    if ($str =~ s/%([89A-F][0-9A-F])/chr(hex($1))/eg) {
+	# All this crap because the more obvious:
+	#
+	#   Encode::decode("UTF-8", $str, sub { sprintf "%%%02X", shift })
+	#
+	# doesn't work.  Apparently passing a sub as CHECK only works
+	# for 'ascii' and similar direct encodings.
+
+	require Encode;
+	my $enc = Encode::find_encoding("UTF-8");
+	my $u = "";
+	while (length $str) {
+	    $u .= $enc->decode($str, Encode::FB_QUIET());
+	    if (length $str) {
+		# escape next char
+		$u .= URI::Escape::escape_char(substr($str, 0, 1, ""));
+	    }
+	}
+	$str = $u;
+    }
+    return $str;
 }
 
 
@@ -481,10 +530,16 @@
 
 =item $uri->as_string
 
-Returns a URI object to a plain string.  URI objects are
+Returns a URI object to a plain ASCII string.  URI objects are
 also converted to plain strings automatically by overloading.  This
 means that $uri objects can be used as plain strings in most Perl
 constructs.
+
+=item $uri->as_iri
+
+Returns a Unicode string representing the URI.  Escaped UTF-8 sequences
+representing non-ASCII characters are turned into their corresponding Unicode
+code point.
 
 =item $uri->canonical
 
@@ -676,6 +731,15 @@
 If the $new_host string ends with a colon and a number, then this
 number also sets the port.
 
+For IPv6 addresses the brackets around the raw address is removed in the return
+value from $uri->host.  When setting the host attribute to an IPv6 address you
+can use a raw address or one enclosed in brackets.  The address needs to be
+enclosed in brackets if you want to pass in a new port value as well.
+
+=item $uri->ihost
+
+Returns the host in Unicode form.  Any IDNA A-labels are turned into U-labels.
+
 =item $uri->port
 
 =item $uri->port( $new_port )
@@ -695,6 +759,10 @@
 unit.  The returned value includes a port, even if it matches the
 default port.  The host part and the port part are separated by a
 colon: ":".
+
+For IPv6 addresses the bracketing is preserved; thus
+URI->new("http://[::1]/")->host_port returns "[::1]:80".  Contrast this with
+$uri->host which will remove the brackets.
 
 =item $uri->default_port
 
@@ -806,6 +874,10 @@
 C<URI> objects belonging to the mailto scheme support the common
 methods and the generic query methods.  In addition, they support the
 following mailto-specific methods: $uri->to, $uri->headers.
+
+Note that the "foo at example.com" part of a mailto is I<not> the
+C<userinfo> and C<host> but instead the C<path>.  This allowed a
+mailto to contain multiple comma-seperated email addresses.
 
 =item B<mms>:
 
@@ -1015,7 +1087,7 @@
 
 =head1 COPYRIGHT
 
-Copyright 1995-2004,2008 Gisle Aas.
+Copyright 1995-2009 Gisle Aas.
 
 Copyright 1995 Martijn Koster.
 

Added: branches/upstream/liburi-perl/current/URI/IRI.pm
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/URI/IRI.pm?rev=47594&op=file
==============================================================================
--- branches/upstream/liburi-perl/current/URI/IRI.pm (added)
+++ branches/upstream/liburi-perl/current/URI/IRI.pm Sun Nov 22 01:34:51 2009
@@ -1,0 +1,44 @@
+package URI::IRI;
+
+# Experimental
+
+use strict;
+use URI ();
+
+use overload '""' => sub { shift->as_string };
+
+sub new {
+    my($class, $uri, $scheme) = @_;
+    utf8::upgrade($uri);
+    return bless {
+	uri => URI->new($uri, $scheme),
+    }, $class;
+}
+
+sub clone {
+    my $self = shift;
+    return bless {
+	uri => $self->{uri}->clone,
+    }, ref($self);
+}
+
+sub as_string {
+    my $self = shift;
+    return $self->{uri}->as_iri;
+}
+
+sub AUTOLOAD
+{
+    use vars qw($AUTOLOAD);
+    my $method = substr($AUTOLOAD, rindex($AUTOLOAD, '::')+2);
+
+    # We create the function here so that it will not need to be
+    # autoloaded the next time.
+    no strict 'refs';
+    *$method = sub { shift->{uri}->$method(@_) };
+    goto &$method;
+}
+
+sub DESTROY {}   # avoid AUTOLOADing it
+
+1;

Added: branches/upstream/liburi-perl/current/URI/_idna.pm
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/URI/_idna.pm?rev=47594&op=file
==============================================================================
--- branches/upstream/liburi-perl/current/URI/_idna.pm (added)
+++ branches/upstream/liburi-perl/current/URI/_idna.pm Sun Nov 22 01:34:51 2009
@@ -1,0 +1,78 @@
+package URI::_idna;
+
+# This module implements the RFCs 3490 (IDNA) and 3491 (Nameprep)
+# based on Python-2.6.4/Lib/encodings/idna.py
+
+use strict;
+use URI::_punycode qw(encode_punycode decode_punycode);
+use Carp qw(croak);
+
+my $ASCII = qr/^[\x00-\x7F]*\z/;
+
+sub encode {
+    my $idomain = shift;
+    my @labels = split(/\./, $idomain, -1);
+    my @last_empty;
+    push(@last_empty, pop @labels) if @labels > 1 && $labels[-1] eq "";
+    for (@labels) {
+	$_ = ToASCII($_);
+    }
+    return join(".", @labels, @last_empty);
+}
+
+sub decode {
+    my $domain = shift;
+    return join(".", map ToUnicode($_), split(/\./, $domain, -1))
+}
+
+sub nameprep { # XXX real implementation missing
+    my $label = shift;
+    $label = lc($label);
+    return $label;
+}
+
+sub check_size {
+    my $label = shift;
+    croak "Label empty" if $label eq "";
+    croak "Label too long" if length($label) > 63;
+    return $label;
+}
+
+sub ToASCII {
+    my $label = shift;
+    return check_size($label) if $label =~ $ASCII;
+
+    # Step 2: nameprep
+    $label = nameprep($label);
+    # Step 3: UseSTD3ASCIIRules is false
+    # Step 4: try ASCII again
+    return check_size($label) if $label =~ $ASCII;
+
+    # Step 5: Check ACE prefix
+    if ($label =~ /^xn--/) {
+        croak "Label starts with ACE prefix";
+    }
+
+    # Step 6: Encode with PUNYCODE
+    $label = encode_punycode($label);
+
+    # Step 7: Prepend ACE prefix
+    $label = "xn--$label";
+
+    # Step 8: Check size
+    return check_size($label);
+}
+
+sub ToUnicode {
+    my $label = shift;
+    $label = nameprep($label) unless $label =~ $ASCII;
+    return $label unless $label =~ /^xn--/;
+    my $label1 = decode_punycode(substr($label, 4));
+    my $label2 = ToASCII($label);
+    if (lc($label) ne $label2) {
+	croak "IDNA does not round-trip: '$label' vs '$label2'";
+    }
+    return $label1;
+}
+
+1;

Added: branches/upstream/liburi-perl/current/URI/_punycode.pm
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/URI/_punycode.pm?rev=47594&op=file
==============================================================================
--- branches/upstream/liburi-perl/current/URI/_punycode.pm (added)
+++ branches/upstream/liburi-perl/current/URI/_punycode.pm Sun Nov 22 01:34:51 2009
@@ -1,0 +1,203 @@
+package URI::_punycode;
+
+use strict;
+our $VERSION = 0.02;
+
+require Exporter;
+our @ISA    = qw(Exporter);
+our @EXPORT = qw(encode_punycode decode_punycode);
+
+use integer;
+
+our $DEBUG = 0;
+
+use constant BASE => 36;
+use constant TMIN => 1;
+use constant TMAX => 26;
+use constant SKEW => 38;
+use constant DAMP => 700;
+use constant INITIAL_BIAS => 72;
+use constant INITIAL_N => 128;
+
+my $Delimiter = chr 0x2D;
+my $BasicRE   = qr/[\x00-\x7f]/;
+
+sub _croak { require Carp; Carp::croak(@_); }
+
+sub digit_value {
+    my $code = shift;
+    return ord($code) - ord("A") if $code =~ /[A-Z]/;
+    return ord($code) - ord("a") if $code =~ /[a-z]/;
+    return ord($code) - ord("0") + 26 if $code =~ /[0-9]/;
+    return;
+}
+
+sub code_point {
+    my $digit = shift;
+    return $digit + ord('a') if 0 <= $digit && $digit <= 25;
+    return $digit + ord('0') - 26 if 26 <= $digit && $digit <= 36;
+    die 'NOT COME HERE';
+}
+
+sub adapt {
+    my($delta, $numpoints, $firsttime) = @_;
+    $delta = $firsttime ? $delta / DAMP : $delta / 2;
+    $delta += $delta / $numpoints;
+    my $k = 0;
+    while ($delta > ((BASE - TMIN) * TMAX) / 2) {
+	$delta /= BASE - TMIN;
+	$k += BASE;
+    }
+    return $k + (((BASE - TMIN + 1) * $delta) / ($delta + SKEW));
+}
+
+sub decode_punycode {
+    my $code = shift;
+
+    my $n      = INITIAL_N;
+    my $i      = 0;
+    my $bias   = INITIAL_BIAS;
+    my @output;
+
+    if ($code =~ s/(.*)$Delimiter//o) {
+	push @output, map ord, split //, $1;
+	return _croak('non-basic code point') unless $1 =~ /^$BasicRE*$/o;
+    }
+
+    while ($code) {
+	my $oldi = $i;
+	my $w    = 1;
+    LOOP:
+	for (my $k = BASE; 1; $k += BASE) {
+	    my $cp = substr($code, 0, 1, '');
+	    my $digit = digit_value($cp);
+	    defined $digit or return _croak("invalid punycode input");
+	    $i += $digit * $w;
+	    my $t = ($k <= $bias) ? TMIN
+		: ($k >= $bias + TMAX) ? TMAX : $k - $bias;
+	    last LOOP if $digit < $t;
+	    $w *= (BASE - $t);
+	}
+	$bias = adapt($i - $oldi, @output + 1, $oldi == 0);
+	warn "bias becomes $bias" if $DEBUG;
+	$n += $i / (@output + 1);
+	$i = $i % (@output + 1);
+	splice(@output, $i, 0, $n);
+	warn join " ", map sprintf('%04x', $_), @output if $DEBUG;
+	$i++;
+    }
+    return join '', map chr, @output;
+}
+
+sub encode_punycode {
+    my $input = shift;
+    # my @input = split //, $input; # doesn't work in 5.6.x!
+    my @input = map substr($input, $_, 1), 0..length($input)-1;
+
+    my $n     = INITIAL_N;
+    my $delta = 0;
+    my $bias  = INITIAL_BIAS;
+
+    my @output;
+    my @basic = grep /$BasicRE/, @input;
+    my $h = my $b = @basic;
+    push @output, @basic;
+    push @output, $Delimiter if $b && $h < @input;
+    warn "basic codepoints: (@output)" if $DEBUG;
+
+    while ($h < @input) {
+	my $m = min(grep { $_ >= $n } map ord, @input);
+	warn sprintf "next code point to insert is %04x", $m if $DEBUG;
+	$delta += ($m - $n) * ($h + 1);
+	$n = $m;
+	for my $i (@input) {
+	    my $c = ord($i);
+	    $delta++ if $c < $n;
+	    if ($c == $n) {
+		my $q = $delta;
+	    LOOP:
+		for (my $k = BASE; 1; $k += BASE) {
+		    my $t = ($k <= $bias) ? TMIN :
+			($k >= $bias + TMAX) ? TMAX : $k - $bias;
+		    last LOOP if $q < $t;
+		    my $cp = code_point($t + (($q - $t) % (BASE - $t)));
+		    push @output, chr($cp);
+		    $q = ($q - $t) / (BASE - $t);
+		}
+		push @output, chr(code_point($q));
+		$bias = adapt($delta, $h + 1, $h == $b);
+		warn "bias becomes $bias" if $DEBUG;
+		$delta = 0;
+		$h++;
+	    }
+	}
+	$delta++;
+	$n++;
+    }
+    return join '', @output;
+}
+
+sub min {
+    my $min = shift;
+    for (@_) { $min = $_ if $_ <= $min }
+    return $min;
+}
+
+1;
+__END__
+
+=head1 NAME
+
+URI::_punycode - encodes Unicode string in Punycode
+
+=head1 SYNOPSIS
+
+  use URI::_punycode;
+  $punycode = encode_punycode($unicode);
+  $unicode  = decode_punycode($punycode);
+
+=head1 DESCRIPTION
+
+URI::_punycode is a module to encode / decode Unicode strings into
+Punycode, an efficient encoding of Unicode for use with IDNA.
+
+This module requires Perl 5.6.0 or over to handle UTF8 flagged Unicode
+strings.
+
+=head1 FUNCTIONS
+
+This module exports following functions by default.
+
+=over 4
+
+=item encode_punycode
+
+  $punycode = encode_punycode($unicode);
+
+takes Unicode string (UTF8-flagged variable) and returns Punycode
+encoding for it.
+
+=item decode_punycode
+
+  $unicode = decode_punycode($punycode)
+
+takes Punycode encoding and returns original Unicode string.
+
+=back
+
+These functions throws exceptionsn on failure. You can catch 'em via
+C<eval>.
+
+=head1 AUTHOR
+
+Tatsuhiko Miyagawa E<lt>miyagawa at bulknews.netE<gt> is the author of
+IDNA::Punycode v0.02 which was the basis for this module.
+
+This library is free software; you can redistribute it and/or modify
+it under the same terms as Perl itself.
+
+=head1 SEE ALSO
+
+L<IDNA::Punycode>, RFC 3492
+
+=cut

Modified: branches/upstream/liburi-perl/current/URI/_server.pm
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/URI/_server.pm?rev=47594&op=diff
==============================================================================
--- branches/upstream/liburi-perl/current/URI/_server.pm (original)
+++ branches/upstream/liburi-perl/current/URI/_server.pm Sun Nov 22 01:34:51 2009
@@ -4,6 +4,42 @@
 
 use strict;
 use URI::Escape qw(uri_unescape);
+
+sub _uric_escape {
+    my($class, $str) = @_;
+    if ($str =~ m,^((?:$URI::scheme_re:)?)//([^/?\#]*)(.*)$,os) {
+	my($scheme, $host, $rest) = ($1, $2, $3);
+	my $ui = $host =~ s/(.*@)// ? $1 : "";
+	my $port = $host =~ s/(:\d+)\z// ? $1 : "";
+	if (_host_escape($host)) {
+	    $str = "$scheme//$ui$host$port$rest";
+	}
+    }
+    return $class->SUPER::_uric_escape($str);
+}
+
+sub _host_escape {
+    return unless $_[0] =~ /[^URI::uric]/;
+    require URI::_idna;
+    $_[0] = URI::_idna::encode($_[0]);
+    return 1;
+}
+
+sub as_iri {
+    my $self = shift;
+    my $str = $self->SUPER::as_iri;
+    if ($str =~ /\bxn--/) {
+	if ($str =~ m,^((?:$URI::scheme_re:)?)//([^/?\#]*)(.*)$,os) {
+	    my($scheme, $host, $rest) = ($1, $2, $3);
+	    my $ui = $host =~ s/(.*@)// ? $1 : "";
+	    my $port = $host =~ s/(:\d+)\z// ? $1 : "";
+	    require URI::_idna;
+	    $host = URI::_idna::encode($host);
+	    $str = "$scheme//$ui$host$port$rest";
+	}
+    }
+    return $str;
+}
 
 sub userinfo
 {
@@ -38,14 +74,31 @@
 	$new = "" unless defined $new;
 	if (length $new) {
 	    $new =~ s/[@]/%40/g;   # protect @
-	    $port = $1 if $new =~ s/(:\d+)$//;
+	    if ($new =~ /^[^:]*:\d*\z/ || $new =~ /]:\d*\z/) {
+		$new =~ s/(:\d*)\z// || die "Assert";
+		$port = $1;
+	    }
+	    $new = "[$new]" if $new =~ /:/ && $new !~ /^\[/; # IPv6 address
+	    _host_escape($new);
 	}
 	$self->authority("$ui$new$port");
     }
     return undef unless defined $old;
     $old =~ s/.*@//;
-    $old =~ s/:\d+$//;
+    $old =~ s/:\d+$//;          # remove the port
+    $old =~ s{^\[(.*)\]$}{$1};  # remove brackets around IPv6 (RFC 3986 3.2.2)
     return uri_unescape($old);
+}
+
+sub ihost
+{
+    my $self = shift;
+    my $old = $self->host(@_);
+    if ($old =~ /(^|\.)xn--/) {
+	require URI::_idna;
+	$old = URI::_idna::decode($old);
+    }
+    return $old;
 }
 
 sub _port
@@ -78,8 +131,8 @@
     $self->host(shift) if @_;
     return undef unless defined $old;
     $old =~ s/.*@//;        # zap userinfo
-    $old =~ s/:$//;         # empty port does not could
-    $old .= ":" . $self->port unless $old =~ /:/;
+    $old =~ s/:$//;         # empty port should be treated the same a no port
+    $old .= ":" . $self->port unless $old =~ /:\d+$/;
     $old;
 }
 

Added: branches/upstream/liburi-perl/current/t/idna.t
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/t/idna.t?rev=47594&op=file
==============================================================================
--- branches/upstream/liburi-perl/current/t/idna.t (added)
+++ branches/upstream/liburi-perl/current/t/idna.t Sun Nov 22 01:34:51 2009
@@ -1,0 +1,13 @@
+#!perl -w
+
+use strict;
+use utf8;
+use Test::More tests => 6;
+use URI::_idna;
+
+is URI::_idna::encode("www.example.com"), "www.example.com";
+is URI::_idna::decode("www.example.com"), "www.example.com";
+is URI::_idna::encode("www.example.com."), "www.example.com.";
+is URI::_idna::decode("www.example.com."), "www.example.com.";
+is URI::_idna::encode("Bücher.ch"), "xn--bcher-kva.ch";
+is URI::_idna::decode("xn--bcher-kva.ch"), "bücher.ch";

Added: branches/upstream/liburi-perl/current/t/iri.t
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/t/iri.t?rev=47594&op=file
==============================================================================
--- branches/upstream/liburi-perl/current/t/iri.t (added)
+++ branches/upstream/liburi-perl/current/t/iri.t Sun Nov 22 01:34:51 2009
@@ -1,0 +1,28 @@
+#!perl -w
+
+use utf8;
+use strict;
+use Test::More tests => 11;
+
+use URI;
+
+my $u;
+
+$u = URI->new("http://Bücher.ch");
+is $u, "http://xn--bcher-kva.ch";
+is $u->host, "xn--bcher-kva.ch";
+is $u->ihost, "bücher.ch";
+is $u->as_iri, "http://bücher.ch";
+
+$u = URI->new("http://example.com/Bücher");
+is $u, "http://example.com/B%C3%BCcher";
+is $u->as_iri, "http://example.com/Bücher";
+
+$u = URI->new("http://example.com/B%FCcher");  # latin1 encoded stuff
+is $u->as_iri, "http://example.com/B%FCcher";  # ...should not be decoded
+
+$u = URI->new("http://âž¡.ws/");
+is $u, "http://xn--hgi.ws/";
+is $u->host, "xn--hgi.ws";
+is $u->ihost, "âž¡.ws";
+is $u->as_iri, "http://âž¡.ws/";

Added: branches/upstream/liburi-perl/current/t/num_eq.t
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/t/num_eq.t?rev=47594&op=file
==============================================================================
--- branches/upstream/liburi-perl/current/t/num_eq.t (added)
+++ branches/upstream/liburi-perl/current/t/num_eq.t Sun Nov 22 01:34:51 2009
@@ -1,0 +1,17 @@
+#!/usr/bin/perl -w
+
+# Test URI's overloading of numeric comparision for checking object
+# equality
+
+use strict;
+use Test::More 'no_plan';
+
+use URI;
+
+my $uri1 = URI->new("http://foo.com");
+my $uri2 = URI->new("http://foo.com");
+
+# cmp_ok() has a bug/misfeature where it strips overloading
+# before doing the comparison.  So use a regular ok().
+ok $uri1 == $uri1, "==";
+ok $uri1 != $uri2, "!=";

Modified: branches/upstream/liburi-perl/current/t/old-base.t
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/t/old-base.t?rev=47594&op=diff
==============================================================================
--- branches/upstream/liburi-perl/current/t/old-base.t (original)
+++ branches/upstream/liburi-perl/current/t/old-base.t Sun Nov 22 01:34:51 2009
@@ -477,11 +477,11 @@
     my $url = new URI::URL 'ftp://anonymous:p%61ss@håst:12345';
     $url->_expect('user', 'anonymous');
     $url->_expect('password', 'pass');
-    $url->_expect('host', 'håst');
+    $url->_expect('host', 'xn--hst-ula');
     $url->_expect('port', 12345);
     # Can't really know how netloc is represented since it is partially escaped
     #$url->_expect('netloc', 'anonymous:pass at hst:12345');
-    $url->_expect('as_string' => 'ftp://anonymous:pass@h%E5st:12345');
+    $url->_expect('as_string' => 'ftp://anonymous:pass@xn--hst-ula:12345');
 
     # The '0' is sometimes tricky to get right
     $url->user(0);

Added: branches/upstream/liburi-perl/current/t/punycode.t
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/t/punycode.t?rev=47594&op=file
==============================================================================
--- branches/upstream/liburi-perl/current/t/punycode.t (added)
+++ branches/upstream/liburi-perl/current/t/punycode.t Sun Nov 22 01:34:51 2009
@@ -1,0 +1,56 @@
+#!perl -w
+
+use strict;
+use utf8;
+use Test::More tests => 15;
+use URI::_punycode qw(encode_punycode decode_punycode);
+
+my %RFC_3492 = (
+    A => {
+	unicode => udecode("u+0644 u+064A u+0647 u+0645 u+0627 u+0628 u+062A u+0643 u+0644 u+0645 u+0648 u+0634 u+0639 u+0631 u+0628 u+064A u+061F"),
+	ascii => "egbpdaj6bu4bxfgehfvwxn",
+    },
+    B => {
+	unicode => udecode("u+4ED6 u+4EEC u+4E3A u+4EC0 u+4E48 u+4E0D u+8BF4 u+4E2D u+6587"),
+	ascii => "ihqwcrb4cv8a8dqg056pqjye",
+    },
+    E => {
+	unicode => udecode("u+05DC u+05DE u+05D4 u+05D4 u+05DD u+05E4 u+05E9 u+05D5 u+05D8 u+05DC u+05D0 u+05DE u+05D3 u+05D1 u+05E8 u+05D9 u+05DD u+05E2 u+05D1 u+05E8 u+05D9 u+05EA"),
+	ascii => "4dbcagdahymbxekheh6e0a7fei0b",
+    },
+    J => {
+	unicode => udecode("U+0050 u+006F u+0072 u+0071 u+0075 u+00E9 u+006E u+006F u+0070 u+0075 u+0065 u+0064 u+0065 u+006E u+0073 u+0069 u+006D u+0070 u+006C u+0065 u+006D u+0065 u+006E u+0074 u+0065 u+0068 u+0061 u+0062 u+006C u+0061 u+0072 u+0065 u+006E U+0045 u+0073 u+0070 u+0061 u+00F1 u+006F u+006C"),
+	ascii => "PorqunopuedensimplementehablarenEspaol-fmd56a",
+    },
+    K => {
+	unicode => udecode("U+0054 u+1EA1 u+0069 u+0073 u+0061 u+006F u+0068 u+1ECD u+006B u+0068 u+00F4 u+006E u+0067 u+0074 u+0068 u+1EC3 u+0063 u+0068 u+1EC9 u+006E u+00F3 u+0069 u+0074 u+0069 u+1EBF u+006E u+0067 U+0056 u+0069 u+1EC7 u+0074"),
+	ascii => "TisaohkhngthchnitingVit-kjcr8268qyxafd2f1b9g",
+    },
+    O => {
+	unicode => udecode("u+3072 u+3068 u+3064 u+5C4B u+6839 u+306E u+4E0B u+0032"),
+	ascii => "2-u9tlzr9756bt3uc0v",
+    },
+    S => {
+	unicode => "\$1.00",
+	ascii =>   "\$1.00",
+    },
+);
+
+is encode_punycode("bücher"), "bcher-kva", "http://en.wikipedia.org/wiki/Punycode example encode";
+is decode_punycode("bcher-kva"), "bücher", "http://en.wikipedia.org/wiki/Punycode example decode";
+
+for my $test_key (sort keys %RFC_3492) {
+    my $test = $RFC_3492{$test_key};
+    is encode_punycode($test->{unicode}), $test->{ascii}, "$test_key encode";
+    is decode_punycode($test->{ascii}), $test->{unicode}, "$test_key decode" unless $test_key eq "S";
+}
+
+sub udecode {
+    my $str = shift;
+    my @u;
+    for (split(" ", $str)) {
+	/^[uU]\+[\dA-F]{2,4}$/ || die "Unexpected ucode: $_";
+	push(@u, chr(hex(substr($_, 2))));
+    }
+    return join("", @u);
+}

Modified: branches/upstream/liburi-perl/current/t/rfc2732.t
URL: http://svn.debian.org/wsvn/pkg-perl/branches/upstream/liburi-perl/current/t/rfc2732.t?rev=47594&op=diff
==============================================================================
--- branches/upstream/liburi-perl/current/t/rfc2732.t (original)
+++ branches/upstream/liburi-perl/current/t/rfc2732.t Sun Nov 22 01:34:51 2009
@@ -1,39 +1,52 @@
 #!perl -w
 
-print "1..9\n";
+# Test URIs containing IPv6 addresses
 
 use strict;
+use Test::More tests => 19;
+
 use URI;
 my $uri = URI->new("http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80/index.html");
 
-print "not " unless $uri->as_string eq "http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80/index.html";
-print "ok 1\n";
+is $uri->as_string, "http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80/index.html";
+is $uri->host, "FEDC:BA98:7654:3210:FEDC:BA98:7654:3210";
+is $uri->host_port, "[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80";
+is $uri->port, "80";
 
-print "not " unless $uri->host eq "[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]";
-print "ok 2\n";
-
-print "not " unless $uri->host_port eq "[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80";
-print "ok 3\n";
-
-print "not " unless $uri->port eq "80";
-print "ok 4\n";
+$uri->port(undef);
+is $uri->as_string, "http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]/index.html";
+is $uri->host_port, "[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80";
+$uri->port(80);
 
 $uri->host("host");
-print "not " unless $uri->as_string eq "http://host:80/index.html";
-print "ok 5\n";
+is $uri->as_string, "http://host:80/index.html";
+
+$uri->host("FEDC:BA98:7654:3210:FEDC:BA98:7654:3210");
+is $uri->as_string, "http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:80/index.html";
+$uri->host_port("[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:88");
+is $uri->as_string, "http://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:88/index.html";
+$uri->host_port("[::1]:80");
+is $uri->as_string, "http://[::1]:80/index.html";
+$uri->host("::1:80");
+is $uri->as_string, "http://[::1:80]:80/index.html";
+$uri->host("[::1:80]");
+is $uri->as_string, "http://[::1:80]:80/index.html";
+$uri->host("[::1]:88");
+is $uri->as_string, "http://[::1]:88/index.html";
+
 
 $uri = URI->new("ftp://ftp:@[3ffe:2a00:100:7031::1]");
-print "not " unless $uri->as_string eq "ftp://ftp:@[3ffe:2a00:100:7031::1]";
-print "ok 6\n";
+is $uri->as_string, "ftp://ftp:@[3ffe:2a00:100:7031::1]";
 
-print "not " unless $uri->port eq "21" && !$uri->_port;
-print "ok 7\n";
+is $uri->port, "21";
+ok !$uri->_port;
 
-print "not " unless $uri->host("ftp") eq "[3ffe:2a00:100:7031::1]";
-print "ok 8\n";
+is $uri->host("ftp"), "3ffe:2a00:100:7031::1";
 
-print "not " unless $uri eq "ftp://ftp:\@ftp";
-print "ok 9\n";
+is $uri, "ftp://ftp:\@ftp";
+
+$uri = URI->new("http://[::1]");
+is $uri->host, "::1";
 
 __END__
 




More information about the Pkg-perl-cvs-commits mailing list