[libhtml-scrubber-perl] 01/01: Imported from HTML-Scrubber-0.04.tar.gz.

Florian Schlichting fsfs at moszumanska.debian.org
Sat Nov 11 13:45:58 UTC 2017


This is an automated email from the git hooks/post-receive script.

fsfs pushed a commit to annotated tag HTML-Scrubber-0.04
in repository libhtml-scrubber-perl.

commit 2f8e2eeb61fe5e5a12d81f597243c1111d36ff4f
Author: D. H. <podmaster at cpan.org>
Date:   Thu Oct 30 02:35:15 2003 +0000

    Imported from HTML-Scrubber-0.04.tar.gz.
---
 Changes                 |   9 ++
 LICENSE                 | 383 ++++++++++++++++++++++++++++++++++++++++++++++++
 MANIFEST                |   7 +-
 META.yml                |   7 +-
 Makefile.PL             |   3 +
 README                  |   6 +-
 Scrubber.pm             | 233 +++++++++++++++++++++++------
 t/01_use.t              |  12 ++
 test.pl => t/02_basic.t |   3 +
 t/03_more.t             |  44 ++++++
 t/04_style_script.t     |  27 ++++
 t/05_pi_comment.t       |  27 ++++
 12 files changed, 709 insertions(+), 52 deletions(-)

diff --git a/Changes b/Changes
index c985e48..55c9f9b 100755
--- a/Changes
+++ b/Changes
@@ -1,5 +1,14 @@
 Revision history for Perl extension HTML::Scrubber.
 
+0.04  Wed Oct 29 18:35:08 2003
+    - added missing lc in a few places (and got rid of for @_)
+    - fixed (and improved) optimizations (stupid typo)
+    - added DESTROY to break circular reference (I lost my TODO, so i forgot)
+    - added more pod (allow deny ...)
+    - improved test suite
+    - added LICENSE file
+    - added script/style functions (nice)
+
 0.03  Mon Jul 21 07:32:10 2003
     - perltidy ;)
     - closed http://rt.cpan.org/NoAuth/Bug.html?id=2969
diff --git a/LICENSE b/LICENSE
new file mode 100755
index 0000000..9bb6486
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,383 @@
+Terms of Perl itself
+
+a) the GNU General Public License as published by the Free
+   Software Foundation; either version 1, or (at your option) any
+   later version, or
+b) the "Artistic License"
+
+---------------------------------------------------------------------------
+
+The General Public License (GPL)
+Version 2, June 1991
+
+Copyright (C) 1989, 1991 Free Software Foundation, Inc. 675 Mass Ave,
+Cambridge, MA 02139, USA. Everyone is permitted to copy and distribute
+verbatim copies of this license document, but changing it is not allowed.
+
+Preamble
+
+The licenses for most software are designed to take away your freedom to share
+and change it. By contrast, the GNU General Public License is intended to
+guarantee your freedom to share and change free software--to make sure the
+software is free for all its users. This General Public License applies to most of
+the Free Software Foundation's software and to any other program whose
+authors commit to using it. (Some other Free Software Foundation software is
+covered by the GNU Library General Public License instead.) You can apply it to
+your programs, too.
+
+When we speak of free software, we are referring to freedom, not price. Our
+General Public Licenses are designed to make sure that you have the freedom
+to distribute copies of free software (and charge for this service if you wish), that
+you receive source code or can get it if you want it, that you can change the
+software or use pieces of it in new free programs; and that you know you can do
+these things.
+
+To protect your rights, we need to make restrictions that forbid anyone to deny
+you these rights or to ask you to surrender the rights. These restrictions
+translate to certain responsibilities for you if you distribute copies of the
+software, or if you modify it.
+
+For example, if you distribute copies of such a program, whether gratis or for a
+fee, you must give the recipients all the rights that you have. You must make
+sure that they, too, receive or can get the source code. And you must show
+them these terms so they know their rights.
+
+We protect your rights with two steps: (1) copyright the software, and (2) offer
+you this license which gives you legal permission to copy, distribute and/or
+modify the software.
+
+Also, for each author's protection and ours, we want to make certain that
+everyone understands that there is no warranty for this free software. If the
+software is modified by someone else and passed on, we want its recipients to
+know that what they have is not the original, so that any problems introduced by
+others will not reflect on the original authors' reputations.
+
+Finally, any free program is threatened constantly by software patents. We wish
+to avoid the danger that redistributors of a free program will individually obtain
+patent licenses, in effect making the program proprietary. To prevent this, we
+have made it clear that any patent must be licensed for everyone's free use or
+not licensed at all.
+
+The precise terms and conditions for copying, distribution and modification
+follow.
+
+GNU GENERAL PUBLIC LICENSE
+TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND
+MODIFICATION
+
+0. This License applies to any program or other work which contains a notice
+placed by the copyright holder saying it may be distributed under the terms of
+this General Public License. The "Program", below, refers to any such program
+or work, and a "work based on the Program" means either the Program or any
+derivative work under copyright law: that is to say, a work containing the
+Program or a portion of it, either verbatim or with modifications and/or translated
+into another language. (Hereinafter, translation is included without limitation in
+the term "modification".) Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not covered by
+this License; they are outside its scope. The act of running the Program is not
+restricted, and the output from the Program is covered only if its contents
+constitute a work based on the Program (independent of having been made by
+running the Program). Whether that is true depends on what the Program does.
+
+1. You may copy and distribute verbatim copies of the Program's source code as
+you receive it, in any medium, provided that you conspicuously and appropriately
+publish on each copy an appropriate copyright notice and disclaimer of warranty;
+keep intact all the notices that refer to this License and to the absence of any
+warranty; and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and you may at
+your option offer warranty protection in exchange for a fee.
+
+2. You may modify your copy or copies of the Program or any portion of it, thus
+forming a work based on the Program, and copy and distribute such
+modifications or work under the terms of Section 1 above, provided that you also
+meet all of these conditions:
+
+a) You must cause the modified files to carry prominent notices stating that you
+changed the files and the date of any change.
+
+b) You must cause any work that you distribute or publish, that in whole or in
+part contains or is derived from the Program or any part thereof, to be licensed
+as a whole at no charge to all third parties under the terms of this License.
+
+c) If the modified program normally reads commands interactively when run, you
+must cause it, when started running for such interactive use in the most ordinary
+way, to print or display an announcement including an appropriate copyright
+notice and a notice that there is no warranty (or else, saying that you provide a
+warranty) and that users may redistribute the program under these conditions,
+and telling the user how to view a copy of this License. (Exception: if the
+Program itself is interactive but does not normally print such an announcement,
+your work based on the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole. If identifiable
+sections of that work are not derived from the Program, and can be reasonably
+considered independent and separate works in themselves, then this License,
+and its terms, do not apply to those sections when you distribute them as
+separate works. But when you distribute the same sections as part of a whole
+which is a work based on the Program, the distribution of the whole must be on
+the terms of this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest your rights to
+work written entirely by you; rather, the intent is to exercise the right to control
+the distribution of derivative or collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program with the
+Program (or with a work based on the Program) on a volume of a storage or
+distribution medium does not bring the other work under the scope of this
+License.
+
+3. You may copy and distribute the Program (or a work based on it, under
+Section 2) in object code or executable form under the terms of Sections 1 and 2
+above provided that you also do one of the following:
+
+a) Accompany it with the complete corresponding machine-readable source
+code, which must be distributed under the terms of Sections 1 and 2 above on a
+medium customarily used for software interchange; or,
+
+b) Accompany it with a written offer, valid for at least three years, to give any
+third party, for a charge no more than your cost of physically performing source
+distribution, a complete machine-readable copy of the corresponding source
+code, to be distributed under the terms of Sections 1 and 2 above on a medium
+customarily used for software interchange; or,
+
+c) Accompany it with the information you received as to the offer to distribute
+corresponding source code. (This alternative is allowed only for noncommercial
+distribution and only if you received the program in object code or executable
+form with such an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for making
+modifications to it. For an executable work, complete source code means all the
+source code for all modules it contains, plus any associated interface definition
+files, plus the scripts used to control compilation and installation of the
+executable. However, as a special exception, the source code distributed need
+not include anything that is normally distributed (in either source or binary form)
+with the major components (compiler, kernel, and so on) of the operating system
+on which the executable runs, unless that component itself accompanies the
+executable.
+
+If distribution of executable or object code is made by offering access to copy
+from a designated place, then offering equivalent access to copy the source
+code from the same place counts as distribution of the source code, even though
+third parties are not compelled to copy the source along with the object code.
+
+4. You may not copy, modify, sublicense, or distribute the Program except as
+expressly provided under this License. Any attempt otherwise to copy, modify,
+sublicense or distribute the Program is void, and will automatically terminate
+your rights under this License. However, parties who have received copies, or
+rights, from you under this License will not have their licenses terminated so long
+as such parties remain in full compliance.
+
+5. You are not required to accept this License, since you have not signed it.
+However, nothing else grants you permission to modify or distribute the Program
+or its derivative works. These actions are prohibited by law if you do not accept
+this License. Therefore, by modifying or distributing the Program (or any work
+based on the Program), you indicate your acceptance of this License to do so,
+and all its terms and conditions for copying, distributing or modifying the
+Program or works based on it.
+
+6. Each time you redistribute the Program (or any work based on the Program),
+the recipient automatically receives a license from the original licensor to copy,
+distribute or modify the Program subject to these terms and conditions. You
+may not impose any further restrictions on the recipients' exercise of the rights
+granted herein. You are not responsible for enforcing compliance by third parties
+to this License.
+
+7. If, as a consequence of a court judgment or allegation of patent infringement
+or for any other reason (not limited to patent issues), conditions are imposed on
+you (whether by court order, agreement or otherwise) that contradict the
+conditions of this License, they do not excuse you from the conditions of this
+License. If you cannot distribute so as to satisfy simultaneously your obligations
+under this License and any other pertinent obligations, then as a consequence
+you may not distribute the Program at all. For example, if a patent license would
+not permit royalty-free redistribution of the Program by all those who receive
+copies directly or indirectly through you, then the only way you could satisfy
+both it and this License would be to refrain entirely from distribution of the
+Program.
+
+If any portion of this section is held invalid or unenforceable under any particular
+circumstance, the balance of the section is intended to apply and the section as
+a whole is intended to apply in other circumstances.
+
+It is not the purpose of this section to induce you to infringe any patents or other
+property right claims or to contest validity of any such claims; this section has
+the sole purpose of protecting the integrity of the free software distribution
+system, which is implemented by public license practices. Many people have
+made generous contributions to the wide range of software distributed through
+that system in reliance on consistent application of that system; it is up to the
+author/donor to decide if he or she is willing to distribute software through any
+other system and a licensee cannot impose that choice.
+
+This section is intended to make thoroughly clear what is believed to be a
+consequence of the rest of this License.
+
+8. If the distribution and/or use of the Program is restricted in certain countries
+either by patents or by copyrighted interfaces, the original copyright holder who
+places the Program under this License may add an explicit geographical
+distribution limitation excluding those countries, so that distribution is permitted
+only in or among countries not thus excluded. In such case, this License
+incorporates the limitation as if written in the body of this License.
+
+9. The Free Software Foundation may publish revised and/or new versions of the
+General Public License from time to time. Such new versions will be similar in
+spirit to the present version, but may differ in detail to address new problems or
+concerns.
+
+Each version is given a distinguishing version number. If the Program specifies a
+version number of this License which applies to it and "any later version", you
+have the option of following the terms and conditions either of that version or of
+any later version published by the Free Software Foundation. If the Program does
+not specify a version number of this License, you may choose any version ever
+published by the Free Software Foundation.
+
+10. If you wish to incorporate parts of the Program into other free programs
+whose distribution conditions are different, write to the author to ask for
+permission. For software which is copyrighted by the Free Software Foundation,
+write to the Free Software Foundation; we sometimes make exceptions for this.
+Our decision will be guided by the two goals of preserving the free status of all
+derivatives of our free software and of promoting the sharing and reuse of
+software generally.
+
+NO WARRANTY
+
+11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS
+NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
+APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE
+COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM
+"AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR
+IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE
+ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
+PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE,
+YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR
+CORRECTION.
+
+12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED
+TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY
+WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS
+PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
+GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES
+ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM
+(INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
+RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
+PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY
+OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS
+BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
+
+END OF TERMS AND CONDITIONS
+
+
+---------------------------------------------------------------------------
+
+The Artistic License
+
+Preamble
+
+The intent of this document is to state the conditions under which a Package
+may be copied, such that the Copyright Holder maintains some semblance of
+artistic control over the development of the package, while giving the users of the
+package the right to use and distribute the Package in a more-or-less customary
+fashion, plus the right to make reasonable modifications.
+
+Definitions:
+
+-    "Package" refers to the collection of files distributed by the Copyright
+     Holder, and derivatives of that collection of files created through textual
+     modification. 
+-    "Standard Version" refers to such a Package if it has not been modified,
+     or has been modified in accordance with the wishes of the Copyright
+     Holder. 
+-    "Copyright Holder" is whoever is named in the copyright or copyrights for
+     the package. 
+-    "You" is you, if you're thinking about copying or distributing this Package.
+-    "Reasonable copying fee" is whatever you can justify on the basis of
+     media cost, duplication charges, time of people involved, and so on. (You
+     will not be required to justify it to the Copyright Holder, but only to the
+     computing community at large as a market that must bear the fee.) 
+-    "Freely Available" means that no fee is charged for the item itself, though
+     there may be fees involved in handling the item. It also means that
+     recipients of the item may redistribute it under the same conditions they
+     received it. 
+
+1. You may make and give away verbatim copies of the source form of the
+Standard Version of this Package without restriction, provided that you duplicate
+all of the original copyright notices and associated disclaimers.
+
+2. You may apply bug fixes, portability fixes and other modifications derived from
+the Public Domain or from the Copyright Holder. A Package modified in such a
+way shall still be considered the Standard Version.
+
+3. You may otherwise modify your copy of this Package in any way, provided
+that you insert a prominent notice in each changed file stating how and when
+you changed that file, and provided that you do at least ONE of the following:
+
+     a) place your modifications in the Public Domain or otherwise
+     make them Freely Available, such as by posting said modifications
+     to Usenet or an equivalent medium, or placing the modifications on
+     a major archive site such as ftp.uu.net, or by allowing the
+     Copyright Holder to include your modifications in the Standard
+     Version of the Package.
+
+     b) use the modified Package only within your corporation or
+     organization.
+
+     c) rename any non-standard executables so the names do not
+     conflict with standard executables, which must also be provided,
+     and provide a separate manual page for each non-standard
+     executable that clearly documents how it differs from the Standard
+     Version.
+
+     d) make other distribution arrangements with the Copyright Holder.
+
+4. You may distribute the programs of this Package in object code or executable
+form, provided that you do at least ONE of the following:
+
+     a) distribute a Standard Version of the executables and library
+     files, together with instructions (in the manual page or equivalent)
+     on where to get the Standard Version.
+
+     b) accompany the distribution with the machine-readable source of
+     the Package with your modifications.
+
+     c) accompany any non-standard executables with their
+     corresponding Standard Version executables, giving the
+     non-standard executables non-standard names, and clearly
+     documenting the differences in manual pages (or equivalent),
+     together with instructions on where to get the Standard Version.
+
+     d) make other distribution arrangements with the Copyright Holder.
+
+5. You may charge a reasonable copying fee for any distribution of this Package.
+You may charge any fee you choose for support of this Package. You may not
+charge a fee for this Package itself. However, you may distribute this Package in
+aggregate with other (possibly commercial) programs as part of a larger
+(possibly commercial) software distribution provided that you do not advertise
+this Package as a product of your own.
+
+6. The scripts and library files supplied as input to or produced as output from
+the programs of this Package do not automatically fall under the copyright of this
+Package, but belong to whomever generated them, and may be sold
+commercially, and may be aggregated with this Package.
+
+7. C or perl subroutines supplied by you and linked into this Package shall not
+be considered part of this Package.
+
+8. Aggregation of this Package with a commercial distribution is always permitted
+provided that the use of this Package is embedded; that is, when no overt attempt
+is made to make this Package's interfaces visible to the end user of the
+commercial distribution. Such use shall not be construed as a distribution of
+this Package.
+
+9. The name of the Copyright Holder may not be used to endorse or promote
+products derived from this software without specific prior written permission.
+
+10. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
+WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR
+PURPOSE.
+
+The End
+
+
diff --git a/MANIFEST b/MANIFEST
index e26cca7..a57c3ae 100755
--- a/MANIFEST
+++ b/MANIFEST
@@ -1,8 +1,13 @@
 Changes
+LICENSE
 Makefile.PL
 MANIFEST
 MANIFEST.SKIP
 README
 Scrubber.pm
-test.pl
+t/01_use.t
+t/02_basic.t
+t/03_more.t
+t/04_style_script.t
+t/05_pi_comment.t
 META.yml                                Module meta-data (added by MakeMaker)
diff --git a/META.yml b/META.yml
index b9cfcbc..199b21b 100755
--- a/META.yml
+++ b/META.yml
@@ -1,10 +1,13 @@
+# http://module-build.sourceforge.net/META-spec.html
 #XXXXXXX This is a prototype!!!  It will change in the future!!! XXXXX#
 name:         HTML-Scrubber
-version:      0.03
+version:      0.04
 version_from: Scrubber.pm
 installdirs:  site
 requires:
     HTML::Parser:                  3
+    Test:                          0
+    Test::More:                    0
 
 distribution_type: module
-generated_by: ExtUtils::MakeMaker version 6.10_06
+generated_by: ExtUtils::MakeMaker version 6.17
diff --git a/Makefile.PL b/Makefile.PL
index 85c75c5..155f2be 100755
--- a/Makefile.PL
+++ b/Makefile.PL
@@ -4,8 +4,11 @@ use ExtUtils::MakeMaker;
 WriteMakefile(
     'NAME'		=> 'HTML::Scrubber',
     'VERSION_FROM'	=> 'Scrubber.pm', # finds $VERSION
+    'PREREQ_FATAL'  => 1,
     'PREREQ_PM'		=> {
         'HTML::Parser' => 3,
+        'Test'         => 0,
+        'Test::More'   => 0,
     }, # e.g., Module::Name => 1.1
     ($] >= 5.005 ?    ## Add these new keywords supported since 5.005
       (ABSTRACT_FROM => 'Scrubber.pm', # retrieve abstract from module
diff --git a/README b/README
index 0a2de5a..c43f182 100755
--- a/README
+++ b/README
@@ -21,6 +21,6 @@ COPYRIGHT AND LICENCE
 Copyright (C) 2003 D.H. aka PodMaster
 
 This library is free software; you can redistribute it and/or modify
-it under the same terms as Perl itself. 
-If you don't know what this means,
-visit http://perl.com/ or http://cpan.org/.
+it under the same terms as Perl itself.
+The LICENSE file contains the full text of the license.
+
diff --git a/Scrubber.pm b/Scrubber.pm
index d58f28f..5e22fd9 100755
--- a/Scrubber.pm
+++ b/Scrubber.pm
@@ -12,26 +12,25 @@ HTML::Scrubber - Perl extension for scrubbing/sanitizing html
     use strict;
                                                                             #
     my $html = q[
-        <HR>                                                                #
-        <B> bold                                                            #
-            <U> underlined                                                  #
-                <I>                                                         #
-                    <A href=#>  LINK    </A>                                #
-                </I>                                                        #
-            </U>                                                            #
-        </B>                                                                #
-        </HR>                                                               #
+    <style type="text/css"> BAD { background: #666; color: #666;} </style>
+    <script language="javascript"> alert("Hello, I am EVIL!");    </script>
+    <HR>
+        a   => <a href=1>link </a>
+        br  => <br>
+        b   => <B> bold </B>
+        u   => <U> UNDERLINE </U>
     ];
                                                                             #
-    my $scrubber = HTML::Scrubber->new( allow => [ qw[ p b i u hr br ] ] );
+    my $scrubber = HTML::Scrubber->new( allow => [ qw[ p b i u hr br ] ] ); #
                                                                             #
-    print $scrubber->scrub($html);
+    print $scrubber->scrub($html);                                          #
                                                                             #
-    $scrubber->deny( qw[ p b i u hr br ] );
+    $scrubber->deny( qw[ p b i u hr br ] );                                 #
                                                                             #
-    print $scrubber->scrub($html);
+    print $scrubber->scrub($html);                                          #
                                                                             #
 
+
 =for example end
 
 =head1 DESCRIPTION
@@ -63,7 +62,7 @@ use HTML::Entities;
 use vars qw[ $VERSION $_scrub $_scrub_fh ];
 use strict;
 
-$VERSION = '0.03';
+$VERSION = '0.04';
 
 # my my my my, these here to prevent foolishness like 
 # http://perlmonks.org/index.pl?node_id=251127#Stealing+Lexicals
@@ -78,6 +77,7 @@ sub new {
         marked_sections => 0,
         strict_comment  => 0,
         unbroken_text   => 1,
+        case_sensitive  => 0,
     );
 
     my $self = {
@@ -88,13 +88,16 @@ sub new {
         _comment => 0,
         _process => 0,
         _r => "",
+        _optimize => 1,
+        _script => 0,
+        _style  => 0,
     };
 
     $p->{"\0_s"} = bless $self, $package;
 
     return $self unless @_;
 
-    my %args = @_;
+    my(%args)= @_;
 
     for my $f( qw[ default allow deny rules process comment ] ) {
         next unless exists $args{$f};
@@ -108,6 +111,12 @@ sub new {
     return $self;
 }
 
+=head2 comment
+
+    warn "comments are  ", $p->comment ? 'allowed' : 'not allowed';
+    $p->comment(0);  # off by default
+
+=cut
 
 sub comment {
     return
@@ -117,6 +126,14 @@ sub comment {
     return;
 }
 
+=head2 process
+
+    warn "process instructions are  ", $p->process ? 'allowed' : 'not allowed';
+    $p->process(0);  # off by default
+
+=cut
+
+
 sub process {
     return
         $_[0]->{_process}
@@ -125,59 +142,132 @@ sub process {
     return;
 }
 
-sub allow {
-    my $self = shift;
 
-    $self->{_rules}{$_}=1 for @_;
+=head2 script
 
-    return unless $self->{_optimize}; # till I figure it out (huh)
+    warn "script tags (and everything in between) are supressed"
+        if $p->script;      # off by default
+    $p->script( 0 || 1 );
 
-    if( $self->{_p}{'*'} ){       # default allow
-        $self->{_p}->report_tags();   # so clear it
-    } else {
-        $self->{_p}->report_tags( # default deny, so optimize
-            grep {                # report only tags we want
-                $self->{_rules}{$_}
-            } keys %{
-                $self->{_rules}
-            }
-        );
+B<**> Please note that this is implemented 
+using HTML::Parser's ignore_elements function,
+so if C<script> is set to true,
+all script tags encountered will be validated like all other tags.
+
+=cut
+
+sub script {
+    return
+        $_[0]->{_script}
+            if @_ == 1;
+    $_[0]->{_script} = $_[1];
+    return;
+}
+
+=head2 style 
+
+    warn "style tags (and everything in between) are supressed"
+        if $p->style;       # off by default
+    $p->style( 0 || 1 );
+
+B<**> Please note that this is implemented 
+using HTML::Parser's ignore_elements function,
+so if C<style> is set to true,
+all style tags encountered will be validated like all other tags.
+
+=cut
+
+sub style {
+    return
+        $_[0]->{_style}
+            if @_ == 1;
+    $_[0]->{_style} = $_[1];
+    return;
+}
+
+=head2 allow
+
+    $p->allow(qw[ t a g s ]);
+
+=cut
+
+sub allow {
+    my $self = shift;
+    for my $k(@_){
+        $self->{_rules}{lc $k}=1;
     }
+    $self->{_optimize} = 1; # each time a rule changes, reoptimize when parse
+
     return;
 }
 
+
+=head2 deny
+
+    $p->deny(qw[ t a g s ]);
+
+=cut
+
 sub deny {
     my $self = shift;
 
-    $self->{_rules}{$_} = 0 for @_;
+    for my $k(@_){
+        $self->{_rules}{lc $k} = 0;
+    }
 
-    return unless $self->{_optimize}; # till I figure it out (huh)
+    $self->{_optimize} = 1; # each time a rule changes, reoptimize when parse
 
-    $self->{_p}->ignore_tags( # always ignore stuff we don't want
-        grep {
-            not $self->{_rules}{$_}
-        } keys %{
-            $self->{_rules}
-        }
-    );
     return;
 }
 
+=head2 rules
+
+    $p->rules(
+        img => {
+            src => qr{^(?!http://)}i, # only relative image links allowed
+            alt => 1,                 # alt attribute allowed
+            '*' => 0,                 # deny all other attributes
+        },
+        b => 1,
+        ...
+    );
+
+=cut
+
 sub rules{
     my $self = shift;
-    my %rules = @_;
+    my(%rules)= @_;
     for my $k(keys %rules) {
-        $self->{_rules}{$k} = $rules{$k};
+        $self->{_rules}{lc $k} = $rules{$k};
     }
+
+    $self->{_optimize} = 1; # each time a rule changes, reoptimize when parse
+
     return;
 }
 
+=head2 default
+
+    print "default is ", $p->default();
+    $p->default(1);      # allow tags by default
+    $p->default(
+        undef,           # don't change
+        {                # default attribute rules
+            '*' => 1,    # allow attributes by default
+        }
+    );
+
+=cut
+
 sub default {
     return
         $_[0]->{_rules}{'*'}
             if @_ == 1;
+
     $_[0]->{_rules}{'*'} = $_[1] if defined $_[1];
-    $_[0]->{_rules}{'_'} = $_[2] if defined $_[2];
+    $_[0]->{_rules}{'_'} = $_[2] if defined $_[2] and ref $_[2];
+    $_[0]->{_optimize} = 1; # each time a rule changes, reoptimize when parse
+
     return;
 }
 
@@ -197,6 +287,8 @@ sub scrub_file {
         return unless defined $_[0]->_out($_[2]);
     }
 
+    $_[0]->_optimize() ;#if $_[0]->{_optimize};
+
     $_[0]->{_p}->parse_file($_[1]);
 
     return delete $_[0]->{_r} unless exists $_[0]->{_out};
@@ -219,6 +311,8 @@ sub scrub {
         return unless defined $_[0]->_out($_[2]);
     }
 
+    $_[0]->_optimize();# if $_[0]->{_optimize};
+
     $_[0]->{_p}->parse($_[1]);
     $_[0]->{_p}->eof();
     
@@ -275,7 +369,7 @@ sub _validate {
             $f{$k} = $a->{$k};
         }
     }
-    
+
     return "<$t $r>"
         if $r = join ' ',
                 map {
@@ -420,6 +514,56 @@ sub _scrub {
     }
 }
 
+sub _optimize {
+    my($self) = @_;
+
+    my( @ignore_elements ) = grep { not $self->{"_$_"} } qw(script style);
+    $self->{_p}->ignore_elements(@ignore_elements); # if @ is empty, we reset ;)
+
+    return unless $self->{_optimize};
+#sub allow
+#    return unless $self->{_optimize}; # till I figure it out (huh)
+
+    if( $self->{_rules}{'*'} ){       # default allow
+        $self->{_p}->report_tags();   # so clear it
+#        warn "\nreporting all\n";
+    } else {
+
+        my(@reports) =
+            grep {                # report only tags we want
+                $self->{_rules}{$_}
+            } keys %{
+                $self->{_rules}
+            };
+
+        $self->{_p}->report_tags( # default deny, so optimize
+            @reports
+        ) if @reports;
+#        warn "\nreporting only @reports\n";
+    }
+
+# sub deny
+#    return unless $self->{_optimize}; # till I figure it out (huh)
+    my(@ignores)= 
+        grep {
+            not $self->{_rules}{$_}
+        } keys %{
+            $self->{_rules}
+        };
+
+    $self->{_p}->ignore_tags( # always ignore stuff we don't want
+        @ignores
+    ) if @ignores;
+#    warn "\nignoring @ignores\n" if @ignores;
+
+    $self->{_optimize}=0;
+    return;
+}
+
+
+sub DESTROY {
+    delete $_[0]->{_p}->{"\0_s"}; # break circular reference
+}
 1;
 
 #print sprintf q[ '%-12s => %s,], "$_'", $h{$_} for sort keys %h;# perl!
@@ -427,8 +571,6 @@ sub _scrub {
 #perl -ne"chomp;print $_;if( /ok\(/ ){s/\#test \d+$//;print qq'\t\t# test ', ++$a }print $/" test.pl >test2.pl
 #perl -ne"chomp;if(/ok\(/){s/# test .*$//;print$_,qq'\t\t# test ',++$a}else{print$_}print$/" test.pl >test2.pl
 
-=cut
-
 =head1 How does it work?
 
 When a tag is encountered, HTML::Scrubber
@@ -603,7 +745,6 @@ All rights reserved.
 This module is free software;
 you can redistribute it and/or modify it under
 the same terms as Perl itself.
-If you don't know what this means,
-visit http://perl.com/ or http://cpan.org/.
+The LICENSE file contains the full text of the license.
 
 =cut
diff --git a/t/01_use.t b/t/01_use.t
new file mode 100755
index 0000000..76d61fd
--- /dev/null
+++ b/t/01_use.t
@@ -0,0 +1,12 @@
+# Before `make install' is performed this script should be runnable with
+# `make test'. After `make install' it should work as `perl test.pl'
+
+#########################
+
+# change 'tests => 1' to 'tests => ';
+
+use Test;
+BEGIN { plan tests => 1 };
+use HTML::Scrubber;
+ok(1);
+
diff --git a/test.pl b/t/02_basic.t
similarity index 95%
rename from test.pl
rename to t/02_basic.t
index bacf386..8662b25 100755
--- a/test.pl
+++ b/t/02_basic.t
@@ -84,8 +84,11 @@ ok( $scrubber->default() );				# test 33
 ok( ! $scrubber->comment() );				# test 34
 ok( ! $scrubber->process() );				# test 35
 
+#use Data::Dumper;die Dumper( [ $scrubber, $scrubber->scrub($html) ]);
+
 $scrubber = $scrubber->scrub($html);
 
+
 ok( $scrubber );				# test 36
 ok( $scrubber =~ /[><]/ );				# test 37
 ok( $scrubber !~ /href/i );				# test 38
diff --git a/t/03_more.t b/t/03_more.t
new file mode 100755
index 0000000..0e123a4
--- /dev/null
+++ b/t/03_more.t
@@ -0,0 +1,44 @@
+# perl Makefile.PL && nmake realclean && cls && perl Makefile.PL && nmake test
+
+use strict;
+use Test::More tests => 7; 
+BEGIN { $^W = 1 }
+
+use_ok( 'HTML::Scrubber::StripScripts' );
+
+my $s = HTML::Scrubber->new;
+my $html = q[<a href=1>link </a><br><B> bold </B><U> UNDERLINE </U>];
+
+isa_ok($s, 'HTML::Scrubber');
+
+$s->rules( 'font' => { face => 1 } );
+
+is( $s->scrub('<font face="gothic">'), '<font face="gothic">', 'font face gothic' );
+
+$s->allow(qw[ U ]);
+
+#use Data::Dumper;warn $/,Dumper($s);
+
+is( $s->scrub($html), q[link  bold <u> UNDERLINE </u>],'only U');
+
+$s->allow(qw[ B U ]);
+
+#use Data::Dumper;warn $/,Dumper($s);
+
+is( $s->scrub($html),  q[link <b> bold </b><u> UNDERLINE </u>],'B and U');
+
+$s->allow(qw[ A B ]);
+$s->deny('U');
+$s->default(0,{ '*'=> 1});
+
+#use Data::Dumper;warn $/,Dumper($s);
+
+is( $s->scrub($html),  q[<a href="1">link </a><b> bold </b> UNDERLINE ],'A and B');
+
+$s = HTML::Scrubber->new(
+    default => [ 1, { '*' => 1 } ]
+);
+
+is( $s->scrub($html), q[<a href="1">link </a><br><b> bold </b><u> UNDERLINE </u>], 'A B U and BR');
+
+#use Data::Dumper;warn $/,Dumper($s);
diff --git a/t/04_style_script.t b/t/04_style_script.t
new file mode 100755
index 0000000..3bc6ab7
--- /dev/null
+++ b/t/04_style_script.t
@@ -0,0 +1,27 @@
+# perl Makefile.PL && nmake realclean && cls && perl Makefile.PL && nmake test
+
+use strict;
+use Test::More tests => 9; 
+BEGIN { $^W = 1 }
+
+use_ok( 'HTML::Scrubber::StripScripts' );
+
+my $s = HTML::Scrubber->new;
+my $html = q[start <style>in the style</style> middle <script>in the script</script> end];
+
+isa_ok($s, 'HTML::Scrubber');
+
+is( $s->script, 0, 'script off by default');
+is( $s->style, 0, 'style off by default');
+is( $s->scrub($html), 'start  middle  end', 'default (no style no script)');
+
+
+$s->script(1);
+is( $s->script, 1, 'script on');
+is( $s->scrub($html), 'start  middle in the script end', 'script off');
+
+
+
+$s->style(1);
+is( $s->style, 1, 'style on');
+is( $s->scrub($html), 'start in the style middle in the script end', 'style off and script off');
\ No newline at end of file
diff --git a/t/05_pi_comment.t b/t/05_pi_comment.t
new file mode 100755
index 0000000..625605f
--- /dev/null
+++ b/t/05_pi_comment.t
@@ -0,0 +1,27 @@
+# perl Makefile.PL && nmake realclean && cls && perl Makefile.PL && nmake test
+
+use strict;
+use Test::More tests => 9; 
+BEGIN { $^W = 1 }
+
+use_ok( 'HTML::Scrubber::StripScripts' );
+
+my $s = HTML::Scrubber->new;
+my $html = q[start <!--comment--> mid1 <?html pi> mid2 <?xml pi?> end];
+
+isa_ok($s, 'HTML::Scrubber');
+
+is( $s->comment, 0, 'comment off by default');
+is( $s->process, 0, 'process off by default');
+is( $s->scrub($html), 'start  mid1  mid2  end');
+
+
+$s->comment(1);
+is( $s->comment, 1, 'comment on');
+is( $s->scrub($html), 'start <!--comment--> mid1  mid2  end', 'comment on');
+
+
+
+$s->process(1);
+is( $s->process, 1, 'process on');
+is( $s->scrub($html), 'start <!--comment--> mid1 <?html pi> mid2 <?xml pi?> end', 'process on');
\ No newline at end of file

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/pkg-perl/packages/libhtml-scrubber-perl.git



More information about the Pkg-perl-cvs-commits mailing list