[DRE-commits] [SCM] coderay.git branch, upstream, updated. upstream/0.9.8-1-gb788473

Tue Feb 28 12:00:10 UTC 2012

The following commit has been merged in the upstream branch:
commit b7884736f7ec92fa4a0c593cfb8e1551687c8bc9
Author: Youhei SASAKI <uwabami at gfd-dennou.org>
Date:   Mon Feb 27 10:51:46 2012 +0900

    Imported Upstream version 1.0.5

diff --git a/FOLDERS b/FOLDERS
deleted file mode 100644
index e393ed7..0000000
--- a/FOLDERS
+++ /dev/null
@@ -1,53 +0,0 @@
-= CodeRay - Trunk folder structure
-
-== bench - Benchmarking system
-
-All benchmarking stuff goes here.
-
-Test inputs are stored in files named <code>example.<lang></code>.
-Test outputs go to <code>bench/test.<encoder-default-file-extension></code>.
-
-Run <code>bench/bench.rb</code> to get a usage description.
-
-Run <code>rake bench</code> to perform an example benchmark.
-
-
-== bin - Scripts
-
-Executional files for CodeRay.
-
-
-== demo - Demos and functional tests
-
-Demonstrational scripts to show of CodeRay's features.
-
-Run them as functional tests with <code>rake test:demos</code>.
-
-
-== etc - Lots of stuff
-
-Some addidtional files for CodeRay, mainly graphics and Vim scripts.
-
-
-== gem_server - Gem output folder
-
-For <code>rake gem</code>.
-
-
-== lib - CodeRay library code
-
-This is the base directory for the CodeRay library.
-
-
-== rake_helpers - Rake helper libraries
-
-Some files to enhance Rake, including the Autumnal Rdoc template and some scripts.
-
-
-== test - Tests
-
-Tests for the scanners.
-
-Each language has its own subfolder and sub-suite.
-
-Run with <code>rake test</code>.
diff --git a/lib/README b/README_INDEX.rdoc
similarity index 79%
rename from lib/README
rename to README_INDEX.rdoc
index e3fd869..7332653 100644
--- a/lib/README
+++ b/README_INDEX.rdoc
@@ -1,57 +1,45 @@
 = CodeRay
 
-[- Tired of blue'n'gray? Try the original version of this documentation on
-coderay.rubychan.de[http://coderay.rubychan.de/doc/] (use Ctrl+Click to open it in its own frame.) -]
+Tired of blue'n'gray? Try the original version of this documentation on
+coderay.rubychan.de[http://coderay.rubychan.de/doc/] :-)
 
 == About
+
 CodeRay is a Ruby library for syntax highlighting.
 
-Syntax highlighting means: You put your code in, and you get it back colored;
-Keywords, strings, floats, comments - all in different colors.
-And with line numbers.
+You put your code in, and you get it back colored; Keywords, strings,
+floats, comments - all in different colors. And with line numbers.
 
 *Syntax* *Highlighting*...
 * makes code easier to read and maintain
 * lets you detect syntax errors faster
 * helps you to understand the syntax of a language
 * looks nice
-* is what everybody should have on their website
+* is what everybody wants to have on their website
 * solves all your problems and makes the girls run after you
 
-Version: 0.9.8
-Author:: murphy (Kornelius Kalnbach)
-Contact:: murphy rubychan de
-Website:: coderay.rubychan.de[http://coderay.rubychan.de]
-License:: GNU LGPL; see LICENSE file in the main directory.
 
 == Installation
 
-You need RubyGems[http://rubyforge.org/frs/?group_id=126].
-
  % gem install coderay
 
 
 === Dependencies
 
-CodeRay needs Ruby 1.8.6 or later. It also runs with Ruby 1.9.1+ and JRuby 1.1+.
+CodeRay needs Ruby 1.8.7+ or 1.9.2+. It also runs on Rubinius and JRuby.
 
 
 == Example Usage
-(Forgive me, but this is not highlighted.)
 
  require 'coderay'
  
- tokens = CodeRay.scan "puts 'Hello, world!'", :ruby
- page = tokens.html :line_numbers => :inline, :wrap => :page
- puts page
+ html = CodeRay.scan("puts 'Hello, world!'", :ruby).div(:line_numbers => :table)
 
 
 == Documentation
 
 See CodeRay.
 
-Please report errors in this documentation to <murphy rubychan de>.
-
 
 == Credits
 
@@ -94,7 +82,6 @@ Please report errors in this documentation to <murphy rubychan de>.
 * Rob Aldred for the terminal encoder
 * Trans for pointing out $DEBUG dependencies
 * Flameeyes for finding that Term::ANSIColor was obsolete
-* Etienne Massip for reporting a serious bug in JavaScript scanner
 * matz and all Ruby gods and gurus
 * The inventors of: the computer, the internet, the true color display, HTML &
   CSS, VIM, Ruby, pizza, microwaves, guitars, scouting, programming, anime, 
@@ -124,6 +111,8 @@ Where would we be without all those people?
   less useless
 * Term::ANSIColor[http://term-ansicolor.rubyforge.org/]
 * PLEAC[http://pleac.sourceforge.net/] code examples
+* Github
+* Travis CI (http://travis-ci.org/rubychan/github)
 
 === Free
 
diff --git a/Rakefile b/Rakefile
index 05d0144..ba6c34e 100644
--- a/Rakefile
+++ b/Rakefile
@@ -1,8 +1,7 @@
-require 'rake/rdoctask'
+$:.unshift File.dirname(__FILE__) unless $:.include? '.'
 
 ROOT = '.'
 LIB_ROOT = File.join ROOT, 'lib'
-EXTRA_RDOC_FILES = %w(lib/README FOLDERS)
 
 task :default => :test
 
@@ -15,20 +14,21 @@ if File.directory? 'rake_tasks'
   
 else
   
-  # fallback tasks when rake_tasks folder is not present
+  # fallback tasks when rake_tasks folder is not present (eg. in the distribution package)
   desc 'Run CodeRay tests (basic)'
   task :test do
     ruby './test/functional/suite.rb'
     ruby './test/functional/for_redcloth.rb'
   end
   
+  gem 'rdoc' if defined? gem
+  require 'rdoc/task'
   desc 'Generate documentation for CodeRay'
   Rake::RDocTask.new :doc do |rd|
     rd.title = 'CodeRay Documentation'
-    rd.main = 'lib/README'
+    rd.main = 'README_INDEX.rdoc'
     rd.rdoc_files.add Dir['lib']
-    rd.rdoc_files.add 'lib/README'
-    rd.rdoc_files.add 'FOLDERS'
+    rd.rdoc_files.add rd.main
     rd.rdoc_dir = 'doc'
   end
   
diff --git a/bin/coderay b/bin/coderay
index 62101a8..d78cd57 100755
--- a/bin/coderay
+++ b/bin/coderay
@@ -1,86 +1,215 @@
 #!/usr/bin/env ruby
-# CodeRay Executable
-#
-# Version: 0.2
-# Author: murphy
-
 require 'coderay'
 
-if ARGV.empty?
-  $stderr.puts <<-USAGE
-CodeRay #{CodeRay::VERSION} (http://coderay.rubychan.de)
+$options, args = ARGV.partition { |arg| arg[/^-[hv]$|--\w+/] }
+subcommand = args.first if /^\w/ === args.first
+subcommand = nil if subcommand && File.exist?(subcommand)
+args.delete subcommand
 
-Usage:
-  coderay file [-<format>]
-  coderay -<lang> [-<format>] [< file] [> output]
+def option? *options
+  !($options & options).empty?
+end
 
-Defaults:
-  lang:   based on file extension
-  format: ANSI colorized output for terminal, HTML page for files
+def tty?
+  $stdout.tty? || option?('--tty')
+end
 
-Examples:
-  coderay foo.rb                         # colorized output to terminal, based on file extension
-  coderay foo.rb -loc                    # print LOC count, based on file extension and format
-  coderay foo.rb > foo.html              # HTML page output to file, based on extension
-  coderay -ruby < foo.rb                 # colorized output to terminal, based on lang
-  coderay -ruby -loc < foo.rb            # print LOC count, based on lang
-  coderay -ruby -page foo.rb             # HTML page output to terminal, based on lang and format
-  coderay -ruby -page foo.rb > foo.html  # HTML page output to file, based on lang and format
+def version
+  puts <<-USAGE
+CodeRay #{CodeRay::VERSION}
   USAGE
 end
 
-first, second = ARGV
+def help
+  puts <<-HELP
+This is CodeRay #{CodeRay::VERSION}, a syntax highlighting tool for selected languages.
 
-def read
-  file = ARGV.grep(/^(?!-)/).last
-  if file
-    if File.exist?(file)
-      File.read file
-    else
-      $stderr.puts "No such file: #{file}"
-    end
-  else
-    $stdin.read
-  end
+usage:
+  coderay [-language] [input] [-format] [output]
+  
+defaults:
+  language   detect from input file name or shebang; fall back to plain text
+  input      STDIN
+  format     detect from output file name or use terminal; fall back to HTML
+  output     STDOUT
+
+common:
+  coderay file.rb                      # highlight file to terminal
+  coderay file.rb > file.html          # highlight file to HTML page
+  coderay file.rb -div > file.html     # highlight file to HTML snippet
+
+configure output:
+  coderay file.py output.json          # output tokens as JSON
+  coderay file.py -loc                 # count lines of code in Python file
+
+configure input:
+  coderay -python file                 # specify the input language
+  coderay -ruby                        # take input from STDIN
+
+more:
+  coderay stylesheet [style]           # print CSS stylesheet
+  HELP
 end
 
-if first
-  if first[/-(\w+)/] == first
-    lang = $1
-    input = read
-    tokens = :scan
-  else
-    file = first
-    unless File.exist? file
-      $stderr.puts "No such file: #{file}"
-      exit 2
+def commands
+  puts <<-COMMANDS
+  general:
+    highlight   code highlighting (default command, optional)
+    stylesheet  print the CSS stylesheet with the given name (aliases: style, css)
+  
+  about:
+    list [of]   list all available plugins (or just the scanners|encoders|styles|filetypes)
+    commands    print this list
+    help        show some help
+    version     print CodeRay version
+  COMMANDS
+end
+
+def print_list_of plugin_host
+  plugins = plugin_host.all_plugins.map do |plugin|
+    info = "  #{plugin.plugin_id}: #{plugin.title}"
+    
+    aliases = (plugin.aliases - [:default]).map { |key| "-#{key}" }.sort_by { |key| key.size }
+    if plugin.respond_to?(:file_extension) || !aliases.empty?
+      additional_info = []
+      additional_info << aliases.join(', ') unless aliases.empty?
+      info << " (#{additional_info.join('; ')})"
     end
-    tokens = CodeRay.scan_file file
+    
+    info << '  <-- default' if plugin.aliases.include? :default
+    
+    info
   end
-else
-  $stderr.puts 'No lang/file given.'
-  exit 1
+  puts plugins.sort
+end
+
+if option? '-v', '--version'
+  version
+end
+
+if option? '-h', '--help'
+  help
 end
 
-if second
-  if second[/-(\w+)/] == second
-    format = $1.to_sym
+case subcommand
+when 'highlight', nil
+  if ARGV.empty?
+    version
+    help
   else
-    raise 'invalid format (must be -xxx)'
+    signature = args.map { |arg| arg[/^-/] ? '-' : 'f' }.join
+    names     = args.map { |arg| arg.sub(/^-/, '') }
+    case signature
+    when /^$/
+      exit
+    when /^ff?$/
+      input_file, output_file, = *names
+    when /^f-f?$/
+      input_file, output_format, output_file, = *names
+    when /^-ff?$/
+      input_lang, input_file, output_file, = *names
+    when /^-f-f?$/
+      input_lang, input_file, output_format, output_file, = *names
+    when /^--?f?$/
+      input_lang, output_format, output_file, = *names
+    else
+      $stdout = $stderr
+      help
+      puts
+      puts "Unknown parameter order: #{args.join ' '}, expected: [-language] [input] [-format] [output]"
+      exit 1
+    end
+    
+    if input_file
+      input_lang ||= CodeRay::FileType.fetch input_file, :text, true
+    end
+    
+    if output_file
+      output_format ||= CodeRay::FileType[output_file]
+    else
+      output_format ||= :terminal
+    end
+    
+    output_format = :page if output_format.to_s == 'html'
+    
+    if input_file
+      input = File.read input_file
+    else
+      input = $stdin.read
+    end
+    
+    begin
+      file =
+        if output_file
+          File.open output_file, 'w'
+        else
+          $stdout.sync = true
+          $stdout
+        end
+      CodeRay.encode(input, input_lang, output_format, :out => file)
+      file.puts
+    rescue CodeRay::PluginHost::PluginNotFound => boom
+      $stdout = $stderr
+      if boom.message[/CodeRay::(\w+)s could not load plugin :?(.*?): /]
+        puts "I don't know the #$1 \"#$2\"."
+      else
+        puts boom.message
+      end
+      # puts "I don't know this plugin: #{boom.message[/Could not load plugin (.*?): /, 1]}."
+    rescue CodeRay::Scanners::Scanner::ScanError  # FIXME: rescue Errno::EPIPE
+      # this is sometimes raised by pagers; ignore [TODO: wtf?]
+    ensure
+      file.close if output_file
+    end
+  end
+when 'li', 'list'
+  arg = args.first && args.first.downcase
+  if [nil, 's', 'sc', 'scanner', 'scanners'].include? arg
+    puts 'input languages (Scanners):'
+    print_list_of CodeRay::Scanners
+  end
+  
+  if [nil, 'e', 'en', 'enc', 'encoder', 'encoders'].include? arg
+    puts 'output formats (Encoders):'
+    print_list_of CodeRay::Encoders
   end
+  
+  if [nil, 'st', 'style', 'styles'].include? arg
+    puts 'CSS themes for HTML output (Styles):'
+    print_list_of CodeRay::Styles
+  end
+  
+  if [nil, 'f', 'ft', 'file', 'filetype', 'filetypes'].include? arg
+    puts 'recognized file types:'
+    
+    filetypes = Hash.new { |h, k| h[k] = [] }
+    CodeRay::FileType::TypeFromExt.inject filetypes do |types, (ext, type)|
+      types[type.to_s] << ".#{ext}"
+      types
+    end
+    CodeRay::FileType::TypeFromName.inject filetypes do |types, (name, type)|
+      types[type.to_s] << name
+      types
+    end
+    
+    filetypes.sort.each do |type, exts|
+      puts "  #{type}: #{exts.sort_by { |ext| ext.size }.join(', ')}"
+    end
+  end
+when 'stylesheet', 'style', 'css'
+  puts CodeRay::Encoders[:html]::CSS.new(args.first || :default).stylesheet
+when 'commands'
+  commands
+when 'help'
+  help
 else
-  if $stdout.tty?
-    format = :term
+  $stdout = $stderr
+  help
+  puts
+  if subcommand[/\A\w+\z/]
+    puts "Unknown command: #{subcommand}"
   else
-    $stderr.puts 'No format given; setting to default (HTML Page).'
-    format = :page
+    puts "File not found: #{subcommand}"
   end
+  exit 1
 end
-
-if tokens == :scan
-  output = CodeRay::Duo[lang => format].highlight input
-else
-  output = tokens.encode format
-end
-out = $stdout
-out.puts output
diff --git a/bin/coderay_stylesheet b/bin/coderay_stylesheet
deleted file mode 100755
index baa7c26..0000000
--- a/bin/coderay_stylesheet
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/usr/bin/env ruby
-require 'coderay'
-
-puts CodeRay::Encoders[:html]::CSS.new.stylesheet
diff --git a/lib/coderay.rb b/lib/coderay.rb
index bd8b4e9..876d770 100644
--- a/lib/coderay.rb
+++ b/lib/coderay.rb
@@ -1,16 +1,21 @@
+# encoding: utf-8
+# Encoding.default_internal = 'UTF-8'
+
 # = CodeRay Library
 #
 # CodeRay is a Ruby library for syntax highlighting.
 #
-# I try to make CodeRay easy to use and intuitive, but at the same time fully featured, complete,
-# fast and efficient.
+# I try to make CodeRay easy to use and intuitive, but at the same time fully
+# featured, complete, fast and efficient.
 # 
 # See README.
 # 
 # It consists mainly of
-# * the main engine: CodeRay (Scanners::Scanner, Tokens/TokenStream, Encoders::Encoder), PluginHost
+# * the main engine: CodeRay (Scanners::Scanner, Tokens, Encoders::Encoder)
+# * the plugin system: PluginHost, Plugin
 # * the scanners in CodeRay::Scanners
 # * the encoders in CodeRay::Encoders
+# * the styles in CodeRay::Styles
 # 
 # Here's a fancy graphic to light up this gray docu:
 # 
@@ -22,8 +27,8 @@
 #
 # == Usage
 #
-# Remember you need RubyGems to use CodeRay, unless you have it in your load path. Run Ruby with
-# -rubygems option if required.
+# Remember you need RubyGems to use CodeRay, unless you have it in your load
+# path. Run Ruby with -rubygems option if required.
 #
 # === Highlight Ruby code in a string as html
 # 
@@ -98,13 +103,6 @@
 # CodeRay.encode_tokens:: Encode the given tokens.
 # CodeRay.encode_file:: Scan a file, guess the language using FileType and encode it.
 #
-# == Streaming
-#
-# Streaming saves RAM by running Scanner and Encoder in some sort of
-# pipe mode; see TokenStream.
-#
-# CodeRay.scan_stream:: Scan in stream mode.
-#
 # == All-in-One Encoding
 #
 # CodeRay.encode:: Highlight a string with a given input and output format.
@@ -129,23 +127,37 @@ module CodeRay
   
   $CODERAY_DEBUG ||= false
   
-  # Version: Major.Minor.Teeny[.Revision]
-  # Major: 0 for pre-stable, 1 for stable
-  # Minor: feature milestone
-  # Teeny: development state, 0 for pre-release
-  # Revision: Subversion Revision number (generated on rake gem:make)
-  VERSION = '0.9.8'
-
-  require 'coderay/tokens'
-  require 'coderay/token_classes'
-  require 'coderay/scanner'
-  require 'coderay/encoder'
-  require 'coderay/duo'
-  require 'coderay/style'
-
-
+  CODERAY_PATH = File.join File.dirname(__FILE__), 'coderay'
+  
+  # Assuming the path is a subpath of lib/coderay/
+  def self.coderay_path *path
+    File.join CODERAY_PATH, *path
+  end
+  
+  require coderay_path('version')
+  
+  # helpers
+  autoload :FileType,    coderay_path('helpers', 'file_type')
+  
+  # Tokens
+  autoload :Tokens,      coderay_path('tokens')
+  autoload :TokensProxy, coderay_path('tokens_proxy')
+  autoload :TokenKinds,  coderay_path('token_kinds')
+  
+  # Plugin system
+  autoload :PluginHost,  coderay_path('helpers', 'plugin')
+  autoload :Plugin,      coderay_path('helpers', 'plugin')
+  
+  # Plugins
+  autoload :Scanners,    coderay_path('scanner')
+  autoload :Encoders,    coderay_path('encoder')
+  autoload :Styles,      coderay_path('style')
+  
+  # convenience access and reusable Encoder/Scanner pair
+  autoload :Duo,         coderay_path('duo')
+  
   class << self
-
+    
     # Scans the given +code+ (a String) with the Scanner for +lang+.
     #
     # This is a simple way to use CodeRay. Example:
@@ -154,15 +166,15 @@ module CodeRay
     #
     # See also demo/demo_simple.
     def scan code, lang, options = {}, &block
-      scanner = Scanners[lang].new code, options, &block
-      scanner.tokenize
+      # FIXME: return a proxy for direct-stream encoding
+      TokensProxy.new code, lang, options, block
     end
-
+    
     # Scans +filename+ (a path to a code file) with the Scanner for +lang+.
     #
     # If +lang+ is :auto or omitted, the CodeRay::FileType module is used to
     # determine it. If it cannot find out what type it is, it uses
-    # CodeRay::Scanners::Plaintext.
+    # CodeRay::Scanners::Text.
     #
     # Calls CodeRay.scan.
     #
@@ -170,56 +182,22 @@ module CodeRay
     #  require 'coderay'
     #  page = CodeRay.scan_file('some_c_code.c').html
     def scan_file filename, lang = :auto, options = {}, &block
-      file = IO.read filename
-      if lang == :auto
-        require 'coderay/helpers/file_type'
-        lang = FileType.fetch filename, :plaintext, true
-      end
-      scan file, lang, options = {}, &block
-    end
-
-    # Scan the +code+ (a string) with the scanner for +lang+.
-    #
-    # Calls scan.
-    #
-    # See CodeRay.scan.
-    def scan_stream code, lang, options = {}, &block
-      options[:stream] = true
+      lang = FileType.fetch filename, :text, true if lang == :auto
+      code = File.read filename
       scan code, lang, options, &block
     end
-
-    # Encode a string in Streaming mode.
-    #
-    # This starts scanning +code+ with the the Scanner for +lang+
-    # while encodes the output with the Encoder for +format+.
-    # +options+ will be passed to the Encoder.
-    #
-    # See CodeRay::Encoder.encode_stream
-    def encode_stream code, lang, format, options = {}
-      encoder(format, options).encode_stream code, lang, options
-    end
-
+    
     # Encode a string.
     #
     # This scans +code+ with the the Scanner for +lang+ and then
     # encodes it with the Encoder for +format+.
     # +options+ will be passed to the Encoder.
     #
-    # See CodeRay::Encoder.encode
+    # See CodeRay::Encoder.encode.
     def encode code, lang, format, options = {}
       encoder(format, options).encode code, lang, options
     end
-
-    # Highlight a string into a HTML <div>.
-    #
-    # CSS styles use classes, so you have to include a stylesheet
-    # in your output.
-    #
-    # See encode.
-    def highlight code, lang, options = { :css => :class }, format = :div
-      encode code, lang, format, options
-    end
-
+    
     # Encode pre-scanned Tokens.
     # Use this together with CodeRay.scan:
     #
@@ -232,7 +210,7 @@ module CodeRay
     def encode_tokens tokens, format, options = {}
       encoder(format, options).encode_tokens tokens, options
     end
-
+    
     # Encodes +filename+ (a path to a code file) with the Scanner for +lang+.
     #
     # See CodeRay.scan_file.
@@ -245,7 +223,17 @@ module CodeRay
       tokens = scan_file filename, :auto, get_scanner_options(options)
       encode_tokens tokens, format, options
     end
-
+    
+    # Highlight a string into a HTML <div>.
+    #
+    # CSS styles use classes, so you have to include a stylesheet
+    # in your output.
+    #
+    # See encode.
+    def highlight code, lang, options = { :css => :class }, format = :div
+      encode code, lang, format, options
+    end
+    
     # Highlight a file into a HTML <div>.
     #
     # CSS styles use classes, so you have to include a stylesheet
@@ -255,7 +243,7 @@ module CodeRay
     def highlight_file filename, options = { :css => :class }, format = :div
       encode_file filename, format, options
     end
-
+    
     # Finds the Encoder class for +format+ and creates an instance, passing
     # +options+ to it.
     #
@@ -273,15 +261,15 @@ module CodeRay
     def encoder format, options = {}
       Encoders[format].new options
     end
-
+    
     # Finds the Scanner class for +lang+ and creates an instance, passing
     # +options+ to it.
     #
     # See Scanner.new.
-    def scanner lang, options = {}
-      Scanners[lang].new '', options
+    def scanner lang, options = {}, &block
+      Scanners[lang].new '', options, &block
     end
-
+    
     # Extract the options for the scanner from the +options+ hash.
     #
     # Returns an empty Hash if <tt>:scanner_options</tt> is not set.
@@ -291,32 +279,7 @@ module CodeRay
     def get_scanner_options options
       options.fetch :scanner_options, {}
     end
-
-  end
-
-  # This Exception is raised when you try to stream with something that is not
-  # capable of streaming.
-  class NotStreamableError < Exception
-    def initialize obj
-      @obj = obj
-    end
-
-    def to_s
-      '%s is not Streamable!' % @obj.class
-    end
-  end
-
-  # A dummy module that is included by subclasses of CodeRay::Scanner an CodeRay::Encoder
-  # to show that they are able to handle streams.
-  module Streamable
+    
   end
-
-end
-
-# Run a test script.
-if $0 == __FILE__
-  $stderr.print 'Press key to print demo.'; gets
-  # Just use this file as an example of Ruby code.
-  code = File.read(__FILE__)[/module CodeRay.*/m]
-  print CodeRay.scan(code, :ruby).html
+  
 end
diff --git a/lib/coderay/duo.rb b/lib/coderay/duo.rb
index 5468dda..cb3f8ee 100644
--- a/lib/coderay/duo.rb
+++ b/lib/coderay/duo.rb
@@ -21,10 +21,7 @@ module CodeRay
     # Create a new Duo, holding a lang and a format to highlight code.
     # 
     # simple:
-    #   CodeRay::Duo[:ruby, :page].highlight 'bla 42'
-    # 
-    # streaming:
-    #   CodeRay::Duo[:ruby, :page].highlight 'bar 23', :stream => true
+    #   CodeRay::Duo[:ruby, :html].highlight 'bla 42'
     # 
     # with options:
     #   CodeRay::Duo[:ruby, :html, :hint => :debug].highlight '????::??'
@@ -38,7 +35,7 @@ module CodeRay
     # The options are forwarded to scanner and encoder
     # (see CodeRay.get_scanner_options).
     def initialize lang = nil, format = nil, options = {}
-      if format == nil and lang.is_a? Hash and lang.size == 1
+      if format.nil? && lang.is_a?(Hash) && lang.size == 1
         @lang = lang.keys.first
         @format = lang[@lang]
       else
@@ -47,12 +44,12 @@ module CodeRay
       end
       @options = options
     end
-
+    
     class << self
       # To allow calls like Duo[:ruby, :html].highlight.
       alias [] new
     end
-
+    
     # The scanner of the duo. Only created once.
     def scanner
       @scanner ||= CodeRay.scanner @lang, CodeRay.get_scanner_options(@options)
@@ -64,22 +61,21 @@ module CodeRay
     end
     
     # Tokenize and highlight the code using +scanner+ and +encoder+.
-    #
-    # If the :stream option is set, the Duo will go into streaming mode,
-    # saving memory for the cost of time.
-    def encode code, options = { :stream => false }
-      stream = options.delete :stream
+    def encode code, options = {}
       options = @options.merge options
-      if stream
-        encoder.encode_stream(code, @lang, options)
-      else
-        scanner.code = code
-        encoder.encode_tokens(scanner.tokenize, options)
-      end
+      encoder.encode(code, @lang, options)
     end
     alias highlight encode
-
+    
+    # Allows to use Duo like a proc object:
+    # 
+    #  CodeRay::Duo[:python => :yaml].call(code)
+    # 
+    # or, in Ruby 1.9 and later:
+    # 
+    #  CodeRay::Duo[:python => :yaml].(code)
+    alias call encode
+    
   end
-
+  
 end
-
diff --git a/lib/coderay/encoder.rb b/lib/coderay/encoder.rb
index 3ae2924..d2d6c7e 100644
--- a/lib/coderay/encoder.rb
+++ b/lib/coderay/encoder.rb
@@ -1,5 +1,5 @@
 module CodeRay
-
+  
   # This module holds the Encoder class and its subclasses.
   # For example, the HTML encoder is named CodeRay::Encoders::HTML
   # can be found in coderay/encoders/html.
@@ -8,9 +8,10 @@ module CodeRay
   # mechanism and the [] method that returns the Encoder class
   # belonging to the given format.
   module Encoders
+    
     extend PluginHost
     plugin_path File.dirname(__FILE__), 'encoders'
-
+    
     # = Encoder
     #
     # The Encoder base class. Together with Scanner and
@@ -26,34 +27,32 @@ module CodeRay
     class Encoder
       extend Plugin
       plugin_host Encoders
-
-      attr_reader :token_stream
-
+      
       class << self
-
-        # Returns if the Encoder can be used in streaming mode.
-        def streamable?
-          is_a? Streamable
-        end
-
+        
         # If FILE_EXTENSION isn't defined, this method returns the
         # downcase class name instead.
         def const_missing sym
           if sym == :FILE_EXTENSION
-            plugin_id
+            (defined?(@plugin_id) && @plugin_id || name[/\w+$/].downcase).to_s
           else
             super
           end
         end
-
+        
+        # The default file extension for output file of this encoder class.
+        def file_extension
+          self::FILE_EXTENSION
+        end
+        
       end
-
+      
       # Subclasses are to store their default options in this constant.
-      DEFAULT_OPTIONS = { :stream => false }
-
+      DEFAULT_OPTIONS = { }
+      
       # The options you gave the Encoder at creating.
-      attr_accessor :options
-
+      attr_accessor :options, :scanner
+      
       # Creates a new Encoder.
       # +options+ is saved and used for all encode operations, as long
       # as you don't overwrite it there by passing additional options.
@@ -61,153 +60,142 @@ module CodeRay
       # Encoder objects provide three encode methods:
       # - encode simply takes a +code+ string and a +lang+
       # - encode_tokens expects a +tokens+ object instead
-      # - encode_stream is like encode, but uses streaming mode.
       #
       # Each method has an optional +options+ parameter. These are
       # added to the options you passed at creation.
       def initialize options = {}
         @options = self.class::DEFAULT_OPTIONS.merge options
-        raise "I am only the basic Encoder class. I can't encode "\
-          "anything. :( Use my subclasses." if self.class == Encoder
+        @@CODERAY_TOKEN_INTERFACE_DEPRECATION_WARNING_GIVEN = false
       end
-
+      
       # Encode a Tokens object.
       def encode_tokens tokens, options = {}
         options = @options.merge options
+        @scanner = tokens.scanner if tokens.respond_to? :scanner
         setup options
         compile tokens, options
         finish options
       end
-
-      # Encode the given +code+ after tokenizing it using the Scanner
-      # for +lang+.
+      
+      # Encode the given +code+ using the Scanner for +lang+.
       def encode code, lang, options = {}
         options = @options.merge options
-        scanner_options = CodeRay.get_scanner_options(options)
-        tokens = CodeRay.scan code, lang, scanner_options
-        encode_tokens tokens, options
+        @scanner = Scanners[lang].new code, CodeRay.get_scanner_options(options).update(:tokens => self)
+        setup options
+        @scanner.tokenize
+        finish options
       end
-
+      
       # You can use highlight instead of encode, if that seems
       # more clear to you.
       alias highlight encode
-
-      # Encode the given +code+ using the Scanner for +lang+ in
-      # streaming mode.
-      def encode_stream code, lang, options = {}
-        raise NotStreamableError, self unless kind_of? Streamable
-        options = @options.merge options
-        setup options
-        scanner_options = CodeRay.get_scanner_options options
-        @token_stream =
-          CodeRay.scan_stream code, lang, scanner_options, &self
-        finish options
-      end
-
-      # Behave like a proc. The token method is converted to a proc.
-      def to_proc
-        method(:token).to_proc
-      end
-
-      # Return the default file extension for outputs of this encoder.
+      
+      # The default file extension for this encoder.
       def file_extension
-        self.class::FILE_EXTENSION
+        self.class.file_extension
       end
-
-    protected
-
-      # Called with merged options before encoding starts.
-      # Sets @out to an empty string.
-      #
-      # See the HTML Encoder for an example of option caching.
-      def setup options
-        @out = ''
+      
+      def << token
+        unless @@CODERAY_TOKEN_INTERFACE_DEPRECATION_WARNING_GIVEN
+          warn 'Using old Tokens#<< interface.'
+          @@CODERAY_TOKEN_INTERFACE_DEPRECATION_WARNING_GIVEN = true
+        end
+        self.token(*token)
       end
-
+      
       # Called with +content+ and +kind+ of the currently scanned token.
       # For simple scanners, it's enougth to implement this method.
       #
-      # By default, it calls text_token or block_token, depending on
-      # whether +content+ is a String.
+      # By default, it calls text_token, begin_group, end_group, begin_line,
+      # or end_line, depending on the +content+.
       def token content, kind
-        encoded_token =
-          if content.is_a? ::String
-            text_token content, kind
-          elsif content.is_a? ::Symbol
-            block_token content, kind
-          else
-            raise 'Unknown token content type: %p' % [content]
-          end
-        append_encoded_token_to_output encoded_token
-      end
-      
-      def append_encoded_token_to_output encoded_token
-        @out << encoded_token if encoded_token && defined?(@out) && @out
-      end
-      
-      # Called for each text token ([text, kind]), where text is a String.
-      def text_token text, kind
-      end
-      
-      # Called for each block (non-text) token ([action, kind]),
-      # where +action+ is a Symbol.
-      # 
-      # Calls open_token, close_token, begin_line, and end_line according to
-      # the value of +action+.
-      def block_token action, kind
-        case action
-        when :open
-          open_token kind
-        when :close
-          close_token kind
+        case content
+        when String
+          text_token content, kind
+        when :begin_group
+          begin_group kind
+        when :end_group
+          end_group kind
         when :begin_line
           begin_line kind
         when :end_line
           end_line kind
         else
-          raise 'unknown block action: %p' % action
+          raise ArgumentError, 'Unknown token content type: %p, kind = %p' % [content, kind]
         end
       end
       
-      # Called for each block token at the start of the block ([:open, kind]).
-      def open_token kind
+      # Called for each text token ([text, kind]), where text is a String.
+      def text_token text, kind
+        @out << text
       end
       
-      # Called for each block token end of the block ([:close, kind]).
-      def close_token kind
+      # Starts a token group with the given +kind+.
+      def begin_group kind
       end
       
-      # Called for each line token block at the start of the line ([:begin_line, kind]).
+      # Ends a token group with the given +kind+.
+      def end_group kind
+      end
+      
+      # Starts a new line token group with the given +kind+.
       def begin_line kind
       end
       
-      # Called for each line token block at the end of the line ([:end_line, kind]).
+      # Ends a new line token group with the given +kind+.
       def end_line kind
       end
-
+      
+    protected
+      
+      # Called with merged options before encoding starts.
+      # Sets @out to an empty string.
+      #
+      # See the HTML Encoder for an example of option caching.
+      def setup options
+        @out = get_output(options)
+      end
+      
+      def get_output options
+        options[:out] || ''
+      end
+      
+      # Append data.to_s to the output. Returns the argument.
+      def output data
+        @out << data.to_s
+        data
+      end
+      
       # Called with merged options after encoding starts.
       # The return value is the result of encoding, typically @out.
       def finish options
         @out
       end
-
+      
       # Do the encoding.
       #
-      # The already created +tokens+ object must be used; it can be a
-      # TokenStream or a Tokens object.
-      if RUBY_VERSION >= '1.9'
-        def compile tokens, options
-          for text, kind in tokens
-            token text, kind
+      # The already created +tokens+ object must be used; it must be a
+      # Tokens object.
+      def compile tokens, options = {}
+        content = nil
+        for item in tokens
+          if item.is_a? Array
+            raise ArgumentError, 'Two-element array tokens are no longer supported.'
+          end
+          if content
+            token content, item
+            content = nil
+          else
+            content = item
           end
         end
-      else
-        def compile tokens, options
-          tokens.each(&self)
-        end
+        raise 'odd number list for Tokens' if content
       end
-
+      
+      alias tokens compile
+      public :tokens
+      
     end
-
+    
   end
 end
diff --git a/lib/coderay/encoders/_map.rb b/lib/coderay/encoders/_map.rb
index 526c3a0..4cca196 100644
--- a/lib/coderay/encoders/_map.rb
+++ b/lib/coderay/encoders/_map.rb
@@ -1,12 +1,17 @@
 module CodeRay
 module Encoders
-
+  
   map \
-    :loc => :lines_of_code,
-    :plain => :text,
-    :stats => :statistic,
-    :terminal => :term,
-    :tex => :latex
-
+    :loc             => :lines_of_code,
+    :plain           => :text,
+    :plaintext       => :text,
+    :remove_comments => :comment_filter,
+    :stats           => :statistic,
+    :term            => :terminal,
+    :tty             => :terminal,
+    :yml             => :yaml
+  
+  # No default because Tokens#nonsense should raise NoMethodError.
+  
 end
 end
diff --git a/lib/coderay/encoders/comment_filter.rb b/lib/coderay/encoders/comment_filter.rb
index 4d3fb54..28336b3 100644
--- a/lib/coderay/encoders/comment_filter.rb
+++ b/lib/coderay/encoders/comment_filter.rb
@@ -1,43 +1,25 @@
-($:.unshift '../..'; require 'coderay') unless defined? CodeRay
 module CodeRay
 module Encoders
   
-  load :token_class_filter
+  load :token_kind_filter
   
-  class CommentFilter < TokenClassFilter
+  # A simple Filter that removes all tokens of the :comment kind.
+  # 
+  # Alias: +remove_comments+
+  # 
+  # Usage:
+  #  CodeRay.scan('print # foo', :ruby).comment_filter.text
+  #  #-> "print "
+  # 
+  # See also: TokenKindFilter, LinesOfCode
+  class CommentFilter < TokenKindFilter
     
     register_for :comment_filter
     
     DEFAULT_OPTIONS = superclass::DEFAULT_OPTIONS.merge \
-      :exclude => [:comment]
+      :exclude => [:comment, :docstring]
     
   end
   
 end
 end
-
-if $0 == __FILE__
-  $VERBOSE = true
-  $: << File.join(File.dirname(__FILE__), '..')
-  eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class CommentFilterTest < Test::Unit::TestCase
-  
-  def test_filtering_comments
-    tokens = CodeRay.scan <<-RUBY, :ruby
-#!/usr/bin/env ruby
-# a minimal Ruby program
-puts "Hello world!"
-    RUBY
-    assert_equal <<-RUBY_FILTERED, tokens.comment_filter.text
-#!/usr/bin/env ruby
-
-puts "Hello world!"
-    RUBY_FILTERED
-  end
-  
-end
\ No newline at end of file
diff --git a/lib/coderay/encoders/count.rb b/lib/coderay/encoders/count.rb
index c9a6dfd..98a427e 100644
--- a/lib/coderay/encoders/count.rb
+++ b/lib/coderay/encoders/count.rb
@@ -1,21 +1,39 @@
 module CodeRay
 module Encoders
-
+  
+  # Returns the number of tokens.
+  # 
+  # Text and block tokens are counted.
   class Count < Encoder
-
-    include Streamable
+    
     register_for :count
-
-    protected
-
+    
+  protected
+    
     def setup options
-      @out = 0
+      super
+      
+      @count = 0
     end
-
-    def token text, kind
-      @out += 1
+    
+    def finish options
+      output @count
     end
+    
+  public
+    
+    def text_token text, kind
+      @count += 1
+    end
+    
+    def begin_group kind
+      @count += 1
+    end
+    alias end_group begin_group
+    alias begin_line begin_group
+    alias end_line begin_group
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/encoders/debug.rb b/lib/coderay/encoders/debug.rb
index a4b0648..95d6138 100644
--- a/lib/coderay/encoders/debug.rb
+++ b/lib/coderay/encoders/debug.rb
@@ -1,6 +1,6 @@
 module CodeRay
 module Encoders
-
+  
   # = Debug Encoder
   #
   # Fast encoder producing simple debug output.
@@ -10,40 +10,52 @@ module Encoders
   # You cannot fully restore the tokens information from the
   # output, because consecutive :space tokens are merged.
   # Use Tokens#dump for caching purposes.
+  # 
+  # See also: Scanners::Debug
   class Debug < Encoder
-
-    include Streamable
+    
     register_for :debug
-
+    
     FILE_EXTENSION = 'raydebug'
-
-  protected
+    
+    def initialize options = {}
+      super
+      @opened = []
+    end
+    
     def text_token text, kind
       if kind == :space
-        text
+        @out << text
       else
+        # TODO: Escape (
         text = text.gsub(/[)\\]/, '\\\\\0')  # escape ) and \
-        "#{kind}(#{text})"
+        @out << kind.to_s << '(' << text << ')'
       end
     end
-
-    def open_token kind
-      "#{kind}<"
+    
+    def begin_group kind
+      @opened << kind
+      @out << kind.to_s << '<'
     end
-
-    def close_token kind
-      ">"
+    
+    def end_group kind
+      if @opened.last != kind
+        puts @out
+        raise "we are inside #{@opened.inspect}, not #{kind}"
+      end
+      @opened.pop
+      @out << '>'
     end
-
+    
     def begin_line kind
-      "#{kind}["
+      @out << kind.to_s << '['
     end
-
+    
     def end_line kind
-      "]"
+      @out << ']'
     end
-
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/encoders/div.rb b/lib/coderay/encoders/div.rb
index 4120172..efd9435 100644
--- a/lib/coderay/encoders/div.rb
+++ b/lib/coderay/encoders/div.rb
@@ -1,19 +1,23 @@
 module CodeRay
 module Encoders
-
+  
   load :html
-
+  
+  # Wraps HTML output into a DIV element, using inline styles by default.
+  # 
+  # See Encoders::HTML for available options.
   class Div < HTML
-
+    
     FILE_EXTENSION = 'div.html'
-
+    
     register_for :div
-
+    
     DEFAULT_OPTIONS = HTML::DEFAULT_OPTIONS.merge \
-      :css => :style,
-      :wrap => :div
-
+      :css          => :style,
+      :wrap         => :div,
+      :line_numbers => false
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/encoders/filter.rb b/lib/coderay/encoders/filter.rb
index 5e4b34d..e7f34d6 100644
--- a/lib/coderay/encoders/filter.rb
+++ b/lib/coderay/encoders/filter.rb
@@ -1,75 +1,58 @@
-($:.unshift '../..'; require 'coderay') unless defined? CodeRay
 module CodeRay
 module Encoders
   
+  # A Filter encoder has another Tokens instance as output.
+  # It can be subclass to select, remove, or modify tokens in the stream.
+  # 
+  # Subclasses of Filter are called "Filters" and can be chained.
+  # 
+  # == Options
+  # 
+  # === :tokens
+  # 
+  # The Tokens object which will receive the output.
+  # 
+  # Default: Tokens.new
+  # 
+  # See also: TokenKindFilter
   class Filter < Encoder
     
     register_for :filter
     
   protected
     def setup options
-      @out = Tokens.new
+      super
+      
+      @tokens = options[:tokens] || Tokens.new
     end
     
-    def text_token text, kind
-      [text, kind] if include_text_token? text, kind
+    def finish options
+      output @tokens
     end
     
-    def include_text_token? text, kind
-      true
-    end
+  public
     
-    def block_token action, kind
-      [action, kind] if include_block_token? action, kind
+    def text_token text, kind  # :nodoc:
+      @tokens.text_token text, kind
     end
     
-    def include_block_token? action, kind
-      true
+    def begin_group kind  # :nodoc:
+      @tokens.begin_group kind
     end
     
-  end
-  
-end
-end
-
-if $0 == __FILE__
-  $VERBOSE = true
-  $: << File.join(File.dirname(__FILE__), '..')
-  eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class FilterTest < Test::Unit::TestCase
-  
-  def test_creation
-    assert CodeRay::Encoders::Filter < CodeRay::Encoders::Encoder
-    filter = nil
-    assert_nothing_raised do
-      filter = CodeRay.encoder :filter
+    def begin_line kind  # :nodoc:
+      @tokens.begin_line kind
     end
-    assert_kind_of CodeRay::Encoders::Encoder, filter
-  end
-  
-  def test_filtering_text_tokens
-    tokens = CodeRay::Tokens.new
-    10.times do |i|
-      tokens << [i.to_s, :index]
+    
+    def end_group kind  # :nodoc:
+      @tokens.end_group kind
     end
-    assert_equal tokens, CodeRay::Encoders::Filter.new.encode_tokens(tokens)
-    assert_equal tokens, tokens.filter
-  end
-  
-  def test_filtering_block_tokens
-    tokens = CodeRay::Tokens.new
-    10.times do |i|
-      tokens << [:open, :index]
-      tokens << [i.to_s, :content]
-      tokens << [:close, :index]
+    
+    def end_line kind  # :nodoc:
+      @tokens.end_line kind
     end
-    assert_equal tokens, CodeRay::Encoders::Filter.new.encode_tokens(tokens)
-    assert_equal tokens, tokens.filter
+    
   end
   
 end
+end
diff --git a/lib/coderay/encoders/html.rb b/lib/coderay/encoders/html.rb
index 585ddbb..c32dbd1 100644
--- a/lib/coderay/encoders/html.rb
+++ b/lib/coderay/encoders/html.rb
@@ -2,7 +2,7 @@ require 'set'
 
 module CodeRay
 module Encoders
-
+  
   # = HTML Encoder
   #
   # This is CodeRay's most important highlighter:
@@ -21,12 +21,12 @@ module Encoders
   #    :line_numbers => :inline,
   #    :css => :style
   #  )
-  #  #-> <span class="no">1</span>  <span style="color:#036; font-weight:bold;">Some</span> code
   #
   # == Options
   #
   # === :tab_width
   # Convert \t characters to +n+ spaces (a number.)
+  # 
   # Default: 8
   #
   # === :css
@@ -48,10 +48,18 @@ module Encoders
   # Default: 'CodeRay output'
   #
   # === :line_numbers
-  # Include line numbers in :table, :inline, :list or nil (no line numbers)
+  # Include line numbers in :table, :inline, or nil (no line numbers)
   #
   # Default: nil
   #
+  # === :line_number_anchors
+  # Adds anchors and links to the line numbers. Can be false (off), true (on),
+  # or a prefix string that will be prepended to the anchor name.
+  #
+  # The prefix must consist only of letters, digits, and underscores.
+  #
+  # Default: true, default prefix name: "line"
+  #
   # === :line_number_start
   # Where to start with line number counting.
   #
@@ -74,47 +82,48 @@ module Encoders
   #
   # === :hint
   # Include some information into the output using the title attribute.
-  # Can be :info (show token type on mouse-over), :info_long (with full path)
+  # Can be :info (show token kind on mouse-over), :info_long (with full path)
   # or :debug (via inspect).
   #
   # Default: false
   class HTML < Encoder
-
-    include Streamable
+    
     register_for :html
-
-    FILE_EXTENSION = 'html'
-
+    
+    FILE_EXTENSION = 'snippet.html'
+    
     DEFAULT_OPTIONS = {
       :tab_width => 8,
-
-      :css => :class,
-
-      :style => :cycnus,
-      :wrap => nil,
+      
+      :css   => :class,
+      :style => :alpha,
+      :wrap  => nil,
       :title => 'CodeRay output',
-
-      :line_numbers => nil,
-      :line_number_start => 1,
-      :bold_every => 10,
-      :highlight_lines => nil,
-
+      
+      :line_numbers        => nil,
+      :line_number_anchors => 'n',
+      :line_number_start   => 1,
+      :bold_every          => 10,
+      :highlight_lines     => nil,
+      
       :hint => false,
     }
-
-    helper :output, :css
-
+    
+    autoload :Output,    CodeRay.coderay_path('encoders', 'html', 'output')
+    autoload :CSS,       CodeRay.coderay_path('encoders', 'html', 'css')
+    autoload :Numbering, CodeRay.coderay_path('encoders', 'html', 'numbering')
+    
     attr_reader :css
-
+    
   protected
-
+    
     HTML_ESCAPE = {  #:nodoc:
       '&' => '&',
       '"' => '"',
       '>' => '>',
       '<' => '<',
     }
-
+    
     # This was to prevent illegal HTML.
     # Strange chars should still be avoided in codes.
     evil_chars = Array(0x00...0x20) - [?\n, ?\t, ?\s]
@@ -124,185 +133,170 @@ module Encoders
     # \x9 (\t) and \xA (\n) not included
     #HTML_ESCAPE_PATTERN = /[\t&"><\0-\x8\xB-\x1f\x7f-\xff]/
     HTML_ESCAPE_PATTERN = /[\t"&><\0-\x8\xB-\x1f]/
-
-    TOKEN_KIND_TO_INFO = Hash.new { |h, kind|
-      h[kind] =
-        case kind
-        when :pre_constant
-          'Predefined constant'
-        else
-          kind.to_s.gsub(/_/, ' ').gsub(/\b\w/) { $&.capitalize }
-        end
-    }
-
-    TRANSPARENT_TOKEN_KINDS = [
+    
+    TOKEN_KIND_TO_INFO = Hash.new do |h, kind|
+      h[kind] = kind.to_s.gsub(/_/, ' ').gsub(/\b\w/) { $&.capitalize }
+    end
+    
+    TRANSPARENT_TOKEN_KINDS = Set[
       :delimiter, :modifier, :content, :escape, :inline_delimiter,
-    ].to_set
-
-    # Generate a hint about the given +classes+ in a +hint+ style.
+    ]
+    
+    # Generate a hint about the given +kinds+ in a +hint+ style.
     #
     # +hint+ may be :info, :info_long or :debug.
-    def self.token_path_to_hint hint, classes
+    def self.token_path_to_hint hint, kinds
+      kinds = Array kinds
       title =
         case hint
         when :info
-          TOKEN_KIND_TO_INFO[classes.first]
+          kinds = kinds[1..-1] if TRANSPARENT_TOKEN_KINDS.include? kinds.first
+          TOKEN_KIND_TO_INFO[kinds.first]
         when :info_long
-          classes.reverse.map { |kind| TOKEN_KIND_TO_INFO[kind] }.join('/')
+          kinds.reverse.map { |kind| TOKEN_KIND_TO_INFO[kind] }.join('/')
         when :debug
-          classes.inspect
+          kinds.inspect
         end
       title ? " title=\"#{title}\"" : ''
     end
-
+    
     def setup options
       super
-
+      
+      if options[:wrap] || options[:line_numbers]
+        @real_out = @out
+        @out = ''
+      end
+      
       @HTML_ESCAPE = HTML_ESCAPE.dup
       @HTML_ESCAPE["\t"] = ' ' * options[:tab_width]
-
-      @opened = [nil]
+      
+      @opened = []
+      @last_opened = nil
       @css = CSS.new options[:style]
-
+      
       hint = options[:hint]
-      if hint and not [:debug, :info, :info_long].include? hint
+      if hint && ![:debug, :info, :info_long].include?(hint)
         raise ArgumentError, "Unknown value %p for :hint; \
-          expected :info, :debug, false, or nil." % hint
+          expected :info, :info_long, :debug, false, or nil." % hint
       end
-
+      
+      css_classes = TokenKinds
       case options[:css]
-
       when :class
-        @css_style = Hash.new do |h, k|
-          c = CodeRay::Tokens::ClassOfKind[k.first]
-          if c == :NO_HIGHLIGHT and not hint
-            h[k.dup] = false
-          else
-            title = if hint
-              HTML.token_path_to_hint(hint, k[1..-1] << k.first)
-            else
-              ''
-            end
-            if c == :NO_HIGHLIGHT
-              h[k.dup] = '<span%s>' % [title]
-            else
-              h[k.dup] = '<span%s class="%s">' % [title, c]
-            end
-          end
-        end
-
-      when :style
-        @css_style = Hash.new do |h, k|
-          if k.is_a? ::Array
-            styles = k.dup
+        @span_for_kind = Hash.new do |h, k|
+          if k.is_a? ::Symbol
+            kind = k_dup = k
           else
-            styles = [k]
+            kind = k.first
+            k_dup = k.dup
           end
-          classes = styles.map { |c| Tokens::ClassOfKind[c] }
-          if classes.first == :NO_HIGHLIGHT and not hint
-            h[k] = false
+          if kind != :space && (hint || css_class = css_classes[kind])
+            title = HTML.token_path_to_hint hint, k if hint
+            css_class ||= css_classes[kind]
+            h[k_dup] = "<span#{title}#{" class=\"#{css_class}\"" if css_class}>"
           else
-            styles.shift if TRANSPARENT_TOKEN_KINDS.include? styles.first
-            title = HTML.token_path_to_hint hint, styles
-            style = @css[*classes]
-            h[k] =
-              if style
-                '<span%s style="%s">' % [title, style]
-              else
-                false
-              end
+            h[k_dup] = nil
           end
         end
-
+      when :style
+        @span_for_kind = Hash.new do |h, k|
+          kind = k.is_a?(Symbol) ? k : k.first
+          h[k.is_a?(Symbol) ? k : k.dup] =
+            if kind != :space && (hint || css_classes[kind])
+              title = HTML.token_path_to_hint hint, k if hint
+              style = @css.get_style Array(k).map { |c| css_classes[c] }
+              "<span#{title}#{" style=\"#{style}\"" if style}>"
+            end
+        end
       else
         raise ArgumentError, "Unknown value %p for :css." % options[:css]
-
       end
+      
+      @set_last_opened = options[:hint] || options[:css] == :style
     end
-
+    
     def finish options
-      @opened.shift
-      @out << '</span>' * @opened.size
       unless @opened.empty?
-        warn '%d tokens still open: %p' % [@opened.size, @opened]
+        warn '%d tokens still open: %p' % [@opened.size, @opened] if $CODERAY_DEBUG
+        @out << '</span>' while @opened.pop
+        @last_opened = nil
       end
-
+      
       @out.extend Output
       @out.css = @css
-      @out.numerize! options[:line_numbers], options
+      if options[:line_numbers]
+        Numbering.number! @out, options[:line_numbers], options
+      end
       @out.wrap! options[:wrap]
       @out.apply_title! options[:title]
-
-      super
-    end
-
-    def token text, type = :plain
-      case text
       
-      when nil
-        # raise 'Token with nil as text was given: %p' % [[text, type]] 
-      
-      when String
-        if text =~ /#{HTML_ESCAPE_PATTERN}/o
-          text = text.gsub(/#{HTML_ESCAPE_PATTERN}/o) { |m| @HTML_ESCAPE[m] }
-        end
-        @opened[0] = type
-        if text != "\n" && style = @css_style[@opened]
-          @out << style << text << '</span>'
-        else
-          @out << text
-        end
-        
-      
-      # token groups, eg. strings
-      when :open
-        @opened[0] = type
-        @out << (@css_style[@opened] || '<span>')
-        @opened << type
-      when :close
-        if @opened.empty?
-          # nothing to close
-        else
-          if $CODERAY_DEBUG and (@opened.size == 1 or @opened.last != type)
-            raise 'Malformed token stream: Trying to close a token (%p) \
-              that is not open. Open are: %p.' % [type, @opened[1..-1]]
-          end
-          @out << '</span>'
-          @opened.pop
-        end
+      if defined?(@real_out) && @real_out
+        @real_out << @out
+        @out = @real_out
+      end
       
-      # whole lines to be highlighted, eg. a deleted line in a diff
-      when :begin_line
-        @opened[0] = type
-        if style = @css_style[@opened]
-          if style['class="']
-            @out << style.sub('class="', 'class="line ')
-          else
-            @out << style.sub('>', ' class="line">')
-          end
-        else
-          @out << '<span class="line">'
-        end
-        @opened << type
-      when :end_line
-        if @opened.empty?
-          # nothing to close
+      super
+    end
+    
+  public
+    
+    def text_token text, kind
+      if text =~ /#{HTML_ESCAPE_PATTERN}/o
+        text = text.gsub(/#{HTML_ESCAPE_PATTERN}/o) { |m| @HTML_ESCAPE[m] }
+      end
+      if style = @span_for_kind[@last_opened ? [kind, *@opened] : kind]
+        @out << style << text << '</span>'
+      else
+        @out << text
+      end
+    end
+    
+    # token groups, eg. strings
+    def begin_group kind
+      @out << (@span_for_kind[@last_opened ? [kind, *@opened] : kind] || '<span>')
+      @opened << kind
+      @last_opened = kind if @set_last_opened
+    end
+    
+    def end_group kind
+      if $CODERAY_DEBUG && (@opened.empty? || @opened.last != kind)
+        warn 'Malformed token stream: Trying to close a token (%p) ' \
+          'that is not open. Open are: %p.' % [kind, @opened[1..-1]]
+      end
+      if @opened.pop
+        @out << '</span>'
+        @last_opened = @opened.last if @last_opened
+      end
+    end
+    
+    # whole lines to be highlighted, eg. a deleted line in a diff
+    def begin_line kind
+      if style = @span_for_kind[@last_opened ? [kind, *@opened] : kind]
+        if style['class="']
+          @out << style.sub('class="', 'class="line ')
         else
-          if $CODERAY_DEBUG and (@opened.size == 1 or @opened.last != type)
-            raise 'Malformed token stream: Trying to close a line (%p) \
-              that is not open. Open are: %p.' % [type, @opened[1..-1]]
-          end
-          @out << '</span>'
-          @opened.pop
+          @out << style.sub('>', ' class="line">')
         end
-      
       else
-        raise 'unknown token kind: %p' % [text]
-        
+        @out << '<span class="line">'
       end
+      @opened << kind
+      @last_opened = kind if @options[:css] == :style
     end
-
+    
+    def end_line kind
+      if $CODERAY_DEBUG && (@opened.empty? || @opened.last != kind)
+        warn 'Malformed token stream: Trying to close a line (%p) ' \
+          'that is not open. Open are: %p.' % [kind, @opened[1..-1]]
+      end
+      if @opened.pop
+        @out << '</span>'
+        @last_opened = @opened.last if @last_opened
+      end
+    end
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/encoders/html/css.rb b/lib/coderay/encoders/html/css.rb
index 09ac8bc..6de4b46 100644
--- a/lib/coderay/encoders/html/css.rb
+++ b/lib/coderay/encoders/html/css.rb
@@ -2,7 +2,7 @@ module CodeRay
 module Encoders
 
   class HTML
-    class CSS
+    class CSS  # :nodoc:
 
       attr :stylesheet
 
@@ -20,11 +20,11 @@ module Encoders
         parse style::TOKEN_COLORS
       end
 
-      def [] *styles
+      def get_style styles
         cl = @classes[styles.first]
         return '' unless cl
         style = ''
-        1.upto(styles.size) do |offset|
+        1.upto styles.size do |offset|
           break if style = cl[styles[offset .. -1]]
         end
         # warn 'Style not found: %p' % [styles] if style.empty?
@@ -44,7 +44,7 @@ module Encoders
         ( [^\}]+ )?          # $2 = style
         \s* \} \s*
       |
-        ( . )                # $3 = error
+        ( [^\n]+ )           # $3 = error
       /mx
       def parse stylesheet
         stylesheet.scan CSS_CLASS_PATTERN do |selectors, style, error|
@@ -63,8 +63,3 @@ module Encoders
 
 end
 end
-
-if $0 == __FILE__
-  require 'pp'
-  pp CodeRay::Encoders::HTML::CSS.new
-end
diff --git a/lib/coderay/encoders/html/numbering.rb b/lib/coderay/encoders/html/numbering.rb
new file mode 100644
index 0000000..15ce11b
--- /dev/null
+++ b/lib/coderay/encoders/html/numbering.rb
@@ -0,0 +1,115 @@
+module CodeRay
+module Encoders
+
+  class HTML
+
+    module Numbering  # :nodoc:
+
+      def self.number! output, mode = :table, options = {}
+        return self unless mode
+
+        options = DEFAULT_OPTIONS.merge options
+
+        start = options[:line_number_start]
+        unless start.is_a? Integer
+          raise ArgumentError, "Invalid value %p for :line_number_start; Integer expected." % start
+        end
+        
+        anchor_prefix = options[:line_number_anchors]
+        anchor_prefix = 'line' if anchor_prefix == true
+        anchor_prefix = anchor_prefix.to_s[/\w+/] if anchor_prefix
+        anchoring =
+          if anchor_prefix
+            proc do |line|
+              line = line.to_s
+              anchor = anchor_prefix + line
+              "<a href=\"##{anchor}\" name=\"#{anchor}\">#{line}</a>"
+            end
+          else
+            proc { |line| line.to_s }  # :to_s.to_proc in Ruby 1.8.7+
+          end
+        
+        bold_every = options[:bold_every]
+        highlight_lines = options[:highlight_lines]
+        bolding =
+          if bold_every == false && highlight_lines == nil
+            anchoring
+          elsif highlight_lines.is_a? Enumerable
+            highlight_lines = highlight_lines.to_set
+            proc do |line|
+              if highlight_lines.include? line
+                "<strong class=\"highlighted\">#{anchoring[line]}</strong>"  # highlighted line numbers in bold
+              else
+                anchoring[line]
+              end
+            end
+          elsif bold_every.is_a? Integer
+            raise ArgumentError, ":bolding can't be 0." if bold_every == 0
+            proc do |line|
+              if line % bold_every == 0
+                "<strong>#{anchoring[line]}</strong>"  # every bold_every-th number in bold
+              else
+                anchoring[line]
+              end
+            end
+          else
+            raise ArgumentError, 'Invalid value %p for :bolding; false or Integer expected.' % bold_every
+          end
+        
+        line_count = output.count("\n")
+        position_of_last_newline = output.rindex(RUBY_VERSION >= '1.9' ? /\n/ : ?\n)
+        if position_of_last_newline
+          after_last_newline = output[position_of_last_newline + 1 .. -1]
+          ends_with_newline = after_last_newline[/\A(?:<\/span>)*\z/]
+          line_count += 1 if not ends_with_newline
+        end
+        
+        case mode
+        when :inline
+          max_width = (start + line_count).to_s.size
+          line_number = start
+          nesting = []
+          output.gsub!(/^.*$\n?/) do |line|
+            line.chomp!
+            open = nesting.join
+            line.scan(%r!<(/)?span[^>]*>?!) do |close,|
+              if close
+                nesting.pop
+              else
+                nesting << $&
+              end
+            end
+            close = '</span>' * nesting.size
+            
+            line_number_text = bolding.call line_number
+            indent = ' ' * (max_width - line_number.to_s.size)  # TODO: Optimize (10^x)
+            line_number += 1
+            "<span class=\"line-numbers\">#{indent}#{line_number_text}</span>#{open}#{line}#{close}\n"
+          end
+
+        when :table
+          line_numbers = (start ... start + line_count).map(&bolding).join("\n")
+          line_numbers << "\n"
+          line_numbers_table_template = Output::TABLE.apply('LINE_NUMBERS', line_numbers)
+
+          output.gsub!(/<\/div>\n/, '</div>')
+          output.wrap_in! line_numbers_table_template
+          output.wrapped_in = :div
+
+        when :list
+          raise NotImplementedError, 'The :list option is no longer available. Use :table.'
+
+        else
+          raise ArgumentError, 'Unknown value %p for mode: expected one of %p' %
+            [mode, [:table, :inline]]
+        end
+
+        output
+      end
+
+    end
+
+  end
+
+end
+end
diff --git a/lib/coderay/encoders/html/numerization.rb b/lib/coderay/encoders/html/numerization.rb
deleted file mode 100644
index 17e8ddb..0000000
--- a/lib/coderay/encoders/html/numerization.rb
+++ /dev/null
@@ -1,133 +0,0 @@
-module CodeRay
-module Encoders
-
-  class HTML
-
-    module Output
-
-      def numerize *args
-        clone.numerize!(*args)
-      end
-
-=begin      NUMERIZABLE_WRAPPINGS = {
-        :table => [:div, :page, nil],
-        :inline => :all,
-        :list => [:div, :page, nil]
-      }
-      NUMERIZABLE_WRAPPINGS.default = :all
-=end
-      def numerize! mode = :table, options = {}
-        return self unless mode
-
-        options = DEFAULT_OPTIONS.merge options
-
-        start = options[:line_number_start]
-        unless start.is_a? Integer
-          raise ArgumentError, "Invalid value %p for :line_number_start; Integer expected." % start
-        end
-
-        #allowed_wrappings = NUMERIZABLE_WRAPPINGS[mode]
-        #unless allowed_wrappings == :all or allowed_wrappings.include? options[:wrap]
-        #  raise ArgumentError, "Can't numerize, :wrap must be in %p, but is %p" % [NUMERIZABLE_WRAPPINGS, options[:wrap]]
-        #end
-
-        bold_every = options[:bold_every]
-        highlight_lines = options[:highlight_lines]
-        bolding =
-          if bold_every == false && highlight_lines == nil
-            proc { |line| line.to_s }
-          elsif highlight_lines.is_a? Enumerable
-            highlight_lines = highlight_lines.to_set
-            proc do |line|
-              if highlight_lines.include? line
-                "<strong class=\"highlighted\">#{line}</strong>"  # highlighted line numbers in bold
-              else
-                line.to_s
-              end
-            end
-          elsif bold_every.is_a? Integer
-            raise ArgumentError, ":bolding can't be 0." if bold_every == 0
-            proc do |line|
-              if line % bold_every == 0
-                "<strong>#{line}</strong>"  # every bold_every-th number in bold
-              else
-                line.to_s
-              end
-            end
-          else
-            raise ArgumentError, 'Invalid value %p for :bolding; false or Integer expected.' % bold_every
-          end
-
-        case mode
-        when :inline
-          max_width = (start + line_count).to_s.size
-          line_number = start
-          gsub!(/^/) do
-            line_number_text = bolding.call line_number
-            indent = ' ' * (max_width - line_number.to_s.size)  # TODO: Optimize (10^x)
-            res = "<span class=\"no\">#{indent}#{line_number_text}</span> "
-            line_number += 1
-            res
-          end
-
-        when :table
-          # This is really ugly.
-          # Because even monospace fonts seem to have different heights when bold,
-          # I make the newline bold, both in the code and the line numbers.
-          # FIXME Still not working perfect for Mr. Internet Exploder
-          line_numbers = (start ... start + line_count).to_a.map(&bolding).join("\n")
-          line_numbers << "\n"  # also for Mr. MS Internet Exploder :-/
-          line_numbers.gsub!(/\n/) { "<tt>\n</tt>" }
-
-          line_numbers_table_tpl = TABLE.apply('LINE_NUMBERS', line_numbers)
-          gsub!("</div>\n", '</div>')
-          gsub!("\n", "<tt>\n</tt>")
-          wrap_in! line_numbers_table_tpl
-          @wrapped_in = :div
-
-        when :list
-          opened_tags = []
-          gsub!(/^.*$\n?/) do |line|
-            line.chomp!
-
-            open = opened_tags.join
-            line.scan(%r!<(/)?span[^>]*>?!) do |close,|
-              if close
-                opened_tags.pop
-              else
-                opened_tags << $&
-              end
-            end
-            close = '</span>' * opened_tags.size
-
-            "<li>#{open}#{line}#{close}</li>\n"
-          end
-          chomp!("\n")
-          wrap_in! LIST
-          @wrapped_in = :div
-
-        else
-          raise ArgumentError, 'Unknown value %p for mode: expected one of %p' %
-            [mode, [:table, :list, :inline]]
-        end
-
-        self
-      end
-
-      def line_count
-        line_count = count("\n")
-        position_of_last_newline = rindex(?\n)
-        if position_of_last_newline
-          after_last_newline = self[position_of_last_newline + 1 .. -1]
-          ends_with_newline = after_last_newline[/\A(?:<\/span>)*\z/]
-          line_count += 1 if not ends_with_newline
-        end
-        line_count
-      end
-
-    end
-
-  end
-
-end
-end
diff --git a/lib/coderay/encoders/html/output.rb b/lib/coderay/encoders/html/output.rb
index 28574a5..9132d94 100644
--- a/lib/coderay/encoders/html/output.rb
+++ b/lib/coderay/encoders/html/output.rb
@@ -3,44 +3,29 @@ module Encoders
 
   class HTML
 
-    # This module is included in the output String from thew HTML Encoder.
+    # This module is included in the output String of the HTML Encoder.
     #
     # It provides methods like wrap, div, page etc.
     #
     # Remember to use #clone instead of #dup to keep the modules the object was
     # extended with.
     #
-    # TODO: more doc.
+    # TODO: Rewrite this without monkey patching.
     module Output
 
-      require 'coderay/encoders/html/numerization.rb'
-
       attr_accessor :css
 
       class << self
 
-        # This makes Output look like a class.
-        #
-        # Example:
-        #
-        #  a = Output.new '<span class="co">Code</span>'
-        #  a.wrap! :page
-        def new string, css = CSS.new, element = nil
-          output = string.clone.extend self
-          output.wrapped_in = element
-          output.css = css
-          output
-        end
-
         # Raises an exception if an object that doesn't respond to to_str is extended by Output,
         # to prevent users from misuse. Use Module#remove_method to disable.
-        def extended o
+        def extended o  # :nodoc:
           warn "The Output module is intended to extend instances of String, not #{o.class}." unless o.respond_to? :to_str
         end
 
-        def make_stylesheet css, in_tag = false
+        def make_stylesheet css, in_tag = false  # :nodoc:
           sheet = css.stylesheet
-          sheet = <<-CSS if in_tag
+          sheet = <<-'CSS' if in_tag
 <style type="text/css">
 #{sheet}
 </style>
@@ -48,27 +33,13 @@ module Encoders
           sheet
         end
 
-        def page_template_for_css css
+        def page_template_for_css css  # :nodoc:
           sheet = make_stylesheet css
           PAGE.apply 'CSS', sheet
         end
 
-        # Define a new wrapper. This is meta programming.
-        def wrapper *wrappers
-          wrappers.each do |wrapper|
-            define_method wrapper do |*args|
-              wrap wrapper, *args
-            end
-            define_method "#{wrapper}!".to_sym do |*args|
-              wrap! wrapper, *args
-            end
-          end
-        end
-
       end
 
-      wrapper :div, :span, :page
-
       def wrapped_in? element
         wrapped_in == element
       end
@@ -78,10 +49,6 @@ module Encoders
       end
       attr_writer :wrapped_in
 
-      def wrap_in template
-        clone.wrap_in! template
-      end
-
       def wrap_in! template
         Template.wrap! self, template, 'CONTENT'
         self
@@ -118,15 +85,13 @@ module Encoders
         self
       end
 
-      def wrap *args
-        clone.wrap!(*args)
-      end
-
       def stylesheet in_tag = false
         Output.make_stylesheet @css, in_tag
       end
 
-      class Template < String
+#-- don't include the templates in docu
+
+      class Template < String  # :nodoc:
 
         def self.wrap! str, template, target
           target = Regexp.new(Regexp.escape("<%#{target}%>"))
@@ -147,51 +112,46 @@ module Encoders
           end
         end
 
-        module Simple
-          def ` str  #` <-- for stupid editors
-            Template.new str
-          end
-        end
       end
 
-      extend Template::Simple
+      SPAN = Template.new '<span class="CodeRay"><%CONTENT%></span>'
 
-#-- don't include the templates in docu
-
-      SPAN = `<span class="CodeRay"><%CONTENT%></span>`
-
-      DIV = <<-`DIV`
+      DIV = Template.new <<-DIV
 <div class="CodeRay">
   <div class="code"><pre><%CONTENT%></pre></div>
 </div>
       DIV
 
-      TABLE = <<-`TABLE`
+      TABLE = Template.new <<-TABLE
 <table class="CodeRay"><tr>
-  <td class="line_numbers" title="click to toggle" onclick="with (this.firstChild.style) { display = (display == '') ? 'none' : '' }"><pre><%LINE_NUMBERS%></pre></td>
-  <td class="code"><pre ondblclick="with (this.style) { overflow = (overflow == 'auto' || overflow == '') ? 'visible' : 'auto' }"><%CONTENT%></pre></td>
+  <td class="line-numbers" title="double click to toggle" ondblclick="with (this.firstChild.style) { display = (display == '') ? 'none' : '' }"><pre><%LINE_NUMBERS%></pre></td>
+  <td class="code"><pre><%CONTENT%></pre></td>
 </tr></table>
       TABLE
-      # title="double click to expand"
-
-      LIST = <<-`LIST`
-<ol class="CodeRay">
-<%CONTENT%>
-</ol>
-      LIST
 
-      PAGE = <<-`PAGE`
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
-  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="de">
+      PAGE = Template.new <<-PAGE
+<!DOCTYPE html>
+<html>
 <head>
-  <meta http-equiv="content-type" content="text/html; charset=utf-8" />
+  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
   <title></title>
   <style type="text/css">
+.CodeRay .line-numbers a {
+  text-decoration: inherit;
+  color: inherit;
+}
+body {
+  background-color: white;
+  padding: 0;
+  margin: 0;
+}
 <%CSS%>
+.CodeRay {
+  border: none;
+}
   </style>
 </head>
-<body style="background-color: white;">
+<body>
 
 <%CONTENT%>
 </body>
diff --git a/lib/coderay/encoders/json.rb b/lib/coderay/encoders/json.rb
index 7aa077c..a9e40dc 100644
--- a/lib/coderay/encoders/json.rb
+++ b/lib/coderay/encoders/json.rb
@@ -1,69 +1,83 @@
-($:.unshift '../..'; require 'coderay') unless defined? CodeRay
 module CodeRay
 module Encoders
   
-  # = JSON Encoder
+  # A simple JSON Encoder.
+  # 
+  # Example:
+  #  CodeRay.scan('puts "Hello world!"', :ruby).json
+  # yields
+  #  [
+  #    {"type"=>"text", "text"=>"puts", "kind"=>"ident"},
+  #    {"type"=>"text", "text"=>" ", "kind"=>"space"},
+  #    {"type"=>"block", "action"=>"open", "kind"=>"string"},
+  #    {"type"=>"text", "text"=>"\"", "kind"=>"delimiter"},
+  #    {"type"=>"text", "text"=>"Hello world!", "kind"=>"content"},
+  #    {"type"=>"text", "text"=>"\"", "kind"=>"delimiter"},
+  #    {"type"=>"block", "action"=>"close", "kind"=>"string"},
+  #  ]
   class JSON < Encoder
     
+    begin
+      require 'json'
+    rescue LoadError
+      begin
+        require 'rubygems' unless defined? Gem
+        gem 'json'
+        require 'json'
+      rescue LoadError
+        $stderr.puts "The JSON encoder needs the JSON library.\n" \
+          "Please gem install json."
+        raise
+      end
+    end
+    
     register_for :json
     FILE_EXTENSION = 'json'
     
   protected
     def setup options
-      begin
-        require 'json'
-      rescue LoadError
-        require 'rubygems'
-        require 'json'
+      super
+      
+      @first = true
+      @out << '['
+    end
+    
+    def finish options
+      @out << ']'
+    end
+    
+    def append data
+      if @first
+        @first = false
+      else
+        @out << ','
       end
-      @out = []
+      
+      @out << data.to_json
     end
     
+  public
     def text_token text, kind
-      { :type => 'text', :text => text, :kind => kind }
+      append :type => 'text', :text => text, :kind => kind
     end
     
-    def block_token action, kind
-      { :type => 'block', :action => action, :kind => kind }
+    def begin_group kind
+      append :type => 'block', :action => 'open', :kind => kind
     end
     
-    def finish options
-      @out.to_json
+    def end_group kind
+      append :type => 'block', :action => 'close', :kind => kind
+    end
+    
+    def begin_line kind
+      append :type => 'block', :action => 'begin_line', :kind => kind
+    end
+    
+    def end_line kind
+      append :type => 'block', :action => 'end_line', :kind => kind
     end
     
   end
   
 end
 end
-
-if $0 == __FILE__
-  $VERBOSE = true
-  $: << File.join(File.dirname(__FILE__), '..')
-  eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-$:.delete '.'
-require 'rubygems' if RUBY_VERSION < '1.9'
-
-class JSONEncoderTest < Test::Unit::TestCase
-  
-  def test_json_output
-    tokens = CodeRay.scan <<-RUBY, :ruby
-puts "Hello world!"
-    RUBY
-    require 'json'
-    assert_equal [
-      {"type"=>"text", "text"=>"puts", "kind"=>"ident"},
-      {"type"=>"text", "text"=>" ", "kind"=>"space"},
-      {"type"=>"block", "action"=>"open", "kind"=>"string"},
-      {"type"=>"text", "text"=>"\"", "kind"=>"delimiter"},
-      {"type"=>"text", "text"=>"Hello world!", "kind"=>"content"},
-      {"type"=>"text", "text"=>"\"", "kind"=>"delimiter"},
-      {"type"=>"block", "action"=>"close", "kind"=>"string"},
-      {"type"=>"text", "text"=>"\n", "kind"=>"space"}
-    ], JSON.load(tokens.json)
-  end
-  
-end
\ No newline at end of file
diff --git a/lib/coderay/encoders/lines_of_code.rb b/lib/coderay/encoders/lines_of_code.rb
index c1ad66e..5f8422f 100644
--- a/lib/coderay/encoders/lines_of_code.rb
+++ b/lib/coderay/encoders/lines_of_code.rb
@@ -1,10 +1,9 @@
-($:.unshift '../..'; require 'coderay') unless defined? CodeRay
 module CodeRay
 module Encoders
   
   # Counts the LoC (Lines of Code). Returns an Integer >= 0.
   # 
-  # Alias: :loc
+  # Alias: +loc+
   # 
   # Everything that is not comment, markup, doctype/shebang, or an empty line,
   # is considered to be code.
@@ -15,76 +14,32 @@ module Encoders
   # 
   # A Scanner class should define the token kinds that are not code in the
   # KINDS_NOT_LOC constant, which defaults to [:comment, :doctype].
-  class LinesOfCode < Encoder
+  class LinesOfCode < TokenKindFilter
     
     register_for :lines_of_code
     
     NON_EMPTY_LINE = /^\s*\S.*$/
     
-    def compile tokens, options
-      if scanner = tokens.scanner
+  protected
+    
+    def setup options
+      if scanner
         kinds_not_loc = scanner.class::KINDS_NOT_LOC
       else
-        warn ArgumentError, 'Tokens have no scanner.' if $VERBOSE
+        warn "Tokens have no associated scanner, counting all nonempty lines." if $VERBOSE
         kinds_not_loc = CodeRay::Scanners::Scanner::KINDS_NOT_LOC
       end
-      code = tokens.token_class_filter :exclude => kinds_not_loc
-      @loc = code.text.scan(NON_EMPTY_LINE).size
+      
+      options[:exclude] = kinds_not_loc
+      
+      super options
     end
     
     def finish options
-      @loc
+      output @tokens.text.scan(NON_EMPTY_LINE).size
     end
     
   end
   
 end
 end
-
-if $0 == __FILE__
-  $VERBOSE = true
-  $: << File.join(File.dirname(__FILE__), '..')
-  eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class LinesOfCodeTest < Test::Unit::TestCase
-  
-  def test_creation
-    assert CodeRay::Encoders::LinesOfCode < CodeRay::Encoders::Encoder
-    filter = nil
-    assert_nothing_raised do
-      filter = CodeRay.encoder :loc
-    end
-    assert_kind_of CodeRay::Encoders::LinesOfCode, filter
-    assert_nothing_raised do
-      filter = CodeRay.encoder :lines_of_code
-    end
-    assert_kind_of CodeRay::Encoders::LinesOfCode, filter
-  end
-  
-  def test_lines_of_code
-    tokens = CodeRay.scan <<-RUBY, :ruby
-#!/usr/bin/env ruby
-
-# a minimal Ruby program
-puts "Hello world!"
-    RUBY
-    assert_equal 1, CodeRay::Encoders::LinesOfCode.new.encode_tokens(tokens)
-    assert_equal 1, tokens.lines_of_code
-    assert_equal 1, tokens.loc
-  end
-  
-  def test_filtering_block_tokens
-    tokens = CodeRay::Tokens.new
-    tokens << ["Hello\n", :world]
-    tokens << ["Hello\n", :space]
-    tokens << ["Hello\n", :comment]
-    assert_equal 2, CodeRay::Encoders::LinesOfCode.new.encode_tokens(tokens)
-    assert_equal 2, tokens.lines_of_code
-    assert_equal 2, tokens.loc
-  end
-  
-end
\ No newline at end of file
diff --git a/lib/coderay/encoders/null.rb b/lib/coderay/encoders/null.rb
index add3862..73ba47d 100644
--- a/lib/coderay/encoders/null.rb
+++ b/lib/coderay/encoders/null.rb
@@ -1,26 +1,18 @@
 module CodeRay
 module Encoders
-
+  
   # = Null Encoder
   #
   # Does nothing and returns an empty string.
   class Null < Encoder
-
-    include Streamable
+    
     register_for :null
-
-    # Defined for faster processing
-    def to_proc
-      proc {}
-    end
-
-  protected
-
-    def token(*)
+    
+    def text_token text, kind
       # do nothing
     end
-
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/encoders/page.rb b/lib/coderay/encoders/page.rb
index 1b69cce..800e73f 100644
--- a/lib/coderay/encoders/page.rb
+++ b/lib/coderay/encoders/page.rb
@@ -1,20 +1,24 @@
 module CodeRay
 module Encoders
-
+  
   load :html
-
+  
+  # Wraps the output into a HTML page, using CSS classes and
+  # line numbers in the table format by default.
+  # 
+  # See Encoders::HTML for available options.
   class Page < HTML
-
+    
     FILE_EXTENSION = 'html'
-
+    
     register_for :page
-
+    
     DEFAULT_OPTIONS = HTML::DEFAULT_OPTIONS.merge \
-      :css => :class,
-      :wrap => :page,
+      :css          => :class,
+      :wrap         => :page,
       :line_numbers => :table
-
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/encoders/span.rb b/lib/coderay/encoders/span.rb
index 319f6fd..da705bd 100644
--- a/lib/coderay/encoders/span.rb
+++ b/lib/coderay/encoders/span.rb
@@ -1,19 +1,23 @@
 module CodeRay
 module Encoders
-
+  
   load :html
-
+  
+  # Wraps HTML output into a SPAN element, using inline styles by default.
+  # 
+  # See Encoders::HTML for available options.
   class Span < HTML
-
+    
     FILE_EXTENSION = 'span.html'
-
+    
     register_for :span
-
+    
     DEFAULT_OPTIONS = HTML::DEFAULT_OPTIONS.merge \
-      :css => :style,
-      :wrap => :span
-
+      :css          => :style,
+      :wrap         => :span,
+      :line_numbers => false
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/encoders/statistic.rb b/lib/coderay/encoders/statistic.rb
index 6d0c646..2315d9e 100644
--- a/lib/coderay/encoders/statistic.rb
+++ b/lib/coderay/encoders/statistic.rb
@@ -1,43 +1,27 @@
 module CodeRay
 module Encoders
-
+  
   # Makes a statistic for the given tokens.
+  # 
+  # Alias: +stats+
   class Statistic < Encoder
-
-    include Streamable
-    register_for :stats, :statistic
-
-    attr_reader :type_stats, :real_token_count
-
+    
+    register_for :statistic
+    
+    attr_reader :type_stats, :real_token_count  # :nodoc:
+    
+    TypeStats = Struct.new :count, :size  # :nodoc:
+    
   protected
-
-    TypeStats = Struct.new :count, :size
-
+    
     def setup options
+      super
+      
       @type_stats = Hash.new { |h, k| h[k] = TypeStats.new 0, 0 }
       @real_token_count = 0
     end
-
-    def generate tokens, options
-      @tokens = tokens
-      super
-    end
-
-    def text_token text, kind
-      @real_token_count += 1 unless kind == :space
-      @type_stats[kind].count += 1
-      @type_stats[kind].size += text.size
-      @type_stats['TOTAL'].size += text.size
-      @type_stats['TOTAL'].count += 1
-    end
-
-    # TODO Hierarchy handling
-    def block_token action, kind
-      @type_stats['TOTAL'].count += 1
-      @type_stats['open/close'].count += 1
-    end
-
-    STATS = <<-STATS
+    
+    STATS = <<-STATS  # :nodoc:
 
 Code Statistics
 
@@ -49,12 +33,12 @@ Token Types (%d):
   type                     count     ratio    size (average)
 -------------------------------------------------------------
 %s
-      STATS
-# space                    12007   33.81 %     1.7
-    TOKEN_TYPES_ROW = <<-TKR
+    STATS
+    
+    TOKEN_TYPES_ROW = <<-TKR  # :nodoc:
   %-20s  %8d  %6.2f %%   %5.1f
-      TKR
-
+    TKR
+    
     def finish options
       all = @type_stats['TOTAL']
       all_count, all_size = all.count, all.size
@@ -64,14 +48,49 @@ Token Types (%d):
       types_stats = @type_stats.sort_by { |k, v| [-v.count, k.to_s] }.map do |k, v|
         TOKEN_TYPES_ROW % [k, v.count, 100.0 * v.count / all_count, v.size]
       end.join
-      STATS % [
+      @out << STATS % [
         all_count, @real_token_count, all_size,
         @type_stats.delete_if { |k, v| k.is_a? String }.size,
         types_stats
       ]
+      
+      super
     end
-
+    
+  public
+    
+    def text_token text, kind
+      @real_token_count += 1 unless kind == :space
+      @type_stats[kind].count += 1
+      @type_stats[kind].size += text.size
+      @type_stats['TOTAL'].size += text.size
+      @type_stats['TOTAL'].count += 1
+    end
+    
+    # TODO Hierarchy handling
+    def begin_group kind
+      block_token ':begin_group', kind
+    end
+    
+    def end_group kind
+      block_token ':end_group', kind
+    end
+    
+    def begin_line kind
+      block_token ':begin_line', kind
+    end
+    
+    def end_line kind
+      block_token ':end_line', kind
+    end
+    
+    def block_token action, kind
+      @type_stats['TOTAL'].count += 1
+      @type_stats[action].count += 1
+      @type_stats[kind].count += 1
+    end
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/encoders/term.rb b/lib/coderay/encoders/term.rb
deleted file mode 100644
index 1f284ed..0000000
--- a/lib/coderay/encoders/term.rb
+++ /dev/null
@@ -1,158 +0,0 @@
-# encoders/term.rb
-# By Rob Aldred (http://robaldred.co.uk)
-# Based on idea by Nathan Weizenbaum (http://nex-3.com)
-# MIT License (http://www.opensource.org/licenses/mit-license.php)
-#
-# A CodeRay encoder that outputs code highlighted for a color terminal.
-# Check out http://robaldred.co.uk
-
-module CodeRay
-  module Encoders
-    class Term < Encoder
-      register_for :term
-
-      TOKEN_COLORS = {
-        :annotation => '35',
-        :attribute_name => '33',
-        :attribute_name_fat => '33',
-        :attribute_value => '31',
-        :attribute_value_fat => '31',
-        :bin => '1;35',
-        :char => {:self => '36', :delimiter => '34'},
-        :class => '1;35',
-        :class_variable => '36',
-        :color => '32',
-        :comment => '37',
-        :complex => '34',
-        :constant => ['34', '4'],
-        :decoration => '35',
-        :definition => '1;32',
-        :directive => ['32', '4'],
-        :doc => '46',
-        :doctype => '1;30',
-        :doc_string => ['31', '4'],
-        :entity => '33',
-        :error => ['1;33', '41'],
-        :exception => '1;31',
-        :float => '1;35',
-        :function => '1;34',
-        :global_variable => '42',
-        :hex => '1;36',
-        :important => '1;31',
-        :include => '33',
-        :integer => '1;34',
-        :interpreted => '1;35',
-        :key => '35',
-        :label => '1;4',
-        :local_variable => '33',
-        :oct => '1;35',
-        :operator_name => '1;29',
-        :pre_constant => '1;36',
-        :pre_type => '1;30',
-        :predefined => ['4', '1;34'],
-        :preprocessor => '36',
-        :pseudo_class => '34',
-        :regexp => {
-          :content => '31',
-          :delimiter => '1;29',
-          :modifier => '35',
-          :function => '1;29'
-        },
-        :reserved => '1;31',
-        :shell => {
-          :self => '42',
-          :content => '1;29',
-          :delimiter => '37',
-        },
-        :string => {
-          :self => '32',
-          :modifier => '1;32',
-          :escape => '1;36',
-          :delimiter => '1;32',
-        },
-        :symbol => '1;32',
-        :tag => '34',
-        :tag_fat => '1;34',
-        :tag_special => ['34', '4'],
-        :type => '1;34',
-        :value => '36',
-        :variable => '34',
-        :insert => '42',
-        :delete => '41',
-        :change => '44',
-        :head => '45',
-      }
-      TOKEN_COLORS[:keyword] = TOKEN_COLORS[:reserved]
-      TOKEN_COLORS[:method] = TOKEN_COLORS[:function]
-      TOKEN_COLORS[:imaginary] = TOKEN_COLORS[:complex]
-      TOKEN_COLORS[:open] = TOKEN_COLORS[:close] = TOKEN_COLORS[:nesting_delimiter] = TOKEN_COLORS[:escape] = TOKEN_COLORS[:delimiter]
-
-      protected
-
-      def setup(options)
-        @out = ''
-        @opened = [nil]
-        @subcolors = nil
-      end
-
-      def finish(options)
-        super
-      end
-    
-      def token text, type = :plain
-        case text
-      
-        when nil
-          # raise 'Token with nil as text was given: %p' % [[text, type]] 
-      
-        when String
-        
-          if color = (@subcolors || TOKEN_COLORS)[type]
-            color = color[:self] || return if Hash === color
-
-            @out << col(color) + text.gsub("\n", col(0) + "\n" + col(color)) + col(0)
-            @out << col(@subcolors[:self]) if @subcolors && @subcolors[:self]
-          else
-            @out << text
-          end
-      
-        # token groups, eg. strings
-        when :open
-          @opened[0] = type
-          if color = TOKEN_COLORS[type]
-            if Hash === color
-              @subcolors = color
-              @out << col(color[:self]) if color[:self]
-            else
-              @subcolors = {}
-              @out << col(color)
-            end
-          end
-          @opened << type
-        when :close
-          if @opened.empty?
-            # nothing to close
-          else
-            @out << col(0) if (@subcolors || {})[:self]
-            @subcolors = nil
-            @opened.pop
-          end
-      
-        # whole lines to be highlighted, eg. a added/modified/deleted lines in a diff
-        when :begin_line
-        
-        when :end_line        
-      
-        else
-          raise 'unknown token kind: %p' % [text]
-        end
-      end
-
-      private
-
-      def col(color)
-        Array(color).map { |c| "\e[#{c}m" }.join
-      end
-    end
-  end
-end
\ No newline at end of file
diff --git a/lib/coderay/encoders/terminal.rb b/lib/coderay/encoders/terminal.rb
new file mode 100644
index 0000000..005032d
--- /dev/null
+++ b/lib/coderay/encoders/terminal.rb
@@ -0,0 +1,179 @@
+module CodeRay
+  module Encoders
+    
+    # Outputs code highlighted for a color terminal.
+    # 
+    # Note: This encoder is in beta. It currently doesn't use the Styles.
+    # 
+    # Alias: +term+
+    # 
+    # == Authors & License
+    # 
+    # By Rob Aldred (http://robaldred.co.uk)
+    # 
+    # Based on idea by Nathan Weizenbaum (http://nex-3.com)
+    # 
+    # MIT License (http://www.opensource.org/licenses/mit-license.php)
+    class Terminal < Encoder
+      
+      register_for :terminal
+      
+      TOKEN_COLORS = {
+        :annotation => '35',
+        :attribute_name => '33',
+        :attribute_value => '31',
+        :binary => '1;35',
+        :char => {
+          :self => '36', :delimiter => '1;34'
+        },
+        :class => '1;35',
+        :class_variable => '36',
+        :color => '32',
+        :comment => '37',
+        :complex => '1;34',
+        :constant => ['1;34', '4'],
+        :decoration => '35',
+        :definition => '1;32',
+        :directive => ['32', '4'],
+        :doc => '46',
+        :doctype => '1;30',
+        :doc_string => ['31', '4'],
+        :entity => '33',
+        :error => ['1;33', '41'],
+        :exception => '1;31',
+        :float => '1;35',
+        :function => '1;34',
+        :global_variable => '42',
+        :hex => '1;36',
+        :include => '33',
+        :integer => '1;34',
+        :key => '35',
+        :label => '1;15',
+        :local_variable => '33',
+        :octal => '1;35',
+        :operator_name => '1;29',
+        :predefined_constant => '1;36',
+        :predefined_type => '1;30',
+        :predefined => ['4', '1;34'],
+        :preprocessor => '36',
+        :pseudo_class => '1;34',
+        :regexp => {
+          :self => '31',
+          :content => '31',
+          :delimiter => '1;29',
+          :modifier => '35',
+          :function => '1;29'
+        },
+        :reserved => '1;31',
+        :shell => {
+          :self => '42',
+          :content => '1;29',
+          :delimiter => '37',
+        },
+        :string => {
+          :self => '32',
+          :modifier => '1;32',
+          :escape => '1;36',
+          :delimiter => '1;32',
+        },
+        :symbol => '1;32',
+        :tag => '1;34',
+        :type => '1;34',
+        :value => '36',
+        :variable => '1;34',
+        
+        :insert => '42',
+        :delete => '41',
+        :change => '44',
+        :head => '45'
+      }
+      TOKEN_COLORS[:keyword] = TOKEN_COLORS[:reserved]
+      TOKEN_COLORS[:method] = TOKEN_COLORS[:function]
+      TOKEN_COLORS[:imaginary] = TOKEN_COLORS[:complex]
+      TOKEN_COLORS[:begin_group] = TOKEN_COLORS[:end_group] =
+        TOKEN_COLORS[:escape] = TOKEN_COLORS[:delimiter]
+      
+    protected
+      
+      def setup(options)
+        super
+        @opened = []
+        @subcolors = nil
+      end
+      
+    public
+      
+      def text_token text, kind
+        if color = (@subcolors || TOKEN_COLORS)[kind]
+          if Hash === color
+            if color[:self]
+              color = color[:self]
+            else
+              @out << text
+              return
+            end
+          end
+          
+          @out << ansi_colorize(color)
+          @out << text.gsub("\n", ansi_clear + "\n" + ansi_colorize(color))
+          @out << ansi_clear
+          @out << ansi_colorize(@subcolors[:self]) if @subcolors && @subcolors[:self]
+        else
+          @out << text
+        end
+      end
+      
+      def begin_group kind
+        @opened << kind
+        @out << open_token(kind)
+      end
+      alias begin_line begin_group
+      
+      def end_group kind
+        if @opened.empty?
+          # nothing to close
+        else
+          @opened.pop
+          @out << ansi_clear
+          @out << open_token(@opened.last)
+        end
+      end
+      
+      def end_line kind
+        if @opened.empty?
+          # nothing to close
+        else
+          @opened.pop
+          # whole lines to be highlighted,
+          # eg. added/modified/deleted lines in a diff
+          @out << "\t" * 100 + ansi_clear
+          @out << open_token(@opened.last)
+        end
+      end
+      
+    private
+      
+      def open_token kind
+        if color = TOKEN_COLORS[kind]
+          if Hash === color
+            @subcolors = color
+            ansi_colorize(color[:self]) if color[:self]
+          else
+            @subcolors = {}
+            ansi_colorize(color)
+          end
+        else
+          @subcolors = nil
+          ''
+        end
+      end
+      
+      def ansi_colorize(color)
+        Array(color).map { |c| "\e[#{c}m" }.join
+      end
+      def ansi_clear
+        ansi_colorize(0)
+      end
+    end
+  end
+end
\ No newline at end of file
diff --git a/lib/coderay/encoders/text.rb b/lib/coderay/encoders/text.rb
index 161ee67..15c66f9 100644
--- a/lib/coderay/encoders/text.rb
+++ b/lib/coderay/encoders/text.rb
@@ -1,32 +1,46 @@
 module CodeRay
 module Encoders
-
+  
+  # Concats the tokens into a single string, resulting in the original
+  # code string if no tokens were removed.
+  # 
+  # Alias: +plain+, +plaintext+
+  # 
+  # == Options
+  # 
+  # === :separator
+  # A separator string to join the tokens.
+  # 
+  # Default: empty String
   class Text < Encoder
-
-    include Streamable
+    
     register_for :text
-
+    
     FILE_EXTENSION = 'txt'
-
+    
     DEFAULT_OPTIONS = {
-      :separator => ''
+      :separator => nil
     }
-
+    
+    def text_token text, kind
+      super
+      
+      if @first
+        @first = false
+      else
+        @out << @sep
+      end if @sep
+    end
+    
   protected
     def setup options
       super
+      
+      @first = true
       @sep = options[:separator]
     end
-
-    def text_token text, kind
-      text + @sep
-    end
-
-    def finish options
-      super.chomp @sep
-    end
-
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/encoders/token_class_filter.rb b/lib/coderay/encoders/token_class_filter.rb
deleted file mode 100644
index a9e8673..0000000
--- a/lib/coderay/encoders/token_class_filter.rb
+++ /dev/null
@@ -1,84 +0,0 @@
-($:.unshift '../..'; require 'coderay') unless defined? CodeRay
-module CodeRay
-module Encoders
-  
-  load :filter
-  
-  class TokenClassFilter < Filter
-
-    include Streamable
-    register_for :token_class_filter
-
-    DEFAULT_OPTIONS = {
-      :exclude => [],
-      :include => :all
-    }
-
-  protected
-    def setup options
-      super
-      @exclude = options[:exclude]
-      @exclude = Array(@exclude) unless @exclude == :all
-      @include = options[:include]
-      @include = Array(@include) unless @include == :all
-    end
-    
-    def include_text_token? text, kind
-       (@include == :all || @include.include?(kind)) &&
-      !(@exclude == :all || @exclude.include?(kind))
-    end
-    
-  end
-
-end
-end
-
-if $0 == __FILE__
-  $VERBOSE = true
-  $: << File.join(File.dirname(__FILE__), '..')
-  eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class TokenClassFilterTest < Test::Unit::TestCase
-  
-  def test_creation
-    assert CodeRay::Encoders::TokenClassFilter < CodeRay::Encoders::Encoder
-    assert CodeRay::Encoders::TokenClassFilter < CodeRay::Encoders::Filter
-    filter = nil
-    assert_nothing_raised do
-      filter = CodeRay.encoder :token_class_filter
-    end
-    assert_instance_of CodeRay::Encoders::TokenClassFilter, filter
-  end
-  
-  def test_filtering_text_tokens
-    tokens = CodeRay::Tokens.new
-    for i in 1..10
-      tokens << [i.to_s, :index]
-      tokens << [' ', :space] if i < 10
-    end
-    assert_equal 10, CodeRay::Encoders::TokenClassFilter.new.encode_tokens(tokens, :exclude => :space).size
-    assert_equal 10, tokens.token_class_filter(:exclude => :space).size
-    assert_equal 9, CodeRay::Encoders::TokenClassFilter.new.encode_tokens(tokens, :include => :space).size
-    assert_equal 9, tokens.token_class_filter(:include => :space).size
-    assert_equal 0, CodeRay::Encoders::TokenClassFilter.new.encode_tokens(tokens, :exclude => :all).size
-    assert_equal 0, tokens.token_class_filter(:exclude => :all).size
-  end
-  
-  def test_filtering_block_tokens
-    tokens = CodeRay::Tokens.new
-    10.times do |i|
-      tokens << [:open, :index]
-      tokens << [i.to_s, :content]
-      tokens << [:close, :index]
-    end
-    assert_equal 20, CodeRay::Encoders::TokenClassFilter.new.encode_tokens(tokens, :include => :blubb).size
-    assert_equal 20, tokens.token_class_filter(:include => :blubb).size
-    assert_equal 30, CodeRay::Encoders::TokenClassFilter.new.encode_tokens(tokens, :exclude => :index).size
-    assert_equal 30, tokens.token_class_filter(:exclude => :index).size
-  end
-  
-end
diff --git a/lib/coderay/encoders/token_kind_filter.rb b/lib/coderay/encoders/token_kind_filter.rb
new file mode 100644
index 0000000..4773ea3
--- /dev/null
+++ b/lib/coderay/encoders/token_kind_filter.rb
@@ -0,0 +1,111 @@
+module CodeRay
+module Encoders
+  
+  load :filter
+  
+  # A Filter that selects tokens based on their token kind.
+  # 
+  # == Options
+  # 
+  # === :exclude
+  # 
+  # One or many symbols (in an Array) which shall be excluded.
+  # 
+  # Default: []
+  # 
+  # === :include
+  # 
+  # One or many symbols (in an array) which shall be included.
+  # 
+  # Default: :all, which means all tokens are included.
+  # 
+  # Exclusion wins over inclusion.
+  # 
+  # See also: CommentFilter
+  class TokenKindFilter < Filter
+    
+    register_for :token_kind_filter
+    
+    DEFAULT_OPTIONS = {
+      :exclude => [],
+      :include => :all
+    }
+    
+  protected
+    def setup options
+      super
+      
+      @group_excluded = false
+      @exclude = options[:exclude]
+      @exclude = Array(@exclude) unless @exclude == :all
+      @include = options[:include]
+      @include = Array(@include) unless @include == :all
+    end
+    
+    def include_text_token? text, kind
+      include_group? kind
+    end
+    
+    def include_group? kind
+       (@include == :all || @include.include?(kind)) &&
+      !(@exclude == :all || @exclude.include?(kind))
+    end
+    
+  public
+    
+    # Add the token to the output stream if +kind+ matches the conditions.
+    def text_token text, kind
+      super if !@group_excluded && include_text_token?(text, kind)
+    end
+    
+    # Add the token group to the output stream if +kind+ matches the
+    # conditions.
+    # 
+    # If it does not, all tokens inside the group are excluded from the
+    # stream, even if their kinds match.
+    def begin_group kind
+      if @group_excluded
+        @group_excluded += 1
+      elsif include_group? kind
+        super
+      else
+        @group_excluded = 1
+      end
+    end
+    
+    # See +begin_group+.
+    def begin_line kind
+      if @group_excluded
+        @group_excluded += 1
+      elsif include_group? kind
+        super
+      else
+        @group_excluded = 1
+      end
+    end
+    
+    # Take care of re-enabling the delegation of tokens to the output stream
+    # if an exluded group has ended.
+    def end_group kind
+      if @group_excluded
+        @group_excluded -= 1
+        @group_excluded = false if @group_excluded.zero?
+      else
+        super
+      end
+    end
+    
+    # See +end_group+.
+    def end_line kind
+      if @group_excluded
+        @group_excluded -= 1
+        @group_excluded = false if @group_excluded.zero?
+      else
+        super
+      end
+    end
+    
+  end
+  
+end
+end
diff --git a/lib/coderay/encoders/xml.rb b/lib/coderay/encoders/xml.rb
index f32c967..3d306a6 100644
--- a/lib/coderay/encoders/xml.rb
+++ b/lib/coderay/encoders/xml.rb
@@ -1,39 +1,40 @@
 module CodeRay
 module Encoders
-
+  
   # = XML Encoder
   #
   # Uses REXML. Very slow.
   class XML < Encoder
-
-    include Streamable
+    
     register_for :xml
-
+    
     FILE_EXTENSION = 'xml'
-
-    require 'rexml/document'
-
+    
+    autoload :REXML, 'rexml/document'
+    
     DEFAULT_OPTIONS = {
       :tab_width => 8,
       :pretty => -1,
       :transitive => false,
     }
-
+    
   protected
-
     def setup options
+      super
+      
       @doc = REXML::Document.new
       @doc << REXML::XMLDecl.new
       @tab_width = options[:tab_width]
       @root = @node = @doc.add_element('coderay-tokens')
     end
-
+    
     def finish options
-      @out = ''
       @doc.write @out, options[:pretty], options[:transitive], true
-      @out
+      
+      super
     end
     
+  public
     def text_token text, kind
       if kind == :space
         token = @node
@@ -53,19 +54,19 @@ module Encoders
         end
       end
     end
-
-    def open_token kind
+    
+    def begin_group kind
       @node = @node.add_element kind.to_s
     end
-
-    def close_token kind
+    
+    def end_group kind
       if @node == @root
         raise 'no token to close!'
       end
       @node = @node.parent
     end
-
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/encoders/yaml.rb b/lib/coderay/encoders/yaml.rb
index 5564e58..ba6e715 100644
--- a/lib/coderay/encoders/yaml.rb
+++ b/lib/coderay/encoders/yaml.rb
@@ -1,22 +1,50 @@
+autoload :YAML, 'yaml'
+
 module CodeRay
 module Encoders
-
+  
   # = YAML Encoder
   #
   # Slow.
   class YAML < Encoder
-
+    
     register_for :yaml
-
+    
     FILE_EXTENSION = 'yaml'
-
+    
   protected
-    def compile tokens, options
-      require 'yaml'
-      @out = tokens.to_a.to_yaml
+    def setup options
+      super
+      
+      @data = []
     end
-
+    
+    def finish options
+      output ::YAML.dump(@data)
+    end
+    
+  public
+    def text_token text, kind
+      @data << [text, kind]
+    end
+    
+    def begin_group kind
+      @data << [:begin_group, kind]
+    end
+    
+    def end_group kind
+      @data << [:end_group, kind]
+    end
+    
+    def begin_line kind
+      @data << [:begin_line, kind]
+    end
+    
+    def end_line kind
+      @data << [:end_line, kind]
+    end
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/for_redcloth.rb b/lib/coderay/for_redcloth.rb
index 69985bc..f9df32b 100644
--- a/lib/coderay/for_redcloth.rb
+++ b/lib/coderay/for_redcloth.rb
@@ -30,7 +30,7 @@ module CodeRay
       end
       RedCloth::TextileDoc.send :include, ForRedCloth::TextileDoc
       RedCloth::Formatters::HTML.module_eval do
-        def unescape(html)
+        def unescape(html)  # :nodoc:
           replacements = {
             '&' => '&',
             '"' => '"',
@@ -45,7 +45,7 @@ module CodeRay
           if !opts[:lang] && RedCloth::VERSION.to_s >= '4.2.0'
             # simulating pre-4.2 behavior
             if opts[:text].sub!(/\A\[(\w+)\]/, '')
-              if CodeRay::Scanners[$1].plugin_id == 'plaintext'
+              if CodeRay::Scanners[$1].lang == :text
                 opts[:text] = $& + opts[:text]
               else
                 opts[:lang] = $1
@@ -57,7 +57,7 @@ module CodeRay
             @in_bc ||= nil
             format = @in_bc ? :div : :span
             opts[:text] = unescape(opts[:text]) unless @in_bc
-            highlighted_code = CodeRay.encode opts[:text], opts[:lang], format, :stream => true
+            highlighted_code = CodeRay.encode opts[:text], opts[:lang], format
             highlighted_code.sub!(/\A<(span|div)/) { |m| m + pba(@in_bc || opts) }
             highlighted_code
           else
@@ -74,7 +74,7 @@ module CodeRay
           @in_bc = nil
           opts[:lang] ? '' : "</pre>\n"
         end
-        def escape_pre(text)
+        def escape_pre(text)  # :nodoc:
           if @in_bc ||= nil
             text
           else
diff --git a/lib/coderay/helpers/file_type.rb b/lib/coderay/helpers/file_type.rb
index d2ca86b..7b90918 100644
--- a/lib/coderay/helpers/file_type.rb
+++ b/lib/coderay/helpers/file_type.rb
@@ -1,56 +1,70 @@
-#!/usr/bin/env ruby
 module CodeRay
-
-# = FileType
-#
-# A simple filetype recognizer.
-#
-# Copyright (c) 2006 by murphy (Kornelius Kalnbach) <murphy rubychan de>
-#
-# License:: LGPL / ask the author
-# Version:: 0.1 (2005-09-01)
-#
-# == Documentation
-#
-#  # determine the type of the given
-#   lang = FileType[ARGV.first]
-#  
-#   # return :plaintext if the file type is unknown
-#   lang = FileType.fetch ARGV.first, :plaintext
-#  
-#   # try the shebang line, too
-#   lang = FileType.fetch ARGV.first, :plaintext, true
-module FileType
-
-  UnknownFileType = Class.new Exception
-
-  class << self
-
-    # Try to determine the file type of the file.
-    #
-    # +filename+ is a relative or absolute path to a file.
-    #
-    # The file itself is only accessed when +read_shebang+ is set to true.
-    # That means you can get filetypes from files that don't exist.
-    def [] filename, read_shebang = false
-      name = File.basename filename
-      ext = File.extname(name).sub(/^\./, '')  # from last dot, delete the leading dot
-      ext2 = filename.to_s[/\.(.*)/, 1]  # from first dot
-
-      type =
-        TypeFromExt[ext] ||
-        TypeFromExt[ext.downcase] ||
-        (TypeFromExt[ext2] if ext2) ||
-        (TypeFromExt[ext2.downcase] if ext2) ||
-        TypeFromName[name] ||
-        TypeFromName[name.downcase]
-      type ||= shebang(filename) if read_shebang
-
-      type
-    end
-
-    def shebang filename
-      begin
+  
+  # = FileType
+  #
+  # A simple filetype recognizer.
+  #
+  # == Usage
+  #
+  #  # determine the type of the given
+  #  lang = FileType[file_name]
+  #  
+  #  # return :text if the file type is unknown
+  #  lang = FileType.fetch file_name, :text
+  #  
+  #  # try the shebang line, too
+  #  lang = FileType.fetch file_name, :text, true
+  module FileType
+    
+    UnknownFileType = Class.new Exception
+    
+    class << self
+      
+      # Try to determine the file type of the file.
+      #
+      # +filename+ is a relative or absolute path to a file.
+      #
+      # The file itself is only accessed when +read_shebang+ is set to true.
+      # That means you can get filetypes from files that don't exist.
+      def [] filename, read_shebang = false
+        name = File.basename filename
+        ext = File.extname(name).sub(/^\./, '')  # from last dot, delete the leading dot
+        ext2 = filename.to_s[/\.(.*)/, 1]  # from first dot
+        
+        type =
+          TypeFromExt[ext] ||
+          TypeFromExt[ext.downcase] ||
+          (TypeFromExt[ext2] if ext2) ||
+          (TypeFromExt[ext2.downcase] if ext2) ||
+          TypeFromName[name] ||
+          TypeFromName[name.downcase]
+        type ||= shebang(filename) if read_shebang
+        
+        type
+      end
+      
+      # This works like Hash#fetch.
+      #
+      # If the filetype cannot be found, the +default+ value
+      # is returned.
+      def fetch filename, default = nil, read_shebang = false
+        if default && block_given?
+          warn 'Block supersedes default value argument; use either.'
+        end
+        
+        if type = self[filename, read_shebang]
+          type
+        else
+          return yield if block_given?
+          return default if default
+          raise UnknownFileType, 'Could not determine type of %p.' % filename
+        end
+      end
+      
+    protected
+      
+      def shebang filename
+        return unless File.exist? filename
         File.open filename, 'r' do |f|
           if first_line = f.gets
             if type = first_line[TypeFromShebang]
@@ -58,203 +72,72 @@ module FileType
             end
           end
         end
-      rescue IOError
-        nil
       end
+      
     end
-
-    # This works like Hash#fetch.
-    #
-    # If the filetype cannot be found, the +default+ value
-    # is returned.
-    def fetch filename, default = nil, read_shebang = false
-      if default and block_given?
-        warn 'block supersedes default value argument'
-      end
-
-      unless type = self[filename, read_shebang]
-        return yield if block_given?
-        return default if default
-        raise UnknownFileType, 'Could not determine type of %p.' % filename
-      end
-      type
+    
+    TypeFromExt = {
+      'c'        => :c,
+      'cfc'      => :xml,
+      'cfm'      => :xml,
+      'clj'      => :clojure,
+      'css'      => :css,
+      'diff'     => :diff,
+      'dpr'      => :delphi,
+      'erb'      => :erb,
+      'gemspec'  => :ruby,
+      'groovy'   => :groovy,
+      'gvy'      => :groovy,
+      'h'        => :c,
+      'haml'     => :haml,
+      'htm'      => :page,
+      'html'     => :page,
+      'html.erb' => :erb,
+      'java'     => :java,
+      'js'       => :java_script,
+      'json'     => :json,
+      'mab'      => :ruby,
+      'pas'      => :delphi,
+      'patch'    => :diff,
+      'php'      => :php,
+      'php3'     => :php,
+      'php4'     => :php,
+      'php5'     => :php,
+      'prawn'    => :ruby,
+      'py'       => :python,
+      'py3'      => :python,
+      'pyw'      => :python,
+      'rake'     => :ruby,
+      'raydebug' => :raydebug,
+      'rb'       => :ruby,
+      'rbw'      => :ruby,
+      'rhtml'    => :erb,
+      'rjs'      => :ruby,
+      'rpdf'     => :ruby,
+      'ru'       => :ruby,
+      'rxml'     => :ruby,
+      # 'sch'      => :scheme,
+      'sql'      => :sql,
+      # 'ss'       => :scheme,
+      'tmproj'   => :xml,
+      'xhtml'    => :page,
+      'xml'      => :xml,
+      'yaml'     => :yaml,
+      'yml'      => :yaml,
+    }
+    for cpp_alias in %w[cc cpp cp cxx c++ C hh hpp h++ cu]
+      TypeFromExt[cpp_alias] = :cpp
     end
-
+    
+    TypeFromShebang = /\b(?:ruby|perl|python|sh)\b/
+    
+    TypeFromName = {
+      'Capfile'  => :ruby,
+      'Rakefile' => :ruby,
+      'Rantfile' => :ruby,
+      'Gemfile'  => :ruby,
+    }
+    
   end
-
-  TypeFromExt = {
-    'c'        => :c,
-    'cfc'      => :xml,
-    'cfm'      => :xml,
-    'css'      => :css,
-    'diff'     => :diff,
-    'dpr'      => :delphi,
-    'gemspec'  => :ruby,
-    'groovy'   => :groovy,
-    'gvy'      => :groovy,
-    'h'        => :c,
-    'htm'      => :html,
-    'html'     => :html,
-    'html.erb' => :rhtml,
-    'java'     => :java,
-    'js'       => :java_script,
-    'json'     => :json,
-    'mab'      => :ruby,
-    'pas'      => :delphi,
-    'patch'    => :diff,
-    'php'      => :php,
-    'php3'     => :php,
-    'php4'     => :php,
-    'php5'     => :php,
-    'py'       => :python,
-    'py3'      => :python,
-    'pyw'      => :python,
-    'rake'     => :ruby,
-    'raydebug' => :debug,
-    'rb'       => :ruby,
-    'rbw'      => :ruby,
-    'rhtml'    => :rhtml,
-    'rjs'      => :ruby,
-    'rpdf'     => :ruby,
-    'rxml'     => :ruby,
-    'sch'      => :scheme,
-    'sql'      => :sql,
-    'ss'       => :scheme,
-    'xhtml'    => :xhtml,
-    'xml'      => :xml,
-    'yaml'     => :yaml,
-    'yml'      => :yaml,
-  }
-  for cpp_alias in %w[cc cpp cp cxx c++ C hh hpp h++ cu]
-    TypeFromExt[cpp_alias] = :cpp
-  end
-
-  TypeFromShebang = /\b(?:ruby|perl|python|sh)\b/
-
-  TypeFromName = {
-    'Rakefile' => :ruby,
-    'Rantfile' => :ruby,
-  }
-
-end
-
-end
-
-if $0 == __FILE__
-  $VERBOSE = true
-  eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class FileTypeTests < Test::Unit::TestCase
   
-  include CodeRay
-  
-  def test_fetch
-    assert_raise FileType::UnknownFileType do
-      FileType.fetch ''
-    end
-
-    assert_throws :not_found do
-      FileType.fetch '.' do
-        throw :not_found
-      end
-    end
-
-    assert_equal :default, FileType.fetch('c', :default)
-
-    stderr, fake_stderr = $stderr, Object.new
-    $err = ''
-    def fake_stderr.write x
-      $err << x
-    end
-    $stderr = fake_stderr
-    FileType.fetch('c', :default) { }
-    assert_equal "block supersedes default value argument\n", $err
-    $stderr = stderr
-  end
-
-  def test_ruby
-    assert_equal :ruby, FileType['test.rb']
-    assert_equal :ruby, FileType['test.java.rb']
-    assert_equal :java, FileType['test.rb.java']
-    assert_equal :ruby, FileType['C:\\Program Files\\x\\y\\c\\test.rbw']
-    assert_equal :ruby, FileType['/usr/bin/something/Rakefile']
-    assert_equal :ruby, FileType['~/myapp/gem/Rantfile']
-    assert_equal :ruby, FileType['./lib/tasks\repository.rake']
-    assert_not_equal :ruby, FileType['test_rb']
-    assert_not_equal :ruby, FileType['Makefile']
-    assert_not_equal :ruby, FileType['set.rb/set']
-    assert_not_equal :ruby, FileType['~/projects/blabla/rb']
-  end
-
-  def test_c
-    assert_equal :c, FileType['test.c']
-    assert_equal :c, FileType['C:\\Program Files\\x\\y\\c\\test.h']
-    assert_not_equal :c, FileType['test_c']
-    assert_not_equal :c, FileType['Makefile']
-    assert_not_equal :c, FileType['set.h/set']
-    assert_not_equal :c, FileType['~/projects/blabla/c']
-  end
-
-  def test_cpp
-    assert_equal :cpp, FileType['test.c++']
-    assert_equal :cpp, FileType['test.cxx']
-    assert_equal :cpp, FileType['test.hh']
-    assert_equal :cpp, FileType['test.hpp']
-    assert_equal :cpp, FileType['test.cu']
-    assert_equal :cpp, FileType['test.C']
-    assert_not_equal :cpp, FileType['test.c']
-    assert_not_equal :cpp, FileType['test.h']
-  end
-
-  def test_html
-    assert_equal :html, FileType['test.htm']
-    assert_equal :xhtml, FileType['test.xhtml']
-    assert_equal :xhtml, FileType['test.html.xhtml']
-    assert_equal :rhtml, FileType['_form.rhtml']
-    assert_equal :rhtml, FileType['_form.html.erb']
-  end
-
-  def test_yaml
-    assert_equal :yaml, FileType['test.yml']
-    assert_equal :yaml, FileType['test.yaml']
-    assert_equal :yaml, FileType['my.html.yaml']
-    assert_not_equal :yaml, FileType['YAML']
-  end
-
-  def test_pathname
-    require 'pathname'
-    pn = Pathname.new 'test.rb'
-    assert_equal :ruby, FileType[pn]
-    dir = Pathname.new '/etc/var/blubb'
-    assert_equal :ruby, FileType[dir + pn]
-    assert_equal :cpp, FileType[dir + 'test.cpp']
-  end
-
-  def test_no_shebang
-    dir = './test'
-    if File.directory? dir
-      Dir.chdir dir do
-        assert_equal :c, FileType['test.c']
-      end
-    end
-  end
-  
-  def test_shebang_empty_file
-    require 'tmpdir'
-    tmpfile = File.join(Dir.tmpdir, 'bla')
-    File.open(tmpfile, 'w') { }  # touch
-    assert_equal nil, FileType[tmpfile]
-  end
-  
-  def test_shebang
-    require 'tmpdir'
-    tmpfile = File.join(Dir.tmpdir, 'bla')
-    File.open(tmpfile, 'w') { |f| f.puts '#!/usr/bin/env ruby' }
-    assert_equal :ruby, FileType[tmpfile, true]
-  end
-
 end
diff --git a/lib/coderay/helpers/gzip.rb b/lib/coderay/helpers/gzip.rb
new file mode 100644
index 0000000..245014a
--- /dev/null
+++ b/lib/coderay/helpers/gzip.rb
@@ -0,0 +1,41 @@
+module CodeRay
+  
+  # A simplified interface to the gzip library +zlib+ (from the Ruby Standard Library.)
+  module GZip
+    
+    require 'zlib'
+    
+    # The default zipping level. 7 zips good and fast.
+    DEFAULT_GZIP_LEVEL = 7
+    
+    # Unzips the given string +s+.
+    #
+    # Example:
+    #   require 'gzip_simple'
+    #   print GZip.gunzip(File.read('adresses.gz'))
+    def GZip.gunzip s
+      Zlib::Inflate.inflate s
+    end
+    
+    # Zips the given string +s+.
+    #
+    # Example:
+    #   require 'gzip_simple'
+    #   File.open('adresses.gz', 'w') do |file
+    #     file.write GZip.gzip('Mum: 0123 456 789', 9)
+    #   end
+    #
+    # If you provide a +level+, you can control how strong
+    # the string is compressed:
+    # - 0: no compression, only convert to gzip format
+    # - 1: compress fast
+    # - 7: compress more, but still fast (default)
+    # - 8: compress more, slower
+    # - 9: compress best, very slow
+    def GZip.gzip s, level = DEFAULT_GZIP_LEVEL
+      Zlib::Deflate.new(level).deflate s, Zlib::FINISH
+    end
+    
+  end
+  
+end
diff --git a/lib/coderay/helpers/gzip_simple.rb b/lib/coderay/helpers/gzip_simple.rb
deleted file mode 100644
index b979f66..0000000
--- a/lib/coderay/helpers/gzip_simple.rb
+++ /dev/null
@@ -1,123 +0,0 @@
-# =GZip Simple
-#
-# A simplified interface to the gzip library +zlib+ (from the Ruby Standard Library.)
-#
-# Author: murphy (mail to murphy rubychan de)
-#
-# Version: 0.2 (2005.may.28)
-#
-# ==Documentation
-#
-# See +GZip+ module and the +String+ extensions.
-#
-module GZip
-
-  require 'zlib'
-
-  # The default zipping level. 7 zips good and fast.
-  DEFAULT_GZIP_LEVEL = 7
-
-  # Unzips the given string +s+.
-  #
-  # Example:
-  #   require 'gzip_simple'
-  #   print GZip.gunzip(File.read('adresses.gz'))
-  def GZip.gunzip s
-    Zlib::Inflate.inflate s
-  end
-
-  # Zips the given string +s+.
-  #
-  # Example:
-  #   require 'gzip_simple'
-  #   File.open('adresses.gz', 'w') do |file
-  #     file.write GZip.gzip('Mum: 0123 456 789', 9)
-  #   end
-  #
-  # If you provide a +level+, you can control how strong
-  # the string is compressed:
-  # - 0: no compression, only convert to gzip format
-  # - 1: compress fast
-  # - 7: compress more, but still fast (default)
-  # - 8: compress more, slower
-  # - 9: compress best, very slow
-  def GZip.gzip s, level = DEFAULT_GZIP_LEVEL
-    Zlib::Deflate.new(level).deflate s, Zlib::FINISH
-  end
-end
-
-
-# String extensions to use the GZip module.
-#
-# The methods gzip and gunzip provide an even more simple
-# interface to the ZLib:
-#
-#   # create a big string
-#   x = 'a' * 1000
-#   
-#   # zip it
-#   x_gz = x.gzip
-#   
-#   # test the result
-#   puts 'Zipped %d bytes to %d bytes.' % [x.size, x_gz.size]
-#   #-> Zipped 1000 bytes to 19 bytes.
-#   
-#   # unzipping works
-#   p x_gz.gunzip == x  #-> true
-class String
-  # Returns the string, unzipped.
-  # See GZip.gunzip
-  def gunzip
-    GZip.gunzip self
-  end
-  # Replaces the string with its unzipped value.
-  # See GZip.gunzip
-  def gunzip!
-    replace gunzip
-  end
-
-  # Returns the string, zipped.
-  # +level+ is the gzip compression level, see GZip.gzip.
-  def gzip level = GZip::DEFAULT_GZIP_LEVEL
-    GZip.gzip self, level
-  end
-  # Replaces the string with its zipped value.
-  # See GZip.gzip.
-  def gzip!(*args)
-    replace gzip(*args)
-  end
-end
-
-if $0 == __FILE__
-  eval DATA.read, nil, $0, __LINE__+4
-end
-
-__END__
-#CODE
-
-# Testing / Benchmark
-x = 'a' * 1000
-x_gz = x.gzip
-puts 'Zipped %d bytes to %d bytes.' % [x.size, x_gz.size]  #-> Zipped 1000 bytes to 19 bytes.
-p x_gz.gunzip == x  #-> true
-
-require 'benchmark'
-
-INFO = 'packed to %0.3f%%'  # :nodoc:
-
-x = Array.new(100000) { rand(255).chr + 'aaaaaaaaa' + rand(255).chr }.join
-Benchmark.bm(10) do |bm|
-  for level in 0..9
-    bm.report "zip #{level}" do
-      $x = x.gzip level
-    end
-    puts INFO % [100.0 * $x.size / x.size]
-  end
-  bm.report 'zip' do
-    $x = x.gzip
-  end
-  puts INFO % [100.0 * $x.size / x.size]
-  bm.report 'unzip' do
-    $x.gunzip
-  end
-end
diff --git a/lib/coderay/helpers/plugin.rb b/lib/coderay/helpers/plugin.rb
index 2dffbdc..06c1233 100644
--- a/lib/coderay/helpers/plugin.rb
+++ b/lib/coderay/helpers/plugin.rb
@@ -1,349 +1,284 @@
 module CodeRay
   
-# = PluginHost
-#
-# A simple subclass plugin system.
-#
-#  Example:
-#    class Generators < PluginHost
-#      plugin_path 'app/generators'
-#    end
-#    
-#    class Generator
-#      extend Plugin
-#      PLUGIN_HOST = Generators
-#    end
-#    
-#    class FancyGenerator < Generator
-#      register_for :fancy
-#    end
-#
-#    Generators[:fancy]  #-> FancyGenerator
-#    # or
-#    CodeRay.require_plugin 'Generators/fancy'
-module PluginHost
-
-  # Raised if Encoders::[] fails because:
-  # * a file could not be found
-  # * the requested Encoder is not registered
-  PluginNotFound = Class.new Exception
-  HostNotFound = Class.new Exception
-
-  PLUGIN_HOSTS = []
-  PLUGIN_HOSTS_BY_ID = {}  # dummy hash
-
-  # Loads all plugins using list and load.
-  def load_all
-    for plugin in list
-      load plugin
-    end
-  end
-
-  # Returns the Plugin for +id+.
+  # = PluginHost
+  #
+  # A simple subclass/subfolder plugin system.
   #
   # Example:
-  #  yaml_plugin = MyPluginHost[:yaml]
-  def [] id, *args, &blk
-    plugin = validate_id(id)
-    begin
-      plugin = plugin_hash.[] plugin, *args, &blk
-    end while plugin.is_a? Symbol
-    plugin
-  end
-
-  # Alias for +[]+.
-  alias load []
-
-  def require_helper plugin_id, helper_name
-    path = path_to File.join(plugin_id, helper_name)
-    require path
-  end
-
-  class << self
-
-    # Adds the module/class to the PLUGIN_HOSTS list.
-    def extended mod
-      PLUGIN_HOSTS << mod
+  #  class Generators
+  #    extend PluginHost
+  #    plugin_path 'app/generators'
+  #  end
+  #  
+  #  class Generator
+  #    extend Plugin
+  #    PLUGIN_HOST = Generators
+  #  end
+  #  
+  #  class FancyGenerator < Generator
+  #    register_for :fancy
+  #  end
+  #  
+  #  Generators[:fancy]  #-> FancyGenerator
+  #  # or
+  #  CodeRay.require_plugin 'Generators/fancy'
+  #  # or
+  #  Generators::Fancy
+  module PluginHost
+    
+    # Raised if Encoders::[] fails because:
+    # * a file could not be found
+    # * the requested Plugin is not registered
+    PluginNotFound = Class.new LoadError
+    HostNotFound = Class.new LoadError
+    
+    PLUGIN_HOSTS = []
+    PLUGIN_HOSTS_BY_ID = {}  # dummy hash
+    
+    # Loads all plugins using list and load.
+    def load_all
+      for plugin in list
+        load plugin
+      end
     end
-
-    # Warns you that you should not #include this module.
-    def included mod
-      warn "#{name} should not be included. Use extend."
+    
+    # Returns the Plugin for +id+.
+    #
+    # Example:
+    #  yaml_plugin = MyPluginHost[:yaml]
+    def [] id, *args, &blk
+      plugin = validate_id(id)
+      begin
+        plugin = plugin_hash.[] plugin, *args, &blk
+      end while plugin.is_a? Symbol
+      plugin
     end
-
-    # Find the PluginHost for host_id.
-    def host_by_id host_id
-      unless PLUGIN_HOSTS_BY_ID.default_proc
-        ph = Hash.new do |h, a_host_id|
-          for host in PLUGIN_HOSTS
-            h[host.host_id] = host
-          end
-          h.fetch a_host_id, nil
-        end
-        PLUGIN_HOSTS_BY_ID.replace ph
+    
+    alias load []
+    
+    # Tries to +load+ the missing plugin by translating +const+ to the
+    # underscore form (eg. LinesOfCode becomes lines_of_code).
+    def const_missing const
+      id = const.to_s.
+        gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2').
+        gsub(/([a-z\d])([A-Z])/,'\1_\2').
+        downcase
+      load id
+    end
+    
+    class << self
+      
+      # Adds the module/class to the PLUGIN_HOSTS list.
+      def extended mod
+        PLUGIN_HOSTS << mod
       end
-      PLUGIN_HOSTS_BY_ID[host_id]
+      
     end
-
-  end
-
-  # The path where the plugins can be found.
-  def plugin_path *args
-    unless args.empty?
-      @plugin_path = File.expand_path File.join(*args)
-      load_map
+    
+    # The path where the plugins can be found.
+    def plugin_path *args
+      unless args.empty?
+        @plugin_path = File.expand_path File.join(*args)
+      end
+      @plugin_path ||= ''
     end
-    @plugin_path
-  end
-
-  # The host's ID.
-  #
-  # If PLUGIN_HOST_ID is not set, it is simply the class name.
-  def host_id
-    if self.const_defined? :PLUGIN_HOST_ID
-      self::PLUGIN_HOST_ID
-    else
-      name
+    
+    # Map a plugin_id to another.
+    #
+    # Usage: Put this in a file plugin_path/_map.rb.
+    #
+    #  class MyColorHost < PluginHost
+    #    map :navy => :dark_blue,
+    #      :maroon => :brown,
+    #      :luna => :moon
+    #  end
+    def map hash
+      for from, to in hash
+        from = validate_id from
+        to = validate_id to
+        plugin_hash[from] = to unless plugin_hash.has_key? from
+      end
     end
-  end
-
-  # Map a plugin_id to another.
-  #
-  # Usage: Put this in a file plugin_path/_map.rb.
-  #
-  #  class MyColorHost < PluginHost
-  #    map :navy => :dark_blue,
-  #      :maroon => :brown,
-  #      :luna => :moon
-  #  end
-  def map hash
-    for from, to in hash
-      from = validate_id from
-      to = validate_id to
-      plugin_hash[from] = to unless plugin_hash.has_key? from
+    
+    # Define the default plugin to use when no plugin is found
+    # for a given id, or return the default plugin.
+    #
+    # See also map.
+    #
+    #  class MyColorHost < PluginHost
+    #    map :navy => :dark_blue
+    #    default :gray
+    #  end
+    #  
+    #  MyColorHost.default  # loads and returns the Gray plugin
+    def default id = nil
+      if id
+        id = validate_id id
+        raise "The default plugin can't be named \"default\"." if id == :default
+        plugin_hash[:default] = id
+      else
+        load :default
+      end
     end
-  end
-
-  # Define the default plugin to use when no plugin is found
-  # for a given id.
-  #
-  # See also map.
-  #
-  #  class MyColorHost < PluginHost
-  #    map :navy => :dark_blue
-  #    default :gray
-  #  end
-  def default id = nil
-    if id
-      id = validate_id id
-      plugin_hash[nil] = id
-    else
-      plugin_hash[nil]
+    
+    # Every plugin must register itself for +id+ by calling register_for,
+    # which calls this method.
+    #
+    # See Plugin#register_for.
+    def register plugin, id
+      plugin_hash[validate_id(id)] = plugin
     end
-  end
-
-  # Every plugin must register itself for one or more
-  # +ids+ by calling register_for, which calls this method.
-  #
-  # See Plugin#register_for.
-  def register plugin, *ids
-    for id in ids
-      unless id.is_a? Symbol
-        raise ArgumentError,
-          "id must be a Symbol, but it was a #{id.class}"
+    
+    # A Hash of plugion_id => Plugin pairs.
+    def plugin_hash
+      @plugin_hash ||= make_plugin_hash
+    end
+    
+    # Returns an array of all .rb files in the plugin path.
+    #
+    # The extension .rb is not included.
+    def list
+      Dir[path_to('*')].select do |file|
+        File.basename(file)[/^(?!_)\w+\.rb$/]
+      end.map do |file|
+        File.basename(file, '.rb').to_sym
       end
-      plugin_hash[validate_id(id)] = plugin
     end
-  end
-
-  # A Hash of plugion_id => Plugin pairs.
-  def plugin_hash
-    @plugin_hash ||= create_plugin_hash
-  end
-
-  # Returns an array of all .rb files in the plugin path.
-  #
-  # The extension .rb is not included.
-  def list
-    Dir[path_to('*')].select do |file|
-      File.basename(file)[/^(?!_)\w+\.rb$/]
-    end.map do |file|
-      File.basename file, '.rb'
+    
+    # Returns an array of all Plugins.
+    # 
+    # Note: This loads all plugins using load_all.
+    def all_plugins
+      load_all
+      plugin_hash.values.grep(Class)
     end
-  end
-
-  # Makes a map of all loaded plugins.
-  def inspect
-    map = plugin_hash.dup
-    map.each do |id, plugin|
-      map[id] = plugin.to_s[/(?>\w+)$/]
+    
+    # Loads the map file (see map).
+    #
+    # This is done automatically when plugin_path is called.
+    def load_plugin_map
+      mapfile = path_to '_map'
+      @plugin_map_loaded = true
+      if File.exist? mapfile
+        require mapfile
+        true
+      else
+        false
+      end
     end
-    "#{name}[#{host_id}]#{map.inspect}"
-  end
-
-protected
-  # Created a new plugin list and stores it to @plugin_hash.
-  def create_plugin_hash
-    @plugin_hash =
+    
+  protected
+    
+    # Return a plugin hash that automatically loads plugins.
+    def make_plugin_hash
+      @plugin_map_loaded ||= false
       Hash.new do |h, plugin_id|
         id = validate_id(plugin_id)
         path = path_to id
         begin
+          raise LoadError, "#{path} not found" unless File.exist? path
           require path
         rescue LoadError => boom
-          if h.has_key? nil  # default plugin
-            h[id] = h[nil]
+          if @plugin_map_loaded
+            if h.has_key?(:default)
+              warn '%p could not load plugin %p; falling back to %p' % [self, id, h[:default]]
+              h[:default]
+            else
+              raise PluginNotFound, '%p could not load plugin %p: %s' % [self, id, boom]
+            end
           else
-            raise PluginNotFound, 'Could not load plugin %p: %s' % [id, boom]
+            load_plugin_map
+            h[plugin_id]
           end
         else
           # Plugin should have registered by now
-          unless h.has_key? id
-            raise PluginNotFound,
-              "No #{self.name} plugin for #{id.inspect} found in #{path}."
+          if h.has_key? id
+            h[id]
+          else
+            raise PluginNotFound, "No #{self.name} plugin for #{id.inspect} found in #{path}."
           end
         end
-        h[id]
       end
-  end
-
-  # Loads the map file (see map).
-  #
-  # This is done automatically when plugin_path is called.
-  def load_map
-    mapfile = path_to '_map'
-    if File.exist? mapfile
-      require mapfile
-    elsif $VERBOSE
-      warn 'no _map.rb found for %s' % name
     end
-  end
-
-  # Returns the Plugin for +id+.
-  # Use it like Hash#fetch.
-  #
-  # Example:
-  #  yaml_plugin = MyPluginHost[:yaml, :default]
-  def fetch id, *args, &blk
-    plugin_hash.fetch validate_id(id), *args, &blk
-  end
-
-  # Returns the expected path to the plugin file for the given id.
-  def path_to plugin_id
-    File.join plugin_path, "#{plugin_id}.rb"
-  end
-
-  # Converts +id+ to a Symbol if it is a String,
-  # or returns +id+ if it already is a Symbol.
-  #
-  # Raises +ArgumentError+ for all other objects, or if the
-  # given String includes non-alphanumeric characters (\W).
-  def validate_id id
-    if id.is_a? Symbol or id.nil?
-      id
-    elsif id.is_a? String
-      if id[/\w+/] == id
-        id.downcase.to_sym
+    
+    # Returns the expected path to the plugin file for the given id.
+    def path_to plugin_id
+      File.join plugin_path, "#{plugin_id}.rb"
+    end
+    
+    # Converts +id+ to a Symbol if it is a String,
+    # or returns +id+ if it already is a Symbol.
+    #
+    # Raises +ArgumentError+ for all other objects, or if the
+    # given String includes non-alphanumeric characters (\W).
+    def validate_id id
+      if id.is_a? Symbol or id.nil?
+        id
+      elsif id.is_a? String
+        if id[/\w+/] == id
+          id.downcase.to_sym
+        else
+          raise ArgumentError, "Invalid id given: #{id}"
+        end
       else
-        raise ArgumentError, "Invalid id: '#{id}' given."
+        raise ArgumentError, "String or Symbol expected, but #{id.class} given."
       end
-    else
-      raise ArgumentError,
-        "String or Symbol expected, but #{id.class} given."
     end
-  end
-
-end
-
-
-# = Plugin
-#
-#  Plugins have to include this module.
-#
-#  IMPORTANT: use extend for this module.
-#
-#  Example: see PluginHost.
-module Plugin
-
-  def included mod
-    warn "#{name} should not be included. Use extend."
-  end
-
-  # Register this class for the given langs.
-  # Example:
-  #   class MyPlugin < PluginHost::BaseClass
-  #     register_for :my_id
-  #     ...
-  #   end
-  #
-  # See PluginHost.register.
-  def register_for *ids
-    plugin_host.register self, *ids
+    
   end
   
-  # Returns the title of the plugin, or sets it to the
-  # optional argument +title+.
-  def title title = nil
-    if title
-      @title = title.to_s
-    else
-      @title ||= name[/([^:]+)$/, 1]
-    end
-  end
-
-  # The host for this Plugin class.
-  def plugin_host host = nil
-    if host and not host.is_a? PluginHost
-      raise ArgumentError,
-        "PluginHost expected, but #{host.class} given."
-    end
-    self.const_set :PLUGIN_HOST, host if host
-    self::PLUGIN_HOST
-  end
-
-  # Require some helper files.
+  
+  # = Plugin
   #
-  # Example:
+  #  Plugins have to include this module.
   #
-  #  class MyPlugin < PluginHost::BaseClass
-  #     register_for :my_id
-  #     helper :my_helper
+  #  IMPORTANT: Use extend for this module.
   #
-  # The above example loads the file myplugin/my_helper.rb relative to the
-  # file in which MyPlugin was defined.
-  # 
-  # You can also load a helper from a different plugin:
-  # 
-  #  helper 'other_plugin/helper_name'
-  def helper *helpers
-    for helper in helpers
-      if helper.is_a?(String) && helper[/\//]
-        self::PLUGIN_HOST.require_helper $`, $'
+  #  See CodeRay::PluginHost for examples.
+  module Plugin
+    
+    attr_reader :plugin_id
+    
+    # Register this class for the given +id+.
+    # 
+    # Example:
+    #   class MyPlugin < PluginHost::BaseClass
+    #     register_for :my_id
+    #     ...
+    #   end
+    #
+    # See PluginHost.register.
+    def register_for id
+      @plugin_id = id
+      plugin_host.register self, id
+    end
+    
+    # Returns the title of the plugin, or sets it to the
+    # optional argument +title+.
+    def title title = nil
+      if title
+        @title = title.to_s
       else
-        self::PLUGIN_HOST.require_helper plugin_id, helper.to_s
+        @title ||= name[/([^:]+)$/, 1]
       end
     end
+    
+    # The PluginHost for this Plugin class.
+    def plugin_host host = nil
+      if host.is_a? PluginHost
+        const_set :PLUGIN_HOST, host
+      end
+      self::PLUGIN_HOST
+    end
+    
+    def aliases
+      plugin_host.load_plugin_map
+      plugin_host.plugin_hash.inject [] do |aliases, (key, _)|
+        aliases << key if plugin_host[key] == self
+        aliases
+      end
+    end
+    
   end
-
-  # Returns the pulgin id used by the engine.
-  def plugin_id
-    name[/\w+$/].downcase
-  end
-
-end
-
-# Convenience method for plugin loading.
-# The syntax used is:
-#
-#  CodeRay.require_plugin '<Host ID>/<Plugin ID>'
-#
-# Returns the loaded plugin.
-def self.require_plugin path
-  host_id, plugin_id = path.split '/', 2
-  host = PluginHost.host_by_id(host_id)
-  raise PluginHost::HostNotFound,
-    "No host for #{host_id.inspect} found." unless host
-  host.load plugin_id
+  
 end
-
-end
\ No newline at end of file
diff --git a/lib/coderay/helpers/word_list.rb b/lib/coderay/helpers/word_list.rb
index 9b4f456..ea969c3 100644
--- a/lib/coderay/helpers/word_list.rb
+++ b/lib/coderay/helpers/word_list.rb
@@ -1,138 +1,77 @@
 module CodeRay
-
-# = WordList
-# 
-# <b>A Hash subclass designed for mapping word lists to token types.</b>
-# 
-# Copyright (c) 2006 by murphy (Kornelius Kalnbach) <murphy rubychan de>
-#
-# License:: LGPL / ask the author
-# Version:: 1.1 (2006-Oct-19)
-#
-# A WordList is a Hash with some additional features.
-# It is intended to be used for keyword recognition.
-#
-# WordList is highly optimized to be used in Scanners,
-# typically to decide whether a given ident is a special token.
-#
-# For case insensitive words use CaseIgnoringWordList.
-#
-# Example:
-#
-#  # define word arrays
-#  RESERVED_WORDS = %w[
-#    asm break case continue default do else
-#    ...
-#  ]
-#  
-#  PREDEFINED_TYPES = %w[
-#    int long short char void
-#    ...
-#  ]
-#  
-#  PREDEFINED_CONSTANTS = %w[
-#    EOF NULL ...
-#  ]
-#  
-#  # make a WordList
-#  IDENT_KIND = WordList.new(:ident).
-#    add(RESERVED_WORDS, :reserved).
-#    add(PREDEFINED_TYPES, :pre_type).
-#    add(PREDEFINED_CONSTANTS, :pre_constant)
-#
-#  ...
-#
-#  def scan_tokens tokens, options
-#    ...
-#    
-#    elsif scan(/[A-Za-z_][A-Za-z_0-9]*/)
-#      # use it
-#      kind = IDENT_KIND[match]
-#      ...
-class WordList < Hash
-
-  # Creates a new WordList with +default+ as default value.
-  # 
-  # You can activate +caching+ to store the results for every [] request.
+  
+  # = WordList
   # 
-  # With caching, methods like +include?+ or +delete+ may no longer behave
-  # as you expect. Therefore, it is recommended to use the [] method only.
-  def initialize default = false, caching = false, &block
-    if block
-      raise ArgumentError, 'Can\'t combine block with caching.' if caching
-      super(&block)
-    else
-      if caching
-        super() do |h, k|
-          h[k] = h.fetch k, default
-        end
-      else
-        super default
-      end
-    end
-  end
-
-  # Add words to the list and associate them with +kind+.
+  # <b>A Hash subclass designed for mapping word lists to token types.</b>
   # 
-  # Returns +self+, so you can concat add calls.
-  def add words, kind = true
-    words.each do |word|
-      self[word] = kind
+  # Copyright (c) 2006-2011 by murphy (Kornelius Kalnbach) <murphy rubychan de>
+  #
+  # License:: LGPL / ask the author
+  # Version:: 2.0 (2011-05-08)
+  #
+  # A WordList is a Hash with some additional features.
+  # It is intended to be used for keyword recognition.
+  #
+  # WordList is optimized to be used in Scanners,
+  # typically to decide whether a given ident is a special token.
+  #
+  # For case insensitive words use WordList::CaseIgnoring.
+  #
+  # Example:
+  #
+  #  # define word arrays
+  #  RESERVED_WORDS = %w[
+  #    asm break case continue default do else
+  #  ]
+  #  
+  #  PREDEFINED_TYPES = %w[
+  #    int long short char void
+  #  ]
+  #  
+  #  # make a WordList
+  #  IDENT_KIND = WordList.new(:ident).
+  #    add(RESERVED_WORDS, :reserved).
+  #    add(PREDEFINED_TYPES, :predefined_type)
+  #  
+  #  ...
+  #  
+  #  def scan_tokens tokens, options
+  #    ...
+  #    
+  #    elsif scan(/[A-Za-z_][A-Za-z_0-9]*/)
+  #      # use it
+  #      kind = IDENT_KIND[match]
+  #      ...
+  class WordList < Hash
+    
+    # Create a new WordList with +default+ as default value.
+    def initialize default = false
+      super default
     end
-    self
-  end
-
-end
-
-
-# A CaseIgnoringWordList is like a WordList, only that
-# keys are compared case-insensitively.
-# 
-# Ignoring the text case is realized by sending the +downcase+ message to
-# all keys.
-# 
-# Caching usually makes a CaseIgnoringWordList faster, but it has to be
-# activated explicitely.
-class CaseIgnoringWordList < WordList
-
-  # Creates a new case-insensitive WordList with +default+ as default value.
-  # 
-  # You can activate caching to store the results for every [] request.
-  # This speeds up subsequent lookups for the same word, but also
-  # uses memory.
-  def initialize default = false, caching = false
-    if caching
-      super(default, false) do |h, k|
-        h[k] = h.fetch k.downcase, default
-      end
-    else
-      super(default, false)
-      extend Uncached
+    
+    # Add words to the list and associate them with +value+.
+    # 
+    # Returns +self+, so you can concat add calls.
+    def add words, value = true
+      words.each { |word| self[word] = value }
+      self
     end
+    
   end
   
-  module Uncached  # :nodoc:
+  
+  # A CaseIgnoring WordList is like a WordList, only that
+  # keys are compared case-insensitively (normalizing keys using +downcase+).
+  class WordList::CaseIgnoring < WordList
+    
     def [] key
-      super(key.downcase)
+      super key.downcase
     end
-  end
-
-  # Add +words+ to the list and associate them with +kind+.
-  def add words, kind = true
-    words.each do |word|
-      self[word.downcase] = kind
+    
+    def []= key, value
+      super key.downcase, value
     end
-    self
+    
   end
-
-end
-
+  
 end
-
-__END__
-# check memory consumption
-END {
-  ObjectSpace.each_object(CodeRay::CaseIgnoringWordList) do |wl|
-    p wl.inject(0) { |memo, key, value| memo + key.size + 24 }
-  end
-}
\ No newline at end of file
diff --git a/lib/coderay/scanner.rb b/lib/coderay/scanner.rb
index c4fcb8a..907cf00 100644
--- a/lib/coderay/scanner.rb
+++ b/lib/coderay/scanner.rb
@@ -1,7 +1,10 @@
-module CodeRay
-
-  require 'coderay/helpers/plugin'
+# encoding: utf-8
+require 'strscan'
 
+module CodeRay
+  
+  autoload :WordList, coderay_path('helpers', 'word_list')
+  
   # = Scanners
   #
   # This module holds the Scanner class and its subclasses.
@@ -16,9 +19,8 @@ module CodeRay
   module Scanners
     extend PluginHost
     plugin_path File.dirname(__FILE__), 'scanners'
-
-    require 'strscan'
-
+    
+    
     # = Scanner
     #
     # The base class for all Scanners.
@@ -46,64 +48,89 @@ module CodeRay
       
       extend Plugin
       plugin_host Scanners
-
+      
       # Raised if a Scanner fails while scanning
-      ScanError = Class.new(Exception)
-
-      require 'coderay/helpers/word_list'
-
+      ScanError = Class.new StandardError
+      
       # The default options for all scanner classes.
       #
       # Define @default_options for subclasses.
-      DEFAULT_OPTIONS = { :stream => false }
+      DEFAULT_OPTIONS = { }
+      
+      KINDS_NOT_LOC = [:comment, :doctype, :docstring]
+      
+      attr_accessor :state
       
-      KINDS_NOT_LOC = [:comment, :doctype]
-
       class << self
-
-        # Returns if the Scanner can be used in streaming mode.
-        def streamable?
-          is_a? Streamable
+        
+        # Normalizes the given code into a string with UNIX newlines, in the
+        # scanner's internal encoding, with invalid and undefined charachters
+        # replaced by placeholders. Always returns a new object.
+        def normalize code
+          # original = code
+          code = code.to_s unless code.is_a? ::String
+          return code if code.empty?
+          
+          if code.respond_to? :encoding
+            code = encode_with_encoding code, self.encoding
+          else
+            code = to_unix code
+          end
+          # code = code.dup if code.eql? original
+          code
         end
-
-        def normify code
-          code = code.to_s
-          if code.respond_to?(:encoding) && (code.encoding.name != 'UTF-8' || !code.valid_encoding?)
-            code = code.dup
-            original_encoding = code.encoding
-            code.force_encoding 'Windows-1252'
-            unless code.valid_encoding?
-              code.force_encoding original_encoding
-              if code.encoding.name == 'UTF-8'
-                code.encode! 'UTF-16BE', :invalid => :replace, :undef => :replace, :replace => '?'
-              end
-              code.encode! 'UTF-8', :invalid => :replace, :undef => :replace, :replace => '?'
+        
+        # The typical filename suffix for this scanner's language.
+        def file_extension extension = lang
+          @file_extension ||= extension.to_s
+        end
+        
+        # The encoding used internally by this scanner.
+        def encoding name = 'UTF-8'
+          @encoding ||= defined?(Encoding.find) && Encoding.find(name)
+        end
+        
+        # The lang of this Scanner class, which is equal to its Plugin ID.
+        def lang
+          @plugin_id
+        end
+        
+      protected
+        
+        def encode_with_encoding code, target_encoding
+          if code.encoding == target_encoding
+            if code.valid_encoding?
+              return to_unix(code)
+            else
+              source_encoding = guess_encoding code
             end
+          else
+            source_encoding = code.encoding
           end
-          code.to_unix
+          # print "encode_with_encoding from #{source_encoding} to #{target_encoding}"
+          code.encode target_encoding, source_encoding, :universal_newline => true, :undef => :replace, :invalid => :replace
         end
         
-        def file_extension extension = nil
-          if extension
-            @file_extension = extension.to_s
-          else
-            @file_extension ||= plugin_id.to_s
+        def to_unix code
+          code.index(?\r) ? code.gsub(/\r\n?/, "\n") : code
+        end
+        
+        def guess_encoding s
+          #:nocov:
+          IO.popen("file -b --mime -", "w+") do |file|
+            file.write s[0, 1024]
+            file.close_write
+            begin
+              Encoding.find file.gets[/charset=([-\w]+)/, 1]
+            rescue ArgumentError
+              Encoding::BINARY
+            end
           end
+          #:nocov:
         end
-
+        
       end
-
-=begin
-## Excluded for speed reasons; protected seems to make methods slow.
-
-  # Save the StringScanner methods from being called.
-  # This would not be useful for highlighting.
-  strscan_public_methods =
-    StringScanner.instance_methods -
-    StringScanner.ancestors[1].instance_methods
-  protected(*strscan_public_methods)
-=end
-
+      
       # Create a new Scanner.
       #
       # * +code+ is the input String and is handled by the superclass
@@ -111,146 +138,147 @@ module CodeRay
       # * +options+ is a Hash with Symbols as keys.
       #   It is merged with the default options of the class (you can
       #   overwrite default options here.)
-      # * +block+ is the callback for streamed highlighting.
-      #
-      # If you set :stream to +true+ in the options, the Scanner uses a
-      # TokenStream with the +block+ as callback to handle the tokens.
       #
       # Else, a Tokens object is used.
-      def initialize code='', options = {}, &block
-        raise "I am only the basic Scanner class. I can't scan "\
-          "anything. :( Use my subclasses." if self.class == Scanner
+      def initialize code = '', options = {}
+        if self.class == Scanner
+          raise NotImplementedError, "I am only the basic Scanner class. I can't scan anything. :( Use my subclasses."
+        end
         
         @options = self.class::DEFAULT_OPTIONS.merge options
-
-        super Scanner.normify(code)
-
-        @tokens = options[:tokens]
-        if @options[:stream]
-          warn "warning in CodeRay::Scanner.new: :stream is set, "\
-            "but no block was given" unless block_given?
-          raise NotStreamableError, self unless kind_of? Streamable
-          @tokens ||= TokenStream.new(&block)
-        else
-          warn "warning in CodeRay::Scanner.new: Block given, "\
-            "but :stream is #{@options[:stream]}" if block_given?
-          @tokens ||= Tokens.new
-        end
-        @tokens.scanner = self
-
+        
+        super self.class.normalize(code)
+        
+        @tokens = options[:tokens] || Tokens.new
+        @tokens.scanner = self if @tokens.respond_to? :scanner=
+        
         setup
       end
-
+      
+      # Sets back the scanner. Subclasses should redefine the reset_instance
+      # method instead of this one.
       def reset
         super
         reset_instance
       end
-
+      
+      # Set a new string to be scanned.
       def string= code
-        code = Scanner.normify(code)
-        if defined?(RUBY_DESCRIPTION) && RUBY_DESCRIPTION['rubinius 1.0.1']
-          reset_state
-          @string = code
-        else
-          super code
-        end
+        code = self.class.normalize(code)
+        super code
         reset_instance
       end
-
-      # More mnemonic accessor name for the input string.
-      alias code string
-      alias code= string=
-
-      # Returns the Plugin ID for this scanner.
+      
+      # the Plugin ID for this scanner
       def lang
-        self.class.plugin_id
+        self.class.lang
       end
-
-      # Scans the code and returns all tokens in a Tokens object.
-      def tokenize new_string=nil, options = {}
+      
+      # the default file extension for this scanner
+      def file_extension
+        self.class.file_extension
+      end
+      
+      # Scan the code and returns all tokens in a Tokens object.
+      def tokenize source = nil, options = {}
         options = @options.merge(options)
-        self.string = new_string if new_string
-        @cached_tokens =
-          if @options[:stream]  # :stream must have been set already
-            reset unless new_string
-            scan_tokens @tokens, options
-            @tokens
-          else
-            scan_tokens @tokens, options
-          end
+        @tokens = options[:tokens] || @tokens || Tokens.new
+        @tokens.scanner = self if @tokens.respond_to? :scanner=
+        case source
+        when Array
+          self.string = self.class.normalize(source.join)
+        when nil
+          reset
+        else
+          self.string = self.class.normalize(source)
+        end
+        
+        begin
+          scan_tokens @tokens, options
+        rescue => e
+          message = "Error in %s#scan_tokens, initial state was: %p" % [self.class, defined?(state) && state]
+          raise_inspect e.message, @tokens, message, 30, e.backtrace
+        end
+        
+        @cached_tokens = @tokens
+        if source.is_a? Array
+          @tokens.split_into_parts(*source.map { |part| part.size })
+        else
+          @tokens
+        end
       end
-
+      
+      # Cache the result of tokenize.
       def tokens
         @cached_tokens ||= tokenize
       end
       
-      # Whether the scanner is in streaming mode.
-      def streaming?
-        !!@options[:stream]
-      end
-
-      # Traverses the tokens.
+      # Traverse the tokens.
       def each &block
-        raise ArgumentError,
-          'Cannot traverse TokenStream.' if @options[:stream]
         tokens.each(&block)
       end
       include Enumerable
-
-      # The current line position of the scanner.
+      
+      # The current line position of the scanner, starting with 1.
+      # See also: #column.
       #
       # Beware, this is implemented inefficiently. It should be used
       # for debugging only.
-      def line
-        string[0..pos].count("\n") + 1
+      def line pos = self.pos
+        return 1 if pos <= 0
+        binary_string[0...pos].count("\n") + 1
       end
       
+      # The current column position of the scanner, starting with 1.
+      # See also: #line.
       def column pos = self.pos
-        return 0 if pos <= 0
-        string = string()
-        if string.respond_to?(:bytesize) && (defined?(@bin_string) || string.bytesize != string.size)
-          @bin_string ||= string.dup.force_encoding('binary')
-          string = @bin_string
-        end
-        pos - (string.rindex(?\n, pos) || 0)
+        return 1 if pos <= 0
+        pos - (binary_string.rindex(?\n, pos - 1) || -1)
       end
       
-      def marshal_dump
-        @options
+      # The string in binary encoding.
+      # 
+      # To be used with #pos, which is the index of the byte the scanner
+      # will scan next.
+      def binary_string
+        @binary_string ||=
+          if string.respond_to?(:bytesize) && string.bytesize != string.size
+            #:nocov:
+            string.dup.force_encoding('binary')
+            #:nocov:
+          else
+            string
+          end
       end
       
-      def marshal_load options
-        @options = options
-      end
-
     protected
-
+      
       # Can be implemented by subclasses to do some initialization
       # that has to be done once per instance.
       #
       # Use reset for initialization that has to be done once per
       # scan.
-      def setup
+      def setup  # :doc:
       end
-
+      
       # This is the central method, and commonly the only one a
       # subclass implements.
       #
       # Subclasses must implement this method; it must return +tokens+
       # and must only use Tokens#<< for storing scanned tokens!
-      def scan_tokens tokens, options
-        raise NotImplementedError,
-          "#{self.class}#scan_tokens not implemented."
+      def scan_tokens tokens, options  # :doc:
+        raise NotImplementedError, "#{self.class}#scan_tokens not implemented."
       end
-
+      
+      # Resets the scanner.
       def reset_instance
-        @tokens.clear unless @options[:keep_tokens]
+        @tokens.clear if @tokens.respond_to?(:clear) && !@options[:keep_tokens]
         @cached_tokens = nil
-        @bin_string = nil if defined? @bin_string
+        @binary_string = nil if defined? @binary_string
       end
-
+      
       # Scanner error with additional status information
-      def raise_inspect msg, tokens, state = 'No state given!', ambit = 30
+      def raise_inspect msg, tokens, state = self.state || 'No state given!', ambit = 30, backtrace = caller
         raise ScanError, <<-EOE % [
 
 
@@ -272,13 +300,13 @@ surrounding code:
         EOE
           File.basename(caller[0]),
           msg,
-          tokens.size,
-          tokens.last(10).map { |t| t.inspect }.join("\n"),
+          tokens.respond_to?(:size) ? tokens.size : 0,
+          tokens.respond_to?(:last) ? tokens.last(10).map { |t| t.inspect }.join("\n") : '',
           line, column, pos,
           matched, state, bol?, eos?,
-          string[pos - ambit, ambit],
-          string[pos, ambit],
-        ]
+          binary_string[pos - ambit, ambit],
+          binary_string[pos, ambit],
+        ], backtrace
       end
       
       # Shorthand for scan_until(/\z/).
@@ -288,19 +316,8 @@ surrounding code:
         terminate
         rest
       end
-
-    end
-
-  end
-end
-
-class String
-  # I love this hack. It seems to silence all dos/unix/mac newline problems.
-  def to_unix
-    if index ?\r
-      gsub(/\r\n?/, "\n")
-    else
-      self
+      
     end
+    
   end
 end
diff --git a/lib/coderay/scanners/_map.rb b/lib/coderay/scanners/_map.rb
index 01078c1..a240298 100644
--- a/lib/coderay/scanners/_map.rb
+++ b/lib/coderay/scanners/_map.rb
@@ -1,23 +1,24 @@
 module CodeRay
 module Scanners
-
+  
   map \
-    :h => :c,
-    :cplusplus => :cpp,
-    :'c++' => :cpp,
-    :ecma => :java_script,
-    :ecmascript => :java_script,
+    :'c++'       => :cpp,
+    :cplusplus   => :cpp,
+    :ecmascript  => :java_script,
     :ecma_script => :java_script,
-    :irb => :ruby,
-    :javascript => :java_script,
-    :js => :java_script,
-    :nitro => :nitro_xhtml,
-    :pascal => :delphi,
-    :plain => :plaintext,
-    :xhtml => :html,
-    :yml => :yaml
-
-  default :plain
-
+    :rhtml       => :erb,
+    :eruby       => :erb,
+    :irb         => :ruby,
+    :javascript  => :java_script,
+    :js          => :java_script,
+    :pascal      => :delphi,
+    :patch       => :diff,
+    :plain       => :text,
+    :plaintext   => :text,
+    :xhtml       => :html,
+    :yml         => :yaml
+  
+  default :text
+  
 end
 end
diff --git a/lib/coderay/scanners/c.rb b/lib/coderay/scanners/c.rb
index d7f2be7..8d24b99 100644
--- a/lib/coderay/scanners/c.rb
+++ b/lib/coderay/scanners/c.rb
@@ -1,46 +1,47 @@
 module CodeRay
 module Scanners
-
+  
+  # Scanner for C.
   class C < Scanner
 
-    include Streamable
-    
     register_for :c
     file_extension 'c'
-
-    RESERVED_WORDS = [
+    
+    KEYWORDS = [
       'asm', 'break', 'case', 'continue', 'default', 'do',
       'else', 'enum', 'for', 'goto', 'if', 'return',
       'sizeof', 'struct', 'switch', 'typedef', 'union', 'while',
       'restrict',  # added in C99
-    ]
+    ]  # :nodoc:
 
     PREDEFINED_TYPES = [
       'int', 'long', 'short', 'char',
       'signed', 'unsigned', 'float', 'double',
       'bool', 'complex',  # added in C99
-    ]
+    ]  # :nodoc:
 
     PREDEFINED_CONSTANTS = [
       'EOF', 'NULL',
       'true', 'false',  # added in C99
-    ]
+    ]  # :nodoc:
     DIRECTIVES = [
       'auto', 'extern', 'register', 'static', 'void',
       'const', 'volatile',  # added in C89
       'inline',  # added in C99
-    ]
+    ]  # :nodoc:
 
     IDENT_KIND = WordList.new(:ident).
-      add(RESERVED_WORDS, :reserved).
-      add(PREDEFINED_TYPES, :pre_type).
+      add(KEYWORDS, :keyword).
+      add(PREDEFINED_TYPES, :predefined_type).
       add(DIRECTIVES, :directive).
-      add(PREDEFINED_CONSTANTS, :pre_constant)
-
-    ESCAPE = / [rbfntv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
-    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x
+      add(PREDEFINED_CONSTANTS, :predefined_constant)  # :nodoc:
 
-    def scan_tokens tokens, options
+    ESCAPE = / [rbfntv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x  # :nodoc:
+    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x  # :nodoc:
+    
+  protected
+    
+    def scan_tokens encoder, options
 
       state = :initial
       label_expected = true
@@ -50,9 +51,6 @@ module Scanners
 
       until eos?
 
-        kind = nil
-        match = nil
-        
         case state
 
         when :initial
@@ -62,15 +60,10 @@ module Scanners
               in_preproc_line = false
               label_expected = label_expected_before_preproc_line
             end
-            tokens << [match, :space]
-            next
+            encoder.text_token match, :space
 
-          elsif scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
-            kind = :comment
-
-          elsif match = scan(/ \# \s* if \s* 0 /x)
-            match << scan_until(/ ^\# (?:elif|else|endif) .*? $ | \z /xm) unless eos?
-            kind = :comment
+          elsif match = scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
+            encoder.text_token match, :comment
 
           elsif match = scan(/ [-+*=<>?:;,!&^|()\[\]{}~%]+ | \/=? | \.(?!\d) /x)
             label_expected = match =~ /[;\{\}]/
@@ -78,7 +71,7 @@ module Scanners
               label_expected = true if match == ':'
               case_expected = false
             end
-            kind = :operator
+            encoder.text_token match, :operator
 
           elsif match = scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
             kind = IDENT_KIND[match]
@@ -87,114 +80,107 @@ module Scanners
               match << matched
             else
               label_expected = false
-              if kind == :reserved
+              if kind == :keyword
                 case match
                 when 'case', 'default'
                   case_expected = true
                 end
               end
             end
+            encoder.text_token match, kind
 
-          elsif scan(/\$/)
-            kind = :ident
-          
           elsif match = scan(/L?"/)
-            tokens << [:open, :string]
+            encoder.begin_group :string
             if match[0] == ?L
-              tokens << ['L', :modifier]
+              encoder.text_token 'L', :modifier
               match = '"'
             end
+            encoder.text_token match, :delimiter
             state = :string
-            kind = :delimiter
 
-          elsif scan(/#[ \t]*(\w*)/)
-            kind = :preprocessor
+          elsif match = scan(/ \# \s* if \s* 0 /x)
+            match << scan_until(/ ^\# (?:elif|else|endif) .*? $ | \z /xm) unless eos?
+            encoder.text_token match, :comment
+
+          elsif match = scan(/#[ \t]*(\w*)/)
+            encoder.text_token match, :preprocessor
             in_preproc_line = true
             label_expected_before_preproc_line = label_expected
             state = :include_expected if self[1] == 'include'
 
-          elsif scan(/ L?' (?: [^\'\n\\] | \\ #{ESCAPE} )? '? /ox)
+          elsif match = scan(/ L?' (?: [^\'\n\\] | \\ #{ESCAPE} )? '? /ox)
             label_expected = false
-            kind = :char
+            encoder.text_token match, :char
 
-          elsif scan(/0[xX][0-9A-Fa-f]+/)
+          elsif match = scan(/\$/)
+            encoder.text_token match, :ident
+          
+          elsif match = scan(/0[xX][0-9A-Fa-f]+/)
             label_expected = false
-            kind = :hex
+            encoder.text_token match, :hex
 
-          elsif scan(/(?:0[0-7]+)(?![89.eEfF])/)
+          elsif match = scan(/(?:0[0-7]+)(?![89.eEfF])/)
             label_expected = false
-            kind = :oct
+            encoder.text_token match, :octal
 
-          elsif scan(/(?:\d+)(?![.eEfF])L?L?/)
+          elsif match = scan(/(?:\d+)(?![.eEfF])L?L?/)
             label_expected = false
-            kind = :integer
+            encoder.text_token match, :integer
 
-          elsif scan(/\d[fF]?|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
+          elsif match = scan(/\d[fF]?|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
             label_expected = false
-            kind = :float
+            encoder.text_token match, :float
 
           else
-            getch
-            kind = :error
+            encoder.text_token getch, :error
 
           end
 
         when :string
-          if scan(/[^\\\n"]+/)
-            kind = :content
-          elsif scan(/"/)
-            tokens << ['"', :delimiter]
-            tokens << [:close, :string]
+          if match = scan(/[^\\\n"]+/)
+            encoder.text_token match, :content
+          elsif match = scan(/"/)
+            encoder.text_token match, :delimiter
+            encoder.end_group :string
             state = :initial
             label_expected = false
-            next
-          elsif scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
-            kind = :char
-          elsif scan(/ \\ | $ /x)
-            tokens << [:close, :string]
-            kind = :error
+          elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+            encoder.text_token match, :char
+          elsif match = scan(/ \\ | $ /x)
+            encoder.end_group :string
+            encoder.text_token match, :error
             state = :initial
             label_expected = false
           else
-            raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+            raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
           end
 
         when :include_expected
-          if scan(/<[^>\n]+>?|"[^"\n\\]*(?:\\.[^"\n\\]*)*"?/)
-            kind = :include
+          if match = scan(/<[^>\n]+>?|"[^"\n\\]*(?:\\.[^"\n\\]*)*"?/)
+            encoder.text_token match, :include
             state = :initial
 
           elsif match = scan(/\s+/)
-            kind = :space
+            encoder.text_token match, :space
             state = :initial if match.index ?\n
 
           else
             state = :initial
-            next
 
           end
 
         else
-          raise_inspect 'Unknown state', tokens
-
-        end
+          raise_inspect 'Unknown state', encoder
 
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens
         end
-        raise_inspect 'Empty token', tokens unless match
-
-        tokens << [match, kind]
 
       end
 
       if state == :string
-        tokens << [:close, :string]
+        encoder.end_group :string
       end
 
-      tokens
+      encoder
     end
 
   end
diff --git a/lib/coderay/scanners/clojure.rb b/lib/coderay/scanners/clojure.rb
new file mode 100644
index 0000000..f8fbf65
--- /dev/null
+++ b/lib/coderay/scanners/clojure.rb
@@ -0,0 +1,217 @@
+# encoding: utf-8
+module CodeRay
+  module Scanners
+    
+    # Clojure scanner by Licenser.
+    class Clojure < Scanner
+      
+      register_for :clojure
+      file_extension 'clj'
+      
+      SPECIAL_FORMS = %w[
+        def if do let quote var fn loop recur throw try catch monitor-enter monitor-exit .
+        new 
+      ]  # :nodoc:
+      
+      CORE_FORMS = %w[
+        + - -> ->> .. / * <= < = == >= > accessor aclone add-classpath add-watch
+        agent agent-error agent-errors aget alength alias all-ns alter alter-meta!
+        alter-var-root amap ancestors and apply areduce array-map aset aset-boolean
+        aset-byte aset-char aset-double aset-float aset-int aset-long aset-short
+        assert assoc assoc! assoc-in associative? atom await await-for bases bean
+        bigdec bigint binding bit-and bit-and-not bit-clear bit-flip bit-not bit-or
+        bit-set bit-shift-left bit-shift-right bit-test bit-xor boolean boolean-array
+        booleans bound-fn bound-fn* bound? butlast byte byte-array bytes case cast char
+        char-array char-escape-string char-name-string char? chars class class?
+        clear-agent-errors clojure-version coll? comment commute comp comparator
+        compare compare-and-set! compile complement concat cond condp conj conj!
+        cons constantly construct-proxy contains? count counted? create-ns
+        create-struct cycle dec decimal? declare definline defmacro defmethod defmulti
+        defn defn- defonce defprotocol defrecord defstruct deftype delay delay?
+        deliver denominator deref derive descendants disj disj! dissoc dissoc!
+        distinct distinct? doall doc dorun doseq dosync dotimes doto double
+        double-array doubles drop drop-last drop-while empty empty? ensure
+        enumeration-seq error-handler error-mode eval even? every? extend
+        extend-protocol extend-type extenders extends? false? ffirst file-seq
+        filter find find-doc find-ns find-var first float float-array float?
+        floats flush fn fn? fnext for force format future future-call future-cancel
+        future-cancelled? future-done? future? gen-class gen-interface gensym get
+        get-in get-method get-proxy-class get-thread-bindings get-validator hash
+        hash-map hash-set identical? identity if-let if-not ifn? import in-ns
+        inc init-proxy instance? int int-array integer? interleave intern
+        interpose into into-array ints io! isa? iterate iterator-seq juxt key
+        keys keyword keyword? last lazy-cat lazy-seq let letfn line-seq list list*
+        list? load load-file load-reader load-string loaded-libs locking long
+        long-array longs loop macroexpand macroexpand-1 make-array make-hierarchy
+        map map? mapcat max max-key memfn memoize merge merge-with meta methods
+        min min-key mod name namespace neg? newline next nfirst nil? nnext not
+        not-any? not-empty not-every? not= ns ns-aliases ns-imports ns-interns
+        ns-map ns-name ns-publics ns-refers ns-resolve ns-unalias ns-unmap nth
+        nthnext num number? numerator object-array odd? or parents partial
+        partition pcalls peek persistent! pmap pop pop! pop-thread-bindings
+        pos? pr pr-str prefer-method prefers print print-namespace-doc
+        print-str printf println println-str prn prn-str promise proxy
+        proxy-mappings proxy-super push-thread-bindings pvalues quot rand
+        rand-int range ratio? rationalize re-find re-groups re-matcher
+        re-matches re-pattern re-seq read read-line read-string reduce ref
+        ref-history-count ref-max-history ref-min-history ref-set refer
+        refer-clojure reify release-pending-sends rem remove remove-all-methods
+        remove-method remove-ns remove-watch repeat repeatedly replace replicate
+        require reset! reset-meta! resolve rest restart-agent resultset-seq
+        reverse reversible? rseq rsubseq satisfies? second select-keys send
+        send-off seq seq? seque sequence sequential? set set-error-handler!
+        set-error-mode! set-validator! set? short short-array shorts
+        shutdown-agents slurp some sort sort-by sorted-map sorted-map-by
+        sorted-set sorted-set-by sorted? special-form-anchor special-symbol?
+        split-at split-with str string? struct struct-map subs subseq subvec
+        supers swap! symbol symbol? sync syntax-symbol-anchor take take-last
+        take-nth take-while test the-ns thread-bound? time to-array to-array-2d
+        trampoline transient tree-seq true? type unchecked-add unchecked-dec
+        unchecked-divide unchecked-inc unchecked-multiply unchecked-negate
+        unchecked-remainder unchecked-subtract underive update-in update-proxy
+        use val vals var-get var-set var? vary-meta vec vector vector-of vector?
+        when when-first when-let when-not while with-bindings with-bindings*
+        with-in-str with-local-vars with-meta with-open with-out-str
+        with-precision xml-seq zero? zipmap 
+      ]  # :nodoc:
+      
+      PREDEFINED_CONSTANTS = %w[
+        true false nil *1 *2 *3 *agent* *clojure-version* *command-line-args*
+        *compile-files* *compile-path* *e *err* *file* *flush-on-newline*
+        *in* *ns* *out* *print-dup* *print-length* *print-level* *print-meta*
+        *print-readably* *read-eval* *warn-on-reflection*
+      ]  # :nodoc:
+      
+      IDENT_KIND = WordList.new(:ident).
+        add(SPECIAL_FORMS, :keyword).
+        add(CORE_FORMS, :keyword).
+        add(PREDEFINED_CONSTANTS, :predefined_constant)
+      
+      KEYWORD_NEXT_TOKEN_KIND = WordList.new(nil).
+        add(%w[ def defn defn- definline defmacro defmulti defmethod defstruct defonce declare ], :function).
+        add(%w[ ns ], :namespace).
+        add(%w[ defprotocol defrecord ], :class)
+      
+      BASIC_IDENTIFIER = /[a-zA-Z$%*\/_+!?&<>\-=]=?[a-zA-Z0-9$&*+!\/_?<>\-\#]*/
+      IDENTIFIER = /(?!-\d)(?:(?:#{BASIC_IDENTIFIER}\.)*#{BASIC_IDENTIFIER}(?:\/#{BASIC_IDENTIFIER})?\.?)|\.\.?/
+      SYMBOL = /::?#{IDENTIFIER}/o
+      DIGIT = /\d/
+      DIGIT10 = DIGIT
+      DIGIT16 = /[0-9a-f]/i
+      DIGIT8 = /[0-7]/
+      DIGIT2 = /[01]/
+      RADIX16 = /\#x/i
+      RADIX8 = /\#o/i
+      RADIX2 = /\#b/i
+      RADIX10 = /\#d/i
+      EXACTNESS = /#i|#e/i
+      SIGN = /[\+-]?/
+      EXP_MARK = /[esfdl]/i
+      EXP = /#{EXP_MARK}#{SIGN}#{DIGIT}+/
+      SUFFIX = /#{EXP}?/
+      PREFIX10 = /#{RADIX10}?#{EXACTNESS}?|#{EXACTNESS}?#{RADIX10}?/
+      PREFIX16 = /#{RADIX16}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX16}/
+      PREFIX8 = /#{RADIX8}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX8}/
+      PREFIX2 = /#{RADIX2}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX2}/
+      UINT10 = /#{DIGIT10}+#*/
+      UINT16 = /#{DIGIT16}+#*/
+      UINT8 = /#{DIGIT8}+#*/
+      UINT2 = /#{DIGIT2}+#*/
+      DECIMAL = /#{DIGIT10}+#+\.#*#{SUFFIX}|#{DIGIT10}+\.#{DIGIT10}*#*#{SUFFIX}|\.#{DIGIT10}+#*#{SUFFIX}|#{UINT10}#{EXP}/
+      UREAL10 = /#{UINT10}\/#{UINT10}|#{DECIMAL}|#{UINT10}/
+      UREAL16 = /#{UINT16}\/#{UINT16}|#{UINT16}/
+      UREAL8 = /#{UINT8}\/#{UINT8}|#{UINT8}/
+      UREAL2 = /#{UINT2}\/#{UINT2}|#{UINT2}/
+      REAL10 = /#{SIGN}#{UREAL10}/
+      REAL16 = /#{SIGN}#{UREAL16}/
+      REAL8 = /#{SIGN}#{UREAL8}/
+      REAL2 = /#{SIGN}#{UREAL2}/
+      IMAG10 = /i|#{UREAL10}i/
+      IMAG16 = /i|#{UREAL16}i/
+      IMAG8 = /i|#{UREAL8}i/
+      IMAG2 = /i|#{UREAL2}i/
+      COMPLEX10 = /#{REAL10}@#{REAL10}|#{REAL10}\+#{IMAG10}|#{REAL10}-#{IMAG10}|\+#{IMAG10}|-#{IMAG10}|#{REAL10}/
+      COMPLEX16 = /#{REAL16}@#{REAL16}|#{REAL16}\+#{IMAG16}|#{REAL16}-#{IMAG16}|\+#{IMAG16}|-#{IMAG16}|#{REAL16}/
+      COMPLEX8 = /#{REAL8}@#{REAL8}|#{REAL8}\+#{IMAG8}|#{REAL8}-#{IMAG8}|\+#{IMAG8}|-#{IMAG8}|#{REAL8}/
+      COMPLEX2 = /#{REAL2}@#{REAL2}|#{REAL2}\+#{IMAG2}|#{REAL2}-#{IMAG2}|\+#{IMAG2}|-#{IMAG2}|#{REAL2}/
+      NUM10 = /#{PREFIX10}?#{COMPLEX10}/
+      NUM16 = /#{PREFIX16}#{COMPLEX16}/
+      NUM8 = /#{PREFIX8}#{COMPLEX8}/
+      NUM2 = /#{PREFIX2}#{COMPLEX2}/
+      NUM = /#{NUM10}|#{NUM16}|#{NUM8}|#{NUM2}/
+      
+    protected
+      
+      def scan_tokens encoder, options
+        
+        state = :initial
+        kind = nil
+        
+        until eos?
+          
+          case state
+          when :initial
+            if match = scan(/ \s+ | \\\n | , /x)
+              encoder.text_token match, :space
+            elsif match = scan(/['`\(\[\)\]\{\}]|\#[({]|~@?|[@\^]/)
+              encoder.text_token match, :operator
+            elsif match = scan(/;.*/)
+              encoder.text_token match, :comment  # TODO: recognize (comment ...) too
+            elsif match = scan(/\#?\\(?:newline|space|.?)/)
+              encoder.text_token match, :char
+            elsif match = scan(/\#[ft]/)
+              encoder.text_token match, :predefined_constant
+            elsif match = scan(/#{IDENTIFIER}/o)
+              kind = IDENT_KIND[match]
+              encoder.text_token match, kind
+              if rest? && kind == :keyword
+                if kind = KEYWORD_NEXT_TOKEN_KIND[match]
+                  encoder.text_token match, :space if match = scan(/\s+/o)
+                  encoder.text_token match, kind if match = scan(/#{IDENTIFIER}/o)
+                end
+              end
+            elsif match = scan(/#{SYMBOL}/o)
+              encoder.text_token match, :symbol
+            elsif match = scan(/\./)
+              encoder.text_token match, :operator
+            elsif match = scan(/ \# \^ #{IDENTIFIER} /ox)
+              encoder.text_token match, :type
+            elsif match = scan(/ (\#)? " /x)
+              state = self[1] ? :regexp : :string
+              encoder.begin_group state
+              encoder.text_token match, :delimiter
+            elsif match = scan(/#{NUM}/o) and not matched.empty?
+              encoder.text_token match, match[/[.e\/]/i] ? :float : :integer
+            else
+              encoder.text_token getch, :error
+            end
+            
+          when :string, :regexp
+            if match = scan(/[^"\\]+|\\.?/)
+              encoder.text_token match, :content
+            elsif match = scan(/"/)
+              encoder.text_token match, :delimiter
+              encoder.end_group state
+              state = :initial
+            else
+              raise_inspect "else case \" reached; %p not handled." % peek(1),
+                encoder, state
+            end
+            
+          else
+            raise 'else case reached'
+            
+          end
+          
+        end
+        
+        if [:string, :regexp].include? state
+          encoder.end_group state
+        end
+        
+        encoder
+        
+      end
+    end
+  end
+end
\ No newline at end of file
diff --git a/lib/coderay/scanners/cpp.rb b/lib/coderay/scanners/cpp.rb
index c29083a..9da62f4 100644
--- a/lib/coderay/scanners/cpp.rb
+++ b/lib/coderay/scanners/cpp.rb
@@ -1,16 +1,17 @@
 module CodeRay
 module Scanners
 
+  # Scanner for C++.
+  # 
+  # Aliases: +cplusplus+, c++
   class CPlusPlus < Scanner
 
-    include Streamable
-    
     register_for :cpp
     file_extension 'cpp'
     title 'C++'
     
-    # http://www.cppreference.com/wiki/keywords/start
-    RESERVED_WORDS = [
+    #-- http://www.cppreference.com/wiki/keywords/start
+    KEYWORDS = [
       'and', 'and_eq', 'asm', 'bitand', 'bitor', 'break',
       'case', 'catch', 'class', 'compl', 'const_cast',
       'continue', 'default', 'delete', 'do', 'dynamic_cast', 'else',
@@ -18,37 +19,39 @@ module Scanners
       'not', 'not_eq', 'or', 'or_eq', 'reinterpret_cast', 'return',
       'sizeof', 'static_cast', 'struct', 'switch', 'template',
       'throw', 'try', 'typedef', 'typeid', 'typename', 'union',
-      'while', 'xor', 'xor_eq'
-    ]
-
+      'while', 'xor', 'xor_eq',
+    ]  # :nodoc:
+    
     PREDEFINED_TYPES = [
       'bool', 'char', 'double', 'float', 'int', 'long',
-      'short', 'signed', 'unsigned', 'wchar_t', 'string'
-    ]
+      'short', 'signed', 'unsigned', 'wchar_t', 'string',
+    ]  # :nodoc:
     PREDEFINED_CONSTANTS = [
       'false', 'true',
       'EOF', 'NULL',
-    ]
+    ]  # :nodoc:
     PREDEFINED_VARIABLES = [
-      'this'
-    ]
+      'this',
+    ]  # :nodoc:
     DIRECTIVES = [
       'auto', 'const', 'explicit', 'extern', 'friend', 'inline', 'mutable', 'operator',
       'private', 'protected', 'public', 'register', 'static', 'using', 'virtual', 'void',
-      'volatile'
-    ]
-
+      'volatile',
+    ]  # :nodoc:
+    
     IDENT_KIND = WordList.new(:ident).
-      add(RESERVED_WORDS, :reserved).
-      add(PREDEFINED_TYPES, :pre_type).
+      add(KEYWORDS, :keyword).
+      add(PREDEFINED_TYPES, :predefined_type).
       add(PREDEFINED_VARIABLES, :local_variable).
       add(DIRECTIVES, :directive).
-      add(PREDEFINED_CONSTANTS, :pre_constant)
-
-    ESCAPE = / [rbfntv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
-    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x
+      add(PREDEFINED_CONSTANTS, :predefined_constant)  # :nodoc:
 
-    def scan_tokens tokens, options
+    ESCAPE = / [rbfntv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x  # :nodoc:
+    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x  # :nodoc:
+    
+  protected
+    
+    def scan_tokens encoder, options
 
       state = :initial
       label_expected = true
@@ -58,9 +61,6 @@ module Scanners
 
       until eos?
 
-        kind = nil
-        match = nil
-        
         case state
 
         when :initial
@@ -70,15 +70,14 @@ module Scanners
               in_preproc_line = false
               label_expected = label_expected_before_preproc_line
             end
-            tokens << [match, :space]
-            next
+            encoder.text_token match, :space
 
-          elsif scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
-            kind = :comment
+          elsif match = scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
+            encoder.text_token match, :comment
 
           elsif match = scan(/ \# \s* if \s* 0 /x)
             match << scan_until(/ ^\# (?:elif|else|endif) .*? $ | \z /xm) unless eos?
-            kind = :comment
+            encoder.text_token match, :comment
 
           elsif match = scan(/ [-+*=<>?:;,!&^|()\[\]{}~%]+ | \/=? | \.(?!\d) /x)
             label_expected = match =~ /[;\{\}]/
@@ -86,7 +85,7 @@ module Scanners
               label_expected = true if match == ':'
               case_expected = false
             end
-            kind = :operator
+            encoder.text_token match, :operator
 
           elsif match = scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
             kind = IDENT_KIND[match]
@@ -95,7 +94,7 @@ module Scanners
               match << matched
             else
               label_expected = false
-              if kind == :reserved
+              if kind == :keyword
                 case match
                 when 'class'
                   state = :class_name_expected
@@ -104,122 +103,110 @@ module Scanners
                 end
               end
             end
+            encoder.text_token match, kind
 
-          elsif scan(/\$/)
-            kind = :ident
+          elsif match = scan(/\$/)
+            encoder.text_token match, :ident
           
           elsif match = scan(/L?"/)
-            tokens << [:open, :string]
+            encoder.begin_group :string
             if match[0] == ?L
-              tokens << ['L', :modifier]
+              encoder.text_token match, 'L', :modifier
               match = '"'
             end
             state = :string
-            kind = :delimiter
+            encoder.text_token match, :delimiter
 
-          elsif scan(/#[ \t]*(\w*)/)
-            kind = :preprocessor
+          elsif match = scan(/#[ \t]*(\w*)/)
+            encoder.text_token match, :preprocessor
             in_preproc_line = true
             label_expected_before_preproc_line = label_expected
             state = :include_expected if self[1] == 'include'
 
-          elsif scan(/ L?' (?: [^\'\n\\] | \\ #{ESCAPE} )? '? /ox)
+          elsif match = scan(/ L?' (?: [^\'\n\\] | \\ #{ESCAPE} )? '? /ox)
             label_expected = false
-            kind = :char
+            encoder.text_token match, :char
 
-          elsif scan(/0[xX][0-9A-Fa-f]+/)
+          elsif match = scan(/0[xX][0-9A-Fa-f]+/)
             label_expected = false
-            kind = :hex
+            encoder.text_token match, :hex
 
-          elsif scan(/(?:0[0-7]+)(?![89.eEfF])/)
+          elsif match = scan(/(?:0[0-7]+)(?![89.eEfF])/)
             label_expected = false
-            kind = :oct
+            encoder.text_token match, :octal
 
-          elsif scan(/(?:\d+)(?![.eEfF])L?L?/)
+          elsif match = scan(/(?:\d+)(?![.eEfF])L?L?/)
             label_expected = false
-            kind = :integer
+            encoder.text_token match, :integer
 
-          elsif scan(/\d[fF]?|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
+          elsif match = scan(/\d[fF]?|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
             label_expected = false
-            kind = :float
+            encoder.text_token match, :float
 
           else
-            getch
-            kind = :error
+            encoder.text_token getch, :error
 
           end
 
         when :string
-          if scan(/[^\\"]+/)
-            kind = :content
-          elsif scan(/"/)
-            tokens << ['"', :delimiter]
-            tokens << [:close, :string]
+          if match = scan(/[^\\"]+/)
+            encoder.text_token match, :content
+          elsif match = scan(/"/)
+            encoder.text_token match, :delimiter
+            encoder.end_group :string
             state = :initial
             label_expected = false
-            next
-          elsif scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
-            kind = :char
-          elsif scan(/ \\ | $ /x)
-            tokens << [:close, :string]
-            kind = :error
+          elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+            encoder.text_token match, :char
+          elsif match = scan(/ \\ | $ /x)
+            encoder.end_group :string
+            encoder.text_token match, :error
             state = :initial
             label_expected = false
           else
-            raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+            raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
           end
 
         when :include_expected
-          if scan(/<[^>\n]+>?|"[^"\n\\]*(?:\\.[^"\n\\]*)*"?/)
-            kind = :include
+          if match = scan(/<[^>\n]+>?|"[^"\n\\]*(?:\\.[^"\n\\]*)*"?/)
+            encoder.text_token match, :include
             state = :initial
 
           elsif match = scan(/\s+/)
-            kind = :space
+            encoder.text_token match, :space
             state = :initial if match.index ?\n
 
           else
             state = :initial
-            next
 
           end
         
         when :class_name_expected
-          if scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
-            kind = :class
+          if match = scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
+            encoder.text_token match, :class
             state = :initial
 
           elsif match = scan(/\s+/)
-            kind = :space
+            encoder.text_token match, :space
 
           else
-            getch
-            kind = :error
+            encoder.text_token getch, :error
             state = :initial
 
           end
           
         else
-          raise_inspect 'Unknown state', tokens
+          raise_inspect 'Unknown state', encoder
 
         end
 
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens
-        end
-        raise_inspect 'Empty token', tokens unless match
-
-        tokens << [match, kind]
-
       end
 
       if state == :string
-        tokens << [:close, :string]
+        encoder.end_group :string
       end
 
-      tokens
+      encoder
     end
 
   end
diff --git a/lib/coderay/scanners/css.rb b/lib/coderay/scanners/css.rb
index 099238f..34eaecb 100644
--- a/lib/coderay/scanners/css.rb
+++ b/lib/coderay/scanners/css.rb
@@ -2,208 +2,198 @@ module CodeRay
 module Scanners
 
   class CSS < Scanner
-
+    
     register_for :css
-
+    
     KINDS_NOT_LOC = [
       :comment,
       :class, :pseudo_class, :type,
       :constant, :directive,
-      :key, :value, :operator, :color, :float, :content, :delimiter,
+      :key, :value, :operator, :color, :float, :string,
       :error, :important,
-    ]
+    ]  # :nodoc:
     
-    module RE
+    module RE  # :nodoc:
       Hex = /[0-9a-fA-F]/
       Unicode = /\\#{Hex}{1,6}(?:\r\n|\s)?/ # differs from standard because it allows uppercase hex too
       Escape = /#{Unicode}|\\[^\r\n\f0-9a-fA-F]/
       NMChar = /[-_a-zA-Z0-9]|#{Escape}/
       NMStart = /[_a-zA-Z]|#{Escape}/
       NL = /\r\n|\r|\n|\f/
-      String1 = /"(?:[^\n\r\f\\"]|\\#{NL}|#{Escape})*"?/  # FIXME: buggy regexp
-      String2 = /'(?:[^\n\r\f\\']|\\#{NL}|#{Escape})*'?/  # FIXME: buggy regexp
+      String1 = /"(?:[^\n\r\f\\"]|\\#{NL}|#{Escape})*"?/  # TODO: buggy regexp
+      String2 = /'(?:[^\n\r\f\\']|\\#{NL}|#{Escape})*'?/  # TODO: buggy regexp
       String = /#{String1}|#{String2}/
-
+      
       HexColor = /#(?:#{Hex}{6}|#{Hex}{3})/
       Color = /#{HexColor}/
-
+      
       Num = /-?(?:[0-9]+|[0-9]*\.[0-9]+)/
       Name = /#{NMChar}+/
       Ident = /-?#{NMStart}#{NMChar}*/
       AtKeyword = /@#{Ident}/
       Percentage = /#{Num}%/
-
+      
       reldimensions = %w[em ex px]
       absdimensions = %w[in cm mm pt pc]
-      Unit = Regexp.union(*(reldimensions + absdimensions))
-
+      Unit = Regexp.union(*(reldimensions + absdimensions + %w[s]))
+      
       Dimension = /#{Num}#{Unit}/
-
+      
       Comment = %r! /\* (?: .*? \*/ | .* ) !mx
-      Function = /(?:url|alpha)\((?:[^)\n\r\f]|\\\))*\)?/
-
+      Function = /(?:url|alpha|attr|counters?)\((?:[^)\n\r\f]|\\\))*\)?/
+      
       Id = /##{Name}/
       Class = /\.#{Name}/
       PseudoClass = /:#{Name}/
       AttributeSelector = /\[[^\]]*\]?/
-
     end
-
-    def scan_tokens tokens, options
+    
+  protected
+    
+    def setup
+      @state = :initial
+      @value_expected = nil
+    end
+    
+    def scan_tokens encoder, options
+      states = Array(options[:state] || @state)
+      value_expected = @value_expected
       
-      value_expected = nil
-      states = [:initial]
-
       until eos?
-
-        kind = nil
-        match = nil
-
-        if scan(/\s+/)
-          kind = :space
-
+        
+        if match = scan(/\s+/)
+          encoder.text_token match, :space
+          
         elsif case states.last
           when :initial, :media
-            if scan(/(?>#{RE::Ident})(?!\()|\*/ox)
-              kind = :type
-            elsif scan RE::Class
-              kind = :class
-            elsif scan RE::Id
-              kind = :constant
-            elsif scan RE::PseudoClass
-              kind = :pseudo_class
+            if match = scan(/(?>#{RE::Ident})(?!\()|\*/ox)
+              encoder.text_token match, :type
+              next
+            elsif match = scan(RE::Class)
+              encoder.text_token match, :class
+              next
+            elsif match = scan(RE::Id)
+              encoder.text_token match, :constant
+              next
+            elsif match = scan(RE::PseudoClass)
+              encoder.text_token match, :pseudo_class
+              next
             elsif match = scan(RE::AttributeSelector)
               # TODO: Improve highlighting inside of attribute selectors.
-              tokens << [:open, :string]
-              tokens << [match[0,1], :delimiter]
-              tokens << [match[1..-2], :content] if match.size > 2
-              tokens << [match[-1,1], :delimiter] if match[-1] == ?]
-              tokens << [:close, :string]
+              encoder.text_token match[0,1], :operator
+              encoder.text_token match[1..-2], :attribute_name if match.size > 2
+              encoder.text_token match[-1,1], :operator if match[-1] == ?]
               next
             elsif match = scan(/@media/)
-              kind = :directive
+              encoder.text_token match, :directive
               states.push :media_before_name
+              next
             end
           
           when :block
-            if scan(/(?>#{RE::Ident})(?!\()/ox)
+            if match = scan(/(?>#{RE::Ident})(?!\()/ox)
               if value_expected
-                kind = :value
+                encoder.text_token match, :value
               else
-                kind = :key
+                encoder.text_token match, :key
               end
+              next
             end
-
+            
           when :media_before_name
-            if scan RE::Ident
-              kind = :type
+            if match = scan(RE::Ident)
+              encoder.text_token match, :type
               states[-1] = :media_after_name
+              next
             end
           
           when :media_after_name
-            if scan(/\{/)
-              kind = :operator
+            if match = scan(/\{/)
+              encoder.text_token match, :operator
               states[-1] = :media
+              next
             end
           
-          when :comment
-            if scan(/(?:[^*\s]|\*(?!\/))+/)
-              kind = :comment
-            elsif scan(/\*\//)
-              kind = :comment
-              states.pop
-            elsif scan(/\s+/)
-              kind = :space
-            end
-
           else
-            raise_inspect 'Unknown state', tokens
-
+            #:nocov:
+            raise_inspect 'Unknown state', encoder
+            #:nocov:
+            
           end
-
-        elsif scan(/\/\*/)
-          kind = :comment
-          states.push :comment
-
-        elsif scan(/\{/)
+          
+        elsif match = scan(/\/\*(?:.*?\*\/|\z)/m)
+          encoder.text_token match, :comment
+          
+        elsif match = scan(/\{/)
           value_expected = false
-          kind = :operator
+          encoder.text_token match, :operator
           states.push :block
-
-        elsif scan(/\}/)
+          
+        elsif match = scan(/\}/)
           value_expected = false
+          encoder.text_token match, :operator
           if states.last == :block || states.last == :media
-            kind = :operator
             states.pop
-          else
-            kind = :error
           end
-
+          
         elsif match = scan(/#{RE::String}/o)
-          tokens << [:open, :string]
-          tokens << [match[0, 1], :delimiter]
-          tokens << [match[1..-2], :content] if match.size > 2
-          tokens << [match[-1, 1], :delimiter] if match.size >= 2
-          tokens << [:close, :string]
-          next
-
+          encoder.begin_group :string
+          encoder.text_token match[0, 1], :delimiter
+          encoder.text_token match[1..-2], :content if match.size > 2
+          encoder.text_token match[-1, 1], :delimiter if match.size >= 2
+          encoder.end_group :string
+          
         elsif match = scan(/#{RE::Function}/o)
-          tokens << [:open, :string]
+          encoder.begin_group :string
           start = match[/^\w+\(/]
-          tokens << [start, :delimiter]
+          encoder.text_token start, :delimiter
           if match[-1] == ?)
-            tokens << [match[start.size..-2], :content]
-            tokens << [')', :delimiter]
+            encoder.text_token match[start.size..-2], :content
+            encoder.text_token ')', :delimiter
           else
-            tokens << [match[start.size..-1], :content]
+            encoder.text_token match[start.size..-1], :content
           end
-          tokens << [:close, :string]
-          next
-
-        elsif scan(/(?: #{RE::Dimension} | #{RE::Percentage} | #{RE::Num} )/ox)
-          kind = :float
-
-        elsif scan(/#{RE::Color}/o)
-          kind = :color
-
-        elsif scan(/! *important/)
-          kind = :important
-
-        elsif scan(/rgb\([^()\n]*\)?/)
-          kind = :color
-
-        elsif scan(/#{RE::AtKeyword}/o)
-          kind = :directive
-
+          encoder.end_group :string
+          
+        elsif match = scan(/(?: #{RE::Dimension} | #{RE::Percentage} | #{RE::Num} )/ox)
+          encoder.text_token match, :float
+          
+        elsif match = scan(/#{RE::Color}/o)
+          encoder.text_token match, :color
+          
+        elsif match = scan(/! *important/)
+          encoder.text_token match, :important
+          
+        elsif match = scan(/(?:rgb|hsl)a?\([^()\n]*\)?/)
+          encoder.text_token match, :color
+          
+        elsif match = scan(RE::AtKeyword)
+          encoder.text_token match, :directive
+          
         elsif match = scan(/ [+>:;,.=()\/] /x)
           if match == ':'
             value_expected = true
           elsif match == ';'
             value_expected = false
           end
-          kind = :operator
-
+          encoder.text_token match, :operator
+          
         else
-          getch
-          kind = :error
-
-        end
-
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens
+          encoder.text_token getch, :error
+          
         end
-        raise_inspect 'Empty token', tokens unless match
-
-        tokens << [match, kind]
-
+        
       end
-
-      tokens
+      
+      if options[:keep_state]
+        @state = states
+        @value_expected = value_expected
+      end
+      
+      encoder
     end
-
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/scanners/debug.rb b/lib/coderay/scanners/debug.rb
index 0e78b23..566bfa7 100644
--- a/lib/coderay/scanners/debug.rb
+++ b/lib/coderay/scanners/debug.rb
@@ -1,62 +1,65 @@
 module CodeRay
 module Scanners
-
+  
   # = Debug Scanner
+  # 
+  # Interprets the output of the Encoders::Debug encoder.
   class Debug < Scanner
-
-    include Streamable
+    
     register_for :debug
-    file_extension 'raydebug'
-    title 'CodeRay Token Dump'
-
+    title 'CodeRay Token Dump Import'
+    
   protected
-    def scan_tokens tokens, options
-
+    
+    def scan_tokens encoder, options
+      
       opened_tokens = []
-
+      
       until eos?
-
-        kind = nil
-        match = nil
-
-          if scan(/\s+/)
-            tokens << [matched, :space]
-            next
-            
-          elsif scan(/ (\w+) \( ( [^\)\\]* ( \\. [^\)\\]* )* ) \) /x)
-            kind = self[1].to_sym
-            match = self[2].gsub(/\\(.)/, '\1')
-            
-          elsif scan(/ (\w+) < /x)
-            kind = self[1].to_sym
-            opened_tokens << kind
-            match = :open
-            
-          elsif !opened_tokens.empty? && scan(/ > /x)
-            kind = opened_tokens.pop || :error
-            match = :close
-            
-          else
+        
+        if match = scan(/\s+/)
+          encoder.text_token match, :space
+          
+        elsif match = scan(/ (\w+) \( ( [^\)\\]* ( \\. [^\)\\]* )* ) \)? /x)
+          kind = self[1].to_sym
+          match = self[2].gsub(/\\(.)/m, '\1')
+          unless TokenKinds.has_key? kind
             kind = :error
-            getch
-
+            match = matched
           end
-                  
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens
+          encoder.text_token match, kind
+          
+        elsif match = scan(/ (\w+) ([<\[]) /x)
+          kind = self[1].to_sym
+          opened_tokens << kind
+          case self[2]
+          when '<'
+            encoder.begin_group kind
+          when '['
+            encoder.begin_line kind
+          else
+            raise 'CodeRay bug: This case should not be reached.'
+          end
+          
+        elsif !opened_tokens.empty? && match = scan(/ > /x)
+          encoder.end_group opened_tokens.pop
+          
+        elsif !opened_tokens.empty? && match = scan(/ \] /x)
+          encoder.end_line opened_tokens.pop
+          
+        else
+          encoder.text_token getch, :space
+          
         end
-        raise_inspect 'Empty token', tokens unless match
-
-        tokens << [match, kind]
         
       end
       
-      tokens
+      encoder.end_group opened_tokens.pop until opened_tokens.empty?
+      
+      encoder
     end
-
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/scanners/delphi.rb b/lib/coderay/scanners/delphi.rb
index de0ee71..b328155 100644
--- a/lib/coderay/scanners/delphi.rb
+++ b/lib/coderay/scanners/delphi.rb
@@ -1,12 +1,15 @@
 module CodeRay
 module Scanners
   
+  # Scanner for the Delphi language (Object Pascal).
+  # 
+  # Alias: +pascal+
   class Delphi < Scanner
-
+    
     register_for :delphi
     file_extension 'pas'
     
-    RESERVED_WORDS = [
+    KEYWORDS = [
       'and', 'array', 'as', 'at', 'asm', 'at', 'begin', 'case', 'class',
       'const', 'constructor', 'destructor', 'dispinterface', 'div', 'do',
       'downto', 'else', 'end', 'except', 'exports', 'file', 'finalization',
@@ -16,9 +19,9 @@ module Scanners
       'procedure', 'program', 'property', 'raise', 'record', 'repeat',
       'resourcestring', 'set', 'shl', 'shr', 'string', 'then', 'threadvar',
       'to', 'try', 'type', 'unit', 'until', 'uses', 'var', 'while', 'with',
-      'xor', 'on'
-    ]
-
+      'xor', 'on',
+    ]  # :nodoc:
+    
     DIRECTIVES = [
       'absolute', 'abstract', 'assembler', 'at', 'automated', 'cdecl',
       'contains', 'deprecated', 'dispid', 'dynamic', 'export',
@@ -27,121 +30,112 @@ module Scanners
       'package', 'pascal', 'platform', 'private', 'protected', 'public',
       'published', 'read', 'readonly', 'register', 'reintroduce',
       'requires', 'resident', 'safecall', 'stdcall', 'stored', 'varargs',
-      'virtual', 'write', 'writeonly'
-    ]
-
-    IDENT_KIND = CaseIgnoringWordList.new(:ident).
-      add(RESERVED_WORDS, :reserved).
-      add(DIRECTIVES, :directive)
+      'virtual', 'write', 'writeonly',
+    ]  # :nodoc:
     
-    NAME_FOLLOWS = CaseIgnoringWordList.new(false).
-      add(%w(procedure function .))
-
-  private
-    def scan_tokens tokens, options
-
+    IDENT_KIND = WordList::CaseIgnoring.new(:ident).
+      add(KEYWORDS, :keyword).
+      add(DIRECTIVES, :directive)  # :nodoc:
+    
+    NAME_FOLLOWS = WordList::CaseIgnoring.new(false).
+      add(%w(procedure function .))  # :nodoc:
+    
+  protected
+    
+    def scan_tokens encoder, options
+      
       state = :initial
       last_token = ''
-
+      
       until eos?
-
-        kind = nil
-        match = nil
-
+        
         if state == :initial
           
-          if scan(/ \s+ /x)
-            tokens << [matched, :space]
+          if match = scan(/ \s+ /x)
+            encoder.text_token match, :space
             next
             
-          elsif scan(%r! \{ \$ [^}]* \}? | \(\* \$ (?: .*? \*\) | .* ) !mx)
-            tokens << [matched, :preprocessor]
+          elsif match = scan(%r! \{ \$ [^}]* \}? | \(\* \$ (?: .*? \*\) | .* ) !mx)
+            encoder.text_token match, :preprocessor
             next
             
-          elsif scan(%r! // [^\n]* | \{ [^}]* \}? | \(\* (?: .*? \*\) | .* ) !mx)
-            tokens << [matched, :comment]
+          elsif match = scan(%r! // [^\n]* | \{ [^}]* \}? | \(\* (?: .*? \*\) | .* ) !mx)
+            encoder.text_token match, :comment
             next
             
           elsif match = scan(/ <[>=]? | >=? | :=? | [-+=*\/;,@\^|\(\)\[\]] | \.\. /x)
-            kind = :operator
+            encoder.text_token match, :operator
           
           elsif match = scan(/\./)
-            kind = :operator
-            if last_token == 'end'
-              tokens << [match, kind]
-              next
-            end
+            encoder.text_token match, :operator
+            next if last_token == 'end'
             
           elsif match = scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
-            kind = NAME_FOLLOWS[last_token] ? :ident : IDENT_KIND[match]
+            encoder.text_token match, NAME_FOLLOWS[last_token] ? :ident : IDENT_KIND[match]
             
-          elsif match = scan(/ ' ( [^\n']|'' ) (?:'|$) /x)
-            tokens << [:open, :char]
-            tokens << ["'", :delimiter]
-            tokens << [self[1], :content]
-            tokens << ["'", :delimiter]
-            tokens << [:close, :char]
+          elsif match = skip(/ ' ( [^\n']|'' ) (?:'|$) /x)
+            encoder.begin_group :char
+            encoder.text_token "'", :delimiter
+            encoder.text_token self[1], :content
+            encoder.text_token "'", :delimiter
+            encoder.end_group :char
             next
             
           elsif match = scan(/ ' /x)
-            tokens << [:open, :string]
+            encoder.begin_group :string
+            encoder.text_token match, :delimiter
             state = :string
-            kind = :delimiter
             
-          elsif scan(/ \# (?: \d+ | \$[0-9A-Fa-f]+ ) /x)
-            kind = :char
+          elsif match = scan(/ \# (?: \d+ | \$[0-9A-Fa-f]+ ) /x)
+            encoder.text_token match, :char
             
-          elsif scan(/ \$ [0-9A-Fa-f]+ /x)
-            kind = :hex
+          elsif match = scan(/ \$ [0-9A-Fa-f]+ /x)
+            encoder.text_token match, :hex
             
-          elsif scan(/ (?: \d+ ) (?![eE]|\.[^.]) /x)
-            kind = :integer
+          elsif match = scan(/ (?: \d+ ) (?![eE]|\.[^.]) /x)
+            encoder.text_token match, :integer
+            
+          elsif match = scan(/ \d+ (?: \.\d+ (?: [eE][+-]? \d+ )? | [eE][+-]? \d+ ) /x)
+            encoder.text_token match, :float
             
-          elsif scan(/ \d+ (?: \.\d+ (?: [eE][+-]? \d+ )? | [eE][+-]? \d+ ) /x)
-            kind = :float
-
           else
-            kind = :error
-            getch
-
+            encoder.text_token getch, :error
+            next
+            
           end
           
         elsif state == :string
-          if scan(/[^\n']+/)
-            kind = :content
-          elsif scan(/''/)
-            kind = :char
-          elsif scan(/'/)
-            tokens << ["'", :delimiter]
-            tokens << [:close, :string]
+          if match = scan(/[^\n']+/)
+            encoder.text_token match, :content
+          elsif match = scan(/''/)
+            encoder.text_token match, :char
+          elsif match = scan(/'/)
+            encoder.text_token match, :delimiter
+            encoder.end_group :string
             state = :initial
             next
-          elsif scan(/\n/)
-            tokens << [:close, :string]
-            kind = :error
+          elsif match = scan(/\n/)
+            encoder.end_group :string
+            encoder.text_token match, :space
             state = :initial
           else
-            raise "else case \' reached; %p not handled." % peek(1), tokens
+            raise "else case \' reached; %p not handled." % peek(1), encoder
           end
           
         else
-          raise 'else-case reached', tokens
+          raise 'else-case reached', encoder
           
         end
         
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens, state
-        end
-        raise_inspect 'Empty token', tokens unless match
-
         last_token = match
-        tokens << [match, kind]
         
       end
       
-      tokens
+      if state == :string
+        encoder.end_group state
+      end
+      
+      encoder
     end
 
   end
diff --git a/lib/coderay/scanners/diff.rb b/lib/coderay/scanners/diff.rb
index 353b966..9e899c3 100644
--- a/lib/coderay/scanners/diff.rb
+++ b/lib/coderay/scanners/diff.rb
@@ -1,25 +1,41 @@
 module CodeRay
 module Scanners
   
+  # Scanner for output of the diff command.
+  # 
+  # Alias: +patch+
   class Diff < Scanner
     
     register_for :diff
     title 'diff output'
     
-    def scan_tokens tokens, options
+    DEFAULT_OPTIONS = {
+      :highlight_code => true,
+      :inline_diff    => true,
+    }
+    
+  protected
+    
+    def scan_tokens encoder, options
       
       line_kind = nil
       state = :initial
+      deleted_lines = 0
+      scanners = Hash.new do |h, lang|
+        h[lang] = Scanners[lang].new '', :keep_tokens => true, :keep_state => true
+      end
+      content_scanner = scanners[:plain]
+      content_scanner_entry_state = nil
       
       until eos?
-        kind = match = nil
         
         if match = scan(/\n/)
+          deleted_lines = 0 unless line_kind == :delete
           if line_kind
-            tokens << [:end_line, line_kind]
+            encoder.end_line line_kind
             line_kind = nil
           end
-          tokens << [match, :space]
+          encoder.text_token match, :space
           next
         end
         
@@ -27,81 +43,154 @@ module Scanners
         
         when :initial
           if match = scan(/--- |\+\+\+ |=+|_+/)
-            tokens << [:begin_line, line_kind = :head]
-            tokens << [match, :head]
+            encoder.begin_line line_kind = :head
+            encoder.text_token match, :head
+            if match = scan(/.*?(?=$|[\t\n\x00]|  \(revision)/)
+              encoder.text_token match, :filename
+              if options[:highlight_code] && match != '/dev/null'
+                file_type = CodeRay::FileType.fetch(match, :text)
+                file_type = :text if file_type == :diff
+                content_scanner = scanners[file_type]
+                content_scanner_entry_state = nil
+              end
+            end
             next unless match = scan(/.+/)
-            kind = :plain
+            encoder.text_token match, :plain
           elsif match = scan(/Index: |Property changes on: /)
-            tokens << [:begin_line, line_kind = :head]
-            tokens << [match, :head]
+            encoder.begin_line line_kind = :head
+            encoder.text_token match, :head
             next unless match = scan(/.+/)
-            kind = :plain
+            encoder.text_token match, :plain
           elsif match = scan(/Added: /)
-            tokens << [:begin_line, line_kind = :head]
-            tokens << [match, :head]
+            encoder.begin_line line_kind = :head
+            encoder.text_token match, :head
             next unless match = scan(/.+/)
-            kind = :plain
+            encoder.text_token match, :plain
             state = :added
-          elsif match = scan(/\\ /)
-            tokens << [:begin_line, line_kind = :change]
-            tokens << [match, :change]
-            next unless match = scan(/.+/)
-            kind = :plain
+          elsif match = scan(/\\ .*/)
+            encoder.text_token match, :comment
           elsif match = scan(/@@(?>[^@\n]*)@@/)
+            content_scanner.state = :initial unless match?(/\n\+/)
+            content_scanner_entry_state = nil
             if check(/\n|$/)
-              tokens << [:begin_line, line_kind = :change]
+              encoder.begin_line line_kind = :change
             else
-              tokens << [:open, :change]
+              encoder.begin_group :change
             end
-            tokens << [match[0,2], :change]
-            tokens << [match[2...-2], :plain]
-            tokens << [match[-2,2], :change]
-            tokens << [:close, :change] unless line_kind
+            encoder.text_token match[0,2], :change
+            encoder.text_token match[2...-2], :plain
+            encoder.text_token match[-2,2], :change
+            encoder.end_group :change unless line_kind
             next unless match = scan(/.+/)
-            kind = :plain
+            if options[:highlight_code]
+              content_scanner.tokenize match, :tokens => encoder
+            else
+              encoder.text_token match, :plain
+            end
+            next
           elsif match = scan(/\+/)
-            tokens << [:begin_line, line_kind = :insert]
-            tokens << [match, :insert]
+            encoder.begin_line line_kind = :insert
+            encoder.text_token match, :insert
             next unless match = scan(/.+/)
-            kind = :plain
+            if options[:highlight_code]
+              content_scanner.tokenize match, :tokens => encoder
+            else
+              encoder.text_token match, :plain
+            end
+            next
           elsif match = scan(/-/)
-            tokens << [:begin_line, line_kind = :delete]
-            tokens << [match, :delete]
-            next unless match = scan(/.+/)
-            kind = :plain
-          elsif scan(/ .*/)
-            kind = :comment
-          elsif scan(/.+/)
-            tokens << [:begin_line, line_kind = :comment]
-            kind = :plain
+            deleted_lines += 1
+            encoder.begin_line line_kind = :delete
+            encoder.text_token match, :delete
+            if options[:inline_diff] && deleted_lines == 1 && check(/(?>.*)\n\+(?>.*)$(?!\n\+)/)
+              content_scanner_entry_state = content_scanner.state
+              skip(/(.*)\n\+(.*)$/)
+              head, deletion, insertion, tail = diff self[1], self[2]
+              pre, deleted, post = content_scanner.tokenize [head, deletion, tail], :tokens => Tokens.new
+              encoder.tokens pre
+              unless deleted.empty?
+                encoder.begin_group :eyecatcher
+                encoder.tokens deleted
+                encoder.end_group :eyecatcher
+              end
+              encoder.tokens post
+              encoder.end_line line_kind
+              encoder.text_token "\n", :space
+              encoder.begin_line line_kind = :insert
+              encoder.text_token '+', :insert
+              content_scanner.state = content_scanner_entry_state || :initial
+              pre, inserted, post = content_scanner.tokenize [head, insertion, tail], :tokens => Tokens.new
+              encoder.tokens pre
+              unless inserted.empty?
+                encoder.begin_group :eyecatcher
+                encoder.tokens inserted
+                encoder.end_group :eyecatcher
+              end
+              encoder.tokens post
+            elsif match = scan(/.*/)
+              if options[:highlight_code]
+                if deleted_lines == 1
+                  content_scanner_entry_state = content_scanner.state
+                end
+                content_scanner.tokenize match, :tokens => encoder unless match.empty?
+                if !match?(/\n-/)
+                  if match?(/\n\+/)
+                    content_scanner.state = content_scanner_entry_state || :initial
+                  end
+                  content_scanner_entry_state = nil
+                end
+              else
+                encoder.text_token match, :plain
+              end
+            end
+            next
+          elsif match = scan(/ .*/)
+            if options[:highlight_code]
+              content_scanner.tokenize match, :tokens => encoder
+            else
+              encoder.text_token match, :plain
+            end
+            next
+          elsif match = scan(/.+/)
+            encoder.begin_line line_kind = :comment
+            encoder.text_token match, :plain
           else
             raise_inspect 'else case rached'
           end
         
         when :added
           if match = scan(/   \+/)
-            tokens << [:begin_line, line_kind = :insert]
-            tokens << [match, :insert]
+            encoder.begin_line line_kind = :insert
+            encoder.text_token match, :insert
             next unless match = scan(/.+/)
-            kind = :plain
+            encoder.text_token match, :plain
           else
             state = :initial
             next
           end
         end
         
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens
-        end
-        raise_inspect 'Empty token', tokens unless match
-        
-        tokens << [match, kind]
       end
       
-      tokens << [:end_line, line_kind] if line_kind
-      tokens
+      encoder.end_line line_kind if line_kind
+      
+      encoder
+    end
+    
+  private
+    
+    def diff a, b
+      # i will be the index of the leftmost difference from the left.
+      i_max = [a.size, b.size].min
+      i = 0
+      i += 1 while i < i_max && a[i] == b[i]
+      # j_min will be the index of the leftmost difference from the right.
+      j_min = i - i_max
+      # j will be the index of the rightmost difference from the right which
+      # does not precede the leftmost one from the left.
+      j = -1
+      j -= 1 while j >= j_min && a[j] == b[j]
+      return a[0...i], a[i..j], b[i..j], (j < -1) ? a[j+1..-1] : ''
     end
     
   end
diff --git a/lib/coderay/scanners/erb.rb b/lib/coderay/scanners/erb.rb
new file mode 100644
index 0000000..727a993
--- /dev/null
+++ b/lib/coderay/scanners/erb.rb
@@ -0,0 +1,81 @@
+module CodeRay
+module Scanners
+  
+  load :html
+  load :ruby
+  
+  # Scanner for HTML ERB templates.
+  class ERB < Scanner
+    
+    register_for :erb
+    title 'HTML ERB Template'
+    
+    KINDS_NOT_LOC = HTML::KINDS_NOT_LOC
+    
+    ERB_RUBY_BLOCK = /
+      (<%(?!%)[-=\#]?)
+      ((?>
+        [^\-%]*    # normal*
+        (?>        # special
+          (?: %(?!>) | -(?!%>) )
+          [^\-%]*  # normal*
+        )*
+      ))
+      ((?: -?%> )?)
+    /x  # :nodoc:
+    
+    START_OF_ERB = /
+      <%(?!%)
+    /x  # :nodoc:
+    
+  protected
+    
+    def setup
+      @ruby_scanner = CodeRay.scanner :ruby, :tokens => @tokens, :keep_tokens => true
+      @html_scanner = CodeRay.scanner :html, :tokens => @tokens, :keep_tokens => true, :keep_state => true
+    end
+    
+    def reset_instance
+      super
+      @html_scanner.reset
+    end
+    
+    def scan_tokens encoder, options
+      
+      until eos?
+        
+        if (match = scan_until(/(?=#{START_OF_ERB})/o) || scan_rest) and not match.empty?
+          @html_scanner.tokenize match, :tokens => encoder
+          
+        elsif match = scan(/#{ERB_RUBY_BLOCK}/o)
+          start_tag = self[1]
+          code = self[2]
+          end_tag = self[3]
+          
+          encoder.begin_group :inline
+          encoder.text_token start_tag, :inline_delimiter
+          
+          if start_tag == '<%#'
+            encoder.text_token code, :comment
+          else
+            @ruby_scanner.tokenize code, :tokens => encoder
+          end unless code.empty?
+          
+          encoder.text_token end_tag, :inline_delimiter unless end_tag.empty?
+          encoder.end_group :inline
+          
+        else
+          raise_inspect 'else-case reached!', encoder
+          
+        end
+        
+      end
+      
+      encoder
+      
+    end
+    
+  end
+  
+end
+end
diff --git a/lib/coderay/scanners/groovy.rb b/lib/coderay/scanners/groovy.rb
index 17330e6..cf55daf 100644
--- a/lib/coderay/scanners/groovy.rb
+++ b/lib/coderay/scanners/groovy.rb
@@ -1,29 +1,29 @@
 module CodeRay
 module Scanners
-
+  
   load :java
-
+  
+  # Scanner for Groovy.
   class Groovy < Java
-
-    include Streamable
+    
     register_for :groovy
     
-    # TODO: Check this!
+    # TODO: check list of keywords
     GROOVY_KEYWORDS = %w[
       as assert def in
-    ]
+    ]  # :nodoc:
     KEYWORDS_EXPECTING_VALUE = WordList.new.add %w[
       case instanceof new return throw typeof while as assert in
-    ]
-    GROOVY_MAGIC_VARIABLES = %w[ it ]
+    ]  # :nodoc:
+    GROOVY_MAGIC_VARIABLES = %w[ it ]  # :nodoc:
     
     IDENT_KIND = Java::IDENT_KIND.dup.
       add(GROOVY_KEYWORDS, :keyword).
-      add(GROOVY_MAGIC_VARIABLES, :local_variable)
+      add(GROOVY_MAGIC_VARIABLES, :local_variable)  # :nodoc:
     
-    ESCAPE = / [bfnrtv$\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
-    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} /x  # no 4-byte unicode chars? U[a-fA-F0-9]{8}
-    REGEXP_ESCAPE =  / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} | \d | [bBdDsSwW\/] /x
+    ESCAPE = / [bfnrtv$\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x  # :nodoc:
+    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} /x  # :nodoc: no 4-byte unicode chars? U[a-fA-F0-9]{8}
+    REGEXP_ESCAPE =  / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} | \d | [bBdDsSwW\/] /x  # :nodoc:
     
     # TODO: interpretation inside ', ", /
     STRING_CONTENT_PATTERN = {
@@ -32,45 +32,44 @@ module Scanners
       "'''" => /(?>[^\\']+|'(?!''))+/,
       '"""' => /(?>[^\\$"]+|"(?!""))+/,
       '/' => /[^\\$\/\n]+/,
-    }
+    }  # :nodoc:
+    
+  protected
     
-    def scan_tokens tokens, options
-
+    def scan_tokens encoder, options
+      
       state = :initial
       inline_block_stack = []
       inline_block_paren_depth = nil
       string_delimiter = nil
       import_clause = class_name_follows = last_token = after_def = false
       value_expected = true
-
+      
       until eos?
-
-        kind = nil
-        match = nil
         
         case state
-
+        
         when :initial
-
+          
           if match = scan(/ \s+ | \\\n /x)
-            tokens << [match, :space]
+            encoder.text_token match, :space
             if match.index ?\n
               import_clause = after_def = false
               value_expected = true unless value_expected
             end
             next
           
-          elsif scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
+          elsif match = scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
             value_expected = true
             after_def = false
-            kind = :comment
+            encoder.text_token match, :comment
           
-          elsif bol? && scan(/ \#!.* /x)
-            kind = :doctype
+          elsif bol? && match = scan(/ \#!.* /x)
+            encoder.text_token match, :doctype
           
-          elsif import_clause && scan(/ (?!as) #{IDENT} (?: \. #{IDENT} )* (?: \.\* )? /ox)
+          elsif import_clause && match = scan(/ (?!as) #{IDENT} (?: \. #{IDENT} )* (?: \.\* )? /ox)
             after_def = value_expected = false
-            kind = :include
+            encoder.text_token match, :include
           
           elsif match = scan(/ #{IDENT} | \[\] /ox)
             kind = IDENT_KIND[match]
@@ -90,16 +89,17 @@ module Scanners
               import_clause = match == 'import'
               after_def = true if match == 'def'
             end
+            encoder.text_token match, kind
           
-          elsif scan(/;/)
+          elsif match = scan(/;/)
             import_clause = after_def = false
             value_expected = true
-            kind = :operator
+            encoder.text_token match, :operator
           
-          elsif scan(/\{/)
+          elsif match = scan(/\{/)
             class_name_follows = after_def = false
             value_expected = true
-            kind = :operator
+            encoder.text_token match, :operator
             if !inline_block_stack.empty?
               inline_block_paren_depth += 1
             end
@@ -110,155 +110,146 @@ module Scanners
             value_expected = true
             value_expected = :regexp if match == '~'
             after_def = false
-            kind = :operator
+            encoder.text_token match, :operator
           
           elsif match = scan(/ [)\]}] /x)
             value_expected = after_def = false
             if !inline_block_stack.empty? && match == '}'
               inline_block_paren_depth -= 1
               if inline_block_paren_depth == 0  # closing brace of inline block reached
-                tokens << [match, :inline_delimiter]
-                tokens << [:close, :inline]
+                encoder.text_token match, :inline_delimiter
+                encoder.end_group :inline
                 state, string_delimiter, inline_block_paren_depth = inline_block_stack.pop
                 next
               end
             end
-            kind = :operator
+            encoder.text_token match, :operator
           
           elsif check(/[\d.]/)
             after_def = value_expected = false
-            if scan(/0[xX][0-9A-Fa-f]+/)
-              kind = :hex
-            elsif scan(/(?>0[0-7]+)(?![89.eEfF])/)
-              kind = :oct
-            elsif scan(/\d+[fFdD]|\d*\.\d+(?:[eE][+-]?\d+)?[fFdD]?|\d+[eE][+-]?\d+[fFdD]?/)
-              kind = :float
-            elsif scan(/\d+[lLgG]?/)
-              kind = :integer
+            if match = scan(/0[xX][0-9A-Fa-f]+/)
+              encoder.text_token match, :hex
+            elsif match = scan(/(?>0[0-7]+)(?![89.eEfF])/)
+              encoder.text_token match, :octal
+            elsif match = scan(/\d+[fFdD]|\d*\.\d+(?:[eE][+-]?\d+)?[fFdD]?|\d+[eE][+-]?\d+[fFdD]?/)
+              encoder.text_token match, :float
+            elsif match = scan(/\d+[lLgG]?/)
+              encoder.text_token match, :integer
             end
-
+            
           elsif match = scan(/'''|"""/)
             after_def = value_expected = false
             state = :multiline_string
-            tokens << [:open, :string]
+            encoder.begin_group :string
             string_delimiter = match
-            kind = :delimiter
-          
-          # TODO: record.'name'
+            encoder.text_token match, :delimiter
+            
+          # TODO: record.'name' syntax
           elsif match = scan(/["']/)
             after_def = value_expected = false
             state = match == '/' ? :regexp : :string
-            tokens << [:open, state]
+            encoder.begin_group state
             string_delimiter = match
-            kind = :delimiter
-
-          elsif value_expected && (match = scan(/\//))
+            encoder.text_token match, :delimiter
+            
+          elsif value_expected && match = scan(/\//)
             after_def = value_expected = false
-            tokens << [:open, :regexp]
+            encoder.begin_group :regexp
             state = :regexp
             string_delimiter = '/'
-            kind = :delimiter
-
-          elsif scan(/ @ #{IDENT} /ox)
+            encoder.text_token match, :delimiter
+            
+          elsif match = scan(/ @ #{IDENT} /ox)
             after_def = value_expected = false
-            kind = :annotation
-
-          elsif scan(/\//)
+            encoder.text_token match, :annotation
+            
+          elsif match = scan(/\//)
             after_def = false
             value_expected = true
-            kind = :operator
-          
+            encoder.text_token match, :operator
+            
           else
-            getch
-            kind = :error
-
+            encoder.text_token getch, :error
+            
           end
-
+          
         when :string, :regexp, :multiline_string
-          if scan(STRING_CONTENT_PATTERN[string_delimiter])
-            kind = :content
+          if match = scan(STRING_CONTENT_PATTERN[string_delimiter])
+            encoder.text_token match, :content
             
           elsif match = scan(state == :multiline_string ? /'''|"""/ : /["'\/]/)
-            tokens << [match, :delimiter]
+            encoder.text_token match, :delimiter
             if state == :regexp
               # TODO: regexp modifiers? s, m, x, i?
               modifiers = scan(/[ix]+/)
-              tokens << [modifiers, :modifier] if modifiers && !modifiers.empty?
+              encoder.text_token modifiers, :modifier if modifiers && !modifiers.empty?
             end
             state = :string if state == :multiline_string
-            tokens << [:close, state]
+            encoder.end_group state
             string_delimiter = nil
             after_def = value_expected = false
             state = :initial
             next
-          
+            
           elsif (state == :string || state == :multiline_string) &&
               (match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox))
             if string_delimiter[0] == ?' && !(match == "\\\\" || match == "\\'")
-              kind = :content
+              encoder.text_token match, :content
             else
-              kind = :char
+              encoder.text_token match, :char
             end
-          elsif state == :regexp && scan(/ \\ (?: #{REGEXP_ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
-            kind = :char
-          
+          elsif state == :regexp && match = scan(/ \\ (?: #{REGEXP_ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+            encoder.text_token match, :char
+            
           elsif match = scan(/ \$ #{IDENT} /mox)
-            tokens << [:open, :inline]
-            tokens << ['$', :inline_delimiter]
+            encoder.begin_group :inline
+            encoder.text_token '$', :inline_delimiter
             match = match[1..-1]
-            tokens << [match, IDENT_KIND[match]]
-            tokens << [:close, :inline]
+            encoder.text_token match, IDENT_KIND[match]
+            encoder.end_group :inline
             next
           elsif match = scan(/ \$ \{ /x)
-            tokens << [:open, :inline]
-            tokens << ['${', :inline_delimiter]
+            encoder.begin_group :inline
+            encoder.text_token match, :inline_delimiter
             inline_block_stack << [state, string_delimiter, inline_block_paren_depth]
             inline_block_paren_depth = 1
             state = :initial
             next
-          
-          elsif scan(/ \$ /mx)
-            kind = :content
-          
-          elsif scan(/ \\. /mx)
-            kind = :content
-          
-          elsif scan(/ \\ | \n /x)
-            tokens << [:close, state]
-            kind = :error
+            
+          elsif match = scan(/ \$ /mx)
+            encoder.text_token match, :content
+            
+          elsif match = scan(/ \\. /mx)
+            encoder.text_token match, :content  # TODO: Shouldn't this be :error?
+            
+          elsif match = scan(/ \\ | \n /x)
+            encoder.end_group state
+            encoder.text_token match, :error
             after_def = value_expected = false
             state = :initial
-          
+            
           else
-            raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+            raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
+            
           end
-
+          
         else
-          raise_inspect 'Unknown state', tokens
-
-        end
-
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens
+          raise_inspect 'Unknown state', encoder
+          
         end
-        raise_inspect 'Empty token', tokens unless match
         
         last_token = match unless [:space, :comment, :doctype].include? kind
         
-        tokens << [match, kind]
-
       end
-
+      
       if [:multiline_string, :string, :regexp].include? state
-        tokens << [:close, state]
+        encoder.end_group state
       end
-
-      tokens
+      
+      encoder
     end
-
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/scanners/haml.rb b/lib/coderay/scanners/haml.rb
new file mode 100644
index 0000000..5433790
--- /dev/null
+++ b/lib/coderay/scanners/haml.rb
@@ -0,0 +1,168 @@
+module CodeRay
+module Scanners
+  
+  load :ruby
+  load :html
+  load :java_script
+  
+  class HAML < Scanner
+    
+    register_for :haml
+    title 'HAML Template'
+    
+    KINDS_NOT_LOC = HTML::KINDS_NOT_LOC
+    
+  protected
+    
+    def setup
+      super
+      @ruby_scanner          = CodeRay.scanner :ruby, :tokens => @tokens, :keep_tokens => true
+      @embedded_ruby_scanner = CodeRay.scanner :ruby, :tokens => @tokens, :keep_tokens => true, :state => @ruby_scanner.interpreted_string_state
+      @html_scanner          = CodeRay.scanner :html, :tokens => @tokens, :keep_tokens => true
+    end
+    
+    def scan_tokens encoder, options
+      
+      match = nil
+      code = ''
+      
+      until eos?
+        
+        if bol?
+          if match = scan(/!!!.*/)
+            encoder.text_token match, :doctype
+            next
+          end
+          
+          if match = scan(/(?>( *)(\/(?!\[if)|-\#|:javascript|:ruby|:\w+) *)(?=\n)/)
+            encoder.text_token match, :comment
+            
+            code = self[2]
+            if match = scan(/(?:\n+#{self[1]} .*)+/)
+              case code
+              when '/', '-#'
+                encoder.text_token match, :comment
+              when ':javascript'
+                # TODO: recognize #{...} snippets inside JavaScript
+                @java_script_scanner ||= CodeRay.scanner :java_script, :tokens => @tokens, :keep_tokens => true
+                @java_script_scanner.tokenize match, :tokens => encoder
+              when ':ruby'
+                @ruby_scanner.tokenize match, :tokens => encoder
+              when /:\w+/
+                encoder.text_token match, :comment
+              else
+                raise 'else-case reached: %p' % [code]
+              end
+            end
+          end
+          
+          if match = scan(/ +/)
+            encoder.text_token match, :space
+          end
+          
+          if match = scan(/\/.*/)
+            encoder.text_token match, :comment
+            next
+          end
+          
+          if match = scan(/\\/)
+            encoder.text_token match, :plain
+            if match = scan(/.+/)
+              @html_scanner.tokenize match, :tokens => encoder
+            end
+            next
+          end
+          
+          tag = false
+          
+          if match = scan(/%[\w:]+\/?/)
+            encoder.text_token match, :tag
+            # if match = scan(/( +)(.+)/)
+            #   encoder.text_token self[1], :space
+            #   @embedded_ruby_scanner.tokenize self[2], :tokens => encoder
+            # end
+            tag = true
+          end
+          
+          while match = scan(/([.#])[-\w]*\w/)
+            encoder.text_token match, self[1] == '#' ? :constant : :class
+            tag = true
+          end
+          
+          if tag && match = scan(/(\()([^)]+)?(\))?/)
+            # TODO: recognize title=@title, class="widget_#{@widget.number}"
+            encoder.text_token self[1], :plain
+            @html_scanner.tokenize self[2], :tokens => encoder, :state => :attribute if self[2]
+            encoder.text_token self[3], :plain if self[3]
+          end
+          
+          if tag && match = scan(/\{/)
+            encoder.text_token match, :plain
+            
+            code = ''
+            level = 1
+            while true
+              code << scan(/([^\{\},\n]|, *\n?)*/)
+              case match = getch
+              when '{'
+                level += 1
+                code << match
+              when '}'
+                level -= 1
+                if level > 0
+                  code << match
+                else
+                  break
+                end
+              when "\n", ",", nil
+                break
+              end
+            end
+            @ruby_scanner.tokenize code, :tokens => encoder unless code.empty?
+            
+            encoder.text_token match, :plain if match
+          end
+          
+          if tag && match = scan(/(\[)([^\]\n]+)?(\])?/)
+            encoder.text_token self[1], :plain
+            @ruby_scanner.tokenize self[2], :tokens => encoder if self[2]
+            encoder.text_token self[3], :plain if self[3]
+          end
+          
+          if tag && match = scan(/\//)
+            encoder.text_token match, :tag
+          end
+          
+          if scan(/(>?<?[-=]|[&!]=|(& |!)|~)( *)([^,\n\|]+(?:(, *|\|(?=.|\n.*\|$))\n?[^,\n\|]*)*)?/)
+            encoder.text_token self[1] + self[3], :plain
+            if self[4]
+              if self[2]
+                @embedded_ruby_scanner.tokenize self[4], :tokens => encoder
+              else
+                @ruby_scanner.tokenize self[4], :tokens => encoder
+              end
+            end
+          elsif match = scan(/((?:<|><?)(?![!?\/\w]))?(.+)?/)
+            encoder.text_token self[1], :plain if self[1]
+            # TODO: recognize #{...} snippets
+            @html_scanner.tokenize self[2], :tokens => encoder if self[2]
+          end
+          
+        elsif match = scan(/.+/)
+          @html_scanner.tokenize match, :tokens => encoder
+          
+        end
+        
+        if match = scan(/\n/)
+          encoder.text_token match, :space
+        end
+      end
+      
+      encoder
+      
+    end
+    
+  end
+  
+end
+end
diff --git a/lib/coderay/scanners/html.rb b/lib/coderay/scanners/html.rb
index 009a461..98d06fc 100644
--- a/lib/coderay/scanners/html.rb
+++ b/lib/coderay/scanners/html.rb
@@ -2,22 +2,42 @@ module CodeRay
 module Scanners
 
   # HTML Scanner
+  # 
+  # Alias: +xhtml+
+  # 
+  # See also: Scanners::XML
   class HTML < Scanner
 
-    include Streamable
     register_for :html
     
     KINDS_NOT_LOC = [
       :comment, :doctype, :preprocessor,
       :tag, :attribute_name, :operator,
-      :attribute_value, :delimiter, :content,
-      :plain, :entity, :error
-    ]
-
-    ATTR_NAME = /[\w.:-]+/
-    ATTR_VALUE_UNQUOTED = ATTR_NAME
-    TAG_END = /\/?>/
-    HEX = /[0-9a-fA-F]/
+      :attribute_value, :string,
+      :plain, :entity, :error,
+    ]  # :nodoc:
+    
+    EVENT_ATTRIBUTES = %w(
+      onabort onafterprint onbeforeprint onbeforeunload onblur oncanplay
+      oncanplaythrough onchange onclick oncontextmenu oncuechange ondblclick
+      ondrag ondragdrop ondragend ondragenter ondragleave ondragover
+      ondragstart ondrop ondurationchange onemptied onended onerror onfocus
+      onformchange onforminput onhashchange oninput oninvalid onkeydown
+      onkeypress onkeyup onload onloadeddata onloadedmetadata onloadstart
+      onmessage onmousedown onmousemove onmouseout onmouseover onmouseup
+      onmousewheel onmove onoffline ononline onpagehide onpageshow onpause
+      onplay onplaying onpopstate onprogress onratechange onreadystatechange
+      onredo onreset onresize onscroll onseeked onseeking onselect onshow
+      onstalled onstorage onsubmit onsuspend ontimeupdate onundo onunload
+      onvolumechange onwaiting
+    )
+    
+    IN_ATTRIBUTE = WordList::CaseIgnoring.new(nil).
+      add(EVENT_ATTRIBUTES, :script)
+    
+    ATTR_NAME = /[\w.:-]+/  # :nodoc:
+    TAG_END = /\/?>/  # :nodoc:
+    HEX = /[0-9a-fA-F]/  # :nodoc:
     ENTITY = /
       &
       (?:
@@ -31,152 +51,203 @@ module Scanners
         )
       )
       ;
-    /ox
-
+    /ox  # :nodoc:
+    
     PLAIN_STRING_CONTENT = {
       "'" => /[^&'>\n]+/,
       '"' => /[^&">\n]+/,
-    }
-
+    }  # :nodoc:
+    
     def reset
       super
       @state = :initial
+      @plain_string_content = nil
     end
-
-  private
+    
+  protected
+    
     def setup
       @state = :initial
       @plain_string_content = nil
     end
-
-    def scan_tokens tokens, options
-
-      state = @state
+    
+    def scan_java_script encoder, code
+      if code && !code.empty?
+        @java_script_scanner ||= Scanners::JavaScript.new '', :keep_tokens => true
+        # encoder.begin_group :inline
+        @java_script_scanner.tokenize code, :tokens => encoder
+        # encoder.end_group :inline
+      end
+    end
+    
+    def scan_tokens encoder, options
+      state = options[:state] || @state
       plain_string_content = @plain_string_content
-
+      in_tag = in_attribute = nil
+      
+      encoder.begin_group :string if state == :attribute_value_string
+      
       until eos?
-
-        kind = nil
-        match = nil
-
-        if scan(/\s+/m)
-          kind = :space
-
+        
+        if state != :in_special_tag && match = scan(/\s+/m)
+          encoder.text_token match, :space
+          
         else
-
+          
           case state
-
+          
           when :initial
-            if scan(/<!--.*?-->/m)
-              kind = :comment
-            elsif scan(/<!DOCTYPE.*?>/m)
-              kind = :doctype
-            elsif scan(/<\?xml.*?\?>/m)
-              kind = :preprocessor
-            elsif scan(/<\?.*?\?>|<%.*?%>/m)
-              kind = :comment
-            elsif scan(/<\/[-\w.:]*>/m)
-              kind = :tag
-            elsif match = scan(/<[-\w.:]+>?/m)
-              kind = :tag
-              state = :attribute unless match[-1] == ?>
-            elsif scan(/[^<>&]+/)
-              kind = :plain
-            elsif scan(/#{ENTITY}/ox)
-              kind = :entity
-            elsif scan(/[<>&]/)
-              kind = :error
+            if match = scan(/<!--(?:.*?-->|.*)/m)
+              encoder.text_token match, :comment
+            elsif match = scan(/<!DOCTYPE(?:.*?>|.*)/m)
+              encoder.text_token match, :doctype
+            elsif match = scan(/<\?xml(?:.*?\?>|.*)/m)
+              encoder.text_token match, :preprocessor
+            elsif match = scan(/<\?(?:.*?\?>|.*)/m)
+              encoder.text_token match, :comment
+            elsif match = scan(/<\/[-\w.:]*>?/m)
+              in_tag = nil
+              encoder.text_token match, :tag
+            elsif match = scan(/<(?:(script)|[-\w.:]+)(>)?/m)
+              encoder.text_token match, :tag
+              in_tag = self[1]
+              if self[2]
+                state = :in_special_tag if in_tag
+              else
+                state = :attribute
+              end
+            elsif match = scan(/[^<>&]+/)
+              encoder.text_token match, :plain
+            elsif match = scan(/#{ENTITY}/ox)
+              encoder.text_token match, :entity
+            elsif match = scan(/[<>&]/)
+              in_tag = nil
+              encoder.text_token match, :error
             else
-              raise_inspect '[BUG] else-case reached with state %p' % [state], tokens
+              raise_inspect '[BUG] else-case reached with state %p' % [state], encoder
             end
-
+            
           when :attribute
-            if scan(/#{TAG_END}/o)
-              kind = :tag
-              state = :initial
-            elsif scan(/#{ATTR_NAME}/o)
-              kind = :attribute_name
+            if match = scan(/#{TAG_END}/o)
+              encoder.text_token match, :tag
+              in_attribute = nil
+              if in_tag
+                state = :in_special_tag
+              else
+                state = :initial
+              end
+            elsif match = scan(/#{ATTR_NAME}/o)
+              in_attribute = IN_ATTRIBUTE[match]
+              encoder.text_token match, :attribute_name
               state = :attribute_equal
             else
-              kind = :error
-              getch
+              in_tag = nil
+              encoder.text_token getch, :error
             end
-
+            
           when :attribute_equal
-            if scan(/=/)
-              kind = :operator
+            if match = scan(/=/)  #/
+              encoder.text_token match, :operator
               state = :attribute_value
-            elsif scan(/#{ATTR_NAME}/o)
-              kind = :attribute_name
-            elsif scan(/#{TAG_END}/o)
-              kind = :tag
-              state = :initial
-            elsif scan(/./)
-              kind = :error
+            elsif scan(/#{ATTR_NAME}/o) || scan(/#{TAG_END}/o)
+              state = :attribute
+              next
+            else
+              encoder.text_token getch, :error
               state = :attribute
             end
-
+            
           when :attribute_value
-            if scan(/#{ATTR_VALUE_UNQUOTED}/o)
-              kind = :attribute_value
+            if match = scan(/#{ATTR_NAME}/o)
+              encoder.text_token match, :attribute_value
               state = :attribute
             elsif match = scan(/["']/)
-              tokens << [:open, :string]
-              state = :attribute_value_string
-              plain_string_content = PLAIN_STRING_CONTENT[match]
-              kind = :delimiter
-            elsif scan(/#{TAG_END}/o)
-              kind = :tag
+              if in_attribute == :script
+                encoder.begin_group :inline
+                encoder.text_token match, :inline_delimiter
+                if scan(/javascript:[ \t]*/)
+                  encoder.text_token matched, :comment
+                end
+                code = scan_until(match == '"' ? /(?="|\z)/ : /(?='|\z)/)
+                scan_java_script encoder, code
+                match = scan(/["']/)
+                encoder.text_token match, :inline_delimiter if match
+                encoder.end_group :inline
+                state = :attribute
+                in_attribute = nil
+              else
+                encoder.begin_group :string
+                state = :attribute_value_string
+                plain_string_content = PLAIN_STRING_CONTENT[match]
+                encoder.text_token match, :delimiter
+              end
+            elsif match = scan(/#{TAG_END}/o)
+              encoder.text_token match, :tag
               state = :initial
             else
-              kind = :error
-              getch
+              encoder.text_token getch, :error
             end
-
+            
           when :attribute_value_string
-            if scan(plain_string_content)
-              kind = :content
-            elsif scan(/['"]/)
-              tokens << [matched, :delimiter]
-              tokens << [:close, :string]
+            if match = scan(plain_string_content)
+              encoder.text_token match, :content
+            elsif match = scan(/['"]/)
+              encoder.text_token match, :delimiter
+              encoder.end_group :string
               state = :attribute
-              next
-            elsif scan(/#{ENTITY}/ox)
-              kind = :entity
-            elsif scan(/&/)
-              kind = :content
-            elsif scan(/[\n>]/)
-              tokens << [:close, :string]
-              kind = :error
+            elsif match = scan(/#{ENTITY}/ox)
+              encoder.text_token match, :entity
+            elsif match = scan(/&/)
+              encoder.text_token match, :content
+            elsif match = scan(/[\n>]/)
+              encoder.end_group :string
               state = :initial
+              encoder.text_token match, :error
             end
-
+            
+          when :in_special_tag
+            case in_tag
+            when 'script'
+              encoder.text_token match, :space if match = scan(/[ \t]*\n/)
+              if scan(/(\s*<!--)(?:(.*?)(-->)|(.*))/m)
+                code = self[2] || self[4]
+                closing = self[3]
+                encoder.text_token self[1], :comment
+              else
+                code = scan_until(/(?=(?:\n\s*)?<\/script>)|\z/)
+                closing = false
+              end
+              unless code.empty?
+                encoder.begin_group :inline
+                scan_java_script encoder, code
+                encoder.end_group :inline
+              end
+              encoder.text_token closing, :comment if closing
+              state = :initial
+            else
+              raise 'unknown special tag: %p' % [in_tag]
+            end
+            
           else
-            raise_inspect 'Unknown state: %p' % [state], tokens
-
+            raise_inspect 'Unknown state: %p' % [state], encoder
+            
           end
-
-        end
-
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens, state
+          
         end
-        raise_inspect 'Empty token', tokens unless match
-
-        tokens << [match, kind]
+        
       end
-
+      
       if options[:keep_state]
         @state = state
         @plain_string_content = plain_string_content
       end
-
-      tokens
+      
+      encoder.end_group :string if state == :attribute_value_string
+      
+      encoder
     end
-
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/scanners/java.rb b/lib/coderay/scanners/java.rb
index caf3619..c1490ac 100644
--- a/lib/coderay/scanners/java.rb
+++ b/lib/coderay/scanners/java.rb
@@ -1,11 +1,12 @@
 module CodeRay
 module Scanners
-
+  
+  # Scanner for Java.
   class Java < Scanner
-
-    include Streamable
+    
     register_for :java
-    helper :builtin_types
+    
+    autoload :BuiltinTypes, CodeRay.coderay_path('scanners', 'java', 'builtin_types')
     
     # http://java.sun.com/docs/books/tutorial/java/nutsandbolts/_keywords.html
     KEYWORDS = %w[
@@ -13,63 +14,64 @@ module Scanners
       finally for if instanceof import new package
       return switch throw try typeof while
       debugger export
-    ]
-    RESERVED = %w[ const goto ]
-    CONSTANTS = %w[ false null true ]
-    MAGIC_VARIABLES = %w[ this super ]
+    ]  # :nodoc:
+    RESERVED = %w[ const goto ]  # :nodoc:
+    CONSTANTS = %w[ false null true ]  # :nodoc:
+    MAGIC_VARIABLES = %w[ this super ]  # :nodoc:
     TYPES = %w[
       boolean byte char class double enum float int interface long
       short void
-    ] << '[]'  # because int[] should be highlighted as a type
+    ] << '[]'  # :nodoc: because int[] should be highlighted as a type
     DIRECTIVES = %w[
       abstract extends final implements native private protected public
       static strictfp synchronized throws transient volatile
-    ]
+    ]  # :nodoc:
     
     IDENT_KIND = WordList.new(:ident).
       add(KEYWORDS, :keyword).
       add(RESERVED, :reserved).
-      add(CONSTANTS, :pre_constant).
+      add(CONSTANTS, :predefined_constant).
       add(MAGIC_VARIABLES, :local_variable).
       add(TYPES, :type).
-      add(BuiltinTypes::List, :pre_type).
+      add(BuiltinTypes::List, :predefined_type).
       add(BuiltinTypes::List.select { |builtin| builtin[/(Error|Exception)$/] }, :exception).
-      add(DIRECTIVES, :directive)
+      add(DIRECTIVES, :directive)  # :nodoc:
 
-    ESCAPE = / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
-    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x
+    ESCAPE = / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x  # :nodoc:
+    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x  # :nodoc:
     STRING_CONTENT_PATTERN = {
       "'" => /[^\\']+/,
       '"' => /[^\\"]+/,
       '/' => /[^\\\/]+/,
-    }
-    IDENT = /[a-zA-Z_][A-Za-z_0-9]*/
+    }  # :nodoc:
+    IDENT = /[a-zA-Z_][A-Za-z_0-9]*/  # :nodoc:
+    
+  protected
     
-    def scan_tokens tokens, options
+    def scan_tokens encoder, options
 
       state = :initial
       string_delimiter = nil
-      import_clause = class_name_follows = last_token_dot = false
+      package_name_expected = false
+      class_name_follows = false
+      last_token_dot = false
 
       until eos?
 
-        kind = nil
-        match = nil
-        
         case state
 
         when :initial
 
           if match = scan(/ \s+ | \\\n /x)
-            tokens << [match, :space]
+            encoder.text_token match, :space
             next
           
           elsif match = scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
-            tokens << [match, :comment]
+            encoder.text_token match, :comment
             next
           
-          elsif import_clause && scan(/ #{IDENT} (?: \. #{IDENT} )* /ox)
-            kind = :include
+          elsif package_name_expected && match = scan(/ #{IDENT} (?: \. #{IDENT} )* /ox)
+            encoder.text_token match, package_name_expected
           
           elsif match = scan(/ #{IDENT} | \[\] /ox)
             kind = IDENT_KIND[match]
@@ -79,95 +81,91 @@ module Scanners
               kind = :class
               class_name_follows = false
             else
-              import_clause = true if match == 'import'
-              class_name_follows = true if match == 'class' || match == 'interface'
+              case match
+              when 'import'
+                package_name_expected = :include
+              when 'package'
+                package_name_expected = :namespace
+              when 'class', 'interface'
+                class_name_follows = true
+              end
             end
+            encoder.text_token match, kind
           
-          elsif scan(/ \.(?!\d) | [,?:()\[\]}] | -- | \+\+ | && | \|\| | \*\*=? | [-+*\/%^~&|<>=!]=? | <<<?=? | >>>?=? /x)
-            kind = :operator
+          elsif match = scan(/ \.(?!\d) | [,?:()\[\]}] | -- | \+\+ | && | \|\| | \*\*=? | [-+*\/%^~&|<>=!]=? | <<<?=? | >>>?=? /x)
+            encoder.text_token match, :operator
           
-          elsif scan(/;/)
-            import_clause = false
-            kind = :operator
+          elsif match = scan(/;/)
+            package_name_expected = false
+            encoder.text_token match, :operator
           
-          elsif scan(/\{/)
+          elsif match = scan(/\{/)
             class_name_follows = false
-            kind = :operator
+            encoder.text_token match, :operator
           
           elsif check(/[\d.]/)
-            if scan(/0[xX][0-9A-Fa-f]+/)
-              kind = :hex
-            elsif scan(/(?>0[0-7]+)(?![89.eEfF])/)
-              kind = :oct
-            elsif scan(/\d+[fFdD]|\d*\.\d+(?:[eE][+-]?\d+)?[fFdD]?|\d+[eE][+-]?\d+[fFdD]?/)
-              kind = :float
-            elsif scan(/\d+[lL]?/)
-              kind = :integer
+            if match = scan(/0[xX][0-9A-Fa-f]+/)
+              encoder.text_token match, :hex
+            elsif match = scan(/(?>0[0-7]+)(?![89.eEfF])/)
+              encoder.text_token match, :octal
+            elsif match = scan(/\d+[fFdD]|\d*\.\d+(?:[eE][+-]?\d+)?[fFdD]?|\d+[eE][+-]?\d+[fFdD]?/)
+              encoder.text_token match, :float
+            elsif match = scan(/\d+[lL]?/)
+              encoder.text_token match, :integer
             end
 
           elsif match = scan(/["']/)
-            tokens << [:open, :string]
             state = :string
+            encoder.begin_group state
             string_delimiter = match
-            kind = :delimiter
+            encoder.text_token match, :delimiter
 
-          elsif scan(/ @ #{IDENT} /ox)
-            kind = :annotation
+          elsif match = scan(/ @ #{IDENT} /ox)
+            encoder.text_token match, :annotation
 
           else
-            getch
-            kind = :error
+            encoder.text_token getch, :error
 
           end
 
         when :string
-          if scan(STRING_CONTENT_PATTERN[string_delimiter])
-            kind = :content
+          if match = scan(STRING_CONTENT_PATTERN[string_delimiter])
+            encoder.text_token match, :content
           elsif match = scan(/["'\/]/)
-            tokens << [match, :delimiter]
-            tokens << [:close, state]
-            string_delimiter = nil
+            encoder.text_token match, :delimiter
+            encoder.end_group state
             state = :initial
-            next
+            string_delimiter = nil
           elsif state == :string && (match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox))
             if string_delimiter == "'" && !(match == "\\\\" || match == "\\'")
-              kind = :content
+              encoder.text_token match, :content
             else
-              kind = :char
+              encoder.text_token match, :char
             end
-          elsif scan(/\\./m)
-            kind = :content
-          elsif scan(/ \\ | $ /x)
-            tokens << [:close, state]
-            kind = :error
+          elsif match = scan(/\\./m)
+            encoder.text_token match, :content
+          elsif match = scan(/ \\ | $ /x)
+            encoder.end_group state
             state = :initial
+            encoder.text_token match, :error
           else
-            raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+            raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
           end
 
         else
-          raise_inspect 'Unknown state', tokens
+          raise_inspect 'Unknown state', encoder
 
         end
-
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens
-        end
-        raise_inspect 'Empty token', tokens unless match
         
         last_token_dot = match == '.'
         
-        tokens << [match, kind]
-
       end
 
       if state == :string
-        tokens << [:close, state]
+        encoder.end_group state
       end
 
-      tokens
+      encoder
     end
 
   end
diff --git a/lib/coderay/scanners/java/builtin_types.rb b/lib/coderay/scanners/java/builtin_types.rb
index 8087edd..d1b8b73 100644
--- a/lib/coderay/scanners/java/builtin_types.rb
+++ b/lib/coderay/scanners/java/builtin_types.rb
@@ -3,6 +3,7 @@ module Scanners
   
   module Java::BuiltinTypes  # :nodoc:
     
+    #:nocov:
     List = %w[
       AbstractAction AbstractBorder AbstractButton AbstractCellEditor AbstractCollection
       AbstractColorChooserPanel AbstractDocument AbstractExecutorService AbstractInterruptibleChannel
@@ -412,6 +413,7 @@ module Scanners
       XPathFactoryConfigurationException XPathFunction XPathFunctionException XPathFunctionResolver
       XPathVariableResolver ZipEntry ZipException ZipFile ZipInputStream ZipOutputStream ZoneView
     ]
+    #:nocov:
     
   end
   
diff --git a/lib/coderay/scanners/java_script.rb b/lib/coderay/scanners/java_script.rb
index 1f26348..43ecb18 100644
--- a/lib/coderay/scanners/java_script.rb
+++ b/lib/coderay/scanners/java_script.rb
@@ -1,28 +1,29 @@
 module CodeRay
 module Scanners
-
+  
+  # Scanner for JavaScript.
+  # 
+  # Aliases: +ecmascript+, +ecma_script+, +javascript+
   class JavaScript < Scanner
-
-    include Streamable
-
+    
     register_for :java_script
     file_extension 'js'
-
+    
     # The actual JavaScript keywords.
     KEYWORDS = %w[
       break case catch continue default delete do else
       finally for function if in instanceof new
       return switch throw try typeof var void while with
-    ]
+    ]  # :nodoc:
     PREDEFINED_CONSTANTS = %w[
-      false null true undefined
-    ]
+      false null true undefined NaN Infinity
+    ]  # :nodoc:
     
-    MAGIC_VARIABLES = %w[ this arguments ]  # arguments was introduced in JavaScript 1.4
+    MAGIC_VARIABLES = %w[ this arguments ]  # :nodoc: arguments was introduced in JavaScript 1.4
     
     KEYWORDS_EXPECTING_VALUE = WordList.new.add %w[
       case delete in instanceof new return throw typeof with
-    ]
+    ]  # :nodoc:
     
     # Reserved for future use.
     RESERVED_WORDS = %w[
@@ -30,68 +31,66 @@ module Scanners
       final float goto implements import int interface long native package
       private protected public short static super synchronized throws transient
       volatile
-    ]
+    ]  # :nodoc:
     
     IDENT_KIND = WordList.new(:ident).
       add(RESERVED_WORDS, :reserved).
-      add(PREDEFINED_CONSTANTS, :pre_constant).
+      add(PREDEFINED_CONSTANTS, :predefined_constant).
       add(MAGIC_VARIABLES, :local_variable).
-      add(KEYWORDS, :keyword)
-
-    ESCAPE = / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
-    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x
-    REGEXP_ESCAPE =  / [bBdDsSwW] /x
+      add(KEYWORDS, :keyword)  # :nodoc:
+    
+    ESCAPE = / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x  # :nodoc:
+    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x  # :nodoc:
+    REGEXP_ESCAPE =  / [bBdDsSwW] /x  # :nodoc:
     STRING_CONTENT_PATTERN = {
       "'" => /[^\\']+/,
       '"' => /[^\\"]+/,
       '/' => /[^\\\/]+/,
-    }
+    }  # :nodoc:
     KEY_CHECK_PATTERN = {
       "'" => / (?> [^\\']* (?: \\. [^\\']* )* ) ' \s* : /mx,
       '"' => / (?> [^\\"]* (?: \\. [^\\"]* )* ) " \s* : /mx,
-    }
-
-    def scan_tokens tokens, options
-
+    }  # :nodoc:
+    
+  protected
+    
+    def scan_tokens encoder, options
+      
       state = :initial
       string_delimiter = nil
       value_expected = true
       key_expected = false
       function_expected = false
-
+      
       until eos?
-
-        kind = nil
-        match = nil
         
         case state
-
+          
         when :initial
-
+          
           if match = scan(/ \s+ | \\\n /x)
             value_expected = true if !value_expected && match.index(?\n)
-            tokens << [match, :space]
-            next
-
-          elsif scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
+            encoder.text_token match, :space
+            
+          elsif match = scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
             value_expected = true
-            kind = :comment
-
+            encoder.text_token match, :comment
+            
           elsif check(/\.?\d/)
             key_expected = value_expected = false
-            if scan(/0[xX][0-9A-Fa-f]+/)
-              kind = :hex
-            elsif scan(/(?>0[0-7]+)(?![89.eEfF])/)
-              kind = :oct
-            elsif scan(/\d+[fF]|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
-              kind = :float
-            elsif scan(/\d+/)
-              kind = :integer
+            if match = scan(/0[xX][0-9A-Fa-f]+/)
+              encoder.text_token match, :hex
+            elsif match = scan(/(?>0[0-7]+)(?![89.eEfF])/)
+              encoder.text_token match, :octal
+            elsif match = scan(/\d+[fF]|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
+              encoder.text_token match, :float
+            elsif match = scan(/\d+/)
+              encoder.text_token match, :integer
             end
-          
+            
           elsif value_expected && match = scan(/<([[:alpha:]]\w*) (?: [^\/>]*\/> | .*?<\/\1>)/xim)
-            # FIXME: scan over nested tags
-            xml_scanner.tokenize match
+            # TODO: scan over nested tags
+            xml_scanner.tokenize match, :tokens => encoder
             value_expected = false
             next
             
@@ -100,12 +99,12 @@ module Scanners
             last_operator = match[-1]
             key_expected = (last_operator == ?{) || (last_operator == ?,)
             function_expected = false
-            kind = :operator
-
-          elsif scan(/ [)\]}]+ /x)
+            encoder.text_token match, :operator
+            
+          elsif match = scan(/ [)\]}]+ /x)
             function_expected = key_expected = value_expected = false
-            kind = :operator
-
+            encoder.text_token match, :operator
+            
           elsif match = scan(/ [$a-zA-Z_][A-Za-z_0-9$]* /x)
             kind = IDENT_KIND[match]
             value_expected = (kind == :keyword) && KEYWORDS_EXPECTING_VALUE[match]
@@ -123,101 +122,91 @@ module Scanners
             end
             function_expected = (kind == :keyword) && (match == 'function')
             key_expected = false
-          
+            encoder.text_token match, kind
+            
           elsif match = scan(/["']/)
             if key_expected && check(KEY_CHECK_PATTERN[match])
               state = :key
             else
               state = :string
             end
-            tokens << [:open, state]
+            encoder.begin_group state
             string_delimiter = match
-            kind = :delimiter
-
-          elsif value_expected && (match = scan(/\/(?=\S)/))
-            tokens << [:open, :regexp]
+            encoder.text_token match, :delimiter
+            
+          elsif value_expected && (match = scan(/\//))
+            encoder.begin_group :regexp
             state = :regexp
             string_delimiter = '/'
-            kind = :delimiter
-
-          elsif scan(/ \/ /x)
+            encoder.text_token match, :delimiter
+            
+          elsif match = scan(/ \/ /x)
             value_expected = true
             key_expected = false
-            kind = :operator
-
+            encoder.text_token match, :operator
+            
           else
-            getch
-            kind = :error
-
+            encoder.text_token getch, :error
+            
           end
-
+          
         when :string, :regexp, :key
-          if scan(STRING_CONTENT_PATTERN[string_delimiter])
-            kind = :content
+          if match = scan(STRING_CONTENT_PATTERN[string_delimiter])
+            encoder.text_token match, :content
           elsif match = scan(/["'\/]/)
-            tokens << [match, :delimiter]
+            encoder.text_token match, :delimiter
             if state == :regexp
               modifiers = scan(/[gim]+/)
-              tokens << [modifiers, :modifier] if modifiers && !modifiers.empty?
+              encoder.text_token modifiers, :modifier if modifiers && !modifiers.empty?
             end
-            tokens << [:close, state]
+            encoder.end_group state
             string_delimiter = nil
             key_expected = value_expected = false
             state = :initial
-            next
           elsif state != :regexp && (match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox))
             if string_delimiter == "'" && !(match == "\\\\" || match == "\\'")
-              kind = :content
+              encoder.text_token match, :content
             else
-              kind = :char
+              encoder.text_token match, :char
             end
-          elsif state == :regexp && scan(/ \\ (?: #{ESCAPE} | #{REGEXP_ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
-            kind = :char
-          elsif scan(/\\./m)
-            kind = :content
-          elsif scan(/ \\ | $ /x)
-            tokens << [:close, state]
-            kind = :error
+          elsif state == :regexp && match = scan(/ \\ (?: #{ESCAPE} | #{REGEXP_ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+            encoder.text_token match, :char
+          elsif match = scan(/\\./m)
+            encoder.text_token match, :content
+          elsif match = scan(/ \\ | $ /x)
+            encoder.end_group state
+            encoder.text_token match, :error
             key_expected = value_expected = false
             state = :initial
           else
-            raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+            raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
           end
-
+          
         else
-          raise_inspect 'Unknown state', tokens
-
-        end
-
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens
+          raise_inspect 'Unknown state', encoder
+          
         end
-        raise_inspect 'Empty token', tokens unless match
         
-        tokens << [match, kind]
-
       end
-
+      
       if [:string, :regexp].include? state
-        tokens << [:close, state]
+        encoder.end_group state
       end
-
-      tokens
+      
+      encoder
     end
-
+    
   protected
-
+    
     def reset_instance
       super
       @xml_scanner.reset if defined? @xml_scanner
     end
-
+    
     def xml_scanner
       @xml_scanner ||= CodeRay.scanner :xml, :tokens => @tokens, :keep_tokens => true, :keep_state => false
     end
-
+    
   end
   
 end
diff --git a/lib/coderay/scanners/json.rb b/lib/coderay/scanners/json.rb
index abe24fb..0c90c34 100644
--- a/lib/coderay/scanners/json.rb
+++ b/lib/coderay/scanners/json.rb
@@ -1,22 +1,24 @@
 module CodeRay
 module Scanners
   
+  # Scanner for JSON (JavaScript Object Notation).
   class JSON < Scanner
     
-    include Streamable
-    
     register_for :json
     file_extension 'json'
     
     KINDS_NOT_LOC = [
       :float, :char, :content, :delimiter,
       :error, :integer, :operator, :value,
-    ]
+    ]  # :nodoc:
+    
+    ESCAPE = / [bfnrt\\"\/] /x  # :nodoc:
+    UNICODE_ESCAPE = / u[a-fA-F0-9]{4} /x  # :nodoc:
     
-    ESCAPE = / [bfnrt\\"\/] /x
-    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} /x
+  protected
     
-    def scan_tokens tokens, options
+    # See http://json.org/ for a definition of the JSON lexic/grammar.
+    def scan_tokens encoder, options
       
       state = :initial
       stack = []
@@ -24,82 +26,67 @@ module Scanners
       
       until eos?
         
-        kind = nil
-        match = nil
-        
         case state
         
         when :initial
-          if match = scan(/ \s+ | \\\n /x)
-            tokens << [match, :space]
-            next
+          if match = scan(/ \s+ /x)
+            encoder.text_token match, :space
+          elsif match = scan(/"/)
+            state = key_expected ? :key : :string
+            encoder.begin_group state
+            encoder.text_token match, :delimiter
           elsif match = scan(/ [:,\[{\]}] /x)
-            kind = :operator
+            encoder.text_token match, :operator
             case match
-            when '{' then stack << :object; key_expected = true
-            when '[' then stack << :array
             when ':' then key_expected = false
             when ',' then key_expected = true if stack.last == :object
+            when '{' then stack << :object; key_expected = true
+            when '[' then stack << :array
             when '}', ']' then stack.pop  # no error recovery, but works for valid JSON
             end
           elsif match = scan(/ true | false | null /x)
-            kind = :value
-          elsif match = scan(/-?(?:0|[1-9]\d*)/)
-            kind = :integer
-            if scan(/\.\d+(?:[eE][-+]?\d+)?|[eE][-+]?\d+/)
+            encoder.text_token match, :value
+          elsif match = scan(/ -? (?: 0 | [1-9]\d* ) /x)
+            if scan(/ \.\d+ (?:[eE][-+]?\d+)? | [eE][-+]? \d+ /x)
               match << matched
-              kind = :float
+              encoder.text_token match, :float
+            else
+              encoder.text_token match, :integer
             end
-          elsif match = scan(/"/)
-            state = key_expected ? :key : :string
-            tokens << [:open, state]
-            kind = :delimiter
           else
-            getch
-            kind = :error
+            encoder.text_token getch, :error
           end
           
         when :string, :key
-          if scan(/[^\\"]+/)
-            kind = :content
-          elsif scan(/"/)
-            tokens << ['"', :delimiter]
-            tokens << [:close, state]
+          if match = scan(/[^\\"]+/)
+            encoder.text_token match, :content
+          elsif match = scan(/"/)
+            encoder.text_token match, :delimiter
+            encoder.end_group state
             state = :initial
-            next
-          elsif scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
-            kind = :char
-          elsif scan(/\\./m)
-            kind = :content
-          elsif scan(/ \\ | $ /x)
-            tokens << [:close, state]
-            kind = :error
+          elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+            encoder.text_token match, :char
+          elsif match = scan(/\\./m)
+            encoder.text_token match, :content
+          elsif match = scan(/ \\ | $ /x)
+            encoder.end_group state
+            encoder.text_token match, :error
             state = :initial
           else
-            raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+            raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
           end
           
         else
-          raise_inspect 'Unknown state', tokens
+          raise_inspect 'Unknown state: %p' % [state], encoder
           
         end
-        
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens
-        end
-        raise_inspect 'Empty token', tokens unless match
-        
-        tokens << [match, kind]
-        
       end
       
       if [:string, :key].include? state
-        tokens << [:close, state]
+        encoder.end_group state
       end
       
-      tokens
+      encoder
     end
     
   end
diff --git a/lib/coderay/scanners/nitro_xhtml.rb b/lib/coderay/scanners/nitro_xhtml.rb
deleted file mode 100644
index 3db42d9..0000000
--- a/lib/coderay/scanners/nitro_xhtml.rb
+++ /dev/null
@@ -1,136 +0,0 @@
-module CodeRay
-module Scanners
-
-  load :html
-  load :ruby
-
-  # Nitro XHTML Scanner
-  class NitroXHTML < Scanner
-
-    include Streamable
-    register_for :nitro_xhtml
-    file_extension :xhtml
-    title 'Nitro XHTML'
-
-    KINDS_NOT_LOC = HTML::KINDS_NOT_LOC
-    
-    NITRO_RUBY_BLOCK = /
-      <\?r
-      (?>
-        [^\?]*
-        (?> \?(?!>) [^\?]* )*
-      )
-      (?: \?> )?
-    |
-      <ruby>
-      (?>
-        [^<]*
-        (?> <(?!\/ruby>) [^<]* )*
-      )
-      (?: <\/ruby> )?
-    |
-      <%
-      (?>
-        [^%]*
-        (?> %(?!>) [^%]* )*
-      )
-      (?: %> )?
-    /mx
-
-    NITRO_VALUE_BLOCK = /
-      \#
-      (?:
-        \{
-        [^{}]*
-        (?>
-          \{ [^}]* \}
-          (?> [^{}]* )
-        )*
-        \}?
-      | \| [^|]* \|?
-      | \( [^)]* \)?
-      | \[ [^\]]* \]?
-      | \\ [^\\]* \\?
-      )
-    /x
-
-    NITRO_ENTITY = /
-      % (?: \#\d+ | \w+ ) ;
-    /
-
-    START_OF_RUBY = /
-      (?=[<\#%])
-      < (?: \?r | % | ruby> )
-    | \# [{(|]
-    | % (?: \#\d+ | \w+ ) ;
-    /x
-
-    CLOSING_PAREN = Hash.new do |h, p|
-      h[p] = p
-    end.update( {
-      '(' => ')',
-      '[' => ']',
-      '{' => '}',
-    } )
-
-  private
-
-    def setup
-      @ruby_scanner = CodeRay.scanner :ruby, :tokens => @tokens, :keep_tokens => true
-      @html_scanner = CodeRay.scanner :html, :tokens => @tokens, :keep_tokens => true, :keep_state => true
-    end
-
-    def reset_instance
-      super
-      @html_scanner.reset
-    end
-
-    def scan_tokens tokens, options
-
-      until eos?
-
-        if (match = scan_until(/(?=#{START_OF_RUBY})/o) || scan_rest) && !match.empty?
-          @html_scanner.tokenize match
-
-        elsif match = scan(/#{NITRO_VALUE_BLOCK}/o)
-          start_tag = match[0,2]
-          delimiter = CLOSING_PAREN[start_tag[1,1]]
-          end_tag = match[-1,1] == delimiter ? delimiter : ''
-          tokens << [:open, :inline]
-          tokens << [start_tag, :inline_delimiter]
-          code = match[start_tag.size .. -1 - end_tag.size]
-          @ruby_scanner.tokenize code
-          tokens << [end_tag, :inline_delimiter] unless end_tag.empty?
-          tokens << [:close, :inline]
-
-        elsif match = scan(/#{NITRO_RUBY_BLOCK}/o)
-          start_tag = '<?r'
-          end_tag = match[-2,2] == '?>' ? '?>' : ''
-          tokens << [:open, :inline]
-          tokens << [start_tag, :inline_delimiter]
-          code = match[start_tag.size .. -(end_tag.size)-1]
-          @ruby_scanner.tokenize code
-          tokens << [end_tag, :inline_delimiter] unless end_tag.empty?
-          tokens << [:close, :inline]
-
-        elsif entity = scan(/#{NITRO_ENTITY}/o)
-          tokens << [entity, :entity]
-        
-        elsif scan(/%/)
-          tokens << [matched, :error]
-
-        else
-          raise_inspect 'else-case reached!', tokens
-          
-        end
-
-      end
-
-      tokens
-
-    end
-
-  end
-
-end
-end
diff --git a/lib/coderay/scanners/php.rb b/lib/coderay/scanners/php.rb
index 51d6af0..dadab00 100644
--- a/lib/coderay/scanners/php.rb
+++ b/lib/coderay/scanners/php.rb
@@ -3,14 +3,19 @@ module Scanners
   
   load :html
   
+  # Scanner for PHP.
+  # 
   # Original by Stefan Walk.
   class PHP < Scanner
     
     register_for :php
     file_extension 'php'
+    encoding 'BINARY'
     
     KINDS_NOT_LOC = HTML::KINDS_NOT_LOC
     
+  protected
+    
     def setup
       @html_scanner = CodeRay.scanner :html, :tokens => @tokens, :keep_tokens => true, :keep_state => true
     end
@@ -20,7 +25,7 @@ module Scanners
       @html_scanner.reset
     end
     
-    module Words
+    module Words  # :nodoc:
       
       # according to http://www.php.net/manual/en/reserved.keywords.php
       KEYWORDS = %w[
@@ -176,20 +181,20 @@ module Scanners
         $argc $argv
       ]
       
-      IDENT_KIND = CaseIgnoringWordList.new(:ident).
-        add(KEYWORDS, :reserved).
-        add(TYPES, :pre_type).
-        add(LANGUAGE_CONSTRUCTS, :reserved).
+      IDENT_KIND = WordList::CaseIgnoring.new(:ident).
+        add(KEYWORDS, :keyword).
+        add(TYPES, :predefined_type).
+        add(LANGUAGE_CONSTRUCTS, :keyword).
         add(BUILTIN_FUNCTIONS, :predefined).
-        add(CLASSES, :pre_constant).
+        add(CLASSES, :predefined_constant).
         add(EXCEPTIONS, :exception).
-        add(CONSTANTS, :pre_constant)
+        add(CONSTANTS, :predefined_constant)
       
       VARIABLE_KIND = WordList.new(:local_variable).
         add(PREDEFINED, :predefined)
     end
     
-    module RE
+    module RE  # :nodoc:
       
       PHP_START = /
         <script\s+[^>]*?language\s*=\s*"php"[^>]*?> |
@@ -224,17 +229,13 @@ module Scanners
       
     end
     
-    def scan_tokens tokens, options
-      if string.respond_to?(:encoding)
-        unless string.encoding == Encoding::ASCII_8BIT
-          self.string = string.encode Encoding::ASCII_8BIT,
-            :invalid => :replace, :undef => :replace, :replace => '?'
-        end
-      end
+  protected
+    
+    def scan_tokens encoder, options
       
       if check(RE::PHP_START) ||  # starts with <?
-       (match?(/\s*<\S/) && exist?(RE::PHP_START)) || # starts with tag and contains <?
-       exist?(RE::HTML_INDICATOR) ||
+       (match?(/\s*<\S/) && check(/.{1,1000}#{RE::PHP_START}/om)) || # starts with tag and contains <?
+       check(/.{0,1000}#{RE::HTML_INDICATOR}/om) ||
        check(/.{1,100}#{RE::PHP_START}/om)  # PHP start after max 100 chars
         # is HTML with embedded PHP, so start with HTML
         states = [:initial]
@@ -252,29 +253,24 @@ module Scanners
       
       until eos?
         
-        match = nil
-        kind = nil
-        
         case states.last
         
         when :initial  # HTML
-          if scan RE::PHP_START
-            kind = :inline_delimiter
+          if match = scan(RE::PHP_START)
+            encoder.text_token match, :inline_delimiter
             label_expected = true
             states << :php
           else
             match = scan_until(/(?=#{RE::PHP_START})/o) || scan_rest
             @html_scanner.tokenize match unless match.empty?
-            next
           end
         
         when :php
           if match = scan(/\s+/)
-            tokens << [match, :space]
-            next
+            encoder.text_token match, :space
           
-          elsif scan(%r! (?m: \/\* (?: .*? \*\/ | .* ) ) | (?://|\#) .*? (?=#{RE::PHP_END}|$) !xo)
-            kind = :comment
+          elsif match = scan(%r! (?m: \/\* (?: .*? \*\/ | .* ) ) | (?://|\#) .*? (?=#{RE::PHP_END}|$) !xo)
+            encoder.text_token match, :comment
           
           elsif match = scan(RE::IDENTIFIER)
             kind = Words::IDENT_KIND[match]
@@ -285,7 +281,7 @@ module Scanners
               label_expected = false
               if kind == :ident && match =~ /^[A-Z]/
                 kind = :constant
-              elsif kind == :reserved
+              elsif kind == :keyword
                 case match
                 when 'class'
                   states << :class_expected
@@ -299,77 +295,68 @@ module Scanners
                 next
               end
             end
+            encoder.text_token match, kind
           
-          elsif scan(/(?:\d+\.\d*|\d*\.\d+)(?:e[-+]?\d+)?|\d+e[-+]?\d+/i)
+          elsif match = scan(/(?:\d+\.\d*|\d*\.\d+)(?:e[-+]?\d+)?|\d+e[-+]?\d+/i)
             label_expected = false
-            kind = :float
+            encoder.text_token match, :float
           
-          elsif scan(/0x[0-9a-fA-F]+/)
+          elsif match = scan(/0x[0-9a-fA-F]+/)
             label_expected = false
-            kind = :hex
+            encoder.text_token match, :hex
           
-          elsif scan(/\d+/)
+          elsif match = scan(/\d+/)
             label_expected = false
-            kind = :integer
-          
-          elsif scan(/'/)
-            tokens << [:open, :string]
-            if modifier
-              tokens << [modifier, :modifier]
-              modifier = nil
-            end
-            kind = :delimiter
-            states.push :sqstring
+            encoder.text_token match, :integer
           
-          elsif match = scan(/["`]/)
-            tokens << [:open, :string]
+          elsif match = scan(/['"`]/)
+            encoder.begin_group :string
             if modifier
-              tokens << [modifier, :modifier]
+              encoder.text_token modifier, :modifier
               modifier = nil
             end
             delimiter = match
-            kind = :delimiter
-            states.push :dqstring
+            encoder.text_token match, :delimiter
+            states.push match == "'" ? :sqstring : :dqstring
           
           elsif match = scan(RE::VARIABLE)
             label_expected = false
-            kind = Words::VARIABLE_KIND[match]
+            encoder.text_token match, Words::VARIABLE_KIND[match]
           
-          elsif scan(/\{/)
-            kind = :operator
+          elsif match = scan(/\{/)
+            encoder.text_token match, :operator
             label_expected = true
             states.push :php
           
-          elsif scan(/\}/)
+          elsif match = scan(/\}/)
             if states.size == 1
-              kind = :error
+              encoder.text_token match, :error
             else
               states.pop
               if states.last.is_a?(::Array)
                 delimiter = states.last[1]
                 states[-1] = states.last[0]
-                tokens << [matched, :delimiter]
-                tokens << [:close, :inline]
-                next
+                encoder.text_token match, :delimiter
+                encoder.end_group :inline
               else
-                kind = :operator
+                encoder.text_token match, :operator
                 label_expected = true
               end
             end
           
-          elsif scan(/@/)
+          elsif match = scan(/@/)
             label_expected = false
-            kind = :exception
+            encoder.text_token match, :exception
           
-          elsif scan RE::PHP_END
-            kind = :inline_delimiter
+          elsif match = scan(RE::PHP_END)
+            encoder.text_token match, :inline_delimiter
             states = [:initial]
           
           elsif match = scan(/<<<(?:(#{RE::IDENTIFIER})|"(#{RE::IDENTIFIER})"|'(#{RE::IDENTIFIER})')/o)
-            tokens << [:open, :string]
-            warn 'heredoc in heredoc?' if heredoc_delimiter
+            encoder.begin_group :string
+            # warn 'heredoc in heredoc?' if heredoc_delimiter
             heredoc_delimiter = Regexp.escape(self[1] || self[2] || self[3])
-            kind = :delimiter
+            encoder.text_token match, :delimiter
             states.push self[3] ? :sqstring : :dqstring
             heredoc_delimiter = /#{heredoc_delimiter}(?=;?$)/
           
@@ -379,152 +366,141 @@ module Scanners
               label_expected = true if match == ':'
               case_expected = false
             end
-            kind = :operator
+            encoder.text_token match, :operator
           
           else
-            getch
-            kind = :error
+            encoder.text_token getch, :error
           
           end
         
         when :sqstring
-          if scan(heredoc_delimiter ? /[^\\\n]+/ : /[^'\\]+/)
-            kind = :content
-          elsif !heredoc_delimiter && scan(/'/)
-            tokens << [matched, :delimiter]
-            tokens << [:close, :string]
+          if match = scan(heredoc_delimiter ? /[^\\\n]+/ : /[^'\\]+/)
+            encoder.text_token match, :content
+          elsif !heredoc_delimiter && match = scan(/'/)
+            encoder.text_token match, :delimiter
+            encoder.end_group :string
             delimiter = nil
             label_expected = false
             states.pop
-            next
           elsif heredoc_delimiter && match = scan(/\n/)
-            kind = :content
             if scan heredoc_delimiter
-              tokens << ["\n", :content]
-              tokens << [matched, :delimiter]
-              tokens << [:close, :string]
+              encoder.text_token "\n", :content
+              encoder.text_token matched, :delimiter
+              encoder.end_group :string
               heredoc_delimiter = nil
               label_expected = false
               states.pop
-              next
+            else
+              encoder.text_token match, :content
             end
-          elsif scan(heredoc_delimiter ? /\\\\/ : /\\[\\'\n]/)
-            kind = :char
-          elsif scan(/\\./m)
-            kind = :content
-          elsif scan(/\\/)
-            kind = :error
+          elsif match = scan(heredoc_delimiter ? /\\\\/ : /\\[\\'\n]/)
+            encoder.text_token match, :char
+          elsif match = scan(/\\./m)
+            encoder.text_token match, :content
+          elsif match = scan(/\\/)
+            encoder.text_token match, :error
+          else
+            states.pop
           end
         
         when :dqstring
-          if scan(heredoc_delimiter ? /[^${\\\n]+/ : (delimiter == '"' ? /[^"${\\]+/ : /[^`${\\]+/))
-            kind = :content
-          elsif !heredoc_delimiter && scan(delimiter == '"' ? /"/ : /`/)
-            tokens << [matched, :delimiter]
-            tokens << [:close, :string]
+          if match = scan(heredoc_delimiter ? /[^${\\\n]+/ : (delimiter == '"' ? /[^"${\\]+/ : /[^`${\\]+/))
+            encoder.text_token match, :content
+          elsif !heredoc_delimiter && match = scan(delimiter == '"' ? /"/ : /`/)
+            encoder.text_token match, :delimiter
+            encoder.end_group :string
             delimiter = nil
             label_expected = false
             states.pop
-            next
           elsif heredoc_delimiter && match = scan(/\n/)
-            kind = :content
             if scan heredoc_delimiter
-              tokens << ["\n", :content]
-              tokens << [matched, :delimiter]
-              tokens << [:close, :string]
+              encoder.text_token "\n", :content
+              encoder.text_token matched, :delimiter
+              encoder.end_group :string
               heredoc_delimiter = nil
               label_expected = false
               states.pop
-              next
+            else
+              encoder.text_token match, :content
             end
-          elsif scan(/\\(?:x[0-9A-Fa-f]{1,2}|[0-7]{1,3})/)
-            kind = :char
-          elsif scan(heredoc_delimiter ? /\\[nrtvf\\$]/ : (delimiter == '"' ? /\\[nrtvf\\$"]/ : /\\[nrtvf\\$`]/))
-            kind = :char
-          elsif scan(/\\./m)
-            kind = :content
-          elsif scan(/\\/)
-            kind = :error
+          elsif match = scan(/\\(?:x[0-9A-Fa-f]{1,2}|[0-7]{1,3})/)
+            encoder.text_token match, :char
+          elsif match = scan(heredoc_delimiter ? /\\[nrtvf\\$]/ : (delimiter == '"' ? /\\[nrtvf\\$"]/ : /\\[nrtvf\\$`]/))
+            encoder.text_token match, :char
+          elsif match = scan(/\\./m)
+            encoder.text_token match, :content
+          elsif match = scan(/\\/)
+            encoder.text_token match, :error
           elsif match = scan(/#{RE::VARIABLE}/o)
-            kind = :local_variable
             if check(/\[#{RE::IDENTIFIER}\]/o)
-              tokens << [:open, :inline]
-              tokens << [match, :local_variable]
-              tokens << [scan(/\[/), :operator]
-              tokens << [scan(/#{RE::IDENTIFIER}/o), :ident]
-              tokens << [scan(/\]/), :operator]
-              tokens << [:close, :inline]
-              next
+              encoder.begin_group :inline
+              encoder.text_token match, :local_variable
+              encoder.text_token scan(/\[/), :operator
+              encoder.text_token scan(/#{RE::IDENTIFIER}/o), :ident
+              encoder.text_token scan(/\]/), :operator
+              encoder.end_group :inline
             elsif check(/\[/)
               match << scan(/\[['"]?#{RE::IDENTIFIER}?['"]?\]?/o)
-              kind = :error
+              encoder.text_token match, :error
             elsif check(/->#{RE::IDENTIFIER}/o)
-              tokens << [:open, :inline]
-              tokens << [match, :local_variable]
-              tokens << [scan(/->/), :operator]
-              tokens << [scan(/#{RE::IDENTIFIER}/o), :ident]
-              tokens << [:close, :inline]
-              next
+              encoder.begin_group :inline
+              encoder.text_token match, :local_variable
+              encoder.text_token scan(/->/), :operator
+              encoder.text_token scan(/#{RE::IDENTIFIER}/o), :ident
+              encoder.end_group :inline
             elsif check(/->/)
               match << scan(/->/)
-              kind = :error
+              encoder.text_token match, :error
+            else
+              encoder.text_token match, :local_variable
             end
           elsif match = scan(/\{/)
             if check(/\$/)
-              kind = :delimiter
+              encoder.begin_group :inline
               states[-1] = [states.last, delimiter]
               delimiter = nil
               states.push :php
-              tokens << [:open, :inline]
+              encoder.text_token match, :delimiter
             else
-              kind = :string
+              encoder.text_token match, :content
             end
-          elsif scan(/\$\{#{RE::IDENTIFIER}\}/o)
-            kind = :local_variable
-          elsif scan(/\$/)
-            kind = :content
+          elsif match = scan(/\$\{#{RE::IDENTIFIER}\}/o)
+            encoder.text_token match, :local_variable
+          elsif match = scan(/\$/)
+            encoder.text_token match, :content
+          else
+            states.pop
           end
         
         when :class_expected
-          if scan(/\s+/)
-            kind = :space
+          if match = scan(/\s+/)
+            encoder.text_token match, :space
           elsif match = scan(/#{RE::IDENTIFIER}/o)
-            kind = :class
+            encoder.text_token match, :class
             states.pop
           else
             states.pop
-            next
           end
         
         when :function_expected
-          if scan(/\s+/)
-            kind = :space
-          elsif scan(/&/)
-            kind = :operator
+          if match = scan(/\s+/)
+            encoder.text_token match, :space
+          elsif match = scan(/&/)
+            encoder.text_token match, :operator
           elsif match = scan(/#{RE::IDENTIFIER}/o)
-            kind = :function
+            encoder.text_token match, :function
             states.pop
           else
             states.pop
-            next
           end
         
         else
-          raise_inspect 'Unknown state!', tokens, states
+          raise_inspect 'Unknown state!', encoder, states
         end
         
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens, states
-        end
-        raise_inspect 'Empty token', tokens, states unless match
-        
-        tokens << [match, kind]
-        
       end
       
-      tokens
+      encoder
     end
     
   end
diff --git a/lib/coderay/scanners/plaintext.rb b/lib/coderay/scanners/plaintext.rb
deleted file mode 100644
index 521f3ac..0000000
--- a/lib/coderay/scanners/plaintext.rb
+++ /dev/null
@@ -1,20 +0,0 @@
-module CodeRay
-module Scanners
-
-  class Plaintext < Scanner
-
-    register_for :plaintext, :plain
-    title 'Plain text'
-    
-    include Streamable
-    
-    KINDS_NOT_LOC = [:plain]
-    
-    def scan_tokens tokens, options
-      tokens << [scan_rest, :plain]
-    end
-
-  end
-
-end
-end
diff --git a/lib/coderay/scanners/python.rb b/lib/coderay/scanners/python.rb
index 6d0fc34..5e38a2c 100644
--- a/lib/coderay/scanners/python.rb
+++ b/lib/coderay/scanners/python.rb
@@ -1,12 +1,12 @@
 module CodeRay
 module Scanners
   
-  # Bases on pygments' PythonLexer, see
+  # Scanner for Python. Supports Python 3.
+  # 
+  # Based on pygments' PythonLexer, see
   # http://dev.pocoo.org/projects/pygments/browser/pygments/lexers/agile.py.
   class Python < Scanner
     
-    include Streamable
-    
     register_for :python
     file_extension 'py'
     
@@ -16,11 +16,11 @@ module Scanners
       'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'not',
       'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield',
       'nonlocal',  # new in Python 3
-    ]
+    ]  # :nodoc:
     
     OLD_KEYWORDS = [
       'exec', 'print',  # gone in Python 3
-    ]
+    ]  # :nodoc:
     
     PREDEFINED_METHODS_AND_TYPES = %w[
       __import__ abs all any apply basestring bin bool buffer
@@ -32,7 +32,7 @@ module Scanners
       raw_input reduce reload repr reversed round set setattr slice
       sorted staticmethod str sum super tuple type unichr unicode
       vars xrange zip
-    ]
+    ]  # :nodoc:
     
     PREDEFINED_EXCEPTIONS = %w[
       ArithmeticError AssertionError AttributeError
@@ -47,23 +47,23 @@ module Scanners
       TypeError UnboundLocalError UnicodeDecodeError
       UnicodeEncodeError UnicodeError UnicodeTranslateError
       UnicodeWarning UserWarning ValueError Warning ZeroDivisionError
-    ]
+    ]  # :nodoc:
     
     PREDEFINED_VARIABLES_AND_CONSTANTS = [
-      'False', 'True', 'None', # "keywords" since Python 3
+      'False', 'True', 'None',  # "keywords" since Python 3
       'self', 'Ellipsis', 'NotImplemented',
-    ]
+    ]  # :nodoc:
     
     IDENT_KIND = WordList.new(:ident).
       add(KEYWORDS, :keyword).
       add(OLD_KEYWORDS, :old_keyword).
       add(PREDEFINED_METHODS_AND_TYPES, :predefined).
-      add(PREDEFINED_VARIABLES_AND_CONSTANTS, :pre_constant).
-      add(PREDEFINED_EXCEPTIONS, :exception)
+      add(PREDEFINED_VARIABLES_AND_CONSTANTS, :predefined_constant).
+      add(PREDEFINED_EXCEPTIONS, :exception)  # :nodoc:
     
-    NAME = / [^\W\d] \w* /x
-    ESCAPE = / [abfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
-    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} | N\{[-\w ]+\} /x
+    NAME = / [^\W\d] \w* /x  # :nodoc:
+    ESCAPE = / [abfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x  # :nodoc:
+    UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} | N\{[-\w ]+\} /x  # :nodoc:
     
     OPERATOR = /
       \.\.\. |          # ellipsis
@@ -73,95 +73,103 @@ module Scanners
       [-+*\/%&|^]=? |   # ordinary math and binary logic
       [~`] |            # binary complement and inspection
       <<=? | >>=? | [<>=]=? | !=  # comparison and assignment
-    /x
+    /x  # :nodoc:
     
-    STRING_DELIMITER_REGEXP = Hash.new do |h, delimiter|
-      h[delimiter] = Regexp.union delimiter
-    end
+    STRING_DELIMITER_REGEXP = Hash.new { |h, delimiter|
+      h[delimiter] = Regexp.union delimiter  # :nodoc:
+    }
     
-    STRING_CONTENT_REGEXP = Hash.new do |h, delimiter|
-      h[delimiter] = / [^\\\n]+? (?= \\ | $ | #{Regexp.escape(delimiter)} ) /x
-    end
+    STRING_CONTENT_REGEXP = Hash.new { |h, delimiter|
+      h[delimiter] = / [^\\\n]+? (?= \\ | $ | #{Regexp.escape(delimiter)} ) /x  # :nodoc:
+    }
     
     DEF_NEW_STATE = WordList.new(:initial).
       add(%w(def), :def_expected).
       add(%w(import from), :include_expected).
-      add(%w(class), :class_expected)
+      add(%w(class), :class_expected)  # :nodoc:
     
     DESCRIPTOR = /
       #{NAME}
       (?: \. #{NAME} )*
       | \*
-    /x
+    /x  # :nodoc:
+    
+    DOCSTRING_COMING = /
+      [ \t]* u?r? ("""|''')
+    /x  # :nodoc:
     
-    def scan_tokens tokens, options
+  protected
+    
+    def scan_tokens encoder, options
       
       state = :initial
       string_delimiter = nil
       string_raw = false
+      string_type = nil
+      docstring_coming = match?(/#{DOCSTRING_COMING}/o)
       last_token_dot = false
       unicode = string.respond_to?(:encoding) && string.encoding.name == 'UTF-8'
       from_import_state = []
       
       until eos?
         
-        kind = nil
-        match = nil
-        
         if state == :string
-          if scan(STRING_DELIMITER_REGEXP[string_delimiter])
-            tokens << [matched, :delimiter]
-            tokens << [:close, :string]
+          if match = scan(STRING_DELIMITER_REGEXP[string_delimiter])
+            encoder.text_token match, :delimiter
+            encoder.end_group string_type
+            string_type = nil
             state = :initial
             next
-          elsif string_delimiter.size == 3 && scan(/\n/)
-            kind = :content
-          elsif scan(STRING_CONTENT_REGEXP[string_delimiter])
-            kind = :content
-          elsif !string_raw && scan(/ \\ #{ESCAPE} /ox)
-            kind = :char
-          elsif scan(/ \\ #{UNICODE_ESCAPE} /ox)
-            kind = :char
-          elsif scan(/ \\ . /x)
-            kind = :content
-          elsif scan(/ \\ | $ /x)
-            tokens << [:close, :string]
-            kind = :error
+          elsif string_delimiter.size == 3 && match = scan(/\n/)
+            encoder.text_token match, :content
+          elsif match = scan(STRING_CONTENT_REGEXP[string_delimiter])
+            encoder.text_token match, :content
+          elsif !string_raw && match = scan(/ \\ #{ESCAPE} /ox)
+            encoder.text_token match, :char
+          elsif match = scan(/ \\ #{UNICODE_ESCAPE} /ox)
+            encoder.text_token match, :char
+          elsif match = scan(/ \\ . /x)
+            encoder.text_token match, :content
+          elsif match = scan(/ \\ | $ /x)
+            encoder.end_group string_type
+            string_type = nil
+            encoder.text_token match, :error
             state = :initial
           else
-            raise_inspect "else case \" reached; %p not handled." % peek(1), tokens, state
+            raise_inspect "else case \" reached; %p not handled." % peek(1), encoder, state
           end
         
-        elsif match = scan(/ [ \t]+ | \\\n /x)
-          tokens << [match, :space]
-          next
-        
-        elsif match = scan(/\n/)
-          tokens << [match, :space]
-          state = :initial if state == :include_expected
+        elsif match = scan(/ [ \t]+ | \\?\n /x)
+          encoder.text_token match, :space
+          if match == "\n"
+            state = :initial if state == :include_expected
+            docstring_coming = true if match?(/#{DOCSTRING_COMING}/o)
+          end
           next
         
         elsif match = scan(/ \# [^\n]* /mx)
-          tokens << [match, :comment]
+          encoder.text_token match, :comment
           next
         
         elsif state == :initial
           
-          if scan(/#{OPERATOR}/o)
-            kind = :operator
+          if match = scan(/#{OPERATOR}/o)
+            encoder.text_token match, :operator
           
           elsif match = scan(/(u?r?|b)?("""|"|'''|')/i)
-            tokens << [:open, :string]
             string_delimiter = self[2]
+            string_type = docstring_coming ? :docstring : :string
+            docstring_coming = false if docstring_coming
+            encoder.begin_group string_type
             string_raw = false
             modifiers = self[1]
             unless modifiers.empty?
               string_raw = !!modifiers.index(?r)
-              tokens << [modifiers, :modifier]
+              encoder.text_token modifiers, :modifier
               match = string_delimiter
             end
             state = :string
-            kind = :delimiter
+            encoder.text_token match, :delimiter
           
           # TODO: backticks
           
@@ -177,43 +185,45 @@ module Scanners
               state = DEF_NEW_STATE[match]
               from_import_state << match.to_sym if state == :include_expected
             end
+            encoder.text_token match, kind
           
-          elsif scan(/@[a-zA-Z0-9_.]+[lL]?/)
-            kind = :decorator
+          elsif match = scan(/@[a-zA-Z0-9_.]+[lL]?/)
+            encoder.text_token match, :decorator
           
-          elsif scan(/0[xX][0-9A-Fa-f]+[lL]?/)
-            kind = :hex
+          elsif match = scan(/0[xX][0-9A-Fa-f]+[lL]?/)
+            encoder.text_token match, :hex
           
-          elsif scan(/0[bB][01]+[lL]?/)
-            kind = :bin
+          elsif match = scan(/0[bB][01]+[lL]?/)
+            encoder.text_token match, :binary
           
           elsif match = scan(/(?:\d*\.\d+|\d+\.\d*)(?:[eE][+-]?\d+)?|\d+[eE][+-]?\d+/)
-            kind = :float
             if scan(/[jJ]/)
               match << matched
-              kind = :imaginary
+              encoder.text_token match, :imaginary
+            else
+              encoder.text_token match, :float
             end
           
-          elsif scan(/0[oO][0-7]+|0[0-7]+(?![89.eE])[lL]?/)
-            kind = :oct
+          elsif match = scan(/0[oO][0-7]+|0[0-7]+(?![89.eE])[lL]?/)
+            encoder.text_token match, :octal
           
           elsif match = scan(/\d+([lL])?/)
-            kind = :integer
             if self[1] == nil && scan(/[jJ]/)
               match << matched
-              kind = :imaginary
+              encoder.text_token match, :imaginary
+            else
+              encoder.text_token match, :integer
             end
           
           else
-            getch
-            kind = :error
+            encoder.text_token getch, :error
           
           end
             
         elsif state == :def_expected
           state = :initial
           if match = scan(unicode ? /#{NAME}/uo : /#{NAME}/o)
-            kind = :method
+            encoder.text_token match, :method
           else
             next
           end
@@ -221,33 +231,34 @@ module Scanners
         elsif state == :class_expected
           state = :initial
           if match = scan(unicode ? /#{NAME}/uo : /#{NAME}/o)
-            kind = :class
+            encoder.text_token match, :class
           else
             next
           end
           
         elsif state == :include_expected
           if match = scan(unicode ? /#{DESCRIPTOR}/uo : /#{DESCRIPTOR}/o)
-            kind = :include
             if match == 'as'
-              kind = :keyword
+              encoder.text_token match, :keyword
               from_import_state << :as
             elsif from_import_state.first == :from && match == 'import'
-              kind = :keyword
+              encoder.text_token match, :keyword
               from_import_state << :import
             elsif from_import_state.last == :as
-              # kind = match[0,1][unicode ? /[[:upper:]]/u : /[[:upper:]]/] ? :class : :method
-              kind = :ident
+              # encoder.text_token match, match[0,1][unicode ? /[[:upper:]]/u : /[[:upper:]]/] ? :class : :method
+              encoder.text_token match, :ident
               from_import_state.pop
             elsif IDENT_KIND[match] == :keyword
               unscan
               match = nil
               state = :initial
               next
+            else
+              encoder.text_token match, :include
             end
           elsif match = scan(/,/)
             from_import_state.pop if from_import_state.last == :as
-            kind = :operator
+            encoder.text_token match, :operator
           else
             from_import_state = []
             state = :initial
@@ -255,28 +266,19 @@ module Scanners
           end
           
         else
-          raise_inspect 'Unknown state', tokens, state
+          raise_inspect 'Unknown state', encoder, state
           
         end
         
-        match ||= matched
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens, state
-        end
-        raise_inspect 'Empty token', tokens, state unless match
-        
         last_token_dot = match == '.'
         
-        tokens << [match, kind]
-        
       end
       
       if state == :string
-        tokens << [:close, :string]
+        encoder.end_group string_type
       end
       
-      tokens
+      encoder
     end
     
   end
diff --git a/lib/coderay/scanners/raydebug.rb b/lib/coderay/scanners/raydebug.rb
new file mode 100644
index 0000000..7a21354
--- /dev/null
+++ b/lib/coderay/scanners/raydebug.rb
@@ -0,0 +1,66 @@
+module CodeRay
+module Scanners
+
+  # = Debug Scanner
+  # 
+  # Parses the output of the Encoders::Debug encoder.
+  class Raydebug < Scanner
+
+    register_for :raydebug
+    file_extension 'raydebug'
+    title 'CodeRay Token Dump'
+    
+  protected
+    
+    def scan_tokens encoder, options
+
+      opened_tokens = []
+
+      until eos?
+
+        if match = scan(/\s+/)
+          encoder.text_token match, :space
+          
+        elsif match = scan(/ (\w+) \( ( [^\)\\]* ( \\. [^\)\\]* )* ) /x)
+          kind = self[1]
+          encoder.text_token kind, :class
+          encoder.text_token '(', :operator
+          match = self[2]
+          encoder.text_token match, kind.to_sym
+          encoder.text_token match, :operator if match = scan(/\)/)
+          
+        elsif match = scan(/ (\w+) ([<\[]) /x)
+          kind = self[1]
+          case self[2]
+          when '<'
+            encoder.text_token kind, :class
+          when '['
+            encoder.text_token kind, :class
+          else
+            raise 'CodeRay bug: This case should not be reached.'
+          end
+          kind = kind.to_sym
+          opened_tokens << kind
+          encoder.begin_group kind
+          encoder.text_token self[2], :operator
+          
+        elsif !opened_tokens.empty? && match = scan(/ [>\]] /x)
+          encoder.text_token match, :operator
+          encoder.end_group opened_tokens.pop
+          
+        else
+          encoder.text_token getch, :space
+          
+        end
+        
+      end
+      
+      encoder.end_group opened_tokens.pop until opened_tokens.empty?
+      
+      encoder
+    end
+
+  end
+
+end
+end
diff --git a/lib/coderay/scanners/rhtml.rb b/lib/coderay/scanners/rhtml.rb
deleted file mode 100644
index ce51ec6..0000000
--- a/lib/coderay/scanners/rhtml.rb
+++ /dev/null
@@ -1,78 +0,0 @@
-module CodeRay
-module Scanners
-
-  load :html
-  load :ruby
-
-  # RHTML Scanner
-  class RHTML < Scanner
-
-    include Streamable
-    register_for :rhtml
-    title 'HTML ERB Template'
-    
-    KINDS_NOT_LOC = HTML::KINDS_NOT_LOC
-
-    ERB_RUBY_BLOCK = /
-      <%(?!%)[=-]?
-      (?>
-        [^\-%]*    # normal*
-        (?>        # special
-          (?: %(?!>) | -(?!%>) )
-          [^\-%]*  # normal*
-        )*
-      )
-      (?: -?%> )?
-    /x
-
-    START_OF_ERB = /
-      <%(?!%)
-    /x
-
-  private
-
-    def setup
-      @ruby_scanner = CodeRay.scanner :ruby, :tokens => @tokens, :keep_tokens => true
-      @html_scanner = CodeRay.scanner :html, :tokens => @tokens, :keep_tokens => true, :keep_state => true
-    end
-
-    def reset_instance
-      super
-      @html_scanner.reset
-    end
-
-    def scan_tokens tokens, options
-
-      until eos?
-
-        if (match = scan_until(/(?=#{START_OF_ERB})/o) || scan_rest) and not match.empty?
-          @html_scanner.tokenize match
-
-        elsif match = scan(/#{ERB_RUBY_BLOCK}/o)
-          start_tag = match[/\A<%[-=#]?/]
-          end_tag = match[/-?%?>?\z/]
-          tokens << [:open, :inline]
-          tokens << [start_tag, :inline_delimiter]
-          code = match[start_tag.size .. -1 - end_tag.size]
-          if start_tag == '<%#'
-            tokens << [code, :comment]
-          else
-            @ruby_scanner.tokenize code
-          end
-          tokens << [end_tag, :inline_delimiter] unless end_tag.empty?
-          tokens << [:close, :inline]
-
-        else
-          raise_inspect 'else-case reached!', tokens
-        end
-
-      end
-
-      tokens
-
-    end
-
-  end
-
-end
-end
diff --git a/lib/coderay/scanners/ruby.rb b/lib/coderay/scanners/ruby.rb
index 5feaf9d..2be98a6 100644
--- a/lib/coderay/scanners/ruby.rb
+++ b/lib/coderay/scanners/ruby.rb
@@ -1,7 +1,6 @@
-# encoding: utf-8
 module CodeRay
 module Scanners
-
+  
   # This scanner is really complex, since Ruby _is_ a complex language!
   #
   # It tries to highlight 100% of all common code,
@@ -9,310 +8,240 @@ module Scanners
   #
   # It is optimized for HTML highlighting, and is not very useful for
   # parsing or pretty printing.
-  #
-  # For now, I think it's better than the scanners in VIM or Syntax, or
-  # any highlighter I was able to find, except Caleb's RubyLexer.
-  #
-  # I hope it's also better than the rdoc/irb lexer.
   class Ruby < Scanner
-
-    include Streamable
-
+    
     register_for :ruby
     file_extension 'rb'
-
-    helper :patterns
     
-    if not defined? EncodingError
-      EncodingError = Class.new Exception
+    autoload :Patterns,    CodeRay.coderay_path('scanners', 'ruby', 'patterns')
+    autoload :StringState, CodeRay.coderay_path('scanners', 'ruby', 'string_state')
+    
+    def interpreted_string_state
+      StringState.new :string, true, '"'
     end
-
-  private
-    def scan_tokens tokens, options
-      if string.respond_to?(:encoding)
-        unless string.encoding == Encoding::UTF_8
-          self.string = string.encode Encoding::UTF_8,
-            :invalid => :replace, :undef => :replace, :replace => '?'
-        end
-        unicode = false
-      else
-        unicode = exist?(/[^\x00-\x7f]/)
+    
+  protected
+    
+    def setup
+      @state = :initial
+    end
+    
+    def scan_tokens encoder, options
+      state, heredocs = options[:state] || @state
+      heredocs = heredocs.dup if heredocs.is_a?(Array)
+      
+      if state && state.instance_of?(StringState)
+        encoder.begin_group state.type
       end
       
-      last_token_dot = false
-      value_expected = true
-      heredocs = nil
       last_state = nil
-      state = :initial
-      depth = nil
-      inline_block_stack = []
       
+      method_call_expected = false
+      value_expected = true
+      
+      inline_block_stack = nil
+      inline_block_curly_depth = 0
+      
+      if heredocs
+        state = heredocs.shift
+        encoder.begin_group state.type
+        heredocs = nil if heredocs.empty?
+      end
+      
+      # def_object_stack = nil
+      # def_object_paren_depth = 0
       
       patterns = Patterns  # avoid constant lookup
       
+      unicode = string.respond_to?(:encoding) && string.encoding.name == 'UTF-8'
+      
       until eos?
-        match = nil
-        kind = nil
-
-        if state.instance_of? patterns::StringState
-# {{{
-          match = scan_until(state.pattern) || scan_rest
-          tokens << [match, :content] unless match.empty?
-          break if eos?
-
-          if state.heredoc and self[1]  # end of heredoc
-            match = getch.to_s
-            match << scan_until(/$/) unless eos?
-            tokens << [match, :delimiter]
-            tokens << [:close, state.type]
-            state = state.next_state
-            next
-          end
-
-          case match = getch
-
-          when state.delim
-            if state.paren
-              state.paren_depth -= 1
-              if state.paren_depth > 0
-                tokens << [match, :nesting_delimiter]
-                next
-              end
-            end
-            tokens << [match, :delimiter]
-            if state.type == :regexp and not eos?
-              modifiers = scan(/#{patterns::REGEXP_MODIFIERS}/ox)
-              tokens << [modifiers, :modifier] unless modifiers.empty?
-            end
-            tokens << [:close, state.type]
-            value_expected = false
-            state = state.next_state
-
-          when '\\'
-            if state.interpreted
-              if esc = scan(/ #{patterns::ESCAPE} /ox)
-                tokens << [match + esc, :char]
-              else
-                tokens << [match, :error]
-              end
+        
+        if state.instance_of? ::Symbol
+          
+          if match = scan(/[ \t\f\v]+/)
+            encoder.text_token match, :space
+            
+          elsif match = scan(/\n/)
+            if heredocs
+              unscan  # heredoc scanning needs \n at start
+              state = heredocs.shift
+              encoder.begin_group state.type
+              heredocs = nil if heredocs.empty?
             else
-              case m = getch
-              when state.delim, '\\'
-                tokens << [match + m, :char]
-              when nil
-                tokens << [match, :error]
-              else
-                tokens << [match + m, :content]
-              end
-            end
-
-          when '#'
-            case peek(1)
-            when '{'
-              inline_block_stack << [state, depth, heredocs]
+              state = :initial if state == :undef_comma_expected
+              encoder.text_token match, :space
               value_expected = true
-              state = :initial
-              depth = 1
-              tokens << [:open, :inline]
-              tokens << [match + getch, :inline_delimiter]
-            when '$', '@'
-              tokens << [match, :escape]
-              last_state = state  # scan one token as normal code, then return here
-              state = :initial
-            else
-              raise_inspect 'else-case # reached; #%p not handled' % peek(1), tokens
             end
-
-          when state.paren
-            state.paren_depth += 1
-            tokens << [match, :nesting_delimiter]
-
-          when /#{patterns::REGEXP_SYMBOLS}/ox
-            tokens << [match, :function]
-
-          else
-            raise_inspect 'else-case " reached; %p not handled, state = %p' % [match, state], tokens
-
-          end
-          next
-# }}}
-        else
-# {{{
-          if match = scan(/[ \t\f]+/)
-            kind = :space
-            match << scan(/\s*/) unless eos? || heredocs
-            value_expected = true if match.index(?\n)
-            tokens << [match, kind]
-            next
             
-          elsif match = scan(/\\?\n/)
-            kind = :space
-            if match == "\n"
-              value_expected = true
-              state = :initial if state == :undef_comma_expected
-            end
+          elsif match = scan(bol? ? / \#(!)?.* | #{patterns::RUBYDOC_OR_DATA} /ox : /\#.*/)
+            encoder.text_token match, self[1] ? :doctype : :comment
+            
+          elsif match = scan(/\\\n/)
             if heredocs
               unscan  # heredoc scanning needs \n at start
+              encoder.text_token scan(/\\/), :space
               state = heredocs.shift
-              tokens << [:open, state.type]
+              encoder.begin_group state.type
               heredocs = nil if heredocs.empty?
-              next
             else
-              match << scan(/\s*/) unless eos?
+              encoder.text_token match, :space
             end
-            tokens << [match, kind]
-            next
-          
-          elsif bol? && match = scan(/\#!.*/)
-            tokens << [match, :doctype]
-            next
             
-          elsif match = scan(/\#.*/) or
-            ( bol? and match = scan(/#{patterns::RUBYDOC_OR_DATA}/o) )
-              kind = :comment
-              tokens << [match, kind]
-              next
-
           elsif state == :initial
-
+            
             # IDENTS #
-            if match = scan(unicode ? /#{patterns::METHOD_NAME}/uo :
+            if !method_call_expected &&
+               match = scan(unicode ? /#{patterns::METHOD_NAME}/uo :
                                       /#{patterns::METHOD_NAME}/o)
-              if last_token_dot
-                kind = if match[/^[A-Z]/] and not match?(/\(/) then :constant else :ident end
-              else
-                if value_expected != :expect_colon && scan(/:(?= )/)
-                  tokens << [match, :key]
-                  match = ':'
-                  kind = :operator
-                else
-                  kind = patterns::IDENT_KIND[match]
-                  if kind == :ident
-                    if match[/\A[A-Z]/] and not match[/[!?]$/] and not match?(/\(/)
-                      kind = :constant
-                    end
-                  elsif kind == :reserved
-                    state = patterns::DEF_NEW_STATE[match]
-                    value_expected = :set if patterns::KEYWORDS_EXPECTING_VALUE[match]
-                  end
+              value_expected = false
+              kind = patterns::IDENT_KIND[match]
+              if kind == :ident
+                if match[/\A[A-Z]/] && !(match[/[!?]$/] || match?(/\(/))
+                  kind = :constant
                 end
+              elsif kind == :keyword
+                state = patterns::KEYWORD_NEW_STATE[match]
+                value_expected = true if patterns::KEYWORDS_EXPECTING_VALUE[match]
               end
-              value_expected = :set if check(/#{patterns::VALUE_FOLLOWS}/o)
-            
-            elsif last_token_dot and match = scan(/#{patterns::METHOD_NAME_OPERATOR}|\(/o)
-              kind = :ident
-              value_expected = :set if check(unicode ? /#{patterns::VALUE_FOLLOWS}/uo :
-                                                       /#{patterns::VALUE_FOLLOWS}/o)
-
-            # OPERATORS #
-            elsif not last_token_dot and match = scan(/ \.\.\.? | (?:\.|::)() | [,\(\)\[\]\{\}] | ==?=? /x)
-              if match !~ / [.\)\]\}] /x or match =~ /\.\.\.?/
-                value_expected = :set
+              value_expected = true if !value_expected && check(/#{patterns::VALUE_FOLLOWS}/o)
+              encoder.text_token match, kind
+              
+            elsif method_call_expected &&
+               match = scan(unicode ? /#{patterns::METHOD_AFTER_DOT}/uo :
+                                      /#{patterns::METHOD_AFTER_DOT}/o)
+              if method_call_expected == '::' && match[/\A[A-Z]/] && !match?(/\(/)
+                encoder.text_token match, :constant
+              else
+                encoder.text_token match, :ident
               end
-              last_token_dot = :set if self[1]
-              kind = :operator
-              unless inline_block_stack.empty?
+              method_call_expected = false
+              value_expected = check(/#{patterns::VALUE_FOLLOWS}/o)
+              
+            # OPERATORS #
+            elsif !method_call_expected && match = scan(/ (\.(?!\.)|::) | (?: \.\.\.? | ==?=? | [,\(\[\{] )() | [\)\]\}] /x)
+              method_call_expected = self[1]
+              value_expected = !method_call_expected && self[2]
+              if inline_block_stack
                 case match
                 when '{'
-                  depth += 1
+                  inline_block_curly_depth += 1
                 when '}'
-                  depth -= 1
-                  if depth == 0  # closing brace of inline block reached
-                    state, depth, heredocs = inline_block_stack.pop
+                  inline_block_curly_depth -= 1
+                  if inline_block_curly_depth == 0  # closing brace of inline block reached
+                    state, inline_block_curly_depth, heredocs = inline_block_stack.pop
+                    inline_block_stack = nil if inline_block_stack.empty?
                     heredocs = nil if heredocs && heredocs.empty?
-                    tokens << [match, :inline_delimiter]
-                    kind = :inline
-                    match = :close
+                    encoder.text_token match, :inline_delimiter
+                    encoder.end_group :inline
+                    next
                   end
                 end
               end
-
-            elsif match = scan(/ ['"] /mx)
-              tokens << [:open, :string]
-              kind = :delimiter
-              state = patterns::StringState.new :string, match == '"', match  # important for streaming
-
-            elsif match = scan(unicode ? /#{patterns::INSTANCE_VARIABLE}/uo :
-                                         /#{patterns::INSTANCE_VARIABLE}/o)
-              kind = :instance_variable
-
-            elsif value_expected and match = scan(/\//)
-              tokens << [:open, :regexp]
-              kind = :delimiter
-              interpreted = true
-              state = patterns::StringState.new :regexp, interpreted, match
-
-            # elsif match = scan(/[-+]?#{patterns::NUMERIC}/o)
-            elsif match = value_expected ? scan(/[-+]?#{patterns::NUMERIC}/o) : scan(/#{patterns::NUMERIC}/o)
-              kind = self[1] ? :float : :integer
-
+              encoder.text_token match, :operator
+              
             elsif match = scan(unicode ? /#{patterns::SYMBOL}/uo :
                                          /#{patterns::SYMBOL}/o)
               case delim = match[1]
               when ?', ?"
-                tokens << [:open, :symbol]
-                tokens << [':', :symbol]
+                encoder.begin_group :symbol
+                encoder.text_token ':', :symbol
                 match = delim.chr
-                kind = :delimiter
-                state = patterns::StringState.new :symbol, delim == ?", match
+                encoder.text_token match, :delimiter
+                state = self.class::StringState.new :symbol, delim == ?", match
+              else
+                encoder.text_token match, :symbol
+                value_expected = false
+              end
+              
+            elsif match = scan(/ ' (?:(?>[^'\\]*) ')? | " (?:(?>[^"\\\#]*) ")? /mx)
+              encoder.begin_group :string
+              if match.size == 1
+                encoder.text_token match, :delimiter
+                state = self.class::StringState.new :string, match == '"', match  # important for streaming
+              else
+                encoder.text_token match[0,1], :delimiter
+                encoder.text_token match[1..-2], :content if match.size > 2
+                encoder.text_token match[-1,1], :delimiter
+                encoder.end_group :string
+                value_expected = false
+              end
+              
+            elsif match = scan(unicode ? /#{patterns::INSTANCE_VARIABLE}/uo :
+                                         /#{patterns::INSTANCE_VARIABLE}/o)
+              value_expected = false
+              encoder.text_token match, :instance_variable
+              
+            elsif value_expected && match = scan(/\//)
+              encoder.begin_group :regexp
+              encoder.text_token match, :delimiter
+              state = self.class::StringState.new :regexp, true, '/'
+              
+            elsif match = scan(value_expected ? /[-+]?#{patterns::NUMERIC}/o : /#{patterns::NUMERIC}/o)
+              if method_call_expected
+                encoder.text_token match, :error
+                method_call_expected = false
               else
-                kind = :symbol
+                encoder.text_token match, self[1] ? :float : :integer  # TODO: send :hex/:octal/:binary
               end
-
-            elsif match = scan(/ -[>=]? | [+!~^]=? | [*|&]{1,2}=? | >>? /x)
-              value_expected = :set
-              kind = :operator
-
-            elsif value_expected and match = scan(unicode ? /#{patterns::HEREDOC_OPEN}/uo :
-                                                            /#{patterns::HEREDOC_OPEN}/o)
-              indented = self[1] == '-'
+              value_expected = false
+              
+            elsif match = scan(/ [-+!~^\/]=? | [:;] | [*|&]{1,2}=? | >>? /x)
+              value_expected = true
+              encoder.text_token match, :operator
+              
+            elsif value_expected && match = scan(/#{patterns::HEREDOC_OPEN}/o)
               quote = self[3]
               delim = self[quote ? 4 : 2]
               kind = patterns::QUOTE_TO_TYPE[quote]
-              tokens << [:open, kind]
-              tokens << [match, :delimiter]
-              match = :close
-              heredoc = patterns::StringState.new kind, quote != '\'', delim, (indented ? :indented : :linestart )
+              encoder.begin_group kind
+              encoder.text_token match, :delimiter
+              encoder.end_group kind
               heredocs ||= []  # create heredocs if empty
-              heredocs << heredoc
-
-            elsif value_expected and match = scan(/#{patterns::FANCY_START_CORRECT}/o)
-              kind, interpreted = *patterns::FancyStringType.fetch(self[1]) do
-                raise_inspect 'Unknown fancy string: %%%p' % k, tokens
-              end
-              tokens << [:open, kind]
-              state = patterns::StringState.new kind, interpreted, self[2]
-              kind = :delimiter
-
-            elsif value_expected and match = scan(unicode ? /#{patterns::CHARACTER}/uo :
-                                                            /#{patterns::CHARACTER}/o)
-              kind = :integer
-
-            elsif match = scan(/ [\/%]=? | <(?:<|=>?)? | [?:;] /x)
-              value_expected = :set
-              kind = :operator
-
+              heredocs << self.class::StringState.new(kind, quote != "'", delim,
+                self[1] == '-' ? :indented : :linestart)
+              value_expected = false
+              
+            elsif value_expected && match = scan(/#{patterns::FANCY_STRING_START}/o)
+              kind = patterns::FANCY_STRING_KIND[self[1]]
+              encoder.begin_group kind
+              state = self.class::StringState.new kind, patterns::FANCY_STRING_INTERPRETED[self[1]], self[2]
+              encoder.text_token match, :delimiter
+              
+            elsif value_expected && match = scan(/#{patterns::CHARACTER}/o)
+              value_expected = false
+              encoder.text_token match, :integer
+              
+            elsif match = scan(/ %=? | <(?:<|=>?)? | \? /x)
+              value_expected = true
+              encoder.text_token match, :operator
+              
             elsif match = scan(/`/)
-              if last_token_dot
-                kind = :operator
-              else
-                tokens << [:open, :shell]
-                kind = :delimiter
-                state = patterns::StringState.new :shell, true, match
-              end
-
+              encoder.begin_group :shell
+              encoder.text_token match, :delimiter
+              state = self.class::StringState.new :shell, true, match
+              
             elsif match = scan(unicode ? /#{patterns::GLOBAL_VARIABLE}/uo :
                                          /#{patterns::GLOBAL_VARIABLE}/o)
-              kind = :global_variable
-
+              encoder.text_token match, :global_variable
+              value_expected = false
+              
             elsif match = scan(unicode ? /#{patterns::CLASS_VARIABLE}/uo :
                                          /#{patterns::CLASS_VARIABLE}/o)
-              kind = :class_variable
-
+              encoder.text_token match, :class_variable
+              value_expected = false
+              
+            elsif match = scan(/\\\z/)
+              encoder.text_token match, :space
+              
             else
-              if !unicode && !string.respond_to?(:encoding)
+              if method_call_expected
+                method_call_expected = false
+                next
+              end
+              unless unicode
                 # check for unicode
-                debug, $DEBUG = $DEBUG, false
+                $DEBUG_BEFORE, $DEBUG = $DEBUG, false
                 begin
                   if check(/./mu).size > 1
                     # seems like we should try again with unicode
@@ -321,124 +250,212 @@ module Scanners
                 rescue
                   # bad unicode char; use getch
                 ensure
-                  $DEBUG = debug
+                  $DEBUG = $DEBUG_BEFORE
                 end
                 next if unicode
               end
-              kind = :error
-              match = scan(unicode ? /./mu : /./m)
-
+              
+              encoder.text_token getch, :error
+              
             end
-
-          elsif state == :def_expected
-            state = :initial
-            if scan(/self\./)
-              tokens << ['self', :pre_constant]
-              tokens << ['.', :operator]
+            
+            if last_state
+              state = last_state
+              last_state = nil
             end
+            
+          elsif state == :def_expected
             if match = scan(unicode ? /(?>#{patterns::METHOD_NAME_EX})(?!\.|::)/uo :
                                       /(?>#{patterns::METHOD_NAME_EX})(?!\.|::)/o)
-              kind = :method
+              encoder.text_token match, :method
+              state = :initial
+            else
+              last_state = :dot_expected
+              state = :initial
+            end
+            
+          elsif state == :dot_expected
+            if match = scan(/\.|::/)
+              # invalid definition
+              state = :def_expected
+              encoder.text_token match, :operator
             else
-              next
+              state = :initial
             end
-
+            
           elsif state == :module_expected
             if match = scan(/<</)
-              kind = :operator
+              encoder.text_token match, :operator
             else
               state = :initial
-              if match = scan(unicode ? /(?:#{patterns::IDENT}::)*#{patterns::IDENT}/uo :
-                                        /(?:#{patterns::IDENT}::)*#{patterns::IDENT}/o)
-                kind = :class
-              else
-                next
+              if match = scan(unicode ? / (?:#{patterns::IDENT}::)* #{patterns::IDENT} /oux :
+                                        / (?:#{patterns::IDENT}::)* #{patterns::IDENT} /ox)
+                encoder.text_token match, :class
               end
             end
-
+            
           elsif state == :undef_expected
             state = :undef_comma_expected
-            if match = scan(unicode ? /#{patterns::METHOD_NAME_EX}/uo :
-                                      /#{patterns::METHOD_NAME_EX}/o)
-              kind = :method
-            elsif match = scan(unicode ? /#{patterns::SYMBOL}/uo :
-                                         /#{patterns::SYMBOL}/o)
+            if match = scan(unicode ? /(?>#{patterns::METHOD_NAME_EX})(?!\.|::)/uo :
+                                      /(?>#{patterns::METHOD_NAME_EX})(?!\.|::)/o)
+              encoder.text_token match, :method
+            elsif match = scan(/#{patterns::SYMBOL}/o)
               case delim = match[1]
               when ?', ?"
-                tokens << [:open, :symbol]
-                tokens << [':', :symbol]
+                encoder.begin_group :symbol
+                encoder.text_token ':', :symbol
                 match = delim.chr
-                kind = :delimiter
-                state = patterns::StringState.new :symbol, delim == ?", match
+                encoder.text_token match, :delimiter
+                state = self.class::StringState.new :symbol, delim == ?", match
                 state.next_state = :undef_comma_expected
               else
-                kind = :symbol
+                encoder.text_token match, :symbol
               end
             else
               state = :initial
-              next
             end
-
-          elsif state == :alias_expected
-            match = scan(unicode ? /(#{patterns::METHOD_NAME_OR_SYMBOL})([ \t]+)(#{patterns::METHOD_NAME_OR_SYMBOL})/uo :
-                                   /(#{patterns::METHOD_NAME_OR_SYMBOL})([ \t]+)(#{patterns::METHOD_NAME_OR_SYMBOL})/o)
             
-            if match
-              tokens << [self[1], (self[1][0] == ?: ? :symbol : :method)]
-              tokens << [self[2], :space]
-              tokens << [self[3], (self[3][0] == ?: ? :symbol : :method)]
-            end
-            state = :initial
-            next
-
           elsif state == :undef_comma_expected
             if match = scan(/,/)
-              kind = :operator
+              encoder.text_token match, :operator
               state = :undef_expected
             else
               state = :initial
-              next
             end
-
+            
+          elsif state == :alias_expected
+            match = scan(unicode ? /(#{patterns::METHOD_NAME_OR_SYMBOL})([ \t]+)(#{patterns::METHOD_NAME_OR_SYMBOL})/uo :
+                                   /(#{patterns::METHOD_NAME_OR_SYMBOL})([ \t]+)(#{patterns::METHOD_NAME_OR_SYMBOL})/o)
+            
+            if match
+              encoder.text_token self[1], (self[1][0] == ?: ? :symbol : :method)
+              encoder.text_token self[2], :space
+              encoder.text_token self[3], (self[3][0] == ?: ? :symbol : :method)
+            end
+            state = :initial
+            
+          else
+            #:nocov:
+            raise_inspect 'Unknown state: %p' % [state], encoder
+            #:nocov:
           end
-# }}}
           
-          unless kind == :error
-            if value_expected = value_expected == :set
-              value_expected = :expect_colon if match == '?' || match == 'when'
-            end
-            last_token_dot = last_token_dot == :set
+        else  # StringState
+          
+          match = scan_until(state.pattern) || scan_rest
+          unless match.empty?
+            encoder.text_token match, :content
+            break if eos?
           end
           
-          if $CODERAY_DEBUG and not kind
-            raise_inspect 'Error token %p in line %d' %
-              [[match, kind], line], tokens, state
+          if state.heredoc && self[1]  # end of heredoc
+            match = getch
+            match << scan_until(/$/) unless eos?
+            encoder.text_token match, :delimiter unless match.empty?
+            encoder.end_group state.type
+            state = state.next_state
+            next
           end
-          raise_inspect 'Empty token', tokens unless match
-
-          tokens << [match, kind]
-
-          if last_state
-            state = last_state
-            last_state = nil
+          
+          case match = getch
+          
+          when state.delim
+            if state.paren_depth
+              state.paren_depth -= 1
+              if state.paren_depth > 0
+                encoder.text_token match, :content
+                next
+              end
+            end
+            encoder.text_token match, :delimiter
+            if state.type == :regexp && !eos?
+              match = scan(/#{patterns::REGEXP_MODIFIERS}/o)
+              encoder.text_token match, :modifier unless match.empty?
+            end
+            encoder.end_group state.type
+            value_expected = false
+            state = state.next_state
+            
+          when '\\'
+            if state.interpreted
+              if esc = scan(/#{patterns::ESCAPE}/o)
+                encoder.text_token match + esc, :char
+              else
+                encoder.text_token match, :error
+              end
+            else
+              case esc = getch
+              when nil
+                encoder.text_token match, :content
+              when state.delim, '\\'
+                encoder.text_token match + esc, :char
+              else
+                encoder.text_token match + esc, :content
+              end
+            end
+            
+          when '#'
+            case peek(1)
+            when '{'
+              inline_block_stack ||= []
+              inline_block_stack << [state, inline_block_curly_depth, heredocs]
+              value_expected = true
+              state = :initial
+              inline_block_curly_depth = 1
+              encoder.begin_group :inline
+              encoder.text_token match + getch, :inline_delimiter
+            when '$', '@'
+              encoder.text_token match, :escape
+              last_state = state
+              state = :initial
+            else
+              #:nocov:
+              raise_inspect 'else-case # reached; #%p not handled' % [peek(1)], encoder
+              #:nocov:
+            end
+            
+          when state.opening_paren
+            state.paren_depth += 1
+            encoder.text_token match, :content
+            
+          else
+            #:nocov
+            raise_inspect 'else-case " reached; %p not handled, state = %p' % [match, state], encoder
+            #:nocov:
+            
           end
+          
         end
+        
+      end
+      
+      # cleaning up
+      if state.is_a? StringState
+        encoder.end_group state.type
       end
-
-      inline_block_stack << [state] if state.is_a? patterns::StringState
-      until inline_block_stack.empty?
-        this_block = inline_block_stack.pop
-        tokens << [:close, :inline] if this_block.size > 1
-        state = this_block.first
-        tokens << [:close, state.type]
+      
+      if options[:keep_state]
+        if state.is_a?(StringState) && state.heredoc
+          (heredocs ||= []).unshift state
+          state = :initial
+        elsif heredocs && heredocs.empty?
+          heredocs = nil
+        end
+        @state = state, heredocs
       end
-
-      tokens
+      
+      if inline_block_stack
+        until inline_block_stack.empty?
+          state, = *inline_block_stack.pop
+          encoder.end_group :inline
+          encoder.end_group state.type
+        end
+      end
+      
+      encoder
     end
-
+    
   end
-
+  
 end
 end
-
-# vim:fdm=marker
diff --git a/lib/coderay/scanners/ruby/patterns.rb b/lib/coderay/scanners/ruby/patterns.rb
index 9bd709c..a52198e 100644
--- a/lib/coderay/scanners/ruby/patterns.rb
+++ b/lib/coderay/scanners/ruby/patterns.rb
@@ -2,9 +2,9 @@
 module CodeRay
 module Scanners
 
-  module Ruby::Patterns  # :nodoc:
+  module Ruby::Patterns  # :nodoc: all
 
-    RESERVED_WORDS = %w[
+    KEYWORDS = %w[
       and def end in or unless begin
       defined? ensure module redo super until
       BEGIN break do next rescue then
@@ -13,25 +13,27 @@ module Scanners
       undef yield
     ]
 
-    DEF_KEYWORDS = %w[ def ]
-    UNDEF_KEYWORDS = %w[ undef ]
-    ALIAS_KEYWORDS = %w[ alias ]
-    MODULE_KEYWORDS = %w[ class module ]
-    DEF_NEW_STATE = WordList.new(:initial).
-      add(DEF_KEYWORDS, :def_expected).
-      add(UNDEF_KEYWORDS, :undef_expected).
-      add(ALIAS_KEYWORDS, :alias_expected).
-      add(MODULE_KEYWORDS, :module_expected)
-
+    # See http://murfy.de/ruby-constants.
     PREDEFINED_CONSTANTS = %w[
       nil true false self
-      DATA ARGV ARGF
+      DATA ARGV ARGF ENV
+      FALSE TRUE NIL
+      STDERR STDIN STDOUT
+      TOPLEVEL_BINDING
+      RUBY_COPYRIGHT RUBY_DESCRIPTION RUBY_ENGINE RUBY_PATCHLEVEL
+      RUBY_PLATFORM RUBY_RELEASE_DATE RUBY_REVISION RUBY_VERSION
       __FILE__ __LINE__ __ENCODING__
     ]
 
     IDENT_KIND = WordList.new(:ident).
-      add(RESERVED_WORDS, :reserved).
-      add(PREDEFINED_CONSTANTS, :pre_constant)
+      add(KEYWORDS, :keyword).
+      add(PREDEFINED_CONSTANTS, :predefined_constant)
+
+    KEYWORD_NEW_STATE = WordList.new(:initial).
+      add(%w[ def ], :def_expected).
+      add(%w[ undef ], :undef_expected).
+      add(%w[ alias ], :alias_expected).
+      add(%w[ class module ], :module_expected)
 
     IDENT = 'ä'[/[[:alpha:]]/] == 'ä' ? /[[:alpha:]_][[:alnum:]_]*/ : /[^\W\d]\w*/
 
@@ -46,7 +48,9 @@ module Scanners
       | ===? | =~     # simple equality, case equality, match
       | ![~=@]?       # negation with and without at sign, not-equal and not-match
     /ox
-    METHOD_NAME_EX = / #{IDENT} (?:[?!]|=(?!>))? | #{METHOD_NAME_OPERATOR} /ox
+    METHOD_SUFFIX = / (?: [?!] | = (?![~>]|=(?!>)) ) /x
+    METHOD_NAME_EX = / #{IDENT} #{METHOD_SUFFIX}? | #{METHOD_NAME_OPERATOR} /ox
+    METHOD_AFTER_DOT = / #{IDENT} [?!]? | #{METHOD_NAME_OPERATOR} /ox
     INSTANCE_VARIABLE = / @ #{IDENT} /ox
     CLASS_VARIABLE = / @@ #{IDENT} /ox
     OBJECT_VARIABLE = / @@? #{IDENT} /ox
@@ -60,8 +64,7 @@ module Scanners
     }
     QUOTE_TO_TYPE.default = :string
 
-    REGEXP_MODIFIERS = /[mixounse]*/
-    REGEXP_SYMBOLS = /[|?*+(){}\[\].^$]/
+    REGEXP_MODIFIERS = /[mousenix]*/
 
     DECIMAL = /\d+(?:_\d+)*/
     OCTAL = /0_?[0-7]+(?:_[0-7]+)*/
@@ -87,7 +90,7 @@ module Scanners
         [abefnrstv]
       |  [0-7]{1,3}
       | x[0-9A-Fa-f]{1,2}
-      | .?
+      | .
     /mx
     
     CONTROL_META_ESCAPE = /
@@ -110,12 +113,10 @@ module Scanners
 
     # NOTE: This is not completely correct, but
     # nobody needs heredoc delimiters ending with \n.
-    # Also, delimiters starting with numbers are allowed.
-    # but they are more often than not a false positive.
     HEREDOC_OPEN = /
       << (-)?              # $1 = float
       (?:
-        ( #{IDENT} )       # $2 = delim
+        ( [A-Za-z_0-9]+ )  # $2 = delim
       |
         ( ["'`\/] )        # $3 = quote, type
         ( [^\n]*? ) \3     # $4 = delim
@@ -134,6 +135,8 @@ module Scanners
       (?: \Z | (?=^\#CODE) )
     /mx
     
+    RUBYDOC_OR_DATA = / #{RUBYDOC} | #{DATA} /xo
+
     # Checks for a valid value to follow. This enables
     # value_expected in method calls without parentheses.
     VALUE_FOLLOWS = /
@@ -144,7 +147,7 @@ module Scanners
       | [-+] \d
       | #{CHARACTER}
       )
-    /x
+    /ox
     KEYWORDS_EXPECTING_VALUE = WordList.new.add(%w[
       and end in or unless begin
       defined? ensure redo super until
@@ -153,89 +156,20 @@ module Scanners
       while elsif if not return
       yield
     ])
-
-    RUBYDOC_OR_DATA = / #{RUBYDOC} | #{DATA} /xo
-
-    RDOC_DATA_START = / ^=begin (?!\S) | ^__END__$ /x
-
-    FANCY_START_CORRECT = / % ( [qQwWxsr] | (?![a-zA-Z0-9]) ) ([^a-zA-Z0-9]) /mx
-
-    FancyStringType = {
-      'q' => [:string, false],
-      'Q' => [:string, true],
-      'r' => [:regexp, true],
-      's' => [:symbol, false],
-      'x' => [:shell, true]
-    }
-    FancyStringType['w'] = FancyStringType['q']
-    FancyStringType['W'] = FancyStringType[''] = FancyStringType['Q']
-
-    class StringState < Struct.new :type, :interpreted, :delim, :heredoc,
-      :paren, :paren_depth, :pattern, :next_state
-
-      CLOSING_PAREN = Hash[ *%w[
-        ( )
-        [ ]
-        < >
-        { }
-      ] ]
-
-      CLOSING_PAREN.each { |k,v| k.freeze; v.freeze }  # debug, if I try to change it with <<
-      OPENING_PAREN = CLOSING_PAREN.invert
-
-      STRING_PATTERN = Hash.new do |h, k|
-        delim, interpreted = *k
-        delim_pattern = Regexp.escape(delim.dup)  # dup: workaround for old Ruby
-        if closing_paren = CLOSING_PAREN[delim]
-          delim_pattern = delim_pattern[0..-1] if defined? JRUBY_VERSION  # JRuby fix
-          delim_pattern << Regexp.escape(closing_paren)
-        end
-        delim_pattern << '\\\\' unless delim == '\\'
-        
-        special_escapes =
-          case interpreted
-          when :regexp_symbols
-            '| ' + REGEXP_SYMBOLS.source
-          when :words
-            '| \s'
-          end
-        
-        h[k] =
-          if interpreted and not delim == '#'
-            / (?= [#{delim_pattern}] | \# [{$@] #{special_escapes} ) /mx
-          else
-            / (?= [#{delim_pattern}] #{special_escapes} ) /mx
-          end
-      end
-
-      HEREDOC_PATTERN = Hash.new do |h, k|
-        delim, interpreted, indented = *k
-        delim_pattern = Regexp.escape(delim.dup)  # dup: workaround for old Ruby
-        delim_pattern = / \n #{ '(?>[\ \t]*)' if indented } #{ Regexp.new delim_pattern } $ /x
-        h[k] =
-          if interpreted
-            / (?= #{delim_pattern}() | \\ | \# [{$@] ) /mx  # $1 set == end of heredoc
-          else
-            / (?= #{delim_pattern}() | \\ ) /mx
-          end
-      end
-
-      def initialize kind, interpreted, delim, heredoc = false
-        if heredoc
-          pattern = HEREDOC_PATTERN[ [delim, interpreted, heredoc == :indented] ]
-          delim = nil
-        else
-          pattern = STRING_PATTERN[ [delim, interpreted] ]
-          if paren = CLOSING_PAREN[delim]
-            delim, paren = paren, delim
-            paren_depth = 1
-          end
-        end
-        super kind, interpreted, delim, heredoc, paren, paren_depth, pattern, :initial
-      end
-    end unless defined? StringState
-
+    
+    FANCY_STRING_START = / % ( [QqrsWwx] | (?![a-zA-Z0-9]) ) ([^a-zA-Z0-9]) /x
+    FANCY_STRING_KIND = Hash.new(:string).merge({
+      'r' => :regexp,
+      's' => :symbol,
+      'x' => :shell,
+    })
+    FANCY_STRING_INTERPRETED = Hash.new(true).merge({
+      'q' => false,
+      's' => false,
+      'w' => false,
+    })
+    
   end
-
+  
 end
 end
diff --git a/lib/coderay/scanners/ruby/string_state.rb b/lib/coderay/scanners/ruby/string_state.rb
new file mode 100644
index 0000000..2f398d1
--- /dev/null
+++ b/lib/coderay/scanners/ruby/string_state.rb
@@ -0,0 +1,71 @@
+# encoding: utf-8
+module CodeRay
+module Scanners
+  
+  class Ruby
+    
+    class StringState < Struct.new :type, :interpreted, :delim, :heredoc,
+      :opening_paren, :paren_depth, :pattern, :next_state  # :nodoc: all
+      
+      CLOSING_PAREN = Hash[ *%w[
+        ( )
+        [ ]
+        < >
+        { }
+      ] ].each { |k,v| k.freeze; v.freeze }  # debug, if I try to change it with <<
+      
+      STRING_PATTERN = Hash.new do |h, k|
+        delim, interpreted = *k
+        # delim = delim.dup  # workaround for old Ruby
+        delim_pattern = Regexp.escape(delim)
+        if closing_paren = CLOSING_PAREN[delim]
+          delim_pattern << Regexp.escape(closing_paren)
+        end
+        delim_pattern << '\\\\' unless delim == '\\'
+        
+        # special_escapes =
+        #   case interpreted
+        #   when :regexp_symbols
+        #     '| [|?*+(){}\[\].^$]'
+        #   end
+        
+        h[k] =
+          if interpreted && delim != '#'
+            / (?= [#{delim_pattern}] | \# [{$@] ) /mx
+          else
+            / (?= [#{delim_pattern}] ) /mx
+          end
+      end
+      
+      def initialize kind, interpreted, delim, heredoc = false
+        if heredoc
+          pattern = heredoc_pattern delim, interpreted, heredoc == :indented
+          delim = nil
+        else
+          pattern = STRING_PATTERN[ [delim, interpreted] ]
+          if closing_paren = CLOSING_PAREN[delim]
+            opening_paren = delim
+            delim = closing_paren
+            paren_depth = 1
+          end
+        end
+        super kind, interpreted, delim, heredoc, opening_paren, paren_depth, pattern, :initial
+      end
+      
+      def heredoc_pattern delim, interpreted, indented
+        # delim = delim.dup  # workaround for old Ruby
+        delim_pattern = Regexp.escape(delim)
+        delim_pattern = / (?:\A|\n) #{ '(?>[ \t]*)' if indented } #{ Regexp.new delim_pattern } $ /x
+        if interpreted
+          / (?= #{delim_pattern}() | \\ | \# [{$@] ) /mx  # $1 set == end of heredoc
+        else
+          / (?= #{delim_pattern}() | \\ ) /mx
+        end
+      end
+      
+    end
+    
+  end
+  
+end
+end
diff --git a/lib/coderay/scanners/scheme.rb b/lib/coderay/scanners/scheme.rb
deleted file mode 100644
index ba22b80..0000000
--- a/lib/coderay/scanners/scheme.rb
+++ /dev/null
@@ -1,145 +0,0 @@
-module CodeRay
-  module Scanners
-
-    # Scheme scanner for CodeRay (by closure).
-    # Thanks to murphy for putting CodeRay into public.
-    class Scheme < Scanner
-      
-      # TODO: function defs
-      # TODO: built-in functions
-      
-      register_for :scheme
-      file_extension 'scm'
-
-      CORE_FORMS = %w[
-        lambda let let* letrec syntax-case define-syntax let-syntax
-        letrec-syntax begin define quote if or and cond case do delay
-        quasiquote set! cons force call-with-current-continuation call/cc
-      ]
-
-      IDENT_KIND = CaseIgnoringWordList.new(:ident).
-        add(CORE_FORMS, :reserved)
-      
-      #IDENTIFIER_INITIAL = /[a-z!@\$%&\*\/\:<=>\?~_\^]/i
-      #IDENTIFIER_SUBSEQUENT = /#{IDENTIFIER_INITIAL}|\d|\.|\+|-/
-      #IDENTIFIER = /#{IDENTIFIER_INITIAL}#{IDENTIFIER_SUBSEQUENT}*|\+|-|\.{3}/
-      IDENTIFIER = /[a-zA-Z!@$%&*\/:<=>?~_^][\w!@$%&*\/:<=>?~^.+\-]*|[+-]|\.\.\./
-      DIGIT = /\d/
-      DIGIT10 = DIGIT
-      DIGIT16 = /[0-9a-f]/i
-      DIGIT8 = /[0-7]/
-      DIGIT2 = /[01]/
-      RADIX16 = /\#x/i
-      RADIX8 = /\#o/i
-      RADIX2 = /\#b/i
-      RADIX10 = /\#d/i
-      EXACTNESS = /#i|#e/i
-      SIGN = /[\+-]?/
-      EXP_MARK = /[esfdl]/i
-      EXP = /#{EXP_MARK}#{SIGN}#{DIGIT}+/
-      SUFFIX = /#{EXP}?/
-      PREFIX10 = /#{RADIX10}?#{EXACTNESS}?|#{EXACTNESS}?#{RADIX10}?/
-      PREFIX16 = /#{RADIX16}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX16}/
-      PREFIX8 = /#{RADIX8}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX8}/
-      PREFIX2 = /#{RADIX2}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX2}/
-      UINT10 = /#{DIGIT10}+#*/
-      UINT16 = /#{DIGIT16}+#*/
-      UINT8 = /#{DIGIT8}+#*/
-      UINT2 = /#{DIGIT2}+#*/
-      DECIMAL = /#{DIGIT10}+#+\.#*#{SUFFIX}|#{DIGIT10}+\.#{DIGIT10}*#*#{SUFFIX}|\.#{DIGIT10}+#*#{SUFFIX}|#{UINT10}#{EXP}/
-      UREAL10 = /#{UINT10}\/#{UINT10}|#{DECIMAL}|#{UINT10}/
-      UREAL16 = /#{UINT16}\/#{UINT16}|#{UINT16}/
-      UREAL8 = /#{UINT8}\/#{UINT8}|#{UINT8}/
-      UREAL2 = /#{UINT2}\/#{UINT2}|#{UINT2}/
-      REAL10 = /#{SIGN}#{UREAL10}/
-      REAL16 = /#{SIGN}#{UREAL16}/
-      REAL8 = /#{SIGN}#{UREAL8}/
-      REAL2 = /#{SIGN}#{UREAL2}/
-      IMAG10 = /i|#{UREAL10}i/
-      IMAG16 = /i|#{UREAL16}i/
-      IMAG8 = /i|#{UREAL8}i/
-      IMAG2 = /i|#{UREAL2}i/
-      COMPLEX10 = /#{REAL10}@#{REAL10}|#{REAL10}\+#{IMAG10}|#{REAL10}-#{IMAG10}|\+#{IMAG10}|-#{IMAG10}|#{REAL10}/
-      COMPLEX16 = /#{REAL16}@#{REAL16}|#{REAL16}\+#{IMAG16}|#{REAL16}-#{IMAG16}|\+#{IMAG16}|-#{IMAG16}|#{REAL16}/
-      COMPLEX8 = /#{REAL8}@#{REAL8}|#{REAL8}\+#{IMAG8}|#{REAL8}-#{IMAG8}|\+#{IMAG8}|-#{IMAG8}|#{REAL8}/
-      COMPLEX2 = /#{REAL2}@#{REAL2}|#{REAL2}\+#{IMAG2}|#{REAL2}-#{IMAG2}|\+#{IMAG2}|-#{IMAG2}|#{REAL2}/
-      NUM10 = /#{PREFIX10}?#{COMPLEX10}/
-      NUM16 = /#{PREFIX16}#{COMPLEX16}/
-      NUM8 = /#{PREFIX8}#{COMPLEX8}/
-      NUM2 = /#{PREFIX2}#{COMPLEX2}/
-      NUM = /#{NUM10}|#{NUM16}|#{NUM8}|#{NUM2}/
-    
-    private
-      def scan_tokens tokens,options
-        
-        state = :initial
-        ident_kind = IDENT_KIND
-        
-        until eos?
-          kind = match = nil
-          
-          case state
-          when :initial
-            if scan(/ \s+ | \\\n /x)
-              kind = :space
-            elsif scan(/['\(\[\)\]]|#\(/)
-              kind = :operator_fat
-            elsif scan(/;.*/)
-              kind = :comment
-            elsif scan(/#\\(?:newline|space|.?)/)
-              kind = :char
-            elsif scan(/#[ft]/)
-              kind = :pre_constant
-            elsif scan(/#{IDENTIFIER}/o)
-              kind = ident_kind[matched]
-            elsif scan(/\./)
-              kind = :operator
-            elsif scan(/"/)
-              tokens << [:open, :string]
-              state = :string
-              tokens << ['"', :delimiter]
-              next
-            elsif scan(/#{NUM}/o) and not matched.empty?
-              kind = :integer
-            elsif getch
-              kind = :error
-            end
-            
-          when :string
-            if scan(/[^"\\]+/) or scan(/\\.?/)
-              kind = :content
-            elsif scan(/"/)
-              tokens << ['"', :delimiter]
-              tokens << [:close, :string]
-              state = :initial
-              next
-            else
-              raise_inspect "else case \" reached; %p not handled." % peek(1),
-                tokens, state
-            end
-            
-          else
-            raise "else case reached"
-          end
-          
-          match ||= matched
-          if $CODERAY_DEBUG and not kind
-            raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens
-          end
-          raise_inspect 'Empty token', tokens, state unless match
-          
-          tokens << [match, kind]
-          
-        end  # until eos
-        
-        if state == :string
-          tokens << [:close, :string]
-        end
-        
-        tokens
-        
-      end #scan_tokens
-    end #class
-  end #module scanners
-end #module coderay
\ No newline at end of file
diff --git a/lib/coderay/scanners/sql.rb b/lib/coderay/scanners/sql.rb
index 2d56c03..bcbffd5 100644
--- a/lib/coderay/scanners/sql.rb
+++ b/lib/coderay/scanners/sql.rb
@@ -5,31 +5,49 @@ module CodeRay module Scanners
 
     register_for :sql
     
-    RESERVED_WORDS = %w(
-      and as avg before begin between by case collate columns create database
-      databases delete distinct drop else end engine exists fields from full
-      group having if index inner insert into is join key like not on or order
-      outer primary prompt replace select set show table tables then trigger
-      union update using values when where
+    KEYWORDS = %w(
+      all and any as before begin between by case check collate
+      each else end exists
+      for foreign from full group having if in inner is join
+      like not of on or order outer over references
+      then to union using values when where
+      left right distinct
+    )
+    
+    OBJECTS = %w(
+      database databases table tables column columns fields index constraint
+      constraints transaction function procedure row key view trigger
+    )
+    
+    COMMANDS = %w(
+      add alter comment create delete drop grant insert into select update set
+      show prompt begin commit rollback replace truncate
     )
     
     PREDEFINED_TYPES = %w(
-      bigint bin binary bit blob bool boolean char date datetime decimal
-      double enum float hex int integer longblob longtext mediumblob mediumint
-      mediumtext oct smallint text time timestamp tinyblob tinyint tinytext
-      unsigned varchar year
+      char varchar varchar2 enum binary text tinytext mediumtext
+      longtext blob tinyblob mediumblob longblob timestamp
+      date time datetime year double decimal float int
+      integer tinyint mediumint bigint smallint unsigned bit
+      bool boolean hex bin oct
     )
     
-    PREDEFINED_FUNCTIONS = %w( sum cast abs pi count min max avg )
+    PREDEFINED_FUNCTIONS = %w( sum cast substring abs pi count min max avg now )
     
-    DIRECTIVES = %w( auto_increment unique default charset )
+    DIRECTIVES = %w( 
+      auto_increment unique default charset initially deferred
+      deferrable cascade immediate read write asc desc after
+      primary foreign return engine
+    )
     
     PREDEFINED_CONSTANTS = %w( null true false )
     
-    IDENT_KIND = CaseIgnoringWordList.new(:ident).
-      add(RESERVED_WORDS, :reserved).
-      add(PREDEFINED_TYPES, :pre_type).
-      add(PREDEFINED_CONSTANTS, :pre_constant).
+    IDENT_KIND = WordList::CaseIgnoring.new(:ident).
+      add(KEYWORDS, :keyword).
+      add(OBJECTS, :type).
+      add(COMMANDS, :class).
+      add(PREDEFINED_TYPES, :predefined_type).
+      add(PREDEFINED_CONSTANTS, :predefined_constant).
       add(PREDEFINED_FUNCTIONS, :predefined).
       add(DIRECTIVES, :directive)
     
@@ -38,58 +56,60 @@ module CodeRay module Scanners
     
     STRING_PREFIXES = /[xnb]|_\w+/i
     
-    def scan_tokens tokens, options
+    def scan_tokens encoder, options
       
       state = :initial
       string_type = nil
       string_content = ''
+      name_expected = false
       
       until eos?
         
-        kind = nil
-        match = nil
-        
         if state == :initial
           
-          if scan(/ \s+ | \\\n /x)
-            kind = :space
+          if match = scan(/ \s+ | \\\n /x)
+            encoder.text_token match, :space
           
-          elsif scan(/(?:--\s?|#).*/)
-            kind = :comment
+          elsif match = scan(/(?:--\s?|#).*/)
+            encoder.text_token match, :comment
             
-          elsif scan(%r! /\* (?: .*? \*/ | .* ) !mx)
-            kind = :comment
+          elsif match = scan(%r( /\* (!)? (?: .*? \*/ | .* ) )mx)
+            encoder.text_token match, self[1] ? :directive : :comment
             
-          elsif scan(/ [-+*\/=<>;,!&^|()\[\]{}~%] | \.(?!\d) /x)
-            kind = :operator
+          elsif match = scan(/ [*\/=<>:;,!&^|()\[\]{}~%] | [-+\.](?!\d) /x)
+            name_expected = true if match == '.' && check(/[A-Za-z_]/)
+            encoder.text_token match, :operator
             
-          elsif scan(/(#{STRING_PREFIXES})?([`"'])/o)
+          elsif match = scan(/(#{STRING_PREFIXES})?([`"'])/o)
             prefix = self[1]
             string_type = self[2]
-            tokens << [:open, :string]
-            tokens << [prefix, :modifier] if prefix
+            encoder.begin_group :string
+            encoder.text_token prefix, :modifier if prefix
             match = string_type
             state = :string
-            kind = :delimiter
+            encoder.text_token match, :delimiter
             
           elsif match = scan(/ @? [A-Za-z_][A-Za-z_0-9]* /x)
-            kind = match[0] == ?@ ? :variable : IDENT_KIND[match.downcase]
+            encoder.text_token match, name_expected ? :ident : (match[0] == ?@ ? :variable : IDENT_KIND[match])
+            name_expected = false
             
-          elsif scan(/0[xX][0-9A-Fa-f]+/)
-            kind = :hex
+          elsif match = scan(/0[xX][0-9A-Fa-f]+/)
+            encoder.text_token match, :hex
             
-          elsif scan(/0[0-7]+(?![89.eEfF])/)
-            kind = :oct
+          elsif match = scan(/0[0-7]+(?![89.eEfF])/)
+            encoder.text_token match, :octal
             
-          elsif scan(/(?>\d+)(?![.eEfF])/)
-            kind = :integer
+          elsif match = scan(/[-+]?(?>\d+)(?![.eEfF])/)
+            encoder.text_token match, :integer
             
-          elsif scan(/\d[fF]|\d*\.\d+(?:[eE][+-]?\d+)?|\d+[eE][+-]?\d+/)
-            kind = :float
+          elsif match = scan(/[-+]?(?:\d[fF]|\d*\.\d+(?:[eE][+-]?\d+)?|\d+[eE][+-]?\d+)/)
+            encoder.text_token match, :float
+          
+          elsif match = scan(/\\N/)
+            encoder.text_token match, :predefined_constant
             
           else
-            getch
-            kind = :error
+            encoder.text_token getch, :error
             
           end
           
@@ -104,54 +124,48 @@ module CodeRay module Scanners
                 next
               end
               unless string_content.empty?
-                tokens << [string_content, :content]
+                encoder.text_token string_content, :content
                 string_content = ''
               end
-              tokens << [matched, :delimiter]
-              tokens << [:close, :string]
+              encoder.text_token match, :delimiter
+              encoder.end_group :string
               state = :initial
               string_type = nil
-              next
             else
               string_content << match
             end
-            next
-          elsif scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+          elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
             unless string_content.empty?
-              tokens << [string_content, :content]
+              encoder.text_token string_content, :content
               string_content = ''
             end
-            kind = :char
+            encoder.text_token match, :char
           elsif match = scan(/ \\ . /mox)
             string_content << match
             next
-          elsif scan(/ \\ | $ /x)
+          elsif match = scan(/ \\ | $ /x)
             unless string_content.empty?
-              tokens << [string_content, :content]
+              encoder.text_token string_content, :content
               string_content = ''
             end
-            kind = :error
+            encoder.text_token match, :error
             state = :initial
           else
-            raise "else case \" reached; %p not handled." % peek(1), tokens
+            raise "else case \" reached; %p not handled." % peek(1), encoder
           end
           
         else
-          raise 'else-case reached', tokens
+          raise 'else-case reached', encoder
           
         end
         
-        match ||= matched
-        unless kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens, state
-        end
-        raise_inspect 'Empty token', tokens unless match
-        
-        tokens << [match, kind]
-        
       end
-      tokens
+      
+      if state == :string
+        encoder.end_group state
+      end
+      
+      encoder
       
     end
     
diff --git a/lib/coderay/scanners/text.rb b/lib/coderay/scanners/text.rb
new file mode 100644
index 0000000..bde9029
--- /dev/null
+++ b/lib/coderay/scanners/text.rb
@@ -0,0 +1,26 @@
+module CodeRay
+  module Scanners
+    
+    # Scanner for plain text.
+    # 
+    # Yields just one token of the kind :plain.
+    # 
+    # Alias: +plaintext+, +plain+
+    class Text < Scanner
+      
+      register_for :text
+      title 'Plain text'
+      
+      KINDS_NOT_LOC = [:plain]  # :nodoc:
+      
+    protected
+      
+      def scan_tokens encoder, options
+        encoder.text_token string, :plain
+        encoder
+      end
+      
+    end
+    
+  end
+end
diff --git a/lib/coderay/scanners/xml.rb b/lib/coderay/scanners/xml.rb
index aeabeca..947f16e 100644
--- a/lib/coderay/scanners/xml.rb
+++ b/lib/coderay/scanners/xml.rb
@@ -3,7 +3,7 @@ module Scanners
 
   load :html
 
-  # XML Scanner
+  # Scanner for XML.
   #
   # Currently this is the same scanner as Scanners::HTML.
   class XML < HTML
diff --git a/lib/coderay/scanners/yaml.rb b/lib/coderay/scanners/yaml.rb
index 095d609..96f4e93 100644
--- a/lib/coderay/scanners/yaml.rb
+++ b/lib/coderay/scanners/yaml.rb
@@ -1,7 +1,7 @@
 module CodeRay
 module Scanners
   
-  # YAML Scanner
+  # Scanner for YAML.
   #
   # Based on the YAML scanner from Syntax by Jamis Buck.
   class YAML < Scanner
@@ -11,57 +11,59 @@ module Scanners
     
     KINDS_NOT_LOC = :all
     
-    def scan_tokens tokens, options
+  protected
+    
+    def scan_tokens encoder, options
       
       state = :initial
-      key_indent = 0
+      key_indent = string_indent = 0
       
       until eos?
         
-        kind = nil
-        match = nil
         key_indent = nil if bol?
         
         if match = scan(/ +[\t ]*/)
-          kind = :space
+          encoder.text_token match, :space
           
         elsif match = scan(/\n+/)
-          kind = :space
+          encoder.text_token match, :space
           state = :initial if match.index(?\n)
           
         elsif match = scan(/#.*/)
-          kind = :comment
+          encoder.text_token match, :comment
           
         elsif bol? and case
           when match = scan(/---|\.\.\./)
-            tokens << [:open, :head]
-            tokens << [match, :head]
-            tokens << [:close, :head]
+            encoder.begin_group :head
+            encoder.text_token match, :head
+            encoder.end_group :head
             next
           when match = scan(/%.*/)
-            tokens << [match, :doctype]
+            encoder.text_token match, :doctype
             next
           end
         
         elsif state == :value and case
-          when !check(/(?:"[^"]*")(?=: |:$)/) && scan(/"/)
-            tokens << [:open, :string]
-            tokens << [matched, :delimiter]
-            tokens << [matched, :content] if scan(/ [^"\\]* (?: \\. [^"\\]* )* /mx)
-            tokens << [matched, :delimiter] if scan(/"/)
-            tokens << [:close, :string]
+          when !check(/(?:"[^"]*")(?=: |:$)/) && match = scan(/"/)
+            encoder.begin_group :string
+            encoder.text_token match, :delimiter
+            encoder.text_token match, :content if match = scan(/ [^"\\]* (?: \\. [^"\\]* )* /mx)
+            encoder.text_token match, :delimiter if match = scan(/"/)
+            encoder.end_group :string
             next
           when match = scan(/[|>][-+]?/)
-            tokens << [:open, :string]
-            tokens << [match, :delimiter]
-            string_indent = key_indent || column(pos - match.size - 1)
-            tokens << [matched, :content] if scan(/(?:\n+ {#{string_indent + 1}}.*)+/)
-            tokens << [:close, :string]
+            encoder.begin_group :string
+            encoder.text_token match, :delimiter
+            string_indent = key_indent || column(pos - match.size) - 1
+            encoder.text_token matched, :content if scan(/(?:\n+ {#{string_indent + 1}}.*)+/)
+            encoder.end_group :string
             next
           when match = scan(/(?![!"*&]).+?(?=$|\s+#)/)
-            tokens << [match, :string]
-            string_indent = key_indent || column(pos - match.size - 1)
-            tokens << [matched, :string] if scan(/(?:\n+ {#{string_indent + 1}}.*)+/)
+            encoder.begin_group :string
+            encoder.text_token match, :content
+            string_indent = key_indent || column(pos - match.size) - 1
+            encoder.text_token matched, :content if scan(/(?:\n+ {#{string_indent + 1}}.*)+/)
+            encoder.end_group :string
             next
           end
           
@@ -69,68 +71,67 @@ module Scanners
           when match = scan(/[-:](?= |$)/)
             state = :value if state == :colon && (match == ':' || match == '-')
             state = :value if state == :initial && match == '-'
-            kind = :operator
+            encoder.text_token match, :operator
+            next
           when match = scan(/[,{}\[\]]/)
-            kind = :operator
-          when state == :initial && match = scan(/[\w.() ]*\S(?=: |:$)/)
-            kind = :key
-            key_indent = column(pos - match.size - 1)
-            # tokens << [key_indent.inspect, :debug]
+            encoder.text_token match, :operator
+            next
+          when state == :initial && match = scan(/[-\w.()\/ ]*\S(?= *:(?: |$))/)
+            encoder.text_token match, :key
+            key_indent = column(pos - match.size) - 1
             state = :colon
-          when match = scan(/(?:"[^"\n]*"|'[^'\n]*')(?=: |:$)/)
-            tokens << [:open, :key]
-            tokens << [match[0,1], :delimiter]
-            tokens << [match[1..-2], :content]
-            tokens << [match[-1,1], :delimiter]
-            tokens << [:close, :key]
-            key_indent = column(pos - match.size - 1)
-            # tokens << [key_indent.inspect, :debug]
+            next
+          when match = scan(/(?:"[^"\n]*"|'[^'\n]*')(?= *:(?: |$))/)
+            encoder.begin_group :key
+            encoder.text_token match[0,1], :delimiter
+            encoder.text_token match[1..-2], :content
+            encoder.text_token match[-1,1], :delimiter
+            encoder.end_group :key
+            key_indent = column(pos - match.size) - 1
             state = :colon
             next
-          when scan(/(![\w\/]+)(:([\w:]+))?/)
-            tokens << [self[1], :type]
+          when match = scan(/(![\w\/]+)(:([\w:]+))?/)
+            encoder.text_token self[1], :type
             if self[2]
-              tokens << [':', :operator]
-              tokens << [self[3], :class]
+              encoder.text_token ':', :operator
+              encoder.text_token self[3], :class
             end
             next
-          when scan(/&\S+/)
-            kind = :variable
-          when scan(/\*\w+/)
-            kind = :global_variable
-          when scan(/<</)
-            kind = :class_variable
-          when scan(/\d\d:\d\d:\d\d/)
-            kind = :oct
-          when scan(/\d\d\d\d-\d\d-\d\d\s\d\d:\d\d:\d\d(\.\d+)? [-+]\d\d:\d\d/)
-            kind = :oct
-          when scan(/:\w+/)
-            kind = :symbol
-          when scan(/[^:\s]+(:(?! |$)[^:\s]*)* .*/)
-            kind = :error
-          when scan(/[^:\s]+(:(?! |$)[^:\s]*)*/)
-            kind = :error
+          when match = scan(/&\S+/)
+            encoder.text_token match, :variable
+            next
+          when match = scan(/\*\w+/)
+            encoder.text_token match, :global_variable
+            next
+          when match = scan(/<</)
+            encoder.text_token match, :class_variable
+            next
+          when match = scan(/\d\d:\d\d:\d\d/)
+            encoder.text_token match, :octal
+            next
+          when match = scan(/\d\d\d\d-\d\d-\d\d\s\d\d:\d\d:\d\d(\.\d+)? [-+]\d\d:\d\d/)
+            encoder.text_token match, :octal
+            next
+          when match = scan(/:\w+/)
+            encoder.text_token match, :symbol
+            next
+          when match = scan(/[^:\s]+(:(?! |$)[^:\s]*)* .*/)
+            encoder.text_token match, :error
+            next
+          when match = scan(/[^:\s]+(:(?! |$)[^:\s]*)*/)
+            encoder.text_token match, :error
+            next
           end
           
         else
-          getch
-          kind = :error
+          raise if eos?
+          encoder.text_token getch, :error
           
         end
         
-        match ||= matched
-        
-        if $CODERAY_DEBUG and not kind
-          raise_inspect 'Error token %p in line %d' %
-            [[match, kind], line], tokens, state
-        end
-        raise_inspect 'Empty token', tokens, state unless match
-        
-        tokens << [match, kind]
-        
       end
       
-      tokens
+      encoder
     end
     
   end
diff --git a/lib/coderay/style.rb b/lib/coderay/style.rb
index c2977c5..df4704f 100644
--- a/lib/coderay/style.rb
+++ b/lib/coderay/style.rb
@@ -1,20 +1,23 @@
 module CodeRay
 
   # This module holds the Style class and its subclasses.
-  #
+  # 
   # See Plugin.
   module Styles
     extend PluginHost
     plugin_path File.dirname(__FILE__), 'styles'
-
+    
+    # Base class for styles.
+    # 
+    # Styles are used by Encoders::HTML to colorize tokens.
     class Style
       extend Plugin
       plugin_host Styles
-
-      DEFAULT_OPTIONS = { }
-
+      
+      DEFAULT_OPTIONS = { }  # :nodoc:
+      
     end
-
+    
   end
-
+  
 end
diff --git a/lib/coderay/styles/_map.rb b/lib/coderay/styles/_map.rb
index 52035fe..92d4354 100644
--- a/lib/coderay/styles/_map.rb
+++ b/lib/coderay/styles/_map.rb
@@ -1,7 +1,7 @@
 module CodeRay
 module Styles
-
-  default :cycnus
-
+  
+  default :alpha
+  
 end
 end
diff --git a/lib/coderay/styles/alpha.rb b/lib/coderay/styles/alpha.rb
new file mode 100644
index 0000000..8506d10
--- /dev/null
+++ b/lib/coderay/styles/alpha.rb
@@ -0,0 +1,142 @@
+module CodeRay
+module Styles
+  
+  # A colorful theme using CSS 3 colors (with alpha channel).
+  class Alpha < Style
+
+    register_for :alpha
+
+    code_background = 'hsl(0,0%,95%)'
+    numbers_background = 'hsl(180,65%,90%)'
+    border_color = 'silver'
+    normal_color = 'black'
+
+    CSS_MAIN_STYLES = <<-MAIN  # :nodoc:
+.CodeRay {
+  background-color: #{code_background};
+  border: 1px solid #{border_color};
+  color: #{normal_color};
+}
+.CodeRay pre {
+  margin: 0px;
+}
+
+span.CodeRay { white-space: pre; border: 0px; padding: 2px; }
+
+table.CodeRay { border-collapse: collapse; width: 100%; padding: 2px; }
+table.CodeRay td { padding: 2px 4px; vertical-align: top; }
+
+.CodeRay .line-numbers {
+  background-color: #{numbers_background};
+  color: gray;
+  text-align: right;
+  -webkit-user-select: none;
+  -moz-user-select: none;
+  user-select: none;
+}
+.CodeRay .line-numbers a {
+  background-color: #{numbers_background} !important;
+  color: gray !important;
+  text-decoration: none !important;
+}
+.CodeRay .line-numbers a:target { color: blue !important; }
+.CodeRay .line-numbers .highlighted { color: red !important; }
+.CodeRay .line-numbers .highlighted a { color: red !important; }
+.CodeRay span.line-numbers { padding: 0px 4px; }
+.CodeRay .line { display: block; float: left; width: 100%; }
+.CodeRay .code { width: 100%; }
+    MAIN
+
+    TOKEN_COLORS = <<-'TOKENS'
+.debug { color: white !important; background: blue !important; }
+
+.annotation { color:#007 }
+.attribute-name { color:#b48 }
+.attribute-value { color:#700 }
+.binary { color:#509 }
+.char .content { color:#D20 }
+.char .delimiter { color:#710 }
+.char { color:#D20 }
+.class { color:#B06; font-weight:bold }
+.class-variable { color:#369 }
+.color { color:#0A0 }
+.comment { color:#777 }
+.comment .char { color:#444 }
+.comment .delimiter { color:#444 }
+.complex { color:#A08 }
+.constant { color:#036; font-weight:bold }
+.decorator { color:#B0B }
+.definition { color:#099; font-weight:bold }
+.delimiter { color:black }
+.directive { color:#088; font-weight:bold }
+.doc { color:#970 }
+.doc-string { color:#D42; font-weight:bold }
+.doctype { color:#34b }
+.entity { color:#800; font-weight:bold }
+.error { color:#F00; background-color:#FAA }
+.escape  { color:#666 }
+.exception { color:#C00; font-weight:bold }
+.float { color:#60E }
+.function { color:#06B; font-weight:bold }
+.global-variable { color:#d70 }
+.hex { color:#02b }
+.imaginary { color:#f00 }
+.include { color:#B44; font-weight:bold }
+.inline { background-color: hsla(0,0%,0%,0.07); color: black }
+.inline-delimiter { font-weight: bold; color: #666 }
+.instance-variable { color:#33B }
+.integer  { color:#00D }
+.key .char { color: #60f }
+.key .delimiter { color: #404 }
+.key { color: #606 }
+.keyword { color:#080; font-weight:bold }
+.label { color:#970; font-weight:bold }
+.local-variable { color:#963 }
+.namespace { color:#707; font-weight:bold }
+.octal { color:#40E }
+.operator { }
+.predefined { color:#369; font-weight:bold }
+.predefined-constant { color:#069 }
+.predefined-type { color:#0a5; font-weight:bold }
+.preprocessor { color:#579 }
+.pseudo-class { color:#00C; font-weight:bold }
+.regexp .content { color:#808 }
+.regexp .delimiter { color:#404 }
+.regexp .modifier { color:#C2C }
+.regexp { background-color:hsla(300,100%,50%,0.06); }
+.reserved { color:#080; font-weight:bold }
+.shell .content { color:#2B2 }
+.shell .delimiter { color:#161 }
+.shell { background-color:hsla(120,100%,50%,0.06); }
+.string .char { color: #b0b }
+.string .content { color: #D20 }
+.string .delimiter { color: #710 }
+.string .modifier { color: #E40 }
+.string { background-color:hsla(0,100%,50%,0.05); }
+.symbol .content { color:#A60 }
+.symbol .delimiter { color:#630 }
+.symbol { color:#A60 }
+.tag { color:#070 }
+.type { color:#339; font-weight:bold }
+.value { color: #088; }
+.variable  { color:#037 }
+
+.insert { background: hsla(120,100%,50%,0.12) }
+.delete { background: hsla(0,100%,50%,0.12) }
+.change { color: #bbf; background: #007; }
+.head { color: #f8f; background: #505 }
+.head .filename { color: white; }
+
+.delete .eyecatcher { background-color: hsla(0,100%,50%,0.2); border: 1px solid hsla(0,100%,45%,0.5); margin: -1px; border-bottom: none; border-top-left-radius: 5px; border-top-right-radius: 5px; }
+.insert .eyecatcher { background-color: hsla(120,100%,50%,0.2); border: 1px solid hsla(120,100%,25%,0.5); margin: -1px; border-top: none; border-bottom-left-radius: 5px; border-bottom-right-radius: 5px; }
+
+.insert .insert { color: #0c0; background:transparent; font-weight:bold }
+.delete .delete { color: #c00; background:transparent; font-weight:bold }
+.change .change { color: #88f }
+.head .head { color: #f4f }
+    TOKENS
+
+  end
+
+end
+end
diff --git a/lib/coderay/styles/cycnus.rb b/lib/coderay/styles/cycnus.rb
deleted file mode 100644
index da4f626..0000000
--- a/lib/coderay/styles/cycnus.rb
+++ /dev/null
@@ -1,152 +0,0 @@
-module CodeRay
-module Styles
-
-  class Cycnus < Style
-
-    register_for :cycnus
-
-    code_background = '#f8f8f8'
-    numbers_background = '#def'
-    border_color = 'silver'
-    normal_color = '#000'
-
-    CSS_MAIN_STYLES = <<-MAIN
-.CodeRay {
-  background-color: #{code_background};
-  border: 1px solid #{border_color};
-  font-family: 'Courier New', 'Terminal', monospace;
-  color: #{normal_color};
-}
-.CodeRay pre { margin: 0px }
-
-div.CodeRay { }
-
-span.CodeRay { white-space: pre; border: 0px; padding: 2px }
-
-table.CodeRay { border-collapse: collapse; width: 100%; padding: 2px }
-table.CodeRay td { padding: 2px 4px; vertical-align: top }
-
-.CodeRay .line_numbers, .CodeRay .no {
-  background-color: #{numbers_background};
-  color: gray;
-  text-align: right;
-}
-.CodeRay .line_numbers tt { font-weight: bold }
-.CodeRay .line_numbers .highlighted { color: red }
-.CodeRay .line { display: block; float: left; width: 100%; }
-.CodeRay .no { padding: 0px 4px }
-.CodeRay .code { width: 100% }
-
-ol.CodeRay { font-size: 10pt }
-ol.CodeRay li { white-space: pre }
-
-.CodeRay .code pre { overflow: auto }
-    MAIN
-
-    TOKEN_COLORS = <<-'TOKENS'
-.debug { color:white ! important; background:blue ! important; }
-
-.af { color:#00C }
-.an { color:#007 }
-.at { color:#f08 }
-.av { color:#700 }
-.aw { color:#C00 }
-.bi { color:#509; font-weight:bold }
-.c  { color:#888; }
-
-.ch { color:#04D }
-.ch .k { color:#04D }
-.ch .dl { color:#039 }
-
-.cl { color:#B06; font-weight:bold }
-.cm { color:#A08; font-weight:bold }
-.co { color:#036; font-weight:bold }
-.cr { color:#0A0 }
-.cv { color:#369 }
-.de { color:#B0B; }
-.df { color:#099; font-weight:bold }
-.di { color:#088; font-weight:bold }
-.dl { color:black }
-.do { color:#970 }
-.dt { color:#34b }
-.ds { color:#D42; font-weight:bold }
-.e  { color:#666; font-weight:bold }
-.en { color:#800; font-weight:bold }
-.er { color:#F00; background-color:#FAA }
-.ex { color:#C00; font-weight:bold }
-.fl { color:#60E; font-weight:bold }
-.fu { color:#06B; font-weight:bold }
-.gv { color:#d70; font-weight:bold }
-.hx { color:#058; font-weight:bold }
-.i  { color:#00D; font-weight:bold }
-.ic { color:#B44; font-weight:bold }
-
-.il { background: #ddd; color: black }
-.il .il { background: #ccc }
-.il .il .il { background: #bbb }
-.il .idl { background: #ddd; font-weight: bold; color: #666 }
-.idl { background-color: #bbb; font-weight: bold; color: #666; }
-
-.im { color:#f00; }
-.in { color:#B2B; font-weight:bold }
-.iv { color:#33B }
-.la { color:#970; font-weight:bold }
-.lv { color:#963 }
-.oc { color:#40E; font-weight:bold }
-.of { color:#000; font-weight:bold }
-.op { }
-.pc { color:#038; font-weight:bold }
-.pd { color:#369; font-weight:bold }
-.pp { color:#579; }
-.ps { color:#00C; font-weight:bold }
-.pt { color:#074; font-weight:bold }
-.r, .kw  { color:#080; font-weight:bold }
-
-.ke { color: #808; }
-.ke .dl { color: #606; }
-.ke .ch { color: #80f; }
-.vl { color: #088; }
-
-.rx { background-color:#fff0ff }
-.rx .k { color:#808 }
-.rx .dl { color:#404 }
-.rx .mod { color:#C2C }
-.rx .fu  { color:#404; font-weight: bold }
-
-.s { background-color:#fff0f0; color: #D20; }
-.s .s { background-color:#ffe0e0 }
-.s .s  .s { background-color:#ffd0d0 }
-.s .k { }
-.s .ch { color: #b0b; }
-.s .dl { color: #710; }
-
-.sh { background-color:#f0fff0; color:#2B2 }
-.sh .k { }
-.sh .dl { color:#161 }
-
-.sy { color:#A60 }
-.sy .k { color:#A60 }
-.sy .dl { color:#630 }
-
-.ta { color:#070 }
-.tf { color:#070; font-weight:bold }
-.ts { color:#D70; font-weight:bold }
-.ty { color:#339; font-weight:bold }
-.v  { color:#036 }
-.xt { color:#444 }
-
-.ins { background: #afa; }
-.del { background: #faa; }
-.chg { color: #aaf; background: #007; }
-.head { color: #f8f; background: #505 }
-
-.ins .ins { color: #080; font-weight:bold }
-.del .del { color: #800; font-weight:bold }
-.chg .chg { color: #66f; }
-.head .head { color: #f4f; }
-    TOKENS
-
-  end
-
-end
-end
diff --git a/lib/coderay/styles/murphy.rb b/lib/coderay/styles/murphy.rb
deleted file mode 100644
index 8345942..0000000
--- a/lib/coderay/styles/murphy.rb
+++ /dev/null
@@ -1,134 +0,0 @@
-module CodeRay
-module Styles
-
-  class Murphy < Style
-
-    register_for :murphy
-
-    code_background = '#001129'
-    numbers_background = code_background
-    border_color = 'silver'
-    normal_color = '#C0C0C0'
-
-    CSS_MAIN_STYLES = <<-MAIN
-.CodeRay {
-  background-color: #{code_background};
-  border: 1px solid #{border_color};
-  font-family: 'Courier New', 'Terminal', monospace;
-  color: #{normal_color};
-}
-.CodeRay pre { margin: 0px; }
-
-div.CodeRay { }
-
-span.CodeRay { white-space: pre; border: 0px; padding: 2px; }
-
-table.CodeRay { border-collapse: collapse; width: 100%; padding: 2px; }
-table.CodeRay td { padding: 2px 4px; vertical-align: top; }
-
-.CodeRay .line_numbers, .CodeRay .no {
-  background-color: #{numbers_background};
-  color: gray;
-  text-align: right;
-}
-.CodeRay .line_numbers tt { font-weight: bold; }
-.CodeRay .line_numbers .highlighted { color: red }
-.CodeRay .line { display: block; float: left; width: 100%; }
-.CodeRay .no { padding: 0px 4px; }
-.CodeRay .code { width: 100%; }
-
-ol.CodeRay { font-size: 10pt; }
-ol.CodeRay li { white-space: pre; }
-
-.CodeRay .code pre { overflow: auto; }
-    MAIN
-
-    TOKEN_COLORS = <<-'TOKENS'
-.af { color:#00C; }
-.an { color:#007; }
-.av { color:#700; }
-.aw { color:#C00; }
-.bi { color:#509; font-weight:bold; }
-.c  { color:#555; background-color: black; }
-
-.ch { color:#88F; }
-.ch .k { color:#04D; }
-.ch .dl { color:#039; }
-
-.cl { color:#e9e; font-weight:bold; }
-.co { color:#5ED; font-weight:bold; }
-.cr { color:#0A0; }
-.cv { color:#ccf; }
-.df { color:#099; font-weight:bold; }
-.di { color:#088; font-weight:bold; }
-.dl { color:black; }
-.do { color:#970; }
-.ds { color:#D42; font-weight:bold; }
-.e  { color:#666; font-weight:bold; }
-.er { color:#F00; background-color:#FAA; }
-.ex { color:#F00; font-weight:bold; }
-.fl { color:#60E; font-weight:bold; }
-.fu { color:#5ed; font-weight:bold; }
-.gv { color:#f84; }
-.hx { color:#058; font-weight:bold; }
-.i  { color:#66f; font-weight:bold; }
-.ic { color:#B44; font-weight:bold; }
-.il { }
-.in { color:#B2B; font-weight:bold; }
-.iv { color:#aaf; }
-.la { color:#970; font-weight:bold; }
-.lv { color:#963; }
-.oc { color:#40E; font-weight:bold; }
-.of { color:#000; font-weight:bold; }
-.op { }
-.pc { color:#08f; font-weight:bold; }
-.pd { color:#369; font-weight:bold; }
-.pp { color:#579; }
-.pt { color:#66f; font-weight:bold; }
-.r  { color:#5de; font-weight:bold; }
-.r, .kw  { color:#5de; font-weight:bold }
-
-.ke { color: #808; }
-
-.rx { background-color:#221133; }
-.rx .k { color:#f8f; }
-.rx .dl { color:#f0f; }
-.rx .mod { color:#f0b; }
-.rx .fu  { color:#404; font-weight: bold; }
-
-.s  { background-color:#331122; }
-.s  .s { background-color:#ffe0e0; }
-.s  .s  .s { background-color:#ffd0d0; }
-.s  .k { color:#F88; }
-.s  .dl { color:#f55; }
-
-.sh { background-color:#f0fff0; }
-.sh .k { color:#2B2; }
-.sh .dl { color:#161; }
-
-.sy { color:#Fc8; }
-.sy .k { color:#Fc8; }
-.sy .dl { color:#F84; }
-
-.ta { color:#070; }
-.tf { color:#070; font-weight:bold; }
-.ts { color:#D70; font-weight:bold; }
-.ty { color:#339; font-weight:bold; }
-.v  { color:#036; }
-.xt { color:#444; }
-
-.ins { background: #afa; }
-.del { background: #faa; }
-.chg { color: #aaf; background: #007; }
-.head { color: #f8f; background: #505 }
-
-.ins .ins { color: #080; font-weight:bold }
-.del .del { color: #800; font-weight:bold }
-.chg .chg { color: #66f; }
-.head .head { color: #f4f; }
-    TOKENS
-
-  end
-
-end
-end
diff --git a/lib/coderay/token_classes.rb b/lib/coderay/token_classes.rb
deleted file mode 100755
index ae35c0f..0000000
--- a/lib/coderay/token_classes.rb
+++ /dev/null
@@ -1,86 +0,0 @@
-module CodeRay
-  class Tokens
-    ClassOfKind = Hash.new do |h, k|
-      h[k] = k.to_s
-    end
-    ClassOfKind.update with = {
-      :annotation => 'at',
-      :attribute_name => 'an',
-      :attribute_name_fat => 'af',
-      :attribute_value => 'av',
-      :attribute_value_fat => 'aw',
-      :bin => 'bi',
-      :char => 'ch',
-      :class => 'cl',
-      :class_variable => 'cv',
-      :color => 'cr',
-      :comment => 'c',
-      :complex => 'cm',
-      :constant => 'co',
-      :content => 'k',
-      :decorator => 'de',
-      :definition => 'df',
-      :delimiter => 'dl',
-      :directive => 'di',
-      :doc => 'do',
-      :doctype => 'dt',
-      :doc_string => 'ds',
-      :entity => 'en',
-      :error => 'er',
-      :escape => 'e',
-      :exception => 'ex',
-      :float => 'fl',
-      :function => 'fu',
-      :global_variable => 'gv',
-      :hex => 'hx',
-      :imaginary => 'cm',
-      :important => 'im',
-      :include => 'ic',
-      :inline => 'il',
-      :inline_delimiter => 'idl',
-      :instance_variable => 'iv',
-      :integer => 'i',
-      :interpreted => 'in',
-      :keyword => 'kw',
-      :key => 'ke',
-      :label => 'la',
-      :local_variable => 'lv',
-      :modifier => 'mod',
-      :oct => 'oc',
-      :operator_fat => 'of',
-      :pre_constant => 'pc',
-      :pre_type => 'pt',
-      :predefined => 'pd',
-      :preprocessor => 'pp',
-      :pseudo_class => 'ps',
-      :regexp => 'rx',
-      :reserved => 'r',
-      :shell => 'sh',
-      :string => 's',
-      :symbol => 'sy',
-      :tag => 'ta',
-      :tag_fat => 'tf',
-      :tag_special => 'ts',
-      :type => 'ty',
-      :variable => 'v',
-      :value => 'vl',
-      :xml_text => 'xt',
-      
-      :insert => 'ins',
-      :delete => 'del',
-      :change => 'chg',
-      :head => 'head',
-
-      :ident => :NO_HIGHLIGHT, # 'id'
-      #:operator => 'op',
-      :operator => :NO_HIGHLIGHT,  # 'op'
-      :space => :NO_HIGHLIGHT,  # 'sp'
-      :plain => :NO_HIGHLIGHT,
-    }
-    ClassOfKind[:method] = ClassOfKind[:function]
-    ClassOfKind[:open] = ClassOfKind[:close] = ClassOfKind[:delimiter]
-    ClassOfKind[:nesting_delimiter] = ClassOfKind[:delimiter]
-    ClassOfKind[:escape] = ClassOfKind[:delimiter]
-    #ClassOfKind.default = ClassOfKind[:error] or raise 'no class found for :error!'
-  end
-end
\ No newline at end of file
diff --git a/lib/coderay/token_kinds.rb b/lib/coderay/token_kinds.rb
new file mode 100755
index 0000000..3b8d07e
--- /dev/null
+++ b/lib/coderay/token_kinds.rb
@@ -0,0 +1,90 @@
+module CodeRay
+  
+  # A Hash of all known token kinds and their associated CSS classes.
+  TokenKinds = Hash.new do |h, k|
+    warn 'Undefined Token kind: %p' % [k] if $CODERAY_DEBUG
+    false
+  end
+  
+  # speedup
+  TokenKinds.compare_by_identity if TokenKinds.respond_to? :compare_by_identity
+  
+  TokenKinds.update(  # :nodoc:
+    :annotation          => 'annotation',
+    :attribute_name      => 'attribute-name',
+    :attribute_value     => 'attribute-value',
+    :binary              => 'bin',
+    :char                => 'char',
+    :class               => 'class',
+    :class_variable      => 'class-variable',
+    :color               => 'color',
+    :comment             => 'comment',
+    :complex             => 'complex',
+    :constant            => 'constant',
+    :content             => 'content',
+    :debug               => 'debug',
+    :decorator           => 'decorator',
+    :definition          => 'definition',
+    :delimiter           => 'delimiter',
+    :directive           => 'directive',
+    :doc                 => 'doc',
+    :doctype             => 'doctype',
+    :doc_string          => 'doc-string',
+    :entity              => 'entity',
+    :error               => 'error',
+    :escape              => 'escape',
+    :exception           => 'exception',
+    :filename            => 'filename',
+    :float               => 'float',
+    :function            => 'function',
+    :global_variable     => 'global-variable',
+    :hex                 => 'hex',
+    :imaginary           => 'imaginary',
+    :important           => 'important',
+    :include             => 'include',
+    :inline              => 'inline',
+    :inline_delimiter    => 'inline-delimiter',
+    :instance_variable   => 'instance-variable',
+    :integer             => 'integer',
+    :key                 => 'key',
+    :keyword             => 'keyword',
+    :label               => 'label',
+    :local_variable      => 'local-variable',
+    :modifier            => 'modifier',
+    :namespace           => 'namespace',
+    :octal               => 'octal',
+    :predefined          => 'predefined',
+    :predefined_constant => 'predefined-constant',
+    :predefined_type     => 'predefined-type',
+    :preprocessor        => 'preprocessor',
+    :pseudo_class        => 'pseudo-class',
+    :regexp              => 'regexp',
+    :reserved            => 'reserved',
+    :shell               => 'shell',
+    :string              => 'string',
+    :symbol              => 'symbol',
+    :tag                 => 'tag',
+    :type                => 'type',
+    :value               => 'value',
+    :variable            => 'variable',
+    
+    :change              => 'change',
+    :delete              => 'delete',
+    :head                => 'head',
+    :insert              => 'insert',
+    
+    :eyecatcher          => 'eyecatcher',
+    
+    :ident               => false,
+    :operator            => false,
+    
+    :space               => false,
+    :plain               => false
+  )
+  
+  TokenKinds[:method]    = TokenKinds[:function]
+  TokenKinds[:escape]    = TokenKinds[:delimiter]
+  TokenKinds[:docstring] = TokenKinds[:comment]
+  
+  TokenKinds.freeze
+end
diff --git a/lib/coderay/tokens.rb b/lib/coderay/tokens.rb
index 6ac5f44..c747017 100644
--- a/lib/coderay/tokens.rb
+++ b/lib/coderay/tokens.rb
@@ -1,6 +1,9 @@
 module CodeRay
-
-  # = Tokens
+  
+  # GZip library for writing and reading token dumps.
+  autoload :GZip, coderay_path('helpers', 'gzip')
+  
+  # = Tokens  TODO: Rewrite!
   #
   # The Tokens class represents a list of tokens returnd from
   # a Scanner.
@@ -8,7 +11,7 @@ module CodeRay
   # A token is not a special object, just a two-element Array
   # consisting of
   # * the _token_ _text_ (the original source of the token in a String) or
-  #   a _token_ _action_ (:open, :close, :begin_line, :end_line)
+  #   a _token_ _action_ (begin_group, end_group, begin_line, end_line)
   # * the _token_ _kind_ (a Symbol representing the type of the token)
   #
   # A token looks like this:
@@ -18,16 +21,16 @@ module CodeRay
   #   ['$^', :error]
   #
   # Some scanners also yield sub-tokens, represented by special
-  # token actions, namely :open and :close.
+  # token actions, namely begin_group and end_group.
   #
   # The Ruby scanner, for example, splits "a string" into:
   #
   #  [
-  #   [:open, :string],
+  #   [:begin_group, :string],
   #   ['"', :delimiter],
   #   ['a string', :content],
   #   ['"', :delimiter],
-  #   [:close, :string]
+  #   [:end_group, :string]
   #  ]
   #
   # Tokens is the interface between Scanners and Encoders:
@@ -47,46 +50,11 @@ module CodeRay
   # 
   # It also allows you to generate tokens directly (without using a scanner),
   # to load them from a file, and still use any Encoder that CodeRay provides.
-  #
-  # Tokens' subclass TokenStream allows streaming to save memory.
   class Tokens < Array
     
     # The Scanner instance that created the tokens.
     attr_accessor :scanner
     
-    # Whether the object is a TokenStream.
-    #
-    # Returns false.
-    def stream?
-      false
-    end
-
-    # Iterates over all tokens.
-    #
-    # If a filter is given, only tokens of that kind are yielded.
-    def each kind_filter = nil, &block
-      unless kind_filter
-        super(&block)
-      else
-        super() do |text, kind|
-          next unless kind == kind_filter
-          yield text, kind
-        end
-      end
-    end
-
-    # Iterates over all text tokens.
-    # Range tokens like [:open, :string] are left out.
-    #
-    # Example:
-    #   tokens.each_text_token { |text, kind| text.replace html_escape(text) }
-    def each_text_token
-      each do |text, kind|
-        next unless text.is_a? ::String
-        yield text, kind
-      end
-    end
-
     # Encode the tokens using encoder.
     #
     # encoder can be
@@ -96,120 +64,98 @@ module CodeRay
     #
     # options are passed to the encoder.
     def encode encoder, options = {}
-      unless encoder.is_a? Encoders::Encoder
-        unless encoder.is_a? Class
-          encoder_class = Encoders[encoder]
-        end
-        encoder = encoder_class.new options
-      end
+      encoder = Encoders[encoder].new options if encoder.respond_to? :to_sym
       encoder.encode_tokens self, options
     end
-
-
-    # Turn into a string using Encoders::Text.
-    #
-    # +options+ are passed to the encoder if given.
-    def to_s options = {}
-      encode :text, options
+    
+    # Turn tokens into a string by concatenating them.
+    def to_s
+      encode CodeRay::Encoders::Encoder.new
     end
-
+    
     # Redirects unknown methods to encoder calls.
     #
     # For example, if you call +tokens.html+, the HTML encoder
     # is used to highlight the tokens.
     def method_missing meth, options = {}
-      Encoders[meth].new(options).encode_tokens self
-    end
-
-    # Returns the tokens compressed by joining consecutive
-    # tokens of the same kind.
-    #
-    # This can not be undone, but should yield the same output
-    # in most Encoders.  It basically makes the output smaller.
-    #
-    # Combined with dump, it saves space for the cost of time.
-    #
-    # If the scanner is written carefully, this is not required -
-    # for example, consecutive //-comment lines could already be
-    # joined in one comment token by the Scanner.
-    def optimize
-      last_kind = last_text = nil
-      new = self.class.new
-      for text, kind in self
-        if text.is_a? String
-          if kind == last_kind
-            last_text << text
-          else
-            new << [last_text, last_kind] if last_kind
-            last_text = text
-            last_kind = kind
-          end
-        else
-          new << [last_text, last_kind] if last_kind
-          last_kind = last_text = nil
-          new << [text, kind]
-        end
-      end
-      new << [last_text, last_kind] if last_kind
-      new
-    end
-
-    # Compact the object itself; see optimize.
-    def optimize!
-      replace optimize
+      encode meth, options
+    rescue PluginHost::PluginNotFound
+      super
     end
     
-    # Ensure that all :open tokens have a correspondent :close one.
-    #
-    # TODO: Test this!
-    def fix
-      tokens = self.class.new
-      # Check token nesting using a stack of kinds.
+    # Split the tokens into parts of the given +sizes+.
+    # 
+    # The result will be an Array of Tokens objects. The parts have
+    # the text size specified by the parameter. In addition, each
+    # part closes all opened tokens. This is useful to insert tokens
+    # betweem them.
+    # 
+    # This method is used by @Scanner#tokenize@ when called with an Array
+    # of source strings. The Diff encoder uses it for inline highlighting.
+    def split_into_parts *sizes
+      parts = []
       opened = []
-      for type, kind in self
-        case type
-        when :open
-          opened.push [:close, kind]
-        when :begin_line
-          opened.push [:end_line, kind]
-        when :close, :end_line
-          expected = opened.pop
-          if [type, kind] != expected
-            # Unexpected :close; decide what to do based on the kind:
-            # - token was never opened: delete the :close (just skip it)
-            next unless opened.rindex expected
-            # - token was opened earlier: also close tokens in between
-            tokens << token until (token = opened.pop) == expected
+      content = nil
+      part = Tokens.new
+      part_size = 0
+      size = sizes.first
+      i = 0
+      for item in self
+        case content
+        when nil
+          content = item
+        when String
+          if size && part_size + content.size > size  # token must be cut
+            if part_size < size  # some part of the token goes into this part
+              content = content.dup  # content may no be safe to change
+              part << content.slice!(0, size - part_size) << item
+            end
+            # close all open groups and lines...
+            closing = opened.reverse.flatten.map do |content_or_kind|
+              case content_or_kind
+              when :begin_group
+                :end_group
+              when :begin_line
+                :end_line
+              else
+                content_or_kind
+              end
+            end
+            part.concat closing
+            begin
+              parts << part
+              part = Tokens.new
+              size = sizes[i += 1]
+            end until size.nil? || size > 0
+            # ...and open them again.
+            part.concat opened.flatten
+            part_size = 0
+            redo unless content.empty?
+          else
+            part << content << item
+            part_size += content.size
           end
+          content = nil
+        when Symbol
+          case content
+          when :begin_group, :begin_line
+            opened << [content, item]
+          when :end_group, :end_line
+            opened.pop
+          else
+            raise ArgumentError, 'Unknown token action: %p, kind = %p' % [content, item]
+          end
+          part << content << item
+          content = nil
+        else
+          raise ArgumentError, 'Token input junk: %p, kind = %p' % [content, item]
         end
-        tokens << [type, kind]
       end
-      # Close remaining opened tokens
-      tokens << token while token = opened.pop
-      tokens
+      parts << part
+      parts << Tokens.new while parts.size < sizes.size
+      parts
     end
     
-    def fix!
-      replace fix
-    end
-    
-    # TODO: Scanner#split_into_lines
-    # 
-    # Makes sure that:
-    # - newlines are single tokens
-    #   (which means all other token are single-line)
-    # - there are no open tokens at the end the line
-    #
-    # This makes it simple for encoders that work line-oriented,
-    # like HTML with list-style numeration.
-    def split_into_lines
-      raise NotImplementedError
-    end
-
-    def split_into_lines!
-      replace split_into_lines
-    end
-
     # Dumps the object into a String that can be saved
     # in files or databases.
     #
@@ -226,28 +172,16 @@ module CodeRay
     #
     # See GZip module.
     def dump gzip_level = 7
-      require 'coderay/helpers/gzip_simple'
       dump = Marshal.dump self
-      dump = dump.gzip gzip_level
+      dump = GZip.gzip dump, gzip_level
       dump.extend Undumping
     end
-
-    # The total size of the tokens.
-    # Should be equal to the input size before
-    # scanning.
-    def text_size
-      size = 0
-      each_text_token do |t, k|
-        size + t.size
-      end
-      size
-    end
-
-    # Return all text tokens joined into a single string.
-    def text
-      map { |t, k| t if t.is_a? ::String }.join
+    
+    # Return the actual number of tokens.
+    def count
+      size / 2
     end
-
+    
     # Include this module to give an object an #undump
     # method.
     #
@@ -258,133 +192,24 @@ module CodeRay
         Tokens.load self
       end
     end
-
+    
     # Undump the object using Marshal.load, then
     # unzip it using GZip.gunzip.
     #
     # The result is commonly a Tokens object, but
     # this is not guaranteed.
     def Tokens.load dump
-      require 'coderay/helpers/gzip_simple'
-      dump = dump.gunzip
+      dump = GZip.gunzip dump
       @dump = Marshal.load dump
     end
-
-  end
-
-
-  # = TokenStream
-  #
-  # The TokenStream class is a fake Array without elements.
-  #
-  # It redirects the method << to a block given at creation.
-  #
-  # This allows scanners and Encoders to use streaming (no
-  # tokens are saved, the input is highlighted the same time it
-  # is scanned) with the same code.
-  #
-  # See CodeRay.encode_stream and CodeRay.scan_stream
-  class TokenStream < Tokens
-
-    # Whether the object is a TokenStream.
-    #
-    # Returns true.
-    def stream?
-      true
-    end
-
-    # The Array is empty, but size counts the tokens given by <<.
-    attr_reader :size
-
-    # Creates a new TokenStream that calls +block+ whenever
-    # its << method is called.
-    #
-    # Example:
-    #
-    #   require 'coderay'
-    #   
-    #   token_stream = CodeRay::TokenStream.new do |text, kind|
-    #     puts 'kind: %s, text size: %d.' % [kind, text.size]
-    #   end
-    #   
-    #   token_stream << ['/\d+/', :regexp]
-    #   #-> kind: rexpexp, text size: 5.
-    #
-    def initialize &block
-      raise ArgumentError, 'Block expected for streaming.' unless block
-      @callback = block
-      @size = 0
-    end
-
-    # Calls +block+ with +token+ and increments size.
-    #
-    # Returns self.
-    def << token
-      @callback.call(*token)
-      @size += 1
-      self
-    end
-
-    # This method is not implemented due to speed reasons. Use Tokens.
-    def text_size
-      raise NotImplementedError,
-        'This method is not implemented due to speed reasons.'
-    end
-
-    # A TokenStream cannot be dumped. Use Tokens.
-    def dump
-      raise NotImplementedError, 'A TokenStream cannot be dumped.'
-    end
-
-    # A TokenStream cannot be optimized. Use Tokens.
-    def optimize
-      raise NotImplementedError, 'A TokenStream cannot be optimized.'
-    end
-
-  end
-
-end
-
-if $0 == __FILE__
-  $VERBOSE = true
-  $: << File.join(File.dirname(__FILE__), '..')
-  eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class TokensTest < Test::Unit::TestCase
-  
-  def test_creation
-    assert CodeRay::Tokens < Array
-    tokens = nil
-    assert_nothing_raised do
-      tokens = CodeRay::Tokens.new
-    end
-    assert_kind_of Array, tokens
-  end
-  
-  def test_adding_tokens
-    tokens = CodeRay::Tokens.new
-    assert_nothing_raised do
-      tokens << ['string', :type]
-      tokens << ['()', :operator]
-    end
-    assert_equal tokens.size, 2
-  end
-  
-  def test_dump_undump
-    tokens = CodeRay::Tokens.new
-    assert_nothing_raised do
-      tokens << ['string', :type]
-      tokens << ['()', :operator]
-    end
-    tokens2 = nil
-    assert_nothing_raised do
-      tokens2 = tokens.dump.undump
-    end
-    assert_equal tokens, tokens2
+    
+    alias text_token push
+    def begin_group kind; push :begin_group, kind end
+    def end_group kind; push :end_group, kind end
+    def begin_line kind; push :begin_line, kind end
+    def end_line kind; push :end_line, kind end
+    alias tokens concat
+    
   end
   
-end
\ No newline at end of file
+end
diff --git a/lib/coderay/tokens_proxy.rb b/lib/coderay/tokens_proxy.rb
new file mode 100644
index 0000000..31ff39b
--- /dev/null
+++ b/lib/coderay/tokens_proxy.rb
@@ -0,0 +1,55 @@
+module CodeRay
+  
+  # The result of a scan operation is a TokensProxy, but should act like Tokens.
+  # 
+  # This proxy makes it possible to use the classic CodeRay.scan.encode API
+  # while still providing the benefits of direct streaming.
+  class TokensProxy
+    
+    attr_accessor :input, :lang, :options, :block
+    
+    # Create a new TokensProxy with the arguments of CodeRay.scan.
+    def initialize input, lang, options = {}, block = nil
+      @input   = input
+      @lang    = lang
+      @options = options
+      @block   = block
+    end
+    
+    # Call CodeRay.encode if +encoder+ is a Symbol;
+    # otherwise, convert the receiver to tokens and call encoder.encode_tokens.
+    def encode encoder, options = {}
+      if encoder.respond_to? :to_sym
+        CodeRay.encode(input, lang, encoder, options)
+      else
+        encoder.encode_tokens tokens, options
+      end
+    end
+    
+    # Tries to call encode;
+    # delegates to tokens otherwise.
+    def method_missing method, *args, &blk
+      encode method.to_sym, *args
+    rescue PluginHost::PluginNotFound
+      tokens.send(method, *args, &blk)
+    end
+    
+    # The (cached) result of the tokenized input; a Tokens instance.
+    def tokens
+      @tokens ||= scanner.tokenize(input)
+    end
+    
+    # A (cached) scanner instance to use for the scan task.
+    def scanner
+      @scanner ||= CodeRay.scanner(lang, options, &block)
+    end
+    
+    # Overwrite Struct#each.
+    def each *args, &blk
+      tokens.each(*args, &blk)
+      self
+    end
+    
+  end
+  
+end
diff --git a/lib/coderay/version.rb b/lib/coderay/version.rb
new file mode 100644
index 0000000..e2797b5
--- /dev/null
+++ b/lib/coderay/version.rb
@@ -0,0 +1,3 @@
+module CodeRay
+  VERSION = '1.0.5'
+end
diff --git a/metadata.yml b/metadata.yml
index a7b531a..50a9dd1 100644
--- a/metadata.yml
+++ b/metadata.yml
@@ -1,117 +1,113 @@
 --- !ruby/object:Gem::Specification 
 name: coderay
 version: !ruby/object:Gem::Version 
-  hash: 43
+  hash: 29
   prerelease: 
   segments: 
+  - 1
   - 0
-  - 9
-  - 8
-  version: 0.9.8
+  - 5
+  version: 1.0.5
 platform: ruby
 authors: 
-- murphy
+- Kornelius Kalnbach
 autorequire: 
 bindir: bin
 cert_chain: []
 
-date: 2011-05-01 00:00:00 Z
+date: 2011-12-28 00:00:00 +01:00
+default_executable: 
 dependencies: []
 
-description: |
-  Fast and easy syntax highlighting for selected languages, written in Ruby.
-  Comes with RedCloth integration and LOC counter.
-
-email: murphy at rubychan.de
+description: Fast and easy syntax highlighting for selected languages, written in Ruby. Comes with RedCloth integration and LOC counter.
+email: 
+- murphy at rubychan.de
 executables: 
 - coderay
-- coderay_stylesheet
 extensions: []
 
 extra_rdoc_files: 
-- lib/README
-- FOLDERS
+- README_INDEX.rdoc
 files: 
-- ./lib/coderay/duo.rb
-- ./lib/coderay/encoder.rb
-- ./lib/coderay/encoders/_map.rb
-- ./lib/coderay/encoders/comment_filter.rb
-- ./lib/coderay/encoders/count.rb
-- ./lib/coderay/encoders/debug.rb
-- ./lib/coderay/encoders/div.rb
-- ./lib/coderay/encoders/filter.rb
-- ./lib/coderay/encoders/html/css.rb
-- ./lib/coderay/encoders/html/numerization.rb
-- ./lib/coderay/encoders/html/output.rb
-- ./lib/coderay/encoders/html.rb
-- ./lib/coderay/encoders/json.rb
-- ./lib/coderay/encoders/lines_of_code.rb
-- ./lib/coderay/encoders/null.rb
-- ./lib/coderay/encoders/page.rb
-- ./lib/coderay/encoders/span.rb
-- ./lib/coderay/encoders/statistic.rb
-- ./lib/coderay/encoders/term.rb
-- ./lib/coderay/encoders/text.rb
-- ./lib/coderay/encoders/token_class_filter.rb
-- ./lib/coderay/encoders/xml.rb
-- ./lib/coderay/encoders/yaml.rb
-- ./lib/coderay/for_redcloth.rb
-- ./lib/coderay/helpers/file_type.rb
-- ./lib/coderay/helpers/gzip_simple.rb
-- ./lib/coderay/helpers/plugin.rb
-- ./lib/coderay/helpers/word_list.rb
-- ./lib/coderay/scanner.rb
-- ./lib/coderay/scanners/_map.rb
-- ./lib/coderay/scanners/c.rb
-- ./lib/coderay/scanners/cpp.rb
-- ./lib/coderay/scanners/css.rb
-- ./lib/coderay/scanners/debug.rb
-- ./lib/coderay/scanners/delphi.rb
-- ./lib/coderay/scanners/diff.rb
-- ./lib/coderay/scanners/groovy.rb
-- ./lib/coderay/scanners/html.rb
-- ./lib/coderay/scanners/java/builtin_types.rb
-- ./lib/coderay/scanners/java.rb
-- ./lib/coderay/scanners/java_script.rb
-- ./lib/coderay/scanners/json.rb
-- ./lib/coderay/scanners/nitro_xhtml.rb
-- ./lib/coderay/scanners/php.rb
-- ./lib/coderay/scanners/plaintext.rb
-- ./lib/coderay/scanners/python.rb
-- ./lib/coderay/scanners/rhtml.rb
-- ./lib/coderay/scanners/ruby/patterns.rb
-- ./lib/coderay/scanners/ruby.rb
-- ./lib/coderay/scanners/scheme.rb
-- ./lib/coderay/scanners/sql.rb
-- ./lib/coderay/scanners/xml.rb
-- ./lib/coderay/scanners/yaml.rb
-- ./lib/coderay/style.rb
-- ./lib/coderay/styles/_map.rb
-- ./lib/coderay/styles/cycnus.rb
-- ./lib/coderay/styles/murphy.rb
-- ./lib/coderay/token_classes.rb
-- ./lib/coderay/tokens.rb
-- ./lib/coderay.rb
-- ./Rakefile
-- ./test/functional/basic.rb
-- ./test/functional/for_redcloth.rb
-- ./test/functional/load_plugin_scanner.rb
-- ./test/functional/suite.rb
-- ./test/functional/vhdl.rb
-- ./test/functional/word_list.rb
-- ./lib/README
-- ./LICENSE
-- lib/README
-- FOLDERS
+- LICENSE
+- README_INDEX.rdoc
+- Rakefile
+- lib/coderay.rb
+- lib/coderay/duo.rb
+- lib/coderay/encoder.rb
+- lib/coderay/encoders/_map.rb
+- lib/coderay/encoders/comment_filter.rb
+- lib/coderay/encoders/count.rb
+- lib/coderay/encoders/debug.rb
+- lib/coderay/encoders/div.rb
+- lib/coderay/encoders/filter.rb
+- lib/coderay/encoders/html.rb
+- lib/coderay/encoders/html/css.rb
+- lib/coderay/encoders/html/numbering.rb
+- lib/coderay/encoders/html/output.rb
+- lib/coderay/encoders/json.rb
+- lib/coderay/encoders/lines_of_code.rb
+- lib/coderay/encoders/null.rb
+- lib/coderay/encoders/page.rb
+- lib/coderay/encoders/span.rb
+- lib/coderay/encoders/statistic.rb
+- lib/coderay/encoders/terminal.rb
+- lib/coderay/encoders/text.rb
+- lib/coderay/encoders/token_kind_filter.rb
+- lib/coderay/encoders/xml.rb
+- lib/coderay/encoders/yaml.rb
+- lib/coderay/for_redcloth.rb
+- lib/coderay/helpers/file_type.rb
+- lib/coderay/helpers/gzip.rb
+- lib/coderay/helpers/plugin.rb
+- lib/coderay/helpers/word_list.rb
+- lib/coderay/scanner.rb
+- lib/coderay/scanners/_map.rb
+- lib/coderay/scanners/c.rb
+- lib/coderay/scanners/clojure.rb
+- lib/coderay/scanners/cpp.rb
+- lib/coderay/scanners/css.rb
+- lib/coderay/scanners/debug.rb
+- lib/coderay/scanners/delphi.rb
+- lib/coderay/scanners/diff.rb
+- lib/coderay/scanners/erb.rb
+- lib/coderay/scanners/groovy.rb
+- lib/coderay/scanners/haml.rb
+- lib/coderay/scanners/html.rb
+- lib/coderay/scanners/java.rb
+- lib/coderay/scanners/java/builtin_types.rb
+- lib/coderay/scanners/java_script.rb
+- lib/coderay/scanners/json.rb
+- lib/coderay/scanners/php.rb
+- lib/coderay/scanners/python.rb
+- lib/coderay/scanners/raydebug.rb
+- lib/coderay/scanners/ruby.rb
+- lib/coderay/scanners/ruby/patterns.rb
+- lib/coderay/scanners/ruby/string_state.rb
+- lib/coderay/scanners/sql.rb
+- lib/coderay/scanners/text.rb
+- lib/coderay/scanners/xml.rb
+- lib/coderay/scanners/yaml.rb
+- lib/coderay/style.rb
+- lib/coderay/styles/_map.rb
+- lib/coderay/styles/alpha.rb
+- lib/coderay/token_kinds.rb
+- lib/coderay/tokens.rb
+- lib/coderay/tokens_proxy.rb
+- lib/coderay/version.rb
+- test/functional/basic.rb
+- test/functional/examples.rb
+- test/functional/for_redcloth.rb
+- test/functional/suite.rb
 - bin/coderay
-- bin/coderay_stylesheet
+has_rdoc: true
 homepage: http://coderay.rubychan.de
 licenses: []
 
 post_install_message: 
 rdoc_options: 
 - -SNw2
-- -mlib/README
+- -mREADME_INDEX.rdoc
 - -t CodeRay Documentation
 require_paths: 
 - lib
@@ -120,12 +116,12 @@ required_ruby_version: !ruby/object:Gem::Requirement
   requirements: 
   - - ">="
     - !ruby/object:Gem::Version 
-      hash: 51
+      hash: 59
       segments: 
       - 1
       - 8
-      - 2
-      version: 1.8.2
+      - 6
+      version: 1.8.6
 required_rubygems_version: !ruby/object:Gem::Requirement 
   none: false
   requirements: 
@@ -138,9 +134,12 @@ required_rubygems_version: !ruby/object:Gem::Requirement
 requirements: []
 
 rubyforge_project: coderay
-rubygems_version: 1.7.2
+rubygems_version: 1.6.2
 signing_key: 
 specification_version: 3
 summary: Fast syntax highlighting for selected languages.
 test_files: 
-- ./test/functional/suite.rb
+- test/functional/basic.rb
+- test/functional/examples.rb
+- test/functional/for_redcloth.rb
+- test/functional/suite.rb
diff --git a/test/functional/basic.rb b/test/functional/basic.rb
index 8ecd3d3..3053b54 100755
--- a/test/functional/basic.rb
+++ b/test/functional/basic.rb
@@ -1,11 +1,31 @@
+# encoding: utf-8
 require 'test/unit'
+require File.expand_path('../../lib/assert_warning', __FILE__)
+
+$:.unshift File.expand_path('../../../lib', __FILE__)
 require 'coderay'
 
 class BasicTest < Test::Unit::TestCase
   
   def test_version
     assert_nothing_raised do
-      assert_match(/\A\d\.\d\.\d\z/, CodeRay::VERSION)
+      assert_match(/\A\d\.\d\.\d?\z/, CodeRay::VERSION)
+    end
+  end
+  
+  def with_empty_load_path
+    old_load_path = $:.dup
+    $:.clear
+    yield
+  ensure
+    $:.replace old_load_path
+  end
+  
+  def test_autoload
+    with_empty_load_path do
+      assert_nothing_raised do
+        CodeRay::Scanners::Java::BuiltinTypes
+      end
     end
   end
   
@@ -14,36 +34,60 @@ class BasicTest < Test::Unit::TestCase
   RUBY_TEST_TOKENS = [
     ['puts', :ident],
     [' ', :space],
-    [:open, :string],
+    [:begin_group, :string],
       ['"', :delimiter],
       ['Hello, World!', :content],
       ['"', :delimiter],
-    [:close, :string]
-  ]
+    [:end_group, :string]
+  ].flatten
   def test_simple_scan
     assert_nothing_raised do
-      assert_equal RUBY_TEST_TOKENS, CodeRay.scan(RUBY_TEST_CODE, :ruby).to_ary
+      assert_equal RUBY_TEST_TOKENS, CodeRay.scan(RUBY_TEST_CODE, :ruby).tokens
     end
   end
   
-  RUBY_TEST_HTML = 'puts <span class="s"><span class="dl">"</span>' + 
-    '<span class="k">Hello, World!</span><span class="dl">"</span></span>'
+  RUBY_TEST_HTML = 'puts <span class="string"><span class="delimiter">"</span>' + 
+    '<span class="content">Hello, World!</span><span class="delimiter">"</span></span>'
   def test_simple_highlight
     assert_nothing_raised do
       assert_equal RUBY_TEST_HTML, CodeRay.scan(RUBY_TEST_CODE, :ruby).html
     end
   end
   
+  def test_scan_file
+    CodeRay.scan_file __FILE__
+  end
+  
+  def test_encode
+    assert_equal 1, CodeRay.encode('test', :python, :count)
+  end
+  
+  def test_encode_tokens
+    assert_equal 1, CodeRay.encode_tokens(CodeRay::Tokens['test', :string], :count)
+  end
+  
+  def test_encode_file
+    assert_equal File.read(__FILE__), CodeRay.encode_file(__FILE__, :text)
+  end
+  
+  def test_highlight
+    assert_match '<pre>test</pre>', CodeRay.highlight('test', :python)
+  end
+  
+  def test_highlight_file
+    assert_match "require <span class=\"string\"><span class=\"delimiter\">'</span><span class=\"content\">test/unit</span><span class=\"delimiter\">'</span></span>\n", CodeRay.highlight_file(__FILE__)
+  end
+  
   def test_duo
     assert_equal(RUBY_TEST_CODE,
-      CodeRay::Duo[:plain, :plain].highlight(RUBY_TEST_CODE))
+      CodeRay::Duo[:plain, :text].highlight(RUBY_TEST_CODE))
     assert_equal(RUBY_TEST_CODE,
-      CodeRay::Duo[:plain => :plain].highlight(RUBY_TEST_CODE))
+      CodeRay::Duo[:plain => :text].highlight(RUBY_TEST_CODE))
   end
   
   def test_duo_stream
     assert_equal(RUBY_TEST_CODE,
-      CodeRay::Duo[:plain, :plain].highlight(RUBY_TEST_CODE, :stream => true))
+      CodeRay::Duo[:plain, :text].highlight(RUBY_TEST_CODE, :stream => true))
   end
   
   def test_comment_filter
@@ -98,25 +142,179 @@ more code  # and another comment, in-line.
     assert_equal 0, CodeRay.scan(rHTML, :html).lines_of_code
     assert_equal 0, CodeRay.scan(rHTML, :php).lines_of_code
     assert_equal 0, CodeRay.scan(rHTML, :yaml).lines_of_code
-    assert_equal 4, CodeRay.scan(rHTML, :rhtml).lines_of_code
+    assert_equal 4, CodeRay.scan(rHTML, :erb).lines_of_code
   end
   
-  def test_rubygems_not_loaded
-    assert_equal nil, defined? Gem
-  end if ENV['check_rubygems'] && RUBY_VERSION < '1.9'
-  
   def test_list_of_encoders
     assert_kind_of(Array, CodeRay::Encoders.list)
-    assert CodeRay::Encoders.list.include?('count')
+    assert CodeRay::Encoders.list.include?(:count)
   end
   
   def test_list_of_scanners
     assert_kind_of(Array, CodeRay::Scanners.list)
-    assert CodeRay::Scanners.list.include?('plaintext')
+    assert CodeRay::Scanners.list.include?(:text)
+  end
+  
+  def test_token_kinds
+    assert_kind_of Hash, CodeRay::TokenKinds
+    for kind, css_class in CodeRay::TokenKinds
+      assert_kind_of Symbol, kind
+      if css_class != false
+        assert_kind_of String, css_class, "TokenKinds[%p] == %p" % [kind, css_class]
+      end
+    end
+    assert_equal 'reserved', CodeRay::TokenKinds[:reserved]
+    assert_warning 'Undefined Token kind: :shibboleet' do
+      assert_equal false, CodeRay::TokenKinds[:shibboleet]
+    end
+  end
+  
+  class Milk < CodeRay::Encoders::Encoder
+    FILE_EXTENSION = 'cocoa'
+  end
+  
+  class HoneyBee < CodeRay::Encoders::Encoder
+  end
+  
+  def test_encoder_file_extension
+    assert_nothing_raised do
+      assert_equal 'html', CodeRay::Encoders::Page::FILE_EXTENSION
+      assert_equal 'cocoa', Milk::FILE_EXTENSION
+      assert_equal 'cocoa', Milk.new.file_extension
+      assert_equal 'honeybee', HoneyBee::FILE_EXTENSION
+      assert_equal 'honeybee', HoneyBee.new.file_extension
+    end
+    assert_raise NameError do
+      HoneyBee::MISSING_CONSTANT
+    end
+  end
+  
+  def test_encoder_tokens
+    encoder = CodeRay::Encoders::Encoder.new
+    encoder.send :setup, {}
+    assert_raise(ArgumentError) { encoder.token :strange, '' }
+    encoder.token 'test', :debug
+  end
+  
+  def test_encoder_deprecated_interface
+    encoder = CodeRay::Encoders::Encoder.new
+    encoder.send :setup, {}
+    assert_warning 'Using old Tokens#<< interface.' do
+      encoder << ['test', :content]
+    end
+    assert_raise ArgumentError do
+      encoder << [:strange, :input]
+    end
+    assert_raise ArgumentError do
+      encoder.encode_tokens [['test', :token]]
+    end
+  end
+  
+  def encoder_token_interface_deprecation_warning_given
+    CodeRay::Encoders::Encoder.send :class_variable_get, :@@CODERAY_TOKEN_INTERFACE_DEPRECATION_WARNING_GIVEN
+  end
+  
+  def test_scanner_file_extension
+    assert_equal 'rb', CodeRay::Scanners::Ruby.file_extension
+    assert_equal 'rb', CodeRay::Scanners::Ruby.new.file_extension
+    assert_equal 'java', CodeRay::Scanners::Java.file_extension
+    assert_equal 'java', CodeRay::Scanners::Java.new.file_extension
+  end
+  
+  def test_scanner_lang
+    assert_equal :ruby, CodeRay::Scanners::Ruby.lang
+    assert_equal :ruby, CodeRay::Scanners::Ruby.new.lang
+    assert_equal :java, CodeRay::Scanners::Java.lang
+    assert_equal :java, CodeRay::Scanners::Java.new.lang
+  end
+  
+  def test_scanner_tokenize
+    assert_equal ['foo', :plain], CodeRay::Scanners::Plain.new.tokenize('foo')
+    assert_equal [['foo', :plain], ['bar', :plain]], CodeRay::Scanners::Plain.new.tokenize(['foo', 'bar'])
+    CodeRay::Scanners::Plain.new.tokenize 42
+  end
+  
+  def test_scanner_tokens
+    scanner = CodeRay::Scanners::Plain.new
+    scanner.tokenize('foo')
+    assert_equal ['foo', :plain], scanner.tokens
+    scanner.string = ''
+    assert_equal ['', :plain], scanner.tokens
+  end
+  
+  def test_scanner_line_and_column
+    scanner = CodeRay::Scanners::Plain.new "foo\nbär+quux"
+    assert_equal 0, scanner.pos
+    assert_equal 1, scanner.line
+    assert_equal 1, scanner.column
+    scanner.scan(/foo/)
+    assert_equal 3, scanner.pos
+    assert_equal 1, scanner.line
+    assert_equal 4, scanner.column
+    scanner.scan(/\n/)
+    assert_equal 4, scanner.pos
+    assert_equal 2, scanner.line
+    assert_equal 1, scanner.column
+    scanner.scan(/b/)
+    assert_equal 5, scanner.pos
+    assert_equal 2, scanner.line
+    assert_equal 2, scanner.column
+    scanner.scan(/a/)
+    assert_equal 5, scanner.pos
+    assert_equal 2, scanner.line
+    assert_equal 2, scanner.column
+    scanner.scan(/ä/)
+    assert_equal 7, scanner.pos
+    assert_equal 2, scanner.line
+    assert_equal 4, scanner.column
+    scanner.scan(/r/)
+    assert_equal 8, scanner.pos
+    assert_equal 2, scanner.line
+    assert_equal 5, scanner.column
+  end
+  
+  def test_scanner_use_subclasses
+    assert_raise NotImplementedError do
+      CodeRay::Scanners::Scanner.new
+    end
+  end
+  
+  class InvalidScanner < CodeRay::Scanners::Scanner
+  end
+  
+  def test_scanner_scan_tokens
+    assert_raise NotImplementedError do
+      InvalidScanner.new.tokenize ''
+    end
+  end
+  
+  class RaisingScanner < CodeRay::Scanners::Scanner
+    def scan_tokens encoder, options
+      raise_inspect 'message', [], :initial
+    end
+  end
+  
+  def test_scanner_raise_inspect
+    assert_raise CodeRay::Scanners::Scanner::ScanError do
+      RaisingScanner.new.tokenize ''
+    end
   end
   
   def test_scan_a_frozen_string
-    CodeRay.scan RUBY_VERSION, :ruby
+    assert_nothing_raised do
+      CodeRay.scan RUBY_VERSION, :ruby
+      CodeRay.scan RUBY_VERSION, :plain
+    end
+  end
+  
+  def test_scan_a_non_string
+    assert_nothing_raised do
+      CodeRay.scan 42, :ruby
+      CodeRay.scan nil, :ruby
+      CodeRay.scan self, :ruby
+      CodeRay.encode ENV.to_hash, :ruby, :page
+      CodeRay.highlight CodeRay, :plain
+    end
   end
   
 end
diff --git a/test/functional/examples.rb b/test/functional/examples.rb
new file mode 100755
index 0000000..ff64af3
--- /dev/null
+++ b/test/functional/examples.rb
@@ -0,0 +1,129 @@
+require 'test/unit'
+
+$:.unshift File.expand_path('../../../lib', __FILE__)
+require 'coderay'
+
+class ExamplesTest < Test::Unit::TestCase
+  
+  def test_examples
+    # output as HTML div (using inline CSS styles)
+    div = CodeRay.scan('puts "Hello, world!"', :ruby).div
+    assert_equal <<-DIV, div
+<div class="CodeRay">
+  <div class="code"><pre>puts <span style="background-color:hsla(0,100%,50%,0.05)"><span style="color:#710">"</span><span style="color:#D20">Hello, world!</span><span style="color:#710">"</span></span></pre></div>
+</div>
+    DIV
+    
+    # ...with line numbers
+    div = CodeRay.scan(<<-CODE.chomp, :ruby).div(:line_numbers => :table)
+5.times do
+  puts 'Hello, world!'
+end
+    CODE
+    assert_equal <<-DIV, div
+<table class="CodeRay"><tr>
+  <td class="line-numbers" title="double click to toggle" ondblclick="with (this.firstChild.style) { display = (display == '') ? 'none' : '' }"><pre><a href="#n1" name="n1">1</a>
+<a href="#n2" name="n2">2</a>
+<a href="#n3" name="n3">3</a>
+</pre></td>
+  <td class="code"><pre><span style="color:#00D">5</span>.times <span style="color:#080;font-weight:bold">do</span>
+  puts <span style="background-color:hsla(0,100%,50%,0.05)"><span style="color:#710">'</span><span style="color:#D20">Hello, world!</span><span style="color:#710">'</span></span>
+<span style="color:#080;font-weight:bold">end</span></pre></td>
+</tr></table>
+    DIV
+    
+    # output as standalone HTML page (using CSS classes)
+    page = CodeRay.scan('puts "Hello, world!"', :ruby).page
+    assert_match <<-PAGE, page
+<body>
+
+<table class="CodeRay"><tr>
+  <td class="line-numbers" title="double click to toggle" ondblclick="with (this.firstChild.style) { display = (display == '') ? 'none' : '' }"><pre>
+</pre></td>
+  <td class="code"><pre>puts <span class="string"><span class="delimiter">"</span><span class="content">Hello, world!</span><span class="delimiter">"</span></span></pre></td>
+</tr></table>
+
+</body>
+    PAGE
+    
+    # keep scanned tokens for later use
+    tokens = CodeRay.scan('{ "just": "an", "example": 42 }', :json)
+    assert_kind_of CodeRay::TokensProxy, tokens
+    
+    assert_equal ["{", :operator, " ", :space, :begin_group, :key,
+      "\"", :delimiter, "just", :content, "\"", :delimiter,
+      :end_group, :key, ":", :operator, " ", :space,
+      :begin_group, :string, "\"", :delimiter, "an", :content,
+      "\"", :delimiter, :end_group, :string, ",", :operator,
+      " ", :space, :begin_group, :key, "\"", :delimiter,
+      "example", :content, "\"", :delimiter, :end_group, :key,
+      ":", :operator, " ", :space, "42", :integer,
+      " ", :space, "}", :operator], tokens.tokens
+    
+    # produce a token statistic
+    assert_equal <<-STATISTIC, tokens.statistic
+
+Code Statistics
+
+Tokens                  26
+  Non-Whitespace        15
+Bytes Total             31
+
+Token Types (7):
+  type                     count     ratio    size (average)
+-------------------------------------------------------------
+  TOTAL                       26  100.00 %     1.2
+  delimiter                    6   23.08 %     1.0
+  operator                     5   19.23 %     1.0
+  space                        5   19.23 %     1.0
+  key                          4   15.38 %     0.0
+  :begin_group                 3   11.54 %     0.0
+  :end_group                   3   11.54 %     0.0
+  content                      3   11.54 %     4.3
+  string                       2    7.69 %     0.0
+  integer                      1    3.85 %     2.0
+
+    STATISTIC
+    
+    # count the tokens
+    assert_equal 26, tokens.count
+    
+    # produce a HTML div, but with CSS classes
+    div = tokens.div(:css => :class)
+    assert_equal <<-DIV, div
+<div class="CodeRay">
+  <div class="code"><pre>{ <span class="key"><span class="delimiter">"</span><span class="content">just</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">an</span><span class="delimiter">"</span></span>, <span class="key"><span class="delimiter">"</span><span class="content">example</span><span class="delimiter">"</span></span>: <span class="integer">42</span> }</pre></div>
+</div>
+    DIV
+    
+    # highlight a file (HTML div); guess the file type base on the extension
+    assert_equal :ruby, CodeRay::FileType[__FILE__]
+    
+    # get a new scanner for Python
+    python_scanner = CodeRay.scanner :python
+    assert_kind_of CodeRay::Scanners::Python, python_scanner
+    
+    # get a new encoder for terminal
+    terminal_encoder = CodeRay.encoder :term
+    assert_kind_of CodeRay::Encoders::Terminal, terminal_encoder
+    
+    # scanning into tokens
+    tokens = python_scanner.tokenize 'import this;  # The Zen of Python'
+    assert_equal ["import", :keyword, " ", :space, "this", :include,
+      ";", :operator, "  ", :space, "# The Zen of Python", :comment], tokens
+    
+    # format the tokens
+    term = terminal_encoder.encode_tokens(tokens)
+    assert_equal "\e[1;31mimport\e[0m \e[33mthis\e[0m;  \e[37m# The Zen of Python\e[0m", term
+    
+    # re-using scanner and encoder
+    ruby_highlighter = CodeRay::Duo[:ruby, :div]
+    div = ruby_highlighter.encode('puts "Hello, world!"')
+    assert_equal <<-DIV, div
+<div class="CodeRay">
+  <div class="code"><pre>puts <span style="background-color:hsla(0,100%,50%,0.05)"><span style="color:#710">"</span><span style="color:#D20">Hello, world!</span><span style="color:#710">"</span></span></pre></div>
+</div>
+    DIV
+  end
+  
+end
diff --git a/test/functional/for_redcloth.rb b/test/functional/for_redcloth.rb
index efd0578..e980667 100644
--- a/test/functional/for_redcloth.rb
+++ b/test/functional/for_redcloth.rb
@@ -1,5 +1,7 @@
 require 'test/unit'
-$:.unshift 'lib'
+require File.expand_path('../../lib/assert_warning', __FILE__)
+
+$:.unshift File.expand_path('../../../lib', __FILE__)
 require 'coderay'
 
 begin
@@ -8,17 +10,18 @@ begin
   require 'redcloth'
 rescue LoadError
   warn 'RedCloth not found - skipping for_redcloth tests.'
+  undef RedCloth if defined? RedCloth
 end
 
 class BasicTest < Test::Unit::TestCase
   
   def test_for_redcloth
     require 'coderay/for_redcloth'
-    assert_equal "<p><span lang=\"ruby\" class=\"CodeRay\">puts <span style=\"background-color:#fff0f0;color:#D20\"><span style=\"color:#710\">"</span><span style=\"\">Hello, World!</span><span style=\"color:#710\">"</span></span></span></p>",
+    assert_equal "<p><span lang=\"ruby\" class=\"CodeRay\">puts <span style=\"background-color:hsla(0,100%,50%,0.05)\"><span style=\"color:#710\">"</span><span style=\"color:#D20\">Hello, World!</span><span style=\"color:#710\">"</span></span></span></p>",
       RedCloth.new('@[ruby]puts "Hello, World!"@').to_html
     assert_equal <<-BLOCKCODE.chomp,
 <div lang="ruby" class="CodeRay">
-  <div class="code"><pre>puts <span style="background-color:#fff0f0;color:#D20"><span style="color:#710">"</span><span style="">Hello, World!</span><span style="color:#710">"</span></span></pre></div>
+  <div class="code"><pre>puts <span style="background-color:hsla(0,100%,50%,0.05)"><span style="color:#710">"</span><span style="color:#D20">Hello, World!</span><span style="color:#710">"</span></span></pre></div>
 </div>
       BLOCKCODE
       RedCloth.new('bc[ruby]. puts "Hello, World!"').to_html
@@ -63,15 +66,19 @@ class BasicTest < Test::Unit::TestCase
   # See http://jgarber.lighthouseapp.com/projects/13054/tickets/124-code-markup-does-not-allow-brackets.
   def test_for_redcloth_false_positive
     require 'coderay/for_redcloth'
-    assert_equal '<p><code>[project]_dff.skjd</code></p>',
-      RedCloth.new('@[project]_dff.skjd@').to_html
+    assert_warning 'CodeRay::Scanners could not load plugin :project; falling back to :text' do
+      assert_equal '<p><code>[project]_dff.skjd</code></p>',
+        RedCloth.new('@[project]_dff.skjd@').to_html
+    end
     # false positive, but expected behavior / known issue
     assert_equal "<p><span lang=\"ruby\" class=\"CodeRay\">_dff.skjd</span></p>",
       RedCloth.new('@[ruby]_dff.skjd@').to_html
-    assert_equal <<-BLOCKCODE.chomp,
+    assert_warning 'CodeRay::Scanners could not load plugin :project; falling back to :text' do
+      assert_equal <<-BLOCKCODE.chomp,
 <pre><code>[project]_dff.skjd</code></pre>
-      BLOCKCODE
-      RedCloth.new('bc. [project]_dff.skjd').to_html
+        BLOCKCODE
+        RedCloth.new('bc. [project]_dff.skjd').to_html
+    end
   end
   
 end if defined? RedCloth
\ No newline at end of file
diff --git a/test/functional/load_plugin_scanner.rb b/test/functional/load_plugin_scanner.rb
deleted file mode 100755
index 25bbc93..0000000
--- a/test/functional/load_plugin_scanner.rb
+++ /dev/null
@@ -1,11 +0,0 @@
-require 'test/unit'
-require 'coderay'
-
-class PluginScannerTest < Test::Unit::TestCase
-  
-  def test_load
-    require File.join(File.dirname(__FILE__), 'vhdl')
-    assert_equal 'VHDL', CodeRay.scanner(:vhdl).class.name
-  end
-  
-end
diff --git a/test/functional/suite.rb b/test/functional/suite.rb
index 97dd330..ec23eec 100755
--- a/test/functional/suite.rb
+++ b/test/functional/suite.rb
@@ -1,12 +1,15 @@
 require 'test/unit'
 
-MYDIR = File.dirname(__FILE__)
-
-$:.unshift 'lib'
+$VERBOSE = $CODERAY_DEBUG = true
+$:.unshift File.expand_path('../../../lib', __FILE__)
 require 'coderay'
-puts "Running basic CodeRay #{CodeRay::VERSION} tests..."
 
-suite = %w(basic load_plugin_scanner word_list)
+mydir = File.dirname(__FILE__)
+suite = Dir[File.join(mydir, '*.rb')].
+  map { |tc| File.basename(tc).sub(/\.rb$/, '') } - %w'suite for_redcloth'
+
+puts "Running basic CodeRay #{CodeRay::VERSION} tests: #{suite.join(', ')}"
+
 for test_case in suite
-  load File.join(MYDIR, test_case + '.rb')
+  load File.join(mydir, test_case + '.rb')
 end
diff --git a/test/functional/vhdl.rb b/test/functional/vhdl.rb
deleted file mode 100644
index c7e3824..0000000
--- a/test/functional/vhdl.rb
+++ /dev/null
@@ -1,126 +0,0 @@
-class VHDL < CodeRay::Scanners::Scanner
-
-  register_for :vhdl
-
-  RESERVED_WORDS = [
-    'access','after','alias','all','assert','architecture','begin',
-    'block','body','buffer','bus','case','component','configuration','constant',
-    'disconnect','downto','else','elsif','end','entity','exit','file','for',
-    'function','generate','generic','group','guarded','if','impure','in',
-    'inertial','inout','is','label','library','linkage','literal','loop',
-    'map','new','next','null','of','on','open','others','out','package',
-    'port','postponed','procedure','process','pure','range','record','register',
-    'reject','report','return','select','severity','signal','shared','subtype',
-    'then','to','transport','type','unaffected','units','until','use','variable',
-    'wait','when','while','with','note','warning','error','failure','and',
-    'or','xor','not','nor',
-    'array'
-  ]
-
-  PREDEFINED_TYPES = [
-    'bit','bit_vector','character','boolean','integer','real','time','string',
-    'severity_level','positive','natural','signed','unsigned','line','text',
-    'std_logic','std_logic_vector','std_ulogic','std_ulogic_vector','qsim_state',
-    'qsim_state_vector','qsim_12state','qsim_12state_vector','qsim_strength',
-    'mux_bit','mux_vector','reg_bit','reg_vector','wor_bit','wor_vector'
-  ]
-
-  PREDEFINED_CONSTANTS = [
-
-  ]
-
-  IDENT_KIND = CodeRay::CaseIgnoringWordList.new(:ident).
-    add(RESERVED_WORDS, :reserved).
-    add(PREDEFINED_TYPES, :pre_type).
-    add(PREDEFINED_CONSTANTS, :pre_constant)
-
-  ESCAPE = / [rbfntv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
-  UNICODE_ESCAPE =  / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x
-
-  def scan_tokens tokens, options
-
-    state = :initial
-
-    until eos?
-
-      kind = nil
-      match = nil
-
-      case state
-
-      when :initial
-
-        if scan(/ \s+ | \\\n /x)
-          kind = :space
-
-        elsif scan(/-- .*/x)
-          kind = :comment
-
-        elsif scan(/ [-+*\/=<>?:;,!&^|()\[\]{}~%]+ | \.(?!\d) /x)
-          kind = :operator
-
-        elsif match = scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
-          kind = IDENT_KIND[match.downcase]
-
-        elsif match = scan(/[a-z]?"/i)
-          tokens << [:open, :string]
-          state = :string
-          kind = :delimiter
-
-        elsif scan(/ L?' (?: [^\'\n\\] | \\ #{ESCAPE} )? '? /ox)
-          kind = :char
-
-        elsif scan(/(?:\d+)(?![.eEfF])/)
-          kind = :integer
-
-        elsif scan(/\d[fF]?|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
-          kind = :float
-
-        else
-          getch
-          kind = :error
-
-        end
-
-      when :string
-        if scan(/[^\\\n"]+/)
-          kind = :content
-        elsif scan(/"/)
-          tokens << ['"', :delimiter]
-          tokens << [:close, :string]
-          state = :initial
-          next
-        elsif scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
-          kind = :char
-        elsif scan(/ \\ | $ /x)
-          tokens << [:close, :string]
-          kind = :error
-          state = :initial
-        else
-          raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
-        end
-
-      else
-        raise_inspect 'Unknown state', tokens
-
-      end
-
-      match ||= matched
-      if $DEBUG and not kind
-        raise_inspect 'Error token %p in line %d' %
-          [[match, kind], line], tokens
-      end
-      raise_inspect 'Empty token', tokens unless match
-
-      tokens << [match, kind]
-
-    end
-
-    if state == :string
-      tokens << [:close, :string]
-    end
-
-    tokens
-  end
-
-end
diff --git a/test/functional/word_list.rb b/test/functional/word_list.rb
deleted file mode 100644
index 84d6e9e..0000000
--- a/test/functional/word_list.rb
+++ /dev/null
@@ -1,79 +0,0 @@
-require 'test/unit'
-require 'coderay'
-
-class WordListTest < Test::Unit::TestCase
-  
-  include CodeRay
-  
-  # define word arrays
-  RESERVED_WORDS = %w[
-    asm break case continue default do else
-    ...
-  ]
-  
-  PREDEFINED_TYPES = %w[
-    int long short char void
-    ...
-  ]
-  
-  PREDEFINED_CONSTANTS = %w[
-    EOF NULL ...
-  ]
-  
-  # make a WordList
-  IDENT_KIND = WordList.new(:ident).
-    add(RESERVED_WORDS, :reserved).
-    add(PREDEFINED_TYPES, :pre_type).
-    add(PREDEFINED_CONSTANTS, :pre_constant)
-
-  def test_word_list_example
-    assert_equal :pre_type, IDENT_KIND['void']
-    # assert_equal :pre_constant, IDENT_KIND['...']  # not specified
-  end
-  
-  def test_word_list
-    list = WordList.new(:ident).add(['foobar'], :reserved)
-    assert_equal :reserved, list['foobar']
-    assert_equal :ident, list['FooBar']
-  end
-
-  def test_word_list_cached
-    list = WordList.new(:ident, true).add(['foobar'], :reserved)
-    assert_equal :reserved, list['foobar']
-    assert_equal :ident, list['FooBar']
-  end
-
-  def test_case_ignoring_word_list
-    list = CaseIgnoringWordList.new(:ident).add(['foobar'], :reserved)
-    assert_equal :ident, list['foo']
-    assert_equal :reserved, list['foobar']
-    assert_equal :reserved, list['FooBar']
-
-    list = CaseIgnoringWordList.new(:ident).add(['FooBar'], :reserved)
-    assert_equal :ident, list['foo']
-    assert_equal :reserved, list['foobar']
-    assert_equal :reserved, list['FooBar']
-  end
-
-  def test_case_ignoring_word_list_cached
-    list = CaseIgnoringWordList.new(:ident, true).add(['foobar'], :reserved)
-    assert_equal :ident, list['foo']
-    assert_equal :reserved, list['foobar']
-    assert_equal :reserved, list['FooBar']
-
-    list = CaseIgnoringWordList.new(:ident, true).add(['FooBar'], :reserved)
-    assert_equal :ident, list['foo']
-    assert_equal :reserved, list['foobar']
-    assert_equal :reserved, list['FooBar']
-  end
-
-  def test_dup
-    list = WordList.new(:ident).add(['foobar'], :reserved)
-    assert_equal :reserved, list['foobar']
-    list2 = list.dup
-    list2.add(%w[foobar], :keyword)
-    assert_equal :keyword, list2['foobar']
-    assert_equal :reserved, list['foobar']
-  end
-
-end
\ No newline at end of file

-- 
coderay.git