[DRE-commits] [SCM] coderay.git branch, upstream, updated. upstream/0.9.8-1-gb788473
Youhei SASAKI
uwabami at gfd-dennou.org
Tue Feb 28 12:00:10 UTC 2012
The following commit has been merged in the upstream branch:
commit b7884736f7ec92fa4a0c593cfb8e1551687c8bc9
Author: Youhei SASAKI <uwabami at gfd-dennou.org>
Date: Mon Feb 27 10:51:46 2012 +0900
Imported Upstream version 1.0.5
diff --git a/FOLDERS b/FOLDERS
deleted file mode 100644
index e393ed7..0000000
--- a/FOLDERS
+++ /dev/null
@@ -1,53 +0,0 @@
-= CodeRay - Trunk folder structure
-
-== bench - Benchmarking system
-
-All benchmarking stuff goes here.
-
-Test inputs are stored in files named <code>example.<lang></code>.
-Test outputs go to <code>bench/test.<encoder-default-file-extension></code>.
-
-Run <code>bench/bench.rb</code> to get a usage description.
-
-Run <code>rake bench</code> to perform an example benchmark.
-
-
-== bin - Scripts
-
-Executional files for CodeRay.
-
-
-== demo - Demos and functional tests
-
-Demonstrational scripts to show of CodeRay's features.
-
-Run them as functional tests with <code>rake test:demos</code>.
-
-
-== etc - Lots of stuff
-
-Some addidtional files for CodeRay, mainly graphics and Vim scripts.
-
-
-== gem_server - Gem output folder
-
-For <code>rake gem</code>.
-
-
-== lib - CodeRay library code
-
-This is the base directory for the CodeRay library.
-
-
-== rake_helpers - Rake helper libraries
-
-Some files to enhance Rake, including the Autumnal Rdoc template and some scripts.
-
-
-== test - Tests
-
-Tests for the scanners.
-
-Each language has its own subfolder and sub-suite.
-
-Run with <code>rake test</code>.
diff --git a/lib/README b/README_INDEX.rdoc
similarity index 79%
rename from lib/README
rename to README_INDEX.rdoc
index e3fd869..7332653 100644
--- a/lib/README
+++ b/README_INDEX.rdoc
@@ -1,57 +1,45 @@
= CodeRay
-[- Tired of blue'n'gray? Try the original version of this documentation on
-coderay.rubychan.de[http://coderay.rubychan.de/doc/] (use Ctrl+Click to open it in its own frame.) -]
+Tired of blue'n'gray? Try the original version of this documentation on
+coderay.rubychan.de[http://coderay.rubychan.de/doc/] :-)
== About
+
CodeRay is a Ruby library for syntax highlighting.
-Syntax highlighting means: You put your code in, and you get it back colored;
-Keywords, strings, floats, comments - all in different colors.
-And with line numbers.
+You put your code in, and you get it back colored; Keywords, strings,
+floats, comments - all in different colors. And with line numbers.
*Syntax* *Highlighting*...
* makes code easier to read and maintain
* lets you detect syntax errors faster
* helps you to understand the syntax of a language
* looks nice
-* is what everybody should have on their website
+* is what everybody wants to have on their website
* solves all your problems and makes the girls run after you
-Version: 0.9.8
-Author:: murphy (Kornelius Kalnbach)
-Contact:: murphy rubychan de
-Website:: coderay.rubychan.de[http://coderay.rubychan.de]
-License:: GNU LGPL; see LICENSE file in the main directory.
== Installation
-You need RubyGems[http://rubyforge.org/frs/?group_id=126].
-
% gem install coderay
=== Dependencies
-CodeRay needs Ruby 1.8.6 or later. It also runs with Ruby 1.9.1+ and JRuby 1.1+.
+CodeRay needs Ruby 1.8.7+ or 1.9.2+. It also runs on Rubinius and JRuby.
== Example Usage
-(Forgive me, but this is not highlighted.)
require 'coderay'
- tokens = CodeRay.scan "puts 'Hello, world!'", :ruby
- page = tokens.html :line_numbers => :inline, :wrap => :page
- puts page
+ html = CodeRay.scan("puts 'Hello, world!'", :ruby).div(:line_numbers => :table)
== Documentation
See CodeRay.
-Please report errors in this documentation to <murphy rubychan de>.
-
== Credits
@@ -94,7 +82,6 @@ Please report errors in this documentation to <murphy rubychan de>.
* Rob Aldred for the terminal encoder
* Trans for pointing out $DEBUG dependencies
* Flameeyes for finding that Term::ANSIColor was obsolete
-* Etienne Massip for reporting a serious bug in JavaScript scanner
* matz and all Ruby gods and gurus
* The inventors of: the computer, the internet, the true color display, HTML &
CSS, VIM, Ruby, pizza, microwaves, guitars, scouting, programming, anime,
@@ -124,6 +111,8 @@ Where would we be without all those people?
less useless
* Term::ANSIColor[http://term-ansicolor.rubyforge.org/]
* PLEAC[http://pleac.sourceforge.net/] code examples
+* Github
+* Travis CI (http://travis-ci.org/rubychan/github)
=== Free
diff --git a/Rakefile b/Rakefile
index 05d0144..ba6c34e 100644
--- a/Rakefile
+++ b/Rakefile
@@ -1,8 +1,7 @@
-require 'rake/rdoctask'
+$:.unshift File.dirname(__FILE__) unless $:.include? '.'
ROOT = '.'
LIB_ROOT = File.join ROOT, 'lib'
-EXTRA_RDOC_FILES = %w(lib/README FOLDERS)
task :default => :test
@@ -15,20 +14,21 @@ if File.directory? 'rake_tasks'
else
- # fallback tasks when rake_tasks folder is not present
+ # fallback tasks when rake_tasks folder is not present (eg. in the distribution package)
desc 'Run CodeRay tests (basic)'
task :test do
ruby './test/functional/suite.rb'
ruby './test/functional/for_redcloth.rb'
end
+ gem 'rdoc' if defined? gem
+ require 'rdoc/task'
desc 'Generate documentation for CodeRay'
Rake::RDocTask.new :doc do |rd|
rd.title = 'CodeRay Documentation'
- rd.main = 'lib/README'
+ rd.main = 'README_INDEX.rdoc'
rd.rdoc_files.add Dir['lib']
- rd.rdoc_files.add 'lib/README'
- rd.rdoc_files.add 'FOLDERS'
+ rd.rdoc_files.add rd.main
rd.rdoc_dir = 'doc'
end
diff --git a/bin/coderay b/bin/coderay
index 62101a8..d78cd57 100755
--- a/bin/coderay
+++ b/bin/coderay
@@ -1,86 +1,215 @@
#!/usr/bin/env ruby
-# CodeRay Executable
-#
-# Version: 0.2
-# Author: murphy
-
require 'coderay'
-if ARGV.empty?
- $stderr.puts <<-USAGE
-CodeRay #{CodeRay::VERSION} (http://coderay.rubychan.de)
+$options, args = ARGV.partition { |arg| arg[/^-[hv]$|--\w+/] }
+subcommand = args.first if /^\w/ === args.first
+subcommand = nil if subcommand && File.exist?(subcommand)
+args.delete subcommand
-Usage:
- coderay file [-<format>]
- coderay -<lang> [-<format>] [< file] [> output]
+def option? *options
+ !($options & options).empty?
+end
-Defaults:
- lang: based on file extension
- format: ANSI colorized output for terminal, HTML page for files
+def tty?
+ $stdout.tty? || option?('--tty')
+end
-Examples:
- coderay foo.rb # colorized output to terminal, based on file extension
- coderay foo.rb -loc # print LOC count, based on file extension and format
- coderay foo.rb > foo.html # HTML page output to file, based on extension
- coderay -ruby < foo.rb # colorized output to terminal, based on lang
- coderay -ruby -loc < foo.rb # print LOC count, based on lang
- coderay -ruby -page foo.rb # HTML page output to terminal, based on lang and format
- coderay -ruby -page foo.rb > foo.html # HTML page output to file, based on lang and format
+def version
+ puts <<-USAGE
+CodeRay #{CodeRay::VERSION}
USAGE
end
-first, second = ARGV
+def help
+ puts <<-HELP
+This is CodeRay #{CodeRay::VERSION}, a syntax highlighting tool for selected languages.
-def read
- file = ARGV.grep(/^(?!-)/).last
- if file
- if File.exist?(file)
- File.read file
- else
- $stderr.puts "No such file: #{file}"
- end
- else
- $stdin.read
- end
+usage:
+ coderay [-language] [input] [-format] [output]
+
+defaults:
+ language detect from input file name or shebang; fall back to plain text
+ input STDIN
+ format detect from output file name or use terminal; fall back to HTML
+ output STDOUT
+
+common:
+ coderay file.rb # highlight file to terminal
+ coderay file.rb > file.html # highlight file to HTML page
+ coderay file.rb -div > file.html # highlight file to HTML snippet
+
+configure output:
+ coderay file.py output.json # output tokens as JSON
+ coderay file.py -loc # count lines of code in Python file
+
+configure input:
+ coderay -python file # specify the input language
+ coderay -ruby # take input from STDIN
+
+more:
+ coderay stylesheet [style] # print CSS stylesheet
+ HELP
end
-if first
- if first[/-(\w+)/] == first
- lang = $1
- input = read
- tokens = :scan
- else
- file = first
- unless File.exist? file
- $stderr.puts "No such file: #{file}"
- exit 2
+def commands
+ puts <<-COMMANDS
+ general:
+ highlight code highlighting (default command, optional)
+ stylesheet print the CSS stylesheet with the given name (aliases: style, css)
+
+ about:
+ list [of] list all available plugins (or just the scanners|encoders|styles|filetypes)
+ commands print this list
+ help show some help
+ version print CodeRay version
+ COMMANDS
+end
+
+def print_list_of plugin_host
+ plugins = plugin_host.all_plugins.map do |plugin|
+ info = " #{plugin.plugin_id}: #{plugin.title}"
+
+ aliases = (plugin.aliases - [:default]).map { |key| "-#{key}" }.sort_by { |key| key.size }
+ if plugin.respond_to?(:file_extension) || !aliases.empty?
+ additional_info = []
+ additional_info << aliases.join(', ') unless aliases.empty?
+ info << " (#{additional_info.join('; ')})"
end
- tokens = CodeRay.scan_file file
+
+ info << ' <-- default' if plugin.aliases.include? :default
+
+ info
end
-else
- $stderr.puts 'No lang/file given.'
- exit 1
+ puts plugins.sort
+end
+
+if option? '-v', '--version'
+ version
+end
+
+if option? '-h', '--help'
+ help
end
-if second
- if second[/-(\w+)/] == second
- format = $1.to_sym
+case subcommand
+when 'highlight', nil
+ if ARGV.empty?
+ version
+ help
else
- raise 'invalid format (must be -xxx)'
+ signature = args.map { |arg| arg[/^-/] ? '-' : 'f' }.join
+ names = args.map { |arg| arg.sub(/^-/, '') }
+ case signature
+ when /^$/
+ exit
+ when /^ff?$/
+ input_file, output_file, = *names
+ when /^f-f?$/
+ input_file, output_format, output_file, = *names
+ when /^-ff?$/
+ input_lang, input_file, output_file, = *names
+ when /^-f-f?$/
+ input_lang, input_file, output_format, output_file, = *names
+ when /^--?f?$/
+ input_lang, output_format, output_file, = *names
+ else
+ $stdout = $stderr
+ help
+ puts
+ puts "Unknown parameter order: #{args.join ' '}, expected: [-language] [input] [-format] [output]"
+ exit 1
+ end
+
+ if input_file
+ input_lang ||= CodeRay::FileType.fetch input_file, :text, true
+ end
+
+ if output_file
+ output_format ||= CodeRay::FileType[output_file]
+ else
+ output_format ||= :terminal
+ end
+
+ output_format = :page if output_format.to_s == 'html'
+
+ if input_file
+ input = File.read input_file
+ else
+ input = $stdin.read
+ end
+
+ begin
+ file =
+ if output_file
+ File.open output_file, 'w'
+ else
+ $stdout.sync = true
+ $stdout
+ end
+ CodeRay.encode(input, input_lang, output_format, :out => file)
+ file.puts
+ rescue CodeRay::PluginHost::PluginNotFound => boom
+ $stdout = $stderr
+ if boom.message[/CodeRay::(\w+)s could not load plugin :?(.*?): /]
+ puts "I don't know the #$1 \"#$2\"."
+ else
+ puts boom.message
+ end
+ # puts "I don't know this plugin: #{boom.message[/Could not load plugin (.*?): /, 1]}."
+ rescue CodeRay::Scanners::Scanner::ScanError # FIXME: rescue Errno::EPIPE
+ # this is sometimes raised by pagers; ignore [TODO: wtf?]
+ ensure
+ file.close if output_file
+ end
+ end
+when 'li', 'list'
+ arg = args.first && args.first.downcase
+ if [nil, 's', 'sc', 'scanner', 'scanners'].include? arg
+ puts 'input languages (Scanners):'
+ print_list_of CodeRay::Scanners
+ end
+
+ if [nil, 'e', 'en', 'enc', 'encoder', 'encoders'].include? arg
+ puts 'output formats (Encoders):'
+ print_list_of CodeRay::Encoders
end
+
+ if [nil, 'st', 'style', 'styles'].include? arg
+ puts 'CSS themes for HTML output (Styles):'
+ print_list_of CodeRay::Styles
+ end
+
+ if [nil, 'f', 'ft', 'file', 'filetype', 'filetypes'].include? arg
+ puts 'recognized file types:'
+
+ filetypes = Hash.new { |h, k| h[k] = [] }
+ CodeRay::FileType::TypeFromExt.inject filetypes do |types, (ext, type)|
+ types[type.to_s] << ".#{ext}"
+ types
+ end
+ CodeRay::FileType::TypeFromName.inject filetypes do |types, (name, type)|
+ types[type.to_s] << name
+ types
+ end
+
+ filetypes.sort.each do |type, exts|
+ puts " #{type}: #{exts.sort_by { |ext| ext.size }.join(', ')}"
+ end
+ end
+when 'stylesheet', 'style', 'css'
+ puts CodeRay::Encoders[:html]::CSS.new(args.first || :default).stylesheet
+when 'commands'
+ commands
+when 'help'
+ help
else
- if $stdout.tty?
- format = :term
+ $stdout = $stderr
+ help
+ puts
+ if subcommand[/\A\w+\z/]
+ puts "Unknown command: #{subcommand}"
else
- $stderr.puts 'No format given; setting to default (HTML Page).'
- format = :page
+ puts "File not found: #{subcommand}"
end
+ exit 1
end
-
-if tokens == :scan
- output = CodeRay::Duo[lang => format].highlight input
-else
- output = tokens.encode format
-end
-out = $stdout
-out.puts output
diff --git a/bin/coderay_stylesheet b/bin/coderay_stylesheet
deleted file mode 100755
index baa7c26..0000000
--- a/bin/coderay_stylesheet
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/usr/bin/env ruby
-require 'coderay'
-
-puts CodeRay::Encoders[:html]::CSS.new.stylesheet
diff --git a/lib/coderay.rb b/lib/coderay.rb
index bd8b4e9..876d770 100644
--- a/lib/coderay.rb
+++ b/lib/coderay.rb
@@ -1,16 +1,21 @@
+# encoding: utf-8
+# Encoding.default_internal = 'UTF-8'
+
# = CodeRay Library
#
# CodeRay is a Ruby library for syntax highlighting.
#
-# I try to make CodeRay easy to use and intuitive, but at the same time fully featured, complete,
-# fast and efficient.
+# I try to make CodeRay easy to use and intuitive, but at the same time fully
+# featured, complete, fast and efficient.
#
# See README.
#
# It consists mainly of
-# * the main engine: CodeRay (Scanners::Scanner, Tokens/TokenStream, Encoders::Encoder), PluginHost
+# * the main engine: CodeRay (Scanners::Scanner, Tokens, Encoders::Encoder)
+# * the plugin system: PluginHost, Plugin
# * the scanners in CodeRay::Scanners
# * the encoders in CodeRay::Encoders
+# * the styles in CodeRay::Styles
#
# Here's a fancy graphic to light up this gray docu:
#
@@ -22,8 +27,8 @@
#
# == Usage
#
-# Remember you need RubyGems to use CodeRay, unless you have it in your load path. Run Ruby with
-# -rubygems option if required.
+# Remember you need RubyGems to use CodeRay, unless you have it in your load
+# path. Run Ruby with -rubygems option if required.
#
# === Highlight Ruby code in a string as html
#
@@ -98,13 +103,6 @@
# CodeRay.encode_tokens:: Encode the given tokens.
# CodeRay.encode_file:: Scan a file, guess the language using FileType and encode it.
#
-# == Streaming
-#
-# Streaming saves RAM by running Scanner and Encoder in some sort of
-# pipe mode; see TokenStream.
-#
-# CodeRay.scan_stream:: Scan in stream mode.
-#
# == All-in-One Encoding
#
# CodeRay.encode:: Highlight a string with a given input and output format.
@@ -129,23 +127,37 @@ module CodeRay
$CODERAY_DEBUG ||= false
- # Version: Major.Minor.Teeny[.Revision]
- # Major: 0 for pre-stable, 1 for stable
- # Minor: feature milestone
- # Teeny: development state, 0 for pre-release
- # Revision: Subversion Revision number (generated on rake gem:make)
- VERSION = '0.9.8'
-
- require 'coderay/tokens'
- require 'coderay/token_classes'
- require 'coderay/scanner'
- require 'coderay/encoder'
- require 'coderay/duo'
- require 'coderay/style'
-
-
+ CODERAY_PATH = File.join File.dirname(__FILE__), 'coderay'
+
+ # Assuming the path is a subpath of lib/coderay/
+ def self.coderay_path *path
+ File.join CODERAY_PATH, *path
+ end
+
+ require coderay_path('version')
+
+ # helpers
+ autoload :FileType, coderay_path('helpers', 'file_type')
+
+ # Tokens
+ autoload :Tokens, coderay_path('tokens')
+ autoload :TokensProxy, coderay_path('tokens_proxy')
+ autoload :TokenKinds, coderay_path('token_kinds')
+
+ # Plugin system
+ autoload :PluginHost, coderay_path('helpers', 'plugin')
+ autoload :Plugin, coderay_path('helpers', 'plugin')
+
+ # Plugins
+ autoload :Scanners, coderay_path('scanner')
+ autoload :Encoders, coderay_path('encoder')
+ autoload :Styles, coderay_path('style')
+
+ # convenience access and reusable Encoder/Scanner pair
+ autoload :Duo, coderay_path('duo')
+
class << self
-
+
# Scans the given +code+ (a String) with the Scanner for +lang+.
#
# This is a simple way to use CodeRay. Example:
@@ -154,15 +166,15 @@ module CodeRay
#
# See also demo/demo_simple.
def scan code, lang, options = {}, &block
- scanner = Scanners[lang].new code, options, &block
- scanner.tokenize
+ # FIXME: return a proxy for direct-stream encoding
+ TokensProxy.new code, lang, options, block
end
-
+
# Scans +filename+ (a path to a code file) with the Scanner for +lang+.
#
# If +lang+ is :auto or omitted, the CodeRay::FileType module is used to
# determine it. If it cannot find out what type it is, it uses
- # CodeRay::Scanners::Plaintext.
+ # CodeRay::Scanners::Text.
#
# Calls CodeRay.scan.
#
@@ -170,56 +182,22 @@ module CodeRay
# require 'coderay'
# page = CodeRay.scan_file('some_c_code.c').html
def scan_file filename, lang = :auto, options = {}, &block
- file = IO.read filename
- if lang == :auto
- require 'coderay/helpers/file_type'
- lang = FileType.fetch filename, :plaintext, true
- end
- scan file, lang, options = {}, &block
- end
-
- # Scan the +code+ (a string) with the scanner for +lang+.
- #
- # Calls scan.
- #
- # See CodeRay.scan.
- def scan_stream code, lang, options = {}, &block
- options[:stream] = true
+ lang = FileType.fetch filename, :text, true if lang == :auto
+ code = File.read filename
scan code, lang, options, &block
end
-
- # Encode a string in Streaming mode.
- #
- # This starts scanning +code+ with the the Scanner for +lang+
- # while encodes the output with the Encoder for +format+.
- # +options+ will be passed to the Encoder.
- #
- # See CodeRay::Encoder.encode_stream
- def encode_stream code, lang, format, options = {}
- encoder(format, options).encode_stream code, lang, options
- end
-
+
# Encode a string.
#
# This scans +code+ with the the Scanner for +lang+ and then
# encodes it with the Encoder for +format+.
# +options+ will be passed to the Encoder.
#
- # See CodeRay::Encoder.encode
+ # See CodeRay::Encoder.encode.
def encode code, lang, format, options = {}
encoder(format, options).encode code, lang, options
end
-
- # Highlight a string into a HTML <div>.
- #
- # CSS styles use classes, so you have to include a stylesheet
- # in your output.
- #
- # See encode.
- def highlight code, lang, options = { :css => :class }, format = :div
- encode code, lang, format, options
- end
-
+
# Encode pre-scanned Tokens.
# Use this together with CodeRay.scan:
#
@@ -232,7 +210,7 @@ module CodeRay
def encode_tokens tokens, format, options = {}
encoder(format, options).encode_tokens tokens, options
end
-
+
# Encodes +filename+ (a path to a code file) with the Scanner for +lang+.
#
# See CodeRay.scan_file.
@@ -245,7 +223,17 @@ module CodeRay
tokens = scan_file filename, :auto, get_scanner_options(options)
encode_tokens tokens, format, options
end
-
+
+ # Highlight a string into a HTML <div>.
+ #
+ # CSS styles use classes, so you have to include a stylesheet
+ # in your output.
+ #
+ # See encode.
+ def highlight code, lang, options = { :css => :class }, format = :div
+ encode code, lang, format, options
+ end
+
# Highlight a file into a HTML <div>.
#
# CSS styles use classes, so you have to include a stylesheet
@@ -255,7 +243,7 @@ module CodeRay
def highlight_file filename, options = { :css => :class }, format = :div
encode_file filename, format, options
end
-
+
# Finds the Encoder class for +format+ and creates an instance, passing
# +options+ to it.
#
@@ -273,15 +261,15 @@ module CodeRay
def encoder format, options = {}
Encoders[format].new options
end
-
+
# Finds the Scanner class for +lang+ and creates an instance, passing
# +options+ to it.
#
# See Scanner.new.
- def scanner lang, options = {}
- Scanners[lang].new '', options
+ def scanner lang, options = {}, &block
+ Scanners[lang].new '', options, &block
end
-
+
# Extract the options for the scanner from the +options+ hash.
#
# Returns an empty Hash if <tt>:scanner_options</tt> is not set.
@@ -291,32 +279,7 @@ module CodeRay
def get_scanner_options options
options.fetch :scanner_options, {}
end
-
- end
-
- # This Exception is raised when you try to stream with something that is not
- # capable of streaming.
- class NotStreamableError < Exception
- def initialize obj
- @obj = obj
- end
-
- def to_s
- '%s is not Streamable!' % @obj.class
- end
- end
-
- # A dummy module that is included by subclasses of CodeRay::Scanner an CodeRay::Encoder
- # to show that they are able to handle streams.
- module Streamable
+
end
-
-end
-
-# Run a test script.
-if $0 == __FILE__
- $stderr.print 'Press key to print demo.'; gets
- # Just use this file as an example of Ruby code.
- code = File.read(__FILE__)[/module CodeRay.*/m]
- print CodeRay.scan(code, :ruby).html
+
end
diff --git a/lib/coderay/duo.rb b/lib/coderay/duo.rb
index 5468dda..cb3f8ee 100644
--- a/lib/coderay/duo.rb
+++ b/lib/coderay/duo.rb
@@ -21,10 +21,7 @@ module CodeRay
# Create a new Duo, holding a lang and a format to highlight code.
#
# simple:
- # CodeRay::Duo[:ruby, :page].highlight 'bla 42'
- #
- # streaming:
- # CodeRay::Duo[:ruby, :page].highlight 'bar 23', :stream => true
+ # CodeRay::Duo[:ruby, :html].highlight 'bla 42'
#
# with options:
# CodeRay::Duo[:ruby, :html, :hint => :debug].highlight '????::??'
@@ -38,7 +35,7 @@ module CodeRay
# The options are forwarded to scanner and encoder
# (see CodeRay.get_scanner_options).
def initialize lang = nil, format = nil, options = {}
- if format == nil and lang.is_a? Hash and lang.size == 1
+ if format.nil? && lang.is_a?(Hash) && lang.size == 1
@lang = lang.keys.first
@format = lang[@lang]
else
@@ -47,12 +44,12 @@ module CodeRay
end
@options = options
end
-
+
class << self
# To allow calls like Duo[:ruby, :html].highlight.
alias [] new
end
-
+
# The scanner of the duo. Only created once.
def scanner
@scanner ||= CodeRay.scanner @lang, CodeRay.get_scanner_options(@options)
@@ -64,22 +61,21 @@ module CodeRay
end
# Tokenize and highlight the code using +scanner+ and +encoder+.
- #
- # If the :stream option is set, the Duo will go into streaming mode,
- # saving memory for the cost of time.
- def encode code, options = { :stream => false }
- stream = options.delete :stream
+ def encode code, options = {}
options = @options.merge options
- if stream
- encoder.encode_stream(code, @lang, options)
- else
- scanner.code = code
- encoder.encode_tokens(scanner.tokenize, options)
- end
+ encoder.encode(code, @lang, options)
end
alias highlight encode
-
+
+ # Allows to use Duo like a proc object:
+ #
+ # CodeRay::Duo[:python => :yaml].call(code)
+ #
+ # or, in Ruby 1.9 and later:
+ #
+ # CodeRay::Duo[:python => :yaml].(code)
+ alias call encode
+
end
-
+
end
-
diff --git a/lib/coderay/encoder.rb b/lib/coderay/encoder.rb
index 3ae2924..d2d6c7e 100644
--- a/lib/coderay/encoder.rb
+++ b/lib/coderay/encoder.rb
@@ -1,5 +1,5 @@
module CodeRay
-
+
# This module holds the Encoder class and its subclasses.
# For example, the HTML encoder is named CodeRay::Encoders::HTML
# can be found in coderay/encoders/html.
@@ -8,9 +8,10 @@ module CodeRay
# mechanism and the [] method that returns the Encoder class
# belonging to the given format.
module Encoders
+
extend PluginHost
plugin_path File.dirname(__FILE__), 'encoders'
-
+
# = Encoder
#
# The Encoder base class. Together with Scanner and
@@ -26,34 +27,32 @@ module CodeRay
class Encoder
extend Plugin
plugin_host Encoders
-
- attr_reader :token_stream
-
+
class << self
-
- # Returns if the Encoder can be used in streaming mode.
- def streamable?
- is_a? Streamable
- end
-
+
# If FILE_EXTENSION isn't defined, this method returns the
# downcase class name instead.
def const_missing sym
if sym == :FILE_EXTENSION
- plugin_id
+ (defined?(@plugin_id) && @plugin_id || name[/\w+$/].downcase).to_s
else
super
end
end
-
+
+ # The default file extension for output file of this encoder class.
+ def file_extension
+ self::FILE_EXTENSION
+ end
+
end
-
+
# Subclasses are to store their default options in this constant.
- DEFAULT_OPTIONS = { :stream => false }
-
+ DEFAULT_OPTIONS = { }
+
# The options you gave the Encoder at creating.
- attr_accessor :options
-
+ attr_accessor :options, :scanner
+
# Creates a new Encoder.
# +options+ is saved and used for all encode operations, as long
# as you don't overwrite it there by passing additional options.
@@ -61,153 +60,142 @@ module CodeRay
# Encoder objects provide three encode methods:
# - encode simply takes a +code+ string and a +lang+
# - encode_tokens expects a +tokens+ object instead
- # - encode_stream is like encode, but uses streaming mode.
#
# Each method has an optional +options+ parameter. These are
# added to the options you passed at creation.
def initialize options = {}
@options = self.class::DEFAULT_OPTIONS.merge options
- raise "I am only the basic Encoder class. I can't encode "\
- "anything. :( Use my subclasses." if self.class == Encoder
+ @@CODERAY_TOKEN_INTERFACE_DEPRECATION_WARNING_GIVEN = false
end
-
+
# Encode a Tokens object.
def encode_tokens tokens, options = {}
options = @options.merge options
+ @scanner = tokens.scanner if tokens.respond_to? :scanner
setup options
compile tokens, options
finish options
end
-
- # Encode the given +code+ after tokenizing it using the Scanner
- # for +lang+.
+
+ # Encode the given +code+ using the Scanner for +lang+.
def encode code, lang, options = {}
options = @options.merge options
- scanner_options = CodeRay.get_scanner_options(options)
- tokens = CodeRay.scan code, lang, scanner_options
- encode_tokens tokens, options
+ @scanner = Scanners[lang].new code, CodeRay.get_scanner_options(options).update(:tokens => self)
+ setup options
+ @scanner.tokenize
+ finish options
end
-
+
# You can use highlight instead of encode, if that seems
# more clear to you.
alias highlight encode
-
- # Encode the given +code+ using the Scanner for +lang+ in
- # streaming mode.
- def encode_stream code, lang, options = {}
- raise NotStreamableError, self unless kind_of? Streamable
- options = @options.merge options
- setup options
- scanner_options = CodeRay.get_scanner_options options
- @token_stream =
- CodeRay.scan_stream code, lang, scanner_options, &self
- finish options
- end
-
- # Behave like a proc. The token method is converted to a proc.
- def to_proc
- method(:token).to_proc
- end
-
- # Return the default file extension for outputs of this encoder.
+
+ # The default file extension for this encoder.
def file_extension
- self.class::FILE_EXTENSION
+ self.class.file_extension
end
-
- protected
-
- # Called with merged options before encoding starts.
- # Sets @out to an empty string.
- #
- # See the HTML Encoder for an example of option caching.
- def setup options
- @out = ''
+
+ def << token
+ unless @@CODERAY_TOKEN_INTERFACE_DEPRECATION_WARNING_GIVEN
+ warn 'Using old Tokens#<< interface.'
+ @@CODERAY_TOKEN_INTERFACE_DEPRECATION_WARNING_GIVEN = true
+ end
+ self.token(*token)
end
-
+
# Called with +content+ and +kind+ of the currently scanned token.
# For simple scanners, it's enougth to implement this method.
#
- # By default, it calls text_token or block_token, depending on
- # whether +content+ is a String.
+ # By default, it calls text_token, begin_group, end_group, begin_line,
+ # or end_line, depending on the +content+.
def token content, kind
- encoded_token =
- if content.is_a? ::String
- text_token content, kind
- elsif content.is_a? ::Symbol
- block_token content, kind
- else
- raise 'Unknown token content type: %p' % [content]
- end
- append_encoded_token_to_output encoded_token
- end
-
- def append_encoded_token_to_output encoded_token
- @out << encoded_token if encoded_token && defined?(@out) && @out
- end
-
- # Called for each text token ([text, kind]), where text is a String.
- def text_token text, kind
- end
-
- # Called for each block (non-text) token ([action, kind]),
- # where +action+ is a Symbol.
- #
- # Calls open_token, close_token, begin_line, and end_line according to
- # the value of +action+.
- def block_token action, kind
- case action
- when :open
- open_token kind
- when :close
- close_token kind
+ case content
+ when String
+ text_token content, kind
+ when :begin_group
+ begin_group kind
+ when :end_group
+ end_group kind
when :begin_line
begin_line kind
when :end_line
end_line kind
else
- raise 'unknown block action: %p' % action
+ raise ArgumentError, 'Unknown token content type: %p, kind = %p' % [content, kind]
end
end
- # Called for each block token at the start of the block ([:open, kind]).
- def open_token kind
+ # Called for each text token ([text, kind]), where text is a String.
+ def text_token text, kind
+ @out << text
end
- # Called for each block token end of the block ([:close, kind]).
- def close_token kind
+ # Starts a token group with the given +kind+.
+ def begin_group kind
end
- # Called for each line token block at the start of the line ([:begin_line, kind]).
+ # Ends a token group with the given +kind+.
+ def end_group kind
+ end
+
+ # Starts a new line token group with the given +kind+.
def begin_line kind
end
- # Called for each line token block at the end of the line ([:end_line, kind]).
+ # Ends a new line token group with the given +kind+.
def end_line kind
end
-
+
+ protected
+
+ # Called with merged options before encoding starts.
+ # Sets @out to an empty string.
+ #
+ # See the HTML Encoder for an example of option caching.
+ def setup options
+ @out = get_output(options)
+ end
+
+ def get_output options
+ options[:out] || ''
+ end
+
+ # Append data.to_s to the output. Returns the argument.
+ def output data
+ @out << data.to_s
+ data
+ end
+
# Called with merged options after encoding starts.
# The return value is the result of encoding, typically @out.
def finish options
@out
end
-
+
# Do the encoding.
#
- # The already created +tokens+ object must be used; it can be a
- # TokenStream or a Tokens object.
- if RUBY_VERSION >= '1.9'
- def compile tokens, options
- for text, kind in tokens
- token text, kind
+ # The already created +tokens+ object must be used; it must be a
+ # Tokens object.
+ def compile tokens, options = {}
+ content = nil
+ for item in tokens
+ if item.is_a? Array
+ raise ArgumentError, 'Two-element array tokens are no longer supported.'
+ end
+ if content
+ token content, item
+ content = nil
+ else
+ content = item
end
end
- else
- def compile tokens, options
- tokens.each(&self)
- end
+ raise 'odd number list for Tokens' if content
end
-
+
+ alias tokens compile
+ public :tokens
+
end
-
+
end
end
diff --git a/lib/coderay/encoders/_map.rb b/lib/coderay/encoders/_map.rb
index 526c3a0..4cca196 100644
--- a/lib/coderay/encoders/_map.rb
+++ b/lib/coderay/encoders/_map.rb
@@ -1,12 +1,17 @@
module CodeRay
module Encoders
-
+
map \
- :loc => :lines_of_code,
- :plain => :text,
- :stats => :statistic,
- :terminal => :term,
- :tex => :latex
-
+ :loc => :lines_of_code,
+ :plain => :text,
+ :plaintext => :text,
+ :remove_comments => :comment_filter,
+ :stats => :statistic,
+ :term => :terminal,
+ :tty => :terminal,
+ :yml => :yaml
+
+ # No default because Tokens#nonsense should raise NoMethodError.
+
end
end
diff --git a/lib/coderay/encoders/comment_filter.rb b/lib/coderay/encoders/comment_filter.rb
index 4d3fb54..28336b3 100644
--- a/lib/coderay/encoders/comment_filter.rb
+++ b/lib/coderay/encoders/comment_filter.rb
@@ -1,43 +1,25 @@
-($:.unshift '../..'; require 'coderay') unless defined? CodeRay
module CodeRay
module Encoders
- load :token_class_filter
+ load :token_kind_filter
- class CommentFilter < TokenClassFilter
+ # A simple Filter that removes all tokens of the :comment kind.
+ #
+ # Alias: +remove_comments+
+ #
+ # Usage:
+ # CodeRay.scan('print # foo', :ruby).comment_filter.text
+ # #-> "print "
+ #
+ # See also: TokenKindFilter, LinesOfCode
+ class CommentFilter < TokenKindFilter
register_for :comment_filter
DEFAULT_OPTIONS = superclass::DEFAULT_OPTIONS.merge \
- :exclude => [:comment]
+ :exclude => [:comment, :docstring]
end
end
end
-
-if $0 == __FILE__
- $VERBOSE = true
- $: << File.join(File.dirname(__FILE__), '..')
- eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class CommentFilterTest < Test::Unit::TestCase
-
- def test_filtering_comments
- tokens = CodeRay.scan <<-RUBY, :ruby
-#!/usr/bin/env ruby
-# a minimal Ruby program
-puts "Hello world!"
- RUBY
- assert_equal <<-RUBY_FILTERED, tokens.comment_filter.text
-#!/usr/bin/env ruby
-
-puts "Hello world!"
- RUBY_FILTERED
- end
-
-end
\ No newline at end of file
diff --git a/lib/coderay/encoders/count.rb b/lib/coderay/encoders/count.rb
index c9a6dfd..98a427e 100644
--- a/lib/coderay/encoders/count.rb
+++ b/lib/coderay/encoders/count.rb
@@ -1,21 +1,39 @@
module CodeRay
module Encoders
-
+
+ # Returns the number of tokens.
+ #
+ # Text and block tokens are counted.
class Count < Encoder
-
- include Streamable
+
register_for :count
-
- protected
-
+
+ protected
+
def setup options
- @out = 0
+ super
+
+ @count = 0
end
-
- def token text, kind
- @out += 1
+
+ def finish options
+ output @count
end
+
+ public
+
+ def text_token text, kind
+ @count += 1
+ end
+
+ def begin_group kind
+ @count += 1
+ end
+ alias end_group begin_group
+ alias begin_line begin_group
+ alias end_line begin_group
+
end
-
+
end
end
diff --git a/lib/coderay/encoders/debug.rb b/lib/coderay/encoders/debug.rb
index a4b0648..95d6138 100644
--- a/lib/coderay/encoders/debug.rb
+++ b/lib/coderay/encoders/debug.rb
@@ -1,6 +1,6 @@
module CodeRay
module Encoders
-
+
# = Debug Encoder
#
# Fast encoder producing simple debug output.
@@ -10,40 +10,52 @@ module Encoders
# You cannot fully restore the tokens information from the
# output, because consecutive :space tokens are merged.
# Use Tokens#dump for caching purposes.
+ #
+ # See also: Scanners::Debug
class Debug < Encoder
-
- include Streamable
+
register_for :debug
-
+
FILE_EXTENSION = 'raydebug'
-
- protected
+
+ def initialize options = {}
+ super
+ @opened = []
+ end
+
def text_token text, kind
if kind == :space
- text
+ @out << text
else
+ # TODO: Escape (
text = text.gsub(/[)\\]/, '\\\\\0') # escape ) and \
- "#{kind}(#{text})"
+ @out << kind.to_s << '(' << text << ')'
end
end
-
- def open_token kind
- "#{kind}<"
+
+ def begin_group kind
+ @opened << kind
+ @out << kind.to_s << '<'
end
-
- def close_token kind
- ">"
+
+ def end_group kind
+ if @opened.last != kind
+ puts @out
+ raise "we are inside #{@opened.inspect}, not #{kind}"
+ end
+ @opened.pop
+ @out << '>'
end
-
+
def begin_line kind
- "#{kind}["
+ @out << kind.to_s << '['
end
-
+
def end_line kind
- "]"
+ @out << ']'
end
-
+
end
-
+
end
end
diff --git a/lib/coderay/encoders/div.rb b/lib/coderay/encoders/div.rb
index 4120172..efd9435 100644
--- a/lib/coderay/encoders/div.rb
+++ b/lib/coderay/encoders/div.rb
@@ -1,19 +1,23 @@
module CodeRay
module Encoders
-
+
load :html
-
+
+ # Wraps HTML output into a DIV element, using inline styles by default.
+ #
+ # See Encoders::HTML for available options.
class Div < HTML
-
+
FILE_EXTENSION = 'div.html'
-
+
register_for :div
-
+
DEFAULT_OPTIONS = HTML::DEFAULT_OPTIONS.merge \
- :css => :style,
- :wrap => :div
-
+ :css => :style,
+ :wrap => :div,
+ :line_numbers => false
+
end
-
+
end
end
diff --git a/lib/coderay/encoders/filter.rb b/lib/coderay/encoders/filter.rb
index 5e4b34d..e7f34d6 100644
--- a/lib/coderay/encoders/filter.rb
+++ b/lib/coderay/encoders/filter.rb
@@ -1,75 +1,58 @@
-($:.unshift '../..'; require 'coderay') unless defined? CodeRay
module CodeRay
module Encoders
+ # A Filter encoder has another Tokens instance as output.
+ # It can be subclass to select, remove, or modify tokens in the stream.
+ #
+ # Subclasses of Filter are called "Filters" and can be chained.
+ #
+ # == Options
+ #
+ # === :tokens
+ #
+ # The Tokens object which will receive the output.
+ #
+ # Default: Tokens.new
+ #
+ # See also: TokenKindFilter
class Filter < Encoder
register_for :filter
protected
def setup options
- @out = Tokens.new
+ super
+
+ @tokens = options[:tokens] || Tokens.new
end
- def text_token text, kind
- [text, kind] if include_text_token? text, kind
+ def finish options
+ output @tokens
end
- def include_text_token? text, kind
- true
- end
+ public
- def block_token action, kind
- [action, kind] if include_block_token? action, kind
+ def text_token text, kind # :nodoc:
+ @tokens.text_token text, kind
end
- def include_block_token? action, kind
- true
+ def begin_group kind # :nodoc:
+ @tokens.begin_group kind
end
- end
-
-end
-end
-
-if $0 == __FILE__
- $VERBOSE = true
- $: << File.join(File.dirname(__FILE__), '..')
- eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class FilterTest < Test::Unit::TestCase
-
- def test_creation
- assert CodeRay::Encoders::Filter < CodeRay::Encoders::Encoder
- filter = nil
- assert_nothing_raised do
- filter = CodeRay.encoder :filter
+ def begin_line kind # :nodoc:
+ @tokens.begin_line kind
end
- assert_kind_of CodeRay::Encoders::Encoder, filter
- end
-
- def test_filtering_text_tokens
- tokens = CodeRay::Tokens.new
- 10.times do |i|
- tokens << [i.to_s, :index]
+
+ def end_group kind # :nodoc:
+ @tokens.end_group kind
end
- assert_equal tokens, CodeRay::Encoders::Filter.new.encode_tokens(tokens)
- assert_equal tokens, tokens.filter
- end
-
- def test_filtering_block_tokens
- tokens = CodeRay::Tokens.new
- 10.times do |i|
- tokens << [:open, :index]
- tokens << [i.to_s, :content]
- tokens << [:close, :index]
+
+ def end_line kind # :nodoc:
+ @tokens.end_line kind
end
- assert_equal tokens, CodeRay::Encoders::Filter.new.encode_tokens(tokens)
- assert_equal tokens, tokens.filter
+
end
end
+end
diff --git a/lib/coderay/encoders/html.rb b/lib/coderay/encoders/html.rb
index 585ddbb..c32dbd1 100644
--- a/lib/coderay/encoders/html.rb
+++ b/lib/coderay/encoders/html.rb
@@ -2,7 +2,7 @@ require 'set'
module CodeRay
module Encoders
-
+
# = HTML Encoder
#
# This is CodeRay's most important highlighter:
@@ -21,12 +21,12 @@ module Encoders
# :line_numbers => :inline,
# :css => :style
# )
- # #-> <span class="no">1</span> <span style="color:#036; font-weight:bold;">Some</span> code
#
# == Options
#
# === :tab_width
# Convert \t characters to +n+ spaces (a number.)
+ #
# Default: 8
#
# === :css
@@ -48,10 +48,18 @@ module Encoders
# Default: 'CodeRay output'
#
# === :line_numbers
- # Include line numbers in :table, :inline, :list or nil (no line numbers)
+ # Include line numbers in :table, :inline, or nil (no line numbers)
#
# Default: nil
#
+ # === :line_number_anchors
+ # Adds anchors and links to the line numbers. Can be false (off), true (on),
+ # or a prefix string that will be prepended to the anchor name.
+ #
+ # The prefix must consist only of letters, digits, and underscores.
+ #
+ # Default: true, default prefix name: "line"
+ #
# === :line_number_start
# Where to start with line number counting.
#
@@ -74,47 +82,48 @@ module Encoders
#
# === :hint
# Include some information into the output using the title attribute.
- # Can be :info (show token type on mouse-over), :info_long (with full path)
+ # Can be :info (show token kind on mouse-over), :info_long (with full path)
# or :debug (via inspect).
#
# Default: false
class HTML < Encoder
-
- include Streamable
+
register_for :html
-
- FILE_EXTENSION = 'html'
-
+
+ FILE_EXTENSION = 'snippet.html'
+
DEFAULT_OPTIONS = {
:tab_width => 8,
-
- :css => :class,
-
- :style => :cycnus,
- :wrap => nil,
+
+ :css => :class,
+ :style => :alpha,
+ :wrap => nil,
:title => 'CodeRay output',
-
- :line_numbers => nil,
- :line_number_start => 1,
- :bold_every => 10,
- :highlight_lines => nil,
-
+
+ :line_numbers => nil,
+ :line_number_anchors => 'n',
+ :line_number_start => 1,
+ :bold_every => 10,
+ :highlight_lines => nil,
+
:hint => false,
}
-
- helper :output, :css
-
+
+ autoload :Output, CodeRay.coderay_path('encoders', 'html', 'output')
+ autoload :CSS, CodeRay.coderay_path('encoders', 'html', 'css')
+ autoload :Numbering, CodeRay.coderay_path('encoders', 'html', 'numbering')
+
attr_reader :css
-
+
protected
-
+
HTML_ESCAPE = { #:nodoc:
'&' => '&',
'"' => '"',
'>' => '>',
'<' => '<',
}
-
+
# This was to prevent illegal HTML.
# Strange chars should still be avoided in codes.
evil_chars = Array(0x00...0x20) - [?\n, ?\t, ?\s]
@@ -124,185 +133,170 @@ module Encoders
# \x9 (\t) and \xA (\n) not included
#HTML_ESCAPE_PATTERN = /[\t&"><\0-\x8\xB-\x1f\x7f-\xff]/
HTML_ESCAPE_PATTERN = /[\t"&><\0-\x8\xB-\x1f]/
-
- TOKEN_KIND_TO_INFO = Hash.new { |h, kind|
- h[kind] =
- case kind
- when :pre_constant
- 'Predefined constant'
- else
- kind.to_s.gsub(/_/, ' ').gsub(/\b\w/) { $&.capitalize }
- end
- }
-
- TRANSPARENT_TOKEN_KINDS = [
+
+ TOKEN_KIND_TO_INFO = Hash.new do |h, kind|
+ h[kind] = kind.to_s.gsub(/_/, ' ').gsub(/\b\w/) { $&.capitalize }
+ end
+
+ TRANSPARENT_TOKEN_KINDS = Set[
:delimiter, :modifier, :content, :escape, :inline_delimiter,
- ].to_set
-
- # Generate a hint about the given +classes+ in a +hint+ style.
+ ]
+
+ # Generate a hint about the given +kinds+ in a +hint+ style.
#
# +hint+ may be :info, :info_long or :debug.
- def self.token_path_to_hint hint, classes
+ def self.token_path_to_hint hint, kinds
+ kinds = Array kinds
title =
case hint
when :info
- TOKEN_KIND_TO_INFO[classes.first]
+ kinds = kinds[1..-1] if TRANSPARENT_TOKEN_KINDS.include? kinds.first
+ TOKEN_KIND_TO_INFO[kinds.first]
when :info_long
- classes.reverse.map { |kind| TOKEN_KIND_TO_INFO[kind] }.join('/')
+ kinds.reverse.map { |kind| TOKEN_KIND_TO_INFO[kind] }.join('/')
when :debug
- classes.inspect
+ kinds.inspect
end
title ? " title=\"#{title}\"" : ''
end
-
+
def setup options
super
-
+
+ if options[:wrap] || options[:line_numbers]
+ @real_out = @out
+ @out = ''
+ end
+
@HTML_ESCAPE = HTML_ESCAPE.dup
@HTML_ESCAPE["\t"] = ' ' * options[:tab_width]
-
- @opened = [nil]
+
+ @opened = []
+ @last_opened = nil
@css = CSS.new options[:style]
-
+
hint = options[:hint]
- if hint and not [:debug, :info, :info_long].include? hint
+ if hint && ![:debug, :info, :info_long].include?(hint)
raise ArgumentError, "Unknown value %p for :hint; \
- expected :info, :debug, false, or nil." % hint
+ expected :info, :info_long, :debug, false, or nil." % hint
end
-
+
+ css_classes = TokenKinds
case options[:css]
-
when :class
- @css_style = Hash.new do |h, k|
- c = CodeRay::Tokens::ClassOfKind[k.first]
- if c == :NO_HIGHLIGHT and not hint
- h[k.dup] = false
- else
- title = if hint
- HTML.token_path_to_hint(hint, k[1..-1] << k.first)
- else
- ''
- end
- if c == :NO_HIGHLIGHT
- h[k.dup] = '<span%s>' % [title]
- else
- h[k.dup] = '<span%s class="%s">' % [title, c]
- end
- end
- end
-
- when :style
- @css_style = Hash.new do |h, k|
- if k.is_a? ::Array
- styles = k.dup
+ @span_for_kind = Hash.new do |h, k|
+ if k.is_a? ::Symbol
+ kind = k_dup = k
else
- styles = [k]
+ kind = k.first
+ k_dup = k.dup
end
- classes = styles.map { |c| Tokens::ClassOfKind[c] }
- if classes.first == :NO_HIGHLIGHT and not hint
- h[k] = false
+ if kind != :space && (hint || css_class = css_classes[kind])
+ title = HTML.token_path_to_hint hint, k if hint
+ css_class ||= css_classes[kind]
+ h[k_dup] = "<span#{title}#{" class=\"#{css_class}\"" if css_class}>"
else
- styles.shift if TRANSPARENT_TOKEN_KINDS.include? styles.first
- title = HTML.token_path_to_hint hint, styles
- style = @css[*classes]
- h[k] =
- if style
- '<span%s style="%s">' % [title, style]
- else
- false
- end
+ h[k_dup] = nil
end
end
-
+ when :style
+ @span_for_kind = Hash.new do |h, k|
+ kind = k.is_a?(Symbol) ? k : k.first
+ h[k.is_a?(Symbol) ? k : k.dup] =
+ if kind != :space && (hint || css_classes[kind])
+ title = HTML.token_path_to_hint hint, k if hint
+ style = @css.get_style Array(k).map { |c| css_classes[c] }
+ "<span#{title}#{" style=\"#{style}\"" if style}>"
+ end
+ end
else
raise ArgumentError, "Unknown value %p for :css." % options[:css]
-
end
+
+ @set_last_opened = options[:hint] || options[:css] == :style
end
-
+
def finish options
- @opened.shift
- @out << '</span>' * @opened.size
unless @opened.empty?
- warn '%d tokens still open: %p' % [@opened.size, @opened]
+ warn '%d tokens still open: %p' % [@opened.size, @opened] if $CODERAY_DEBUG
+ @out << '</span>' while @opened.pop
+ @last_opened = nil
end
-
+
@out.extend Output
@out.css = @css
- @out.numerize! options[:line_numbers], options
+ if options[:line_numbers]
+ Numbering.number! @out, options[:line_numbers], options
+ end
@out.wrap! options[:wrap]
@out.apply_title! options[:title]
-
- super
- end
-
- def token text, type = :plain
- case text
- when nil
- # raise 'Token with nil as text was given: %p' % [[text, type]]
-
- when String
- if text =~ /#{HTML_ESCAPE_PATTERN}/o
- text = text.gsub(/#{HTML_ESCAPE_PATTERN}/o) { |m| @HTML_ESCAPE[m] }
- end
- @opened[0] = type
- if text != "\n" && style = @css_style[@opened]
- @out << style << text << '</span>'
- else
- @out << text
- end
-
-
- # token groups, eg. strings
- when :open
- @opened[0] = type
- @out << (@css_style[@opened] || '<span>')
- @opened << type
- when :close
- if @opened.empty?
- # nothing to close
- else
- if $CODERAY_DEBUG and (@opened.size == 1 or @opened.last != type)
- raise 'Malformed token stream: Trying to close a token (%p) \
- that is not open. Open are: %p.' % [type, @opened[1..-1]]
- end
- @out << '</span>'
- @opened.pop
- end
+ if defined?(@real_out) && @real_out
+ @real_out << @out
+ @out = @real_out
+ end
- # whole lines to be highlighted, eg. a deleted line in a diff
- when :begin_line
- @opened[0] = type
- if style = @css_style[@opened]
- if style['class="']
- @out << style.sub('class="', 'class="line ')
- else
- @out << style.sub('>', ' class="line">')
- end
- else
- @out << '<span class="line">'
- end
- @opened << type
- when :end_line
- if @opened.empty?
- # nothing to close
+ super
+ end
+
+ public
+
+ def text_token text, kind
+ if text =~ /#{HTML_ESCAPE_PATTERN}/o
+ text = text.gsub(/#{HTML_ESCAPE_PATTERN}/o) { |m| @HTML_ESCAPE[m] }
+ end
+ if style = @span_for_kind[@last_opened ? [kind, *@opened] : kind]
+ @out << style << text << '</span>'
+ else
+ @out << text
+ end
+ end
+
+ # token groups, eg. strings
+ def begin_group kind
+ @out << (@span_for_kind[@last_opened ? [kind, *@opened] : kind] || '<span>')
+ @opened << kind
+ @last_opened = kind if @set_last_opened
+ end
+
+ def end_group kind
+ if $CODERAY_DEBUG && (@opened.empty? || @opened.last != kind)
+ warn 'Malformed token stream: Trying to close a token (%p) ' \
+ 'that is not open. Open are: %p.' % [kind, @opened[1..-1]]
+ end
+ if @opened.pop
+ @out << '</span>'
+ @last_opened = @opened.last if @last_opened
+ end
+ end
+
+ # whole lines to be highlighted, eg. a deleted line in a diff
+ def begin_line kind
+ if style = @span_for_kind[@last_opened ? [kind, *@opened] : kind]
+ if style['class="']
+ @out << style.sub('class="', 'class="line ')
else
- if $CODERAY_DEBUG and (@opened.size == 1 or @opened.last != type)
- raise 'Malformed token stream: Trying to close a line (%p) \
- that is not open. Open are: %p.' % [type, @opened[1..-1]]
- end
- @out << '</span>'
- @opened.pop
+ @out << style.sub('>', ' class="line">')
end
-
else
- raise 'unknown token kind: %p' % [text]
-
+ @out << '<span class="line">'
end
+ @opened << kind
+ @last_opened = kind if @options[:css] == :style
end
-
+
+ def end_line kind
+ if $CODERAY_DEBUG && (@opened.empty? || @opened.last != kind)
+ warn 'Malformed token stream: Trying to close a line (%p) ' \
+ 'that is not open. Open are: %p.' % [kind, @opened[1..-1]]
+ end
+ if @opened.pop
+ @out << '</span>'
+ @last_opened = @opened.last if @last_opened
+ end
+ end
+
end
-
+
end
end
diff --git a/lib/coderay/encoders/html/css.rb b/lib/coderay/encoders/html/css.rb
index 09ac8bc..6de4b46 100644
--- a/lib/coderay/encoders/html/css.rb
+++ b/lib/coderay/encoders/html/css.rb
@@ -2,7 +2,7 @@ module CodeRay
module Encoders
class HTML
- class CSS
+ class CSS # :nodoc:
attr :stylesheet
@@ -20,11 +20,11 @@ module Encoders
parse style::TOKEN_COLORS
end
- def [] *styles
+ def get_style styles
cl = @classes[styles.first]
return '' unless cl
style = ''
- 1.upto(styles.size) do |offset|
+ 1.upto styles.size do |offset|
break if style = cl[styles[offset .. -1]]
end
# warn 'Style not found: %p' % [styles] if style.empty?
@@ -44,7 +44,7 @@ module Encoders
( [^\}]+ )? # $2 = style
\s* \} \s*
|
- ( . ) # $3 = error
+ ( [^\n]+ ) # $3 = error
/mx
def parse stylesheet
stylesheet.scan CSS_CLASS_PATTERN do |selectors, style, error|
@@ -63,8 +63,3 @@ module Encoders
end
end
-
-if $0 == __FILE__
- require 'pp'
- pp CodeRay::Encoders::HTML::CSS.new
-end
diff --git a/lib/coderay/encoders/html/numbering.rb b/lib/coderay/encoders/html/numbering.rb
new file mode 100644
index 0000000..15ce11b
--- /dev/null
+++ b/lib/coderay/encoders/html/numbering.rb
@@ -0,0 +1,115 @@
+module CodeRay
+module Encoders
+
+ class HTML
+
+ module Numbering # :nodoc:
+
+ def self.number! output, mode = :table, options = {}
+ return self unless mode
+
+ options = DEFAULT_OPTIONS.merge options
+
+ start = options[:line_number_start]
+ unless start.is_a? Integer
+ raise ArgumentError, "Invalid value %p for :line_number_start; Integer expected." % start
+ end
+
+ anchor_prefix = options[:line_number_anchors]
+ anchor_prefix = 'line' if anchor_prefix == true
+ anchor_prefix = anchor_prefix.to_s[/\w+/] if anchor_prefix
+ anchoring =
+ if anchor_prefix
+ proc do |line|
+ line = line.to_s
+ anchor = anchor_prefix + line
+ "<a href=\"##{anchor}\" name=\"#{anchor}\">#{line}</a>"
+ end
+ else
+ proc { |line| line.to_s } # :to_s.to_proc in Ruby 1.8.7+
+ end
+
+ bold_every = options[:bold_every]
+ highlight_lines = options[:highlight_lines]
+ bolding =
+ if bold_every == false && highlight_lines == nil
+ anchoring
+ elsif highlight_lines.is_a? Enumerable
+ highlight_lines = highlight_lines.to_set
+ proc do |line|
+ if highlight_lines.include? line
+ "<strong class=\"highlighted\">#{anchoring[line]}</strong>" # highlighted line numbers in bold
+ else
+ anchoring[line]
+ end
+ end
+ elsif bold_every.is_a? Integer
+ raise ArgumentError, ":bolding can't be 0." if bold_every == 0
+ proc do |line|
+ if line % bold_every == 0
+ "<strong>#{anchoring[line]}</strong>" # every bold_every-th number in bold
+ else
+ anchoring[line]
+ end
+ end
+ else
+ raise ArgumentError, 'Invalid value %p for :bolding; false or Integer expected.' % bold_every
+ end
+
+ line_count = output.count("\n")
+ position_of_last_newline = output.rindex(RUBY_VERSION >= '1.9' ? /\n/ : ?\n)
+ if position_of_last_newline
+ after_last_newline = output[position_of_last_newline + 1 .. -1]
+ ends_with_newline = after_last_newline[/\A(?:<\/span>)*\z/]
+ line_count += 1 if not ends_with_newline
+ end
+
+ case mode
+ when :inline
+ max_width = (start + line_count).to_s.size
+ line_number = start
+ nesting = []
+ output.gsub!(/^.*$\n?/) do |line|
+ line.chomp!
+ open = nesting.join
+ line.scan(%r!<(/)?span[^>]*>?!) do |close,|
+ if close
+ nesting.pop
+ else
+ nesting << $&
+ end
+ end
+ close = '</span>' * nesting.size
+
+ line_number_text = bolding.call line_number
+ indent = ' ' * (max_width - line_number.to_s.size) # TODO: Optimize (10^x)
+ line_number += 1
+ "<span class=\"line-numbers\">#{indent}#{line_number_text}</span>#{open}#{line}#{close}\n"
+ end
+
+ when :table
+ line_numbers = (start ... start + line_count).map(&bolding).join("\n")
+ line_numbers << "\n"
+ line_numbers_table_template = Output::TABLE.apply('LINE_NUMBERS', line_numbers)
+
+ output.gsub!(/<\/div>\n/, '</div>')
+ output.wrap_in! line_numbers_table_template
+ output.wrapped_in = :div
+
+ when :list
+ raise NotImplementedError, 'The :list option is no longer available. Use :table.'
+
+ else
+ raise ArgumentError, 'Unknown value %p for mode: expected one of %p' %
+ [mode, [:table, :inline]]
+ end
+
+ output
+ end
+
+ end
+
+ end
+
+end
+end
diff --git a/lib/coderay/encoders/html/numerization.rb b/lib/coderay/encoders/html/numerization.rb
deleted file mode 100644
index 17e8ddb..0000000
--- a/lib/coderay/encoders/html/numerization.rb
+++ /dev/null
@@ -1,133 +0,0 @@
-module CodeRay
-module Encoders
-
- class HTML
-
- module Output
-
- def numerize *args
- clone.numerize!(*args)
- end
-
-=begin NUMERIZABLE_WRAPPINGS = {
- :table => [:div, :page, nil],
- :inline => :all,
- :list => [:div, :page, nil]
- }
- NUMERIZABLE_WRAPPINGS.default = :all
-=end
- def numerize! mode = :table, options = {}
- return self unless mode
-
- options = DEFAULT_OPTIONS.merge options
-
- start = options[:line_number_start]
- unless start.is_a? Integer
- raise ArgumentError, "Invalid value %p for :line_number_start; Integer expected." % start
- end
-
- #allowed_wrappings = NUMERIZABLE_WRAPPINGS[mode]
- #unless allowed_wrappings == :all or allowed_wrappings.include? options[:wrap]
- # raise ArgumentError, "Can't numerize, :wrap must be in %p, but is %p" % [NUMERIZABLE_WRAPPINGS, options[:wrap]]
- #end
-
- bold_every = options[:bold_every]
- highlight_lines = options[:highlight_lines]
- bolding =
- if bold_every == false && highlight_lines == nil
- proc { |line| line.to_s }
- elsif highlight_lines.is_a? Enumerable
- highlight_lines = highlight_lines.to_set
- proc do |line|
- if highlight_lines.include? line
- "<strong class=\"highlighted\">#{line}</strong>" # highlighted line numbers in bold
- else
- line.to_s
- end
- end
- elsif bold_every.is_a? Integer
- raise ArgumentError, ":bolding can't be 0." if bold_every == 0
- proc do |line|
- if line % bold_every == 0
- "<strong>#{line}</strong>" # every bold_every-th number in bold
- else
- line.to_s
- end
- end
- else
- raise ArgumentError, 'Invalid value %p for :bolding; false or Integer expected.' % bold_every
- end
-
- case mode
- when :inline
- max_width = (start + line_count).to_s.size
- line_number = start
- gsub!(/^/) do
- line_number_text = bolding.call line_number
- indent = ' ' * (max_width - line_number.to_s.size) # TODO: Optimize (10^x)
- res = "<span class=\"no\">#{indent}#{line_number_text}</span> "
- line_number += 1
- res
- end
-
- when :table
- # This is really ugly.
- # Because even monospace fonts seem to have different heights when bold,
- # I make the newline bold, both in the code and the line numbers.
- # FIXME Still not working perfect for Mr. Internet Exploder
- line_numbers = (start ... start + line_count).to_a.map(&bolding).join("\n")
- line_numbers << "\n" # also for Mr. MS Internet Exploder :-/
- line_numbers.gsub!(/\n/) { "<tt>\n</tt>" }
-
- line_numbers_table_tpl = TABLE.apply('LINE_NUMBERS', line_numbers)
- gsub!("</div>\n", '</div>')
- gsub!("\n", "<tt>\n</tt>")
- wrap_in! line_numbers_table_tpl
- @wrapped_in = :div
-
- when :list
- opened_tags = []
- gsub!(/^.*$\n?/) do |line|
- line.chomp!
-
- open = opened_tags.join
- line.scan(%r!<(/)?span[^>]*>?!) do |close,|
- if close
- opened_tags.pop
- else
- opened_tags << $&
- end
- end
- close = '</span>' * opened_tags.size
-
- "<li>#{open}#{line}#{close}</li>\n"
- end
- chomp!("\n")
- wrap_in! LIST
- @wrapped_in = :div
-
- else
- raise ArgumentError, 'Unknown value %p for mode: expected one of %p' %
- [mode, [:table, :list, :inline]]
- end
-
- self
- end
-
- def line_count
- line_count = count("\n")
- position_of_last_newline = rindex(?\n)
- if position_of_last_newline
- after_last_newline = self[position_of_last_newline + 1 .. -1]
- ends_with_newline = after_last_newline[/\A(?:<\/span>)*\z/]
- line_count += 1 if not ends_with_newline
- end
- line_count
- end
-
- end
-
- end
-
-end
-end
diff --git a/lib/coderay/encoders/html/output.rb b/lib/coderay/encoders/html/output.rb
index 28574a5..9132d94 100644
--- a/lib/coderay/encoders/html/output.rb
+++ b/lib/coderay/encoders/html/output.rb
@@ -3,44 +3,29 @@ module Encoders
class HTML
- # This module is included in the output String from thew HTML Encoder.
+ # This module is included in the output String of the HTML Encoder.
#
# It provides methods like wrap, div, page etc.
#
# Remember to use #clone instead of #dup to keep the modules the object was
# extended with.
#
- # TODO: more doc.
+ # TODO: Rewrite this without monkey patching.
module Output
- require 'coderay/encoders/html/numerization.rb'
-
attr_accessor :css
class << self
- # This makes Output look like a class.
- #
- # Example:
- #
- # a = Output.new '<span class="co">Code</span>'
- # a.wrap! :page
- def new string, css = CSS.new, element = nil
- output = string.clone.extend self
- output.wrapped_in = element
- output.css = css
- output
- end
-
# Raises an exception if an object that doesn't respond to to_str is extended by Output,
# to prevent users from misuse. Use Module#remove_method to disable.
- def extended o
+ def extended o # :nodoc:
warn "The Output module is intended to extend instances of String, not #{o.class}." unless o.respond_to? :to_str
end
- def make_stylesheet css, in_tag = false
+ def make_stylesheet css, in_tag = false # :nodoc:
sheet = css.stylesheet
- sheet = <<-CSS if in_tag
+ sheet = <<-'CSS' if in_tag
<style type="text/css">
#{sheet}
</style>
@@ -48,27 +33,13 @@ module Encoders
sheet
end
- def page_template_for_css css
+ def page_template_for_css css # :nodoc:
sheet = make_stylesheet css
PAGE.apply 'CSS', sheet
end
- # Define a new wrapper. This is meta programming.
- def wrapper *wrappers
- wrappers.each do |wrapper|
- define_method wrapper do |*args|
- wrap wrapper, *args
- end
- define_method "#{wrapper}!".to_sym do |*args|
- wrap! wrapper, *args
- end
- end
- end
-
end
- wrapper :div, :span, :page
-
def wrapped_in? element
wrapped_in == element
end
@@ -78,10 +49,6 @@ module Encoders
end
attr_writer :wrapped_in
- def wrap_in template
- clone.wrap_in! template
- end
-
def wrap_in! template
Template.wrap! self, template, 'CONTENT'
self
@@ -118,15 +85,13 @@ module Encoders
self
end
- def wrap *args
- clone.wrap!(*args)
- end
-
def stylesheet in_tag = false
Output.make_stylesheet @css, in_tag
end
- class Template < String
+#-- don't include the templates in docu
+
+ class Template < String # :nodoc:
def self.wrap! str, template, target
target = Regexp.new(Regexp.escape("<%#{target}%>"))
@@ -147,51 +112,46 @@ module Encoders
end
end
- module Simple
- def ` str #` <-- for stupid editors
- Template.new str
- end
- end
end
- extend Template::Simple
+ SPAN = Template.new '<span class="CodeRay"><%CONTENT%></span>'
-#-- don't include the templates in docu
-
- SPAN = `<span class="CodeRay"><%CONTENT%></span>`
-
- DIV = <<-`DIV`
+ DIV = Template.new <<-DIV
<div class="CodeRay">
<div class="code"><pre><%CONTENT%></pre></div>
</div>
DIV
- TABLE = <<-`TABLE`
+ TABLE = Template.new <<-TABLE
<table class="CodeRay"><tr>
- <td class="line_numbers" title="click to toggle" onclick="with (this.firstChild.style) { display = (display == '') ? 'none' : '' }"><pre><%LINE_NUMBERS%></pre></td>
- <td class="code"><pre ondblclick="with (this.style) { overflow = (overflow == 'auto' || overflow == '') ? 'visible' : 'auto' }"><%CONTENT%></pre></td>
+ <td class="line-numbers" title="double click to toggle" ondblclick="with (this.firstChild.style) { display = (display == '') ? 'none' : '' }"><pre><%LINE_NUMBERS%></pre></td>
+ <td class="code"><pre><%CONTENT%></pre></td>
</tr></table>
TABLE
- # title="double click to expand"
-
- LIST = <<-`LIST`
-<ol class="CodeRay">
-<%CONTENT%>
-</ol>
- LIST
- PAGE = <<-`PAGE`
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
- "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="de">
+ PAGE = Template.new <<-PAGE
+<!DOCTYPE html>
+<html>
<head>
- <meta http-equiv="content-type" content="text/html; charset=utf-8" />
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title></title>
<style type="text/css">
+.CodeRay .line-numbers a {
+ text-decoration: inherit;
+ color: inherit;
+}
+body {
+ background-color: white;
+ padding: 0;
+ margin: 0;
+}
<%CSS%>
+.CodeRay {
+ border: none;
+}
</style>
</head>
-<body style="background-color: white;">
+<body>
<%CONTENT%>
</body>
diff --git a/lib/coderay/encoders/json.rb b/lib/coderay/encoders/json.rb
index 7aa077c..a9e40dc 100644
--- a/lib/coderay/encoders/json.rb
+++ b/lib/coderay/encoders/json.rb
@@ -1,69 +1,83 @@
-($:.unshift '../..'; require 'coderay') unless defined? CodeRay
module CodeRay
module Encoders
- # = JSON Encoder
+ # A simple JSON Encoder.
+ #
+ # Example:
+ # CodeRay.scan('puts "Hello world!"', :ruby).json
+ # yields
+ # [
+ # {"type"=>"text", "text"=>"puts", "kind"=>"ident"},
+ # {"type"=>"text", "text"=>" ", "kind"=>"space"},
+ # {"type"=>"block", "action"=>"open", "kind"=>"string"},
+ # {"type"=>"text", "text"=>"\"", "kind"=>"delimiter"},
+ # {"type"=>"text", "text"=>"Hello world!", "kind"=>"content"},
+ # {"type"=>"text", "text"=>"\"", "kind"=>"delimiter"},
+ # {"type"=>"block", "action"=>"close", "kind"=>"string"},
+ # ]
class JSON < Encoder
+ begin
+ require 'json'
+ rescue LoadError
+ begin
+ require 'rubygems' unless defined? Gem
+ gem 'json'
+ require 'json'
+ rescue LoadError
+ $stderr.puts "The JSON encoder needs the JSON library.\n" \
+ "Please gem install json."
+ raise
+ end
+ end
+
register_for :json
FILE_EXTENSION = 'json'
protected
def setup options
- begin
- require 'json'
- rescue LoadError
- require 'rubygems'
- require 'json'
+ super
+
+ @first = true
+ @out << '['
+ end
+
+ def finish options
+ @out << ']'
+ end
+
+ def append data
+ if @first
+ @first = false
+ else
+ @out << ','
end
- @out = []
+
+ @out << data.to_json
end
+ public
def text_token text, kind
- { :type => 'text', :text => text, :kind => kind }
+ append :type => 'text', :text => text, :kind => kind
end
- def block_token action, kind
- { :type => 'block', :action => action, :kind => kind }
+ def begin_group kind
+ append :type => 'block', :action => 'open', :kind => kind
end
- def finish options
- @out.to_json
+ def end_group kind
+ append :type => 'block', :action => 'close', :kind => kind
+ end
+
+ def begin_line kind
+ append :type => 'block', :action => 'begin_line', :kind => kind
+ end
+
+ def end_line kind
+ append :type => 'block', :action => 'end_line', :kind => kind
end
end
end
end
-
-if $0 == __FILE__
- $VERBOSE = true
- $: << File.join(File.dirname(__FILE__), '..')
- eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-$:.delete '.'
-require 'rubygems' if RUBY_VERSION < '1.9'
-
-class JSONEncoderTest < Test::Unit::TestCase
-
- def test_json_output
- tokens = CodeRay.scan <<-RUBY, :ruby
-puts "Hello world!"
- RUBY
- require 'json'
- assert_equal [
- {"type"=>"text", "text"=>"puts", "kind"=>"ident"},
- {"type"=>"text", "text"=>" ", "kind"=>"space"},
- {"type"=>"block", "action"=>"open", "kind"=>"string"},
- {"type"=>"text", "text"=>"\"", "kind"=>"delimiter"},
- {"type"=>"text", "text"=>"Hello world!", "kind"=>"content"},
- {"type"=>"text", "text"=>"\"", "kind"=>"delimiter"},
- {"type"=>"block", "action"=>"close", "kind"=>"string"},
- {"type"=>"text", "text"=>"\n", "kind"=>"space"}
- ], JSON.load(tokens.json)
- end
-
-end
\ No newline at end of file
diff --git a/lib/coderay/encoders/lines_of_code.rb b/lib/coderay/encoders/lines_of_code.rb
index c1ad66e..5f8422f 100644
--- a/lib/coderay/encoders/lines_of_code.rb
+++ b/lib/coderay/encoders/lines_of_code.rb
@@ -1,10 +1,9 @@
-($:.unshift '../..'; require 'coderay') unless defined? CodeRay
module CodeRay
module Encoders
# Counts the LoC (Lines of Code). Returns an Integer >= 0.
#
- # Alias: :loc
+ # Alias: +loc+
#
# Everything that is not comment, markup, doctype/shebang, or an empty line,
# is considered to be code.
@@ -15,76 +14,32 @@ module Encoders
#
# A Scanner class should define the token kinds that are not code in the
# KINDS_NOT_LOC constant, which defaults to [:comment, :doctype].
- class LinesOfCode < Encoder
+ class LinesOfCode < TokenKindFilter
register_for :lines_of_code
NON_EMPTY_LINE = /^\s*\S.*$/
- def compile tokens, options
- if scanner = tokens.scanner
+ protected
+
+ def setup options
+ if scanner
kinds_not_loc = scanner.class::KINDS_NOT_LOC
else
- warn ArgumentError, 'Tokens have no scanner.' if $VERBOSE
+ warn "Tokens have no associated scanner, counting all nonempty lines." if $VERBOSE
kinds_not_loc = CodeRay::Scanners::Scanner::KINDS_NOT_LOC
end
- code = tokens.token_class_filter :exclude => kinds_not_loc
- @loc = code.text.scan(NON_EMPTY_LINE).size
+
+ options[:exclude] = kinds_not_loc
+
+ super options
end
def finish options
- @loc
+ output @tokens.text.scan(NON_EMPTY_LINE).size
end
end
end
end
-
-if $0 == __FILE__
- $VERBOSE = true
- $: << File.join(File.dirname(__FILE__), '..')
- eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class LinesOfCodeTest < Test::Unit::TestCase
-
- def test_creation
- assert CodeRay::Encoders::LinesOfCode < CodeRay::Encoders::Encoder
- filter = nil
- assert_nothing_raised do
- filter = CodeRay.encoder :loc
- end
- assert_kind_of CodeRay::Encoders::LinesOfCode, filter
- assert_nothing_raised do
- filter = CodeRay.encoder :lines_of_code
- end
- assert_kind_of CodeRay::Encoders::LinesOfCode, filter
- end
-
- def test_lines_of_code
- tokens = CodeRay.scan <<-RUBY, :ruby
-#!/usr/bin/env ruby
-
-# a minimal Ruby program
-puts "Hello world!"
- RUBY
- assert_equal 1, CodeRay::Encoders::LinesOfCode.new.encode_tokens(tokens)
- assert_equal 1, tokens.lines_of_code
- assert_equal 1, tokens.loc
- end
-
- def test_filtering_block_tokens
- tokens = CodeRay::Tokens.new
- tokens << ["Hello\n", :world]
- tokens << ["Hello\n", :space]
- tokens << ["Hello\n", :comment]
- assert_equal 2, CodeRay::Encoders::LinesOfCode.new.encode_tokens(tokens)
- assert_equal 2, tokens.lines_of_code
- assert_equal 2, tokens.loc
- end
-
-end
\ No newline at end of file
diff --git a/lib/coderay/encoders/null.rb b/lib/coderay/encoders/null.rb
index add3862..73ba47d 100644
--- a/lib/coderay/encoders/null.rb
+++ b/lib/coderay/encoders/null.rb
@@ -1,26 +1,18 @@
module CodeRay
module Encoders
-
+
# = Null Encoder
#
# Does nothing and returns an empty string.
class Null < Encoder
-
- include Streamable
+
register_for :null
-
- # Defined for faster processing
- def to_proc
- proc {}
- end
-
- protected
-
- def token(*)
+
+ def text_token text, kind
# do nothing
end
-
+
end
-
+
end
end
diff --git a/lib/coderay/encoders/page.rb b/lib/coderay/encoders/page.rb
index 1b69cce..800e73f 100644
--- a/lib/coderay/encoders/page.rb
+++ b/lib/coderay/encoders/page.rb
@@ -1,20 +1,24 @@
module CodeRay
module Encoders
-
+
load :html
-
+
+ # Wraps the output into a HTML page, using CSS classes and
+ # line numbers in the table format by default.
+ #
+ # See Encoders::HTML for available options.
class Page < HTML
-
+
FILE_EXTENSION = 'html'
-
+
register_for :page
-
+
DEFAULT_OPTIONS = HTML::DEFAULT_OPTIONS.merge \
- :css => :class,
- :wrap => :page,
+ :css => :class,
+ :wrap => :page,
:line_numbers => :table
-
+
end
-
+
end
end
diff --git a/lib/coderay/encoders/span.rb b/lib/coderay/encoders/span.rb
index 319f6fd..da705bd 100644
--- a/lib/coderay/encoders/span.rb
+++ b/lib/coderay/encoders/span.rb
@@ -1,19 +1,23 @@
module CodeRay
module Encoders
-
+
load :html
-
+
+ # Wraps HTML output into a SPAN element, using inline styles by default.
+ #
+ # See Encoders::HTML for available options.
class Span < HTML
-
+
FILE_EXTENSION = 'span.html'
-
+
register_for :span
-
+
DEFAULT_OPTIONS = HTML::DEFAULT_OPTIONS.merge \
- :css => :style,
- :wrap => :span
-
+ :css => :style,
+ :wrap => :span,
+ :line_numbers => false
+
end
-
+
end
end
diff --git a/lib/coderay/encoders/statistic.rb b/lib/coderay/encoders/statistic.rb
index 6d0c646..2315d9e 100644
--- a/lib/coderay/encoders/statistic.rb
+++ b/lib/coderay/encoders/statistic.rb
@@ -1,43 +1,27 @@
module CodeRay
module Encoders
-
+
# Makes a statistic for the given tokens.
+ #
+ # Alias: +stats+
class Statistic < Encoder
-
- include Streamable
- register_for :stats, :statistic
-
- attr_reader :type_stats, :real_token_count
-
+
+ register_for :statistic
+
+ attr_reader :type_stats, :real_token_count # :nodoc:
+
+ TypeStats = Struct.new :count, :size # :nodoc:
+
protected
-
- TypeStats = Struct.new :count, :size
-
+
def setup options
+ super
+
@type_stats = Hash.new { |h, k| h[k] = TypeStats.new 0, 0 }
@real_token_count = 0
end
-
- def generate tokens, options
- @tokens = tokens
- super
- end
-
- def text_token text, kind
- @real_token_count += 1 unless kind == :space
- @type_stats[kind].count += 1
- @type_stats[kind].size += text.size
- @type_stats['TOTAL'].size += text.size
- @type_stats['TOTAL'].count += 1
- end
-
- # TODO Hierarchy handling
- def block_token action, kind
- @type_stats['TOTAL'].count += 1
- @type_stats['open/close'].count += 1
- end
-
- STATS = <<-STATS
+
+ STATS = <<-STATS # :nodoc:
Code Statistics
@@ -49,12 +33,12 @@ Token Types (%d):
type count ratio size (average)
-------------------------------------------------------------
%s
- STATS
-# space 12007 33.81 % 1.7
- TOKEN_TYPES_ROW = <<-TKR
+ STATS
+
+ TOKEN_TYPES_ROW = <<-TKR # :nodoc:
%-20s %8d %6.2f %% %5.1f
- TKR
-
+ TKR
+
def finish options
all = @type_stats['TOTAL']
all_count, all_size = all.count, all.size
@@ -64,14 +48,49 @@ Token Types (%d):
types_stats = @type_stats.sort_by { |k, v| [-v.count, k.to_s] }.map do |k, v|
TOKEN_TYPES_ROW % [k, v.count, 100.0 * v.count / all_count, v.size]
end.join
- STATS % [
+ @out << STATS % [
all_count, @real_token_count, all_size,
@type_stats.delete_if { |k, v| k.is_a? String }.size,
types_stats
]
+
+ super
end
-
+
+ public
+
+ def text_token text, kind
+ @real_token_count += 1 unless kind == :space
+ @type_stats[kind].count += 1
+ @type_stats[kind].size += text.size
+ @type_stats['TOTAL'].size += text.size
+ @type_stats['TOTAL'].count += 1
+ end
+
+ # TODO Hierarchy handling
+ def begin_group kind
+ block_token ':begin_group', kind
+ end
+
+ def end_group kind
+ block_token ':end_group', kind
+ end
+
+ def begin_line kind
+ block_token ':begin_line', kind
+ end
+
+ def end_line kind
+ block_token ':end_line', kind
+ end
+
+ def block_token action, kind
+ @type_stats['TOTAL'].count += 1
+ @type_stats[action].count += 1
+ @type_stats[kind].count += 1
+ end
+
end
-
+
end
end
diff --git a/lib/coderay/encoders/term.rb b/lib/coderay/encoders/term.rb
deleted file mode 100644
index 1f284ed..0000000
--- a/lib/coderay/encoders/term.rb
+++ /dev/null
@@ -1,158 +0,0 @@
-# encoders/term.rb
-# By Rob Aldred (http://robaldred.co.uk)
-# Based on idea by Nathan Weizenbaum (http://nex-3.com)
-# MIT License (http://www.opensource.org/licenses/mit-license.php)
-#
-# A CodeRay encoder that outputs code highlighted for a color terminal.
-# Check out http://robaldred.co.uk
-
-module CodeRay
- module Encoders
- class Term < Encoder
- register_for :term
-
- TOKEN_COLORS = {
- :annotation => '35',
- :attribute_name => '33',
- :attribute_name_fat => '33',
- :attribute_value => '31',
- :attribute_value_fat => '31',
- :bin => '1;35',
- :char => {:self => '36', :delimiter => '34'},
- :class => '1;35',
- :class_variable => '36',
- :color => '32',
- :comment => '37',
- :complex => '34',
- :constant => ['34', '4'],
- :decoration => '35',
- :definition => '1;32',
- :directive => ['32', '4'],
- :doc => '46',
- :doctype => '1;30',
- :doc_string => ['31', '4'],
- :entity => '33',
- :error => ['1;33', '41'],
- :exception => '1;31',
- :float => '1;35',
- :function => '1;34',
- :global_variable => '42',
- :hex => '1;36',
- :important => '1;31',
- :include => '33',
- :integer => '1;34',
- :interpreted => '1;35',
- :key => '35',
- :label => '1;4',
- :local_variable => '33',
- :oct => '1;35',
- :operator_name => '1;29',
- :pre_constant => '1;36',
- :pre_type => '1;30',
- :predefined => ['4', '1;34'],
- :preprocessor => '36',
- :pseudo_class => '34',
- :regexp => {
- :content => '31',
- :delimiter => '1;29',
- :modifier => '35',
- :function => '1;29'
- },
- :reserved => '1;31',
- :shell => {
- :self => '42',
- :content => '1;29',
- :delimiter => '37',
- },
- :string => {
- :self => '32',
- :modifier => '1;32',
- :escape => '1;36',
- :delimiter => '1;32',
- },
- :symbol => '1;32',
- :tag => '34',
- :tag_fat => '1;34',
- :tag_special => ['34', '4'],
- :type => '1;34',
- :value => '36',
- :variable => '34',
- :insert => '42',
- :delete => '41',
- :change => '44',
- :head => '45',
- }
- TOKEN_COLORS[:keyword] = TOKEN_COLORS[:reserved]
- TOKEN_COLORS[:method] = TOKEN_COLORS[:function]
- TOKEN_COLORS[:imaginary] = TOKEN_COLORS[:complex]
- TOKEN_COLORS[:open] = TOKEN_COLORS[:close] = TOKEN_COLORS[:nesting_delimiter] = TOKEN_COLORS[:escape] = TOKEN_COLORS[:delimiter]
-
- protected
-
- def setup(options)
- @out = ''
- @opened = [nil]
- @subcolors = nil
- end
-
- def finish(options)
- super
- end
-
- def token text, type = :plain
- case text
-
- when nil
- # raise 'Token with nil as text was given: %p' % [[text, type]]
-
- when String
-
- if color = (@subcolors || TOKEN_COLORS)[type]
- color = color[:self] || return if Hash === color
-
- @out << col(color) + text.gsub("\n", col(0) + "\n" + col(color)) + col(0)
- @out << col(@subcolors[:self]) if @subcolors && @subcolors[:self]
- else
- @out << text
- end
-
- # token groups, eg. strings
- when :open
- @opened[0] = type
- if color = TOKEN_COLORS[type]
- if Hash === color
- @subcolors = color
- @out << col(color[:self]) if color[:self]
- else
- @subcolors = {}
- @out << col(color)
- end
- end
- @opened << type
- when :close
- if @opened.empty?
- # nothing to close
- else
- @out << col(0) if (@subcolors || {})[:self]
- @subcolors = nil
- @opened.pop
- end
-
- # whole lines to be highlighted, eg. a added/modified/deleted lines in a diff
- when :begin_line
-
- when :end_line
-
- else
- raise 'unknown token kind: %p' % [text]
- end
- end
-
- private
-
- def col(color)
- Array(color).map { |c| "\e[#{c}m" }.join
- end
- end
- end
-end
\ No newline at end of file
diff --git a/lib/coderay/encoders/terminal.rb b/lib/coderay/encoders/terminal.rb
new file mode 100644
index 0000000..005032d
--- /dev/null
+++ b/lib/coderay/encoders/terminal.rb
@@ -0,0 +1,179 @@
+module CodeRay
+ module Encoders
+
+ # Outputs code highlighted for a color terminal.
+ #
+ # Note: This encoder is in beta. It currently doesn't use the Styles.
+ #
+ # Alias: +term+
+ #
+ # == Authors & License
+ #
+ # By Rob Aldred (http://robaldred.co.uk)
+ #
+ # Based on idea by Nathan Weizenbaum (http://nex-3.com)
+ #
+ # MIT License (http://www.opensource.org/licenses/mit-license.php)
+ class Terminal < Encoder
+
+ register_for :terminal
+
+ TOKEN_COLORS = {
+ :annotation => '35',
+ :attribute_name => '33',
+ :attribute_value => '31',
+ :binary => '1;35',
+ :char => {
+ :self => '36', :delimiter => '1;34'
+ },
+ :class => '1;35',
+ :class_variable => '36',
+ :color => '32',
+ :comment => '37',
+ :complex => '1;34',
+ :constant => ['1;34', '4'],
+ :decoration => '35',
+ :definition => '1;32',
+ :directive => ['32', '4'],
+ :doc => '46',
+ :doctype => '1;30',
+ :doc_string => ['31', '4'],
+ :entity => '33',
+ :error => ['1;33', '41'],
+ :exception => '1;31',
+ :float => '1;35',
+ :function => '1;34',
+ :global_variable => '42',
+ :hex => '1;36',
+ :include => '33',
+ :integer => '1;34',
+ :key => '35',
+ :label => '1;15',
+ :local_variable => '33',
+ :octal => '1;35',
+ :operator_name => '1;29',
+ :predefined_constant => '1;36',
+ :predefined_type => '1;30',
+ :predefined => ['4', '1;34'],
+ :preprocessor => '36',
+ :pseudo_class => '1;34',
+ :regexp => {
+ :self => '31',
+ :content => '31',
+ :delimiter => '1;29',
+ :modifier => '35',
+ :function => '1;29'
+ },
+ :reserved => '1;31',
+ :shell => {
+ :self => '42',
+ :content => '1;29',
+ :delimiter => '37',
+ },
+ :string => {
+ :self => '32',
+ :modifier => '1;32',
+ :escape => '1;36',
+ :delimiter => '1;32',
+ },
+ :symbol => '1;32',
+ :tag => '1;34',
+ :type => '1;34',
+ :value => '36',
+ :variable => '1;34',
+
+ :insert => '42',
+ :delete => '41',
+ :change => '44',
+ :head => '45'
+ }
+ TOKEN_COLORS[:keyword] = TOKEN_COLORS[:reserved]
+ TOKEN_COLORS[:method] = TOKEN_COLORS[:function]
+ TOKEN_COLORS[:imaginary] = TOKEN_COLORS[:complex]
+ TOKEN_COLORS[:begin_group] = TOKEN_COLORS[:end_group] =
+ TOKEN_COLORS[:escape] = TOKEN_COLORS[:delimiter]
+
+ protected
+
+ def setup(options)
+ super
+ @opened = []
+ @subcolors = nil
+ end
+
+ public
+
+ def text_token text, kind
+ if color = (@subcolors || TOKEN_COLORS)[kind]
+ if Hash === color
+ if color[:self]
+ color = color[:self]
+ else
+ @out << text
+ return
+ end
+ end
+
+ @out << ansi_colorize(color)
+ @out << text.gsub("\n", ansi_clear + "\n" + ansi_colorize(color))
+ @out << ansi_clear
+ @out << ansi_colorize(@subcolors[:self]) if @subcolors && @subcolors[:self]
+ else
+ @out << text
+ end
+ end
+
+ def begin_group kind
+ @opened << kind
+ @out << open_token(kind)
+ end
+ alias begin_line begin_group
+
+ def end_group kind
+ if @opened.empty?
+ # nothing to close
+ else
+ @opened.pop
+ @out << ansi_clear
+ @out << open_token(@opened.last)
+ end
+ end
+
+ def end_line kind
+ if @opened.empty?
+ # nothing to close
+ else
+ @opened.pop
+ # whole lines to be highlighted,
+ # eg. added/modified/deleted lines in a diff
+ @out << "\t" * 100 + ansi_clear
+ @out << open_token(@opened.last)
+ end
+ end
+
+ private
+
+ def open_token kind
+ if color = TOKEN_COLORS[kind]
+ if Hash === color
+ @subcolors = color
+ ansi_colorize(color[:self]) if color[:self]
+ else
+ @subcolors = {}
+ ansi_colorize(color)
+ end
+ else
+ @subcolors = nil
+ ''
+ end
+ end
+
+ def ansi_colorize(color)
+ Array(color).map { |c| "\e[#{c}m" }.join
+ end
+ def ansi_clear
+ ansi_colorize(0)
+ end
+ end
+ end
+end
\ No newline at end of file
diff --git a/lib/coderay/encoders/text.rb b/lib/coderay/encoders/text.rb
index 161ee67..15c66f9 100644
--- a/lib/coderay/encoders/text.rb
+++ b/lib/coderay/encoders/text.rb
@@ -1,32 +1,46 @@
module CodeRay
module Encoders
-
+
+ # Concats the tokens into a single string, resulting in the original
+ # code string if no tokens were removed.
+ #
+ # Alias: +plain+, +plaintext+
+ #
+ # == Options
+ #
+ # === :separator
+ # A separator string to join the tokens.
+ #
+ # Default: empty String
class Text < Encoder
-
- include Streamable
+
register_for :text
-
+
FILE_EXTENSION = 'txt'
-
+
DEFAULT_OPTIONS = {
- :separator => ''
+ :separator => nil
}
-
+
+ def text_token text, kind
+ super
+
+ if @first
+ @first = false
+ else
+ @out << @sep
+ end if @sep
+ end
+
protected
def setup options
super
+
+ @first = true
@sep = options[:separator]
end
-
- def text_token text, kind
- text + @sep
- end
-
- def finish options
- super.chomp @sep
- end
-
+
end
-
+
end
end
diff --git a/lib/coderay/encoders/token_class_filter.rb b/lib/coderay/encoders/token_class_filter.rb
deleted file mode 100644
index a9e8673..0000000
--- a/lib/coderay/encoders/token_class_filter.rb
+++ /dev/null
@@ -1,84 +0,0 @@
-($:.unshift '../..'; require 'coderay') unless defined? CodeRay
-module CodeRay
-module Encoders
-
- load :filter
-
- class TokenClassFilter < Filter
-
- include Streamable
- register_for :token_class_filter
-
- DEFAULT_OPTIONS = {
- :exclude => [],
- :include => :all
- }
-
- protected
- def setup options
- super
- @exclude = options[:exclude]
- @exclude = Array(@exclude) unless @exclude == :all
- @include = options[:include]
- @include = Array(@include) unless @include == :all
- end
-
- def include_text_token? text, kind
- (@include == :all || @include.include?(kind)) &&
- !(@exclude == :all || @exclude.include?(kind))
- end
-
- end
-
-end
-end
-
-if $0 == __FILE__
- $VERBOSE = true
- $: << File.join(File.dirname(__FILE__), '..')
- eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class TokenClassFilterTest < Test::Unit::TestCase
-
- def test_creation
- assert CodeRay::Encoders::TokenClassFilter < CodeRay::Encoders::Encoder
- assert CodeRay::Encoders::TokenClassFilter < CodeRay::Encoders::Filter
- filter = nil
- assert_nothing_raised do
- filter = CodeRay.encoder :token_class_filter
- end
- assert_instance_of CodeRay::Encoders::TokenClassFilter, filter
- end
-
- def test_filtering_text_tokens
- tokens = CodeRay::Tokens.new
- for i in 1..10
- tokens << [i.to_s, :index]
- tokens << [' ', :space] if i < 10
- end
- assert_equal 10, CodeRay::Encoders::TokenClassFilter.new.encode_tokens(tokens, :exclude => :space).size
- assert_equal 10, tokens.token_class_filter(:exclude => :space).size
- assert_equal 9, CodeRay::Encoders::TokenClassFilter.new.encode_tokens(tokens, :include => :space).size
- assert_equal 9, tokens.token_class_filter(:include => :space).size
- assert_equal 0, CodeRay::Encoders::TokenClassFilter.new.encode_tokens(tokens, :exclude => :all).size
- assert_equal 0, tokens.token_class_filter(:exclude => :all).size
- end
-
- def test_filtering_block_tokens
- tokens = CodeRay::Tokens.new
- 10.times do |i|
- tokens << [:open, :index]
- tokens << [i.to_s, :content]
- tokens << [:close, :index]
- end
- assert_equal 20, CodeRay::Encoders::TokenClassFilter.new.encode_tokens(tokens, :include => :blubb).size
- assert_equal 20, tokens.token_class_filter(:include => :blubb).size
- assert_equal 30, CodeRay::Encoders::TokenClassFilter.new.encode_tokens(tokens, :exclude => :index).size
- assert_equal 30, tokens.token_class_filter(:exclude => :index).size
- end
-
-end
diff --git a/lib/coderay/encoders/token_kind_filter.rb b/lib/coderay/encoders/token_kind_filter.rb
new file mode 100644
index 0000000..4773ea3
--- /dev/null
+++ b/lib/coderay/encoders/token_kind_filter.rb
@@ -0,0 +1,111 @@
+module CodeRay
+module Encoders
+
+ load :filter
+
+ # A Filter that selects tokens based on their token kind.
+ #
+ # == Options
+ #
+ # === :exclude
+ #
+ # One or many symbols (in an Array) which shall be excluded.
+ #
+ # Default: []
+ #
+ # === :include
+ #
+ # One or many symbols (in an array) which shall be included.
+ #
+ # Default: :all, which means all tokens are included.
+ #
+ # Exclusion wins over inclusion.
+ #
+ # See also: CommentFilter
+ class TokenKindFilter < Filter
+
+ register_for :token_kind_filter
+
+ DEFAULT_OPTIONS = {
+ :exclude => [],
+ :include => :all
+ }
+
+ protected
+ def setup options
+ super
+
+ @group_excluded = false
+ @exclude = options[:exclude]
+ @exclude = Array(@exclude) unless @exclude == :all
+ @include = options[:include]
+ @include = Array(@include) unless @include == :all
+ end
+
+ def include_text_token? text, kind
+ include_group? kind
+ end
+
+ def include_group? kind
+ (@include == :all || @include.include?(kind)) &&
+ !(@exclude == :all || @exclude.include?(kind))
+ end
+
+ public
+
+ # Add the token to the output stream if +kind+ matches the conditions.
+ def text_token text, kind
+ super if !@group_excluded && include_text_token?(text, kind)
+ end
+
+ # Add the token group to the output stream if +kind+ matches the
+ # conditions.
+ #
+ # If it does not, all tokens inside the group are excluded from the
+ # stream, even if their kinds match.
+ def begin_group kind
+ if @group_excluded
+ @group_excluded += 1
+ elsif include_group? kind
+ super
+ else
+ @group_excluded = 1
+ end
+ end
+
+ # See +begin_group+.
+ def begin_line kind
+ if @group_excluded
+ @group_excluded += 1
+ elsif include_group? kind
+ super
+ else
+ @group_excluded = 1
+ end
+ end
+
+ # Take care of re-enabling the delegation of tokens to the output stream
+ # if an exluded group has ended.
+ def end_group kind
+ if @group_excluded
+ @group_excluded -= 1
+ @group_excluded = false if @group_excluded.zero?
+ else
+ super
+ end
+ end
+
+ # See +end_group+.
+ def end_line kind
+ if @group_excluded
+ @group_excluded -= 1
+ @group_excluded = false if @group_excluded.zero?
+ else
+ super
+ end
+ end
+
+ end
+
+end
+end
diff --git a/lib/coderay/encoders/xml.rb b/lib/coderay/encoders/xml.rb
index f32c967..3d306a6 100644
--- a/lib/coderay/encoders/xml.rb
+++ b/lib/coderay/encoders/xml.rb
@@ -1,39 +1,40 @@
module CodeRay
module Encoders
-
+
# = XML Encoder
#
# Uses REXML. Very slow.
class XML < Encoder
-
- include Streamable
+
register_for :xml
-
+
FILE_EXTENSION = 'xml'
-
- require 'rexml/document'
-
+
+ autoload :REXML, 'rexml/document'
+
DEFAULT_OPTIONS = {
:tab_width => 8,
:pretty => -1,
:transitive => false,
}
-
+
protected
-
def setup options
+ super
+
@doc = REXML::Document.new
@doc << REXML::XMLDecl.new
@tab_width = options[:tab_width]
@root = @node = @doc.add_element('coderay-tokens')
end
-
+
def finish options
- @out = ''
@doc.write @out, options[:pretty], options[:transitive], true
- @out
+
+ super
end
+ public
def text_token text, kind
if kind == :space
token = @node
@@ -53,19 +54,19 @@ module Encoders
end
end
end
-
- def open_token kind
+
+ def begin_group kind
@node = @node.add_element kind.to_s
end
-
- def close_token kind
+
+ def end_group kind
if @node == @root
raise 'no token to close!'
end
@node = @node.parent
end
-
+
end
-
+
end
end
diff --git a/lib/coderay/encoders/yaml.rb b/lib/coderay/encoders/yaml.rb
index 5564e58..ba6e715 100644
--- a/lib/coderay/encoders/yaml.rb
+++ b/lib/coderay/encoders/yaml.rb
@@ -1,22 +1,50 @@
+autoload :YAML, 'yaml'
+
module CodeRay
module Encoders
-
+
# = YAML Encoder
#
# Slow.
class YAML < Encoder
-
+
register_for :yaml
-
+
FILE_EXTENSION = 'yaml'
-
+
protected
- def compile tokens, options
- require 'yaml'
- @out = tokens.to_a.to_yaml
+ def setup options
+ super
+
+ @data = []
end
-
+
+ def finish options
+ output ::YAML.dump(@data)
+ end
+
+ public
+ def text_token text, kind
+ @data << [text, kind]
+ end
+
+ def begin_group kind
+ @data << [:begin_group, kind]
+ end
+
+ def end_group kind
+ @data << [:end_group, kind]
+ end
+
+ def begin_line kind
+ @data << [:begin_line, kind]
+ end
+
+ def end_line kind
+ @data << [:end_line, kind]
+ end
+
end
-
+
end
end
diff --git a/lib/coderay/for_redcloth.rb b/lib/coderay/for_redcloth.rb
index 69985bc..f9df32b 100644
--- a/lib/coderay/for_redcloth.rb
+++ b/lib/coderay/for_redcloth.rb
@@ -30,7 +30,7 @@ module CodeRay
end
RedCloth::TextileDoc.send :include, ForRedCloth::TextileDoc
RedCloth::Formatters::HTML.module_eval do
- def unescape(html)
+ def unescape(html) # :nodoc:
replacements = {
'&' => '&',
'"' => '"',
@@ -45,7 +45,7 @@ module CodeRay
if !opts[:lang] && RedCloth::VERSION.to_s >= '4.2.0'
# simulating pre-4.2 behavior
if opts[:text].sub!(/\A\[(\w+)\]/, '')
- if CodeRay::Scanners[$1].plugin_id == 'plaintext'
+ if CodeRay::Scanners[$1].lang == :text
opts[:text] = $& + opts[:text]
else
opts[:lang] = $1
@@ -57,7 +57,7 @@ module CodeRay
@in_bc ||= nil
format = @in_bc ? :div : :span
opts[:text] = unescape(opts[:text]) unless @in_bc
- highlighted_code = CodeRay.encode opts[:text], opts[:lang], format, :stream => true
+ highlighted_code = CodeRay.encode opts[:text], opts[:lang], format
highlighted_code.sub!(/\A<(span|div)/) { |m| m + pba(@in_bc || opts) }
highlighted_code
else
@@ -74,7 +74,7 @@ module CodeRay
@in_bc = nil
opts[:lang] ? '' : "</pre>\n"
end
- def escape_pre(text)
+ def escape_pre(text) # :nodoc:
if @in_bc ||= nil
text
else
diff --git a/lib/coderay/helpers/file_type.rb b/lib/coderay/helpers/file_type.rb
index d2ca86b..7b90918 100644
--- a/lib/coderay/helpers/file_type.rb
+++ b/lib/coderay/helpers/file_type.rb
@@ -1,56 +1,70 @@
-#!/usr/bin/env ruby
module CodeRay
-
-# = FileType
-#
-# A simple filetype recognizer.
-#
-# Copyright (c) 2006 by murphy (Kornelius Kalnbach) <murphy rubychan de>
-#
-# License:: LGPL / ask the author
-# Version:: 0.1 (2005-09-01)
-#
-# == Documentation
-#
-# # determine the type of the given
-# lang = FileType[ARGV.first]
-#
-# # return :plaintext if the file type is unknown
-# lang = FileType.fetch ARGV.first, :plaintext
-#
-# # try the shebang line, too
-# lang = FileType.fetch ARGV.first, :plaintext, true
-module FileType
-
- UnknownFileType = Class.new Exception
-
- class << self
-
- # Try to determine the file type of the file.
- #
- # +filename+ is a relative or absolute path to a file.
- #
- # The file itself is only accessed when +read_shebang+ is set to true.
- # That means you can get filetypes from files that don't exist.
- def [] filename, read_shebang = false
- name = File.basename filename
- ext = File.extname(name).sub(/^\./, '') # from last dot, delete the leading dot
- ext2 = filename.to_s[/\.(.*)/, 1] # from first dot
-
- type =
- TypeFromExt[ext] ||
- TypeFromExt[ext.downcase] ||
- (TypeFromExt[ext2] if ext2) ||
- (TypeFromExt[ext2.downcase] if ext2) ||
- TypeFromName[name] ||
- TypeFromName[name.downcase]
- type ||= shebang(filename) if read_shebang
-
- type
- end
-
- def shebang filename
- begin
+
+ # = FileType
+ #
+ # A simple filetype recognizer.
+ #
+ # == Usage
+ #
+ # # determine the type of the given
+ # lang = FileType[file_name]
+ #
+ # # return :text if the file type is unknown
+ # lang = FileType.fetch file_name, :text
+ #
+ # # try the shebang line, too
+ # lang = FileType.fetch file_name, :text, true
+ module FileType
+
+ UnknownFileType = Class.new Exception
+
+ class << self
+
+ # Try to determine the file type of the file.
+ #
+ # +filename+ is a relative or absolute path to a file.
+ #
+ # The file itself is only accessed when +read_shebang+ is set to true.
+ # That means you can get filetypes from files that don't exist.
+ def [] filename, read_shebang = false
+ name = File.basename filename
+ ext = File.extname(name).sub(/^\./, '') # from last dot, delete the leading dot
+ ext2 = filename.to_s[/\.(.*)/, 1] # from first dot
+
+ type =
+ TypeFromExt[ext] ||
+ TypeFromExt[ext.downcase] ||
+ (TypeFromExt[ext2] if ext2) ||
+ (TypeFromExt[ext2.downcase] if ext2) ||
+ TypeFromName[name] ||
+ TypeFromName[name.downcase]
+ type ||= shebang(filename) if read_shebang
+
+ type
+ end
+
+ # This works like Hash#fetch.
+ #
+ # If the filetype cannot be found, the +default+ value
+ # is returned.
+ def fetch filename, default = nil, read_shebang = false
+ if default && block_given?
+ warn 'Block supersedes default value argument; use either.'
+ end
+
+ if type = self[filename, read_shebang]
+ type
+ else
+ return yield if block_given?
+ return default if default
+ raise UnknownFileType, 'Could not determine type of %p.' % filename
+ end
+ end
+
+ protected
+
+ def shebang filename
+ return unless File.exist? filename
File.open filename, 'r' do |f|
if first_line = f.gets
if type = first_line[TypeFromShebang]
@@ -58,203 +72,72 @@ module FileType
end
end
end
- rescue IOError
- nil
end
+
end
-
- # This works like Hash#fetch.
- #
- # If the filetype cannot be found, the +default+ value
- # is returned.
- def fetch filename, default = nil, read_shebang = false
- if default and block_given?
- warn 'block supersedes default value argument'
- end
-
- unless type = self[filename, read_shebang]
- return yield if block_given?
- return default if default
- raise UnknownFileType, 'Could not determine type of %p.' % filename
- end
- type
+
+ TypeFromExt = {
+ 'c' => :c,
+ 'cfc' => :xml,
+ 'cfm' => :xml,
+ 'clj' => :clojure,
+ 'css' => :css,
+ 'diff' => :diff,
+ 'dpr' => :delphi,
+ 'erb' => :erb,
+ 'gemspec' => :ruby,
+ 'groovy' => :groovy,
+ 'gvy' => :groovy,
+ 'h' => :c,
+ 'haml' => :haml,
+ 'htm' => :page,
+ 'html' => :page,
+ 'html.erb' => :erb,
+ 'java' => :java,
+ 'js' => :java_script,
+ 'json' => :json,
+ 'mab' => :ruby,
+ 'pas' => :delphi,
+ 'patch' => :diff,
+ 'php' => :php,
+ 'php3' => :php,
+ 'php4' => :php,
+ 'php5' => :php,
+ 'prawn' => :ruby,
+ 'py' => :python,
+ 'py3' => :python,
+ 'pyw' => :python,
+ 'rake' => :ruby,
+ 'raydebug' => :raydebug,
+ 'rb' => :ruby,
+ 'rbw' => :ruby,
+ 'rhtml' => :erb,
+ 'rjs' => :ruby,
+ 'rpdf' => :ruby,
+ 'ru' => :ruby,
+ 'rxml' => :ruby,
+ # 'sch' => :scheme,
+ 'sql' => :sql,
+ # 'ss' => :scheme,
+ 'tmproj' => :xml,
+ 'xhtml' => :page,
+ 'xml' => :xml,
+ 'yaml' => :yaml,
+ 'yml' => :yaml,
+ }
+ for cpp_alias in %w[cc cpp cp cxx c++ C hh hpp h++ cu]
+ TypeFromExt[cpp_alias] = :cpp
end
-
+
+ TypeFromShebang = /\b(?:ruby|perl|python|sh)\b/
+
+ TypeFromName = {
+ 'Capfile' => :ruby,
+ 'Rakefile' => :ruby,
+ 'Rantfile' => :ruby,
+ 'Gemfile' => :ruby,
+ }
+
end
-
- TypeFromExt = {
- 'c' => :c,
- 'cfc' => :xml,
- 'cfm' => :xml,
- 'css' => :css,
- 'diff' => :diff,
- 'dpr' => :delphi,
- 'gemspec' => :ruby,
- 'groovy' => :groovy,
- 'gvy' => :groovy,
- 'h' => :c,
- 'htm' => :html,
- 'html' => :html,
- 'html.erb' => :rhtml,
- 'java' => :java,
- 'js' => :java_script,
- 'json' => :json,
- 'mab' => :ruby,
- 'pas' => :delphi,
- 'patch' => :diff,
- 'php' => :php,
- 'php3' => :php,
- 'php4' => :php,
- 'php5' => :php,
- 'py' => :python,
- 'py3' => :python,
- 'pyw' => :python,
- 'rake' => :ruby,
- 'raydebug' => :debug,
- 'rb' => :ruby,
- 'rbw' => :ruby,
- 'rhtml' => :rhtml,
- 'rjs' => :ruby,
- 'rpdf' => :ruby,
- 'rxml' => :ruby,
- 'sch' => :scheme,
- 'sql' => :sql,
- 'ss' => :scheme,
- 'xhtml' => :xhtml,
- 'xml' => :xml,
- 'yaml' => :yaml,
- 'yml' => :yaml,
- }
- for cpp_alias in %w[cc cpp cp cxx c++ C hh hpp h++ cu]
- TypeFromExt[cpp_alias] = :cpp
- end
-
- TypeFromShebang = /\b(?:ruby|perl|python|sh)\b/
-
- TypeFromName = {
- 'Rakefile' => :ruby,
- 'Rantfile' => :ruby,
- }
-
-end
-
-end
-
-if $0 == __FILE__
- $VERBOSE = true
- eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class FileTypeTests < Test::Unit::TestCase
- include CodeRay
-
- def test_fetch
- assert_raise FileType::UnknownFileType do
- FileType.fetch ''
- end
-
- assert_throws :not_found do
- FileType.fetch '.' do
- throw :not_found
- end
- end
-
- assert_equal :default, FileType.fetch('c', :default)
-
- stderr, fake_stderr = $stderr, Object.new
- $err = ''
- def fake_stderr.write x
- $err << x
- end
- $stderr = fake_stderr
- FileType.fetch('c', :default) { }
- assert_equal "block supersedes default value argument\n", $err
- $stderr = stderr
- end
-
- def test_ruby
- assert_equal :ruby, FileType['test.rb']
- assert_equal :ruby, FileType['test.java.rb']
- assert_equal :java, FileType['test.rb.java']
- assert_equal :ruby, FileType['C:\\Program Files\\x\\y\\c\\test.rbw']
- assert_equal :ruby, FileType['/usr/bin/something/Rakefile']
- assert_equal :ruby, FileType['~/myapp/gem/Rantfile']
- assert_equal :ruby, FileType['./lib/tasks\repository.rake']
- assert_not_equal :ruby, FileType['test_rb']
- assert_not_equal :ruby, FileType['Makefile']
- assert_not_equal :ruby, FileType['set.rb/set']
- assert_not_equal :ruby, FileType['~/projects/blabla/rb']
- end
-
- def test_c
- assert_equal :c, FileType['test.c']
- assert_equal :c, FileType['C:\\Program Files\\x\\y\\c\\test.h']
- assert_not_equal :c, FileType['test_c']
- assert_not_equal :c, FileType['Makefile']
- assert_not_equal :c, FileType['set.h/set']
- assert_not_equal :c, FileType['~/projects/blabla/c']
- end
-
- def test_cpp
- assert_equal :cpp, FileType['test.c++']
- assert_equal :cpp, FileType['test.cxx']
- assert_equal :cpp, FileType['test.hh']
- assert_equal :cpp, FileType['test.hpp']
- assert_equal :cpp, FileType['test.cu']
- assert_equal :cpp, FileType['test.C']
- assert_not_equal :cpp, FileType['test.c']
- assert_not_equal :cpp, FileType['test.h']
- end
-
- def test_html
- assert_equal :html, FileType['test.htm']
- assert_equal :xhtml, FileType['test.xhtml']
- assert_equal :xhtml, FileType['test.html.xhtml']
- assert_equal :rhtml, FileType['_form.rhtml']
- assert_equal :rhtml, FileType['_form.html.erb']
- end
-
- def test_yaml
- assert_equal :yaml, FileType['test.yml']
- assert_equal :yaml, FileType['test.yaml']
- assert_equal :yaml, FileType['my.html.yaml']
- assert_not_equal :yaml, FileType['YAML']
- end
-
- def test_pathname
- require 'pathname'
- pn = Pathname.new 'test.rb'
- assert_equal :ruby, FileType[pn]
- dir = Pathname.new '/etc/var/blubb'
- assert_equal :ruby, FileType[dir + pn]
- assert_equal :cpp, FileType[dir + 'test.cpp']
- end
-
- def test_no_shebang
- dir = './test'
- if File.directory? dir
- Dir.chdir dir do
- assert_equal :c, FileType['test.c']
- end
- end
- end
-
- def test_shebang_empty_file
- require 'tmpdir'
- tmpfile = File.join(Dir.tmpdir, 'bla')
- File.open(tmpfile, 'w') { } # touch
- assert_equal nil, FileType[tmpfile]
- end
-
- def test_shebang
- require 'tmpdir'
- tmpfile = File.join(Dir.tmpdir, 'bla')
- File.open(tmpfile, 'w') { |f| f.puts '#!/usr/bin/env ruby' }
- assert_equal :ruby, FileType[tmpfile, true]
- end
-
end
diff --git a/lib/coderay/helpers/gzip.rb b/lib/coderay/helpers/gzip.rb
new file mode 100644
index 0000000..245014a
--- /dev/null
+++ b/lib/coderay/helpers/gzip.rb
@@ -0,0 +1,41 @@
+module CodeRay
+
+ # A simplified interface to the gzip library +zlib+ (from the Ruby Standard Library.)
+ module GZip
+
+ require 'zlib'
+
+ # The default zipping level. 7 zips good and fast.
+ DEFAULT_GZIP_LEVEL = 7
+
+ # Unzips the given string +s+.
+ #
+ # Example:
+ # require 'gzip_simple'
+ # print GZip.gunzip(File.read('adresses.gz'))
+ def GZip.gunzip s
+ Zlib::Inflate.inflate s
+ end
+
+ # Zips the given string +s+.
+ #
+ # Example:
+ # require 'gzip_simple'
+ # File.open('adresses.gz', 'w') do |file
+ # file.write GZip.gzip('Mum: 0123 456 789', 9)
+ # end
+ #
+ # If you provide a +level+, you can control how strong
+ # the string is compressed:
+ # - 0: no compression, only convert to gzip format
+ # - 1: compress fast
+ # - 7: compress more, but still fast (default)
+ # - 8: compress more, slower
+ # - 9: compress best, very slow
+ def GZip.gzip s, level = DEFAULT_GZIP_LEVEL
+ Zlib::Deflate.new(level).deflate s, Zlib::FINISH
+ end
+
+ end
+
+end
diff --git a/lib/coderay/helpers/gzip_simple.rb b/lib/coderay/helpers/gzip_simple.rb
deleted file mode 100644
index b979f66..0000000
--- a/lib/coderay/helpers/gzip_simple.rb
+++ /dev/null
@@ -1,123 +0,0 @@
-# =GZip Simple
-#
-# A simplified interface to the gzip library +zlib+ (from the Ruby Standard Library.)
-#
-# Author: murphy (mail to murphy rubychan de)
-#
-# Version: 0.2 (2005.may.28)
-#
-# ==Documentation
-#
-# See +GZip+ module and the +String+ extensions.
-#
-module GZip
-
- require 'zlib'
-
- # The default zipping level. 7 zips good and fast.
- DEFAULT_GZIP_LEVEL = 7
-
- # Unzips the given string +s+.
- #
- # Example:
- # require 'gzip_simple'
- # print GZip.gunzip(File.read('adresses.gz'))
- def GZip.gunzip s
- Zlib::Inflate.inflate s
- end
-
- # Zips the given string +s+.
- #
- # Example:
- # require 'gzip_simple'
- # File.open('adresses.gz', 'w') do |file
- # file.write GZip.gzip('Mum: 0123 456 789', 9)
- # end
- #
- # If you provide a +level+, you can control how strong
- # the string is compressed:
- # - 0: no compression, only convert to gzip format
- # - 1: compress fast
- # - 7: compress more, but still fast (default)
- # - 8: compress more, slower
- # - 9: compress best, very slow
- def GZip.gzip s, level = DEFAULT_GZIP_LEVEL
- Zlib::Deflate.new(level).deflate s, Zlib::FINISH
- end
-end
-
-
-# String extensions to use the GZip module.
-#
-# The methods gzip and gunzip provide an even more simple
-# interface to the ZLib:
-#
-# # create a big string
-# x = 'a' * 1000
-#
-# # zip it
-# x_gz = x.gzip
-#
-# # test the result
-# puts 'Zipped %d bytes to %d bytes.' % [x.size, x_gz.size]
-# #-> Zipped 1000 bytes to 19 bytes.
-#
-# # unzipping works
-# p x_gz.gunzip == x #-> true
-class String
- # Returns the string, unzipped.
- # See GZip.gunzip
- def gunzip
- GZip.gunzip self
- end
- # Replaces the string with its unzipped value.
- # See GZip.gunzip
- def gunzip!
- replace gunzip
- end
-
- # Returns the string, zipped.
- # +level+ is the gzip compression level, see GZip.gzip.
- def gzip level = GZip::DEFAULT_GZIP_LEVEL
- GZip.gzip self, level
- end
- # Replaces the string with its zipped value.
- # See GZip.gzip.
- def gzip!(*args)
- replace gzip(*args)
- end
-end
-
-if $0 == __FILE__
- eval DATA.read, nil, $0, __LINE__+4
-end
-
-__END__
-#CODE
-
-# Testing / Benchmark
-x = 'a' * 1000
-x_gz = x.gzip
-puts 'Zipped %d bytes to %d bytes.' % [x.size, x_gz.size] #-> Zipped 1000 bytes to 19 bytes.
-p x_gz.gunzip == x #-> true
-
-require 'benchmark'
-
-INFO = 'packed to %0.3f%%' # :nodoc:
-
-x = Array.new(100000) { rand(255).chr + 'aaaaaaaaa' + rand(255).chr }.join
-Benchmark.bm(10) do |bm|
- for level in 0..9
- bm.report "zip #{level}" do
- $x = x.gzip level
- end
- puts INFO % [100.0 * $x.size / x.size]
- end
- bm.report 'zip' do
- $x = x.gzip
- end
- puts INFO % [100.0 * $x.size / x.size]
- bm.report 'unzip' do
- $x.gunzip
- end
-end
diff --git a/lib/coderay/helpers/plugin.rb b/lib/coderay/helpers/plugin.rb
index 2dffbdc..06c1233 100644
--- a/lib/coderay/helpers/plugin.rb
+++ b/lib/coderay/helpers/plugin.rb
@@ -1,349 +1,284 @@
module CodeRay
-# = PluginHost
-#
-# A simple subclass plugin system.
-#
-# Example:
-# class Generators < PluginHost
-# plugin_path 'app/generators'
-# end
-#
-# class Generator
-# extend Plugin
-# PLUGIN_HOST = Generators
-# end
-#
-# class FancyGenerator < Generator
-# register_for :fancy
-# end
-#
-# Generators[:fancy] #-> FancyGenerator
-# # or
-# CodeRay.require_plugin 'Generators/fancy'
-module PluginHost
-
- # Raised if Encoders::[] fails because:
- # * a file could not be found
- # * the requested Encoder is not registered
- PluginNotFound = Class.new Exception
- HostNotFound = Class.new Exception
-
- PLUGIN_HOSTS = []
- PLUGIN_HOSTS_BY_ID = {} # dummy hash
-
- # Loads all plugins using list and load.
- def load_all
- for plugin in list
- load plugin
- end
- end
-
- # Returns the Plugin for +id+.
+ # = PluginHost
+ #
+ # A simple subclass/subfolder plugin system.
#
# Example:
- # yaml_plugin = MyPluginHost[:yaml]
- def [] id, *args, &blk
- plugin = validate_id(id)
- begin
- plugin = plugin_hash.[] plugin, *args, &blk
- end while plugin.is_a? Symbol
- plugin
- end
-
- # Alias for +[]+.
- alias load []
-
- def require_helper plugin_id, helper_name
- path = path_to File.join(plugin_id, helper_name)
- require path
- end
-
- class << self
-
- # Adds the module/class to the PLUGIN_HOSTS list.
- def extended mod
- PLUGIN_HOSTS << mod
+ # class Generators
+ # extend PluginHost
+ # plugin_path 'app/generators'
+ # end
+ #
+ # class Generator
+ # extend Plugin
+ # PLUGIN_HOST = Generators
+ # end
+ #
+ # class FancyGenerator < Generator
+ # register_for :fancy
+ # end
+ #
+ # Generators[:fancy] #-> FancyGenerator
+ # # or
+ # CodeRay.require_plugin 'Generators/fancy'
+ # # or
+ # Generators::Fancy
+ module PluginHost
+
+ # Raised if Encoders::[] fails because:
+ # * a file could not be found
+ # * the requested Plugin is not registered
+ PluginNotFound = Class.new LoadError
+ HostNotFound = Class.new LoadError
+
+ PLUGIN_HOSTS = []
+ PLUGIN_HOSTS_BY_ID = {} # dummy hash
+
+ # Loads all plugins using list and load.
+ def load_all
+ for plugin in list
+ load plugin
+ end
end
-
- # Warns you that you should not #include this module.
- def included mod
- warn "#{name} should not be included. Use extend."
+
+ # Returns the Plugin for +id+.
+ #
+ # Example:
+ # yaml_plugin = MyPluginHost[:yaml]
+ def [] id, *args, &blk
+ plugin = validate_id(id)
+ begin
+ plugin = plugin_hash.[] plugin, *args, &blk
+ end while plugin.is_a? Symbol
+ plugin
end
-
- # Find the PluginHost for host_id.
- def host_by_id host_id
- unless PLUGIN_HOSTS_BY_ID.default_proc
- ph = Hash.new do |h, a_host_id|
- for host in PLUGIN_HOSTS
- h[host.host_id] = host
- end
- h.fetch a_host_id, nil
- end
- PLUGIN_HOSTS_BY_ID.replace ph
+
+ alias load []
+
+ # Tries to +load+ the missing plugin by translating +const+ to the
+ # underscore form (eg. LinesOfCode becomes lines_of_code).
+ def const_missing const
+ id = const.to_s.
+ gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2').
+ gsub(/([a-z\d])([A-Z])/,'\1_\2').
+ downcase
+ load id
+ end
+
+ class << self
+
+ # Adds the module/class to the PLUGIN_HOSTS list.
+ def extended mod
+ PLUGIN_HOSTS << mod
end
- PLUGIN_HOSTS_BY_ID[host_id]
+
end
-
- end
-
- # The path where the plugins can be found.
- def plugin_path *args
- unless args.empty?
- @plugin_path = File.expand_path File.join(*args)
- load_map
+
+ # The path where the plugins can be found.
+ def plugin_path *args
+ unless args.empty?
+ @plugin_path = File.expand_path File.join(*args)
+ end
+ @plugin_path ||= ''
end
- @plugin_path
- end
-
- # The host's ID.
- #
- # If PLUGIN_HOST_ID is not set, it is simply the class name.
- def host_id
- if self.const_defined? :PLUGIN_HOST_ID
- self::PLUGIN_HOST_ID
- else
- name
+
+ # Map a plugin_id to another.
+ #
+ # Usage: Put this in a file plugin_path/_map.rb.
+ #
+ # class MyColorHost < PluginHost
+ # map :navy => :dark_blue,
+ # :maroon => :brown,
+ # :luna => :moon
+ # end
+ def map hash
+ for from, to in hash
+ from = validate_id from
+ to = validate_id to
+ plugin_hash[from] = to unless plugin_hash.has_key? from
+ end
end
- end
-
- # Map a plugin_id to another.
- #
- # Usage: Put this in a file plugin_path/_map.rb.
- #
- # class MyColorHost < PluginHost
- # map :navy => :dark_blue,
- # :maroon => :brown,
- # :luna => :moon
- # end
- def map hash
- for from, to in hash
- from = validate_id from
- to = validate_id to
- plugin_hash[from] = to unless plugin_hash.has_key? from
+
+ # Define the default plugin to use when no plugin is found
+ # for a given id, or return the default plugin.
+ #
+ # See also map.
+ #
+ # class MyColorHost < PluginHost
+ # map :navy => :dark_blue
+ # default :gray
+ # end
+ #
+ # MyColorHost.default # loads and returns the Gray plugin
+ def default id = nil
+ if id
+ id = validate_id id
+ raise "The default plugin can't be named \"default\"." if id == :default
+ plugin_hash[:default] = id
+ else
+ load :default
+ end
end
- end
-
- # Define the default plugin to use when no plugin is found
- # for a given id.
- #
- # See also map.
- #
- # class MyColorHost < PluginHost
- # map :navy => :dark_blue
- # default :gray
- # end
- def default id = nil
- if id
- id = validate_id id
- plugin_hash[nil] = id
- else
- plugin_hash[nil]
+
+ # Every plugin must register itself for +id+ by calling register_for,
+ # which calls this method.
+ #
+ # See Plugin#register_for.
+ def register plugin, id
+ plugin_hash[validate_id(id)] = plugin
end
- end
-
- # Every plugin must register itself for one or more
- # +ids+ by calling register_for, which calls this method.
- #
- # See Plugin#register_for.
- def register plugin, *ids
- for id in ids
- unless id.is_a? Symbol
- raise ArgumentError,
- "id must be a Symbol, but it was a #{id.class}"
+
+ # A Hash of plugion_id => Plugin pairs.
+ def plugin_hash
+ @plugin_hash ||= make_plugin_hash
+ end
+
+ # Returns an array of all .rb files in the plugin path.
+ #
+ # The extension .rb is not included.
+ def list
+ Dir[path_to('*')].select do |file|
+ File.basename(file)[/^(?!_)\w+\.rb$/]
+ end.map do |file|
+ File.basename(file, '.rb').to_sym
end
- plugin_hash[validate_id(id)] = plugin
end
- end
-
- # A Hash of plugion_id => Plugin pairs.
- def plugin_hash
- @plugin_hash ||= create_plugin_hash
- end
-
- # Returns an array of all .rb files in the plugin path.
- #
- # The extension .rb is not included.
- def list
- Dir[path_to('*')].select do |file|
- File.basename(file)[/^(?!_)\w+\.rb$/]
- end.map do |file|
- File.basename file, '.rb'
+
+ # Returns an array of all Plugins.
+ #
+ # Note: This loads all plugins using load_all.
+ def all_plugins
+ load_all
+ plugin_hash.values.grep(Class)
end
- end
-
- # Makes a map of all loaded plugins.
- def inspect
- map = plugin_hash.dup
- map.each do |id, plugin|
- map[id] = plugin.to_s[/(?>\w+)$/]
+
+ # Loads the map file (see map).
+ #
+ # This is done automatically when plugin_path is called.
+ def load_plugin_map
+ mapfile = path_to '_map'
+ @plugin_map_loaded = true
+ if File.exist? mapfile
+ require mapfile
+ true
+ else
+ false
+ end
end
- "#{name}[#{host_id}]#{map.inspect}"
- end
-
-protected
- # Created a new plugin list and stores it to @plugin_hash.
- def create_plugin_hash
- @plugin_hash =
+
+ protected
+
+ # Return a plugin hash that automatically loads plugins.
+ def make_plugin_hash
+ @plugin_map_loaded ||= false
Hash.new do |h, plugin_id|
id = validate_id(plugin_id)
path = path_to id
begin
+ raise LoadError, "#{path} not found" unless File.exist? path
require path
rescue LoadError => boom
- if h.has_key? nil # default plugin
- h[id] = h[nil]
+ if @plugin_map_loaded
+ if h.has_key?(:default)
+ warn '%p could not load plugin %p; falling back to %p' % [self, id, h[:default]]
+ h[:default]
+ else
+ raise PluginNotFound, '%p could not load plugin %p: %s' % [self, id, boom]
+ end
else
- raise PluginNotFound, 'Could not load plugin %p: %s' % [id, boom]
+ load_plugin_map
+ h[plugin_id]
end
else
# Plugin should have registered by now
- unless h.has_key? id
- raise PluginNotFound,
- "No #{self.name} plugin for #{id.inspect} found in #{path}."
+ if h.has_key? id
+ h[id]
+ else
+ raise PluginNotFound, "No #{self.name} plugin for #{id.inspect} found in #{path}."
end
end
- h[id]
end
- end
-
- # Loads the map file (see map).
- #
- # This is done automatically when plugin_path is called.
- def load_map
- mapfile = path_to '_map'
- if File.exist? mapfile
- require mapfile
- elsif $VERBOSE
- warn 'no _map.rb found for %s' % name
end
- end
-
- # Returns the Plugin for +id+.
- # Use it like Hash#fetch.
- #
- # Example:
- # yaml_plugin = MyPluginHost[:yaml, :default]
- def fetch id, *args, &blk
- plugin_hash.fetch validate_id(id), *args, &blk
- end
-
- # Returns the expected path to the plugin file for the given id.
- def path_to plugin_id
- File.join plugin_path, "#{plugin_id}.rb"
- end
-
- # Converts +id+ to a Symbol if it is a String,
- # or returns +id+ if it already is a Symbol.
- #
- # Raises +ArgumentError+ for all other objects, or if the
- # given String includes non-alphanumeric characters (\W).
- def validate_id id
- if id.is_a? Symbol or id.nil?
- id
- elsif id.is_a? String
- if id[/\w+/] == id
- id.downcase.to_sym
+
+ # Returns the expected path to the plugin file for the given id.
+ def path_to plugin_id
+ File.join plugin_path, "#{plugin_id}.rb"
+ end
+
+ # Converts +id+ to a Symbol if it is a String,
+ # or returns +id+ if it already is a Symbol.
+ #
+ # Raises +ArgumentError+ for all other objects, or if the
+ # given String includes non-alphanumeric characters (\W).
+ def validate_id id
+ if id.is_a? Symbol or id.nil?
+ id
+ elsif id.is_a? String
+ if id[/\w+/] == id
+ id.downcase.to_sym
+ else
+ raise ArgumentError, "Invalid id given: #{id}"
+ end
else
- raise ArgumentError, "Invalid id: '#{id}' given."
+ raise ArgumentError, "String or Symbol expected, but #{id.class} given."
end
- else
- raise ArgumentError,
- "String or Symbol expected, but #{id.class} given."
end
- end
-
-end
-
-
-# = Plugin
-#
-# Plugins have to include this module.
-#
-# IMPORTANT: use extend for this module.
-#
-# Example: see PluginHost.
-module Plugin
-
- def included mod
- warn "#{name} should not be included. Use extend."
- end
-
- # Register this class for the given langs.
- # Example:
- # class MyPlugin < PluginHost::BaseClass
- # register_for :my_id
- # ...
- # end
- #
- # See PluginHost.register.
- def register_for *ids
- plugin_host.register self, *ids
+
end
- # Returns the title of the plugin, or sets it to the
- # optional argument +title+.
- def title title = nil
- if title
- @title = title.to_s
- else
- @title ||= name[/([^:]+)$/, 1]
- end
- end
-
- # The host for this Plugin class.
- def plugin_host host = nil
- if host and not host.is_a? PluginHost
- raise ArgumentError,
- "PluginHost expected, but #{host.class} given."
- end
- self.const_set :PLUGIN_HOST, host if host
- self::PLUGIN_HOST
- end
-
- # Require some helper files.
+
+ # = Plugin
#
- # Example:
+ # Plugins have to include this module.
#
- # class MyPlugin < PluginHost::BaseClass
- # register_for :my_id
- # helper :my_helper
+ # IMPORTANT: Use extend for this module.
#
- # The above example loads the file myplugin/my_helper.rb relative to the
- # file in which MyPlugin was defined.
- #
- # You can also load a helper from a different plugin:
- #
- # helper 'other_plugin/helper_name'
- def helper *helpers
- for helper in helpers
- if helper.is_a?(String) && helper[/\//]
- self::PLUGIN_HOST.require_helper $`, $'
+ # See CodeRay::PluginHost for examples.
+ module Plugin
+
+ attr_reader :plugin_id
+
+ # Register this class for the given +id+.
+ #
+ # Example:
+ # class MyPlugin < PluginHost::BaseClass
+ # register_for :my_id
+ # ...
+ # end
+ #
+ # See PluginHost.register.
+ def register_for id
+ @plugin_id = id
+ plugin_host.register self, id
+ end
+
+ # Returns the title of the plugin, or sets it to the
+ # optional argument +title+.
+ def title title = nil
+ if title
+ @title = title.to_s
else
- self::PLUGIN_HOST.require_helper plugin_id, helper.to_s
+ @title ||= name[/([^:]+)$/, 1]
end
end
+
+ # The PluginHost for this Plugin class.
+ def plugin_host host = nil
+ if host.is_a? PluginHost
+ const_set :PLUGIN_HOST, host
+ end
+ self::PLUGIN_HOST
+ end
+
+ def aliases
+ plugin_host.load_plugin_map
+ plugin_host.plugin_hash.inject [] do |aliases, (key, _)|
+ aliases << key if plugin_host[key] == self
+ aliases
+ end
+ end
+
end
-
- # Returns the pulgin id used by the engine.
- def plugin_id
- name[/\w+$/].downcase
- end
-
-end
-
-# Convenience method for plugin loading.
-# The syntax used is:
-#
-# CodeRay.require_plugin '<Host ID>/<Plugin ID>'
-#
-# Returns the loaded plugin.
-def self.require_plugin path
- host_id, plugin_id = path.split '/', 2
- host = PluginHost.host_by_id(host_id)
- raise PluginHost::HostNotFound,
- "No host for #{host_id.inspect} found." unless host
- host.load plugin_id
+
end
-
-end
\ No newline at end of file
diff --git a/lib/coderay/helpers/word_list.rb b/lib/coderay/helpers/word_list.rb
index 9b4f456..ea969c3 100644
--- a/lib/coderay/helpers/word_list.rb
+++ b/lib/coderay/helpers/word_list.rb
@@ -1,138 +1,77 @@
module CodeRay
-
-# = WordList
-#
-# <b>A Hash subclass designed for mapping word lists to token types.</b>
-#
-# Copyright (c) 2006 by murphy (Kornelius Kalnbach) <murphy rubychan de>
-#
-# License:: LGPL / ask the author
-# Version:: 1.1 (2006-Oct-19)
-#
-# A WordList is a Hash with some additional features.
-# It is intended to be used for keyword recognition.
-#
-# WordList is highly optimized to be used in Scanners,
-# typically to decide whether a given ident is a special token.
-#
-# For case insensitive words use CaseIgnoringWordList.
-#
-# Example:
-#
-# # define word arrays
-# RESERVED_WORDS = %w[
-# asm break case continue default do else
-# ...
-# ]
-#
-# PREDEFINED_TYPES = %w[
-# int long short char void
-# ...
-# ]
-#
-# PREDEFINED_CONSTANTS = %w[
-# EOF NULL ...
-# ]
-#
-# # make a WordList
-# IDENT_KIND = WordList.new(:ident).
-# add(RESERVED_WORDS, :reserved).
-# add(PREDEFINED_TYPES, :pre_type).
-# add(PREDEFINED_CONSTANTS, :pre_constant)
-#
-# ...
-#
-# def scan_tokens tokens, options
-# ...
-#
-# elsif scan(/[A-Za-z_][A-Za-z_0-9]*/)
-# # use it
-# kind = IDENT_KIND[match]
-# ...
-class WordList < Hash
-
- # Creates a new WordList with +default+ as default value.
- #
- # You can activate +caching+ to store the results for every [] request.
+
+ # = WordList
#
- # With caching, methods like +include?+ or +delete+ may no longer behave
- # as you expect. Therefore, it is recommended to use the [] method only.
- def initialize default = false, caching = false, &block
- if block
- raise ArgumentError, 'Can\'t combine block with caching.' if caching
- super(&block)
- else
- if caching
- super() do |h, k|
- h[k] = h.fetch k, default
- end
- else
- super default
- end
- end
- end
-
- # Add words to the list and associate them with +kind+.
+ # <b>A Hash subclass designed for mapping word lists to token types.</b>
#
- # Returns +self+, so you can concat add calls.
- def add words, kind = true
- words.each do |word|
- self[word] = kind
+ # Copyright (c) 2006-2011 by murphy (Kornelius Kalnbach) <murphy rubychan de>
+ #
+ # License:: LGPL / ask the author
+ # Version:: 2.0 (2011-05-08)
+ #
+ # A WordList is a Hash with some additional features.
+ # It is intended to be used for keyword recognition.
+ #
+ # WordList is optimized to be used in Scanners,
+ # typically to decide whether a given ident is a special token.
+ #
+ # For case insensitive words use WordList::CaseIgnoring.
+ #
+ # Example:
+ #
+ # # define word arrays
+ # RESERVED_WORDS = %w[
+ # asm break case continue default do else
+ # ]
+ #
+ # PREDEFINED_TYPES = %w[
+ # int long short char void
+ # ]
+ #
+ # # make a WordList
+ # IDENT_KIND = WordList.new(:ident).
+ # add(RESERVED_WORDS, :reserved).
+ # add(PREDEFINED_TYPES, :predefined_type)
+ #
+ # ...
+ #
+ # def scan_tokens tokens, options
+ # ...
+ #
+ # elsif scan(/[A-Za-z_][A-Za-z_0-9]*/)
+ # # use it
+ # kind = IDENT_KIND[match]
+ # ...
+ class WordList < Hash
+
+ # Create a new WordList with +default+ as default value.
+ def initialize default = false
+ super default
end
- self
- end
-
-end
-
-
-# A CaseIgnoringWordList is like a WordList, only that
-# keys are compared case-insensitively.
-#
-# Ignoring the text case is realized by sending the +downcase+ message to
-# all keys.
-#
-# Caching usually makes a CaseIgnoringWordList faster, but it has to be
-# activated explicitely.
-class CaseIgnoringWordList < WordList
-
- # Creates a new case-insensitive WordList with +default+ as default value.
- #
- # You can activate caching to store the results for every [] request.
- # This speeds up subsequent lookups for the same word, but also
- # uses memory.
- def initialize default = false, caching = false
- if caching
- super(default, false) do |h, k|
- h[k] = h.fetch k.downcase, default
- end
- else
- super(default, false)
- extend Uncached
+
+ # Add words to the list and associate them with +value+.
+ #
+ # Returns +self+, so you can concat add calls.
+ def add words, value = true
+ words.each { |word| self[word] = value }
+ self
end
+
end
- module Uncached # :nodoc:
+
+ # A CaseIgnoring WordList is like a WordList, only that
+ # keys are compared case-insensitively (normalizing keys using +downcase+).
+ class WordList::CaseIgnoring < WordList
+
def [] key
- super(key.downcase)
+ super key.downcase
end
- end
-
- # Add +words+ to the list and associate them with +kind+.
- def add words, kind = true
- words.each do |word|
- self[word.downcase] = kind
+
+ def []= key, value
+ super key.downcase, value
end
- self
+
end
-
-end
-
+
end
-
-__END__
-# check memory consumption
-END {
- ObjectSpace.each_object(CodeRay::CaseIgnoringWordList) do |wl|
- p wl.inject(0) { |memo, key, value| memo + key.size + 24 }
- end
-}
\ No newline at end of file
diff --git a/lib/coderay/scanner.rb b/lib/coderay/scanner.rb
index c4fcb8a..907cf00 100644
--- a/lib/coderay/scanner.rb
+++ b/lib/coderay/scanner.rb
@@ -1,7 +1,10 @@
-module CodeRay
-
- require 'coderay/helpers/plugin'
+# encoding: utf-8
+require 'strscan'
+module CodeRay
+
+ autoload :WordList, coderay_path('helpers', 'word_list')
+
# = Scanners
#
# This module holds the Scanner class and its subclasses.
@@ -16,9 +19,8 @@ module CodeRay
module Scanners
extend PluginHost
plugin_path File.dirname(__FILE__), 'scanners'
-
- require 'strscan'
-
+
+
# = Scanner
#
# The base class for all Scanners.
@@ -46,64 +48,89 @@ module CodeRay
extend Plugin
plugin_host Scanners
-
+
# Raised if a Scanner fails while scanning
- ScanError = Class.new(Exception)
-
- require 'coderay/helpers/word_list'
-
+ ScanError = Class.new StandardError
+
# The default options for all scanner classes.
#
# Define @default_options for subclasses.
- DEFAULT_OPTIONS = { :stream => false }
+ DEFAULT_OPTIONS = { }
+
+ KINDS_NOT_LOC = [:comment, :doctype, :docstring]
+
+ attr_accessor :state
- KINDS_NOT_LOC = [:comment, :doctype]
-
class << self
-
- # Returns if the Scanner can be used in streaming mode.
- def streamable?
- is_a? Streamable
+
+ # Normalizes the given code into a string with UNIX newlines, in the
+ # scanner's internal encoding, with invalid and undefined charachters
+ # replaced by placeholders. Always returns a new object.
+ def normalize code
+ # original = code
+ code = code.to_s unless code.is_a? ::String
+ return code if code.empty?
+
+ if code.respond_to? :encoding
+ code = encode_with_encoding code, self.encoding
+ else
+ code = to_unix code
+ end
+ # code = code.dup if code.eql? original
+ code
end
-
- def normify code
- code = code.to_s
- if code.respond_to?(:encoding) && (code.encoding.name != 'UTF-8' || !code.valid_encoding?)
- code = code.dup
- original_encoding = code.encoding
- code.force_encoding 'Windows-1252'
- unless code.valid_encoding?
- code.force_encoding original_encoding
- if code.encoding.name == 'UTF-8'
- code.encode! 'UTF-16BE', :invalid => :replace, :undef => :replace, :replace => '?'
- end
- code.encode! 'UTF-8', :invalid => :replace, :undef => :replace, :replace => '?'
+
+ # The typical filename suffix for this scanner's language.
+ def file_extension extension = lang
+ @file_extension ||= extension.to_s
+ end
+
+ # The encoding used internally by this scanner.
+ def encoding name = 'UTF-8'
+ @encoding ||= defined?(Encoding.find) && Encoding.find(name)
+ end
+
+ # The lang of this Scanner class, which is equal to its Plugin ID.
+ def lang
+ @plugin_id
+ end
+
+ protected
+
+ def encode_with_encoding code, target_encoding
+ if code.encoding == target_encoding
+ if code.valid_encoding?
+ return to_unix(code)
+ else
+ source_encoding = guess_encoding code
end
+ else
+ source_encoding = code.encoding
end
- code.to_unix
+ # print "encode_with_encoding from #{source_encoding} to #{target_encoding}"
+ code.encode target_encoding, source_encoding, :universal_newline => true, :undef => :replace, :invalid => :replace
end
- def file_extension extension = nil
- if extension
- @file_extension = extension.to_s
- else
- @file_extension ||= plugin_id.to_s
+ def to_unix code
+ code.index(?\r) ? code.gsub(/\r\n?/, "\n") : code
+ end
+
+ def guess_encoding s
+ #:nocov:
+ IO.popen("file -b --mime -", "w+") do |file|
+ file.write s[0, 1024]
+ file.close_write
+ begin
+ Encoding.find file.gets[/charset=([-\w]+)/, 1]
+ rescue ArgumentError
+ Encoding::BINARY
+ end
end
+ #:nocov:
end
-
+
end
-
-=begin
-## Excluded for speed reasons; protected seems to make methods slow.
-
- # Save the StringScanner methods from being called.
- # This would not be useful for highlighting.
- strscan_public_methods =
- StringScanner.instance_methods -
- StringScanner.ancestors[1].instance_methods
- protected(*strscan_public_methods)
-=end
-
+
# Create a new Scanner.
#
# * +code+ is the input String and is handled by the superclass
@@ -111,146 +138,147 @@ module CodeRay
# * +options+ is a Hash with Symbols as keys.
# It is merged with the default options of the class (you can
# overwrite default options here.)
- # * +block+ is the callback for streamed highlighting.
- #
- # If you set :stream to +true+ in the options, the Scanner uses a
- # TokenStream with the +block+ as callback to handle the tokens.
#
# Else, a Tokens object is used.
- def initialize code='', options = {}, &block
- raise "I am only the basic Scanner class. I can't scan "\
- "anything. :( Use my subclasses." if self.class == Scanner
+ def initialize code = '', options = {}
+ if self.class == Scanner
+ raise NotImplementedError, "I am only the basic Scanner class. I can't scan anything. :( Use my subclasses."
+ end
@options = self.class::DEFAULT_OPTIONS.merge options
-
- super Scanner.normify(code)
-
- @tokens = options[:tokens]
- if @options[:stream]
- warn "warning in CodeRay::Scanner.new: :stream is set, "\
- "but no block was given" unless block_given?
- raise NotStreamableError, self unless kind_of? Streamable
- @tokens ||= TokenStream.new(&block)
- else
- warn "warning in CodeRay::Scanner.new: Block given, "\
- "but :stream is #{@options[:stream]}" if block_given?
- @tokens ||= Tokens.new
- end
- @tokens.scanner = self
-
+
+ super self.class.normalize(code)
+
+ @tokens = options[:tokens] || Tokens.new
+ @tokens.scanner = self if @tokens.respond_to? :scanner=
+
setup
end
-
+
+ # Sets back the scanner. Subclasses should redefine the reset_instance
+ # method instead of this one.
def reset
super
reset_instance
end
-
+
+ # Set a new string to be scanned.
def string= code
- code = Scanner.normify(code)
- if defined?(RUBY_DESCRIPTION) && RUBY_DESCRIPTION['rubinius 1.0.1']
- reset_state
- @string = code
- else
- super code
- end
+ code = self.class.normalize(code)
+ super code
reset_instance
end
-
- # More mnemonic accessor name for the input string.
- alias code string
- alias code= string=
-
- # Returns the Plugin ID for this scanner.
+
+ # the Plugin ID for this scanner
def lang
- self.class.plugin_id
+ self.class.lang
end
-
- # Scans the code and returns all tokens in a Tokens object.
- def tokenize new_string=nil, options = {}
+
+ # the default file extension for this scanner
+ def file_extension
+ self.class.file_extension
+ end
+
+ # Scan the code and returns all tokens in a Tokens object.
+ def tokenize source = nil, options = {}
options = @options.merge(options)
- self.string = new_string if new_string
- @cached_tokens =
- if @options[:stream] # :stream must have been set already
- reset unless new_string
- scan_tokens @tokens, options
- @tokens
- else
- scan_tokens @tokens, options
- end
+ @tokens = options[:tokens] || @tokens || Tokens.new
+ @tokens.scanner = self if @tokens.respond_to? :scanner=
+ case source
+ when Array
+ self.string = self.class.normalize(source.join)
+ when nil
+ reset
+ else
+ self.string = self.class.normalize(source)
+ end
+
+ begin
+ scan_tokens @tokens, options
+ rescue => e
+ message = "Error in %s#scan_tokens, initial state was: %p" % [self.class, defined?(state) && state]
+ raise_inspect e.message, @tokens, message, 30, e.backtrace
+ end
+
+ @cached_tokens = @tokens
+ if source.is_a? Array
+ @tokens.split_into_parts(*source.map { |part| part.size })
+ else
+ @tokens
+ end
end
-
+
+ # Cache the result of tokenize.
def tokens
@cached_tokens ||= tokenize
end
- # Whether the scanner is in streaming mode.
- def streaming?
- !!@options[:stream]
- end
-
- # Traverses the tokens.
+ # Traverse the tokens.
def each &block
- raise ArgumentError,
- 'Cannot traverse TokenStream.' if @options[:stream]
tokens.each(&block)
end
include Enumerable
-
- # The current line position of the scanner.
+
+ # The current line position of the scanner, starting with 1.
+ # See also: #column.
#
# Beware, this is implemented inefficiently. It should be used
# for debugging only.
- def line
- string[0..pos].count("\n") + 1
+ def line pos = self.pos
+ return 1 if pos <= 0
+ binary_string[0...pos].count("\n") + 1
end
+ # The current column position of the scanner, starting with 1.
+ # See also: #line.
def column pos = self.pos
- return 0 if pos <= 0
- string = string()
- if string.respond_to?(:bytesize) && (defined?(@bin_string) || string.bytesize != string.size)
- @bin_string ||= string.dup.force_encoding('binary')
- string = @bin_string
- end
- pos - (string.rindex(?\n, pos) || 0)
+ return 1 if pos <= 0
+ pos - (binary_string.rindex(?\n, pos - 1) || -1)
end
- def marshal_dump
- @options
+ # The string in binary encoding.
+ #
+ # To be used with #pos, which is the index of the byte the scanner
+ # will scan next.
+ def binary_string
+ @binary_string ||=
+ if string.respond_to?(:bytesize) && string.bytesize != string.size
+ #:nocov:
+ string.dup.force_encoding('binary')
+ #:nocov:
+ else
+ string
+ end
end
- def marshal_load options
- @options = options
- end
-
protected
-
+
# Can be implemented by subclasses to do some initialization
# that has to be done once per instance.
#
# Use reset for initialization that has to be done once per
# scan.
- def setup
+ def setup # :doc:
end
-
+
# This is the central method, and commonly the only one a
# subclass implements.
#
# Subclasses must implement this method; it must return +tokens+
# and must only use Tokens#<< for storing scanned tokens!
- def scan_tokens tokens, options
- raise NotImplementedError,
- "#{self.class}#scan_tokens not implemented."
+ def scan_tokens tokens, options # :doc:
+ raise NotImplementedError, "#{self.class}#scan_tokens not implemented."
end
-
+
+ # Resets the scanner.
def reset_instance
- @tokens.clear unless @options[:keep_tokens]
+ @tokens.clear if @tokens.respond_to?(:clear) && !@options[:keep_tokens]
@cached_tokens = nil
- @bin_string = nil if defined? @bin_string
+ @binary_string = nil if defined? @binary_string
end
-
+
# Scanner error with additional status information
- def raise_inspect msg, tokens, state = 'No state given!', ambit = 30
+ def raise_inspect msg, tokens, state = self.state || 'No state given!', ambit = 30, backtrace = caller
raise ScanError, <<-EOE % [
@@ -272,13 +300,13 @@ surrounding code:
EOE
File.basename(caller[0]),
msg,
- tokens.size,
- tokens.last(10).map { |t| t.inspect }.join("\n"),
+ tokens.respond_to?(:size) ? tokens.size : 0,
+ tokens.respond_to?(:last) ? tokens.last(10).map { |t| t.inspect }.join("\n") : '',
line, column, pos,
matched, state, bol?, eos?,
- string[pos - ambit, ambit],
- string[pos, ambit],
- ]
+ binary_string[pos - ambit, ambit],
+ binary_string[pos, ambit],
+ ], backtrace
end
# Shorthand for scan_until(/\z/).
@@ -288,19 +316,8 @@ surrounding code:
terminate
rest
end
-
- end
-
- end
-end
-
-class String
- # I love this hack. It seems to silence all dos/unix/mac newline problems.
- def to_unix
- if index ?\r
- gsub(/\r\n?/, "\n")
- else
- self
+
end
+
end
end
diff --git a/lib/coderay/scanners/_map.rb b/lib/coderay/scanners/_map.rb
index 01078c1..a240298 100644
--- a/lib/coderay/scanners/_map.rb
+++ b/lib/coderay/scanners/_map.rb
@@ -1,23 +1,24 @@
module CodeRay
module Scanners
-
+
map \
- :h => :c,
- :cplusplus => :cpp,
- :'c++' => :cpp,
- :ecma => :java_script,
- :ecmascript => :java_script,
+ :'c++' => :cpp,
+ :cplusplus => :cpp,
+ :ecmascript => :java_script,
:ecma_script => :java_script,
- :irb => :ruby,
- :javascript => :java_script,
- :js => :java_script,
- :nitro => :nitro_xhtml,
- :pascal => :delphi,
- :plain => :plaintext,
- :xhtml => :html,
- :yml => :yaml
-
- default :plain
-
+ :rhtml => :erb,
+ :eruby => :erb,
+ :irb => :ruby,
+ :javascript => :java_script,
+ :js => :java_script,
+ :pascal => :delphi,
+ :patch => :diff,
+ :plain => :text,
+ :plaintext => :text,
+ :xhtml => :html,
+ :yml => :yaml
+
+ default :text
+
end
end
diff --git a/lib/coderay/scanners/c.rb b/lib/coderay/scanners/c.rb
index d7f2be7..8d24b99 100644
--- a/lib/coderay/scanners/c.rb
+++ b/lib/coderay/scanners/c.rb
@@ -1,46 +1,47 @@
module CodeRay
module Scanners
-
+
+ # Scanner for C.
class C < Scanner
- include Streamable
-
register_for :c
file_extension 'c'
-
- RESERVED_WORDS = [
+
+ KEYWORDS = [
'asm', 'break', 'case', 'continue', 'default', 'do',
'else', 'enum', 'for', 'goto', 'if', 'return',
'sizeof', 'struct', 'switch', 'typedef', 'union', 'while',
'restrict', # added in C99
- ]
+ ] # :nodoc:
PREDEFINED_TYPES = [
'int', 'long', 'short', 'char',
'signed', 'unsigned', 'float', 'double',
'bool', 'complex', # added in C99
- ]
+ ] # :nodoc:
PREDEFINED_CONSTANTS = [
'EOF', 'NULL',
'true', 'false', # added in C99
- ]
+ ] # :nodoc:
DIRECTIVES = [
'auto', 'extern', 'register', 'static', 'void',
'const', 'volatile', # added in C89
'inline', # added in C99
- ]
+ ] # :nodoc:
IDENT_KIND = WordList.new(:ident).
- add(RESERVED_WORDS, :reserved).
- add(PREDEFINED_TYPES, :pre_type).
+ add(KEYWORDS, :keyword).
+ add(PREDEFINED_TYPES, :predefined_type).
add(DIRECTIVES, :directive).
- add(PREDEFINED_CONSTANTS, :pre_constant)
-
- ESCAPE = / [rbfntv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
- UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x
+ add(PREDEFINED_CONSTANTS, :predefined_constant) # :nodoc:
- def scan_tokens tokens, options
+ ESCAPE = / [rbfntv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x # :nodoc:
+ UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x # :nodoc:
+
+ protected
+
+ def scan_tokens encoder, options
state = :initial
label_expected = true
@@ -50,9 +51,6 @@ module Scanners
until eos?
- kind = nil
- match = nil
-
case state
when :initial
@@ -62,15 +60,10 @@ module Scanners
in_preproc_line = false
label_expected = label_expected_before_preproc_line
end
- tokens << [match, :space]
- next
+ encoder.text_token match, :space
- elsif scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
- kind = :comment
-
- elsif match = scan(/ \# \s* if \s* 0 /x)
- match << scan_until(/ ^\# (?:elif|else|endif) .*? $ | \z /xm) unless eos?
- kind = :comment
+ elsif match = scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
+ encoder.text_token match, :comment
elsif match = scan(/ [-+*=<>?:;,!&^|()\[\]{}~%]+ | \/=? | \.(?!\d) /x)
label_expected = match =~ /[;\{\}]/
@@ -78,7 +71,7 @@ module Scanners
label_expected = true if match == ':'
case_expected = false
end
- kind = :operator
+ encoder.text_token match, :operator
elsif match = scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
kind = IDENT_KIND[match]
@@ -87,114 +80,107 @@ module Scanners
match << matched
else
label_expected = false
- if kind == :reserved
+ if kind == :keyword
case match
when 'case', 'default'
case_expected = true
end
end
end
+ encoder.text_token match, kind
- elsif scan(/\$/)
- kind = :ident
-
elsif match = scan(/L?"/)
- tokens << [:open, :string]
+ encoder.begin_group :string
if match[0] == ?L
- tokens << ['L', :modifier]
+ encoder.text_token 'L', :modifier
match = '"'
end
+ encoder.text_token match, :delimiter
state = :string
- kind = :delimiter
- elsif scan(/#[ \t]*(\w*)/)
- kind = :preprocessor
+ elsif match = scan(/ \# \s* if \s* 0 /x)
+ match << scan_until(/ ^\# (?:elif|else|endif) .*? $ | \z /xm) unless eos?
+ encoder.text_token match, :comment
+
+ elsif match = scan(/#[ \t]*(\w*)/)
+ encoder.text_token match, :preprocessor
in_preproc_line = true
label_expected_before_preproc_line = label_expected
state = :include_expected if self[1] == 'include'
- elsif scan(/ L?' (?: [^\'\n\\] | \\ #{ESCAPE} )? '? /ox)
+ elsif match = scan(/ L?' (?: [^\'\n\\] | \\ #{ESCAPE} )? '? /ox)
label_expected = false
- kind = :char
+ encoder.text_token match, :char
- elsif scan(/0[xX][0-9A-Fa-f]+/)
+ elsif match = scan(/\$/)
+ encoder.text_token match, :ident
+
+ elsif match = scan(/0[xX][0-9A-Fa-f]+/)
label_expected = false
- kind = :hex
+ encoder.text_token match, :hex
- elsif scan(/(?:0[0-7]+)(?![89.eEfF])/)
+ elsif match = scan(/(?:0[0-7]+)(?![89.eEfF])/)
label_expected = false
- kind = :oct
+ encoder.text_token match, :octal
- elsif scan(/(?:\d+)(?![.eEfF])L?L?/)
+ elsif match = scan(/(?:\d+)(?![.eEfF])L?L?/)
label_expected = false
- kind = :integer
+ encoder.text_token match, :integer
- elsif scan(/\d[fF]?|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
+ elsif match = scan(/\d[fF]?|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
label_expected = false
- kind = :float
+ encoder.text_token match, :float
else
- getch
- kind = :error
+ encoder.text_token getch, :error
end
when :string
- if scan(/[^\\\n"]+/)
- kind = :content
- elsif scan(/"/)
- tokens << ['"', :delimiter]
- tokens << [:close, :string]
+ if match = scan(/[^\\\n"]+/)
+ encoder.text_token match, :content
+ elsif match = scan(/"/)
+ encoder.text_token match, :delimiter
+ encoder.end_group :string
state = :initial
label_expected = false
- next
- elsif scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
- kind = :char
- elsif scan(/ \\ | $ /x)
- tokens << [:close, :string]
- kind = :error
+ elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+ encoder.text_token match, :char
+ elsif match = scan(/ \\ | $ /x)
+ encoder.end_group :string
+ encoder.text_token match, :error
state = :initial
label_expected = false
else
- raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+ raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
end
when :include_expected
- if scan(/<[^>\n]+>?|"[^"\n\\]*(?:\\.[^"\n\\]*)*"?/)
- kind = :include
+ if match = scan(/<[^>\n]+>?|"[^"\n\\]*(?:\\.[^"\n\\]*)*"?/)
+ encoder.text_token match, :include
state = :initial
elsif match = scan(/\s+/)
- kind = :space
+ encoder.text_token match, :space
state = :initial if match.index ?\n
else
state = :initial
- next
end
else
- raise_inspect 'Unknown state', tokens
-
- end
+ raise_inspect 'Unknown state', encoder
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens
end
- raise_inspect 'Empty token', tokens unless match
-
- tokens << [match, kind]
end
if state == :string
- tokens << [:close, :string]
+ encoder.end_group :string
end
- tokens
+ encoder
end
end
diff --git a/lib/coderay/scanners/clojure.rb b/lib/coderay/scanners/clojure.rb
new file mode 100644
index 0000000..f8fbf65
--- /dev/null
+++ b/lib/coderay/scanners/clojure.rb
@@ -0,0 +1,217 @@
+# encoding: utf-8
+module CodeRay
+ module Scanners
+
+ # Clojure scanner by Licenser.
+ class Clojure < Scanner
+
+ register_for :clojure
+ file_extension 'clj'
+
+ SPECIAL_FORMS = %w[
+ def if do let quote var fn loop recur throw try catch monitor-enter monitor-exit .
+ new
+ ] # :nodoc:
+
+ CORE_FORMS = %w[
+ + - -> ->> .. / * <= < = == >= > accessor aclone add-classpath add-watch
+ agent agent-error agent-errors aget alength alias all-ns alter alter-meta!
+ alter-var-root amap ancestors and apply areduce array-map aset aset-boolean
+ aset-byte aset-char aset-double aset-float aset-int aset-long aset-short
+ assert assoc assoc! assoc-in associative? atom await await-for bases bean
+ bigdec bigint binding bit-and bit-and-not bit-clear bit-flip bit-not bit-or
+ bit-set bit-shift-left bit-shift-right bit-test bit-xor boolean boolean-array
+ booleans bound-fn bound-fn* bound? butlast byte byte-array bytes case cast char
+ char-array char-escape-string char-name-string char? chars class class?
+ clear-agent-errors clojure-version coll? comment commute comp comparator
+ compare compare-and-set! compile complement concat cond condp conj conj!
+ cons constantly construct-proxy contains? count counted? create-ns
+ create-struct cycle dec decimal? declare definline defmacro defmethod defmulti
+ defn defn- defonce defprotocol defrecord defstruct deftype delay delay?
+ deliver denominator deref derive descendants disj disj! dissoc dissoc!
+ distinct distinct? doall doc dorun doseq dosync dotimes doto double
+ double-array doubles drop drop-last drop-while empty empty? ensure
+ enumeration-seq error-handler error-mode eval even? every? extend
+ extend-protocol extend-type extenders extends? false? ffirst file-seq
+ filter find find-doc find-ns find-var first float float-array float?
+ floats flush fn fn? fnext for force format future future-call future-cancel
+ future-cancelled? future-done? future? gen-class gen-interface gensym get
+ get-in get-method get-proxy-class get-thread-bindings get-validator hash
+ hash-map hash-set identical? identity if-let if-not ifn? import in-ns
+ inc init-proxy instance? int int-array integer? interleave intern
+ interpose into into-array ints io! isa? iterate iterator-seq juxt key
+ keys keyword keyword? last lazy-cat lazy-seq let letfn line-seq list list*
+ list? load load-file load-reader load-string loaded-libs locking long
+ long-array longs loop macroexpand macroexpand-1 make-array make-hierarchy
+ map map? mapcat max max-key memfn memoize merge merge-with meta methods
+ min min-key mod name namespace neg? newline next nfirst nil? nnext not
+ not-any? not-empty not-every? not= ns ns-aliases ns-imports ns-interns
+ ns-map ns-name ns-publics ns-refers ns-resolve ns-unalias ns-unmap nth
+ nthnext num number? numerator object-array odd? or parents partial
+ partition pcalls peek persistent! pmap pop pop! pop-thread-bindings
+ pos? pr pr-str prefer-method prefers print print-namespace-doc
+ print-str printf println println-str prn prn-str promise proxy
+ proxy-mappings proxy-super push-thread-bindings pvalues quot rand
+ rand-int range ratio? rationalize re-find re-groups re-matcher
+ re-matches re-pattern re-seq read read-line read-string reduce ref
+ ref-history-count ref-max-history ref-min-history ref-set refer
+ refer-clojure reify release-pending-sends rem remove remove-all-methods
+ remove-method remove-ns remove-watch repeat repeatedly replace replicate
+ require reset! reset-meta! resolve rest restart-agent resultset-seq
+ reverse reversible? rseq rsubseq satisfies? second select-keys send
+ send-off seq seq? seque sequence sequential? set set-error-handler!
+ set-error-mode! set-validator! set? short short-array shorts
+ shutdown-agents slurp some sort sort-by sorted-map sorted-map-by
+ sorted-set sorted-set-by sorted? special-form-anchor special-symbol?
+ split-at split-with str string? struct struct-map subs subseq subvec
+ supers swap! symbol symbol? sync syntax-symbol-anchor take take-last
+ take-nth take-while test the-ns thread-bound? time to-array to-array-2d
+ trampoline transient tree-seq true? type unchecked-add unchecked-dec
+ unchecked-divide unchecked-inc unchecked-multiply unchecked-negate
+ unchecked-remainder unchecked-subtract underive update-in update-proxy
+ use val vals var-get var-set var? vary-meta vec vector vector-of vector?
+ when when-first when-let when-not while with-bindings with-bindings*
+ with-in-str with-local-vars with-meta with-open with-out-str
+ with-precision xml-seq zero? zipmap
+ ] # :nodoc:
+
+ PREDEFINED_CONSTANTS = %w[
+ true false nil *1 *2 *3 *agent* *clojure-version* *command-line-args*
+ *compile-files* *compile-path* *e *err* *file* *flush-on-newline*
+ *in* *ns* *out* *print-dup* *print-length* *print-level* *print-meta*
+ *print-readably* *read-eval* *warn-on-reflection*
+ ] # :nodoc:
+
+ IDENT_KIND = WordList.new(:ident).
+ add(SPECIAL_FORMS, :keyword).
+ add(CORE_FORMS, :keyword).
+ add(PREDEFINED_CONSTANTS, :predefined_constant)
+
+ KEYWORD_NEXT_TOKEN_KIND = WordList.new(nil).
+ add(%w[ def defn defn- definline defmacro defmulti defmethod defstruct defonce declare ], :function).
+ add(%w[ ns ], :namespace).
+ add(%w[ defprotocol defrecord ], :class)
+
+ BASIC_IDENTIFIER = /[a-zA-Z$%*\/_+!?&<>\-=]=?[a-zA-Z0-9$&*+!\/_?<>\-\#]*/
+ IDENTIFIER = /(?!-\d)(?:(?:#{BASIC_IDENTIFIER}\.)*#{BASIC_IDENTIFIER}(?:\/#{BASIC_IDENTIFIER})?\.?)|\.\.?/
+ SYMBOL = /::?#{IDENTIFIER}/o
+ DIGIT = /\d/
+ DIGIT10 = DIGIT
+ DIGIT16 = /[0-9a-f]/i
+ DIGIT8 = /[0-7]/
+ DIGIT2 = /[01]/
+ RADIX16 = /\#x/i
+ RADIX8 = /\#o/i
+ RADIX2 = /\#b/i
+ RADIX10 = /\#d/i
+ EXACTNESS = /#i|#e/i
+ SIGN = /[\+-]?/
+ EXP_MARK = /[esfdl]/i
+ EXP = /#{EXP_MARK}#{SIGN}#{DIGIT}+/
+ SUFFIX = /#{EXP}?/
+ PREFIX10 = /#{RADIX10}?#{EXACTNESS}?|#{EXACTNESS}?#{RADIX10}?/
+ PREFIX16 = /#{RADIX16}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX16}/
+ PREFIX8 = /#{RADIX8}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX8}/
+ PREFIX2 = /#{RADIX2}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX2}/
+ UINT10 = /#{DIGIT10}+#*/
+ UINT16 = /#{DIGIT16}+#*/
+ UINT8 = /#{DIGIT8}+#*/
+ UINT2 = /#{DIGIT2}+#*/
+ DECIMAL = /#{DIGIT10}+#+\.#*#{SUFFIX}|#{DIGIT10}+\.#{DIGIT10}*#*#{SUFFIX}|\.#{DIGIT10}+#*#{SUFFIX}|#{UINT10}#{EXP}/
+ UREAL10 = /#{UINT10}\/#{UINT10}|#{DECIMAL}|#{UINT10}/
+ UREAL16 = /#{UINT16}\/#{UINT16}|#{UINT16}/
+ UREAL8 = /#{UINT8}\/#{UINT8}|#{UINT8}/
+ UREAL2 = /#{UINT2}\/#{UINT2}|#{UINT2}/
+ REAL10 = /#{SIGN}#{UREAL10}/
+ REAL16 = /#{SIGN}#{UREAL16}/
+ REAL8 = /#{SIGN}#{UREAL8}/
+ REAL2 = /#{SIGN}#{UREAL2}/
+ IMAG10 = /i|#{UREAL10}i/
+ IMAG16 = /i|#{UREAL16}i/
+ IMAG8 = /i|#{UREAL8}i/
+ IMAG2 = /i|#{UREAL2}i/
+ COMPLEX10 = /#{REAL10}@#{REAL10}|#{REAL10}\+#{IMAG10}|#{REAL10}-#{IMAG10}|\+#{IMAG10}|-#{IMAG10}|#{REAL10}/
+ COMPLEX16 = /#{REAL16}@#{REAL16}|#{REAL16}\+#{IMAG16}|#{REAL16}-#{IMAG16}|\+#{IMAG16}|-#{IMAG16}|#{REAL16}/
+ COMPLEX8 = /#{REAL8}@#{REAL8}|#{REAL8}\+#{IMAG8}|#{REAL8}-#{IMAG8}|\+#{IMAG8}|-#{IMAG8}|#{REAL8}/
+ COMPLEX2 = /#{REAL2}@#{REAL2}|#{REAL2}\+#{IMAG2}|#{REAL2}-#{IMAG2}|\+#{IMAG2}|-#{IMAG2}|#{REAL2}/
+ NUM10 = /#{PREFIX10}?#{COMPLEX10}/
+ NUM16 = /#{PREFIX16}#{COMPLEX16}/
+ NUM8 = /#{PREFIX8}#{COMPLEX8}/
+ NUM2 = /#{PREFIX2}#{COMPLEX2}/
+ NUM = /#{NUM10}|#{NUM16}|#{NUM8}|#{NUM2}/
+
+ protected
+
+ def scan_tokens encoder, options
+
+ state = :initial
+ kind = nil
+
+ until eos?
+
+ case state
+ when :initial
+ if match = scan(/ \s+ | \\\n | , /x)
+ encoder.text_token match, :space
+ elsif match = scan(/['`\(\[\)\]\{\}]|\#[({]|~@?|[@\^]/)
+ encoder.text_token match, :operator
+ elsif match = scan(/;.*/)
+ encoder.text_token match, :comment # TODO: recognize (comment ...) too
+ elsif match = scan(/\#?\\(?:newline|space|.?)/)
+ encoder.text_token match, :char
+ elsif match = scan(/\#[ft]/)
+ encoder.text_token match, :predefined_constant
+ elsif match = scan(/#{IDENTIFIER}/o)
+ kind = IDENT_KIND[match]
+ encoder.text_token match, kind
+ if rest? && kind == :keyword
+ if kind = KEYWORD_NEXT_TOKEN_KIND[match]
+ encoder.text_token match, :space if match = scan(/\s+/o)
+ encoder.text_token match, kind if match = scan(/#{IDENTIFIER}/o)
+ end
+ end
+ elsif match = scan(/#{SYMBOL}/o)
+ encoder.text_token match, :symbol
+ elsif match = scan(/\./)
+ encoder.text_token match, :operator
+ elsif match = scan(/ \# \^ #{IDENTIFIER} /ox)
+ encoder.text_token match, :type
+ elsif match = scan(/ (\#)? " /x)
+ state = self[1] ? :regexp : :string
+ encoder.begin_group state
+ encoder.text_token match, :delimiter
+ elsif match = scan(/#{NUM}/o) and not matched.empty?
+ encoder.text_token match, match[/[.e\/]/i] ? :float : :integer
+ else
+ encoder.text_token getch, :error
+ end
+
+ when :string, :regexp
+ if match = scan(/[^"\\]+|\\.?/)
+ encoder.text_token match, :content
+ elsif match = scan(/"/)
+ encoder.text_token match, :delimiter
+ encoder.end_group state
+ state = :initial
+ else
+ raise_inspect "else case \" reached; %p not handled." % peek(1),
+ encoder, state
+ end
+
+ else
+ raise 'else case reached'
+
+ end
+
+ end
+
+ if [:string, :regexp].include? state
+ encoder.end_group state
+ end
+
+ encoder
+
+ end
+ end
+ end
+end
\ No newline at end of file
diff --git a/lib/coderay/scanners/cpp.rb b/lib/coderay/scanners/cpp.rb
index c29083a..9da62f4 100644
--- a/lib/coderay/scanners/cpp.rb
+++ b/lib/coderay/scanners/cpp.rb
@@ -1,16 +1,17 @@
module CodeRay
module Scanners
+ # Scanner for C++.
+ #
+ # Aliases: +cplusplus+, c++
class CPlusPlus < Scanner
- include Streamable
-
register_for :cpp
file_extension 'cpp'
title 'C++'
- # http://www.cppreference.com/wiki/keywords/start
- RESERVED_WORDS = [
+ #-- http://www.cppreference.com/wiki/keywords/start
+ KEYWORDS = [
'and', 'and_eq', 'asm', 'bitand', 'bitor', 'break',
'case', 'catch', 'class', 'compl', 'const_cast',
'continue', 'default', 'delete', 'do', 'dynamic_cast', 'else',
@@ -18,37 +19,39 @@ module Scanners
'not', 'not_eq', 'or', 'or_eq', 'reinterpret_cast', 'return',
'sizeof', 'static_cast', 'struct', 'switch', 'template',
'throw', 'try', 'typedef', 'typeid', 'typename', 'union',
- 'while', 'xor', 'xor_eq'
- ]
-
+ 'while', 'xor', 'xor_eq',
+ ] # :nodoc:
+
PREDEFINED_TYPES = [
'bool', 'char', 'double', 'float', 'int', 'long',
- 'short', 'signed', 'unsigned', 'wchar_t', 'string'
- ]
+ 'short', 'signed', 'unsigned', 'wchar_t', 'string',
+ ] # :nodoc:
PREDEFINED_CONSTANTS = [
'false', 'true',
'EOF', 'NULL',
- ]
+ ] # :nodoc:
PREDEFINED_VARIABLES = [
- 'this'
- ]
+ 'this',
+ ] # :nodoc:
DIRECTIVES = [
'auto', 'const', 'explicit', 'extern', 'friend', 'inline', 'mutable', 'operator',
'private', 'protected', 'public', 'register', 'static', 'using', 'virtual', 'void',
- 'volatile'
- ]
-
+ 'volatile',
+ ] # :nodoc:
+
IDENT_KIND = WordList.new(:ident).
- add(RESERVED_WORDS, :reserved).
- add(PREDEFINED_TYPES, :pre_type).
+ add(KEYWORDS, :keyword).
+ add(PREDEFINED_TYPES, :predefined_type).
add(PREDEFINED_VARIABLES, :local_variable).
add(DIRECTIVES, :directive).
- add(PREDEFINED_CONSTANTS, :pre_constant)
-
- ESCAPE = / [rbfntv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
- UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x
+ add(PREDEFINED_CONSTANTS, :predefined_constant) # :nodoc:
- def scan_tokens tokens, options
+ ESCAPE = / [rbfntv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x # :nodoc:
+ UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x # :nodoc:
+
+ protected
+
+ def scan_tokens encoder, options
state = :initial
label_expected = true
@@ -58,9 +61,6 @@ module Scanners
until eos?
- kind = nil
- match = nil
-
case state
when :initial
@@ -70,15 +70,14 @@ module Scanners
in_preproc_line = false
label_expected = label_expected_before_preproc_line
end
- tokens << [match, :space]
- next
+ encoder.text_token match, :space
- elsif scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
- kind = :comment
+ elsif match = scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
+ encoder.text_token match, :comment
elsif match = scan(/ \# \s* if \s* 0 /x)
match << scan_until(/ ^\# (?:elif|else|endif) .*? $ | \z /xm) unless eos?
- kind = :comment
+ encoder.text_token match, :comment
elsif match = scan(/ [-+*=<>?:;,!&^|()\[\]{}~%]+ | \/=? | \.(?!\d) /x)
label_expected = match =~ /[;\{\}]/
@@ -86,7 +85,7 @@ module Scanners
label_expected = true if match == ':'
case_expected = false
end
- kind = :operator
+ encoder.text_token match, :operator
elsif match = scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
kind = IDENT_KIND[match]
@@ -95,7 +94,7 @@ module Scanners
match << matched
else
label_expected = false
- if kind == :reserved
+ if kind == :keyword
case match
when 'class'
state = :class_name_expected
@@ -104,122 +103,110 @@ module Scanners
end
end
end
+ encoder.text_token match, kind
- elsif scan(/\$/)
- kind = :ident
+ elsif match = scan(/\$/)
+ encoder.text_token match, :ident
elsif match = scan(/L?"/)
- tokens << [:open, :string]
+ encoder.begin_group :string
if match[0] == ?L
- tokens << ['L', :modifier]
+ encoder.text_token match, 'L', :modifier
match = '"'
end
state = :string
- kind = :delimiter
+ encoder.text_token match, :delimiter
- elsif scan(/#[ \t]*(\w*)/)
- kind = :preprocessor
+ elsif match = scan(/#[ \t]*(\w*)/)
+ encoder.text_token match, :preprocessor
in_preproc_line = true
label_expected_before_preproc_line = label_expected
state = :include_expected if self[1] == 'include'
- elsif scan(/ L?' (?: [^\'\n\\] | \\ #{ESCAPE} )? '? /ox)
+ elsif match = scan(/ L?' (?: [^\'\n\\] | \\ #{ESCAPE} )? '? /ox)
label_expected = false
- kind = :char
+ encoder.text_token match, :char
- elsif scan(/0[xX][0-9A-Fa-f]+/)
+ elsif match = scan(/0[xX][0-9A-Fa-f]+/)
label_expected = false
- kind = :hex
+ encoder.text_token match, :hex
- elsif scan(/(?:0[0-7]+)(?![89.eEfF])/)
+ elsif match = scan(/(?:0[0-7]+)(?![89.eEfF])/)
label_expected = false
- kind = :oct
+ encoder.text_token match, :octal
- elsif scan(/(?:\d+)(?![.eEfF])L?L?/)
+ elsif match = scan(/(?:\d+)(?![.eEfF])L?L?/)
label_expected = false
- kind = :integer
+ encoder.text_token match, :integer
- elsif scan(/\d[fF]?|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
+ elsif match = scan(/\d[fF]?|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
label_expected = false
- kind = :float
+ encoder.text_token match, :float
else
- getch
- kind = :error
+ encoder.text_token getch, :error
end
when :string
- if scan(/[^\\"]+/)
- kind = :content
- elsif scan(/"/)
- tokens << ['"', :delimiter]
- tokens << [:close, :string]
+ if match = scan(/[^\\"]+/)
+ encoder.text_token match, :content
+ elsif match = scan(/"/)
+ encoder.text_token match, :delimiter
+ encoder.end_group :string
state = :initial
label_expected = false
- next
- elsif scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
- kind = :char
- elsif scan(/ \\ | $ /x)
- tokens << [:close, :string]
- kind = :error
+ elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+ encoder.text_token match, :char
+ elsif match = scan(/ \\ | $ /x)
+ encoder.end_group :string
+ encoder.text_token match, :error
state = :initial
label_expected = false
else
- raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+ raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
end
when :include_expected
- if scan(/<[^>\n]+>?|"[^"\n\\]*(?:\\.[^"\n\\]*)*"?/)
- kind = :include
+ if match = scan(/<[^>\n]+>?|"[^"\n\\]*(?:\\.[^"\n\\]*)*"?/)
+ encoder.text_token match, :include
state = :initial
elsif match = scan(/\s+/)
- kind = :space
+ encoder.text_token match, :space
state = :initial if match.index ?\n
else
state = :initial
- next
end
when :class_name_expected
- if scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
- kind = :class
+ if match = scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
+ encoder.text_token match, :class
state = :initial
elsif match = scan(/\s+/)
- kind = :space
+ encoder.text_token match, :space
else
- getch
- kind = :error
+ encoder.text_token getch, :error
state = :initial
end
else
- raise_inspect 'Unknown state', tokens
+ raise_inspect 'Unknown state', encoder
end
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens
- end
- raise_inspect 'Empty token', tokens unless match
-
- tokens << [match, kind]
-
end
if state == :string
- tokens << [:close, :string]
+ encoder.end_group :string
end
- tokens
+ encoder
end
end
diff --git a/lib/coderay/scanners/css.rb b/lib/coderay/scanners/css.rb
index 099238f..34eaecb 100644
--- a/lib/coderay/scanners/css.rb
+++ b/lib/coderay/scanners/css.rb
@@ -2,208 +2,198 @@ module CodeRay
module Scanners
class CSS < Scanner
-
+
register_for :css
-
+
KINDS_NOT_LOC = [
:comment,
:class, :pseudo_class, :type,
:constant, :directive,
- :key, :value, :operator, :color, :float, :content, :delimiter,
+ :key, :value, :operator, :color, :float, :string,
:error, :important,
- ]
+ ] # :nodoc:
- module RE
+ module RE # :nodoc:
Hex = /[0-9a-fA-F]/
Unicode = /\\#{Hex}{1,6}(?:\r\n|\s)?/ # differs from standard because it allows uppercase hex too
Escape = /#{Unicode}|\\[^\r\n\f0-9a-fA-F]/
NMChar = /[-_a-zA-Z0-9]|#{Escape}/
NMStart = /[_a-zA-Z]|#{Escape}/
NL = /\r\n|\r|\n|\f/
- String1 = /"(?:[^\n\r\f\\"]|\\#{NL}|#{Escape})*"?/ # FIXME: buggy regexp
- String2 = /'(?:[^\n\r\f\\']|\\#{NL}|#{Escape})*'?/ # FIXME: buggy regexp
+ String1 = /"(?:[^\n\r\f\\"]|\\#{NL}|#{Escape})*"?/ # TODO: buggy regexp
+ String2 = /'(?:[^\n\r\f\\']|\\#{NL}|#{Escape})*'?/ # TODO: buggy regexp
String = /#{String1}|#{String2}/
-
+
HexColor = /#(?:#{Hex}{6}|#{Hex}{3})/
Color = /#{HexColor}/
-
+
Num = /-?(?:[0-9]+|[0-9]*\.[0-9]+)/
Name = /#{NMChar}+/
Ident = /-?#{NMStart}#{NMChar}*/
AtKeyword = /@#{Ident}/
Percentage = /#{Num}%/
-
+
reldimensions = %w[em ex px]
absdimensions = %w[in cm mm pt pc]
- Unit = Regexp.union(*(reldimensions + absdimensions))
-
+ Unit = Regexp.union(*(reldimensions + absdimensions + %w[s]))
+
Dimension = /#{Num}#{Unit}/
-
+
Comment = %r! /\* (?: .*? \*/ | .* ) !mx
- Function = /(?:url|alpha)\((?:[^)\n\r\f]|\\\))*\)?/
-
+ Function = /(?:url|alpha|attr|counters?)\((?:[^)\n\r\f]|\\\))*\)?/
+
Id = /##{Name}/
Class = /\.#{Name}/
PseudoClass = /:#{Name}/
AttributeSelector = /\[[^\]]*\]?/
-
end
-
- def scan_tokens tokens, options
+
+ protected
+
+ def setup
+ @state = :initial
+ @value_expected = nil
+ end
+
+ def scan_tokens encoder, options
+ states = Array(options[:state] || @state)
+ value_expected = @value_expected
- value_expected = nil
- states = [:initial]
-
until eos?
-
- kind = nil
- match = nil
-
- if scan(/\s+/)
- kind = :space
-
+
+ if match = scan(/\s+/)
+ encoder.text_token match, :space
+
elsif case states.last
when :initial, :media
- if scan(/(?>#{RE::Ident})(?!\()|\*/ox)
- kind = :type
- elsif scan RE::Class
- kind = :class
- elsif scan RE::Id
- kind = :constant
- elsif scan RE::PseudoClass
- kind = :pseudo_class
+ if match = scan(/(?>#{RE::Ident})(?!\()|\*/ox)
+ encoder.text_token match, :type
+ next
+ elsif match = scan(RE::Class)
+ encoder.text_token match, :class
+ next
+ elsif match = scan(RE::Id)
+ encoder.text_token match, :constant
+ next
+ elsif match = scan(RE::PseudoClass)
+ encoder.text_token match, :pseudo_class
+ next
elsif match = scan(RE::AttributeSelector)
# TODO: Improve highlighting inside of attribute selectors.
- tokens << [:open, :string]
- tokens << [match[0,1], :delimiter]
- tokens << [match[1..-2], :content] if match.size > 2
- tokens << [match[-1,1], :delimiter] if match[-1] == ?]
- tokens << [:close, :string]
+ encoder.text_token match[0,1], :operator
+ encoder.text_token match[1..-2], :attribute_name if match.size > 2
+ encoder.text_token match[-1,1], :operator if match[-1] == ?]
next
elsif match = scan(/@media/)
- kind = :directive
+ encoder.text_token match, :directive
states.push :media_before_name
+ next
end
when :block
- if scan(/(?>#{RE::Ident})(?!\()/ox)
+ if match = scan(/(?>#{RE::Ident})(?!\()/ox)
if value_expected
- kind = :value
+ encoder.text_token match, :value
else
- kind = :key
+ encoder.text_token match, :key
end
+ next
end
-
+
when :media_before_name
- if scan RE::Ident
- kind = :type
+ if match = scan(RE::Ident)
+ encoder.text_token match, :type
states[-1] = :media_after_name
+ next
end
when :media_after_name
- if scan(/\{/)
- kind = :operator
+ if match = scan(/\{/)
+ encoder.text_token match, :operator
states[-1] = :media
+ next
end
- when :comment
- if scan(/(?:[^*\s]|\*(?!\/))+/)
- kind = :comment
- elsif scan(/\*\//)
- kind = :comment
- states.pop
- elsif scan(/\s+/)
- kind = :space
- end
-
else
- raise_inspect 'Unknown state', tokens
-
+ #:nocov:
+ raise_inspect 'Unknown state', encoder
+ #:nocov:
+
end
-
- elsif scan(/\/\*/)
- kind = :comment
- states.push :comment
-
- elsif scan(/\{/)
+
+ elsif match = scan(/\/\*(?:.*?\*\/|\z)/m)
+ encoder.text_token match, :comment
+
+ elsif match = scan(/\{/)
value_expected = false
- kind = :operator
+ encoder.text_token match, :operator
states.push :block
-
- elsif scan(/\}/)
+
+ elsif match = scan(/\}/)
value_expected = false
+ encoder.text_token match, :operator
if states.last == :block || states.last == :media
- kind = :operator
states.pop
- else
- kind = :error
end
-
+
elsif match = scan(/#{RE::String}/o)
- tokens << [:open, :string]
- tokens << [match[0, 1], :delimiter]
- tokens << [match[1..-2], :content] if match.size > 2
- tokens << [match[-1, 1], :delimiter] if match.size >= 2
- tokens << [:close, :string]
- next
-
+ encoder.begin_group :string
+ encoder.text_token match[0, 1], :delimiter
+ encoder.text_token match[1..-2], :content if match.size > 2
+ encoder.text_token match[-1, 1], :delimiter if match.size >= 2
+ encoder.end_group :string
+
elsif match = scan(/#{RE::Function}/o)
- tokens << [:open, :string]
+ encoder.begin_group :string
start = match[/^\w+\(/]
- tokens << [start, :delimiter]
+ encoder.text_token start, :delimiter
if match[-1] == ?)
- tokens << [match[start.size..-2], :content]
- tokens << [')', :delimiter]
+ encoder.text_token match[start.size..-2], :content
+ encoder.text_token ')', :delimiter
else
- tokens << [match[start.size..-1], :content]
+ encoder.text_token match[start.size..-1], :content
end
- tokens << [:close, :string]
- next
-
- elsif scan(/(?: #{RE::Dimension} | #{RE::Percentage} | #{RE::Num} )/ox)
- kind = :float
-
- elsif scan(/#{RE::Color}/o)
- kind = :color
-
- elsif scan(/! *important/)
- kind = :important
-
- elsif scan(/rgb\([^()\n]*\)?/)
- kind = :color
-
- elsif scan(/#{RE::AtKeyword}/o)
- kind = :directive
-
+ encoder.end_group :string
+
+ elsif match = scan(/(?: #{RE::Dimension} | #{RE::Percentage} | #{RE::Num} )/ox)
+ encoder.text_token match, :float
+
+ elsif match = scan(/#{RE::Color}/o)
+ encoder.text_token match, :color
+
+ elsif match = scan(/! *important/)
+ encoder.text_token match, :important
+
+ elsif match = scan(/(?:rgb|hsl)a?\([^()\n]*\)?/)
+ encoder.text_token match, :color
+
+ elsif match = scan(RE::AtKeyword)
+ encoder.text_token match, :directive
+
elsif match = scan(/ [+>:;,.=()\/] /x)
if match == ':'
value_expected = true
elsif match == ';'
value_expected = false
end
- kind = :operator
-
+ encoder.text_token match, :operator
+
else
- getch
- kind = :error
-
- end
-
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens
+ encoder.text_token getch, :error
+
end
- raise_inspect 'Empty token', tokens unless match
-
- tokens << [match, kind]
-
+
end
-
- tokens
+
+ if options[:keep_state]
+ @state = states
+ @value_expected = value_expected
+ end
+
+ encoder
end
-
+
end
-
+
end
end
diff --git a/lib/coderay/scanners/debug.rb b/lib/coderay/scanners/debug.rb
index 0e78b23..566bfa7 100644
--- a/lib/coderay/scanners/debug.rb
+++ b/lib/coderay/scanners/debug.rb
@@ -1,62 +1,65 @@
module CodeRay
module Scanners
-
+
# = Debug Scanner
+ #
+ # Interprets the output of the Encoders::Debug encoder.
class Debug < Scanner
-
- include Streamable
+
register_for :debug
- file_extension 'raydebug'
- title 'CodeRay Token Dump'
-
+ title 'CodeRay Token Dump Import'
+
protected
- def scan_tokens tokens, options
-
+
+ def scan_tokens encoder, options
+
opened_tokens = []
-
+
until eos?
-
- kind = nil
- match = nil
-
- if scan(/\s+/)
- tokens << [matched, :space]
- next
-
- elsif scan(/ (\w+) \( ( [^\)\\]* ( \\. [^\)\\]* )* ) \) /x)
- kind = self[1].to_sym
- match = self[2].gsub(/\\(.)/, '\1')
-
- elsif scan(/ (\w+) < /x)
- kind = self[1].to_sym
- opened_tokens << kind
- match = :open
-
- elsif !opened_tokens.empty? && scan(/ > /x)
- kind = opened_tokens.pop || :error
- match = :close
-
- else
+
+ if match = scan(/\s+/)
+ encoder.text_token match, :space
+
+ elsif match = scan(/ (\w+) \( ( [^\)\\]* ( \\. [^\)\\]* )* ) \)? /x)
+ kind = self[1].to_sym
+ match = self[2].gsub(/\\(.)/m, '\1')
+ unless TokenKinds.has_key? kind
kind = :error
- getch
-
+ match = matched
end
-
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens
+ encoder.text_token match, kind
+
+ elsif match = scan(/ (\w+) ([<\[]) /x)
+ kind = self[1].to_sym
+ opened_tokens << kind
+ case self[2]
+ when '<'
+ encoder.begin_group kind
+ when '['
+ encoder.begin_line kind
+ else
+ raise 'CodeRay bug: This case should not be reached.'
+ end
+
+ elsif !opened_tokens.empty? && match = scan(/ > /x)
+ encoder.end_group opened_tokens.pop
+
+ elsif !opened_tokens.empty? && match = scan(/ \] /x)
+ encoder.end_line opened_tokens.pop
+
+ else
+ encoder.text_token getch, :space
+
end
- raise_inspect 'Empty token', tokens unless match
-
- tokens << [match, kind]
end
- tokens
+ encoder.end_group opened_tokens.pop until opened_tokens.empty?
+
+ encoder
end
-
+
end
-
+
end
end
diff --git a/lib/coderay/scanners/delphi.rb b/lib/coderay/scanners/delphi.rb
index de0ee71..b328155 100644
--- a/lib/coderay/scanners/delphi.rb
+++ b/lib/coderay/scanners/delphi.rb
@@ -1,12 +1,15 @@
module CodeRay
module Scanners
+ # Scanner for the Delphi language (Object Pascal).
+ #
+ # Alias: +pascal+
class Delphi < Scanner
-
+
register_for :delphi
file_extension 'pas'
- RESERVED_WORDS = [
+ KEYWORDS = [
'and', 'array', 'as', 'at', 'asm', 'at', 'begin', 'case', 'class',
'const', 'constructor', 'destructor', 'dispinterface', 'div', 'do',
'downto', 'else', 'end', 'except', 'exports', 'file', 'finalization',
@@ -16,9 +19,9 @@ module Scanners
'procedure', 'program', 'property', 'raise', 'record', 'repeat',
'resourcestring', 'set', 'shl', 'shr', 'string', 'then', 'threadvar',
'to', 'try', 'type', 'unit', 'until', 'uses', 'var', 'while', 'with',
- 'xor', 'on'
- ]
-
+ 'xor', 'on',
+ ] # :nodoc:
+
DIRECTIVES = [
'absolute', 'abstract', 'assembler', 'at', 'automated', 'cdecl',
'contains', 'deprecated', 'dispid', 'dynamic', 'export',
@@ -27,121 +30,112 @@ module Scanners
'package', 'pascal', 'platform', 'private', 'protected', 'public',
'published', 'read', 'readonly', 'register', 'reintroduce',
'requires', 'resident', 'safecall', 'stdcall', 'stored', 'varargs',
- 'virtual', 'write', 'writeonly'
- ]
-
- IDENT_KIND = CaseIgnoringWordList.new(:ident).
- add(RESERVED_WORDS, :reserved).
- add(DIRECTIVES, :directive)
+ 'virtual', 'write', 'writeonly',
+ ] # :nodoc:
- NAME_FOLLOWS = CaseIgnoringWordList.new(false).
- add(%w(procedure function .))
-
- private
- def scan_tokens tokens, options
-
+ IDENT_KIND = WordList::CaseIgnoring.new(:ident).
+ add(KEYWORDS, :keyword).
+ add(DIRECTIVES, :directive) # :nodoc:
+
+ NAME_FOLLOWS = WordList::CaseIgnoring.new(false).
+ add(%w(procedure function .)) # :nodoc:
+
+ protected
+
+ def scan_tokens encoder, options
+
state = :initial
last_token = ''
-
+
until eos?
-
- kind = nil
- match = nil
-
+
if state == :initial
- if scan(/ \s+ /x)
- tokens << [matched, :space]
+ if match = scan(/ \s+ /x)
+ encoder.text_token match, :space
next
- elsif scan(%r! \{ \$ [^}]* \}? | \(\* \$ (?: .*? \*\) | .* ) !mx)
- tokens << [matched, :preprocessor]
+ elsif match = scan(%r! \{ \$ [^}]* \}? | \(\* \$ (?: .*? \*\) | .* ) !mx)
+ encoder.text_token match, :preprocessor
next
- elsif scan(%r! // [^\n]* | \{ [^}]* \}? | \(\* (?: .*? \*\) | .* ) !mx)
- tokens << [matched, :comment]
+ elsif match = scan(%r! // [^\n]* | \{ [^}]* \}? | \(\* (?: .*? \*\) | .* ) !mx)
+ encoder.text_token match, :comment
next
elsif match = scan(/ <[>=]? | >=? | :=? | [-+=*\/;,@\^|\(\)\[\]] | \.\. /x)
- kind = :operator
+ encoder.text_token match, :operator
elsif match = scan(/\./)
- kind = :operator
- if last_token == 'end'
- tokens << [match, kind]
- next
- end
+ encoder.text_token match, :operator
+ next if last_token == 'end'
elsif match = scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
- kind = NAME_FOLLOWS[last_token] ? :ident : IDENT_KIND[match]
+ encoder.text_token match, NAME_FOLLOWS[last_token] ? :ident : IDENT_KIND[match]
- elsif match = scan(/ ' ( [^\n']|'' ) (?:'|$) /x)
- tokens << [:open, :char]
- tokens << ["'", :delimiter]
- tokens << [self[1], :content]
- tokens << ["'", :delimiter]
- tokens << [:close, :char]
+ elsif match = skip(/ ' ( [^\n']|'' ) (?:'|$) /x)
+ encoder.begin_group :char
+ encoder.text_token "'", :delimiter
+ encoder.text_token self[1], :content
+ encoder.text_token "'", :delimiter
+ encoder.end_group :char
next
elsif match = scan(/ ' /x)
- tokens << [:open, :string]
+ encoder.begin_group :string
+ encoder.text_token match, :delimiter
state = :string
- kind = :delimiter
- elsif scan(/ \# (?: \d+ | \$[0-9A-Fa-f]+ ) /x)
- kind = :char
+ elsif match = scan(/ \# (?: \d+ | \$[0-9A-Fa-f]+ ) /x)
+ encoder.text_token match, :char
- elsif scan(/ \$ [0-9A-Fa-f]+ /x)
- kind = :hex
+ elsif match = scan(/ \$ [0-9A-Fa-f]+ /x)
+ encoder.text_token match, :hex
- elsif scan(/ (?: \d+ ) (?![eE]|\.[^.]) /x)
- kind = :integer
+ elsif match = scan(/ (?: \d+ ) (?![eE]|\.[^.]) /x)
+ encoder.text_token match, :integer
+
+ elsif match = scan(/ \d+ (?: \.\d+ (?: [eE][+-]? \d+ )? | [eE][+-]? \d+ ) /x)
+ encoder.text_token match, :float
- elsif scan(/ \d+ (?: \.\d+ (?: [eE][+-]? \d+ )? | [eE][+-]? \d+ ) /x)
- kind = :float
-
else
- kind = :error
- getch
-
+ encoder.text_token getch, :error
+ next
+
end
elsif state == :string
- if scan(/[^\n']+/)
- kind = :content
- elsif scan(/''/)
- kind = :char
- elsif scan(/'/)
- tokens << ["'", :delimiter]
- tokens << [:close, :string]
+ if match = scan(/[^\n']+/)
+ encoder.text_token match, :content
+ elsif match = scan(/''/)
+ encoder.text_token match, :char
+ elsif match = scan(/'/)
+ encoder.text_token match, :delimiter
+ encoder.end_group :string
state = :initial
next
- elsif scan(/\n/)
- tokens << [:close, :string]
- kind = :error
+ elsif match = scan(/\n/)
+ encoder.end_group :string
+ encoder.text_token match, :space
state = :initial
else
- raise "else case \' reached; %p not handled." % peek(1), tokens
+ raise "else case \' reached; %p not handled." % peek(1), encoder
end
else
- raise 'else-case reached', tokens
+ raise 'else-case reached', encoder
end
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens, state
- end
- raise_inspect 'Empty token', tokens unless match
-
last_token = match
- tokens << [match, kind]
end
- tokens
+ if state == :string
+ encoder.end_group state
+ end
+
+ encoder
end
end
diff --git a/lib/coderay/scanners/diff.rb b/lib/coderay/scanners/diff.rb
index 353b966..9e899c3 100644
--- a/lib/coderay/scanners/diff.rb
+++ b/lib/coderay/scanners/diff.rb
@@ -1,25 +1,41 @@
module CodeRay
module Scanners
+ # Scanner for output of the diff command.
+ #
+ # Alias: +patch+
class Diff < Scanner
register_for :diff
title 'diff output'
- def scan_tokens tokens, options
+ DEFAULT_OPTIONS = {
+ :highlight_code => true,
+ :inline_diff => true,
+ }
+
+ protected
+
+ def scan_tokens encoder, options
line_kind = nil
state = :initial
+ deleted_lines = 0
+ scanners = Hash.new do |h, lang|
+ h[lang] = Scanners[lang].new '', :keep_tokens => true, :keep_state => true
+ end
+ content_scanner = scanners[:plain]
+ content_scanner_entry_state = nil
until eos?
- kind = match = nil
if match = scan(/\n/)
+ deleted_lines = 0 unless line_kind == :delete
if line_kind
- tokens << [:end_line, line_kind]
+ encoder.end_line line_kind
line_kind = nil
end
- tokens << [match, :space]
+ encoder.text_token match, :space
next
end
@@ -27,81 +43,154 @@ module Scanners
when :initial
if match = scan(/--- |\+\+\+ |=+|_+/)
- tokens << [:begin_line, line_kind = :head]
- tokens << [match, :head]
+ encoder.begin_line line_kind = :head
+ encoder.text_token match, :head
+ if match = scan(/.*?(?=$|[\t\n\x00]| \(revision)/)
+ encoder.text_token match, :filename
+ if options[:highlight_code] && match != '/dev/null'
+ file_type = CodeRay::FileType.fetch(match, :text)
+ file_type = :text if file_type == :diff
+ content_scanner = scanners[file_type]
+ content_scanner_entry_state = nil
+ end
+ end
next unless match = scan(/.+/)
- kind = :plain
+ encoder.text_token match, :plain
elsif match = scan(/Index: |Property changes on: /)
- tokens << [:begin_line, line_kind = :head]
- tokens << [match, :head]
+ encoder.begin_line line_kind = :head
+ encoder.text_token match, :head
next unless match = scan(/.+/)
- kind = :plain
+ encoder.text_token match, :plain
elsif match = scan(/Added: /)
- tokens << [:begin_line, line_kind = :head]
- tokens << [match, :head]
+ encoder.begin_line line_kind = :head
+ encoder.text_token match, :head
next unless match = scan(/.+/)
- kind = :plain
+ encoder.text_token match, :plain
state = :added
- elsif match = scan(/\\ /)
- tokens << [:begin_line, line_kind = :change]
- tokens << [match, :change]
- next unless match = scan(/.+/)
- kind = :plain
+ elsif match = scan(/\\ .*/)
+ encoder.text_token match, :comment
elsif match = scan(/@@(?>[^@\n]*)@@/)
+ content_scanner.state = :initial unless match?(/\n\+/)
+ content_scanner_entry_state = nil
if check(/\n|$/)
- tokens << [:begin_line, line_kind = :change]
+ encoder.begin_line line_kind = :change
else
- tokens << [:open, :change]
+ encoder.begin_group :change
end
- tokens << [match[0,2], :change]
- tokens << [match[2...-2], :plain]
- tokens << [match[-2,2], :change]
- tokens << [:close, :change] unless line_kind
+ encoder.text_token match[0,2], :change
+ encoder.text_token match[2...-2], :plain
+ encoder.text_token match[-2,2], :change
+ encoder.end_group :change unless line_kind
next unless match = scan(/.+/)
- kind = :plain
+ if options[:highlight_code]
+ content_scanner.tokenize match, :tokens => encoder
+ else
+ encoder.text_token match, :plain
+ end
+ next
elsif match = scan(/\+/)
- tokens << [:begin_line, line_kind = :insert]
- tokens << [match, :insert]
+ encoder.begin_line line_kind = :insert
+ encoder.text_token match, :insert
next unless match = scan(/.+/)
- kind = :plain
+ if options[:highlight_code]
+ content_scanner.tokenize match, :tokens => encoder
+ else
+ encoder.text_token match, :plain
+ end
+ next
elsif match = scan(/-/)
- tokens << [:begin_line, line_kind = :delete]
- tokens << [match, :delete]
- next unless match = scan(/.+/)
- kind = :plain
- elsif scan(/ .*/)
- kind = :comment
- elsif scan(/.+/)
- tokens << [:begin_line, line_kind = :comment]
- kind = :plain
+ deleted_lines += 1
+ encoder.begin_line line_kind = :delete
+ encoder.text_token match, :delete
+ if options[:inline_diff] && deleted_lines == 1 && check(/(?>.*)\n\+(?>.*)$(?!\n\+)/)
+ content_scanner_entry_state = content_scanner.state
+ skip(/(.*)\n\+(.*)$/)
+ head, deletion, insertion, tail = diff self[1], self[2]
+ pre, deleted, post = content_scanner.tokenize [head, deletion, tail], :tokens => Tokens.new
+ encoder.tokens pre
+ unless deleted.empty?
+ encoder.begin_group :eyecatcher
+ encoder.tokens deleted
+ encoder.end_group :eyecatcher
+ end
+ encoder.tokens post
+ encoder.end_line line_kind
+ encoder.text_token "\n", :space
+ encoder.begin_line line_kind = :insert
+ encoder.text_token '+', :insert
+ content_scanner.state = content_scanner_entry_state || :initial
+ pre, inserted, post = content_scanner.tokenize [head, insertion, tail], :tokens => Tokens.new
+ encoder.tokens pre
+ unless inserted.empty?
+ encoder.begin_group :eyecatcher
+ encoder.tokens inserted
+ encoder.end_group :eyecatcher
+ end
+ encoder.tokens post
+ elsif match = scan(/.*/)
+ if options[:highlight_code]
+ if deleted_lines == 1
+ content_scanner_entry_state = content_scanner.state
+ end
+ content_scanner.tokenize match, :tokens => encoder unless match.empty?
+ if !match?(/\n-/)
+ if match?(/\n\+/)
+ content_scanner.state = content_scanner_entry_state || :initial
+ end
+ content_scanner_entry_state = nil
+ end
+ else
+ encoder.text_token match, :plain
+ end
+ end
+ next
+ elsif match = scan(/ .*/)
+ if options[:highlight_code]
+ content_scanner.tokenize match, :tokens => encoder
+ else
+ encoder.text_token match, :plain
+ end
+ next
+ elsif match = scan(/.+/)
+ encoder.begin_line line_kind = :comment
+ encoder.text_token match, :plain
else
raise_inspect 'else case rached'
end
when :added
if match = scan(/ \+/)
- tokens << [:begin_line, line_kind = :insert]
- tokens << [match, :insert]
+ encoder.begin_line line_kind = :insert
+ encoder.text_token match, :insert
next unless match = scan(/.+/)
- kind = :plain
+ encoder.text_token match, :plain
else
state = :initial
next
end
end
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens
- end
- raise_inspect 'Empty token', tokens unless match
-
- tokens << [match, kind]
end
- tokens << [:end_line, line_kind] if line_kind
- tokens
+ encoder.end_line line_kind if line_kind
+
+ encoder
+ end
+
+ private
+
+ def diff a, b
+ # i will be the index of the leftmost difference from the left.
+ i_max = [a.size, b.size].min
+ i = 0
+ i += 1 while i < i_max && a[i] == b[i]
+ # j_min will be the index of the leftmost difference from the right.
+ j_min = i - i_max
+ # j will be the index of the rightmost difference from the right which
+ # does not precede the leftmost one from the left.
+ j = -1
+ j -= 1 while j >= j_min && a[j] == b[j]
+ return a[0...i], a[i..j], b[i..j], (j < -1) ? a[j+1..-1] : ''
end
end
diff --git a/lib/coderay/scanners/erb.rb b/lib/coderay/scanners/erb.rb
new file mode 100644
index 0000000..727a993
--- /dev/null
+++ b/lib/coderay/scanners/erb.rb
@@ -0,0 +1,81 @@
+module CodeRay
+module Scanners
+
+ load :html
+ load :ruby
+
+ # Scanner for HTML ERB templates.
+ class ERB < Scanner
+
+ register_for :erb
+ title 'HTML ERB Template'
+
+ KINDS_NOT_LOC = HTML::KINDS_NOT_LOC
+
+ ERB_RUBY_BLOCK = /
+ (<%(?!%)[-=\#]?)
+ ((?>
+ [^\-%]* # normal*
+ (?> # special
+ (?: %(?!>) | -(?!%>) )
+ [^\-%]* # normal*
+ )*
+ ))
+ ((?: -?%> )?)
+ /x # :nodoc:
+
+ START_OF_ERB = /
+ <%(?!%)
+ /x # :nodoc:
+
+ protected
+
+ def setup
+ @ruby_scanner = CodeRay.scanner :ruby, :tokens => @tokens, :keep_tokens => true
+ @html_scanner = CodeRay.scanner :html, :tokens => @tokens, :keep_tokens => true, :keep_state => true
+ end
+
+ def reset_instance
+ super
+ @html_scanner.reset
+ end
+
+ def scan_tokens encoder, options
+
+ until eos?
+
+ if (match = scan_until(/(?=#{START_OF_ERB})/o) || scan_rest) and not match.empty?
+ @html_scanner.tokenize match, :tokens => encoder
+
+ elsif match = scan(/#{ERB_RUBY_BLOCK}/o)
+ start_tag = self[1]
+ code = self[2]
+ end_tag = self[3]
+
+ encoder.begin_group :inline
+ encoder.text_token start_tag, :inline_delimiter
+
+ if start_tag == '<%#'
+ encoder.text_token code, :comment
+ else
+ @ruby_scanner.tokenize code, :tokens => encoder
+ end unless code.empty?
+
+ encoder.text_token end_tag, :inline_delimiter unless end_tag.empty?
+ encoder.end_group :inline
+
+ else
+ raise_inspect 'else-case reached!', encoder
+
+ end
+
+ end
+
+ encoder
+
+ end
+
+ end
+
+end
+end
diff --git a/lib/coderay/scanners/groovy.rb b/lib/coderay/scanners/groovy.rb
index 17330e6..cf55daf 100644
--- a/lib/coderay/scanners/groovy.rb
+++ b/lib/coderay/scanners/groovy.rb
@@ -1,29 +1,29 @@
module CodeRay
module Scanners
-
+
load :java
-
+
+ # Scanner for Groovy.
class Groovy < Java
-
- include Streamable
+
register_for :groovy
- # TODO: Check this!
+ # TODO: check list of keywords
GROOVY_KEYWORDS = %w[
as assert def in
- ]
+ ] # :nodoc:
KEYWORDS_EXPECTING_VALUE = WordList.new.add %w[
case instanceof new return throw typeof while as assert in
- ]
- GROOVY_MAGIC_VARIABLES = %w[ it ]
+ ] # :nodoc:
+ GROOVY_MAGIC_VARIABLES = %w[ it ] # :nodoc:
IDENT_KIND = Java::IDENT_KIND.dup.
add(GROOVY_KEYWORDS, :keyword).
- add(GROOVY_MAGIC_VARIABLES, :local_variable)
+ add(GROOVY_MAGIC_VARIABLES, :local_variable) # :nodoc:
- ESCAPE = / [bfnrtv$\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
- UNICODE_ESCAPE = / u[a-fA-F0-9]{4} /x # no 4-byte unicode chars? U[a-fA-F0-9]{8}
- REGEXP_ESCAPE = / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} | \d | [bBdDsSwW\/] /x
+ ESCAPE = / [bfnrtv$\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x # :nodoc:
+ UNICODE_ESCAPE = / u[a-fA-F0-9]{4} /x # :nodoc: no 4-byte unicode chars? U[a-fA-F0-9]{8}
+ REGEXP_ESCAPE = / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} | \d | [bBdDsSwW\/] /x # :nodoc:
# TODO: interpretation inside ', ", /
STRING_CONTENT_PATTERN = {
@@ -32,45 +32,44 @@ module Scanners
"'''" => /(?>[^\\']+|'(?!''))+/,
'"""' => /(?>[^\\$"]+|"(?!""))+/,
'/' => /[^\\$\/\n]+/,
- }
+ } # :nodoc:
+
+ protected
- def scan_tokens tokens, options
-
+ def scan_tokens encoder, options
+
state = :initial
inline_block_stack = []
inline_block_paren_depth = nil
string_delimiter = nil
import_clause = class_name_follows = last_token = after_def = false
value_expected = true
-
+
until eos?
-
- kind = nil
- match = nil
case state
-
+
when :initial
-
+
if match = scan(/ \s+ | \\\n /x)
- tokens << [match, :space]
+ encoder.text_token match, :space
if match.index ?\n
import_clause = after_def = false
value_expected = true unless value_expected
end
next
- elsif scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
+ elsif match = scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
value_expected = true
after_def = false
- kind = :comment
+ encoder.text_token match, :comment
- elsif bol? && scan(/ \#!.* /x)
- kind = :doctype
+ elsif bol? && match = scan(/ \#!.* /x)
+ encoder.text_token match, :doctype
- elsif import_clause && scan(/ (?!as) #{IDENT} (?: \. #{IDENT} )* (?: \.\* )? /ox)
+ elsif import_clause && match = scan(/ (?!as) #{IDENT} (?: \. #{IDENT} )* (?: \.\* )? /ox)
after_def = value_expected = false
- kind = :include
+ encoder.text_token match, :include
elsif match = scan(/ #{IDENT} | \[\] /ox)
kind = IDENT_KIND[match]
@@ -90,16 +89,17 @@ module Scanners
import_clause = match == 'import'
after_def = true if match == 'def'
end
+ encoder.text_token match, kind
- elsif scan(/;/)
+ elsif match = scan(/;/)
import_clause = after_def = false
value_expected = true
- kind = :operator
+ encoder.text_token match, :operator
- elsif scan(/\{/)
+ elsif match = scan(/\{/)
class_name_follows = after_def = false
value_expected = true
- kind = :operator
+ encoder.text_token match, :operator
if !inline_block_stack.empty?
inline_block_paren_depth += 1
end
@@ -110,155 +110,146 @@ module Scanners
value_expected = true
value_expected = :regexp if match == '~'
after_def = false
- kind = :operator
+ encoder.text_token match, :operator
elsif match = scan(/ [)\]}] /x)
value_expected = after_def = false
if !inline_block_stack.empty? && match == '}'
inline_block_paren_depth -= 1
if inline_block_paren_depth == 0 # closing brace of inline block reached
- tokens << [match, :inline_delimiter]
- tokens << [:close, :inline]
+ encoder.text_token match, :inline_delimiter
+ encoder.end_group :inline
state, string_delimiter, inline_block_paren_depth = inline_block_stack.pop
next
end
end
- kind = :operator
+ encoder.text_token match, :operator
elsif check(/[\d.]/)
after_def = value_expected = false
- if scan(/0[xX][0-9A-Fa-f]+/)
- kind = :hex
- elsif scan(/(?>0[0-7]+)(?![89.eEfF])/)
- kind = :oct
- elsif scan(/\d+[fFdD]|\d*\.\d+(?:[eE][+-]?\d+)?[fFdD]?|\d+[eE][+-]?\d+[fFdD]?/)
- kind = :float
- elsif scan(/\d+[lLgG]?/)
- kind = :integer
+ if match = scan(/0[xX][0-9A-Fa-f]+/)
+ encoder.text_token match, :hex
+ elsif match = scan(/(?>0[0-7]+)(?![89.eEfF])/)
+ encoder.text_token match, :octal
+ elsif match = scan(/\d+[fFdD]|\d*\.\d+(?:[eE][+-]?\d+)?[fFdD]?|\d+[eE][+-]?\d+[fFdD]?/)
+ encoder.text_token match, :float
+ elsif match = scan(/\d+[lLgG]?/)
+ encoder.text_token match, :integer
end
-
+
elsif match = scan(/'''|"""/)
after_def = value_expected = false
state = :multiline_string
- tokens << [:open, :string]
+ encoder.begin_group :string
string_delimiter = match
- kind = :delimiter
-
- # TODO: record.'name'
+ encoder.text_token match, :delimiter
+
+ # TODO: record.'name' syntax
elsif match = scan(/["']/)
after_def = value_expected = false
state = match == '/' ? :regexp : :string
- tokens << [:open, state]
+ encoder.begin_group state
string_delimiter = match
- kind = :delimiter
-
- elsif value_expected && (match = scan(/\//))
+ encoder.text_token match, :delimiter
+
+ elsif value_expected && match = scan(/\//)
after_def = value_expected = false
- tokens << [:open, :regexp]
+ encoder.begin_group :regexp
state = :regexp
string_delimiter = '/'
- kind = :delimiter
-
- elsif scan(/ @ #{IDENT} /ox)
+ encoder.text_token match, :delimiter
+
+ elsif match = scan(/ @ #{IDENT} /ox)
after_def = value_expected = false
- kind = :annotation
-
- elsif scan(/\//)
+ encoder.text_token match, :annotation
+
+ elsif match = scan(/\//)
after_def = false
value_expected = true
- kind = :operator
-
+ encoder.text_token match, :operator
+
else
- getch
- kind = :error
-
+ encoder.text_token getch, :error
+
end
-
+
when :string, :regexp, :multiline_string
- if scan(STRING_CONTENT_PATTERN[string_delimiter])
- kind = :content
+ if match = scan(STRING_CONTENT_PATTERN[string_delimiter])
+ encoder.text_token match, :content
elsif match = scan(state == :multiline_string ? /'''|"""/ : /["'\/]/)
- tokens << [match, :delimiter]
+ encoder.text_token match, :delimiter
if state == :regexp
# TODO: regexp modifiers? s, m, x, i?
modifiers = scan(/[ix]+/)
- tokens << [modifiers, :modifier] if modifiers && !modifiers.empty?
+ encoder.text_token modifiers, :modifier if modifiers && !modifiers.empty?
end
state = :string if state == :multiline_string
- tokens << [:close, state]
+ encoder.end_group state
string_delimiter = nil
after_def = value_expected = false
state = :initial
next
-
+
elsif (state == :string || state == :multiline_string) &&
(match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox))
if string_delimiter[0] == ?' && !(match == "\\\\" || match == "\\'")
- kind = :content
+ encoder.text_token match, :content
else
- kind = :char
+ encoder.text_token match, :char
end
- elsif state == :regexp && scan(/ \\ (?: #{REGEXP_ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
- kind = :char
-
+ elsif state == :regexp && match = scan(/ \\ (?: #{REGEXP_ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+ encoder.text_token match, :char
+
elsif match = scan(/ \$ #{IDENT} /mox)
- tokens << [:open, :inline]
- tokens << ['$', :inline_delimiter]
+ encoder.begin_group :inline
+ encoder.text_token '$', :inline_delimiter
match = match[1..-1]
- tokens << [match, IDENT_KIND[match]]
- tokens << [:close, :inline]
+ encoder.text_token match, IDENT_KIND[match]
+ encoder.end_group :inline
next
elsif match = scan(/ \$ \{ /x)
- tokens << [:open, :inline]
- tokens << ['${', :inline_delimiter]
+ encoder.begin_group :inline
+ encoder.text_token match, :inline_delimiter
inline_block_stack << [state, string_delimiter, inline_block_paren_depth]
inline_block_paren_depth = 1
state = :initial
next
-
- elsif scan(/ \$ /mx)
- kind = :content
-
- elsif scan(/ \\. /mx)
- kind = :content
-
- elsif scan(/ \\ | \n /x)
- tokens << [:close, state]
- kind = :error
+
+ elsif match = scan(/ \$ /mx)
+ encoder.text_token match, :content
+
+ elsif match = scan(/ \\. /mx)
+ encoder.text_token match, :content # TODO: Shouldn't this be :error?
+
+ elsif match = scan(/ \\ | \n /x)
+ encoder.end_group state
+ encoder.text_token match, :error
after_def = value_expected = false
state = :initial
-
+
else
- raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+ raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
+
end
-
+
else
- raise_inspect 'Unknown state', tokens
-
- end
-
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens
+ raise_inspect 'Unknown state', encoder
+
end
- raise_inspect 'Empty token', tokens unless match
last_token = match unless [:space, :comment, :doctype].include? kind
- tokens << [match, kind]
-
end
-
+
if [:multiline_string, :string, :regexp].include? state
- tokens << [:close, state]
+ encoder.end_group state
end
-
- tokens
+
+ encoder
end
-
+
end
-
+
end
end
diff --git a/lib/coderay/scanners/haml.rb b/lib/coderay/scanners/haml.rb
new file mode 100644
index 0000000..5433790
--- /dev/null
+++ b/lib/coderay/scanners/haml.rb
@@ -0,0 +1,168 @@
+module CodeRay
+module Scanners
+
+ load :ruby
+ load :html
+ load :java_script
+
+ class HAML < Scanner
+
+ register_for :haml
+ title 'HAML Template'
+
+ KINDS_NOT_LOC = HTML::KINDS_NOT_LOC
+
+ protected
+
+ def setup
+ super
+ @ruby_scanner = CodeRay.scanner :ruby, :tokens => @tokens, :keep_tokens => true
+ @embedded_ruby_scanner = CodeRay.scanner :ruby, :tokens => @tokens, :keep_tokens => true, :state => @ruby_scanner.interpreted_string_state
+ @html_scanner = CodeRay.scanner :html, :tokens => @tokens, :keep_tokens => true
+ end
+
+ def scan_tokens encoder, options
+
+ match = nil
+ code = ''
+
+ until eos?
+
+ if bol?
+ if match = scan(/!!!.*/)
+ encoder.text_token match, :doctype
+ next
+ end
+
+ if match = scan(/(?>( *)(\/(?!\[if)|-\#|:javascript|:ruby|:\w+) *)(?=\n)/)
+ encoder.text_token match, :comment
+
+ code = self[2]
+ if match = scan(/(?:\n+#{self[1]} .*)+/)
+ case code
+ when '/', '-#'
+ encoder.text_token match, :comment
+ when ':javascript'
+ # TODO: recognize #{...} snippets inside JavaScript
+ @java_script_scanner ||= CodeRay.scanner :java_script, :tokens => @tokens, :keep_tokens => true
+ @java_script_scanner.tokenize match, :tokens => encoder
+ when ':ruby'
+ @ruby_scanner.tokenize match, :tokens => encoder
+ when /:\w+/
+ encoder.text_token match, :comment
+ else
+ raise 'else-case reached: %p' % [code]
+ end
+ end
+ end
+
+ if match = scan(/ +/)
+ encoder.text_token match, :space
+ end
+
+ if match = scan(/\/.*/)
+ encoder.text_token match, :comment
+ next
+ end
+
+ if match = scan(/\\/)
+ encoder.text_token match, :plain
+ if match = scan(/.+/)
+ @html_scanner.tokenize match, :tokens => encoder
+ end
+ next
+ end
+
+ tag = false
+
+ if match = scan(/%[\w:]+\/?/)
+ encoder.text_token match, :tag
+ # if match = scan(/( +)(.+)/)
+ # encoder.text_token self[1], :space
+ # @embedded_ruby_scanner.tokenize self[2], :tokens => encoder
+ # end
+ tag = true
+ end
+
+ while match = scan(/([.#])[-\w]*\w/)
+ encoder.text_token match, self[1] == '#' ? :constant : :class
+ tag = true
+ end
+
+ if tag && match = scan(/(\()([^)]+)?(\))?/)
+ # TODO: recognize title=@title, class="widget_#{@widget.number}"
+ encoder.text_token self[1], :plain
+ @html_scanner.tokenize self[2], :tokens => encoder, :state => :attribute if self[2]
+ encoder.text_token self[3], :plain if self[3]
+ end
+
+ if tag && match = scan(/\{/)
+ encoder.text_token match, :plain
+
+ code = ''
+ level = 1
+ while true
+ code << scan(/([^\{\},\n]|, *\n?)*/)
+ case match = getch
+ when '{'
+ level += 1
+ code << match
+ when '}'
+ level -= 1
+ if level > 0
+ code << match
+ else
+ break
+ end
+ when "\n", ",", nil
+ break
+ end
+ end
+ @ruby_scanner.tokenize code, :tokens => encoder unless code.empty?
+
+ encoder.text_token match, :plain if match
+ end
+
+ if tag && match = scan(/(\[)([^\]\n]+)?(\])?/)
+ encoder.text_token self[1], :plain
+ @ruby_scanner.tokenize self[2], :tokens => encoder if self[2]
+ encoder.text_token self[3], :plain if self[3]
+ end
+
+ if tag && match = scan(/\//)
+ encoder.text_token match, :tag
+ end
+
+ if scan(/(>?<?[-=]|[&!]=|(& |!)|~)( *)([^,\n\|]+(?:(, *|\|(?=.|\n.*\|$))\n?[^,\n\|]*)*)?/)
+ encoder.text_token self[1] + self[3], :plain
+ if self[4]
+ if self[2]
+ @embedded_ruby_scanner.tokenize self[4], :tokens => encoder
+ else
+ @ruby_scanner.tokenize self[4], :tokens => encoder
+ end
+ end
+ elsif match = scan(/((?:<|><?)(?![!?\/\w]))?(.+)?/)
+ encoder.text_token self[1], :plain if self[1]
+ # TODO: recognize #{...} snippets
+ @html_scanner.tokenize self[2], :tokens => encoder if self[2]
+ end
+
+ elsif match = scan(/.+/)
+ @html_scanner.tokenize match, :tokens => encoder
+
+ end
+
+ if match = scan(/\n/)
+ encoder.text_token match, :space
+ end
+ end
+
+ encoder
+
+ end
+
+ end
+
+end
+end
diff --git a/lib/coderay/scanners/html.rb b/lib/coderay/scanners/html.rb
index 009a461..98d06fc 100644
--- a/lib/coderay/scanners/html.rb
+++ b/lib/coderay/scanners/html.rb
@@ -2,22 +2,42 @@ module CodeRay
module Scanners
# HTML Scanner
+ #
+ # Alias: +xhtml+
+ #
+ # See also: Scanners::XML
class HTML < Scanner
- include Streamable
register_for :html
KINDS_NOT_LOC = [
:comment, :doctype, :preprocessor,
:tag, :attribute_name, :operator,
- :attribute_value, :delimiter, :content,
- :plain, :entity, :error
- ]
-
- ATTR_NAME = /[\w.:-]+/
- ATTR_VALUE_UNQUOTED = ATTR_NAME
- TAG_END = /\/?>/
- HEX = /[0-9a-fA-F]/
+ :attribute_value, :string,
+ :plain, :entity, :error,
+ ] # :nodoc:
+
+ EVENT_ATTRIBUTES = %w(
+ onabort onafterprint onbeforeprint onbeforeunload onblur oncanplay
+ oncanplaythrough onchange onclick oncontextmenu oncuechange ondblclick
+ ondrag ondragdrop ondragend ondragenter ondragleave ondragover
+ ondragstart ondrop ondurationchange onemptied onended onerror onfocus
+ onformchange onforminput onhashchange oninput oninvalid onkeydown
+ onkeypress onkeyup onload onloadeddata onloadedmetadata onloadstart
+ onmessage onmousedown onmousemove onmouseout onmouseover onmouseup
+ onmousewheel onmove onoffline ononline onpagehide onpageshow onpause
+ onplay onplaying onpopstate onprogress onratechange onreadystatechange
+ onredo onreset onresize onscroll onseeked onseeking onselect onshow
+ onstalled onstorage onsubmit onsuspend ontimeupdate onundo onunload
+ onvolumechange onwaiting
+ )
+
+ IN_ATTRIBUTE = WordList::CaseIgnoring.new(nil).
+ add(EVENT_ATTRIBUTES, :script)
+
+ ATTR_NAME = /[\w.:-]+/ # :nodoc:
+ TAG_END = /\/?>/ # :nodoc:
+ HEX = /[0-9a-fA-F]/ # :nodoc:
ENTITY = /
&
(?:
@@ -31,152 +51,203 @@ module Scanners
)
)
;
- /ox
-
+ /ox # :nodoc:
+
PLAIN_STRING_CONTENT = {
"'" => /[^&'>\n]+/,
'"' => /[^&">\n]+/,
- }
-
+ } # :nodoc:
+
def reset
super
@state = :initial
+ @plain_string_content = nil
end
-
- private
+
+ protected
+
def setup
@state = :initial
@plain_string_content = nil
end
-
- def scan_tokens tokens, options
-
- state = @state
+
+ def scan_java_script encoder, code
+ if code && !code.empty?
+ @java_script_scanner ||= Scanners::JavaScript.new '', :keep_tokens => true
+ # encoder.begin_group :inline
+ @java_script_scanner.tokenize code, :tokens => encoder
+ # encoder.end_group :inline
+ end
+ end
+
+ def scan_tokens encoder, options
+ state = options[:state] || @state
plain_string_content = @plain_string_content
-
+ in_tag = in_attribute = nil
+
+ encoder.begin_group :string if state == :attribute_value_string
+
until eos?
-
- kind = nil
- match = nil
-
- if scan(/\s+/m)
- kind = :space
-
+
+ if state != :in_special_tag && match = scan(/\s+/m)
+ encoder.text_token match, :space
+
else
-
+
case state
-
+
when :initial
- if scan(/<!--.*?-->/m)
- kind = :comment
- elsif scan(/<!DOCTYPE.*?>/m)
- kind = :doctype
- elsif scan(/<\?xml.*?\?>/m)
- kind = :preprocessor
- elsif scan(/<\?.*?\?>|<%.*?%>/m)
- kind = :comment
- elsif scan(/<\/[-\w.:]*>/m)
- kind = :tag
- elsif match = scan(/<[-\w.:]+>?/m)
- kind = :tag
- state = :attribute unless match[-1] == ?>
- elsif scan(/[^<>&]+/)
- kind = :plain
- elsif scan(/#{ENTITY}/ox)
- kind = :entity
- elsif scan(/[<>&]/)
- kind = :error
+ if match = scan(/<!--(?:.*?-->|.*)/m)
+ encoder.text_token match, :comment
+ elsif match = scan(/<!DOCTYPE(?:.*?>|.*)/m)
+ encoder.text_token match, :doctype
+ elsif match = scan(/<\?xml(?:.*?\?>|.*)/m)
+ encoder.text_token match, :preprocessor
+ elsif match = scan(/<\?(?:.*?\?>|.*)/m)
+ encoder.text_token match, :comment
+ elsif match = scan(/<\/[-\w.:]*>?/m)
+ in_tag = nil
+ encoder.text_token match, :tag
+ elsif match = scan(/<(?:(script)|[-\w.:]+)(>)?/m)
+ encoder.text_token match, :tag
+ in_tag = self[1]
+ if self[2]
+ state = :in_special_tag if in_tag
+ else
+ state = :attribute
+ end
+ elsif match = scan(/[^<>&]+/)
+ encoder.text_token match, :plain
+ elsif match = scan(/#{ENTITY}/ox)
+ encoder.text_token match, :entity
+ elsif match = scan(/[<>&]/)
+ in_tag = nil
+ encoder.text_token match, :error
else
- raise_inspect '[BUG] else-case reached with state %p' % [state], tokens
+ raise_inspect '[BUG] else-case reached with state %p' % [state], encoder
end
-
+
when :attribute
- if scan(/#{TAG_END}/o)
- kind = :tag
- state = :initial
- elsif scan(/#{ATTR_NAME}/o)
- kind = :attribute_name
+ if match = scan(/#{TAG_END}/o)
+ encoder.text_token match, :tag
+ in_attribute = nil
+ if in_tag
+ state = :in_special_tag
+ else
+ state = :initial
+ end
+ elsif match = scan(/#{ATTR_NAME}/o)
+ in_attribute = IN_ATTRIBUTE[match]
+ encoder.text_token match, :attribute_name
state = :attribute_equal
else
- kind = :error
- getch
+ in_tag = nil
+ encoder.text_token getch, :error
end
-
+
when :attribute_equal
- if scan(/=/)
- kind = :operator
+ if match = scan(/=/) #/
+ encoder.text_token match, :operator
state = :attribute_value
- elsif scan(/#{ATTR_NAME}/o)
- kind = :attribute_name
- elsif scan(/#{TAG_END}/o)
- kind = :tag
- state = :initial
- elsif scan(/./)
- kind = :error
+ elsif scan(/#{ATTR_NAME}/o) || scan(/#{TAG_END}/o)
+ state = :attribute
+ next
+ else
+ encoder.text_token getch, :error
state = :attribute
end
-
+
when :attribute_value
- if scan(/#{ATTR_VALUE_UNQUOTED}/o)
- kind = :attribute_value
+ if match = scan(/#{ATTR_NAME}/o)
+ encoder.text_token match, :attribute_value
state = :attribute
elsif match = scan(/["']/)
- tokens << [:open, :string]
- state = :attribute_value_string
- plain_string_content = PLAIN_STRING_CONTENT[match]
- kind = :delimiter
- elsif scan(/#{TAG_END}/o)
- kind = :tag
+ if in_attribute == :script
+ encoder.begin_group :inline
+ encoder.text_token match, :inline_delimiter
+ if scan(/javascript:[ \t]*/)
+ encoder.text_token matched, :comment
+ end
+ code = scan_until(match == '"' ? /(?="|\z)/ : /(?='|\z)/)
+ scan_java_script encoder, code
+ match = scan(/["']/)
+ encoder.text_token match, :inline_delimiter if match
+ encoder.end_group :inline
+ state = :attribute
+ in_attribute = nil
+ else
+ encoder.begin_group :string
+ state = :attribute_value_string
+ plain_string_content = PLAIN_STRING_CONTENT[match]
+ encoder.text_token match, :delimiter
+ end
+ elsif match = scan(/#{TAG_END}/o)
+ encoder.text_token match, :tag
state = :initial
else
- kind = :error
- getch
+ encoder.text_token getch, :error
end
-
+
when :attribute_value_string
- if scan(plain_string_content)
- kind = :content
- elsif scan(/['"]/)
- tokens << [matched, :delimiter]
- tokens << [:close, :string]
+ if match = scan(plain_string_content)
+ encoder.text_token match, :content
+ elsif match = scan(/['"]/)
+ encoder.text_token match, :delimiter
+ encoder.end_group :string
state = :attribute
- next
- elsif scan(/#{ENTITY}/ox)
- kind = :entity
- elsif scan(/&/)
- kind = :content
- elsif scan(/[\n>]/)
- tokens << [:close, :string]
- kind = :error
+ elsif match = scan(/#{ENTITY}/ox)
+ encoder.text_token match, :entity
+ elsif match = scan(/&/)
+ encoder.text_token match, :content
+ elsif match = scan(/[\n>]/)
+ encoder.end_group :string
state = :initial
+ encoder.text_token match, :error
end
-
+
+ when :in_special_tag
+ case in_tag
+ when 'script'
+ encoder.text_token match, :space if match = scan(/[ \t]*\n/)
+ if scan(/(\s*<!--)(?:(.*?)(-->)|(.*))/m)
+ code = self[2] || self[4]
+ closing = self[3]
+ encoder.text_token self[1], :comment
+ else
+ code = scan_until(/(?=(?:\n\s*)?<\/script>)|\z/)
+ closing = false
+ end
+ unless code.empty?
+ encoder.begin_group :inline
+ scan_java_script encoder, code
+ encoder.end_group :inline
+ end
+ encoder.text_token closing, :comment if closing
+ state = :initial
+ else
+ raise 'unknown special tag: %p' % [in_tag]
+ end
+
else
- raise_inspect 'Unknown state: %p' % [state], tokens
-
+ raise_inspect 'Unknown state: %p' % [state], encoder
+
end
-
- end
-
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens, state
+
end
- raise_inspect 'Empty token', tokens unless match
-
- tokens << [match, kind]
+
end
-
+
if options[:keep_state]
@state = state
@plain_string_content = plain_string_content
end
-
- tokens
+
+ encoder.end_group :string if state == :attribute_value_string
+
+ encoder
end
-
+
end
-
+
end
end
diff --git a/lib/coderay/scanners/java.rb b/lib/coderay/scanners/java.rb
index caf3619..c1490ac 100644
--- a/lib/coderay/scanners/java.rb
+++ b/lib/coderay/scanners/java.rb
@@ -1,11 +1,12 @@
module CodeRay
module Scanners
-
+
+ # Scanner for Java.
class Java < Scanner
-
- include Streamable
+
register_for :java
- helper :builtin_types
+
+ autoload :BuiltinTypes, CodeRay.coderay_path('scanners', 'java', 'builtin_types')
# http://java.sun.com/docs/books/tutorial/java/nutsandbolts/_keywords.html
KEYWORDS = %w[
@@ -13,63 +14,64 @@ module Scanners
finally for if instanceof import new package
return switch throw try typeof while
debugger export
- ]
- RESERVED = %w[ const goto ]
- CONSTANTS = %w[ false null true ]
- MAGIC_VARIABLES = %w[ this super ]
+ ] # :nodoc:
+ RESERVED = %w[ const goto ] # :nodoc:
+ CONSTANTS = %w[ false null true ] # :nodoc:
+ MAGIC_VARIABLES = %w[ this super ] # :nodoc:
TYPES = %w[
boolean byte char class double enum float int interface long
short void
- ] << '[]' # because int[] should be highlighted as a type
+ ] << '[]' # :nodoc: because int[] should be highlighted as a type
DIRECTIVES = %w[
abstract extends final implements native private protected public
static strictfp synchronized throws transient volatile
- ]
+ ] # :nodoc:
IDENT_KIND = WordList.new(:ident).
add(KEYWORDS, :keyword).
add(RESERVED, :reserved).
- add(CONSTANTS, :pre_constant).
+ add(CONSTANTS, :predefined_constant).
add(MAGIC_VARIABLES, :local_variable).
add(TYPES, :type).
- add(BuiltinTypes::List, :pre_type).
+ add(BuiltinTypes::List, :predefined_type).
add(BuiltinTypes::List.select { |builtin| builtin[/(Error|Exception)$/] }, :exception).
- add(DIRECTIVES, :directive)
+ add(DIRECTIVES, :directive) # :nodoc:
- ESCAPE = / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
- UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x
+ ESCAPE = / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x # :nodoc:
+ UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x # :nodoc:
STRING_CONTENT_PATTERN = {
"'" => /[^\\']+/,
'"' => /[^\\"]+/,
'/' => /[^\\\/]+/,
- }
- IDENT = /[a-zA-Z_][A-Za-z_0-9]*/
+ } # :nodoc:
+ IDENT = /[a-zA-Z_][A-Za-z_0-9]*/ # :nodoc:
+
+ protected
- def scan_tokens tokens, options
+ def scan_tokens encoder, options
state = :initial
string_delimiter = nil
- import_clause = class_name_follows = last_token_dot = false
+ package_name_expected = false
+ class_name_follows = false
+ last_token_dot = false
until eos?
- kind = nil
- match = nil
-
case state
when :initial
if match = scan(/ \s+ | \\\n /x)
- tokens << [match, :space]
+ encoder.text_token match, :space
next
elsif match = scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
- tokens << [match, :comment]
+ encoder.text_token match, :comment
next
- elsif import_clause && scan(/ #{IDENT} (?: \. #{IDENT} )* /ox)
- kind = :include
+ elsif package_name_expected && match = scan(/ #{IDENT} (?: \. #{IDENT} )* /ox)
+ encoder.text_token match, package_name_expected
elsif match = scan(/ #{IDENT} | \[\] /ox)
kind = IDENT_KIND[match]
@@ -79,95 +81,91 @@ module Scanners
kind = :class
class_name_follows = false
else
- import_clause = true if match == 'import'
- class_name_follows = true if match == 'class' || match == 'interface'
+ case match
+ when 'import'
+ package_name_expected = :include
+ when 'package'
+ package_name_expected = :namespace
+ when 'class', 'interface'
+ class_name_follows = true
+ end
end
+ encoder.text_token match, kind
- elsif scan(/ \.(?!\d) | [,?:()\[\]}] | -- | \+\+ | && | \|\| | \*\*=? | [-+*\/%^~&|<>=!]=? | <<<?=? | >>>?=? /x)
- kind = :operator
+ elsif match = scan(/ \.(?!\d) | [,?:()\[\]}] | -- | \+\+ | && | \|\| | \*\*=? | [-+*\/%^~&|<>=!]=? | <<<?=? | >>>?=? /x)
+ encoder.text_token match, :operator
- elsif scan(/;/)
- import_clause = false
- kind = :operator
+ elsif match = scan(/;/)
+ package_name_expected = false
+ encoder.text_token match, :operator
- elsif scan(/\{/)
+ elsif match = scan(/\{/)
class_name_follows = false
- kind = :operator
+ encoder.text_token match, :operator
elsif check(/[\d.]/)
- if scan(/0[xX][0-9A-Fa-f]+/)
- kind = :hex
- elsif scan(/(?>0[0-7]+)(?![89.eEfF])/)
- kind = :oct
- elsif scan(/\d+[fFdD]|\d*\.\d+(?:[eE][+-]?\d+)?[fFdD]?|\d+[eE][+-]?\d+[fFdD]?/)
- kind = :float
- elsif scan(/\d+[lL]?/)
- kind = :integer
+ if match = scan(/0[xX][0-9A-Fa-f]+/)
+ encoder.text_token match, :hex
+ elsif match = scan(/(?>0[0-7]+)(?![89.eEfF])/)
+ encoder.text_token match, :octal
+ elsif match = scan(/\d+[fFdD]|\d*\.\d+(?:[eE][+-]?\d+)?[fFdD]?|\d+[eE][+-]?\d+[fFdD]?/)
+ encoder.text_token match, :float
+ elsif match = scan(/\d+[lL]?/)
+ encoder.text_token match, :integer
end
elsif match = scan(/["']/)
- tokens << [:open, :string]
state = :string
+ encoder.begin_group state
string_delimiter = match
- kind = :delimiter
+ encoder.text_token match, :delimiter
- elsif scan(/ @ #{IDENT} /ox)
- kind = :annotation
+ elsif match = scan(/ @ #{IDENT} /ox)
+ encoder.text_token match, :annotation
else
- getch
- kind = :error
+ encoder.text_token getch, :error
end
when :string
- if scan(STRING_CONTENT_PATTERN[string_delimiter])
- kind = :content
+ if match = scan(STRING_CONTENT_PATTERN[string_delimiter])
+ encoder.text_token match, :content
elsif match = scan(/["'\/]/)
- tokens << [match, :delimiter]
- tokens << [:close, state]
- string_delimiter = nil
+ encoder.text_token match, :delimiter
+ encoder.end_group state
state = :initial
- next
+ string_delimiter = nil
elsif state == :string && (match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox))
if string_delimiter == "'" && !(match == "\\\\" || match == "\\'")
- kind = :content
+ encoder.text_token match, :content
else
- kind = :char
+ encoder.text_token match, :char
end
- elsif scan(/\\./m)
- kind = :content
- elsif scan(/ \\ | $ /x)
- tokens << [:close, state]
- kind = :error
+ elsif match = scan(/\\./m)
+ encoder.text_token match, :content
+ elsif match = scan(/ \\ | $ /x)
+ encoder.end_group state
state = :initial
+ encoder.text_token match, :error
else
- raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+ raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
end
else
- raise_inspect 'Unknown state', tokens
+ raise_inspect 'Unknown state', encoder
end
-
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens
- end
- raise_inspect 'Empty token', tokens unless match
last_token_dot = match == '.'
- tokens << [match, kind]
-
end
if state == :string
- tokens << [:close, state]
+ encoder.end_group state
end
- tokens
+ encoder
end
end
diff --git a/lib/coderay/scanners/java/builtin_types.rb b/lib/coderay/scanners/java/builtin_types.rb
index 8087edd..d1b8b73 100644
--- a/lib/coderay/scanners/java/builtin_types.rb
+++ b/lib/coderay/scanners/java/builtin_types.rb
@@ -3,6 +3,7 @@ module Scanners
module Java::BuiltinTypes # :nodoc:
+ #:nocov:
List = %w[
AbstractAction AbstractBorder AbstractButton AbstractCellEditor AbstractCollection
AbstractColorChooserPanel AbstractDocument AbstractExecutorService AbstractInterruptibleChannel
@@ -412,6 +413,7 @@ module Scanners
XPathFactoryConfigurationException XPathFunction XPathFunctionException XPathFunctionResolver
XPathVariableResolver ZipEntry ZipException ZipFile ZipInputStream ZipOutputStream ZoneView
]
+ #:nocov:
end
diff --git a/lib/coderay/scanners/java_script.rb b/lib/coderay/scanners/java_script.rb
index 1f26348..43ecb18 100644
--- a/lib/coderay/scanners/java_script.rb
+++ b/lib/coderay/scanners/java_script.rb
@@ -1,28 +1,29 @@
module CodeRay
module Scanners
-
+
+ # Scanner for JavaScript.
+ #
+ # Aliases: +ecmascript+, +ecma_script+, +javascript+
class JavaScript < Scanner
-
- include Streamable
-
+
register_for :java_script
file_extension 'js'
-
+
# The actual JavaScript keywords.
KEYWORDS = %w[
break case catch continue default delete do else
finally for function if in instanceof new
return switch throw try typeof var void while with
- ]
+ ] # :nodoc:
PREDEFINED_CONSTANTS = %w[
- false null true undefined
- ]
+ false null true undefined NaN Infinity
+ ] # :nodoc:
- MAGIC_VARIABLES = %w[ this arguments ] # arguments was introduced in JavaScript 1.4
+ MAGIC_VARIABLES = %w[ this arguments ] # :nodoc: arguments was introduced in JavaScript 1.4
KEYWORDS_EXPECTING_VALUE = WordList.new.add %w[
case delete in instanceof new return throw typeof with
- ]
+ ] # :nodoc:
# Reserved for future use.
RESERVED_WORDS = %w[
@@ -30,68 +31,66 @@ module Scanners
final float goto implements import int interface long native package
private protected public short static super synchronized throws transient
volatile
- ]
+ ] # :nodoc:
IDENT_KIND = WordList.new(:ident).
add(RESERVED_WORDS, :reserved).
- add(PREDEFINED_CONSTANTS, :pre_constant).
+ add(PREDEFINED_CONSTANTS, :predefined_constant).
add(MAGIC_VARIABLES, :local_variable).
- add(KEYWORDS, :keyword)
-
- ESCAPE = / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
- UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x
- REGEXP_ESCAPE = / [bBdDsSwW] /x
+ add(KEYWORDS, :keyword) # :nodoc:
+
+ ESCAPE = / [bfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x # :nodoc:
+ UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x # :nodoc:
+ REGEXP_ESCAPE = / [bBdDsSwW] /x # :nodoc:
STRING_CONTENT_PATTERN = {
"'" => /[^\\']+/,
'"' => /[^\\"]+/,
'/' => /[^\\\/]+/,
- }
+ } # :nodoc:
KEY_CHECK_PATTERN = {
"'" => / (?> [^\\']* (?: \\. [^\\']* )* ) ' \s* : /mx,
'"' => / (?> [^\\"]* (?: \\. [^\\"]* )* ) " \s* : /mx,
- }
-
- def scan_tokens tokens, options
-
+ } # :nodoc:
+
+ protected
+
+ def scan_tokens encoder, options
+
state = :initial
string_delimiter = nil
value_expected = true
key_expected = false
function_expected = false
-
+
until eos?
-
- kind = nil
- match = nil
case state
-
+
when :initial
-
+
if match = scan(/ \s+ | \\\n /x)
value_expected = true if !value_expected && match.index(?\n)
- tokens << [match, :space]
- next
-
- elsif scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
+ encoder.text_token match, :space
+
+ elsif match = scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
value_expected = true
- kind = :comment
-
+ encoder.text_token match, :comment
+
elsif check(/\.?\d/)
key_expected = value_expected = false
- if scan(/0[xX][0-9A-Fa-f]+/)
- kind = :hex
- elsif scan(/(?>0[0-7]+)(?![89.eEfF])/)
- kind = :oct
- elsif scan(/\d+[fF]|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
- kind = :float
- elsif scan(/\d+/)
- kind = :integer
+ if match = scan(/0[xX][0-9A-Fa-f]+/)
+ encoder.text_token match, :hex
+ elsif match = scan(/(?>0[0-7]+)(?![89.eEfF])/)
+ encoder.text_token match, :octal
+ elsif match = scan(/\d+[fF]|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
+ encoder.text_token match, :float
+ elsif match = scan(/\d+/)
+ encoder.text_token match, :integer
end
-
+
elsif value_expected && match = scan(/<([[:alpha:]]\w*) (?: [^\/>]*\/> | .*?<\/\1>)/xim)
- # FIXME: scan over nested tags
- xml_scanner.tokenize match
+ # TODO: scan over nested tags
+ xml_scanner.tokenize match, :tokens => encoder
value_expected = false
next
@@ -100,12 +99,12 @@ module Scanners
last_operator = match[-1]
key_expected = (last_operator == ?{) || (last_operator == ?,)
function_expected = false
- kind = :operator
-
- elsif scan(/ [)\]}]+ /x)
+ encoder.text_token match, :operator
+
+ elsif match = scan(/ [)\]}]+ /x)
function_expected = key_expected = value_expected = false
- kind = :operator
-
+ encoder.text_token match, :operator
+
elsif match = scan(/ [$a-zA-Z_][A-Za-z_0-9$]* /x)
kind = IDENT_KIND[match]
value_expected = (kind == :keyword) && KEYWORDS_EXPECTING_VALUE[match]
@@ -123,101 +122,91 @@ module Scanners
end
function_expected = (kind == :keyword) && (match == 'function')
key_expected = false
-
+ encoder.text_token match, kind
+
elsif match = scan(/["']/)
if key_expected && check(KEY_CHECK_PATTERN[match])
state = :key
else
state = :string
end
- tokens << [:open, state]
+ encoder.begin_group state
string_delimiter = match
- kind = :delimiter
-
- elsif value_expected && (match = scan(/\/(?=\S)/))
- tokens << [:open, :regexp]
+ encoder.text_token match, :delimiter
+
+ elsif value_expected && (match = scan(/\//))
+ encoder.begin_group :regexp
state = :regexp
string_delimiter = '/'
- kind = :delimiter
-
- elsif scan(/ \/ /x)
+ encoder.text_token match, :delimiter
+
+ elsif match = scan(/ \/ /x)
value_expected = true
key_expected = false
- kind = :operator
-
+ encoder.text_token match, :operator
+
else
- getch
- kind = :error
-
+ encoder.text_token getch, :error
+
end
-
+
when :string, :regexp, :key
- if scan(STRING_CONTENT_PATTERN[string_delimiter])
- kind = :content
+ if match = scan(STRING_CONTENT_PATTERN[string_delimiter])
+ encoder.text_token match, :content
elsif match = scan(/["'\/]/)
- tokens << [match, :delimiter]
+ encoder.text_token match, :delimiter
if state == :regexp
modifiers = scan(/[gim]+/)
- tokens << [modifiers, :modifier] if modifiers && !modifiers.empty?
+ encoder.text_token modifiers, :modifier if modifiers && !modifiers.empty?
end
- tokens << [:close, state]
+ encoder.end_group state
string_delimiter = nil
key_expected = value_expected = false
state = :initial
- next
elsif state != :regexp && (match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox))
if string_delimiter == "'" && !(match == "\\\\" || match == "\\'")
- kind = :content
+ encoder.text_token match, :content
else
- kind = :char
+ encoder.text_token match, :char
end
- elsif state == :regexp && scan(/ \\ (?: #{ESCAPE} | #{REGEXP_ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
- kind = :char
- elsif scan(/\\./m)
- kind = :content
- elsif scan(/ \\ | $ /x)
- tokens << [:close, state]
- kind = :error
+ elsif state == :regexp && match = scan(/ \\ (?: #{ESCAPE} | #{REGEXP_ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+ encoder.text_token match, :char
+ elsif match = scan(/\\./m)
+ encoder.text_token match, :content
+ elsif match = scan(/ \\ | $ /x)
+ encoder.end_group state
+ encoder.text_token match, :error
key_expected = value_expected = false
state = :initial
else
- raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+ raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
end
-
+
else
- raise_inspect 'Unknown state', tokens
-
- end
-
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens
+ raise_inspect 'Unknown state', encoder
+
end
- raise_inspect 'Empty token', tokens unless match
- tokens << [match, kind]
-
end
-
+
if [:string, :regexp].include? state
- tokens << [:close, state]
+ encoder.end_group state
end
-
- tokens
+
+ encoder
end
-
+
protected
-
+
def reset_instance
super
@xml_scanner.reset if defined? @xml_scanner
end
-
+
def xml_scanner
@xml_scanner ||= CodeRay.scanner :xml, :tokens => @tokens, :keep_tokens => true, :keep_state => false
end
-
+
end
end
diff --git a/lib/coderay/scanners/json.rb b/lib/coderay/scanners/json.rb
index abe24fb..0c90c34 100644
--- a/lib/coderay/scanners/json.rb
+++ b/lib/coderay/scanners/json.rb
@@ -1,22 +1,24 @@
module CodeRay
module Scanners
+ # Scanner for JSON (JavaScript Object Notation).
class JSON < Scanner
- include Streamable
-
register_for :json
file_extension 'json'
KINDS_NOT_LOC = [
:float, :char, :content, :delimiter,
:error, :integer, :operator, :value,
- ]
+ ] # :nodoc:
+
+ ESCAPE = / [bfnrt\\"\/] /x # :nodoc:
+ UNICODE_ESCAPE = / u[a-fA-F0-9]{4} /x # :nodoc:
- ESCAPE = / [bfnrt\\"\/] /x
- UNICODE_ESCAPE = / u[a-fA-F0-9]{4} /x
+ protected
- def scan_tokens tokens, options
+ # See http://json.org/ for a definition of the JSON lexic/grammar.
+ def scan_tokens encoder, options
state = :initial
stack = []
@@ -24,82 +26,67 @@ module Scanners
until eos?
- kind = nil
- match = nil
-
case state
when :initial
- if match = scan(/ \s+ | \\\n /x)
- tokens << [match, :space]
- next
+ if match = scan(/ \s+ /x)
+ encoder.text_token match, :space
+ elsif match = scan(/"/)
+ state = key_expected ? :key : :string
+ encoder.begin_group state
+ encoder.text_token match, :delimiter
elsif match = scan(/ [:,\[{\]}] /x)
- kind = :operator
+ encoder.text_token match, :operator
case match
- when '{' then stack << :object; key_expected = true
- when '[' then stack << :array
when ':' then key_expected = false
when ',' then key_expected = true if stack.last == :object
+ when '{' then stack << :object; key_expected = true
+ when '[' then stack << :array
when '}', ']' then stack.pop # no error recovery, but works for valid JSON
end
elsif match = scan(/ true | false | null /x)
- kind = :value
- elsif match = scan(/-?(?:0|[1-9]\d*)/)
- kind = :integer
- if scan(/\.\d+(?:[eE][-+]?\d+)?|[eE][-+]?\d+/)
+ encoder.text_token match, :value
+ elsif match = scan(/ -? (?: 0 | [1-9]\d* ) /x)
+ if scan(/ \.\d+ (?:[eE][-+]?\d+)? | [eE][-+]? \d+ /x)
match << matched
- kind = :float
+ encoder.text_token match, :float
+ else
+ encoder.text_token match, :integer
end
- elsif match = scan(/"/)
- state = key_expected ? :key : :string
- tokens << [:open, state]
- kind = :delimiter
else
- getch
- kind = :error
+ encoder.text_token getch, :error
end
when :string, :key
- if scan(/[^\\"]+/)
- kind = :content
- elsif scan(/"/)
- tokens << ['"', :delimiter]
- tokens << [:close, state]
+ if match = scan(/[^\\"]+/)
+ encoder.text_token match, :content
+ elsif match = scan(/"/)
+ encoder.text_token match, :delimiter
+ encoder.end_group state
state = :initial
- next
- elsif scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
- kind = :char
- elsif scan(/\\./m)
- kind = :content
- elsif scan(/ \\ | $ /x)
- tokens << [:close, state]
- kind = :error
+ elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+ encoder.text_token match, :char
+ elsif match = scan(/\\./m)
+ encoder.text_token match, :content
+ elsif match = scan(/ \\ | $ /x)
+ encoder.end_group state
+ encoder.text_token match, :error
state = :initial
else
- raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
+ raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
end
else
- raise_inspect 'Unknown state', tokens
+ raise_inspect 'Unknown state: %p' % [state], encoder
end
-
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens
- end
- raise_inspect 'Empty token', tokens unless match
-
- tokens << [match, kind]
-
end
if [:string, :key].include? state
- tokens << [:close, state]
+ encoder.end_group state
end
- tokens
+ encoder
end
end
diff --git a/lib/coderay/scanners/nitro_xhtml.rb b/lib/coderay/scanners/nitro_xhtml.rb
deleted file mode 100644
index 3db42d9..0000000
--- a/lib/coderay/scanners/nitro_xhtml.rb
+++ /dev/null
@@ -1,136 +0,0 @@
-module CodeRay
-module Scanners
-
- load :html
- load :ruby
-
- # Nitro XHTML Scanner
- class NitroXHTML < Scanner
-
- include Streamable
- register_for :nitro_xhtml
- file_extension :xhtml
- title 'Nitro XHTML'
-
- KINDS_NOT_LOC = HTML::KINDS_NOT_LOC
-
- NITRO_RUBY_BLOCK = /
- <\?r
- (?>
- [^\?]*
- (?> \?(?!>) [^\?]* )*
- )
- (?: \?> )?
- |
- <ruby>
- (?>
- [^<]*
- (?> <(?!\/ruby>) [^<]* )*
- )
- (?: <\/ruby> )?
- |
- <%
- (?>
- [^%]*
- (?> %(?!>) [^%]* )*
- )
- (?: %> )?
- /mx
-
- NITRO_VALUE_BLOCK = /
- \#
- (?:
- \{
- [^{}]*
- (?>
- \{ [^}]* \}
- (?> [^{}]* )
- )*
- \}?
- | \| [^|]* \|?
- | \( [^)]* \)?
- | \[ [^\]]* \]?
- | \\ [^\\]* \\?
- )
- /x
-
- NITRO_ENTITY = /
- % (?: \#\d+ | \w+ ) ;
- /
-
- START_OF_RUBY = /
- (?=[<\#%])
- < (?: \?r | % | ruby> )
- | \# [{(|]
- | % (?: \#\d+ | \w+ ) ;
- /x
-
- CLOSING_PAREN = Hash.new do |h, p|
- h[p] = p
- end.update( {
- '(' => ')',
- '[' => ']',
- '{' => '}',
- } )
-
- private
-
- def setup
- @ruby_scanner = CodeRay.scanner :ruby, :tokens => @tokens, :keep_tokens => true
- @html_scanner = CodeRay.scanner :html, :tokens => @tokens, :keep_tokens => true, :keep_state => true
- end
-
- def reset_instance
- super
- @html_scanner.reset
- end
-
- def scan_tokens tokens, options
-
- until eos?
-
- if (match = scan_until(/(?=#{START_OF_RUBY})/o) || scan_rest) && !match.empty?
- @html_scanner.tokenize match
-
- elsif match = scan(/#{NITRO_VALUE_BLOCK}/o)
- start_tag = match[0,2]
- delimiter = CLOSING_PAREN[start_tag[1,1]]
- end_tag = match[-1,1] == delimiter ? delimiter : ''
- tokens << [:open, :inline]
- tokens << [start_tag, :inline_delimiter]
- code = match[start_tag.size .. -1 - end_tag.size]
- @ruby_scanner.tokenize code
- tokens << [end_tag, :inline_delimiter] unless end_tag.empty?
- tokens << [:close, :inline]
-
- elsif match = scan(/#{NITRO_RUBY_BLOCK}/o)
- start_tag = '<?r'
- end_tag = match[-2,2] == '?>' ? '?>' : ''
- tokens << [:open, :inline]
- tokens << [start_tag, :inline_delimiter]
- code = match[start_tag.size .. -(end_tag.size)-1]
- @ruby_scanner.tokenize code
- tokens << [end_tag, :inline_delimiter] unless end_tag.empty?
- tokens << [:close, :inline]
-
- elsif entity = scan(/#{NITRO_ENTITY}/o)
- tokens << [entity, :entity]
-
- elsif scan(/%/)
- tokens << [matched, :error]
-
- else
- raise_inspect 'else-case reached!', tokens
-
- end
-
- end
-
- tokens
-
- end
-
- end
-
-end
-end
diff --git a/lib/coderay/scanners/php.rb b/lib/coderay/scanners/php.rb
index 51d6af0..dadab00 100644
--- a/lib/coderay/scanners/php.rb
+++ b/lib/coderay/scanners/php.rb
@@ -3,14 +3,19 @@ module Scanners
load :html
+ # Scanner for PHP.
+ #
# Original by Stefan Walk.
class PHP < Scanner
register_for :php
file_extension 'php'
+ encoding 'BINARY'
KINDS_NOT_LOC = HTML::KINDS_NOT_LOC
+ protected
+
def setup
@html_scanner = CodeRay.scanner :html, :tokens => @tokens, :keep_tokens => true, :keep_state => true
end
@@ -20,7 +25,7 @@ module Scanners
@html_scanner.reset
end
- module Words
+ module Words # :nodoc:
# according to http://www.php.net/manual/en/reserved.keywords.php
KEYWORDS = %w[
@@ -176,20 +181,20 @@ module Scanners
$argc $argv
]
- IDENT_KIND = CaseIgnoringWordList.new(:ident).
- add(KEYWORDS, :reserved).
- add(TYPES, :pre_type).
- add(LANGUAGE_CONSTRUCTS, :reserved).
+ IDENT_KIND = WordList::CaseIgnoring.new(:ident).
+ add(KEYWORDS, :keyword).
+ add(TYPES, :predefined_type).
+ add(LANGUAGE_CONSTRUCTS, :keyword).
add(BUILTIN_FUNCTIONS, :predefined).
- add(CLASSES, :pre_constant).
+ add(CLASSES, :predefined_constant).
add(EXCEPTIONS, :exception).
- add(CONSTANTS, :pre_constant)
+ add(CONSTANTS, :predefined_constant)
VARIABLE_KIND = WordList.new(:local_variable).
add(PREDEFINED, :predefined)
end
- module RE
+ module RE # :nodoc:
PHP_START = /
<script\s+[^>]*?language\s*=\s*"php"[^>]*?> |
@@ -224,17 +229,13 @@ module Scanners
end
- def scan_tokens tokens, options
- if string.respond_to?(:encoding)
- unless string.encoding == Encoding::ASCII_8BIT
- self.string = string.encode Encoding::ASCII_8BIT,
- :invalid => :replace, :undef => :replace, :replace => '?'
- end
- end
+ protected
+
+ def scan_tokens encoder, options
if check(RE::PHP_START) || # starts with <?
- (match?(/\s*<\S/) && exist?(RE::PHP_START)) || # starts with tag and contains <?
- exist?(RE::HTML_INDICATOR) ||
+ (match?(/\s*<\S/) && check(/.{1,1000}#{RE::PHP_START}/om)) || # starts with tag and contains <?
+ check(/.{0,1000}#{RE::HTML_INDICATOR}/om) ||
check(/.{1,100}#{RE::PHP_START}/om) # PHP start after max 100 chars
# is HTML with embedded PHP, so start with HTML
states = [:initial]
@@ -252,29 +253,24 @@ module Scanners
until eos?
- match = nil
- kind = nil
-
case states.last
when :initial # HTML
- if scan RE::PHP_START
- kind = :inline_delimiter
+ if match = scan(RE::PHP_START)
+ encoder.text_token match, :inline_delimiter
label_expected = true
states << :php
else
match = scan_until(/(?=#{RE::PHP_START})/o) || scan_rest
@html_scanner.tokenize match unless match.empty?
- next
end
when :php
if match = scan(/\s+/)
- tokens << [match, :space]
- next
+ encoder.text_token match, :space
- elsif scan(%r! (?m: \/\* (?: .*? \*\/ | .* ) ) | (?://|\#) .*? (?=#{RE::PHP_END}|$) !xo)
- kind = :comment
+ elsif match = scan(%r! (?m: \/\* (?: .*? \*\/ | .* ) ) | (?://|\#) .*? (?=#{RE::PHP_END}|$) !xo)
+ encoder.text_token match, :comment
elsif match = scan(RE::IDENTIFIER)
kind = Words::IDENT_KIND[match]
@@ -285,7 +281,7 @@ module Scanners
label_expected = false
if kind == :ident && match =~ /^[A-Z]/
kind = :constant
- elsif kind == :reserved
+ elsif kind == :keyword
case match
when 'class'
states << :class_expected
@@ -299,77 +295,68 @@ module Scanners
next
end
end
+ encoder.text_token match, kind
- elsif scan(/(?:\d+\.\d*|\d*\.\d+)(?:e[-+]?\d+)?|\d+e[-+]?\d+/i)
+ elsif match = scan(/(?:\d+\.\d*|\d*\.\d+)(?:e[-+]?\d+)?|\d+e[-+]?\d+/i)
label_expected = false
- kind = :float
+ encoder.text_token match, :float
- elsif scan(/0x[0-9a-fA-F]+/)
+ elsif match = scan(/0x[0-9a-fA-F]+/)
label_expected = false
- kind = :hex
+ encoder.text_token match, :hex
- elsif scan(/\d+/)
+ elsif match = scan(/\d+/)
label_expected = false
- kind = :integer
-
- elsif scan(/'/)
- tokens << [:open, :string]
- if modifier
- tokens << [modifier, :modifier]
- modifier = nil
- end
- kind = :delimiter
- states.push :sqstring
+ encoder.text_token match, :integer
- elsif match = scan(/["`]/)
- tokens << [:open, :string]
+ elsif match = scan(/['"`]/)
+ encoder.begin_group :string
if modifier
- tokens << [modifier, :modifier]
+ encoder.text_token modifier, :modifier
modifier = nil
end
delimiter = match
- kind = :delimiter
- states.push :dqstring
+ encoder.text_token match, :delimiter
+ states.push match == "'" ? :sqstring : :dqstring
elsif match = scan(RE::VARIABLE)
label_expected = false
- kind = Words::VARIABLE_KIND[match]
+ encoder.text_token match, Words::VARIABLE_KIND[match]
- elsif scan(/\{/)
- kind = :operator
+ elsif match = scan(/\{/)
+ encoder.text_token match, :operator
label_expected = true
states.push :php
- elsif scan(/\}/)
+ elsif match = scan(/\}/)
if states.size == 1
- kind = :error
+ encoder.text_token match, :error
else
states.pop
if states.last.is_a?(::Array)
delimiter = states.last[1]
states[-1] = states.last[0]
- tokens << [matched, :delimiter]
- tokens << [:close, :inline]
- next
+ encoder.text_token match, :delimiter
+ encoder.end_group :inline
else
- kind = :operator
+ encoder.text_token match, :operator
label_expected = true
end
end
- elsif scan(/@/)
+ elsif match = scan(/@/)
label_expected = false
- kind = :exception
+ encoder.text_token match, :exception
- elsif scan RE::PHP_END
- kind = :inline_delimiter
+ elsif match = scan(RE::PHP_END)
+ encoder.text_token match, :inline_delimiter
states = [:initial]
elsif match = scan(/<<<(?:(#{RE::IDENTIFIER})|"(#{RE::IDENTIFIER})"|'(#{RE::IDENTIFIER})')/o)
- tokens << [:open, :string]
- warn 'heredoc in heredoc?' if heredoc_delimiter
+ encoder.begin_group :string
+ # warn 'heredoc in heredoc?' if heredoc_delimiter
heredoc_delimiter = Regexp.escape(self[1] || self[2] || self[3])
- kind = :delimiter
+ encoder.text_token match, :delimiter
states.push self[3] ? :sqstring : :dqstring
heredoc_delimiter = /#{heredoc_delimiter}(?=;?$)/
@@ -379,152 +366,141 @@ module Scanners
label_expected = true if match == ':'
case_expected = false
end
- kind = :operator
+ encoder.text_token match, :operator
else
- getch
- kind = :error
+ encoder.text_token getch, :error
end
when :sqstring
- if scan(heredoc_delimiter ? /[^\\\n]+/ : /[^'\\]+/)
- kind = :content
- elsif !heredoc_delimiter && scan(/'/)
- tokens << [matched, :delimiter]
- tokens << [:close, :string]
+ if match = scan(heredoc_delimiter ? /[^\\\n]+/ : /[^'\\]+/)
+ encoder.text_token match, :content
+ elsif !heredoc_delimiter && match = scan(/'/)
+ encoder.text_token match, :delimiter
+ encoder.end_group :string
delimiter = nil
label_expected = false
states.pop
- next
elsif heredoc_delimiter && match = scan(/\n/)
- kind = :content
if scan heredoc_delimiter
- tokens << ["\n", :content]
- tokens << [matched, :delimiter]
- tokens << [:close, :string]
+ encoder.text_token "\n", :content
+ encoder.text_token matched, :delimiter
+ encoder.end_group :string
heredoc_delimiter = nil
label_expected = false
states.pop
- next
+ else
+ encoder.text_token match, :content
end
- elsif scan(heredoc_delimiter ? /\\\\/ : /\\[\\'\n]/)
- kind = :char
- elsif scan(/\\./m)
- kind = :content
- elsif scan(/\\/)
- kind = :error
+ elsif match = scan(heredoc_delimiter ? /\\\\/ : /\\[\\'\n]/)
+ encoder.text_token match, :char
+ elsif match = scan(/\\./m)
+ encoder.text_token match, :content
+ elsif match = scan(/\\/)
+ encoder.text_token match, :error
+ else
+ states.pop
end
when :dqstring
- if scan(heredoc_delimiter ? /[^${\\\n]+/ : (delimiter == '"' ? /[^"${\\]+/ : /[^`${\\]+/))
- kind = :content
- elsif !heredoc_delimiter && scan(delimiter == '"' ? /"/ : /`/)
- tokens << [matched, :delimiter]
- tokens << [:close, :string]
+ if match = scan(heredoc_delimiter ? /[^${\\\n]+/ : (delimiter == '"' ? /[^"${\\]+/ : /[^`${\\]+/))
+ encoder.text_token match, :content
+ elsif !heredoc_delimiter && match = scan(delimiter == '"' ? /"/ : /`/)
+ encoder.text_token match, :delimiter
+ encoder.end_group :string
delimiter = nil
label_expected = false
states.pop
- next
elsif heredoc_delimiter && match = scan(/\n/)
- kind = :content
if scan heredoc_delimiter
- tokens << ["\n", :content]
- tokens << [matched, :delimiter]
- tokens << [:close, :string]
+ encoder.text_token "\n", :content
+ encoder.text_token matched, :delimiter
+ encoder.end_group :string
heredoc_delimiter = nil
label_expected = false
states.pop
- next
+ else
+ encoder.text_token match, :content
end
- elsif scan(/\\(?:x[0-9A-Fa-f]{1,2}|[0-7]{1,3})/)
- kind = :char
- elsif scan(heredoc_delimiter ? /\\[nrtvf\\$]/ : (delimiter == '"' ? /\\[nrtvf\\$"]/ : /\\[nrtvf\\$`]/))
- kind = :char
- elsif scan(/\\./m)
- kind = :content
- elsif scan(/\\/)
- kind = :error
+ elsif match = scan(/\\(?:x[0-9A-Fa-f]{1,2}|[0-7]{1,3})/)
+ encoder.text_token match, :char
+ elsif match = scan(heredoc_delimiter ? /\\[nrtvf\\$]/ : (delimiter == '"' ? /\\[nrtvf\\$"]/ : /\\[nrtvf\\$`]/))
+ encoder.text_token match, :char
+ elsif match = scan(/\\./m)
+ encoder.text_token match, :content
+ elsif match = scan(/\\/)
+ encoder.text_token match, :error
elsif match = scan(/#{RE::VARIABLE}/o)
- kind = :local_variable
if check(/\[#{RE::IDENTIFIER}\]/o)
- tokens << [:open, :inline]
- tokens << [match, :local_variable]
- tokens << [scan(/\[/), :operator]
- tokens << [scan(/#{RE::IDENTIFIER}/o), :ident]
- tokens << [scan(/\]/), :operator]
- tokens << [:close, :inline]
- next
+ encoder.begin_group :inline
+ encoder.text_token match, :local_variable
+ encoder.text_token scan(/\[/), :operator
+ encoder.text_token scan(/#{RE::IDENTIFIER}/o), :ident
+ encoder.text_token scan(/\]/), :operator
+ encoder.end_group :inline
elsif check(/\[/)
match << scan(/\[['"]?#{RE::IDENTIFIER}?['"]?\]?/o)
- kind = :error
+ encoder.text_token match, :error
elsif check(/->#{RE::IDENTIFIER}/o)
- tokens << [:open, :inline]
- tokens << [match, :local_variable]
- tokens << [scan(/->/), :operator]
- tokens << [scan(/#{RE::IDENTIFIER}/o), :ident]
- tokens << [:close, :inline]
- next
+ encoder.begin_group :inline
+ encoder.text_token match, :local_variable
+ encoder.text_token scan(/->/), :operator
+ encoder.text_token scan(/#{RE::IDENTIFIER}/o), :ident
+ encoder.end_group :inline
elsif check(/->/)
match << scan(/->/)
- kind = :error
+ encoder.text_token match, :error
+ else
+ encoder.text_token match, :local_variable
end
elsif match = scan(/\{/)
if check(/\$/)
- kind = :delimiter
+ encoder.begin_group :inline
states[-1] = [states.last, delimiter]
delimiter = nil
states.push :php
- tokens << [:open, :inline]
+ encoder.text_token match, :delimiter
else
- kind = :string
+ encoder.text_token match, :content
end
- elsif scan(/\$\{#{RE::IDENTIFIER}\}/o)
- kind = :local_variable
- elsif scan(/\$/)
- kind = :content
+ elsif match = scan(/\$\{#{RE::IDENTIFIER}\}/o)
+ encoder.text_token match, :local_variable
+ elsif match = scan(/\$/)
+ encoder.text_token match, :content
+ else
+ states.pop
end
when :class_expected
- if scan(/\s+/)
- kind = :space
+ if match = scan(/\s+/)
+ encoder.text_token match, :space
elsif match = scan(/#{RE::IDENTIFIER}/o)
- kind = :class
+ encoder.text_token match, :class
states.pop
else
states.pop
- next
end
when :function_expected
- if scan(/\s+/)
- kind = :space
- elsif scan(/&/)
- kind = :operator
+ if match = scan(/\s+/)
+ encoder.text_token match, :space
+ elsif match = scan(/&/)
+ encoder.text_token match, :operator
elsif match = scan(/#{RE::IDENTIFIER}/o)
- kind = :function
+ encoder.text_token match, :function
states.pop
else
states.pop
- next
end
else
- raise_inspect 'Unknown state!', tokens, states
+ raise_inspect 'Unknown state!', encoder, states
end
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens, states
- end
- raise_inspect 'Empty token', tokens, states unless match
-
- tokens << [match, kind]
-
end
- tokens
+ encoder
end
end
diff --git a/lib/coderay/scanners/plaintext.rb b/lib/coderay/scanners/plaintext.rb
deleted file mode 100644
index 521f3ac..0000000
--- a/lib/coderay/scanners/plaintext.rb
+++ /dev/null
@@ -1,20 +0,0 @@
-module CodeRay
-module Scanners
-
- class Plaintext < Scanner
-
- register_for :plaintext, :plain
- title 'Plain text'
-
- include Streamable
-
- KINDS_NOT_LOC = [:plain]
-
- def scan_tokens tokens, options
- tokens << [scan_rest, :plain]
- end
-
- end
-
-end
-end
diff --git a/lib/coderay/scanners/python.rb b/lib/coderay/scanners/python.rb
index 6d0fc34..5e38a2c 100644
--- a/lib/coderay/scanners/python.rb
+++ b/lib/coderay/scanners/python.rb
@@ -1,12 +1,12 @@
module CodeRay
module Scanners
- # Bases on pygments' PythonLexer, see
+ # Scanner for Python. Supports Python 3.
+ #
+ # Based on pygments' PythonLexer, see
# http://dev.pocoo.org/projects/pygments/browser/pygments/lexers/agile.py.
class Python < Scanner
- include Streamable
-
register_for :python
file_extension 'py'
@@ -16,11 +16,11 @@ module Scanners
'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'not',
'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield',
'nonlocal', # new in Python 3
- ]
+ ] # :nodoc:
OLD_KEYWORDS = [
'exec', 'print', # gone in Python 3
- ]
+ ] # :nodoc:
PREDEFINED_METHODS_AND_TYPES = %w[
__import__ abs all any apply basestring bin bool buffer
@@ -32,7 +32,7 @@ module Scanners
raw_input reduce reload repr reversed round set setattr slice
sorted staticmethod str sum super tuple type unichr unicode
vars xrange zip
- ]
+ ] # :nodoc:
PREDEFINED_EXCEPTIONS = %w[
ArithmeticError AssertionError AttributeError
@@ -47,23 +47,23 @@ module Scanners
TypeError UnboundLocalError UnicodeDecodeError
UnicodeEncodeError UnicodeError UnicodeTranslateError
UnicodeWarning UserWarning ValueError Warning ZeroDivisionError
- ]
+ ] # :nodoc:
PREDEFINED_VARIABLES_AND_CONSTANTS = [
- 'False', 'True', 'None', # "keywords" since Python 3
+ 'False', 'True', 'None', # "keywords" since Python 3
'self', 'Ellipsis', 'NotImplemented',
- ]
+ ] # :nodoc:
IDENT_KIND = WordList.new(:ident).
add(KEYWORDS, :keyword).
add(OLD_KEYWORDS, :old_keyword).
add(PREDEFINED_METHODS_AND_TYPES, :predefined).
- add(PREDEFINED_VARIABLES_AND_CONSTANTS, :pre_constant).
- add(PREDEFINED_EXCEPTIONS, :exception)
+ add(PREDEFINED_VARIABLES_AND_CONSTANTS, :predefined_constant).
+ add(PREDEFINED_EXCEPTIONS, :exception) # :nodoc:
- NAME = / [^\W\d] \w* /x
- ESCAPE = / [abfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
- UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} | N\{[-\w ]+\} /x
+ NAME = / [^\W\d] \w* /x # :nodoc:
+ ESCAPE = / [abfnrtv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x # :nodoc:
+ UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} | N\{[-\w ]+\} /x # :nodoc:
OPERATOR = /
\.\.\. | # ellipsis
@@ -73,95 +73,103 @@ module Scanners
[-+*\/%&|^]=? | # ordinary math and binary logic
[~`] | # binary complement and inspection
<<=? | >>=? | [<>=]=? | != # comparison and assignment
- /x
+ /x # :nodoc:
- STRING_DELIMITER_REGEXP = Hash.new do |h, delimiter|
- h[delimiter] = Regexp.union delimiter
- end
+ STRING_DELIMITER_REGEXP = Hash.new { |h, delimiter|
+ h[delimiter] = Regexp.union delimiter # :nodoc:
+ }
- STRING_CONTENT_REGEXP = Hash.new do |h, delimiter|
- h[delimiter] = / [^\\\n]+? (?= \\ | $ | #{Regexp.escape(delimiter)} ) /x
- end
+ STRING_CONTENT_REGEXP = Hash.new { |h, delimiter|
+ h[delimiter] = / [^\\\n]+? (?= \\ | $ | #{Regexp.escape(delimiter)} ) /x # :nodoc:
+ }
DEF_NEW_STATE = WordList.new(:initial).
add(%w(def), :def_expected).
add(%w(import from), :include_expected).
- add(%w(class), :class_expected)
+ add(%w(class), :class_expected) # :nodoc:
DESCRIPTOR = /
#{NAME}
(?: \. #{NAME} )*
| \*
- /x
+ /x # :nodoc:
+
+ DOCSTRING_COMING = /
+ [ \t]* u?r? ("""|''')
+ /x # :nodoc:
- def scan_tokens tokens, options
+ protected
+
+ def scan_tokens encoder, options
state = :initial
string_delimiter = nil
string_raw = false
+ string_type = nil
+ docstring_coming = match?(/#{DOCSTRING_COMING}/o)
last_token_dot = false
unicode = string.respond_to?(:encoding) && string.encoding.name == 'UTF-8'
from_import_state = []
until eos?
- kind = nil
- match = nil
-
if state == :string
- if scan(STRING_DELIMITER_REGEXP[string_delimiter])
- tokens << [matched, :delimiter]
- tokens << [:close, :string]
+ if match = scan(STRING_DELIMITER_REGEXP[string_delimiter])
+ encoder.text_token match, :delimiter
+ encoder.end_group string_type
+ string_type = nil
state = :initial
next
- elsif string_delimiter.size == 3 && scan(/\n/)
- kind = :content
- elsif scan(STRING_CONTENT_REGEXP[string_delimiter])
- kind = :content
- elsif !string_raw && scan(/ \\ #{ESCAPE} /ox)
- kind = :char
- elsif scan(/ \\ #{UNICODE_ESCAPE} /ox)
- kind = :char
- elsif scan(/ \\ . /x)
- kind = :content
- elsif scan(/ \\ | $ /x)
- tokens << [:close, :string]
- kind = :error
+ elsif string_delimiter.size == 3 && match = scan(/\n/)
+ encoder.text_token match, :content
+ elsif match = scan(STRING_CONTENT_REGEXP[string_delimiter])
+ encoder.text_token match, :content
+ elsif !string_raw && match = scan(/ \\ #{ESCAPE} /ox)
+ encoder.text_token match, :char
+ elsif match = scan(/ \\ #{UNICODE_ESCAPE} /ox)
+ encoder.text_token match, :char
+ elsif match = scan(/ \\ . /x)
+ encoder.text_token match, :content
+ elsif match = scan(/ \\ | $ /x)
+ encoder.end_group string_type
+ string_type = nil
+ encoder.text_token match, :error
state = :initial
else
- raise_inspect "else case \" reached; %p not handled." % peek(1), tokens, state
+ raise_inspect "else case \" reached; %p not handled." % peek(1), encoder, state
end
- elsif match = scan(/ [ \t]+ | \\\n /x)
- tokens << [match, :space]
- next
-
- elsif match = scan(/\n/)
- tokens << [match, :space]
- state = :initial if state == :include_expected
+ elsif match = scan(/ [ \t]+ | \\?\n /x)
+ encoder.text_token match, :space
+ if match == "\n"
+ state = :initial if state == :include_expected
+ docstring_coming = true if match?(/#{DOCSTRING_COMING}/o)
+ end
next
elsif match = scan(/ \# [^\n]* /mx)
- tokens << [match, :comment]
+ encoder.text_token match, :comment
next
elsif state == :initial
- if scan(/#{OPERATOR}/o)
- kind = :operator
+ if match = scan(/#{OPERATOR}/o)
+ encoder.text_token match, :operator
elsif match = scan(/(u?r?|b)?("""|"|'''|')/i)
- tokens << [:open, :string]
string_delimiter = self[2]
+ string_type = docstring_coming ? :docstring : :string
+ docstring_coming = false if docstring_coming
+ encoder.begin_group string_type
string_raw = false
modifiers = self[1]
unless modifiers.empty?
string_raw = !!modifiers.index(?r)
- tokens << [modifiers, :modifier]
+ encoder.text_token modifiers, :modifier
match = string_delimiter
end
state = :string
- kind = :delimiter
+ encoder.text_token match, :delimiter
# TODO: backticks
@@ -177,43 +185,45 @@ module Scanners
state = DEF_NEW_STATE[match]
from_import_state << match.to_sym if state == :include_expected
end
+ encoder.text_token match, kind
- elsif scan(/@[a-zA-Z0-9_.]+[lL]?/)
- kind = :decorator
+ elsif match = scan(/@[a-zA-Z0-9_.]+[lL]?/)
+ encoder.text_token match, :decorator
- elsif scan(/0[xX][0-9A-Fa-f]+[lL]?/)
- kind = :hex
+ elsif match = scan(/0[xX][0-9A-Fa-f]+[lL]?/)
+ encoder.text_token match, :hex
- elsif scan(/0[bB][01]+[lL]?/)
- kind = :bin
+ elsif match = scan(/0[bB][01]+[lL]?/)
+ encoder.text_token match, :binary
elsif match = scan(/(?:\d*\.\d+|\d+\.\d*)(?:[eE][+-]?\d+)?|\d+[eE][+-]?\d+/)
- kind = :float
if scan(/[jJ]/)
match << matched
- kind = :imaginary
+ encoder.text_token match, :imaginary
+ else
+ encoder.text_token match, :float
end
- elsif scan(/0[oO][0-7]+|0[0-7]+(?![89.eE])[lL]?/)
- kind = :oct
+ elsif match = scan(/0[oO][0-7]+|0[0-7]+(?![89.eE])[lL]?/)
+ encoder.text_token match, :octal
elsif match = scan(/\d+([lL])?/)
- kind = :integer
if self[1] == nil && scan(/[jJ]/)
match << matched
- kind = :imaginary
+ encoder.text_token match, :imaginary
+ else
+ encoder.text_token match, :integer
end
else
- getch
- kind = :error
+ encoder.text_token getch, :error
end
elsif state == :def_expected
state = :initial
if match = scan(unicode ? /#{NAME}/uo : /#{NAME}/o)
- kind = :method
+ encoder.text_token match, :method
else
next
end
@@ -221,33 +231,34 @@ module Scanners
elsif state == :class_expected
state = :initial
if match = scan(unicode ? /#{NAME}/uo : /#{NAME}/o)
- kind = :class
+ encoder.text_token match, :class
else
next
end
elsif state == :include_expected
if match = scan(unicode ? /#{DESCRIPTOR}/uo : /#{DESCRIPTOR}/o)
- kind = :include
if match == 'as'
- kind = :keyword
+ encoder.text_token match, :keyword
from_import_state << :as
elsif from_import_state.first == :from && match == 'import'
- kind = :keyword
+ encoder.text_token match, :keyword
from_import_state << :import
elsif from_import_state.last == :as
- # kind = match[0,1][unicode ? /[[:upper:]]/u : /[[:upper:]]/] ? :class : :method
- kind = :ident
+ # encoder.text_token match, match[0,1][unicode ? /[[:upper:]]/u : /[[:upper:]]/] ? :class : :method
+ encoder.text_token match, :ident
from_import_state.pop
elsif IDENT_KIND[match] == :keyword
unscan
match = nil
state = :initial
next
+ else
+ encoder.text_token match, :include
end
elsif match = scan(/,/)
from_import_state.pop if from_import_state.last == :as
- kind = :operator
+ encoder.text_token match, :operator
else
from_import_state = []
state = :initial
@@ -255,28 +266,19 @@ module Scanners
end
else
- raise_inspect 'Unknown state', tokens, state
+ raise_inspect 'Unknown state', encoder, state
end
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens, state
- end
- raise_inspect 'Empty token', tokens, state unless match
-
last_token_dot = match == '.'
- tokens << [match, kind]
-
end
if state == :string
- tokens << [:close, :string]
+ encoder.end_group string_type
end
- tokens
+ encoder
end
end
diff --git a/lib/coderay/scanners/raydebug.rb b/lib/coderay/scanners/raydebug.rb
new file mode 100644
index 0000000..7a21354
--- /dev/null
+++ b/lib/coderay/scanners/raydebug.rb
@@ -0,0 +1,66 @@
+module CodeRay
+module Scanners
+
+ # = Debug Scanner
+ #
+ # Parses the output of the Encoders::Debug encoder.
+ class Raydebug < Scanner
+
+ register_for :raydebug
+ file_extension 'raydebug'
+ title 'CodeRay Token Dump'
+
+ protected
+
+ def scan_tokens encoder, options
+
+ opened_tokens = []
+
+ until eos?
+
+ if match = scan(/\s+/)
+ encoder.text_token match, :space
+
+ elsif match = scan(/ (\w+) \( ( [^\)\\]* ( \\. [^\)\\]* )* ) /x)
+ kind = self[1]
+ encoder.text_token kind, :class
+ encoder.text_token '(', :operator
+ match = self[2]
+ encoder.text_token match, kind.to_sym
+ encoder.text_token match, :operator if match = scan(/\)/)
+
+ elsif match = scan(/ (\w+) ([<\[]) /x)
+ kind = self[1]
+ case self[2]
+ when '<'
+ encoder.text_token kind, :class
+ when '['
+ encoder.text_token kind, :class
+ else
+ raise 'CodeRay bug: This case should not be reached.'
+ end
+ kind = kind.to_sym
+ opened_tokens << kind
+ encoder.begin_group kind
+ encoder.text_token self[2], :operator
+
+ elsif !opened_tokens.empty? && match = scan(/ [>\]] /x)
+ encoder.text_token match, :operator
+ encoder.end_group opened_tokens.pop
+
+ else
+ encoder.text_token getch, :space
+
+ end
+
+ end
+
+ encoder.end_group opened_tokens.pop until opened_tokens.empty?
+
+ encoder
+ end
+
+ end
+
+end
+end
diff --git a/lib/coderay/scanners/rhtml.rb b/lib/coderay/scanners/rhtml.rb
deleted file mode 100644
index ce51ec6..0000000
--- a/lib/coderay/scanners/rhtml.rb
+++ /dev/null
@@ -1,78 +0,0 @@
-module CodeRay
-module Scanners
-
- load :html
- load :ruby
-
- # RHTML Scanner
- class RHTML < Scanner
-
- include Streamable
- register_for :rhtml
- title 'HTML ERB Template'
-
- KINDS_NOT_LOC = HTML::KINDS_NOT_LOC
-
- ERB_RUBY_BLOCK = /
- <%(?!%)[=-]?
- (?>
- [^\-%]* # normal*
- (?> # special
- (?: %(?!>) | -(?!%>) )
- [^\-%]* # normal*
- )*
- )
- (?: -?%> )?
- /x
-
- START_OF_ERB = /
- <%(?!%)
- /x
-
- private
-
- def setup
- @ruby_scanner = CodeRay.scanner :ruby, :tokens => @tokens, :keep_tokens => true
- @html_scanner = CodeRay.scanner :html, :tokens => @tokens, :keep_tokens => true, :keep_state => true
- end
-
- def reset_instance
- super
- @html_scanner.reset
- end
-
- def scan_tokens tokens, options
-
- until eos?
-
- if (match = scan_until(/(?=#{START_OF_ERB})/o) || scan_rest) and not match.empty?
- @html_scanner.tokenize match
-
- elsif match = scan(/#{ERB_RUBY_BLOCK}/o)
- start_tag = match[/\A<%[-=#]?/]
- end_tag = match[/-?%?>?\z/]
- tokens << [:open, :inline]
- tokens << [start_tag, :inline_delimiter]
- code = match[start_tag.size .. -1 - end_tag.size]
- if start_tag == '<%#'
- tokens << [code, :comment]
- else
- @ruby_scanner.tokenize code
- end
- tokens << [end_tag, :inline_delimiter] unless end_tag.empty?
- tokens << [:close, :inline]
-
- else
- raise_inspect 'else-case reached!', tokens
- end
-
- end
-
- tokens
-
- end
-
- end
-
-end
-end
diff --git a/lib/coderay/scanners/ruby.rb b/lib/coderay/scanners/ruby.rb
index 5feaf9d..2be98a6 100644
--- a/lib/coderay/scanners/ruby.rb
+++ b/lib/coderay/scanners/ruby.rb
@@ -1,7 +1,6 @@
-# encoding: utf-8
module CodeRay
module Scanners
-
+
# This scanner is really complex, since Ruby _is_ a complex language!
#
# It tries to highlight 100% of all common code,
@@ -9,310 +8,240 @@ module Scanners
#
# It is optimized for HTML highlighting, and is not very useful for
# parsing or pretty printing.
- #
- # For now, I think it's better than the scanners in VIM or Syntax, or
- # any highlighter I was able to find, except Caleb's RubyLexer.
- #
- # I hope it's also better than the rdoc/irb lexer.
class Ruby < Scanner
-
- include Streamable
-
+
register_for :ruby
file_extension 'rb'
-
- helper :patterns
- if not defined? EncodingError
- EncodingError = Class.new Exception
+ autoload :Patterns, CodeRay.coderay_path('scanners', 'ruby', 'patterns')
+ autoload :StringState, CodeRay.coderay_path('scanners', 'ruby', 'string_state')
+
+ def interpreted_string_state
+ StringState.new :string, true, '"'
end
-
- private
- def scan_tokens tokens, options
- if string.respond_to?(:encoding)
- unless string.encoding == Encoding::UTF_8
- self.string = string.encode Encoding::UTF_8,
- :invalid => :replace, :undef => :replace, :replace => '?'
- end
- unicode = false
- else
- unicode = exist?(/[^\x00-\x7f]/)
+
+ protected
+
+ def setup
+ @state = :initial
+ end
+
+ def scan_tokens encoder, options
+ state, heredocs = options[:state] || @state
+ heredocs = heredocs.dup if heredocs.is_a?(Array)
+
+ if state && state.instance_of?(StringState)
+ encoder.begin_group state.type
end
- last_token_dot = false
- value_expected = true
- heredocs = nil
last_state = nil
- state = :initial
- depth = nil
- inline_block_stack = []
+ method_call_expected = false
+ value_expected = true
+
+ inline_block_stack = nil
+ inline_block_curly_depth = 0
+
+ if heredocs
+ state = heredocs.shift
+ encoder.begin_group state.type
+ heredocs = nil if heredocs.empty?
+ end
+
+ # def_object_stack = nil
+ # def_object_paren_depth = 0
patterns = Patterns # avoid constant lookup
+ unicode = string.respond_to?(:encoding) && string.encoding.name == 'UTF-8'
+
until eos?
- match = nil
- kind = nil
-
- if state.instance_of? patterns::StringState
-# {{{
- match = scan_until(state.pattern) || scan_rest
- tokens << [match, :content] unless match.empty?
- break if eos?
-
- if state.heredoc and self[1] # end of heredoc
- match = getch.to_s
- match << scan_until(/$/) unless eos?
- tokens << [match, :delimiter]
- tokens << [:close, state.type]
- state = state.next_state
- next
- end
-
- case match = getch
-
- when state.delim
- if state.paren
- state.paren_depth -= 1
- if state.paren_depth > 0
- tokens << [match, :nesting_delimiter]
- next
- end
- end
- tokens << [match, :delimiter]
- if state.type == :regexp and not eos?
- modifiers = scan(/#{patterns::REGEXP_MODIFIERS}/ox)
- tokens << [modifiers, :modifier] unless modifiers.empty?
- end
- tokens << [:close, state.type]
- value_expected = false
- state = state.next_state
-
- when '\\'
- if state.interpreted
- if esc = scan(/ #{patterns::ESCAPE} /ox)
- tokens << [match + esc, :char]
- else
- tokens << [match, :error]
- end
+
+ if state.instance_of? ::Symbol
+
+ if match = scan(/[ \t\f\v]+/)
+ encoder.text_token match, :space
+
+ elsif match = scan(/\n/)
+ if heredocs
+ unscan # heredoc scanning needs \n at start
+ state = heredocs.shift
+ encoder.begin_group state.type
+ heredocs = nil if heredocs.empty?
else
- case m = getch
- when state.delim, '\\'
- tokens << [match + m, :char]
- when nil
- tokens << [match, :error]
- else
- tokens << [match + m, :content]
- end
- end
-
- when '#'
- case peek(1)
- when '{'
- inline_block_stack << [state, depth, heredocs]
+ state = :initial if state == :undef_comma_expected
+ encoder.text_token match, :space
value_expected = true
- state = :initial
- depth = 1
- tokens << [:open, :inline]
- tokens << [match + getch, :inline_delimiter]
- when '$', '@'
- tokens << [match, :escape]
- last_state = state # scan one token as normal code, then return here
- state = :initial
- else
- raise_inspect 'else-case # reached; #%p not handled' % peek(1), tokens
end
-
- when state.paren
- state.paren_depth += 1
- tokens << [match, :nesting_delimiter]
-
- when /#{patterns::REGEXP_SYMBOLS}/ox
- tokens << [match, :function]
-
- else
- raise_inspect 'else-case " reached; %p not handled, state = %p' % [match, state], tokens
-
- end
- next
-# }}}
- else
-# {{{
- if match = scan(/[ \t\f]+/)
- kind = :space
- match << scan(/\s*/) unless eos? || heredocs
- value_expected = true if match.index(?\n)
- tokens << [match, kind]
- next
- elsif match = scan(/\\?\n/)
- kind = :space
- if match == "\n"
- value_expected = true
- state = :initial if state == :undef_comma_expected
- end
+ elsif match = scan(bol? ? / \#(!)?.* | #{patterns::RUBYDOC_OR_DATA} /ox : /\#.*/)
+ encoder.text_token match, self[1] ? :doctype : :comment
+
+ elsif match = scan(/\\\n/)
if heredocs
unscan # heredoc scanning needs \n at start
+ encoder.text_token scan(/\\/), :space
state = heredocs.shift
- tokens << [:open, state.type]
+ encoder.begin_group state.type
heredocs = nil if heredocs.empty?
- next
else
- match << scan(/\s*/) unless eos?
+ encoder.text_token match, :space
end
- tokens << [match, kind]
- next
-
- elsif bol? && match = scan(/\#!.*/)
- tokens << [match, :doctype]
- next
- elsif match = scan(/\#.*/) or
- ( bol? and match = scan(/#{patterns::RUBYDOC_OR_DATA}/o) )
- kind = :comment
- tokens << [match, kind]
- next
-
elsif state == :initial
-
+
# IDENTS #
- if match = scan(unicode ? /#{patterns::METHOD_NAME}/uo :
+ if !method_call_expected &&
+ match = scan(unicode ? /#{patterns::METHOD_NAME}/uo :
/#{patterns::METHOD_NAME}/o)
- if last_token_dot
- kind = if match[/^[A-Z]/] and not match?(/\(/) then :constant else :ident end
- else
- if value_expected != :expect_colon && scan(/:(?= )/)
- tokens << [match, :key]
- match = ':'
- kind = :operator
- else
- kind = patterns::IDENT_KIND[match]
- if kind == :ident
- if match[/\A[A-Z]/] and not match[/[!?]$/] and not match?(/\(/)
- kind = :constant
- end
- elsif kind == :reserved
- state = patterns::DEF_NEW_STATE[match]
- value_expected = :set if patterns::KEYWORDS_EXPECTING_VALUE[match]
- end
+ value_expected = false
+ kind = patterns::IDENT_KIND[match]
+ if kind == :ident
+ if match[/\A[A-Z]/] && !(match[/[!?]$/] || match?(/\(/))
+ kind = :constant
end
+ elsif kind == :keyword
+ state = patterns::KEYWORD_NEW_STATE[match]
+ value_expected = true if patterns::KEYWORDS_EXPECTING_VALUE[match]
end
- value_expected = :set if check(/#{patterns::VALUE_FOLLOWS}/o)
-
- elsif last_token_dot and match = scan(/#{patterns::METHOD_NAME_OPERATOR}|\(/o)
- kind = :ident
- value_expected = :set if check(unicode ? /#{patterns::VALUE_FOLLOWS}/uo :
- /#{patterns::VALUE_FOLLOWS}/o)
-
- # OPERATORS #
- elsif not last_token_dot and match = scan(/ \.\.\.? | (?:\.|::)() | [,\(\)\[\]\{\}] | ==?=? /x)
- if match !~ / [.\)\]\}] /x or match =~ /\.\.\.?/
- value_expected = :set
+ value_expected = true if !value_expected && check(/#{patterns::VALUE_FOLLOWS}/o)
+ encoder.text_token match, kind
+
+ elsif method_call_expected &&
+ match = scan(unicode ? /#{patterns::METHOD_AFTER_DOT}/uo :
+ /#{patterns::METHOD_AFTER_DOT}/o)
+ if method_call_expected == '::' && match[/\A[A-Z]/] && !match?(/\(/)
+ encoder.text_token match, :constant
+ else
+ encoder.text_token match, :ident
end
- last_token_dot = :set if self[1]
- kind = :operator
- unless inline_block_stack.empty?
+ method_call_expected = false
+ value_expected = check(/#{patterns::VALUE_FOLLOWS}/o)
+
+ # OPERATORS #
+ elsif !method_call_expected && match = scan(/ (\.(?!\.)|::) | (?: \.\.\.? | ==?=? | [,\(\[\{] )() | [\)\]\}] /x)
+ method_call_expected = self[1]
+ value_expected = !method_call_expected && self[2]
+ if inline_block_stack
case match
when '{'
- depth += 1
+ inline_block_curly_depth += 1
when '}'
- depth -= 1
- if depth == 0 # closing brace of inline block reached
- state, depth, heredocs = inline_block_stack.pop
+ inline_block_curly_depth -= 1
+ if inline_block_curly_depth == 0 # closing brace of inline block reached
+ state, inline_block_curly_depth, heredocs = inline_block_stack.pop
+ inline_block_stack = nil if inline_block_stack.empty?
heredocs = nil if heredocs && heredocs.empty?
- tokens << [match, :inline_delimiter]
- kind = :inline
- match = :close
+ encoder.text_token match, :inline_delimiter
+ encoder.end_group :inline
+ next
end
end
end
-
- elsif match = scan(/ ['"] /mx)
- tokens << [:open, :string]
- kind = :delimiter
- state = patterns::StringState.new :string, match == '"', match # important for streaming
-
- elsif match = scan(unicode ? /#{patterns::INSTANCE_VARIABLE}/uo :
- /#{patterns::INSTANCE_VARIABLE}/o)
- kind = :instance_variable
-
- elsif value_expected and match = scan(/\//)
- tokens << [:open, :regexp]
- kind = :delimiter
- interpreted = true
- state = patterns::StringState.new :regexp, interpreted, match
-
- # elsif match = scan(/[-+]?#{patterns::NUMERIC}/o)
- elsif match = value_expected ? scan(/[-+]?#{patterns::NUMERIC}/o) : scan(/#{patterns::NUMERIC}/o)
- kind = self[1] ? :float : :integer
-
+ encoder.text_token match, :operator
+
elsif match = scan(unicode ? /#{patterns::SYMBOL}/uo :
/#{patterns::SYMBOL}/o)
case delim = match[1]
when ?', ?"
- tokens << [:open, :symbol]
- tokens << [':', :symbol]
+ encoder.begin_group :symbol
+ encoder.text_token ':', :symbol
match = delim.chr
- kind = :delimiter
- state = patterns::StringState.new :symbol, delim == ?", match
+ encoder.text_token match, :delimiter
+ state = self.class::StringState.new :symbol, delim == ?", match
+ else
+ encoder.text_token match, :symbol
+ value_expected = false
+ end
+
+ elsif match = scan(/ ' (?:(?>[^'\\]*) ')? | " (?:(?>[^"\\\#]*) ")? /mx)
+ encoder.begin_group :string
+ if match.size == 1
+ encoder.text_token match, :delimiter
+ state = self.class::StringState.new :string, match == '"', match # important for streaming
+ else
+ encoder.text_token match[0,1], :delimiter
+ encoder.text_token match[1..-2], :content if match.size > 2
+ encoder.text_token match[-1,1], :delimiter
+ encoder.end_group :string
+ value_expected = false
+ end
+
+ elsif match = scan(unicode ? /#{patterns::INSTANCE_VARIABLE}/uo :
+ /#{patterns::INSTANCE_VARIABLE}/o)
+ value_expected = false
+ encoder.text_token match, :instance_variable
+
+ elsif value_expected && match = scan(/\//)
+ encoder.begin_group :regexp
+ encoder.text_token match, :delimiter
+ state = self.class::StringState.new :regexp, true, '/'
+
+ elsif match = scan(value_expected ? /[-+]?#{patterns::NUMERIC}/o : /#{patterns::NUMERIC}/o)
+ if method_call_expected
+ encoder.text_token match, :error
+ method_call_expected = false
else
- kind = :symbol
+ encoder.text_token match, self[1] ? :float : :integer # TODO: send :hex/:octal/:binary
end
-
- elsif match = scan(/ -[>=]? | [+!~^]=? | [*|&]{1,2}=? | >>? /x)
- value_expected = :set
- kind = :operator
-
- elsif value_expected and match = scan(unicode ? /#{patterns::HEREDOC_OPEN}/uo :
- /#{patterns::HEREDOC_OPEN}/o)
- indented = self[1] == '-'
+ value_expected = false
+
+ elsif match = scan(/ [-+!~^\/]=? | [:;] | [*|&]{1,2}=? | >>? /x)
+ value_expected = true
+ encoder.text_token match, :operator
+
+ elsif value_expected && match = scan(/#{patterns::HEREDOC_OPEN}/o)
quote = self[3]
delim = self[quote ? 4 : 2]
kind = patterns::QUOTE_TO_TYPE[quote]
- tokens << [:open, kind]
- tokens << [match, :delimiter]
- match = :close
- heredoc = patterns::StringState.new kind, quote != '\'', delim, (indented ? :indented : :linestart )
+ encoder.begin_group kind
+ encoder.text_token match, :delimiter
+ encoder.end_group kind
heredocs ||= [] # create heredocs if empty
- heredocs << heredoc
-
- elsif value_expected and match = scan(/#{patterns::FANCY_START_CORRECT}/o)
- kind, interpreted = *patterns::FancyStringType.fetch(self[1]) do
- raise_inspect 'Unknown fancy string: %%%p' % k, tokens
- end
- tokens << [:open, kind]
- state = patterns::StringState.new kind, interpreted, self[2]
- kind = :delimiter
-
- elsif value_expected and match = scan(unicode ? /#{patterns::CHARACTER}/uo :
- /#{patterns::CHARACTER}/o)
- kind = :integer
-
- elsif match = scan(/ [\/%]=? | <(?:<|=>?)? | [?:;] /x)
- value_expected = :set
- kind = :operator
-
+ heredocs << self.class::StringState.new(kind, quote != "'", delim,
+ self[1] == '-' ? :indented : :linestart)
+ value_expected = false
+
+ elsif value_expected && match = scan(/#{patterns::FANCY_STRING_START}/o)
+ kind = patterns::FANCY_STRING_KIND[self[1]]
+ encoder.begin_group kind
+ state = self.class::StringState.new kind, patterns::FANCY_STRING_INTERPRETED[self[1]], self[2]
+ encoder.text_token match, :delimiter
+
+ elsif value_expected && match = scan(/#{patterns::CHARACTER}/o)
+ value_expected = false
+ encoder.text_token match, :integer
+
+ elsif match = scan(/ %=? | <(?:<|=>?)? | \? /x)
+ value_expected = true
+ encoder.text_token match, :operator
+
elsif match = scan(/`/)
- if last_token_dot
- kind = :operator
- else
- tokens << [:open, :shell]
- kind = :delimiter
- state = patterns::StringState.new :shell, true, match
- end
-
+ encoder.begin_group :shell
+ encoder.text_token match, :delimiter
+ state = self.class::StringState.new :shell, true, match
+
elsif match = scan(unicode ? /#{patterns::GLOBAL_VARIABLE}/uo :
/#{patterns::GLOBAL_VARIABLE}/o)
- kind = :global_variable
-
+ encoder.text_token match, :global_variable
+ value_expected = false
+
elsif match = scan(unicode ? /#{patterns::CLASS_VARIABLE}/uo :
/#{patterns::CLASS_VARIABLE}/o)
- kind = :class_variable
-
+ encoder.text_token match, :class_variable
+ value_expected = false
+
+ elsif match = scan(/\\\z/)
+ encoder.text_token match, :space
+
else
- if !unicode && !string.respond_to?(:encoding)
+ if method_call_expected
+ method_call_expected = false
+ next
+ end
+ unless unicode
# check for unicode
- debug, $DEBUG = $DEBUG, false
+ $DEBUG_BEFORE, $DEBUG = $DEBUG, false
begin
if check(/./mu).size > 1
# seems like we should try again with unicode
@@ -321,124 +250,212 @@ module Scanners
rescue
# bad unicode char; use getch
ensure
- $DEBUG = debug
+ $DEBUG = $DEBUG_BEFORE
end
next if unicode
end
- kind = :error
- match = scan(unicode ? /./mu : /./m)
-
+
+ encoder.text_token getch, :error
+
end
-
- elsif state == :def_expected
- state = :initial
- if scan(/self\./)
- tokens << ['self', :pre_constant]
- tokens << ['.', :operator]
+
+ if last_state
+ state = last_state
+ last_state = nil
end
+
+ elsif state == :def_expected
if match = scan(unicode ? /(?>#{patterns::METHOD_NAME_EX})(?!\.|::)/uo :
/(?>#{patterns::METHOD_NAME_EX})(?!\.|::)/o)
- kind = :method
+ encoder.text_token match, :method
+ state = :initial
+ else
+ last_state = :dot_expected
+ state = :initial
+ end
+
+ elsif state == :dot_expected
+ if match = scan(/\.|::/)
+ # invalid definition
+ state = :def_expected
+ encoder.text_token match, :operator
else
- next
+ state = :initial
end
-
+
elsif state == :module_expected
if match = scan(/<</)
- kind = :operator
+ encoder.text_token match, :operator
else
state = :initial
- if match = scan(unicode ? /(?:#{patterns::IDENT}::)*#{patterns::IDENT}/uo :
- /(?:#{patterns::IDENT}::)*#{patterns::IDENT}/o)
- kind = :class
- else
- next
+ if match = scan(unicode ? / (?:#{patterns::IDENT}::)* #{patterns::IDENT} /oux :
+ / (?:#{patterns::IDENT}::)* #{patterns::IDENT} /ox)
+ encoder.text_token match, :class
end
end
-
+
elsif state == :undef_expected
state = :undef_comma_expected
- if match = scan(unicode ? /#{patterns::METHOD_NAME_EX}/uo :
- /#{patterns::METHOD_NAME_EX}/o)
- kind = :method
- elsif match = scan(unicode ? /#{patterns::SYMBOL}/uo :
- /#{patterns::SYMBOL}/o)
+ if match = scan(unicode ? /(?>#{patterns::METHOD_NAME_EX})(?!\.|::)/uo :
+ /(?>#{patterns::METHOD_NAME_EX})(?!\.|::)/o)
+ encoder.text_token match, :method
+ elsif match = scan(/#{patterns::SYMBOL}/o)
case delim = match[1]
when ?', ?"
- tokens << [:open, :symbol]
- tokens << [':', :symbol]
+ encoder.begin_group :symbol
+ encoder.text_token ':', :symbol
match = delim.chr
- kind = :delimiter
- state = patterns::StringState.new :symbol, delim == ?", match
+ encoder.text_token match, :delimiter
+ state = self.class::StringState.new :symbol, delim == ?", match
state.next_state = :undef_comma_expected
else
- kind = :symbol
+ encoder.text_token match, :symbol
end
else
state = :initial
- next
end
-
- elsif state == :alias_expected
- match = scan(unicode ? /(#{patterns::METHOD_NAME_OR_SYMBOL})([ \t]+)(#{patterns::METHOD_NAME_OR_SYMBOL})/uo :
- /(#{patterns::METHOD_NAME_OR_SYMBOL})([ \t]+)(#{patterns::METHOD_NAME_OR_SYMBOL})/o)
- if match
- tokens << [self[1], (self[1][0] == ?: ? :symbol : :method)]
- tokens << [self[2], :space]
- tokens << [self[3], (self[3][0] == ?: ? :symbol : :method)]
- end
- state = :initial
- next
-
elsif state == :undef_comma_expected
if match = scan(/,/)
- kind = :operator
+ encoder.text_token match, :operator
state = :undef_expected
else
state = :initial
- next
end
-
+
+ elsif state == :alias_expected
+ match = scan(unicode ? /(#{patterns::METHOD_NAME_OR_SYMBOL})([ \t]+)(#{patterns::METHOD_NAME_OR_SYMBOL})/uo :
+ /(#{patterns::METHOD_NAME_OR_SYMBOL})([ \t]+)(#{patterns::METHOD_NAME_OR_SYMBOL})/o)
+
+ if match
+ encoder.text_token self[1], (self[1][0] == ?: ? :symbol : :method)
+ encoder.text_token self[2], :space
+ encoder.text_token self[3], (self[3][0] == ?: ? :symbol : :method)
+ end
+ state = :initial
+
+ else
+ #:nocov:
+ raise_inspect 'Unknown state: %p' % [state], encoder
+ #:nocov:
end
-# }}}
- unless kind == :error
- if value_expected = value_expected == :set
- value_expected = :expect_colon if match == '?' || match == 'when'
- end
- last_token_dot = last_token_dot == :set
+ else # StringState
+
+ match = scan_until(state.pattern) || scan_rest
+ unless match.empty?
+ encoder.text_token match, :content
+ break if eos?
end
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens, state
+ if state.heredoc && self[1] # end of heredoc
+ match = getch
+ match << scan_until(/$/) unless eos?
+ encoder.text_token match, :delimiter unless match.empty?
+ encoder.end_group state.type
+ state = state.next_state
+ next
end
- raise_inspect 'Empty token', tokens unless match
-
- tokens << [match, kind]
-
- if last_state
- state = last_state
- last_state = nil
+
+ case match = getch
+
+ when state.delim
+ if state.paren_depth
+ state.paren_depth -= 1
+ if state.paren_depth > 0
+ encoder.text_token match, :content
+ next
+ end
+ end
+ encoder.text_token match, :delimiter
+ if state.type == :regexp && !eos?
+ match = scan(/#{patterns::REGEXP_MODIFIERS}/o)
+ encoder.text_token match, :modifier unless match.empty?
+ end
+ encoder.end_group state.type
+ value_expected = false
+ state = state.next_state
+
+ when '\\'
+ if state.interpreted
+ if esc = scan(/#{patterns::ESCAPE}/o)
+ encoder.text_token match + esc, :char
+ else
+ encoder.text_token match, :error
+ end
+ else
+ case esc = getch
+ when nil
+ encoder.text_token match, :content
+ when state.delim, '\\'
+ encoder.text_token match + esc, :char
+ else
+ encoder.text_token match + esc, :content
+ end
+ end
+
+ when '#'
+ case peek(1)
+ when '{'
+ inline_block_stack ||= []
+ inline_block_stack << [state, inline_block_curly_depth, heredocs]
+ value_expected = true
+ state = :initial
+ inline_block_curly_depth = 1
+ encoder.begin_group :inline
+ encoder.text_token match + getch, :inline_delimiter
+ when '$', '@'
+ encoder.text_token match, :escape
+ last_state = state
+ state = :initial
+ else
+ #:nocov:
+ raise_inspect 'else-case # reached; #%p not handled' % [peek(1)], encoder
+ #:nocov:
+ end
+
+ when state.opening_paren
+ state.paren_depth += 1
+ encoder.text_token match, :content
+
+ else
+ #:nocov
+ raise_inspect 'else-case " reached; %p not handled, state = %p' % [match, state], encoder
+ #:nocov:
+
end
+
end
+
+ end
+
+ # cleaning up
+ if state.is_a? StringState
+ encoder.end_group state.type
end
-
- inline_block_stack << [state] if state.is_a? patterns::StringState
- until inline_block_stack.empty?
- this_block = inline_block_stack.pop
- tokens << [:close, :inline] if this_block.size > 1
- state = this_block.first
- tokens << [:close, state.type]
+
+ if options[:keep_state]
+ if state.is_a?(StringState) && state.heredoc
+ (heredocs ||= []).unshift state
+ state = :initial
+ elsif heredocs && heredocs.empty?
+ heredocs = nil
+ end
+ @state = state, heredocs
end
-
- tokens
+
+ if inline_block_stack
+ until inline_block_stack.empty?
+ state, = *inline_block_stack.pop
+ encoder.end_group :inline
+ encoder.end_group state.type
+ end
+ end
+
+ encoder
end
-
+
end
-
+
end
end
-
-# vim:fdm=marker
diff --git a/lib/coderay/scanners/ruby/patterns.rb b/lib/coderay/scanners/ruby/patterns.rb
index 9bd709c..a52198e 100644
--- a/lib/coderay/scanners/ruby/patterns.rb
+++ b/lib/coderay/scanners/ruby/patterns.rb
@@ -2,9 +2,9 @@
module CodeRay
module Scanners
- module Ruby::Patterns # :nodoc:
+ module Ruby::Patterns # :nodoc: all
- RESERVED_WORDS = %w[
+ KEYWORDS = %w[
and def end in or unless begin
defined? ensure module redo super until
BEGIN break do next rescue then
@@ -13,25 +13,27 @@ module Scanners
undef yield
]
- DEF_KEYWORDS = %w[ def ]
- UNDEF_KEYWORDS = %w[ undef ]
- ALIAS_KEYWORDS = %w[ alias ]
- MODULE_KEYWORDS = %w[ class module ]
- DEF_NEW_STATE = WordList.new(:initial).
- add(DEF_KEYWORDS, :def_expected).
- add(UNDEF_KEYWORDS, :undef_expected).
- add(ALIAS_KEYWORDS, :alias_expected).
- add(MODULE_KEYWORDS, :module_expected)
-
+ # See http://murfy.de/ruby-constants.
PREDEFINED_CONSTANTS = %w[
nil true false self
- DATA ARGV ARGF
+ DATA ARGV ARGF ENV
+ FALSE TRUE NIL
+ STDERR STDIN STDOUT
+ TOPLEVEL_BINDING
+ RUBY_COPYRIGHT RUBY_DESCRIPTION RUBY_ENGINE RUBY_PATCHLEVEL
+ RUBY_PLATFORM RUBY_RELEASE_DATE RUBY_REVISION RUBY_VERSION
__FILE__ __LINE__ __ENCODING__
]
IDENT_KIND = WordList.new(:ident).
- add(RESERVED_WORDS, :reserved).
- add(PREDEFINED_CONSTANTS, :pre_constant)
+ add(KEYWORDS, :keyword).
+ add(PREDEFINED_CONSTANTS, :predefined_constant)
+
+ KEYWORD_NEW_STATE = WordList.new(:initial).
+ add(%w[ def ], :def_expected).
+ add(%w[ undef ], :undef_expected).
+ add(%w[ alias ], :alias_expected).
+ add(%w[ class module ], :module_expected)
IDENT = 'ä'[/[[:alpha:]]/] == 'ä' ? /[[:alpha:]_][[:alnum:]_]*/ : /[^\W\d]\w*/
@@ -46,7 +48,9 @@ module Scanners
| ===? | =~ # simple equality, case equality, match
| ![~=@]? # negation with and without at sign, not-equal and not-match
/ox
- METHOD_NAME_EX = / #{IDENT} (?:[?!]|=(?!>))? | #{METHOD_NAME_OPERATOR} /ox
+ METHOD_SUFFIX = / (?: [?!] | = (?![~>]|=(?!>)) ) /x
+ METHOD_NAME_EX = / #{IDENT} #{METHOD_SUFFIX}? | #{METHOD_NAME_OPERATOR} /ox
+ METHOD_AFTER_DOT = / #{IDENT} [?!]? | #{METHOD_NAME_OPERATOR} /ox
INSTANCE_VARIABLE = / @ #{IDENT} /ox
CLASS_VARIABLE = / @@ #{IDENT} /ox
OBJECT_VARIABLE = / @@? #{IDENT} /ox
@@ -60,8 +64,7 @@ module Scanners
}
QUOTE_TO_TYPE.default = :string
- REGEXP_MODIFIERS = /[mixounse]*/
- REGEXP_SYMBOLS = /[|?*+(){}\[\].^$]/
+ REGEXP_MODIFIERS = /[mousenix]*/
DECIMAL = /\d+(?:_\d+)*/
OCTAL = /0_?[0-7]+(?:_[0-7]+)*/
@@ -87,7 +90,7 @@ module Scanners
[abefnrstv]
| [0-7]{1,3}
| x[0-9A-Fa-f]{1,2}
- | .?
+ | .
/mx
CONTROL_META_ESCAPE = /
@@ -110,12 +113,10 @@ module Scanners
# NOTE: This is not completely correct, but
# nobody needs heredoc delimiters ending with \n.
- # Also, delimiters starting with numbers are allowed.
- # but they are more often than not a false positive.
HEREDOC_OPEN = /
<< (-)? # $1 = float
(?:
- ( #{IDENT} ) # $2 = delim
+ ( [A-Za-z_0-9]+ ) # $2 = delim
|
( ["'`\/] ) # $3 = quote, type
( [^\n]*? ) \3 # $4 = delim
@@ -134,6 +135,8 @@ module Scanners
(?: \Z | (?=^\#CODE) )
/mx
+ RUBYDOC_OR_DATA = / #{RUBYDOC} | #{DATA} /xo
+
# Checks for a valid value to follow. This enables
# value_expected in method calls without parentheses.
VALUE_FOLLOWS = /
@@ -144,7 +147,7 @@ module Scanners
| [-+] \d
| #{CHARACTER}
)
- /x
+ /ox
KEYWORDS_EXPECTING_VALUE = WordList.new.add(%w[
and end in or unless begin
defined? ensure redo super until
@@ -153,89 +156,20 @@ module Scanners
while elsif if not return
yield
])
-
- RUBYDOC_OR_DATA = / #{RUBYDOC} | #{DATA} /xo
-
- RDOC_DATA_START = / ^=begin (?!\S) | ^__END__$ /x
-
- FANCY_START_CORRECT = / % ( [qQwWxsr] | (?![a-zA-Z0-9]) ) ([^a-zA-Z0-9]) /mx
-
- FancyStringType = {
- 'q' => [:string, false],
- 'Q' => [:string, true],
- 'r' => [:regexp, true],
- 's' => [:symbol, false],
- 'x' => [:shell, true]
- }
- FancyStringType['w'] = FancyStringType['q']
- FancyStringType['W'] = FancyStringType[''] = FancyStringType['Q']
-
- class StringState < Struct.new :type, :interpreted, :delim, :heredoc,
- :paren, :paren_depth, :pattern, :next_state
-
- CLOSING_PAREN = Hash[ *%w[
- ( )
- [ ]
- < >
- { }
- ] ]
-
- CLOSING_PAREN.each { |k,v| k.freeze; v.freeze } # debug, if I try to change it with <<
- OPENING_PAREN = CLOSING_PAREN.invert
-
- STRING_PATTERN = Hash.new do |h, k|
- delim, interpreted = *k
- delim_pattern = Regexp.escape(delim.dup) # dup: workaround for old Ruby
- if closing_paren = CLOSING_PAREN[delim]
- delim_pattern = delim_pattern[0..-1] if defined? JRUBY_VERSION # JRuby fix
- delim_pattern << Regexp.escape(closing_paren)
- end
- delim_pattern << '\\\\' unless delim == '\\'
-
- special_escapes =
- case interpreted
- when :regexp_symbols
- '| ' + REGEXP_SYMBOLS.source
- when :words
- '| \s'
- end
-
- h[k] =
- if interpreted and not delim == '#'
- / (?= [#{delim_pattern}] | \# [{$@] #{special_escapes} ) /mx
- else
- / (?= [#{delim_pattern}] #{special_escapes} ) /mx
- end
- end
-
- HEREDOC_PATTERN = Hash.new do |h, k|
- delim, interpreted, indented = *k
- delim_pattern = Regexp.escape(delim.dup) # dup: workaround for old Ruby
- delim_pattern = / \n #{ '(?>[\ \t]*)' if indented } #{ Regexp.new delim_pattern } $ /x
- h[k] =
- if interpreted
- / (?= #{delim_pattern}() | \\ | \# [{$@] ) /mx # $1 set == end of heredoc
- else
- / (?= #{delim_pattern}() | \\ ) /mx
- end
- end
-
- def initialize kind, interpreted, delim, heredoc = false
- if heredoc
- pattern = HEREDOC_PATTERN[ [delim, interpreted, heredoc == :indented] ]
- delim = nil
- else
- pattern = STRING_PATTERN[ [delim, interpreted] ]
- if paren = CLOSING_PAREN[delim]
- delim, paren = paren, delim
- paren_depth = 1
- end
- end
- super kind, interpreted, delim, heredoc, paren, paren_depth, pattern, :initial
- end
- end unless defined? StringState
-
+
+ FANCY_STRING_START = / % ( [QqrsWwx] | (?![a-zA-Z0-9]) ) ([^a-zA-Z0-9]) /x
+ FANCY_STRING_KIND = Hash.new(:string).merge({
+ 'r' => :regexp,
+ 's' => :symbol,
+ 'x' => :shell,
+ })
+ FANCY_STRING_INTERPRETED = Hash.new(true).merge({
+ 'q' => false,
+ 's' => false,
+ 'w' => false,
+ })
+
end
-
+
end
end
diff --git a/lib/coderay/scanners/ruby/string_state.rb b/lib/coderay/scanners/ruby/string_state.rb
new file mode 100644
index 0000000..2f398d1
--- /dev/null
+++ b/lib/coderay/scanners/ruby/string_state.rb
@@ -0,0 +1,71 @@
+# encoding: utf-8
+module CodeRay
+module Scanners
+
+ class Ruby
+
+ class StringState < Struct.new :type, :interpreted, :delim, :heredoc,
+ :opening_paren, :paren_depth, :pattern, :next_state # :nodoc: all
+
+ CLOSING_PAREN = Hash[ *%w[
+ ( )
+ [ ]
+ < >
+ { }
+ ] ].each { |k,v| k.freeze; v.freeze } # debug, if I try to change it with <<
+
+ STRING_PATTERN = Hash.new do |h, k|
+ delim, interpreted = *k
+ # delim = delim.dup # workaround for old Ruby
+ delim_pattern = Regexp.escape(delim)
+ if closing_paren = CLOSING_PAREN[delim]
+ delim_pattern << Regexp.escape(closing_paren)
+ end
+ delim_pattern << '\\\\' unless delim == '\\'
+
+ # special_escapes =
+ # case interpreted
+ # when :regexp_symbols
+ # '| [|?*+(){}\[\].^$]'
+ # end
+
+ h[k] =
+ if interpreted && delim != '#'
+ / (?= [#{delim_pattern}] | \# [{$@] ) /mx
+ else
+ / (?= [#{delim_pattern}] ) /mx
+ end
+ end
+
+ def initialize kind, interpreted, delim, heredoc = false
+ if heredoc
+ pattern = heredoc_pattern delim, interpreted, heredoc == :indented
+ delim = nil
+ else
+ pattern = STRING_PATTERN[ [delim, interpreted] ]
+ if closing_paren = CLOSING_PAREN[delim]
+ opening_paren = delim
+ delim = closing_paren
+ paren_depth = 1
+ end
+ end
+ super kind, interpreted, delim, heredoc, opening_paren, paren_depth, pattern, :initial
+ end
+
+ def heredoc_pattern delim, interpreted, indented
+ # delim = delim.dup # workaround for old Ruby
+ delim_pattern = Regexp.escape(delim)
+ delim_pattern = / (?:\A|\n) #{ '(?>[ \t]*)' if indented } #{ Regexp.new delim_pattern } $ /x
+ if interpreted
+ / (?= #{delim_pattern}() | \\ | \# [{$@] ) /mx # $1 set == end of heredoc
+ else
+ / (?= #{delim_pattern}() | \\ ) /mx
+ end
+ end
+
+ end
+
+ end
+
+end
+end
diff --git a/lib/coderay/scanners/scheme.rb b/lib/coderay/scanners/scheme.rb
deleted file mode 100644
index ba22b80..0000000
--- a/lib/coderay/scanners/scheme.rb
+++ /dev/null
@@ -1,145 +0,0 @@
-module CodeRay
- module Scanners
-
- # Scheme scanner for CodeRay (by closure).
- # Thanks to murphy for putting CodeRay into public.
- class Scheme < Scanner
-
- # TODO: function defs
- # TODO: built-in functions
-
- register_for :scheme
- file_extension 'scm'
-
- CORE_FORMS = %w[
- lambda let let* letrec syntax-case define-syntax let-syntax
- letrec-syntax begin define quote if or and cond case do delay
- quasiquote set! cons force call-with-current-continuation call/cc
- ]
-
- IDENT_KIND = CaseIgnoringWordList.new(:ident).
- add(CORE_FORMS, :reserved)
-
- #IDENTIFIER_INITIAL = /[a-z!@\$%&\*\/\:<=>\?~_\^]/i
- #IDENTIFIER_SUBSEQUENT = /#{IDENTIFIER_INITIAL}|\d|\.|\+|-/
- #IDENTIFIER = /#{IDENTIFIER_INITIAL}#{IDENTIFIER_SUBSEQUENT}*|\+|-|\.{3}/
- IDENTIFIER = /[a-zA-Z!@$%&*\/:<=>?~_^][\w!@$%&*\/:<=>?~^.+\-]*|[+-]|\.\.\./
- DIGIT = /\d/
- DIGIT10 = DIGIT
- DIGIT16 = /[0-9a-f]/i
- DIGIT8 = /[0-7]/
- DIGIT2 = /[01]/
- RADIX16 = /\#x/i
- RADIX8 = /\#o/i
- RADIX2 = /\#b/i
- RADIX10 = /\#d/i
- EXACTNESS = /#i|#e/i
- SIGN = /[\+-]?/
- EXP_MARK = /[esfdl]/i
- EXP = /#{EXP_MARK}#{SIGN}#{DIGIT}+/
- SUFFIX = /#{EXP}?/
- PREFIX10 = /#{RADIX10}?#{EXACTNESS}?|#{EXACTNESS}?#{RADIX10}?/
- PREFIX16 = /#{RADIX16}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX16}/
- PREFIX8 = /#{RADIX8}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX8}/
- PREFIX2 = /#{RADIX2}#{EXACTNESS}?|#{EXACTNESS}?#{RADIX2}/
- UINT10 = /#{DIGIT10}+#*/
- UINT16 = /#{DIGIT16}+#*/
- UINT8 = /#{DIGIT8}+#*/
- UINT2 = /#{DIGIT2}+#*/
- DECIMAL = /#{DIGIT10}+#+\.#*#{SUFFIX}|#{DIGIT10}+\.#{DIGIT10}*#*#{SUFFIX}|\.#{DIGIT10}+#*#{SUFFIX}|#{UINT10}#{EXP}/
- UREAL10 = /#{UINT10}\/#{UINT10}|#{DECIMAL}|#{UINT10}/
- UREAL16 = /#{UINT16}\/#{UINT16}|#{UINT16}/
- UREAL8 = /#{UINT8}\/#{UINT8}|#{UINT8}/
- UREAL2 = /#{UINT2}\/#{UINT2}|#{UINT2}/
- REAL10 = /#{SIGN}#{UREAL10}/
- REAL16 = /#{SIGN}#{UREAL16}/
- REAL8 = /#{SIGN}#{UREAL8}/
- REAL2 = /#{SIGN}#{UREAL2}/
- IMAG10 = /i|#{UREAL10}i/
- IMAG16 = /i|#{UREAL16}i/
- IMAG8 = /i|#{UREAL8}i/
- IMAG2 = /i|#{UREAL2}i/
- COMPLEX10 = /#{REAL10}@#{REAL10}|#{REAL10}\+#{IMAG10}|#{REAL10}-#{IMAG10}|\+#{IMAG10}|-#{IMAG10}|#{REAL10}/
- COMPLEX16 = /#{REAL16}@#{REAL16}|#{REAL16}\+#{IMAG16}|#{REAL16}-#{IMAG16}|\+#{IMAG16}|-#{IMAG16}|#{REAL16}/
- COMPLEX8 = /#{REAL8}@#{REAL8}|#{REAL8}\+#{IMAG8}|#{REAL8}-#{IMAG8}|\+#{IMAG8}|-#{IMAG8}|#{REAL8}/
- COMPLEX2 = /#{REAL2}@#{REAL2}|#{REAL2}\+#{IMAG2}|#{REAL2}-#{IMAG2}|\+#{IMAG2}|-#{IMAG2}|#{REAL2}/
- NUM10 = /#{PREFIX10}?#{COMPLEX10}/
- NUM16 = /#{PREFIX16}#{COMPLEX16}/
- NUM8 = /#{PREFIX8}#{COMPLEX8}/
- NUM2 = /#{PREFIX2}#{COMPLEX2}/
- NUM = /#{NUM10}|#{NUM16}|#{NUM8}|#{NUM2}/
-
- private
- def scan_tokens tokens,options
-
- state = :initial
- ident_kind = IDENT_KIND
-
- until eos?
- kind = match = nil
-
- case state
- when :initial
- if scan(/ \s+ | \\\n /x)
- kind = :space
- elsif scan(/['\(\[\)\]]|#\(/)
- kind = :operator_fat
- elsif scan(/;.*/)
- kind = :comment
- elsif scan(/#\\(?:newline|space|.?)/)
- kind = :char
- elsif scan(/#[ft]/)
- kind = :pre_constant
- elsif scan(/#{IDENTIFIER}/o)
- kind = ident_kind[matched]
- elsif scan(/\./)
- kind = :operator
- elsif scan(/"/)
- tokens << [:open, :string]
- state = :string
- tokens << ['"', :delimiter]
- next
- elsif scan(/#{NUM}/o) and not matched.empty?
- kind = :integer
- elsif getch
- kind = :error
- end
-
- when :string
- if scan(/[^"\\]+/) or scan(/\\.?/)
- kind = :content
- elsif scan(/"/)
- tokens << ['"', :delimiter]
- tokens << [:close, :string]
- state = :initial
- next
- else
- raise_inspect "else case \" reached; %p not handled." % peek(1),
- tokens, state
- end
-
- else
- raise "else case reached"
- end
-
- match ||= matched
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens
- end
- raise_inspect 'Empty token', tokens, state unless match
-
- tokens << [match, kind]
-
- end # until eos
-
- if state == :string
- tokens << [:close, :string]
- end
-
- tokens
-
- end #scan_tokens
- end #class
- end #module scanners
-end #module coderay
\ No newline at end of file
diff --git a/lib/coderay/scanners/sql.rb b/lib/coderay/scanners/sql.rb
index 2d56c03..bcbffd5 100644
--- a/lib/coderay/scanners/sql.rb
+++ b/lib/coderay/scanners/sql.rb
@@ -5,31 +5,49 @@ module CodeRay module Scanners
register_for :sql
- RESERVED_WORDS = %w(
- and as avg before begin between by case collate columns create database
- databases delete distinct drop else end engine exists fields from full
- group having if index inner insert into is join key like not on or order
- outer primary prompt replace select set show table tables then trigger
- union update using values when where
+ KEYWORDS = %w(
+ all and any as before begin between by case check collate
+ each else end exists
+ for foreign from full group having if in inner is join
+ like not of on or order outer over references
+ then to union using values when where
+ left right distinct
+ )
+
+ OBJECTS = %w(
+ database databases table tables column columns fields index constraint
+ constraints transaction function procedure row key view trigger
+ )
+
+ COMMANDS = %w(
+ add alter comment create delete drop grant insert into select update set
+ show prompt begin commit rollback replace truncate
)
PREDEFINED_TYPES = %w(
- bigint bin binary bit blob bool boolean char date datetime decimal
- double enum float hex int integer longblob longtext mediumblob mediumint
- mediumtext oct smallint text time timestamp tinyblob tinyint tinytext
- unsigned varchar year
+ char varchar varchar2 enum binary text tinytext mediumtext
+ longtext blob tinyblob mediumblob longblob timestamp
+ date time datetime year double decimal float int
+ integer tinyint mediumint bigint smallint unsigned bit
+ bool boolean hex bin oct
)
- PREDEFINED_FUNCTIONS = %w( sum cast abs pi count min max avg )
+ PREDEFINED_FUNCTIONS = %w( sum cast substring abs pi count min max avg now )
- DIRECTIVES = %w( auto_increment unique default charset )
+ DIRECTIVES = %w(
+ auto_increment unique default charset initially deferred
+ deferrable cascade immediate read write asc desc after
+ primary foreign return engine
+ )
PREDEFINED_CONSTANTS = %w( null true false )
- IDENT_KIND = CaseIgnoringWordList.new(:ident).
- add(RESERVED_WORDS, :reserved).
- add(PREDEFINED_TYPES, :pre_type).
- add(PREDEFINED_CONSTANTS, :pre_constant).
+ IDENT_KIND = WordList::CaseIgnoring.new(:ident).
+ add(KEYWORDS, :keyword).
+ add(OBJECTS, :type).
+ add(COMMANDS, :class).
+ add(PREDEFINED_TYPES, :predefined_type).
+ add(PREDEFINED_CONSTANTS, :predefined_constant).
add(PREDEFINED_FUNCTIONS, :predefined).
add(DIRECTIVES, :directive)
@@ -38,58 +56,60 @@ module CodeRay module Scanners
STRING_PREFIXES = /[xnb]|_\w+/i
- def scan_tokens tokens, options
+ def scan_tokens encoder, options
state = :initial
string_type = nil
string_content = ''
+ name_expected = false
until eos?
- kind = nil
- match = nil
-
if state == :initial
- if scan(/ \s+ | \\\n /x)
- kind = :space
+ if match = scan(/ \s+ | \\\n /x)
+ encoder.text_token match, :space
- elsif scan(/(?:--\s?|#).*/)
- kind = :comment
+ elsif match = scan(/(?:--\s?|#).*/)
+ encoder.text_token match, :comment
- elsif scan(%r! /\* (?: .*? \*/ | .* ) !mx)
- kind = :comment
+ elsif match = scan(%r( /\* (!)? (?: .*? \*/ | .* ) )mx)
+ encoder.text_token match, self[1] ? :directive : :comment
- elsif scan(/ [-+*\/=<>;,!&^|()\[\]{}~%] | \.(?!\d) /x)
- kind = :operator
+ elsif match = scan(/ [*\/=<>:;,!&^|()\[\]{}~%] | [-+\.](?!\d) /x)
+ name_expected = true if match == '.' && check(/[A-Za-z_]/)
+ encoder.text_token match, :operator
- elsif scan(/(#{STRING_PREFIXES})?([`"'])/o)
+ elsif match = scan(/(#{STRING_PREFIXES})?([`"'])/o)
prefix = self[1]
string_type = self[2]
- tokens << [:open, :string]
- tokens << [prefix, :modifier] if prefix
+ encoder.begin_group :string
+ encoder.text_token prefix, :modifier if prefix
match = string_type
state = :string
- kind = :delimiter
+ encoder.text_token match, :delimiter
elsif match = scan(/ @? [A-Za-z_][A-Za-z_0-9]* /x)
- kind = match[0] == ?@ ? :variable : IDENT_KIND[match.downcase]
+ encoder.text_token match, name_expected ? :ident : (match[0] == ?@ ? :variable : IDENT_KIND[match])
+ name_expected = false
- elsif scan(/0[xX][0-9A-Fa-f]+/)
- kind = :hex
+ elsif match = scan(/0[xX][0-9A-Fa-f]+/)
+ encoder.text_token match, :hex
- elsif scan(/0[0-7]+(?![89.eEfF])/)
- kind = :oct
+ elsif match = scan(/0[0-7]+(?![89.eEfF])/)
+ encoder.text_token match, :octal
- elsif scan(/(?>\d+)(?![.eEfF])/)
- kind = :integer
+ elsif match = scan(/[-+]?(?>\d+)(?![.eEfF])/)
+ encoder.text_token match, :integer
- elsif scan(/\d[fF]|\d*\.\d+(?:[eE][+-]?\d+)?|\d+[eE][+-]?\d+/)
- kind = :float
+ elsif match = scan(/[-+]?(?:\d[fF]|\d*\.\d+(?:[eE][+-]?\d+)?|\d+[eE][+-]?\d+)/)
+ encoder.text_token match, :float
+
+ elsif match = scan(/\\N/)
+ encoder.text_token match, :predefined_constant
else
- getch
- kind = :error
+ encoder.text_token getch, :error
end
@@ -104,54 +124,48 @@ module CodeRay module Scanners
next
end
unless string_content.empty?
- tokens << [string_content, :content]
+ encoder.text_token string_content, :content
string_content = ''
end
- tokens << [matched, :delimiter]
- tokens << [:close, :string]
+ encoder.text_token match, :delimiter
+ encoder.end_group :string
state = :initial
string_type = nil
- next
else
string_content << match
end
- next
- elsif scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
+ elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
unless string_content.empty?
- tokens << [string_content, :content]
+ encoder.text_token string_content, :content
string_content = ''
end
- kind = :char
+ encoder.text_token match, :char
elsif match = scan(/ \\ . /mox)
string_content << match
next
- elsif scan(/ \\ | $ /x)
+ elsif match = scan(/ \\ | $ /x)
unless string_content.empty?
- tokens << [string_content, :content]
+ encoder.text_token string_content, :content
string_content = ''
end
- kind = :error
+ encoder.text_token match, :error
state = :initial
else
- raise "else case \" reached; %p not handled." % peek(1), tokens
+ raise "else case \" reached; %p not handled." % peek(1), encoder
end
else
- raise 'else-case reached', tokens
+ raise 'else-case reached', encoder
end
- match ||= matched
- unless kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens, state
- end
- raise_inspect 'Empty token', tokens unless match
-
- tokens << [match, kind]
-
end
- tokens
+
+ if state == :string
+ encoder.end_group state
+ end
+
+ encoder
end
diff --git a/lib/coderay/scanners/text.rb b/lib/coderay/scanners/text.rb
new file mode 100644
index 0000000..bde9029
--- /dev/null
+++ b/lib/coderay/scanners/text.rb
@@ -0,0 +1,26 @@
+module CodeRay
+ module Scanners
+
+ # Scanner for plain text.
+ #
+ # Yields just one token of the kind :plain.
+ #
+ # Alias: +plaintext+, +plain+
+ class Text < Scanner
+
+ register_for :text
+ title 'Plain text'
+
+ KINDS_NOT_LOC = [:plain] # :nodoc:
+
+ protected
+
+ def scan_tokens encoder, options
+ encoder.text_token string, :plain
+ encoder
+ end
+
+ end
+
+ end
+end
diff --git a/lib/coderay/scanners/xml.rb b/lib/coderay/scanners/xml.rb
index aeabeca..947f16e 100644
--- a/lib/coderay/scanners/xml.rb
+++ b/lib/coderay/scanners/xml.rb
@@ -3,7 +3,7 @@ module Scanners
load :html
- # XML Scanner
+ # Scanner for XML.
#
# Currently this is the same scanner as Scanners::HTML.
class XML < HTML
diff --git a/lib/coderay/scanners/yaml.rb b/lib/coderay/scanners/yaml.rb
index 095d609..96f4e93 100644
--- a/lib/coderay/scanners/yaml.rb
+++ b/lib/coderay/scanners/yaml.rb
@@ -1,7 +1,7 @@
module CodeRay
module Scanners
- # YAML Scanner
+ # Scanner for YAML.
#
# Based on the YAML scanner from Syntax by Jamis Buck.
class YAML < Scanner
@@ -11,57 +11,59 @@ module Scanners
KINDS_NOT_LOC = :all
- def scan_tokens tokens, options
+ protected
+
+ def scan_tokens encoder, options
state = :initial
- key_indent = 0
+ key_indent = string_indent = 0
until eos?
- kind = nil
- match = nil
key_indent = nil if bol?
if match = scan(/ +[\t ]*/)
- kind = :space
+ encoder.text_token match, :space
elsif match = scan(/\n+/)
- kind = :space
+ encoder.text_token match, :space
state = :initial if match.index(?\n)
elsif match = scan(/#.*/)
- kind = :comment
+ encoder.text_token match, :comment
elsif bol? and case
when match = scan(/---|\.\.\./)
- tokens << [:open, :head]
- tokens << [match, :head]
- tokens << [:close, :head]
+ encoder.begin_group :head
+ encoder.text_token match, :head
+ encoder.end_group :head
next
when match = scan(/%.*/)
- tokens << [match, :doctype]
+ encoder.text_token match, :doctype
next
end
elsif state == :value and case
- when !check(/(?:"[^"]*")(?=: |:$)/) && scan(/"/)
- tokens << [:open, :string]
- tokens << [matched, :delimiter]
- tokens << [matched, :content] if scan(/ [^"\\]* (?: \\. [^"\\]* )* /mx)
- tokens << [matched, :delimiter] if scan(/"/)
- tokens << [:close, :string]
+ when !check(/(?:"[^"]*")(?=: |:$)/) && match = scan(/"/)
+ encoder.begin_group :string
+ encoder.text_token match, :delimiter
+ encoder.text_token match, :content if match = scan(/ [^"\\]* (?: \\. [^"\\]* )* /mx)
+ encoder.text_token match, :delimiter if match = scan(/"/)
+ encoder.end_group :string
next
when match = scan(/[|>][-+]?/)
- tokens << [:open, :string]
- tokens << [match, :delimiter]
- string_indent = key_indent || column(pos - match.size - 1)
- tokens << [matched, :content] if scan(/(?:\n+ {#{string_indent + 1}}.*)+/)
- tokens << [:close, :string]
+ encoder.begin_group :string
+ encoder.text_token match, :delimiter
+ string_indent = key_indent || column(pos - match.size) - 1
+ encoder.text_token matched, :content if scan(/(?:\n+ {#{string_indent + 1}}.*)+/)
+ encoder.end_group :string
next
when match = scan(/(?![!"*&]).+?(?=$|\s+#)/)
- tokens << [match, :string]
- string_indent = key_indent || column(pos - match.size - 1)
- tokens << [matched, :string] if scan(/(?:\n+ {#{string_indent + 1}}.*)+/)
+ encoder.begin_group :string
+ encoder.text_token match, :content
+ string_indent = key_indent || column(pos - match.size) - 1
+ encoder.text_token matched, :content if scan(/(?:\n+ {#{string_indent + 1}}.*)+/)
+ encoder.end_group :string
next
end
@@ -69,68 +71,67 @@ module Scanners
when match = scan(/[-:](?= |$)/)
state = :value if state == :colon && (match == ':' || match == '-')
state = :value if state == :initial && match == '-'
- kind = :operator
+ encoder.text_token match, :operator
+ next
when match = scan(/[,{}\[\]]/)
- kind = :operator
- when state == :initial && match = scan(/[\w.() ]*\S(?=: |:$)/)
- kind = :key
- key_indent = column(pos - match.size - 1)
- # tokens << [key_indent.inspect, :debug]
+ encoder.text_token match, :operator
+ next
+ when state == :initial && match = scan(/[-\w.()\/ ]*\S(?= *:(?: |$))/)
+ encoder.text_token match, :key
+ key_indent = column(pos - match.size) - 1
state = :colon
- when match = scan(/(?:"[^"\n]*"|'[^'\n]*')(?=: |:$)/)
- tokens << [:open, :key]
- tokens << [match[0,1], :delimiter]
- tokens << [match[1..-2], :content]
- tokens << [match[-1,1], :delimiter]
- tokens << [:close, :key]
- key_indent = column(pos - match.size - 1)
- # tokens << [key_indent.inspect, :debug]
+ next
+ when match = scan(/(?:"[^"\n]*"|'[^'\n]*')(?= *:(?: |$))/)
+ encoder.begin_group :key
+ encoder.text_token match[0,1], :delimiter
+ encoder.text_token match[1..-2], :content
+ encoder.text_token match[-1,1], :delimiter
+ encoder.end_group :key
+ key_indent = column(pos - match.size) - 1
state = :colon
next
- when scan(/(![\w\/]+)(:([\w:]+))?/)
- tokens << [self[1], :type]
+ when match = scan(/(![\w\/]+)(:([\w:]+))?/)
+ encoder.text_token self[1], :type
if self[2]
- tokens << [':', :operator]
- tokens << [self[3], :class]
+ encoder.text_token ':', :operator
+ encoder.text_token self[3], :class
end
next
- when scan(/&\S+/)
- kind = :variable
- when scan(/\*\w+/)
- kind = :global_variable
- when scan(/<</)
- kind = :class_variable
- when scan(/\d\d:\d\d:\d\d/)
- kind = :oct
- when scan(/\d\d\d\d-\d\d-\d\d\s\d\d:\d\d:\d\d(\.\d+)? [-+]\d\d:\d\d/)
- kind = :oct
- when scan(/:\w+/)
- kind = :symbol
- when scan(/[^:\s]+(:(?! |$)[^:\s]*)* .*/)
- kind = :error
- when scan(/[^:\s]+(:(?! |$)[^:\s]*)*/)
- kind = :error
+ when match = scan(/&\S+/)
+ encoder.text_token match, :variable
+ next
+ when match = scan(/\*\w+/)
+ encoder.text_token match, :global_variable
+ next
+ when match = scan(/<</)
+ encoder.text_token match, :class_variable
+ next
+ when match = scan(/\d\d:\d\d:\d\d/)
+ encoder.text_token match, :octal
+ next
+ when match = scan(/\d\d\d\d-\d\d-\d\d\s\d\d:\d\d:\d\d(\.\d+)? [-+]\d\d:\d\d/)
+ encoder.text_token match, :octal
+ next
+ when match = scan(/:\w+/)
+ encoder.text_token match, :symbol
+ next
+ when match = scan(/[^:\s]+(:(?! |$)[^:\s]*)* .*/)
+ encoder.text_token match, :error
+ next
+ when match = scan(/[^:\s]+(:(?! |$)[^:\s]*)*/)
+ encoder.text_token match, :error
+ next
end
else
- getch
- kind = :error
+ raise if eos?
+ encoder.text_token getch, :error
end
- match ||= matched
-
- if $CODERAY_DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens, state
- end
- raise_inspect 'Empty token', tokens, state unless match
-
- tokens << [match, kind]
-
end
- tokens
+ encoder
end
end
diff --git a/lib/coderay/style.rb b/lib/coderay/style.rb
index c2977c5..df4704f 100644
--- a/lib/coderay/style.rb
+++ b/lib/coderay/style.rb
@@ -1,20 +1,23 @@
module CodeRay
# This module holds the Style class and its subclasses.
- #
+ #
# See Plugin.
module Styles
extend PluginHost
plugin_path File.dirname(__FILE__), 'styles'
-
+
+ # Base class for styles.
+ #
+ # Styles are used by Encoders::HTML to colorize tokens.
class Style
extend Plugin
plugin_host Styles
-
- DEFAULT_OPTIONS = { }
-
+
+ DEFAULT_OPTIONS = { } # :nodoc:
+
end
-
+
end
-
+
end
diff --git a/lib/coderay/styles/_map.rb b/lib/coderay/styles/_map.rb
index 52035fe..92d4354 100644
--- a/lib/coderay/styles/_map.rb
+++ b/lib/coderay/styles/_map.rb
@@ -1,7 +1,7 @@
module CodeRay
module Styles
-
- default :cycnus
-
+
+ default :alpha
+
end
end
diff --git a/lib/coderay/styles/alpha.rb b/lib/coderay/styles/alpha.rb
new file mode 100644
index 0000000..8506d10
--- /dev/null
+++ b/lib/coderay/styles/alpha.rb
@@ -0,0 +1,142 @@
+module CodeRay
+module Styles
+
+ # A colorful theme using CSS 3 colors (with alpha channel).
+ class Alpha < Style
+
+ register_for :alpha
+
+ code_background = 'hsl(0,0%,95%)'
+ numbers_background = 'hsl(180,65%,90%)'
+ border_color = 'silver'
+ normal_color = 'black'
+
+ CSS_MAIN_STYLES = <<-MAIN # :nodoc:
+.CodeRay {
+ background-color: #{code_background};
+ border: 1px solid #{border_color};
+ color: #{normal_color};
+}
+.CodeRay pre {
+ margin: 0px;
+}
+
+span.CodeRay { white-space: pre; border: 0px; padding: 2px; }
+
+table.CodeRay { border-collapse: collapse; width: 100%; padding: 2px; }
+table.CodeRay td { padding: 2px 4px; vertical-align: top; }
+
+.CodeRay .line-numbers {
+ background-color: #{numbers_background};
+ color: gray;
+ text-align: right;
+ -webkit-user-select: none;
+ -moz-user-select: none;
+ user-select: none;
+}
+.CodeRay .line-numbers a {
+ background-color: #{numbers_background} !important;
+ color: gray !important;
+ text-decoration: none !important;
+}
+.CodeRay .line-numbers a:target { color: blue !important; }
+.CodeRay .line-numbers .highlighted { color: red !important; }
+.CodeRay .line-numbers .highlighted a { color: red !important; }
+.CodeRay span.line-numbers { padding: 0px 4px; }
+.CodeRay .line { display: block; float: left; width: 100%; }
+.CodeRay .code { width: 100%; }
+ MAIN
+
+ TOKEN_COLORS = <<-'TOKENS'
+.debug { color: white !important; background: blue !important; }
+
+.annotation { color:#007 }
+.attribute-name { color:#b48 }
+.attribute-value { color:#700 }
+.binary { color:#509 }
+.char .content { color:#D20 }
+.char .delimiter { color:#710 }
+.char { color:#D20 }
+.class { color:#B06; font-weight:bold }
+.class-variable { color:#369 }
+.color { color:#0A0 }
+.comment { color:#777 }
+.comment .char { color:#444 }
+.comment .delimiter { color:#444 }
+.complex { color:#A08 }
+.constant { color:#036; font-weight:bold }
+.decorator { color:#B0B }
+.definition { color:#099; font-weight:bold }
+.delimiter { color:black }
+.directive { color:#088; font-weight:bold }
+.doc { color:#970 }
+.doc-string { color:#D42; font-weight:bold }
+.doctype { color:#34b }
+.entity { color:#800; font-weight:bold }
+.error { color:#F00; background-color:#FAA }
+.escape { color:#666 }
+.exception { color:#C00; font-weight:bold }
+.float { color:#60E }
+.function { color:#06B; font-weight:bold }
+.global-variable { color:#d70 }
+.hex { color:#02b }
+.imaginary { color:#f00 }
+.include { color:#B44; font-weight:bold }
+.inline { background-color: hsla(0,0%,0%,0.07); color: black }
+.inline-delimiter { font-weight: bold; color: #666 }
+.instance-variable { color:#33B }
+.integer { color:#00D }
+.key .char { color: #60f }
+.key .delimiter { color: #404 }
+.key { color: #606 }
+.keyword { color:#080; font-weight:bold }
+.label { color:#970; font-weight:bold }
+.local-variable { color:#963 }
+.namespace { color:#707; font-weight:bold }
+.octal { color:#40E }
+.operator { }
+.predefined { color:#369; font-weight:bold }
+.predefined-constant { color:#069 }
+.predefined-type { color:#0a5; font-weight:bold }
+.preprocessor { color:#579 }
+.pseudo-class { color:#00C; font-weight:bold }
+.regexp .content { color:#808 }
+.regexp .delimiter { color:#404 }
+.regexp .modifier { color:#C2C }
+.regexp { background-color:hsla(300,100%,50%,0.06); }
+.reserved { color:#080; font-weight:bold }
+.shell .content { color:#2B2 }
+.shell .delimiter { color:#161 }
+.shell { background-color:hsla(120,100%,50%,0.06); }
+.string .char { color: #b0b }
+.string .content { color: #D20 }
+.string .delimiter { color: #710 }
+.string .modifier { color: #E40 }
+.string { background-color:hsla(0,100%,50%,0.05); }
+.symbol .content { color:#A60 }
+.symbol .delimiter { color:#630 }
+.symbol { color:#A60 }
+.tag { color:#070 }
+.type { color:#339; font-weight:bold }
+.value { color: #088; }
+.variable { color:#037 }
+
+.insert { background: hsla(120,100%,50%,0.12) }
+.delete { background: hsla(0,100%,50%,0.12) }
+.change { color: #bbf; background: #007; }
+.head { color: #f8f; background: #505 }
+.head .filename { color: white; }
+
+.delete .eyecatcher { background-color: hsla(0,100%,50%,0.2); border: 1px solid hsla(0,100%,45%,0.5); margin: -1px; border-bottom: none; border-top-left-radius: 5px; border-top-right-radius: 5px; }
+.insert .eyecatcher { background-color: hsla(120,100%,50%,0.2); border: 1px solid hsla(120,100%,25%,0.5); margin: -1px; border-top: none; border-bottom-left-radius: 5px; border-bottom-right-radius: 5px; }
+
+.insert .insert { color: #0c0; background:transparent; font-weight:bold }
+.delete .delete { color: #c00; background:transparent; font-weight:bold }
+.change .change { color: #88f }
+.head .head { color: #f4f }
+ TOKENS
+
+ end
+
+end
+end
diff --git a/lib/coderay/styles/cycnus.rb b/lib/coderay/styles/cycnus.rb
deleted file mode 100644
index da4f626..0000000
--- a/lib/coderay/styles/cycnus.rb
+++ /dev/null
@@ -1,152 +0,0 @@
-module CodeRay
-module Styles
-
- class Cycnus < Style
-
- register_for :cycnus
-
- code_background = '#f8f8f8'
- numbers_background = '#def'
- border_color = 'silver'
- normal_color = '#000'
-
- CSS_MAIN_STYLES = <<-MAIN
-.CodeRay {
- background-color: #{code_background};
- border: 1px solid #{border_color};
- font-family: 'Courier New', 'Terminal', monospace;
- color: #{normal_color};
-}
-.CodeRay pre { margin: 0px }
-
-div.CodeRay { }
-
-span.CodeRay { white-space: pre; border: 0px; padding: 2px }
-
-table.CodeRay { border-collapse: collapse; width: 100%; padding: 2px }
-table.CodeRay td { padding: 2px 4px; vertical-align: top }
-
-.CodeRay .line_numbers, .CodeRay .no {
- background-color: #{numbers_background};
- color: gray;
- text-align: right;
-}
-.CodeRay .line_numbers tt { font-weight: bold }
-.CodeRay .line_numbers .highlighted { color: red }
-.CodeRay .line { display: block; float: left; width: 100%; }
-.CodeRay .no { padding: 0px 4px }
-.CodeRay .code { width: 100% }
-
-ol.CodeRay { font-size: 10pt }
-ol.CodeRay li { white-space: pre }
-
-.CodeRay .code pre { overflow: auto }
- MAIN
-
- TOKEN_COLORS = <<-'TOKENS'
-.debug { color:white ! important; background:blue ! important; }
-
-.af { color:#00C }
-.an { color:#007 }
-.at { color:#f08 }
-.av { color:#700 }
-.aw { color:#C00 }
-.bi { color:#509; font-weight:bold }
-.c { color:#888; }
-
-.ch { color:#04D }
-.ch .k { color:#04D }
-.ch .dl { color:#039 }
-
-.cl { color:#B06; font-weight:bold }
-.cm { color:#A08; font-weight:bold }
-.co { color:#036; font-weight:bold }
-.cr { color:#0A0 }
-.cv { color:#369 }
-.de { color:#B0B; }
-.df { color:#099; font-weight:bold }
-.di { color:#088; font-weight:bold }
-.dl { color:black }
-.do { color:#970 }
-.dt { color:#34b }
-.ds { color:#D42; font-weight:bold }
-.e { color:#666; font-weight:bold }
-.en { color:#800; font-weight:bold }
-.er { color:#F00; background-color:#FAA }
-.ex { color:#C00; font-weight:bold }
-.fl { color:#60E; font-weight:bold }
-.fu { color:#06B; font-weight:bold }
-.gv { color:#d70; font-weight:bold }
-.hx { color:#058; font-weight:bold }
-.i { color:#00D; font-weight:bold }
-.ic { color:#B44; font-weight:bold }
-
-.il { background: #ddd; color: black }
-.il .il { background: #ccc }
-.il .il .il { background: #bbb }
-.il .idl { background: #ddd; font-weight: bold; color: #666 }
-.idl { background-color: #bbb; font-weight: bold; color: #666; }
-
-.im { color:#f00; }
-.in { color:#B2B; font-weight:bold }
-.iv { color:#33B }
-.la { color:#970; font-weight:bold }
-.lv { color:#963 }
-.oc { color:#40E; font-weight:bold }
-.of { color:#000; font-weight:bold }
-.op { }
-.pc { color:#038; font-weight:bold }
-.pd { color:#369; font-weight:bold }
-.pp { color:#579; }
-.ps { color:#00C; font-weight:bold }
-.pt { color:#074; font-weight:bold }
-.r, .kw { color:#080; font-weight:bold }
-
-.ke { color: #808; }
-.ke .dl { color: #606; }
-.ke .ch { color: #80f; }
-.vl { color: #088; }
-
-.rx { background-color:#fff0ff }
-.rx .k { color:#808 }
-.rx .dl { color:#404 }
-.rx .mod { color:#C2C }
-.rx .fu { color:#404; font-weight: bold }
-
-.s { background-color:#fff0f0; color: #D20; }
-.s .s { background-color:#ffe0e0 }
-.s .s .s { background-color:#ffd0d0 }
-.s .k { }
-.s .ch { color: #b0b; }
-.s .dl { color: #710; }
-
-.sh { background-color:#f0fff0; color:#2B2 }
-.sh .k { }
-.sh .dl { color:#161 }
-
-.sy { color:#A60 }
-.sy .k { color:#A60 }
-.sy .dl { color:#630 }
-
-.ta { color:#070 }
-.tf { color:#070; font-weight:bold }
-.ts { color:#D70; font-weight:bold }
-.ty { color:#339; font-weight:bold }
-.v { color:#036 }
-.xt { color:#444 }
-
-.ins { background: #afa; }
-.del { background: #faa; }
-.chg { color: #aaf; background: #007; }
-.head { color: #f8f; background: #505 }
-
-.ins .ins { color: #080; font-weight:bold }
-.del .del { color: #800; font-weight:bold }
-.chg .chg { color: #66f; }
-.head .head { color: #f4f; }
- TOKENS
-
- end
-
-end
-end
diff --git a/lib/coderay/styles/murphy.rb b/lib/coderay/styles/murphy.rb
deleted file mode 100644
index 8345942..0000000
--- a/lib/coderay/styles/murphy.rb
+++ /dev/null
@@ -1,134 +0,0 @@
-module CodeRay
-module Styles
-
- class Murphy < Style
-
- register_for :murphy
-
- code_background = '#001129'
- numbers_background = code_background
- border_color = 'silver'
- normal_color = '#C0C0C0'
-
- CSS_MAIN_STYLES = <<-MAIN
-.CodeRay {
- background-color: #{code_background};
- border: 1px solid #{border_color};
- font-family: 'Courier New', 'Terminal', monospace;
- color: #{normal_color};
-}
-.CodeRay pre { margin: 0px; }
-
-div.CodeRay { }
-
-span.CodeRay { white-space: pre; border: 0px; padding: 2px; }
-
-table.CodeRay { border-collapse: collapse; width: 100%; padding: 2px; }
-table.CodeRay td { padding: 2px 4px; vertical-align: top; }
-
-.CodeRay .line_numbers, .CodeRay .no {
- background-color: #{numbers_background};
- color: gray;
- text-align: right;
-}
-.CodeRay .line_numbers tt { font-weight: bold; }
-.CodeRay .line_numbers .highlighted { color: red }
-.CodeRay .line { display: block; float: left; width: 100%; }
-.CodeRay .no { padding: 0px 4px; }
-.CodeRay .code { width: 100%; }
-
-ol.CodeRay { font-size: 10pt; }
-ol.CodeRay li { white-space: pre; }
-
-.CodeRay .code pre { overflow: auto; }
- MAIN
-
- TOKEN_COLORS = <<-'TOKENS'
-.af { color:#00C; }
-.an { color:#007; }
-.av { color:#700; }
-.aw { color:#C00; }
-.bi { color:#509; font-weight:bold; }
-.c { color:#555; background-color: black; }
-
-.ch { color:#88F; }
-.ch .k { color:#04D; }
-.ch .dl { color:#039; }
-
-.cl { color:#e9e; font-weight:bold; }
-.co { color:#5ED; font-weight:bold; }
-.cr { color:#0A0; }
-.cv { color:#ccf; }
-.df { color:#099; font-weight:bold; }
-.di { color:#088; font-weight:bold; }
-.dl { color:black; }
-.do { color:#970; }
-.ds { color:#D42; font-weight:bold; }
-.e { color:#666; font-weight:bold; }
-.er { color:#F00; background-color:#FAA; }
-.ex { color:#F00; font-weight:bold; }
-.fl { color:#60E; font-weight:bold; }
-.fu { color:#5ed; font-weight:bold; }
-.gv { color:#f84; }
-.hx { color:#058; font-weight:bold; }
-.i { color:#66f; font-weight:bold; }
-.ic { color:#B44; font-weight:bold; }
-.il { }
-.in { color:#B2B; font-weight:bold; }
-.iv { color:#aaf; }
-.la { color:#970; font-weight:bold; }
-.lv { color:#963; }
-.oc { color:#40E; font-weight:bold; }
-.of { color:#000; font-weight:bold; }
-.op { }
-.pc { color:#08f; font-weight:bold; }
-.pd { color:#369; font-weight:bold; }
-.pp { color:#579; }
-.pt { color:#66f; font-weight:bold; }
-.r { color:#5de; font-weight:bold; }
-.r, .kw { color:#5de; font-weight:bold }
-
-.ke { color: #808; }
-
-.rx { background-color:#221133; }
-.rx .k { color:#f8f; }
-.rx .dl { color:#f0f; }
-.rx .mod { color:#f0b; }
-.rx .fu { color:#404; font-weight: bold; }
-
-.s { background-color:#331122; }
-.s .s { background-color:#ffe0e0; }
-.s .s .s { background-color:#ffd0d0; }
-.s .k { color:#F88; }
-.s .dl { color:#f55; }
-
-.sh { background-color:#f0fff0; }
-.sh .k { color:#2B2; }
-.sh .dl { color:#161; }
-
-.sy { color:#Fc8; }
-.sy .k { color:#Fc8; }
-.sy .dl { color:#F84; }
-
-.ta { color:#070; }
-.tf { color:#070; font-weight:bold; }
-.ts { color:#D70; font-weight:bold; }
-.ty { color:#339; font-weight:bold; }
-.v { color:#036; }
-.xt { color:#444; }
-
-.ins { background: #afa; }
-.del { background: #faa; }
-.chg { color: #aaf; background: #007; }
-.head { color: #f8f; background: #505 }
-
-.ins .ins { color: #080; font-weight:bold }
-.del .del { color: #800; font-weight:bold }
-.chg .chg { color: #66f; }
-.head .head { color: #f4f; }
- TOKENS
-
- end
-
-end
-end
diff --git a/lib/coderay/token_classes.rb b/lib/coderay/token_classes.rb
deleted file mode 100755
index ae35c0f..0000000
--- a/lib/coderay/token_classes.rb
+++ /dev/null
@@ -1,86 +0,0 @@
-module CodeRay
- class Tokens
- ClassOfKind = Hash.new do |h, k|
- h[k] = k.to_s
- end
- ClassOfKind.update with = {
- :annotation => 'at',
- :attribute_name => 'an',
- :attribute_name_fat => 'af',
- :attribute_value => 'av',
- :attribute_value_fat => 'aw',
- :bin => 'bi',
- :char => 'ch',
- :class => 'cl',
- :class_variable => 'cv',
- :color => 'cr',
- :comment => 'c',
- :complex => 'cm',
- :constant => 'co',
- :content => 'k',
- :decorator => 'de',
- :definition => 'df',
- :delimiter => 'dl',
- :directive => 'di',
- :doc => 'do',
- :doctype => 'dt',
- :doc_string => 'ds',
- :entity => 'en',
- :error => 'er',
- :escape => 'e',
- :exception => 'ex',
- :float => 'fl',
- :function => 'fu',
- :global_variable => 'gv',
- :hex => 'hx',
- :imaginary => 'cm',
- :important => 'im',
- :include => 'ic',
- :inline => 'il',
- :inline_delimiter => 'idl',
- :instance_variable => 'iv',
- :integer => 'i',
- :interpreted => 'in',
- :keyword => 'kw',
- :key => 'ke',
- :label => 'la',
- :local_variable => 'lv',
- :modifier => 'mod',
- :oct => 'oc',
- :operator_fat => 'of',
- :pre_constant => 'pc',
- :pre_type => 'pt',
- :predefined => 'pd',
- :preprocessor => 'pp',
- :pseudo_class => 'ps',
- :regexp => 'rx',
- :reserved => 'r',
- :shell => 'sh',
- :string => 's',
- :symbol => 'sy',
- :tag => 'ta',
- :tag_fat => 'tf',
- :tag_special => 'ts',
- :type => 'ty',
- :variable => 'v',
- :value => 'vl',
- :xml_text => 'xt',
-
- :insert => 'ins',
- :delete => 'del',
- :change => 'chg',
- :head => 'head',
-
- :ident => :NO_HIGHLIGHT, # 'id'
- #:operator => 'op',
- :operator => :NO_HIGHLIGHT, # 'op'
- :space => :NO_HIGHLIGHT, # 'sp'
- :plain => :NO_HIGHLIGHT,
- }
- ClassOfKind[:method] = ClassOfKind[:function]
- ClassOfKind[:open] = ClassOfKind[:close] = ClassOfKind[:delimiter]
- ClassOfKind[:nesting_delimiter] = ClassOfKind[:delimiter]
- ClassOfKind[:escape] = ClassOfKind[:delimiter]
- #ClassOfKind.default = ClassOfKind[:error] or raise 'no class found for :error!'
- end
-end
\ No newline at end of file
diff --git a/lib/coderay/token_kinds.rb b/lib/coderay/token_kinds.rb
new file mode 100755
index 0000000..3b8d07e
--- /dev/null
+++ b/lib/coderay/token_kinds.rb
@@ -0,0 +1,90 @@
+module CodeRay
+
+ # A Hash of all known token kinds and their associated CSS classes.
+ TokenKinds = Hash.new do |h, k|
+ warn 'Undefined Token kind: %p' % [k] if $CODERAY_DEBUG
+ false
+ end
+
+ # speedup
+ TokenKinds.compare_by_identity if TokenKinds.respond_to? :compare_by_identity
+
+ TokenKinds.update( # :nodoc:
+ :annotation => 'annotation',
+ :attribute_name => 'attribute-name',
+ :attribute_value => 'attribute-value',
+ :binary => 'bin',
+ :char => 'char',
+ :class => 'class',
+ :class_variable => 'class-variable',
+ :color => 'color',
+ :comment => 'comment',
+ :complex => 'complex',
+ :constant => 'constant',
+ :content => 'content',
+ :debug => 'debug',
+ :decorator => 'decorator',
+ :definition => 'definition',
+ :delimiter => 'delimiter',
+ :directive => 'directive',
+ :doc => 'doc',
+ :doctype => 'doctype',
+ :doc_string => 'doc-string',
+ :entity => 'entity',
+ :error => 'error',
+ :escape => 'escape',
+ :exception => 'exception',
+ :filename => 'filename',
+ :float => 'float',
+ :function => 'function',
+ :global_variable => 'global-variable',
+ :hex => 'hex',
+ :imaginary => 'imaginary',
+ :important => 'important',
+ :include => 'include',
+ :inline => 'inline',
+ :inline_delimiter => 'inline-delimiter',
+ :instance_variable => 'instance-variable',
+ :integer => 'integer',
+ :key => 'key',
+ :keyword => 'keyword',
+ :label => 'label',
+ :local_variable => 'local-variable',
+ :modifier => 'modifier',
+ :namespace => 'namespace',
+ :octal => 'octal',
+ :predefined => 'predefined',
+ :predefined_constant => 'predefined-constant',
+ :predefined_type => 'predefined-type',
+ :preprocessor => 'preprocessor',
+ :pseudo_class => 'pseudo-class',
+ :regexp => 'regexp',
+ :reserved => 'reserved',
+ :shell => 'shell',
+ :string => 'string',
+ :symbol => 'symbol',
+ :tag => 'tag',
+ :type => 'type',
+ :value => 'value',
+ :variable => 'variable',
+
+ :change => 'change',
+ :delete => 'delete',
+ :head => 'head',
+ :insert => 'insert',
+
+ :eyecatcher => 'eyecatcher',
+
+ :ident => false,
+ :operator => false,
+
+ :space => false,
+ :plain => false
+ )
+
+ TokenKinds[:method] = TokenKinds[:function]
+ TokenKinds[:escape] = TokenKinds[:delimiter]
+ TokenKinds[:docstring] = TokenKinds[:comment]
+
+ TokenKinds.freeze
+end
diff --git a/lib/coderay/tokens.rb b/lib/coderay/tokens.rb
index 6ac5f44..c747017 100644
--- a/lib/coderay/tokens.rb
+++ b/lib/coderay/tokens.rb
@@ -1,6 +1,9 @@
module CodeRay
-
- # = Tokens
+
+ # GZip library for writing and reading token dumps.
+ autoload :GZip, coderay_path('helpers', 'gzip')
+
+ # = Tokens TODO: Rewrite!
#
# The Tokens class represents a list of tokens returnd from
# a Scanner.
@@ -8,7 +11,7 @@ module CodeRay
# A token is not a special object, just a two-element Array
# consisting of
# * the _token_ _text_ (the original source of the token in a String) or
- # a _token_ _action_ (:open, :close, :begin_line, :end_line)
+ # a _token_ _action_ (begin_group, end_group, begin_line, end_line)
# * the _token_ _kind_ (a Symbol representing the type of the token)
#
# A token looks like this:
@@ -18,16 +21,16 @@ module CodeRay
# ['$^', :error]
#
# Some scanners also yield sub-tokens, represented by special
- # token actions, namely :open and :close.
+ # token actions, namely begin_group and end_group.
#
# The Ruby scanner, for example, splits "a string" into:
#
# [
- # [:open, :string],
+ # [:begin_group, :string],
# ['"', :delimiter],
# ['a string', :content],
# ['"', :delimiter],
- # [:close, :string]
+ # [:end_group, :string]
# ]
#
# Tokens is the interface between Scanners and Encoders:
@@ -47,46 +50,11 @@ module CodeRay
#
# It also allows you to generate tokens directly (without using a scanner),
# to load them from a file, and still use any Encoder that CodeRay provides.
- #
- # Tokens' subclass TokenStream allows streaming to save memory.
class Tokens < Array
# The Scanner instance that created the tokens.
attr_accessor :scanner
- # Whether the object is a TokenStream.
- #
- # Returns false.
- def stream?
- false
- end
-
- # Iterates over all tokens.
- #
- # If a filter is given, only tokens of that kind are yielded.
- def each kind_filter = nil, &block
- unless kind_filter
- super(&block)
- else
- super() do |text, kind|
- next unless kind == kind_filter
- yield text, kind
- end
- end
- end
-
- # Iterates over all text tokens.
- # Range tokens like [:open, :string] are left out.
- #
- # Example:
- # tokens.each_text_token { |text, kind| text.replace html_escape(text) }
- def each_text_token
- each do |text, kind|
- next unless text.is_a? ::String
- yield text, kind
- end
- end
-
# Encode the tokens using encoder.
#
# encoder can be
@@ -96,120 +64,98 @@ module CodeRay
#
# options are passed to the encoder.
def encode encoder, options = {}
- unless encoder.is_a? Encoders::Encoder
- unless encoder.is_a? Class
- encoder_class = Encoders[encoder]
- end
- encoder = encoder_class.new options
- end
+ encoder = Encoders[encoder].new options if encoder.respond_to? :to_sym
encoder.encode_tokens self, options
end
-
-
- # Turn into a string using Encoders::Text.
- #
- # +options+ are passed to the encoder if given.
- def to_s options = {}
- encode :text, options
+
+ # Turn tokens into a string by concatenating them.
+ def to_s
+ encode CodeRay::Encoders::Encoder.new
end
-
+
# Redirects unknown methods to encoder calls.
#
# For example, if you call +tokens.html+, the HTML encoder
# is used to highlight the tokens.
def method_missing meth, options = {}
- Encoders[meth].new(options).encode_tokens self
- end
-
- # Returns the tokens compressed by joining consecutive
- # tokens of the same kind.
- #
- # This can not be undone, but should yield the same output
- # in most Encoders. It basically makes the output smaller.
- #
- # Combined with dump, it saves space for the cost of time.
- #
- # If the scanner is written carefully, this is not required -
- # for example, consecutive //-comment lines could already be
- # joined in one comment token by the Scanner.
- def optimize
- last_kind = last_text = nil
- new = self.class.new
- for text, kind in self
- if text.is_a? String
- if kind == last_kind
- last_text << text
- else
- new << [last_text, last_kind] if last_kind
- last_text = text
- last_kind = kind
- end
- else
- new << [last_text, last_kind] if last_kind
- last_kind = last_text = nil
- new << [text, kind]
- end
- end
- new << [last_text, last_kind] if last_kind
- new
- end
-
- # Compact the object itself; see optimize.
- def optimize!
- replace optimize
+ encode meth, options
+ rescue PluginHost::PluginNotFound
+ super
end
- # Ensure that all :open tokens have a correspondent :close one.
- #
- # TODO: Test this!
- def fix
- tokens = self.class.new
- # Check token nesting using a stack of kinds.
+ # Split the tokens into parts of the given +sizes+.
+ #
+ # The result will be an Array of Tokens objects. The parts have
+ # the text size specified by the parameter. In addition, each
+ # part closes all opened tokens. This is useful to insert tokens
+ # betweem them.
+ #
+ # This method is used by @Scanner#tokenize@ when called with an Array
+ # of source strings. The Diff encoder uses it for inline highlighting.
+ def split_into_parts *sizes
+ parts = []
opened = []
- for type, kind in self
- case type
- when :open
- opened.push [:close, kind]
- when :begin_line
- opened.push [:end_line, kind]
- when :close, :end_line
- expected = opened.pop
- if [type, kind] != expected
- # Unexpected :close; decide what to do based on the kind:
- # - token was never opened: delete the :close (just skip it)
- next unless opened.rindex expected
- # - token was opened earlier: also close tokens in between
- tokens << token until (token = opened.pop) == expected
+ content = nil
+ part = Tokens.new
+ part_size = 0
+ size = sizes.first
+ i = 0
+ for item in self
+ case content
+ when nil
+ content = item
+ when String
+ if size && part_size + content.size > size # token must be cut
+ if part_size < size # some part of the token goes into this part
+ content = content.dup # content may no be safe to change
+ part << content.slice!(0, size - part_size) << item
+ end
+ # close all open groups and lines...
+ closing = opened.reverse.flatten.map do |content_or_kind|
+ case content_or_kind
+ when :begin_group
+ :end_group
+ when :begin_line
+ :end_line
+ else
+ content_or_kind
+ end
+ end
+ part.concat closing
+ begin
+ parts << part
+ part = Tokens.new
+ size = sizes[i += 1]
+ end until size.nil? || size > 0
+ # ...and open them again.
+ part.concat opened.flatten
+ part_size = 0
+ redo unless content.empty?
+ else
+ part << content << item
+ part_size += content.size
end
+ content = nil
+ when Symbol
+ case content
+ when :begin_group, :begin_line
+ opened << [content, item]
+ when :end_group, :end_line
+ opened.pop
+ else
+ raise ArgumentError, 'Unknown token action: %p, kind = %p' % [content, item]
+ end
+ part << content << item
+ content = nil
+ else
+ raise ArgumentError, 'Token input junk: %p, kind = %p' % [content, item]
end
- tokens << [type, kind]
end
- # Close remaining opened tokens
- tokens << token while token = opened.pop
- tokens
+ parts << part
+ parts << Tokens.new while parts.size < sizes.size
+ parts
end
- def fix!
- replace fix
- end
-
- # TODO: Scanner#split_into_lines
- #
- # Makes sure that:
- # - newlines are single tokens
- # (which means all other token are single-line)
- # - there are no open tokens at the end the line
- #
- # This makes it simple for encoders that work line-oriented,
- # like HTML with list-style numeration.
- def split_into_lines
- raise NotImplementedError
- end
-
- def split_into_lines!
- replace split_into_lines
- end
-
# Dumps the object into a String that can be saved
# in files or databases.
#
@@ -226,28 +172,16 @@ module CodeRay
#
# See GZip module.
def dump gzip_level = 7
- require 'coderay/helpers/gzip_simple'
dump = Marshal.dump self
- dump = dump.gzip gzip_level
+ dump = GZip.gzip dump, gzip_level
dump.extend Undumping
end
-
- # The total size of the tokens.
- # Should be equal to the input size before
- # scanning.
- def text_size
- size = 0
- each_text_token do |t, k|
- size + t.size
- end
- size
- end
-
- # Return all text tokens joined into a single string.
- def text
- map { |t, k| t if t.is_a? ::String }.join
+
+ # Return the actual number of tokens.
+ def count
+ size / 2
end
-
+
# Include this module to give an object an #undump
# method.
#
@@ -258,133 +192,24 @@ module CodeRay
Tokens.load self
end
end
-
+
# Undump the object using Marshal.load, then
# unzip it using GZip.gunzip.
#
# The result is commonly a Tokens object, but
# this is not guaranteed.
def Tokens.load dump
- require 'coderay/helpers/gzip_simple'
- dump = dump.gunzip
+ dump = GZip.gunzip dump
@dump = Marshal.load dump
end
-
- end
-
-
- # = TokenStream
- #
- # The TokenStream class is a fake Array without elements.
- #
- # It redirects the method << to a block given at creation.
- #
- # This allows scanners and Encoders to use streaming (no
- # tokens are saved, the input is highlighted the same time it
- # is scanned) with the same code.
- #
- # See CodeRay.encode_stream and CodeRay.scan_stream
- class TokenStream < Tokens
-
- # Whether the object is a TokenStream.
- #
- # Returns true.
- def stream?
- true
- end
-
- # The Array is empty, but size counts the tokens given by <<.
- attr_reader :size
-
- # Creates a new TokenStream that calls +block+ whenever
- # its << method is called.
- #
- # Example:
- #
- # require 'coderay'
- #
- # token_stream = CodeRay::TokenStream.new do |text, kind|
- # puts 'kind: %s, text size: %d.' % [kind, text.size]
- # end
- #
- # token_stream << ['/\d+/', :regexp]
- # #-> kind: rexpexp, text size: 5.
- #
- def initialize &block
- raise ArgumentError, 'Block expected for streaming.' unless block
- @callback = block
- @size = 0
- end
-
- # Calls +block+ with +token+ and increments size.
- #
- # Returns self.
- def << token
- @callback.call(*token)
- @size += 1
- self
- end
-
- # This method is not implemented due to speed reasons. Use Tokens.
- def text_size
- raise NotImplementedError,
- 'This method is not implemented due to speed reasons.'
- end
-
- # A TokenStream cannot be dumped. Use Tokens.
- def dump
- raise NotImplementedError, 'A TokenStream cannot be dumped.'
- end
-
- # A TokenStream cannot be optimized. Use Tokens.
- def optimize
- raise NotImplementedError, 'A TokenStream cannot be optimized.'
- end
-
- end
-
-end
-
-if $0 == __FILE__
- $VERBOSE = true
- $: << File.join(File.dirname(__FILE__), '..')
- eval DATA.read, nil, $0, __LINE__ + 4
-end
-
-__END__
-require 'test/unit'
-
-class TokensTest < Test::Unit::TestCase
-
- def test_creation
- assert CodeRay::Tokens < Array
- tokens = nil
- assert_nothing_raised do
- tokens = CodeRay::Tokens.new
- end
- assert_kind_of Array, tokens
- end
-
- def test_adding_tokens
- tokens = CodeRay::Tokens.new
- assert_nothing_raised do
- tokens << ['string', :type]
- tokens << ['()', :operator]
- end
- assert_equal tokens.size, 2
- end
-
- def test_dump_undump
- tokens = CodeRay::Tokens.new
- assert_nothing_raised do
- tokens << ['string', :type]
- tokens << ['()', :operator]
- end
- tokens2 = nil
- assert_nothing_raised do
- tokens2 = tokens.dump.undump
- end
- assert_equal tokens, tokens2
+
+ alias text_token push
+ def begin_group kind; push :begin_group, kind end
+ def end_group kind; push :end_group, kind end
+ def begin_line kind; push :begin_line, kind end
+ def end_line kind; push :end_line, kind end
+ alias tokens concat
+
end
-end
\ No newline at end of file
+end
diff --git a/lib/coderay/tokens_proxy.rb b/lib/coderay/tokens_proxy.rb
new file mode 100644
index 0000000..31ff39b
--- /dev/null
+++ b/lib/coderay/tokens_proxy.rb
@@ -0,0 +1,55 @@
+module CodeRay
+
+ # The result of a scan operation is a TokensProxy, but should act like Tokens.
+ #
+ # This proxy makes it possible to use the classic CodeRay.scan.encode API
+ # while still providing the benefits of direct streaming.
+ class TokensProxy
+
+ attr_accessor :input, :lang, :options, :block
+
+ # Create a new TokensProxy with the arguments of CodeRay.scan.
+ def initialize input, lang, options = {}, block = nil
+ @input = input
+ @lang = lang
+ @options = options
+ @block = block
+ end
+
+ # Call CodeRay.encode if +encoder+ is a Symbol;
+ # otherwise, convert the receiver to tokens and call encoder.encode_tokens.
+ def encode encoder, options = {}
+ if encoder.respond_to? :to_sym
+ CodeRay.encode(input, lang, encoder, options)
+ else
+ encoder.encode_tokens tokens, options
+ end
+ end
+
+ # Tries to call encode;
+ # delegates to tokens otherwise.
+ def method_missing method, *args, &blk
+ encode method.to_sym, *args
+ rescue PluginHost::PluginNotFound
+ tokens.send(method, *args, &blk)
+ end
+
+ # The (cached) result of the tokenized input; a Tokens instance.
+ def tokens
+ @tokens ||= scanner.tokenize(input)
+ end
+
+ # A (cached) scanner instance to use for the scan task.
+ def scanner
+ @scanner ||= CodeRay.scanner(lang, options, &block)
+ end
+
+ # Overwrite Struct#each.
+ def each *args, &blk
+ tokens.each(*args, &blk)
+ self
+ end
+
+ end
+
+end
diff --git a/lib/coderay/version.rb b/lib/coderay/version.rb
new file mode 100644
index 0000000..e2797b5
--- /dev/null
+++ b/lib/coderay/version.rb
@@ -0,0 +1,3 @@
+module CodeRay
+ VERSION = '1.0.5'
+end
diff --git a/metadata.yml b/metadata.yml
index a7b531a..50a9dd1 100644
--- a/metadata.yml
+++ b/metadata.yml
@@ -1,117 +1,113 @@
--- !ruby/object:Gem::Specification
name: coderay
version: !ruby/object:Gem::Version
- hash: 43
+ hash: 29
prerelease:
segments:
+ - 1
- 0
- - 9
- - 8
- version: 0.9.8
+ - 5
+ version: 1.0.5
platform: ruby
authors:
-- murphy
+- Kornelius Kalnbach
autorequire:
bindir: bin
cert_chain: []
-date: 2011-05-01 00:00:00 Z
+date: 2011-12-28 00:00:00 +01:00
+default_executable:
dependencies: []
-description: |
- Fast and easy syntax highlighting for selected languages, written in Ruby.
- Comes with RedCloth integration and LOC counter.
-
-email: murphy at rubychan.de
+description: Fast and easy syntax highlighting for selected languages, written in Ruby. Comes with RedCloth integration and LOC counter.
+email:
+- murphy at rubychan.de
executables:
- coderay
-- coderay_stylesheet
extensions: []
extra_rdoc_files:
-- lib/README
-- FOLDERS
+- README_INDEX.rdoc
files:
-- ./lib/coderay/duo.rb
-- ./lib/coderay/encoder.rb
-- ./lib/coderay/encoders/_map.rb
-- ./lib/coderay/encoders/comment_filter.rb
-- ./lib/coderay/encoders/count.rb
-- ./lib/coderay/encoders/debug.rb
-- ./lib/coderay/encoders/div.rb
-- ./lib/coderay/encoders/filter.rb
-- ./lib/coderay/encoders/html/css.rb
-- ./lib/coderay/encoders/html/numerization.rb
-- ./lib/coderay/encoders/html/output.rb
-- ./lib/coderay/encoders/html.rb
-- ./lib/coderay/encoders/json.rb
-- ./lib/coderay/encoders/lines_of_code.rb
-- ./lib/coderay/encoders/null.rb
-- ./lib/coderay/encoders/page.rb
-- ./lib/coderay/encoders/span.rb
-- ./lib/coderay/encoders/statistic.rb
-- ./lib/coderay/encoders/term.rb
-- ./lib/coderay/encoders/text.rb
-- ./lib/coderay/encoders/token_class_filter.rb
-- ./lib/coderay/encoders/xml.rb
-- ./lib/coderay/encoders/yaml.rb
-- ./lib/coderay/for_redcloth.rb
-- ./lib/coderay/helpers/file_type.rb
-- ./lib/coderay/helpers/gzip_simple.rb
-- ./lib/coderay/helpers/plugin.rb
-- ./lib/coderay/helpers/word_list.rb
-- ./lib/coderay/scanner.rb
-- ./lib/coderay/scanners/_map.rb
-- ./lib/coderay/scanners/c.rb
-- ./lib/coderay/scanners/cpp.rb
-- ./lib/coderay/scanners/css.rb
-- ./lib/coderay/scanners/debug.rb
-- ./lib/coderay/scanners/delphi.rb
-- ./lib/coderay/scanners/diff.rb
-- ./lib/coderay/scanners/groovy.rb
-- ./lib/coderay/scanners/html.rb
-- ./lib/coderay/scanners/java/builtin_types.rb
-- ./lib/coderay/scanners/java.rb
-- ./lib/coderay/scanners/java_script.rb
-- ./lib/coderay/scanners/json.rb
-- ./lib/coderay/scanners/nitro_xhtml.rb
-- ./lib/coderay/scanners/php.rb
-- ./lib/coderay/scanners/plaintext.rb
-- ./lib/coderay/scanners/python.rb
-- ./lib/coderay/scanners/rhtml.rb
-- ./lib/coderay/scanners/ruby/patterns.rb
-- ./lib/coderay/scanners/ruby.rb
-- ./lib/coderay/scanners/scheme.rb
-- ./lib/coderay/scanners/sql.rb
-- ./lib/coderay/scanners/xml.rb
-- ./lib/coderay/scanners/yaml.rb
-- ./lib/coderay/style.rb
-- ./lib/coderay/styles/_map.rb
-- ./lib/coderay/styles/cycnus.rb
-- ./lib/coderay/styles/murphy.rb
-- ./lib/coderay/token_classes.rb
-- ./lib/coderay/tokens.rb
-- ./lib/coderay.rb
-- ./Rakefile
-- ./test/functional/basic.rb
-- ./test/functional/for_redcloth.rb
-- ./test/functional/load_plugin_scanner.rb
-- ./test/functional/suite.rb
-- ./test/functional/vhdl.rb
-- ./test/functional/word_list.rb
-- ./lib/README
-- ./LICENSE
-- lib/README
-- FOLDERS
+- LICENSE
+- README_INDEX.rdoc
+- Rakefile
+- lib/coderay.rb
+- lib/coderay/duo.rb
+- lib/coderay/encoder.rb
+- lib/coderay/encoders/_map.rb
+- lib/coderay/encoders/comment_filter.rb
+- lib/coderay/encoders/count.rb
+- lib/coderay/encoders/debug.rb
+- lib/coderay/encoders/div.rb
+- lib/coderay/encoders/filter.rb
+- lib/coderay/encoders/html.rb
+- lib/coderay/encoders/html/css.rb
+- lib/coderay/encoders/html/numbering.rb
+- lib/coderay/encoders/html/output.rb
+- lib/coderay/encoders/json.rb
+- lib/coderay/encoders/lines_of_code.rb
+- lib/coderay/encoders/null.rb
+- lib/coderay/encoders/page.rb
+- lib/coderay/encoders/span.rb
+- lib/coderay/encoders/statistic.rb
+- lib/coderay/encoders/terminal.rb
+- lib/coderay/encoders/text.rb
+- lib/coderay/encoders/token_kind_filter.rb
+- lib/coderay/encoders/xml.rb
+- lib/coderay/encoders/yaml.rb
+- lib/coderay/for_redcloth.rb
+- lib/coderay/helpers/file_type.rb
+- lib/coderay/helpers/gzip.rb
+- lib/coderay/helpers/plugin.rb
+- lib/coderay/helpers/word_list.rb
+- lib/coderay/scanner.rb
+- lib/coderay/scanners/_map.rb
+- lib/coderay/scanners/c.rb
+- lib/coderay/scanners/clojure.rb
+- lib/coderay/scanners/cpp.rb
+- lib/coderay/scanners/css.rb
+- lib/coderay/scanners/debug.rb
+- lib/coderay/scanners/delphi.rb
+- lib/coderay/scanners/diff.rb
+- lib/coderay/scanners/erb.rb
+- lib/coderay/scanners/groovy.rb
+- lib/coderay/scanners/haml.rb
+- lib/coderay/scanners/html.rb
+- lib/coderay/scanners/java.rb
+- lib/coderay/scanners/java/builtin_types.rb
+- lib/coderay/scanners/java_script.rb
+- lib/coderay/scanners/json.rb
+- lib/coderay/scanners/php.rb
+- lib/coderay/scanners/python.rb
+- lib/coderay/scanners/raydebug.rb
+- lib/coderay/scanners/ruby.rb
+- lib/coderay/scanners/ruby/patterns.rb
+- lib/coderay/scanners/ruby/string_state.rb
+- lib/coderay/scanners/sql.rb
+- lib/coderay/scanners/text.rb
+- lib/coderay/scanners/xml.rb
+- lib/coderay/scanners/yaml.rb
+- lib/coderay/style.rb
+- lib/coderay/styles/_map.rb
+- lib/coderay/styles/alpha.rb
+- lib/coderay/token_kinds.rb
+- lib/coderay/tokens.rb
+- lib/coderay/tokens_proxy.rb
+- lib/coderay/version.rb
+- test/functional/basic.rb
+- test/functional/examples.rb
+- test/functional/for_redcloth.rb
+- test/functional/suite.rb
- bin/coderay
-- bin/coderay_stylesheet
+has_rdoc: true
homepage: http://coderay.rubychan.de
licenses: []
post_install_message:
rdoc_options:
- -SNw2
-- -mlib/README
+- -mREADME_INDEX.rdoc
- -t CodeRay Documentation
require_paths:
- lib
@@ -120,12 +116,12 @@ required_ruby_version: !ruby/object:Gem::Requirement
requirements:
- - ">="
- !ruby/object:Gem::Version
- hash: 51
+ hash: 59
segments:
- 1
- 8
- - 2
- version: 1.8.2
+ - 6
+ version: 1.8.6
required_rubygems_version: !ruby/object:Gem::Requirement
none: false
requirements:
@@ -138,9 +134,12 @@ required_rubygems_version: !ruby/object:Gem::Requirement
requirements: []
rubyforge_project: coderay
-rubygems_version: 1.7.2
+rubygems_version: 1.6.2
signing_key:
specification_version: 3
summary: Fast syntax highlighting for selected languages.
test_files:
-- ./test/functional/suite.rb
+- test/functional/basic.rb
+- test/functional/examples.rb
+- test/functional/for_redcloth.rb
+- test/functional/suite.rb
diff --git a/test/functional/basic.rb b/test/functional/basic.rb
index 8ecd3d3..3053b54 100755
--- a/test/functional/basic.rb
+++ b/test/functional/basic.rb
@@ -1,11 +1,31 @@
+# encoding: utf-8
require 'test/unit'
+require File.expand_path('../../lib/assert_warning', __FILE__)
+
+$:.unshift File.expand_path('../../../lib', __FILE__)
require 'coderay'
class BasicTest < Test::Unit::TestCase
def test_version
assert_nothing_raised do
- assert_match(/\A\d\.\d\.\d\z/, CodeRay::VERSION)
+ assert_match(/\A\d\.\d\.\d?\z/, CodeRay::VERSION)
+ end
+ end
+
+ def with_empty_load_path
+ old_load_path = $:.dup
+ $:.clear
+ yield
+ ensure
+ $:.replace old_load_path
+ end
+
+ def test_autoload
+ with_empty_load_path do
+ assert_nothing_raised do
+ CodeRay::Scanners::Java::BuiltinTypes
+ end
end
end
@@ -14,36 +34,60 @@ class BasicTest < Test::Unit::TestCase
RUBY_TEST_TOKENS = [
['puts', :ident],
[' ', :space],
- [:open, :string],
+ [:begin_group, :string],
['"', :delimiter],
['Hello, World!', :content],
['"', :delimiter],
- [:close, :string]
- ]
+ [:end_group, :string]
+ ].flatten
def test_simple_scan
assert_nothing_raised do
- assert_equal RUBY_TEST_TOKENS, CodeRay.scan(RUBY_TEST_CODE, :ruby).to_ary
+ assert_equal RUBY_TEST_TOKENS, CodeRay.scan(RUBY_TEST_CODE, :ruby).tokens
end
end
- RUBY_TEST_HTML = 'puts <span class="s"><span class="dl">"</span>' +
- '<span class="k">Hello, World!</span><span class="dl">"</span></span>'
+ RUBY_TEST_HTML = 'puts <span class="string"><span class="delimiter">"</span>' +
+ '<span class="content">Hello, World!</span><span class="delimiter">"</span></span>'
def test_simple_highlight
assert_nothing_raised do
assert_equal RUBY_TEST_HTML, CodeRay.scan(RUBY_TEST_CODE, :ruby).html
end
end
+ def test_scan_file
+ CodeRay.scan_file __FILE__
+ end
+
+ def test_encode
+ assert_equal 1, CodeRay.encode('test', :python, :count)
+ end
+
+ def test_encode_tokens
+ assert_equal 1, CodeRay.encode_tokens(CodeRay::Tokens['test', :string], :count)
+ end
+
+ def test_encode_file
+ assert_equal File.read(__FILE__), CodeRay.encode_file(__FILE__, :text)
+ end
+
+ def test_highlight
+ assert_match '<pre>test</pre>', CodeRay.highlight('test', :python)
+ end
+
+ def test_highlight_file
+ assert_match "require <span class=\"string\"><span class=\"delimiter\">'</span><span class=\"content\">test/unit</span><span class=\"delimiter\">'</span></span>\n", CodeRay.highlight_file(__FILE__)
+ end
+
def test_duo
assert_equal(RUBY_TEST_CODE,
- CodeRay::Duo[:plain, :plain].highlight(RUBY_TEST_CODE))
+ CodeRay::Duo[:plain, :text].highlight(RUBY_TEST_CODE))
assert_equal(RUBY_TEST_CODE,
- CodeRay::Duo[:plain => :plain].highlight(RUBY_TEST_CODE))
+ CodeRay::Duo[:plain => :text].highlight(RUBY_TEST_CODE))
end
def test_duo_stream
assert_equal(RUBY_TEST_CODE,
- CodeRay::Duo[:plain, :plain].highlight(RUBY_TEST_CODE, :stream => true))
+ CodeRay::Duo[:plain, :text].highlight(RUBY_TEST_CODE, :stream => true))
end
def test_comment_filter
@@ -98,25 +142,179 @@ more code # and another comment, in-line.
assert_equal 0, CodeRay.scan(rHTML, :html).lines_of_code
assert_equal 0, CodeRay.scan(rHTML, :php).lines_of_code
assert_equal 0, CodeRay.scan(rHTML, :yaml).lines_of_code
- assert_equal 4, CodeRay.scan(rHTML, :rhtml).lines_of_code
+ assert_equal 4, CodeRay.scan(rHTML, :erb).lines_of_code
end
- def test_rubygems_not_loaded
- assert_equal nil, defined? Gem
- end if ENV['check_rubygems'] && RUBY_VERSION < '1.9'
-
def test_list_of_encoders
assert_kind_of(Array, CodeRay::Encoders.list)
- assert CodeRay::Encoders.list.include?('count')
+ assert CodeRay::Encoders.list.include?(:count)
end
def test_list_of_scanners
assert_kind_of(Array, CodeRay::Scanners.list)
- assert CodeRay::Scanners.list.include?('plaintext')
+ assert CodeRay::Scanners.list.include?(:text)
+ end
+
+ def test_token_kinds
+ assert_kind_of Hash, CodeRay::TokenKinds
+ for kind, css_class in CodeRay::TokenKinds
+ assert_kind_of Symbol, kind
+ if css_class != false
+ assert_kind_of String, css_class, "TokenKinds[%p] == %p" % [kind, css_class]
+ end
+ end
+ assert_equal 'reserved', CodeRay::TokenKinds[:reserved]
+ assert_warning 'Undefined Token kind: :shibboleet' do
+ assert_equal false, CodeRay::TokenKinds[:shibboleet]
+ end
+ end
+
+ class Milk < CodeRay::Encoders::Encoder
+ FILE_EXTENSION = 'cocoa'
+ end
+
+ class HoneyBee < CodeRay::Encoders::Encoder
+ end
+
+ def test_encoder_file_extension
+ assert_nothing_raised do
+ assert_equal 'html', CodeRay::Encoders::Page::FILE_EXTENSION
+ assert_equal 'cocoa', Milk::FILE_EXTENSION
+ assert_equal 'cocoa', Milk.new.file_extension
+ assert_equal 'honeybee', HoneyBee::FILE_EXTENSION
+ assert_equal 'honeybee', HoneyBee.new.file_extension
+ end
+ assert_raise NameError do
+ HoneyBee::MISSING_CONSTANT
+ end
+ end
+
+ def test_encoder_tokens
+ encoder = CodeRay::Encoders::Encoder.new
+ encoder.send :setup, {}
+ assert_raise(ArgumentError) { encoder.token :strange, '' }
+ encoder.token 'test', :debug
+ end
+
+ def test_encoder_deprecated_interface
+ encoder = CodeRay::Encoders::Encoder.new
+ encoder.send :setup, {}
+ assert_warning 'Using old Tokens#<< interface.' do
+ encoder << ['test', :content]
+ end
+ assert_raise ArgumentError do
+ encoder << [:strange, :input]
+ end
+ assert_raise ArgumentError do
+ encoder.encode_tokens [['test', :token]]
+ end
+ end
+
+ def encoder_token_interface_deprecation_warning_given
+ CodeRay::Encoders::Encoder.send :class_variable_get, :@@CODERAY_TOKEN_INTERFACE_DEPRECATION_WARNING_GIVEN
+ end
+
+ def test_scanner_file_extension
+ assert_equal 'rb', CodeRay::Scanners::Ruby.file_extension
+ assert_equal 'rb', CodeRay::Scanners::Ruby.new.file_extension
+ assert_equal 'java', CodeRay::Scanners::Java.file_extension
+ assert_equal 'java', CodeRay::Scanners::Java.new.file_extension
+ end
+
+ def test_scanner_lang
+ assert_equal :ruby, CodeRay::Scanners::Ruby.lang
+ assert_equal :ruby, CodeRay::Scanners::Ruby.new.lang
+ assert_equal :java, CodeRay::Scanners::Java.lang
+ assert_equal :java, CodeRay::Scanners::Java.new.lang
+ end
+
+ def test_scanner_tokenize
+ assert_equal ['foo', :plain], CodeRay::Scanners::Plain.new.tokenize('foo')
+ assert_equal [['foo', :plain], ['bar', :plain]], CodeRay::Scanners::Plain.new.tokenize(['foo', 'bar'])
+ CodeRay::Scanners::Plain.new.tokenize 42
+ end
+
+ def test_scanner_tokens
+ scanner = CodeRay::Scanners::Plain.new
+ scanner.tokenize('foo')
+ assert_equal ['foo', :plain], scanner.tokens
+ scanner.string = ''
+ assert_equal ['', :plain], scanner.tokens
+ end
+
+ def test_scanner_line_and_column
+ scanner = CodeRay::Scanners::Plain.new "foo\nbär+quux"
+ assert_equal 0, scanner.pos
+ assert_equal 1, scanner.line
+ assert_equal 1, scanner.column
+ scanner.scan(/foo/)
+ assert_equal 3, scanner.pos
+ assert_equal 1, scanner.line
+ assert_equal 4, scanner.column
+ scanner.scan(/\n/)
+ assert_equal 4, scanner.pos
+ assert_equal 2, scanner.line
+ assert_equal 1, scanner.column
+ scanner.scan(/b/)
+ assert_equal 5, scanner.pos
+ assert_equal 2, scanner.line
+ assert_equal 2, scanner.column
+ scanner.scan(/a/)
+ assert_equal 5, scanner.pos
+ assert_equal 2, scanner.line
+ assert_equal 2, scanner.column
+ scanner.scan(/ä/)
+ assert_equal 7, scanner.pos
+ assert_equal 2, scanner.line
+ assert_equal 4, scanner.column
+ scanner.scan(/r/)
+ assert_equal 8, scanner.pos
+ assert_equal 2, scanner.line
+ assert_equal 5, scanner.column
+ end
+
+ def test_scanner_use_subclasses
+ assert_raise NotImplementedError do
+ CodeRay::Scanners::Scanner.new
+ end
+ end
+
+ class InvalidScanner < CodeRay::Scanners::Scanner
+ end
+
+ def test_scanner_scan_tokens
+ assert_raise NotImplementedError do
+ InvalidScanner.new.tokenize ''
+ end
+ end
+
+ class RaisingScanner < CodeRay::Scanners::Scanner
+ def scan_tokens encoder, options
+ raise_inspect 'message', [], :initial
+ end
+ end
+
+ def test_scanner_raise_inspect
+ assert_raise CodeRay::Scanners::Scanner::ScanError do
+ RaisingScanner.new.tokenize ''
+ end
end
def test_scan_a_frozen_string
- CodeRay.scan RUBY_VERSION, :ruby
+ assert_nothing_raised do
+ CodeRay.scan RUBY_VERSION, :ruby
+ CodeRay.scan RUBY_VERSION, :plain
+ end
+ end
+
+ def test_scan_a_non_string
+ assert_nothing_raised do
+ CodeRay.scan 42, :ruby
+ CodeRay.scan nil, :ruby
+ CodeRay.scan self, :ruby
+ CodeRay.encode ENV.to_hash, :ruby, :page
+ CodeRay.highlight CodeRay, :plain
+ end
end
end
diff --git a/test/functional/examples.rb b/test/functional/examples.rb
new file mode 100755
index 0000000..ff64af3
--- /dev/null
+++ b/test/functional/examples.rb
@@ -0,0 +1,129 @@
+require 'test/unit'
+
+$:.unshift File.expand_path('../../../lib', __FILE__)
+require 'coderay'
+
+class ExamplesTest < Test::Unit::TestCase
+
+ def test_examples
+ # output as HTML div (using inline CSS styles)
+ div = CodeRay.scan('puts "Hello, world!"', :ruby).div
+ assert_equal <<-DIV, div
+<div class="CodeRay">
+ <div class="code"><pre>puts <span style="background-color:hsla(0,100%,50%,0.05)"><span style="color:#710">"</span><span style="color:#D20">Hello, world!</span><span style="color:#710">"</span></span></pre></div>
+</div>
+ DIV
+
+ # ...with line numbers
+ div = CodeRay.scan(<<-CODE.chomp, :ruby).div(:line_numbers => :table)
+5.times do
+ puts 'Hello, world!'
+end
+ CODE
+ assert_equal <<-DIV, div
+<table class="CodeRay"><tr>
+ <td class="line-numbers" title="double click to toggle" ondblclick="with (this.firstChild.style) { display = (display == '') ? 'none' : '' }"><pre><a href="#n1" name="n1">1</a>
+<a href="#n2" name="n2">2</a>
+<a href="#n3" name="n3">3</a>
+</pre></td>
+ <td class="code"><pre><span style="color:#00D">5</span>.times <span style="color:#080;font-weight:bold">do</span>
+ puts <span style="background-color:hsla(0,100%,50%,0.05)"><span style="color:#710">'</span><span style="color:#D20">Hello, world!</span><span style="color:#710">'</span></span>
+<span style="color:#080;font-weight:bold">end</span></pre></td>
+</tr></table>
+ DIV
+
+ # output as standalone HTML page (using CSS classes)
+ page = CodeRay.scan('puts "Hello, world!"', :ruby).page
+ assert_match <<-PAGE, page
+<body>
+
+<table class="CodeRay"><tr>
+ <td class="line-numbers" title="double click to toggle" ondblclick="with (this.firstChild.style) { display = (display == '') ? 'none' : '' }"><pre>
+</pre></td>
+ <td class="code"><pre>puts <span class="string"><span class="delimiter">"</span><span class="content">Hello, world!</span><span class="delimiter">"</span></span></pre></td>
+</tr></table>
+
+</body>
+ PAGE
+
+ # keep scanned tokens for later use
+ tokens = CodeRay.scan('{ "just": "an", "example": 42 }', :json)
+ assert_kind_of CodeRay::TokensProxy, tokens
+
+ assert_equal ["{", :operator, " ", :space, :begin_group, :key,
+ "\"", :delimiter, "just", :content, "\"", :delimiter,
+ :end_group, :key, ":", :operator, " ", :space,
+ :begin_group, :string, "\"", :delimiter, "an", :content,
+ "\"", :delimiter, :end_group, :string, ",", :operator,
+ " ", :space, :begin_group, :key, "\"", :delimiter,
+ "example", :content, "\"", :delimiter, :end_group, :key,
+ ":", :operator, " ", :space, "42", :integer,
+ " ", :space, "}", :operator], tokens.tokens
+
+ # produce a token statistic
+ assert_equal <<-STATISTIC, tokens.statistic
+
+Code Statistics
+
+Tokens 26
+ Non-Whitespace 15
+Bytes Total 31
+
+Token Types (7):
+ type count ratio size (average)
+-------------------------------------------------------------
+ TOTAL 26 100.00 % 1.2
+ delimiter 6 23.08 % 1.0
+ operator 5 19.23 % 1.0
+ space 5 19.23 % 1.0
+ key 4 15.38 % 0.0
+ :begin_group 3 11.54 % 0.0
+ :end_group 3 11.54 % 0.0
+ content 3 11.54 % 4.3
+ string 2 7.69 % 0.0
+ integer 1 3.85 % 2.0
+
+ STATISTIC
+
+ # count the tokens
+ assert_equal 26, tokens.count
+
+ # produce a HTML div, but with CSS classes
+ div = tokens.div(:css => :class)
+ assert_equal <<-DIV, div
+<div class="CodeRay">
+ <div class="code"><pre>{ <span class="key"><span class="delimiter">"</span><span class="content">just</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">an</span><span class="delimiter">"</span></span>, <span class="key"><span class="delimiter">"</span><span class="content">example</span><span class="delimiter">"</span></span>: <span class="integer">42</span> }</pre></div>
+</div>
+ DIV
+
+ # highlight a file (HTML div); guess the file type base on the extension
+ assert_equal :ruby, CodeRay::FileType[__FILE__]
+
+ # get a new scanner for Python
+ python_scanner = CodeRay.scanner :python
+ assert_kind_of CodeRay::Scanners::Python, python_scanner
+
+ # get a new encoder for terminal
+ terminal_encoder = CodeRay.encoder :term
+ assert_kind_of CodeRay::Encoders::Terminal, terminal_encoder
+
+ # scanning into tokens
+ tokens = python_scanner.tokenize 'import this; # The Zen of Python'
+ assert_equal ["import", :keyword, " ", :space, "this", :include,
+ ";", :operator, " ", :space, "# The Zen of Python", :comment], tokens
+
+ # format the tokens
+ term = terminal_encoder.encode_tokens(tokens)
+ assert_equal "\e[1;31mimport\e[0m \e[33mthis\e[0m; \e[37m# The Zen of Python\e[0m", term
+
+ # re-using scanner and encoder
+ ruby_highlighter = CodeRay::Duo[:ruby, :div]
+ div = ruby_highlighter.encode('puts "Hello, world!"')
+ assert_equal <<-DIV, div
+<div class="CodeRay">
+ <div class="code"><pre>puts <span style="background-color:hsla(0,100%,50%,0.05)"><span style="color:#710">"</span><span style="color:#D20">Hello, world!</span><span style="color:#710">"</span></span></pre></div>
+</div>
+ DIV
+ end
+
+end
diff --git a/test/functional/for_redcloth.rb b/test/functional/for_redcloth.rb
index efd0578..e980667 100644
--- a/test/functional/for_redcloth.rb
+++ b/test/functional/for_redcloth.rb
@@ -1,5 +1,7 @@
require 'test/unit'
-$:.unshift 'lib'
+require File.expand_path('../../lib/assert_warning', __FILE__)
+
+$:.unshift File.expand_path('../../../lib', __FILE__)
require 'coderay'
begin
@@ -8,17 +10,18 @@ begin
require 'redcloth'
rescue LoadError
warn 'RedCloth not found - skipping for_redcloth tests.'
+ undef RedCloth if defined? RedCloth
end
class BasicTest < Test::Unit::TestCase
def test_for_redcloth
require 'coderay/for_redcloth'
- assert_equal "<p><span lang=\"ruby\" class=\"CodeRay\">puts <span style=\"background-color:#fff0f0;color:#D20\"><span style=\"color:#710\">"</span><span style=\"\">Hello, World!</span><span style=\"color:#710\">"</span></span></span></p>",
+ assert_equal "<p><span lang=\"ruby\" class=\"CodeRay\">puts <span style=\"background-color:hsla(0,100%,50%,0.05)\"><span style=\"color:#710\">"</span><span style=\"color:#D20\">Hello, World!</span><span style=\"color:#710\">"</span></span></span></p>",
RedCloth.new('@[ruby]puts "Hello, World!"@').to_html
assert_equal <<-BLOCKCODE.chomp,
<div lang="ruby" class="CodeRay">
- <div class="code"><pre>puts <span style="background-color:#fff0f0;color:#D20"><span style="color:#710">"</span><span style="">Hello, World!</span><span style="color:#710">"</span></span></pre></div>
+ <div class="code"><pre>puts <span style="background-color:hsla(0,100%,50%,0.05)"><span style="color:#710">"</span><span style="color:#D20">Hello, World!</span><span style="color:#710">"</span></span></pre></div>
</div>
BLOCKCODE
RedCloth.new('bc[ruby]. puts "Hello, World!"').to_html
@@ -63,15 +66,19 @@ class BasicTest < Test::Unit::TestCase
# See http://jgarber.lighthouseapp.com/projects/13054/tickets/124-code-markup-does-not-allow-brackets.
def test_for_redcloth_false_positive
require 'coderay/for_redcloth'
- assert_equal '<p><code>[project]_dff.skjd</code></p>',
- RedCloth.new('@[project]_dff.skjd@').to_html
+ assert_warning 'CodeRay::Scanners could not load plugin :project; falling back to :text' do
+ assert_equal '<p><code>[project]_dff.skjd</code></p>',
+ RedCloth.new('@[project]_dff.skjd@').to_html
+ end
# false positive, but expected behavior / known issue
assert_equal "<p><span lang=\"ruby\" class=\"CodeRay\">_dff.skjd</span></p>",
RedCloth.new('@[ruby]_dff.skjd@').to_html
- assert_equal <<-BLOCKCODE.chomp,
+ assert_warning 'CodeRay::Scanners could not load plugin :project; falling back to :text' do
+ assert_equal <<-BLOCKCODE.chomp,
<pre><code>[project]_dff.skjd</code></pre>
- BLOCKCODE
- RedCloth.new('bc. [project]_dff.skjd').to_html
+ BLOCKCODE
+ RedCloth.new('bc. [project]_dff.skjd').to_html
+ end
end
end if defined? RedCloth
\ No newline at end of file
diff --git a/test/functional/load_plugin_scanner.rb b/test/functional/load_plugin_scanner.rb
deleted file mode 100755
index 25bbc93..0000000
--- a/test/functional/load_plugin_scanner.rb
+++ /dev/null
@@ -1,11 +0,0 @@
-require 'test/unit'
-require 'coderay'
-
-class PluginScannerTest < Test::Unit::TestCase
-
- def test_load
- require File.join(File.dirname(__FILE__), 'vhdl')
- assert_equal 'VHDL', CodeRay.scanner(:vhdl).class.name
- end
-
-end
diff --git a/test/functional/suite.rb b/test/functional/suite.rb
index 97dd330..ec23eec 100755
--- a/test/functional/suite.rb
+++ b/test/functional/suite.rb
@@ -1,12 +1,15 @@
require 'test/unit'
-MYDIR = File.dirname(__FILE__)
-
-$:.unshift 'lib'
+$VERBOSE = $CODERAY_DEBUG = true
+$:.unshift File.expand_path('../../../lib', __FILE__)
require 'coderay'
-puts "Running basic CodeRay #{CodeRay::VERSION} tests..."
-suite = %w(basic load_plugin_scanner word_list)
+mydir = File.dirname(__FILE__)
+suite = Dir[File.join(mydir, '*.rb')].
+ map { |tc| File.basename(tc).sub(/\.rb$/, '') } - %w'suite for_redcloth'
+
+puts "Running basic CodeRay #{CodeRay::VERSION} tests: #{suite.join(', ')}"
+
for test_case in suite
- load File.join(MYDIR, test_case + '.rb')
+ load File.join(mydir, test_case + '.rb')
end
diff --git a/test/functional/vhdl.rb b/test/functional/vhdl.rb
deleted file mode 100644
index c7e3824..0000000
--- a/test/functional/vhdl.rb
+++ /dev/null
@@ -1,126 +0,0 @@
-class VHDL < CodeRay::Scanners::Scanner
-
- register_for :vhdl
-
- RESERVED_WORDS = [
- 'access','after','alias','all','assert','architecture','begin',
- 'block','body','buffer','bus','case','component','configuration','constant',
- 'disconnect','downto','else','elsif','end','entity','exit','file','for',
- 'function','generate','generic','group','guarded','if','impure','in',
- 'inertial','inout','is','label','library','linkage','literal','loop',
- 'map','new','next','null','of','on','open','others','out','package',
- 'port','postponed','procedure','process','pure','range','record','register',
- 'reject','report','return','select','severity','signal','shared','subtype',
- 'then','to','transport','type','unaffected','units','until','use','variable',
- 'wait','when','while','with','note','warning','error','failure','and',
- 'or','xor','not','nor',
- 'array'
- ]
-
- PREDEFINED_TYPES = [
- 'bit','bit_vector','character','boolean','integer','real','time','string',
- 'severity_level','positive','natural','signed','unsigned','line','text',
- 'std_logic','std_logic_vector','std_ulogic','std_ulogic_vector','qsim_state',
- 'qsim_state_vector','qsim_12state','qsim_12state_vector','qsim_strength',
- 'mux_bit','mux_vector','reg_bit','reg_vector','wor_bit','wor_vector'
- ]
-
- PREDEFINED_CONSTANTS = [
-
- ]
-
- IDENT_KIND = CodeRay::CaseIgnoringWordList.new(:ident).
- add(RESERVED_WORDS, :reserved).
- add(PREDEFINED_TYPES, :pre_type).
- add(PREDEFINED_CONSTANTS, :pre_constant)
-
- ESCAPE = / [rbfntv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x
- UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x
-
- def scan_tokens tokens, options
-
- state = :initial
-
- until eos?
-
- kind = nil
- match = nil
-
- case state
-
- when :initial
-
- if scan(/ \s+ | \\\n /x)
- kind = :space
-
- elsif scan(/-- .*/x)
- kind = :comment
-
- elsif scan(/ [-+*\/=<>?:;,!&^|()\[\]{}~%]+ | \.(?!\d) /x)
- kind = :operator
-
- elsif match = scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
- kind = IDENT_KIND[match.downcase]
-
- elsif match = scan(/[a-z]?"/i)
- tokens << [:open, :string]
- state = :string
- kind = :delimiter
-
- elsif scan(/ L?' (?: [^\'\n\\] | \\ #{ESCAPE} )? '? /ox)
- kind = :char
-
- elsif scan(/(?:\d+)(?![.eEfF])/)
- kind = :integer
-
- elsif scan(/\d[fF]?|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
- kind = :float
-
- else
- getch
- kind = :error
-
- end
-
- when :string
- if scan(/[^\\\n"]+/)
- kind = :content
- elsif scan(/"/)
- tokens << ['"', :delimiter]
- tokens << [:close, :string]
- state = :initial
- next
- elsif scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
- kind = :char
- elsif scan(/ \\ | $ /x)
- tokens << [:close, :string]
- kind = :error
- state = :initial
- else
- raise_inspect "else case \" reached; %p not handled." % peek(1), tokens
- end
-
- else
- raise_inspect 'Unknown state', tokens
-
- end
-
- match ||= matched
- if $DEBUG and not kind
- raise_inspect 'Error token %p in line %d' %
- [[match, kind], line], tokens
- end
- raise_inspect 'Empty token', tokens unless match
-
- tokens << [match, kind]
-
- end
-
- if state == :string
- tokens << [:close, :string]
- end
-
- tokens
- end
-
-end
diff --git a/test/functional/word_list.rb b/test/functional/word_list.rb
deleted file mode 100644
index 84d6e9e..0000000
--- a/test/functional/word_list.rb
+++ /dev/null
@@ -1,79 +0,0 @@
-require 'test/unit'
-require 'coderay'
-
-class WordListTest < Test::Unit::TestCase
-
- include CodeRay
-
- # define word arrays
- RESERVED_WORDS = %w[
- asm break case continue default do else
- ...
- ]
-
- PREDEFINED_TYPES = %w[
- int long short char void
- ...
- ]
-
- PREDEFINED_CONSTANTS = %w[
- EOF NULL ...
- ]
-
- # make a WordList
- IDENT_KIND = WordList.new(:ident).
- add(RESERVED_WORDS, :reserved).
- add(PREDEFINED_TYPES, :pre_type).
- add(PREDEFINED_CONSTANTS, :pre_constant)
-
- def test_word_list_example
- assert_equal :pre_type, IDENT_KIND['void']
- # assert_equal :pre_constant, IDENT_KIND['...'] # not specified
- end
-
- def test_word_list
- list = WordList.new(:ident).add(['foobar'], :reserved)
- assert_equal :reserved, list['foobar']
- assert_equal :ident, list['FooBar']
- end
-
- def test_word_list_cached
- list = WordList.new(:ident, true).add(['foobar'], :reserved)
- assert_equal :reserved, list['foobar']
- assert_equal :ident, list['FooBar']
- end
-
- def test_case_ignoring_word_list
- list = CaseIgnoringWordList.new(:ident).add(['foobar'], :reserved)
- assert_equal :ident, list['foo']
- assert_equal :reserved, list['foobar']
- assert_equal :reserved, list['FooBar']
-
- list = CaseIgnoringWordList.new(:ident).add(['FooBar'], :reserved)
- assert_equal :ident, list['foo']
- assert_equal :reserved, list['foobar']
- assert_equal :reserved, list['FooBar']
- end
-
- def test_case_ignoring_word_list_cached
- list = CaseIgnoringWordList.new(:ident, true).add(['foobar'], :reserved)
- assert_equal :ident, list['foo']
- assert_equal :reserved, list['foobar']
- assert_equal :reserved, list['FooBar']
-
- list = CaseIgnoringWordList.new(:ident, true).add(['FooBar'], :reserved)
- assert_equal :ident, list['foo']
- assert_equal :reserved, list['foobar']
- assert_equal :reserved, list['FooBar']
- end
-
- def test_dup
- list = WordList.new(:ident).add(['foobar'], :reserved)
- assert_equal :reserved, list['foobar']
- list2 = list.dup
- list2.add(%w[foobar], :keyword)
- assert_equal :keyword, list2['foobar']
- assert_equal :reserved, list['foobar']
- end
-
-end
\ No newline at end of file
--
coderay.git
More information about the Pkg-ruby-extras-commits
mailing list