class Peggy::Parser

Packrat parser class. Note all methods have a trailing exclamation (!) or question mark (?), or have long names with underscores (_). This is because productions are methods and we need to avoid name collisions. To use this class you must subclass Parser and provide your productions as methods. Your productions must call match? or one of the protected convenience routines to perform parsing. Productions must never call another production directly, or results will not get memoized and you will slow down your parse conciderably, and possibly risk getting into an infinite recursion (until the stack blows its top). Note, as a conveience in writting productions, you can call any match? function multiple times, passing each returned index, such as in a sequence, without checking the results of each production.

Attributes

debug_flag[RW]

Tells parser to print intermediate results if set.

ignore_productions[RW]

The productions to ignore.

parse_results[R]

The results of the parse. A hash (keys of indexs) of hashes (keys of production symbols and values of end indexes.

source_text[RW]

The source to parse, can be set prior to calling parse!().

Public Instance Methods

[](range) click to toggle source

Return a range (or character) of the source_text.

# File lib/parse/parser.rb, line 91
def [] range
  raise "source_text not set" if source_text.nil?
  source_text[range]
end
_memoize(goal, index, result, position = parse_results[index]) click to toggle source

Record the results of the parse in the parse_results memo.

# File lib/parse/parser.rb, line 171
    def _memoize goal, index, result, position = parse_results[index]
      if result
        position[:found_order] = [] unless position.has_key?(:found_order)
        position[:found_order] << goal
position[goal.to_s] = source_text[index...result] if result - index < 40 && goal.is_a?(Symbol)
      end
      position[goal] = result if result || goal.is_a?(Symbol)
      result
    end
allow?(goal, index) click to toggle source

Try to match a production from the given index. Returns the end index if found or start index if not found.

# File lib/parse/parser.rb, line 124
def allow? goal, index
  return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence
  found = match? goal, index
  found == NO_MATCH ? index : found
end
ast?(options={}) click to toggle source

Create an Abstract Syntax Tree from the parse results. You must call parse?() prior to this. Valid options:

  • :ignore=>[symbol of element to ignore]

# File lib/parse/ast.rb, line 218
    def ast? options={}
      ast = AST.new source_text, parse_results, options
#puts ast
      ast
    end
check?(goal, index) click to toggle source

Try to match a production from the given index then backtrack. Returns index if found or NO_MATCH if not.

# File lib/parse/parser.rb, line 132
def check? goal, index
  return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence
  found = match? goal, index
  found == NO_MATCH ? NO_MATCH : index
end
correct_regexp!(re) click to toggle source

Make sure regular expressions match the beginning of the string, actually from the string from the given index.

# File lib/parse/parser.rb, line 233
def correct_regexp! re
  source = re.source
  source[0..1] == '\\A' ? re : Regexp.new("\\A(#{source})", re.options)
end
dissallow?(goal, index) click to toggle source

Try not to match a production from the given index then backtrack. Returns index if not found or NO_MATCH if found.

# File lib/parse/parser.rb, line 140
def dissallow? goal, index
  return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence
  found = match? goal, index
  found == NO_MATCH ? index : NO_MATCH
end
eof(index) click to toggle source

Special production that only matches the end of source_text. Note, this function does not end in (?) or (!) because it is meant be used as a normal production.

# File lib/parse/parser.rb, line 148
def eof index
  return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence
  index >= source_text.length ? index : NO_MATCH
end
ignore?(index) click to toggle source

Match tokens that should be ignored. Used by match?(). Returns end index if found or start index if not found. Subclasses should override this method if they wish to ignore other text, such as comments.

# File lib/parse/parser.rb, line 184
def ignore? index
  return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence
  return index if @ignoring || ignore_productions.nil?
  @ignoring = true
  ignore_productions.each do |prod|
    index = allow? prod, index
  end
  @ignoring = nil
  index
end
literal?(value, index) click to toggle source

Match a literal string or regular expression from the given index. Returns the end index if found or NO_MATCH if not found.

# File lib/parse/parser.rb, line 197
def literal? value, index
  return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence
  case value
  when String
    string? value, index
  when Regexp
    regexp? value, index
  else
    raise "Unknown literal: #{value.inspect}"
  end
end
match?(goal, index) click to toggle source

Match a production from the given index. Returns the end index if found or NO_MATCH if not found.

# File lib/parse/parser.rb, line 155
def match? goal, index
  return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence
  index = ignore? index unless @ignoring
  goal = goal.to_sym
  position = parse_results[index]
  found = position.fetch(goal) do
    position[goal] = IN_USE # used to prevent inifinite recursion in case user attemts
                            # a left recursion
    _memoize goal, index, send(goal, index), position
  end
  puts "found #{goal} at #{index}...#{found} #{source_text[index...found].inspect}" if found && debug_flag
  raise "Parser cannot handle infinite (left) recursions. Please rewrite usage of '#{goal}'." if found == IN_USE
  found
end
parse?(goal, source = nil, index = 0) click to toggle source

Invokes the parser from the beginning of the source on the given production goal. You may provide the source here or you can set source_text prior to calling. If index is provided the parser will ignore characters previous to it.

# File lib/parse/parser.rb, line 99
def parse? goal, source = nil, index = 0
  self.source_text = source unless source.nil?
    # Hash of automatic hashes
  @parse_results = Hash.new {|h1, k1| h1[k1] = {}} # OrderedHash.new {|h1, k1| h1[k1] = {}}
  @keys = nil
  index = match? goal, index
  pp(parse_results) if debug_flag
  index
end
query?(*args) click to toggle source

Queries the parse results for a heirarchy of production matches. An array of index ranges is returned, or an empny array if none are found. This can only be called after parse_results have been set by a parse.

# File lib/parse/parser.rb, line 112
def query? *args
  raise "You must first call parse!" unless parse_results
  @keys = @parse_results.keys.sort unless @keys
  found_list = []
  index = 0
  args.each do |arg|
    index = find? arg, index
  end
end
regexp?(value, index) click to toggle source

Match a regular expression from the given index. Returns the end index if found or NO_MATCH if not found.

# File lib/parse/parser.rb, line 222
    def regexp? value, index
      return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence
      value = correct_regexp! value
      index = ignore? index unless @ignoring
      found = value.match source_text[index..-1]
# puts "#{value.inspect} ~= #{found[0].inspect}" if found
      _memoize(value, index, found ? found.end(0) + index : NO_MATCH)
    end
string?(value, index) click to toggle source

Match a string from the given index. Returns the end index if found or NO_MATCH if not found.

# File lib/parse/parser.rb, line 211
    def string? value, index
      return NO_MATCH if index == NO_MATCH # allow users to not check results of a sequence
      value = value.to_s
      index = ignore? index unless @ignoring
      i2 = index + value.length
# puts source_text[index...i2].inspect + ' ' + value.inspect
      _memoize(value, index, source_text[index...i2] == value ? i2 : NO_MATCH)
    end

Protected Instance Methods

index_results!() click to toggle source

Create an index of the parse results. Todo: unfinished.

# File lib/parse/parser.rb, line 241
def index_results!
  raise "You must first call parse!" unless parse_results
  @index = new Hash {|h, k| h[k] = []}
  parse_results.each_pair do |index, prod_map|
    prod_map[:found_order].reverse_each
    prod_map.each_value
    @index[prod]
  end
end