class Nokogiri::XML::SAX::Parser

This parser is a SAX style parser that reads its input as it deems necessary. The parser takes a Nokogiri::XML::SAX::Document, an optional encoding, then given an XML input, sends messages to the Nokogiri::XML::SAX::Document.

Here is an example of using this parser:

# Create a subclass of Nokogiri::XML::SAX::Document and implement
# the events we care about:
class MyHandler < Nokogiri::XML::SAX::Document
  def start_element name, attrs = []
    puts "starting: #{name}"
  end

  def end_element name
    puts "ending: #{name}"
  end
end

parser = Nokogiri::XML::SAX::Parser.new(MyHandler.new)

# Hand an IO object to the parser, which will read the XML from the IO.
File.open(path_to_xml) do |f|
  parser.parse(f)
end

For more information about SAX parsers, see Nokogiri::XML::SAX.

Also see Nokogiri::XML::SAX::Document for the available events.

For HTML documents, use the subclass Nokogiri::HTML4::SAX::Parser.

Attributes

document[RW]

The Nokogiri::XML::SAX::Document where events will be sent.

encoding[RW]

The encoding beings used for this document.

Public Class Methods

new ⇒ SAX::Parser click to toggle source
new(handler) ⇒ SAX::Parser
new(handler, encoding) ⇒ SAX::Parser

Create a new Parser.

Parameters
  • handler (optional Nokogiri::XML::SAX::Document) The document that will receive events. Will create a new Nokogiri::XML::SAX::Document if not given, which is accessible through the #document attribute.

  • encoding (optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input. (default nil for auto-detection)

# File lib/nokogiri/xml/sax/parser.rb, line 95
def initialize(doc = Nokogiri::XML::SAX::Document.new, encoding = nil)
  @encoding = encoding
  @document = doc
  @warned   = false

  initialize_native unless Nokogiri.jruby?
end

Public Instance Methods

parse(input) { |parser_context| ... } click to toggle source

Parse the input, sending events to the SAX::Document at #document.

Parameters
  • input (String, IO) The input to parse.

If input quacks like a readable IO object, this method forwards to Parser.parse_io, otherwise it forwards to Parser.parse_memory.

Yields

If a block is given, the underlying ParserContext object will be yielded. This can be used to set options on the parser context before parsing begins.

# File lib/nokogiri/xml/sax/parser.rb, line 119
def parse(input, &block)
  if input.respond_to?(:read) && input.respond_to?(:close)
    parse_io(input, &block)
  else
    parse_memory(input, &block)
  end
end
parse_file(filename) { |parser_context| ... } click to toggle source
parse_file(filename, encoding) { |parser_context| ... }

Parse a file.

Parameters
  • filename (String) The path to the file to be parsed.

  • encoding (optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input, or nil for auto-detection. (default #encoding)

Yields

If a block is given, the underlying ParserContext object will be yielded. This can be used to set options on the parser context before parsing begins.

# File lib/nokogiri/xml/sax/parser.rb, line 187
def parse_file(filename, encoding = @encoding)
  raise ArgumentError, "no filename provided" unless filename
  raise Errno::ENOENT unless File.exist?(filename)
  raise Errno::EISDIR if File.directory?(filename)

  ctx = related_class("ParserContext").file(filename, encoding)
  yield ctx if block_given?
  ctx.parse_with(self)
end
parse_io(io) { |parser_context| ... } click to toggle source
parse_io(io, encoding) { |parser_context| ... }

Parse an input stream.

Parameters
  • io (IO) The readable IO object from which to read input

  • encoding (optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input, or nil for auto-detection. (default #encoding)

Yields

If a block is given, the underlying ParserContext object will be yielded. This can be used to set options on the parser context before parsing begins.

# File lib/nokogiri/xml/sax/parser.rb, line 143
def parse_io(io, encoding = @encoding)
  ctx = related_class("ParserContext").io(io, encoding)
  yield ctx if block_given?
  ctx.parse_with(self)
end
parse_memory(input) { |parser_context| ... } click to toggle source
parse_memory(input, encoding) { |parser_context| ... }

Parse an input string.

Parameters
  • input (String) The input string to be parsed.

  • encoding (optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input, or nil for auto-detection. (default #encoding)

Yields

If a block is given, the underlying ParserContext object will be yielded. This can be used to set options on the parser context before parsing begins.

# File lib/nokogiri/xml/sax/parser.rb, line 165
def parse_memory(input, encoding = @encoding)
  ctx = related_class("ParserContext").memory(input, encoding)
  yield ctx if block_given?
  ctx.parse_with(self)
end