class Nokogiri::HTML4::SAX::Parser
💡 This class is an alias for Nokogiri::HTML4::SAX::Parser
as of v1.12.0.
This class lets you perform SAX
style parsing on HTML
with HTML
error correction.
Here is a basic usage example:
class MyDoc < Nokogiri::XML::SAX::Document def start_element name, attributes = [] puts "found a #{name}" end end parser = Nokogiri::HTML4::SAX::Parser.new(MyDoc.new) parser.parse(File.read(ARGV[0], mode: 'rb'))
For more information on SAX
parsers, see Nokogiri::XML::SAX
Public Instance Methods
parse_file(filename, encoding = "UTF-8") { |ctx| ... }
click to toggle source
Parse a file with filename
# File lib/nokogiri/html4/sax/parser.rb, line 51 def parse_file(filename, encoding = "UTF-8") raise ArgumentError unless filename raise Errno::ENOENT unless File.exist?(filename) raise Errno::EISDIR if File.directory?(filename) ctx = ParserContext.file(filename, encoding) yield ctx if block_given? ctx.parse_with(self) end
parse_io(io, encoding = "UTF-8") { |ctx| ... }
click to toggle source
Parse given io
# File lib/nokogiri/html4/sax/parser.rb, line 41 def parse_io(io, encoding = "UTF-8") check_encoding(encoding) @encoding = encoding ctx = ParserContext.io(io, ENCODINGS[encoding]) yield ctx if block_given? ctx.parse_with(self) end
parse_memory(data, encoding = "UTF-8") { |ctx| ... }
click to toggle source
Parse html stored in data
using encoding
# File lib/nokogiri/html4/sax/parser.rb, line 30 def parse_memory(data, encoding = "UTF-8") raise TypeError unless String === data return if data.empty? ctx = ParserContext.memory(data, encoding) yield ctx if block_given? ctx.parse_with(self) end