Install
sudo gem install nokogiri
Contribute
github.com/tenderlove/nokogiri

An HTML, XML, SAX, & Reader parser with the ability to search documents via XPath or CSS3 selectors… and much more

Nokogiri

Module Nokogiri

Nokogiri parses and searches XML/HTML very quickly, and also has correctly implemented CSS3 selector support as well as XPath support.

Parsing a document returns either a Nokogiri::XML::Document, or a Nokogiri::HTML::Document depending on the kind of document you parse.

Here is an example:

require 'nokogiri'
require 'open-uri'

# Get a Nokogiri::HTML:Document for the page we’re interested in...

doc = Nokogiri::HTML(open('http://www.google.com/search?q=tenderlove'))

# Do funky things with it using Nokogiri::XML::Node methods...

####
# Search for nodes by css
doc.css('h3.r a.l').each do |link|
  puts link.content
end

See Nokogiri::XML::Node#css for more information about CSS searching. See Nokogiri::XML::Node#xpath for more information about XPath searching.


Constants

VERSION

The version of Nokogiri you are using

VERSION_INFO

More complete version information about libxml

Public Class Methods

HTML(thing, url = nil, encoding = nil, options = XML::ParseOptions::DEFAULT_HTML, &block) Show Source
 

Parse HTML. Convenience method for Nokogiri::HTML::Document.parse

# File lib/nokogiri/html.rb, line 13 13: def HTML thing, url = nil, encoding = nil, options = XML::ParseOptions::DEFAULT_HTML, &block 14: Nokogiri::HTML::Document.parse(thing, url, encoding, options, &block) 15: end
Slop(*args, &block) Show Source
 

Parse a document and add the Slop decorator. The Slop decorator implements method_missing such that methods may be used instead of CSS or XPath. For example:

  doc = Nokogiri::Slop(<<-eohtml)
    <html>
      <body>
        <p>first</p>
        <p>second</p>
      </body>
    </html>
  eohtml
  assert_equal('second', doc.html.body.p[1].text)
# File lib/nokogiri.rb, line 117 117: def Slop(*args, &block) 118: Nokogiri(*args, &block).slop! 119: end
XML(thing, url = nil, encoding = nil, options = XML::ParseOptions::DEFAULT_XML, &block) Show Source
 

Parse XML. Convenience method for Nokogiri::XML::Document.parse

# File lib/nokogiri/xml.rb, line 32 32: def XML thing, url = nil, encoding = nil, options = XML::ParseOptions::DEFAULT_XML, &block 33: Nokogiri::XML::Document.parse(thing, url, encoding, options, &block) 34: end
XSLT(stylesheet) Show Source
 

Create a Nokogiri::XSLT::Stylesheet with stylesheet.

Example:

  xslt = Nokogiri::XSLT(File.read(ARGV[0]))
# File lib/nokogiri/xslt.rb, line 12 12: def XSLT stylesheet 13: XSLT.parse(stylesheet) 14: end
jruby?() Show Source
# File lib/nokogiri/version.rb, line 32 32: def self.jruby? 33: !Nokogiri::VERSION_INFO['ruby']['jruby'].nil? 34: end
make(input = nil, opts = {}) Show Source
 

Create a new Nokogiri::XML::DocumentFragment

# File lib/nokogiri.rb, line 94 94: def make input = nil, opts = {}, &blk 95: if input 96: Nokogiri::HTML.fragment(input).children.first 97: else 98: Nokogiri(&blk) 99: end 100: end
parse(string, url = nil, encoding = nil, options = nil) Show Source
 

Parse an HTML or XML document. string contains the document.

# File lib/nokogiri.rb, line 75 75: def parse string, url = nil, encoding = nil, options = nil 76: doc = 77: if string.respond_to?(:read) || 78: string =~ /^\s*<[^Hh>]*html/ # Probably html 79: Nokogiri::HTML( 80: string, 81: url, 82: encoding, options || XML::ParseOptions::DEFAULT_HTML 83: ) 84: else 85: Nokogiri::XML(string, url, encoding, 86: options || XML::ParseOptions::DEFAULT_XML) 87: end 88: yield doc if block_given? 89: doc 90: end