Nokogiri用于在唯一标签集之间选择文本和html

我正在尝试使用Nokogiri从两个独特的标签集之间提取文本。

什么是在p-tag中获取文本的最佳方法


The problem

和


The solution

，然后全部


The solution

和

之间的HTML？

完整html的示例：

 The problem
 TEXT I WANT 
 The solution
 HTML I WANT with it's own set of tags (but never an  or )

谢谢！

 require 'nokogiri' doc = Nokogiri.HTML(DATA) doc.search('//h2/following-sibling::node()[name() != "h2" and name() != "div" and text() != "\n"]').each do |block| p block.text end __END__ The problem
 TEXT I WANT
 The solution
 dont capture this
 HTML I WANT with it's own set of tags

输出：

 "TEXT I WANT" "HTML I WANT with it's own set of tags"

此XPath选择h2所有后续兄弟节点，这些节点不是h2 ， div或只包含字符串"\n" 。

这里是你如何在两个包含类点的 h2之间获得p标签文本

 //h2[@class="point"][1]/following-sibling::p[./following-sibling::h2[@class="point"]]/text()

对于第二个，你应该探索w3schools ，并以第一个作为例子来做。

Interesting Posts

Nokogiri用于在唯一标签集之间选择文本和html

The problem

The solution

The solution

The problem

The solution

`or`
`)`

The problem

The solution

紧跟在h3之后的文本框的CSS选择器路径，带有文本“公司名称”