Tag: nokogiri

Nokogiri在视图中显示数据: 试图弄清楚我在应用程序/ html中显示的文本和图像的显示方式。这是我的app / scrape2.rb文件 require ‘nokogiri’ require ‘open-uri’ url = “https://marketplace.asos.com/boutiques/independent-label” doc = Nokogiri::HTML(open(url)) label = doc.css(‘#boutiqueList’) @label = label.css(‘#boutiqueList img’).map { |l| p l.attr(‘src’) } @title = label.css(“#boutiqueList .notranslate”).map { |o| p o.text } 这是控制器： class PagesController < ApplicationController def about #used to change the routing to /about end def index @label = […]

无法安装Nokogiri: 当我在Ubuntu 12.1上运行sudo gem install nokogiri时，我已经回来了。怎么了，怎么解决？ jason@jason:~/ror/clss$ sudo gem install nokogiri Building native extensions. This could take a while… ERROR: Error installing nokogiri: ERROR: Failed to build gem native extension. /usr/bin/ruby1.9.1 extconf.rb /usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require’: cannot load such file — mkmf (LoadError) from /usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require’ from extconf.rb:5:in `’ Gem files will remain installed in /var/lib/gems/1.9.1/gems/nokogiri-1.5.5 for […]

Nokogiri :: XML.parse应该为换行创建单独的Text节点吗？: 我有一个外部工具创建的XML文档： S1 First Suite section 1 C1 Test 1.1 Other 4 – Must Test C2 Test 1.2 Other 4 – Must Test 从irb，我执行以下操作:(输出被抑制直到最终命令） > require(‘nokogiri’) > doc = Nokogiri::XML.parse(open(‘./test.xml’)) > test_case = doc.search(‘case’).first => #<Nokogiri::XML::Element:0x3ff75851bc44 name="case" children=[#, #<Nokogiri::XML::Element:0x3ff75851b7bc name="id" children=[#]>, #, #<Nokogiri::XML::Element:0x3ff75851b078 name="title" children=[#]>, #, #<Nokogiri::XML::Element:0x3ff75851a970 name="type" children=[#]>, #, #<Nokogiri::XML::Element:0x3ff7585190d4 name="priority" children=[#]>, #, #, #, […]

无法在mac上安装ruby的机械化: 我正在尝试使用ruby版本1.8.7在Mac OS X版本10.7.3上安装mechanize。问题在于其依赖性之一nokogiri。我看过其他有关安装xcode的post，我这样做的是版本4.3.2。这是我收到的错误。先感谢您。 sudo gem install mechanize Building native extensions. This could take a while… ERROR: Error installing mechanize: ERROR: Failed to build gem native extension. /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby extconf.rb mkmf.rb can’t find header files for ruby at /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/ruby.h Gem files will remain installed in /Library/Ruby/Gems/1.8/gems/nokogiri-1.5.2 for inspection. Results logged to /Library/Ruby/Gems/1.8/gems/nokogiri-1.5.2/ext/nokogiri/gem_make.out

为什么我会遇到Nokogiri崩溃和MemoryError：负重新分配大小？: 我有一个在本地运行良好的爬虫，但是当我在XL EC2实例上运行它时，我得到一个MemoryError: negative re-allocation size错误。我在网上搜索过但找不到任何有用的东西。什么可能是错的？

Nokogiri用于在唯一标签集之间选择文本和html: 我正在尝试使用Nokogiri从两个独特的标签集之间提取文本。什么是在p-tag中获取文本的最佳方法 The problem 和 The solution ，然后全部 The solution 和之间的HTML？完整html的示例： The problem TEXT I WANT The solution HTML I WANT with it’s own set of tags (but never an or ) 谢谢！

数据抓取多个数组创建和排序: 我们正在努力削减课程名称，资格和课程持续时间，并将每个课程存储在一个单独的arrays中。下面我们拉出所有这些，但它似乎是随机顺序，有些部分可能按页面排序等。想知道是否有人能够提供帮助。 require ‘mechanize’ mechanize = Mechanize.new @duration_array = [] @qual_array = [] @courses_array = [] page = mechanize.get(‘http://search.ucas.com/search/results?Vac=2&AvailableIn=2016&IsFeatherProcessed=True&page=1&providerids=41’) page.search(‘div.courseinfoduration’).each do |x| puts x.text.strip page.search(‘div.courseinfooutcome’).each do |y| puts y.text.strip end while next_page_link = page.at(‘.pager a[text()=”>”]’) page = mechanize.get(next_page_link[‘href’]) page.search(‘div.courseinfoduration’).each do |x| name = x @duration_array.push(name) puts x.text.strip end end while next_page_link = page.at(‘.pager a[text()=”>”]’) page […]

在Ruby脚本中使用SLIM / HAML等？: 我目前正在编写一个脚本，分析一些遗传数据，然后在彩色Word文档上生成输出。但是，脚本工作正常，脚本中的一个方法编写得很糟糕，即创建Word文档的方法。创建文档的方法创建一个独立的HTML文件，然后使用’docx’扩展名保存，这允许我为文档的不同部分提供不同的样式。以下是实现此function的最低要求。它包括一些示例输入数据，这些数据将在最后一步之前以不同的方法创建并存储在散列中，以及必要的方法。 require ‘bio’ def make_hash(input_file) input_read = Hash.new biofastafile = Bio::FlatFile.open(Bio::FastaFormat, input_file) biofastafile.each_entry do |entry| input_read[entry.definition] = entry.aaseq end return input_read end def to_doc(hash, output, motif) output_file = File.new(output, “w”) output_file.puts ” .id{font-weight: bold;} .signalp{color:#000099; font-weight: bold;} .motif{color:#FF3300; font-weight: bold;} h3 {word-wrap: break-word;} p {word-wrap: break-word; font-family:Courier New, Courier, Mono;}” hash.each […]

rails.strip_tags for html with carriage return: 以下代码将\ r转换为是否正确？ strip_tags “aaa\r\n” # => “aaa \n” 这是对的吗？我应该收到 “aaa\r\n”

如何使用Nokogiri使用本地dtd文件正确validationxml文件？: 我有一个简单，有效的DTD和一个似乎符合DTD的有效XML文件，但Nokogiri正在生成大量validation输出，这意味着XML文件未通过validation。 dtd文件是： xml文件是： FOO SOFTWARE. The core global object. This is a special singleton object. It is used for internal Wayland protocol features. The sync request asks the server to emit the ‘done’ event on the returned wl_callback object. Since requests are handled in-order and events are delivered in-order, this can be used as a […]