用浏览器登录然后ruby / mechanize接管它?

这有可能吗? 我需要传递给机械化的东西? 我可以用什么url开始呢?

我无法管理(到目前为止)使用机械化登录到一个网站,所以我想我是否可以做这个小小的解决方法。 我相信我可以捕获所有的cookie和其他所有内容,然后将它传递给ruby / mechanize来完成剩下的工作……

下面的截图是使用firebug进行的( Firebug记录POST或GET请求,响应头 )

登录工作=只有一行

登录工作=只有一行http://sofzh.miximages.com/ruby/ofivo.png

和登录的html工作

 

登录对我不起作用

登录对我不起作用http://sofzh.miximages.com/ruby/13zcqj6.png

这是HTML

  

在这两种情况下,我的脚本几乎都是一样的。

 require 'rubygems' require 'mechanize' #agent = WWW::Mechanize.new agent = WWW::Mechanize.new page = agent.get("http://www.vbulletin.org/forum/index.php") login_form = page.form_with(:action => 'login.php?do=login') puts login_form.fields.each { |f| puts "#{f.name} : #{f.value}" } login_form['vb_login_username'] = 'user name' login_form['vb_login_password'] = '' page = agent.submit login_form output = File.open("login.html", "w") {|f| f.write(page.parser.to_html) } 

用于登录的机械化日志不起作用

  INFO -- : Net::HTTP::Get: /login?auth_successurl=http://www.somedomain.com/forum/yota?baz_r=1 DEBUG -- : request-header: accept-language => en-us,en;q=0.5 DEBUG -- : request-header: connection => keep-alive DEBUG -- : request-header: accept => */* DEBUG -- : request-header: accept-encoding => gzip,identity DEBUG -- : request-header: user-agent => WWW-Mechanize/0.9.3 (http://rubyforge.org/projects/mechanize/) DEBUG -- : request-header: accept-charset => ISO-8859-1,utf-8;q=0.7,*;q=0.7 DEBUG -- : request-header: host => www.somedomain.com DEBUG -- : request-header: keep-alive => 300 DEBUG -- : Read 400 bytes DEBUG -- : Read 1424 bytes DEBUG -- : Read 2448 bytes DEBUG -- : Read 3211 bytes DEBUG -- : response-header: vary => Accept-Encoding DEBUG -- : response-header: cache-control => no-store, no-cache, must-revalidate, post-check=0, pre-check=0 DEBUG -- : response-header: connection => close DEBUG -- : response-header: expires => Thu, 19 Nov 1981 08:52:00 GMT DEBUG -- : response-header: content-type => text/html; charset=utf-8 DEBUG -- : response-header: date => Fri, 29 Jan 2010 23:43:12 GMT DEBUG -- : response-header: content-encoding => gzip DEBUG -- : response-header: server => Apache/2.2.3 (CentOS) DEBUG -- : response-header: content-length => 3211 DEBUG -- : response-header: set-cookie => PHPSESSID=7cfilg86ju2ldcgso22246hpu4; path=/, WebStats:visitorId=lSMkcwuSWEE%3D; expires=Mon, 27-Jan-2020 23:43:12 GMT; path=/, WebStats:sessionId=%2B2HHK296t%2BQ%3D; expires=Mon, 27-Jan-2020 23:43:12 GMT; path=/ DEBUG -- : response-header: accept-ranges => bytes DEBUG -- : response-header: pragma => no-cache DEBUG -- : gunzip body DEBUG -- : saved cookie: PHPSESSID=7cfilg86ju2ldcgso22246hpu4 DEBUG -- : saved cookie: WebStats:visitorId=lSMkcwuSWEE%3D DEBUG -- : saved cookie: WebStats:sessionId=%2B2HHK296t%2BQ%3D INFO -- : status: 200 DEBUG -- : query: "auth_username=radek&auth_password=mypassword" INFO -- : Net::HTTP::Post: /login?auth_successurl=http://www.somedomain.com/forum/yota?baz_r=1 DEBUG -- : request-header: accept-language => en-us,en;q=0.5 DEBUG -- : request-header: connection => keep-alive DEBUG -- : request-header: accept => */* DEBUG -- : request-header: accept-encoding => gzip,identity DEBUG -- : request-header: content-type => application/x-www-form-urlencoded DEBUG -- : request-header: user-agent => WWW-Mechanize/0.9.3 (http://rubyforge.org/projects/mechanize/) DEBUG -- : request-header: cookie => WebStats:sessionId=%2B2HHK296t%2BQ%3D; WebStats:visitorId=lSMkcwuSWEE%3D; PHPSESSID=7cfilg86ju2ldcgso22246hpu4 DEBUG -- : request-header: referer => http://www.somedomain.com/login?auth_successurl=http://www.somedomain.com/forum/yota?baz_r=1 DEBUG -- : request-header: accept-charset => ISO-8859-1,utf-8;q=0.7,*;q=0.7 DEBUG -- : request-header: content-length => 43 DEBUG -- : request-header: host => www.somedomain.com DEBUG -- : request-header: keep-alive => 300 DEBUG -- : Read 650 bytes DEBUG -- : Read 1674 bytes DEBUG -- : Read 2698 bytes DEBUG -- : Read 3211 bytes DEBUG -- : response-header: vary => Accept-Encoding DEBUG -- : response-header: cache-control => no-store, no-cache, must-revalidate, post-check=0, pre-check=0 DEBUG -- : response-header: connection => close DEBUG -- : response-header: expires => Thu, 19 Nov 1981 08:52:00 GMT DEBUG -- : response-header: content-type => text/html; charset=utf-8 DEBUG -- : response-header: date => Fri, 29 Jan 2010 23:43:13 GMT DEBUG -- : response-header: content-encoding => gzip DEBUG -- : response-header: server => Apache/2.2.3 (CentOS) DEBUG -- : response-header: content-length => 3211 DEBUG -- : response-header: accept-ranges => bytes DEBUG -- : response-header: pragma => no-cache DEBUG -- : gunzip body INFO -- : status: 200 

是的,捕获cookie(例如,通过Firefox中的FireCookies插件),并手动将其传递给机械化可能适用于大多数情况。

您的问题很可能源于Mechanize仅跟踪使用Set-Cookie HTTP标头创建的cookie。 它无法处理JavaScript创建的cookie。