Rails：带有erb的动态robots.txt

我正在尝试在我的Rails（3.0.10）应用程序中呈现动态文本文件（robots.txt），但它继续将其呈现为HTML（称为控制台）。

match 'robots.txt' => 'sites#robots'

控制器：

 class SitesController < ApplicationController respond_to :html, :js, :xml, :css, :txt def robots @site = Site.find_by_subdomain # blah blah end end

应用程序/视图/网站/ robots.txt.erb：

 Sitemap: /sitemap.xml

但是当我访问http://www.example.com/robots.txt我得到一个空白页面/来源，日志显示：

 Started GET "/robots.txt" for 127.0.0.1 at 2011-11-21 11:22:13 -0500 Processing by SitesController#robots as HTML Site Load (0.4ms) SELECT `sites`.* FROM `sites` WHERE (`sites`.`subdomain` = 'blah') ORDER BY created_at DESC LIMIT 1 Completed 406 Not Acceptable in 828ms

知道我做错了什么吗？

注意：我将此添加到config / initializers / mime_types，因为Rails抱怨不知道.txt mime类型是什么：

 Mime::Type.register_alias "text/plain", :txt

注意2：我确实从公共目录中删除了stock robots.txt。

我认为问题是，如果在控制器中定义respond_to ，则必须在操作中使用respond_with ：

 def robots @site = Site.find_by_subdomain # blah blah respond_with @site end

另外，尝试显式指定要呈现的.erb文件：

 def robots @site = Site.find_by_subdomain # blah blah render 'sites/robots.txt.erb' respond_with @site end

注意：这是来自coderwall的转贴 。

阅读Stackoverflow上类似答案的一些建议，我目前使用以下解决方案根据请求的主机参数呈现动态robots.txt。

路由

 # config/routes.rb # # Dynamic robots.txt get 'robots.:format' => 'robots#index'

调节器

 # app/controllers/robots_controller.rb class RobotsController < ApplicationController # No layout layout false # Render a robots.txt file based on whether the request # is performed against a canonical url or not # Prevent robots from indexing content served via a CDN twice def index if canonical_host? render 'allow' else render 'disallow' end end private def canonical_host? request.host =~ /plugingeek\.com/ end end

查看

基于request.host我们呈现两个不同的.text.erb视图文件之一。

允许机器人

 # app/views/robots/allow.text.erb # Note the .text extension # Allow robots to index the entire site except some specified routes # rendered when site is visited with the default hostname # http://www.robotstxt.org/ # ALLOW ROBOTS User-agent: * Disallow:

禁止蜘蛛

 # app/views/robots/disallow.text.erb # Note the .text extension # Disallow robots to index any page on the site # rendered when robot is visiting the site # via the Cloudfront CDN URL # to prevent duplicate indexing # and search results referencing the Cloudfront URL # DISALLOW ROBOTS User-agent: * Disallow: /

眼镜

使用RSpec和Capybara测试设置也非常容易。

 # spec/features/robots_spec.rb require 'spec_helper' feature "Robots" do context "canonical host" do scenario "allow robots to index the site" do Capybara.app_host = 'http://www.plugingeek.com' visit '/robots.txt' Capybara.app_host = nil expect(page).to have_content('# ALLOW ROBOTS') expect(page).to have_content('User-agent: *') expect(page).to have_content('Disallow:') expect(page).to have_no_content('Disallow: /') end end context "non-canonical host" do scenario "deny robots to index the site" do visit '/robots.txt' expect(page).to have_content('# DISALLOW ROBOTS') expect(page).to have_content('User-agent: *') expect(page).to have_content('Disallow: /') end end end # This would be the resulting docs # Robots # canonical host # allow robots to index the site # non-canonical host # deny robots to index the site

最后一步，您可能需要删除公用文件夹中的静态public/robots.txt （如果它仍然存在）。

希望这个对你有帮助。随意发表评论，帮助进一步改进这项技术。

在Rails 3.2.3中工作的一个解决方案（不确定3.0.10）如下：

1）为模板文件命名robots.text.erb ＃强调text与txt

2）像这样设置您的路线： match '/robots.:format' => 'sites#robots'

3）保持原样（您可以删除控制器中的respond_with）

 def robots @site = Site.find_by_subdomain # blah blah end

此解决方案还消除了在接受的答案中提到的render调用中显式指定txt.erb的需要。

我不喜欢robots.txt到达我的Web服务器的想法。

如果您使用Nginx / Apache作为反向代理，那么静态文件的处理速度要快于达到rails本身的请求。

这更清洁，我认为这也更快。

尝试使用以下设置。

nginx.conf – 用于生产

 location /robots.txt { alias /path-to-your-rails-public-directory/production-robots.txt; }

nginx.conf – 用于舞台

 location /robots.txt { alias /path-to-your-rails-public-directory/stage-robots.txt; }

对于我的rails项目，我通常有一个单独的控制器用于robots.txt响应

 class RobotsController < ApplicationController layout nil def index host = request.host if host == 'lawc.at' then #liveserver render 'allow.txt', :content_type => "text/plain" else #testserver render 'disallow.txt', :content_type => "text/plain" end end end

然后我有一个名为： disallow.txt.erb和allow.txt.erb

在我的routes.rb我有

 get "robots.txt" => 'robots#index'

Rails：带有erb的动态robots.txt

路由

调节器

查看

眼镜

Rails：不需要的实例出现在视图中并保存到数据库中

Facebook应用程序 – 通过omniauth登录 – OAuthException 191

（Rails）有没有办法检查字段的数据类型？

在Rails中的模型中创建模型

Rails中的外键通常是否可以避免？

Error-Bundler无法找到gem“bundler”的兼容版本：在Heroku上推送项目时

如何在Rails下拉菜单中设置默认选定项？

订阅不是在自定义条带表单上创建，而是在Stripe上创建订阅

两个rails应用程序共享模型文件夹

找不到uri 和方法的处理程序