delayed_job在生产一段时间后停止运行

在生产中，我们的delayed_job进程因某种原因而死亡。我不确定它是崩溃还是被操作系统杀死了什么。我没有在delayed_job.log文件中看到任何错误。

我该怎么做才能解决这个问题？我正在考虑安装monit来监控它，但这只能告诉我它何时死亡。它不会真的告诉我为什么它会死。

有没有办法让它对日志文件更加健谈，所以我可以告诉它为什么会死？

还有其他建议吗？

我遇到过extreme_job无声失败的两个原因。第一个是当人们在分叉进程中使用libxml时的实际段错误（这在一段时间后会在邮件列表中弹出）。

第二个问题与delayed_job依赖的1.1.0版本的守护进程有关的问题（ https://github.com/collectiveidea/delayed_job/issues#issue/81 ），这可以通过使用轻松解决1.0.10这是我自己的Gemfile中的内容。

记录

登录有delayed_job，所以如果工作人员在没有打印错误的情况下死亡，通常是因为它没有抛出exception（例如Segfault）或外部正在杀死进程。

监控

我使用bluepill来监控我的延迟作业实例，到目前为止，这已经非常成功地确保了作业仍在运行。为应用程序运行bluepill的步骤非常简单

将bluepill gem添加到Gemfile：

  # Monitoring gem 'i18n' # Not sure why but it complained I didn't have it gem 'bluepill'

我创建了一个bluepill配置文件：

 app_home = "/home/mi/production" workers = 5 Bluepill.application("mi_delayed_job", :log_file => "#{app_home}/shared/log/bluepill.log") do |app| (0...workers).each do |i| app.process("delayed_job.#{i}") do |process| process.working_dir = "#{app_home}/current" process.start_grace_time = 10.seconds process.stop_grace_time = 10.seconds process.restart_grace_time = 10.seconds process.start_command = "cd #{app_home}/current && RAILS_ENV=production ruby script/delayed_job start -i #{i}" process.stop_command = "cd #{app_home}/current && RAILS_ENV=production ruby script/delayed_job stop -i #{i}" process.pid_file = "#{app_home}/shared/pids/delayed_job.#{i}.pid" process.uid = "mi" process.gid = "mi" end end end

然后在我的capistrano部署文件中，我刚刚添加：

 # Bluepill related tasks after "deploy:update", "bluepill:quit", "bluepill:start" namespace :bluepill do desc "Stop processes that bluepill is monitoring and quit bluepill" task :quit, :roles => [:app] do run "cd #{current_path} && bundle exec bluepill --no-privileged stop" run "cd #{current_path} && bundle exec bluepill --no-privileged quit" end desc "Load bluepill configuration and start it" task :start, :roles => [:app] do run "cd #{current_path} && bundle exec bluepill --no-privileged load /home/mi/production/current/config/delayed_job.bluepill" end desc "Prints bluepills monitored processes statuses" task :status, :roles => [:app] do run "cd #{current_path} && bundle exec bluepill --no-privileged status" end end

希望这有所帮助。

我遇到此问题的最常见情况是由数据库问题（mysql连接错误等）引起的。默认情况下没有日志。

所以我建议你使用god来控制你的delayed_job（你可以看到它的日志文件！）。

假设您在Rails4中使用delayed_job，您应该：

1.install god gem：$ gem install god

2.有这个脚本文件：

 # filename: cache_cleaner.god RAILS_ROOT = '/sg552/workspace/m-api-cache-cleaner' God.watch do |w| w.name = 'cache_cleaner' w.dir = RAILS_ROOT w.start = "cd #{RAILS_ROOT} && RAILS_ENV=production bundle exec bin/delayed_job -n 5 start" w.stop = "cd #{RAILS_ROOT} && RAILS_ENV=production bundle exec bin/delayed_job stop" w.restart = "cd #{RAILS_ROOT} && RAILS_ENV=production bundle exec bin/delayed_job -n 5 restart" w.log = "#{RAILS_ROOT}/log/cache_cleaner_stdout.log" w.pid_file = File.join(RAILS_ROOT, "log/delayed_job.total.pid") # you should NEVER use this config settings: # w.keepalive (always comment it out! ) end

3.启动/停止/重启delayed_jobs，从以下命令更改命令：

 $ bundle exec bin/delayed_job -n 3 start

至：

 $ god -c cache_cleaner.god -D $ god start/stop/restart cache_cleaner

请参考我的个人博客： http ： //siwei.me/blog/posts/using-delayed-job-with-god

delayed_job在生产一段时间后停止运行

记录

监控

使用Carrierwave上传文件时，没有路由匹配错误

Rails中的非RESTful操作

测试rails葡萄API与curl，params数组

Rails使用Devise显示在线用户

“rake db：seed”和rake db：fixtures：load“之间的区别是什么？

耙顺序任务

什么是设计authenticate_user的代码！生成后：用户

覆盖Rails上传器以播种数据库

通过电子邮件或手机号码设计注册

在rake中销毁Rails 3对象？

delayed_job在生产一段时间后停止运行

记录

监控

使用Carrierwave上传文件时，没有路由匹配错误

Rails中的非RESTful操作

测试rails葡萄API与curl，params数组

Rails使用Devise显示在线用户

“rake db：seed”和rake db：fixtures：load“之间的区别是什么？

耙顺序任务

什么是设计authenticate_user的代码！ 生成后：用户

覆盖Rails上传器以播种数据库

通过电子邮件或手机号码设计注册

在rake中销毁Rails 3对象？

什么是设计authenticate_user的代码！生成后：用户