ruby 2.1.2超时仍然不是线程安全吗？

我有50个sidekiq线程爬网，几周前线程在运行大约20分钟后开始挂起。当我进行回溯转储时，大多数线程都停留在net / http初始化：

/app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:879:in `initialize' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:879:in `open' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:879:in `block in connect' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:76:in `timeout' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:878:in `connect' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:863:in `do_start' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/net/http.rb:858:in `start' /app/vendor/bundle/ruby/2.1.0/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:700:in `start' /app/vendor/bundle/ruby/2.1.0/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:631:in `connection_for' /app/vendor/bundle/ruby/2.1.0/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:994:in `request' /app/vendor/bundle/ruby/2.1.0/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:257:in `fetch' /app/vendor/bundle/ruby/2.1.0/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:974:in `response_redirect' /app/vendor/bundle/ruby/2.1.0/gems/mechanize-2.7.2/lib/mechanize/http/agent.rb:298:in `fetch' /app/vendor/bundle/ruby/2.1.0/gems/mechanize-2.7.2/lib/mechanize.rb:432:in `get' /app/app/workers/crawl_page.rb:24:in `block in perform' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:91:in `block in timeout' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:35:in `block in catch' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:35:in `catch' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:35:in `catch' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:106:in `timeout'

我不认为sidekiq会卡在net / http上，因为我在超时中包装了整个调用： Timeout::timeout(APP_CONFIG['crawl_page_timeout']) { @page = agent.get(url) }

…但后来我开始阅读一些关于ruby的Timeout不是线程安全的旧post： http ： //blog.headius.com/2008/02/rubys-threadraise-threadkill-timeoutrb.html

ruby的Timeout仍然不是线程安全吗？

我知道很多人在Ruby中编写爬虫。如果Timeout不是线程安全的，那么编写处理net / http问题的爬虫的人怎么会卡住？

更新：

我已经切换到HTTPClient（具体说它的线程安全）来替换机械化。我们似乎仍然陷入初始化线程的困境。再次，这可能是由于ruby’ss Timeout无法正常工作，或者它可能是sidekiq问题。这是最近挂起的sidekiq线程的堆栈跟踪：

 /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:805:in `initialize' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:805:in `new' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:805:in `create_socket' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:752:in `block in connect' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:91:in `block in timeout' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:101:in `call' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:101:in `timeout' /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/timeout.rb:127:in `timeout' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:751:in `connect' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:609:in `query' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient/session.rb:164:in `query' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:1087:in `do_get_block' /app/vendor/bundle/ruby/2.1.0/gems/newrelic_rpm-3.9.2.239/lib/new_relic/agent/instrumentation/httpclient.rb:34:in `block in do_get_block_with_newrelic' /app/vendor/bundle/ruby/2.1.0/gems/newrelic_rpm-3.9.2.239/lib/new_relic/agent/cross_app_tracing.rb:43:in `tl_trace_http_request' /app/vendor/bundle/ruby/2.1.0/gems/newrelic_rpm-3.9.2.239/lib/new_relic/agent/instrumentation/httpclient.rb:33:in `do_get_block_with_newrelic' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:891:in `block in do_request' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:985:in `protect_keep_alive_disconnected' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:890:in `do_request' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:963:in `follow_redirect' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:776:in `request' /app/vendor/bundle/ruby/2.1.0/gems/httpclient-2.4.0/lib/httpclient.rb:677:in `get' /app/app/ohm_models/queued_page.rb:20:in `run_crawl'

更正，在Ruby代码中使用Timeout仍然不安全，除非您确切知道该块中发生了什么（包括任何C代码可能正在做什么）。我亲自观察了连接池中发生的灾难性事件。

您可能能够逃脱拯救错误并重试，但如果您运气不好，您的过程可能会被楔入并需要重新启动。

如果你分叉创建新进程，你可以安全地杀死它们，如果它们运行时间很长（或使用timeout(1)因为它们没有任何方法来破坏你的父进程）。

我知道很多人在Ruby中编写爬虫。如果Timeout不是线程安全的，那么编写处理net / http问题的爬虫的人怎么会卡住？

你有一个有效的例子吗？

ruby 2.1.2超时仍然不是线程安全吗？

ruby是否具有Java等效的synchronize关键字？

即使只有一个线程在线程池中，并发会发生并发吗？

你什么时候需要将参数传递给`Thread.new`？

是puma唯一的multithreading导轨4 http服务器？

如何确定在Heroku Performance dyno上运行的Puma工作者和线程的正确数量？

Ruby线程是否安全？

如何将Monitor对象传递给Ruby中的两个线程对象？

如何部署线程安全的异步Rails应用程序？

Thread＃run和Thread #wakeup之间的区别？

在没有混淆输出的情况下，在Ruby中从并行操作打印输出的最简单方法是什么？