Ruby在块中创建tar球以避免内存不足错误

我正在尝试重用以下代码来创建tar球：

tarfile = File.open("#{Pathname.new(path).realpath.to_s}.tar","w") Gem::Package::TarWriter.new(tarfile) do |tar| Dir[File.join(path, "**/*")].each do |file| mode = File.stat(file).mode relative_file = file.sub /^#{Regexp::escape path}\/?/, '' if File.directory?(file) tar.mkdir relative_file, mode else tar.add_file relative_file, mode do |tf| File.open(file, "rb") { |f| tf.write f.read } end end end end tarfile.rewind tarfile

只要涉及小文件夹，它就能正常工作，但任何大的文件都会因以下错误而失败：

 Error: Your application used more memory than the safety cap

我怎么能在块中做到这一点以避免内存问题？

看起来问题可能出在这一行：

 File.open(file, "rb") { |f| tf.write f.read }

您通过执行f.read来“ f.read ”输入文件。 slurping意味着整个文件被读入内存，根本不可扩展，并且是使用read而不是长度的结果。

相反，我会做一些事情来读取和写入块中的文件，以便您拥有一致的内存使用量。这读取1MB块。您可以根据自己的需要进行调整：

 BLOCKSIZE_TO_READ = 1024 * 1000 File.open(file, "rb") do |fi| while buffer = fi.read(BLOCKSIZE_TO_READ) tf.write buffer end end

以下是文档中有关read ：

如果length是一个正整数，它会尝试读取长度字节而不进行任何转换（二进制模式）。它返回nil或长度为1到长度字节的字符串。零意味着它在开始时遇到了EOF。 1到长度为1个字节的字符串表示在读取结果后它符合EOF。长度字节字符串表示它不符合EOF。结果字符串始终是ASCII-8BIT编码。

另一个问题是，您似乎没有正确打开输出文件：

 tarfile = File.open("#{Pathname.new(path).realpath.to_s}.tar","w")

由于"w"您在“文本”模式下编写它。相反，您需要以二进制模式"wb"编写，因为tarball包含二进制（压缩）数据：

 tarfile = File.open("#{Pathname.new(path).realpath.to_s}.tar","wb")

重写原始代码更像是我想看到它，导致：

 BLOCKSIZE_TO_READ = 1024 * 1000 def create_tarball(path) tar_filename = Pathname.new(path).realpath.to_path + '.tar' File.open(tar_filename, 'wb') do |tarfile| Gem::Package::TarWriter.new(tarfile) do |tar| Dir[File.join(path, '**/*')].each do |file| mode = File.stat(file).mode relative_file = file.sub(/^#{ Regexp.escape(path) }\/?/, '') if File.directory?(file) tar.mkdir(relative_file, mode) else tar.add_file(relative_file, mode) do |tf| File.open(file, 'rb') do |f| while buffer = f.read(BLOCKSIZE_TO_READ) tf.write buffer end end end end end end end tar_filename end

BLOCKSIZE_TO_READ应该位于文件的顶部，因为它是一个常量并且是“可调整的” – 更有可能比代码正文更改。

该方法返回tarball的路径，而不是原始代码的IO句柄。使用IO.open的块forms自动关闭输出，这将导致任何后续open自动rewind 。我更喜欢传递路径字符串而不是文件的IO句柄。

我还在括号中包含了一些方法参数。虽然Ruby中的方法参数不需要括号，有些人避开它们，但我认为它们通过分隔参数的开始和结束位置使代码更易于维护。当你将参数和块传递给方法时，它们也避免混淆Ruby – 这是众所周知的bug的原因。

minitar看起来像写入流，所以我不认为记忆会成为一个问题。以下是pack方法的注释和定义（截至2013年5月21日）：

 # A convenience method to pack files specified by +src+ into +dest+. If # +src+ is an Array, then each file detailed therein will be packed into # the resulting Archive::Tar::Minitar::Output stream; if +recurse_dirs+ # is true, then directories will be recursed. # # If +src+ is an Array, it will be treated as the argument to Find.find; # all files matching will be packed. def pack(src, dest, recurse_dirs = true, &block) Output.open(dest) do |outp| if src.kind_of?(Array) src.each do |entry| pack_file(entry, outp, &block) if dir?(entry) and recurse_dirs Dir["#{entry}/**/**"].each do |ee| pack_file(ee, outp, &block) end end end else Find.find(src) do |entry| pack_file(entry, outp, &block) end end end end

从README编写tar的示例：

 # Packs everything that matches Find.find('tests') File.open('test.tar', 'wb') { |tar| Minitar.pack('tests', tar) }

从README写一个gzipped tar的例子：

 tgz = Zlib::GzipWriter.new(File.open('test.tgz', 'wb')) # Warning: tgz will be closed! Minitar.pack('tests', tgz)

Ruby在块中创建tar球以避免内存不足错误

OpenSSL :: X509 ::显示错误域名证书的证书

Snow Leopard + Ruby 1.9.1 + MySQL Gem =巨大的问题

如何在Ruby中关闭命令提示符窗口？

Linq地图！还是收集！

厨师rubygem安装程序失败

ruby Date.today和DateTime.now的错误日期

我如何使用冷冻Capistrano？

在ruby on rails上显示表单提交时没有页面刷新的结果数据

迭代并设置Ruby对象实例变量

Rails，Polymorphic Association – 仅渲染关联实例

Ruby在块中创建tar球以避免内存不足错误

OpenSSL :: X509 ::显示错误域名证书的证书

Snow Leopard + Ruby 1.9.1 + MySQL Gem =巨大的问题

如何在Ruby中关闭命令提示符窗口？

Linq地图！ 还是收集！

厨师rubygem安装程序失败

ruby Date.today和DateTime.now的错误日期

我如何使用冷冻Capistrano？

在ruby on rails上显示表单提交时没有页面刷新的结果数据

迭代并设置Ruby对象实例变量

Rails，Polymorphic Association – 仅渲染关联实例

Linq地图！还是收集！