HttpZip
HttpZip is a Ruby gem to extract individual files from a remote ZIP archive, without the need to download the entire file.
If your Zip file is hosted on a server that supports Content-Range requests and you only want to extract individual files, you don't need to download the entire archive to do that. HttpZip uses Content-Range requests to first read only the Central Directory of your archive and builds a list of entries from that. You can then download and extract individual entries without downloading the entire archive.
Installation
Add this line to your application's Gemfile:
gem 'http_zip'
And then execute:
$ bundle install
Or install it yourself as:
$ gem install http_zip
Usage
# Create a new HttpZip::File referencing your remote archive.
# This only makes a HEAD request to check the server for
# Range request support.
zip = HttpZip::File.new("https://www.example.org/archive.zip")
# Get a reference to a specific file.
# This only requests the archive's Central Directory Entry.
entry = zip.entries.find { |e| e.name == 'compressed.txt' }
# Read the extracted file contents into memory.
# This downloads the entry's compressed contents and uncompresses
# them locally.
content = entry.read
# You can also write the extracted entry directly to a local file.
entry.write_to_file('/path/extracted.txt')
If the server that the zip file is hosted on doesn't support Range requests, HttpZip will throw HttpZip::ContentRangeError
.
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/peret/http_zip.
License
The gem is available as open source under the terms of the MIT License.