Ruby bindings for lzws library
See lzws library.
Other bindings: brotli, zstd, bzip2.
Installation
Operating systems: GNU/Linux, FreeBSD, OSX.
Dependencies: lzws 1.4.0+ version.
gem install ruby-lzws
You can build it from source.
rake gem
gem install pkg/ruby-lzws-*.gem
You can also use overlay for gentoo.
Usage
There are simple APIs: String
and File
. Also you can use generic streaming API: Stream::Writer
and Stream::Reader
.
require "lzws"
data = LZWS::String.compress "TOBEORNOTTOBEORTOBEORNOT"
puts LZWS::String.decompress(data)
LZWS::File.compress "file.txt", "file.txt.Z"
LZWS::File.decompress "file.txt.Z", "file.txt"
LZWS::Stream::Writer.open("file.txt.Z") { |writer| writer << "TOBEORNOTTOBEORTOBEORNOT" }
puts LZWS::Stream::Reader.open("file.txt.Z") { |reader| reader.read }
writer = LZWS::Stream::Writer.new output_socket
begin
bytes_written = writer.write_nonblock "TOBEORNOTTOBEORTOBEORNOT"
# handle "bytes_written"
rescue IO::WaitWritable
# handle wait
ensure
writer.close
end
reader = LZWS::Stream::Reader.new input_socket
begin
puts reader.read_nonblock(512)
rescue IO::WaitReadable
# handle wait
rescue ::EOFError
# handle eof
ensure
reader.close
end
You can create and read tar.Z
archives with minitar.
LZWS is compatible with UNIX compress (with default options).
require "lzws"
require "minitar"
LZWS::Stream::Writer.open "file.tar.Z" do |writer|
Minitar::Writer.open writer do |tar|
tar.add_file_simple "file", :data => "TOBEORNOTTOBEORTOBEORNOT"
end
end
LZWS::Stream::Reader.open "file.tar.Z" do |reader|
Minitar::Reader.open reader do |tar|
tar.each_entry do |entry|
puts entry.name
puts entry.read
end
end
end
You can also use Content-Encoding: compress
with sinatra:
require "lzws"
require "sinatra"
get "/" do
headers["Content-Encoding"] = "compress"
LZWS::String.compress "TOBEORNOTTOBEORTOBEORNOT"
end
All functionality (including streaming) can be used inside multiple threads with parallel. This code will provide heavy load for your CPU.
require "lzws"
require "parallel"
Parallel.each large_datas do |large_data|
LZWS::String.compress large_data
end
Docs
Please review rdoc generated docs.
Options
Option | Values | Default | Description |
---|---|---|---|
source_buffer_length |
0, 2 - inf | 0 (auto) | internal buffer length for source data |
destination_buffer_length |
0, 2 - inf | 0 (auto) | internal buffer length for description data |
gvl |
true/false | false | enables global VM lock where possible |
max_code_bit_length |
9 - 16 | 16 | max code bit length |
block_mode |
true/false | true | enables block mode |
without_magic_header |
true/false | false | disables magic header |
msb |
true/false | false | enables most significant bit mode |
unaligned_bit_groups |
true/false | false | enables unaligned bit groups |
quiet |
true/false | false | disables lzws library logging |
There are internal buffers for compressed and decompressed data.
For example you want to use 1 KB as source_buffer_length
for compressor - please use 256 B as destination_buffer_length
.
You want to use 256 B as source_buffer_length
for decompressor - please use 1 KB as destination_buffer_length
.
gvl
is disabled by default, this mode allows running multiple compressors/decompressors in different threads simultaneously.
Please consider enabling gvl
if you don't want to launch processors in separate threads.
If gvl
is enabled ruby won't waste time on acquiring/releasing VM lock.
You can also read lzws docs for more info about options.
Option | Related constants |
---|---|
max_code_bit_length |
LZWS::Option::LOWEST_MAX_CODE_BIT_LENGTH = 9, LZWS::Option::BIGGEST_MAX_CODE_BIT_LENGTH = 16 |
Possible compressor options:
:source_buffer_length
:destination_buffer_length
:gvl
:max_code_bit_length
:block_mode
:without_magic_header
:msb
:unaligned_bit_groups
:quiet
Possible decompressor options:
:source_buffer_length
:destination_buffer_length
:gvl
:without_magic_header
:msb
:unaligned_bit_groups
:quiet
Example:
require "lzws"
data = LZWS::String.compress "TOBEORNOTTOBEORTOBEORNOT", :msb => true
puts LZWS::String.decompress(data, :msb => true)
Please read more about compatibility in lzws docs.
String
String maintains destination buffer only, so it accepts destination_buffer_length
option only.
::compress(source, options = {})
::decompress(source, options = {})
source
is a source string.
File
File maintains both source and destination buffers, it accepts both source_buffer_length
and destination_buffer_length
options.
::compress(source, destination, options = {})
::decompress(source, destination, options = {})
source
and destination
are file pathes.
Stream::Writer
Its behaviour is similar to builtin Zlib::GzipWriter
.
Writer maintains destination buffer only, so it accepts destination_buffer_length
option only.
::open(file_path, options = {}, :external_encoding => nil, :transcode_options => {}, &block)
Open file path and create stream writer associated with opened file.
Data will be transcoded to :external_encoding
using :transcode_options
before compressing.
::new(destination_io, options = {}, :external_encoding => nil, :transcode_options => {})
Create stream writer associated with destination io.
Data will be transcoded to :external_encoding
using :transcode_options
before compressing.
#set_encoding(external_encoding, nil, transcode_options)
Set another encodings, nil
is just for compatibility with IO
.
#io
#to_io
#stat
#external_encoding
#transcode_options
#pos
#tell
See IO
docs.
#write(*objects)
#flush
#rewind
#close
#closed?
See Zlib::GzipWriter
docs.
#write_nonblock(object, *options)
#flush_nonblock(*options)
#rewind_nonblock(*options)
#close_nonblock(*options)
Special asynchronous methods missing in Zlib::GzipWriter
.
rewind
wants to close
, close
wants to write
something and flush
, flush
want to write
something.
So it is possible to have asynchronous variants for these synchronous methods.
Behaviour is the same as IO#write_nonblock
method.
#<<(object)
#print(*objects)
#printf(*args)
#putc(object, :encoding => 'ASCII-8BIT')
#puts(*objects)
Typical helpers, see Zlib::GzipWriter
docs.
Stream::Reader
Its behaviour is similar to builtin Zlib::GzipReader
.
Reader maintains both source and destination buffers, it accepts both source_buffer_length
and destination_buffer_length
options.
::open(file_path, options = {}, :external_encoding => nil, :internal_encoding => nil, :transcode_options => {}, &block)
Open file path and create stream reader associated with opened file.
Data will be force encoded to :external_encoding
and transcoded to :internal_encoding
using :transcode_options
after decompressing.
::new(source_io, options = {}, :external_encoding => nil, :internal_encoding => nil, :transcode_options => {})
Create stream reader associated with source io.
Data will be force encoded to :external_encoding
and transcoded to :internal_encoding
using :transcode_options
after decompressing.
#set_encoding(external_encoding, internal_encoding, transcode_options)
Set another encodings.
#io
#to_io
#stat
#external_encoding
#internal_encoding
#transcode_options
#pos
#tell
See IO
docs.
#read(bytes_to_read = nil, out_buffer = nil)
#eof?
#rewind
#close
#closed?
See Zlib::GzipReader
docs.
#readpartial(bytes_to_read = nil, out_buffer = nil)
#read_nonblock(bytes_to_read, out_buffer = nil, *options)
See IO
docs.
#getbyte
#each_byte(&block)
#readbyte
#ungetbyte(byte)
#getc
#readchar
#each_char(&block)
#ungetc(char)
#lineno
#lineno=
#gets(separator = $OUTPUT_RECORD_SEPARATOR, limit = nil)
#readline
#readlines
#each(&block)
#each_line(&block)
#ungetline(line)
Typical helpers, see Zlib::GzipReader
docs.
Thread safety
:gvl
option is disabled by default, you can use bindings effectively in multiple threads.
Please be careful: bindings are not thread safe.
You should lock all shared data between threads.
For example: you should not use same compressor/decompressor inside multiple threads. Please verify that you are using each processor inside single thread at the same time.
CI
Please visit scripts/test-images. See universal test script scripts/ci_test.sh for CI.