Examples
========

Read lines from a gzip-compressed file
--------------------------------------

The following snippet is an example of using CodecZlib.jl, which exports
`GzipDecompressorStream{S}` as an alias of
`TranscodingStream{GzipDecompressor,S}`, where `S` is a subtype of `IO`:
```julia
using CodecZlib
stream = GzipDecompressorStream(open("data.txt.gz"))
for line in eachline(stream)
    # do something...
end
close(stream)
```

Note that the last `close` call closes the wrapped file as well.
Alternatively, `open(<stream type>, <filepath>) do ... end` syntax closes the
file at the end:
```julia
using CodecZlib
open(GzipDecompressorStream, "data.txt.gz") do stream
    for line in eachline(stream)
        # do something...
    end
end
```

Read compressed data from a pipe
--------------------------------

The input is not limited to usual files. You can read data from a pipe
(actually, any `IO` object that implements standard I/O methods) as follows:
```julia
using CodecZlib
proc = open(`cat some.data.gz`)
stream = GzipDecompressorStream(proc)
for line in eachline(stream)
    # do something...
end
close(stream)  # This will finish the process as well.
```

Save a data matrix with Zstd compression
----------------------------------------

Writing compressed data is easy. One thing you need to keep in mind is to call
`close` after writing data; otherwise, the output file will be incomplete:
```julia
using CodecZstd
using DelimitedFiles
mat = randn(100, 100)
stream = ZstdCompressorStream(open("data.mat.zst", "w"))
writedlm(stream, mat)
close(stream)
```

Of course, `open(<stream type>, ...) do ... end` just works:
```julia
using CodecZstd
using DelimitedFiles
mat = randn(100, 100)
open(ZstdCompressorStream, "data.mat.zst", "w") do stream
    writedlm(stream, mat)
end
```

Explicitly finish transcoding by writing `TOKEN_END`
----------------------------------------------------

When writing data, the end of a data stream is indicated by calling `close`,
which writes an epilogue if necessary and flushes all buffered data to the
underlying I/O stream. If you want to explicitly specify the end of a data
chunk for some reason, you can write `TranscodingStreams.TOKEN_END` to the
transcoding stream, which finishes the current transcoding process without
closing the underlying stream:
```julia
using CodecZstd
using TranscodingStreams
buf = IOBuffer()
stream = ZstdCompressorStream(buf)
write(stream, "foobarbaz"^100, TranscodingStreams.TOKEN_END)
flush(stream)
compressed = take!(buf)
close(stream)
```

Use a noop codec
----------------

The `Noop` codec does nothing (i.e., buffering data without transformation).
`NoopStream` is an alias of `TranscodingStream{Noop}`.  The following example
creates a decompressor stream based on the extension of a filepath:
```julia
using CodecZlib
using CodecXz
using TranscodingStreams

function makestream(filepath)
    if endswith(filepath, ".gz")
        codec = GzipDecompressor()
    elseif endswith(filepath, ".xz")
        codec = XzDecompressor()
    else
        codec = Noop()
    end
    return TranscodingStream(codec, open(filepath))
end

makestream("data.txt.gz")
makestream("data.txt.xz")
makestream("data.txt")
```

Change the codec of a file
--------------------------

`TranscodingStream`s are composable: a stream can be an input/output of another
stream. You can use this to change the format of a file by composing different
codecs as below:
```julia
using CodecZlib
using CodecZstd

input  = open("data.txt.gz",  "r")
output = open("data.txt.zst", "w")

stream = GzipDecompressorStream(ZstdCompressorStream(output))
write(stream, input)
close(stream)
```

Effectively, this is equivalent to the following pipeline:

    cat data.txt.gz | gzip -d | zstd >data.txt.zst

Stop decoding on the end of a block
-----------------------------------

Many codecs support decoding concatenated data blocks (or chunks). For example,
if you concatenate two gzip files into a single file and read it using
`GzipDecompressorStream`, you will see the byte stream of concatenation of the
two files. If you need the part corresponding the first file, you can set
`stop_on_end` to `true` to stop transcoding at the end of the first block.
Note that setting `stop_on_end` to `true` does not close the wrapped stream
because you will often want to reuse it.
```julia
using CodecZlib
# cat foo.txt.gz bar.txt.gz > foobar.txt.gz
stream = GzipDecompressorStream(open("foobar.txt.gz"), stop_on_end=true)
read(stream)  #> the content of foo.txt
eof(stream)   #> true
```

In the case where you need to reuse the wrapped stream, the code above must be
slightly modified because the transcoding stream may read more bytes than
necessary from the wrapped stream. Wrapping the stream with `NoopStream` solves
the problem because any extra data read after the end of the chunk will be 
stored back in the internal buffer of the wrapped transcoding stream.
```julia
using CodecZlib
using TranscodingStreams
stream = NoopStream(open("foobar.txt.gz"))
read(GzipDecompressorStream(stream, stop_on_end=true))  #> the content of foo.txt
read(GzipDecompressorStream(stream, stop_on_end=true))  #> the content of bar.txt
```

Check I/O statistics
--------------------

`TranscodingStreams.stats` returns a snapshot of the I/O statistics. For
example, the following function shows progress of decompression to the standard
error:
```julia
using CodecZlib

function decompress(input, output)
    buffer = Vector{UInt8}(undef, 16 * 1024)
    while !eof(input)
        n = min(bytesavailable(input), length(buffer))
        unsafe_read(input, pointer(buffer), n)
        unsafe_write(output, pointer(buffer), n)
        stats = TranscodingStreams.stats(input)
        print(STDERR, "\rin: $(stats.in), out: $(stats.out)")
    end
    println(STDERR)
end

input = GzipDecompressorStream(open("foobar.txt.gz"))
output = IOBuffer()
decompress(input, output)
```

`stats.in` is the number of bytes supplied to the stream and `stats.out` is the
number of bytes consumed out of the stream.

Transcode data in one shot
--------------------------

TranscodingStreams.jl extends the `transcode` function to transcode a data
in one shot. `transcode` takes a codec object as its first argument and a data
vector as its second argument:
```julia
using CodecZlib
decompressed = transcode(ZlibDecompressor, b"x\x9cKL*JLNLI\x04R\x00\x19\xf2\x04U")
String(decompressed)
```

Transcode lots of strings
-------------------------

`transcode(<codec type>, data)` method is convenient but suboptimal when
transcoding a number of objects. This is because the method reallocates a new
codec object for every call. Instead, you can use `transcode(<codec object>,
data)` method that reuses the allocated object as follows. In this usage, you
need to explicitly allocate and free resources by calling
`TranscodingStreams.initialize` and `TranscodingStreams.finalize`,
respectively.

```julia
using CodecZstd
using TranscodingStreams
strings = ["foo", "bar", "baz"]
codec = ZstdCompressor()
TranscodingStreams.initialize(codec)  # allocate resources
try
    for s in strings
        data = transcode(codec, s)
        # do something...
    end
catch
    rethrow()
finally
    TranscodingStreams.finalize(codec)  # free resources
end
```

Unread data
-----------

`TranscodingStream` supports *unread* operation, which inserts data into the
current reading position. This is useful when you want to peek from the stream.
`TranscodingStreams.unread` and `TranscodingStreams.unsafe_unread` functions are
provided:
```julia
using TranscodingStreams
stream = NoopStream(open("data.txt"))
data1 = read(stream, 8)
TranscodingStreams.unread(stream, data1)
data2 = read(stream, 8)
@assert data1 == data2
```

The unread operation is different from the write operation in that the unreaded
data are not written to the wrapped stream. The unreaded data are stored in the
internal buffer of a transcoding stream.

Unfortunately, *unwrite* operation is not provided because there is no way to
cancel write operations that are already committed to the wrapped stream.