1 Star 1 Fork 0

ClickHouse-Build/snappy

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
BSD-3-Clause

Snappy, a fast compressor/decompressor.

Introduction

Snappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger. (For more information, see "Performance", below.)

Snappy has the following properties:

  • Fast: Compression speeds at 250 MB/sec and beyond, with no assembler code. See "Performance" below.
  • Stable: Over the last few years, Snappy has compressed and decompressed petabytes of data in Google's production environment. The Snappy bitstream format is stable and will not change between versions.
  • Robust: The Snappy decompressor is designed not to crash in the face of corrupted or malicious input.
  • Free and open source software: Snappy is licensed under a BSD-type license. For more information, see the included COPYING file.

Snappy has previously been called "Zippy" in some Google presentations and the like.

Performance

Snappy is intended to be fast. On a single core of a Core i7 processor in 64-bit mode, it compresses at about 250 MB/sec or more and decompresses at about 500 MB/sec or more. (These numbers are for the slowest inputs in our benchmark suite; others are much faster.) In our tests, Snappy usually is faster than algorithms in the same class (e.g. LZO, LZF, QuickLZ, etc.) while achieving comparable compression ratios.

Typical compression ratios (based on the benchmark suite) are about 1.5-1.7x for plain text, about 2-4x for HTML, and of course 1.0x for JPEGs, PNGs and other already-compressed data. Similar numbers for zlib in its fastest mode are 2.6-2.8x, 3-7x and 1.0x, respectively. More sophisticated algorithms are capable of achieving yet higher compression rates, although usually at the expense of speed. Of course, compression ratio will vary significantly with the input.

Although Snappy should be fairly portable, it is primarily optimized for 64-bit x86-compatible processors, and may run slower in other environments. In particular:

  • Snappy uses 64-bit operations in several places to process more data at once than would otherwise be possible.
  • Snappy assumes unaligned 32 and 64-bit loads and stores are cheap. On some platforms, these must be emulated with single-byte loads and stores, which is much slower.
  • Snappy assumes little-endian throughout, and needs to byte-swap data in several places if running on a big-endian platform.

Experience has shown that even heavily tuned code can be improved. Performance optimizations, whether for 64-bit x86 or other platforms, are of course most welcome; see "Contact", below.

Building

You need the CMake version specified in CMakeLists.txt or later to build:

mkdir build
cd build && cmake ../ && make

Usage

Note that Snappy, both the implementation and the main interface, is written in C++. However, several third-party bindings to other languages are available; see the home page for more information. Also, if you want to use Snappy from C code, you can use the included C bindings in snappy-c.h.

To use Snappy from your own C++ program, include the file "snappy.h" from your calling file, and link against the compiled library.

There are many ways to call Snappy, but the simplest possible is

snappy::Compress(input.data(), input.size(), &output);

and similarly

snappy::Uncompress(input.data(), input.size(), &output);

where "input" and "output" are both instances of std::string.

There are other interfaces that are more flexible in various ways, including support for custom (non-array) input sources. See the header file for more information.

Tests and benchmarks

When you compile Snappy, snappy_unittest is compiled in addition to the library itself. You do not need it to use the compressor from your own library, but it contains several useful components for Snappy development.

First of all, it contains unit tests, verifying correctness on your machine in various scenarios. If you want to change or optimize Snappy, please run the tests to verify you have not broken anything. Note that if you have the Google Test library installed, unit test behavior (especially failures) will be significantly more user-friendly. You can find Google Test at

https://github.com/google/googletest

You probably also want the gflags library for handling of command-line flags; you can find it at

https://gflags.github.io/gflags/

In addition to the unit tests, snappy contains microbenchmarks used to tune compression and decompression performance. These are automatically run before the unit tests, but you can disable them using the flag --run_microbenchmarks=false if you have gflags installed (otherwise you will need to edit the source).

Finally, snappy can benchmark Snappy against a few other compression libraries (zlib, LZO, LZF, and QuickLZ), if they were detected at configure time. To benchmark using a given file, give the compression algorithm you want to test Snappy against (e.g. --zlib) and then a list of one or more file names on the command line. The testdata/ directory contains the files used by the microbenchmark, which should provide a reasonably balanced starting point for benchmarking. (Note that baddata[1-3].snappy are not intended as benchmarks; they are used to verify correctness in the presence of corrupted data in the unit test.)

Contact

Snappy is distributed through GitHub. For the latest version, a bug tracker, and other information, see https://github.com/google/snappy.

Copyright 2011, Google Inc. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Google Inc. nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. === Some of the benchmark data in testdata/ is licensed differently: - fireworks.jpeg is Copyright 2013 Steinar H. Gunderson, and is licensed under the Creative Commons Attribution 3.0 license (CC-BY-3.0). See https://creativecommons.org/licenses/by/3.0/ for more information. - kppkn.gtb is taken from the Gaviota chess tablebase set, and is licensed under the MIT License. See https://sites.google.com/site/gaviotachessengine/Home/endgame-tablebases-1 for more information. - paper-100k.pdf is an excerpt (bytes 92160 to 194560) from the paper “Combinatorial Modeling of Chromatin Features Quantitatively Predicts DNA Replication Timing in _Drosophila_” by Federico Comoglio and Renato Paro, which is licensed under the CC-BY license. See http://www.ploscompbiol.org/static/license for more ifnormation. - alice29.txt, asyoulik.txt, plrabn12.txt and lcet10.txt are from Project Gutenberg. The first three have expired copyrights and are in the public domain; the latter does not have expired copyright, but is still in the public domain according to the license information (http://www.gutenberg.org/ebooks/53).

简介

A fast compressor/decompressor 展开 收起
README
BSD-3-Clause
取消

发行版

暂无发行版

贡献者

全部

近期动态

不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/ClickHouse-Build/snappy.git
git@gitee.com:ClickHouse-Build/snappy.git
ClickHouse-Build
snappy
snappy
master

搜索帮助