2 Star 6 Fork 0

第四范式 / pafka

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0
logo

Slack Release Docker Pulls Stars Fork License

Pafka: Persistent Memory (PMem) Accelerated Kafka

1. Introduction

Pafka is an evolved version of Apache Kafka developed by MemArk. Kafka is an open-source distributed event streaming/message queue system for handling real-time data feeds efficiently and reliably. However, its performance (e.g., throughput and latency) is constrained by slow external storage. Pafka enhances Kafka based on tiered storage architecture, which is usually equipped with high-performance SSD or Intel® Optane™ Persistent Memory (PMem). With the careful design of data migration algorithms, it improves overall persistence performance with low cost. For example, it can well support the scenario that high data production rate is repeated after an interval of time (e.g., special discount is released every one hour from a shopping website); it is also capable of improving the overall performance when high throughput is required over a long period.

Please refer to our latest blog for Pafka benchmark and use cases :point_right: English (中文)

2. Architecture

The basic idea behind Pafka is to utilize tiered storage architecture to enhance overall performance of a system. Nowadays, a data center may have various kinds of storage devices, such as HDD, SSD, and state-of-the-art non-volatile persistent memory. However, Kafka is not aware of such storage hierarchy. In this project, we enhance Kafka by using the high performance storage device, e.g., PMem, as the first layer of storage, together with carefully designed migration algorithms, to significantly improve overall performance for specific scenarios.

The key challenge of taking advantage of tiered storage is to design the data partitioning and migration mechanism between the fast and slow devices. The overall architecture and key workflow of Pafka are illustrated in the below figure. Basically, data is written onto PMem when there is available space, otherwise onto HDD. Besides, there is a background migration task to balance the data between PMem and HDD. Specifically, a new parameter storage.migrate.threshold is introduced to indicate when the migration is triggered. Specifically, when the used space of PMem exceeds the threshold, the oldest data on the PMem is migrated to HDD; otherwise the newest data on the HDD is migrated to PMem. The migration tries to keep the fresh data in the PMem to achieve the efficiency of consumers.

arch

3. Get Started

For the complete documentation of Kafka, refer to here.

3.1. Docker Image

The easiest way to try Pafka is to use the docker image: https://hub.docker.com/r/4pdopensource/pafka-dev

docker run -it 4pdopensource/pafka-dev bash

If you use the docker image, you can skip the following Compile step.

3.2. Compile

3.2.1. Dependencies

:warning: We have done some modifications on the original pmdk source codes. Please download the source code from the two repositories provided above.

We have already shipped pcj and llpl jars in libs folder in the Pafka repository. They are compiled with java 8 and g++ 4.8.5. In general, you are not required to compile the two libraries by yourself. However, if you encounter any compilation/running error caused by these two libraries, you can download the source codes and compile on your own environment as below.

After cloning the source code:

# compile pcj
cd pcj
make && make jar
cp target/pcj.jar $PAFKA_HOME/libs

# compile llpl
cd llpl
make && make jar
cp target/llpl.jar $PAFKA_HOME/libs

3.2.2. Build Pafka Jar

./gradlew jar

3.3. Run

3.3.1. Environmental Setup

To verify the correctness, you can use any file systems with normal hard disks. To take advantage of tiered storage architecture, it requires the availability of PMem hardware mounted as a DAX file system.

3.3.2. Config

In order to support tiered storage, we add some more config fields to the Kafka server config.

Config Default Value Note
log.channel.type file log file channel type.
Options: "file", "pmem", "tiered".
"file": use normal file as vanilla Kafka does
"pmem": use pmem as the log storage
"tiered": use tiered storage
storage.tiers.types PMEM,HDD the storage types for each layers (separated by ,).
Available types are PMEM, NVME, SSD, HDD.
storage.tiers.first.paths /pmem first-layer storage paths (separated by ,).
(Only applicable if log.channel.type=tiered or pmem)
storage.tiers.first.sizes -1 first-layer storage capacities in bytes (separated by ,); -1 means use all the space
(Only applicable if log.channel.type=tiered or pmem)
storage.tiers.second.paths /hdd second-layer storage paths (separated by ,)
(Only applicable if log.channel.type=tiered)
storage.migrate.threshold 0.5 the threshold used to control when to start the migration.
-1 means no migration.
(Only applicable if log.channel.type=tiered)
storage.migrate.threads 1 the number of threads used for migration
(Only applicable if log.channel.type=tiered)
log.pmem.pool.ratio 0.8 A pool of log segments will be pre-allocated. This is the proportion of total pmem size. Pre-allocation will increase the first startup time, but can eliminate the dynamic allocation cost when serving requests.
(Only applicable if we are using pmem as the first-layer storage)

:warning: log.preallocate has to be set to true if pmem is used, as PMem MemoryBlock does not support append-like operations.

Sample config in config/server.properties is as follows:

######## start of tiered storage config ########

# log file channel type; Options: "file", "pmem", "tiered".
# if "file": use normal file as vanilla Kafka does. Following configs are not applicable.
log.channel.type=tiered
# the storage types for each layers (separated by ,)
storage.tiers.types=NVME,HDD
# first-layer storage paths (separated by ,)
storage.tiers.first.paths=/nvme
# first-layer storage capacities in bytes (separated by ,); -1 means use all the space
storage.tiers.first.sizes=-1
# second-layer storage paths (separated by ,)
storage.tiers.second.paths=/hdd
# threshold to control when to start the migration; -1 means no migration.
storage.migrate.threshold=0.5
# migration threads
storage.migrate.threads=1

# pmem-specific config
# pre-allocated pool ratio
log.pmem.pool.ratio=0.8
# log.preallocate have to set to true if pmem is used
log.preallocate=true

######## end of tiered storage config ########

3.3.3. Start Pafka

Follow instructions in https://kafka.apache.org/quickstart. Basically:

bin/zookeeper-server-start.sh config/zookeeper.properties > zk.log 2>&1 &
bin/kafka-server-start.sh config/server.properties > pafka.log 2>&1 &

Benchmark Pafka

Producer
Single Client
# bin/kafka-producer-perf-test.sh --topic $TOPIC --throughput $MAX_THROUGHPUT --num-records $NUM_RECORDS --record-size $RECORD_SIZE --producer.config config/producer.properties --producer-props bootstrap.servers=$BROKER_IP:$PORT
bin/kafka-producer-perf-test.sh --topic test --throughput 1000000 --num-records 1000000 --record-size 1024 --producer.config config/producer.properties --producer-props bootstrap.servers=localhost:9092
Multiple Clients

We provide a script to let you run multiple clients on multiple hosts. For example, if you want to run 16 producers in each of the hosts, node-1 and node-2, you can run the following command:

bin/bench.py --threads 16 --hosts "node-1 node-2" --num_records 100000000 --type producer

In total, there are 32 clients, which will generate 100000000 records. Each client is responsible for populating one topic.

In order to make it work, you have to configure password-less login from the running machine to the client machines.

You can run python3 bin/bench.py --help to see other benchmark options.

Consumer
Single Client
# bin/kafka-consumer-perf-test.sh --topic $TOPIC --consumer.config config/consumer.properties --bootstrap-server $BROKER_IP:$PORT --messages $NUM_RECORDS --show-detailed-stats --reporting-interval $REPORT_INTERVAL --timeout $TIMEOUT_IN_MS
bin/kafka-consumer-perf-test.sh --topic test --consumer.config config/consumer.properties --bootstrap-server localhost:9092 --messages 1000000 --show-detailed-stats --reporting-interval 1000 --timeout 100000
Multiple Clients

Similarly, you can use the same script as producer benchmark to launch multiple clients.

bin/bench.py --threads 16 --hosts "node-1 node-2" --num_records 100000000 --type consumer

4. Limitations

  • pmdk llpl MemoryPool does not provide a ByteBuffer API. We did some hacking to provide a zero-copy ByteBuffer API. You may see some warnings from JRE with version >= 9. We've tested on Java 8, Java 11 and Java 15.

    WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by com.intel.pmem.llpl.MemoryPoolImpl to field java.nio.Buffer.address WARNING: Please consider reporting this to the maintainers of com.intel.pmem.llpl.MemoryPoolImpl WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release

  • Currently, only the log files are stored in the tiered storage, while the indexes are still kept as normal files, as we do not see much performance gain if we move the indexes to fast storage.

5. Roadmap

Version Status Features
v0.1.1 Released - Use PMem for data storage
- Significant performance boost compared with Kafka
v0.2.0 Released - A two-layered storage strategy to utilize the total capacity of all storage devices while maintaining the efficiency by our cold-hot data migration algorithms
- Further PMem performance improvement by using libpmem
v0.3.0 Released - Configurable storage devices for both 1st and 2nd layers, to support using large SSD/NVMe as the first layer

6. Community

Pafka is developed by MemArk (https://memark.io/en), which is a tech community focusing on leveraging modern storage architecture for system enhancement. MemArk is led by 4Paradigm (https://www.4paradigm.com/) and other sponsors (such as Intel). Please join our community for:

  • Chatting: For any feedback, suggestions, issues, and anything about using Pafka or other storage related topics, we strongly encourage you to join our interactive discussion channel at Slack #pafka-help
  • Development discussion: If you would like to formally report a bug or suggestion, please use the GitHub Issues; if you would like to propose a new feature for some discussion, or would like to start a pull request, please use the GitHub Discussions, and our developers will respond promptly.

You can also contact the MemArk community for any feedback: contact@memark.io

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

简介

Pafka is originated from the OpenAIOS project to leverage an optimized tiered storage access strategy to improve overall performance for streaming/messaging system. 展开 收起
Apache-2.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
1
https://gitee.com/paradigm4/pafka.git
git@gitee.com:paradigm4/pafka.git
paradigm4
pafka
pafka
main

搜索帮助