# DataFlow
**Repository Path**: xmagictech/DataFlow
## Basic Information
- **Project Name**: DataFlow
- **Description**: 基于大模型算子和工作流的高效文本大模型训练数据合成框架
- **Primary Language**: Python
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: https://opendcai.github.io/DataFlow-Doc/zh/
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 1
- **Created**: 2026-04-13
- **Last Updated**: 2026-04-13
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# DataFlow
**Generate, Clean, and Prepare LLM Data, All-in-One**

[](https://github.com/OpenDCAI/DataFlow)
[](https://github.com/OpenDCAI/DataFlow/issues)
[](https://github.com/OpenDCAI/DataFlow/issues?q=is%3Aissue%20state%3Aclosed)
[](https://github.com/OpenDCAI/DataFlow/pulls)
[](https://github.com/OpenDCAI/DataFlow/pulls?q=is%3Apr+is%3Aclosed)
[](https://github.com/OpenDCAI/DataFlow/graphs/contributors)
[](https://github.com/OpenDCAI/DataFlow)
[](https://pypi.org/project/open-dataflow/)
[](https://pypi.org/project/open-dataflow/)
[](https://pypistats.org/packages/open-dataflow)
[](https://pepy.tech/project/open-dataflow)
[](https://colab.research.google.com/drive/1haosl2QS4N4HM7u7HvSsz_MnLabxexXl?usp=sharing)
[](https://hub.docker.com/r/molyheci/dataflow)
[](https://OpenDCAI.github.io/DataFlow-Doc/)
[](https://arxiv.org/abs/2512.16676)
[](https://deepwiki.com/OpenDCAI/DataFlow)
[](https://discord.gg/e4mKEaFptu)
[](https://github.com/user-attachments/assets/3c2e5d4d-d1ea-4d8c-9146-ff14e657e857)

Visual, low-code pipelines with flexible orchestration across domains and use cases.💪
Turn raw data into high-quality LLM training datasets.🔧
🎉 Get smarter LLMs cheaply — give us a star ⭐ on GitHub for the latest update.
**Beginner-friendly learning resources (continuously updated)**:
[[🎬 Video Tutorials]](https://space.bilibili.com/3546929239689711?spm_id_from=333.337.0.0)
[[📚 Written Tutorials]](https://wcny4qa9krto.feishu.cn/wiki/I9tbw2qnBi0lEakmmAGclTysnFd)
[简体中文](./README-zh.md) | English
## 📰 0. News
* **[2026-02-02] 🖥️ DataFlow WebUI is now available!**
Launch the visual pipeline builder with a single command: `dataflow webui`. Build and run DataFlow pipelines through an intuitive web interface. 👉 [WebUI Docs](#dfwebui)