# DataFlow
**Repository Path**: huait-ossc/DataFlow
## Basic Information
- **Project Name**: DataFlow
- **Description**: https://github.com/OpenDCAI/DataFlow
- **Primary Language**: Python
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 1
- **Forks**: 2
- **Created**: 2025-07-15
- **Last Updated**: 2026-03-29
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# DataFlow
**大模型数据生成、清洗与准备,一站式搞定**

[](https://github.com/OpenDCAI/DataFlow)
[](https://github.com/OpenDCAI/DataFlow/issues)
[](https://github.com/OpenDCAI/DataFlow/issues?q=is%3Aissue%20state%3Aclosed)
[](https://github.com/OpenDCAI/DataFlow/pulls)
[](https://github.com/OpenDCAI/DataFlow/pulls?q=is%3Apr+is%3Aclosed)
[](https://github.com/OpenDCAI/DataFlow/graphs/contributors)
[](https://github.com/OpenDCAI/DataFlow)
[](https://pypi.org/project/open-dataflow/)
[](https://pypi.org/project/open-dataflow/)
[](https://pypistats.org/packages/open-dataflow)
[](https://pepy.tech/project/open-dataflow)
[](https://colab.research.google.com/drive/1haosl2QS4N4HM7u7HvSsz_MnLabxexXl?usp=sharing)
[](https://hub.docker.com/r/molyheci/dataflow)
[](https://OpenDCAI.github.io/DataFlow-Doc/)
[](https://arxiv.org/abs/2512.16676)
[](https://deepwiki.com/OpenDCAI/DataFlow)
[](https://discord.gg/e4mKEaFptu)
[](https://github.com/user-attachments/assets/3c2e5d4d-d1ea-4d8c-9146-ff14e657e857)

可视化、低代码流水线,支持跨领域和用例的灵活编排。💪
将原始数据转化为高质量的 LLM 训练数据集。🔧
🎉 以更低的成本获得更智能的 LLM —— 在 GitHub 上给我们点个Star ⭐ 以获取最新更新。
**初学者友好学习资源(持续更新)**:
[[🎬 视频教程]](https://space.bilibili.com/3546929239689711?spm_id_from=333.337.0.0)
[[📚 文字教程]](https://wcny4qa9krto.feishu.cn/wiki/I9tbw2qnBi0lEakmmAGclTysnFd)
简体中文 | [English](./README.md)
## 📰 0. 新闻
* **[2026-02-02] 🖥️ DataFlow WebUI 正式发布!**
通过一条命令 `dataflow webui` 即可启动可视化流水线构建器,在直观的网页界面中构建并运行 DataFlow 流水线。👉 [WebUI 文档](#dfwebui)