# Toucan-1.5M **Repository Path**: hf-datasets/Toucan-1.5M ## Basic Information - **Project Name**: Toucan-1.5M - **Description**: Mirror of https://huggingface.co/datasets/Agent-Ark/Toucan-1.5M - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-10-12 - **Last Updated**: 2025-10-12 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README --- dataset_info: - config_name: Kimi-K2 features: - name: uuid dtype: string - name: subset_name dtype: string - name: messages dtype: string - name: question dtype: string - name: available_tools dtype: string - name: target_tools dtype: string - name: question_quality_assessment dtype: string - name: response_quality_assessment dtype: string - name: metadata dtype: string splits: - name: train num_bytes: 19540301213 num_examples: 518516 download_size: 6392602476 dataset_size: 19540301213 - config_name: OSS features: - name: uuid dtype: string - name: subset_name dtype: string - name: messages dtype: string - name: question dtype: string - name: available_tools dtype: string - name: target_tools dtype: string - name: question_quality_assessment dtype: string - name: response_quality_assessment dtype: string - name: metadata dtype: string splits: - name: train num_bytes: 23321900170 num_examples: 457130 download_size: 8158074700 dataset_size: 23321900170 - config_name: Qwen3 features: - name: uuid dtype: string - name: subset_name dtype: string - name: messages dtype: string - name: question dtype: string - name: available_tools dtype: string - name: target_tools dtype: string - name: question_quality_assessment dtype: string - name: response_quality_assessment dtype: string - name: metadata dtype: string splits: - name: train num_bytes: 21763561944 num_examples: 551613 download_size: 6837495729 dataset_size: 21763561944 - config_name: SFT features: - name: uuid dtype: string - name: subset_name dtype: string - name: question dtype: string - name: target_tools dtype: string - name: tools dtype: string - name: messages dtype: string splits: - name: train num_bytes: 1346302110 num_examples: 119287 download_size: 425496735 dataset_size: 1346302110 configs: - config_name: Kimi-K2 data_files: - split: train path: Kimi-K2/train-* - config_name: OSS data_files: - split: train path: OSS/train-* - config_name: Qwen3 data_files: - split: train path: Qwen3/train-* - config_name: SFT data_files: - split: train path: SFT/train-* license: apache-2.0 size_categories: - 1M