# CropGBM-Tutorial-data **Repository Path**: cau-xyt/CropGBM-Tutorial-data ## Basic Information - **Project Name**: CropGBM-Tutorial-data - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2021-04-26 - **Last Updated**: 2023-07-01 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## These files contain all the data required by the Tutorial. * genofile.map, genofile.ped.zip are the genotype files before preprocessing (ped format). * genofile_filter.bed, genofile_filter.bim, genofile_filter.fam are the genotype files after preprocessing (bed format). * genofile_filter.geno is the genotype file after preprocessing. The first line is the SNP ID, the first column is the sample ID, and 012 represents the genotype. * ksampleid_file.txt content is sample ID to be extracted. * ksnpid_file.txt content is SNP ID to be extracted. * rsampleid_file.txt content is sample ID to be removed. * rsnpid_file.txt content is SNP ID to be removed. * ppexsampleid_file.txt content is the sample ID to be extracted. * phefile.txt is the phenotype file. The file has 4 columns in total, the first column is the sample ID, the second column is the paternal ID, the third column is the strain, and the fourth column is the phenotype value. * phefile.numphe content is the sample ID and the transformed phenotype value. * phefile.word2num content is the corresponding relationship between the phenotype and the number in the phenotype RECODE. * train.geno.zip is the genotype data used for training. The file format is consistent with genofile_filter.geno. * train.phe is the phenotype data corresponding to train.geno.zip. The file has 2 columns in total, the first column is the sample ID, the second column is the phenotype value. * train.lgb_model is the model file after training, and the tree structure is recorded in the file. * valid.geno is the genotype data used for verification. The file format is consistent with genofile_filter.geno. * valid.phe is the phenotype data corresponding to valid.geno. The file format is consistent with train.phe. * test.geno is the genotype data used for testing. The file format is consistent with genofile_filter.geno.