# single_cell
**Repository Path**: mo_xinwu/single_cell
## Basic Information
- **Project Name**: single_cell
- **Description**: 单细胞相关
- **Primary Language**: R
- **License**: GPL-3.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 1
- **Forks**: 0
- **Created**: 2023-03-20
- **Last Updated**: 2024-10-10
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# README
## 仓库用途
主要用于存放一些单细胞相关的 code
## function
1. 用于 tsne 和 umap 图美化 -> tsne_umap_beautif.R
2. 用于获取 barcode 的原始信息,并添加 线粒体基因占比、核糖体基因占比、红细胞基因占比信息:Seurat_get_barcode_info.R
3. 添加 plot_gene_dot 函数,用于在 tsne 或 umap 图上展示表达某个基因的细胞 -> tsne_umap_beautif.R
4. get_gene_set.R 获取基因集数据
在进行 GSVA 或 GSEA 分析前,我们需要得到基因集数据,常用的基因集数据有 C1、C2、...、C8、H 9个数据集,可以通过 msigdbr R packages 来获取不同物种的不同数据集。
- 可用的 Species
| species_name | species_common_name |
| ------------------------------- | ------------------------------------------------------------ |
| Anolis carolinensis | Carolina anole, green anole |
| Bos taurus | bovine, cattle, cow, dairy cow, domestic cattle, domestic cow, ox, oxen |
| Caenorhabditis elegans | NA |
| Canis lupus familiaris | dog, dogs |
| Danio rerio | leopard danio, zebra danio, zebra fish, zebrafish |
| Drosophila melanogaster | fruit fly |
| Equus caballus | domestic horse, equine, horse |
| Felis catus | cat, cats, domestic cat |
| Gallus gallus | bantam, chicken, chickens, Gallus domesticus |
| Homo sapiens | human |
| Macaca mulatta | rhesus macaque, rhesus macaques, Rhesus monkey, rhesus monkeys |
| Monodelphis domestica | gray short-tailed opossum |
| Mus musculus | house mouse, mouse |
| Ornithorhynchus anatinus | duck-billed platypus, duckbill platypus, platypus |
| Pan troglodytes | chimpanzee |
| Rattus norvegicus | brown rat, Norway rat, rat, rats |
| Saccharomyces cerevisiae | baker's yeast, brewer's yeast, S. cerevisiae |
| Schizosaccharomyces pombe 972h- | NA |
| Sus scrofa | pig, pigs, swine, wild boar |
| Xenopus tropicalis | tropical clawed frog, western clawed frog |
5. 双细胞预测: run_scrublet.py
参数说明:
--gene_cell_matrix_input_file: 输入的 gene x cell矩阵文件行为基因, 列为细胞
--expected_doublet_rate:预估的双细胞率, 默认0.06
--predicted_output_file:结果输出文件, `predicted_doublets` 列为1的即为双细胞,应去除。
|Barcode |doublet_scores |predicted_doublets|
|------------------|--------------------|------------------|
|AAACCCACAAGGCGTA-1|0.04035002430724357 |0 |
|AAACCCACATGGGATG-1|0.031759656652360524|0 |
|AAACCCAGTACTCCGG-1|0.09191277946453216 |0 |
|AAACCCACAGCAGATG-1|0.31719128329297813 |1 |
|AAAGAACAGCTGTACT-1|0.27234342012667123 |1 |
|AAAGTCCAGCTCCACG-1|0.46285714285714286 |1 |
6. legend 为圆圈数字
在 `tsne_umap_beautif.R` 添加 `plot_number_circle` 用于绘制 legend 为圆圈数字, 注意:
- 需要对圆圈进行修大小修改 可以修改 `draw_number_circle`
- 只能用于绘制 一个 cluster 对应 一个 CellType 的情况
- 一定要注意的是 CellType 一定要记得将其转成 factor, 注意顺序, 一定要将 CellType 的 factor 顺序要和 Cluster 的顺序要对对应
```R
data <- data.table::fread("demo_tsne_umap.txt") %>% filter(Cluster %in% c(1,2,3,4,5))
data$CellType <- NA
data$CellType[data$Cluster == "1"] <- "Th2"
data$CellType[data$Cluster == "2"] <- "ILC1"
data$CellType[data$Cluster == "3"] <- "DC2"
data$CellType[data$Cluster == "4"] <- "Tc1"
data$CellType[data$Cluster == "5"] <- "Treg"
data$CellType <- factor(data$CellType,levels = unique(data$CellType)[order(unique(data$Cluster))])
p1 <- plot_number_circle(reduction_data = data,colors_by = "CellType",
labels_by = "Cluster",show_labels = T,
point_size = 1,adjust_axis = T,
label_size = 5)
data[,c(2,3)] <- NULL
p2 <- plot_number_circle(reduction_data = data,colors_by = "CellType",
labels_by = "Cluster",show_labels = T,
point_size = 1,adjust_axis = T,
label_size = 5)
p <- p1 | p2
```
