# CellTics **Repository Path**: handsun1/CellTics ## Basic Information - **Project Name**: CellTics - **Description**: 合并vcf邻近位点的工具 来源:https://github.com/MGHComputationalPathology/CellTics 修改了配置文件信息使安装更容易,修改一处bug - **Primary Language**: Python - **License**: BSD-3-Clause - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-03-08 - **Last Updated**: 2022-03-09 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # CellTics Center for Integrated Diagnostics at Mass General Hospital NGS tools ## Installation ``` conda create -p /home/work/software/celltics python==3.5 cd /home/work/software/celltics git clone https://gitee.com/handsun1/CellTics.git cd CellTics virtualenv --python=/home/work/software/celltics/bin/python3 venv3 source venv3/bin/activate pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple python3 setup.py install ``` If you receive an error pertaining to lzma.h you may need to disable lzma and try python setup.py install again. (This occurs on MacOS Mojave) ``` export HTSLIB_CONFIGURE_OPTIONS=--disable-lzma python setup.py install ``` With Mac OS High Sierra there is a new security feature to disable multithreading. If you see an error such as: ```bash objc[49174]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. objc[49174]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. ``` set the following environment variable: ```bash export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES ``` ## VarGroup Scans a vcf file and combines multiple nearby SNPs and indels into a single genomic event. The pickle library DOES NOT work with the cli library which allows calling celltics directly. When multithreading you must call ```bash python /path/to/celltics/tools/vargroup.py -i -o ``` VarGrouper runs multithreaded by default. Use --debug or set threads to 1 (-t 1) to avoid multiprocessing. ### Simplest vargroup command _This will produce a warning if invoked with the cli library._ ```bash celltics vargroup --input-file sorted_variants.vcf --output-file grouped_variants.vcf --ref-seq hg19.fasta -t 1 ``` or (no warning message) ```bash python celltics/tools/vargroup.py --input-file sorted_variants.vcf --output-file grouped_variants.vcf --ref-seq hg19.fasta -t 1 ``` ### Run vargroup with bam ```bash celltics vargroup --input-file sorted_variants.vcf --output-file grouped_variants.vcf --bam-file sorted_alignment.bam --ref-seq hg19.fasta -t ``` or (no warning message) ```bash python celltics/tools/vargroup.py --input-file sorted_variants.vcf --output-file grouped_variants.vcf --bam-file sorted_alignment.bam --ref-seq hg19.fasta -t 1 ``` If a reference sequence is not supplied the UCSC hg19 api is queried ([http://genome.ucsc.edu/](http://genome.ucsc.edu/)). Variants will be grouped if they are within a certain distance and occur on the same reads. For more advanced options run ```celltics vargroup --help```. ### Troubleshooting Errors are not very informative when masked by the python multithreading module. Run vargroup with --debug and error messages are more informative. ### Algorithm ![VarGrouper](https://gitee.com/handsun1/CellTics/blob/master/celltics/docs/graphics/vargrouper_flow.png) ### Version 2.0 - added multiprocessing - converted from python 2.7 to python 3 ### Contact * [Allison MacLeay](mailto:allison.macleay@gmail.com) * [Ryan Schmidt](mailto:RSCHMIDT@BWH.HARVARD.EDU)