As the basal bricks, the dynamics and arrangement of nucleosomes orchestrate the higher architecture of chromatin in a fundamental way, thereby affecting almost all nuclear biology processes. Thanks to its rather simple protocol, ATAC-seq has been rapidly adopted as a major tool for chromatin-accessible profiling at both bulk and single-cell level; however, to picture the arrangement of nucleosomes per se remains a challenge with ATAC-seq. We introduce a novel ATAC-seq analysis toolkit, named deNOPA, to predict nucleosome positions. Assessments showed that deNOPA not only outperformed state-of-the-art tools, but it is the only tool able to predict nucleosome position precisely with ultrasparse ATAC-seq data. The remarkable performance of deNOPA was fueled by the reads from short fragments, which compose nearly half of sequenced reads and are normally discarded from an ATAC-seq library. However, we found that the short fragment reads enrich information on nucleosome positions and that the linker regions were predicted by reads from both short and long fragments using Gaussian smoothing. See the basic workflow of deNOPA as follows:
The deNOPA package was initially developed using python 2.7. The support of python 3 has also been added at version 1.0.2 (tested python 3.7). The package was tested under the default environment of Anaconda-5.3.1, both python 2.7 and python 3.7 version. Besides a python environment, the following dependencies were also needed.
Please make sure they were properly installed ahead of the deNOPA package itself. Please also use the python 3 version as far as possible. Only this version is maintained now.
We also offered a tested pre-requirements list in the requirements.txt. User can quickly build an environment using
pip install -r requirements.txt
Use the following commands to get deNOPA installed.
git clone https://gitee.com/bxxu/denopa.git
cd denopa
python setup.py install
Download the compatible wheel file from the dist directory in this repo according to your version of python. Then get it installed using the following command:
pip install deNOPA-x.y.z-pyX-none-any.whl
The bam files should be indexed using samtools before running the package. The package was only tested in alignments from bowtie2. The compatibility to other aligners is not guaranteed.
usage: denopa [-h] -i INPUT [-o OUTPUT] [-b BUFFERSIZE] [-s CHROMSKIP]
[-c CHROMINCLUDE] [-n NAME] [-m MAXLEN] [--proc PROC] [-p PARER]
[-q QNFR] [-r]
Decoding the nucleosome positions with ATAC-seq data at single cell level
optional arguments:
-h, --help show this help message and exit
-i INPUT, --input INPUT
The input bam files. The files should be sorted. This
argument could be given multiple times for multiple
input files.
-o OUTPUT, --output OUTPUT
The directory where the output files will go to. It
will be created if not exists (default .).
-b BUFFERSIZE, --bufferSize BUFFERSIZE
Number of reads buffered in reading the bam file
(default 1000000).
-s CHROMSKIP, --chromSkip CHROMSKIP
Names of chromosomes skiped from the processing.
Multiple values should be sepaated by ',' (default
chrY,chrM).
-c CHROMINCLUDE, --chromInclude CHROMINCLUDE
The regular expression of chromosome names included in
the analysis, for human genome
'chr[1-9][0-9]{,1}|chrX' should be enough. It can be
combined with -s.
-n NAME, --name NAME The name of the project (default deNOPA).
-m MAXLEN, --maxLen MAXLEN
The maximun fragment length in the input files
(default 2000).
--proc PROC Number of processors used in the analysis (default 1).
-p PARER, --pARER PARER
The p-value threshold used in determining the ATAC-seq
reads enriched regions (ARERs, default 0.1)
-q QNFR, --qNFR QNFR The q-value threshold used in determining the
nucleosome free regions (NFRs, default 0.1).
-r, --removeIntermediateFiles
The intermediate files will be removed if this flag is
set.
Sparse data does not mean no data. When reads are too sparse, deNOPA cannot detect any ARER and will certainly fail. The package has been tested to work for data with 600K or more read pairs for Saccharomyces cerevisiae or 10M or more read pairs for human or mouse. Here we provided a test dataset in the "test" directory which contained about 25% (604K) aligned fragments in SRR1822145, together with the results.
Please add the following citation if you use deNOPA in your study:
Xu B, Li X, Gao X, et al. DeNOPA: decoding nucleosome positions sensitively with sparse ATAC-seq data[J]. Briefings in Bioinformatics, 2022, 23(1): bbab469.
You can send questions, discussions, bug reports and other useful information to Zhihua Zhang (zangzhihua@big.ac.cn) or Bingxiang Xu (xubingxiang@sus.edu.cn).
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。