Benchmarking of ATAC Sequencing Data From BGIs Low-Cost DNBSEQ-G400 Instrument for Identification of Open and Occupied Chromatin Regions

Published in Front Mol Biosci, 2022

Recommended citation: Naval-Sanchez et al. (2022). "Benchmarking of ATAC Sequencing Data From BGI’s Low-Cost DNBSEQ-G400 Instrument for Identification of Open and Occupied Chromatin Regions" Frontiers in Molecular Biosciences vol 9. (2296). https://doi.org/10.3389/fmolb.2022.900323

Authors: Naval-Sanchez, Marina and Deshpande, Nikita and Tran, Minh and Zhang, Jingyu and Alhomrani, Majid and Alsanie, Walaa and Nguyen, Quan and Nefzger, Christian M.

Background: Chromatin falls into one of two major subtypes: closed heterochromatin and euchromatin which is accessible, transcriptionally active, and occupied by transcription factors (TFs). The most widely used approach to interrogate differences in the chromatin state landscape is the Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq). While library generation is relatively inexpensive, sequencing depth requirements can make this assay cost-prohibitive for some laboratories.Findings: Here, we benchmark data from Beijing Genomics Institute’s (BGI) DNBSEQ-G400 low-cost sequencer against data from a standard Illumina instrument (HiSeqX10). For comparisons, the same bulk ATAC-seq libraries generated from pluripotent stem cells (PSCs) and fibroblasts were sequenced on both platforms. Both instruments generate sequencing reads with comparable mapping rates and genomic context. However, DNBSEQ-G400 data contained a significantly higher number of small, sub-nucleosomal reads (>30% increase) and a reduced number of bi-nucleosomal reads (>75% decrease), which resulted in narrower peak bases and improved peak calling, enabling the identification of 4% more differentially accessible regions between PSCs and fibroblasts. The ability to identify master TFs that underpin the PSC state relative to fibroblasts (via HOMER, HINT-ATAC, TOBIAS), namely, foot-printing capacity, were highly similar between data generated on both platforms. Integrative analysis with transcriptional data equally enabled direct recovery of three published 3-factor combinations that have been shown to induce pluripotency.Conclusion: Other than a small increase in peak calling sensitivity for DNBSEQ-G400 data (BGI), both platforms enable comparable levels of open chromatin identification for ATAC-seq library sequencing, yielding similar analytical outcomes, albeit at low-data generation costs in the case of the BGI instrument.

Access paper here