Summary
Open or accessible regions of the genome are the primary positions of binding sites for transcription factors and chromatin regulators. Transposase-accessible chromatin sequencing (ATAC-seq) can probe chromatin accessibility in the intact nucleus. Here, we describe a protocol to generate ATAC-seq libraries from fresh Arabidopsis thaliana tissues and establish an easy-to-use bioinformatic analysis pipeline. Our method could be applied to other plants and other tissues and allows for the reliable detection of changes in chromatin accessibility throughout plant growth and development.
For complete details on the use and execution of this protocol, please refer to Wang et al. (2020).
Subject areas: Bioinformatics, Sequencing, Model Organisms, Gene Expression
Graphical Abstract
Highlights
-
•
A protocol to generate ATAC-seq libraries from fresh plant tissues
-
•
An easy-to-use bioinformatic analysis pipeline for ATAC-seq
-
•
Identification of differentially accessible peaks by ATAC-seq
Open or accessible regions of the genome are the primary positions of binding sites for transcription factors and chromatin regulators. Transposase-accessible chromatin sequencing (ATAC-seq) can probe chromatin accessibility in the intact nucleus. Here, we describe a protocol to generate ATAC-seq libraries from fresh Arabidopsis thaliana tissues and establish an easy-to-use bioinformatic analysis pipeline. Our method could be applied to other plants and other tissues and allows for the reliable detection of changes in chromatin accessibility throughout plant growth and development.
Before you begin
Experimental considerations
-
1.
Here, we describe a detailed ATAC-seq protocol as applied to embryos and seedlings in Arabidopsis thaliana. This protocol can also be used for other plants and other tissues. Before beginning this protocol, please ensure that your core facility has the instruments compatible for flow sorting as the sorting step is extremely important for acquiring high-quality data.
-
2.
To ensure success, please perform pilot experiments to detect the position of nuclei with different ploidy levels in the peak diagram and to set appropriate gates for FACS sorting.
Key resources table
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Chemicals, peptides, and recombinant proteins | ||
Gamborg B-5 basal medium | Phyto Technology | Cat#G398 |
Sucrose | ABCONE | Cat#57501 |
2-(N-morpholino) ethanesulfonic acid (MES) | BBI Life Sciences | Cat#145224948 |
Phytagel | Sigma-Aldrich | Cat#7101052-1 |
Agar Bacteriological Grade | Shanghai Jiafeng | Cat#H8145 |
Triton X-100 | Sigma-Aldrich | Cat#9002931 |
KCl | Sigma-Aldrich | Cat#7447407 |
NaCl | Sigma-Aldrich | Cat#7647145 |
MgCl2·6H2O | Sigma-Aldrich | Cat#7791186 |
Tris base | Sigma-Aldrich | Cat#77861 |
DAPI | AAT Bioquest | Cat#28718903 |
2-Mercaptoethanol | Ruibio | Cat#60242 |
Spermine | Sigma-Aldrich | Cat#85590 |
Eva green dye | Biotium | Cat#31000 |
Phusion high-fidelity DNA polymerase | Thermo Fisher Scientific | Cat#022021 |
Complete protease inhibitor cocktail | Merck | Cat#04693132001 |
Critical commercial assays | ||
MinElute Reaction Cleanup Kit | QIAGEN | Cat#28204 |
TruePrep DNA Library Prep Kit v2 | Vazyme Biotech | Cat#TD50102 |
TruePrep Index Kit v2 | Vazyme Biotech | Cat#TD202 |
2× NEBNext high-fidelity PCR mix | New England Biolabs | Cat#M0541L |
AMPure beads | Beckman | Cat#A63880 |
Deposited data | ||
ATAC-seq, ChIP-seq, and RNA-seq experiment data | This paper | BioProject PRJCA002620, Beijing Institute of Genomics Data Center (http://bigd.big.ac.cn) |
Experimental models: organisms/strains | ||
A. thaliana: Col-0 | N/A | N/A |
Software and algorithms | ||
R version 3.6 | The R Foundation | RRID: SCR_001905 https://www.r-project.orgl |
Adobe Photoshop CC 2018 | Adobe Acrobat | N/A |
Adobe Illustrator CC 2018 | Adobe Acrobat | N/A |
Fastp | (Chen et al., 2018) | RRID: SCR_016962 https://github.com/OpenGene/fastp |
FastQC v0.11.7 | FastQC | RRID:SCR_014583 http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ |
MulitQC v1.6 | (Ewels et al., 2016) | https://multiqc.info/ |
Bowtie2 v2.3.4.3 | (Langmead and Salzberg, 2012) | http://bowtie-bio.sourceforge.net/bowtie2/ |
Samtools v1.9 | (Li et al., 2009) | RRID: SCR_002105 http://www.htslib.org/ |
sambamba v0.6.7 | (Tarasov et al., 2015) | https://lomereiter.github.io/sambamba/ |
bedtools v2.25.0 | (Quinlan and Hall, 2010) | RRID: SCR_006646 https://bedtools.readthedocs.io/en/latest/ |
deepTools v3.1.2 | (Ramirez et al., 2014) | RRID: SCR_016366 https://deeptools.readthedocs.io/en/develop/ |
featureCounts v1.6.2 | (Liao et al., 2014) | RRID: SCR_012919 http://bioinf.wehi.edu.au/featureCounts/ |
DESeq2 v1.26.0 | (Love et al., 2014) | RRID: SCR_015687 https://bioconductor.org/packages/release/bioc/html/DESeq2.html |
MACS2 v2.1.2 | (Zhang et al., 2008) | https://github.com/taoliu/MACS |
DiffBind v2.14.0 | (Ross-Innes et al., 2012) | RRID: SCR_012918 https://bioconductor.org/packages/release/bioc/html/DiffBind.html |
ChIPseeker v1.22.0 | (Yu et al., 2015) | https://bioconductor.org/packages/release/bioc/html/ChIPseeker.html |
HOMER v4.10 | (Heinz et al., 2010) | RRID: SCR_010881 http://homer.ucsd.edu/homer/ |
Intervene v0.6.1 | (Khan and Mathelier, 2017) | https://intervene.readthedocs.io/en/latest/ |
Integrative Genomics Viewer | (Robinson et al., 2011) | RRID: SCR_011793 http://software.broadinstitute.org/software/igv/ |
Other | ||
Eppendorf realplex2 | Eppendorf | Cat#A248709R |
The BD FACSAria II | BD | N/A |
Percival chamber | Percival | N/A |
Materials and equipment
The lysis buffer and wash buffer can be stored at 4°C for several months. Please add spermine and 2-mercaptoethanol before experiments.
Lysis buffer
Reagent | Final concentration | Amount |
---|---|---|
Tris-HCl pH 7.5 (1 M) | 15 mM | 0.75 mL |
NaCl (5 M) | 20 mM | 0.2 mL |
KCl (2.5 M) | 80 mM | 1.6 mL |
2-Mercaptoethanol (14.3 M) | 5 mM | 17.5 μL |
Spermine (1 M) | 0.5 mM | 25 μL |
Cocktail (100×) | 1× | 0.5 mL |
Triton X-100 (100%) | 0.2% | 0.1 mL |
ddH2O | 47.3 mL | |
Total | n/a | 50 mL |
Wash buffer
Reagent | Final concentration | Amount |
---|---|---|
Tris-HCl pH 8.0 (1 M) | 10 mM | 0.5 mL |
MgCl2 (1 M) | 5 mM | 0.25 mL |
Cocktail (100×) | 1× | 0.5 mL |
ddH2O | 49.25 mL | |
Total | n/a | 50 mL |
CRITICAL: Both 2-mercaptoethanol and spermine have pungent odors, please add them in a fume hood.
Alternatives: This protocol describes cell nuclear FACS using a BD FACSAria II machine, any other FACS machine can be used as well. The FACS parameters used here are: nozzle size (70 μm), psi (70), sorting speed (1), sheath fluid (1× PBS).
Step-by-step method details
Extraction of cell nuclei
Timing: 1–2 h
This step describes the procedure for preparation of the nuclei suspension. Steps 1 and 2 are used for preparation of cell nuclei from embryos, while steps 3 and 4 are used for seedlings.
-
1.
Approximately 200 embryos at the late-cotyledon stage were collected with a 1 mL injection syringe under a dissecting microscope at 25°C. The collected embryos were placed on a 3.5 cm diameter round dish supplemented with B5 agar media (Figure 1A). The B5 agar will keep the embryos alive. The dissected embryos were transferred into a 12 cm long square dish with 500 μL pre-chilled lysis buffer on ice.
-
2.
Finely chop embryos with a double-sided blade to release nuclei on ice in fume hood.
Note: The embryos naturally settle at the bottom of the lysis buffer, so we can collect the materials together and repeatedly cut them. Due to the small volume of embryos, please cut them repeatedly to release more nucleus until the color of the lysis buffer is yellow-green (Figure 1B), which should take approximately 10 min. 200 embryos should provide a sufficient number of nuclei.
-
3.
Approximately 3-day-old 150 - 200 seedlings grown on B5 agar were collected with forceps at RT. The seedlings were transferred to a 12 cm long square dish with 500 μL pre-chilled lysis buffer on ice.
-
4.
Finely chop seedlings with a double-sided blade to release nuclei on ice in fume hood.
Note: For seedlings, 5 min of chopping is enough. The color of the lysis buffer will be bright green after chopping.
-
5.
Filter the slurry through a 40 μm filter into 15 mL collection tubes on ice. Rinse the petri dish with 1.5 mL lysis buffer and filter into the same 15 mL collection tube. The final nuclei suspension is about 2 mL in volume and the color should appear yellow-green.
CRITICAL: For acquiring more intact nuclei, we recommend extensively chopping plant materials in lysis buffer rather than grinding them in liquid nitrogen which leads to more tissue fragments that will disturb FACS.
Figure 1.
Generation of nuclei suspension
(A) The Arabidopsis embryos at the late-cotyledon stage were collected on the B5 agar.
(B) The nuclei lysis suspension of embryos after filtering.
(C) The 3-day-old Arabidopsis seedlings were collected on the B5 agar.
(D) The nuclei lysis suspension of seedlings after filtering.
FACS of cell nuclei
Timing: 1–5 h
This step describes the procedure for sorting nuclei by FACS.
-
6.
The nuclei suspensions are stained with DAPI for flow cytometric analysis. Set up an unstained control without DAPI.
Note: It is not necessary to estimate the density of nuclei suspension in this step because we will sort ∼50,000 nuclei per replicate to ensure repeatability. The amount of nuclei suspension required for control samples is flexible. The purpose of control is to determine the appropriate voltage needed to exclude the interference of other particles.
-
7.
Add 2 μL of 1 mg/mL DAPI to the nuclei suspension in a sample tube, flick gently to mix.
-
8.
Select parameters including FSC-A (forward scatter-A, particle size), SSC-A (side scatter-A, internal complexity), DAPI-A (intensity of DAPI staining), and DAPI-W (voltage pulse width).
-
9.
Use the unstained control and stained samples to set appropriate photomultiplier tube (PMT) voltages and adjust compensations on a BD FACSAria II cell analyzer.
-
10.
Create gates for the samples according to the position of 2C or 4C nuclei (Figure 2).
-
11.
Add 500 μL of lysis buffer to a 10 mL BD collect tube and collect a total of 50,000 events per sample (each sample per tube). The nozzle size, psi, and sorting speed of FACS are 70 μm, 70, and 1, respectively.
Note: 200 embryos and 150–200 3-day-old seedlings are enough material for getting 50,000 nuclei.
-
12.
Centrifuge at 1,000 × g for 10 min at 4°C.
Note: Since we cannot see the pellet at the bottom of the tube after centrifugation, please leave ∼50 μL of liquid to avoid nuclei loss when discarding the supernatant.
-
13.
Pipette 1 mL of pre-chilled wash buffer into the centrifuge tube.
-
14.
Centrifuge at 1,000 × g for 10 min at 4°C and discard the supernatant as much as possible.
Note: Immature embryo cells at late-cotyledon developmental stage rarely divide. DAPI-A detected nuclei almost with 2C from FACS (Figure 2A). The nuclei were sorted into lysis buffer as plant nuclei do not remain intact in PBS buffer that is used for FACS.
Figure 2.
Gate set for FACS
(A) FACS histogram of embryo nuclei suspension. The DAPI-A-2C and 4C represent cell ploidy with 2 times and 4 times, respectively. Only DAPI-A-2C cell nuclei was sorted.
(B) FACS histogram of seedling nuclei suspension. The DAPI-A-2C, -4C, and -8C represent cell ploidy with 2 times, 4 times, and 8 times, respectively. All DAPI-A-2C, -4C, and -8C cell nuclei were sorted.
ATAC-seq library constructing
Timing: 5–8 h
This step describes the procedure for generation of ATAC-seq library.
We use the TruePrep DNA Library Prep Kit V2 for Illumina including 5× TTBL and TTE Mix V50 (Vazyme, TD50102; see manual at http://www.vazymebiotech.com/products_detail/productId=70.html) to construct ATAC-seq libraries. We use the MinElute Reaction Cleanup Kit including Buffer PB, PE and EB (Qiagen) to purify transposed DNA.
-
15.
Prepare the transposition master mix in a 200 μL Lobind tube (Eppendorf) on ice. Add reagents in the order shown below.
Reagent | Amount |
---|---|
5× TTBL | 10 μL |
TTE Mix V50 | 5 μL |
Cocktail (100×) | 0.5 μL |
Nuclei suspension | x μL |
ddH2O | 34.5–x μL |
Total | 50 μL |
Note: The volume of nuclei suspension after step 14 is variable. Please sure that the remaining nuclear suspension is less than 34.5 μL since the total reaction volume for transposition is 50 μL. It is not necessary to optimize reaction conditions (e.g., transposition time) when working with different initial volumes or numbers of nuclei.
-
16.
Dispense master mix into a PCR tube containing nuclear suspension. Pipette to mix.
-
17.
Load the tube into a thermal cycler pre-warmed to 37°C and incubate for 30 min with occasional gentle mixing to keep the nuclei in suspension.
-
18.
Add 3–5 volumes of Buffer PB to 1 volume of the PCR sample.
-
19.
Transfer the mixture to a spin column that is situated in a provided 2 mL collection tube.
-
20.
Centrifuge at 13,523 × g for 1 min at 4°C.
-
21.
Discard the flow-through and add 750 μL of Buffer PE to the spin column.
-
22.
Centrifuge at 13,523 × g for 1 min at 4°C.
-
23.
Discard the flow-through and centrifuge the column for an additional 2 min.
-
24.
Place column in a clean 1.5 mL DNA Lobind microcentrifuge tube and add 11 μL of buffer EB to the center of the column for 2 min.
-
25.
Centrifuge at 13,523 × g for 1 min at 4°C.
Pause point: Samples can be safely stored overnight at 4°C or −20°C.
We use the TruePrep DNA Index Kit V2 for Illumina sequencing (Vazyme) and NEBNext high-fidelity PCR mix to amplify transposed DNA.
-
26.
Prepare the PCR amplification mix in a 0.2 mL PCR tube on ice as follows:
Reagent | Amount |
---|---|
Transposed DNA | 10 μL |
N5 Primer | 2.5 μL |
N7 Primer | 2.5 μL |
2× NEBNext high-fidelity PCR mix | 25 μL |
ddH2O | 10 μL |
Total | 50 μL |
-
27.
Incubate the reaction in a thermo cycler with the following PCR program.
PCR cycling conditions | |||
---|---|---|---|
Steps | Temperature | Time | Cycles |
Pre-incubation | 72°C | 5 min | 1 |
Initial denaturation | 98°C | 30 s | 1 |
Denaturation | 98°C | 10 s | 5 cycles |
Annealing | 63°C | 30 s | |
Extension | 72°C | 1 min | |
Hold | 4°C | Forever |
-
28.
To determine the number of additional PCR cycles needed to adequately amplify the DNA library, prepare the quantitative PCR (qPCR) Library Amplification Mix as follows:
Reagent | Amount |
---|---|
Amplified library | 5 μL |
N5 Primer | 0.5 μL |
N7 Primer | 0.5 μL |
2× NEBNext high-fidelity PCR mix | 7.5 μL |
20× Eva green dye | 0.75 μL |
ddH2O | 0.75 μL |
Total | 15 μL |
-
29.
Incubate the reaction in a qPCR thermocycler with the following PCR program.
PCR cycling conditions | |||
---|---|---|---|
Steps | Temperature | Time | Cycles |
Initial denaturation | 98°C | 30 s | 1 |
Denaturation | 98°C | 10 s | 20 cycles |
Annealing | 63°C | 30 s | |
Extension | 72°C | 1 min |
-
30.
To determine the optimal number of cycles needed to amplify the remaining 45 μL of each library, view the linear fluorescence versus cycle number plot on the qPCR machine once the reaction is finished. The cycle number at which the fluorescence for a given reaction is at 1/3 of its maximum is the number of additional cycles (N) that each library requires for adequate amplification (Figure 3).
-
31.
Incubate the remaining reaction in a thermocycler with the following PCR program.
PCR cycling conditions | |||
---|---|---|---|
Steps | Temperature | Time | Cycles |
Initial denaturation | 98°C | 30 s | 1 |
Denaturation | 98°C | 10 s | N cycles |
Annealing | 63°C | 30 s | |
Extension | 72°C | 1 min | |
Hold | 4°C | Forever |
-
32.
Add 67.5 μL (1.5× of sample volume) resuspended Ampure XP beads to the PCR reaction. Mix well by pipetting up and down at least 10 times. Alternatively, samples can be mixed by vortexing for 3–5 s.
Note: If Ampure XP beads are stored at 4°C, please pre-warm to 25°C for at least 30 min and vortex before use.
-
33.
Incubate samples on the bench for at least 5 min at 25°C.
-
34.
Place the tube on an appropriate magnetic stand to separate the beads from the supernatant. Quickly centrifuge the sample to collect the liquid from the sides of the tube before placing it on the magnetic stand if necessary.
-
35.
After 5 min, carefully remove and discard the supernatant. Be careful not to disturb the beads that contain bound DNA.
-
36.
Add 200 μL of freshly prepared 80% ethanol to the tube while in the magnetic stand.
-
37.
Incubate at 25°C for 30 s, and carefully remove and discard the supernatant.
-
38.
Repeat steps 34 and 35 once for a total of two washes.
Note: Be sure to remove all visible liquid after the second wash. Briefly centrifuge the tube on a benchtop centrifuge if necessary.
-
39.
Air-dry the beads for up to 5 min while the tube is on the magnetic stand with the lid open.
Note: Do not over-dry the beads. This may result in a lower recovery of DNA. Elute the samples when the beads are still dark-brown and glossy looking, but when all visible liquid has evaporated. When the beads turn lighter brown and start to crack, it shows beads are too dry.
-
40.
Remove the tube from the magnetic stand. Elute the DNA from the beads by adding 30 μL EB buffer provided by Qiagen.
-
41.
Mix well by pipetting up and down 10 times, or on a vortex mixer. Incubate for at least 2 min at room temperature. Quickly spin the sample to collect the liquid from the sides of the tube if necessary.
-
42.
Place the tube on the magnetic stand for 5 min. Transfer 28 μL to a new 1.5 mL tube. Libraries can be stored at −20°C. The concentration of the library can be checked by NanoDrop.
Figure 3.
Schematic diagram of qPCR
The maximum of fluorescence is 2,000. As such, one-third of the maximum is ~666, and the number of PCR cycles should be 9 (red line).
ATAC-seq library quality assessment
This step describes the procedure for assessment of ATAC-seq library.
We pipette 20 μL of library samples and send them to Novogene for library quality determination and sequencing (Figure 4). The library quality platform: NGS3K; Sequencing platform: HiseqPE150, paired-end.
Note: If we started with 50,000 sorted nuclei, 20 μL of library sample is enough for quality assessment and sequencing. Most sequencing companies perform library quality analysis before sequencing.
Figure 4.
Assessment of ATAC-seq library
(A–D) The purified ATAC-seq DNA libraries of 4 samples on NGS3K platform. Em, immature embryo; G3, 3-day-old seedlings grown on the B5 agar plate.
Expected outcomes
We performed ATAC-seq on four samples following this protocol. The resultant datasets inform us of the dynamics of accessible chromatin regions during seed germination. First, our result shows the distribution of open chromatin regions along the genome (Figure 5A). Second, the analysis of differences between samples suggest the major variable regions and the principal biological processes encoded therein by gene ontology (GO) enrichment annotation (Figures 5C and 5D). Finally, we inferred putative transcription factor binding sites which are enriched in open regions by HOMER-motif analysis (Figures 5B and 5E).
Figure 5.
Results of ATAC-seq analysis
(A) The genome-wide distribution of ATAC-seq peaks. Window size: gene body ± 3.0 kb.
(B) HOMER DNA-motif enrichment analyses of accessible peaks. The enrichment of binding motifs of 8 transcription factors per sample are shown.
(C) Volcano plot showing the genes associated with decreased (blue) or increased (red) accessible peaks between immature embryos (Em) and 3-day-old seedlings (G3). The known genes related to seed or embryo development are shown. Ns (gray in color), no difference between two samples.
(D) GO term analyses showing distinct gene ontologies of target genes linked to differentially accessible peaks (G3 versus Em). The selected 20 enriched GO biological processes are indicated. The −log10(p.adj) is given.
(E) HOMER DNA-motif enrichment analyses of differentially accessible peaks (G3 versus Em). The enrichment of binding motifs of 5 transcription factors per sample are shown.
Quantification and statistical analysis
ATAC-seq analysis pipeline
This step describes the bioinformatic pipeline used for the analysis of ATAC-seq dataset and identification of differentially accessible peaks.
We will describe the basic analysis pipeline of processing and analyzing ATAC-seq including data filtering, quality control, matching, difference analysis, motif prediction, and GO set enrichment analysis as shown below. For a detailed analysis process and related code refer to the following link: https://github.com/WangLab-SIPPE/2020STARProtocols_pipeline.
-
1.
For each library, raw.fastq was trimmed by fastp v0.20.0 with the parameters "-a CTGTCTCTTATACACATCT."
-
2.
After trimming, FastQC v0.11.7 and MulitQC v1.6 were performed as quality control to obtain clean fq files.
-
3.
Reads were aligned using Bowtie2 v2.3.4.3 to the Arabidopsis genome (TAIR10). Alternative align tools such as Bwa mean can be chosen.
-
4.
The resulting SAM file containing mapped reads were converted to BAM format, sorted, and indexed using Samtools v1.9.
-
5.
The biological replicates were merged by Samtools v1.9.
-
6.
The sorted BAM were processed to remove duplicated and organellar reads by sambamba v0.6.7 and bedtools v2.25.0. An alternative tool to mark and remove duplicated reads is Picard.
-
7.
To normalize and visualize the individual and merged replicate datasets, the BAM files were converted to bigwig using bamCoverage provided by deepTools v3.1.2 with a bin size of 10 bp and normalized by Bin Per Million mapped reads.
-
8.
MACS2 v2.1.2 was used to call peaks with default parameters "-t ATAC.bam -f BAMPE -g 1.1e8."
-
9.
For differentially accessible region analyses, we used "DiffBind" with the parameter "minOverlap = 1." "DiffBind" was also used to calculate merged peak locations, based on the outer boundaries of overlapping peaks from all the analyzed samples.
-
10.
The peaks called by MACS2 and differential peaks identified by DiffBind v2.14.0 were annotated by TxDb.Athaliana.BioMart.plantsmart28 and ChIPseeker v1.22.0 package with the "annotatePeak" function. The gene promoters are defined as ± 3.0 kb from the transcription start site (TSS).
-
11.
The differential regions were scanned for the enrichment of motifs using the "findMotifsGenome.pl" function provided by HOMER v4.10 with default parameters.
-
12.
The GO and Gene Set Enrichment Analysis of differentially accessible regions were performed using clusterProfiler v3.14.0 and org.At.tair.db v3.10.0.
Limitations
This protocol is designed for fresh plant materials. If the materials are stored at −80°C or in liquid nitrogen, it will lead to a high amount of tissue debris, which can interfere with the flow sorting of intact nuclei.
Troubleshooting
Problem 1
Sequenced reads have a low mapping rate.
Potential solution
There are several reasons that can cause low mapping rates. Please ensure that all solutions, equipment, and work areas are free of DNA. Wear a mask and gloves when chopping samples and constructing libraries. Avoid excessive PCR cycles during amplification. In general, 12–17 cycles are sufficient.
Problem 2
Poor data quality.
Potential solution
This might be caused by low quality or a low amount of cell nuclei. Please make sure that fresh samples are used, and an appropriate sorting speed and sheath pressure are selected.
Problem 3
Sequence reads do not map primarily to non-coding regions.
Potential solution
When you find that mapped sequences were mostly aligned to exonic regions in the reference genome, this might be caused by a low quality of nuclei or exogenous DNA pollution. Low-quality cell nuclei might cause nucleosome instability, which leads to an increase in the mapping rate at non-signal regions. To fix this problem, please make sure that only high-quality cell nuclei are used for the Tn5 tagmentation. As such, the examination of the sorted nuclei under a microscope is strongly recommended. Otherwise, 75% ethanol or DNA cleaner can be used to remove exogenous DNAs, and thereby improving your data quality.
Problem 4
The differentially accessible regions are far less as expected.
Potential solution
This problem can also be caused by poor data quality. Please check peak numbers, plot profiles, and the correlation between replicates. You can achieve these using the "plotCorrelation" function in deeptools, "plotPCA" function in DESeq2, or the "dba.plotPCA" function in DESeq2. Meanwhile, batch effects should also be considered. You can introduce the batch in the design effect with block parameter "dba.contrast" in DiffBind. Finally, you can try different ways to probe differentially accessible regions. According to a recent report (Reske et al., 2020), different bioinformatic tools may lead to different results.
Resource availability
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Jia-Wei Wang (jwwang@sippe.ac.cn).
Materials availability
This study did not generate new unique materials.
Data and code availability
The data can be found online at https://www.sciencedirect.com/science/article/pii/S1534580720305517
The analysis code can be found online at https://github.com/WangLab-SIPPE/2020STARProtocols_pipeline.
Acknowledgments
We thank Hong Zhu for technical support in FACS (Cell Biology Core Facility of Shanghai Center for Plant Stress Biology, Center for Excellence in Molecular Plant Sciences, CAS); members of J.-W.W.’s lab for discussion and comments on the manuscript. This work was supported by National Natural Science Foundation of China grant numbers 31788103, 31525004, and 31721001 to (J.-W.W.) and the Strategic Priority Research Program of the Chinese Academy of Sciences grant number XDB27030101 to (J.-W.W.).
Author contributions
F.-X.W., S.-G.D., and Y.-L.W. contributed equally. Y.-X.M. and J.G. helped with FACS. Z.-G.X. helped with bioinformatic analysis. F.-X.W. and S.-G.D. wrote the manuscript. J.-W.W. polished the paper and supervised the study.
Declaration of interests
The authors declare no competing interests.
References
- Chen S., Zhou Y., Chen Y., Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–i890. doi: 10.1093/bioinformatics/bty560. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ewels P., Magnusson M., Lundin S., Kaller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016;32:3047–3048. doi: 10.1093/bioinformatics/btw354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heinz S., Benner C., Spann N., Bertolino E., Lin Y.C., Laslo P., Cheng J.X., Murre C., Singh H., Glass C.K. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell. 2010;38:576–589. doi: 10.1016/j.molcel.2010.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Khan A., Mathelier A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics. 2017;18:287. doi: 10.1186/s12859-017-1708-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., Genome Project Data Processing, S. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liao Y., Smyth G.K., Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics. 2014;30:923–930. doi: 10.1093/bioinformatics/btt656. [DOI] [PubMed] [Google Scholar]
- Love M.I., Huber W., Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quinlan A.R., Hall I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ramirez F., Dundar F., Diehl S., Gruning B.A., Manke T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 2014;42 doi: 10.1093/nar/gku365. W187–191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reske J.J., Wilson M.R., Chandler R.L. ATAC-seq normalization method can significantly affect differential accessibility analysis and interpretation. Epigenetics Chromatin. 2020;13:22. doi: 10.1186/s13072-020-00342-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robinson J.T., Thorvaldsdottir H., Winckler W., Guttman M., Lander E.S., Getz G., Mesirov J.P. Integrative genomics viewer. Nat. Biotechnol. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ross-Innes C.S., Stark R., Teschendorff A.E., Holmes K.A., Ali H.R., Dunning M.J., Brown G.D., Gojis O., Ellis I.O., Green A.R. Differential oestrogen receptor binding is associated with clinical outcome in breast cancer. Nature. 2012;481:389–393. doi: 10.1038/nature10730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tarasov A., Vilella A.J., Cuppen E., Nijman I.J., Prins P. Sambamba: fast processing of NGS alignment formats. Bioinformatics. 2015;31:2032–2034. doi: 10.1093/bioinformatics/btv098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang F.X., Shang G.D., Wu L.Y., Xu Z.G., Zhao X.Y., Wang J.W. Chromatin accessibility dynamics and a hierarchical transcriptional regulatory network structure for plant somatic embryogenesis. Dev. Cell. 2020;54:742–757.e8. doi: 10.1016/j.devcel.2020.07.003. [DOI] [PubMed] [Google Scholar]
- Yu G., Wang L.G., He Q.Y. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics. 2015;31:2382–2383. doi: 10.1093/bioinformatics/btv145. [DOI] [PubMed] [Google Scholar]
- Zhang Y., Liu T., Meyer C.A., Eeckhoute J., Johnson D.S., Bernstein B.E., Nusbaum C., Myers R.M., Brown M., Li W. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data can be found online at https://www.sciencedirect.com/science/article/pii/S1534580720305517
The analysis code can be found online at https://github.com/WangLab-SIPPE/2020STARProtocols_pipeline.