. Author manuscript; available in PMC: 2020 Dec 7.

Published in final edited form as: Nat Protoc. 2019 Jun 7;14(7):2036–2068. doi: 10.1038/s41596-019-0172-4

Table 2 |.

Main programs and parameters used throughout the analyses

Program	Function	Parameter and description	Steps
cutadapt		cutadapt trims and filters reads to expected length and quality	152
		-l trims the sequence to length (bp)
		--max-n discards reads with more than n N bases
bwa	index	bwa index builds the index of reference sequences	153, 156
	index	-p prefix is used for output files	153, 156
	mem	bwa mem maps reads to the indexed linker or reference genome	154, 156
		-p specifies the paired-end mode
		-k specifies minimum seed length, matches shorter than specified will be missed
		-w specifies band width; gaps larger than it will be omitted
		-T specifies minimum score; alignment scored less than it will not be output
		-L specifies clipping penalty; used for the score calculation
		-B specifies mismatch penalty; used for the score calculation
		-O specifies gap open penalty; used for the score calculation
samtools	view	samtools view reads, filters and transforms SAM/BAM files	154, 156
		-u specifies output of uncompressed BAM files
		-b specifies output in the BAM format
		-f bits flag; alignments with all bits present will be output
	sort	samtools sort sorts alignments by coordinates or read names	154, 156
		-n specifies sort by read name rather than by chromosomal coordinates
		-m specifies maximum memory (in KB, MB or GB) per thread
	fixmate	samtools fixmate fills in mate coordinates and related flags from a name-sorted	156
		alignment
		-p disables FR (forward–reverse orientation) proper pair check
		-m adds ms (mate score) tags
	markdup	samtools markdup marks duplicate alignments from a coordinate sorted file	156
GridTools.py	matefq	GridTools.py matefq parses the reads mapped to the GRID-seq linker in the BAM file	155
		into RNA–DNA mates in interleaved FASTQ format
		-l specifies minimum length; RNA or DNA with length less than specified will be omitted
		-n renames the prefix of each read
		-o outputs to file in HDF5 format
	evaluate	GridTools.py evaluate calculates quality and quantity of each pair of RNA–DNA	157
		mates from the BAM file mapped to the genome.
		-g specifies gene annotation in GTF format
		-k specifies bin size (kb) of the genome (default: 10 kb)
		-m specifies moving window for smoothing in bins (default: 10)
		-o outputs mapping information to the HDF5 file
	stats	GridTools.py stats calculates statistics of GRID-seq data	158
		-p specifies prefix of output file names
		-b outputs the summary of base-position information for RNA, Linker and DNA
		-c outputs the summary of mapping information in read counts
		-l outputs the distribution of sequence length for RNA, Linker and DNA
		-r outputs the resolution information of the library
	RNA	GridTools.py RNA identifies chromatin-enriched RNAs and evaluates the gene	159
		expression levels as well as interaction scopes
		-e specifies output file for the gene expression
		-s specifies output file for the RNA interaction scope
	DNA	GridTools.py DNA identifies RNA-enriched chromatin regions in background (trans) and foreground (cis)	160
	matrix	GridTools.py matrix evaluates the RNA–chromatin interaction matrix	161
		-k specifies cutoff of RNA reads per kilobase in the gene body
		-x specifies cutoff of DNA reads per kilobase at the maximum bin
	model	GridTools.py model builds a network model to deduce the enhancer–promoter proximity	Box 3
		-e specifies BED file of regulatory elements (e.g., enhancers and promoters)
		-k specifies cutoff of RNA reads per kilobase in the gene body
		-x specifies cutoff of DNA reads per kilobase at the maximum bin size
		-z specifies z score used to filter for significant proximity
bgzip		Block compression/decompression utility	157
tabix		Generic indexer for TAB-delimited genome position files	157
tabix		-p specifies input file in GFF or GTF format	157