Table 1.
Profile | Peak Criteria1 | Tag Shift | Control Data2 |
Rank By |
FDR3 | User Input Parameters4 |
Artifact Filtering: Strand- based / Duplicate5 |
Refer- ence |
|
---|---|---|---|---|---|---|---|---|---|
CisGenome v1.1 |
Strand- specific window scan |
1: Number of reads in window, 2: Number of reads in window – control reads |
Average for highest ranking peak pairs |
Conditional binomial used to estimate FDR |
No. of reads under peak |
1: Negative binomial, 2: conditional binomial |
Target FDR, optional window width, window interval |
Yes / Yes | 10 |
ERANGE v3.1 |
Tag aggregation |
1: Height cutoff, 2: Height and fold enrichment over control counts in region |
Hiqh quality peak estimate, per-region estimate, or input |
Used to calculate fold enrichment and optionally p-values |
p- value |
1: None 2: # control # ChIP |
Optional peak height, ratio to background |
Yes / No | 4,18 |
FindPeaks v3.1.9.2 |
Aggregation of overlapped tags |
Height threshold |
Input or estimated |
N/A | N | 1: Monte Carlo simulation 2: N/A |
Minimum peak height, subpeak valley depth |
Yes / Yes | 19 |
F-Seq v1.82 |
Kernel density estimation |
s Standard deviations above kde for 1: random background, 2: control |
Input or estimated |
Kde for local background |
Peak Height |
1: None 2: None |
Threshold standard deviation value, kde bandwidth |
No / No | 14 |
GLITR | Aggregation of overlapped tags |
Classification by height and relative enrichment |
User input tag extension |
Multiply sampled to estimate background class values |
Peak height and fold enrich -ment |
2: # control # ChIP |
Target FDR, number nearest neighbors for clustering |
No / No | 17 |
MACS v1.3.5 |
Tags shifted then window scan |
Local region Poisson p value |
Estimate from high quality peak pairs |
Used for Poisson fit when available |
p- value |
1: None 2: # control # ChIP |
p-value threshold, tag length, mfold for shift estimate |
No / Yes | 13 |
PeakSeq | Extended tag aggregation |
Local region binomial p value |
Input tag extension length |
Used for significance of sample enrichment w/ binomial distribution |
q- value |
1: Poisson background assumption 2: From binomial for sample + control |
Target FDR | No / No | 5 |
QuEST v2.3 |
Kernel density estimation |
2: Height threshold, background ratio |
Mode of local shifts that maximize strand cross correlation |
Kde for enrichment and empirical FDR estimation |
q- value |
1: N/A 2: # control # ChIP as a function of profile threshold |
Kde bandwidth, peak height,subpeak valley depth,ratio to background |
Yes / Yes | 9 |
SICER v1.02 |
Window scan with gaps allowed |
P value from random background model, enrichment relative to control |
Input | Linearly rescaled for candidate peak rejection and p- values |
q- value |
1: None 2: From Poisson p- values |
Window length, gap size, FDR (w/ control) or E-value (no control) |
No / Yes | 15 |
SiSSRs v1.4 |
Window scan |
N+-N− sign change, N++N− threshold in region |
Average nearest paired tag distance |
Used to compute fold- enrichment distribution |
p- value |
1: Poisson 2: control distribution |
1: FDR 1,2: N++N− threshold |
Yes / Yes | 11 |
spp v1.0 |
Strand specific window scan |
Poisson p-value (paired peaks only) |
Maximal strand cross- correlation |
Subtracted before peak calling |
p- value |
1: Monte Carlo simulation 2: # control # ChIP |
Ratio to background |
Yes / No | 12 |
USeq v4.2 |
Window scan |
Binomial p- value |
Estimated or user specified |
Subtracted before peak calling |
q- value |
1, 2: Binomial 2: # control # ChIP |
Target FDR | No / Yes | 20 |
Throughout the table 1: and 2: refer to one sample and two-sample experiments, respectively.
The ‘Control Data’ column is intended to give a rough idea of how control data is used by the software. ‘N/A’ means that control data is not handled.
The “FDR’ column describes how the FDR is or optionally may be computed. Note that ‘None’ indicates an FDR is not computed, however the experimental data may still be analyzed; ‘N/A’ indicates the experimental setup (1 sample or 2) is not yet handled by the software.
The lists of ‘User Input Parameters’ for each program are not exhaustive but rather comprise a subset of greatest interest to new users.
’Strand-based’ artifiact filtering rejects peaks if the strand-specific distributions of reads do not conform to expectation, for example by exhibiting extreme bias of tag populations for one strand or the other in a region. ‘Duplicate’ filtering refers to either removal of reads that occur in excess of expectation at a location or filtering of called peaks to eliminate those due to low complexity read pileups that may be associated with, for example, microsatellite DNA.