Skip to main content
. 2022 Jun 5;22(7):2599–2613. doi: 10.1111/1755-0998.13646

TABLE 1.

Description of commonly used filtering approaches in the analysis of RADseq data (“filter”), the reason for their usage (“usage”), and how they impact population genomic inference (“impact”)

Filter Usage Impact Reference
Hardy–Weinberg equilibrium (HWE)
  • Removes loci under selection

  • Removes library and sequencing artefacts

  • Unknown

Gruber et al. (2018), Sethuraman et al. (2019), Waples (2015)
Linkage within loci
  • Mitigates effects of nonindependence of single nucleotide polymorphisms (SNPs) by removing physically linked SNPs.

  • Reduces false signals of population structure

  • Necessary for STRUCTURE (if LD correction is not used)

O'Leary et al. (2018)
Locus level diversity
  • Loci with high SNP density (i.e., many SNPs within a locus) may be the result of polyploidy

  • Can remove putative paralogous loci

Hohenlohe et al. (2011), Mastretta‐Yanes et al. (2015)
Minor allele frequency (MAF)/count (MAC)
  • Identification of genotyping errors
  • Can remove informative loci if not applied carefully

  • MAF will affect loci differently based on missingness

  • Removes genotyping errors

Linck and Battey (2019), O'Leary et al. (2018)
Variant call rate
  • Ensures SNP panel is well represented across individuals

  • Can dramatically reduce number of loci

  • Helps ensure samples are comparable

O'Leary et al. (2018)