. 2013 Mar 22;14:104. doi: 10.1186/1471-2105-14-104

Table 1.

PTM filtering tokens and information extraction assessment

PTM	Filtering	Generic corpus			Positive corpora
	token	*# Filtered astracts*	*# Retrieved abstracts*	*Precision*	*# Abstracts*	*Recall*
Acetylation	“acet”	26,144	1,753	65%	97	89%
Amidation	“amid”	21,861	1,515	73%	61	95%
Disulfide bond	“disulf”	6,933	1,095	94%	514	75%
Glycosylation	“glyco”	31,379	2,746	73%	464	85%
Methylation	“methyl”	28,015	664	57%	47	87%
Phosphorylation	“phospho”	61,144	16,129	71%	906	93%
Sulfation	“sulf”	20,834	256	65%	40	92%

“Filtering token” is the term used to select the abstracts, “# filtered abstracts” is the number of abstracts which contain these terms, and “# retrieved abstracts” is the number of abstracts selected by the complete sentence extraction procedure. Precision was estimated based on manual analysis of 100 positive abstracts.