Table 3.

Performance of Aggregator.

Set of Articles to be Clustered	split	purity	F1
Conditions queries, all retrieved articles	0.105	0.74	0.79
Conditions queries, only NCT-containing articles	0.107	0.72	0.77
Condition+Intervention query, all retrieved articles	0.00	0.86	0.91
Condition+Intervention queries, only NCT-containing articles	0.0078	0.79	0.86

A series of 20 PubMed queries were carried out on various conditions (see Methods) and 22 queries were carried out on similar conditions plus specific interventions (Table 2). We used Aggregator either to cluster all articles retrieved by these searches, or only clustered the subset of articles that contained NCT numbers. Note that performance was somewhat better when all retrieved articles were clustered than when only the subset of NCT-containing articles was clustered. This indicates that the clustering process, by taking into account multiple interactions among a larger number of articles, improved upon the predictions based on the pairwise similarity model alone.