Skip to main content
. 2018 Aug 28;16(8):e2005343. doi: 10.1371/journal.pbio.2005343

Fig 1. The overall architecture of the new relevance search algorithm in PubMed.

Fig 1

(a) It consists of two stages: processing first by BM25, a classic term-weighting algorithm; the top 500 results are then re-ranked by LambdaMART, a high-performance L2R algorithm. The machine-learning–based ranking model is learned offline using relevance-ranked training data together with a set of features extracted from queries, documents, or both. (b) Features designed and experimented in this study with their brief descriptions and identifiers. D, document; IDF, inverse document frequency; L2R, learning to rank; Q, query; QD, query–document relationship; TIAB, title and abstract