Skip to main content
. 2023 Nov 30;9:e1710. doi: 10.7717/peerj-cs.1710

Table 3. Definition of different term-weighting approaches.

Note that some formulations include the expression max(X,1) to prevent the possibility of undefined values, such as divisions by zero or log(0).

Scheme Formulation
TGF (Salton & Buckley, 1988) A + C
IDF (Salton & Buckley, 1988) log(N/(A+C))
TGF* (Domeniconi et al., 2015) A
TGF*-IDFEC (Domeniconi et al., 2015) A×(log((C+D)/max(C,1)))
χ2 (Quinlan, 1986) N((ADBC)2/((A+C)(B+D)(A+B)(C+D)))
OR (van Rijsbergen, Harper & Porter, 1981) log((max(A,1)×D)/max(B×C,1))
IG (Quinlan, 1986) (A/N)log(max(A,1)/(A+C))((A+B)/N)log((A+B)/N)+(B/N)log(B/(B+D))
GR (Quinlan, 1986) IG/(((A+B)/N)log((A+B)/N)((C+D)/N)log((C+D)/N))
FDD 0.5 (Maisonnave et al., 2021) (1.25×A/(A+C)×A/(A+B))/((0.25×A/(A+C))+A/(A+B))
FD D1.0 (Maisonnave et al., 2021) (2.00×A/(A+C)×A/(A+B))/((1.00×A/(A+C))+A/(A+B))
FDD 10 (Maisonnave et al., 2021) (101×A/(A+C)×A/(A+B))/((100×A/(A+C))+A/(A+B))