Skip to main content
. 2020 Aug 12;22(8):e17478. doi: 10.2196/17478

Table 3.

Description of training and test data sets.

Targets Total number of tweets, n (%) Number of tweets with positive target, n (%) Number of tweets with negative target, n (%)
Relevance
  • Total: 4000 (100)

  • Training: 3600 (100)

  • Test: 400 (100)

Relevant
  • Total: 3011 (75.28)

  • Training: 2709 (75.25)

  • Test: 302 (75.5)

Nonrelevant
  • Total: 989 (24.72)

  • Training: 891 (24.75)

  • Test: 98 (24.5)

Commercial
  • Total: 3011 (100)

  • Training: 2709 (100)

  • Test: 302 (100)

Noncommercial
  • Total: 2175 (72.24)

  • Training: 1957 (72.24)

  • Test: 218 (72.2)

Commercial
  • Total: 836 (27.76)

  • Training: 752 (27.86)

  • Test: 84 (27.8)

Sentiment
  • Total: 2175 (100)

  • Training: 1957 (100)

  • Test: 218 (100)

Provape
  • Total: 1357 (62.39)

  • Training: 1221 (62.39)

  • Test: 136 (62.4)

Not provape
  • Total: 818 (37.61)

  • Training: 736 (37.61)

  • Test: 82 (37.6)