Table 5.
Example of processed training data. Where each row corresponds to a word from the collection of abstracts in the training data. The Part of Speech column (POS) tag tokens as a common noun (NN), adjective (JJ), etc. The sentence ID column identifies a training example (i.e., all tokens with the same Sentence ID are inserted into the model as one training example. The label column describes whether a token is a software tool (T) or not (O).
Token | POS | Sentence ID | Label |
MetaComp | NN | 1.0 | T |
comprehensive | JJ | 1.0 | O |
analysis | NN | 1.0 | O |
software | NN | 1.0 | O |
comparative | JJ | 1.0 | O |