Table 5.
An example for constructing string features
| Input and processed documents and condidate string features | Details |
| Input document | The Three Human Syntrophin Genes Are Expressed in Diverse Tissues, Have Distinct Chromosomal Locations, and Each Bind to Dystrophin and Its Relatives |
| Processed document | the three human syntrophin genes are expressed in diverse tissues have distinct chromosomal locations and each bind to dystrophin and its relatives |
| Candidate string features | the thr |
| he thre | |
| e three | |
| three h | |
| hree hu | |
| ree hum | |
| ee huma | |
| e human | |
| ... |
The length of substring is fixed to 7. The example document only has one sentence (the title of the document of PMID:8576247). A seven-character window moves along the sequential text. All characters are converted to lower case. Only alphabetical letters and the space character are processed. Punctuation is converted to the space character.