Table 1. The comparison of AES systems.
| AES/Parameter | Vendor | Release date | Primary focus | Technique(s) used | Training data | Feedback Application | Correlation with human raters’ scores |
|---|---|---|---|---|---|---|---|
| PEG™ | Ellis Page | 1966 | Style | Statistical | Yes (100 –400) | No | 0.87 |
| IEA™ | Landauer, Foltz, & Laham | 1997 | Content | LSA (KAT engine by PEARSON) | Yes (∼100) | Yes | 0.90 |
| E-rater® | ETS development team | 1998 | Style & Content | NLP | Yes (∼400) | Yes (Criterion) | ∼0.91 |
| IntelliMetric™ | Vantage Learning | 1998 | Style & Content | NLP | Yes (∼300) | Yes (MY Access!) | ∼0.83 |
| BETSY™ | Rudner | 1998 | Style & Content | Bayesian text classification | Yes (1000) | No | ∼0.80 |
| Alikaniotis, Yannakoudakis & Rei (2016) | Alikaniotis, Yannakoudakis, and Rei | 2016 | Style & Content | SSWE + Two-layer Bi-LSTM | Yes (∼8000) | No | ∼0.91 (Spearman) ∼0.96 (Pearson) |
| Taghipour & Ng (2016) | Taghipour and Ng | 2016 | Style & Content | Adopted LSTM | Yes (∼7786) | NO | QWK for LSTM ∼0.761 |
| Dong & Zhang (2016) | Dong and Zhang | 2016 | Syntactic and semantic features | Word embedding and a two-layer Convolution Neural Network | Yes (∼1500 to ∼1800) | NO | average kappa ∼0.734 versus 0.754 for human |
| Dasgupta et al. (2018) | Dasgupta, T., Naskar, A., Dey, L., & Saha, R. | 2018 | Style, Content, linguistic and psychological | Deep Convolution Recurrent Neural Network | Yes ( ∼8000 to 10000) | NO | Pearson’s and Spearman’s correlation of 0.94 and 0.97 respectively |
Notes.
Scorers.