Table 4. Overall performance metrics from Scenario 1: Original vs. AI-generated abstracts.
| Metric | GPTZero (%) | ZeroGPT (%) | DetectGPT (%) | 
|---|---|---|---|
| Accuracy | 97.22 | 64.35 | 54.63 | 
| - By author categories | |||
| - Native | 99.07 | 64.81 | 58.33 | 
| - Non-Native | 97.22 | 63.89 | 50.93 | 
| - By disciplines | |||
| - Technology & Engineering | 100.00 | 61.11 | 56.94 | 
| - Social Sciences | 98.61 | 66.67 | 63.89 | 
| - Interdisciplinary | 95.83 | 65.28 | 43.06 | 
| False Positive Rate (FPR) | 0.00 | 16.67 | 31.94 | 
| - By author categories | |||
| - Native | 0.00 | 19.44 | 27.78 | 
| - Non-Native | 0.00 | 13.89 | 36.11 | 
| - By disciplines | |||
| - Technology & Engineering | 0.00 | 12.50 | 41.67 | 
| - Social Sciences | 0.00 | 12.50 | 12.50 | 
| - Interdisciplinary | 0.00 | 25.00 | 41.67 | 
| False Negative Rate (FNR) | 2.78 | 45.15 | 52.08 | 
| - By author categories | |||
| - Native | 2.78 | 43.06 | 48.61 | 
| - Non-Native | 8.33 | 47.22 | 55.56 | 
| - By disciplines | |||
| - Technology & Engineering | 2.08 | 52.08 | 43.75 | 
| - Social Sciences | 4.17 | 43.75 | 47.92 | 
| - Interdisciplinary | 2.08 | 39.58 | 64.58 | 
| False Accusation Rate (FAR) | 44.44 | ||
| - By author categories | |||
| - Native | 44.44 | ||
| - Non-Native | 44.44 | ||
| - By disciplines | |||
| - Technology & Engineering | 45.83 | ||
| - Social Sciences | 25.00 | ||
| - Interdisciplinary | 62.50 | ||
| Majority False Accusation Rate (MFAR) | 4.17 | ||
| - By author categories | |||
| - Native | 2.78 | ||
| - Non-Native | 5.56 | ||
| - By disciplines | |||
| - Technology & Engineering | 8.33 | ||
| - Social Sciences | 0.00 | ||
| - Interdisciplinary | 4.17 |