Skip to main content
. 2024 Jun 1;8:e2400077. doi: 10.1200/CCI.24.00077

TABLE A4.

Performance Characteristics of AI Content Detectors in Distinguishing AI-Generated Text From Human-Written Abstracts

Detector Comparison AUROC AUPRC Brier Score
Originality.ai GPT-3.5 v human-written 1.000 1.000 0.013
GPT-4 v human-written 0.995 0.996 0.027
Mixed human/GPT-3.5 v human-written 0.782 0.775 0.400
Mixed human/GPT-4 v human-written 0.706 0.703 0.426
Translated v human-written 0.912 0.921 0.199
Sapling GPT-3.5 v human-written 0.991 0.990 0.024
GPT-4 v human-written 0.973 0.975 0.043
Mixed human/GPT-3.5 v human-written 0.617 0.671 0.338
Mixed human/GPT-4 v human-written 0.606 0.655 0.359
Translated v human-written 0.609 0.595 0.397
GPTZero GPT-3.5 v human-written 1.000 1.000 0.010
GPT-4 v human-written 0.999 0.999 0.016
Mixed human/GPT-3.5 v human-written 0.684 0.692 0.346
Mixed human/GPT-4 v human-written 0.596 0.636 0.372
Translated v human-written 0.585 0.586 0.385
Kashyapa GPT-3.5 v human-written 0.970 0.970 0.477
GPT-4 v human-written 0.880 0.826 0.498
Mixed human/GPT-3.5 v human-written 0.670 0.701 0.499
Mixed human/GPT-4 v human-written 0.640 0.702 0.499

NOTE. Each comparison represents performance metrics for detecting 100 human-written versus 100 generated, mixed, or translated abstracts. However, the Kashyap detector underwent preliminary assessment with 10 human-written versus 10 generated or mixed abstracts with performance as listed, but because of inferior performance compared with other detectors was excluded for future analysis. Bold text indicates the model achieved the best performance for the specified comparison as measured by the listed metric (ie, highest AUROC or AUPRC, or lowest Brier Score).

Abbreviations: AI, artificial intelligence; AUPRC, area under the precision recall curve; AUROC, area under the receiver operating characteristic curve.

a

Detector was run on a preliminary cohort with 10 abstracts in each category, but further analysis was not performed because of lower accuracy compared with other detectors.