. 2024 Jun 1;8:e2400077. doi: 10.1200/CCI.24.00077

TABLE A4.

Performance Characteristics of AI Content Detectors in Distinguishing AI-Generated Text From Human-Written Abstracts

Detector	Comparison	AUROC	AUPRC	Brier Score
Originality.ai	GPT-3.5 v human-written	1.000	1.000	0.013
	GPT-4 v human-written	0.995	0.996	0.027
	Mixed human/GPT-3.5 v human-written	0.782	0.775	0.400
	Mixed human/GPT-4 v human-written	0.706	0.703	0.426
	Translated v human-written	0.912	0.921	0.199
Sapling	GPT-3.5 v human-written	0.991	0.990	0.024
	GPT-4 v human-written	0.973	0.975	0.043
	Mixed human/GPT-3.5 v human-written	0.617	0.671	0.338
	Mixed human/GPT-4 v human-written	0.606	0.655	0.359
	Translated v human-written	0.609	0.595	0.397
GPTZero	GPT-3.5 v human-written	1.000	1.000	0.010
	GPT-4 v human-written	0.999	0.999	0.016
	Mixed human/GPT-3.5 v human-written	0.684	0.692	0.346
	Mixed human/GPT-4 v human-written	0.596	0.636	0.372
	Translated v human-written	0.585	0.586	0.385
Kashyap^a	GPT-3.5 v human-written	0.970	0.970	0.477
	GPT-4 v human-written	0.880	0.826	0.498
	Mixed human/GPT-3.5 v human-written	0.670	0.701	0.499
	Mixed human/GPT-4 v human-written	0.640	0.702	0.499

NOTE. Each comparison represents performance metrics for detecting 100 human-written versus 100 generated, mixed, or translated abstracts. However, the Kashyap detector underwent preliminary assessment with 10 human-written versus 10 generated or mixed abstracts with performance as listed, but because of inferior performance compared with other detectors was excluded for future analysis. Bold text indicates the model achieved the best performance for the specified comparison as measured by the listed metric (ie, highest AUROC or AUPRC, or lowest Brier Score).

Abbreviations: AI, artificial intelligence; AUPRC, area under the precision recall curve; AUROC, area under the receiver operating characteristic curve.

Detector was run on a preliminary cohort with 10 abstracts in each category, but further analysis was not performed because of lower accuracy compared with other detectors.