Skip to main content
PLOS One logoLink to PLOS One
. 2024 Jan 2;19(1):e0290303. doi: 10.1371/journal.pone.0290303

GAN-based medical image small region forgery detection via a two-stage cascade framework

Jianyi Zhang 1,2,*,#, Xuanxi Huang 1,#, Yaqi Liu 1, Yuyang Han 1, Zixiao Xiang 1
Editor: Ali Mohammad Alqudah3
PMCID: PMC10760893  PMID: 38166011

Abstract

Using generative adversarial network (GAN) Goodfellow et al. (2014) for data enhancement of medical images is significantly helpful for many computer-aided diagnosis (CAD) tasks. A new GAN-based automated tampering attack, like CT-GAN Mirsky et al. (2019), has emerged. It can inject or remove lung cancer lesions to CT scans. Because the tampering region may even account for less than 1% of the original image, even state-of-the-art methods are challenging to detect the traces of such tampering. This paper proposes a two-stage cascade framework to detect GAN-based medical image small region forgery like CT-GAN. In the local detection stage, we train the detector network with small sub-images so that interference information in authentic regions will not affect the detector. We use depthwise separable convolution and residual networks to prevent the detector from over-fitting and enhance the ability to find forged regions through the attention mechanism. The detection results of all sub-images in the same image will be combined into a heatmap. In the global classification stage, using gray-level co-occurrence matrix (GLCM) can better extract features of the heatmap. Because the shape and size of the tampered region are uncertain, we use hyperplanes in an infinite-dimensional space for classification. Our method can classify whether a CT image has been tampered and locate the tampered position. Sufficient experiments show that our method can achieve excellent performance than the state-of-the-art detection methods.

1 Introduction

Due to the privacy of medical images, the lack of data has always been a significant problem for machine learning tasks related to medical images. One way to solve this problem is the generative adversarial network (GAN) [1], which can generate images that are highly similar to real images. GAN has been widely concerned in the medical image field. Several studies have used GAN to generate medical images for data enhancement and achieved gratifying performance. The image quality generated by GAN is enough to confuse radiologists. Therefore, once this technology is used for malicious attacks, it will lead to serious consequences.

Using the deep convolution neural network can detect the GAN-generated image [24]. Moreover, the detection accuracy can be improved through feature engineering [510]. To the knowledge of this paper, there is no detection method for GAN forged medical images. Although there is no specific solution to detect the medical images generated by GAN, there are some domain generic methods. For example, Frank et al. used discrete cosine transform (DCT) to detect GAN generated images [9]. Marra et al., through incremental learning, can detect new GAN-generated images with only a small number of samples [11]. Cozzolino et al. learn feature extraction through auto-encoders and generalize the model through a small number of samples [12].

CT-GAN [13], a GAN difficult to detect even with state-of-the-art methods, emerged. It can inject or remove large lung nodules from CT images. Examples of CT-GAN inject/remove tampering of lung nodules are shown in Fig 1. The number of large lung nodules is a significant marker of lung cancer. Therefore, CT-GAN can make doctors misjudge the patient’s condition, seriously threatening the patient’s life safety. In addition, this attack may also be used to defraud medical insurance and maliciously discredit competitors.

Fig 1. CT-GAN tampered samples.

Fig 1

The first row shows the removal tampering of CT-GAN. A lung nodule is removed from the CT slice image by CT-GAN. The second row shows the injection tampering of CT-GAN. A small nodule was tampered with as a large nodule by CT-GAN.

However, new attacks like CT-GAN challenge the current detection methods. This type of attack is a GAN-based automated 3D tampering attack. It generates only a minimal area, and the surrounding area is used as a constraint condition to train a conditional generative adversarial network (CGAN). In that case, the generated image will be closer to the real image. We call the attack using CGAN to forge a very small region in an image as a GAN-based small region forgery attack. At present, no solution can effectively detect the GAN-based small region forgery attack in medical images. The characteristic of the attack is that the ratio of the region generated by GAN is very small. Although some methods, such as Rössler et al. [3], can detect partial generation, such as face manipulation. However, because medical images’ style, content, and storage format are very different from the normal images and the tampered region is too small, even state-of-the-art detection methods can not effectively detect GAN-based small region forgery attacks in medical images. It is conceivable that our medical image security is facing a considerable threat.

In order to solve the above-mentioned problem, we propose a novel cascade framework based on a local detection network and a global classification method that can detect GAN-based small region forgery attacks in medical images. The first stage is local detection. We crop a small sub-image from the CT slice image to train the detector network. The sub-image size is small enough so that interference information in authentic regions will not affect the detector. Because the training data only has a single channel and the training data size is small, it is easy to over-fit. Therefore, we design a lightweight neural network with fewer parameters and use early stopping to prevent over-fitting. After training, the detector can detect the tampered region effectively. Then, we traverse the entire CT slice image by sub-image. The detection result of all sub-images will be combined and output as a heatmap. It can indicate which region may be tampered. The second stage is global classification. Since CT-GAN can adjust the size of the tampered region to a certain extent, we use gray-level co-occurrence matrix (GLCM) features to train principal component analysis (PCA), and classifier models for global classification. Compared with the method that uses the whole image as input, this method can locate the tampered coordinate and requires less training data but has faster training speed and higher accuracy.

The main contributions are as follows:

  • We propose a novel cascade framework based on local detection and a global classification to detect and locate the tampering regions caused by GAN-based automated 3D medical imagery attacks, including injection and removal.

  • For detection of GAN-based and small region forgery, we propose a local detection network with channel attention, spatial attention, depthwise separable convolution, and residual networks. It can better find information on small areas in the image and prevent over-fitting.

  • For detection of the realistic alters that consider correlations between scans, we design a global classification method base on hyperplanes in an infinite-dimensional space with the gray-level co-occurrence matrix (GLCM) as input features. It can effectively cooperate with local detection to classify medical images.

  • Experiments show that, for GAN-based small region forgery attacks in the medical image like CT-GAN, our method can achieve excellent performance.

For the reproducibility of the proposed method, we have published our source code online at https://github.com/BESTICSP/CT-GAN-Detector.

The rest of this paper is organized as follows. In Section 2, we discussed the background and related work of the detection of GAN generated images in recent years. Moreover, in Section 3, we explained our method in detail. Furthermore, in Section 4, we describe our experimental results. In Section 5, we discuss our method, and in Section 6 we draw our conclusions.

2 Background and related works

2.1 Medical image

The medical image uses some particular medium to interact with the human body to show the structure of the internal tissues or organs of the human. Digital imaging and communications in medicine (DICOM) is an international standard for medical images and their related information. It is widely used in various radiological diagnostic equipment (X-ray, CT, MR, ultrasound, etc.). All medical images of patients are stored in DICOM file format. The data used in this paper are mainly CT images with DICOM format. CT equipment scans slices one after another around a certain part of the patient’s body. A complete CT scan may include about 300 slice images. The scanned image is multi-layered. A three-dimensional image can be formed by stacking layers of slice images on the z-axis.

The definition of medical images such as CT is positively correlated with radiation dose. In contrast, high-dose radiation may damage patient’s health, so it is difficult to improve the definition of medical images. Besides, medical images have only one channel. So GAN can easily fit the distribution of medical images than normal three-channel color images.

2.2 Generative adversarial network

Since GAN was proposed by Goodfellow et al., it has been one of the hot spots in the computer vision (CV) field. The GAN model is different from the traditional neural network structure. GAN includes a generative model G and a discriminative model D. G generates a new sample from random noise, and D distinguishes whether the input sample is a real sample. The task of G is to generate images that D cannot distinguish. At the same time, the task of D is to distinguish between the images generated by G and the real images. The two networks compete against each other during training through this min-max game. In this way, G can learn the data distribution of the real sample. Up to now, GAN has derived a large number of variants, such as WGAN [14], PGGAN [15], StyleGAN [16] and so on. These variants are widely used in various CV tasks.

2.3 Application of GAN in medical image

Medical images are different from normal images and have robust privacy. Even though there are many public data sets such as LIDC-IDRI, DDSM MIAS, OASIS, etc., the medical data sets are still insufficient. Because GAN can effectively alleviate the lack of training data, there are also a large number of researches of medical imaging using GAN. In recent years, the more frequently used GAN variants in medical imaging are pix2pix [17] and CycleGAN [18]. GAN is widely used in image synthesis [1924], noise reduction [2528], cross-modality [2934], image enhancement [35], image super-resolution [36, 37], image segmentation [19], and many other aspects, providing significant help for the computer-aided diagnosis (CAD).

2.4 Detect the GAN-generated image

Because of the high performance of GAN, it has gradually become a trend to use deep learning to distinguish whether an image is generated by GAN. Due to the excellent performance of convolutional neural networks (CNN) in CV tasks, CNNs, such as ResNet [38], XceptionNet [39], and EfficientNet [40], are widely used in various CV fields, including digital image forensics [2, 4, 6]. Besides, Andreas et al. [3] prove the superior performance of XceptionNet in image source detection.

Using some features can make the network perform better. A way to distinguish whether an image is generated by GAN is to use GAN fingerprint. [7]. GAN will leave special fingerprints in the generated image due to its structure. Through deep learning, learn those fingerprints as a feature. Then it can be used to distinguish the source of the image. Some people use the shortcomings of GAN to find some special features to better distinguish whether a image is generated by GAN. For example, McCloskey and Albright find that the saturated or underexposed pixels of image will be suppressed by the normalization operation of the GAN generator [5], and use this feature to distinguish the real camera images and GAN images. Because the statistical characteristics of GAN images are different from real images, some people use three co-occurrence matrices on RGB channels as features to distinguish the source of the image [8]. Zhang et al. suppress the image content information by converting the image to the YCrCb color space and then use the Scharr operator and the gray-level co-occurrence matrix(GLCM) to obtain edge features, allowing them to simultaneously detect GAN images and copy-move images [6]. In addition, someone distinguishes the source of the image from the defects of up-sampling operations in GAN. Frank et al. found that the up-sampling in GAN will cause grid-like artifacts in the generated images after DCT operation [9], which can be used to distinguish the source of the image. Durall et al. found that the images generated by GAN cannot reproduce the actual spectral distribution [10], which is also due to the upsampling operation. Therefore, after using azimuthal integration to extract the spectral features, using SVM or K-Means can distinguish the source of the image without the need to train a deep CNN.

2.5 Challenge

The CT-GAN paper also proposed some detection methods that may be useful. Unfortunately, these methods are not suitable for GAN-based small region forgery attack in medical image like CT-GAN. The reasons are as follows.

On the one hand, there is a huge difference between medical images and normal images. Medical images show the structure and density of human internal tissues or organs, so they have unique content and style. Medical images, such as CT, MR, X-ray, etc., are all taken with special equipment different from general photographing equipment and are saved according to the DICOM standard. Medical images are all single-channel in terms of the image data format, and the pixel values range of the medical image is about 4096. Compared with the normal gray-scale image ranging from 0 to 255, the range of pixel values of medical images is 16 times larger. Therefore, the pre-training model of normal images only has little effect on medical images. In addition, methods that need to extract features from three channels of an image, such as [6] that needs to compare three features extract from different channels, are not appropriate. The co-occurrence matrix is one of the most effective features to distinguish whether an image is generated by GAN. However, due to the expansion of the pixel value range, the cost of calculating the co-occurrence matrix will increase to unacceptable. So the method using the co-occurrence matrix [8] cannot work too. We did a detailed analysis on the countermeasures that are listed in [41]. Unfortunately, all of them failed to detect or prevent the nodule forgery for chest CT.

On the other hand, GAN-based small region forgery attacks are more difficult to detect. Take CT-GAN as an example. CT-GAN is a 3D CGAN. It designs a 3D network that references the pix2pix structure. The generator of CT-GAN is a 3D UNet [42] structure. It cuts out a small cuboid from a series of CT slices of the patient, then scales it into a small cube of 323 pixels and masks the 163 pixels in the center of the cube to zero. This cube with a masked center is input into the generator as a condition. CT-GAN trained two models. Those models can generate large or small nodules in the cube’s center. It is worth mentioning that the size of a CT image is (512 × 512) pixels, in which the region modified by CT-GAN is less than (32 × 32). That means the minimum number of pixels that have been tampered with is only 1/1024 of the total. Fig 2 shows a CT image injected into a lung nodule.

Fig 2. Example of a CT image injected into a lung nodule.

Fig 2

(a) is a tampered CT image. The nodules in the red box are injected by CT-GAN. (b) is the heatmap corresponding to (a), in which the bright red spot corresponds to the injected nodule. Because the preset sliding window stride is greater than 1, the size of the heatmap is smaller than the original CT slice image. In order to facilitate observation, we enlarged the thermal map, superimposed it on the CT image, and adjusted their colors.

Hence, as we can see from Figs 1 and 2, Each tampering operation by CT-GAN will modify 1/1024 to 1/256 of the pixels of the image. A CT image has been tampered with at four different locations, and the tampered pixels only account for about 1%. In other words, 99% of an image is interference information. The untampered part is equivalent to the “cover” of the tampered part, which seriously hinders the model from learning the difference between positive and negative images. That is why even state-of-the-art methods, like the methods based on saturation cues [5], frequency analysis [9] and spectral regularization optimization [10], are challenging to detect the tampering trace of CT-GAN. More serious is that CNN is not sensitive enough to small tampered regions, so it is difficult to detect such attacks accurately. Unfortunately, almost all current detection methods are based on deep CNN, so it is challenging to detect directly. The current methods for distinguishing whether an image is generated by GAN aim at the images wholly or mostly generated by GAN. There is no particular detection model for GAN-based small region forgery attacks in medical images like CT-GAN for the time being. In Section 4.4, We have tried to use the whole CT image as input to train the state-of-the-art network. Unfortunately, the result is inferior.

3 Our method

3.1 Motivation

Medical images are critical private information and are vitally important to the patient’s life. At present, the integrity of medical images faces the threat of GAN-based small region forgery attacks. However, there is no practical method to detect GAN-based small region forgery attacks in medical images.

There are two main reasons why attacks like CT-GAN are difficult to detect. First, medical images’ style, content, and storage format are very different from the normal images. If we convert these DICOM images to any other image format, it will lose information like pixels or meta-data. Therefore, the model training by normal image can not be well generalized to CT images, making the pre-training model unusable. Training a new model will need much more data. Unfortunately, the sample data of medical images is very limited due to the restrictions on the use of data concerning health under the privacy regulations like California consumer privacy act (CCPA) or general data protection regulation (GDPR). Hence, we cannot try to solve this attack from the perspective of training data. Moreover, what makes the detection task more challenging is that the tampered region is very small while the entire CT image is large. As mentioned above, the tampered region is less than (32 × 32) when the entire CT is (512 × 512). This means the ratio of a single tampered region in the original image may be less than 0.4%. This greatly reduces the sensitivity of general CNN detection methods since the loss of spatial information limits the learning ability of CNN. Hence, directly detecting the whole image will result in very low accuracy. Based on the above, even state-of-the-art methods are challenging to detect CT-GAN attacks.

Although no specific method can be implemented directly to detect the GAN-based small region forgery attack in medical images, some research works can still inspire us to design an effective method.

Andreas et al. [3] used a face tracking method to extract the face area of the image. They found that if the extracted facial information is used as the input of the detector, it will be more accurate than directly using the entire image as input. It means that the neural network can achieve a better performance if the classifier focuses on more precise regions. Following their idea, we refer to the common preprocessing method of copy-move forgery detection, making the detector pay more attention to the local part of the image through a sliding window. Specifically, we split the target CT image into many small sub-images to train a local detector with a corresponding method for using local classification results to determine global classification results.

Chollet et al. [39] replaced the Inception modules with depthwise separable convolutions and proposed their method named XceptionNet for computer vision. Since XceptionNet makes more efficient use of model parameters, compared to Inception V3 [43], it shows better runtime performance and higher accuracy on large-scale datasets like ImageNet while having fewer parameters than general deep CNN. This architecture can effectively reduce overfitting when we cannot collect more data. Therefore, considering these features, we designed our method inspired by the XceptionNet to detect the GAN-based local tampering attacks.

3.2 Threat model

In this paper, we have the below assumptions. (i) We assume that the attacker’s target is the medical image. (ii) We assume that the attacker utilizes the GAN-based method instead of traditional methods like copy-move or image-splicing. (iii) We assume that the attacker realistically alters the contents of a 3D scan while considering nearby anatomy and can be completely automated.

3.3 Overview

As can be seen from Fig 3, our detection method is divided into two stages: local detection and global classification. The method we propose is outlined below.

Fig 3. Overview of our method.

Fig 3

We cut out small sub-images from CT slices to train the local detection neural network. Each sub-image will be detected and output a tampered probability. The detection results were combined according to the position to generate a heatmap. Then we use GLCM to extract the features from the heatmap, which are used for PCA and classifier model training. We use the trained model for global classification.

In the local detection stage, small sub-images are cut out from CT slices in a planned way to train the local detector neural network. The size of the cropped sub-image is small enough for the tampered region. In this way, the authentic regions are not enough to hinder the detector. The detector can focus on learning the difference between the real image and the GAN-generated image. The tampered region may be hidden in the original image as the background when testing. Therefore, to minimize the missed judgment, our method detects each sub-image divided by the sliding window and predicts the tampered probability of each sub-image. When all sub-images were detected, the results were combined according to the position to generate a heatmap. This heatmap can intuitively reflect which region in the original image may have been tampered with by GAN. In the global classification stage, we use GLCM to extract the features from the heatmap, which are used for PCA and classifier model training. GLCM can make the features of the heatmap more prominent. We use the trained model for global classification.

Intuitively, our method allows the neural network to observe the details of the image more carefully instead of looking at the overall situation. Thus it has a better performance when facing GAN-based small region forgery attacks.

3.4 Local detection network architecture

Because our training data is insufficient and the hardware is not powerful, we tend to use lightweight networks as the local detector. Using depthwise separable convolution can reduce a large number of required training parameters while maintaining a good training effect. For example, XceptionNet [39] and MobileNet [44] both construct the primary part of the network with depthwise separable convolution, and they perform well in image classification. But our classification task does not need a too-deep network. Because the training data structure is too simple, using a network like XceptionNet or MobileNet will waste many computing resources and may lead to network degradation or over-fitting. Therefore, we designed a shallower network as our sub-image classifier based on the depthwise separable convolution. Our network structure is shown in Fig 4. The network’s input is a (32 × 32) image matrix, and the features of the image are extracted through a small number of traditional convolutions and a large amount of depthwise separable convolutions.

Fig 4. The network architecture.

Fig 4

If there is no description, the default stride of the convolution operation is 1, the padding operation defaults to“SAME”, the activation function defaults to Relu, and each convolution and depthwise separable convolution layer is followed by batch normalization by default.

The attention mechanism can effectively improve the performance of the deep learning model. The attention mechanism is often used by copy-move detection and other detail-oriented tamper detection methods. Inspired by Woo et al. [45], we design a simple attention mechanism for our network. Our attention mechanism is shown in Fig 5. The spatial attention and channel attention can be computed by Eqs (1) and (2). This network sets the channel attention module after the convolution blocks with the largest number of channels. It is more significant to use channel attention here. Similarly, because the pooling layer will further reduce the size of the feature image, we set the spatial attention module before the convolution block containing the pooling operation, where the feature image size is the largest. After adding the attention module, the training cost does not increase much, but it can significantly improve the global classification performance.

As(F)=σ(Conv([MAXs(F);AVGs(F)])) (1)
Ac(F)=σ(FC(MAXc(F))+FC(AVGc(F))) (2)

Fig 5. The attention module of our network.

Fig 5

The number of nerve cells in the three FC layers of channel attention is C, C/4, C, where C means the number of channels. The stride of convolution is 1, using the SAME Padding, and the activation function is Sigmoid.

The design of residual block refers to ResNet [38], and it maintains the same number of convolutional kernel channels when the input feature image size matches the output size. The number of convolution kernels and channels is doubled when the input feature image size is different from the output (through the pooling layer). In the fully connected layer of our network, we employ the Selu [46] as the activation function. The Selu function is given by Eq (3), where λ and α are two meticulously designed numbers. It has better performance than Relu in the full connection layer. Our network can save computing resources through the above network structure while maintaining high accuracy.

Selu(x)=λ{x(x>0)αex-α(x0) (3)

3.5 Global classification method

In our method, the window size is fixed. This is slightly different from the sliding window commonly used in target detection tasks. In the task of target detection, if the window only covers a part of the target, the model may not be able to classify the target correctly, so it is necessary to adjust the window size and traverse the image multiple times. However, in our task, even if the window is only a part of the GAN generation region, the model can determine whether it is generated by GAN with high accuracy. Therefore, we only need a smaller window size to avoid the influence of authentic regions.

First of all, calculate a series of coordinates as the center coordinates of sub-images. Then, crop a sub-image with a size of (32 × 32) according to these center coordinates. The reason for performing the crop operation is that the practical part of the CT image corresponds only to the interior of a circle tangential to the square frame. Furthermore, most of the tampering occurs in this area. It will waste a lot of time and space if the extra part is included in the calculation.

The size of the CT image is marked as CTsize, the sub-image size is marked as imgsize, and the stride marked as s. We calculate the longitudinal coordinates of all rows as follow:

y={imgsize2+i×s} (4)

Where

i=0,1,2,,CTsize-imgsizes (5)

Then, for a row with y = h, we calculate the horizontal ordinates as follow:

x={ctsize/2+j×s} (6)

Where

j=0,1,2,,w-imgsize2sw=(ctsize/2)2-h2 (7)

Our method can record the output result of each sub-image. Then classify whether an image has been tampered with according to these results. In addition to output a final prediction result, our model can also generate a heatmap based on the results of each sub-image (see Fig 2). For the area not counted by the formulas above, we default it to have tampered with the probability of 0.

Generally speaking, attackers usually set the tampered region square or round for GAN-based small region forgery attacks. Consequently, in theory, a series of sub-images will be identified as positive after the local detection of the tampered slice. Hence, fixed patterns such as the false judgment of n vertically and horizontally contiguous sub-images can be utilized to detect tampering traces of CT-GAN. Nevertheless, it is important to note that this detection method is not flawless, as the size of the tampered region in CT-GAN is variable to some extent, allowing the attacker to manipulate it. Moreover, many studies use GAN to generate more extensive and higher resolution images. Therefore, we need a flexible global classification method.

Firstly, to make the features more prominent, GLCM is used to extract the texture features in the thermal map. After rounding the local detection results (heatmap) × 100, calculate the GLCM with a distance of 1 at four angles of 0°, 45°, 90° and 135°, respectively, then get the feature matrix of (100 × 100 × 4). Secondly, a PCA model is trained to reduce the feature to 256 dimensions. Thirdly, the feature data after dimensionality reduction are used to train an classifier model, and the best parameters of the model are found by grid search. This method can adapt to different GAN tampering region sizes.

Algorithm 1 Generate the GLCM of heatmap

Input: heatmap—The heatmap matrix with size H × W. a, b—Two constants determined by angle and distance.

Output: GLCM—A matrix with size g × g, g is the gray levels number.

heatmap = heatmap × 100, then round heatmap to integer, Initialize GLCM to 0 matrix

x = 0, y = 0

while x < W do

  while y < H do

   if 0 < (x+ a)<W and 0 < (y + b)<W then

    g1 = heatmap(x, y)

    g2 = heatmap(x+ a, y+ b)

    GLCM[g1, g2] = GLCM[g1, g2] + 1

   end if

  end while

end while

4 Experiments

4.1 Implementation details

1) Dataset and tampering methods

The algorithms for medical image synthesis can be utilized to tamper medical images [1924]. However, without control over what the algorithm generates, the effect of the tampering will raise suspicion. According to our thread model, if the attackers want a forgery that realistically alters the contents of a 3D scan while considering nearby anatomy and can be completely automated, they can only choose the recent variation CGAN like Pix2Pix or CycleGAN. For example, CT-GAN uses two conditional GANs to perform in-painting on 3D imagery.

For small region forgery, we use the source code of CT-GAN, train the inject and remove models with the LUNA16 data set [47]. Reichman et al. proposed their dataset named LuNoTim-CT [48] which is generated based on the LIDC/IDRI dataset. However, the quantity and quality of LuNoTim-CT is unsatisfied since it also contains copy-move and classical inpainting tampered images and is generated based on the LIDC/IDRI while the LUNA16 is much better. We then use the trained models to generate 3540 different CT scan samples. Among them, 1776 scans were injected lung cancer lesions (Equivalent to a large-diameter lung nodule), and 1764 scans were removed lung cancer lesions. For each fake sample, we select the tampered point and two slices before and after it, five CT slices, and the corresponding five slices before tampering. In the end, 35400 CT slice images were obtained. The tampering points are the CT slices with lung nodules. Therefore, we randomly selected about half of the real CT slice images (about 8850) and replaced them with slices at random locations. Among the 35400 CT slice images, 1200 images are randomly selected as the test set, 4800 images are randomly selected as the training set of global classification, 2000 images are randomly selected as the verification set of local detection, and the remaining 27400 slices are used as the training set of local detection.

We mark the test set described in the previous paragraph as the test set CTGAN-ALL. Besides, we divide the test set CTGAN-ALL into two parts according to inject or remove tampering. The large nodule injected CT slice images, and the real large nodule images were marked as CTGAN-INJ. The large nodule removed CT images, and the real small nodule images were marked as CTGAN-REM.

In addition, eight CT scans different from the above data sets were retained. Two of them were real lung CT scans. One of them had malignant lung cancer lesions, and the other did not. These two scans were marked as MAL and BEN. In addition, one, two, and three large nodules were injected into three scans, respectively. The three scans were marked as INJ1, INJ2, and INJ3. Similarly, one, two, and three large nodules were removed from the remaining three scans marked as REM1, REM2, and REM3.

Furthermore, for the whole image forgery, we use the CycleGAN trained by LUNA16 to generate the attack dataset. We first add impulse noise and gaussian noise to 5000 CT slice images, then use CycleGAN to reduce noise. In the end, 5000 images modified by CycleGAN were obtained. The images without noise and the image denoised by CycleGAN, these 10,000 slice images are marked as the data set CycleGAN. We mark the images denoised by CycleGAN as the positive class and the images without noise as the negative class. Among them, 8000 images are used as the training set, 1000 images are used as the verification set, and 1000 images are used as the test set.

For each slice image in the test set, we use a (32 × 32) window to traverse the whole CT image (with the size of (512 × 512)) with 4 pixels stride. Our method uses Eqs (4) and (6) to traverse the image. For each slice image of the training set and the cross-validation set, we use the method of shifting by one pixel for data enhancement so that each slice image in the training set can generate 25 sub-image images. For the fake image (positive class), we mark the coordinates of the injection center point as (0,0), take 25 coordinate points in the rectangle from (-2,-2) to (2,2). Then use these coordinate points as the center point, cut out 25 sub-images with the size of (32 × 32). For the real image (negative class), we take 10 coordinate points in the rectangle from (-2,-2) to (-1,2), and then randomly select 20 different coordinates from the coordinates calculated by Eqs (4) and (6). Taking these coordinate points as the center, and cut out 25 sub-images with the size of (32 × 32). By adding n negative samples corresponding to the positive samples, the model can better learn the difference between the images before and after tampering. We found that when n = 10, the model’s performance is better. In the experiments, the classifier model can be any algoritm. Here we choose SVM as an example.

2) Setup

All experiments were implemented using the Tensorflow 1.13 framework and were trained on a single NVIDIA GTX2080TI GPU. The parameters of the training phase are as follows. We set the initial learning rate to 0.0005 and use exponential decay, which decays every 600 steps and with a decay rate of 0.85. The mini-batch size is 56, the batch normalization decay parameter is 0.95, and the L2 regularization weight decay parameter is 0.0001. We use Adam optimizer to minimize cross-entropy loss. Except for the learning rate, the default parameters of the Adam optimizer are used, namely β1 = 0.9, β2 = 0.999, ϵ = 1 × 10−8. The early stopping is set to stop training when the accuracy of the validation set no longer increases for three consecutive epochs. If the early stop is not triggered, the training will stop after 30 epochs.

3) Evaluation

We regard the tampered slice image as a positive example and the real slice image as a negative example. The number of positive and negative samples in the actual scene may differ. Therefore, in addition to accuracy(ACC), we also use precision(P), recall(R) and F1-score(F1) to evaluate the model’s performance. The tampering operation of CT-GAN is aimed at 3D medical images. The number of slices involved in a tampering operation can easily reach more than 30. Besides, the number of tampered slices is more if the slice interval is small. However, if the same region of 10 consecutive slices is predicted as the tampering region, we can judge that this position has been tampered with easily. However, if the above indicators are calculated in the unit of 2D slice image, the value will be deficient, which is unreasonable. Therefore, when detecting the complete CT scan, this paper also takes the 3D tampered region (a series of slice images) as the unit and counts the indicators in the following way.

For a tampered region, when 9 or more of the 10 consecutive slices, which including the tampered central slice (these slices must be positive examples in this experiment), are judged as positive examples, we consider that the tampering trace is accurately found and marked as a true positive example. Otherwise, it is regarded as a missing report and marked as a false negative example. In that case, it will be regarded as false positives. Finally, the precision, recall and F1-score are calculated in the above way.

4.2 Ablation study

In order to verify the effectiveness of each module in our method, we conducted ablation studies. Four experiments were used to verify the effectiveness of local detection, attention mechanism, Selu activation function, and GLCM feature extraction. In each experiment, we ablate a module from our method. In these experiments, ablate local detection is to input the complete slice image (512 × 512) and use our network to train and predict directly. The experimental results are shown in Table 1.

Table 1. The ablation study result of our method.

“-SW” means that sliding windows are not used, the whole image is classified directly without local detection. “-Attention” means that attention mechanism are not used. “-Selu” means that uses Relu instead of Selu in our network. “-GLCM” means that PCA and SVM are directly used to classify the heatmap without GLCM to extract features.

Ablated module ACC P R F1
Ours-SW 0.6583 0.6554 0.6660 0.6607
Ours-Attention 0.9158 0.9561 0.8717 0.9119
Ours-Selu 0.9192 0.9499 0.8850 0.9163
Ours-GLCM 0.8717 0.8729 0.8700 0.8715
Ours 0.9350 0.9628 0.9050 0.9330

The experimental results show that using the sliding window to divide sub-images for local detection is very helpful to detect GAN-based small region forgery attacks like CT-GAN, which significantly improves the performance of detection. Selu activation function and attention mechanism can be slightly helpful to the performance of our method. Moreover, the performance of our method can be significantly improved by using GLCM. When the above modules are used together, the improvement effect is better.

4.3 Detection of CT-GAN inject or remove attack

For general GAN-based small region forgery attack, it may not use the same GAN structure to train two different models like CT-GAN. Therefore, we divided the training set into two parts according to the same way as the test set CTGAN-INJ and CTGAN-REM, then trained detector models and tested the inject and remove models of CT-GAN separately. After the training, we got two detectors for different tampering models. After that, we tested on the two kinds of tampering respectively. The test results are shown in Table 2.

Table 2. The detection results of CT-GAN inject and remove attacks.

The training and testing of the two are carried out separately.

Test set ACC P R F1
CTGAN-INJ 0.8999 0.9762 0.8200 0.8913
CTGAN-REM 0.9670 0.9937 0.9400 0.9661

The experimental results show that although training data is reduced, our model can still detect CT-GAN’s inject or remove model with a high F1-score. The detection accuracy and f1-score of the inject tampering model are about 90%, while the detection accuracy and f1-score of the remove tampering model are about 97%. The above results show that our method can still effectively detect the traces of tampering in the face of a single tampering model, and our method is more sensitive to the traces of removing tampering.

4.4 Compare to state-of-the-art detection methods

Because other feature extraction methods for detecting GAN-generated images are not suitable for CT-GAN, as mentioned in Section 2.5, all the countermeasures listed in [41] failed to detect the lung nodule forgery. Since effective methods are all based on XceptionNet and ResNet50, we use these two most advanced deep convolutional neural networks (DCNNs) as the baseline. Specifically, some existing studies have shown that XceptionNet has an excellent performance in GAN forged image detection, and its detection accuracy can be comparable with the most advanced detection methods. Therefore, we choose XceptionNet as the baseline. In addition, because both XceptionNet and the local detection network take depthwise separable convolution as the main structure, we also select ResNet50 as another baseline. Moreover, inspired by [9] we also tried to use the DCT of the sub-image as a feature (Ours-DCT).

1) Detect CT slices

This experiment is divided into two kinds. One is to use the sliding window, the network as the local detector to predict the (32 × 32) sub-image. The other is to train and test in a general way. The input of the network is (512 × 512) complete slice images. The training set and test set are as described in Section 3. The training information is shown in Fig 6, and the test results are shown in Table 3.

Fig 6. The training accuracy curve.

Fig 6

(a) is the input with whole image (512× 512). (b) is the input with sub-image (32× 32).

Table 3. The detect result of CT-GAN with the state-of-the-art methods and ours.

Where “-W” means whole slice image input. “-DCT” means the local detection network is trained with the DCT features extracted from the sub-image.

Method ACC P R F1
XceptionNet-W 0.5912 0.5738 0.7064 0.6332
XceptionNet 0.7125 0.7986 0.5683 0.6641
ResNet50-W 0.5600 0.5488 0.6690 0.6030
ResNet50 0.6850 0.7643 0.5350 0.6294
Ours-W 0.6583 0.6554 0.6660 0.6607
Ours-DCT 0.8383 0.8512 0.8200 0.8353
Ours 0.9350 0.9628 0.9050 0.9330

Experimental results show that when using the current data set, even the most advanced DCNNs such as XceptionNet and ResNet50, the test accuracy and F1-score are only about 65%, which means that it is difficult for them to distinguish CT-GAN tampered images and real images. Their performance is improved when used as local detectors, but the performance is still unsatisfactory, which may be due to overfitting and network degradation. Our method will seriously overfit without the sliding window. However, using the sliding window, the model converges faster, and the accuracy and f1-score of our method are increased to 93%, an increase by a percentage of 28. However, all indicators have declined when using DCT as a feature, and the accuracy and f1-score are only about 86%.

2) Detect CT scans

In order to test the performance of our method more comprehensively, we compared our method with the latest method on complete CT scans. The model used is still trained under the mixed condition of injecting and removing. The test set is the eight scans mentioned above. Fig 7 shows several continuous CT slice images near the tampering center point and the corresponding heatmap. Table 4 shows the results of our experiment.

Fig 7. A part of CT slice images and corresponding heatmaps of scans.

Fig 7

(a) is scan INJ1. (b) is scan REM1. The first column of each scan is the tampered CT slice image. The second column is the heatmaps output when XceptionNet is used as the local detector. The third column is the heatmaps output when ResNet50 is used as the local detector. The fourth column is the heatmaps output of our method.

Table 4. The detection results of complete CT scans.

The “spacing” means the spacing between two adjacent slice images. The “2D” means the indicators are calculated in the unit of a 2D slice image. The “3D” means the indicators are calculated in the unit of a 3D tampering region. There is no difference between the “2D” and “3D” methods in detection but in evaluation.

Test set Spacing (mm) Method TP TN FP FN Accuracy Precision Recall F1-score
BEN 2.5 Xception 0 162 16 0 0.9101 - - -
Resnet50 0 166 12 0 0.9326 - - -
Ours-2D 0 178 0 0 1.0 - - -
MAL 0.625 Xception 0 433 47 0 0.9101 - - -
Resnet50 0 413 67 0 0.9326 - - -
Ours-2D 0 479 1 0 0.9979 - - -
INJ1 1.0 Xception 23 161 30 53 0.6891 0.4340 0.3026 0.3566
Resnet50 12 176 15 64 0.7041 0.4444 0.1579 0.2330
Ours-2D 43 182 9 33 0.8427 0.8269 0.5658 0.6719
Ours-3D 1 - 0 0 - 1.0 1.0 1.0
INJ2 2.5 Xception 25 68 1 37 0.7099 0.9615 0.4032 0.5682
Resnet50 21 68 1 41 0.6794 0.9545 0.3387 0.50
Ours-2D 15 69 0 47 0.6412 1.0 0.2419 0.3896
Ours-3D 2 - 0 0 - 1.0 1.0 1.0
INJ3 0.625 Xception 39 290 14 137 0.6854 0.7358 0.2216 0.3406
Resnet50 32 291 13 144 0.6729 0.7111 0.1818 0.2896
Ours-2D 79 304 0 97 0.7979 1.0 0.4489 0.6196
Ours-3D 2 - 0 0 - 1.0 1.0 1.0
REM1 1.8 Xception 11 88 2 55 0.6346 0.8462 0.1667 0.2785
Resnet50 7 89 1 59 0.6154 0.8750 0.1061 0.1892
Ours-2D 42 90 0 24 0.8462 1.0 0.6364 0.7778
Ours-3D 1 - 0 0 - 1.0 1.0 1.0
REM2 1.8 Xception 41 88 0 64 0.6346 0.8462 0.1667 0.2785
Resnet50 30 86 2 75 0.6010 0.9375 0.2857 0.4380
Ours-2D 80 88 0 25 0.8705 1.0 0.7619 0.8649
Ours-3D 2 - 0 0 - 1.0 1.0 1.0
REM3 2.5 Xception 20 50 1 61 0.5303 0.9524 0.2469 0.3922
Resnet50 19 50 1 62 0.5227 0.950 0.2346 0.3762
Ours-2D 59 51 0 22 0.8333 1.0 0.7284 0.8429
Ours-3D 3 - 0 0 - 1.0 1.0 1.0

The experimental results show that our model can effectively find the traces of CT-GAN tampering and is more stable than other methods. Our method can determine whether a scan has been tampered with automatically through a simple strategy. For example, when any n of m consecutive images are classified to be positive, it is considered that this scan has been tampered with by CT-GAN. Therefore, even CT scan is three-dimensional, while our model is two-dimensional, our model can effectively assist in distinguishing whether a CT scan has been tampered with.

In addition, for CT scans with smaller slice spacing, such as INJ1, INJ3, REM1 and REM2, our method can detect more consecutive positive samples (more than 15). However, when the slice spacing is larger, the continuous positive samples that the model can detect are fewer.

Furthermore, many misjudgments will occur in places that are unrelated to lung nodules, such as folds of clothes and calcified muscle tissue, which doctors can easily identify.

5 Discussion and limitations

1) The correlation of sub-images

In our method, we did not fully consider the correlation of sub-images since attacks like CT-GAN is a 3D medical imagery forgery. Although the current experimental results meet the requirements of the detection task, we believe that introducing this correlation into the detection task will probably help improve the detection accuracy, efficiency, or generalization. Hence, in the future, we plan to conduct more extensive experiments to find the correlation between these sub-images. For example, whether there is potentially hidden information between adjacent sub-images in the tampered region and how this hidden information helps to improve the performance.

2) Efficiency

Since our method divides the medical images into multiple sub-images through a sliding window, the increased number of targets would reduce the detection efficiency. With the help of a high processing speed for a single sub-image, the overall speed to detect a complete scan of the lungs is acceptable. As mentioned before, the main reason for using this sliding window is that the tampered region only occupies a small ratio of a normal image, which causes the existing detection methods that treat the target image as a whole to fail. Hence, the proposed method is the better choice from the detection point of view as it is the only detection method for the CT-GAN attack. It is worth noting that, although our method demonstrates high accuracy in detecting the tampered region when the forgery attack is applied to the entire image, its efficiency still lags behind methods that treat the target as a single unit, such as [6].

3) Generalization

The two most commonly used GAN structures in the medical image field are pix2pix and CycleGAN. CT-GAN uses the pix2pix structure, and our experiments have demonstrated the accuracy in detecting forged medical image generated by CT-GAN. On the other hand, many studies are based on the CycleGAN structure. For example, using CycleGAN to synthesize missing PET from MRI [29], learning automatic X-ray image parsing from labeled CT scan [30], automatic tumor segmentation [31], synthesized medical images [32, 33], and reconstructing CT [34]. Therefore, we chose CycleGAN to construct another data set to exam the performance of our method. The CycleGAN data set is described in Section 4.1. The medical images denoised by CycleGAN can be regarded as entirely generated by CycleGAN. The test results show that our method can classify CycleGAN tampered medical images and real medical images with 99.8% accuracy. In addition, if we do not use machine learning but use some fixed pattern for global classification, it is challenging to classify GAN-based small region forgery and the image generated by GAN wholly.

CycleGAN and pix2pix are the two most commonly used GAN structures in medical image synthesis. Our method can effectively detect the images generated by CycleGAN and pix2pix. Although our method can detect the images generated by the same or similar GAN, the detection effect of other GAN models not in the training set is not as good as the former. Many studies [2, 5, 6, 810, 49] want to improve the generalization ability of GAN detection methods. This may be achieved by studying the common defects of CNN or GAN. For example, Wang et al. [49] tested multiple latest image generation models and found that the images generated by CNN today have certain common defects. Chai et al. [2] summarizes which parts are likely to cause the face images generated by GAN to be recognized. However, the above studies did not take GAN-based small region forgery attacks into consideration. How to combine these studies with GAN-based small region forgery attacks is still a problem. We plan to study this in the future.

4) Traditional vs. GAN-based tampering

As mentioned in Section 3.2, our work focuses on detecting GAN-based tampering. That is because of three reasons.

First, traditional methods like copy-move and image-splicing are commonly used in image forgeries. By these methods, the attacker can duplicate content within the same image or from one image to another to cover up, add or modify something [50]. Many works are already focused on how to detect traditional image forgery, and these techniques can effectively identify image tampering attacks.

More importantly, both copy-move and image-splicing are performed in 2D using image software such as Photoshop. In contrast to the Photoshopping approach, for medical images, the CT scans are 3D images taking many 2D scans of the body over the axial plane (from front to back) along the body. The human body is complex and diverse in the 3D views, making it difficult to inject or remove cancers and tumours realistically since they are usually attached to nearby anatomy. Moreover, CT scanners have distinct localized noise patterns that are visually noticeable [51]. Copy-move or image-splicing will raise suspicion under the supervision of a specialist radiologist. Hence, identifying traditional medical image tampering is not a challenging problem.

It is also important to note that the tampering attacks need to automate the entire process since the radiologist will make a diagnosis immediately after performing the scan. However, the traditional tampering method can only partially be automated.

6 Conclusion

We propose a new method to detect GAN-based small region forgery attacks in the medical image. GAN-based small region forgery attacks under constraints targeting medical images like CT-GAN are challenging to detect by existing models that take whole images as input. We utilize our two-stage cascade framework, which uses a sliding window to train and test a light neural network in units of sub-images, then make the global classification by hyperplanes in an infinite-dimensional space. Experiments show that our method can detect SOTA GAN-based tampering traces more accurately than other detection methods under the same data set.

Data Availability

All the code and data can be found here: https://github.com/BESTICSP/CT-GAN-Detector.

Funding Statement

This work is supported by the Fundamental Research Funds for the Central Universities (328202204). There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al., “Generative adversarial nets,” 2014.
  • 2.L. Chai, D. Bau, S.-N. Lim, and P. Isola, “What makes fake images detectable? understanding properties that generalize,” in European Conference on Computer Vision. Springer, 2020, pp. 103–120.
  • 3.A. Rossler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Nießner, “Faceforensics++: Learning to detect manipulated facial images,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 1–11.
  • 4.F. Marra, D. Gragnaniello, D. Cozzolino, and L. Verdoliva, “Detection of gan-generated fake images over social networks,” in 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), 2018, pp. 384–389.
  • 5.S. McCloskey and M. Albright, “Detecting gan-generated imagery using saturation cues,” in 2019 IEEE International Conference on Image Processing (ICIP), 2019, pp. 4584–4588.
  • 6. Zhang K., Liang Y., Zhang J., Wang Z., and Li X., “No one can escape: A general approach to detect tampered and generated image,” IEEE Access, vol. 7, pp. 129 494–129 503, 2019. [Google Scholar]
  • 7.N. Yu, L. S. Davis, and M. Fritz, “Attributing fake images to gans: Learning and analyzing gan fingerprints,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 7556–7566.
  • 8. Nataraj L., Mohammed T. M., Manjunath B., Chandrasekaran S., Flenner A., Bappy J. H., and Roy-Chowdhury A. K., “Detecting gan generated fake images using co-occurrence matrices,” Electronic Imaging, vol. 2019, no. 5, pp. 532–1, 2019. [Google Scholar]
  • 9.J. Frank, T. Eisenhofer, L. Schönherr, A. Fischer, D. Kolossa, and T. Holz, “Leveraging frequency analysis for deep fake image recognition,” in International Conference on Machine Learning. PMLR, 2020, pp. 3247–3258.
  • 10.R. Durall, M. Keuper, and J. Keuper, “Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 7890–7899.
  • 11.F. Marra, C. Saltori, G. Boato, and L. Verdoliva, “Incremental learning for the detection and classification of gan-generated images,” in 2019 IEEE International Workshop on Information Forensics and Security (WIFS). IEEE, 2019, pp. 1–6.
  • 12.D. Cozzolino, J. Thies, A. Rössler, C. Riess, M. Nießner, and L. Verdoliva, “Forensictransfer: Weakly-supervised domain adaptation for forgery detection,” arXiv preprint arXiv:1812.02510, 2018.
  • 13.Y. Mirsky, T. Mahler, I. Shelef, and Y. Elovici, “Ct-gan: Malicious tampering of 3d medical imagery using deep learning,” in 28th USENIX Security Symposium (USENIX Security 19), 2019, pp. 461–478.
  • 14.M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,” in International conference on machine learning. PMLR, 2017, pp. 214–223.
  • 15.T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for improved quality, stability, and variation,” 2017.
  • 16.T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4401–4410. [DOI] [PubMed]
  • 17.P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125–1134.
  • 18.J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223–2232.
  • 19.D. Jin, Z. Xu, Y. Tang, A. P. Harrison, and D. J. Mollura, “Ct-realistic lung nodule simulation from 3d conditional generative adversarial networks for robust lung segmentation,” in International Conference on Medical Image Computing and Computer Assisted Intervention—MICCAI 2018, Springer, 2018, pp. 732–740.
  • 20.D. Mahapatra, B. Bozorgtabar, J.-P. Thiran, and M. Reyes, “Efficient active learning for image classification and segmentation using a sample selection and conditional generative adversarial network,” in Medical Image Computing and Computer Assisted Intervention—MICCAI 2018, 2018, pp. 580–588.
  • 21.A. Madani, M. Moradi, A. Karargyris, and T. Syeda-Mahmood, “Semi-supervised learning with generative adversarial networks for chest x-ray classification with ability of data domain adaptation,” in 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 2018, pp. 1038–1042.
  • 22.M. J. M. Chuquicusma, S. Hussein, J. Burt, and U. Bagci, “How to fool radiologists with generative adversarial networks? a visual turing test for lung cancer diagnosis,” in 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 2018, pp. 240–244.
  • 23.F. Tom and D. Sheet, “Simulating patho-realistic ultrasound images using deep generative networks with adversarial learning,” in 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 2018, pp. 1174–1177.
  • 24.B. Cao, H. Zhang, N. Wang, X. Gao, and D. Shen, “Auto-gan: self-supervised collaborative learning for medical image synthesis,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 07, 2020, pp. 10 486–10 493.
  • 25. Yang Q., Yan P., Zhang Y., Yu H., Shi Y., Mou X., et al., “Low-dose ct image denoising using a generative adversarial network with wasserstein distance and perceptual loss,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1348–1357, 2018. doi: 10.1109/TMI.2018.2827462 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. You C., Yang Q., Shan H., Gjesteby L., Li G., Ju S., et al., “Structurally-sensitive multi-scale deep neural network for low-dose ct denoising,” IEEE Access, vol. 6, pp. 41 839–41 855, 2018. doi: 10.1109/ACCESS.2018.2858196 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Shan H., Zhang Y., Yang Q., Kruger U., Kalra M. K., Sun L., et al., “3-d convolutional encoder-decoder network for low-dose ct via transfer learning from a 2-d trained network,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1522–1534, 2018. doi: 10.1109/TMI.2018.2832217 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Ran M., Hu J., Chen Y., Chen H., Sun H., Zhou J., et al. l., “Denoising of 3d magnetic resonance images using a residual encoder–decoder wasserstein generative adversarial network,” Medical Image Analysis, vol. 55, pp. 165–180, 2019. doi: 10.1016/j.media.2019.05.001 [DOI] [PubMed] [Google Scholar]
  • 29.Y. Pan, M. Liu, C. Lian, T. Zhou, Y. Xia, and D. Shen, “Synthesizing missing pet from mri with cycle-consistent generative adversarial networks for alzheimer’s disease diagnosis,” in Medical Image Computing and Computer Assisted Intervention—MICCAI 2018, 2018, pp. 455–463. [DOI] [PMC free article] [PubMed]
  • 30.Y. Zhang, S. Miao, T. Mansi, and R. Liao, “Task driven generative modeling for unsupervised domain adaptation: Application to x-ray image segmentation,” in Medical Image Computing and Computer Assisted Intervention—MICCAI 2018, A. F. Frangi, J. A. Schnabel, C. Davatzikos, C. Alberola-López, and G. Fichtinger, Eds. Springer International Publishing, 2018, pp. 599–607.
  • 31.J. Jiang, Y.-C. Hu, N. Tyagi, P. Zhang, A. Rimner, G. S. Mageras, et al., “Tumor-aware, adversarial domain adaptation from ct to mri for lung cancer segmentation,” in Medical Image Computing and Computer Assisted Intervention—MICCAI 2018, 2018, pp. 777–785. [DOI] [PMC free article] [PubMed]
  • 32.Z. Zhang, L. Yang, and Y. Zheng, “Translating and segmenting multimodal medical volumes with cycle-and shape-consistency generative adversarial network,” in Proceedings of the IEEE conference on computer vision and pattern Recognition, 2018, pp. 9242–9251.
  • 33.R. Oulbacha and S. Kadoury, “Mri to ct synthesis of the lumbar spine from a pseudo-3d cycle gan,” in 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), 2020, pp. 1784–1787.
  • 34.X. Ying, H. Guo, K. Ma, J. Wu, Z. Weng, and Y. Zheng, “X2ct-gan: reconstructing ct from biplanar x-rays with generative adversarial networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10 619–10 628.
  • 35. Ma Y., Liu J., Liu Y., Fu H., Hu Y., Cheng J., et al., “Structure and illumination constrained gan for medical image enhancement,” IEEE Transactions on Medical Imaging, vol. 40, no. 12, pp. 3955–3967, 2021. doi: 10.1109/TMI.2021.3101937 [DOI] [PubMed] [Google Scholar]
  • 36.J. Zhu, G. Yang, and P. Lio, “How can we make gan perform better in single medical image super-resolution? a lesion focused multi-scale approach,” in 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). IEEE, 2019, pp. 1669–1673.
  • 37. de Farias E. C., Di Noia C., Han C., Sala E., Castelli M., and Rundo L., “Impact of gan-based lesion-focused medical image super-resolution on the robustness of radiomic features,” Scientific reports, vol. 11, no. 1, pp. 1–12, 2021. doi: 10.1038/s41598-021-00898-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  • 39.F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1251–1258.
  • 40.M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in Proceedings of the 36th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, vol. 97. PMLR, 09–15 Jun 2019, pp. 6105–6114.
  • 41. Mirsky Y. and Lee W., “The creation and detection of deepfakes: A survey,”ACM Computing Surveys (CSUR), vol. 54, no. 1, pp. 1–41, 2021. doi: 10.1145/3425780 [DOI] [Google Scholar]
  • 42.O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, 2015, pp. 234–241.
  • 43.C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818–2826.
  • 44.A. Howard, M. Sandler, B. Chen, W. Wang, L. Chen, M. Tan, et al., “Searching for mobilenetv3,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314–1324.
  • 45.S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” ser. Computer Vision—ECCV 2018. Springer International Publishing, Conference Proceedings, pp. 3–19.
  • 46.G. Klambauer, T. Unterthiner, A. Mayr, and S. Hochreiter, “Self-normalizing neural networks,” 2017.
  • 47. Setio A. A. A., Traverso A., De Bel T., Berens M. S., Van Den Bogaard C., Cerello P., et al., “Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the luna16 challenge,” Medical image analysis, vol. 42, pp. 1–13, 2017. doi: 10.1016/j.media.2017.06.015 [DOI] [PubMed] [Google Scholar]
  • 48.B. Reichman, L. Jing, O. Akin, and Y. Tian, “Medical image tampering detection: A new dataset and baseline,” in International Conference on Pattern Recognition. Springer, 2021, pp. 266–277.
  • 49.S.-Y. Wang, O. Wang, R. Zhang, A. Owens, and A. A. Efros, “Cnn-generated images are surprisingly easy to spot… for now,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8695–8704.
  • 50. Sadeghi S., Dadkhah S., Jalab H. A., Mazzola G., and Uliyan D., “State of the art in passive digital image forgery detection: copy-move image forgery,” Pattern Analysis and Applications, vol. 21, no. 2, pp. 291–306, 2018. doi: 10.1007/s10044-017-0678-8 [DOI] [Google Scholar]
  • 51. Duan Y., Bouslimi D., Yang G., Shu H., and Coatrieux G., “Computed tomography image origin identification based on original sensor pattern noise and 3-d image reconstruction algorithm footprints,” IEEE journal of biomedical and health informatics, vol. 21, no. 4, pp. 1039–1048, 2016. doi: 10.1109/JBHI.2016.2575398 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Ali Mohammad Alqudah

29 May 2023

PONE-D-23-02950GAN-based Medical Image Small Region Forgery Detection via a Two-Stage Cascade FrameworkPLOS ONE

Dear Dr. Zhang,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Jul 13 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Ali Mohammad Alqudah

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating in your Funding Statement: 

"Yes. Supported by the Fundamental Research Funds for the Central Universities  (328202204). For Jianyi Zhang. China"

Please provide an amended statement that declares *all* the funding or sources of support (whether external or internal to your organization) received during this study, as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now.  Please also include the statement “There was no additional external funding received for this study.” in your updated Funding Statement. 

Please include your amended Funding Statement within your cover letter. We will change the online submission form on your behalf.

3. Thank you for stating the following financial disclosure: 

"Yes. Supported by the Fundamental Research Funds for the Central Universities  (328202204). For Jianyi Zhang. China"

Please state what role the funders took in the study. If the funders had no role, please state: ""The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."" 

If this statement is not correct you must amend it as needed. 

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

4. Thank you for stating the following in your Competing Interests section:  

"NO authors have competing interests"

Please complete your Competing Interests on the online submission form to state any Competing Interests. If you have no competing interests, please state ""The authors have declared that no competing interests exist."", as detailed online in our guide for authors at http://journals.plos.org/plosone/s/submit-now 

 This information should be included in your cover letter; we will change the online submission form on your behalf.

5. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.

6. We note that Figure 3 in your submission contain copyrighted image. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission:

a. You may seek permission from the original copyright holder of Figure 3 to publish the content specifically under the CC BY 4.0 license. 

We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:

“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.”

Please upload the completed Content Permission Form or other proof of granted permissions as an ""Other"" file with your submission. 

In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].”

b. If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

7. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Additional Editor Comments:

The authors respond very well to reviewer comments, however, the paper still needs some editing to make it publishable. The paper has many grammatical and writing issues that made it challenging to read, and I highly encourage the authors to proofread the paper. Also, please make sure that all figures are cited correctly.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

********** 

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

********** 

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

********** 

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

********** 

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have done good work on the title “GAN-based Medical Image Small Region Forgery Detection via a Two-Stage Cascade Framework”. It will add new knowledge and new areas of research to the subject area compared with other published material.

However, i have some minor concerns:

1. The authors have inserted Fig. 1 in the introduction section, but it was not cited inside the manuscript. Kindly check the guideline of the PLOS ONE journal and perform the required amendment.

2. It would be more appropriate for the authors to define abbreviations upon first appearance in the main text such as PCA, GAN, CAD, GLCM, CCPA, GDPR and CV field; principal component analysis (PCA), generative adversarial network (GAN) computer-aided diagnosis (CAD), gray-level co-occurrence matrix (GLCM), California consumer privacy act (CCPA) and general data protection regulation (GDPR) and computer vision (CV), respectively.

3. In the section of “Challenges”, it would be more appropriate for the authors to inserted Fig. 4 immediately after the paragraph in which the figure is cited.

4. In the following sentence “That is why even state-of-the-art methods are challenging to detect the tampering trace of CT-GAN. Because of this, many methods based on statistical characteristics, such as [6, 10, 11]”, it would be more appropriate for the authors to detail the methods with their related citation.

5. In the section of “Local detection network architecture”, it would be more appropriate for the authors to inserted Fig. 5 immediately after the paragraph in which the figure is cited.

6. In the section of “Detect CT scans”, it would be more appropriate for the authors to inserted Fig. 8 immediately after the paragraph in which the figure is cited.

7. The following sentence “When the forgery attack is applied to the whole image, although we notice that our method can also detect the tampered region with high accuracy, the efficiency still cannot catch up with those methods that treat the target as a single unit, such as [7]”, seemed as incomplete sentence. Kindly check it and perform the required amendment.

8. In the following sentence “The works such as [29–34] are based on the CycleGAN structure”, it would be more appropriate for the authors to briefly indicate the works with their related citation.

9. In section of “Evaluation “, the following sentence “Similarly, for a real region, suppose 9 or more consecutive slices in the real slice are judged as positive examples, or 9 of the 10 consecutive slices are judged as positive examples”, seemed as redundant and hanged sentence. Kindly check it and perform the required amendment.

10. Moderate editing is required throughout the manuscript, for example:

a. “The design of residual block refers to ResNet [39]. The number of convolution kernels channels is unchanged when the input feature image size is the same as the output”. Moderate editing is required.

b. “At the end of our network. Selu [47] is used as the activation function in the full connection layer of the network”. Moderate editing is required.

c. “The crop operation is because the practical part of the CT image is only the inside of a circle tangent to the square frame. Moderate editing is required

d. “s. Therefore, after local detection of the tampered slice, a series of sub-images will be judged as positive in theory. Therefore, the tampering trace of CT-GAN can be detected through some fixed modes, such as the vertical and horizontal continuous n sub-images are judged to be false”. Moderate editing is required.

e. “However, this detection method is not perfect. The size of the tampered region of GAN is not fixed. The attacker can set the size of the CT-GAN tampered region to a certain extent”. Moderate editing is required.

Best regards,

Dr. Mai Abdel Haleem Abusalah

Faculty of Medical Allied Science,

Zarqa University,

Zarqa, 13110, Jordan.

Tel: +962-796862347

e-mail: ellamomo88@yahoo.com

Reviewer #2: The authors proposed a two-stage cascade framework as a solution to GAN-based medical image small region forgery detection. The authors first train the detector network with small sub-images as the input to recognize real/fake sub-images, and classify all sub-windows in a whole slice or volume to obtain the heatmap. Then a global classification is employed by extracting gray level co-occurrence matrix (GLCM) features from the heatmap and using the SVM for recognition. Experiments show that the proposed method can obtain better results. I think this manuscript can be accepted. On a less significant note, the paper has several grammatical and writing issues that made it challenging to read. I highly encourage the authors to proof-read the paper.

********** 

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: MAI ABDELHALEEM A. ABUSALAH

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: reviewer comments - Mai-22022023.docx

PLoS One. 2024 Jan 2;19(1):e0290303. doi: 10.1371/journal.pone.0290303.r002

Author response to Decision Letter 0


17 Jul 2023

We warmly thank the reviewers for their insightful comments and constructive feedback. They have been beneficial for us to improve the quality of our paper and better communicate the core aspects and contributions of our method. We have made significant changes to the paper to address all expected requirements for the major revision. Furthermore, we have also addressed all questions and comments raised by the individual reviewers and have added explanations to clarify unclear aspects of our paper. Throughout the paper, we have sought to improve the quality of our writing.

RESPONSES TO INDIVIDUAL REVIEWS

===========================================

REVIEWER # 1

1: The authors have inserted Fig. 1 in the introduction section, but it was not cited inside the manuscript. Kindly check the guideline of the PLOS ONE journal and perform the required amendment.

Authors’ response: We apologize for the oversight in not citing Fig. 1 within the manuscript. We have relocated Fig. 1 to Chapter 3.3 as it is the overview of our method and ensured proper citation according to PLOS ONE guidelines.

2: It would be more appropriate for the authors to define abbreviations upon first appearance in the main text such as PCA, GAN, CAD, GLCM, CCPA, GDPR and CV field; principal component analysis (PCA), generative adversarial network (GAN) computer-aided diagnosis (CAD), gray-level co-occurrence matrix (GLCM), California consumer privacy act (CCPA) and general data protection regulation (GDPR) and computer vision (CV), respectively.

Authors’ response: Thank you for your comment. We have addressed your concern by defining all abbreviations upon their first appearance in the manuscript, including PCA, GAN, CAD, GLCM, CCPA, GDPR, and CV.

3: In the section of “Challenges”, it would be more appropriate for the authors to inserted Fig. 4 immediately after the paragraph in which the figure is cited.

Authors’ response: Thank you for your valuable suggestion. We have followed your advice and inserted Fig. 4 immediately after the paragraph where the figure is cited in the "Challenges" section. This modification enhances the readability and coherence of the manuscript. We appreciate your feedback.

4: In the following sentence “That is why even state-of-the-art methods are challenging to detect the tampering trace of CT-GAN. Because of this, many methods based on statistical characteristics, such as [6, 10, 11]”, it would be more appropriate for the authors to detail the methods with their related citation.

Authors’ response: Thank you for your comment. The revised sentence now reads: ``That is why even state-of-the-art methods, like the methods based on saturation cues [6], frequency analysis [10], and spectral regularization optimization [11], are challenging to detect the tampering trace of CT-GAN."

5: In the section of “Local detection network architecture”, it would be more appropriate for the authors to inserted Fig. 5 immediately after the paragraph in which the figure is cited.

Authors’ response: We apologize for the issue caused by the automatic typesetting in LaTeX. Based on your suggestion, we have made the necessary modification.

6: In the section of “Detect CT scans”, it would be more appropriate for the authors to inserted Fig. 8 immediately after the paragraph in which the figure is cited.

Authors’ response: Thank you for your comment. Based on your suggestion, we have made the necessary adjustment. However, as can be seen, some of the images were automatically adjusted by LaTeX due to their size, preventing them from appearing on the same page. We will make overall modifications based on the editor's feedback during the camera-ready stage for similar cases.

7: The following sentence “When the forgery attack is applied to the whole image, although we notice that our method can also detect the tampered region with high accuracy, the efficiency still cannot catch up with those methods that treat the target as a single unit, such as [7]”, seemed as incomplete sentence. Kindly check it and perform the required amendment.

Authors’ response: Thank you for your comment. We have revised the sentence as follows: ``Although our method demonstrates high accuracy in detecting the tampered region when the forgery attack is applied to the entire image, its efficiency still lags behind methods that treat the target as a single unit, such as [7]."

8: In the following sentence “The works such as [29–34] are based on the CycleGAN structure”, it would be more appropriate for the authors to briefly indicate the works with their related citation.

Authors’ response: Thank you for your comment. We have revised the sentence as follows: ``On the other hand, many studies are based on the CycleGAN structure. For example, using CycleGAN to synthesize missing PET from MRI [29], learning automatic X-ray image parsing from labeled CT scan[30], automatic tumor segmentation[31], synthesized medical images[32,33], and reconstructing CT[34]. "

9: In section of “Evaluation", the following sentence “Similarly, for a real region, suppose 9 or more consecutive slices in the real slice are judged as positive examples, or 9 of the 10 consecutive slices are judged as positive examples”, seemed as redundant and hanged sentence. Kindly check it and perform the required amendment.

Authors’ response: Thank you for your comment. We have revised the sentence as follows: `` For a tampered region, when 9 or more of the 10 consecutive slices, which including the tampered central slice (these slices must be positive examples in this experiment), are judged as positive examples, we consider that the tampering trace is accurately found and marked as a true positive example."

10: Moderate editing is required throughout the manuscript:

10.a: ``The design of residual block refers to ResNet [39]. The number of convolution kernels channels is unchanged when the input feature image size is the same as the output”. Moderate editing is required.

Authors’ response: Thank you for your comment. We have revised the sentence as follows: ``The design of the residual block is inspired by ResNet [39], and it maintains the same number of convolutional kernel channels when the input feature image size matches the output size."

10.b: ``At the end of our network. Selu [47] is used as the activation function in the full connection layer of the network”. Moderate editing is required.

Authors’ response: Thank you for your comment. The modified sentence now reads: ``In the fully connected layer of our network, we employ the Selu [47] as the activation function."

10.c: ``The crop operation is because the practical part of the CT image is only the inside of a circle tangent to the square frame." Moderate editing is required.

Authors’ response: Thank you for your comment. The modified sentence now reads: ``The reason for performing the crop operation is that the practical part of the CT image corresponds only to the interior of a circle tangential to the square frame."

10.d: ``Therefore, after local detection of the tampered slice, a series of sub-images will be judged as positive in theory. Therefore, the tampering trace of CT-GAN can be detected through some fixed modes, such as the vertical and horizontal continuous $n$ sub-images are judged to be false”. Moderate editing is required.

Authors’ response: Thank you for your comment. The modified sentence now reads: ``Consequently, in theory, a series of sub-images will be identified as positive after the local detection of the tampered slice. Hence, fixed patterns such as the false judgment of $n$ vertically and horizontally contiguous sub-images can be utilized to detect tampering traces of CT-GAN."

10.e: ``However, this detection method is not perfect. The size of the tampered region of GAN is not fixed. The attacker can set the size of the CT-GAN tampered region to a certain extent”. Moderate editing is required.

Authors’ response: Thank you for your comment. The modified sentence now reads: ``Nevertheless, it is important to note that this detection method is not flawless, as the size of the tampered region in CT-GAN is variable to some extent, allowing the attacker to manipulate it."

===========================================

REVIEWER # 2

On a less significant note, the paper has several grammatical and writing issues that made it challenging to read. I highly encourage the authors to proof-read the paper.

Authors’ response: We have thoroughly proofread the entire paper, addressing the writing issues and grammatical errors as suggested. We have made the necessary amendments to improve the readability and quality of the manuscript. Thank you for pointing out these concerns, and thank you for your comment.

===========================================

The Figure 3 in our submission contain copyrighted image. Since all PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), we deleted this figure.

Attachment

Submitted filename: CT_GAN_PLOS_Rebuttal.pdf

Decision Letter 1

Ali Mohammad Alqudah

7 Aug 2023

GAN-based Medical Image Small Region Forgery Detection via a Two-Stage Cascade Framework

PONE-D-23-02950R1

Dear Dr. Zhang,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Ali Mohammad Alqudah

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have done good work on the title “GAN-based Medical Image Small Region Forgery Detection via a Two-Stage Cascade Framework”. It will add new knowledge and new areas of research to the subject area compared with other published material.

The authors have adequately addressed all comments and performed the required amendments; hence I highly recommend accepting this interesting article.

Reviewer #2: The authors have thoroughly proofread the entire paper, addressing the writing issues and grammatical errors as suggested. They have made the necessary amendments to improve the readability and quality of the manuscript. The method might be applicable to other images as well.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: MAI ABDEL HALEEM ABUSALAH

Reviewer #2: No

**********

Acceptance letter

Ali Mohammad Alqudah

17 Aug 2023

PONE-D-23-02950R1

GAN-based Medical Image Small Region Forgery Detection via a Two-Stage Cascade Framework

Dear Dr. Zhang:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Ali Mohammad Alqudah

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: reviewer comments - Mai-22022023.docx

    Attachment

    Submitted filename: CT_GAN_PLOS_Rebuttal.pdf

    Data Availability Statement

    All the code and data can be found here: https://github.com/BESTICSP/CT-GAN-Detector.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES