Skip to main content
Sensors (Basel, Switzerland) logoLink to Sensors (Basel, Switzerland)
. 2025 Dec 6;25(24):7432. doi: 10.3390/s25247432

Recent Advances in Deep Learning-Based Source Camera Identification and Device Linking

Zimeng Li 1, Ngai-Fong Law 1,*
Editor: Perry Xiao1
PMCID: PMC12737310  PMID: 41471430

Abstract

Photo-response non-uniformity (PRNU) has long been regarded as a reliable method for source camera identification and device linking in forensic applications. Recent advances in deep learning (DL) have introduced diverse architectures, including convolutional neural networks, residual learning, encoder–decoder representations, dual-branch structures, and contrastive learning, to capture specific sensor artifacts. This review summarizes the performance of these DL techniques across both tasks and compares their effectiveness at the model and device levels over time. While DL approaches achieve strong model-level accuracy, robust device-level identification remains challenging, particularly in modern imaging pipelines that involve camera-integrated or AI-driven enhancements during capture. These findings underscore the need for improved techniques and updated datasets to address evolving photograph capture practices.

Keywords: sensor artefacts, photo-response non-uniformity, camera identification, device linking

1. Introduction

The widespread use of digital images in entertainment, social networking, and legal contexts has highlighted the importance of accurately identifying the source camera that captured them. This issue typically arises in two distinct scenarios: (1) verifying the source of an image among several potential cameras and (2) determining whether the same camera took two images. The former is known as source camera identification [1], while the latter is referred to as the device linking problem [2]. Source camera identification is crucial when the set of potential source cameras is known, framing the task as a multi-class classification problem. Conversely, device linking is particularly relevant for verifying associations between online accounts by determining whether their profile images originate from the same camera.

Both source camera identification and device linking usually rely on extracting a unique sensor noise pattern, known as photo-response non-uniformity (PRNU). This statistical pattern is inherent to each camera sensor and is present in every image produced by that sensor [1]. By analyzing PRNU, it becomes possible to determine whether a photograph originated from a specific camera or whether the same device captured the two images.

Traditionally, signal processing-based filtering methods have been used to extract sensor noise from photographs. Although introduced in 2006 [1], the sensor noise-based approach remains widely accepted in forensic and legal contexts for verifying the authenticity of image sources. The emergence of deep learning (DL) techniques has introduced new possibilities for camera identification. Researchers have explored popular architectures such as residual networks, U-Net, and ConvNet for source camera identification. In this context, the task is formulated as a classification problem, where deep learning models are trained to learn forensic features directly from image data, thereby differentiating images from a predefined set of cameras. These methods often address identification at three levels:

  • Camera brand identification: identifying the brand of the camera that captured the photograph.

  • Camera model identification: identifying the camera model that captured the photograph.

  • Camera device identification: identifying the exact device that took the photograph.

Since the first deep learning-based study in 2016 [3], research has consistently shown that while deep learning methods excel at distinguishing camera brands and models, device-level identification remains challenging and requires further investigation. These methods also struggle when encountering previously unseen cameras not included in the training data. Moreover, deep learning approaches typically require a large number of images for training, which may not be feasible in forensic scenarios where only a limited number of images are available.

This constraint becomes particularly critical in device linking tasks, where the goal is to determine whether two images—often the only ones available—were captured by the same camera. In such cases, deep learning methods also attempt to extract camera-specific information from images. Two main approaches have been explored in the literature: one formulates the problem as a few-shot learning task, while the other employs Siamese networks and contrastive learning frameworks to measure similarity between camera-specific artefacts in image pairs. The weak nature of these artefacts makes the linking process particularly challenging.

The primary purpose of this review is to examine state-of-the-art deep learning-based methods for source camera identification and device linking, with a clear distinction between model-level and device-level performance. Recent reviews have primarily focused on source camera identification, leaving device linking comparatively underexplored despite its relevance in forensic and social media contexts. Specifically, prior reviews can be summarized as follows:

  • Ref. [4] surveyed works published before 2021 on source identification using noise patterns in machine learning-based systems.

  • Ref. [5] focused on PRNU and related techniques such as lens radial distortion, color filter array interpolation, and auto-white balance approximation.

  • Ref. [6] reviewed PRNU, statistical methods, and deep learning methods in classification settings. While it mentions model and device-level identification, it lacks performance and comparative analysis.

  • Ref. [7] explored PRNU, CNN-based, feature-based, and metadata-based methods, questioning whether PRNU remains the gold standard in modern imaging pipelines.

  • Ref. [8] provided an overview of camera noise types and neural network-based noise estimation techniques. However, it lacks performance analysis of identification.

This review aims to evaluate the performance of state-of-the-art DL-based methods and compare them with the PRNU-based methods. We begin by reviewing traditional PRNU-based approaches, given their established role and legal relevance in forensic contexts. Then, we examine deep learning methods for both source camera identification and device linking developed between 2016 and 2025. As shown in Figure 1, a search was conducted across major journals, conference proceedings, and digital libraries (IEEE Xplore, Springer, Elsevier, MDPI) using keywords such as “source camera identification”, “device linking”, “deep learning”, “PRNU”, and “camera attribution”. Studies were included if they proposed DL-based methods for source camera identification or device linking, provided comprehensive experimental validation, and reported performance metrics at either the model or device levels. Works lacking experimental results or focusing on video-based identification were excluded. From approximately 180 papers initially identified, 30 met all criteria, with 16 published on or after 2022. Their performance was analyzed at both the model and device levels. Through this, we aim to assess the achievements of deep learning methods in comparison to traditional PRNU-based approaches, highlighting their strengths, limitations, and practical implications.

Figure 1.

Figure 1

Search strategy for selecting studies on DL-based source camera identification and device linking.

Modern cameras are equipped with automatic settings that enhance photographs in real-time during image capture, often complemented by AI-driven photo enhancement features. These developments may modify the extracted sensor noise and camera-specific artefacts, potentially affecting identification accuracy. However, most available datasets do not account for these effects, making it difficult to systematically assess their impact on source camera identification and device linking. As discussed in [7], the modern imaging pipeline introduces unique challenges for PRNU-based methods. Currently, no study has examined whether DL-based methods suffer from similar issues. This review addresses this critical issue by evaluating the impact of the modern imaging pipelines on DL-based methods. Given the increasing use of AI-enhanced photography, this review provides timely insights for forensic analysts and researchers.

The paper is structured as follows: Section 2 examines sensor noise-based methods and their performance on both source camera identification and device linking tasks. Subsequently, Section 3 and Section 4 explore deep learning methods for source camera identification and device linking, respectively. Section 5 compares the performance of these methods, while Section 6 discusses the challenges associated with them. Finally, Section 7 concludes the paper.

2. Sensor Noise-Based Methods for Source Camera Identification and Device Linking

Source camera identification typically involves two steps: (1) extracting camera-specific features and (2) verifying their presence in a test image. One of the most widely used features in source camera identification is PRNU (photo-response non-uniformity) noise, which arises from manufacturing imperfections in the camera sensor. Since this noise pattern originates from hardware defects, it is unique to each camera device [1,9]. Mathematically, PRNU can be extracted using denoising techniques, as shown below:

wi=IiDIi=IiK+ε (1)

where Ii represents the i-th image from a specific camera, D(·) denotes the denoising operator, wi is the noise residue of the i-th image, K is the camera PRNU, and ε is the additive noise component. Denoising can be achieved through wavelet-based denoising, BM3D [10], or the recently proposed optimal filters [11]. Due to additive noise and the inherently weak nature of the PRNU, it is more reliable to estimate the camera’s PRNU using multiple smooth content images captured by the same device, applying either simple averaging [1] or maximum likelihood estimation [9]. This camera’s PRNU can then be considered as its camera signature.

To determine whether a specific camera took an image, its noise residue is extracted using (1) and then compared with the camera PRNU. Metrics such as normalized cross correlation and peak correlation energy (PCE) [2,12,13] can be used to assess the similarity between the noise residue and the PRNU. A threshold value is required to act as a decision rule in digital camera identification [14]. Most research uses PCE because it is independent of image size. The PCE between x and y is defined as

PCE(x,y)=ρspeak=0,x,y21MNAsAρs,x,y2 (2)

where ρ(s,x,y) is the dot product between xx¯ and y(s)y¯, y(s) is obtained by circularly shifting y by a two-dimensional vector s. The bar over the symbol represents its mean value, A is a small neighborhood around the peak, and M and N are the width and height of the image, respectively.

In practice, test images may contain complex structures that introduce scene content artefacts into the noise residue, thereby degrading the performance of source camera identification. Various approaches have been proposed to address issues with scene content artefacts. For example, a strong component in the noise residue indicates that it is less trustworthy and should be attenuated [15]. In [16], both the camera PRNU and the noise residue were obtained by removing the magnitude of the noise residue in the frequency domain and averaging the phase component only. To further improve accuracy and compensate for scene content issues, region reliability is estimated to produce a weighted correlation value [17,18]. A guided filtering approach is proposed in [19] to enhance the PRNU by removing interference from low-frequency components. Table 1 shows the performance of the PRNU-based source device identification. With the use of fifty smooth images to construct the camera PRNU, all these methods can achieve high identification accuracy. The true-positive rates remain satisfactory even at very low false-positive rates. Due to its high accuracy and low false-positive rate, the PRNU-based method remains widely accepted by courts for verifying image sources [19].

Table 1.

Performance of the PRNU-based source device identification methods for 19 camera devices in the Dresden dataset.

References Accuracy True-Positive Rate at a False-Positive Rate of 10−3
2009/2013, [2,13] 0.9032 0.7768
2009, [15] 0.9116 0.7674
2012, [16] 0.9000 0.7672
2017, [17] 0.9263 0.8011

The PRNU-based method is effective when there are enough smooth images to construct the camera PRNU. In device linking, the camera PRNU cannot be estimated because only two images are available for estimation. Direct comparison between the two noise residues is required to perform device linking. It is essential to select plain and bright image regions for comparing the similarity between the two images [20]. Table 2 shows the performance of the PRNU-based method under this situation. Despite achieving high accuracy and a low false-positive rate, the true-positive rate is notably low. This means that most image pairs are considered not to have originated from the same camera device. Results in Table 2 indicate the difficulty of reliably relating the source of two images based on sensor noise.

Table 2.

Performance of the sensor noise-based methods in the device linking setting for 30 camera devices in the Dresden dataset.

References True Positive Rate False Positive Rate Accuracy F1 Score
2009, [15] 0.0013 0.0002 0.9354 0
2013, [13] 0.1040 0.0002 0.9420 0.19
2019, [14] 0.2240 0.0004 0.9496 0.36
2021, [10] 0.0267 0.0004 0.9369 0.05

Since its introduction in 2006 [1], the PRNU-based method has served as a benchmark for forensic comparisons. It is a court-accepted method with documented error rates: a false-negative rate of 2.38% and a false-positive rate of 0.0024% [21]. Recent studies [7,21] have focused on evaluating the reliability of PRNU-based methods in forensic image analysis, particularly under varying image conditions. For example, reference [21] shows that image brightness could affect error rates. Overexposed and underexposed images tend to have lower true-positive rates than bright images. This means that the PRNU-based methods may miss more images from the suspect camera.

Photo-response non-uniformity (PRNU) is highly sensitive to geometric transformations such as rotation and scaling [22,23,24]. These operations alter the spatial alignment of sensor noise patterns, which PRNU-based algorithms rely on for camera identification [25,26,27]. Even minor geometric changes can significantly degrade correlation scores, leading to false negatives in source camera identification [25]. This vulnerability poses a significant challenge in real-world forensic scenarios. To mitigate this, techniques such as robust feature extraction and watermarking have been considered to preserve device-level forensic traces in the presence of geometric attacks [26,27,28]. However, achieving robustness remains an open research problem.

3. Deep Learning Approaches for Source Camera Identification

While PRNU-based methods have demonstrated high accuracy and legal credibility, their performance can degrade under challenging conditions such as overexposure, underexposure, and complex scenes [15,16,17,18,19]. These limitations have motivated researchers to explore deep learning techniques, which aim to extract forensic features directly from images in a data-driven manner. This section reviews the evolution of deep learning techniques in this domain, with a focus on architectural developments in source camera identification. Table 3 presents a chronological overview of deep learning techniques proposed for source camera identification from 2016 to 2025, while Figure 2 summarizes their key design components.

Figure 2.

Figure 2

Figure 2

Schematic diagrams of (a) CNN feature extraction-based methods, (b) pre-processing-based methods, (c) residual-based methods, (d) encoder–decoder-based methods, (e) multiscale-based methods, and (f) dual-path-based methods.

Early studies employed convolutional neural network (CNN) architectures to extract forensic features from images. As illustrated in Figure 2a, these features were fed into a classical machine learning model for classification. For example, in [3], a CNN was used to extract camera features, which were subsequently classified using a support vector machine (SVM). This work demonstrated the effectiveness of CNN models on camera model identification for the first time. More recent developments combined spectral features from wavelet transforms with spatial features from local binary patterns, using multiclass models such as SVM, LDA, and k-NN for classification [29].

Since image scene information is generally irrelevant to camera identification, various pre-processing techniques have been developed to suppress scene content and enhance the extraction of relevant camera features, as illustrated in Figure 2b. For example, researchers have considered fixed high-pass filters, wavelet-based denoising filters [30], and median filters [31]. An adaptive constraint convolution layer was introduced in [32] to suppress image content. The pre-processing module in [33] consisted of an edge map extraction followed by low-pass filtering. In contrast to these fixed filters, reference [34] employed a data-driven pre-processing block to remove irrelevant content from input images dynamically. Recent advancements involve extracting angular and radial image features to capture pixel variations and relationships within neighborhoods [35]. These features are then integrated into vision transformers for classification. All these pre-processing techniques have consistently demonstrated superior performance in camera model identification tasks.

Residual networks have gained attention in this domain. Their core principle is to learn residual mappings—the difference between the input and the output of a set of layers, as illustrated in Figure 2c. This approach helps preserve subtle forensic traces in photos and mitigates issues such as vanishing gradients. Content-adaptive fusion residual networks were explicitly designed for small-sized images [36]. The input images were categorized into three subsets based on their image content characteristics: saturation, smoothness, and others. Subsequently, each subset was individually trained on a residual network model. Later, the residual network was enhanced with a domain knowledge-driven pre-processing module [37]. A hierarchical multi-task learning mechanism was adopted for three-level identification: brand-level, model-level, and device-level. This involved introducing a domain knowledge-driven pre-processing module that incorporated multiscale high-pass filters, followed by convolutional layers and ResNet. Additionally, Ref. [38] explored a combination of convolutional layers and residual blocks. They also introduced a technique to identify images from unknown camera models by setting a threshold for output prediction scores, enhancing the network’s capability to handle previously unseen camera sources.

On another front, the U-Net model, known for its encoder–decoder structure, has gained popularity. As shown in Figure 2d, the encoder progressively reduces the spatial dimensions of the input image while capturing camera features, and the decoder reconstructs the image representation by upsampling, restoring spatial resolution, and combining it with the encoder output through skip connections. This design enables the network to retain camera characteristics while learning contextual information. A hierarchical architecture of U-Net with dense connectivity and residual learning was considered for PRNU extraction [39]. A U-Net encoder–decoder unit was designed as a residual noise feature extractor, in which the embedding from the residual noise map was subsequently used for classification [40]. Besides, the camera fingerprint was extracted from a U-Net [41]. Features from different scales of feature maps of the Transformer block were fused using the graph convolutional network.

Multiscale analysis is always crucial in feature extraction across various domains. As illustrated in Figure 2e, an image can first be decomposed into multiple scales to capture global and local information. Building on this concept, a fine-grained, multiscale residual prediction strategy was employed in [42] to mitigate the impact of scene content on source identification. In [43], multiscale filters were used to suppress various scene content. Additionally, in [44], a multiscale encoder–decoder structure was considered, involving the selection of image patches at different scales to construct the camera fingerprint.

All previous works were based on a single-path architecture. The adoption of dual-path networks, as illustrated in Figure 2f, has emerged as a promising strategy to integrate complementary information collected from two distinct branches. For example, a compact dual-path attention-enhanced ConvNeXt network was developed in [45]. A channel attention mechanism was employed to preserve high-frequency residual information, which is crucial for camera fingerprint construction, while minimizing interference from scene content. An adaptive dual-branch fusion network was introduced to extract multiscale features, and a bottleneck residual module was proposed to facilitate the transfer of shallow features for capturing weak camera features [46]. Additionally, a dual-branch CNN-based framework that fused low-level features from color images and high-pass filtered images was introduced to provide complementary features for the identification task [47]. Building on this trend, Ref. [48] introduced a contrastive learning strategy using a heterogeneous dual-branch network to refine the learning of camera fingerprints. Two approaches to extract camera information were employed within the dual-branch framework. Through contrastive learning, the shared forensic features related to the camera model between the two branches are enhanced, effectively filtering out irrelevant scene content information. The camera fingerprint information extracted from both branches is then integrated into a classification module to perform the camera model identification task. The concept of dual-branch architecture is further enhanced in [49] through the integration of multiscale decomposition and wavelet-based feature refinement. By employing multi-level decomposition and fusing features across scales, the method effectively amplifies subtle noise patterns, thereby improving the extraction of forensic traces for source camera identification.

Table 3.

Chronological overview of deep learning techniques for source camera identification.

Year Techniques References
2016 Pre-processing Highpass [30]
2017 Pre-processing Median filter [31]
Pre-processing Adaptive conv layer [32]
CNN feature extraction with SVM [3]
Residual network Content-adaptive fusion [36]
2019 Residual network Domain knowledge [37]
2020 Pre-processing Edge map [33]
2021 Pre-processing Data-driven [34]
Multi-scale Residual prediction [42]
Multi-scale Multiple-scale filters [43]
2022 U-Net Hierarchical [39]
Multiscale Multiscale encoder-decoder [44]
2023 CNN feature extraction Wavelet with LBP [29]
U-Net Residual-noise extraction [40]
2024 Residual network Conv with residual [38]
U-Net Multi-scale with transformer [41]
Dual-path ConvNeXt [45]
Dual-path Multiscale feature fusion [46]
Dual-path High and low pass fusion [47]
2025 Pre-processing Angular and radial feature extraction [35]
Dual-path Contrastive learning [48]
Dual-path Multiscale with wavelet [49]

4. Deep Learning Approaches for Device Linking

In device linking, the goal is to determine whether two images originate from the same camera. Although the deep learning approaches discussed in Section 3 can be applied to compare noise residues between two images, their performance is limited by the quality of these residues and the presence of other noise components, as illustrated in Equation (1). Consequently, their effectiveness is significantly hindered. Table 4 provides a chronological overview of the deep learning techniques proposed for device linking from 2019 to 2025. Two main strategies have been developed to address the device linking problem:

  • Few-shot learning, particularly one-shot learning.

  • Contrastive learning, which compares image pairs.

Their key design strategies are given in Figure 3. In few-shot learning, virtual samples are generated to augment limited training data, and semi-supervised learning strategies are generally employed to improve performance under data-scarce conditions. For example, Ref. [50] proposed generating virtual samples using a global fuzzification and information diffusion strategy. In [51], a distance-based ensemble strategy was combined with a self-correction mechanism for semi-supervised learning. Similarly, Ref. [52] utilized multiple distance measures and coordinate pseudo-label selection in its semi-supervised learning framework. Despite these efforts, success in device-level linking remains limited.

Figure 3.

Figure 3

Figure 3

Schematic diagrams of (a) few-shot learning-based and (b) contrastive learning-based methods.

Contrastive learning has emerged as a promising technique for device linking. It enables models to learn discriminative feature representations by training on both similar and dissimilar image pairs. This approach is particularly effective in distinguishing between images captured by different cameras and those taken by the same device.

Reference [53] first discussed the use of a Siamese network to extract a camera model fingerprint, known as Noiseprint. As shown in Figure 3b, the Siamese network comprises two identical CNNs with a shared architecture and weights. It is trained using pairs of image patches from the same camera (label +1) or different cameras (label −1). For positive examples, weights are updated to reduce the distance between the outputs, while for negative examples, weights are updated to increase the distance.

In the same year, another Siamese network model was proposed with end-to-end training in [54]. This model, known as the forensic similarity network, comprises a CNN-based feature extractor and a three-layer similarity network. The feature extractor aims to extract forensic traces from two images, while the similarity network produces a score indicating if the forensic traces are consistent.

Later, a multi-layer perceptron network was introduced for feature extraction. It also adopted Siamese architecture with contrastive loss and logistic prediction [55]. As reported in [53,54,55], these networks were effective for model-level camera linking but not for device-level linking. In 2025, a new approach was proposed [56] that also used a Siamese network to extract the noise residuals. However, this method further refined the noise residuals using contextual information aggregation operations. These operations were designed to suppress scene content interference and enhance the extraction of device-level forensic features, rather than model-level features. For the first time, device-level linking between two images was successfully demonstrated under a deep learning framework.

Table 4.

Chronological overview of deep learning techniques for device linking.

Year Techniques References
2019 Contrastive learning Siamese network [53]
Contrastive learning Forensic similarity [54]
2020 Contrastive learning Multi-layer with Siamese [55]
2022 Few-shot learning Global fuzzification with information diffusion [50]
Few-shot learning Ensemble strategy [51]
2023 Few-shot learning Coordinate pseudo-label selection [52]
2025 Contrastive learning SiamNet with contextual information aggregation [56]

5. Performance Comparison and Benchmarking

5.1. Dataset

To evaluate the effectiveness of source camera identification and device linking methods, it is essential to benchmark them across diverse datasets and scenarios. In forensic applications, datasets should reflect real-world conditions, including variations in camera models and devices, image content, and post-processing effects. Table 5 provides an overview of datasets commonly used in image forensics research. These datasets are broadly categorized based on the nature of the images they contain: images captured under default settings, compressed images, and manipulated images. Note that all these datasets primarily consist of RGB images in JPEG format, as it is the most common format used by digital cameras and mobile devices.

Table 5.

Datasets for image forensics applications.

Nature of Images References Dataset Name Number of Images Number of Camera
Devices
Number of Camera
Models
Images acquired under default settings [57] Dresden 16,960 74 25
[58] MICHE-I 3700 3 3
[59] UNIFI 5415 23 21
[60] IMAGINE 2816 67 55
[61] SOCRatES 9700 103 65
[62] Daxing 43,400 90 22
Compressed images at different qualities [63] VISION 34,427 35 29
[64] Forchheim 23,000 27 25
[65] --- 13,210 400 ---
[66] --- 32,445 486 ---
Forged images (copy move, splicing, enhancement) [67] CASIA 5123 No camera information
Forged images (copy move, splicing, removal, enhancement) [68] IMD2020 2010 No camera information
[69] DF2023 1 million No camera information
Forged images (seam carving) [26] --- 2750 11 10
[70] --- 1560 13 12

Among them, the Dresden image database is one of the most widely used datasets for both source camera identification and device linking [57]. It consists of 16,960 full-size photographs captured by seventy-four devices across twenty-five camera models from fourteen different camera brands. This dataset serves as a standard benchmark for most algorithms in this domain. The VISION dataset was introduced in 2017 [63]. It includes data that has undergone compression to simulate real-world sharing scenarios. The dataset contains 34,427 images and 1914 videos from thirty-five devices across eleven camera brands. It includes both original and social media compressed versions (from platforms like Facebook, YouTube, and WhatsApp). Flickr is a popular image-sharing platform; some researchers have collected images from it for testing [65,66]. Between 2018 and 2023, several datasets were introduced for source camera identification [58,59,60,61,62,64]. However, most existing methods have yet to be evaluated using these datasets.

Other datasets, such as CASIA and DF2023, focus on image manipulation (e.g., copy-move, splicing, enhancement) [67,68,69]. While valuable for evaluating forgery detection algorithms, they are not suitable for source camera identification due to missing camera information. An exception is the seam carving manipulation datasets [26,70], which retain crucial source camera information and can be used to assess the robustness of identification methods under content-aware image alterations.

DL-based methods adopt a classification setting, where the outputs represent scores indicating the likelihood that a test photo belongs to each camera class. Standard evaluation metrics for performance assessment include true-positive rate, false-positive rate, accuracy, precision, and F1 score.

5.2. Performance of Deep Learning Methods in Source Camera Identification

Table 6 summarizes the performance of deep learning-based methods for source camera identification at both the model and device levels. The Dresden [57] and VISION datasets [63] are among the most widely used benchmark datasets in the literature for evaluating source camera identification methods. We can also visualize the performance in accuracy over the years for different experimental setups.

Table 6.

(a) The model-level and (b) device-level performance of deep learning-based methods for source camera identification.

(a)
Number of Camera Models (Dataset) References Accuracy
25 models (Forchheim)
(selecting 50 patches)
2025, [49] 0.9673 (F1: 0.9604)
2024, [47] 0.9451 (F1: 0.9312)
2024, [45] 0.9413 (F1: 0.9266)
2021, [43] 0.9387 (F1:0.9035)
2021, [34] 0.9497 (F1: 0.9475)
2021, [42] 0.9105 (F1: 0.9139)
2017, [31] 0.8717 (F1: 0.8667)
2017, [32] 0.8833 (F1: 0.8924)
25 models (Forchheim)
(selecting 256 image patches)
2024, [47] 1.000
2021, [34] 0.9987
2021, [42] 0.9948
2021, [43] 0.9961
2017, [3] 0.9361
23 models (Dresden) 2024, [46] 0.9933
2021, [42] 0.9862
2021, [43] 0.9851
2017, [31] 0.9806
2017, [36] 0.9735
18 models (Dresden) 2025, [49] 0.957 (F1: 0.951)
2025, [48] 0.950 (F1: 0.945)
2024, [45] 0.938 (F1: 0.931)
2024, [47] 0.944 (F1: 0.939)
2022, [44] 0.931 (F1: 0.932)
2021, [34] 0.942 (F1: 0.931)
2021, [42] 0.912 (F1: 0.904)
2021, [43] 0.932 (F1: 0.932)
2019, [53] 0.913 (F1: 0.916)
2017, [32] 0.898 (F1: 0.891)
13 models (Dresden) 2023, [40] 0.9760 (F1: 0.9759)
2019, [71] 0.9756 (F1: 0.9760)
2017, [3] 0.9034 (F1: 0.9050)
4 models (Dresden) 2024, [38] 0.9570
29 models (Vision) 2025, [49] 0.891 (F1: 0.920)
2025, [48] 0.855 (F1: 0.893)
2024, [45] 0.831 (F1: 0.876)
2024, [47] 0.837 (F1: 0.878)
2022, [44] 0.823 (F1: 0.862)
2021, [34] 0.829 (F1: 0.867)
2021, [42] 0.765 (F1: 0.792)
2021, [43] 0.826 (F1: 0.872)
2019, [53] 0.801 (F1: 0.813)
2017, [32] 0.732 (F1: 0.794)
4 models (Vision) 2024, [38] 0.9629
15 models (from [41]) 2024, [41] 0.9787
2021, [43] 0.9234
2019, [37] 0.9509
2017, [36] 0.9356
graphic file with name sensors-25-07432-i001.jpg
(b)
Number of Camera Devices (Dataset) References Accuracy
74 devices (Dresden)
Average no of devices per model = 2.96
2025, [48] 0.492 (F1: 0.486)
2024, [45] 0.414 (F1: 0.416)
2024, [47] 0.475 (F1: 0.471)
2022, [44] 0.439 (F1: 0.448)
2021, [34] 0.446 (F1: 0.467)
2021, [42] 0.393 (F1: 0.393)
2021, [43] 0.428 (F1: 0.446)
2019, [37] 0.5240
2019, [53] 0.427 (F1: 0.444)
2017, [31] 0.4581
2017, [32] 0.367 (F1: 0.392)
35 devices (vision)
Average no of devices per model = 1.21
2025, [35] 0.943
2024, [45] 0.813
2022, [39] 0.811
2022, [44] 0.832
2021, [42] 0.721
2021, [43] 0.765
2019, [71] 0.831
2017, [32] 0.830
13 devices (from [41])
Average no of devices per model = 1.625
2024, [41] 0.9185
2021, [43] 0.7996
2019, [36] 0.8023
2017, [37] 0.7585
graphic file with name sensors-25-07432-i002.jpg

By examining Table 6a, it is evident that all methods demonstrate strong performance in model-level camera identification, with accuracies ranging from 0.73 to 1.00, and an average of 0.92. As illustrated in the boxplot in Figure 4, model-level identification on the Dresden dataset surpasses that on the Vision dataset. This can be attributed to the more complex and diverse scenes present in the Vision dataset, which pose greater challenges for classification.

Figure 4.

Figure 4

Comparison of the model-level and device-level accuracies in deep learning-based methods for source camera identification, where r denotes the average number of devices per model. The orange color lines represent the median.

However, the transition to device-level identification results in a notable drop in performance, as evident in both Table 6b and Figure 4. Generally, as the average number of devices per camera model increases, identification accuracy tends to decline. When the average number of devices per model is below 2, the average accuracy remains around 0.82. In contrast, when this number exceeds 2, the average accuracy drops to 0.44, underscoring the difficulty of distinguishing individual camera devices. This limitation stands in comparison to traditional PRNU-based methods, which consistently achieve over 0.9 in device-level identification (as shown in Table 1).

To enhance device-level performance, a separate network for extracting noise residues from each camera was considered in [71]. Although this approach improves device-level accuracy, it is time-consuming and impractical given the large number of camera devices in real-world applications. In fact, PRNU-based methods are computationally lightweight, relying on correlation measures and filtering. DL-based methods generally require substantial resources for training (e.g., GPUs and large datasets). Despite these higher computational costs, they still fail to outperform PRNU at device-level identification. This raises questions about the cost-benefit trade-off for practical deployment.

5.3. Performance of Deep Learning Methods in Device Linking

This section evaluates the performance of deep learning methods for device linking tasks, focusing on two main strategies: one-shot learning and contrastive learning. Table 7 summarizes the results at both the model-level and device-level linking. Generally, one-shot learning strategies show limited performance. At the model-level, accuracy is around 0.55, while device-level accuracy drops further to 0.38, indicating challenges in reliably linking images from the same device.

Table 7.

The performance of (a) one-shot learning and (b) contrastive learning for device linking [52].

(a)
Linking Number of Cameras (Dataset) Accuracy
Model-level 14 models (Dresden) 0.5660
11 models (Vision) 0.5303
Device-level 27 devices (Dresden) Around 0.4
35 devices (vision) around 0.35
(b)
Linking Number of Cameras (Dataset) References Accuracy
Model-level 5 models (Dresden) 2020, [55] 0.7840
2017, [3] 0.6930
2016, [30] 0.6530
5 models (Vision) 2020, [55] 0.7450
2017, [3] 0.6250
2016, [30] 0.6380
Device-level 30 devices (Dresden)
Average number of devices per model = 3.33
2025, [56] 0.9520 (F1 = 0.82)
2022, [39] 0.9431 (F1 = 0.22)
2019, [54] 0.7448 (F1 = 0.12)
5 devices (Dresden) from one camera model 2020, [55] 0.6530
2017, [3] 0.6863
2016, [30] 0.6397
graphic file with name sensors-25-07432-i003.jpg

Contrastive learning methods have shown more promise in device linking tasks. At the model level, the average accuracy is 0.69, which is lower than the model-level source camera identification performance shown in Table 7b and Figure 4. However, for device-level performance, the accuracy is around 0.65, which is comparable to the device-level source camera identification performance as in Table 7b and Figure 4.

Among the reviewed methods, the approach in [56] stands out for its device-level performance. Evaluated on an imbalanced dataset of 30 devices from 9 camera models, one should consider the F1 score as well. Its F1 score is 0.82, outperforming earlier methods. For instance, the U-Net-based approach [39] achieved an F1 score of 0.22, which is comparable to the performance of PRNU-based methods, as shown in Table 2. The forensic similarity network [54] had an F1 score of 0.12. These results demonstrate that device-level linking between two images is possible under a contrastive learning framework.

We also consider metrics such as true-positive rate and false-positive rate, in addition to the F1 score. Figure 5 shows a plot of true-positive rates against false-positive rates. The blue markers represent PRNU-based methods for device linking, while the green markers correspond to the U-Net approach [39]. Both exhibit low true-positive rates, despite achieving excellent false-positive rates. This indicates that while they rarely link images from different cameras, they fail to link images from the same device correctly. The purple markers denote contrastive learning-based methods. Among these, the method in [54] achieves a higher true-positive rate, but at the same time, a high false-positive rate. In contrast, the method in [56] demonstrates an excellent true-positive rate, though its false-positive rate remains relatively high, which may limit its forensic applicability. The red markers illustrate the performance of PRNU-based methods for source camera identification, where enough images are available to construct a reliable camera fingerprint. Although the method in [56] has only two images for comparison and lacks a reliable camera fingerprint, it achieves a comparable true-positive rate to the traditional PRNU-based approach. However, the high false-positive rate highlights a critical challenge; further research is needed to reduce the false positives and enhance their reliability in practical forensic scenarios.

Figure 5.

Figure 5

A plot of true-positive rate against false-positive rate for PRNU-based device linking methods [10,13,14,15] (blue), U-Net [39] (green), and contrastive learning methods [54,56] (purple). The red color represents the performance of the PRNU-based method [21] in a source camera identification setting where enough images are used to construct the camera PRNU.

6. Discussions and Challenges

6.1. Device-Level Performance

Despite significant progress in DL-based methods for source camera identification and device linking, their performance at the device level remains far below that of PRNU-based methods. PRNU methods consistently achieve high accuracy and low false-positive rates, making them legally accepted benchmarks. In contrast, DL-based methods struggle to extract device-specific artefacts. This gap highlights the need for additional research to achieve device-level performance comparable to that of PRNU-based methods.

Further work can focus on developing strategies that capture device-specific forensic traces, such as using attention mechanisms [72] to direct the network’s focus toward camera-specific features while suppressing scene content. A contrastive learning framework [73] can be utilized to integrate information across various domains and at multiple scales, thereby enhancing device feature learning and characterization.

6.2. Cross-Dataset Validation

Forensic applications involve factors such as social media compression, varying image resolutions, and cameras not presented in the training data. While current DL-based methods perform well on benchmark datasets like Dresden or VISION, their accuracy on images from cameras outside the training set remains uncertain. These benchmarks also do not fully capture real-world diversity, leading to weak generalization to unseen cameras and datasets. To address this limitation, cross-dataset validation is essential. This process involves evaluating models on datasets that differ from the training set to assess robustness and prevent overfitting to specific benchmarks [74]. Such validation ensures that algorithms can handle forensic scenarios involving unknown devices and diverse imaging conditions.

6.3. Computational Costs

Given the large number of camera devices, computational efficiency is critical for practical deployment. Most existing studies emphasize classification performance and do not report efficiency metrics. Deep learning-based models typically demand substantial resources for training, including high-performance GPUs and large datasets, making them more resource-intensive than PRNU-based methods. Future studies could consider developing lightweight architectures, such as teacher–student frameworks [75], to reduce computational requirements and make DL-based solutions more feasible for forensic applications.

6.4. Modern Imaging Pipeline

Nowadays, enhancements to image quality can occur during the image capture process. Modern cameras are equipped with automatic settings, such as scene recognition, object detection, and face identification, to optimize photographs in real-time [76,77,78]. Additionally, popular mobile applications offer beautification options, including skin smoothing and blemish removal, during the photo capture process. However, existing datasets rarely account for these enhancements, making this an underexplored area of research. As noted in [7], the reliability of PRNU as the gold standard in modern imaging pipelines has been questioned due to such processing. In this part, we investigate whether these built-in enhancements would influence the performance of DL-based source camera identification or device linking, as observed with PRNU-based methods.

Two kinds of common photography scenarios are considered. They are camera-built-in image enhancement (S1) and AI enhancement via apps (S2). The former involves applying software-built-in processing through various kinds of settings to enhance the quality of photographs, while the latter employs specific apps that offer beautification features. Figure 6 shows some examples. In the first scenario (S1), the camera is set to enhance its vividness. In the second scenario (S2), an app named “Meitu” is used for AI-based enhancements, including “whitening”, “face whitening”, and “face skimming”. The average intensity histograms are shown in Figure 7. Overall, the histograms exhibit a similar trend; however, S1 and S2 display higher peaks at intensity levels around 220, likely due to whitening processes and the enhancement of vividness. The mean intensities for S1 and S2 are 132 and 127, respectively, which are higher than the average intensity of 115 observed in images captured at default settings.

Figure 6.

Figure 6

Example images considered in the robustness test. The first column displays the original images captured using the default camera settings. The second and third columns show images in scenario 1 (S1) and scenario 2 (S2), respectively.

Figure 7.

Figure 7

The average intensity histogram of the original images captured using the default camera settings: images in scenario 1 (S1) and scenario 2 (S2).

Three methods were examined to evaluate their performance on these two scenarios: the PRNU-based method (Section 2), the deep learning method for source camera identification (Section 3), and the deep learning method for device linking (Section 4).

For the PRNU-based algorithm, the filter used is the wavelet-based denoising filter, and the top left corner was chosen for source identification, following common practice to minimize textured regions [2]. Figure 8 shows the box plot for PCE values of the PRNU-based methods in the two scenarios. A large PCE value suggests that the images are highly likely to have been taken from the specific camera device. Using the default setting, the average PCE value is 63.37. Out of 15 images, 13 have PCE values greater than 20. However, after using software-built-in processing or AI enhancement, the average PCE values for both scenarios drop significantly. They decrease to 0.15 and 0.27 for the two scenarios, respectively. The highest PCE value is 2.06, far below the normal threshold value for source camera identification and device linking applications. This finding aligns with the descriptions in [7], indicating that PRNU-based methods struggle to identify camera devices, even with the minor enhancements evaluated in the two scenarios.

Figure 8.

Figure 8

Comparison of the PCE values of the PRNU-based methods for original images: enhanced images in scenario 1 (S1) and scenario 2 (S2). The orangle line represents the median.

After confirming the effect on PRNU methods, we investigated the impact of the deep learning technique on source camera identification. Specifically, a CNN method with good model-level performance was examined [34]. In the original case without any enhancement, 14 out of 15 images were successfully identified. In scenarios S1 and S2, 13 out of 15 images were accurately identified. Compared to PRNU-based methods, the CNN method demonstrates greater robustness. This resilience can be attributed to the CNN method, which primarily extracts model-based artefacts, which are less susceptible to alterations compared to device-based artefacts.

We then consider the effect on device linking applications. The SiamNet was examined due to its good device-level performance [56]. Figure 9 shows the similarity scores for the original case and the two scenarios. Using the default setting without any software-built-in-processing, the average similarity score is 0.627. With minor enhancement, the average similarity scores for the two scenarios are 0.524 and 0.536, respectively. With a threshold of 0.5, the average accuracy is 88.9% in the original case. However, the accuracy decreased to 60.5% and 58.0% for the two scenarios, indicating significant performance drops in both cases.

Figure 9.

Figure 9

Comparison of the similarity scores of the Siamese network method for original images: enhanced images in scenario 1 (S1) and scenario 2 (S2). The orange line represents the median.

In summary, our findings indicate that, like PRNU-based methods [7], DL-based methods are also affected by the built-in processing features embedded in cameras. This underscores the need for updated datasets that reflect current imaging trends, including camera-integrated processing and AI-based enhancement, to strengthen the robustness of forensic methods against challenges posed by evolving practical photography practices.

7. Conclusions

Source camera identification and device linking are critical tasks in digital forensics, security, and online authentication. Over the years, a wide range of deep learning architectures have been explored to address these challenges. From early convolutional neural networks to advanced encoder–decoder structures, residual networks, dual-branch models, and contrastive learning frameworks, researchers have developed numerous sophisticated methods to extract forensic features from images. These deep learning approaches have shown strong performance in source camera identification, particularly at the model level. However, achieving reliable device-level identification remains a significant challenge due to the subtle nature of device-specific artefacts. Recent contrastive learning methods have demonstrated promising results in device linking, achieving higher F1 scores than earlier DL-based methods; however, they still suffer from higher false-positive rates compared to PRNU-based methods.

Recently, PRNU-based methods have been found to have limitations when confronted with modern image capture practices, such as built-in camera enhancements and AI-based image processing. Our findings reveal that DL-based methods also suffer from this issue. This underscores the need for continued innovation and updated datasets that reflect real-world conditions to improve the resilience of forensic techniques.

Acknowledgments

The Authors acknowledged the support from the Department of Electrical and Electronic Engineering, the Hong Kong Polytechnic University. The authors would like to thank all the reviewers for their suggestions to improve this manuscript. During the preparation of this manuscript, the authors utilized Grammarly (version 14.1265.0) and Copilot (Version 2510 Build 16.0.19328.20244) to enhance the writing quality. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Abbreviations

The following abbreviations are used in this manuscript:

CNN Convolutional neural networks
DL Deep learning
PCE Peak correlation energy
PRNU Photo-response non-uniformity
SVM Support vector machine

Author Contributions

Conceptualization, N.-F.L.; methodology, N.-F.L. and Z.L.; software, Z.L.; validation, Z.L.; investigation, N.-F.L. and Z.L.; writing—original draft preparation, N.-F.L. and Z.L.; writing—review and editing, N.-F.L. and Z.L. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Funding Statement

This research received no external funding.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

  • 1.Lukáš J., Fridrich J., Goljan M. Proc. SPIE 6072, Security, Steganography, and Watermarking of Multimedia Contents. SPIE Press; Bellingham, DC, USA: 2006. Detecting digital image forgeries using sensor pattern noise; pp. 362–372. [Google Scholar]
  • 2.Goljan M., Fridrich J., Filler T. Proc. SPIE 7254, Media Forensics and Security. SPIE Press; Bellingham, DC, USA: 2009. Large scale test of sensor fingerprint camera identification; p. 725401. [Google Scholar]
  • 3.Bondi L., Baroffio L., Güera D., Bestagini P., Delp E.J., Tubaro S. First steps toward camera model identification with convolutional neural networks. IEEE Signal Process. Lett. 2017;24:259–263. doi: 10.1109/LSP.2016.2641006. [DOI] [Google Scholar]
  • 4.Gouda O., Bouridane A., Talib M.A., Nasir Q. Machine learning-based methods in source camera identification: A systematic review; Proceedings of the International Conference on Business Analytics for Technology and Security; Dubai, United Arab Emirates. 16–17 February 2022; pp. 1–7. [DOI] [Google Scholar]
  • 5.Nwokeji C.E., Sheikh-Akbari A., Gorbenko A., Mporars I. Source camera identification techniques: A survey. J. Imaging. 2024;10:31. doi: 10.3390/jimaging10020031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bernacki J., Scherer R. Algorithms and methods for individual source camera identification: A survey. Sensors. 2025;25:3027. doi: 10.3390/s25103027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Klier S., Baier H. Source camera identification—Do we have a good standard? Forensic Sci. Int. Digit. Investig. 2025;52:301858. [Google Scholar]
  • 8.Volkov A.A., Kozlov A.V., Cheremkhin P.A., Rymov D.A., Shifrina A.V., Starikov R.S., Nebavskiy V.A., Petrova E.K., Zlokazov E.Y., Rodin V.G. A review of Neural network-based image noise processing methods. Sensors. 2025;25:6088. doi: 10.3390/s25196088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Chen M., Fridrich J., Goljan M., Lukas J. Determining image origin and integrity using sensor noise. IEEE Trans. Inf. Forensics Secur. 2008;3:74–90. doi: 10.1109/TIFS.2007.916285. [DOI] [Google Scholar]
  • 10.Salazar D.A., Ramirez-Rodriguez A.E., Nakano M., Cedillo-Hernandez M., Perez-Meana H. Evaluating of denoising algorithms for source camera linking; Proceedings of the 13th Mexican Conference on Pattern Recognition; Mexico City, Mexico. 23–26 June 2021; pp. 282–291. [Google Scholar]
  • 11.Kozlov A.V., Nikitin N.V., Rodin V.G., Cheremkhin P.A. Improving the reliability of digital camera identification by optimizing the algorithm for comparing noise signatures. Meas. Tech. 2024;66:923–934. doi: 10.1007/s11018-024-02308-y. [DOI] [Google Scholar]
  • 12.Goljan M., Chen M., Fridrich J. Identifying common source digital camera from image pairs; Proceedings of the IEEE International Conference on Image Processing; San Antonio, TX, USA. 16 September–19 October 2007; pp. VI–125–VI–128. [Google Scholar]
  • 13.Fridrich J. Sensor defects in digital image forensic. In: Sencar H.T., Memon N., editors. Digital Image Forensics: There Is More to a Picture Than Meets the Eye. Springer; Berlin/Heidelberg, Germany: 2013. pp. 179–218. [Google Scholar]
  • 14.Mieremet A. Camera-identification and common-source identification: The correlation values of mismatches. Forensic Sci. Int. 2019;301:46–54. doi: 10.1016/j.forsciint.2019.05.008. [DOI] [PubMed] [Google Scholar]
  • 15.Li C.-T. Source camera linking using enhanced sensor pattern noise extracted from images; Proceedings of the International Conference on Imaging for Crime Detection and Prevention; London, UK. 3 December 2009. [Google Scholar]
  • 16.Kang X., Li Y., Qu Z., Huang J. Enhancing source camera identification performance with a camera reference phase sensor pattern noise. IEEE Trans. Inf. Forensics Secur. 2012;7:393–402. doi: 10.1109/TIFS.2011.2168214. [DOI] [Google Scholar]
  • 17.Chan L.H., Law N.F., Siu W.C. A confidence map and pixel-based weighted correlation for PRNU-based camera identification. Digit. Investig. 2013;10:215–225. doi: 10.1016/j.diin.2013.04.001. [DOI] [Google Scholar]
  • 18.Shi C., Law N.F., Leung H., Siu W.C. A local variance based approach to alleviate the scene content interference for source camera identification. Digit. Investig. 2017;22:74–87. doi: 10.1016/j.diin.2017.07.005. [DOI] [Google Scholar]
  • 19.Liu Y., Xiao Y., Tian H. Plug-and-Play PRNU enhancement algorithm with guided filtering. Sensors. 2024;24:7701. doi: 10.3390/s24237701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Ramirez-Rodriguez A.E., Nakano M., Perez-Meana H. Source camera linking algorithm based on the analysis of plain image zones. Eng. Proc. 2024;60:17. doi: 10.3390/engproc2024060017. [DOI] [Google Scholar]
  • 21.Martin A., Newman J. Significance of image brightness levels for PRNU camera identification. J. Forensic Sci. 2025;70:132–149. doi: 10.1111/1556-4029.15673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bayram S., Sencar T., Memon N.D. Seam-carving based anonymization against image & video source attribution; Proceedings of the IEEE International Workshop on Multimedia Signal Processing; Pula, Italy. 30 September–2 October 2013; pp. 272–277. [Google Scholar]
  • 23.Dirik A.E., Sencar H.T., Memon N. Analysis of seam-carving-based anonymization of images against PRNU noise pattern-based source attribution. IEEE Trans. Inf. Forensics Secur. 2014;9:2277–2290. doi: 10.1109/TIFS.2014.2361200. [DOI] [Google Scholar]
  • 24.Martín-Rodríguez F., Isasi-de-Vicente F., Fernández-Barciela M. A Stress Test for Robustness of Photo Response Nonuniformity (Camera Sensor Fingerprint) Identification on Smartphones. Sensors. 2023;23:3462. doi: 10.3390/s23073462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Taspinar S., Manoranjan M., Memon N. PRNU-based camera attribution from multiple seam-carved images. IEEE Trans. Inf. Forensics Secur. 2017;12:3065–3080. doi: 10.1109/TIFS.2017.2737961. [DOI] [Google Scholar]
  • 26.Irshad M., Law N.F., Loo K.H., Haider S. IMGCAT: An approach to dismantle the anonymity of a source camera using correlative features and an integrated 1D convolutional neural network. Array. 2023;18:100279. doi: 10.1016/j.array.2023.100279. [DOI] [Google Scholar]
  • 27.Li J., Zhang X., Ma B., Qin C., Wang C. Reversible PRNU anonymity for device privacy protection based on data hiding. Expert Syst. Appl. 2023;234:121017. doi: 10.1016/j.eswa.2023.121017. [DOI] [Google Scholar]
  • 28.Wang C., Zhang Q., Wang X., Zhou L., Li Q., Zia Z., Ma B., Shi Y.Q. Light-Field Image Multiple Reversible Robust Watermarking Against Geometric Attacks. IEEE Trans. Dependable Secur. Comput. 2025;22:5861–5875. doi: 10.1109/TDSC.2025.3576223. [DOI] [Google Scholar]
  • 29.Jaiswal A.K., Srivastava R. Role of Data-Intensive Distributed Computing Systems in Designing Data Solutions. Springer; Cham, Switzerland: 2023. Source camera identification using hybrid feature set and machine learning classifiers; pp. 111–127. [Google Scholar]
  • 30.Tuama A., Comby F., Chaumont M. Camera Model Identification with the Use of Deep Convolutional Neural Networks; Proceedings of the IEEE International Workshop on Information Forensics & Security; Abu Dhabi, United Arab Emirates. 4–7 December 2016; pp. 1–6. [Google Scholar]
  • 31.Chen Y., Huang Y., Ding X. Camera model identification with residual neural network; Proceedings of the IEEE International Conference on Image Processing; Beijing, China. 17–20 September 2017; pp. 4337–4341. [Google Scholar]
  • 32.Bayar B., Stamm M.C. Augmented convolutional feature maps for robust CNN-based camera model identification; Proceedings of the IEEE International Conference on Image Processing; Beijing, China. 17–20 September 2017; pp. 4098–4102. [Google Scholar]
  • 33.Kang C., Kang S. Camera model identification using a deep network and a reduced edge dataset. Neural Comput. Appl. 2020;32:13139–13146. doi: 10.1007/s00521-019-04619-6. [DOI] [Google Scholar]
  • 34.Rafi A.M., Tonmoy T.I., Kamal U., Wu Q.J., Hasan M.K. RemNet: Remnant convolutional neural network for camera model identification. Neural Comput. Appl. 2021;33:3655–3670. doi: 10.1007/s00521-020-05220-y. [DOI] [Google Scholar]
  • 35.Elharrouss O., Akbari Y., Almadeed N., Al-Maadeed S., Khelifi F., Bouridane A. PDC-ViT: Source camera identification using pixel difference convolution and vision transformer. Neural Comput. Appl. 2025;37:6933–6949. doi: 10.1007/s00521-025-11004-z. [DOI] [Google Scholar]
  • 36.Yang P., Ni R., Zhao Y., Zhao W. Source camera identification based on content-adaptive fusion residual networks. Pattern Recognit. Lett. 2019;119:195–204. doi: 10.1016/j.patrec.2017.10.016. [DOI] [Google Scholar]
  • 37.Ding X., Chen Y., Tang Z., Huang Y. Camera identification based on domain knowledge-driven deep multi-task learning. IEEE Access. 2019;7:25878–25890. doi: 10.1109/ACCESS.2019.2897360. [DOI] [Google Scholar]
  • 38.Sychandran C., Shreelekshmi R. SCCRNet: A framework for source camera identification on digital images. Neural Comput. Appl. 2024;36:1167–1179. doi: 10.1007/s00521-023-09088-6. [DOI] [Google Scholar]
  • 39.Xiao Y., Tian H., Cao G., Yang D., Li H. Effective PRNU extraction via densely connected hierarchical network. Multimed. Tools Appl. 2022;81:20443–20463. doi: 10.1007/s11042-022-12507-w. [DOI] [Google Scholar]
  • 40.Bharathiraja S., Rajesh Kanna B., Hariharan M. A deep learning framework for image authentication: An automatic source camera identification Deep-Net. Arab. J. Sci. Eng. 2023;48:1207–1219. doi: 10.1007/s13369-022-06743-3. [DOI] [Google Scholar]
  • 41.Lu J., Li C., Huang X., Cui C., Emam M. Source camera identification algorithm based on multi-scale feature fusion. Comput. Mater. Contin. 2024;80:3047–3065. doi: 10.32604/cmc.2024.053680. [DOI] [Google Scholar]
  • 42.Liu Y., Zhou Z., Yang Y., Law N.F.B., Bharath A.A. Efficient source camera identification with diversity-enhanced patch selection and deep residual prediction. Sensors. 2021;21:4701. doi: 10.3390/s21144701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.You C., Zheng H., Guo Z., Wang T., Wu X. Multiscale content-independent feature fusion network for source camera identification. Appl. Sci. 2021;11:6752. doi: 10.3390/app11156752. [DOI] [Google Scholar]
  • 44.Hui C., Jiang F., Liu S., Zhao D. Source camera identification with multi-scale feature fusion network; Proceedings of the IEEE International Conference on Multimedia and Expo; Taipei, Taiwan. 18–22 July 2022; pp. 1–6. [Google Scholar]
  • 45.Huan S., Liu Y., Yang Y., Law N.F. Camera model identification based on dual-path enhanced ConvNeXt network and patches selected by uniform local binary pattern. Expert Syst. Appl. 2024;241:122501. doi: 10.1016/j.eswa.2023.122501. [DOI] [Google Scholar]
  • 46.Zheng H., You C., Wang T., Ju J., Li X. Source camera identification based on an adaptive dual-branch fusion residual network. Multimed. Tools Appl. 2024;83:18479–18495. doi: 10.1007/s11042-023-16290-0. [DOI] [Google Scholar]
  • 47.Rana K., Goyal P., Sharma G. Dual-branch convolutional neural network for robust camera model identification. Expert Syst. Appl. 2024;238:121828. doi: 10.1016/j.eswa.2023.121828. [DOI] [Google Scholar]
  • 48.Han Z., Yang Y., Zhang J., Li Y., Liu Y., Law N.F. A contrastive learning-based heterogeneous dual-branch network for source camera identification. Neurocomputing. 2025;645:130406. doi: 10.1016/j.neucom.2025.130406. [DOI] [Google Scholar]
  • 49.Han Z., Yang Y., Zhang J., Liu Y., Law N.F. DWT-RFNet: A wavelet-based deep learning method for robust source camera identification. Appl. Soft Comput. 2025;185:114027. doi: 10.1016/j.asoc.2025.114027. [DOI] [Google Scholar]
  • 50.Wang B., Wu S., Wei F., Wang Y., Hou J., Sui X. Virtual sample generation for few-shot source camera identification. J. Inf. Secur. Appl. 2022;66:103153. doi: 10.1016/j.jisa.2022.103153. [DOI] [Google Scholar]
  • 51.Wang B., Hou J., Ma Y., Wang F., Wei F. Multi-DS Strategy for source camera identification in few-shot sample data sets. Hindawi Secur. Commun. Netw. 2022;2022:8716884. doi: 10.1155/2022/8716884. [DOI] [Google Scholar]
  • 52.Wang B., Hou J., Wei F., Yu F., Zheng W. MDM-CPS: A few shot sample approach for source camera identification. Expert Syst. Appl. 2023;229:120315. doi: 10.1016/j.eswa.2023.120315. [DOI] [Google Scholar]
  • 53.Cozzolino D., Verdoliva L. Noiseprint: A CNN-based camera model fingerprint. IEEE Trans. Inf. Forensics Secur. 2019;15:144–159. doi: 10.1109/TIFS.2019.2916364. [DOI] [Google Scholar]
  • 54.Mayer O., Stamm M.C. Forensic similarity for digital images. IEEE Trans. Inf. Forensics Secur. 2019;15:1331–1346. doi: 10.1109/TIFS.2019.2924552. [DOI] [Google Scholar]
  • 55.Sameer V.U., Naskar R. Deep Siamese network for limited labels classification in source camera identification. Multimed. Tools Appl. 2020;79:28079–28104. doi: 10.1007/s11042-020-09106-y. [DOI] [Google Scholar]
  • 56.Zheng M., Law N.F., Siu W.C. Unveiling image source: Instance-level camera device linking via context-aware deep siamese network. Expert Syst. Appl. 2025;262:125617. doi: 10.1016/j.eswa.2024.125617. [DOI] [Google Scholar]
  • 57.Gloe T., Böhme R. The Dresden image database for benchmarking digital image forensics; Proceedings of the 2010 ACM Symposium on Applied Computing; New York, NY, USA. 22 March 2010; pp. 1584–1590. [Google Scholar]
  • 58.De Marsico M., Nappi M., Riccio D., Wechsler H. Mobile iris challenge evaluation (MICHE)-I, biometric iris dataset and protocols. Pattern Recognit. Lett. 2015;57:17–23. doi: 10.1016/j.patrec.2015.02.009. [DOI] [Google Scholar]
  • 59.Shaya O., Yang P., Ni R., Zhao Y., Piva A. A new dataset for source identification of high dynamic range images. Sensors. 2018;18:3801. doi: 10.3390/s18113801. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Bernacki J., Scherer R. IMAGINE dataset: Digital camera identification image benchmarking dataset; Proceedings of the 20th International Conference on Security and Cryptography (SECRYPT); Rome, Italy. 10–12 July 2023; pp. 799–804. [Google Scholar]
  • 61.Galdi C., Hartung F., Dugelay J.L. SOCRatES: A database of realistic data for source camera recognition on smartphones; Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods (ICPRAM); Prague, Czech Republic. 19–21 February 2019; pp. 648–655. [Google Scholar]
  • 62.Tian H., Xiao Y., Cao G., Zhang Y., Xu Z., Zhao Y. Daxing smartphone identification dataset. IEEE Access. 2019;7:101046–101053. doi: 10.1109/ACCESS.2019.2928356. [DOI] [Google Scholar]
  • 63.Shullani D., Fontani M., Iuliani M., Shaya O.A., Piva A. VISION: A video and image dataset for source identification. EURASIP J. Inf. Secur. 2017;1:15. doi: 10.1186/s13635-017-0067-2. [DOI] [Google Scholar]
  • 64.Hadwiger B., Riess C., Farinella G.M., Bertini M., Vezzani R., Sclaroff S., Mei T., Del Bimbo A., Escalante H.J., Cucchiara R. Pattern Recognition. ICPR International Workshops and Challenges. Volume 12666. Springer International Publishing; Berlin/Heidelberg, Germany: 2021. The Forchheim image database for camera identification in the Wild; pp. 500–515. [Google Scholar]
  • 65.Costa F.O., Silva E., Eckmann M., Scheirer W.J., Rocha A. Open set source camera attribution and device linking. Pattern Recognit. Lett. 2014;39:92–101. doi: 10.1016/j.patrec.2013.09.006. [DOI] [Google Scholar]
  • 66.Luliani M., Fontani M., Piva A. A leak in PRNU based source identification—Questioning fingerprint Uniqueness. IEEE Access. 2021;9:52455–52463. doi: 10.1109/access.2021.3070478. [DOI] [Google Scholar]
  • 67.Dong J., Wang W., Tan T. CASIA image tampering detection evaluation database; Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing; Beijing, China. 6–10 July 2013; pp. 422–426. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Novozamsky A., Mahdian B., Saic S. Imd2020: A large-scale annotated dataset tailored for detecting manipulated images; Proceedings of the IEEE Winter Applications of Computer Vision Workshops (WACVW); Snowmass Village, CO, USA. 1–5 March 2020; pp. 71–80. [Google Scholar]
  • 69.Fischinger D., Boyer M. DF2023: The digital forensics 2023 dataset for image forgery detection. arXiv. 2025 doi: 10.48550/arXiv.2503.22417.2503.22417 [DOI] [Google Scholar]
  • 70.Irshad M., Liew S.R.C., Law N.F., Loo K.H. CAMID: An assuasive approach to reveal source camera through inconspicuous evidence. Forensic Sci. Int. Digit. Investig. 2023;46:301616. doi: 10.1016/j.fsidi.2023.301616. [DOI] [Google Scholar]
  • 71.Kirchner M., Johnson C. SPN-CNN: Boosting sensor-based source camera attribution with deep learning; Proceedings of the IEEE International Workshop on Information Forensics and Security; Delft, The Netherlands. 9–12 December 2019. [Google Scholar]
  • 72.Sun Y., Li Z., Zhang Y., Pan T., Dong B., Guo Y., Wang J. Efficient Attention Mechanisms for Large Language Models: A survey. arXiv. 2025 doi: 10.48550/arXiv.2507.19595.2507.19595 [DOI] [Google Scholar]
  • 73.Hu H., Wang X., Zhang Y., Chen Q., Guan Q. A comprehensive survey on contrastive learning. Neurocomputing. 2024;610:128645. doi: 10.1016/j.neucom.2024.128645. [DOI] [Google Scholar]
  • 74.Lopez E., Etxebarria-Elezgarai J., Amigo J.M., Seifert A. The importance of choosing a proper validation strategy in predictive models. A tutorial with real examples. Anal. Chim. Acta. 2023;1275:341532. doi: 10.1016/j.aca.2023.341532. [DOI] [PubMed] [Google Scholar]
  • 75.Moslemi A., Briskina A., Dang Z., Li J. A survey on knowledge distillation: Recent advancements. Mach. Learn. Appl. 2024;18:100504. doi: 10.1016/j.mlwa.2024.100605. [DOI] [Google Scholar]
  • 76.AI Camera: What It Is, How It Works, Samsung Semiconductor Global. [(accessed on 25 November 2025)]. Available online: https://semiconductor.samsung.com/applications/ai/ai-camera/
  • 77.Improve Photos with Pixel’s AI Camera Technology, Google Store. [(accessed on 25 November 2025)]. Available online: https://store.google.com/intl/en_uk/ideas/articles/what-is-an-ai-camera/
  • 78.What Is an AI Camera Phone, and How Does It Work? HONOR SA. [(accessed on 25 November 2025)]. Available online: https://www.honor.com/sa-en/blog/what-are-ai-camera-phones/

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.


Articles from Sensors (Basel, Switzerland) are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES