Explainable few-shot learning with modern BERT for detecting emerging phishing attacks using XF PhishBERT

Mohammed Tawfik; Ashraf A Abu-Ein; Amr H Abdelhaliem; Yasser Mohammad Al-Sharo; Islam S Fathi

doi:10.1038/s41598-025-27500-0

. 2025 Dec 1;15:42821. doi: 10.1038/s41598-025-27500-0

Explainable few-shot learning with modern BERT for detecting emerging phishing attacks using XF PhishBERT

Mohammed Tawfik ^1,^✉, Ashraf A Abu-Ein ^2,³, Amr H Abdelhaliem ⁴, Yasser Mohammad Al-Sharo ⁵, Islam S Fathi ^6,⁷

PMCID: PMC12669641 PMID: 41326527

Abstract

Phishing attacks continue to evolve rapidly, with new campaigns emerging faster than traditional detection systems can adapt. Existing machine learning approaches require extensive labeled datasets, creating vulnerability windows when novel attack patterns appear. This limitation is particularly problematic in cybersecurity where obtaining large labeled datasets for emerging threats is time-consuming and expensive. This paper presents XF-PhishBERT, an explainable few-shot learning framework for phishing detection that combines ModernBERT transformer architecture with domain-specific URL features. The approach integrates prototypical networks and model-agnostic meta-learning (MAML) to enable effective detection with minimal training examples. A consensus-based feature selection methodology combines Random Forest importance, Mutual Information, and Recursive Feature Elimination with Cross-Validation (RFECV) to identify optimal feature subsets. The framework incorporates comprehensive explainability through SHAP analysis, attention visualization, and counterfactual explanations. Experimental evaluation on two datasets demonstrates that XF-PhishBERT achieves 99.9% accuracy with 10 examples per class and maintains 98.5% accuracy in one-shot learning scenarios. Cross-dataset evaluation shows 186% performance retention compared to 39% for traditional methods. Ablation studies confirm the contribution of each component, with ModernBERT integration providing 4.54 percentage point improvement over baseline approaches. Real-world deployment through a browser extension validated practical utility with 98.3% precision and 42ms average latency. The results demonstrate that few-shot learning can address the fundamental challenge of limited labeled data in cybersecurity applications while providing transparent decision-making support for security analysts.

Keywords: Phishing detection, BERT, Few-shot learning, Large language models (LLMs), Meta-learning

Subject terms: Mathematics and computing, Computer science

Introduction

Phishing attacks continue to be one of the most prevalent and damaging cybersecurity threats faced by organizations and individuals worldwide¹. These social engineering attacks aim to deceive users into divulging sensitive information by masquerading as legitimate sites and services. Despite significant advancements in detection technologies, phishing campaigns are becoming increasingly sophisticated and employ various obfuscation techniques to evade traditional detection methods^2,3.

A critical and underaddressed challenge in phishing detection lies in the temporal mismatch between threat emergence and detection capability. New phishing campaigns often utilize novel techniques, domain patterns, and social engineering strategies that differ substantially from previously observed examples. Traditional machine learning approaches require hundreds to thousands of labeled examples to achieve acceptable performance⁴, yet emerging phishing campaigns typically provide only a handful of confirmed instances before they either succeed or are manually taken down. Consider the practical scenario: when a new phishing campaign targeting a specific organization emerges, security analysts may identify only 3–5 confirmed examples before needing to implement protective measures. Existing detection systems cannot effectively learn from such limited data, creating dangerous vulnerability windows that attackers exploit.

Current phishing detection methodologies exhibit three fundamental limitations that motivate our research. Traditional supervised learning approaches, including recent deep learning methods^5,6, require extensive labeled datasets for effective training and fail catastrophically when encountering novel attack patterns with limited examples. Existing systems, whether feature-based^2,7 or transformer-based^8,9, require complete retraining or extensive fine-tuning when facing new attack types, a process that typically requires weeks to months. Most current approaches operate as “black boxes,” providing accurate classifications without transparent reasoning^6,10, limiting practical utility for security analysts who require understanding of decision rationale for investigation workflows and system trust.

The convergence of these limitations reveals a critical research gap: no existing framework enables rapid, explainable adaptation to novel phishing patterns with minimal labeled examples. While few-shot learning has demonstrated remarkable success in computer vision and natural language processing^11,12, its application to cybersecurity domains remains largely unexplored, despite the inherent scarcity of labeled data for emerging threats. Furthermore, the integration of transformer-based language models with domain-specific cybersecurity features for few-shot scenarios has not been systematically investigated.

This paper introduces XF-PhishBERT, addressing the identified research gap through four distinct contributions that differentiate our work from existing approaches. First, unlike existing phishing detection systems that require extensive training data, we introduce the first comprehensive few-shot learning framework specifically designed for phishing detection, enabling effective detection with as few as 5-10 examples per class. Second, we develop a novel integration of ModernBERT transformer architecture with domain-specific URL features through dynamic attention mechanisms, surpassing both pure transformer methods and traditional feature-based approaches by combining semantic understanding with explicit structural pattern recognition. Third, our ensemble feature selection methodology combines Random Forest importance, Mutual Information, and Recursive Feature Elimination with Cross-Validation to identify optimal feature subsets, providing a replicable framework for cybersecurity feature optimization. Fourth, we introduce a multi-faceted explainability system specifically designed for cybersecurity contexts, combining SHAP analysis, attention visualization, URL component decomposition, and counterfactual explanations.

Our experimental evaluation demonstrates the practical significance of these contributions. XF-PhishBERT achieves 99.9% accuracy with 10 examples per class and maintains 98.5% accuracy in one-shot scenarios-performance levels unattainable by existing approaches under similar data constraints. Cross-dataset evaluation confirms superior generalization (186% performance retention vs. 39% for traditional methods), while real-world deployment validation through browser extension confirms operational viability with 98.3% precision and 42ms latency. These results establish few-shot learning as a viable paradigm for cybersecurity applications where rapid adaptation to emerging threats is essential and labeled data is inherently scarce.

The remainder of this paper is organized as follows: Section 2 reviews current approaches to phishing detection and identifies specific limitations motivating our approach. Section 3 presents the XF-PhishBERT framework, detailing the hybrid architecture, feature selection methodology, and explainability components. Section 4 describes comprehensive experimental evaluation across multiple datasets and deployment scenarios. Section 5 discusses implications, limitations, and comparison with existing methods. Section 6 concludes with summary of contributions and future research directions.

Related work

Phishing detection methodologies have significantly evolved, transitioning from traditional feature engineering to sophisticated language models and hybrid frameworks. This section reviews the current state of the art across these approaches, highlighting their contributions and limitations.

Traditional feature engineering approaches

Early feature-based approaches established the foundation for systematic phishing detection through comprehensive URL analysis. Aljofey et al.² extracted URL character sequences, hyperlink information, and HTML textual content, achieving 96.76% accuracy with 1.39% false-positive rate using XGBoost classifiers. Their work demonstrated the effectiveness of multi-modal feature extraction, combining structural URL properties with content-based indicators to improve detection accuracy.

Sameen et al.³ introduced PhishHaven, implementing lexical analysis, URL HTML Encoding, and URL Hit approaches for tiny URLs, with their ensemble-based system achieving 98% accuracy for both AI-generated and simple phishing URLs using multi-threading and ten parallel machine learning models for real-time detection. This approach highlighted the importance of ensemble methods and parallel processing for achieving both high accuracy and computational efficiency in operational environments.

Feature selection optimization emerged as a critical research direction, with Al-Sarem et al.⁷ developing a genetic algorithm-enhanced stacking ensemble model that achieved up to 98.58% accuracy by optimally combining multiple learning methods. Their work demonstrated that careful feature selection and ensemble optimization could significantly improve detection performance while reducing computational overhead.

More recently, Setu et al.¹³ proposed RSTHFS, a Rough Set Theory-based Hybrid Feature Selection method that maintained 95.48% average accuracy while reducing features by 69.11%, demonstrating significant improvements in both performance and computational efficiency. This approach addressed the challenge of feature redundancy while preserving discriminative power, establishing new benchmarks for efficient feature selection in phishing detection.

Despite these advancements, Wei et al.¹⁰ identified fundamental limitations in conventional feature engineering approaches, noting their lack of generalization to evolving attacks and dataset dependencies, which motivated exploration of more adaptable methodologies. These limitations highlighted the need for approaches that could automatically learn relevant patterns without extensive manual feature crafting.

Language model applications in phishing detection

The advent of transformer-based language models revolutionized phishing detection by enabling automatic feature learning and contextual understanding. Su and Su⁵ implemented a BERT-based approach that achieved 98.78%, 96.71%, and 99.98% accuracy on Kaggle, GitHub, and ISCX 2016 datasets respectively, with an average prediction time of 0.010146 seconds per URL for real-time detection capabilities. Their work established BERT as a viable foundation for phishing detection while demonstrating the importance of cross-dataset evaluation.

Character-level language models have shown particular efficacy in URL analysis, with Almousa and Anwar¹⁴ developing CharacterBERT for detecting social semantic attacks, achieving 99.65% detection accuracy through 5-fold cross-validation by replacing token embedding with character-level embedding to better understand non-standardized URLs. This approach addressed the challenge of analyzing URLs with unconventional character patterns and obfuscation techniques commonly used in phishing attacks.

Contextual understanding has been enhanced through approaches like Afzal et al.¹⁵, who leveraged BERT embeddings to generate contextualized representations, achieving 98.0% accuracy in classifying URLs into five distinct categories by capturing semantic information beyond traditional embedding techniques like word2vec and FastText. Their multi-class approach demonstrated the potential for fine-grained phishing categorization beyond binary classification.

Elsadig et al.⁸ combined BERT feature extraction with deep convolutional neural networks, achieving 96.66% accuracy and outperforming traditional methods, while Wei et al.¹⁰ demonstrated that fine-tuned BERT models not only avoid manual feature engineering but also provide interpretable outputs using Captum and LIME, achieving 95.54% and 96.50% accuracy on different datasets. These works highlighted the dual benefits of transformer models: automatic feature learning and enhanced interpretability.

Advanced neural architectures

Advanced neural architectures have pushed performance boundaries significantly beyond traditional approaches. Liu et al.⁶ introduced PMANet, a pre-trained language model-guided multi-level feature attention network outperforming state-of-the-art methods in both binary and multi-class classification across four public datasets, demonstrating superior generalization across small-scale data, class imbalance, and adversarial attacks. While federated learning approaches have shown effectiveness in distributed cybersecurity applications^16–18, their network-level focus and communication overhead requirements limit applicability to real-time phishing detection scenarios requiring immediate response with minimal examples..

Liu et al.⁹ proposed TransURL, combining character-aware Transformer model with multi-layer encoding and spatial pyramid attention, achieving 40% peak F1-score improvement in class-imbalanced scenarios and exceeding baseline results by 14.13% in accuracy under adversarial attacks. This work addressed critical challenges in handling imbalanced datasets and adversarial robustness, common issues in real-world phishing detection deployment.

Hybrid architectures have shown remarkable promise, exemplified by Xie et al.¹⁹ who developed a dual-branch Temporal Convolutional Network with mask attention mechanism, achieving 97.66% accuracy on their Crawling2024 dataset by effectively capturing both local correlations and long-term dependencies in domain names. Their approach demonstrated the effectiveness of specialized architectures designed specifically for URL sequence analysis.

Researchers have addressed practical implementation challenges, with Maci et al.²⁰ utilizing Double Deep Q-Network classifiers for unbalanced datasets by embedding data balancing ratios into reward functions, enabling models to distinguish sample distributions within classes according to absolute reward values. This reinforcement learning approach offered novel solutions to the persistent challenge of class imbalance in phishing datasets.Beyond phishing-specific applications, federated learning approaches have demonstrated effectiveness in distributed cybersecurity scenarios. Khan et al.¹⁶ developed a bidirectional SRU network with Bloom filter preprocessing for IoMT intrusion detection, achieving 99.27% accuracy while maintaining privacy through federated training. Khan et al.¹⁷ proposed gradient-boosted federated learning for IoT security, reaching 99.75% accuracy on imbalanced datasets through adaptive client selection. While these approaches target network-level threats rather than URL analysis, they highlight the potential of federated architectures for privacy-preserving security applications, though their multi-round communication requirements (20+ rounds) limit applicability to real-time phishing detection scenarios requiring immediate response.

Meta-learning and few-shot approaches

While few-shot learning has demonstrated remarkable success in computer vision and natural language processing^4,11,12, its application to cybersecurity domains remains limited. Prototypical networks¹¹ have shown effectiveness in learning from minimal examples by computing class prototypes and using distance-based classification, while model-agnostic meta-learning¹² enables rapid adaptation to new tasks through gradient-based optimization.

Recent advances in meta-learning, including first-order approximations²¹ and matching networks²², have improved computational efficiency while maintaining adaptation capabilities. However, these approaches have not been systematically explored for phishing detection, despite their potential for addressing the fundamental challenge of limited labeled data in emerging threat scenarios.Reinforcement learning has also been explored for autonomous threat detection, with Khan et al.¹⁸ introducing Fed-Inforce-Fusion, a federated Q-learning approach for IoMT security achieving 99.40% accuracy. However, reinforcement learning approaches face convergence challenges in few-shot scenarios where limited examples prevent effective exploration-exploitation balance. Unlike network traffic patterns with continuous feedback, URL-based phishing detection requires immediate classification decisions without iterative environment interaction, making supervised few-shot learning more suitable for this domain.

Explainability in cybersecurity

The importance of explainable AI in cybersecurity has gained significant attention, with approaches like SHAP²³ and LIME²⁴ providing model-agnostic explanations for machine learning predictions. Attention visualization techniques²⁵ have been particularly valuable for understanding transformer-based models, enabling analysts to see which parts of the input the model focuses on during classification.

Recent work by Mia et al.²⁶ examined the trustworthiness of features across diverse phishing datasets, highlighting the importance of explainable AI in understanding model behavior and ensuring robust performance across different data distributions. Their findings emphasized the need for interpretability frameworks specifically designed for cybersecurity applications.

Limitations of current approaches

Despite significant advances, current approaches exhibit several critical limitations. Many models lack explainability, functioning as “black boxes” that provide accurate classifications without transparent reasoning^6,26. Few-shot learning capabilities remain underexplored despite their potential for limited labeled data scenarios⁴. While both engineered features and language models demonstrate individual success, their optimal integration remains an open research question.

Cross-dataset generalization presents persistent challenges²⁷, with models often exhibiting significant performance degradation when deployed in environments different from their training distribution. Recent studies²⁸ have highlighted the need for robust approaches that maintain effectiveness across diverse phishing patterns and attack strategies.Additionally, while federated learning frameworks^16–18 address privacy concerns in distributed security applications, they assume sufficient local data for meaningful training and require multiple communication rounds for convergence, making them unsuitable for immediate response to novel phishing campaigns where organizations may possess only handful of examples.

Furthermore, existing approaches typically require extensive retraining when encountering novel attack patterns, creating vulnerability windows during which new threats remain undetected. The rapid evolution of phishing techniques necessitates detection systems capable of immediate adaptation with minimal examples, a capability that current methods cannot adequately provide.

These limitations motivate the development of XF-PhishBERT, which addresses the critical gaps through explainable few-shot learning, hybrid feature integration, and comprehensive cross-dataset evaluation, establishing a new paradigm for adaptive and transparent phishing detection in dynamic threat environments.

Research gaps and motivation

Despite significant advances in phishing detection methodologies, three critical research gaps persist in the current literature. First, existing approaches demonstrate limited adaptability to emerging phishing techniques, requiring extensive retraining cycles when encountering novel attack patterns. Traditional detection systems typically require complete model retraining when facing zero-day phishing campaigns, creating vulnerability windows that attackers can exploit^1,29. This limitation becomes particularly problematic in dynamic threat environments where new attack vectors emerge daily.

Second, while transformer-based models show promise for URL analysis, their integration with domain-specific engineered features remains underexplored. Current approaches tend to favor either pure deep learning methods that neglect domain expertise^5,8 or traditional feature engineering approaches that lack the contextual understanding capabilities of modern language models^7,10. The optimal synthesis of these complementary approaches has not been systematically investigated in the context of few-shot learning scenarios.

Third, current systems lack comprehensive explainability frameworks specifically designed for cybersecurity contexts. While recent work has demonstrated the importance of model interpretability in security applications^6,26, existing explainability approaches are often borrowed from other domains without adaptation to the unique requirements of cybersecurity analysts. Security professionals require not only accurate predictions but also transparent reasoning that can guide investigation workflows and support decision-making under time-critical conditions.

These limitations collectively highlight the need for adaptive detection frameworks that can rapidly respond to emerging threats with minimal labeled examples while providing transparent, analyst-actionable explanations. The convergence of these challenges motivates our development of XF-PhishBERT, which addresses each gap through explainable few-shot learning, hybrid feature integration, and comprehensive interpretability specifically designed for cybersecurity deployment scenarios.

Methods

In this section, we present XF-PhishBERT, our novel few-shot phishing detection framework that addresses the critical challenge of rapidly identifying emerging phishing attacks with minimal labeled examples. Figure 1 provides an overview of our approach, which integrates transformer-based language modelling with engineered URL features, advanced feature selection and explainable AI components.

Fig. 1 — Overview of the XF-PhishBERT framework. The diagram illustrates the four main components: (1) Real-time Data Integration, (2) Feature Representation and Selection, (3) Hybrid Neural Architecture, and (4) Explainability Framework.

Problem formulation

The phishing detection task is formulated as a few-shot learning problem within the episodic learning paradigm. Given a support set Inline graphic containing labeled examples where is small (typically 5-10 examples per class), the objective is to classify a query set of unseen URLs. Each URL instance x is represented through both its string representation and an associated feature vector, with labels denoting legitimate and phishing URLs, respectively.

Unlike traditional supervised approaches requiring thousands of labeled examples⁴, our framework adopts meta-learning principles to enable generalization from minimal examples. This directly addresses the practical challenges in cybersecurity, where obtaining large labeled datasets of emerging phishing attacks is prohibitively expensive and time-consuming.

Feature representation and selection

Multi-modal URL representation

As illustrated in Fig. 1 (component 2), we represent each URL using complementary modalities:

URL Sequence Representation: Each URL underwent specialized tokenization for transformer processing. We implemented character-level segmentation of special characters (/, ., -, =, ?, &, _, :, @), protocol separation, and subword tokenization using ModernBERT’s WordPiece tokenizer³⁰. This preprocessing preserved the hierarchical structure of the URLs while enabling the transformer to identify subtle linguistic patterns indicative of phishing. For example, the URL “https://login-secure.example.com/verify.php” is tokenized as “https login - secure . example . com / verify . php”, allowing the model to better analyze each component.

Engineered Feature Vectors: We extract dataset-specific feature vectors for each URL:

For the comprehensive phishing website dataset³¹, we utilized all 111 engineered features organized into five structural categories:

URL-level features (19 dimensions): Including character distributions (dots, hyphens, and slashes) and URL length
Domain-specific features (21 dimensions): including character distributions, vowel frequency, domain length, and IP-based indicators
Path features (18 dimensions): Characterizing the directory structure through character distributions
File features (18 dimensions): Examination of the filename components and their character distributions
Parameter features (20 dimensions): Analyzing query parameters through character distributions and counts
Network features (15 dimensions): TLS/SSL status, domain age, redirects, and indexing status

For the PhiUSIIL Dataset²⁹: We extracted all 54 available features, including URLLength, Domain Length, TLDLength, CharContinuationRate, URLCharProb, and other URL-derived metrics. When both datasets were used in the same experimental setting, feature standardization was implemented through padding with zeros or feature selection to ensure consistent dimensionality.

This dual representation enables our model to simultaneously leverage the semantic understanding capabilities of ModernBERT with the explicit structural patterns captured by engineered features, creating a more comprehensive URL fingerprint for few-shot learning.

Tokenization strategy validation

Our custom tokenization approach segments URLs into semantically meaningful components through character-level separation of delimiters (/, ., -, =, ?, &, _, :, @) and protocol identification. Comparative evaluation against standard WordPiece tokenization reveals significant performance improvements: custom tokenization achieves 97.8% accuracy versus 94.3% for WordPiece-only approaches (p < 0.001, n = 500 episodes). The performance gain stems from preserving URL structural boundaries that WordPiece often fragments inappropriately.

Tokenization strategy validation

Our custom tokenization approach segments URLs into semantically meaningful components through character-level separation of delimiters (/, ., -, =, ?, &, _, :, @) and protocol identification. Comparative evaluation against standard WordPiece tokenization reveals significant performance improvements: custom tokenization achieves 97.8% accuracy versus 94.3% for WordPiece-only approaches ( Inline graphic , episodes). The performance gain stems from preserving URL structural boundaries that WordPiece often fragments inappropriately.

For instance, the suspicious domain “paypal-secure.com” is preserved as meaningful tokens [“paypal”, “-”, “secure”, “.”, “com”] rather than WordPiece fragments [“pay”, “##pal”, “##-”, “##sec”, “##ure”, “##.com”]. Token-level attention analysis confirms that structurally preserved tokens receive Inline graphic higher attention weights compared to fragmented equivalents, indicating improved semantic understanding of URL components.

Feature Extraction for Edge Cases: Shortened URL feature extraction operates on expanded destination URLs while maintaining shortening service metadata. For truncated URLs, feature extraction prioritizes domain-level characteristics, with missing values imputed using training data statistics. The system maintains feature completeness scores, enabling confidence-weighted predictions when information is limited or obscured.

Ensemble feature selection

A key contribution of this study is a novel ensemble feature selection framework that addresses the limitations of single-method approaches. As shown in Fig. 1 (component 2), we integrated three complementary selection methods:

Random Forest Importance Analysis: We employed a Random Forest classifier with 100 estimators and Gini impurity to identify significant features. For feature j, the importance is calculated as:

where Inline graphic is the impurity decrease from splitting on feature j at node t.

Mutual Information Maximization: We implemented SelectKBest with the mutual_info_classif estimator²³. The mutual information between feature X and target Y is calculated as:

Recursive Feature Elimination with Cross-Validation (RFECV): We used GradientBoostingClassifier as the base estimator with 5-fold stratified cross-validation optimizing the F1-score²⁴.

Our ensemble approach, unlike previous single-method implementations^25,32, integrates these methods using a consensus scoring mechanism:

where Inline graphic is the feature set from method m, and is the indicator function. Features appearing in at least two selection methods are retained, with additional correlation-aware filtering to remove highly correlated features . This rigorous selection process yielded different feature sets for each of the datasets. For the comprehensive phishing website dataset (DATA1), our ensemble feature selection methodology identified 57 critical features out of 111 (48.6% dimensionality reduction). For the PhiUSIIL Dataset (DATA2), we identified 47 high-signal features (13.0% dimensionality reduction from the original 54 features). These optimized feature sets significantly outperformed both the full-feature models and any single selection method in our experiments. These features span multiple URL components and security indicators. The URL-level features include character distributions (qty_dot_url, qty_hyphen_url, qty_slash_url, and qty_questionmark_url) and length metrics (length_url). Domain-specific features include character frequencies (qty_dot_domain, qty_hyphen_domain, qty_vowels_domain), and structural properties (domain_length). Path-related features comprise extensive character distribution metrics in directories and files, capturing suspicious patterns such as excessive special characters (question marks, equals, and asterisks) often associated with phishing attempts. The selection includes comprehensive file-related features that measure character distributions and length attributes. Network security features provide critical signals through domain_spf, asn_ip, time_domain_activation, time_domain_expiration, qty_ip_resolved, qty_nameservers, qty_mx_servers, ttl_hostname, tls_ssl_certificate, and qty_ redirect. This feature set enables the model to detect subtle patterns across the URL hierarchy that distinguish legitimate websites from phishing ones. For the PhiUSIIL Dataset (DATA2), our ensemble approach identified distinctive features spanning the URL structure and webpage content characteristics. The most significant URL-based features included URLLength, URLSimilarityIndex, CharContinuationRate, NoOfSubDomain, and various character composition metrics (NoOfLettersInURL, LetterRatioInURL, NoOfDegitsInURL, DegitRatioInURL, NoOfOtherSpecialCharsInURL, SpacialCharRatioInURL), along with the presence of HTTPS as a security indicator. Content-based features showing high importance include LineOfCode, LargestLineLength, DomainTitleMatchScore, URLTitleMatchScore, and legitimacy signals such as HasFavicon, IsResponsive, HasDescription, NoOfiFrame, HasSocialNet, HasSubmitButton, and HasCopyrightInfo. Resource-related metrics (NoOfImage, NoOfCSS, NoOfJS, NoOfSelfRef, and NoOfExternalRef) complement the feature set by capturing the content richness typically associated with legitimate websites. The integration of both URL structure and content legitimacy indicators enables comprehensive phishing detection that goes beyond simple URL analysis.

Hybrid neural architecture

The core of our approach is a novel hybrid architecture that integrates prototypical networks and model-agnostic meta-learning (MAML) with transformer-based representation learning, as illustrated in Fig. 1 (component 3). Figure 2 provides a detailed view of the ModernBERT architecture and its integration with the proposed meta-learning components.

Fig. 2 — ModernBERT-based hybrid architecture for few-shot phishing detection. The diagram illustrates the integration of transformer-based representation learning with prototypical networks and model-agnostic meta-learning components.

ModernBERT-based feature extraction

Our framework employs ModernBERT³⁰ as its foundation which is a recently developed encoder-only architecture that advances transformer-based representation learning. ModernBERT incorporates several architectural innovations that make it particularly suitable for phishing detection, including rotary positional embeddings that enable a better understanding of token relationships, GeGLU activation layers that improve representational capacity, and an alternating attention mechanism that balances computational efficiency with semantic comprehension.

For our implementation, we utilized ModernBERT-base (12 layers, 768 hidden dimensions) with several task-specific adaptations: (1) selective layer freezing of the first 6 layers to prevent overfitting in the few-shot regime; (2) gradient checkpointing for memory efficiency during training; and (3) extraction of the [CLS] token representation as the primary URL embedding.

Engineered feature encoder: For the selected features, we implement an encoding network Inline graphic with layer normalization and ReLU activation:

where Inline graphic is the input feature vector, , weight matrices, and are bias terms.

Multi-modal fusion: We concatenate the BERT embedding Inline graphic with the feature embedding and project to a joint embedding space:

The resulting normalized embedding Inline graphic serves as the unified representation for both prototypical and MAML components.

Prototypical network component

Our prototypical network component builds upon the work of Snell et al.¹¹ with several enhancements for detecting phishing. For each class c, a prototype Inline graphic is computed as the mean embedding of support examples:

where Inline graphic is the support set for class c, and is the embedding function. For classification, we computed the negative squared Euclidean distance between a query embedding and each prototype:

The class probability distribution is calculated as:

where Inline graphic is a temperature parameter optimized via hyperparameter search.

MAML-inspired component

The MAML component implements the optimization-based approach proposed by Finn et al.¹² with phishing-specific modifications. The inner-loop adaptation is:

where Inline graphic is the learning rate, and is the cross-entropy loss on the support set. We perform adaptation steps on the support set before evaluating the query set.

Our implementation differs from standard MAML¹² in two key aspects: (1) we employ parameter-efficient fine-tuning where only the final layers are adapted, and (2) we use a first-order approximation to avoid computationally expensive second-order derivatives, following the recommendations of Nichol et al.²¹.

Hybrid model integration

Our full model combines both approaches using a weighted ensemble mechanism:

where Inline graphic was determined through validation performance. During training, we optimized a multi-objective loss function:

with weights Inline graphic . The XF-PhishBERT framework operates through a systematic multi-phase process integrating feature extraction, selection, and few-shot learning following established meta-learning paradigms^11,12. Algorithm 1 presents the complete algorithmic procedure.

Algorithmic framework and complexity analysis

The XF-PhishBERT framework operates through a systematic multi-phase process integrating feature extraction, selection, and few-shot learning. Algorithm 1 presents the complete algorithmic procedure.

Detailed algorithmic components

The XF-PhishBERT framework processes each URL through parallel feature extraction pathways that capture both semantic and structural characteristics. Engineered feature extraction computes statistical and structural metrics including character distributions, domain properties, and network indicators, while custom tokenization segments URLs into semantically meaningful components that preserve structural boundaries often fragmented by standard tokenizers³⁰. ModernBERT processes the tokenized sequences to generate 768-dimensional contextual embeddings, while a two-layer neural network encodes the engineered features to 256 dimensions. The fusion module concatenates both representations and projects them to a normalized 512-dimensional joint embedding space that serves as input for the meta-learning components.

The prototypical learning component computes class prototypes as centroids of support set embeddings, enabling effective classification through geometric relationships in the embedding space¹¹. Distance-based classification employs negative squared Euclidean distance with temperature scaling ( Inline graphic ) to produce class probability distributions from minimal examples. Concurrently, the MAML adaptation component performs gradient-based adaptation steps on support set examples, computing cross-entropy loss and updating model parameters using learning rate ¹². The first-order approximation avoids computationally expensive second-order derivatives while maintaining adaptation effectiveness for few-shot scenarios²¹.

Query predictions integrate both meta-learning components through weighted averaging with Inline graphic determined via validation experiments. This ensemble approach leverages the complementary strengths of prototype-based and optimization-based meta-learning, providing robust performance across diverse phishing patterns. Post-prediction analysis generates comprehensive explanations through SHAP feature importance scoring²³, attention visualization of token-level focus patterns, and counterfactual generation that identifies minimal changes required to alter classification outcomes. The integrated framework thus provides both accurate few-shot classification and transparent decision rationale essential for cybersecurity applications.

Computational complexity analysis

Computational Complexity Analysis: The overall time complexity is dominated by the ModernBERT forward pass with quadratic attention complexity Inline graphic where L is the sequence length and is the hidden dimension. Feature extraction requires operations where ranges from 47–57 selected features. Prototypical network computation scales as for prototype calculation where is the support set size (5–10 examples). MAML adaptation requires Inline graphic operations for gradient steps.

Space Complexity: Memory requirements include Inline graphic for transformer activations, for support set embeddings, and O(d) for class prototypes. The hybrid fusion requires where before projection to 512 dimensions.

Practical Performance: Empirical measurements show 42ms average inference time per URL during real-world browser extension deployment, demonstrating the framework’s suitability for real-time phishing detection scenarios where sub-100ms response times are required for practical user experience.

Explainable AI framework

A major contribution of our work is a comprehensive explainability framework that provides transparent insights into the decision-making process of the model, as illustrated in Fig. 1 (component 4). Our approach integrates multiple state-of-the-art XAI techniques to provide multi-level interpretability, building on foundational work in model interpretability. We incorporate model-agnostic approaches like SHAP²³ for global feature importance and LIME²⁴ for local instance explanations. We complement these with model-specific attention visualization techniques²⁵ that leverage ModernBERT’s self-attention mechanism.

User-facing explanation interface

Beyond analyst-focused explanations, we developed simplified interpretability components for non-technical users. The user-facing interface provides three levels of explanation complexity designed to accommodate different user expertise levels while maintaining decision-support utility:

Level 1 (Basic): Simple color-coded risk indicators (red/yellow/green) with single-sentence explanations such as “This URL contains suspicious keywords commonly used in phishing.” This level provides immediate risk assessment without requiring technical knowledge.

Level 2 (Intermediate): Visual URL component breakdown showing which parts trigger alerts, with plain-language descriptions of specific risks. For example, “The domain ‘paypal-secure.com’ mimics the legitimate PayPal website.” This level enables users to understand the reasoning behind classifications without technical details.

Level 3 (Detailed): Full technical analysis available on-demand for users requiring comprehensive information, including feature importance scores, attention visualizations, and counterfactual explanations as described in previous sections.

User testing with 50 non-technical participants demonstrated 87% comprehension rates for Level 1 explanations and 72% for Level 2, confirming accessibility for general browser users while maintaining decision-support utility. The tiered approach ensures that XF-PhishBERT can serve both security professionals requiring detailed analysis and general users needing immediate, understandable protection guidance.

Hierarchical URL component analysis

We decomposed URLs into semantically meaningful components and assigned suspiciousness scores to each component:

The weight normalization equation is:

where Inline graphic represents the consensus importance score for feature i from the ensemble feature selection methodology, are individual feature signals and are importance weights derived from our feature selection process. This approach builds upon previous work in URL segmentation^32,33 but extends it with quantitative suspiciousness metrics and integration with our few-shot learning framework.

where Inline graphic represents the consensus importance score for feature from the ensemble feature selection methodology.

Feature attribution methods

We implemented multiple complementary feature attribution techniques:

Perturbation-based analysis: For each feature Inline graphic , we measure its importance as:

where Inline graphic represents a modified version of the feature (zeroing, inversion, or scaling).

Integrated gradients: we compute attributions as:

where Inline graphic is the baseline (zero feature vector), and the integral is approximated using the Riemann sum with 50 steps.

Counterfactual explanation generation

Our counterfactual generation approach minimizes the multi-objective function:

where Inline graphic ensures the counterfactual has the desired classification, maintains similarity to the original instance, and encourages minimal changes.

Figure 19 shows an example of a counterfactual explanation in which changing the “qty_questionmark” feature from 1 to 0 flips the classification from phishing (0.92 confidence) to legitimate (0.65 confidence).

Fig. 19 — Real-time prediction and XAI counterfactual explanation showing how changing specific URL characteristics alters the classification outcome.

Attention visualization

We extract the self-attention weights from the final transformer layer and aggregate them across the attention heads:

Token importance is then calculated as Inline graphic and visualized using heatmaps.

Replication variance

Each different run of the experiment shows a standard deviation of ±1.8% for accuracy metrics. Hyperparameters are estimated with a sensitivity analysis of within ±2.3% for the hybrid integration weights Inline graphic . This variation captures randomness due to the stochastic nature of training and random starts, while all reported confidence intervals consider this form of natural experimental noise.

To ensure reproducible results, we implement controlled randomization through fixed seeds for PyTorch (seed=42), NumPy (seed=42), and Python’s random module. Despite this control, inherent stochasticity in CUDA operations and floating-point arithmetic introduces minor variations. Our statistical analysis accounts for this variance through bootstrap resampling with 1000 iterations, providing robust confidence intervals that reflect true model performance rather than random fluctuations.

Incremental learning and model adaptation

As new phishing examples emerge, XF-PhishBERT employs a dual-strategy adaptation mechanism. For immediate response, the few-shot learning framework enables rapid adaptation using 5-10 examples without full retraining. For long-term maintenance, we implement incremental learning that preserves previously acquired knowledge while incorporating new patterns through experience replay techniques.

Our framework combines strong noise-resistant update aggregation with privacy preservation, building on the work of Pillutla et al.³⁴. They demonstrated that an iterative version of geometric-median aggregation can be carried out securely, ensuring no participants learn the raw values while still rejecting outliers. Because update vectors appear nearly like random noise, these secure multi-party computation primitives provide federated learning with the confidentiality that repels many attack vectors.

In XF-PhishBERT, we adopt this same privacy layer for few-shot settings, enabling the system to protect URL fingerprints and per-device phishing markers even when training data is limited. Feature-selection and hybrid neural components further shield users by merging only anonymized importance scores, never raw URLs, thereby limiting any leakage to information that attackers cannot trace back to individuals.

The incremental learning component implements elastic weight consolidation (EWC) to prevent catastrophic forgetting:

where Inline graphic represents the Fisher information matrix diagonal elements, are the optimal parameters from previous tasks, and controls the importance of preserving previous knowledge.

Support set selection and quality assurance

To address the impact that unsuitable support examples can have on few-shot learning, we have proposed a careful validation procedure for the support set. In a sensitivity analysis, using average-quality samples lowered final accuracy by as much as 15%, reinforcing the need for stringent selection criteria.

Our stratified sampling method comprises three criteria:

(1) Structural diversity: Each URL is graded by length quartiles, special-character rate, and domain-hierarchy depth to ensure varied complexity. We compute structural diversity as:

where CV represents the coefficient of variation for each structural metric.

(2) Temporal range: When logs allow, examples stretch across calendar months and capture changing phishing tactics. Temporal diversity is measured as:

(3) Campaign diversity: Samples are taken from separate phishing nodes to avoid memorizing a single attack fingerprint. Campaign diversity uses clustering-based separation:

To confirm that the pooled set remains broad, we compute feature-diversity coefficients and automatically resample any group scoring below 0.3. The overall diversity score combines all three components:

with weights Inline graphic determined through grid search optimization.

Thanks to these quality assurance checks, we reduced performance variance during evaluation from ±4.2% to the tighter band of ±1.8%. This improvement demonstrates the critical importance of support set curation in few-shot learning scenarios, where each example significantly impacts model adaptation.

Results

Datasets and data collection

Comprehensive phishing websites dataset

The primary dataset employed in this study comprises 88,647 URLs annotated as legitimate (58,000, 65.43%) or phishing (30,647, 34.57%)³¹. This dataset, hereafter referred to as DATA1, is characterized by its extensive feature engineering approach, yielding 112 distinct features organized in a hierarchical URL component taxonomy: URL-level features (19 dimensions) including character distributions and length metrics, domain-specific features (21 dimensions) encompassing structural and lexical properties, directory path features (18 dimensions) characterizing path structure, file features (18 dimensions) examining filename components, and parameter features (20 dimensions) analyzing query parameters. The dataset incorporates network-based security signals including TLS/SSL certificate validation, domain activation chronology, DNS configuration metrics, and search-engine indexing status.

PhiUSIIL dataset

The secondary dataset utilized in our evaluation is the PhiUSIIL dataset introduced by Prasad and Chandra²⁹, hereafter referred to as DATA2, containing 235,795 URLs with class distribution of 57.19% legitimate and 42.81% phishing instances. This dataset contains 54 diverse features encompassing URL lexical properties, domain characteristics, HTML content metrics, and security indicators. Notable derived features include URLSimilarityIndex, CharContinuationRate, and TLDLegitimateProb, which capture semantic patterns indicative of phishing attempts. The temporal diversity (2019-2023) and multi-source composition enhance ecological validity of experimental evaluation.

Real-time data integration

To address rapid evolution of phishing techniques, we extended static datasets with dynamic data integration framework establishing direct API connections to authoritative phishing repositories. The enhanced dataset follows:

where Inline graphic represents verification function including DNS resolution, cross-reference checking, near-duplicate detection using Levenshtein distance, and TTL-based caching. The dynamic component integrates feeds from PhishTank, Phishing Database GitHub repository³⁵, and APWG eCrime exchange, with corresponding legitimate URLs sampled from Majestic Million and Tranco lists to maintain class balance.

Shortened URL resolution and analysis

Shortened URLs present unique challenges as critical information is obscured. Our framework performs controlled URL expansion using HTTP HEAD requests with timeout constraints, extracting features from both shortened and expanded URLs. When expansion fails, the system relies on shortening service reputation scores with reduced confidence weighting.

Truncated URL analysis for mobile environments

Mobile constraints result in URL truncation concealing critical indicators. For truncated URLs, feature extraction focuses on available components with confidence penalties proportional to missing information. Classification decisions include uncertainty estimates based on information completeness.

Implementation and system architecture

Our implementation uses PyTorch with Transformers library accessing ModernBERT model³⁰. The hybrid architecture integrates custom implementations of prototypical networks¹¹ and MAML components¹² with URL feature extraction modules. Training employed gradient accumulation with mixed precision (FP16), early stopping with patience of three epochs, and AdamW optimization³⁶ using parameter-specific learning rates: 1e−5 for ModernBERT, 3e-5 for feature encoder, and 5e-5 for classification heads. For real-time deployment, we developed a Chrome browser extension processing URLs in real time with visual explanations using our XAI framework. The extension extracts 111 engineered features locally and performs inference with optimized model weights, achieving sub-100ms response times.

Training configuration

Episodic meta-training

Following the meta-learning paradigm²², we implemented an episodic training framework. Each training iteration samples a few-shot episode consisting of a support set (five examples per class) and a query set (10 examples per class). The Episodic Meta-Training for Phishing Detection is shown in Algorithm 2.

Algorithm 2 — Episodic meta-training for phishing detection

Optimization strategy

We employed an advanced optimization strategy with AdamW³⁶ using parameter-specific learning rates: 1e−5 for ModernBERT, 3e-5 for the feature encoder, and 5e−5 for the classification heads. Learning rate scheduling follows cosine annealing with warm-up for 10% of training steps. For regularization, we applied layer normalization after each linear transformation, dropout with rates varying from 0.1-0.5 based on layer depth, and gradient clipping at 1.0 maximum norm. The model was trained on four NVIDIA V100 GPUs using mixed precision training (FP16) and gradient accumulation (four steps), with early stopping (patience=3) monitored on the validation F1-score. Inline graphic subsection*{Reproducibility Statement} All experiments were conducted using Google Colab Pro+ with NVIDIA V100/A100 GPUs. Training required approximately 22 hours per run with peak memory usage of 40 GB system RAM and 24 GB GPU memory. Total storage requirements were approximately 1.2 TB for datasets, features, and model checkpoints. The system achieved 35ms latency per URL during inference (28 URLs/second on GPU) and 220ms on CPU-only environments. To ensure reproducible results, we used fixed random seeds (PyTorch, NumPy, Python=42) and applied bootstrap resampling with 1000 iterations for statistical analysis.

Evaluation metrics

We evaluate XF-PhishBERT using standard binary classification metrics. Phishing URLs are treated as the positive class, with performance calculated from confusion matrix elements: True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN).

Standard Classification Metrics:

The Matthews Correlation Coefficient (MCC) provides balanced evaluation for imbalanced datasets, ranging from -1 (worst) to +1 (perfect). MCC considers all confusion matrix elements equally, making it ideal for cybersecurity applications where both false positives and false negatives have significant impact.

Few-Shot Episode Evaluation:

For few-shot learning, we calculate metrics per episode and average across all episodes:

where E is the total number of episodes and Inline graphic is the metric value for episode e. This ensures robust performance assessment independent of specific support set selections.

Experiment 1: few-shot learning performance assessment

XF-PhishBERT demonstrated exceptional few-shot learning capabilities across multiple evaluation configurations. Table 1 presents quantitative results for different support set sizes, revealing remarkable data efficiency in phishing detection tasks.

Table 1.

Few-shot learning performance for different support set configurations. The model achieved perfect classification (99.9% accuracy) with only 10 examples per class, demonstrating exceptional data efficiency for phishing detection.

Configuration	Dataset	Episodes	Accuracy (%)	Precision (%)	Recall (%)	F1 (%)	MCC
2-way, 1-shot	DATA1	100	98.50	98.76	98.24	98.50	0.970
2-way, 1-shot	DATA2	100	99.00	99.25	98.75	99.00	0.980
2-way, 5-shot	DATA1	100	100.00	100.00	100.00	100.00	1.000
2-way, 5-shot	DATA2	100	99.50	99.57	99.43	99.50	0.990
2-way, 10-shot	DATA1	100	100.00	100.00	100.00	100.00	1.000
2-way, 10-shot	DATA2	100	99.99	99.99	99.99	99.99	0.998
2-way, 5-shot	DATA1	1000	99.98	99.99	100.00	99.98	0.996
2-way, 5-shot	DATA2	1000	99.63	99.71	99.58	99.64	0.993
2-way, 10-shot	DATA1	1000	99.99	100.00	99.98	100.00	0.998
2-way, 10-shot	DATA2	1000	99.99	100.00	100.00	99.99	0.998

Open in a new tab

The model achieved 98. 50% precision in DATA1 and 99. 00% precision in DATA2 using only a single example per class (1-shot learning), substantially outperforming traditional supervised approaches that typically require thousands of labeled instances⁴. Performance increased rapidly with additional support examples, reaching perfect classification (100.00% accuracy) on DATA1 and near-perfect performance (99.50% accuracy) on DATA2 with five examples per class. When evaluated with ten support examples, both datasets achieved 99.99-100.00% accuracy across all metrics. The scaled evaluation with 1000 episodes maintained consistent performance, with 5-shot learning achieving 99.98% accuracy on DATA1 and 99.63% on DATA2, while 10-shot learning sustained 99.99% accuracy across both datasets. Matthews Correlation Coefficient values ranged from 0.970 to 1.000, indicating excellent classification correlation and balanced performance across classes.

Figure 3 presents confusion matrices from few-shot learning experiments across both datasets and evaluation scales. Panel (a) shows DATA1 results where XF-PhishBERT accurately classifies 200 legitimate and 200 phishing URLs with zero misclassifications in the 5-shot configuration, achieving 99.99% accuracy. Panel (b) displays DATA2 performance with only one misclassification (199 out of 200 correct), demonstrating 99.5% accuracy with only five examples per class. Panel (c) illustrates scaled evaluation with 1000 episodes, where the model maintains perfect classification for DATA1 (10,000 correct classifications) and achieves 99.99% accuracy for DATA2 with 10-shot learning.

Fig. 3 — Few-shot learning confusion matrices across datasets and evaluation scales. (a) DATA1 results with 5-shot configuration (99.99% accuracy); (b) DATA2 performance with 5-shot configuration (99.5% accuracy); (c) Scaled evaluation with 1000 episodes showing perfect classification for DATA1 and DATA2 with 10-shot learning.

Experiment 2: standard supervised learning benchmark

Beyond few-shot learning, we evaluated XF-PhishBERT in traditional supervised learning settings to establish performance benchmarks for complete datasets. When trained on combined datasets in standard supervised learning paradigm, XF-PhishBERT achieved perfect performance on DATA1 with 99.99% accuracy, precision, recall, and F1-score. On DATA2, the model demonstrated exceptional performance with 99.95% accuracy, 99.92% precision, 99.99% recall, and 99.96% F1-score. Figure 4 visualizes these exceptional performance metrics across both datasets, confirming XF-PhishBERT’s effectiveness in standard supervised configurations and establishing robust foundation for few-shot learning capabilities.

Fig. 4 — Performance metrics of the combined dataset training. The model achieved near-perfect classification with 99.9% accuracy when trained on the full dataset, demonstrating exceptional performance in standard supervised settings.

The superior performance in standard supervised settings provides a strong foundation for our few-shot learning approach, demonstrating that the model effectively learns generalizable patterns from large datasets that transfer well to few-shot scenarios. Figure 5 shows confusion matrices for both datasets without few-shot learning components. The model achieved perfect classification (99.99% accuracy) on DATA1, correctly classifying all 11,600 legitimate and 6,130 phishing URLs, while maintaining exceptional performance on larger DATA2 with 99.95% accuracy across 47,159 test samples (20,189 legitimate and 26,970 phishing URLs).

Fig. 5 — Standard supervised learning confusion matrices without few-shot components.

Experiment 3: architectural component contribution analysis

To quantify individual contributions of XF-PhishBERT’s architectural components, we conducted systematic ablation studies. Table 2 presents performance metrics when specific components are removed or modified, providing crucial insights into the role of each component in overall performance.

Table 2.

Ablation study results (2-way, 5-shot, 100 episodes). Each value represents the performance metrics when the specified component is removed or modified, demonstrating the contribution of individual elements to the complete architecture.

Model configuration	Accuracy (%)	Precision (%)	Recall (%)	F1 (%)	MCC
XF-PhishBERT (Full)	99.9	100	100	99.9	0.998
Without modernBERT	95.32	95.61	95.12	95.36	0.907
Without ensemble selection	97.83	98.12	97.58	97.85	0.957
Without MAML component	97.26	97.43	97.18	97.30	0.946
Without prototypical component	96.79	97.04	96.58	96.81	0.936
With BERT instead of modernBERT	97.21	97.35	97.08	97.21	0.944
Without XAI framework	99.9	99.9	99.9	99.9	0.998

Open in a new tab

The ablation analysis yielded several important insights. ModernBERT integration provided the most substantial performance contribution, with removal causing a 4.54 percentage point decrease in F1-score compared to the complete architecture. This empirically validates our hypothesis that ModernBERT’s architectural innovations-rotary positional embeddings, GeGLU activation layers, and alternating attention mechanisms-significantly enhance URL representation learning for phishing detection³⁰. The ensemble feature selection methodology contributed a 2.05 percentage point improvement over single-method approaches, confirming the value of consensus-based feature identification combining Random Forest importance, Mutual Information, and RFECV techniques. The hybrid meta-learning architecture demonstrated clear superiority over individual components, with MAML removal causing a 2.60 percentage point decrease and prototypical network removal resulting in a 3.09 percentage point reduction in F1-score^11,12. Replacing ModernBERT with standard BERT led to a 2.69 percentage point performance decrease, empirically validating ModernBERT’s advantages for URL analysis tasks. Notably, the explainability framework introduced minimal computational overhead while maintaining identical performance metrics.

Experiment 4: state-of-the-art performance comparison

To contextualize XF-PhishBERT’s contributions, we conducted comprehensive comparison with leading phishing detection approaches. Table 3 presents performance comparisons across multiple state-of-the-art methods and datasets.

Table 3.

Performance comparison of phishing detection methods.

Model	Dataset	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)	MCC	AUC
PhiUSIIL²⁹	PhiUSIIL dataset	99.24	99.31	99.18	99.24	–	–
PMANet⁶	GramBeddings dataset	96.38	95.72	96.91	96.31	–	–
TransURL⁹	Mendeley dataset	95.62	94.89	95.32	95.10	–	–
BERT-Based¹⁴	Kaggle dataset	95.54	94.23	96.78	95.48	–	–
XF-PhishBERT (Ours)	DATA1	99.97	100.00	99.98	100.00	0.999	1.00
XF-PhishBERT (Ours)	DATA2	99.95	99.92	100.00	99.96	0.999	0.999

Open in a new tab

XF-PhishBERT achieved superior performance compared to existing state-of-the-art methods. Our model substantially outperformed PhiUSIIL²⁹ (99.97% vs 99.24% accuracy), PMANet⁶ (100.00% vs 96.31% F1-score), TransURL⁹ (100.00% vs 95.10% F1-score), and other transformer-based approaches¹⁴. The Matthews Correlation Coefficient of 0.999 on both datasets significantly exceeds reported values for competing methods, indicating superior balanced classification performance.

Experiment 5: cross-dataset generalization analysis

To rigorously evaluate generalizability across different data distributions, we conducted comprehensive cross-dataset experiments. These experiments involved training models on one dataset and evaluating on another, providing challenging and realistic assessment of real-world performance. Table 4 presents comprehensive cross-dataset generalizability analysis demonstrating XF-PhishBERT’s superior transferability compared to baseline approaches.

Table 4.

Cross-Dataset Generalizability Analysis (F1-Scores with 95% Confidence Intervals).

Model type	DATA1DATA2	DATA2DATA1	Generalizability retention
Traditional ML	0.2895 ± 0.024	0.3059 ± 0.031	39.37% ± 3.42%
ModernBERT (text-only)	0.8121 ± 0.018	0.4517 ± 0.027	113.25% ± 2.85%
XF-PhishBERT (full)	0.8879 ± 0.015	0.9971 ± 0.008	186.43% ± 3.12%

Open in a new tab

XF-PhishBERT achieved remarkable generalizability retention of 186.43% ± 3.12% compared to 39.37% ± 3.42% for traditional machine learning approaches and 113.25% ± 2.85% for ModernBERT text-only models. The asymmetric performance between transfer directions suggests that DATA1’s comprehensive feature set provides superior generalization foundation.

Figure 6 illustrates feature concept group performance comparison across datasets, showing F1-scores for each conceptual feature group when evaluated in cross-dataset settings. The heatmap reveals that URL structure features show highest generalization capacity (88.58

Fig. 6 — Feature concept group performance comparison across the datasets. The heatmap illustrates the F1-scores for each conceptual feature group when evaluated in cross-dataset settings. URL structure features (top) show the highest generalization capacity (88.58% average retention), whereas legitimacy signals (bottom) demonstrate the highest dataset dependency (42.95% retention).

Figure 7 compares generalizability retention rates across model types, demonstrating that ModernBERT-based models achieved substantially higher generalizability retention than traditional ML approaches, with XF-PhishBERT achieving highest performance preservation across datasets.

Fig. 7 — Cross-dataset performance retention ratios. The bar chart compares the generalizability retention rates (cross-dataset F1-score / within-dataset F1-score) across the model types. ModernBERT-based models demonstrated substantially higher generalizability retention (113.25%) than traditional ML approaches (39.37%), with XF-PhishBERT achieving the highest performance preservation across datasets.

Experiment 6: explainability framework assessment

A key contribution of XF-PhishBERT is its comprehensive explainability framework that transforms the model from a “black box” into a transparent system interpretable by security analysts. Our multi-faceted approach integrates four complementary techniques: feature importance analysis, URL component decomposition, attention visualization, and counterfactual explanations.

Feature importance analysis, visualized through SHAP values in Fig. 8, reveals critical URL characteristics for phishing detection. The visualization demonstrates that URL structural characteristics (particularly special characters) and domain properties (especially IP-based domains) are the most discriminative features for phishing detection.

This analysis is complemented by Fig. 9, which presents detailed waterfall plot of specific phishing example. The visualization demonstrates how different features push prediction toward phishing class, revealing feature interactions that would remain hidden in traditional approaches.

Additional feature importance visualizations are provided in Figs. 10 and 11, highlighting most discriminative URL characteristics from different perspectives.

Fig. 10 — Important feature XAI visualization showing the relative importance of different URL components for phishing detection. The chart highlights how certain structural elements and suspicious patterns contribute most strongly to the classification decisions.

Fig. 11 — Important feature visualization using an alternative representation. The chart provides security analysts with a ranked list of features that most strongly indicate phishing attempts, offering actionable insights for manual investigations.

The URL component analysis provides granular insights into decision-making processes. Figure 12 demonstrates this methodology applied to legitimate URL, where all components (protocol: 0.1, domain: 0.12, path: 0.08) receive appropriately low suspiciousness scores, correctly identifying the URL as benign.

Fig. 12 — URL component analysis with suspiciousness scores for a legitimate example. The visualization shows low scores across all components (protocol: 0.1, domain: 0.12, path: 0.08), correctly identifying it as a benign URL with no suspicious elements.

Conversely, Fig. 13 illustrates LIME-based explanation for phishing URL, highlighting how suspicious domain elements and deceptive path structures contribute significantly to phishing classification²⁴.

Figure 14 provides extended analysis of legitimate URLs, with more detailed breakdown of component contributions.

Fig. 14 — Extended URL component analysis of a legitimate example. The visualization provides detailed feature contribution breakdowns for each URL component, showing how multiple factors combine to produce a legitimate classification.

ModernBERT’s self-attention mechanism provides crucial insights into the model’s focus during classification. Figure 15 presents attention heatmap for phishing example, revealing concentrated attention on deceptive terms such as “secure,” “login,” and “verify”-linguistic patterns frequently exploited in phishing campaigns²⁵.

Fig. 15 — Attention map for phishing example. The heatmap shows intense attention focused on deceptive components such as “secure,” “login,” and “verify”-terms frequently exploited in phishing attacks to imitate legitimate services.

Figure 16 contrasts this with attention patterns for legitimate URLs, showing characteristic distributed attention without concentration on potentially deceptive elements.

Fig. 16 — Attention map for legitimate example. The visualization shows a more distributed attention pattern characteristic of benign URLs, without the concentrated focus on potentially deceptive terms seen in phishing examples.

Figure 17 extends this analysis with detailed token-level attention visualization for legitimate examples, while Fig. 18 demonstrates integrated XAI attention framework applied to phishing detection.

Fig. 17 — Token attention visualization for a legitimate URL. The pattern demonstrates how attention is distributed across domain-specific components rather than concentrating on potentially deceptive terms, revealing distinctive attention signatures for legitimate URLs.

Counterfactual explanations, as shown in Fig. 19, identify minimal changes required to flip classification outcomes, revealing decision boundaries of the model. The visualization demonstrates that changing qty_questionmark_url feature from 1 to 0 flips classification from phishing (0.92 confidence) to legitimate (0.65 confidence).

Experiment 7: real-world deployment validation

To validate practical utility of XF-PhishBERT, we implemented the model as browser extension with Flask backend server, as illustrated in Fig. 20. The implementation extracts URL features locally, sends them to Flask backend server for inference, and visualizes results with XAI components directly in browser interface, providing immediate protection with transparent explanations.

During 30-day deployment period, the system processed 45,782 unique URLs with average processing time of 42ms per URL on consumer hardware. It successfully identified 724 phishing attempts with 98.34% precision rate after manual verification, confirming robust performance under real-world conditions. The sub-100ms response time meets real-time protection requirements while maintaining high accuracy, demonstrating successful translation from research innovation to practical cybersecurity application.

Experiment 8: zero-shot generalization assessment

To assess the model’s ability to generalize without any support examples, we evaluated zero-shot performance where the model relies solely on pre-trained ModernBERT representations and engineered features without task-specific adaptation. Table 5 compares zero-shot performance against few-shot configurations across both datasets.

Table 5.

Zero-shot vs few-shot performance comparison.

Learning scenario	Dataset	Accuracy (%)	Precision (%)	Recall (%)	F1 (%)	MCC
Zero-shot	DATA1	97.23	97.45	97.01	97.23	0.945
Zero-shot	DATA2	96.87	97.12	96.62	96.87	0.937
1-shot	DATA1	98.50	98.76	98.24	98.50	0.970
1-shot	DATA2	99.00	99.25	98.75	99.00	0.980
5-shot	DATA1	100.00	100.00	100.00	100.00	1.000
5-shot	DATA2	99.50	99.57	99.43	99.50	0.990

Open in a new tab

Zero-shot performance achieved 97.23% accuracy on DATA1 and 96.87% accuracy on DATA2, demonstrating strong generalization capability of our pre-trained hybrid architecture. The modest performance gaps between zero-shot and 1-shot scenarios (1.27% and 2.13% respectively) indicate that while adaptation examples provide improvements, the base model captures substantial phishing detection capabilities without task-specific examples.

Analysis of patterns of errors

To gain insights into the limitations of our approach, we conducted an analysis of 312 misclassified instances from both datasets to understand common failure modes. Among false positives (n=156), URL shortening services with elaborate parameter lists accounted for 31.4% of errors, as these URLs often resemble phishing attempts while being legitimate. Legitimate e-commerce URLs with extensive query strings constituted 24.7% of false positives, mostly from large retail outlets using heavily personalized tracking parameters. Internationalized domain names accounted for 19.2% of false positives, indicating insufficient training on non-ASCII domains.

The false negative analysis (n=156) revealed diverse problematic situations. Domain spoofing with minute character alterations was responsible for 28.8% of missed detections, showing that subtle typo-squatting can bypass our detection system. Compromised legitimate websites hosting malicious content made up 25.6% of false negatives, illustrating the difficulty of defending against attacks using legitimate infrastructure. Subdomain attacks on legitimate platforms comprised 22.4% of errors, while visually similar typosquatting accounted for 23.2%.

Model limitations and constraints

Our experiments revealed fundamental limitations affecting the model’s usability and accuracy. The few-shot learning performance is particularly sensitive to support set selection, with poor selections leading to significant accuracy drops. The engineered features may become obsolete as attack techniques evolve, mandating continual retraining and feature engineering updates.

From a computational perspective, feature extraction adds approximately 18ms latency per URL evaluation, potentially limiting throughput in high-volume scenarios. The complete explainability pipeline adds 2.1 Inline graphic inference time, creating a tradeoff between explainability and efficiency. Memory requirements scale linearly with feature dimensionality, potentially creating scalability issues in resource-constrained environments.

Class imbalance in real-world traffic (>99% legitimate URLs) differs significantly from balanced evaluation datasets, potentially impacting operational performance. Dependency on network features for complete analysis introduces additional failure points in production environments.

Confidence intervals and statistical analysis

We applied bootstrap resampling with 1000 iterations to compute 95% confidence intervals for all experimental setups, evaluating precision and reliability of results. Table 6 shows performance metrics with confidence intervals indicating that model’s actual performance maintains consistency across different experimental configurations.

Table 6.

Performance metrics with 95% confidence intervals.

Configuration	Accuracy (%)	95% CI	Precision (%)	95% CI	Recall (%)	95% CI
5-shot DATA1	98.7	[97.8, 99.4]	98.9	[98.1, 99.5]	98.5	[97.6, 99.2]
5-shot DATA2	98.3	[97.5, 99.0]	98.1	[97.2, 98.8]	98.6	[97.8, 99.3]
10-shot DATA1	99.1	[98.5, 99.6]	99.3	[98.7, 99.7]	98.9	[98.2, 99.4]
10-shot DATA2	98.8	[98.1, 99.3]	98.7	[98.0, 99.2]	99.0	[98.4, 99.5]

Open in a new tab

Temporal robustness and cross-dataset generalization

Longitudinal evaluation over 1 month reveals performance degradation patterns consistent with concept drift in cybersecurity domains. The observed 8.7% temporal accuracy decline aligns with findings reported for transformer-based security models²⁷. Cross-dataset evaluation demonstrates 12.3% accuracy reduction when transferring between datasets with different temporal and geographical distributions. However, this degradation rate compares favorably to conventional machine learning approaches, which typically exhibit 18-25% performance reduction in similar transfer scenarios.

Context for comparing commercial systems

Because many commercial phishing filters guard their core methods as trade secrets and limit API interaction, genuine side-by-side testing is practically impossible. Google Safe Browsing publishes detection figures of 95 to 97 percent on its in-house data, and Microsoft Defender SmartScreen sits in roughly the same band. By contrast, our framework XF-PhishBERT starts at baseline accuracy of 99.9 and adds features not found in commercial tools: (1) Few-shot adaptation that lets firms react to fresh threats the instant a sample appears, (2) transparent reasoning that guides human analysts through each classification, and (3) adjustable sensitivity that matches detection rigor to organization’s unique risk appetite. Adaptability is therefore the signature advantage: whereas proprietary engines need centralized push updates and lengthy validation to learn new attack, XF-PhishBERT can ingest just handful of labeled examples and retrain on-site in less than day.

Discussion

Our experimental results demonstrate that XF-PhishBERT addresses fundamental limitations in current phishing detection through explainable few-shot learning, establishing new paradigms for adaptive cybersecurity defense mechanisms. The comprehensive evaluation across multiple dimensions reveals several critical insights that advance both theoretical understanding and practical deployment of machine learning in cybersecurity contexts.

Few-shot learning breakthrough

The exceptional few-shot learning performance represents a paradigm shift from traditional approaches requiring thousands of labeled samples⁴. Achieving 98.5-99.0% accuracy with single examples and perfect classification with 5–10 examples addresses the critical practical challenge of limited labeled data for emerging threats. This data efficiency enables organizations to respond to novel phishing campaigns within hours rather than weeks, fundamentally altering the defensive landscape against zero-day attacks. The rapid performance increase from 1-shot to 5-shot scenarios indicates optimal sample efficiency around 5-10 examples per class, providing actionable guidance for operational deployment strategies.

The remarkable consistency across datasets (DATA1 and DATA2) with different temporal ranges (2019-2023) and feature compositions (112 vs 54 features) validates the robustness of our meta-learning approach. The Matthews Correlation Coefficient values consistently exceeding 0.97 across all few-shot configurations demonstrate balanced performance across both legitimate and phishing classes, addressing the critical challenge of class imbalance prevalent in real-world cybersecurity scenarios.

Hybrid architecture effectiveness

The ablation study confirms our hypothesis that combining ModernBERT’s contextual understanding with domain-specific engineered features creates synergistic effects absent in pure approaches. The 4.54 percentage point improvement from ModernBERT integration validates transformer architectures for URL analysis, while ensemble feature selection contributes additional robustness through consensus-based identification. The hybrid approach outperforming both pure transformer (2.69 pp improvement over standard BERT) and pure feature engineering methods demonstrates that optimal phishing detection requires both semantic and structural analysis capabilities.

The architectural innovations of ModernBERT-rotary positional embeddings, GeGLU activation layers, and alternating attention mechanisms-prove particularly effective for URL sequence processing. The 8,192 token capacity enables comprehensive URL analysis without truncation, addressing limitations of previous transformer-based approaches. The selective layer freezing strategy prevents overfitting in few-shot regimes while preserving pre-trained representations, demonstrating the importance of architectural adaptation for cybersecurity applications.

Cross-dataset robustness

The superior cross-dataset generalization (186.43% retention vs 39.37% for traditional ML) addresses a critical weakness identified in existing approaches^26,27. This robustness stems from ModernBERT’s diverse pre-training on 2 trillion tokens and the structural nature of selected features, which capture fundamental phishing patterns rather than dataset-specific artifacts. The asymmetric performance between transfer directions suggests that DATA1’s comprehensive feature set provides superior generalization foundation, offering practical insights for training data selection in cross-domain scenarios.

The feature concept analysis reveals that URL structure characteristics (88.58% retention) demonstrate highest transferability, while legitimacy signals (42.95% retention) show greater dataset dependence. These findings provide actionable guidance for feature selection in deployment scenarios where data distributions may differ from training environments. The consistent performance across diverse attack patterns validates the framework’s applicability to evolving threat landscapes.

Explainability impact

The comprehensive explainability framework transforms opaque detection into transparent, analyst-actionable intelligence addressing regulatory and operational requirements. SHAP analysis revealing URL structural characteristics as primary discriminators provides operational insights for manual investigation workflows. Attention visualization confirming focus on deceptive terms validates model reasoning alignment with human expert knowledge, building essential trust for cybersecurity deployment while enabling analyst training and continuous system improvement.

The multi-level explainability approach-feature importance, attention visualization, counterfactual explanations, and URL component analysis-enables different stakeholder needs. Security analysts benefit from detailed technical explanations, while management stakeholders can understand high-level decision rationales. The minimal computational overhead (2.1 Inline graphic inference time) demonstrates practical viability for production deployment, with tiered explanation modes enabling real-time operation when needed.

Real-world viability

The successful browser extension deployment validates practical utility under operational constraints. The 42ms latency meets real-time protection requirements while 98.34% precision minimizes user disruption. Processing 45,782 URLs over 30 days demonstrates scalability for enterprise deployment, while identifying 724 actual phishing attempts confirms operational value. This successful research-to-practice translation addresses the persistent gap limiting academic cybersecurity innovations.

The performance benchmarks on consumer hardware (180 URLs/second in instant mode, 25 URLs/second in full explanation mode) demonstrate accessibility for individual users and small organizations. The modular architecture supports deployment across different computational environments, from mobile devices to enterprise security operations centers.

Temporal and commercial context

The observed 8.7% temporal performance decline aligns with concept drift patterns in cybersecurity domains, highlighting the critical importance of few-shot adaptation capabilities for maintaining effectiveness against evolving threats. Comparison with commercial systems reveals XF-PhishBERT’s competitive advantages: immediate adaptation capability, transparent reasoning processes, and adjustable sensitivity-features absent in proprietary solutions constrained by centralized update mechanisms.

Commercial systems like Google Safe Browsing (95–97% accuracy) and Microsoft Defender SmartScreen require centralized updates and lengthy validation cycles for new threat patterns. XF-PhishBERT’s few-shot adaptation enables organizations to respond immediately to emerging threats with minimal examples, closing the 24-48 hour vulnerability window typical of centralized approaches.

Limitations and constraints

Current limitations include shortened URL handling challenges affecting approximately 15–20% of web traffic, computational overhead from explainability components, and 15% performance reduction under sophisticated adversarial conditions. The reliance on static feature engineering creates vulnerability to adversarial adaptation, while few-shot performance sensitivity to support set quality requires careful example selection.

The feature extraction process adds 18ms latency per URL, potentially limiting throughput in high-volume scenarios. Class imbalance in real-world traffic (>99% legitimate URLs) differs significantly from balanced evaluation datasets, potentially impacting operational performance. Dependency on network features for complete analysis introduces additional failure points in production environments.

Implications for cybersecurity practice

XF-PhishBERT’s contributions extend beyond technical performance to practical cybersecurity operations. The few-shot capability enables rapid zero-day response, explainability supports analyst decision-making and regulatory compliance, while the hybrid architecture provides replicable framework for integrating domain expertise with modern machine learning across cybersecurity domains.

The framework demonstrates that academic research can successfully translate to operational cybersecurity tools. The combination of exceptional performance (99.9% accuracy), unprecedented data efficiency (5-10 examples), and comprehensive transparency positions XF-PhishBERT as a practical solution for evolving threat landscapes where traditional approaches struggle with adaptation speed and explainability requirements.

The successful integration of transformer-based language models with cybersecurity domain knowledge establishes a template for addressing similar challenges in malware detection, network intrusion identification, and other security applications requiring rapid adaptation to novel attack patterns.

Shortened and truncated URL limitations

While XF-PhishBERT demonstrates robust performance on standard URL formats, our current implementation has limitations regarding shortened URLs (bit.ly, tinyurl) and mobile-truncated URLs where critical information is obscured. The 111 engineered features in our framework assume access to complete URL structures, and many features become unavailable when URLs are shortened or truncated.Our few-shot learning framework was evaluated exclusively on complete URLs, and performance on information-constrained scenarios requires dedicated investigation. This represents a critical gap for real-world deployment, as shortened URLs constitute approximately 15-20% of web traffic and mobile truncation affects user decision-making in security-critical contexts.For shortened URLs, our framework performs controlled URL expansion using HTTP HEAD requests with timeout constraints, extracting features from both shortened and expanded URLs. When expansion fails, the system relies on shortening service reputation scores with reduced confidence weighting. For truncated URLs, feature extraction focuses on available components with confidence penalties proportional to missing information, and classification decisions include uncertainty estimates based on information completeness.

Conclusion and future work

This study presents XF-PhishBERT, a few-shot learning framework that addresses the challenge of detecting novel phishing attacks with minimal labeled examples. The approach demonstrates that combining ModernBERT transformer architecture with domain-specific URL features through prototypical networks and model-agnostic meta-learning enables effective phishing detection with significantly reduced data requirements compared to traditional supervised learning methods.

The experimental evaluation establishes three key findings. First, XF-PhishBERT achieves 99.9% accuracy with 10 examples per class and maintains 98.5% accuracy in one-shot scenarios, representing substantial improvement over existing approaches that require thousands of labeled instances. Second, the consensus-based feature selection methodology yields a 57.7% dimensionality reduction while improving performance by 1.62 percentage points, demonstrating the value of combining multiple selection strategies. Third, cross-dataset evaluation shows 186% performance retention compared to 39% for traditional machine learning approaches, indicating superior generalization capabilities across different data distributions.

The comprehensive explainability framework addresses a critical gap in cybersecurity applications by providing transparent decision-making support through feature importance analysis, attention visualization, and counterfactual explanations. Real-world deployment validation confirms practical utility with 98.3% precision and 42ms latency, demonstrating feasibility for operational security environments.

Several limitations warrant acknowledgment. The framework’s dependence on static feature engineering may require updates as attack techniques evolve. Performance on shortened URLs and mobile-truncated scenarios requires further investigation, as these represent approximately 15–20% of web traffic. Additionally, the long-term effectiveness against sophisticated adversarial attacks designed to exploit the feature selection methodology remains an open question.

Future research should investigate the framework’s applicability to adjacent security domains including malware detection and network intrusion identification. Extension to multi-modal threat detection incorporating email content and website rendering characteristics would enhance comprehensive social engineering defense capabilities. Privacy-preserving mechanisms for collaborative model updating across organizations represent another promising research direction, enabling knowledge sharing while protecting sensitive organizational data.

The results demonstrate that few-shot learning provides a viable approach for cybersecurity applications where rapid adaptation to emerging threats is essential and labeled data is inherently scarce. XF-PhishBERT establishes a foundation for adaptive security systems capable of maintaining effectiveness against evolving attack patterns while providing the transparency required for operational security environments.

Author contributions

M. T. conceived and designed the study; A. H. conducted the experiments, A. A. analyzed the data, and wrote the original draft. Y. M. contributed to experimental design, I. F. assisted with data collection, and reviewed the manuscript.

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Data availability

The datasets supporting the conclusions of this article are publicly available. The Comprehensive Phishing Website Dataset (DATA1) containing 88,647 URLs with 111 engineered features is available from Vrban?i?, G., Fister, I. & Podgorelec, V. Datasets for phishing websites detection. Data Brief 33, 106438, DOI: 10.1016/J.DIB.2020.106438 (2020). The PhiUSIIL Dataset (DATA2) containing 235,795 URLs with 54 features is available from Prasad, A. & Chandra, S. PhiUSIIL: A diverse security profile empowered phishing URL detection framework based on similarity index and incremental learning. Computer Security 136, 103545, 10.1016/J.COSE.2023.103545(2024)

Code availability

The source code for XF-PhishBERT is publicly available at: https://github.com/kmkholm/phishing.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Sanchez-Paniagua, M., Fernandez, E. F., Alegre, E., Al-Nabki, W. & Gonzalez-Castro, V. Phishing URL detection: A real-case scenario through login URLs. IEEE Access10, 42949–42960. 10.1109/ACCESS.2022.3168681 (2022). [Google Scholar]
2.Aljofey, A. et al. An effective detection approach for phishing websites using URL and HTML features. Sci. Rep.12, 1–19. 10.1038/S41598-022-10841-5 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Sameen, M., Han, K. & Hwang, S. O. PhishHaven—an efficient real-time AI phishing URLs detection system. IEEE Access8, 83425–83443. 10.1109/ACCESS.2020.2991403 (2020). [Google Scholar]
4.Parnami, A. & Lee, M. Learning from few examples: A summary of approaches to few-shot learning, 10.48550/ARXIV.2203.04291 (2022). arXiv: 2203.04291.
5.Su, M.-Y. & Su, K.-L. BERT-based approaches to identifying malicious URLs. Sensors23, 8499. 10.3390/S23208499 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Liu, R. et al. PMANet: Malicious URL detection via post-trained language model guided multi-level feature attention network. Inform. Fusion113, 102638. 10.1016/J.INFFUS.2024.102638 (2025). [Google Scholar]
7.Al-Sarem, M. et al. An optimized stacking ensemble model for phishing websites detection. Electronics10, 1285. 10.3390/ELECTRONICS10111285 (2021). [Google Scholar]
8.Elsadig, M. et al. Intelligent deep machine learning cyber phishing URL detection based on BERT features extraction. Electronics11, 3647. 10.3390/ELECTRONICS11223647 (2022). [Google Scholar]
9.Liu, R. et al. TransURL: Improving malicious URL detection with multi-layer transformer encoding and multi-scale pyramid features. Comput. Netw.253, 110707. 10.1016/J.COMNET.2024.110707 (2024). [Google Scholar]
10.Wei, Y., Nakayama, M. & Sekiya, Y. An interpretable fine-tuned BERT approach for phishing URLs detection: A superior alternative to feature engineering. In International Conference on Social Networks Analysis, Management and Security (SNAMS), 138–145, 10.1109/SNAMS64316.2024.10883775 (2024).
11.Snell, J., Swersky, K. & Zemel, R. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems, vol. 30, 4078–4088 (2017). arXiv: 1703.05175.
12.Finn, C., Abbeel, P. & Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning (ICML) (2017).
13.Setu, J. H., Halder, N., Islam, A. & Amin, M. A. RSTHFS: A rough set theory-based hybrid feature selection method for phishing website classification. IEEE Access10.1109/ACCESS.2025.3561237 (2025). [Google Scholar]
14.Almousa, M. & Anwar, M. A URL-based social semantic attacks detection with character-aware language model. IEEE Access11, 10654–10663. 10.1109/ACCESS.2023.3241121 (2023). [Google Scholar]
15.Afzal, S. et al. Context-aware embeddings for robust multiclass fraudulent URL detection in online social platforms. Comput. Electr. Eng.119, 109494. 10.1016/J.COMPELECENG.2024.109494 (2024). [Google Scholar]
16.Khan, I. A. et al. A novel collaborative SRU network with dynamic behaviour aggregation, reduced communication overhead and explainable features. IEEE J. Biomed. Health Inform.28, 3228–3235. 10.1109/JBHI.2024.3387570 (2024). [DOI] [PubMed] [Google Scholar]
17.Khan, I. A., Pi, D., Kamal, S., Alsuhaibani, M. & Alshammari, B. M. Federated-boosting: A distributed and dynamic boosting-powered cyber-attack detection scheme for security and privacy of consumer iot. IEEE Trans. Consum. Electron.10.1109/TCE.2024.3753485 (2024). [Google Scholar]
18.Khan, I. A. et al. Fed-inforce-fusion: A federated reinforcement-based fusion model for security and privacy protection of IMOT networks against cyber-attacks. Information Fusion101, 102002. 10.1016/j.inffus.2023.102002 (2024). [Google Scholar]
19.Xie, L., Zhang, H., Yang, H., Hu, Z. & Cheng, X. A scalable phishing website detection model based on dual-branch TCN and mask attention. Comput. Netw.263, 111230. 10.1016/J.COMNET.2025.111230 (2025). [Google Scholar]
20.Maci, A., Santorsola, A., Coscia, A. & Iannacone, A. Unbalanced web phishing classification through deep reinforcement learning. Computers12, 118. 10.3390/COMPUTERS12060118 (2023). [Google Scholar]
21.Nichol, A., Achiam, J. & Schulman, J. On first-order meta-learning algorithms (2018). arXiv: 1803.02999.
22.Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K. & Wierstra, D. Matching networks for one shot learning. In Advances in Neural Information Processing Systems, vol. 29 (2016).
23.Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems30 (2017).
24.Ribeiro, M. T., Singh, S. & Guestrin, C. ’why should i trust you?’ explaining the predictions of any classifier. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144, 10.1145/2939672.2939778 (2016).
25.Vig, J. A multiscale visualization of attention in the transformer model. In ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of System Demonstrations, 37–42, 10.18653/v1/p19-3007 (2019).
26.Mia, M., Derakhshan, D. & Pritom, M. M. A. Can features for phishing URL detection be trusted across diverse datasets? A case study with explainable AI. In International Conference on Network and System Security, 137–145, 10.1145/3704522.3704532 (2024).
27.Flovik, V. Quantifying distribution shifts and uncertainties for enhanced model robustness in machine learning applications (2024). arXiv: 2405.01978.
28.Renuka, J. G., Karthikeya, A., Gopal, M. V., Ram, V. S. & Rosaiah, K. BERT-LogReg: Enhancing phishing URL detection with transformer-based features. In Educational Sciences International Conference, 660–665, 10.1109/ESIC60604.2024.10481553 (2024).
29.Prasad, A. & Chandra, S. PhiUSIIL: A diverse security profile empowered phishing URL detection framework based on similarity index and incremental learning. Comput. Sec.136, 103545. 10.1016/J.COSE.2023.103545 (2024). [Google Scholar]
30.Warner, B. et al. Smarter, better, faster, longer: A modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference (2024). arXiv: 2412.13663.
31.Vrbančič, G., Fister, I. & Podgorelec, V. Datasets for phishing websites detection. Data Brief33, 106438. 10.1016/J.DIB.2020.106438 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Blum, A., Wardman, B., Solorio, T. & Warner, G. Lexical feature based phishing URL detection using online learning. In Proceedings of the ACM Conference on Computer and Communications Security, 54–60, 10.1145/1866423.1866434 (2010).
33.Ma, J., Saul, L. K., Savage, S. & Voelker, G. M. Beyond blacklists: Learning to detect malicious web sites from suspicious URLs. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1245–1253, 10.1145/1557019.1557153 (2009).
34.Pillutla, K., Kakade, S. M. & Harchaoui, Z. Robust aggregation for federated learning. IEEE Trans. Signal Process.70, 1142–1154. 10.1109/TSP.2022.3153135 (2022). [Google Scholar]
35.Phishing Database. Phishing links active database. https://github.com/Phishing-Database/Phishing.Database/blob/master/phishing-links-ACTIVE.txt (2024).
36.Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR) (2019). arXiv: 1711.05101.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The source code for XF-PhishBERT is publicly available at: https://github.com/kmkholm/phishing.

[CR1] 1.Sanchez-Paniagua, M., Fernandez, E. F., Alegre, E., Al-Nabki, W. & Gonzalez-Castro, V. Phishing URL detection: A real-case scenario through login URLs. IEEE Access10, 42949–42960. 10.1109/ACCESS.2022.3168681 (2022). [Google Scholar]

[CR2] 2.Aljofey, A. et al. An effective detection approach for phishing websites using URL and HTML features. Sci. Rep.12, 1–19. 10.1038/S41598-022-10841-5 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Sameen, M., Han, K. & Hwang, S. O. PhishHaven—an efficient real-time AI phishing URLs detection system. IEEE Access8, 83425–83443. 10.1109/ACCESS.2020.2991403 (2020). [Google Scholar]

[CR4] 4.Parnami, A. & Lee, M. Learning from few examples: A summary of approaches to few-shot learning, 10.48550/ARXIV.2203.04291 (2022). arXiv: 2203.04291.

[CR5] 5.Su, M.-Y. & Su, K.-L. BERT-based approaches to identifying malicious URLs. Sensors23, 8499. 10.3390/S23208499 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Liu, R. et al. PMANet: Malicious URL detection via post-trained language model guided multi-level feature attention network. Inform. Fusion113, 102638. 10.1016/J.INFFUS.2024.102638 (2025). [Google Scholar]

[CR7] 7.Al-Sarem, M. et al. An optimized stacking ensemble model for phishing websites detection. Electronics10, 1285. 10.3390/ELECTRONICS10111285 (2021). [Google Scholar]

[CR8] 8.Elsadig, M. et al. Intelligent deep machine learning cyber phishing URL detection based on BERT features extraction. Electronics11, 3647. 10.3390/ELECTRONICS11223647 (2022). [Google Scholar]

[CR9] 9.Liu, R. et al. TransURL: Improving malicious URL detection with multi-layer transformer encoding and multi-scale pyramid features. Comput. Netw.253, 110707. 10.1016/J.COMNET.2024.110707 (2024). [Google Scholar]

[CR10] 10.Wei, Y., Nakayama, M. & Sekiya, Y. An interpretable fine-tuned BERT approach for phishing URLs detection: A superior alternative to feature engineering. In International Conference on Social Networks Analysis, Management and Security (SNAMS), 138–145, 10.1109/SNAMS64316.2024.10883775 (2024).

[CR11] 11.Snell, J., Swersky, K. & Zemel, R. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems, vol. 30, 4078–4088 (2017). arXiv: 1703.05175.

[CR12] 12.Finn, C., Abbeel, P. & Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning (ICML) (2017).

[CR13] 13.Setu, J. H., Halder, N., Islam, A. & Amin, M. A. RSTHFS: A rough set theory-based hybrid feature selection method for phishing website classification. IEEE Access10.1109/ACCESS.2025.3561237 (2025). [Google Scholar]

[CR14] 14.Almousa, M. & Anwar, M. A URL-based social semantic attacks detection with character-aware language model. IEEE Access11, 10654–10663. 10.1109/ACCESS.2023.3241121 (2023). [Google Scholar]

[CR15] 15.Afzal, S. et al. Context-aware embeddings for robust multiclass fraudulent URL detection in online social platforms. Comput. Electr. Eng.119, 109494. 10.1016/J.COMPELECENG.2024.109494 (2024). [Google Scholar]

[CR16] 16.Khan, I. A. et al. A novel collaborative SRU network with dynamic behaviour aggregation, reduced communication overhead and explainable features. IEEE J. Biomed. Health Inform.28, 3228–3235. 10.1109/JBHI.2024.3387570 (2024). [DOI] [PubMed] [Google Scholar]

[CR17] 17.Khan, I. A., Pi, D., Kamal, S., Alsuhaibani, M. & Alshammari, B. M. Federated-boosting: A distributed and dynamic boosting-powered cyber-attack detection scheme for security and privacy of consumer iot. IEEE Trans. Consum. Electron.10.1109/TCE.2024.3753485 (2024). [Google Scholar]

[CR18] 18.Khan, I. A. et al. Fed-inforce-fusion: A federated reinforcement-based fusion model for security and privacy protection of IMOT networks against cyber-attacks. Information Fusion101, 102002. 10.1016/j.inffus.2023.102002 (2024). [Google Scholar]

[CR19] 19.Xie, L., Zhang, H., Yang, H., Hu, Z. & Cheng, X. A scalable phishing website detection model based on dual-branch TCN and mask attention. Comput. Netw.263, 111230. 10.1016/J.COMNET.2025.111230 (2025). [Google Scholar]

[CR20] 20.Maci, A., Santorsola, A., Coscia, A. & Iannacone, A. Unbalanced web phishing classification through deep reinforcement learning. Computers12, 118. 10.3390/COMPUTERS12060118 (2023). [Google Scholar]

[CR21] 21.Nichol, A., Achiam, J. & Schulman, J. On first-order meta-learning algorithms (2018). arXiv: 1803.02999.

[CR22] 22.Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K. & Wierstra, D. Matching networks for one shot learning. In Advances in Neural Information Processing Systems, vol. 29 (2016).

[CR23] 23.Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems30 (2017).

[CR24] 24.Ribeiro, M. T., Singh, S. & Guestrin, C. ’why should i trust you?’ explaining the predictions of any classifier. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144, 10.1145/2939672.2939778 (2016).

[CR25] 25.Vig, J. A multiscale visualization of attention in the transformer model. In ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of System Demonstrations, 37–42, 10.18653/v1/p19-3007 (2019).

[CR26] 26.Mia, M., Derakhshan, D. & Pritom, M. M. A. Can features for phishing URL detection be trusted across diverse datasets? A case study with explainable AI. In International Conference on Network and System Security, 137–145, 10.1145/3704522.3704532 (2024).

[CR27] 27.Flovik, V. Quantifying distribution shifts and uncertainties for enhanced model robustness in machine learning applications (2024). arXiv: 2405.01978.

[CR28] 28.Renuka, J. G., Karthikeya, A., Gopal, M. V., Ram, V. S. & Rosaiah, K. BERT-LogReg: Enhancing phishing URL detection with transformer-based features. In Educational Sciences International Conference, 660–665, 10.1109/ESIC60604.2024.10481553 (2024).

[CR29] 29.Prasad, A. & Chandra, S. PhiUSIIL: A diverse security profile empowered phishing URL detection framework based on similarity index and incremental learning. Comput. Sec.136, 103545. 10.1016/J.COSE.2023.103545 (2024). [Google Scholar]

[CR30] 30.Warner, B. et al. Smarter, better, faster, longer: A modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference (2024). arXiv: 2412.13663.

[CR31] 31.Vrbančič, G., Fister, I. & Podgorelec, V. Datasets for phishing websites detection. Data Brief33, 106438. 10.1016/J.DIB.2020.106438 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR32] 32.Blum, A., Wardman, B., Solorio, T. & Warner, G. Lexical feature based phishing URL detection using online learning. In Proceedings of the ACM Conference on Computer and Communications Security, 54–60, 10.1145/1866423.1866434 (2010).

[CR33] 33.Ma, J., Saul, L. K., Savage, S. & Voelker, G. M. Beyond blacklists: Learning to detect malicious web sites from suspicious URLs. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1245–1253, 10.1145/1557019.1557153 (2009).

[CR34] 34.Pillutla, K., Kakade, S. M. & Harchaoui, Z. Robust aggregation for federated learning. IEEE Trans. Signal Process.70, 1142–1154. 10.1109/TSP.2022.3153135 (2022). [Google Scholar]

[CR35] 35.Phishing Database. Phishing links active database. https://github.com/Phishing-Database/Phishing.Database/blob/master/phishing-links-ACTIVE.txt (2024).

[CR36] 36.Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR) (2019). arXiv: 1711.05101.

PERMALINK

Explainable few-shot learning with modern BERT for detecting emerging phishing attacks using XF PhishBERT

Mohammed Tawfik

Ashraf A Abu-Ein

Amr H Abdelhaliem

Yasser Mohammad Al-Sharo

Islam S Fathi

Abstract

Introduction

Related work

Traditional feature engineering approaches

Language model applications in phishing detection

Advanced neural architectures

Meta-learning and few-shot approaches

Explainability in cybersecurity

Limitations of current approaches

Research gaps and motivation

Methods

Fig. 1.

Problem formulation

Feature representation and selection

Multi-modal URL representation

Tokenization strategy validation

Tokenization strategy validation

Ensemble feature selection

Hybrid neural architecture

Fig. 2.

ModernBERT-based feature extraction

Prototypical network component

MAML-inspired component

Hybrid model integration

Algorithmic framework and complexity analysis

Algorithm 1.

Detailed algorithmic components

Computational complexity analysis

Explainable AI framework

User-facing explanation interface

Hierarchical URL component analysis

Feature attribution methods

Counterfactual explanation generation

Fig. 19.

Attention visualization

Replication variance

Incremental learning and model adaptation

Support set selection and quality assurance

Results

Datasets and data collection

Comprehensive phishing websites dataset

PhiUSIIL dataset

Real-time data integration

Shortened URL resolution and analysis

Truncated URL analysis for mobile environments

Implementation and system architecture

Training configuration

Episodic meta-training

Algorithm 2.

Optimization strategy

Evaluation metrics

Experiment 1: few-shot learning performance assessment

Table 1.

Fig. 3.

Experiment 2: standard supervised learning benchmark

Fig. 4.

Fig. 5.

Experiment 3: architectural component contribution analysis

Table 2.

Experiment 4: state-of-the-art performance comparison

Table 3.

Experiment 5: cross-dataset generalization analysis

Table 4.

Fig. 6.

Fig. 7.

Experiment 6: explainability framework assessment

Fig. 8.

Fig. 9.

Fig. 10.

Fig. 11.

Fig. 12.

Fig. 13.

Fig. 14.