Skip to main content
Clinical Kidney Journal logoLink to Clinical Kidney Journal
. 2023 Dec 4;16(Suppl 2):ii55–ii61. doi: 10.1093/ckj/sfad206

Understanding patient needs and predicting outcomes in IgA nephropathy using data analytics and artificial intelligence: a narrative review

Francesco Paolo Schena 1,2,, Carlo Manno 3, Giovanni Strippoli 4,5
PMCID: PMC10695518  PMID: 38053972

ABSTRACT

This narrative review explores two case scenarios related to immunoglobulin A nephropathy (IgAN) and the application of predictive monitoring, big data analysis and artificial intelligence (AI) in improving treatment outcomes. The first scenario discusses how online service providers accurately understand consumer preferences and needs through the use of AI-powered big data analysis. The author, a clinical nephrologist, contemplates the potential application of similar methodologies, including AI, in his medical practice to better understand and meet patient needs. The second scenario presents a case study of a 20-year-old man with IgAN. The patient exhibited recurring symptoms, including gross haematuria and tonsillitis, over a 2-year period. Through histological examination and treatment with renin–angiotensin system blockade and corticosteroids, the patient experienced significant improvement in kidney function and reduced proteinuria over 15 years of follow-up. The case highlights the importance of individualized treatment strategies and the use of predictive tools, such as AI-based predictive models, in assessing treatment response and predicting long-term outcomes in IgAN patients. The article further discusses the collection and analysis of real-world big data, including electronic health records, for studying disease natural history, predicting treatment responses and identifying prognostic biomarkers. Challenges in integrating data from various sources and issues such as missing data and data processing limitations are also addressed. Mathematical models, including logistic regression and Cox regression analysis, are discussed for predicting clinical outcomes and analysing changes in variables over time. Additionally, the application of machine learning algorithms, including AI techniques, in analysing big data and predicting outcomes in IgAN is explored. In conclusion, the article highlights the potential benefits of leveraging AI-powered big data analysis, predictive monitoring and machine learning algorithms to enhance patient care and improve treatment outcomes in IgAN.

Keywords: artificial intelligence, IgA nephropathy, mathematical models, real-world data

NON-CLINICAL CASE SCENARIO: HOW ONLINE SERVICE PROVIDERS UNDERSTAND OUR NEEDS AND TRAJECTORIES

I am a seasoned clinical nephrologist renowned for my expertise in diagnosing and managing glomerular diseases, particularly immunoglobulin A nephropathy (IgAN). Over the course of my career I have successfully treated numerous patients with this condition. My family and I have been living in a stunning home we purchased 8 years ago, furnished with brand new furniture that we invested a significant amount of money in.

Recently we encountered a problem with our refrigerator, which prompted discussions with my wife about the need for a replacement. Interestingly, during the same period I started receiving multiple promotional e-mails from various sources, including my Amazon account, showcasing a wide array of refrigerators. Surprisingly, some of these refrigerators were identical to the one we bought 7–8 years ago. What intrigued me even more was the fact that my wife and I had expressed a desire to upgrade to a model that also dispenses cold water and ice cubes. Astonishingly, some of the promotional messages I received featured a refrigerator of the same brand as our current one, equipped with the exact features we were looking for.

These online service providers seemed to possess an uncanny ability to understand my preferences and accurately anticipate my needs, delivering tailored advertisements precisely when I required them. At that moment, I couldn't help but wish that I had the same level of insight in my clinical practice—knowing exactly what my patients needed, precisely when and where they needed it. If internet services can exhibit such remarkable accuracy in predicting consumer needs, it surely implies that there must be a methodology behind it. If I can acquire and apply this knowledge to my own medical practice, I am confident it will elevate me to a higher level of proficiency as a physician.

The following day I approached one of my junior residents and discussed the incident with the refrigerator. Unsurprised, my resident informed me about the utilization of big data analysis, artificial intelligence (AI) and other advanced tools used by online sellers to predict consumer demands. It became clear that the collection, analysis and application of big data are crucial factors in this process, enabling companies to refine their understanding and meet customer needs more effectively.

CLINICAL CASE SCENARIO: HOW USING PREDICTIVE MONITORING CAN IMPROVE TREATMENT IN IGAN

We present the case of a 20-year-old man who was admitted to the renal unit following a 2-week history of fever, tonsillitis and gross haematuria. The patient had a recurring pattern of gross haematuria and tonsillitis, along with persistent microhaematuria and proteinuria over the past 2 years. During the initial visit the patient exhibited normal body weight (73.2 kg), normal blood pressure (130/90 mmHg), serum creatinine of 1.10 mg/dl, estimated glomerular filtration rate (eGFR) of 96 ml/min/1.73 m2 and daily proteinuria of 2.5 g.

A kidney biopsy was performed and the histological examination, according to the Oxford classification, revealed IgAN, characterized by diffuse mesangial hypercellularity (M1), segmental glomerular sclerosis and flocculo-capsular adhesions in 20% of glomeruli (S1), as well as florid crescents in 15% of glomeruli (C1). Considering the presence of active renal lesions, the patient was treated with renin–angiotensin system (RAS) blockers in combination with corticosteroids (1 mg/kg body weight) for 2 months. Subsequently a gradual reduction in corticosteroid dosage (0.2 mg/kg body weight/month) was implemented over the following 4 months. RAS blocker therapy (ramipril 7.5 mg/day) was continued.

The DialCheck tool [1] was employed to assess the risk of end-stage kidney disease (ESKD) 10 years after the kidney biopsy, indicating a low probability of 23.48%. At the most recent follow-up visit, 15 years after the kidney biopsy, the patient exhibited normal serum creatinine levels (0.97 mg/dl), improved eGFR (101 ml/min/1.73 m2) and low proteinuria (0.3 g/day). These findings demonstrate the long-term benefits of corticosteroid therapy, as evidenced by the maintenance of normal serum creatinine levels, improved eGFR and reduced proteinuria even after 15 years.

The patient's favourable outcome surpassed the low percentage predicted by the DialCheck tool for reaching ESKD. This case report highlights the utility of the DialCheck tool in predicting ESKD, as it serves as a valuable test for evaluating the therapeutic effect and monitoring the long-term follow-up of patients with IgAN.

Overall, this case emphasizes the importance of individualized treatment strategies in IgAN, guided by predictive tools like DialCheck, to achieve favourable outcomes in management of the disease. Long-term follow-up and regular monitoring of kidney function and proteinuria are crucial in assessing treatment response and disease progression in patients with IgAN.

REAL-WORLD BIG DATA

The collection of electronic health records (EHRs) from daily in- and outpatient visits has resulted in the generation of extensive big data files. Longitudinal EHR data, available in many of these files, allow for the study of disease natural history and the prediction of treatment responses (outcomes). Additionally, EHR data offer opportunities for identifying prognostic biomarkers.

Integrating data files from various sources often presents challenges, including incongruencies, missing data and errors. The collection of big data through randomized clinical trials can mitigate bias and confounding, given the rigorous experimental conditions [2]. Another valuable data source is well-designed observational studies that encompass a diverse representation of the study population, enabling outcome prediction analyses.

However, developing a high-dimensional dataset may encounter the ‘curse of dimensionality,’ as noted by Sinha et al. [3]. Multicollinearity, where two or more predictors are not independent, is a common phenomenon that necessitates reducing the dimensionality of the data file or selecting specific variables for analysis.

Real-world data files often lack granularity, emphasizing the importance of adhering to the REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) guidelines. These guidelines provide a comprehensive list of codes or algorithms to identify eligible patients and define relevant outcomes, thus minimizing missing data, misclassification bias and unmeasured confounding [4].

Notably, EHRs exist for several cohorts of Caucasian and Asian patients affected by IgAN. These datasets, collected through observational and randomized clinical studies [5–7], include longitudinal data that can be leveraged to develop predictive models for clinical outcomes. Clinical variables were recorded at baseline and during follow-up. Barbour et al. [8] utilized these data files to develop an international tool for predicting ESKD in IgAN patients at the time of kidney biopsy. Additionally, a tool predicting ESKD 1 or 2 years post-biopsy has been recently published [9]. These tools were developed using traditional statistical methods.

Furthermore, other data derived from IgAN patients have been employed to create prediction models based on AI and machine learning algorithms. These models will be discussed in the section ‘Machine learning applications’.

MATHEMATICAL MODELS

The collection of patient data often results in disparate distributions. To determine appropriate statistical approaches, a skewness test can be performed to assess the normality or abnormality of the data distribution. Quantitative variables with normal distribution values are summarized using the mean ± standard deviation or median and interquartile range (25th–75th percentile) for non-normally distributed variables. Independent sample t-tests or one-way analysis of variance tests are employed to compare normally distributed continuous data across groups, while the Mann–Whitney U test and Kruskal–Wallis test are used for non-normally distributed continuous data. Categorical variables are presented as absolute and percentage frequencies and are compared using Pearson's chi-squared test or Fisher's test, as appropriate.

For analysing cumulative renal survival time based on hard or surrogate endpoints, Kaplan–Meier curves are used for censored data. Potential non-linear effects of exposure factors are explored and reported when identified. Group comparisons are made using the logrank test and the Breslow test.

It is important to note that clinical variables may change over time and traditional statistical models may not account for such changes. This limitation arises when incorporating time-varying covariates. Another challenge in EHRs is the long-term follow-up, as healthier and non-compliant patients may miss scheduled outpatient visits, resulting in missing data. Additionally, with large data files containing numerous variables and inputs, there can be inadequate processing power for statistical analyses. To address these issues, Brian et al. [10] propose the use of the landmark method initially introduced by Anderson et al. in 1983 [11]. This method utilizes the concept of ‘landmark time’ (e.g. age) to mitigate bias caused by varying degrees of healthcare exposure. It creates relative reference points independent of the outcomes, ultimately reducing the data file size.

Logistic regression analysis is traditionally employed for prediction models based on dichotomous outcomes, such as the presence or absence of an event. The Cox regression proportional hazards model is widely used for survival analysis and time-to-event data, such as the time to progression to ESKD. This model assesses the influence of specific variables or prognostic factors on the relative risk or hazard of experiencing a clinical event within a defined time frame. Patient survival in this context encompasses outcomes of interest beyond mortality, such as time to dialysis initiation, hospitalization, doubling of serum creatinine or a 40% decrease in eGFR. Variables that are significant predictors of ESKD in univariate analysis (P < .05) or deemed clinically relevant are included in a multivariate model to achieve adequate statistical power. Backward or forward stepwise approaches are used for variable selection when dealing with numerous variables. Furthermore, the prediction model should be validated using an external independent cohort of patients. Risk estimates are presented as unadjusted and adjusted hazard ratios (HRs) with their corresponding 95% confidence intervals (CIs), calculated using estimated regression coefficients and standard errors. Baseline variables selected by the multivariate Cox regression model are collected at the time of kidney biopsy and used to evaluate the relative risk associated with baseline prognostic factors such as sex, age, serum creatinine, eGFR, hypertension, histological renal lesions and therapy.

Cox regression analysis allows for studying two main types of survival variables: time-invariant variables, which do not change over time (e.g. patient's sex), and time-varying variables, which may change over time (e.g. proteinuria). However, collecting data, especially for infrequent or irregular outpatient visits, poses constraints and difficulties. In time-to-event data, it is common for some individuals not to be followed up by physicians until the event time, resulting in censored times instead of event times. Therefore, statistical models incorporating time-varying covariates have limitations. Long-term follow-up is particularly challenging, as healthier and non-compliant patients may miss scheduled outpatient visits, leading to missing data. Additionally, the presence of numerous variables and millions of inputs in data files often exceeds the processing capabilities of statistical analyses. Moreover, non-linearity of some predictors and the effects of drugs further complicate data interpretation and their applicability in clinical practice.

To analyse changes in variables such as eGFR and proteinuria, mixed models for repeated measurements are utilized. Joint modelling for longitudinal and time-to-event data, comprising classical survival analysis and linear mixed effects models, is employed to assess the time-dependent prognostic ability of variables measured repeatedly over time [12, 13]. The term ‘mixed’ indicates the inclusion of both fixed effects (covariates with constant mean effects across the population) and random effects (covariates varying among individuals) in the models. Discriminative capability is evaluated using a dynamic discrimination index, defined as the weighted average of the time-dependent area under the curve (AUC) measured repeatedly over time.

Mathematical models have been previously utilized to develop prediction tools for ESKD in IgAN patients [14–20]. Recently, Chinese researchers have developed new predictors of ESKD based on nomograms. Liu et al. [21]. developed a nomogram based on a Chinese cohort of 869 IgAN patients, predicting disease progression using variables identified by Cox regression analysis (e.g. urinary protein excretion, eGFR, hyperuricemia, mesangial proliferation, segmental glomerulosclerosis, tubular atrophy/interstitial fibrosis, crescents and glomerulosclerosis). The nomogram achieved high predictive accuracy, with a C-index value of 0.945. Similarly, another study [22] developed and validated a nomogram for predicting IgAN prognosis in a cohort of 349 Chinese patients, considering variables such as mesangial hypercellularity, tubular atrophy/interstitial fibrosis, average proteinuria and average mean arterial pressure. The nomogram demonstrated good predictive accuracy, with a C-index of 0.88.

However, the principal limitation of these tools is their potential applicability to specific races and ethnicities. Consequently, the International IgAN Prediction Tool developed by Barbour et al. [8]. using a large population of 3927 biopsy-proven IgAN patients from different races and ethnicities is widely utilized in clinical practice. This tool provides predictions of ESKD as a percentage at a maximum estimated time of 7 years. However, it is important to note that IgAN is a long-term disease and many patients develop ESKD ≥2 decades after the kidney biopsy [23].

MACHINE LEARNING APPLICATIONS

Big data analysis can be conducted using data mining algorithms, which encompass supervised, unsupervised and semi-supervised learning approaches. Various methods such as classification techniques (e.g. logistic regression, decision trees, naive Bayesian methods, neural networks and support vector machines), clustering techniques (e.g. k-means, principal component analysis and self-organizing maps) and linear regression analysis can be employed for analysing big data. Prior to the analysis, the dataset is divided into a training set and a validation set using a bootstrapping method. Subsequently, the obtained data are tested on an independent external cohort of patients to validate the model's performance. Accuracy, sensitivity, specificity, receiver operating characteristics (ROC) curve, precision, recall, F-measure, number of positive predictions and false positive predictions are common metrics used to evaluate the tool's performance.

Machine learning algorithms are models that learn to perform tasks or make decisions automatically based on the available data. The spectrum of machine learning applications ranges from classic machine learning approaches to deep learning. These applications have the potential to generate valuable tools for the modern healthcare system. It is crucial that these algorithms ensure robust and valid decision making in clinical practice, as their outcomes can significantly impact patient care.

ESKD PREDICTION

Table 1 provides an overview of different AI-based models used for predicting ESKD in patients with IgAN.

Table 1:

Data used to develop AI models predicting ESKD in patients with IgAN

Author Population Race Patients, n FU (years) Variations, n Training set, n Test set, n Validation test MLA Accuracy (%) Task model Limitations Prediction (years)
Geddes et al. [24] A Caucasian 54 7 7 6 Ns ANN S (86.4) NO KB 7
Pesce et al. [26] A Caucasian/Asian 1040 ND 7 830 210 5 Ns ANN AUC (91.6) CDSS KB
classification
8
Schena et al. [1] A Caucasian 948 7 7 758 190 167 pts ANN AUC (82) Dial Check Only Caucasians 10
Hou et al. [30] A Asian 730 7 511 219 No BP-ANN AUC (88) No Only Asians
Liu et al. [31] A Asian 262 4.6 8 No RFM AUC (96) No Only Asians
Chen et al. [32] A Asian 2047 8 10 1022 1025 pts XGBoost C stat (89) Yes Only Asians
Zhang et al. [33] C Asian 1167 16 XGBoost AUC (85) No Only Asians 5
Li et al. [34] A Asian 2047 7.9 10 1022 1025 pts XGBoost-Surv C stat (82) No Only Asians

A: adults; C: children; FU: follow-up; MLA: machine learning algorithm; ND: not done; Ns: nephrologists; pts: patients; S: sensitivity; C stat: C statistic; BP-ANN: back propagation ANN; RFM: random fast model.

Geddes et al. [24]. conducted the initial study on applying AI to predict ESKD in biopsy-proven IgAN patients, using a small cohort of 54 Scottish adult patients, among whom 23% developed ESKD. The researchers employed an artificial neural network (ANN) algorithm trained and tested with a jack-knife sampling technique. The model achieved a correct outcome assignment in 87% of patients, demonstrating good sensitivity (86.4%) and specificity (87.5%). Subsequently the model underwent validation by six nephrologists in clinical practice, yielding a mean score of 69.4% with a sensitivity of 72% and specificity of 66%. The predicted outcome time was based on a 7-year renal function follow-up. The researchers concluded that enhancing the ANN’s performance could be achieved by incorporating a greater number of variable components into the ANN.

Building upon this work, DiNoia et al. [25] developed an advanced ANN model that included the kidney biopsy report in addition to the variables used by Scottish nephrologists. Using an experimental approach to determine the optimal ANN architecture [26], they developed a clinical decision support system (CDSS) consisting of two ANN models, one for predicting ESKD and another for predicting the time to ESKD development [1]. This model was constructed using a larger retrospective international cohort of 1040 IgAN patients. The CDSS tool exhibited high performance in predicting ESKD. The model's histologic variables were improved by incorporating the international Oxford classification [27–29] into a new retrospective cohort of European IgAN patients. The enhanced ANN model consisted of four hidden layers, each with 100 neurons for the classification model, and three hidden layers, each with 125 neurons for the regression model. This modification significantly improved the tool's performance, resulting in an ROC AUC of 0.82 for a 5-year follow-up and 0.89 for a 10-year follow-up. The CDSS tool is accessible as a mobile app and web-based application for Android and iOS cellular phones. The DialCheck tool has been validated in an independent external cohort of IgAN patients [1].

Hou et al. [30] recently aimed to develop an ANN-based model to predict the risk of developing IgAN solely using laboratory data. They analysed a cohort of 212 biopsy-proven IgAN patients and 518 non-IgAN patients utilizing a backpropagation ANN algorithm. The results revealed that seven variables (age, haematuria, eGFR, serum albumin, serum IgA levels, IgA:C3 ratio and IgG) were independent risk factors for the presence of IgAN. The model achieved an ROC curve value of 0.88 in the validation set, surpassing the performance of the logistic regression model. However, it is important to note that this algorithm was developed using a small cohort of Chinese patients and has not yet been validated in a large external cohort where the diagnostic model should be confirmed by kidney biopsy.

Liu et al. [31] developed a random forest (RF) model to predict ESKD in Chinese IgAN patients. The variables for the model were selected using the Gini impurity index in the RF model logistic regression analysis. The model was developed in a retrospective cohort of 262 IgAN patients and was trained and tested using the six predictors from the CDSS tool [26]. Subsequently the model was further trained by adding variable predictors with significant impact on the progression of kidney damage. The inclusion of data from the MEST [mesangial (M) and endocapillary (E) hypercellularity, segmental sclerosis (S) and interstitial fibrosis/tubular atrophy (T)] classification significantly improved the AUC from 0.92 to 0.97. However, this model has not been validated in an independent external cohort of patients.

Chen et al. [32] designed the Nanjing risk stratification model based on the gradient tree boosting method implemented in the eXtreme Gradient Boosting (XGBoost) system. They employed the Shapley Additive exPlanation (SHAP) method to explain the XGBoost prediction results. Data from the Nanjing Glomerulonephritis Registry were collected from 18 renal centres and a cohort of 2047 Chinese IgAN patients was divided into a derivation cohort (50%) and a validation cohort (50%). Compared with other machine learning algorithms and statistical methods, this model demonstrated the best performance, with an AUC of 0.89. The XGBoost and SHAP models were incorporated into the Nanjing IgAN Risk Stratification System, which is accessible through a web-based calculator. However, it should be noted that among the 36 variables considered, some show inconsistency in their clinical relevance throughout the disease course.

Zhang et al. [33] analysed five different AI models to predict ESKD within 5 years in Chinese children with biopsy-proven IgAN. The researchers used the chi-squared test to select the most relevant variables from 37 attributes, of which one variable proved to be independent. The XGBoost model exhibited the best performance, with an accuracy of 85.11%. However, a significant limitation of this study is the prediction of ESKD within 5 years for a disease characterized by long-term outcomes, particularly in children who often experience renal function recovery. Consequently, the model's utility in clinical practice may be limited.

Li et al. [34] utilized real-world data from the Nanjing Glomerulonephritis Registry, which collected data from 18 renal centres in China. The cohort of 2047 biopsy-proven IgAN patients was divided into derivation (1022 patients) and validation cohorts (1025 patients). The researchers developed a survival model, Extreme Gradient Boosting for Survival (XGBoost-Surv), adapted to time-to-event prediction to forecast ESKD. They employed the SHAP method to interpret individual predicted results. A panel of 36 variables was considered as candidate predictors. The XGBoost-Surv tool was compared with other conventional machine learning algorithms and statistical methods. The model achieved a performance of 0.82 in the validation cohort using the time-dependent concordance index. The interpretation model identified 10 top variables, with tubulointerstitial damage and global sclerosis emerging as the strongest contributors. The study successfully demonstrated the complex relationship between certain predictors and outcomes. However, it is important to note that this model has not been validated in an external independent cohort of IgAN patients.

Overall, these studies show the potential of AI models for predicting ESKD in IgAN patients. Nevertheless, further validation in large external cohorts and incorporation into clinical practice are necessary to establish their robustness and utility.

AI: PROMISES AND LIMITATIONS

AI holds great promise for revolutionizing healthcare by enabling the development of new products and services. AI in the era of big data can assist physicians in improving the quality of patient care, but it cannot replace the traditional physician–patient relationship. In fact, AI cannot replace the intellectual work of the physician at the bedside and the conversation between physician and patient.

In the coming years, AI will become a routine part of nephrology, just as today GFR is estimated and not measured in clinical practice. The successful use of AI in nephrology will depend on the awareness of physicians in clinical practice. Nevertheless, lack of familiarity with AI can be overcome by dedicated courses offered by companies to practitioners and by official courses to students in the medical curriculum. However, companies that are involved in the development of AI tools must be legally liable for any medical errors in clinical practice.

AI also presents several important limitations. It is crucial to understand and address these limitations to fully leverage the potential of AI in the field of medicine.

One significant limitation of AI is its applicability to rare diseases. Due to the scarcity of real-world data, AI may struggle to generate high-quality systems for diagnosing and treating rare conditions. The lack of sufficient data hampers the development of valuable tools in such cases.

Another limitation arises from the unstructured nature of data involved in the initial diagnosis of kidney diseases, which relies on patient history. If AI lacks a deep understanding of human language, it faces challenges in processing and interpreting unstructured data. The current applications of natural language processing in healthcare, such as chatbots, are limited by this constraint.

AI excels when working with structured data. The development of machine learning tools relies heavily on data generated and collected by physicians. Yet physicians are fallible and can make errors in documentation. If AI models are trained on datasets with inaccuracies or mistakes, the generated output may be erroneous. Therefore, it is crucial to conduct a thorough review and revision of medical files before proceeding with the development of AI tools.

Recent research by Kellis et al. [35] highlights the importance of focusing on understanding the causal pathways underlying disease development rather than solely treating disease manifestations. AI can play a crucial role in unravelling these causal pathways, offering the potential to manipulate disease causation and reverse disease outcomes. Thus, investing time and effort in utilizing AI to comprehend the underlying causes of diseases can have a transformative impact on healthcare.

While AI brings promising opportunities to healthcare, it is vital to acknowledge and address its limitations. The scarcity of data for rare diseases, challenges in processing unstructured data, reliance on structured data, potential errors in physician-generated data and the need to delve into causal pathways are all important considerations when utilizing AI in medical practice. By understanding and mitigating these limitations, we can harness the full potential of AI to improve patient care and outcomes.

CONCLUSIONS

In this review we have highlighted the limitations of traditional mathematical techniques, such as regression models, in analysing survival data and longitudinal data, such as eGFR and proteinuria, in IgAN patients. We emphasize the need for more advanced computational approaches, including joint modelling and AI, to enable a more comprehensive analysis of dynamic and individualized prognostic factors.

We have discussed the different computational approaches used to predict ESKD in IgAN patients, ranging from traditional regression models to complex non-linear AI models. As the field continues to evolve, a competition between these two approaches is anticipated in the coming years.

While there have been significant advancements in the development of mathematical models and AI tools for IgAN, it is important to acknowledge their limitations. Mainly, these tools often overlook the heterogeneity of the world population, including differences in race and ethnicity, which may impact the accuracy of predictions.

ACKNOWLEDGEMENTS

The views and opinions expressed in this publication are those of the authors and do not necessarily reflect the policies of CKJ. This work was partially supported by grant ARS01_00876 from the Ministry of University, Italy.

Contributor Information

Francesco Paolo Schena, Department of Precision and Regenerative Medicine and Ionian Area, University of Bari, Bari, Italy; Schena Foundation, Policlinic, Bari, Italy.

Carlo Manno, Department of Precision and Regenerative Medicine and Ionian Area, University of Bari, Bari, Italy.

Giovanni Strippoli, Department of Precision and Regenerative Medicine and Ionian Area, University of Bari, Bari, Italy; School of Public Health, University of Sydney, Sydney, NSW, Australia.

AUTHORS’ CONTRIBUTIONS

All authors contributed to the search and analysis of the studies described in this narrative review.

FUNDING

This paper was published as part of a supplement funded by an educational grant from Otsuka America Pharmaceutical, Inc.

DATA AVAILABILITY STATEMENT

No new data were generated or analysed in support of this research.

CONFLICT OF INTEREST STATEMENT

The authors have no conflicts of interests to disclose.

REFERENCES

  • 1. Schena FP, Anelli VW, Trotta Jet al. Development and testing of an artificial intelligence tool for predicting end-stage kidney disease in patients with immunoglobulin A nephropathy. Kidney Int 2021;99:1179–88. 10.1016/j.kint.2020.07.046 [DOI] [PubMed] [Google Scholar]
  • 2. Lee CH, Yon HJ. Medical big data: promise and challenges. Kidney Res Clin Pract 2017;36:3–11. 10.23876/j.krcp.2017.36.1.3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Sinha A, Hripcsak G, Markatou M. Large datasets in biomedicine: a discussion of salient analytic issues. J Am Med Inform Assoc 2009;16:759–67. 10.1197/jamia.M2780 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Benchimol EI, Smeeth L, Guttmann Aet al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS Med 2015;12:e1001885. 10.1371/journal.pmed.1001885 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Coppo R, Troyanov S, Bellur Set al. Validation of the Oxford classification of IgA nephropathy in cohorts with different presentations and treatments. Kidney Int 2014;86:828–36. 10.1038/ki.2014.63 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Pitcher D, Braddon F, Hendry Bet al. Long-term outcomes in IgA nephropathy. Clin J Am Soc Nephrol 2023;18:727–38. 10.2215/CJN.0000000000000135 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Zhang Z, Zhang Y, Zhang H. IgA nephropathy: a Chinese perspective. Glomerular Dis 2021;2:30–41. 10.1159/000520039 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Barbour SJ, Coppo R, Zhang Het al. Evaluating a new international risk-prediction tool in IgA nephropathy. JAMA Intern Med 2019;179:942–52. 10.1001/jamainternmed.2019.0600 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Barbour SJ, Coppo R, Zhang Het al. Application of the International IgA Nephropathy Prediction Tool one or two years post-biopsy. Kidney Int 2022;102:160–72. 10.1016/j.kint.2022.02.042 [DOI] [PubMed] [Google Scholar]
  • 10. Wells BJ, Chagin KM, Li Let al. Using the landmark method for creating prediction models in large datasets derived from electronic health records. Health Care Manag Sci 2015;18:86–92. 10.1007/s10729-014-9281-3 [DOI] [PubMed] [Google Scholar]
  • 11. Anderson JR, Cain KC, Gelber RD. Analysis of survival by tumor response. J Clin Oncol 1983;1:710–9. 10.1200/JCO.1983.1.11.710 [DOI] [PubMed] [Google Scholar]
  • 12. Rizopoulos DJM. An R package for the joint modelling of longitudinal and time-to-event data. J Stat Softw 2010;35:1–33. 10.18637/jss.v035.i0921603108 [DOI] [Google Scholar]
  • 13. Chesnaye NC, Tripepi G, Dekker FWet al. An introduction to joint models-applications in nephrology. Clin Kidney J 2020;13:143–9. 10.1093/ckj/sfaa024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Wakai K, Kawamura T, Endoh Met al. A scoring system to predict renal outcome in IgA nephropathy: from a nationwide prospective study. Nephrol Dial Transplant 2006;21:2800–8. 10.1093/ndt/gfl342 [DOI] [PubMed] [Google Scholar]
  • 15. Goto M, Wakai K, Kawamura Tet al. A scoring system to predict renal outcome in IgA nephropathy: a nationwide 10-year prospective cohort study. Nephrol Dial Transplant 2009;24:3068–74. 10.1093/ndt/gfp273 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Okonogi H, Utsunomiya Y, Miyazaki Yet al. A predictive clinical grading system for immunoglobulin A nephropathy by combining proteinuria and estimated glomerular filtration rate. Nephron Clin Pract 2011;118:c292–300. 10.1159/000322613 [DOI] [PubMed] [Google Scholar]
  • 17. Berthoux F, Mohey H, Laurent Bet al. Predicting the risk for dialysis or death in IgA nephropathy. J Am Soc Nephrol 2011;22:752–61. 10.1681/ASN.2010040355 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Xie J, Kiryluk K, Wang Wet al. Predicting progression of IgA nephropathy: new clinical progression risk score. PLoS One 2012;7:e38904. 10.1371/journal.pone.0038904 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Tanaka S, Ninomiya T, Katafuchi Ret al. Development and validation of a prediction rule using the Oxford classification in IgA nephropathy. Clin J Am Soc Nephrol 2013;8:2082–90. 10.2215/CJN.03480413 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Knoop T, Vågane AM, Vikse BEet al. Addition of eGFR and age improves the prognostic absolute renal risk-model in 1,134 Norwegian patients with IgA nephropathy. Am J Nephrol 2015;41:210–9. 10.1159/000381403 [DOI] [PubMed] [Google Scholar]
  • 21. Liu J, Duan S, Chen Pet al. Development and validation of a prognostic nomogram for IgA nephropathy. Oncotarget 2017;8:94371–81. 10.18632/oncotarget.21721 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Liu LL, Zhu LB, Zheng JNet al. Development and assessment of a predictive nomogram for the progression of IgA nephropathy. Sci Rep 2018;8:7309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Manno C, Strippoli GFM, D'Altri Cet al. A novel simpler histological classification for renal survival in IgA nephropathy: a retrospective study. Am J Kidney Dis 2007;49:763–75. 10.1053/j.ajkd.2007.03.013 [DOI] [PubMed] [Google Scholar]
  • 24. Geddes CC, Fox JC, Allison MEet al. An artificial neural network can select patients at high risk of developing progressive IgA nephropathy more accurately than experienced nephrologists. Nephrol Dial Transplant 1998;13:67–71. 10.1093/ndt/13.1.67 [DOI] [PubMed] [Google Scholar]
  • 25. DiNoia T, Ostuni VC, Pesce Fet al. An end stage kidney disease predictor based on an artificial neural networks ensemble. Expert Syst Applic 2013;40:4438–45. [Google Scholar]
  • 26. Pesce F, Diciolla M, Binetti Get al. Clinical decision support system for end-stage kidney disease risk estimation in IgA nephropathy patients. Nephrol Dial Transplant 2016;31:80–6. 10.1093/ndt/gfv232 [DOI] [PubMed] [Google Scholar]
  • 27. Working Group of the International IgA Nephropathy Network and the Renal Pathology Society, Cattran DC, Coppo Ret al. The Oxford classification of IgA nephropathy: rationale, clinicopathological correlations, and classification. Kidney Int 2009;76:534–45. 10.1038/ki.2009.243 [DOI] [PubMed] [Google Scholar]
  • 28. Working Group of the International IgA Nephropathy Network and the Renal Pathology Society, Roberts IS, Cook HTet al. The Oxford classification of IgA nephropathy: pathology definitions, correlations, and reproducibility. Kidney Int 2009;76:546–56. 10.1038/ki.2009.168 [DOI] [PubMed] [Google Scholar]
  • 29. Trimarchi H, Barratt J, Cattran DCet al. Oxford Classification of IgA nephropathy 2016: an update from the IgA Nephropathy Classification Working Group. Kidney Int 2017;91:1014–21. 10.1016/j.kint.2017.02.003 [DOI] [PubMed] [Google Scholar]
  • 30. Hou J, Fu S, Wang Xet al. A noninvasive artificial neural network model to predict IgA nephropathy risk in Chinese population. Sci Rep 2022;12:8296. 10.1038/s41598-022-11964-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Liu Y, Zhang Y, Liu Det al. Prediction of ESRD in IgA nephropathy patients from an Asian cohort: a random forest model. Kidney Blood Press Res 2018;43:1852–64. 10.1159/000495818 [DOI] [PubMed] [Google Scholar]
  • 32. Chen T, Li X, Li Yet al. Prediction and risk stratification of kidney outcomes in IgA nephropathy. Am J Kidney Dis 2019;74:300–9. 10.1053/j.ajkd.2019.02.016 [DOI] [PubMed] [Google Scholar]
  • 33. Zhang P, Wang R, Shi N. IgA nephropathy prediction in children with machine learning algorithms. Future Internet 2020;12:230. 10.3390/fi12120230 [DOI] [Google Scholar]
  • 34. Li Y, Chen T, Chen Tet al. An interpretable machine learning survival model for predicting long-term kidney outcomes in IgA nephropathy. AMIA Annu Symp Proc 2020;2020:737–46. [PMC free article] [PubMed] [Google Scholar]
  • 35. Bughin J, van Zeebroeck N. The promise and pitfalls of AI.McKinsey Global Institute. 2018. https://www.mckinsey.com/mgi/overview/in-the-news/the-promise-and-pitfalls-of-ai [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No new data were generated or analysed in support of this research.


Articles from Clinical Kidney Journal are provided here courtesy of Oxford University Press

RESOURCES