Abstract
As digital medicine expands, the growing volume of unused (or underutilized) data are creating a hidden epidemic of technological waste. For example, the concept of a digital twin has gained rapid traction. A virtual replica to mirror an organ, physiological system or a patient to explore predictive simulation, real-time monitoring and/or “what-if” scenarios. Yet, a digital twin generates big data e.g., sensor streams, metadata, audit logs, simulations, and backups. Over time, much of that data may become dormant, but require storage. That is the burden of going digital, invisible waste with the accumulation of unused files, logs, archives, and dormant applications/apps especially in Cloud and institutional infrastructures. The environmental, financial, and operational costs of digital waste are rarely discussed in medicine (or health), yet it matters as data ecosystems scale. In contrast (physical) electronic/e-waste is broadly discussed. Here, we discuss why digital medicine researchers and institutions must take digital waste seriously. We highlight Digital Cleanup Day (21 March 2026) and raise awareness to embed data sustainability metrics into digital medicine.
Subject terms: Computational biology and bioinformatics, Ecology, Ecology, Environmental sciences, Health care, Mathematics and computing
Defining and framing digital waste
Many may be familiar with the term electronic/e-waste1,2 i.e., old or end-of-life electronic appliances e.g., computers, laptops and smartphones3. In contrast we are discussing digital waste, data and software artifacts that persist far past their use by date4. Examples include: duplicate or redundant file versions; historical model weights or simulation snapshots (never reused); log archives or audit trails that sit idle; unopened email attachments or legacy inbox content; cold archives never accessed but held in “online” tiers; unused mobile or desktop applications/apps; intermediate staging files, caches, or temporary backups that are never cleaned.
Accordingly, digital waste may be described as like physical waste as bits accumulate, consume infrastructure, and impose maintenance cost. In enterprise and Cloud settings, “dark data” (like those described above) and ROT (redundant, obsolete, trivial data) are recognized phenomena5,6. Some estimates suggest that a large share of enterprise storage (50% or more) comprises underutilized or redundant data. Yet, unlike e-waste, which is visible, digital waste is invisible and often unchecked. The result? When organizations hoard data “just in case” i.e., the fear of deleting something that might be needed later6, they incur hidden liabilities.
Digital twins
In 2024, npj Digital Medicine created a collection on digital twins (https://www.nature.com/collections/chjceifabd) which are symbolic of the tension between data richness and data burden. In medicine/healthcare, a twin may combine e.g., continuous physiological monitoring, imaging, genomics, clinical records, and simulation outcomes. Each iteration or version contributes to cumulative storage load. The ambition of digital twin implementations implicitly presumes a wealth of stored data over time. In many proposals for personalized models of physiology, researchers emphasize the need for longitudinal, multimodal data to calibrate and update the twin. But it seems less attention is paid to what may happen (years or decades) later, when versions and logs accumulate. As digital twin deployment scales across patient cohorts, unchecked storage of redundant or infrequently used data may become problematic.
Remote monitoring
Some recent papers help to show that environmental framing is not foreign to npj Digital Medicine. Two examples raise awareness that the next generation of digital work should go further than the papers presented below. Generally, authors should report not only clinical or algorithmic outcomes, but data lifecycle metrics e.g., volume(s) generated, percent archived/deleted, energy/carbon estimates. (Could these become a new “data footprint” section in methods or supplementary materials?)
Firstly, Benedetto et al. 7 deployed a mobile multi-screening unit (MMSU) and associated digital health tools in rural Tuscany to reduce travel emissions and centralize screening. Authors report that consolidating multiple screening visits reduced patient and caregiver travel–related CO₂ equivalent/e emissions by >90% compared to conventional models. While the focus in that study is travel emissions, their framing of digital health as contributing to environmental sustainability is relevant i.e., digital systems, when used strategically, can reduce one part of the carbon burden (transport). But their model implicitly relies on considerable data capture, processing, and storage which carry a footprint.
In Lathan et al.’s8 prospective observational study of remote follow-up in lower limb arterial surgery patients, the authors quantify financial and environmental savings from remote-first models, comparing to in-person follow-up. While the principal savings derive from travel reduction, this work exemplifies how digital health interventions are increasingly evaluated through sustainability lenses. It also hints that remote monitoring generates data streams which must be stored and managed which is another dimension of the footprint.
Impact
Digital waste also incurs a financial cost, it is a creeping liability. Every stored byte must be powered, backed up, replicated, and cooled. Even in efficient data centers, small per-byte emissions scale. An estimate is that storing 1 GB yields ~0.04 kg CO₂e per year (i.e. ~35 kg CO₂e per TB9). While this figure depends heavily on data center design, energy mix, and utilization, it offers a good estimation. At that rate, retaining an extra 100 TB of redundant data year after year equates to ~4000 kg CO₂e annually. Cooling systems, auxiliary power, and backup infrastructure magnify the footprint10. Large data storage slow down indexing, backups, searches, and migrations11,12. Legacy or redundant artifacts complicate pipelines and hinder maintainability13. Maintaining, migrating, validating, or re-ingesting old data consumes personnel, budget, and time. The more data that is hoarded, the more infrastructure and operational costs grow. Moreover, from a security risk, older files may lack up-to-date encryption, and unused apps can become vulnerable to attack.
A call to action: Digital Cleanup Day
Digital Cleanup Day (held annually on the third Saturday of March i.e., in 2026 it is 21st March, www.digitalcleanupday.org) encourages individuals, labs, health institutions, hospitals, and digital health companies to decluttering their digital infrastructures. The benefits of a cleanup include lower energy demand, reduced cooling and infrastructure burden, faster operations, lower storage costs, and tightened security. Healthcare organizations and clinical research units could achieve tangible savings with a (modest) pruning of archived sensor logs, redundant imaging, or unused modeling artifacts across many patients.
Cleaning up
Acquiring digital medical data can be a laborious, costly and challenging task. Cleaning up should always be discussed with the wider research team and/or third party to ensure any clean-up is appropriate and within legal requirements. Moreover, cleaning up should be undertaken with expertise from e.g., computing to ensure appropriate and safe actions, Table 1.
Table 1.
Suggested activities to facilitate a clean-up and reduce digital waste
|
Audit storage
|
Use tools to identify large unwieldy files, duplicate versions, long-untouched items, old email attachments, stale logs, unused apps, and legacy backups. |
|
Delete or archive
|
Remove or shift data to cold storage (offline or very low-power tiers). Prune redundant versions or simulation snapshots no longer needed. |
|
Compress or deduplicate
|
Where feasible, compress large media files or deduplicate content across repositories. |
|
Review retention policies
|
For projects, labs, or clinical services, set rational retention windows (e.g. raw sensor logs beyond 5 years archived or pruned). |
|
Automate hygiene
|
Deploy scripts or workflows that schedule periodic cleanups, version pruning, or expiration rules. |
|
Report impact
|
Share how much data (in GB/TB) was removed or archived, and estimate the associated energy or CO₂ avoided (e.g. using ~0.04 kg CO₂e per GB-year or more precise local conversion). |
Naturally, the primary question will probably be, “What if we delete data that we’ll need later?” That can be described as the standard rationale for infinite retention of data i.e., “we might need it again”. Accordingly, the approach should be cautious and staged such as (i) quarantining data (isolated/hidden), recovering if needed and (ii) creating review windows, predefined periods of time for research teams to check, approve and object to the cleanup. Such actions would help ensure that cleanups are not reckless but deliberate, safe and within a well-defined study protocol.
Conclusion
The hazards of e-waste have long been recognized with advocacy made for appropriate hardware disposal and recycling with a circular design as key drivers of sustainability. Yet in digital medicine (and health), there is a parallel threat that is overlooked: unused bits. Digital waste (e.g., redundant data, unused apps) is a hidden liability of our technological age. As digital twins, remote monitoring, and federated infrastructures scale, so does that hidden burden. By anchoring a Digital Cleanup Day (21 March 2026) and instilling data sustainability metrics in digital medicine publications we can reclaim efficiency, reduce carbon, and improve digital hygiene. Bits matter and digital medicine must treat them as part of the sustainability equation.
Acknowledgements
This editorial has received funding from the European Union’s Horizon Europe research and innovation under the Marie Skłodowska-Curie grant agreement No. 101130572. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Any views expressed are the authors’ only. C.W., L.L.S., B.G., D.P. and A.G. are participants and members of the GETM4 project.
Author contributions
The first draft was written by C.W. and A.G. L.L.S., H.S., B.G. and D.P. provided input on revisions. A.G. approved the final draft.
Data availability
No datasets were generated or analysed during the current study.
Competing interests
A.G. is a deputy Editor-in-Chief of npj Digital Medicine. H.S. is President and CEO of Let’s Do It World/World Cleanup Day.
References
- 1.Niu, B., Xiao, J., Xu, Z. & Ruan, J. Tackling the complexity of e-waste for its reuse in functional materials. Nat. Rev. Methods Primers. 5, 46, (2025).
- 2.Liu, K., Tan, Q., Yu, J. & Wang, M. A global perspective on e-waste recycling. Circular Econ.2, 100028 (2023). [Google Scholar]
- 3.Jain, M. et al. Review on E-waste management and its impact on the environment and society. Waste Manag. Bull.1, 34–44 (2023). [Google Scholar]
- 4.Alieva, J. & Haartman, R. Digital MudA-The new form of waste by Industry 4.0. Oper. Supply Chain Manag.: Int. J.13, 269–278 (2020). [Google Scholar]
- 5.Hodgkinson,I., Jackson, L. & Jackson, T. On track for 6.8 billion years of continuous movie streaming: Data, energy & need for digital decarbonization. Observatory of Public Sector Innovation. https://oecd-opsi.org/blog/digital-decarbonization/ (accessed 22 Oct 2025).
- 6.Jackson,T., Hodgkinson, I., Tom Tasker,T. & Smyth, S.-J. How digital waste is polluting the planet. Loughborough University. https://volume.lboro.ac.uk/digital-waste-polluting-the-planet/ (accessed 22 Oct 2025).
- 7.Benedetto, V. et al. Digital health for environmentally sustainable cancer screening. npj Digital Med.8, 184 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Lathan, R. et al. Postoperative remote first care for financially and environmentally sustainable healthcare. npj Digital Med.8, 299 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Al Kez, D., Foley, A. M., Laverty, D., Del Rio, D. F. & Sovacool, B. Exploring the sustainability challenges facing digitalization and internet data centers. J. Clean. Prod.371, 133633 (2022). [Google Scholar]
- 10.Siddik, M. A. B., Shehabi, A. & Marston, L. The environmental footprint of data centers in the United States. Environ. Res. Lett.16, 064017 (2021). [Google Scholar]
- 11.Pokorný, J. Big Data Storage and Management: Challenges and Opportunities. In Environmental Software Systems. Computer Science for Environmental Protection, Cham, (eds Hřebíček, J., Denzer, R., Schimak, G. & Pitner, T.) 28–38 (Springer, Cham., International Publishing, 2017).
- 12.Lawal, Z. K., Zakari, R. Y., Shuaibu, M. Z. & Bala, A. A review: Issues and Challenges in Big Data from Analytic and Storage perspectives,. Int. J. Eng. Computer Sci.5, 15947–15961 (2016). [Google Scholar]
- 13.Fucci, D., Alégroth, E. & Axelsson, T. When traceability goes awry: An industrial experience report. J. Syst. Softw.192, 111389 (2022). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
No datasets were generated or analysed during the current study.






