Skip to main content
. 2015 Jul 7;13(7):e1002195. doi: 10.1371/journal.pbio.1002195

Table 1. Four domains of Big Data in 2025.

In each of the four domains, the projected annual storage and computing needs are presented across the data lifecycle.

Data Phase Astronomy Twitter YouTube Genomics
Acquisition 25 zetta-bytes/year 0.5–15 billion tweets/year 500–900 million hours/year 1 zetta-bases/year
Storage 1 EB/year 1–17 PB/year 1–2 EB/year 2–40 EB/year
Analysis In situ data reduction Topic and sentiment mining Limited requirements Heterogeneous data and analysis
Real-time processing Metadata analysis Variant calling, ~2 trillion central processing unit (CPU) hours
Massive volumes All-pairs genome alignments, ~10,000 trillion CPU hours
Distribution Dedicated lines from antennae to server (600 TB/s) Small units of distribution Major component of modern user’s bandwidth (10 MB/s) Many small (10 MB/s) and fewer massive (10 TB/s) data movement