Skip to main content
. 2018 May 23;10(5):e2680. doi: 10.7759/cureus.2680

Figure 1. Schematic Illustration of Long Tail Data.

Figure 1

“Studies that have plotted data set size against the number of data sources reliably uncover a skewed distribution. Well-organized big science efforts featuring homogenous, well-organized data represent only a small proportion of the total data collected by scientists. A very large proportion of scientific data falls in the long-tail of the distribution, with numerous small independent research efforts yielding a rich variety of specialty research data sets. The extreme right portion of the long tail includes data that are unpublished; such as siloed databases, null findings, laboratory notes, animal care records, etc. These dark data hold a potential wealth of knowledge but are often inaccessible to the outside world” [1]. Image courtesy of Nature Publishing Group.