Skip to main content
. 2017 May 3;67(6):546–557. doi: 10.1093/biosci/bix025

Table 2.

A taxonomy of skills for data-intensive research.

Data management and processing Software skills for science Analysis Visualization Communication for collaboration and results dissemination
Fundamentals of data management Software development practices and engineering mindset Basic statistical inference Visual literacy and graphical principles Reproducible open science
Modeling structure and organization of data Version control Exploratory analysis Visualization services and libraries Collaboration workflows for groups
Database management systems and queries (e.g., SQL) Software testing for reliability Geospatial information handling Visualization tools Collaborative online tools
Metadata concepts, standards, and authoring Software workflows Spatial analysis Interactive visualizations Conflict resolution
Data versioning, identification, and citation Scripted programming (e.g., R and Python) Time-series analysis 2D and 3D visualization Establishing collaboration policies
Archiving data in community repositories Command-line programming Advanced linear modeling Web visualization tools and techniques Composition of collaborative teams
Moving large data Software design for reusability Nonlinear modeling Interdisciplinary thinking
Data-preservation best practices Algorithm design and development Bayesian techniques Discussion facilitation
Units and dimensional analysis Data structures and algorithms Uncertainty propagation Documentation
Data transformation Concepts of cloud and high-performance computing Meta-analysis and systematic reviews Website development
Integrating heterogeneous, messy data Practical cloud computing Scientific workflows Licensing
Quality assessment Code parallelization Scientific algorithms Message development for diverse audiences
Quantifying data uncertainty Numerical stability Simulation modeling Social media
Data provenance and reproducibility Algorithms for handling large data Analytical modeling
Data semantics and ontologies Machine learning

Note: Many if not most of these elements apply across multiple categories. This taxonomy was initially created in a workshop involving natural and physical scientists, information scientists, and computer scientists (isees.nceas.ucsb.edu), with modest refinements by the authors.