Skip to main content
Microbiology Resource Announcements logoLink to Microbiology Resource Announcements
. 2023 Jan 18;12(2):e01310-22. doi: 10.1128/mra.01310-22

The Riffomonas YouTube Channel: An Educational Resource To Foster Reproducible Research Practices

Patrick D Schloss a,
Editor: Irene L G Newtonb
PMCID: PMC9933679  PMID: 36651754

ABSTRACT

Methods for analyzing data in a reproducible manner are often viewed as impenetrable to scientists more familiar with laboratory research. The Riffomonas YouTube channel is committed to teaching these scientists and others how to engage in reproducible research using modern data science tools.

ANNOUNCEMENT

As high-throughput data generation becomes more common in microbiology and other disciplines, there is a significant need for laboratory scientists to develop data science skills (1). Unfortunately, traditional undergraduate and graduate biology training programs are often deficient in opportunities for scientists to develop the skills necessary to analyze large datasets in a reproducible and robust manner (2, 3). Numerous organizations seek to fill this void, including the Carpentries, Codecademy, and DataCamp (4). There are also numerous video tutorials available on YouTube. Although the content available through these platforms is popular, there has been a gap in content that emphasizes project-based learning.

The Riffomonas YouTube channel (https://www.youtube.com/c/RiffomonasProject) seeks to fill this gap. I started consistently posting videos at the beginning of the coronavirus disease 2019 (COVID-19) pandemic in April 2020. As of the end of November 2022, the channel had 11,327 subscribers and included 285 videos that had been viewed 635,947 times. The majority of these are 264 videos in the “Code Club” playlist (5) (Table 1). Other videos are related to a previously described tutorial series on reproducible research (6) and series in which reproducible research practices are used to address topical questions. Code Club videos are typically between 20 and 30 min long. The code that is developed in the videos is available through a website (https://riffomonas.org/code_club/) and the channel’s GitHub-hosted account (https://github.com/riffomonas).

TABLE 1.

Playlists found on the Riffomonas YouTube channela

Topic Playlist title No. of videos
Data science Data visualization with R’s tidyverse and allied packages 146
Data manipulation within R’s tidyverse and other packages 116
Data analysis with base R 39
Tools for reproducible data analysis 33
Working at the command line 26
Literate programming with R markdown 18
Machine learning with mikropml R package 16
Version control with Git and GitHub 15
Scientific writing 15
Project organization 3
Project-based series All Code Club videos since 2 April 2020 265
Microbiome data analysis and visualization 86
ASV/OTUb sensitivity and specificity analyses 67
Visualizing COVID-19 vaccination attitudes 31
Climate change data visualization 29
Evaluating rarefaction and its alternatives 18
Drought index visualization 17
Reproducible research tutorial series 14
Commemorating Juneteenth 2022 with a visualization 5
2018 MLB All Star Break data analysis sprint 4
a

Because most videos cover more than one topic, they are found in multiple playlists. Playlists and counts were current as of 1 December 2022. Playlists can be found under the Playlists tab at https://www.youtube.com/c/RiffomonasProject.

b

ASV, amplicon sequence variant; OTU, operational taxonomic unit.

The channel name, Riffomonas, comes from the concept of riffing, in which musical themes are adapted to achieve a similar sound, albeit perhaps in different contexts (6). This is to emphasize the value of reproducibility not only for recreating a set of results but for applying a method with a different data set (7). The channel covers topics related to reproducible data analysis practices, including R programming, data visualization, project organization, version control, command line programming, workflow tools, and scientific publishing (Table 1). Each video includes a brief introduction followed by me live coding to achieve a goal. I emphasize the use of live coding to modulate the rate of instruction and to show viewers my own coding practices. Observing an experienced analyst make mistakes normalizes some level of failure and demonstrates the strategies they can use to resolve their own mistakes. Viewers are encouraged to follow along with each video and to apply the new information to their own project.

Each video emphasizes a specific topic but includes other content that is selected to review topics covered in recent videos. Although videos can be watched individually, they often form a project arc (Table 1). For example, between July 2020 and July 2021, I formulated a research question, obtained and analyzed data to answer the question, and wrote a paper that was published in mSphere (8). This series of 67 videos covered every topic from creating the initial directory on my computer to house the project files through reviewing the proofs of the published manuscript. Other project arcs have included visualizing microbiome data, modeling microbiome data using machine learning tools, analyzing the impacts of rarefying microbiome data, and other topics. Going forward, the Riffomonas channel will continue to post project-based content to help researchers develop their reproducible research skills.

Data availability.

The Riffomonas YouTube channel is available at https://www.youtube.com/c/RiffomonasProject. The code developed in the Code Club videos is available at https://riffomonas.org/code_club/ and the channel’s GitHub-hosted account (https://github.com/riffomonas).

ACKNOWLEDGMENTS

I am grateful to the audience of the Riffomonas channel for their feedback on topics that I should cover in future episodes.

Contributor Information

Patrick D. Schloss, Email: pschloss@umich.edu.

Irene L. G. Newton, Indiana University, Bloomington

REFERENCES

  • 1.Barone L, Williams J, Micklos D. 2017. Unmet needs for analyzing biological big data: a survey of 704 NSF principal investigators. PLoS Comput Biol 13:e1005755. doi: 10.1371/journal.pcbi.1005755. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Schloss PD. 2018. Identifying and overcoming threats to reproducibility, replicability, robustness, and generalizability in microbiome research. mBio 9:e00525-18. doi: 10.1128/mBio.00525-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Williams JJ, Drew JC, Galindo-Gonzalez S, Robic S, Dinsdale E, Morgan WR, Triplett EW, Burnette JM, III, Donovan SS, Fowlks ER, Goodman AL, Grandgenett NF, Goller CC, Hauser C, Jungck JR, Newman JD, Pearson WR, Ryder EF, Sierk M, Smith TM, Tosado-Acevedo R, Tapprich W, Tobin TC, Toro-Martínez A, Welch LR, Wilson MA, Ebenbach D, McWilliams M, Rosenwald AG, Pauley MA. 2019. Barriers to integration of bioinformatics into undergraduate life sciences education: a national study of US life sciences faculty uncover [sic] significant barriers to integrating bioinformatics into undergraduate instruction. PLoS One 14:e0224288. doi: 10.1371/journal.pone.0224288. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Wilson G. 2016. Software carpentry: lessons learned. F1000Res 3:62. doi: 10.12688/f1000research.3-62.v2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hagan AK, Lesniak NA, Balunas MJ, Bishop L, Close WL, Doherty MD, Elmore AG, Flynn KJ, Hannigan GD, Koumpouras CC, Jenior ML, Kozik AJ, McBride K, Rifkin SB, Stough JMA, Sovacool KL, Sze MA, Tomkovich S, Topcuoglu BD, Schloss PD. 2020. Ten simple rules to increase computational skills among biologists with code clubs. PLoS Comput Biol 16:e1008119. doi: 10.1371/journal.pcbi.1008119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Schloss PD. 2018. The Riffomonas reproducible research tutorial series. J Open Source Educ 1:13. doi: 10.21105/jose.00013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Leek JT, Peng RD. 2015. Reproducible research can still be wrong: adopting a prevention approach. Proc Natl Acad Sci USA 112:1645–1646. doi: 10.1073/pnas.1421412111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Schloss PD. 2021. Amplicon sequence variants artificially split bacterial genomes into separate clusters. mSphere 6:e0019121. doi: 10.1128/mSphere.00191-21. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The Riffomonas YouTube channel is available at https://www.youtube.com/c/RiffomonasProject. The code developed in the Code Club videos is available at https://riffomonas.org/code_club/ and the channel’s GitHub-hosted account (https://github.com/riffomonas).


Articles from Microbiology Resource Announcements are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES