Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2022 Jun 10;119(25):e2205897119. doi: 10.1073/pnas.2205897119

Closing the loop on crowdsourced science

James M Robson a,b, Alexander A Green a,b,c,1
PMCID: PMC9231617  PMID: 35687665

In the age of smartphones, tablets, and personal computers, information communication technologies have changed the way the world thinks, works, and functions. Digital technologies have influenced how people communicate with one another and how knowledge is shared. The dissemination of scientific knowledge has been revolutionized by the internet, enabling researchers to share their findings and data directly with followers outside the scientific community. While online sharing has facilitated scientific collaboration and aided in the distribution of scientific results, it has been far more challenging to bring the outside community into the process of scientific discovery and hypothesis testing to take advantage of crowdsourced insights. In PNAS, Andreasson et al. (1) present a platform for iterative hypothesis generation and high-throughput characterization that enables large-scale, video game–based crowdsourcing of tens of thousands of RNA-based sensor designs. They find that bringing the wisdom of an online community into the discovery process yields unanticipated RNA device architectures along with sensors that operate near the thermodynamic optimum.

RNA is an ideal polymer for biomolecular design; it can code for genetic information, fold into intricate structures known as aptamers to capture ligand molecules, catalyze chemical reactions, and it plays a role in nearly every biological process in living cells. Moreover, the pairing relationships of the RNA bases, where A binds to U and G binds to C, provide a means to direct how RNA molecules fold and how they interact with other transcripts. RNA, therefore, provides a promising way to construct regulatory elements based on its diversity of functions and our growing understanding of sequence–function relationships. Indeed, a diverse assortment of engineered RNA-based switches has been described from riboswitches that bind to different metabolites and proteins to regulate gene expression (24) to riboregulators that can sense pathogen RNAs (5) and perform biomolecular computations (6).

While RNA switches hold great promise in many applications, their design remains challenging due to sequence-specific interactions, the difficulties in predicting the energies of three-dimensional conformations, and noncanonical base pairs that have not been extensively characterized. The fundamental rules that determine RNA switch performance remain poorly understood. Without rules to explain relationships between RNA sequence, structure, and behavior, rapid generation of RNA elements for biological control systems remains a challenge. Sequence design using software packages like NUPACK (7) and ViennaRNA (8) allows for the generation of thousands of potential sensors, but many RNA designs do not perform well when tested experimentally (9), making their performance difficult to predict and thus, slowing development.

In PNAS, Andreasson et al. (1) extend the iterative process of RNA design, synthesis, and testing, describing a pipeline for “citizen scientists” to make, validate, and modify hypotheses at scale, thereby accelerating design advances for RNA switches. Thousands of sensor designs were crowdsourced through Eterna (10), a massive open laboratory and discovery game, enabling the authors to overcome the limitations of existing computational RNA software design packages by leveraging both human insight and experimental testing (Fig. 1). The first community design challenge focused on implementing an RNA switch to detect the cellular metabolite flavin mononucleotide (FMN). Eterna players were tasked with engineering the RNA switch to form the MS2 aptamer RNA structure after binding of FMN. The MS2 aptamer can in turn capture a fluorescent protein ligand, enabling visualization under a microscope. Using repurposed Illumina sequencing chips, the player-devised RNA switches were evaluated at high throughput, providing quantitative RNA characterization data. High-performing player-designed switches from the Eterna community were compared with Ribologic (11), an automated algorithm for designing RNA molecules that are predicted to change their secondary structure in response to interactions with other molecules. Computer-generated designs via Ribologic were found to exhibit lower activation ratios than the final Eterna player designs after iterative refinement.

Fig. 1.

Fig. 1.

Iterative cycle of community-based hypothesis testing. Video game–based crowdsourcing of RNA designs through Eterna is followed by synthesis of RNA libraries tested in massively parallel fashion on an array. High-throughput functional characterization and data quantification are released back to the Eterna community, allowing for iterative rounds of hypothesis generation, modification, and analysis by citizen scientists.

The generality of the crowdsourced design approach was tested by developing RNA sensors for small molecules with different aptamer reporters. In particular, Andreasson et al. (1) demonstrate RNA switches with malachite green and Spinach aptamer reporters (12, 13), which bind dye molecules to generate fluorescent signals. They show that when the total number of tested designs is reduced, the Eterna community players generate responsive switches, even in cases when the Ribologic algorithm fails. The authors go on to demonstrate that the optimal switch design, regardless of the type of riboswitch, requires testing not only multiple switch designs but also, multiple switch architectures. For successful FMN–MS2 switches, the positions of the FMN aptamer and MS2 hairpin were sequestered, and other nucleotides were embedded into stems. However, they demonstrate that across both the FMN–MS2 switches and the fluorescent aptamer switches, more diverse motif orderings and toggling of mechanisms result in optimal performance. If the position, motif ordering, and structure toggling of RNA switch mechanisms strongly impact switch performance, the question arises as to how motif ordering might impact the design of riboswitches and riboregulators and whether other more optimal architectures have yet to be discovered.

Overall, the massive number of designs synthesized and validated through the workflow in Andreasson et al. (1) is a testament to the remarkable technologies for parallel synthesis and assembly of thousands of synthetic nucleic acid templates (12, 13), which can be transcribed both in vitro and in vivo into RNA. When coupled with screening technologies (14, 15) and deep-sequencing platforms with turnaround times in less than a day, the gap between in silico design and experimental validation will become smaller. Ultimately, the process will be limited only by the speed of DNA synthesis and the creativity in which we can think of new RNA switch structures.

The explosion of RNA data in recent years, with hundreds of thousands of RNA switches synthesized, validated, and released to the public, will allow for development of more quantitative and predictive theories for RNA structure design and eventually, de novo, forward-engineered RNA functional design (16). The cycle of RNA design and testing can be short, especially compared with protein engineering. With the application of more sophisticated algorithms and motif finding in RNA regulatory elements, machine learning applications for RNA design will result in minimal experimental validation. Predicting the three-dimensional structure that a protein will adopt has been an important research problem for 50 y, but major advances have recently been achieved through artificial intelligence (AI) networks (17, 18). Only a few deep-learning AI approaches have been developed for the computational prediction of three-dimensional RNA structures (19), but with the expansion and massively parallel high-throughput characterization of these datasets, AI predictions for RNA devices like riboswitches are close at hand.

In the age of big data and with the growth of citizen science applications, the emergence of projects that are game based and provide massive amounts of quantitative data makes exploration of the complex sequence space of riboswitches and riboregulators feasible. Over time, machine learning or AI and the development of more effective RNA design algorithms may mean that some types of data analysis may no longer require any human input. However, science-based games, like the Foldit game (20), that rely on diversity in spatial recognition and problem-solving should continue to provide valuable insights. As it currently stands, development of citizen science initiatives, like Eterna, Foldit, or EyeWire (21), has enabled thousands of interested people to become involved in authentic scientific research from anywhere with internet connectivity.

Despite the ease with which games might be accessed and advancements in mobile technology, studies have shown that typical participants in citizen science initiatives are well-educated males with an existing interest in science or computing and are also likely to live in North America or Europe (22). Perhaps greater efforts targeting marginalized and minoritized audiences, groups that may feel that these opportunities are not meant for them, may result in a greater number of participants and may bring new perspectives and approaches to the data. While Andreasson et al. (1) make strides in “closing the loop” on crowdsourced science to enable iterative hypothesis generation and experimental testing by actively engaging and promoting Eterna to diverse audiences, the research might be more impactful if provisions are made to encourage participants to analyze their own data and ask their own questions. Opening research in an inclusive way may have wider societal benefits, such as increasing transparency of research, while simultaneously helping to build science capital, thus bringing populations that would otherwise be removed from the scientific process to the forefront of research discovery.

To cast the widest net, games built like Eterna must compete with the likes of Candy Crush, Pokemon GO, and other popular video games. If we build it, they do not necessarily come. In communities where there is low science capital, using online citizen science projects may help introduce scientific research to a range of ages and abilities, but the “democratization” of experimental biological science is far from a reality. The provisioning of a low-cost experimental validation workflow for hypotheses generated by the broader community is a crucial step toward making scientific discovery more accessible.

Acknowledgments

Our research is supported by NIH Director’s New Innovator Award DP2GM126892, NIH Grants U01AI148319 and R01EB031893, Research Corporation for Science Advancement Award 28422, and NSF Rapid Response Research Award 022329A. J.M.R. is supported by an NIH T32 Synthetic Biology and Biotechnology Training Grant (T32GM130546) and by the NSF Graduate Research Fellowship.

Footnotes

Competing interest statement: A.A.G. is a cofounder of En Carta Diagnostics Inc.

See companion article, “Crowdsourced RNA design discovers diverse, reversible, efficient, self-contained molecular switches,” 10.1073/pnas.2112979119.

References

  • 1.Andreasson J. O. L., et al. , Crowdsourced RNA design discovers diverse, reversible, efficient, self-contained molecular switches. Proc. Natl. Acad. Sci. U.S.A. 119, e2112979119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Win M. N., Smolke C. D., Higher-order cellular information processing with synthetic RNA devices. Science 322, 456–460 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Topp S., et al. , Synthetic riboswitches that induce gene expression in diverse bacterial species. Appl. Environ. Microbiol. 76, 7881–7884 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Townshend B., Kennedy A. B., Xiang J. S., Smolke C. D., High-throughput cellular RNA device engineering. Nat. Methods 12, 989–994 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Pardee K., et al. , Rapid, low-cost detection of zika virus using programmable biomolecular components. Cell 165, 1255–1266 (2016). [DOI] [PubMed] [Google Scholar]
  • 6.Green A. A., et al. , Complex cellular logic computation using ribocomputing devices. Nature 548, 117–121 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Zadeh J. N., et al. , NUPACK: Analysis and design of nucleic acid systems. J. Comput. Chem. 32, 170–173 (2011). [DOI] [PubMed] [Google Scholar]
  • 8.Lorenz R., et al. , ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Green A. A., Silver P. A., Collins J. J., Yin P., Toehold switches: De-novo-designed regulators of gene expression. Cell 159, 925–939 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Lee J., et al. ; EteRNA Participants, RNA design rules from a massive open laboratory. Proc. Natl. Acad. Sci. U.S.A. 111, 2122–2127 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Wu M. J., Andreasson J. O. L., Kladwang W., Greenleaf W., Das R., Automated design of diverse stand-alone riboswitches. ACS Synth. Biol. 8, 1838–1846 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zhou X., et al. , Microfluidic PicoArray synthesis of oligodeoxynucleotides and simultaneous assembling of multiple DNA sequences. Nucleic Acids Res. 32, 5409–5417 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.LeProust E. M., et al. , Synthesis of high-quality libraries of long (150mer) oligonucleotides by a novel depurination controlled process. Nucleic Acids Res. 38, 2522–2540 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Lucks J. B., et al. , Multiplexed RNA structure characterization with selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq). Proc. Natl. Acad. Sci. U.S.A. 108, 11063–11068 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Pitt J. N., Ferré-D’Amaré A. R., Rapid construction of empirical RNA fitness landscapes. Science 330, 376–379 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Angenent-Mari N. M., Garruss A. S., Soenksen L. R., Church G., Collins J. J., A deep learning approach to programmable RNA switches. Nat. Commun. 11, 5057 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Jumper J., et al. , Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Baek M., et al. , Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Valeri J. A., et al. , Sequence-to-function deep learning frameworks for engineered riboregulators. Nat. Commun. 11, 5058 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Cooper S., et al. , Predicting protein structures with a multiplayer online game. Nature 466, 756–760 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kim J. S., et al. ; EyeWirers, Space-time wiring specificity supports direction selectivity in the retina. Nature 509, 331–336 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Curtis V., “Who takes part in online citizen science?” in Online Citizen Science and the Widening of Academia: Distributed Engagement with Research and Knowledge Production, Curtis V., Ed. (Palgrave Studies in Alternative Education, Springer International Publishing, 2018), pp. 45–68. [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES