Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Sep 27.
Published in final edited form as: Bioessays. 2012 Jul 16;34(9):718–720. doi: 10.1002/bies.201200072

Putting Proteins in Context: Scientific illustrations bring together information from diverse sources to provide an integrative view of the molecular biology of cells

David S Goodsell 1
PMCID: PMC3785232  NIHMSID: NIHMS508858  PMID: 22806423

As scientists, we often focus our attention on the details. For instance, in my graduate work with Richard Dickerson, I spent years synthesizing small pieces of DNA and trying to determine the structures of their complexes with small inhibitors. These days, I use the computer to try to predict these types of macromolecule-inhibitor interactions in HIV. It's fascinating work, and work that can fill an entire lifetime of study. But it's also disconnected from the things that got me excited about science: looking at pond water through microscopes, collecting native southern California insects, trying to find Neptune with the telescope, and all of the other fascinating things that call to a curious mind. In our day-to-day research, it's easy to lose touch with the larger scientific context.

Two decades ago, I began a project to reconnect with the biological context of my main love: molecular biology. I set a goal: to create an illustration of a 100 nm cube taken from a living cell, showing every molecule at the proper size, shape, concentration, and location. There isn't any way to look at this scale level experimentally. Microscopy doesn't have the resolution to see atomic structures in cells, and atomic methods like x-ray crystallography provide a view of the purified molecules, completely removed from their cellular context. I started with Escherichia coli, which was at the time (and still today) the most data-rich organism. Remarkably, there was enough data from biochemistry, structural biology, and microscopy to support such an illustration1 (Figure 1). I've spent the years since then creating other illustrations of cellular environments, attempting to integrate diverse data into a few images that show the cellular context of molecular biology25.

Figure 1. Escherichia coli.

Figure 1

These two cross sections show a portion of an Escherichia coli bacterium, including all macromolecules. Small molecules like ATP, glucose, and water are omitted for clarity. On the left, an illustration from 1991 showing the nucleoid, with a random arrangement of DNA helices, causing many to be clipped in the cross section. On the right, a recent illustration employs artistic license in the arrangement of DNA and membranes, to minimize confusing artifacts. The coloring scheme is designed to highlight the functional compartments of the cell, with the cell wall and flagellar motor in green, soluble enzymes and other proteins in turquoise, ribosomes and tRNA in magenta, DNA in yellow, and proteins in the nucleoid in orange.

The process of researching and planning this type of integrative illustration is arguably the most exciting aspect of the task. As you might imagine, there are many decisions that need to be made. Many of these decisions are based firmly on data: for instance, in the Escherichia coli illustrations, the number and structure of ribosomes and the exact length and sequence of the genome are well determined. Other aspects, however, are more difficult to study, so an informed decision is necessary. How is the peptidoglycan arranged? (There are several competing models.) How compact are the proteins in the replisome? (I've shown them in a tight complex--other models have them loosely associated.) It's a humbling process, uncovering the many gray areas in our knowledge.

The Internet has revolutionized this process, allowing instant access to diverse forms of data. PubMed (http://www.ncbi.nlm.nih.gov/pubmed) allows fast access to the primary research reports, uncovering information at all levels. Of particular value are the many schematic illustrations sprinkled through biochemistry research reports, showing the details of the particular biomolecular interactions being studied. When I was researching the Escherichia coli images in Figure 1, these types of schematics were essential for sorting out the detailed arrangement of molecules in the cell wall and nucleoid. Interactome studies are attempting to do this in a more systematic way, but they don't yet provide the personal knowledge that is infused into the journal figures. Quite remarkably, finding structures for each of the components is the easiest part. The Protein Data Bank6,7 allows instant access to tens of thousands of molecular structures, including nearly all of the central molecular machines found in cells. In cases where atomic structures are not available, structures from electron microscopy are available in the EMDataBank (http://www.emdatabank.org), or directly in published reports.

Creation of scientific illustrations such as these presents the artist with some engaging boundaries--boundaries that are not often present for other forms of art. Most importantly, a scientific illustration must be an accurate representation of the science. The term "accurate," however, is slippery, and in practice, scientific illustrators must employ all manner of approximations and artistic license to create a useful rendering8. For instance, cells contain many fibrous molecules, such as DNA and actin, and large planar membranes. A random cross-section through a cell would intersect with many of these elements, creating clipped views that would be visually confusing. The artist, however, can arrange the molecules in orientations that reflect their actual locations, but that will minimize these types of visual artifacts. Compare the two illustrations in Figure 1. The small black-and-white illustration is one of the first that I created, and includes many clipped DNA strands. This gives the impression of many small, disconnected pieces. The more recent illustration in color artificially arranges all the DNA strands so that nothing is clipped, and giving a better impression of a long, continuous DNA strand.

When I present these illustrations, both in educational settings and research settings, viewers are typically surprised by the density and complexity of the scenes. One of my major goals is to provide a bridge, linking molecular biology with cell biology. The illustrations are designed to remind us that molecules are performing their jobs in a complex environment, which has properties that often differ from purified molecules. The most obvious feature is the crowding, a property that is only recently being studied and appreciated in molecular biology laboratories9. For instance, the crowded environment effects the way proteins interact, favoring complexes relative to separate molecules. Proteins must also be highly specific in their interactions, to avoid unproductive interactions with their many neighbors10. For me, the illustrations also make me think about the role of copy number. There are many ribosomes busily doing their jobs, but for a repressor, there may be only a small number. As described in a thought experiment by Peter Halling11, looking across a population of cells, a molecule with low copy number may be much more prevalent in some cells, or even absent in others, as a result of statistical variation.

The process of creating this type of illustration forces us to look at the larger context of our subject, and to think more deeply about the many aspects that don't normally enter into our day-to-day research. When planning the illustrations for the second edition of the Machinery of Life12, I had several epiphanies. First, I came to appreciate the role of unstructured proteins in cell function. The traditional view has proteins folding into perfect globular structures, but it is becoming apparent from many lines of research that proteins often act as unstructured chains13. Second, I came to appreciate the role of infrastructure in cell structure and function. Reading through the many papers on neuronal structure, I found that if a protein is in a particular place, there will be an infrastructure to hold it there, and a separate regulatory network to make sure it's put there at the proper time. Both of these features are included in the myelin illustration in Figure 2. Many unstructured chains for the infrastructure of the extracellular matrix, between the myelin membranes and in the axon cytoplasm, and flexible chains are important in the function of the voltage-gated channels. As for infrastructure, it can be argued that all of the molecules in this painting serve as infrastructure to ensure that the three membrane-bound proteins shown in yellow (an ion pump and two ion channels) are in the right place and environment to propagate a nerve signal.

Figure 2. Myelin Sheath.

Figure 2

This cross section shows a nerve axon at the bottom, surrounded by a Schwann cell, which wraps around the axon and forms multiple layers of insulation. The illustration highlights the many infrastructural proteins that support the complex interaction of these two cells.

It is my hope that these illustrations will inspire researchers and educators to create similar studies. This is a great way to organize what is known, and identify areas that need more study. For instance, I've had the opportunity to work with Dan Klionsky, a researcher in the field of autophagy, in this capacity. Over the past decade or so, we have worked together to create a series of illustrations encapsulating the data that is known at the particular time14. It has been an exciting project, watching the picture develop as more pieces of the puzzle are added each year.

Acknowledgements

This work was supported in part by the RCSB Protein Data Bank (NSF DBI 0829586). The author has no conflicts of interest in this work.

References

  • 1.Goodsell DS. Inside a living cell. Trends in Biochem. Sci. 1991;16(6):203–206. doi: 10.1016/0968-0004(91)90083-8. [DOI] [PubMed] [Google Scholar]
  • 2.Goodsell DS. Neuromuscular synapse. Biochem Mol Biol Educ. 2009;37(4):204–210. doi: 10.1002/bmb.20297. [DOI] [PubMed] [Google Scholar]
  • 3.Goodsell DS. Escherichia coli. Biochem Mol Biol Educ. 2009;37(6):325–332. doi: 10.1002/bmb.20345. [DOI] [PubMed] [Google Scholar]
  • 4.Goodsell DS. Mitochondrion. Biochem Mol Biol Educ. 2010;38(3):134–140. doi: 10.1002/bmb.20406. [DOI] [PubMed] [Google Scholar]
  • 5.Goodsell DS. Eukaryotic cell panorama. Biochem Mol Biol Educ. 2011;39(2):91–101. doi: 10.1002/bmb.20494. [DOI] [PubMed] [Google Scholar]
  • 6.Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res. 2000;28(1):235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Berman H, Henrick K, Nakamura H. Announcing the worldwide Protein Data Bank. Nat Struct Biol. 2003;10(12):980. doi: 10.1038/nsb1203-980. [DOI] [PubMed] [Google Scholar]
  • 8.Goodsell DS, Johnson GT. Filling in the gaps: Artistic license in education and outreach. Plos Biology. 2007;5(12):2759–2762. doi: 10.1371/journal.pbio.0050308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Ellis RJ. Macromolecular crowding: obvious but underappreciated. Trends Biochem Sci. 2001;26(10):597–604. doi: 10.1016/s0968-0004(01)01938-7. [DOI] [PubMed] [Google Scholar]
  • 10.Sear RP. Specific protein-protein binding in many-component mixtures of proteins. Phys Biol. 2004;1(1–2):53–60. doi: 10.1088/1478-3967/1/2/001. [DOI] [PubMed] [Google Scholar]
  • 11.Halling PJ. Do the laws of chemistry apply to living cells? Trends Biochem Sci. 1989;14(8):317–318. doi: 10.1016/0968-0004(89)90158-8. [DOI] [PubMed] [Google Scholar]
  • 12.Goodsell DS. The Machinery of Life. New York: Springer; 2009. [Google Scholar]
  • 13.Wright PE, Dyson HJ. Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J Mol Biol. 1999;293(2):321–331. doi: 10.1006/jmbi.1999.3110. [DOI] [PubMed] [Google Scholar]
  • 14.Goodsell DS, Klionsky DJ. Artophagy: the art of autophagy--the Cvt pathway. Autophagy. 2010;6(1):3–6. doi: 10.4161/auto.6.1.10812. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES