Skip to main content
The FASEB Journal logoLink to The FASEB Journal
. 2010 Sep;24(9):3341–3350. doi: 10.1096/fj.10-158782

Gene expression atlas for human embryogenesis

Hong Yi *,1, Lu Xue *,1, Ming-Xiong Guo *,1, Jian Ma *, Yan Zeng *, Wei Wang *, Jin-Yang Cai *, Hai-Ming Hu *, Hong-Bing Shu , Yun-Bo Shi ‡,2, Wen-Xin Li *,2
PMCID: PMC2923361  PMID: 20430792

Abstract

Human embryogenesis is believed to involve an integrated set of complex yet coordinated development of different organs and tissues mediated by the changes in the spatiotemporal expression of many genes. Here, we report a genome-wide expression analysis during wk 4–9 of human embryogenesis, a critical period when most organs develop. About half of all human genes are expressed, and 18.6% of the expressed genes were significantly regulated during this important period. We further identified >5000 regulated genes, most of which previously were not known to be associated with animal development. Our study fills an important gap in mammalian developmental studies by identifying functional pathways involved in this critical but previously not studied period. Our study also revealed that the genes involved here are distinct from those during early embryogenesis, which include three groups of maternal genes. Furthermore, we discovered that genes in a given developmental process are regulated coordinately. This led us to develop an easily searchable database of this entire collection of gene expression profiles, allowing for the identification new genes important for a particular developmental process/pathway and deducing the potential function of a novel gene. The validity of the predictions from the database was demonstrated with two examples through spatiotemporal analyses of the two novel genes. Such a database should serve as a highly valuable resource for the molecular analysis of human development and pathogenesis.—Yi, H., Xue, L., Guo, M.-X. Ma, J., Zeng, Y., Wang, W., Cai, J.-Y. Hu, H.-M., Shu, H.-B. Shi, Y.-B., Li, W.-X. Gene expression atlas for human embryogenesis.

Keywords: human embryonic development, gene regulation database, organogenesis, microarrays, maternal genes


After fertilization, the human embryo develops according to a precise genetic program. Although the morphological changes that occur during human embryonic development have been known for some time, to date, little is known about the underlying molecular mechanisms. Elucidation of the global patterns of gene expression and regulation is of fundamental importance for decoding the biological programs that control human embryogenesis (1).

With the advent of genomic sequencing and microarray analysis of gene expression, the major characteristics of the transcriptional programs that control the development of a number of model organisms have been determined (2,3,4,5). However, very few such studies have been performed in mammals, especially in humans. Recently, the transcriptomes of human oocytes and preimplantation embryos have been analyzed by using microarrays (6,7,8,9,10). In addition, gene expression profiles for some human organs have also been reported (11,12,13,14). However, these studies used either very early embryos (within the first 3 d) or organs very late in development. Thus, studies have failed to provide any information on the critical developmental period when the embryo switches from mainly rapid cell proliferation to the development of organs; i.e., the period from ∼20 to 60 d postfertilization, or approximately wk 4–9 (Carnegie stages 10–23) of human embryonic development (15).

We have now filled this important knowledge gap by performing gene expression profiling on human embryos during wk 4–9 of development with the most recent Affymetrix human GeneChip array, which contains 50,093 transcripts and includes 38,500 well-characterized human genes. Our analysis not only uncovered >5000, mostly novel, developmentally regulated genes but also revealed that maternal genes belonged to 3 groups, with many functioning only during the early cell proliferation phase (0–4 wk) of human development. More important, we discovered that genes with similar expression profiles function in the same biochemical and cellular processes. This finding allowed us to establish a searchable web-based database that could serve to identify new components of a biochemical process or cellular pathway during human development by virtue of their similar expression profiles or to deduce the likely function of a novel gene.

MATERIALS AND METHODS

Embryo collection

Human embryos were obtained after therapeutic termination of pregnancy induced by Mifepristone. Appropriate written consent was obtained from the patients and approval gained from the Medical Ethics Committee of Zhongnan Hospital at Wuhan University by following national guidelines (16,17,18). Morphologically normal embryos were selected and staged according to the Carnegie stages of development (http://nmhm.washingtondc. museum/collections/hdac/index.htm), and the Chinese characteristics of embryonic development (n=3 for each week of wk 4–9 of embryonic development) (19). Each whole embryo was microdissected from an intact trophoblast and stored directly in liquid nitrogen or homogenized using Trizol solution and then stored at −80°C. Total RNA was isolated from each whole embryo individually. RT-PCR analysis was performed with a primer pair specific for each gene (Supplemental Table 1).

Microarray analysis

The transcriptional profile of each of the three RNA samples for wk 4–9 of human embryonic development was analyzed independently using the Human Genome U133 Plus 2.0 Array (Affymetrix, Santa Clara, CA, USA). The Affymetrix GeneChip system was used for hybridization, staining, scanning, and imaging of the arrays. Raw data were analyzed with the Affymetrix GeneChip Operating Software (GCOS1.4) using the manufacturer’s default analysis settings and global scaling as the normalization method. The trimmed mean target intensity of each array was set arbitrarily to 250 after removing the highest 2% and lowest 2% of signals. Normalized data (also known as signal values) for all arrays have been deposited in the Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI; ref. 20), accessible through GEO Series accession number GSE15744 (http://www.ncbi.nlm. nih.gov/geo/query/acc.cgi?acc=GSE15744).

One-way analysis of variance (ANOVA) was performed with GeneSpring 7.31 (Agilent, Santa Clara, CA, USA) using the expression data for wk 4–9 to identify genes whose expression changed with time as the independent factor. Gene Ontology (GO) analysis was performed with GoMiner software (21). The GoMiner program assigns a functional category to each gene and then calculates the fold enrichment and two-sided P value (Fisher’s exact test) of the categories by comparing the input list of genes to all genes in each category in the database.

Probe labeling and nonradioactive in situ hybridization conditions were as described previously (22). Probe preparation and hybridization were performed using the DIG Random Labeling and Detection Kit (Booster Tech, Wuhan, China), with a sense probe as the control for specificity in each hybridization experiment.

RESULTS AND DISCUSSION

Microarray analysis discovered mostly new genes involved in human embryogenesis

Total RNA was isolated from 3 embryos individually for each week of wk 4–9 of human embryonic development. Each RNA sample was analyzed independently on a microarray. A transcript was scored as “detected” or “expressed” if significant signal was detected on 2 or 3 of the individual microarrays for the 3 independent samples for each time point (23). Overall, 28,761 of the 50,093 distinct transcripts on the array (57.4%) were detected in ≥1 of the 6 stages, whereas slightly <50% of the transcripts were detected in any given stage (Supplemental Table 2). Analysis of the microarray data by one-way analysis of variance (ANOVA) showed that the expression levels of 5358 (18.6%) of the 28,761 expressed transcripts changed significantly (P<0.05) during wk 4–9 of human development (Supplemental Table 3).

To validate the quality of our microarray data, we first analyzed the hybridization signals for all 5358 regulated genes (P<0.05) across all 18 arrays and found that the three independent samples for each time point clustered together (Supplemental Fig. 1), demonstrating the consistency of the embryo staging and microarray reproducibility. Then we performed RT-PCR on a number of genes whose microarray expression changed by 15-fold or more during wk 4–9. The results were all consistent with the microarray data (Fig. 1).

Figure 1.

Figure 1.

RT-PCR analysis validated the findings from the microarrays. RT-PCR analysis was performed for 38 regulated genes. The results appear to the right of the microarray data for each gene. The color scale at top left indicates the relative levels of expression. Red and green indicate high and low expression levels, respectively. 18S rRNA was analyzed as a loading control.

Many genes have been classified previously as developmentally related genes in the Gene Ontology (GO) database (24) based on expression and functional studies in model organisms. Of the 5358 regulated transcripts, 4203 have assigned biological functions in the GO database. To determine how many of the genes regulated during wk 4–9 of human development were known to be involved in development, we used the GO database to identify any developmentally related genes among the 4203 genes. Surprisingly, only 1099 (26.1%) had been implicated previously in development (Supplemental Table 3). Thus, most of the genes that were regulated during this important period of human development represent novel development-related genes.

Regulation of various biological pathways correlates with embryonic organ development

To determine the gene regulation pathways that control wk 4–9 of human development, we grouped the 5358 regulated genes into 42 clusters according to their expression patterns by means of the self-organizing map (SOM) (25) (Supplemental Fig. 2). Figure 2 shows two examples of these clusters, showing genes whose expression increased (Fig. 2A) or decreased (Fig. 2B), respectively, during this developmental period. The patterns of regulation of these 42 clusters likely reflect the potential roles of the genes during development. To investigate this possibility, we analyzed genes in three expression groups. First, we combined the 1053 genes in the clusters with similar profiles as cluster c28 (Fig. 2A, the “up” genes in Supplemental Table 3) and used the GO database to identify the biological categories statistically significantly enriched in this group of genes (21, 24). Of these 1,053 genes, 835 had been annotated in Gene Ontology (GO). The most noticeable, significantly enriched GO categories, were multicellular organismal development/processes, cell adhesion, cell surface receptor-linked signaling, and cell–cell signaling (Supplemental Table 4). These correlated well with the differentiation and development of organs that takes place during wk 4–9. In particular, wk 9 is the last Carnegie stage of embryonic development; from this point on, the development is, in general, considered “fetal growth”. Thus, it is likely that this group of up-regulated genes, whose expression is highest in wk 8 and 9, participates in the ongoing growth and differentiation of the majority of organs around wk 9 and later. This finding would be consistent with the presence of many genes encoding collagens, including COL3A1, a member of the collagen family known to be essential for normal collagen I fibrillogenesis in the cardiovascular system and other organs (26).

Figure 2.

Figure 2.

SOM cluster analysis of the developmentally regulated genes. Of the 42 clusters, 2 are shown here as examples. The heading of each panel indicates the index of the cluster in the full SOM, followed by the number of genes in the cluster (Supplemental Fig. 1). A) Genes whose expression increased throughout the developmental period (cluster c28 with 103 genes). B) Genes whose expression decreased throughout the developmental period (c3 with 219 genes).

Genes in the second group are those in the clusters with peak levels of expression during wk 5–7 (arch-shaped expression pattern, e.g., cluster c15; Supplemental Fig. 2). Of the 997 genes in these clusters (Supplemental Table 3), 595 had been annotated in the GO database. These genes were highly enriched in GO categories associated with various metabolic processes and transcriptional regulation (data not shown) and are likely involved in the transition of the embryo mainly from cell proliferation in wk 4 to mostly organ development by wk 8–9 (see additional details in the next paragraph). In addition, a number of GO categories related to eye development were also enriched significantly among this group and the peak level expression of the genes correlated well with the formation of the lens and the development of retinal pigmentation during wk 5–7 of embryogenesis. These genes included MAB21L2, a vertebrate member of the Male-abnormal 21 family that is critical for eye morphogenesis (27); MITF, the microphthalmia-associated transcription factor that determines the retinal pigment epithelial identity (28); and SHH (sonic hedgehog), which is required for the normal laminar organization in the retina (29).

Finally, we combined 1618 genes that showed a similar profile of down-regulation as cluster c3 (Fig. 2B), of which 1183 had been annotated in GO (the “down” genes in Supplemental Table 3). The enriched GO categories included various metabolic processes, transcription and gene regulation, and the cell cycle (Supplemental Table 5). The down-regulation of metabolic and cell cycle genes suggested a reduction in cell proliferation, which was consistent with the fact that embryos switched from predominantly rapid growth due to cell division in wk 4 to extensive cell differentiation and organogenesis by wk 9. This condition might also be the reason for the down-regulation of genes involved in transcriptional regulation since the change from rapid cell proliferation to differentiation would require the down-regulation of transcription factors associated with cell proliferation and growth. Although many transcription factors involved in differentiation were likely up-regulated, different organs/tissues presumably utilized different transcription factors. The up-regulation of these organ/tissue-specific transcription factors would not be detected because the changes occurred only in a limited number of tissues/organs. Furthermore, it is important to notice that the GO categories with a large number of up- and down-regulated genes did not overlap (compare the lists in Supplemental Tables 4 and 5), supporting that the GO categories of the regulated genes reflect the developmental processes that occur during this period of embryogenesis.

Unique gene expression signatures underlie wk 4–9 of development and the anatomically distinct processes of early embryogenesis

The embryonic processes that occur between fertilization and gastrulation (wk 0–4) are anatomically distinct from the subsequent organogenesis and histogenesis (wk 4–9). However, the underlying molecular basis of this change remains elusive. To address this, we compared our data with transcriptome data of human oocytes and preimplantation embryos (7, 9). Venn diagrams showed that among the 5331 genes that are expressed at high levels in the oocyte, the expression of ∼15% (793 genes) (Fig. 3A) was regulated during wk 4–9 of embryogenesis. Similarly, among the 1461 transcripts that are highly expressed in the 3-d-old embryo, 15% (354 genes) (Fig. 3B) were regulated during wk 4–9 of embryogenesis. In addition, 136 genes were found to be highly expressed highly in both the oocyte and the 3-d-old embryo as well as significantly regulated during wk 4–9 of development (Fig. 3C). The expression levels of these common genes were highest during wk 4–5 and low by wk 9 (Fig. 3DF). These genes were enriched for categories involved in metabolism, transcription, and the cell cycle (Supplemental Tables 6–8). These were the same categories that were enriched among genes down-regulated during wk 4–9, indicating that many of the genes involved in metabolism and the cell cycle were maternal genes. Thus, a number of maternally expressed genes are not only involved in early embryogenesis (e.g., the first few days) but are also required up to wk 4 of embryogenesis.

Figure 3.

Figure 3.

Venn diagrams showing that a number of genes that are highly expressed in human oocytes and 3-d-old embryos are repressed during wk 4–9. A, B) Venn diagrams between genes whose expression changed significantly during wk 4–9 of human embryogenesis (HU EMR) and genes that were highly expressed in the human oocyte (HU OC+) (A) or in 3-d-old human embryos (HU 3d+) (B). C) Venn diagram showing genes regulated during wk 4–9 that were also highly expressed in the oocyte and 3-d-old embryo. D, F) The expression profiles of the common genes in AC, respectively. Note that the expression of the majority of the genes was repressed by wk 9. Red and green indicate high and low expression levels, respectively. G, H) Venn diagrams between genes with no detectable expression during wk 4–9 and genes highly expressed in human oocyte (G) or 3-d-old embryos (H). I) Venn diagram showing genes with no detectable expression during wk 4–9 but high levels of expression in the oocyte and 3-d-old embryo.

In addition, we compared the genes that showed no detectable expression during wk 4–9 with those that are highly expressed in the oocyte or 3-d-old human embryo (7, 9). The results showed that, among the 5331 genes with high levels of expression in the oocyte, the expression of ∼27% (1424 genes) (Fig. 3G) could not be detected during wk 4–9 of embryogenesis. In contrast, among the 1461 transcripts that are highly expressed in the 3-d-old embryo, only 7% (98 genes) (Fig. 3H) showed no detectable expression during wk 4–9 of embryogenesis. Only 29 genes were expressed highly in both the oocyte and the 3-d-old embryo but not expressed during wk 4–9 (Fig. 3I). GO analysis of the 1424 genes that were highly expressed in the oocyte but absent by wk 4 of development revealed that these genes were enriched in categories related to the plasma membrane, metabolism, nucleotide binding, and reproduction (Supplemental Table 9). The down-regulation of these genes correlated well with the transition from the use of transcripts stored in the oocytes during early embryogenesis, when rapid cell proliferation requires large amounts of membrane components and nucleotides, to the use of zygotic transcripts by wk 4 of development. The fact that only 7% of the genes that are highly expressed in the 3-d-old embryo could not be detected by wk 4 suggests that many up-regulated genes during the first few days of development continue to function at least up to wk 4 of human embryogenesis.

Thus, we discovered that the 5331 highly expressed maternal transcripts can be classified into three groups: 1) those that are absent by wk 4; 2) those that are regulated during wk 4–9; and 3) those that are expressed constitutively during wk 4–9. The first two groups of maternal genes were highly enriched for genes involved in membrane utilization, metabolism, and cell cycle regulation, all of which are important for early embryogenesis, and were down-regulated subsequently. Genes in the third group were expressed constitutively, at least throughout wk 4–9, which suggested that their functions were not specific to any developmental processes, at least up to wk 9; i.e., they were likely to be housekeeping genes.

Coordinate regulation of genes involved in the same developmental processes

We next investigated how genes involved in a given biological/developmental process or related GO categories were regulated during wk 4–9 and whether their regulation correlates with their likely roles during development. One important change associated with the developmental transition from the early cell proliferation phase (wk 0–4) to organogenesis and histogenesis (wk 4–9) is that the number of stem cells or undifferentiated cells is reduced because more cells begin to differentiate into different organ/tissue-specific cell types. To determine how stem cell genes are regulated as development proceeds, we identified all the genes associated with stem cells (GenMaPP database; refs. 30, 31) and analyzed their patterns of expression. As shown in Fig. 4A, the vast majority of these genes were found to be highly expressed during wk 4 or 5 but were down-regulated by wk 9, consistent with their roles in stem cells.

Figure 4.

Figure 4.

Coordinated regulation of genes involved in the same biological processes. A) Stem cell-specific genes among the 5358 regulated genes were gradually down-regulated during wk 4–9. B) Most of the genes that were specific to muscle, fat, and connective tissue were up-regulated during wk 4–9. C) The sequential but coordinated expression of specific genes correlated with the different stages of development of the nervous system. Note that genes in GO categories associated with different aspects of the development of the nervous system clustered together; some clusters were expressed early whereas others were expressed late in embryogenesis.

However, the vast majority of the genes likely involved in organ development, such as genes associated with muscle, fat, and connective tissues (Fig. 4B) or those associated with internal organs or blood and lymph tissues (Supplemental Fig. 3A, B), were found to be up-regulated coordinately, with the highest levels of expression occurring in wk 9. The up-regulation of these genes was consistent with their potential roles in organogenesis and histogenesis during this period. For example, the genes in this group include members of the keratin family; e.g., KRT4 and KRT5, which comprise the largest subgroup of intermediate filament (IF) proteins. The importance of these genes in organogenesis is demonstrated by the fact that mice that lack KRT5 or other IF proteins die shortly after birth (32, 33).

Next, we asked whether such correlations exist for a more complex process, the development of the nervous system. The early morphogenesis of the nervous system begins with the formation of neural plates by d 18 of embryogenesis, and this is followed by complex yet coordinated development during wk 4–9. However, little is known about the networks of gene regulation that control this process. We identified all genes in the GO categories related to the nervous system and found that of the 1588 nervous system-related genes present on the microarray, 710 of these genes were regulated during wk 4–9 of development. These genes belonged to many different clusters with distinct expression profiles (Fig. 4C). Notably, the expression patterns of these clusters correlated well with the sequential development of the nervous system. For example, the neural plates first develop into the neural fold and neural groove and subsequently fuse to form the neural tube, which is the rudiment of the central nervous system, by wk 4 of embryogenesis. Correspondingly, genes related to neural tube closure were expressed highly in wk 4 (Fig. 4C, part 4). Subsequently, growth of the neural tube results in an extensive brain region, which develops into the forebrain/midbrain/hindbrain in wk 5, and this is correlated with high levels of expression of the corresponding genes in wk 4–5 and their subsequent down-regulation (Fig. 4C, parts 5, 6). However, a number of genes were up-regulated by wk 8–9 and correlated with the maturation of neurons, development of the pallium, and neurite morphogenesis, etc. (Fig. 4C, parts 1–3). Thus, the different processes involved in the development of the nervous system are controlled by the coordinated expression of the genes required for these respective processes.

Predicting biological function through gene expression patterns

The above analysis suggests that genes that participate in the same biological processes tend to be regulated coordinately during wk 4–9 of development. This finding raises the possibility that we might be able to infer the biological functions/pathways in which any gene of interest is involved. This is important because the functions of many genes are unknown. Among the 5358 developmentally regulated genes identified here, 1218 genes, nearly 25%, have no assigned biological function in the GO database. In addition, for many genes that have been assigned to GO categories, knowledge of their involvement in specific biological processes is limited.

We first asked whether we could identify additional genes involved in a particular biological process from among those not assigned to GO categories. We used skeletal development as an example. During wk 4 of embryogenesis, somites increase in number as osteoblasts differentiate. During wk 5, upper- and lower-limb buds appear, followed by the segmentation of the trunk during wk 6. Subsequently, cartilages are replaced gradually by bones throughout the embryo. We searched our gene expression database to obtain the gene regulation profiles of all genes in the GO categories associated with skeletal development (ossification, skeletal morphogenesis, cartilage condensation, osteoblast differentiation, etc.). Of the 754 skeletal development-related genes present on the array, 148 genes were found to be regulated during wk 4–9 and belonged to a number of different expression clusters (Fig. 5A).

Figure 5.

Figure 5.

Clustering of gene expression profiles identified novel genes potentially involved in skeletal development. A) 148 genes among the genes regulated during wk 4–9 were found to be in GO categories associated with skeletal development and were clustered based on their expression patterns. Four clusters indicated with colored bars were used for further analysis. B) The 148 genes were clustered with the 1218 genes whose GO categories were unknown, and the genes in the four color-coded clusters shown in A were identified in the combined clusters. The clusters that were highly enriched for the genes in the four color-coded clusters in A were identified and shown in the four panels here. The genes indicated with a blue bar are genes of unknown GO categories, whereas those labeled with colored bars correspond to the genes in A. The red stars indicate 2 unknown genes, C2orf40 (chromosome 2 open reading frame 40) (top left panel) and C1orf61 (chromosome 1 open reading frame 61) (bottom left panel), that were used as examples in this study (Fig. 6 and C, respectively). C) In situ hybridization confirmed that C1orf61 was indeed involved in skeletal development. Cells with brown/yellow stains were considered positive. Note that C1orf61 was expressed at very low levels in wk 4 but was highly up-regulated in the skeletal system, including the vertebrae and limbs, by wk 7–9.

To determine whether any of the developmentally regulated genes with no known GO affiliations were involved in skeletal development, we combined these 148 skeletal genes with the 1218 developmentally regulated genes with no known GO categories and clustered them based on their expression profiles (Fig. 5B). We identified the clusters enriched with genes that belonged to the 4 clusters marked in Fig. 5A. As shown in Fig. 5B, the genes in the 4 color-coded clusters in Fig. 5A were found to be separate nonoverlapping clusters in Fig. 5B, and each cluster contained a number of genes with no assigned GO category (indicated with blue bars).

According to our hypothesis that genes with similar expression profiles could participate in the same biological processes, these unknown genes are likely to be involved in skeletal development as well. To test this possibility, we performed in situ hybridization analysis on some of the genes. The results for one of these genes, C1orf61 (chromosome 1 open reading frame 61), are shown in Fig. 5C. Consistent with the microarray data, the expression of the gene could not be detected in 4-wk-old embryos, but it was highly expressed by wk 7–9. Notably, it was expressed in the skeletal system, including the bones in the vertebrae and limbs (Fig. 5C), supporting a role in skeletal development.

Next, we asked whether we could predict the function of an unknown gene. We chose C2orf40 (chromosome 2 open reading frame 40) as an example. We obtained its expression pattern from our microarray database (Fig. 6A) and identified its location in the clusters of all developmentally regulated genes (Fig. 6B, left panel). We thus identified a small cluster of 59 genes with the most similar expression patterns (Fig. 6B, right panel). GO analysis of these genes revealed that they were enriched with genes involved in skeletal development, collagen, cartilage development, and the extracellular matrix. This finding suggests that C2orf40 is likely involved in skeletal development (it is worth noting that this gene was implicated independently in skeletal development during our above-mentioned search for novel skeletal genes (Fig. 5B, top panel, the gene indicated with a star).

Figure 6.

Figure 6.

Gene expression pattern searches revealed that C2orf40 was involved in skeletal development. A) The C2orf40 expression profile as obtained from our expression database. B) Approximately 500 genes clustered close to C2orf40 (left panel), revealing the existence of different expression patterns. The cluster of genes with the most similar expression pattern as that of C2orf40, indicated by the purple bar on the right, was expanded (right panel, with the location of C2orf40 indicated). Some significant GO categories associated with these clusters are shown on the right; they are all associated with skeletal development. C) In situ hybridization showed that C2orf40 was up-regulated in the vertebrae by wk 7–9. Cells with brown/yellow stains were considered positive. Note that C2orf40 had little expression in the limbs, different from the expression of C1orf61.

We then performed in situ hybridization to determine the spatial and temporal expression profile of this gene. Again, consistent with our microarray database, C2orf40 showed little expression in wk 4 but was up-regulated in the skeletal system by wk 7–9. In particular, it was highly expressed in the vertebrae by wk 7. Interestingly, unlike the other newly discovered skeletal gene C1orf61, C2orf40 was expressed at very low levels in the limbs. Thus, C2orf40 is likely to be involved in skeletal development but with a distinct role from that of C1orf61. In agreement with this, C1orf61 and C2orf40 were in two different gene expression clusters within the genes implicated in skeletal development (Fig. 5B), which provided further evidence that genes with similar expression profiles tend to be involved in the same biological processes.

A searchable database for predicting gene function in development

The above findings prompted us to set up a gene expression matrix of 54,614 transcripts that is fully accessible to the scientific community via a dedicated web-based searchable format, with an easily navigated graphical interface (http://vmolab.whu.edu.cn:8080/HumanGeneSearch/). Some of the important functions of this database include the ability to: 1) search for expression regulation patterns of any human genes during development; 2) examine how the genes associated with particular biological processes such as skeletal development are regulated; 3) identify, among genes whose function is currently unknown, additional genes that are likely involved in a particular biological process; and 4) deduce the developmental function of a novel gene of interest based on its regulation profile.

CONCLUSIONS

Despite the obvious importance of determining the global gene expression profiles during mammalian embryogenesis, few studies have resulted, largely due to the difficulties in accessing sufficient material even for model organisms such as mouse. The few published reports on human development have been limited to oocytes or very early embryos cultured in vitro. Our study is the first comprehensive analysis of global expression patterns during wk 4–9 of human embryogenesis and has revealed that most of the genes involved in mammalian development are not known previously to be involved in development. Our global analysis identified many biological pathways that are important for the transition from cell proliferation to organogenesis during embryogenesis. In addition, many of the genes that are regulated during wk 4–9 are known to be associated with different diseases (data not shown), which highlights the importance of this developmental period. Furthermore, by comparing with published data, we identified three groups of maternal genes with distinct regulation profiles during wk 4–9 of development, which provided a molecular basis for the anatomically distinct developmental events that occur between early embryogenesis and subsequent organogenesis and histogenesis.

More important, our genome-wide and detailed developmental profiling of the patterns of gene expression has revealed that genes involved in the same biological processes are often regulated coordinately. This finding has led us to hypothesize that patterns of gene expression, if known in sufficient detail as in the present study, might be used to predict a role for the gene in a developmental process. Indeed, as our analyses have shown, our expression database can be used to identify previously unidentified genes that might be involved in a particular biological process and conversely to predict the biological processes that a novel gene may participate in. Thus, our database and the web site should serve as a critical resource for research to understand the genetic networks that control mammalian development, especially human embryogenesis.

Supplementary Material

Supplemental Data

Acknowledgments

This work was supported by National High Technology Research and Development Program of China (863 Program) grant 2006AA02A306, National Natural Science Foundation of China grant 30871245, Program of Introducing Talents of Discipline to Universities grant B06018. Y.-B. S. was supported by the National Institute of Child Health and Human Development Intramural Research Program, National Institutes of Health.

References

  1. Lindsay S, Copp A J. MRC-Wellcome Trust Human Dev. Biol. Resource: enabling studies of human developmental gene expression. Trends Genet. 2005;21:586–590. doi: 10.1016/j.tig.2005.08.011. [DOI] [PubMed] [Google Scholar]
  2. Arbeitman M N, Furlong E E, Imam F, Johnson E, Null B H, Baker B S, Krasnow M A, Scott M P, Davis R W, White K P. Gene expression during the life cycle of Drosophila melanogaster. Science. 2002;297:2270–2275. doi: 10.1126/science.1072152. [DOI] [PubMed] [Google Scholar]
  3. Hill A A, Hunter C P, Tsung B T, Tucker-Kellogg G, Brown E L. Genomic analysis of gene expression in C. elegans. Science. 2000;290:809–812. doi: 10.1126/science.290.5492.809. [DOI] [PubMed] [Google Scholar]
  4. White K P, Rifkin S A, Hurban P, Hogness D S. Microarray analysis of Drosophila development during metamorphosis. Science. 1999;286:2179–2184. doi: 10.1126/science.286.5447.2179. [DOI] [PubMed] [Google Scholar]
  5. Zhang W, Morris Q D, Chang R, Shai O, Bakowski M A, Mitsakakis N, Mohammad N, Robinson M D, Zirngibl R, Somogyi E, Laurin N, Eftekharpour E, Sat E, Grigull J, Pan Q, Peng W T, Krogan N, Greenblatt J, Fehlings M, van der Kooy D, Aubin J, Bruneau B G, Rossant J, Blencowe B J, Frey B J, Hughes T R. The functional landscape of mouse gene expression. J Biol. 2004;3:21. doi: 10.1186/jbiol16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Li S S, Liu Y H, Tseng C N, Singh S. Analysis of gene expression in single human oocytes and preimplantation embryos. Biochem Biophys Res Commun. 2006;340:48–53. doi: 10.1016/j.bbrc.2005.11.149. [DOI] [PubMed] [Google Scholar]
  7. Kocabas A M, Crosby J, Ross P J, Otu H H, Beyhan Z, Can H, Tam W L, Rosa G J, Halgren R G, Lim B, Fernandez E, Cibelli J B. The transcriptome of human oocytes. Proc Natl Acad Sci U S A. 2006;103:14027–14032. doi: 10.1073/pnas.0603227103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Assou S, Anahory T, Pantesco V, Le Carrour T, Pellestor F, Klein B, Reyftmann L, Dechaud H, De Vos J, Hamamah S. The human cumulus–oocyte complex gene-expression profile. Hum Reprod. 2006;21:1705–1719. doi: 10.1093/humrep/del065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Dobson A T, Raja R, Abeyta M J, Taylor T, Shen S, Haqq C, Pera R A. The unique transcriptome through day 3 of human preimplantation development. Hum Mol Genet. 2004;13:1461–1470. doi: 10.1093/hmg/ddh157. [DOI] [PubMed] [Google Scholar]
  10. Bermudez M G, Wells D, Malter H, Munne S, Cohen J, Steuerwald N M. Expression profiles of individual human oocytes using microarray technology. Reprod Biomed Online. 2004;8:325–337. doi: 10.1016/s1472-6483(10)60913-3. [DOI] [PubMed] [Google Scholar]
  11. Sarma S, Kerwin J, Puelles L, Scott M, Strachan T, Feng G, Sharpe J, Davidson D, Baldock R, Lindsay S. 3D modelling, gene expression mapping and post-mapping image analysis in the developing human brain. Brain Res Bull. 2005;66:449–453. doi: 10.1016/j.brainresbull.2005.05.022. [DOI] [PubMed] [Google Scholar]
  12. Cai J, Ash D, Kotch L E, Jabs E W, Attie-Bitach T, Auge J, Mattei G, Etchevers H, Vekemans M, Korshunova Y, Tidwell R, Messina D N, Winston J B, Lovett M. Gene expression in pharyngeal arch 1 during human embryonic development. Hum Mol Genet. 2005;14:903–912. doi: 10.1093/hmg/ddi083. [DOI] [PubMed] [Google Scholar]
  13. Piper K, Brickwood S, Turnpenny L W, Cameron I T, Ball S G, Wilson D I, Hanley N A. Beta cell differentiation during early human pancreas development. J Endocrinol. 2004;181:11–23. doi: 10.1677/joe.0.1810011. [DOI] [PubMed] [Google Scholar]
  14. Gaskell T L, Esnal A, Robinson L L, Anderson R A, Saunders P T. Immunohistochemical profiling of germ cells within the human fetal testis: identification of three subpopulations. Biol Reprod. 2004;71:2012–2021. doi: 10.1095/biolreprod.104.028381. [DOI] [PubMed] [Google Scholar]
  15. Hill M A. Early human development. Clin Obstet Gynecol. 2007;50:2–9. doi: 10.1097/GRF.0b013e31802f119d. [DOI] [PubMed] [Google Scholar]
  16. Gou D M, Chow L M, Chen N Q, Jiang D H, Li W X. Construction and characterization of a cDNA library from 4-week-old human embryo. Gene. 2001;278:141–147. doi: 10.1016/s0378-1119(01)00701-6. [DOI] [PubMed] [Google Scholar]
  17. Gou D M, Sun Y, Gao L, Chow L M, Huang J, Feng Y D, Jiang D H, Li W X. Cloning and characterization of a novel Kruppel-like zinc finger gene, ZNF268, expressed in early human embryo. Biochim Biophys Acta. 2001;1518:306–310. doi: 10.1016/s0167-4781(01)00194-4. [DOI] [PubMed] [Google Scholar]
  18. Guo M X, Wang D, Shao H J, Qiu H L, Xue L, Zhao Z Z, Zhu C G, Shi Y B, Li W X. Transcription of human zinc finger ZNF268 gene requires an intragenic promoter element. J Biol Chem. 2006;281:24623–24636. doi: 10.1074/jbc.M602753200. [DOI] [PubMed] [Google Scholar]
  19. Liu B, Gao Y M. Beijing, P.R. China (Chinese): People Health Press; Human Embryology. 1996 [Google Scholar]
  20. Edgar R, Domrachev M, Lash A E. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002;30:207–210. doi: 10.1093/nar/30.1.207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Zeeberg B R, Feng W, Wang G, Wang M D, Fojo A T, Sunshine M, Narasimhan S, Kane D W, Reinhold W C, Lababidi S, Bussey K J, Riss J, Barrett J C, Weinstein J N. GoMiner: a resource for biological interpretation of genomic and proteomic data. Genome Biol. 2003;4:R28. doi: 10.1186/gb-2003-4-4-r28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Darby I A. Totowa, NJ, USA (Chinese): Humana Press; In situ hybridization protocols. 2000 [Google Scholar]
  23. Lee M L, Kuo F C, Whitmore G A, Sklar J. Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. Proc Natl Acad Sci U S A. 2000;97:9834–9839. doi: 10.1073/pnas.97.18.9834. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Ashburner M, Ball C A, Blake J A, Botstein D, Butler H, Cherry J M, Davis A P, Dolinski K, Dwight S S, Eppig J T, Harris M A, Hill D P, Issel-Tarver L, Kasarskis A, Lewis S, Matese J C, Richardson J E, Ringwald M, Rubin G M, Sherlock G. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Tamayo P, Slonim D, Mesirov J, Zhu Q, Kitareewan S, Dmitrovsky E, Lander E S, Golub T R. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci U S A. 1999;96:2907–2912. doi: 10.1073/pnas.96.6.2907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Liu X, Wu H, Byrne M, Krane S, Jaenisch R. Type III collagen is crucial for collagen I fibrillogenesis and for normal cardiovascular development. Proc Natl Acad Sci U S A. 1997;94:1852–1856. doi: 10.1073/pnas.94.5.1852. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kennedy B N, Stearns G W, Smyth V A, Ramamurthy V, van Eeden F, Ankoudinova I, Raible D, Hurley J B, Brockerhoff S E. Zebrafish rx3 and mab21l2 are required during eye morphogenesis. Dev Biol. 2004;270:336–349. doi: 10.1016/j.ydbio.2004.02.026. [DOI] [PubMed] [Google Scholar]
  28. Horsford D J, Nguyen M T, Sellar G C, Kothary R, Arnheiter H, McInnes R R. Chx10 repression of Mitf is required for the maintenance of mammalian neuroretinal identity. Development. 2005;132:177–187. doi: 10.1242/dev.01571. [DOI] [PubMed] [Google Scholar]
  29. Wang Y P, Dakubo G, Howley P, Campsall K D, Mazarolle C J, Shiga S A, Lewis P M, McMahon A P, Wallace V A. Development of normal retinal organization depends on Sonic hedgehog signaling from ganglion cells. Nat Neurosci. 2002;5:831–832. doi: 10.1038/nn911. [DOI] [PubMed] [Google Scholar]
  30. Salomonis N, Hanspers K, Zambon A C, Vranizan K, Lawlor S C, Dahlquist K D, Doniger S W, Stuart J, Conklin B R, Pico A R. GenMAPP 2: new features and resources for pathway analysis. BMC Bioinformatics. 2007;8:217. doi: 10.1186/1471-2105-8-217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Schug J, Schuller W P, Kappen C, Salbaum J M, Bucan M, Stoeckert C J., Jr Promoter features related to tissue specificity as measured by Shannon entropy. Genome Biol. 2005;6:R33. doi: 10.1186/gb-2005-6-4-r33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Gu L H, Coulombe P A. Keratin function in skin epithelia: a broadening palette with surprising shades. Curr Opin Cell Biol. 2007;19:13–23. doi: 10.1016/j.ceb.2006.12.007. [DOI] [PubMed] [Google Scholar]
  33. Coulombe P A, Omary M B. ‘Hard’ and ‘soft’ principles defining the structure, function and regulation of keratin intermediate filaments. Curr Opin Cell Biol. 2002;14:110–122. doi: 10.1016/s0955-0674(01)00301-5. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Data

Articles from The FASEB Journal are provided here courtesy of The Federation of American Societies for Experimental Biology

RESOURCES