Global variability analysis of mRNA and protein concentrations across and within human tissues

Christine Wegler; Magnus Ölander; Jacek R Wiśniewski; Patrik Lundquist; Katharina Zettl; Anders Åsberg; Jøran Hjelmesæth; Tommy B Andersson; Per Artursson

doi:10.1093/nargab/lqz010

. 2019 Oct 30;2(1):lqz010. doi: 10.1093/nargab/lqz010

Global variability analysis of mRNA and protein concentrations across and within human tissues

Christine Wegler ^1,^2,², Magnus Ölander ^1,², Jacek R Wiśniewski ³, Patrik Lundquist ¹, Katharina Zettl ³, Anders Åsberg ^4,⁵, Jøran Hjelmesæth ^6,⁷, Tommy B Andersson ², Per Artursson ^8,^✉

PMCID: PMC7671341 PMID: 33575562

Abstract

Genes and proteins show variable expression patterns throughout the human body. However, it is not clear whether relative differences in mRNA concentrations are retained on the protein level. Furthermore, inter-individual protein concentration variability within single tissue types has not been comprehensively explored. Here, we used the Gini index for in-depth concentration variability analysis of publicly available transcriptomics and proteomics data, and of an in-house proteomics dataset of human liver and jejunum from 38 donors. We found that the transfer of concentration variability from mRNA to protein is limited, that established ‘reference genes’ for data normalization vary markedly at the protein level, that protein concentrations cover a wide variability spectrum within single tissue types, and that concentration variability analysis can be a convenient starting point for identifying disease-associated proteins and novel biomarkers. Our results emphasize the importance of considering individual concentration levels, as opposed to population averages, for personalized systems biology analysis.

INTRODUCTION

Modern transcriptomics and proteomics technologies have enabled the comprehensive mapping of human mRNA and protein levels in different cells and tissues (1,2). It has been shown that some genes and proteins are expressed in tissue-specific patterns, while others are ubiquitously expressed throughout the body (3–7). However, correlations between mRNA and protein concentrations are typically poor (8,9). Further, it is unclear to what extent the relative concentration differences on the mRNA level are transferred to the protein level (10).

Notable inter-individual differences have been observed within single tissue types on the mRNA level (11,12), but this has not been comprehensively investigated on the protein level. Due to the time-consuming nature of mass spectrometry-based proteomics analysis, large-scale proteomics studies of multiple tissues often use samples from few or sometimes pooled donors (5–7). Further, studies of single tissue types commonly use concentration levels from multiple donors as biological replicates, with little emphasis on inter-individual variability (13,14). However, by using few or average protein concentration levels, the outcome of holistic systems biology analysis would not necessarily reflect inter-individual differences. In-depth knowledge of protein concentration variability in single tissue types would enable systems biology to deliver results that better represent the entire population, thereby advancing the field of personalized medicine (15,16).

We reasoned that in-depth concentration variability analysis could be used to address these two knowledge gaps: the transfer of relative concentration differences from mRNA to protein and the inter-individual protein concentration variability in single tissue types. For this purpose, we first used publicly available transcriptomics and proteomics data from paired human tissues, and then generated a unique proteomic dataset of liver and jejunum samples from 38 human donors. Our results show that mRNA variability is poorly reflected at the protein level, and that ‘reference genes’ proposed for normalization of ‘omics’ data, with low concentration variability on the mRNA level, had higher variability on the protein level across different tissues and within multiple samples of the same tissue type. Thus, proper normalization would necessitate specific reference genes for different types of samples and data, which would not be practical in proteomics. Furthermore, we show that proteins with essential cellular functions have low protein concentration variability within single tissue types. On the other hand, many proteins vary substantially between donors and we noted a high variability in proteins related to disease, suggesting that protein concentration variability analysis can be used as a starting point for the identification of disease-associated proteins. We also demonstrate that many proteins show large inter-individual concentration variability, with implications for personalized systems pharmacology.

MATERIALS AND METHODS

Samples for proteomics

Human liver and jejunum samples were obtained from 38 donors that had provided informed consent as part of the COCKTAIL study (17). Samples were collected from patients undergoing bariatric surgery preceded by a 3-week fasting period. The patients were between 23 and 63 years old, with BMI-values ranging between 30 and 63 kg/m².

Data origin

For concentration variability analysis across human tissues (Supplementary Data S1), we used mRNA and protein concentrations from 29 paired human tissue types from previously published transcriptomics and proteomics datasets (7). For within-tissue variability analysis on the mRNA level, we used mRNA concentrations from liver (175 donors) and small intestine (137 donors) obtained from the GTEx project (18).

Within-tissue variability analysis on the protein level was performed using proteomics data of the 38 human liver and jejunum samples obtained here. For the proteomics analysis, proteins were solubilized with 2% SDS and processed with the multi-enzyme digestion filter-aided sample preparation (MED-FASP) protocol, using LysC and trypsin (19). Peptides were separated on a C₁₈ column (50 cm and 75 μm inner diameter) using a 2-h acetonitrile gradient in 0.1% formic acid at a flow rate of 300 nl/min. The LC was coupled to a Q Exactive HF or Q Exactive HF-X mass spectrometer (Thermo Fisher Scientific) operating in data dependent mode with survey scans at a resolution of 60 000, AGC target of 3 × 10⁶ and maximum injection time of 20 ms. The top 15 most abundant isotope patterns were selected from the survey scan with an isolation window of 1.4 m/z and fragmented with nCE at 27. The MS/MS analysis was performed with a resolution of 15 000, AGC target of 1 × 10⁵ and maximum injection time of 60 ms. The resulting MS data were analyzed with MaxQuant (version 1.6.0.16) (20), where proteins were identified by searching peptides against a decoy version of the UniProtKB (May 2013). Carbamidomethylation was set as a fixed modification, and protein discovery rates were specified as 0.01, allowing a maximum of two missed cleavages. Spectral raw intensities were normalized with variance stabilization (vsn) (21). Protein concentrations were calculated with the Total Protein Approach (22).

Calculation of concentration variability

Concentration variability analysis was performed with the Gini index, using a modified version of the ineq R package, version 0.2-13 (https://CRAN.R-project.org/package=ineq). More specifically, Gini indices were calculated for each mRNA or protein based on the mRNA or protein concentration levels in the 29 tissues (for the across-tissue analysis) or based on the protein concentration levels in the 38 human liver or jejunum samples (for the within-tissue analysis). Thus, mRNAs or proteins with low Gini indices (close to zero) had similar concentration levels across the 29 tissues or across the 38 donors in the across-tissue and within-tissue analysis, respectively.

Statistical analysis

Statistical analyses were performed using GraphPad Prism, version 7.03, and partial least squares (PLS) modeling using SIMCA, version 15.0.0.4783. Functional annotation clustering of GOBP and KEGG terms was performed with DAVID, version 6.8 (23), where clusters were considered significant above an enrichment score of 1.3 (corresponding to a P-value < 0.05). Statistical details are provided at relevant places in the ‘Results’ section and figure legends. For the analysis of the proteomics datasets, the first protein ID was selected when multiple protein IDs were separated by ‘;’. This enabled matching between transcriptomics and proteomics datasets.

RESULTS

Variability in mRNA and protein concentration levels across different tissue types

The Gini index is a measure of variability in a dataset (24), and was recently introduced for describing the variability of mRNA levels across different tissue types (25). The Gini index ranges from 0 to 1, where lower values indicate similar concentration levels across samples. Here, we extended the variability analysis to the closer-to-phenotype protein level to investigate whether the concentration variability was similar on the mRNA and protein levels. For this purpose, we calculated Gini indices (Supplementary Data S1) using mRNA and protein concentrations in 29 paired human tissue types from a previously published dataset (7). To ensure reliable Gini index calculations, we only included mRNAs and proteins that were detected in at least 15 tissues. The frequency distribution of Gini indices from the proteomics data was shifted toward higher values compared to the transcriptomics data, reflecting an overall higher variability at the protein level. The distribution of protein concentration variability across tissues peaked at a Gini index of 0.50, while the distribution of mRNA concentration variability peaked at 0.18 (bin width 0.02; Figure 1A and Supplementary Data S1). Gini indices were negatively correlated with median concentration levels across the 29 tissues at the protein level (r = −0.51; Supplementary Figure S1). However, this was only a general trend, and many highly abundant proteins still had high Gini indices (and vice versa). On the other hand, no such correlation was observed at the mRNA level (r = −0.03). Gini indices calculated from the transcriptomics data showed some discrepancies with the corresponding values from the proteomics data (r = 0.50, Figure 1B). In addition, there was very little overlap among the 100 and 500 least variable gene products between the transcriptomics and proteomics datasets (<14%; Figure 1C and D). The fact that low variability at the mRNA level across different tissues not necessarily results in low variability at the protein level is likely due to translational or post-translational regulatory mechanisms, and presumably reflects tissue-specific phenotypes (26).

Figure 1. — Variability in mRNA and protein concentration levels across different tissue types. (A) Concentration variability distributions of matching mRNAs and proteins (n = 8828) across 29 paired human tissue types, based on previously published transcriptomics and proteomics data (7). Numbers in figure denote mode (bin width: 0.02). (B) Correlation of mRNA and protein concentration variability (n = 8828), using transcriptomics and proteomics data from the 29 tissue types (7). (C) Overlap of the 100 least variable mRNAs and proteins across tissues (7). (D) Overlap of the 500 least variable mRNAs and proteins across tissues (7); r = Pearson's correlation coefficient.

Reference gene variability across different tissue types

Historically, ubiquitously expressed gene products have been used as references for e.g. western blots (27) and qPCR (28). More recently, the use of reference genes has also been proposed to improve normalization of more complex omics data (29,30). Large-scale transcriptomics studies have provided lists of genes with low mRNA concentration variability as potential references (25,30). We investigated the concentration variability of mRNAs from these proposed genes, as well as from some traditionally used reference genes (25), across different tissue types in the transcriptomics and proteomics datasets. Unsurprisingly, the majority of the newly proposed references (23 of 33) were among the 10% least variable mRNAs in the transcriptomics dataset (Figure 2A and C). On the other hand, the traditionally used references (e.g. GAPDH and ACTB) generally showed much higher variability, possibly due to the diverse reasons for their selection.

Figure 2. — Reference gene variability in transcriptomics and proteomics data. (A and B) Gini indices of mRNAs and proteins in transcriptomics and proteomics data across 29 tissue types (7). Previously proposed reference genes are highlighted. (C) Normalized ranks of reference genes on the mRNA and protein levels across tissues, as well as across donors in transcriptomics data from 175 liver and 137 small intestine samples (18) and in proteomics data of liver and jejunum from 38 donors. Ranks were normalized by the number of entries in each separate dataset, and sorted by the rank in the transcriptomics data across tissues.

To assess the utility of references selected from transcriptomics data for normalization at the protein level, we studied the variability of the 33 proposed references (25,30) in the proteomics dataset across multiple tissue types. We found that protein concentrations of the proposed references were highly variable across the different tissue types, with Gini indices covering ∼75% of the entire variability spectrum (Figure 2B and C). This high variability on the protein level questions the suitability of selecting reference genes for proteomics based on transcriptomics information.

Inter-individual reference gene variability in single tissue types

Differences in protein concentrations across different tissue types, resulting from tissue specialization, might preclude the universal application of transcriptomics-based reference genes. However, the reference genes could still have a low concentration variability (and thus be useful for normalization) within a more homogeneous sample group, such as multiple samples of the same tissue type from different donors. Therefore, we analyzed reference gene variability on the mRNA level in liver (175 donors) and small intestine (137 donors) using previously published transcriptomics data (18), and on the protein level in liver and jejunum samples from 38 human donors (Supplementary Data S1). Similarly to the observation across different tissue types, many of the proposed references (14 of 33 from the liver and 19 of 33 from the small intestine) were among the 10% least variable mRNAs (Figure 2C; Supplementary Figure S2a and S2b). On the protein level, a few of the references were among the 10% least variable (5 in both liver and jejunum). However, the overall variability was, surprisingly, as high as in the multi-tissue proteomics dataset (Figure 2C). This indicates that caution should be exercised if selecting reference genes for normalization of proteomics data is based on transcriptomics information.

Variability in protein concentrations in single tissue types

Inspired by this unexpectedly high protein concentration variability in single tissue types, we performed more in-depth analysis of within-tissue variability in the proteomes of the 38 liver and jejunum samples. For the calculation of Gini indices, we included proteins that were detected with at least three unique peptides. The final datasets comprised 5968 and 7662 proteins for liver and jejunum, respectively. The variability distributions were significantly lower than in the across-tissue proteomics data, with peaks at Gini indices of 0.12 and 0.16 in liver and jejunum, respectively, compared to 0.54 across tissues (P < 0.0001, one-way ANOVA followed by Tukey’s multiple comparisons test: Figure 3A and Supplementary Data S1). Using previously published transcriptomics data (18), we observed relatively similar within-tissue variability distributions on the mRNA level compared to the protein level, with peaks at Gini indices of 0.20 and 0.18 in liver and small intestine, respectively (Supplementary Figure S3a and S3b). In general, the variability distribution of protein concentrations in the jejunum samples was shifted toward higher values compared to liver (P < 0.0001, one-way ANOVA followed by Tukey’s multiple comparisons test). More specifically, Gini indices ranged between 0.01–0.84 and 0.04–0.94, with medians of 0.21 and 0.24 in liver and jejunum, respectively. Gini indices from the two tissues showed a relatively poor correlation (r = 0.48; 5496 matching proteins). However, when excluding a relatively small number of the most discrepant values between the two datasets (469 proteins), we observed a much stronger correlation (r = 0.71; Figure 3B and Supplementary Figure S4A). This shows that the overall variability across donors is similar in liver and jejunum.

Figure 3. — Variability in protein concentrations in single tissue types. (A) Variability distributions of proteins in human liver (n = 5968) and jejunum (n = 7662) from 38 donors, and in proteomics data across 29 tissues (n = 10373) (7), comparing within-tissue and across-tissue variability. Numbers in figure denote the mode (bin width: 0.02). (B) Correlation of protein concentration variability between the liver and jejunum datasets. The black line shows the regression with all proteins included, and the grey line shows the regression after exclusion of highly discrepant proteins (outside the dotted lines) between the two tissues. (C) Overlap of the 100 and 500 least variable proteins in the liver and jejunum datasets. (D) Variability distributions of 2562 matching proteins in Caco-2 cells, analyzed as triplicates, and human liver and jejunum, comparing technical and biological variability. Numbers in figure denote the mode (bin width: 0.02); r = Pearson’s correlation coefficient.

We further investigated the least variable proteins in liver and jejunum, and found that 34% and 53% of the 100 and 500 least variable proteins, respectively, were overlapping between the two tissue types (Figure 3C). This demonstrates that proteins with low inter-individual variability in liver often also have a low variability in jejunum, suggesting that different tissue types share many proteins that require concentration levels within narrow intervals. Biological functions represented by the 34 proteins with the lowest Gini indices in both tissue types included protein folding (CCT2, CCT8, HSP90AA1, HSP90B1, HSPA8 and PDIA3), energy metabolism (HADHA, MDH2, ENO1, PGK1, ATP5A1 and ATP5B) and vesicular trafficking (CLTC, COPB2, DYNC1H1, GDI2 and RAB11B).

To verify the biological relevance of our findings, and to exclude that the variability was introduced by measurement errors, we compared the Gini indices from the liver and jejunum proteomics data with a previous dataset of Caco-2 cells analyzed in triplicate (31). In theory, the cell samples should have Gini values close to 0 for all proteins, as they were replicates cultured together under identical conditions. For this comparison, we used the 2562 proteins that were detected with at least three peptides in all three datasets. Indeed, the replicate samples had significantly lower Gini indices, with a median value of 0.04 compared to 0.12 and 0.14 in liver and jejunum, respectively (P < 0.0001, one-way ANOVA followed by Dunnett’s multiple comparisons test; Figure 3D). Interestingly, however, some proteins showed relatively high variability even among the replicates. In contrast to the least variable proteins (below the 5th percentile), the most variable proteins (above the 95th percentile) in the replicate samples were over-represented by nucleic acid binding proteins and transcription factors (Supplementary Figure S4B). Nevertheless, the generally low variability among the replicates shows that the Gini indices we observed in the liver and jejunum samples mainly resulted from biological differences and not from technical measurement errors.

We further validated the biological relevance of our analyses by comparing Gini indices in the 38 liver samples with a separate in-house proteomics dataset containing 15 liver samples from another patient group. A stronger correlation was found between the two liver datasets than between the different tissue types (i.e. the 38 liver and jejunum samples; Supplementary Figure S4C). Thus, the results in this study should not be specific to particular datasets.

Within-tissue variability in relation to protein properties

We next investigated the characteristics of proteins from the entire within-tissue variability spectra in liver and jejunum. Proteins were classified by type using the PANTHER classification system (32). Median Gini indices of the protein classes from liver and jejunum were highly correlated (r = 0.87), showing that similar types of proteins had low and high variability in both tissues (Figure 4A). For example, lyases, oxidoreductases, and chaperones were among the least variable protein classes, while cell adhesion molecules and extracellular matrix proteins showed high variability.

Figure 4. — Within-tissue variability in relation to protein properties. (A) Gini indices of proteins in different classes in human liver and jejunum from 38 donors. The number of proteins in each class is shown in the figure. Boxes range between the 25th and 75th percentiles, lines show medians and whiskers denote the 10th and 90th percentiles. (B) Subcellular distribution of proteins in human liver, sorted by Gini index. The 10% least and most variable proteins are highlighted. (C) Major subcellular compartment of the 10% least and most variable proteins in human liver. Dashed lines show the corresponding values for all proteins. (D) Enriched biological processes in the 100 least variable proteins in human liver and jejunum. Scores show the enrichment score of the functional annotation clustering. (E) Gini indices of essential and non-essential proteins in human liver and jejunum. The number of proteins is displayed under each boxplot. Boxes range between the 25th and 75th percentiles, lines show medians and whiskers denote the 10th and 90th percentiles. (F) The effect of various protein properties on Gini indices, assessed by PLS modeling. Error bars indicate 95% confidence intervals for the model coefficients. ****P < 0.0001, one-way ANOVA followed by Dunnett’s multiple comparisons test.

Further, we used the Prolocate database of protein localization in rat liver (33) to study the subcellular distribution of the 2095 proteins that matched between our liver dataset and Prolocate. In general, proteins with high Gini indices were more widely distributed across subcellular compartments (Figure 4B). The cytosol was the major compartment for 46% of the 2095 proteins, followed by mitochondria (17%) and the endoplasmic reticulum (ER; 16%; Figure 4C). We then specifically considered the 10% least and the 10% most variable proteins. Here, differences were seen for other compartments than the cytosol between the two extremes (Figure 4C and Supplementary Figure S5A). The mitochondria and ER were the major compartments for many of the least variable proteins. On the other hand, the plasma membrane (PM) and lysosome were major compartments for the most variable proteins. This is logical in light of the fasting period that the patients undergo prior to surgery (see ‘Materials and Methods’ section), as fasting is known to trigger an increase in the number and activity of lysosomes (34).

We then used the DAVID database (23) to characterize the biological functions of the least and most variable proteins. The 100 least variable proteins represented similar basal cellular functions in both tissues, such as carbohydrate metabolism, protein processing and translation (Figure 4D and Supplementary Data S2). The 100 most variable proteins were more random, with few enriched functions. However, extending the analysis to the 500 most variable proteins resulted in interesting findings. Enriched functions included extracellular matrix organization in the liver, possibly due to varying levels of collagen deposition related to liver status, as well as drug metabolism in the jejunum, which could be related to induction of intestinal metabolic enzymes by dietary and environmental factors (Supplementary Figure S5B and Data S2).

To further probe the involvement of the least variable proteins in basal functions, we compared proteins from the ‘core essentialome’ (35), i.e. proteins essential for cell survival, with non-essential proteins. Indeed, essential proteins had significantly lower Gini indices in both tissues (P < 0.0001, Student’s t-test), showing that these proteins were found at similar concentrations in all donors (Figure 4E). This is logical, as cells need to maintain these proteins at certain levels to ensure survival. The corresponding mRNAs to these essential proteins also had significantly lower Gini indices across donors in the external transcriptomics dataset of liver and small intestinal samples (18) (Supplementary Figure S6A and S6B). Further, similar analysis of the external proteomics dataset containing 15 liver samples from another patient group supported these results (Supplementary Figure S6C).

To provide a summary of factors that influence within-tissue variability, we performed PLS modeling of the effect of various protein properties on Gini indices. We included GRAVY score (hydrophobicity), secondary structure (36), essentiality (35), isoelectric point (37), protein complex participation (38), protein/mRNA ratio (6), turnover rate (39), molecular weight and protein concentration (Supplementary Data S1). The results of the models generally indicated that highly abundant, large, essential, hydrophilic proteins with low turnover rates have low concentration variability in single tissue types (Figure 4F).

Capturing expected biological variability

We next investigated whether the Gini index could capture a biological process where high variability in protein concentrations across donors was expected. For this, we extracted proteins that were associated with diabetes in the DisGeNET database (40) and proteins annotated with the term ‘inflammatory response’ in the Gene Ontology (41). Diabetes-associated proteins were likely to have variable concentration levels across the 38 donors, since a subgroup of the patients (one-third) suffered from diabetes. Further, the patients had varying degrees of obesity, and inflammation is known to increase with increasing body weight (42). The Gini indices for these proteins were compared with proteins from the proteasome and ribosome, i.e. core cellular structures that were expected to have similar concentration levels across the donors. As expected, diabetes-associated proteins had significantly higher Gini indices than the core structures in both liver and jejunum (one-way ANOVA followed by Dunnett’s multiple comparisons test; Figure 5A). This was further investigated by dividing the donors into two groups: patients with diabetes and patients without diabetes, and calculating the Gini indices for each group (Supplementary Figure S7A). As patients in the respective groups were expected to have similar concentrations of diabetes-associated proteins (e.g. high levels of a protein in patients with diabetes but low levels in patients without diabetes), low Gini indices were expected in each group (Supplementary Figure S7A). Indeed, diabetes-associated proteins had significantly lower Gini indices in both groups compared to the Gini indices obtained when including all 38 donors (one-way ANOVA followed by Holm-Sidak’s multiple comparisons test; Supplementary Figure S7B and S7C). Acute inflammatory response proteins also had significantly higher Gini indices than the core structures in both liver and jejunum (one-way ANOVA followed by Dunnett’s multiple comparisons test; Figure 5A). These results further demonstrate the applicability of this approach for observing biologically relevant variability. Interestingly, some of the most variable inflammatory proteins have also been detected in plasma (43) and urine (44). Both well-established biomarkers, such as CRP (45), and less investigated proteins were found among the ten most variable inflammatory proteins, indicating that the latter could potentially be used as biomarkers for tissue inflammation (Figure 5B).

Figure 5. — Variability in proteins associated with inflammation, diabetes, and drug metabolism. (A) Gini indices of proteins from core cellular structures, with low expected variability, compared with Gini indices of diabetes-associated and inflammatory response proteins, with high expected variability. The number of proteins is displayed under each boxplot. Boxes range between the 25th and 75th percentiles, lines show medians, and whiskers denote the 10th and 90th percentiles. (B) The ten most variable inflammatory response proteins in liver and jejunum. (C) Gini indices of all proteins quantified in liver samples, ordered by gene name. Important transporter and metabolic enzyme families are highlighted. (D) Important drug metabolizing CYP enzymes quantified in liver samples, sorted by Gini index. (E) Gini indices of all proteins quantified in jejunum samples, ordered by gene name. Important transporter and metabolic enzyme families are highlighted. (F) Important drug metabolizing CYP enzymes quantified in jejunum samples, sorted by Gini index. Roman numerals indicate statistical significance from one-way ANOVA followed by Dunnett’s multiple comparisons test, in comparisons of diabetes-associated proteins with proteasome (i, iii; P < 0.0001) and ribosome (ii, iv; P < 0.05 and P < 0.0001, respectively), and comparisons of inflammatory response proteins with proteasome (v, vii; P < 0.0001) and ribosome (vi, viii; P < 0.001 and P < 0.0001, respectively) in liver and jejunum, respectively.

Variability in proteins related to drug metabolism

One progressing branch of systems biology is systems pharmacology, in which biological network structures are used to predict drug action (46). For this type of analysis, it is imperative to know the quantities of proteins involved in drug disposition. Therefore, we investigated the variability of the most important proteins involved in drug transport and metabolism. These included proteins from the two main transporter families, ATP-binding cassette (ABC) and solute carrier (SLC), and the two most prominent enzyme families in phase I and phase II metabolizing enzymes, i.e. the cytochrome P450s (CYPs) and UDP-glucuronosyltransferases (UGTs) (47–50). In general, these proteins covered large parts of the variability spectrum in both liver and jejunum (Figure 5C and E). More specifically, Gini indices for the CYPs responsible for the majority of drug metabolism (50) ranged from 0.09 to 0.34 and 0.19 to 0.88 in liver and jejunum, respectively (Figure 5D and F). This spread in Gini indices was also reflected by the fold differences of protein concentrations across donors (i.e. the ratio of maximum and minimum values), where the majority of the enzymes (67%) had a fold difference above 10. Fold differences are heavily affected by extreme values, but provide more tangible information for systems pharmacology analysis. The high variability we observed for many important metabolic enzymes implies that comprehensive systems pharmacology analysis, tailored to the patient, requires thorough consideration of inter-individual protein concentration variability.

DISCUSSION

Concentrations of mRNAs and proteins show considerable variability across tissues, and it is well known that mRNA and protein levels are poorly correlated. However, little is known about the transfer of relative concentration differences from mRNA to protein. Furthermore, the variability in protein concentrations within single tissue types from multiple donors has not been comprehensively characterized. Here, we first used the Gini index to characterize mRNA and protein concentration variability across tissues using a publicly available dataset. We then investigated inter-individual differences within single tissue types, using in-house proteomics data of 38 human liver and jejunum samples. We found substantial variability differences between transcriptomics and proteomics data across tissues, and identified common themes in the single tissue types among proteins at both ends of the variability spectrum.

We observed a discrepancy in concentration variability between the mRNA and protein levels across tissue types. This could be explained by various factors. Tissue specialization constitutes a logical partial explanation, as different proteins are required at different levels to establish specific phenotypes (4). Furthermore, protein levels are controlled by a complex interplay of post-transcriptional processes. Protein synthesis consumes more energy than mRNA transcription (51,52), meaning that tissues will only produce proteins at levels that are required for their tissue-specific functions, whereas mRNA levels do not need to be as tightly controlled (53). Moreover, regulation of synthesis and degradation rates can also largely affect the variability in protein concentration irrespective of the corresponding mRNA levels (54).

We found that the discrepancy in variability between mRNA and protein levels across tissue types was also prominent among transcriptomics-based reference genes proposed for use in normalization procedures. Further, the references showed different variabilities in proteomics datasets containing 38 samples of human liver and jejunum. In essence, these results mean that optimal normalization would require the selection of different reference genes for transcriptomics and proteomics data, and for different cells and tissues. In fact, specific reference genes might even be necessary for different samples of the same tissue type in different conditions (55). The complexity of this task makes the use of reference genes a highly impractical normalization approach for proteomics data. Instead, more sophisticated normalization methods, such as intensity-based variance stability normalization (vsn), should be used (21,56).

In the analysis of our 38 liver and jejunum samples, we observed that protein concentration variability was lower within single tissue types than across different tissues. Variability was generally higher in jejunum than in liver. Demographic differences between sample types could be ruled out as a cause for the higher variability, since the samples originated from the same 38 donors. Therefore, other explanations need to be considered. The liver is a relatively homogeneous tissue, where hepatocytes constitute 80% of the volume (57), whereas the jejunum is composed of several distinct tissue layers (58). Further, hepatocytes have a comparatively small proteome in relation to other cell types in the liver (59). This could explain why more proteins were detected in the jejunum samples. The layered nature of the jejunum means that the pinch-biopsy technique used for obtaining jejunal biopsies can result in slightly different sample compositions due to variable sampling depth (60). Deeper biopsies would contain higher proportions of collagen-rich submucosal connective tissue, thus diluting the tissue-specific expression of the jejunal epithelium to varying degrees. Indeed, we observed high variability in the major intestinal collagen types (61) between the jejunum samples (Supplementary Figure S5C). Further, inter-individual differences in jejunum length (62) could affect the relative position of the jejunal sampling site (17). This could influence the variability due to regional differences in protein concentrations in the human small intestine (63). Such sampling effects have previously been observed on the mRNA level in human lung tissue (12).

We also note that several technical parameters of the proteomics analysis itself may affect the Gini index. For instance, it is well known that the number of peptides detected for a certain protein affects the reliability of its quantification (64). We did observe a tendency that larger numbers of peptides used for detection resulted in lower Gini indices, but the correlations were relatively weak (r = –0.37 and r = –0.35 for liver and jejunum, respectively; Supplementary Figure S8A and S8B). To partly account for the problem, we only included proteins that were detected with at least three unique peptides. Another problem is represented by plasma membrane proteins, which are often low-abundant, show limited solubilization, and contain hydrophobic domains that are problematic to ionize, which all makes them more difficult to analyze (65). However, in the proteomics workflow used here, the lysis buffer contained high concentration of SDS that facilitates the solubilization of proteins with hydrophobic domains. Nevertheless, the challenge of quantifying low abundant proteins still remains, and is a plausible reason to that plasma membrane proteins were over-represented among the most variable proteins in the liver and jejunum. The difficulties involved in their analysis thus indicate that their Gini indices should be interpreted with a measure of caution.

Even though the protein concentration variability was generally higher in jejunum than in liver, the least variable proteins in each tissue were involved in similar essential cellular processes. The least variable proteins also seemed to capture basal differences in tissue physiology. For instance, the constant proliferation and migration of enterocytes along the villus axis (66) likely accounts for the low variability of proteins related to cell division in the jejunum. This highlights the relevance of using the Gini index for variability analysis. The Gini index also provides a simple approach to identify variability in proteins of importance for systems pharmacology analysis, such as transporters and metabolic enzymes, as well as proteins involved in biological processes that deviate from the ‘normal’ situation (e.g. healthy controls). For example, we found high variability in proteins involved in diabetes and inflammation, indicating varying degrees of these disease-related processes in the different donors. By identifying highly variable proteins and subsequent comparison of the actual protein levels with disease status in the different patients, the Gini index might provide an unbiased starting point for biomarker discovery in a more general sense. Supporting this, previous studies have shown that hypervariably expressed genes are largely associated with human diseases (67).

In summary, we found that the transfer of concentration variability from mRNA to protein across tissue types is limited, in addition to the already established poor correlation between quantitative levels of mRNA and protein. Our analysis also indicates that specific, rather than universal, reference genes would be required for different omics levels and sample types. This indicates that reference gene normalization is not feasible for proteomics data. At the level of single tissue types (liver and jejunum), we found that proteins with low concentration variability across donors were involved in essential cellular processes. On the other hand, proteins with variable concentrations reflected varying degrees of disease, indicating that this type of variability analysis could be a simple aid in early biomarker discovery. Furthermore, we observed that proteins within single tissue types covered a wide variability spectrum, which implies that individual concentration levels should be taken into consideration for truly personalized outcomes of systems biology analysis.

DATA AVAILABILITY

Concentration variability data (Gini indices), together with protein properties for the liver and jejunum datasets, are provided in Supplementary Data S1. Proteomics data of liver and jejunum are available upon email request to the corresponding author, provided that the proposed use of data is approved by the COCKTAIL-study steering committee and are according to the consent given by the participants and Norwegian laws and legislations (see Supplementary trial protocol).

Supplementary Material

lqz010_Supplemental_Files

Click here for additional data file.^{(3.9MB, zip)}

ACKNOWLEDGEMENTS

We thank Cecilia Karlsson and Shalini Andersson, Cardiovascular, Renal and Metabolism, Innovative Medicines and Early Development Biotech Unit, AstraZeneca, Jens Kristoffer Hertel, Morbid Obesity Centre, Department of Medicine, Vestfold Hospital Trust, and Dr André Mateus, Uppsala University, for valuable discussions. We also thank Lars Thomas Seeberg and colleagues at the Department of Gastrointestinal Surgery, Vestfold Hospital Trust for providing biopsies from jejunum and liver, and Ida Robertsen, Department of Pharmacy, University of Oslo, for sample logistics. Finally, we are grateful to the patients at the Morbid Obesity Center, Vestfold Hospital Trust, for participating in the COCKTAIL study and accepting tissue biopsies to be taken.

Author Contributions: Conceptualization: C.W., M.Ö., T.B.A. and P.A.; Methodology: C.W. and M.Ö.; Resources: K.Z., J.R.W., J.H. and P.A.; Data curation: C.W.; Investigation: C.W. and M.Ö.; Formal analysis: C.W. and M.Ö.; Visualization: C.W. and M.Ö.; Writing—original draft: C.W., M.Ö. and P.A.; Writing—review and editing, C.W., M.Ö., J.R.W., P.L., A.Å., J.H., T.B.A. and P.A.; Funding acquisition: T.B.A. and P.A.

SUPPLEMENTARY DATA

Supplementary Data are available at NARGAB Online.

FUNDING

Swedish Research Council [5715 to T.B.A., P.A., 01951 to P.A.].

Conflict of interest statement. None declared.

REFERENCES

1. Aebersold R., Mann M.. Mass-spectrometric exploration of proteome structure and function. Nature. 2016; 537:347–355. [DOI] [PubMed] [Google Scholar]
2. Wang Z., Gerstein M., Snyder M.. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 2009; 10:57–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
3. Melé M., Ferreira P.G., Reverter F., DeLuca D.S., Monlong J., Sammeth M., Young T.R., Goldmann J.M., Pervouchine D.D., Sullivan T.J.. The human transcriptome across tissues and individuals. Science. 2015; 348:660–665. [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Uhlén M., Fagerberg L., Hallström B.M., Lindskog C., Oksvold P., Mardinoglu A., Sivertsson Å., Kampf C., Sjöstedt E., Asplund A.. Tissue-based map of the human proteome. Science. 2015; 347:1260419. [DOI] [PubMed] [Google Scholar]
5. Kim M.-S., Pinto S.M., Getnet D., Nirujogi R.S., Manda S.S., Chaerkady R., Madugundu A.K., Kelkar D.S., Isserlin R., Jain S.. A draft map of the human proteome. Nature. 2014; 509:575–581. [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Wilhelm M., Schlegl J., Hahne H., Gholami A.M., Lieberenz M., Savitski M.M., Ziegler E., Butzmann L., Gessulat S., Marx H.. Mass-spectrometry-based draft of the human proteome. Nature. 2014; 509:582–587. [DOI] [PubMed] [Google Scholar]
7. Wang D., Eraslan B., Wieland T., Hallstrom B., Hopf T., Zolg D.P., Zecha J., Asplund A., Li L.H., Meng C. et al.. A deep proteome and transcriptome abundance atlas of 29 healthy human tissues. Mol. Syst. Biol. 2019; 15:e8503. [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Vogel C., Marcotte E.M.. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat. Rev. Genet. 2012; 13:227–232. [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Edfors F., Danielsson F., Hallström B.M., Käll L., Lundberg E., Pontén F., Forsström B., Uhlén M.. Gene‐specific correlation of RNA and protein levels in human cells and tissues. Mol. Syst. Biol. 2016; 12:883. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Fortelny N., Overall C.M., Pavlidis P., Freue G.V.C.. Can we predict protein from mRNA levels?. Nature. 2017; 547:E19. [DOI] [PubMed] [Google Scholar]
11. Hughes D.A., Kircher M., He Z., Guo S., Fairbrother G.L., Moreno C.S., Khaitovich P., Stoneking M.. Evaluating intra-and inter-individual variation in the human placental transcriptome. Genome Biol. 2015; 16:54. [DOI] [PMC free article] [PubMed] [Google Scholar]
12. McCall M.N., Illei P.B., Halushka M.K.. Complex sources of variation in tissue expression data: analysis of the GTEx lung transcriptome. Am. J. Hum. Genet. 2016; 99:624–635. [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Wiśniewski J.R., Duś-Szachniewicz K., Ostasiewicz P., Ziółkowski P., Rakus D., Mann M.. Absolute proteome analysis of colorectal mucosa, adenoma, and cancer reveals drastic changes in fatty acid metabolism and plasma membrane transporters. J. Proteome Res. 2015; 14:4005–4018. [DOI] [PubMed] [Google Scholar]
14. Hamdeh S.A., Shevchenko G., Mi J., Musunuri S., Bergquist J., Marklund N.. Proteomic differences between focal and diffuse traumatic brain injury in human brain tissue. Sci. Rep. 2018; 8:6807. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Chen R., Snyder M.. Systems biology: personalized medicine for the future. Curr. Opin. Pharmacol. 2012; 12:623–628. [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Tavassoly I., Goldfarb J., Iyengar R.. Systems biology primer: the basic methods and approaches. Essays Biochem. 2018; 62:487–500. [DOI] [PubMed] [Google Scholar]
17. Hjelmesæth J., Åsberg A., Andersson S., Sandbu R., Robertsen I., Johnson L.K., Angeles P.C., Hertel J.K., Skovlund E., Heijer M. et al.. Impact of body weight, low energy diet and gastric bypass on drug bioavailability, cardiovascular risk factors and metabolic biomarkers: protocol for an open, non-randomised, three-armed single centre study (COCKTAIL). BMJ Open. 2018; 8:e021878. [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Lonsdale J., Thomas J., Salvatore M., Phillips R., Lo E., Shad S., Hasz R., Walters G., Garcia F., Young N.. The genotype-tissue expression (GTEx) project. Nat. Genet. 2013; 45:580–585. [DOI] [PMC free article] [PubMed] [Google Scholar]
19. Wiśniewski J.R., Mann M.. Consecutive proteolytic digestion in an enzyme reactor increases depth of proteomic and phosphoproteomic analysis. Anal. Chem. 2012; 84:2631–2637. [DOI] [PubMed] [Google Scholar]
20. Cox J., Mann M.. MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 2008; 26:1367. [DOI] [PubMed] [Google Scholar]
21. Huber W., Von Heydebreck A., Sültmann H., Poustka A., Vingron M.. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics. 2002; 18:S96–S104. [DOI] [PubMed] [Google Scholar]
22. Wiśniewski J.R., Rakus D.. Multi-enzyme digestion FASP and the ‘Total Protein Approach’-based absolute quantification of the Escherichia coli proteome. J. Proteomics. 2014; 109:322–331. [DOI] [PubMed] [Google Scholar]
23. Huang D.W., Sherman B.T., Lempicki R.A.. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2008; 4:44. [DOI] [PubMed] [Google Scholar]
24. Ceriani L., Verme P.. The origins of the Gini index: extracts from Variabilità e Mutabilità (1912) by Corrado Gini. J. Econ. Inequal. 2012; 10:421–443. [Google Scholar]
25. O’Hagan S., Muelas M.W., Day P.J., Lundberg E., Kell D.B.. GeneGini: Assessment via the gini coefficient of reference “Housekeeping” genes and diverse human transporter expression profiles. Cell Syst. 2018; 6:230–244. [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Cenik C., Cenik E.S., Byeon G.W., Grubert F., Candille S.I., Spacek D., Alsallakh B., Tilgner H., Araya C.L., Tang H.. Integrative analysis of RNA, translation and protein levels reveals distinct regulatory variation across humans. Genome Res. 2015; 25:1610–1621. [DOI] [PMC free article] [PubMed] [Google Scholar]
27. Gorr T.A., Vogel J.. Western blotting revisited: critical perusal of underappreciated technical issues. Proteomics. 2015; 9:396–405. [DOI] [PubMed] [Google Scholar]
28. Vandesompele J., De Preter K., Pattyn F., Poppe B., Van Roy N., De Paepe A., Speleman F.. Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol. 2002; 3:doi:10.1186/gb-2002-3-7-research0034. [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Wiśniewski J.R., Mann M.. A proteomics approach to the protein normalization problem: selection of unvarying proteins for MS-based proteomics and western blotting. J. Proteome Res. 2016; 15:2321–2326. [DOI] [PubMed] [Google Scholar]
30. Eisenberg E., Levanon E.Y.. Human housekeeping genes, revisited. Trends Genet. 2013; 29:569–574. [DOI] [PubMed] [Google Scholar]
31. Ölander M., Wiśniewski J.R., Matsson P., Lundquist P., Artursson P.. The proteome of filter-grown Caco-2 cells with a focus on proteins involved in drug disposition. J. Pharm. Sci. 2016; 105:817–827. [DOI] [PubMed] [Google Scholar]
32. Thomas P.D., Campbell M.J., Kejariwal A., Mi H., Karlak B., Daverman R., Diemer K., Muruganujan A., Narechania A.. PANTHER: a library of protein families and subfamilies indexed by function. Genome Res. 2003; 13:2129–2141. [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Jadot M., Boonen M., Thirion J., Wang N., Xing J., Zhao C., Tannous A., Qian M., Zheng H., Everett J.K.. Accounting for protein subcellular localization: A compartmental map of the rat liver proteome. Mol. Cell. Proteomics. 2017; 16:194–212. [DOI] [PMC free article] [PubMed] [Google Scholar]
34. Settembre C., Ballabio A.. Lysosome: regulator of lipid degradation pathways. Trends Cell Biol. 2014; 24:743–750. [DOI] [PMC free article] [PubMed] [Google Scholar]
35. Blomen V.A., Májek P., Jae L.T., Bigenzahn J.W., Nieuwenhuis J., Staring J., Sacco R., van Diemen F.R., Olk N., Stukalov A.. Gene essentiality and synthetic lethality in haploid human cells. Science. 2015; 350:1092–1096. [DOI] [PubMed] [Google Scholar]
36. Zecha J., Meng C., Zolg D.P., Samaras P., Wilhelm M., Kuster B.. Peptide level turnover measurements enable the study of proteoform dynamics. Mol. Cell. Proteomics. 2018; 17:974–992. [DOI] [PMC free article] [PubMed] [Google Scholar]
37. Kozlowski L.P. Proteome-pI: proteome isoelectric point database. Nucleic Acids Res. 2016; 45:D1112–D1116. [DOI] [PMC free article] [PubMed] [Google Scholar]
38. Giurgiu M., Reinhard J., Brauner B., Dunger-Kaltenbach I., Fobo G., Frishman G., Montrone C., Ruepp A.. CORUM: the comprehensive resource of mammalian protein complexes—2019. Nucleic Acids Res. 2019; 47:D559–D563. [DOI] [PMC free article] [PubMed] [Google Scholar]
39. Mathieson T., Franken H., Kosinski J., Kurzawa N., Zinn N., Sweetman G., Poeckel D., Ratnu V.S., Schramm M., Becher I.. Systematic analysis of protein turnover in primary cells. Nat. Commun. 2018; 9:689. [DOI] [PMC free article] [PubMed] [Google Scholar]
40. Gutiérrez-Sacristán A., Bravo À., Centeno E., Sanz F., Piñero J., García-García J., Deu-Pons J., Queralt-Rosinach N., Furlong L.I.. DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 2016; 45:D833–D839. [DOI] [PMC free article] [PubMed] [Google Scholar]
41. Ashburner M., Ball C.A., Blake J.A., Botstein D., Butler H., Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T.. Gene Ontology: tool for the unification of biology. Nat. Genet. 2000; 25:25–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
42. Gregor M.F., Hotamisligil G.S.. Inflammatory mechanisms in obesity. Annu. Rev. Immunol. 2011; 29:415–445. [DOI] [PubMed] [Google Scholar]
43. Geyer P.E., Kulak N.A., Pichler G., Holdt L.M., Teupser D., Mann M.. Plasma proteome profiling to assess human health and disease. Cell Syst. 2016; 2:185–195. [DOI] [PubMed] [Google Scholar]
44. Zhao M., Li M., Yang Y., Guo Z., Sun Y., Shao C., Li M., Sun W., Gao Y.. A comprehensive analysis and annotation of human normal urinary proteome. Sci. Rep. 2017; 7:3024. [DOI] [PMC free article] [PubMed] [Google Scholar]
45. Plevy S., Silverberg M.S., Lockton S., Stockfisch T., Croner L., Stachelski J., Brown M., Triggs C., Chuang E., Princen F. et al.. Combined serological, genetic, and inflammatory markers differentiate non-IBD, Crohn's disease, and ulcerative colitis patients. Inflamm. Bowel Dis. 2013; 19:1139–1148. [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Danhof M. Systems pharmacology – Towards the modeling of network interactions. Eur. J. Pharm. Sci. 2016; 94:4–14. [DOI] [PubMed] [Google Scholar]
47. Lin L., Yee S.W., Kim R.B., Giacomini K.M.. SLC transporters as therapeutic targets: emerging opportunities. Nat. Rev. Drug Discovery. 2015; 14:543. [DOI] [PMC free article] [PubMed] [Google Scholar]
48. Vasiliou V., Vasiliou K., Nebert D.W.. Human ATP-binding cassette (ABC) transporter family. Hum. Genomics. 2009; 3:281. [DOI] [PMC free article] [PubMed] [Google Scholar]
49. Williams J.A., Hyland R., Jones B.C., Smith D.A., Hurst S., Goosen T.C., Peterkin V., Koup J.R., Ball S.E.. Drug-drug interactions for UDP-glucuronosyltransferase substrates: a pharmacokinetic explanation for typically observed low exposure (AUCi/AUC) ratios. Drug Metab. Dispos. 2004; 32:1201–1208. [DOI] [PubMed] [Google Scholar]
50. Zanger U.M., Schwab M.. Cytochrome P450 enzymes in drug metabolism: regulation of gene expression, enzyme activities, and impact of genetic variation. Pharmacol. Therap. 2013; 138:103–141. [DOI] [PubMed] [Google Scholar]
51. Wessely F., Bartl M., Guthke R., Li P., Schuster S., Kaleta C.. Optimal regulatory strategies for metabolic pathways in Escherichia coli depending on protein costs. Mol. Syst. Biol. 2011; 7:515. [DOI] [PMC free article] [PubMed] [Google Scholar]
52. Buttgereit F., Brand M.D.. A hierarchy of ATP-consuming processes in mammalian cells. Biochem. J. 1995; 312:163–167. [DOI] [PMC free article] [PubMed] [Google Scholar]
53. Khan Z., Ford M.J., Cusanovich D.A., Mitrano A., Pritchard J.K., Gilad Y.. Primate transcript and protein expression levels evolve under compensatory selection pressures. Science. 2013; 342:1100–1104. [DOI] [PMC free article] [PubMed] [Google Scholar]
54. Liu Y., Beyer A., Aebersold R.. On the dependency of cellular protein levels on mRNA abundance. Cell. 2016; 165:535–550. [DOI] [PubMed] [Google Scholar]
55. Gabrielsson B.G., Olofsson L.E., Sjögren A., Jernås M., Elander A., Lönn M., Rudemo M., Carlsson L.M.S.. Evaluation of reference genes for studies of gene expression in human adipose tissue. Obes. Res. 2005; 13:649–652. [DOI] [PubMed] [Google Scholar]
56. Välikangas T., Suomi T., Elo L.L.. A systematic evaluation of normalization methods in quantitative label-free proteomics. Brief. Bioinform. 2016; 19:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
57. Kmiec Z. Cooperation of liver cells in health and disease. Adv. Anat. Embryol. Cell Biol. 2001; 161:1–151. [DOI] [PubMed] [Google Scholar]
58. Tortora G.J., Derrickson B.H.. Principles of Anatomy and Physiology. 2008; Hoboken: John Wiley & Sons. [Google Scholar]
59. Ding C., Li Y., Guo F., Jiang Y., Ying W., Li D., Yang D., Xia X., Liu W., Zhao Y. et al.. A Cell-type-resolved liver proteome. Mol. Cell Proteomics. 2016; 15:3190. [DOI] [PMC free article] [PubMed] [Google Scholar]
60. Lown K.S., Kolars J.C., Thummel K.E., Barnett J.L., Kunze K.L., Wrighton S.A., Watkins P.B.. Interpatient heterogeneity in expression of CYP3A4 and CYP3A5 in small bowel. Lack of prediction by the erythromycin breath test. Drug Metab. Dispos. 1994; 22:947–955. [PubMed] [Google Scholar]
61. Graham M.F., Diegelmann R.F., Elson C.O., Lindblad W.J., Gotschalk N., Gay S., Gay R.. Collagen content and types in the intestinal strictures of Crohn's disease. Gastroenterology. 1988; 94:257–265. [DOI] [PubMed] [Google Scholar]
62. Hounnou G., Destrieux C., Desme J., Bertrand P., Velut S.. Anatomical study of the length of the human intestine. Surg. Radiol. Anat. 2002; 24:290–294. [DOI] [PubMed] [Google Scholar]
63. Drozdzik M., Busch D., Lapczuk J., Müller J., Ostrowski M., Kurzawski M., Oswald S.. Protein abundance of clinically relevant Drug‐Metabolizing enzymes in the human liver and intestine: a comparative analysis in paired tissue specimens. Clin Pharmacol Ther. 2018; 104:515–524. [DOI] [PubMed] [Google Scholar]
64. Wiśniewski J.R., Wegler C., Artursson P.. Multiple-Enzyme-Digestion strategy improves accuracy and sensitivity of Label- and Standard-Free absolute quantification to a level that is achievable by analysis with stable Isotope-Labeled standard spiking. J. Proteome Res. 2019; 18:217–224. [DOI] [PubMed] [Google Scholar]
65. Moore S.M., Hess S.M., Jorgenson J.W.. Extraction, enrichment, solubilization, and digestion techniques for membrane proteomics. J. Proteome Res. 2016; 15:1243–1252. [DOI] [PMC free article] [PubMed] [Google Scholar]
66. Moor A.E., Harnik Y., Ben-Moshe S., Massasa E.E., Rozenberg M., Eilam R., Bahar Halpern K., Itzkovitz S.. Spatial reconstruction of single enterocytes uncovers broad zonation along the intestinal villus axis. Cell. 2018; 175:1156–1167. [DOI] [PubMed] [Google Scholar]
67. Alemu E.Y., Carl J.W. Jr, Corrada Bravo H., Hannenhalli S.. Determinants of expression variability. Nucleic Acids Res. 2014; 42:3503–3514. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

lqz010_Supplemental_Files

Click here for additional data file.^{(3.9MB, zip)}

Data Availability Statement

[B1] 1. Aebersold R., Mann M.. Mass-spectrometric exploration of proteome structure and function. Nature. 2016; 537:347–355. [DOI] [PubMed] [Google Scholar]

[B2] 2. Wang Z., Gerstein M., Snyder M.. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 2009; 10:57–63. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3] 3. Melé M., Ferreira P.G., Reverter F., DeLuca D.S., Monlong J., Sammeth M., Young T.R., Goldmann J.M., Pervouchine D.D., Sullivan T.J.. The human transcriptome across tissues and individuals. Science. 2015; 348:660–665. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B4] 4. Uhlén M., Fagerberg L., Hallström B.M., Lindskog C., Oksvold P., Mardinoglu A., Sivertsson Å., Kampf C., Sjöstedt E., Asplund A.. Tissue-based map of the human proteome. Science. 2015; 347:1260419. [DOI] [PubMed] [Google Scholar]

[B5] 5. Kim M.-S., Pinto S.M., Getnet D., Nirujogi R.S., Manda S.S., Chaerkady R., Madugundu A.K., Kelkar D.S., Isserlin R., Jain S.. A draft map of the human proteome. Nature. 2014; 509:575–581. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B6] 6. Wilhelm M., Schlegl J., Hahne H., Gholami A.M., Lieberenz M., Savitski M.M., Ziegler E., Butzmann L., Gessulat S., Marx H.. Mass-spectrometry-based draft of the human proteome. Nature. 2014; 509:582–587. [DOI] [PubMed] [Google Scholar]

[B7] 7. Wang D., Eraslan B., Wieland T., Hallstrom B., Hopf T., Zolg D.P., Zecha J., Asplund A., Li L.H., Meng C. et al.. A deep proteome and transcriptome abundance atlas of 29 healthy human tissues. Mol. Syst. Biol. 2019; 15:e8503. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8] 8. Vogel C., Marcotte E.M.. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat. Rev. Genet. 2012; 13:227–232. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] 9. Edfors F., Danielsson F., Hallström B.M., Käll L., Lundberg E., Pontén F., Forsström B., Uhlén M.. Gene‐specific correlation of RNA and protein levels in human cells and tissues. Mol. Syst. Biol. 2016; 12:883. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B10] 10. Fortelny N., Overall C.M., Pavlidis P., Freue G.V.C.. Can we predict protein from mRNA levels?. Nature. 2017; 547:E19. [DOI] [PubMed] [Google Scholar]

[B11] 11. Hughes D.A., Kircher M., He Z., Guo S., Fairbrother G.L., Moreno C.S., Khaitovich P., Stoneking M.. Evaluating intra-and inter-individual variation in the human placental transcriptome. Genome Biol. 2015; 16:54. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B12] 12. McCall M.N., Illei P.B., Halushka M.K.. Complex sources of variation in tissue expression data: analysis of the GTEx lung transcriptome. Am. J. Hum. Genet. 2016; 99:624–635. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B13] 13. Wiśniewski J.R., Duś-Szachniewicz K., Ostasiewicz P., Ziółkowski P., Rakus D., Mann M.. Absolute proteome analysis of colorectal mucosa, adenoma, and cancer reveals drastic changes in fatty acid metabolism and plasma membrane transporters. J. Proteome Res. 2015; 14:4005–4018. [DOI] [PubMed] [Google Scholar]

[B14] 14. Hamdeh S.A., Shevchenko G., Mi J., Musunuri S., Bergquist J., Marklund N.. Proteomic differences between focal and diffuse traumatic brain injury in human brain tissue. Sci. Rep. 2018; 8:6807. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B15] 15. Chen R., Snyder M.. Systems biology: personalized medicine for the future. Curr. Opin. Pharmacol. 2012; 12:623–628. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B16] 16. Tavassoly I., Goldfarb J., Iyengar R.. Systems biology primer: the basic methods and approaches. Essays Biochem. 2018; 62:487–500. [DOI] [PubMed] [Google Scholar]

[B17] 17. Hjelmesæth J., Åsberg A., Andersson S., Sandbu R., Robertsen I., Johnson L.K., Angeles P.C., Hertel J.K., Skovlund E., Heijer M. et al.. Impact of body weight, low energy diet and gastric bypass on drug bioavailability, cardiovascular risk factors and metabolic biomarkers: protocol for an open, non-randomised, three-armed single centre study (COCKTAIL). BMJ Open. 2018; 8:e021878. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B18] 18. Lonsdale J., Thomas J., Salvatore M., Phillips R., Lo E., Shad S., Hasz R., Walters G., Garcia F., Young N.. The genotype-tissue expression (GTEx) project. Nat. Genet. 2013; 45:580–585. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B19] 19. Wiśniewski J.R., Mann M.. Consecutive proteolytic digestion in an enzyme reactor increases depth of proteomic and phosphoproteomic analysis. Anal. Chem. 2012; 84:2631–2637. [DOI] [PubMed] [Google Scholar]

[B20] 20. Cox J., Mann M.. MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 2008; 26:1367. [DOI] [PubMed] [Google Scholar]

[B21] 21. Huber W., Von Heydebreck A., Sültmann H., Poustka A., Vingron M.. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics. 2002; 18:S96–S104. [DOI] [PubMed] [Google Scholar]

[B22] 22. Wiśniewski J.R., Rakus D.. Multi-enzyme digestion FASP and the ‘Total Protein Approach’-based absolute quantification of the Escherichia coli proteome. J. Proteomics. 2014; 109:322–331. [DOI] [PubMed] [Google Scholar]

[B23] 23. Huang D.W., Sherman B.T., Lempicki R.A.. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2008; 4:44. [DOI] [PubMed] [Google Scholar]

[B24] 24. Ceriani L., Verme P.. The origins of the Gini index: extracts from Variabilità e Mutabilità (1912) by Corrado Gini. J. Econ. Inequal. 2012; 10:421–443. [Google Scholar]

[B25] 25. O’Hagan S., Muelas M.W., Day P.J., Lundberg E., Kell D.B.. GeneGini: Assessment via the gini coefficient of reference “Housekeeping” genes and diverse human transporter expression profiles. Cell Syst. 2018; 6:230–244. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B26] 26. Cenik C., Cenik E.S., Byeon G.W., Grubert F., Candille S.I., Spacek D., Alsallakh B., Tilgner H., Araya C.L., Tang H.. Integrative analysis of RNA, translation and protein levels reveals distinct regulatory variation across humans. Genome Res. 2015; 25:1610–1621. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B27] 27. Gorr T.A., Vogel J.. Western blotting revisited: critical perusal of underappreciated technical issues. Proteomics. 2015; 9:396–405. [DOI] [PubMed] [Google Scholar]

[B28] 28. Vandesompele J., De Preter K., Pattyn F., Poppe B., Van Roy N., De Paepe A., Speleman F.. Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol. 2002; 3:doi:10.1186/gb-2002-3-7-research0034. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B29] 29. Wiśniewski J.R., Mann M.. A proteomics approach to the protein normalization problem: selection of unvarying proteins for MS-based proteomics and western blotting. J. Proteome Res. 2016; 15:2321–2326. [DOI] [PubMed] [Google Scholar]

[B30] 30. Eisenberg E., Levanon E.Y.. Human housekeeping genes, revisited. Trends Genet. 2013; 29:569–574. [DOI] [PubMed] [Google Scholar]

[B31] 31. Ölander M., Wiśniewski J.R., Matsson P., Lundquist P., Artursson P.. The proteome of filter-grown Caco-2 cells with a focus on proteins involved in drug disposition. J. Pharm. Sci. 2016; 105:817–827. [DOI] [PubMed] [Google Scholar]

[B32] 32. Thomas P.D., Campbell M.J., Kejariwal A., Mi H., Karlak B., Daverman R., Diemer K., Muruganujan A., Narechania A.. PANTHER: a library of protein families and subfamilies indexed by function. Genome Res. 2003; 13:2129–2141. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B33] 33. Jadot M., Boonen M., Thirion J., Wang N., Xing J., Zhao C., Tannous A., Qian M., Zheng H., Everett J.K.. Accounting for protein subcellular localization: A compartmental map of the rat liver proteome. Mol. Cell. Proteomics. 2017; 16:194–212. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B34] 34. Settembre C., Ballabio A.. Lysosome: regulator of lipid degradation pathways. Trends Cell Biol. 2014; 24:743–750. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B35] 35. Blomen V.A., Májek P., Jae L.T., Bigenzahn J.W., Nieuwenhuis J., Staring J., Sacco R., van Diemen F.R., Olk N., Stukalov A.. Gene essentiality and synthetic lethality in haploid human cells. Science. 2015; 350:1092–1096. [DOI] [PubMed] [Google Scholar]

[B36] 36. Zecha J., Meng C., Zolg D.P., Samaras P., Wilhelm M., Kuster B.. Peptide level turnover measurements enable the study of proteoform dynamics. Mol. Cell. Proteomics. 2018; 17:974–992. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B37] 37. Kozlowski L.P. Proteome-pI: proteome isoelectric point database. Nucleic Acids Res. 2016; 45:D1112–D1116. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B38] 38. Giurgiu M., Reinhard J., Brauner B., Dunger-Kaltenbach I., Fobo G., Frishman G., Montrone C., Ruepp A.. CORUM: the comprehensive resource of mammalian protein complexes—2019. Nucleic Acids Res. 2019; 47:D559–D563. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B39] 39. Mathieson T., Franken H., Kosinski J., Kurzawa N., Zinn N., Sweetman G., Poeckel D., Ratnu V.S., Schramm M., Becher I.. Systematic analysis of protein turnover in primary cells. Nat. Commun. 2018; 9:689. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B40] 40. Gutiérrez-Sacristán A., Bravo À., Centeno E., Sanz F., Piñero J., García-García J., Deu-Pons J., Queralt-Rosinach N., Furlong L.I.. DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 2016; 45:D833–D839. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B41] 41. Ashburner M., Ball C.A., Blake J.A., Botstein D., Butler H., Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T.. Gene Ontology: tool for the unification of biology. Nat. Genet. 2000; 25:25–29. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B42] 42. Gregor M.F., Hotamisligil G.S.. Inflammatory mechanisms in obesity. Annu. Rev. Immunol. 2011; 29:415–445. [DOI] [PubMed] [Google Scholar]

[B43] 43. Geyer P.E., Kulak N.A., Pichler G., Holdt L.M., Teupser D., Mann M.. Plasma proteome profiling to assess human health and disease. Cell Syst. 2016; 2:185–195. [DOI] [PubMed] [Google Scholar]

[B44] 44. Zhao M., Li M., Yang Y., Guo Z., Sun Y., Shao C., Li M., Sun W., Gao Y.. A comprehensive analysis and annotation of human normal urinary proteome. Sci. Rep. 2017; 7:3024. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B45] 45. Plevy S., Silverberg M.S., Lockton S., Stockfisch T., Croner L., Stachelski J., Brown M., Triggs C., Chuang E., Princen F. et al.. Combined serological, genetic, and inflammatory markers differentiate non-IBD, Crohn's disease, and ulcerative colitis patients. Inflamm. Bowel Dis. 2013; 19:1139–1148. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B46] 46. Danhof M. Systems pharmacology – Towards the modeling of network interactions. Eur. J. Pharm. Sci. 2016; 94:4–14. [DOI] [PubMed] [Google Scholar]

[B47] 47. Lin L., Yee S.W., Kim R.B., Giacomini K.M.. SLC transporters as therapeutic targets: emerging opportunities. Nat. Rev. Drug Discovery. 2015; 14:543. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B48] 48. Vasiliou V., Vasiliou K., Nebert D.W.. Human ATP-binding cassette (ABC) transporter family. Hum. Genomics. 2009; 3:281. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B49] 49. Williams J.A., Hyland R., Jones B.C., Smith D.A., Hurst S., Goosen T.C., Peterkin V., Koup J.R., Ball S.E.. Drug-drug interactions for UDP-glucuronosyltransferase substrates: a pharmacokinetic explanation for typically observed low exposure (AUCi/AUC) ratios. Drug Metab. Dispos. 2004; 32:1201–1208. [DOI] [PubMed] [Google Scholar]

[B50] 50. Zanger U.M., Schwab M.. Cytochrome P450 enzymes in drug metabolism: regulation of gene expression, enzyme activities, and impact of genetic variation. Pharmacol. Therap. 2013; 138:103–141. [DOI] [PubMed] [Google Scholar]

[B51] 51. Wessely F., Bartl M., Guthke R., Li P., Schuster S., Kaleta C.. Optimal regulatory strategies for metabolic pathways in Escherichia coli depending on protein costs. Mol. Syst. Biol. 2011; 7:515. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B52] 52. Buttgereit F., Brand M.D.. A hierarchy of ATP-consuming processes in mammalian cells. Biochem. J. 1995; 312:163–167. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B53] 53. Khan Z., Ford M.J., Cusanovich D.A., Mitrano A., Pritchard J.K., Gilad Y.. Primate transcript and protein expression levels evolve under compensatory selection pressures. Science. 2013; 342:1100–1104. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B54] 54. Liu Y., Beyer A., Aebersold R.. On the dependency of cellular protein levels on mRNA abundance. Cell. 2016; 165:535–550. [DOI] [PubMed] [Google Scholar]

[B55] 55. Gabrielsson B.G., Olofsson L.E., Sjögren A., Jernås M., Elander A., Lönn M., Rudemo M., Carlsson L.M.S.. Evaluation of reference genes for studies of gene expression in human adipose tissue. Obes. Res. 2005; 13:649–652. [DOI] [PubMed] [Google Scholar]

[B56] 56. Välikangas T., Suomi T., Elo L.L.. A systematic evaluation of normalization methods in quantitative label-free proteomics. Brief. Bioinform. 2016; 19:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B57] 57. Kmiec Z. Cooperation of liver cells in health and disease. Adv. Anat. Embryol. Cell Biol. 2001; 161:1–151. [DOI] [PubMed] [Google Scholar]

[B58] 58. Tortora G.J., Derrickson B.H.. Principles of Anatomy and Physiology. 2008; Hoboken: John Wiley & Sons. [Google Scholar]

[B59] 59. Ding C., Li Y., Guo F., Jiang Y., Ying W., Li D., Yang D., Xia X., Liu W., Zhao Y. et al.. A Cell-type-resolved liver proteome. Mol. Cell Proteomics. 2016; 15:3190. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B60] 60. Lown K.S., Kolars J.C., Thummel K.E., Barnett J.L., Kunze K.L., Wrighton S.A., Watkins P.B.. Interpatient heterogeneity in expression of CYP3A4 and CYP3A5 in small bowel. Lack of prediction by the erythromycin breath test. Drug Metab. Dispos. 1994; 22:947–955. [PubMed] [Google Scholar]

[B61] 61. Graham M.F., Diegelmann R.F., Elson C.O., Lindblad W.J., Gotschalk N., Gay S., Gay R.. Collagen content and types in the intestinal strictures of Crohn's disease. Gastroenterology. 1988; 94:257–265. [DOI] [PubMed] [Google Scholar]

[B62] 62. Hounnou G., Destrieux C., Desme J., Bertrand P., Velut S.. Anatomical study of the length of the human intestine. Surg. Radiol. Anat. 2002; 24:290–294. [DOI] [PubMed] [Google Scholar]

[B63] 63. Drozdzik M., Busch D., Lapczuk J., Müller J., Ostrowski M., Kurzawski M., Oswald S.. Protein abundance of clinically relevant Drug‐Metabolizing enzymes in the human liver and intestine: a comparative analysis in paired tissue specimens. Clin Pharmacol Ther. 2018; 104:515–524. [DOI] [PubMed] [Google Scholar]

[B64] 64. Wiśniewski J.R., Wegler C., Artursson P.. Multiple-Enzyme-Digestion strategy improves accuracy and sensitivity of Label- and Standard-Free absolute quantification to a level that is achievable by analysis with stable Isotope-Labeled standard spiking. J. Proteome Res. 2019; 18:217–224. [DOI] [PubMed] [Google Scholar]

[B65] 65. Moore S.M., Hess S.M., Jorgenson J.W.. Extraction, enrichment, solubilization, and digestion techniques for membrane proteomics. J. Proteome Res. 2016; 15:1243–1252. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B66] 66. Moor A.E., Harnik Y., Ben-Moshe S., Massasa E.E., Rozenberg M., Eilam R., Bahar Halpern K., Itzkovitz S.. Spatial reconstruction of single enterocytes uncovers broad zonation along the intestinal villus axis. Cell. 2018; 175:1156–1167. [DOI] [PubMed] [Google Scholar]

[B67] 67. Alemu E.Y., Carl J.W. Jr, Corrada Bravo H., Hannenhalli S.. Determinants of expression variability. Nucleic Acids Res. 2014; 42:3503–3514. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Global variability analysis of mRNA and protein concentrations across and within human tissues

Christine Wegler

Magnus Ölander

Jacek R Wiśniewski

Patrik Lundquist

Katharina Zettl

Anders Åsberg

Jøran Hjelmesæth

Tommy B Andersson

Per Artursson

Abstract

INTRODUCTION

MATERIALS AND METHODS

Samples for proteomics

Data origin

Calculation of concentration variability

Statistical analysis

RESULTS

Variability in mRNA and protein concentration levels across different tissue types

Figure 1.

Reference gene variability across different tissue types

Figure 2.

Inter-individual reference gene variability in single tissue types

Variability in protein concentrations in single tissue types

Figure 3.

Within-tissue variability in relation to protein properties

Figure 4.

Capturing expected biological variability

Figure 5.

Variability in proteins related to drug metabolism

DISCUSSION

DATA AVAILABILITY

Supplementary Material

ACKNOWLEDGEMENTS

SUPPLEMENTARY DATA

FUNDING

REFERENCES

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases