
Keywords: aquaporin-2, Bayes’ theorem, collecting duct, kidney, transcription
Abstract
Aquaporin-2 (Aqp2) gene transcription is strongly regulated by vasopressin in the renal collecting duct. However, the transcription factors (TFs) responsible for the regulation of expression of Aqp2 remain largely unknown. We used Bayes’ theorem to integrate several -omics data sets to stratify the 1,344 TFs present in the mouse genome with regard to probabilities of regulating Aqp2 gene transcription. Also, we carried out new RNA sequencing experiments mapping the time course of vasopressin-induced changes in the transcriptome of mpkCCD cells to identify TFs that change in tandem with Aqp2. The analysis identified 17 of 1,344 TFs that are most likely to be involved in the regulation of Aqp2 gene transcription. These TFs included eight that have been proposed in prior studies to play a role in Aqp2 regulation, viz., Cebpb, Elf1, Elf3, Ets1, Jun, Junb, Nfkb1, and Sp1. The remaining nine represent new candidates for future studies (Atf1, Irf3, Klf5, Klf6, Mef2d, Nfyb, Nr2f6, Stat3, and Nr4a1). Conspicuously absent is CREB (Creb1), which has been widely proposed to mediate vasopressin-induced regulation of Aqp2 gene transcription (Nielsen S, Frokiaer J, Marples D, Kwon TH, Agre P, Knepper MA. Physiol Rev 82: 205–244, 2002; Kortenoeven ML, Fenton RA. Biochim Biophys Acta 1840: 1533–1549, 2014; Bockenhauer D, Bichet DG. Nat Rev Nephrol 11: 576–588, 2015; Pearce D, Soundararajan R, Trimpert C, Kashlan OB, Deen PM, Kohan DE. Clin J Am Soc Nephrol 10: 135–146, 2015). Instead, another CREB-like TF, Atf1, ranked fourth among all TFs. RNA sequencing time-course experiments showed a rapid increase in Aqp2 mRNA, within 3 h of vasopressin exposure. This response was matched by an equally rapid increase in the abundance of the mRNA coding for Cebpb, which we have shown by chromatin immunoprecipitation-sequencing studies to bind downstream from the Aqp2 gene. The identified TFs provide a roadmap for future studies to understand regulation of Aqp2 gene expression.
NEW & NOTEWORTHY Abetted by the advent of systems biology-based (“-omics”) techniques in the 21st century, there has been a massive expansion of published data relevant to virtually every physiological question. The authors have developed a large-scale data integration approach based on the application of Bayes’' theorem. In the current work, they integrated 12 different -omics data sets to identify the transcription factors most likely to mediate vasopressin-dependent regulation of transcription of the aquaporin-2 gene.
INTRODUCTION
The aquaporin-2 gene (Aqp2; see APPENDIX for nomenclature) is selectively expressed in three related renal epithelial cell types: collecting duct principal cells, connecting tubule cells, and inner medullary collecting duct cells (1, 2). It is regulated in two major ways by vasopressin: 1) short-term regulation by trafficking of AQP2-containing vesicles to and from the apical plasma membrane (3), which occurs seconds to minutes after vasopressin exposure and 2) long-term regulation by control of transcription of the Aqp2 gene, which occurs hours to days after vasopressin exposure (4). Vasopressin is a neurohypophyseal hormone that regulates both processes by binding to the V2 receptor, which signals largely through Gαs, adenylyl cyclase 6, cAMP, and activation of the two PKA catalytic proteins (PKA-Cα and PKA-Cβ) (5). Although both regulatory processes have been previously studied, in aggregate we have much less information about the latter process (transcriptional regulation of the Aqp2 gene), which appears to be important in a variety of water balance disorders (6). For example, lithium-induced nephrogenic diabetes insipidus is associated with loss of AQP2 protein from collecting duct cells (7) due to a marked reduction in Aqp2 gene transcription (8). It is important to understand how vasopressin regulates Aqp2 gene transcription under normal circumstances to understand the pathomechanisms involved in water balance disorders. Long-term regulation of AQP2 is also seen in an immortalized cell line, mpkCCD (9).
Transcriptional control is accomplished in large part by DNA-binding transcription factors (TFs) that bind to enhancers or promoters vicinal to the regulated gene to control transcriptional initiation or elongation. During interphase, the genome is organized into a series of loops, each formed from interactions between two CTCF (CCCTC-binding factor) molecules that bind at different genomic sites (10). There are ∼13,000 such loops covering the whole genome of median length 190 kb (10). The CTCF loop containing the Aqp2 gene contains three genes including Faim2 and Aqp5 in addition to Aqp2 and is 99 kb in length (11). In most cases, the enhancers containing the TF-binding sites that regulate a given gene are found within the same CTCF loop as that gene (10).
There are more than 1,300 genes in mammalian genomes that code for TFs, and identification of the few that regulate Aqp2 transcription is a challenge. Several review papers have suggested that the TF CREB (i.e., Creb1) may be involved in mediating the effect of vasopressin on Aqp2 transcription (6, 12–14). However, chromatin immunoprecipitation-sequencing (ChIP-Seq) studies identifying Creb1-binding sites along the genome in collecting duct cells showed no binding within 390 kb of the Aqp2 gene body, despite strong peaks at previously documented Creb1-binding sites, indicating that other TFs are likely to be involved (11). Several additional candidate TFs for Aqp2 transcriptional regulation have been identified. For example, studies have pointed to Cebpb (11), Elf3 (15, 16), Ets1 (15, 17), Jun (15, 18), Junb (17, 18), Nfat5 (15, 18–21), Nfkb1 (17, 18, 22, 23), Nr1h2 (24), and Sp1 (25). Thorough studies of the roles of specific TFs would optimally be conducted using several contemporary methods, e.g., CRISPR/Cas9 modification of putative binding sites, CRISPR/Cas9-mediated deletion of candidate TFs, and ChIP-seq analysis to identify TF-binding sites. These methods, although powerful, require substantial time and resources, making it important to prioritize the 1,300+ TFs with regard to their likelihood of playing roles in the regulation of Aqp2 transcription. We previously introduced a method for such prioritization, namely, probabilistic Bayesian analysis, in studies to identify kinases most likely to phosphorylate AQP2 (26, 27). Applied to TFs, such an analysis can take into account many types of existing -omics data to identify a short list of TFs for future study. We structured this analysis in three stages. First, we used -omics data from multiple sources to identify the TFs that are expressed in AQP2-expressing cells of the kidney. Second, we used published ChIP-seq and ATAC sequencing (ATAC-Seq) data to identify the expressed TFs that are most likely to bind within the Aqp2 CTCF loop. Third, we used additional -omics data to identify which of these candidate TFs are regulated by vasopressin. In addition, we added new RNA sequencing (RNA-Seq) data mapping the time course of vasopressin-induced changes in the transcriptome of mouse mpkCCD cells as an additional means of identifying critical vasopressin-regulated TFs.
METHODS
Bioinformatics
Use of Bayes’ theorem for large-scale data integration.
To determine the TFs most likely to regulate transcription of the Aqp2 gene in the renal cortical collecting duct, we used data from multiple sources to rank all mammalian TFs using Bayes’ theorem (27, 28) (see Supplemental Dataset S1 for details of calculations; all Supplemental Material is available at https://esbl.nhlbi.nih.gov/Databases/TFBayesSuppData/). Bayes’ theorem (Fig. 1A) can be stated as follows: P(A|B) = P(B|A) × P(A)/P(B), where P(A|B) is the probability of A given B, P(B|A) is the probability of B given A, P(A) is the prior probability for A, and P(B) is the sum of probabilities of B overall A (28). P(B|A) represents new experimental data being integrated at a given step after mapping it to probability (“likelihood”) values (range: 0−1). For this mapping, except where explicitly stated, we used complements of minimum Bayes’ factors (see the next section of METHODS) (28). Thus, as shown in Fig. 1B, Bayes’ theorem can be viewed in the context of linear algebra as an “operator” that takes a set of “prior probabilities” [P(A) vector] and a set of new data likelihood values [P(B|A) vector] and generates a set of new “posterior probabilities” [P(A|B) vector]. This operator can be applied sequentially to integrate multiple datasets, using posterior probabilities from one step as prior probabilities in the next step. The operator is commutative; thus, the order of data integration does not affect the final values. Overall, Bayes’ theorem provides a systematic means of using existing data to provide answers to biological questions via a process that simulates the way that humans normally integrate multiple pieces of information. In this report, we used it to address the following three questions sequentially: 1) what TFs are expressed in AQP2-expressing collecting duct cells?, 2) what TFs bind to the CTCF loop surrounding the Aqp2 gene?, and 3) what TFs mediate the effect of vasopressin signaling to increase Aqp2 gene transcription?.
Figure 1.
Bayes’ theorem. A: statement of Bayes’ theorem with definition of terms. B: representation of Bayes’ theorem as a mathematical operator. P(A|B) is the probability of A given B, P(B|A) is the probability of B given A, P(A) is the prior probability for A, and P(B) is the sum of probabilities of B overall A.
For these calculations (Supplemental Data sets S1 and S2), we started with all 1,344 TFs (Supplemental Data set S4), assigning them the same prior probabilities P(A) of 1/1,344 and used Bayes’ theorem to update the values from sets of likelihood values, P(B | A), based on different experimental datasets.
Calculating likelihood values from the data.
In the case of quantitative data, we use minimum Bayes’ factors to assign likelihood values based on Goodman (29) and Held (30). Specifically, we used the complement of the minimum Bayes’ factor to calculate likelihoods, that is, 1 – exp[–(Z*)2/2], for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement (Supplemental Data sets S1 and S2). The overall approach is “unbiased,” meaning that each of the 1,344 TFs is assumed to have the same prior probability before being updated with experimental data.
To assign likelihood values corresponding to qualitative enhancer/promoter mapping for the Aqp2 CTCF loop, we integrated ATAC-Seq, H3K27Ac ChIP-Seq, and RNA polymerase II ChIP-Seq data as previously described in Jung et al. (11) and Wen et al. (31). The mapped enhancers and promoter regions were analyzed using MEME (32) to identify TF-binding motifs.
Functional annotation of the top-ranked TFs were obtained from the “[FUNCTION]” field of individual UniProt Protein records as well as through use of “Biological Information Gatherer - Kidney” (BIG; see https://big.nhlbi.nih.gov/index.jsp) (33) to extract information from previous studies of renal collecting duct cells.
RNA-Seq Transcriptomics
Cell culture and total RNA isolation.
The AQP2-expressing mpkCCD clonal cell line (clone 11) was grown on membrane supports (Transwell, Millipore-Sigma Product CLS-3450, 0.4-µm pore size). mpkCCD cells were previously recloned in our laboratory (clone11-38) as previously described (5, 17) from collecting duct cells that were originally immortalized and cloned by Duong Van Huyen et al. (34). mpkCCD cells were used at passages 10 and 11. Cells were seeded on the membrane supports and grown in a complete medium: DMEM-F-12 containing 2% FBS and other supplements (5 μg/mL insulin, 50 nM dexamethasone, 1 nM triiodothyronine, 10 ng/mL epidermal growth factor, 60 nM sodium selenite, and 5 μg/mL transferrin) (11, 17) for 6 days before experiments. Twenty-four hours before the beginning of the experiments, the serum was removed and cells were grown in a defined medium (DMEM-F-12) containing sodium selenite and transferrin. DMEM-F-12 contains a broad range of nutrients including all 21 amino acids, 10 vitamins, glucose, pyruvate, fatty acids, zinc, and iron among other components (see https://www.thermofisher.com/us/en/home/technical-resources/media-formulation.54.html for the full composition). The V2 receptor-selective vasopressin analog desmopressin [dDAVP (0.1 nM)] or vehicle was added to the lower chamber (basolateral side) for different times (3, 6, 12, and 24 h, respectively). Each time point had its own vehicle controls. For RNA isolation, cells were lysed with TRIzol (TRI Reagent, R2050-1-50, Zymo Research). Total RNA was isolated using a Direct-zol RNA Miniprep Kit (R2070, Zymo Research) following the manufacturer’s protocol. To avoid genomic DNA contamination, DNase I was applied during RNA isolation. The isolated total RNA was quantified using a Qubit Fluorometer (Invitrogen) and stored at −80°C until library preparation.
Time-course transcriptome profiling using RNA-Seq.
cDNA was synthesized from total RNA using a SMART-Seq HT Kit (No. 634437, Takara Bio) as described in the manufacturer’s instructions. The synthesized cDNA (300 pg) was used for cDNA library preparation. cDNA libraries were generated using a Nextera XT DNA Library Preparation Kit (FC-131–1096, Illumina) as described in the manufacturer’s instructions. Libraries were quantified using the Qubit Fluorometer system and 2100 Agilent Bioanalyzer (Agilent) and sequenced as 2 × 50 bp (paired-end) on a HiSeq3000 (Illumina). Data from two technical replicates of each biological replicate were concatenated for further downstream data processing. Transcript level quantification using a pseudo-alignment quantification method (“Salmon”, 0.14.10) was performed to calculate transcript abundance in each sample using the “Ensembl” reference genome GRCm38.p6. Differential expression analysis between vehicle- and dDAVP-treated groups was then performed using “edgeR” followed by identification of differentially expressed genes (DEGs) for each time point comparison (dDAVP vs. vehicle, false discovery rate < 0.05).
Data availability.
Raw fastq files and raw count information from the RNA-Seq analysis were deposited in the Gene Expression Omnibus (GEO; GSE163566).
RESULTS
A total of 1,344 TFs are represented in the mouse genome (35). To identify which of these 1,344 TFs are expressed in AQP2-expressing collecting duct cells, we integrated transcriptomic and proteomic expression data from multiple sources (Table 1) using Bayes’ theorem (Fig. 1). The expression data were from native collecting duct cells isolated or dissected from mouse or rat kidneys (36–38) and from cultured vasopressin-sensitive mpkCCD cells, which manifest transcriptional regulation of Aqp2 in response to vasopressin (17, 39, 40). Bayes’ analysis identified 102 TFs that are expressed in collecting duct cells with high probability (Table 2). (The full calculations are provided in Supplemental Table S1). This list includes TFs from multiple TF families. Some are newly identified with respect to potential roles in the renal collecting duct, but others have been previously investigated, notably Cebpb (11), Elf3 (15, 16), Ets1 (15, 17), Jun (15, 18), Junb (17, 18), Nfat5 (15, 18–21), Nfkb1 (17, 18, 22, 23), Nr1h2 (24), and Sp1 (25). Interestingly, the mineralocorticoid receptor, Nr3c2, is absent from this list because it is not expressed in mpkCCD cells. This implies that the mineralocorticoid receptor is not necessary for vasopressin-mediated regulation of Aqp2 gene transcription in mpkCCD cells.
Table 1.
Data sources for Bayes’ identification of TFs expressed in Aqp2-expressing collecting duct cells
| PMID | Data Sets | Rationale | Assignment of Likelihood Values | Noise Threshold | Minimum Likelihood |
|---|---|---|---|---|---|
| 19190182 (17) | Mouse mpkCCD transcriptome (microarray) | The ability of TFs to regulate Aqp2 transcription depends on whether or not it is expressed in collecting duct cells | Complements of minimum Bayes' factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement | Normalized intensity = 0.4 | 0.5 |
| 28973931 (5) | Mouse mpkCCD transcriptome (RNA-Seq) | The ability of TFs to regulate Aqp2 transcription depends on whether or not it is expressed in collecting duct cells | Complements of minimum Bayes' factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement | TPM = 1 | 0.5 |
| 28973931 (5) | Mouse mpkCCD proteome (SILAC) | The ability of TFs to regulate Aqp2 transcription depends on whether or not it is expressed in collecting duct cells | Complements of minimum Bayes' factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement | PSM/MW = 0.01 | 0.5 |
| 30826016 (36) | Transcriptome from the native mouse cortical collecting duct (RNA-Seq) | The ability of TFs to regulate Aqp2 transcription depends on whether or not it is expressed in collecting duct cells | Complements of minimum Bayes’ factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement. | TPM = 1 | 0.5 |
| 17956998 (37) | Rat native inner medullary collecting duct transcriptome (microarray) | The ability of TFs to regulate Aqp2 transcription depends on whether or not it is expressed in collecting duct cells | Complements of minimum Bayes’ factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement. | Normalized intensity = 1 | 0.5 |
| 25817355 (38) | Transcriptome from the microdissected rat connecting tubule + cortical collecting duct + outer medullary collecting duct + inner medullary collecting duct (RNA-Seq) | The ability of TFs to regulate Aqp2 transcription depends on whether or not it is expressed in collecting duct cells | Complements of minimum Bayes’ factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement. | RPKM = 0.5 | 0.5 |
Aqp2, aquaporin-2; RNA-Seq, RNA sequencing; SILAC, stable isotope labeling by amino acids in cell culture; TFs, transcription factors.
Table 2.
TFs expressed in AQP2-expressing collecting duct cells*
| TF | TF Family | Collecting Duct Expression Probability (Fold Increase)† | TF | TF Family | Collecting Duct Expression Probability (Fold Increase)† | TF | TF Family | Collecting Duct Expression Probability (Fold Increase)† |
|---|---|---|---|---|---|---|---|---|
| Ahr | bHLH | 2.88 | Ehf | ETS | 2.88 | Rxrb | Nuclear receptor | 2.88 |
| Arnt | bHLH | 2.86 | Elf1 | ETS | 5.72 | Nfyb | NF-YB/C | 8.72 |
| Hes1 | bHLH | 10.19 | Elf3 | ETS | 11.52 | Nfyc | NF-YB/C | 10.39 |
| Hes6 | bHLH | 3.80 | Elk4 | ETS | 2.88 | Pax8 | PAX | 11.53 |
| Hif1a | bHLH | 5.76 | Ets1 | ETS | 5.45 | Nfat5 | RHD | 5.57 |
| Id1 | bHLH | 2.87 | Ets2 | ETS | 2.88 | Nfkb1 | RHD | 8.64 |
| Mlx | bHLH | 4.80 | Gabpa | ETS | 2.88 | Rela | RHD | 5.34 |
| Mxd4 | bHLH | 3.37 | Foxj3 | Fork head | 2.88 | Mef2a | SRF | 2.88 |
| Mxi1 | bHLH | 5.76 | Foxo1 | Fork head | 2.88 | Mef2d | SRF | 2.83 |
| Myc | bHLH | 2.86 | Foxq1 | Fork head | 4.40 | Stat1 | STAT | 5.76 |
| Sim1 | bHLH | 2.92 | Adnp | Homeobox | 9.53 | Stat2 | STAT | 2.88 |
| Srebf1 | bHLH | 4.16 | Emx2 | Homeobox | 5.66 | Stat3 | STAT | 11.20 |
| Srebf2 | bHLH | 5.45 | Hnf1b | Homeobox | 2.88 | Stat6 | STAT | 2.83 |
| Tcf12 | bHLH | 2.88 | Hoxa9 | Homeobox | 2.88 | Tcf7l2 | TCF/LEF | 2.82 |
| Tcf3 | bHLH | 5.76 | Hoxb6 | Homeobox | 2.82 | Tsc22d1 | TSC22 | 11.53 |
| Atf1 | bZIP | 11.50 | Hoxb7 | Homeobox | 2.88 | Tsc22d3 | TSC22 | 4.72 |
| Atf2 | bZIP | 2.88 | Hoxb9 | Homeobox | 5.76 | Tsc22d4 | TSC22 | 2.84 |
| Atf4 | bZIP | 5.76 | Hoxd8 | Homeobox | 2.88 | Ctcf | zf-C2H2 | 10.73 |
| Creb1 | bZIP | 2.88 | Hoxd9 | Homeobox | 5.76 | Dpf2 | zf-C2H2 | 2.88 |
| Crem | bZIP | 2.88 | Pbx1 | Homeobox | 2.88 | Egr1 | zf-C2H2 | 2.88 |
| Dbp | bZIP | 2.87 | Pbx3 | Homeobox | 2.88 | Glis2 | zf-C2H2 | 2.88 |
| Fos | bZIP | 5.70 | Tgif1 | Homeobox | 2.82 | Klf3 | zf-C2H2 | 5.76 |
| Jun | bZIP | 11.53 | Irf3 | IRF | 8.92 | Klf5 | zf-C2H2 | 5.76 |
| Junb | bZIP | 11.42 | Irf6 | IRF | 5.76 | Klf6 | zf-C2H2 | 11.53 |
| Jund | bZIP | 2.88 | Smad1 | MH1 | 5.49 | Maz | zf-C2H2 | 4.98 |
| Nfe2l2 | bZIP | 5.76 | Smad3 | MH1 | 5.76 | Plagl1 | zf-C2H2 | 2.88 |
| Cebpb | C/EBP | 5.75 | Smad4 | MH1 | 5.76 | Sp1 | zf-C2H2 | 5.22 |
| Cebpg | C/EBP | 5.06 | Esrra | Nuclear receptor | 3.21 | Sp3 | zf-C2H2 | 2.88 |
| Ddit3 | C/EBP | 2.88 | Nr1d2 | Nuclear receptor | 2.88 | Yy1 | zf-C2H2 | 2.88 |
| Xbp1 | C/EBP | 2.88 | Nr1h2 | Nuclear receptor | 8.95 | Zfp91 | zf-C2H2 | 3.29 |
| Grhl2 | CP2 | 2.88 | Nr2c2 | Nuclear receptor | 2.88 | Gata3 | zf-GATA | 11.53 |
| Ubp1 | CP2 | 2.88 | Nr2f6 | Nuclear receptor | 7.90 | Mta2 | zf-GATA | 2.88 |
| Nfib | CTF/NFI | 9.54 | Nr3c1 | Nuclear receptor | 2.88 | Rere | zf-GATA | 5.72 |
| Tfdp1 | E2F | 2.88 | Rxra | Nuclear receptor | 3.42 | Nfx1 | zf-NF-X1 | 4.78 |
Bayesian analysis identified 102 transcription factors (TFs) in aquaporin-2 (AQP2)-expressing collecting duct cells of 1,344 TFs in the mouse genome. †Ratio of posterior probability to prior probability from application of Bayes’ theorem using data from Table 1.
The TFs shown in Table 2 are likely to play roles in the regulation of collecting duct function but are not necessarily involved or regulate Aqp2 transcription. To identify which of the 102 TFs shown in Table 2 are most likely to be involved in the regulation of Aqp2 transcription, we used prior data from ATAC-Seq (11), histone H3K27Ac ChIP-Seq (11), and RNA polymerase II ChIP-Seq (39) to identify regions of high DNA accessibility corresponding to likely cis regulatory elements (enhancers and promoters) in the Aqp2 CTCF loop (Fig. 2). The DNA sequences in these high open probability regions were analyzed to identify TF-binding motifs that map to the TFs shown in Table 2. This identified 74 TFs (Supplemental Table S2) that may bind within the Aqp2 CTCF loop. Table 3 shows the top 33 of these TFs with the highest likelihood ranking from the Bayesian analysis.
Figure 2.

Identification of cis regulatory elements (enhancers and promoters) in the Aqp2 CTCF loop. Three genes are present, namely, Faim2, Aqp2, and Aqp5 (top) within a topologically associating domain of the Aqp2 gene. ENCODE data for CTCF-binding sites and ATAC sequencing (ATAC-Seq) in the whole kidney are shown (https://www.encodeproject.org/). Below that are open probability values as defined by Jung et al. (11) using ATAC-Seq, histone H3K27Ac chromatin immunoprecipitation-sequencing (ChIP-Seq), and RNA polymerase II ChIP-Seq data in mouse collecting duct mpkCCD cells. Colored bars show six putative cis regulatory elements that contained consensus sequences mapping to specific transcription factors grouped by JASPAR motif clusters (http://jaspar.genereg.net/matrix-clusters/) (bottom). Aqp, aquaporin.
Table 3.
The 33 transcription factors with the greatest likelihood of binding to the aquaporin-2 CTCF loop
| Rank | Gene Symbol | Annotation | Class | Binding Region (Fig. 2) | Probability Ratio (Posterior/ Initial Prior) |
|---|---|---|---|---|---|
| 1 | Klf6 | Krueppel-like factor 6 | zf-C2H2 | 1,2,3,4,5,6 | 16.62 |
| 2 | Jun | Transcription factor AP-1 | TF_bZIP | 6 | 16.62 |
| 3 | Elf3 | ETS-related transcription factor Elf-3 | ETS | 2,5,6 | 16.62 |
| 4 | Atf1 | cAMP-dependent transcription factor ATF-1 | TF_bZIP | 6 | 16.59 |
| 5 | Junb | Transcription factor jun-B | TF_bZIP | 6 | 16.47 |
| 6 | Stat3 | Signal transducer and activator of transcription 3 | STAT | 1,4,5,6 | 16.16 |
| 7 | Nfyc | Nuclear transcription factor Y subunit-γ | NF-YB/C | 1,2 | 14.98 |
| 8 | Hes1 | Transcription factor HES-1 | bHLH | 5,6 | 14.69 |
| 9 | Nfib | Nuclear factor 1 B-type | CTF/NFI | 2,6 | 13.76 |
| 10 | Nr1h2 | Oxysterols receptor LXR-β | Ecdystd | 6 | 12.90 |
| 11 | Irf3 | Interferon regulatory factor 3 | IRF | 2 | 12.87 |
| 12 | Nfyb | Nuclear transcription factor Y subunit-β | NF-YB/C | 1,2 | 12.58 |
| 13 | Nfkb1 | Nuclear factor NF-κB p105 subunit | RHD | 1,4,5 | 12.46 |
| 14 | Nr2f6 | Nuclear receptor subfamily 2 group F member 6 | COUP | 6 | 11.39 |
| 15 | Atf4 | cAMP-dependent transcription factor ATF-4 | TF_bZIP | 6 | 8.31 |
| 16 | Hif1a | Hypoxia-inducible factor 1-α | Others | - | 8.31 |
| 17 | Irf6 | Interferon regulatory factor 6 | IRF | 2 | 8.31 |
| 18 | Nfe2l2 | Nuclear factor erythroid 2-related factor 2 | TF_bZIP | 6 | 8.31 |
| 19 | Stat1 | Signal transducer and activator of transcription 1 | STAT | 1,2,4,5,6 | 8.31 |
| 20 | Mxi1 | Max-interacting protein 1 | bHLH | 2,3,4,5,6 | 8.31 |
| 21 | Klf5 | Krueppel-like factor 5 | zf-C2H2 | 1,2,3,4,5,6 | 8.31 |
| 22 | Tcf3 | Transcription factor E2-α | bHLH | 2,3,4,5,6 | 8.31 |
| 23 | Klf3 | Krueppel-like factor 3 | zf-C2H2 | 1,2,3,4,5,6 | 8.31 |
| 24 | Smad3 | Mothers against decapentaplegic homolog 3 | MH1 | 6 | 8.31 |
| 25 | Cebpb | CCAAT/enhancer-binding protein-β | C/EBP | 6 | 8.29 |
| 26 | Elf1 | ETS-related transcription factor Elf-1 | ETS | 2,5,6 | 8.26 |
| 27 | Fos | Proto-oncogene c-Fos | TF_bZIP | 6 | 8.23 |
| 28 | Emx2 | Homeobox protein EMX2 | Homeobox | 1,2 | 8.17 |
| 29 | Nfat5 | Nuclear factor of activated T cells 5 | RHD | 1,4,5,6 | 8.04 |
| 30 | Ets1 | Protein C-ets-1 | ETS | 2,5,6 | 7.87 |
| 31 | Srebf2 | Sterol regulatory element-binding protein 2 | bHLH | 1,2,3,4,5,6 | 7.86 |
| 32 | Rela | Transcription factor p65 | RHD | 1,4,5 | 7.71 |
| 33 | Sp1 | Transcription factor Sp1 | zf-C2H2 | 1,2,3,4,5,6 | 7.53 |
There are at least four signaling processes in collecting duct cells that are involved in the regulation of AQP2 abundance: 1) cell proliferative signaling governed largely by the state of activation of the MAPK pathway, which decreases Aqp2 gene expression as seen in the collecting duct response to lithium treatment (8) and during vasopressin escape (41); 2) inflammatory signaling, which is associated with decreased Aqp2 expression (8); 3) collecting duct development leading to cell type specific gene expression, thereby promoting Aqp2 gene expression; and 4) physiological regulation by vasopressin signaling, which increases Aqp2 gene expression. Based on Gene Ontology (GO) term analysis, 13 of the top 33 TFs are associated with proliferative responses (Jun, Junb, Fos, Stat3, Hes1, Nfib, Irf6, Smad3, Cebpb, Emx2, Ets1, Rela, and Sp1). Six TFs are associated with inflammatory signaling (Stat3, Irf3, Irf6, Nfkb1, Rela, and Smad3). Fifteen are TFs associated with development (Atf1, Atf4, Cebpb, Emx2, Elf3, Ets1, Hes1, Irf6, Jun, Junb, Klf3, Klf5, Klf6, Nfib, and Nr2f6). None of these are specifically expressed in the collecting duct (42), and collecting duct-specific expression of AQP2 may be dependent on a specific combination of TFs rather than a single TF (39). Among all TFs, only Gata2 and Gata3 were found to be collecting duct selective among all renal tubule segments (42), neither of which ranked in the top 33 TFs shown in Table 3. We then carried out further Bayesian analysis to identify the TFs involved in physiological regulation by vasopressin signaling. We asked “what TFs are most likely to mediate the action of vasopressin signaling on transcription of the Aqp2 gene?” To do this, we extended the Bayesian analysis to datasets characterizing responses to vasopressin in the collecting duct.
TFs Most Likely to Mediate Vasopressin-Induced Increases in Aqp2 Gene Transcription
Here, we carried out further Bayesian analysis to identify the TFs most likely to mediate the action of vasopressin signaling through cAMP and PKA on transcription of the Aqp2 gene. The analysis started with posterior probabilities from the analysis shown in Table 3.
The following mechanisms could be involved in vasopressin-mediated regulation of TFs: 1) phosphorylation or dephosphorylation of TFs; 2) translocation of TFs to or from the nucleus in response to vasopressin; and 3) increasing or decreasing the total abundance of TFs due to regulation of transcription, translation, or degradation by vasopressin. The datasets describing these types of response are shown in Table 4. In addition, we added new data (last two lines in Table 4) reporting the time courses of mRNA changes in mpkCCD cells following the addition of vasopressin. These experiments are described in greater detail below in RNA-Seq analysis of the time course of the vasopressin response. The Bayesian analysis is provided in detail in Supplementary Spreadsheet S3 and is shown in Table 5, listing the top 17 ranked TFs in terms of likelihood of mediating vasopressin-induced regulation of Aqp2 gene transcription. This list updates the rank order of individual TFs from Table 3 and adds two TFs to the main list, viz., Mef2d and Nr4a1. Future studies are needed to test explicitly the roles of these TFs, singly or in combination, for example, through CRISPR/Cas9-mediated deletion of the TFs or their putative binding sites.
Table 4.
Data sources for Bayes’ identification of TFs likely to mediate vasopressin’s effect to increase Aqp2 gene transcription
| PMID | Data Sets | Rationale | Assignment of Likelihood Values | Noise Threshold | Minimum Likelihood |
|---|---|---|---|---|---|
| 32219907 (43) | Phosphoproteomics in mpkCCD, response to vasopressin (https://esbl.nhlbi.nih.gov/Databases/mpkCCD-AVP/) | Phosphorylation changes in TFs in response to vasopressin may play a role in Aqp2 transcription regulation | Complements of minimum Bayes’ factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement | Abs[log2 (dDAVP/vehicle)] = 0.20 | 0.5 |
| 31313956 (44) | Phosphoproteomics in the rat inner medullary collecting duct, response to vasopressin (https://esbl.nhlbi.nih.gov/Databases/IMCD-Phos/) | Phosphorylation changes in TFs in response to vasopressin may play a role in Aqp2 transcription regulation | Complements of minimum Bayes’ factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement | Abs[log2 (dDAVP/vehicle)] = 0.12 | 0.5 |
| 22440904 (45) | Phosphoproteomics of nuclear proteins, response to vasopressin (https://esbl.nhlbi.nih.gov/Databases/QuantNucProteomics/) | Translocation of TFs in response to vasopressin may play a role in Aqp2 transcription regulation | Complements of minimum Bayes’ factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement. | Absolute max ratio (dDAVP/control) between NE and NP SD = 0.2 | 0.5 |
| 28973931 (5) | Mouse mpkCCD transcriptome (RNA sequencing) (effect of PKA deletion) (https://esbl.nhlbi.nih.gov/Databases/PKA-KO/) | TFs whose mRNA abundances are altered by PKA deletion may play roles in the vasopressin response | Complements of minimum Bayes’ factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement | Abs[log2 (dKO/control)] = 0.298 (SD) | 0.5 |
| UniProt annotations of TFs | The three TFs directly regulated by PKA are Creb1, Crem, and Atf1 | If Creb1, Crem, or Atf: likelihood = 0.8; if not, likelihood = 0.5 | None | 0.5 | |
| This paper | Mouse mpkCCD cells, time course of transcriptomic changes, 3-h data | TFs whose mRNA abundances are regulated by vasopressin before a change in abundance in Aqp2 mRNA may play a role in the regulation of Aqp2 transcription | Complements of minimum Bayes’ factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each value to the intrinsic noise in the measurement estimated as the SD of absolute values of log2(dDAVP/control). | Abs[log2(dDAVP/vehicle)] = 0.2 (SD) | 0.5 |
| This paper | Mouse mpkCCD cells, time course of transcriptomic changes, 3-, 6-, 12-, and 24-h data, dot product versus Aqp2 mRNA | TFs whose mRNA abundance changes correlate best with Aqp2 mRNA abundance may play a role in the regulation of Aqp2 transcription | Complements of minimum Bayes’ factors = 1 − exp[−(Z*)2/2] for each TF, where Z* is the ratio of each dot product to the intrinsic noise in the measurement estimated as the SD of absolute values of all dot products | Dot product = 4.2 (SD) | 0.5 |
Aqp2, aquaporin-2; dDAVP, desmopressin; TF, transcription factor.
Table 5.
The 17 transcription factors with the greatest likelihood of mediating vasopressin-dependent increases in aquaporin-2 gene transcription
| Rank | Gene Symbol | Annotation | Class | Binding Region (Fig. 2) | Probability Ratio (Posterior/ Initial Prior) |
|---|---|---|---|---|---|
| 1 | Elf3 | ETS-related transcription factor Elf-3 | ETS | 2,5,6 | 45.4 |
| 2 | Cebpb | CCAAT/enhancer-binding protein-β | C/EBP | 6 | 35.8 |
| 3 | Junb | Transcription factor jun-B | TF_bZIP | 6 | 31.4 |
| 4 | Atf1 | cAMP-dependent transcription factor ATF-1 | TF_bZIP | 6 | 22.6 |
| 5 | Jun | Transcription factor AP-1 | TF_bZIP | 6 | 22.4 |
| 6 | Klf5 | Krueppel-like factor 5 | zf-C2H2 | 1,2,3,4,5,6 | 21.7 |
| 7 | Elf1 | ETS-related transcription factor Elf-1 | ETS | 2,5,6 | 18.1 |
| 8 | Nfkb1 | Nuclear factor NF-κB p105 subunit | RHD | 1,4,5 | 17.7 |
| 9 | Stat3 | Signal transducer and activator of transcription 3 | STAT | 1,4,5,6 | 17.4 |
| 10 | Nfyb | Nuclear transcription factor Y subunit-β | NF-YB/C | 1,2 | 16.9 |
| 11 | Klf6 | Krueppel-like factor 6 | zf-C2H2 | 1,2,3,4,5,6 | 14.4 |
| 12 | Nr2f6 | Nuclear receptor subfamily 2 group F member 6 | RXR-related | 6 | 13.4 |
| 13 | Mef2d | Myocyte-specific enhancer factor 2 D | MEF2 | none | 12.5 |
| 14 | Irf3 | Interferon regulatory factor 3 | IRF | 2 | 11.9 |
| 15 | Ets1 | Protein C-ets-1 isoform 6 | ETS | 2,5,6 | 10.8 |
| 16 | Sp1 | Transcription factor Sp1 | zf-C2H2 | 1,2,3,4,5,6 | 10.7 |
| 17 | Nr4a1 | Nuclear receptor subfamily 4 group A member 1 | NGFIB-like | 2,5 | 10.7 |
RNA-Seq Analysis of the Time Course of the Vasopressin Response
To provide data necessary for the above Bayesian analysis, we carried out RNA-Seq profiling of mpkCCD cells exposed to vasopressin for 3, 6, 12, and 24 h after dDAVP addition. Each time point had its own vehicle time-control observations (n = 3 for both dDAVP and time controls). The log2 values of dDAVP-to-vehicle ratios were graphed on a specialized interactive webpage (https://esbl.nhlbi.nih.gov/TIMEAVP/). A summary of the time courses of AQP2 and the TFs from Table 5 are shown in Fig. 3. AQP2 mRNA was already maximally increased at the 3-h time point, indicating a surprisingly rapid response. Among the TFs shown, CCAAT/enhancer-binding protein-β (Cebpb) changed most substantially, exhibiting a large increase at 3 and 6 h but declining at later time points. Previous studies using proteomics to profile nuclear translocation have demonstrated that Cebpb protein abundance increases in the nuclei of mpkCCD cells in response to vasopressin (https://esbl.nhlbi.nih.gov/Databases/QuantNucProteomics/). Subsequently, we demonstrated using ChIP-Seq that vasopressin increases Cebpb binding to the enhancer directly downstream from the Aqp2 gene body (region 6 in Fig. 2) (11). In the Bayesian analysis of TFs most likely to be involved in the regulation of Aqp2 gene transcription, it ranked second among the 1,344 TFs in the mouse genome (Table 5). Cebp family TFs are so-called “pioneer TFs” that can bind to cis regulatory elements without prior chromatin modifications to increase DNA accessibility (46) and is an attractive candidate for further experimental studies.
Figure 3.

Time courses of transcriptomic change in transcription factor abundances following the addition of desmopressin (dDAVP) to mpkCCD cells compared with aquaporin-2 (Aqp2) mRNA changes. A: time courses of Aqp2 transcription. B: time course of transcriptomic changes of transcription factors from Table 3. Only CCAAT/enhancer-binding protein-β (Cebpb) exhibited substantial changes at the 3- and 6-h time points parallel to Aqp2. The gene symbols shown in red are those that changed significantly at any time point following vasopressin addition.
An additional question was “What other genes have time courses similar to that of Aqp2 mRNA?” Fig. 4 shows non-TF transcripts that increased rapidly (within 3 h) like Aqp2, viz., Aqp5, Arg2, Arhgef3, B3gnt7, Baiap2l2, Cited1, Nipal1, Pde4b, Ramp3, and Sult1d1. Many of these were previously identified to be either increased with vasopressin based on Affymetrix microarrays (40) (Arg2, Arhgef3, B3gnt7, Nipal1, Pde4b, and Sult1d1), to be decreased when PKA was deleted from mpkCCD cells (5) (Aqp5, Baiap2l2, and Pde4), or identified as a vasopressin-induced transcript by Robert-Nicoud et al. (47) (Ramp3). Theoretically, transcription of these genes could be regulated by the same TFs as Aqp2, an objective for future studies.
Figure 4.
Time course desmopressin (dDAVP) responses of aquaporin-2 (Aqp2) and several other nontranscription factor transcripts. Several transcripts underwent significant increases in abundance within 3 h of dDAVP addition in mpkCCD cells. Data from RNA sequencing. See the text for details.
DISCUSSION
Bayesian Analysis as a Model of Human Thought
The data integration method used in this report is based on application of Bayes’ theorem, a method that simulates natural human thought processes that integrate diverse information to derive mechanistic hypotheses. The goal was to use logic and data to narrow down possible answers to the following question: “what transcription factors mediate the effect of vasopressin signaling in collecting duct cells to increase the rate of Aqp2 gene transcription?” It started with a list of 1,344 TFs that were initially given the same probability of involvement in the vasopressin response. It then added -omics data of various types to stratify the TF list based on simple logic. For example, early steps in the process used quantitative proteomics and transcriptomics data to stratify TFs based on a simple idea: “TF proteins or mRNAs that are detectable in collecting duct cells are more likely to play a role in the regulation of Aqp2 transcription than TFs that are not detectable.” Moving forward, we further stratified the TF list based on the idea: “TFs with potential binding sites in the Aqp2 CTCF loop enhancer regions are more likely to play a role in the regulation of Aqp2 transcription than TFs that do not have such sites.” The analysis advanced through other steps using Bayes’ theorem that assigned higher regulatory likelihoods to TFs that have been shown in proteomics studies to be altered by vasopressin in terms of protein phosphorylation, absolute protein abundance, or translocation into the nucleus, each with a logical premise specified in Table 4. Although the use of logic resembles human thought, formalized Bayesian analysis is advantageous because computers can remember many more details than humans. At the end of the analysis, we obtained a ranked list of TF candidates that then could be studied further to rule in or out a role in the regulation of Aqp2 gene transcription (Fig. 5). Such studies can take the form of genome editing (CRISPR/Cas9) to delete each TF or to mutate its putative binding site as well as ChIP-Seq to identify whether it indeed binds to the proposed site on the Aqp2 CTCF loop. These experimental studies are beyond the scope of the current work.
Figure 5.

Bayesian integration scheme, showing each step in the overall calculation. Details of data sources are shown in Tables 1 and 4. Rankings obtained at specific sites in calculation are shown in Tables 2, 3 and 5. Aqp2, aquaporin-2; CCD, cortical collecting duct; CNT, connecting tubule; IMCD, inner medullary collecting duct; OMCD, outer medullary collecting duct; RNA-Seq, RNA sequencing; SILAC, stable isotope labeling by amino acids in cell culture; TFs, transcription factors.
Time Course of the Vasopressin Response
Although this study focused primarily on Bayesian modeling with preexisting data, we also carried out new experiments to assess the time course of changes in gene expression in cultured mpkCCD cells after the addition of vasopressin. The observed increase in Aqp2 mRNA was surprisingly fast, rising to a virtually maximum level within 3 h. Several other genes showed equally rapid responses (Fig. 4). Among highly ranked TFs in this study, only C/EBPβ increased in this timeframe (Fig. 3). The response rate at the protein level is substantially slower, requiring many hours or days to reach a maximum level (17). This suggests that other processes contribute to the rise in Aqp2 protein expression after dDAVP addition. The quick rise in Aqp2 mRNA provides important information indicating that the critical regulatory processes in mpkCCD cells must be triggered in 3 h or less. This timeframe is compatible with the regulation of TFs by phosphorylation or other posttranslational modifications or regulation of nuclear translocation of TFs. Alternatively, the rapid response could be in part dependent on chromatin modifications, such as increased histone H3 lysine-27 acetylation in the vicinity of the Aqp2 gene in response to vasopressin, as we have previously demonstrated (11). Such chromatin modifications are required for TFs to access their DNA-binding sites.
Properties of the Most Highly Ranked TFs
In this study, we identified 17 TFs of the 1,344 TFs in the entire mouse genome that are the most highly ranked candidates for roles in vasopressin-mediated increases in Aqp2 transcriptional regulation. This analysis brings the list of candidates down to a reasonable number that can then be studied individually in future investigations. Of the 17 TFs, 8 TFs have been proposed in prior studies to play roles in Aqp2 regulation, viz., Cebpb, Elf1, Elf3, Ets1, Jun, Junb, Nfkb1, and Sp1 (see introduction). The remaining nine TFs represent new candidates for future studies (Atf1, Irf3, Klf5, Klf6, Mef2d, Nfyb, Nr2f6, Stat3, and Nr4a1). Below, we summarize the properties of these TFs and relate them to prior reductionist studies in the literature.
CREB family (Atf1).
CREB family proteins are bZIP transcription factors. CREB itself (Creb1) is widely viewed as a regulator of Aqp2 gene transcription (6, 12, 13, 48). However, in ChIP-Seq studies in vasopressin-responsive cultured mpkCCD cells, CREB (gene symbol: Creb1) was not found to bind anywhere in the vicinity of the Aqp2 gene, although it did bind to many previously annotated sites (11). There are two other CREB-like TFs that, like Creb1, possess pKID domains that can be phosphorylated by PKA (49, 50), viz., Crem and Atf1. The Bayesian analysis pointed to Atf1 (rank 4 of 1,344) as a likely regulator of Aqp2 gene expression. We are not aware of prior experimental studies that have investigated a role of Atf1 in the renal collecting duct. The putative binding site for Atf1 is in the enhancer downstream from the Aqp2 gene (region 6, Fig. 2).
Other bZIP TFs: AP-1 family.
Atf1 is a so-called b-ZIP family TF. The Bayesian analysis ranked three other b-ZIP TFs highly with regard to regulation of Aqp2 expression, namely, Junb, Jun, and Fosl2. These TFs are well-known elements of activator protein-1 (AP-1) dimers that regulate a large number of genes. Prior studies have implicated AP-1 in the regulation of Aqp2 transcription (17, 25, 51).
CCAAT-enhancer-binding proteins.
C/EBPs are so-called “pioneer” TFs that can bind to DNA without prior chromatin modifications to increase DNA accessibility (11). We have previously found, using ChIP-Seq in mpkCCD cells, that C/EBP-β (gene symbol: Cebpb) binds to the enhancer region immediately downstream from Aqp2 (region 6, Fig. 2) (11). Binding was increased by vasopressin, corresponding to an increase in the abundance of Cebpb in the nucleus presumably due to vasopressin-mediated nuclear translocation (https://esbl.nhlbi.nih.gov/Databases/QuantNucProteomics/). Vasopressin increases the half-life of Cebpb protein from 8.5 to 17.7 h (https://esbl.nhlbi.nih.gov/Databases/ProteinHalfLives/), and the present study demonstrated an increase in Cebpb mRNA very rapidly after the addition of vasopressin in mpkCCD cells (Fig. 3). Cebpb ranked second among all candidate TFs with regard to likelihood of a role for vasopressin-mediated Aqp2 transcriptional activation (Table 5).
ETS family.
ETS domain proteins are TFs that classically are targeted by MAPK pathways. Phosphorylation by MAPKs affects DNA binding and results in transcriptional activation or repression. This often triggers a so-called immediate early transcriptional response in cells following MAPK activation, resulting in a proliferative response and dedifferentiation. Since vasopressin signaling typically results in decreased activation of ERK and other MAPKs, signaling through ETS family TFs would be expected to increase collecting duct cell differentiation and Aqp2 gene expression. In this study, three ETS family TFs were highly ranked with respect to likelihood of a role in Aqp2 gene regulation, namely, Elf3 (rank 1), Elf1 (rank 7), and Ets1 (rank 15). These TFs are predicted to bind two different enhancers (regions 2 and 6, Fig. 2) as well as the Aqp2 promoter. Promoter-reporter assays demonstrated that expression of Elf3 conveys vasopressin-mediated increases in Aqp2 transcription (17). Also, dDAVP increased the abundance of Elf3 in nuclear extracts of mouse mpkCCD cells to 230% of control values consistent with nuclear translocation (45).
RHD and IRF families.
Prominent in Table 5 are several TFs that nominally mediate inflammatory signaling. NF-κB p105 (Nfkb1) was ranked eighth with regard to likelihood of a role of vasopressin to increase Aqp2 gene transcription (Table 5), consistent with prior evidence from studies in mpkCCD cells (22). Interferon regulatory factor 3 (Irf3), another nominal regulatory of inflammatory responses, was also highly ranked.
STAT family.
Stat3 was ranked ninth in the overall Bayesian analysis of TFs. In immune cells, it is normally involved in cytokine signaling and activated by tyrosine phosphorylation, resulting in a proliferative response by activation of the cell cycle (52), which would be expected to reduce AQP2 expression. Classically, STATs are phosphorylated by JAKs but can also be phosphorylated by the epidermal growth factor receptor and by Src (52). Src is phosphorylated at Ser17 by PKA in the native rat inner medullary collecting duct (44), which increases its activity. In addition, vasopressin also produced a small increase in Stat3 phosphorylation at Tyr705 (44), which increases its activity (53). We are not aware of studies that have directly addressed the role of Stat3 in the control of Aqp2 gene expression.
zf-C2H2 family.
Klf5, Klf6, and Sp1 are zinc finger TFs that appear in Table 5 as TFs likely to be involved in vasopressin-mediated regulation of Aqp2 gene transcription. In general, zinc finger TFs tend to mediate gene repression. Klf5 has been implicated in inflammatory signaling in the collecting duct (54). Klf6 is thought to play a role in the development of the renal collecting duct system, possibly in association with GATA-3 (55). Based on a promoter-reporter study, Sp1 has been proposed to play a role in the expression of Aqp2 (25).
Nuclear factor-Y.
NF-Y is a heterotrimeric TF composed of Nfya, Nfyb, and Nfyc. One subunit ranked highly in the Bayesian analysis (Table 4). This TF complex binds to 5′-CCAAT-3′ box motifs in gene promoters. Nfyc has been implicated as a transcriptional coregulator for the mineralocorticoid receptor in the renal collecting duct (56).
Other TFs.
Most of the TFs shown in Table 5 (candidates for vasopressin-mediate regulation of Aqp2 gene transcription) are also shown in Table 3 (general candidates not necessarily via vasopressin-mediate regulation). Table 3 also includes a few unique TFs not in Table 5. Among these, Nfat5 (19), Nr1h2 (24), and Smad3 (41) have been implicated in the regulation of Aqp2 gene expression by mechanisms that are independent of vasopressin.
SUPPLEMENTAL DATA
All Supplemental Material: https://esbl.nhlbi.nih.gov/Databases/TFBayesSuppData/.
FUNDING
This work was primarily funded by the Division of Intramural Research, National Heart, Lung, and Blood Institute (Grants ZIA-HL001285 and ZIA-HL006129, to M.A.K.).
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the authors.
AUTHOR CONTRIBUTIONS
H.K., H.J.J., V.R., C.-R.Y., C.-L.C., and M.A.K. conceived and designed research; H.K., H.J.J., C.-R.Y., and C.-L.C. performed experiments; H.K., H.J.J., V.R., K.T.L., E.P., C.-R.Y., C.-L.C., L.C., and M.A.K. analyzed data; H.K., H.J.J., V.R., K.T.L., E.P., C.-R.Y., C.-L.C., L.C., and M.A.K. interpreted results of experiments; H.K., H.J.J., and M.A.K. prepared figures; H.K., H.J.J., C.-L.C., and M.A.K. drafted manuscript; H.K., H.J.J., V.R., K.T.L., E.P., C.-R.Y., C.-L.C., L.C., and M.A.K. edited and revised manuscript; H.K., H.J.J., V.R., K.T.L., E.P., C.-R.Y., C.-L.C., L.C., and M.A.K. approved final version of manuscript.
ACKNOWLEDGMENTS
Next-generation sequencing was done at the NHLBI DNA Sequencing Core Facility (Dr. Yuesheng Li, Director).
APPENDIX: NOMENCLATURE
Nonitalicized gene symbols are used to designate proteins or transcripts. When referring specifically to the gene rather than the gene products, the gene symbols are italicized. Human gene symbols are given in “all caps,” whereas rodent gene symbols are capitalized in the first letter only. Official gene symbols are archived by UniProt (https://www.uniprot.org/).
REFERENCES
- 1.Nielsen S, DiGiovanni SR, Christensen EI, Knepper MA, Harris HW. Cellular and subcellular immunolocalization of vasopressin-regulated water channel in rat kidney. Proc Natl Acad Sci U S A 90: 11663–11667, 1993. doi: 10.1073/pnas.90.24.11663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Coleman RA, Wu DC, Liu J, Wade JB. Expression of aquaporins in the renal connecting tubule. Am J Physiol Renal Physiol 279: F874–F883, 2000. doi: 10.1152/ajprenal.2000.279.5.F874. [DOI] [PubMed] [Google Scholar]
- 3.Nielsen S, Chou CL, Marples D, Christensen EI, Kishore BK, Knepper MA. Vasopressin increases water permeability of kidney collecting duct by inducing translocation of aquaporin-CD water channels to plasma membrane. Proc Natl Acad Sci USA 92: 1013–1017, 1995. doi: 10.1073/pnas.92.4.1013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.DiGiovanni SR, Nielsen S, Christensen EI, Knepper MA. Regulation of collecting duct water channel expression by vasopressin in Brattleboro rat. Proc Natl Acad Sci USA 91: 8984–8988, 1994. doi: 10.1073/pnas.91.19.8984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Isobe K, Jung HJ, Yang CR, Claxton J, Sandoval P, Burg MB, Raghuram V, Knepper MA. Systems-level identification of PKA-dependent signaling in epithelial cells. Proc Natl Acad Sci USA 114: E8875–E8884, 2017. doi: 10.1073/pnas.1709123114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Nielsen S, Frøkiaer J, Marples D, Kwon TH, Agre P, Knepper MA. Aquaporins in the kidney: from molecules to medicine. Physiol Rev 82: 205–244, 2002. doi: 10.1152/physrev.00024.2001. [DOI] [PubMed] [Google Scholar]
- 7.Marples D, Christensen S, Christensen EI, Ottosen PD, Nielsen S. Lithium-induced downregulation of aquaporin-2 water channel expression in rat kidney medulla. J Clin Invest 95: 1838–1845, 1995. doi: 10.1172/JCI117863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sung CC, Chen L, Limbutara K, Jung HJ, Gilmer GG, Yang CR, Lin SH, Khositseth S, Chou CL, Knepper MA. RNA-Seq and protein mass spectrometry in microdissected kidney tubules reveal signaling processes initiating lithium-induced nephrogenic diabetes insipidus. Kidney Int 96: 363–377, 2019. doi: 10.1016/j.kint.2019.02.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hasler U, Mordasini D, Bens M, Bianchi M, Cluzeaud F, Rousselot M, Vandewalle A, Feraille E, Martin PY. Long term regulation of aquaporin-2 expression in vasopressin-responsive renal collecting duct principal cells. J Biol Chem 277: 10379–10386, 2002. doi: 10.1074/jbc.M111880200. [DOI] [PubMed] [Google Scholar]
- 10.Hnisz D, Day DS, Young RA. Insulated neighborhoods: structural and functional units of mammalian gene control. Cell 167: 1188–1200, 2016. doi: 10.1016/j.cell.2016.10.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Jung HJ, Raghuram V, Lee JW, Knepper MA. Genome-wide mapping of DNA accessibility and binding sites for CREB and C/EBPβ in vasopressin-sensitive collecting duct cells. J Am Soc Nephrol 29: 1490–1500, 2018. doi: 10.1681/ASN.2017050545. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kortenoeven ML, Fenton RA. Renal aquaporins and water balance disorders. Biochim Biophys Acta 1840: 1533–1549, 2014. doi: 10.1016/j.bbagen.2013.12.002. [DOI] [PubMed] [Google Scholar]
- 13.Bockenhauer D, Bichet DG. Pathophysiology, diagnosis and management of nephrogenic diabetes insipidus. Nat Rev Nephrol 11: 576–588, 2015. doi: 10.1038/nrneph.2015.89. [DOI] [PubMed] [Google Scholar]
- 14.Pearce D, Soundararajan R, Trimpert C, Kashlan OB, Deen PM, Kohan DE. Collecting duct principal cell transport processes and their regulation. Clin J Am Soc Nephrol 10: 135–146, 2015. doi: 10.2215/CJN.05760513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wilson JL, Miranda CA, Knepper MA. Vasopressin and the regulation of aquaporin-2. Clin Exp Nephrol 17: 751–764, 2013. doi: 10.1007/s10157-013-0789-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lin ST, Ma CC, Kuo KT, Su YF, Wang WL, Chan TH, Su SH, Weng SC, Yang CH, Lin SL, Yu MJ. Transcription factor Elf3 modulates vasopressin-induced aquaporin-2 gene expression in kidney collecting duct cells. Front Physiol 10: 1308, 2019. doi: 10.3389/fphys.2019.01308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Yu MJ, Miller RL, Uawithya P, Rinschen MM, Khositseth S, Braucht DW, Chou CL, Pisitkun T, Nelson RD, Knepper MA. Systems-level analysis of cell-specific AQP2 gene expression in renal collecting duct. Proc Natl Acad Sci USA 106: 2441–2446, 2009. doi: 10.1073/pnas.0813002106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Hasler U, Leroy V, Martin PY, Feraille E. Aquaporin-2 abundance in the renal collecting duct: new insights from cultured cell models. Am J Physiol Renal Physiol 297: F10–F18, 2009. doi: 10.1152/ajprenal.00053.2009. [DOI] [PubMed] [Google Scholar]
- 19.Lam AK, Ko BC, Tam S, Morris R, Yang JY, Chung SK, Chung SS. Osmotic response element-binding protein (OREBP) is an essential regulator of the urine concentrating mechanism. J Biol Chem 279: 48048–48054, 2004. doi: 10.1074/jbc.M407224200. [DOI] [PubMed] [Google Scholar]
- 20.Hasler U, Jeon US, Kim JA, Mordasini D, Kwon HM, Féraille E, Martin PY. Tonicity-responsive enhancer binding protein is an essential regulator of aquaporin-2 expression in renal collecting duct principal cells. J Am Soc Nephrol 17: 1521–1531, 2006. doi: 10.1681/ASN.2005121317. [DOI] [PubMed] [Google Scholar]
- 21.Li SZ, McDill BW, Kovach PA, Ding L, Go WY, Ho SN, Chen F. Calcineurin-NFATc signaling pathway regulates AQP2 expression in response to calcium signals and osmotic stress. Am J Physiol Cell Physiol 292: C1606–C1616, 2007. doi: 10.1152/ajpcell.00588.2005. [DOI] [PubMed] [Google Scholar]
- 22.Hasler U, Leroy V, Jeon US, Bouley R, Dimitrov M, Kim JA, Brown D, Kwon HM, Martin PY, Féraille E. NF-kappaB modulates aquaporin-2 transcription in renal collecting duct principal cells. J Biol Chem 283: 28095–28105, 2008. doi: 10.1074/jbc.M708350200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kortenoeven ML, van den Brand M, Wetzels JF, Deen PM. Hypotonicity-induced reduction of aquaporin-2 transcription in mpkCCD cells is independent of the tonicity responsive element, vasopressin, and cAMP. J Biol Chem 286: 13002–13010, 2011. doi: 10.1074/jbc.M110.207878. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Su W, Huang SZ, Gao M, Kong XM, Gustafsson JA, Xu SJ, Wang B, Zheng F, Chen LH, Wang NP, Guan YF, Zhang XY. Liver X receptor β increases aquaporin 2 protein level via a posttranscriptional mechanism in renal collecting ducts. Am J Physiol Renal Physiol 312: F619–F628, 2017. doi: 10.1152/ajprenal.00564.2016. [DOI] [PubMed] [Google Scholar]
- 25.Hozawa S, Holtzman EJ, Ausiello DA. cAMP motifs regulating transcription in the aquaporin 2 gene. Am J Physiol 270: C1695–C1702, 1996. doi: 10.1152/ajpcell.1996.270.6.C1695. [DOI] [PubMed] [Google Scholar]
- 26.Yang CR, Raghuram V, Emamian M, Sandoval PC, Knepper MA. Deep proteomic profiling of vasopressin-sensitive collecting duct cells. II. Bioinformatic analysis of vasopressin signaling. Am J Physiol Cell Physiol 309: C799–C812, 2015. doi: 10.1152/ajpcell.00214.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Bradford D, Raghuram V, Wilson JL, Chou CL, Hoffert JD, Knepper MA, Pisitkun T. Use of LC-MS/MS and Bayes' theorem to identify protein kinases that phosphorylate aquaporin-2 at Ser256. Am J Physiol Cell Physiol 307: C123–C139, 2014. doi: 10.1152/ajpcell.00377.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Xue Z, Chen JX, Zhao Y, Medvar B, Knepper MA. Data integration in physiology using Bayes' rule and minimum Bayes' factors: deubiquitylating enzymes in the renal collecting duct. Physiol Genomics 49: 151–159, 2017. doi: 10.1152/physiolgenomics.00120.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Goodman SN. Toward evidence-based medical statistics. 2: The Bayes factor. Ann Intern Med 130: 1005–1013, 1999. doi: 10.7326/0003-4819-130-12-199906150-00019. [DOI] [PubMed] [Google Scholar]
- 30.Held L. A nomogram for P values. BMC Med Res Methodol 10: 21, 2010. doi: 10.1186/1471-2288-10-21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Wen B, Jung HJ, Chen L, Saeed F, Knepper MA. NGS-Integrator: An efficient tool for combining multiple NGS data tracks using minimum Bayes' factors. BMC Genomics 21: 806, 2020. doi: 10.1186/s12864-020-07220-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37: W202–W208, 2009. doi: 10.1093/nar/gkp335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Zhao Y, Yang CR, Raghuram V, Parulekar J, Knepper MA. BIG: a large-scale data integration tool for renal physiology. Am J Physiol Renal Physiol 311: F787–F792, 2016. doi: 10.1152/ajprenal.00249.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Duong Van Huyen J, Bens M, Vandewalle A. Differential effects of aldosterone and vasopressin on chloride fluxes in transimmortalized mouse cortical collecting duct cells. J Membr Biol 164: 79–90, 1998. doi: 10.1007/s002329900395. [DOI] [PubMed] [Google Scholar]
- 35.Zhang HM, Chen H, Liu W, Liu H, Gong J, Wang H, Guo AY. AnimalTFDB: a comprehensive animal transcription factor database. Nucleic Acids Res 40: D144–D149, 2012. [0.1093/nar/gkr965] [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Clark JZ, Chen L, Chou CL, Jung HJ, Lee JW, Knepper MA. Representation and relative abundance of cell-type selective markers in whole-kidney RNA-Seq data. Kidney Int 95: 787–796, 2019. doi: 10.1016/j.kint.2018.11.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Uawithya P, Pisitkun T, Ruttenberg BE, Knepper MA. Transcriptional profiling of native inner medullary collecting duct cells from rat kidney. Physiol Genomics 32: 229–253, 2008. doi: 10.1152/physiolgenomics.00201.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Lee JW, Chou CL, Knepper MA. Deep sequencing in microdissected renal tubules identifies nephron segment-specific transcriptomes. J Am Soc Nephrol 26: 2669–2677, 2015. doi: 10.1681/ASN.2014111067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Sandoval PC, Claxton JS, Lee JW, Saeed F, Hoffert JD, Knepper MA. Systems-level analysis reveals selective regulation of Aqp2 gene expression by vasopressin. Sci Rep 6: 34863, 2016. doi: 10.1038/srep34863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Khositseth S, Pisitkun T, Slentz DH, Wang G, Hoffert JD, Knepper MA, Yu MJ. Quantitative protein and mRNA profiling shows selective post-transcriptional control of protein expression by vasopressin in kidney cells. Mol Cell Proteomics 10: M110.004036, 2011. doi: 10.1074/mcp.M110.004036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Lee JW, Alsady M, Chou CL, de Groot T, Deen PMT, Knepper MA, Ecelbarger CM. Single-tubule RNA-Seq uncovers signaling mechanisms that defend against hyponatremia in SIADH. Kidney Int 93: 128–146, 2018. doi: 10.1016/j.kint.2017.06.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Chen L, Chou CL, Knepper MA. A comprehensive map of mRNAs and their isoforms across all 14 renal tubule segments of mouse. J Am Soc Nephrol 32: 897–912, 2021. doi: 10.1681/ASN.2020101406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Datta A, Yang CR, Limbutara K, Chou CL, Rinschen MM, Raghuram V, Knepper MA. PKA-independent vasopressin signaling in renal collecting duct. FASEB J 34: 6129–6146, 2020. doi: 10.1096/fj.201902982R. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Deshpande V, Kao A, Raghuram V, Datta A, Chou CL, Knepper MA. Phosphoproteomic identification of vasopressin V2 receptor-dependent signaling in the renal collecting duct. Am J Physiol Renal Physiol 317: F789–F804, 2019. doi: 10.1152/ajprenal.00281.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Schenk LK, Bolger SJ, Luginbuhl K, Gonzales PA, Rinschen MM, Yu MJ, Hoffert JD, Pisitkun T, Knepper MA. Quantitative proteomics identifies vasopressin-responsive nuclear proteins in collecting duct cells. J Am Soc Nephrol 23: 1008–1018, 2012. doi: 10.1681/ASN.2011070738. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Zaret KS, Carroll JS. Pioneer transcription factors: establishing competence for gene expression. Genes Dev 25: 2227–2241, 2011. doi: 10.1101/gad.176826.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Robert-Nicoud M, Flahaut M, Elalouf JM, Nicod M, Salinas M, Bens M, Doucet A, Wincker P, Artiguenave F, Horisberger JD, Vandewalle A, Rossier BC, Firsov D. Transcriptome of a mouse kidney cortical collecting duct cell line: effects of aldosterone and vasopressin. Proc Natl Acad Sci USA 98: 2712–2716, 2001. doi: 10.1073/pnas.051603198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Posner E, Skutil J. The great neglect: the fate of Mendel's classic paper between 1865 and 1900. Med Hist 12: 122–136, 1968. doi: 10.1017/s0025727300013016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Della Fazia MA, Servillo G, Sassone-Corsi P. Cyclic AMP signalling and cellular proliferation: regulation of CREB and CREM. FEBS Lett 410: 22–24, 1997. doi: 10.1016/s0014-5793(97)00445-6. [DOI] [PubMed] [Google Scholar]
- 50.De Cesare D, Sassone-Corsi P. Transcriptional regulation by cyclic AMP-responsive factors. Prog Nucleic Acid Res Mol Biol 64: 343–369, 2000. doi: 10.1016/s0079-6603(00)64009-6. [DOI] [PubMed] [Google Scholar]
- 51.Yasui M, Zelenin SM, Celsi G, Aperia A. Adenylate cyclase-coupled vasopressin receptor activates AQP2 promoter via a dual effect on CRE and AP1 elements. Am J Physiol Renal Physiol 272: F443–F450, 1997. doi: 10.1152/ajprenal.1997.272.4.F443. [DOI] [PubMed] [Google Scholar]
- 52.Bromberg J, Darnell JE Jr.. The role of STATs in transcriptional control and their impact on cellular function. Oncogene 19: 2468–2473, 2000. doi: 10.1038/sj.onc.1203476. [DOI] [PubMed] [Google Scholar]
- 53.Kaptein A, Paillard V, Saunders M. Dominant negative stat3 mutant inhibits interleukin-6-induced Jak-STAT signal transduction. J Biol Chem 271: 5961–5964, 1996. doi: 10.1074/jbc.271.11.5961. [DOI] [PubMed] [Google Scholar]
- 54.Fujiu K, Manabe I, Nagai R. Renal collecting duct epithelial cells regulate inflammation in tubulointerstitial damage in mice. J Clin Invest 121: 3425–3441, 2011. doi: 10.1172/JCI57582. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Fischer EA, Verpont MC, Garrett-Sinha LA, Ronco PM, Rossert JA. Klf6 is a zinc finger protein expressed in a cell-specific manner during kidney development. J Am Soc Nephrol 12: 726–735, 2001. doi: 10.1681/ASN.V124726. [DOI] [PubMed] [Google Scholar]
- 56.Murai-Takeda A, Shibata H, Kurihara I, Kobayashi S, Yokota K, Suda N, Mitsuishi Y, Jo R, Kitagawa H, Kato S, Saruta T, Itoh H. NF-YC functions as a corepressor of agonist-bound mineralocorticoid receptor. J Biol Chem 285: 8084–8093, 2010. doi: 10.1074/jbc.M109.053371. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Raw fastq files and raw count information from the RNA-Seq analysis were deposited in the Gene Expression Omnibus (GEO; GSE163566).


