Abstract
The vertebrate immune system is tasked with the challenge of responding to any pathogen the organism might encounter, and retaining memory of that pathogen in case of future infection. Recognition and memory of pathogens is encoded within the adaptive immune system and production of T and B lymphocytes with diverse antigen receptor repertoires. In B lymphocytes, diversity is generated by sequential recombination Variable (V), Diversity (D) and Joining (J) gene segments in the immunoglobulin heavy chain gene (Igh) and subsequent V-J recombination in immunoglobulin light chain genes (Igκ followed by Igλ). However, the process by which B cells select particular V, D and J segments during recombination, and the mechanisms by which stochasticity of selection is maintained to ensure antibody repertoire diversity is still unclear. In this review, we will focus on Igκ and recent findings regarding the relationships between gene structure, the generation of diversity and allelic choice. Surprisingly, the nuclear environment in which each Igκ allele resides, including transcription factories assembled on the nuclear matrix, plays critical roles in both gene regulation and in shaping the diversity of Vκ genes accessible to recombination. These findings provide a new paradigm for understanding Igκ recombination and Vκ diversity in the context of B lymphopoiesis.
Igκ structure
In mice, the Igκ locus stretches across 3.2 Mb in chromosome 6, with >100 Vκ genes arrayed on the 5’ end of the locus and 4 functional Jκ gene segments on the 3’ end, with a single constant (C) segment {Martinez-Jean, 2001 #17483}. However, Igκ is not a simple linear structure. Recent technical advances, including various chromatin conformation capture (3C) technologies {Dekker, 2002 #17571}{Davies, 2017 #17572}, have revealed that genes are organized into topologically associating domains (TADs), in which DNA regions within the TAD interact at higher frequency with other DNA regions in the TAD, and are insulated from DNA regions outside of the TAD{Cremer, 2001 #17573}{Dixon, 2012 #17574}. Boundaries between TADs are often defined by sites bound by the CCCTC-binding factor (CTCF){Phillips, 2009 #17564}. It is likely that TADs, or chromatin loops, are generated when a pair of tethered CTCF-cohesin dimers moves in opposite directions along DNA, extruding a loop of DNA behind them, until they reach a convergent pair of CTCF binding sites{Alipour, 2012 #17575}. The binding of CTCF-cohesin to these sites anchors a stable loop of DNA. There is now real-time imaging of DNA loop extrusion that supports this model {Ganji, 2018 #17576}.
Based on known CTCF binding sites and Hi-C data in Rag 2−/− pro-B cells {Aoki-Ota, 2012 #16612}{Choi, 2014 #16091}{Lin, 2012 #11132}{Ribeiro de Almeida, 2011 #16615}, Igκ is organized into six topologically associating domains (TADs) or loops (L1-L6){Karki, 2018 #17541}. Distal Vκ genes fall within three loops (L1-L3) and span Vκs 137–130, Vκs 129–122 and Vκs 121–84, respectively. The distal loops are relatively small and contain frequently expressed Vκ gene genes {Karki, 2018 #17541}. In contrast, the L4 is large, spans Vκs from 83–32 and is less frequently transcribed than the flanking loops. The proximal loops L5 and L6 are also relatively small, span Vκ 31–1 and are frequently expressed {Karki, 2018 #17541}.
Analysis of single cell transcription in small pre-B cells immediately prior to Igκ recombination, reveals that there is little detectable transcription across the distal, intermediate and proximal TADs boundaries {Karki, 2018 #17541}. However, transcription is observed across loop boundaries in the distal and proximal loop clusters. This may reflect variability in how the smaller distal and proximal loops form in individual cells.
Generation of Vκ-Jκ repertoires and unanswered questions
High diversity of antigen receptors is expected based on stochastic combinatorial recombination of V,D and J gene segments. However, for Igκ, both Vκ and Jκ show bias in their bone marrow (BM) and splenic repertoires {Aoki-Ota, 2012 #16612}. Using 5’ RACE, out of 101 functional Vκs from C5BL/6 BM and spleen, only seven Vκ gene segments (1–135, 9–120, 10–96, 19–93, 6–23, 6–17 and 6–15) were found to make approximately 40% of the primary Vκ repertoire, with most Vκ gene segments either clustering along the distal (1–135, 9–120, 19–93) or the proximal (6–17 and 6–15) domains. For Jκ, primary rearrangemens favor Jκ1 and Jκ2, whereas secondary rearrangements favor Jκ4 and Jκ5 {Max, 1979 #3163}. The factors that contribute to this repertoire skewing and restriction remain unclear.
Recombination at Igκ follows the same general mechanism as other antigen receptors in which diverse Vκ segments, organized into TADs, contract onto a recombination center assembled at the Jκ segments. Recombination centers, which are marked by high density of H3K4me3, RAG1/2 protein binding and transcription factors, provide a scaffold for chromosomal organization required for recombination {Schatz, 2011 #11195}{Ji, 2010 #9233}{Matthews, 2007 #17022}. The Jκ-Cκ region is anchored by a matrix attachment region (MAR) {Cockerill, 1986 #5467}{Yi, 1999 #8892} consistent with a model in which the recombination center is a relatively fixed platform onto which the Vκ segments are recruited for Vκ-Jκ recombination.
For recombination, Vκ genes must be accessible which is strongly linked to, and has been equated with, transcription {Yancopoulos, 1985 #3971}{Abarrategui, 2006 #16998}{Abarrategui, 2009 #17542}. All Vκs highly used in the initial Igκ repertoire {Aoki-Ota, 2012 #16612} are transcribed in single cells prior to recombination {Karki, 2018 #17541}. However, the frequency of specific Vκ gene transcription pre-recombination does not identify those Vκ genes most highly represented in the initial Igκ repertoire {Aoki-Ota, 2012 #16612}{Karki, 2018 #17541}. Therefore, while transcription might be required {Yancopoulos, 1985 #3971}{Abarrategui, 2006 #16998}{Abarrategui, 2009 #17542}, the frequency of Vκ transcription does not predict Vκ usage. These observations suggest that Vκ accessibility alone does not determine frequency of subsequent recombination.
There are at least two mechanisms of Vκ transcription. Pre-binding of the transcription factor E2A to Vκ promoters in pro-B cells, is predictive of which Vκ gene segments are transcribed prior to recombination {Lin, 2010 #9235}{Karki, 2018 #17541}.
In addition, Vκ genes near CTCF sites are preferentially transcribed {Karki, 2018 #17541}. Indeed, CTCF sites can anchor transcription {Chernukhin, 2007 #17480}{Pena-Hernandez, 2015 #17543}. Similarly, Vh recombination frequency is related to both transcription factor binding at promoters and proximity to CTCF sites {Bolland, 2016 #17489}. Through these, and probably other mechanisms {Ribeiro de Almeida, 2015 #17548}{Matheson, 2017 #17563}, the Vκ genes are transcribed prior to recombination. However, as both E2A and CTCF-bound sites are present in pro-B cells well prior to Vκ transcription, other unknown mechanisms must regulate the initiation of Vκ transcription at these anchor sites.
Furthermore, it is unclear how the above mechanisms of transcription would make diverse, stochastic Vκ repertoires accessible to recombination. It is possible that critical transcription factors, such as E2A, are limiting and that in individual cells, they bind to stochastically distributed subsets of promoters. However, this would require many Vκ promoters to have similar, if not identical, affinities for E2A. Ensuring diverse Vκ repertoires might be more complex if combinations of transcription factors determine Vκ accessibility. Likewise, how different CTCF sites in individual cells might be chosen to anchor Vκ transcription is not known.
To ensure that B cells express one B cell antigen receptor, one Igκ allele must be first chosen for recombination. Until recently, how this occurs has been unclear. Germline transcription of Jκ-Cκ is biallelic and therefore does not provide a mechanism of allelic choice {Amin, 2009 #4842}{Karki, 2018 #17541}. In contrast, we have demonstrated that transcription of the Vκ genes prior to Igκ recombination is monoallelic suggesting that allelic choice is determined by Vκ accessibility {Karki, 2018 #17541}. However, it is unlikely that transcription per se is the primary mechanism of allelic choice. In pro-B cells, the Igκ alleles sequentially replicate, with the earlier replicating allele being the one fated to be first transcribed and recombined {Mostoslavsky, 2001 #17570}. These data indicate that the alleles must be asymmetrical early in development in a way that dictates subsequent Vκ transcription and Igκ recombination. Monoallelic expression has been associated with H3K27me3 at the Vκ genes {Levin-Klein, 2017 #17388}{Karki, 2018 #17541}. However, quantitative analysis of H2K27me3 convincingly demonstrates that the Vκ genes are not marked in pro-B cells prior to the onset of transcription {Karki, 2018 #17541}.
Therefore, it is difficult to explain stochastic and monoallelic Vκ choice by conventional mechanisms of transcriptional and epigenetic regulation. Herein, we propose a model in which stochastic capture of Vκ containing TADs by fixed transcription factories (TFs), and the regulation of this process by cyclin D3, provide a framework for understanding Vκ stochastic diversity and monoallelic choice in the context of B cell development.
Recruitment of Vκ gene containing TADS to transcription factories.
Fundamentally, there are two mechanisms of gene transcription. The first is one in which transcription factors and downstream mechanisms ultimately lead to recruitment of the RNA polymerase II (RNAP) complex to the gene promoter (Figure 1A, a mechanism we refer to as Type 1 transcription). This is the canonical model and many regulatory mechanisms affecting this process have been described {Orphanides, 1996 #17549}{Lee, 2000 #17550}{Ossipow, 1995 #17551}.
However, it has also been demonstrated that genes can be activated by recruitment, or translocation, to RNAP within fixed transcription factories (Figure 1B, Type 2) {Osborne, 2004 #17170}{Iborra, 1996 #17171}. In Type 1 transcription, RNAP and transcription initiation complexes are recruited to the gene while in Type 2 the gene is recruited to RNAP. Several instances of Type 2 transcription have been reported {Chakalova, 2010 #17552}{Edelman, 2012 #17553}{Osborne, 2004 #17170}{Iborra, 1996 #17171}{Park, 2014 #17554}{Osborne, 2007 #17006}. Furthermore, evidence suggest that Type 2 transcription is a very common mechanism of gene activation {Papantonis, 2013 #17488}.
Prior to Igκ recombination, the Vκ genes are transcribed by a Type 2 mechanism {Karki, 2018 #17541}. From single cell RNA-sequencing and RNA-FISH of individual small pre-B cells, we observed that multiple Vκ genes translocate to fixed RNAP complexes and are transcribed from a single allele {Karki, 2018 #17541}. In other cells, TFs are discrete sites or hubs for transcription {Iborra, 1996 #17171}. In contrast, in pro- and pre-B cells, we observed that nuclear matrix-associated RNAP formed continuous strands that encompassed both Igκ alleles in RNAP “cages” {Karki, 2018 #17541}}(Figure 1C–D).
In any particular cell, it is likely that the folding of TADs within the nuclear niche and the relative positioning of RNAP complexes will vary. All these variables are predicted to change the probability that a particular Vκ gene region would engage one or more RNAP complexes, be transcribed and therefore become available for recombination to Jκ (discussed below). Therefore, intrinsic to Type 2 transcription, and variance in relative geometries of RNAP and the Vκ genes, is a mechanism whereby different Vκ genes could be transcribed in individual cells ensuring a diverse primary Vκ repertoire.
In small pre-B cells prior to Igκ recombination, multiple contiguous Vκ genes in the same orientation are transcribed suggesting that TFs can capture and read through multiple Vκ genes {Karki, 2018 #17541}. Transcription preferentially initiates at E2A-bound promoters or near CTCF sites (Figure 1E) and can extend over very long distances, encompassing multiple Vκs in a single transcript that is only limited by TAD boundaries. Furthermore, more than one Vκ containing TAD can be transcribed in each cell. We propose that this mechanism of loop capture transcription defines a pre-repertoire of accessible Vκs from which one is productively captured by the recombination center at Jκ. It is likely that additional spatial and conformational constraints, imposed by how Vκ gene TADs contract onto Jκ further restrict Vκ usage (Figure 1F).
Loop capture transcription bears mechanistic similarity to how RAG1/2 in recombination centers capture genomic DNA and scans for complementary recombination signal sequences over long distances limited only by TAD boundaries {Hu, 2015 #17490}. This suggests that sequential gene loop capture mechanisms, first by RNAP and then by RAG1/2 {Hu, 2015 #17490}, contribute to initial Igκ repertoire diversity.
Transcription by fixed RNAP requires that genomic DNA be pulled through the transcription complex. This is predicted to modulate chromatin loop structure and overall gene topology including locus contraction. By fixing RNAP, it could serve as a motor driving large and small scale changes in chromatin structure. This is consistent with evidence that Igκ and Igh locus contraction is associated with transcription {Corcoran, 2010 #17555}{Verma-Gaur, 2012 #17556}{Choi, 2014 #16091}{Verma-Gaur, 2012 #17556}.
Regulation of Vκ transcription by Cyclin D3
Remarkably, recruitment of Vκ genes to transcription factories is repressed by the cell cycle molecule, cyclin D3 {Karki, 2018 #17541}. Cyclin D3 also represses Vκ-Jκ contraction but not Jκ germline transcription. While cyclin D3 is normally considered a soluble regulator of CDK4 and 6 {Cooper, 2006 #9458}, in B cell progenitors, there is a large fraction of cyclin D3 associated with the nuclear matrix {Powers, 2012 #9529}. This might be a specific feature of lymphocytes as no detectable nuclear matrix-bound cyclin D3 fraction can be detected in mouse embryonic fibroblasts. In WT pro-B cells, nuclear matrix bound cyclin D3 interdigitates with RNAP forming intertwining strands where it prevents access of Vκ genes to transcription factories (Figure 1E). Cyclin D3 does not enforce monoallelic choice or determine how the Vκ genes are transcribed. Rather, it appears to repress productive access of the Vκ genes, on the Igκ allele fated for recombination, to TFs.
Cyclin D3 is a critical regulator of cell cycle in pro and large pre-B cells {Cooper, 2006 #9458} . Cell cycle exit in small pre-B cells is associated both with repression of cyclin D3 transcription {Cooper, 2006 #9458} and the translocation of cyclin D3 protein to the nuclear membrane {Karki, 2018 #17541}}. Our data indicate that this coordinated repression of cyclin D3 in small pre-B cells both directs cells to exit cell cycle and derepresses Vκ, setting the stage for Igκ recombination in non-cycling cells.
Cyclin D3 also repressed other V genes, including Igh and TcrγV genes, which share similarities with Vκ in their V gene organization {Lefranc, 2005 #17557}{Ribeiro de Almeida, 2015 #17548}{Ebert, 2015 #17566}. These similarities suggest V accessibility as common determinant of monoallelic choice at antigen receptor genes. In addition, cyclin D3 repressed about 200 other genes, at least 70% of which are known to be monoallelically expressed {Karki, 2018 #17541}. Some of these repressed genes, such as olfactory and protocadherin genes, are members of diverse families in which monogenic choice occurs upon cell cycle exit and differentiation {Monahan, 2015 #17481}. Furthermore, and similar to antigen receptor genes, protocadherin and olfactory gene segments are clustered within TADs {Monahan, 2012 #17565}{Holwerda, 2012 #17567}{Guo, 2015 #17558}{Kim, 2007 #17561}. These data suggest that loop capture transcription at TFs, and its regulation by cyclin D3, is a general mechanism coupling cell cycle exit to monogenic choice among diverse gene families.
The mechanism by which cyclin D3 prevents Vκ transcription at TFs is not known. However, cyclin D3 is not the first nuclear matrix-associated protein to be implicated in gene regulation. Similar to cyclin D3, specialized ATC-rich sequence-binding protein-1 (SATB1) is distributed in thymocyte nuclei in a cage-like pattern on the nuclear matrix, where it not only organizes chromatin folding but also establishes specific histone modifications over the region where it binds {Yasui, 2002 #17569}. For example, in the Il2ra locus, SATB1 recruits histone deacetylases and thereby contributes to repression of the locus. Interestingly and consistent with our observation in Ccnd3-/- pro-B cells, SATB1 also represses neuron-specific genes in thymocytes {Cai, 2003 #17560} suggesting a broad role in gene repression. These data highlight a critical, if poorly understood, role for nuclear matrix-associated proteins in gene regulation.
Chromatin loop structure is constrained in nuclear niches
A striking finding was that each Igκ allele was surrounded by cylindrical cages of nuclear matrix-bound RNAP that appeared to define and constrain the space in which the genes reside. To understand these nuclear niches better, we preformed 3D imaging on WT pro-B and pre-B cells and measured distances (Imaris) between distal Vκ, Jκ and RNAP {Karki, 2018 #17541} (immuno-FISH and data not shown).
Shown in Figure 2A–B is a summary of approximate measured relationships between Igκ and nuclear matrix-bound RNAP in WT pro-B cells (Figure 2A) and small pre-B cells (Figure 2B). As demonstrated, the allele fated for recombination (Allele 1) in WT pro-B cells is in a tight niche with the Vκ genes extended in an RNAP cylinder with a diameter of approximately 200 to 250 nm. In contrast, the allele that will not initially recombine (Allele 2) is less restricted by an RNAP matrix, which is shaped like a truncated cone. Upon contraction in small pre-B cells, the Vκ genes of Allele 1 are pulled towards the recombination center assembled at Jκ which is anchored by the MAR {Cockerill, 1986 #5467} into a tight RNAP niche that approximates a sphere rather than a cylinder.
We then used polymer chain simulation of Igκ to investigate the implications of enclosing Igκ in cylindrical niche. Simulation of Vκ structure was performed either without spatial constraints (Figure 3A–B) or constrained within a 0.8 μmX0.2 μm cylinder, which approximates nuclear niche size when Vκ transcription is first initiated (Figure 3C–E). When unconfined, the Igκ polymer folds into a globular meshwork of DNA, without prominent loop structures (Figure 3A). Furthermore, this organization prevented CTCF sites from forming contiguous loops (Figure 3B). Predicted loop structure changes dramatically when restricted to a cylinder (Figure 3C). Proximal Vκ TADs interact more extensively with Jκ consistent with preferential initial recombination {Karki, 2018 #17541} (Movie S1). In contrast, the intermediate loop L4 extends laterally across the cylinder, away from Jκ and more towards the cylinder interior (Movie S1). Strikingly, the L2 and L3 TADs, which contain most expressed Vκs, form more globular domains with borders extending to cylinder edge in predicted close proximity with RNAP (Figure 3D and Movie S2). In multiple instances, Vκs exposed to the cylinder surface were close to CTCF sites. Highly used Vκs, which comprise a subset of expressed Vκs, had a similar spatial distribution to all expressed Vκs, lying either close to a CTCF site or on the outside of an exposed loop (Figure 3E). These results suggest that restriction within a nuclear matrix cylindrical niche positions transcriptionally permissive Vκ genes for capture by nuclear matrix-associated RNAP.
Constraining Igκ in a matrix defined cylindrical space compressed and ordered Vκ containing TADs such that Vκ genes close to CTCF sites, or those in small TADs, were exposed towards the surface of the cylinder and therefore were accessible to RNAP. In contrast, central regions of large TADs tended to fold towards the interior of the cylinder and were relatively unavailable. This gene topology predicts that nuclear matrix-bound RNAP would tend to engage Vκ genes near CTCF sites and then stochastically read in either direction. However, only reading away from CTCF sites would be productive. This model is consistent with the pattern of Vκ expression observed in single small pre-B cells {Karki, 2018 #17541}. Our data suggest the shape and size of the nuclear niche in which an Igκ allele resides changes the 3D topology of the gene and the way in which Vκ gene-containing TADS are packed within the nucleus.
We observed a range of distances between the Vκ genes and RNAP (data not shown) predicted to enforce different Vκ topologies in individual cells thereby diversifying the expressed Vκ repertoire. Furthermore, our modeling assumes static CTCF-defined loops yet recent evidence indicates that they are very dynamic {Rao, 2017 #17562}{Fudenberg, 2016 #17568}. Loop movement in the observed nuclear niches, relative to fixed RNAP complexes is predicted to be another mechanism by which different Vκ repertoires, in individual cells, would be transcribed and available to recombination. Therefore, Type 2 transcription provides multiple potential mechanisms to understand Vκ diversity in the Igκ repertoire.
In pro-B cells, we observed that the Igκ allele not fated for immediate transcription was in a much larger niche. We do not know if differences in niche size or molecular composition dictates monoallelic choice. However, it raises the possibility that factors extrinsic to the Igκ genes, and not intrinsic differences such as H3K27me3, dictate which allele is first transcribed. How this might work will require a better understanding of the protein complexes assembled on the nuclear matrix and the functions they mediate.
Finally, the niche into which the Vκ genes were pulled onto Jκ was quite small, approximately 200 nm in diameter. Constraining Vκ, Jκ and the recombination machinery would limit degrees of freedom and increase local concentrations of reactants thereby favoring efficient recombination. Therefore, nuclear niches provide a conceptual framework with which to understand several fundamental aspects of both immunoglobulin gene diversity and mechanisms of recombination.
Summary
Our recent observations {Karki, 2018 #17541} highlight the importance of the nuclear environment or niche in which the Igκ alleles reside. Surrounding each allele is RNAP arrayed on nuclear matrix strands and Vκ gene TADs translocate to these RNAP complexes to be transcribed. The exact spatial relationships between the Vκ genes and the RNA complexes are predicted to influence the probability with which Vκ genes are transcribed in a particular cell. Furthermore, in each pro- or pre-B cell the spatial relationships between RNAP and the Vκ gene-containing TADs vary. Therefore, intrinsic to this loop capture mechanism of transcription are features predicted to provide diversity to the primary Vκ repertoire.
Productive translocation of Vκ containing TADS is repressed by cyclin D3, which is intertwined with RNAP on the nuclear matrix. The remarkable duality of cyclin D3, as both a cell cycle effector {Cooper, 2006 #9458}{Mandal, 2009 #14530} and repressor of Vκ transcription {Karki, 2018 #17541}{Powers, 2012 #9529}, ensures the tight coupling of cell cycle exit to the initiation of Igκ recombination required to ensure genomic integrity {Clark, 2014 #14857}. Furthermore, each cyclin D3 function is mediated by different fractions of cyclin D3 within the nucleus {Powers, 2012 #9529} providing another example of how nuclear spatial relationships determine function. Defining the interactions between genes and nuclear matrix is complicated and experimentally difficult. However, it is necessary for understanding fundamental mechanisms of gene function including immunoglobulin gene recombination.
Supplementary Material
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.