Abstract
Synovial tissue inflammation is a hallmark of rheumatoid arthritis (RA). Recent work has identified prominent pathogenic cell states in inflamed RA synovial tissue, such as T peripheral helper cells; however, the epigenetic regulation of these states has yet to be defined. Here, we examine genome-wide open chromatin at single-cell resolution in 30 synovial tissue samples, including 12 samples with transcriptional data in multimodal experiments. We identify 24 chromatin classes and predict their associated transcription factors, including a CD8 + GZMK+ class associated with EOMES and a lining fibroblast class associated with AP-1. By integrating with an RA tissue transcriptional atlas, we propose that these chromatin classes represent ‘superstates’ corresponding to multiple transcriptional cell states. Finally, we demonstrate the utility of this RA tissue chromatin atlas through the associations between disease phenotypes and chromatin class abundance, as well as the nomination of classes mediating the effects of putatively causal RA genetic variants.
Subject terms: Epigenetics in immune cells, Rheumatoid arthritis
The epigenetic changes underlying the heterogeneity of RA disease presentation have been the subject of intense scrutiny. In this study, the authors use multiple single-cell sequencing datasets to define ‘chromatin superstates’ in patients with RA, which associate with distinct transcription factors and disease phenotypes.
Introduction
Rheumatoid arthritis (RA) is a chronic autoimmune disease that affects ~1% of people in North America and Northern Europe1. In RA, the synovial joint tissue is infiltrated by immune cells that interact with stromal cells to sustain a cycle of inflammation. Untreated, RA can lead to joint destruction, disability, and a reduction in life expectancy2. The heterogeneous clinical features of RA, including differences in cyclic citrullinated peptide antibody autoreactivity3, underlying genetics4,5, and response to targeted therapies6–10, render it challenging to construct generic treatment plans that will be effective for most patients.
Recent studies have taken advantage of single-cell technologies to define key cell populations that are present and expanded in RA tissue inflammation11–14, demonstrating both the heterogeneous nature of tissue inflammation and the promise to identify novel targeted therapeutics for RA. Our recent Accelerating Medicines Partnership Program: Rheumatoid Arthritis (AMP-RA) reference study14 comprehensively classified pathogenic transcriptional cell states within synovial joint tissue using single-cell CITE-seq15, which simultaneously measures mRNA and surface protein marker expression at the single-cell level. Within 6 broad cell types (B/plasma, T, NK, myeloid, stromal [fibroblast/mural], and endothelial), the study defined 77 fine-grain cell states. Many of these cell states have been previously shown to be associated with RA pathology: for example, CD4+ T peripheral helper cells (TPH)11,12, HLA-DRhi sublining fibroblasts11, proinflammatory IL1B+ monocytes11, and autoimmune-associated B cells (ABC)11,16. However, we have a limited understanding of the chromatin accessibility profiles that underlie these pathogenic synovial tissue cell states.
Open chromatin at critical cis-regulatory regions allows essential transcription factors (TFs) to access DNA and epigenetically regulate gene expression17. Chromatin accessibility is a necessary, but not sufficient, condition for RNA polymerases to produce transcripts at gene promoters18. Therefore, one possibility is that each transcriptional cell state has its own unique chromatin profile19, which we will denote as a chromatin class. Alternatively, multiple transcriptional cell states could share a chromatin class if the cell states were dynamically transitioning from one to another in response to external stimuli without altering the chromatin landscape19. In RA, those external stimuli could be cytokines that activate TFs to induce the expression of key genes and drive pathogenic cell states20. For example, NOTCH3 signaling propels transcriptional programs coordinating the transformation from perivascular fibroblasts to inflammatory sublining fibroblasts21. Similarly, exposure to TNF and interferon-γ promotes the differentiation of monocytes into inflammatory myeloid cells22.
Here, we characterize synovial cells from patients with RA or osteoarthritis (OA) using unimodal single-cell ATAC-seq (scATAC-seq) and multimodal single-nucleus ATAC-seq (snATAC-seq) and RNA-seq (snRNA-seq) technologies to compare chromatin classes to transcriptional cell states (Fig. 1a). Our results support a model of open chromatin superstates shared by multiple fine-grain transcriptional cell states. We show these superstates may be regulated by key TFs and associated with clinical and genetic factors in the pathology of RA (Fig. 1a).
Results
Quality control of unimodal scATAC-seq and multimodal snATAC-seq synovial tissue datasets
We obtained synovial biopsy specimens from 25 people with RA and 5 with OA and disaggregated cells using well-established protocols from the AMP-RA/SLE consortium23 (Methods). We conducted unimodal scATAC-seq on samples from 14 RA patients and 4 OA patients and multimodal snATAC-/snRNA-seq on samples from 11 RA patients and 1 OA patient (Supplementary Table 1). Applying stringent quality control to the open chromatin modality, we retained cells with >10,000 reads, >50% of those reads falling in peak neighborhoods, >10% of reads in promoter regions, <10% of reads in the mitochondrial chromosome, and <10% of reads falling in the ENCODE blacklisted regions24 (Methods; Supplementary Figs. 1a, b and 2a, b; Supplementary Table 2). We further required that cells from the multimodal data pass quality control for the snRNA-seq modality (Methods; Supplementary Figs. 1b and 2c). After additional QC within individual cell types combining both technologies, the final dataset contained 86,994 cells from 30 samples (median of 2990 cells/sample) (Supplementary Figs. 1c, d and 2d, e). For consistency, we called a set of 132,520 consensus peaks from the unimodal scATAC-seq data to be used for all analyses (Methods). We observed that 95% of the called peaks overlapped ENCODE candidate cis-regulatory elements (cCREs)25 and 17% overlapped promoters26, suggesting highly accurate peak calls (Supplementary Fig. 2f).
Defining RA broad cell types by clustering unimodal and multimodal datasets
To assign each cell to a broad cell type, we clustered the unimodal scATAC-seq and multimodal snATAC-seq datasets independently (Methods). In both instances, we characterized six cell types that we annotated based on the chromatin accessibility of “marker peaks,” defined as peaks in cell-type-specific marker gene promoters (Methods; Fig. 1b). We identified T cells (CD3D and CD3G), NK cells (NCAM1 and NCR1), B/plasma cells (MS4A1 and TNFRSF17), myeloid cells (CD163 and C1QA), stromal cells (PDPN and PDGFRB), and vascular endothelial cells (VWF and ERG) (Supplementary Fig. 2g–j). In the multimodal data, we observed consistent peak accessibility and gene expression for marker genes in these cell types (Supplementary Fig. 2k–m).
We combined cells from unimodal and multimodal chromatin technologies and then created datasets for each of the broad cell types. For cell types with more than 1500 cells, we applied Louvain clustering to a shared nearest neighbor graph based on batch corrected27 principal components (PCs) of chromatin accessibility to define fine-grain chromatin classes (Methods).
RA T cell chromatin classes
We first examined the accessible chromatin for 23,168 T cells across unimodal and multimodal datasets. Louvain clustering defined 5 T cell chromatin classes, denoted as TA for T cell ATAC, across 30 samples (Fig. 2a; Supplementary Fig. 3a). In the TA−2: CD4+ PD-1+ TFH/TPH chromatin class, we observed high promoter accessibility and gene expression for PD-1 (PDCD1) and CTLA4, marker genes for T follicular helper (TFH)/TPH cells (Fig. 2b; Supplementary Fig. 3b). A known expanded pathogenic cell state in RA, TFH/TPH cells help B cells respond to inflammation11,12. The TA−3: CD4+ IKZF2+ Treg cluster had high accessibility and expression for IKZF2 (Helios), which can stabilize the inhibitory activity of regulatory T cells28 (Treg) (Fig. 2b). We also observed open chromatin regions at both the FOXP3 transcription start site (TSS) as well as the downstream Treg-specific demethylated region29 (TSDR) specifically for TA−3 (Supplementary Fig. 3c); FOXP3 was also expressed exclusively in TA−3 cells (Supplementary Fig. 3b). We found one more predominantly CD4+ T cell class, TA−1: CD4+ IL7R+, with high expression and accessibility for IL7R, encoding the CD127 protein. This marker is typically lost with activation, suggesting that TA−1 is a population of naive or central memory T cells, as further evidenced by SELL and CCR7 expression (Fig. 2b; Supplementary Fig. 3b). The TA−0: CD8A+ GZMK+ cluster was marked by GZMK and CRTAM peak accessibility and gene expression (Fig. 2b; Supplementary Fig. 3b); a similar population has been shown to be expanded in RA and a major producer of inflammatory cytokines11,30. We found another primarily CD8+ group of T cells, the TA−4: CD8A+ PRF1+ cytotoxic cluster, which had high accessibility for the PRF1 promoter and expression for the PRF1, GNLY, and GZMB genes, suggesting an effector memory phenotype (Fig. 2b; Supplementary Fig. 3b).
Since T cells are primarily defined as CD4 and CD8 lineages that are not thought to cross-differentiate31, we next examined whether the chromatin classes were strictly segregated by CD4 or CD8 promoter peak accessibility. We observed that each chromatin class, while largely showing accessibility for only one lineage’s promoter, also included some cells with accessibility for the other lineage’s promoter (Supplementary Table 3). For example, cytotoxic T cells in TA−4 were more likely to have an accessible CD8A promoter, but also included a minority of cells with accessibility at the CD4 promoter. Therefore, we assessed which promoter peaks were associated with a specific lineage. While accounting for chromatin class, sample, and fragment count, we ran a logistic regression model over all T cells relating each promoter peak’s openness to CD4/CD8A promoter peak accessibility status: 1 for open CD4 and closed CD8A, −1 for open CD8A and closed CD4, or 0 otherwise (Methods). We only found 93 out of 16,383 promoter peaks open in T cells significantly associated with a lineage’s promoter accessibility, with 29 associating to CD4 and 64 to CD8A, at FDR < 0.20 (Supplementary Data 1). This indicated that T cell lineage is important for a small subset of genes’ local promoter chromatin environment, such as IL6ST in CD4 T cells and CRTAM in CD8 T cells, and those lineage-specific loci segregate by chromatin class as expected (Methods; Supplementary Fig. 3d). However, the majority of promoters appeared to be more specifically accessible within their chromatin classes across lineages. This might suggest that the corresponding gene’s function was critical for the class definition, as highlighted by functional genes such as PRF1 with expression in both cytotoxic CD4 and CD8 T cells32 as well as the homing gene CCR7 that acts across both lineages33.
We next identified the TFs potentially regulating these T cell chromatin classes by calculating TF motif enrichments34 in class-specific peaks35 whose TFs were at least minimally expressed within that class (Methods). In the primarily CD8+ classes, TA−0: CD8A + GZMK+ and TA−4: CD8A+ PRF1+ cytotoxic, we found EOMES (padj = 7.44e-99, 8.12e-44, respectively) and T-bet (TBX21) (padj = 4.92e-90, 2.75e-38, respectively) motifs enriched (Fig. 2c); the corresponding TFs are known to drive memory and effector CD8+ cell states36. EOMES had significantly higher gene expression in TA−0 cells compared to all other T cells (Wilcoxon FDR = 1.92e-84; Supplementary Data 2). Furthermore, we found both motifs in the promoter of KLRG1, a gene expressed in CD8+ effector T cells that might participate in the effector-to-memory transition37 (Fig. 2d). The cytotoxic TA−4 class was also enriched for RUNX338 motifs (padj = 5.81e-13) (Fig. 2c). Within the TA−2: CD4+ PD-1+ TFH/TPH class, we observed high enrichments for AP-1 motifs, especially BATF (padj = 3.31e-103; Fig. 2d), which promotes expression of key programs in TFH cells39 and had higher gene expression in this class’s cells (Wilcoxon FDR = 3.10e-125; Supplementary Data 2). We found TCF7 and LEF1 motifs40 within the non-activated TA−1: CD4+ IL7R+ cluster (padj = 1.14e-10, 3.97e-13, respectively; Fig. 2d).
RA stromal chromatin classes
Next, we analyzed 24,307 stromal cells (Methods). With Louvain clustering, we partitioned the cells into 4 open chromatin classes: lining fibroblasts (SA−1) along the synovial membrane, sublining fibroblasts (SA−0, SA−2) filling the interstitial space, and mural cells (SA−3) adjacent to blood vessels41 (Fig. 3a; Supplementary Fig. 4a). The most abundant sublining cluster, SA−0: CXCL12+ HLA-DRhi sublining fibroblasts, was a proinflammatory cluster marked by CXCL12, HLA-DRA, and CD74 accessibility and expression; SA−0 also expressed IL6, which is an established RA drug target7,8 (Fig. 3b; Supplementary Fig. 4b). The SA−2: CD34+ MFAP5+ sublining fibroblast class had accessible promoter peaks, where available, for the expressed CD34, MFAP5, PI16, and DPP4 genes, previously reported to represent a progenitor-like fibroblast state shared across tissue types42–44 (Fig. 3b; Supplementary Fig. 4b). The SA−1: PRG4+ lining fibroblast chromatin class was characterized with high accessibility and expression of PRG4 and CRTAC1 (Fig. 3b; Supplementary Fig. 4b). We also observed high expression of MMP1 and MMP3, matrix metalloproteinases responsible for extracellular matrix (ECM) destruction45, within SA−1 (Supplementary Fig. 4b). Finally, we found a mural cell class, SA−3: MCAM+ mural, with both gene expression and promoter peak accessibility for MCAM and NOTCH3 (Fig. 3b; Supplementary Fig. 4b). In RA, NOTCH3 signaling from the endothelium acts primarily on mural cells, which in turn stimulate sublining fibroblasts along a spatial axis21 as seen in the decreasing NOTCH3 gene expression from SA−3, SA−0, SA−2, to SA−1 in the multiome cells (Supplementary Fig. 4b). Knockout of NOTCH3 has been shown to reduce inflammation and joint destruction in mouse models21.
DNA methylation and chromatin accessibility work in tandem to define cell-type-specific gene regulation through silencing CpG-dense promoters and repressing methylation-sensitive TF binding46. Methylation changes have been previously described between cultured fibroblast cell lines from RA and OA patients47,48. Thus, we wondered if a specific subset of fibroblasts might be the source of these differentially methylated regions (DMRs). Using a published set of DMRs for RA versus OA fibroblast-like synoviocyte (FLS) cell lines47, we defined a per-cell score of peak accessibility associated with hypermethylated (positive) or hypomethylated (negative) loci in RA (Methods). The sublining fibroblasts in SA−0 were enriched for hypomethylated regions (Wilcoxon SA−0 versus other stromal cells one-sided p < 2.2e-16), suggesting that the RA synovial fibroblast DMRs were relatively enriched for putatively functional accessible chromatin regions specifically in sublining fibroblasts (Supplementary Fig. 4c). Furthermore, the genes associated to these FLS DMRs were expressed primarily in tissue SA−0 (Supplementary Fig. 4d, right; Methods) and are crucial to a number of signaling pathways potentially at play in these inflammatory fibroblasts47: STAT3 in IL-6 signaling, CASP1 in IL-1 signaling, TRAF2 in TNF signaling, and TGFB3 in TGFβ signaling. These results proposed the possibility of epigenetic memory retention even after multiple FLS cell line passages49, as sublining fibroblasts, particularly HLA-DRhi and CD34− fibroblasts, are expanded in RA relative to OA in synovial tissue samples11.
We then considered if the retention of DNA methylation after multiple passages extended to a retention of chromatin accessibility or whether that would be lost alongside transcriptional identity21. To assess this, we developed two per-cell scores of fibroblast identity comparing tissue lining (SA−1) to sublining (SA−0, SA−2) cells; one score using differentially expressed genes and the other using differentially accessible peaks. Using a multiome dataset of isolated FLS from two RA synovial tissue samples cultured for three passages in a recent RA fibroblast heterogeneity study44, we compared their per-cell fibroblast identity score to our tissue fibroblast populations in both gene and peak space. Unsurprisingly, we found that differential genes from tissue were able to separate tissue lining and tissue sublining cells, but the cultured FLS did not have discernable lining and sublining populations by the same measure, consistent with previous results21 (Supplementary Fig. 4e). More surprisingly, we saw similar results using the fibroblast identity peak score (Supplementary Fig. 4f), suggesting that fibroblast peak accessibility, and more broadly chromatin class identity, was not maintained in cell culture after multiple passages. This disconnect between DNA methylation and chromatin accessibility has also been seen previously when assaying both directly using ATAC-Me in the monocyte-to-macrophage cell fate transition50.
Next, we investigated which TFs were putatively driving these chromatin classes (Fig. 3c). AP-1 motifs such as FOS::JUND were most significantly enriched in the SA−1 lining class (padj = 9.29e-152; Fig. 3c). These TFs are known to play many roles in RA and specifically regulate MMP1 and MMP3 promoters49,51 (Fig. 3d). The progenitor-like sublining SA−2 class harbored NFATC motifs, such as NFATC4 (padj = 2.89e-36; Fig. 3c). In the SA−0: CXCL12+ HLA-DRhi sublining chromatin class, we found TEAD152 (padj = 2.86e-52; Fig. 3c) and STAT1/3 TF motif enrichments (padj = 3.34e-37, 4.27e-38, respectively; Fig. 3c), with the latter likely regulating the JAK/STAT pathway responsible for the proinflammatory cytokine activation central to RA clinical activity9,53. The gene expression of TEAD1 and STAT3 in SA−0 cells was significantly higher than in the other stromal cells (Wilcoxon FDR = 1.05e-27 and 1.65e-17, respectively; Supplementary Data 2). Finally, SA−3: MCAM+ mural cells were enriched for KLF254,55 and EBF156,57 motifs (padj = 4.94e-119, 1.83e-119, respectively; Fig. 3c).
RA myeloid chromatin classes
We classified 25,691 myeloid cells into 5 chromatin classes (Fig. 4a; Supplementary Fig. 5a). The first class, MA−2: LYVE1+ TIMD4+ TRM, had markers for tissue-resident macrophages (TRM) with gene and peak signal at LYVE1, a perivascular localization marker13, and TIMD4, a scavenger receptor13 (Fig. 4b; Supplementary Fig. 5b). We found another TRM class, MA−0: F13A1+ MARCKS+ TRM, with high accessibility and expression at F13A1 and MARCKS, both known to be expressed in macrophages58,59 (Fig. 4b; Supplementary Fig. 5b). The MA−1: FCN1+ SAMSN1+ infiltrating monocytes had accessibility and expression for FCN1, PLAUR, CCR2, and IL1B, similar to an expanded proinflammatory population in a previous RA study11 (Fig. 4b; Supplementary Fig. 5b). The MA−4: SPP1+ FABP5+ intermediate class likely arose from bone marrow-derived macrophages60 with its high accessibility and expression for SPP1 (Fig. 4b); bone marrow-derived macrophages are known be abundant in active RA and induce proinflammatory cytokines/chemokines13,61. Finally, we found the MA−3: CD1C+ AFF3+ DC chromatin class with expression markers CD1C, AFF3, CLEC10A, and FCER1A, whose corresponding promoter peaks generally showed more promiscuously open chromatin across classes (Fig. 4b; Supplementary Fig. 5b).
We next investigated the TF motifs enriched in the myeloid chromatin classes. MA−2 was enriched for KLF motifs (Fig. 4c), with KLF4 (padj = 1.34e-6) known to both establish residency of TRMs and to assist in their phagocytic function62. Furthermore, we found a KLF4 motif in the promoter of C1QB, whose protein product bridges phagocytes to the apoptotic cells they clear63 (Fig. 4d). Both the intermediate MA−4 and the infiltrating monocyte MA−1 classes had significant enrichments of AP-1 activation motifs (e.g., JUN padj = 1.77e-153, 3.65e-136, respectively; Fig. 4c). AP-1 TFs have been shown to function in human classical monocytes along with CEBP TFs64, also enriched in MA−1 (e.g., CEBPD padj = 2.10e-26; Fig. 4c). SPI1 (PU.1) is the master regulator of myeloid development65, including conventional DCs66. We found the SPI1 motif most strongly enriched in the DC cluster MA−3 (padj = 3.24e-55; Fig. 4c), though the related SPIB motif’s corresponding TF, known to function in pDCs67, was more specifically expressed in this class (Wilcoxon FDR = 6.93e-74; Supplementary Data 2).
RA B/plasma chromatin classes
Next, we clustered 8641 B and plasma cells into 4 MS4A1+ B cell and 2 SDC1+ plasma cell chromatin classes (Methods; Fig. 5a; Supplementary Fig. 6a). We defined a BA−3: FCER2+ IGHD+ naive B class with high accessibility and expression of FCER2, encoding naïve marker CD2368 (Fig. 5b; Supplementary Fig. 6b). We also labeled a BA−4: CD24+ MAST4+ unswitched memory B class (Supplementary Fig. 6b). IGHD and IGHM expression was lower in BA−2: TOX+ PDE4D+ switched memory B cells, and the TF TOX had its highest expression and accessibility within B cells in BA−2 as previously shown in switched memory B cells69,70 (Fig. 5b; Supplementary Fig. 6b). BA−5: ITGAX+ ABC had high accessibility and expression of ITGAX, encoding for CD11c, a key ABC marker71 (Fig. 5b; Supplementary Fig. 6b). ABCs were shown to be associated with leukocyte-rich RA11 with a potential role in antigen presentation72, which was supported here by the expression of LAMP1 and HLA-DRA in BA−5 (Supplementary Fig. 6b). The plasma chromatin class, BA−0: CREB3L2+ plasma, was marked by CREB3L2, a known TF in the transition between B and plasma cells73 (Fig. 5b; Supplementary Fig. 6b). These results suggested tissue in situ B cell activation and differentiation into plasma cells, as we have previously suggested74. Finally, BA−1: CD27+ plasma, had the highest accessibility and expression of CD27 (Fig. 5b; Supplementary Fig. 6b). We note that plasma cells were difficult to define using chromatin accessibility data, with many of the immunoglobulin genes having low signal (Supplementary Fig. 6b).
We then explored the TF motif landscape of B and plasma cells. B cells shared many TF motifs across clusters, with many ETS factors (e.g., SPIB, SPI1, ETS1) as well as EBF1 and NFkB1/2 (Fig. 5c). SPIB and SPI1 work together to regulate B cell receptor signaling75, which starts its dysregulation in RA at the naive B cell level76,77 (padj = 0, 0, respectively; Fig. 5c). Switched memory B cells were enriched for ETS1 motifs (padj = 9.51e-19; Fig. 5c), whose TF is required for IgG2a class switching in mice78. In plasma cells, BA−0 had over-represented motifs such as KLF279 and SP380 (padj = 8.94e-105, 3.84e-138, respectively; Fig. 5c, d). BA−1 was enriched for AP-1 factor motifs81, namely BATF::JUN (padj = 0; Fig. 5c, d, Supplementary Fig. 6c). Both BATF and JUN gene expression was higher in BA−1 cells compared to those in other B/plasma classes (Wilcoxon FDR = 9.29e-04 and 1.60e-47, respectively; Supplementary Data 2). In the locus of PRDM1, a known plasma cell TF80, the more BA−0 accessible peak had an SP3 motif while the more BA−1 accessible peaks had BATF::JUN motifs (Fig. 5d), suggesting potentially different regulatory strategies by class.
RA endothelial chromatin classes
Among the 3809 endothelial cells, we identified 4 chromatin classes (Fig. 6a; Supplementary Fig. 7a). The EA−2: SEMA3G+ arteriolar class had gene and peak markers for signaling-related genes including SEMA3G82, CXCL12, and JAG1 (Fig. 6b; Supplementary Fig. 7b). The NOTCH3 signaling gradient that causes inflammation and joint destruction in RA mouse models likely originates through Notch ligand JAG1 in these arteriolar endothelial cells21. We identified the EA−0: SELP+ venular class with markers for leukocyte trafficking to tissue such as SELP83 as well as inflammatory genes HLA-DRA and CD74 (Fig. 6b; Supplementary Fig. 7b). We also found a capillary class, EA−1: RGCC+ capillary marked by RGCC84 and SPARC chromatin accessibility and gene expression (Fig. 6b; Supplementary Fig. 7b). Finally, a small population of EA−3: PROX1+ lymphatic cells had gene expression of and promoter peak accessibility at PROX185 and PARD6G genes (Fig. 6b; Supplementary Fig. 7b).
We identified SOX motifs86 in EA−2, STAT motifs87 in EA−0, and AP-1 motifs88 in EA−1 (Fig. 6c). Sox17 is a crucial intermediary between Wnt and Notch signaling that specifically initiates and maintains endothelial arterial identity in mice86. Similarly, we found a SOX17 motif (padj = 3.27e-8) in the promoter of NES89,90 with its highest accessibility and expression (Wilcoxon FDR = 4.29e-19; Supplementary Data 2) in EA−2 cells (Fig. 6d).
Chromatin classes are stable irrespective of OA and low-cell-count samples
Our chromatin classes were determined using all samples for maximum power, so we next investigated the contribution of OA and low-cell-count samples to this classification. While we were underpowered to reliably detect differences between RA and OA, we saw that chromatin classes varied in their proportions between these two diseases (Supplementary Table 4). To determine if the chromatin class definitions were robust to the exclusion of OA samples, we removed the 2395 T cells corresponding to OA samples and reclustered the remaining cells. We only observed positive, significant odds ratios (ORs) for cells from a new RA-only cluster belonging to their corresponding original chromatin class, relative to the other classes (Supplementary Fig. 8a). This showed that the same groups of RA T cells cluster together regardless of whether OA T cells were included in the clustering. Since stromal cells had a higher proportion of OA cells, particularly in lining fibroblasts14,91 (Supplementary Table 4), we also reclustered the stromal cells after removing 4,462 cells from OA samples and found that all four of our original stromal chromatin classes had corresponding RA-only cluster(s) (Supplementary Fig. 8b). Furthermore, we sought to determine if including the low-cell-count samples was impacting the chromatin class definitions, especially for the cell types with lower cell counts overall. To test this, we removed 467 cells across 11 samples with fewer than 100 B/plasma cells and reclustered the remaining cells. We were able to recover all the original B/plasma chromatin classes (Supplementary Fig. 8c), suggesting that these low-cell-count samples did not drive our original classes. We saw similar results in endothelial cells after removing 954 cells across 19 samples (Supplementary Fig. 8d). These analyses suggested our chromatin classes were robust to the inclusion of both OA and low-cell-count samples.
Synovial tissue is key to identifying pathogenic RA chromatin classes
To determine if the chromatin classes identified in RA tissue were comparable with the known peripheral blood chromatin landscape, we clustered the tissue cells with those from a published healthy PBMC multiome dataset92,93 (Supplementary Fig. 9). To determine the similarity between the PBMC and tissue chromatin classes, we calculated the OR between the newly defined clusters and the original blood and tissue labels; overall, there was good concordance. For example, the PBMC Treg cells and TA−3: CD4+ IKZF2+ Treg cells were both enriched in T cell combined cluster 5 (OR = 12 and 85, respectively) (Supplementary Fig. 9a) and PBMC cDC2 and pDC associated with MA−3: CD1C+ AFF3 + DC in myeloid combined cluster 4 (OR = 45, 78, and 100, respectively) (Supplementary Fig. 9b). However, there were some tissue chromatin classes that did not have clear counterparts in PBMCs, such as TA−2: CD4+ PD-1+ TFH/TPH, MA−2: LYVE1+ TIMD4+ TRM, MA−4: SPP1+ FABP5+ intermediate, and BA−5: ITGAX + ABC (Supplementary Fig. 9). With the current dataset, we cannot conclusively determine whether these disparities reflect tissue and blood or RA and healthy differences. However, prior studies have shown both that these cell states are tissue-enriched12,71,94 and implicated in RA pathogenesis11–13,16,61, suggesting that the study of disease tissue is necessary for well-powered analyses of these populations.
Chromatin classes are epigenetic superstates of transcriptional cell states
To understand how these chromatin classes corresponded to transcriptionally defined cell states, we used Symphony95 to map the RA multimodal snRNA-seq profiles into the well-annotated AMP-RA cell type references14. After embedding the multimodal snRNA-seq profiles into the AMP-RA reference data, we annotated each multimodal cell by the most common cell state of its five nearest reference neighbors. 70% of T cells (24 states), 96% of stromal cells (10 states), 96% of myeloid cells (15 states), 87% of B/plasma cells (9 states), and 99% of endothelial cells (5 states) mapped well (i.e., at least 3/5 neighbors had the same cell state annotation). We also observed that the proportion of each cell state in the AMP-RA reference and the multimodal query datasets was consistent, suggesting that the reference and query datasets had comparable cell state distributions despite different technologies (Supplementary Fig. 10a–e).
We then sought to understand the correspondence between the mapped transcriptional cell states and chromatin classes. We calculated an OR for each combination of state and class to measure the strength of association and used a Fisher’s exact test to assess significance. We observed that each transcriptional cell state generally corresponded to a single chromatin class (Fig. 7a–c; Supplementary Fig. 10g, h). In contrast, a single chromatin class represented a superstate encompassing multiple transcriptionally defined cell states. For example, cells in the TA−0: CD8A+ GZMK+ chromatin class were more likely to be labeled in the T-5: CD4+ GZMK+ memory, T-13: CD8+ GZMK/B+ memory, or T-14: CD8+ GZMK+ transcriptional cell states across CD4/CD8 lineages (OR = 11, 12, 11, respectively; Fig. 7a); the high GZMK promoter accessibility and expression shared by these states may have contributed to this categorization (Supplementary Fig. 10f). We saw examples of this model in every cell type: SA−1 linked to F-0/F-1 and SA−0 to F-6/F-5/F-3/F-8 in stromal cells; MA−1 to M-7/M-11 and MA−4 to M-3/M-4 in myeloid cells; BA−4 to B-1/B-3 in B/plasma cells; and EA−0 to E-1/E-2 in endothelial cells (Fig. 7b, c; Supplementary Fig. 10g, h; Supplementary Data 3). In all cell types, the transcriptional cell state classification was more accurate within cells whose transcriptional cell state and chromatin class were concordant (e.g., T-14 and TA−0), supporting our class-to-state mapping (Supplementary Fig. 10i).
Indeed, when we aggregated the snATAC-seq reads by states, we observed shared openness between transcriptional cell states within the same class (i.e., superstate), as seen with the cytotoxic TA−4 grouped cell states T-12/T-15 at the cytotoxicity-associated32 FGFBP2 gene, lining fibroblast SA−1 grouped cell states F-0/F-1 at the lining-associated11 CLIC5 gene, and intermediate myeloid MA−4 grouped cell states M-3/M-4 at bone marrow-derived macrophage-associated60 SPP1 gene (Supplementary Fig. 11). Furthermore, we found very few differential promoter peaks between transcriptional states in the same chromatin class even after pseudobulking by sample and state to decrease sparsity (Supplementary Fig. 12a). TA−1: CD4+ IL7R+ had one of the higher numbers of differential peaks within a class, but still only found 1.3% of the peaks tested as differential at FDR < 0.10. Among those was the expected CD4 and CD8A promoter peaks since both the T-4: CD4+ naive state and T-16: CD8+ CD45ROlow/naive state corresponded to TA−1 (Supplementary Fig. 12b; Fig. 7a). These populations likely mapped together since they shared naïve T cell transcriptional profiles, consistent with a highly accessible SELL promoter peak. This contrasted sharply to the number of differential peaks found between states across classes within a cell type (median of 8717 within a cell type vs 23 within a single class; Supplementary Fig. 12a), suggesting that the chromatin landscape in states within a class is much more homogeneous than across classes, as proposed by our superstate model.
We next asked if evidence for chromatin superstates was sensitive to clustering resolution. We observed that the class and state relationships largely replicated when we increased the open chromatin clustering resolution (Supplementary Fig. 13). To further support the superstate hypothesis, we trained a linear discriminant analysis (LDA) model to predict the transcriptional cell state between each pair of states from the chromatin PCs, upon which the chromatin classes were defined. Generally, transcriptional cell states belonging to the same chromatin class were difficult to distinguish using chromatin accessibility data alone (Supplementary Fig. 14). As an example, transcriptional states T-14 and T-13 both belonged to chromatin class TA−0, and thus chromatin PCs could not easily discriminate between them (AUC = 0.61); on the other hand, T-14 and T-3 belonged to classes TA−0 and TA−2, respectively, and LDA nearly perfectly distinguished them (AUC = 0.98) (Supplementary Fig. 14a). In all cell types, the mean AUC between states within the same chromatin class was less than that of states across different chromatin classes. For instance in T cells, the mean AUC was 0.77 within the same classes and 0.88 across different classes, suggesting there was a limit to how well the chromatin accessibility data could differentiate between transcriptional cell states.
Finally, to more thoroughly investigate the validity of the chromatin superstate model, we profiled the chromatin accessibility and transcriptomes of select cell states known to be functionally distinct and defined by well-characterized surface markers12,96. We generated a multiome dataset of sorted RA PBMC subsets via fluorescence-activated cell sorting (FACS) of four populations spanning two chromatin classes and four transcriptional states: CD4+CD127−CD25hi Treg, CD4+CD127−CD25int Treg, CD4+CD25−PD1+CXCR5+ TFH, and CD4+CD25−PD1+CXCR5− TPH (Supplementary Fig. 15a). We performed quality control steps in all three modalities and identified FACS cell state labels before doing any downstream analysis for the remaining 2,998 cells (Supplementary Fig. 15b). When we de novo clustered the chromatin accessibility data of the combined PBMC and tissue cells (Supplementary Fig. 15c), we found that the sorted RA PBMC TFH/TPH cells were most enriched in combined cluster 2 (OR = 4), which was most highly enriched for RA tissue TFH/TPH cells (OR = 32). Similarly, sorted RA PBMC Tregs were most enriched for combined cluster 4 (OR = 3), which was most highly enriched for RA tissue Tregs (OR = 24). This confirmed that our tissue class annotations agreed with well-known subclasses of T cells sorted using established protein markers.
We also wanted to assess whether the two cell states within a chromatin class defined via cell surface proteins (e.g., CD4+CD25−PD1+CXCR5+ TFH and CD4+CD25−PD1+CXCR5− TPH) were transcriptionally distinct. By clustering the cells from the four sorted populations based on gene expression, we successfully distinguished between the pairs of transcriptomic states from each chromatin class (Supplementary Fig. 15d). Moreover, we observed that each gold-standard FACS-defined population had a distinct mRNA cluster identity. Next, we calculated the differentially expressed genes and differentially accessible promoter peaks between the transcriptional states within the same class. While we found significant transcriptional differences, we largely did not observe similar accessibility differences in the corresponding genes’ promoter peaks (Fig. 7d, e). This was consistent with the model of transcriptional cell states from a common superstate sharing open chromatin landscapes. For example, the PDE4D gene, which encodes an RA treatment target97, had significantly more expression in TPH than TFH cells (unadjusted P = 4.64e-19), but a non-significant change in the promoter peak accessibility (unadjusted P = 0.913) (Supplementary Fig. 15e). On the other hand, ZBTB10, a telomere-associated TF98, was a rare example where the chromatin accessibility and gene expression concurred across Treg states (Supplementary Fig. 15f). However, globally, the lack of these examples likely contributed to the lack of fully distinguished state-specific chromatin classes.
Cell neighborhood associations with histological metrics and cell state proportions
Next, we sought to investigate associations between the RA chromatin classes and RA clinical metrics using the larger AMP-RA reference dataset with clinical measurements for 82 RA or OA patients. Per cell type, we classified95 each cell from the AMP-RA reference dataset, now the query, into the RA chromatin classes based on the five nearest multimodal snRNA-seq neighbors, now the reference. To validate this annotation, we compared the relative proportions of chromatin classes between the unimodal scATAC-seq cells and the classified AMP-RA scRNA-seq cells for donors in both studies. We observed a generally high correlation between the two technologies (Fig. 8a; Supplementary Fig. 16a). We then investigated RA clinical associations calculated via Co-varying Neighborhood Analysis (CNA)99. In brief, CNA tests associations between sample-level attributes, such as clinical metrics, and cellular neighborhoods, which are small groups of cells that reflect granular cell states. We used the previously described CNA associations defined in the AMP-RA reference cells and re-aggregated them by their chromatin classes. For example, we found an association between myeloid cells and histology characterized by lymphoid infiltration density (p = 0.005). Specifically, the increase in lymphocyte populations was positively associated with the MA−4: SPP1+ FABP5+ intermediate class, whose inflammatory cytokines/chemokines production may be responsible for lymphocyte homing100, and negatively associated with MA−2: LYVE1+ TIMD4+ TRM, whose gene markers were found more often expressed in synovial TRMs from healthy and remission RA than active RA patients13 (Fig. 8b). Additionally, we observed an association between T cells and the histological Krenn inflammation score (p = 0.02), with TA−2: CD4+ PD-1+ TFH/TPH positively101 and TA−4: CD8A+ PRF1+ cytotoxic negatively correlated (Supplementary Fig. 16b). These results were consistent with the original transcriptional cell state findings14 and suggested that the connections between RA pathology and cell state may begin before transcription.
One of the key findings from the AMP-RA study was the identification of six Cell Type Abundance Phenotypes (CTAPs), which characterized RA patients into subtypes based on the relative proportions of their broad cell type abundances in synovial tissue14. For instance, CTAP-TB has primarily T and B/plasma cells. Specific cell neighborhoods within cell types were expanded or depleted in these CTAPs as defined by CNA associations in the AMP-RA reference cells. We recapitulated some of these transcriptional associations by re-aggregating the CNA results within the chromatin classes; for example, the RA T cell class TA−2 was positively associated with CTAP-TB compared to other T cell classes, likely reflecting the role of TFH/TPH cells in B cell inflammation response11,12, while TA−4 was negatively associated (p = 0.046; Fig. 8c). Furthermore, in stromal cells, we saw the SA−1: PRG4+ lining class positively associated with CTAP-F, a primarily fibroblast CTAP (p = 0.0027; Supplementary Fig. 16c). This indicated that the most expanded type of fibroblasts in CTAP-F individuals was predominantly from the synovial lining layer, which was consistent with lining marker CLIC5 protein having high staining in the lining fibroblasts and being expressed in the highest proportion of cells from high-density fragments of CTAP-F samples (ANOVA padj = 4.92e-03 between CTAPs)14. Therefore, we could meaningfully replicate the RA pathological associations of both clinical metrics and phenotypic subtypes to transcriptional cell states using their related chromatin class superstate, suggesting that the epigenetic regulation underlying the transcriptional cell states may be mined for further pathological insights into RA.
Chromatin classes prioritize RA-associated SNPs
We next asked whether RA risk variants overlapped the chromatin classes to help define the function of putatively causal variants, genes, and pathways at play in RA pathology102–106. Using an RA multi-ancestry genome-wide association meta-analysis study107, we overlapped fine-mapped non-coding variants with posterior inclusion probability (PIP) greater than 0.1 with the 200 bp open chromatin peaks and assessed peak accessibility across the 24 chromatin classes (Fig. 8d; Supplementary Table 5). For six loci, putatively causal variants overlapped a peak accessible in predominantly one cell type, such as rs11209051 in peak chr1: 67,333,106–67,333,306 in T cells (Wilcoxon T versus non-T class one-sided p = 4.17e-04) near the IL12RB2 gene and rs4840568 in peak chr8:11,493,501–11,493,701 in B/plasma cells (Wilcoxon p = 1.49e-05) near the BLK gene. In the other loci, variants overlapped with chromatin classes from two cell types, with most combinations involving T cells. There were four SNPs overlapping peaks accessible in the TA−2: CD4+ PD-1+ TFH/TPH class, which was the most targeted class within T cells and known to be important for RA pathogenesis11,12.
As an example, we observed the putatively causal SNP rs798000 (PIP = 1.00) overlapped with peak chr1: 116,737,968–116,738,168, accessible primarily in T cells (Wilcoxon p = 2.35e-05) with TA−2 as its most accessible class (z = 3.03) (Fig. 8d, e, top). In a previous study93, we linked active chromatin regions to their target genes, which suggested CD2 was the causal gene in this locus. CD2 is a co-stimulatory receptor primarily expressed in T and NK cells108, which likely explains why it was only accessible in our T cell chromatin classes among the five cell types investigated (Fig. 8e, bottom). Intriguingly, rs798000 overlaps a STAT1/2 binding site at a high information content half site position (Fig. 8e, top, position 8 in JASPAR109 motif MA0517.1), suggesting a potential direct link to TF regulation of the JAK/STAT pathway commonly upregulated in RA53.
We also discovered SNP rs9927316 (PIP = 0.54) in myeloid-specific peak chr16:85,982,638–85,982,838 (Wilcoxon p = 4.17e-04), downstream of IRF8, one of the master regulator TFs of myeloid and B cell fates110–112 (Supplementary Fig. 17a). The SNP disrupts a KLF4 motif62, one of the TRM TFs highlighted earlier (Supplementary Fig. 17a; Fig. 4c, d). Furthermore, we observed SNP rs734094 (PIP = 0.41) overlapping peak chr11:2,301,916–2,302,116 with its most accessible classes in T and myeloid cells: TA−4: CD8A+ PRF1+ cytotoxic and MA−3: CD1C+ AFF3+ DC (z = 1.94, 1.65, respectively) (Fig. 8d; Supplementary Fig. 17b). While existing in the promoters of both TSPAN32 and C11orf21 gene isoforms (Supplementary Fig. 17b), we93 proposed the causal gene as Lymphocyte-specific Protein 1 (LSP1), shown to negatively regulate T cell migration and T cell-dependent inflammation in arthritic mouse models113.
For each of these loci, we also aggregated chromatin accessibility reads by classified transcriptional cell state and saw that the multiple states underlying each class had similar patterns, such as rs734094 having some of the strongest signal in TA−4 associated classes T-12, T-21, and MA−3 associated classes M-10, M-14 (Supplementary Fig. 18). This both reaffirmed our chromatin class superstate model and suggested that the classes are useful functional units that simplify mapping risk loci to affected cell states. The RA tissue chromatin classes can help prioritize putative cell states of action for non-coding RA risk variants to assist in their functional characterization within disease etiology.
Discussion
In this study, we described 24 chromatin classes across 5 broad cell types in 30 synovial tissue samples assayed with unimodal scATAC-seq and multimodal snATAC-seq along with the TFs potentially regulating them. Based on our observation that cells from the same chromatin class corresponded to multiple transcriptional cell states, we proposed that these chromatin classes were putative superstates of related transcriptional cell states. Finally, we assessed these chromatin classes’ relationship to RA clinical metrics, subtypes, and genetic risk variants. Our main findings are summarized in Supplementary Table 6 and Supplementary Data 4.
Chromatin accessibility is a key piece in the puzzle of gene regulation. It determines which regions of the genome may participate in regulatory events such as TF binding or may be impacted by non-coding genetic variants. Accessible TF motifs are not guaranteed to be bound, in contrast to the regions identified in gold-standard TF ChIP-seq114 or CUT&RUN115. However, chromatin accessibility datasets are not TF-specific or dependent on antibodies, so they can capture potential regulatory sites for a broader set of factors. At a small scale, the regulation of key loci can be interrogated using scATAC-seq. For example, we found accessible AP-1 motifs in the differentially accessible promoter peak of MMP3, a key driver of RA extracellular matrix destruction51, in lining fibroblasts compared to other stromal cells (Fig. 3c, d). Multiple drugs (e.g., CKD-506, T-5224, Roflumilast) are under investigation to disrupt this specific interaction of AP-1 at the MMP3 promoter, and AP-1 signaling targets more broadly, in models of arthritis as well as clinical trials of RA patients116. At a large scale, these TF-gene interactions can be linked together to form gene regulatory networks in silico117,118 to interrogate the more widespread effects of disrupting signaling cascades. Furthermore, as ~90% of disease causal genetic variants fall in non-coding regions119, chromatin accessibility can prioritize where to look for functional effects of putatively causal RA genetic variants, particularly for those that disrupt TF motifs. Our analyses suggested that the likely causal SNP rs798000 may disrupt STAT binding in a TFH/TPH regulatory region reported to act on CD2, an important T cell co-stimulatory gene120,121. Therefore, our study underscores the value of chromatin accessibility studies in disease-specific transcriptional regulation.
Simultaneous chromatin accessibility and gene expression measurements in the multiome cells were essential to test the relationship between marker peaks and genes. Across cell types, the correlations between scaled marker peak accessibility and gene expression across our chosen markers varied. T cells had higher correlation (R = 0.92; Supplementary Fig. 3b) while myeloid cells had lower correlation (R = 0.76; Supplementary Fig. 5b), potentially due to more heterogeneous subpopulations such as TRMs, infiltrating monocytes, and dendritic cells. Furthermore, when we did not see class correspondence between chromatin accessibility and gene expression on the individual gene level, we observed more class-specific gene expression in the context of promiscuous chromatin accessibility. This suggested a poised chromatin state that depends on the presence of a specific TF or extracellular signal to give rise to a particular transcriptional outcome. For example, the promoter peak of RTKN2 was accessible in all CD4 T cells, but the gene was primarily expressed in Tregs (Supplementary Fig. 3b), likely because it is a direct target of the Treg master regulator FOXP3122. CCL2 in stromal fibroblasts had an accessible promoter peak in both sublining populations, but was primarily expressed in the inflammatory subset (Supplementary Fig. 4b), likely due to stimulation by TNF/INFγ44,123.
Indeed, when expanding genome-wide, we saw a similar pattern of class-specific transcriptional cell states but chromatin classes encompassing multiple related states in our proposed superstate model (Fig. 7a–c; Supplementary Fig. 10g, h). To validate this model, we conducted an RA PBMC multiome experiment of FAC-sorted populations. While we saw differentially expressed genes between transcriptional cell states within a chromatin class, there was an almost complete lack of differentially accessible promoter peaks corresponding to those genes (Fig. 7d, e). Biologically, open chromatin is necessary but not sufficient for gene expression18, so it is reasonable to expect related cell states to have similar open chromatin landscapes with further specificity coming from TFs among other epigenetic regulators. Technically, the robustness of the observed class-state relationships across multiple clustering resolutions mitigated concerns that this proposed model was an artifact (Supplementary Fig. 13). Even in the absence of clusters, classifiers based on continuous chromatin PCs also demonstrated the lack of resolution chromatin accessibility has to distinguish between similar transcriptional states (Supplementary Fig. 14).
Defining the relationship between transcriptional cell states and chromatin classes may have important therapeutic implications. One effective RA treatment strategy is the deletion of a pathogenic cell state: the use of B cell-depleting antibodies (e.g., rituximab10) is an example. However, if one chromatin class corresponds to multiple transcriptional cell states, then deleting very specific pathogenic populations may be ineffective as other non-pathogenic states may transition into the pathogenic state in response to the same pathogenic tissue environment. As an example, a recent study124 of ILCs in a mouse model of psoriasis showed chromatin accessibility in a disease-relevant population of ILC3s even before disease induction using IL-23, particularly at ILC3 TFs, that then increased further after induction. In that case, altering the environment or removing exogenous factors (e.g., TFs, cytokines) might be a more effective treatment. Within RA, the SA−0: CXCL12+ HLA-DRhi sublining fibroblast class, with its four related transcriptional states in our superstate model, may merit further investigation in this regard. SA−0 accessible peaks were enriched for STAT motifs, suggesting potential regulation by the JAK/STAT signaling pathway. Indeed, JAK inhibition via tofacitinib and upadacitinib has been shown to prevent pro-inflammatory HLA-DR induction in RA synovial fibroblasts125. Additional experiments would be required to determine if the F-3: POSTN+ sublining transcriptional cell state could transform into the RA-expanded14 F-5: CD74hiHLAhi sublining or F-6: CXCL12+ SFRP1+ sublining fibroblast populations under JAK/STAT stimulation.
More broadly, the results presented here suggest some interesting next steps. First, our chromatin class superstate model indicated that certain transcriptional cell states were more closely linked, but further experimentation would be required to ascertain whether these related cell states have a plastic enough chromatin landscape to allow for cross-differentiation or whether they are more broadly grouped by function. Second, to better understand whether the more pathogenic chromatin classes such as TA−2: CD4+ PD-1 + TFH/TPH and MA−1: FCN1+ SAMSN1+ infiltrating monocytes are indeed only in tissue, a RA PBMC scATAC-seq study may be warranted. While we saw a general consensus between the chromatin landscapes of RA tissue class TA−2 and our small population of RA blood TFH/TPH cells, a larger PBMC study would be better powered to determine if the chromatin environment in blood may be a proxy for the environment in tissue that gives rise to pathogenic transcriptional populations. Third, even though we did not see large effects of OA and low-cell-count samples on our chromatin classes, a larger study with a more even distribution of RA and OA samples with higher cell counts would be better able to distinguish between RA- and OA-specific chromatin variation.
In conclusion, we presented an atlas for RA tissue chromatin classes that will be a useful resource for linking chromatin accessibility to gene expression and the interpretation of genetic information.
Methods
Patient recruitment
Fourteen RA and 4 OA patients were recruited by the Accelerating Medicines Partnership (AMP) Network for RA and SLE to provide samples for use in the unimodal scATAC-seq experiments. Separately, synovial tissue samples from 11 RA patients and 1 OA patient were collected from Brigham and Women’s Hospital (BWH) and the Hospital for Special Surgery (HSS) for use in the multimodal ATAC + Gene Expression experiments. Histologic sections of RA synovial tissue were examined, and samples with inflammatory features were selected in both cases.
Patients were recruited from Brigham and Women’s Hospital, Columbia University, Hospital for Special Surgery, Queen Mary University of London UK, University of Birmingham UK, University of California San Diego, University of Pittsburgh, University of Rochester. All sites obtained approval for this study from their Institutional Review Boards. All patients gave written informed consent. We have complied with all relevant ethical regulations.
Synovial tissue collection and preparation
Synovial tissue samples from 14 RA patients and 4 OA patients were collected and cryopreserved as part of a larger study cohort by the AMP Network for RA and SLE, as previously described14. Synovial tissue samples were thawed and disaggregated as previously described14,23. The resulting single-cell suspensions were stained with anti-CD235a antibodies (clone 11E4B-7-6 (KC16), Beckman Coulter, 1:100 dilution) and Fixable Viability Dye (FVD) eFlour 780 (eBioscience/ThermoFisher). Live non-erythrocyte (i.e., FVD− CD235−) cells were collected by fluorescence-activated cell sorting (BD FACSAria Fusion). The sorted live cells were then re-frozen in Cryostor and stored in liquid nitrogen. The cells were later thawed and processed as described above for droplet-based scATAC-seq according to manufacturer’s protocols (10X Genomics). For the multimodal experiments, the 11 RA and 1 OA synovial tissue samples were collected and cryopreserved before being thawed, disaggregated, and FAC-sorted, as described above.
Unimodal scATAC-seq experimental protocol
Unimodal scATAC-seq experiments were performed by the BWH Center for Cellular Profiling. Each sample was processed separately in the cell capture step. Nuclei were isolated using an adaptation of the manufacturer’s protocol (10X Genomics). Approximately ten thousand nuclei were incubated with Tn5 Transposase. The transposed nuclei were then loaded on a Chromium Next GEM Chip H and partitioned into Gel Beads in-emulsion (GEMs), followed by GEM incubation and library generation. The ATAC libraries were sequenced to an average of 30,000 reads per cell with the recommended number of cycles according to the manufacturer’s protocol (Single Cell ATAC V1.1, 10X Genomics) using Illumina Novaseq. Samples were initially processed using 10x Genomics Cell Ranger ATAC 1.1.0, which included barcode processing and read alignment to the hg38 reference genome.
Multiome experimental protocol
Multiome experiments were performed by the BWH Center for Cellular Profiling. Each sample was processed separately in the cell capture step. Nuclei were isolated as above. Approximately ten thousand transposed nuclei were loaded on Chromium Next GEM Chip J followed by GEM generation. 10x Barcoded DNA from the transposed DNA (for ATAC) and 10x Barcoded, full-length cDNA from poly-adenylated mRNA (for Gene Expression) were produced during GEM incubation. The ATAC libraries and Gene Expression libraries were then generated separately. Both library types were sequenced to an average of 30,000 reads per cell on different flow cells with the recommended sequencing cycles according to the manufacturer’s protocol (Chromium Next GEM Single Cell Multiome ATAC + Gene Expression, 10X Genomics) using Illumina Novaseq. Samples were initially processed using 10x Genomics Cell Ranger ARC 2.0.0, which included barcode processing and read alignment to the hg38 reference genome, for both ATAC and GEX information.
Computational methods
Supplementary Fig. 1 shows an overview of the computational methodology for cell type/state identification, as many of the methods were reused in different contexts. In the following sections, we explain the core methodology the first time it is used, and then only the ways in which the methodology differs in the different contexts afterwards.
ATAC read QC
Reads were quality controlled from the Cell Ranger BAM files via a new cell-aware strategy that removes likely duplicate reads from PCR amplification bias within a cell while keeping reads originating from the same positions but from different cells. For unimodal scATAC-seq data, duplicate reads from the same cell were called based on read and mate start positions and CIGAR scores, but the multimodal snATAC-seq data only used start positions since Cell Ranger ARC did not provide a mate CIGAR score (MC:Z flag). Reads that were not properly mapped within a pair, had a MAPQ < 60, did not have a cell barcode, or were overlapping the ENCODE blacklisted regions24 of ‘sticky DNA’ were also removed. Using the deduplicated BAM files, we converted them to fragment BED files using BEDOPS126 bam2bed while accounting for the 9-bp Tn5 binding site.
ATAC peak calling
Peaks were called twice on the unimodal scATAC-seq cells, before and after “ATAC cell QC”, to first provide general peak information to be used in the cell QC step and then afterwards on the post QC cells to provide the final, refined peak set. Individual sample unimodal scATAC-seq BAM files were converted to MACS2127 BEDPE files using macs2 randsample, concatenated across samples, and then used to call consensus peaks with macs2 callpeak --call-summits using a control file128 where ATAC-seq was done on free DNA to account for Tn5’s inherent cutting bias. Each sub-peak was trimmed to 200 bp (summit ± 100 bp) to localize the signal and avoid confounding any statistical analysis with peak length. Any overlapping peaks were removed iteratively, keeping the best sub-peak, as determined by q-value, to avoid double counting. For consistent analysis, we used the post cell QC unimodal scATAC-seq trimmed consensus peaks for all downstream analyses unless otherwise stated. We wanted to confirm that these unimodal scATAC-seq consensus peaks were reasonable to use for the multimodal snATAC-seq datasets, beyond just that the datasets were done on the same tissue type. Therefore, we called peaks, as done above, on the individual sample multimodal snATAC-seq BAM files and found that an average of 75% (n = 12 samples; range: 66–83%) of the 200 bp trimmed multimodal snATAC-seq sample-specific peaks overlapped the unimodal scATAC-seq consensus peaks. Furthermore, we used the 5x full consensus peak neighborhoods in the cell QC step for multiome datasets as an added safeguard. We also confirmed our peaks’ quality by seeing good overlap with ENCODE SCREEN v3 candidate cis-regulatory elements (cCREs)25 and the GENCODE v2826 promoter annotations via bedtools129 intersectBed (Supplementary Fig. 2f).
ATAC cell QC
We kept cells with more than 10,000 reads with at least 50% of those reads falling in peak neighborhoods (5x full peak size), at least 10% of reads in promoter regions, not more than 10% of reads called in the mitochondrial chromosome, and not more than 10% of pre-deduplication reads falling in the ENCODE backlisted regions24. The genome annotation we used to define promoters was GENCODE v28 basic26 as was done for Cell Ranger ATAC read mapping; we defined promoter regions for the QC step as 2 kb upstream of HAVANA protein coding transcripts that we subsequently merged to avoid double counting. The fragments from the post QC cells were quantified within the 200 bp trimmed consensus peaks (see “ATAC peak calling”) via GenomicRanges::findOverlaps130 into a peaks x cells matrix.
ATAC clustering
We did multiple rounds of clustering with different inputs. Generally, we did: binarize peaks x cells matrix, log(TFxIDF) normalization using Seurat::TF.IDF131, most variable peak feature selection using Symphony::vargenes_vst95, center/scale features to mean 0 and variance 1 across cells using base::scale, PCA dimensionality reduction using irlba::prcomp_irlba, batch correction by sample using Harmony::HarmonyMatrix27, shared nearest neighbor creation using RANN::nn2 and Seurat::ComputeSNN131, Louvain clustering using Seurat::RunModulatrityClustering131, and cluster visualization using UMAP coordinates via umap::umap. For the unimodal scATAC-seq feature selection, we chose peaks that had at least one fragment in at least five percent of cells and TFxIDF normalization using Seurat::TF.IDF131 before continuing in the above steps. We used 20 PCs for the broad cell type clustering and 10 PCs for the chromatin class clustering since there was less variation within a cell type.
For cluster identification, we used marker peaks, defined as peaks overlapping the promoters of marker genes; if there were multiple peaks overlapping a gene’s promoter or multiple isoforms of a gene, the peak that best tracked with the gene’s expression in the multiome cells was chosen. The broad cell type marker peaks we used are in Supplementary Fig. 2g–j and the chromatin class marker peaks in panel b of Supplementary Figs. 3–7.
ATAC doublet cluster removal
Within the unimodal scATAC-seq and multimodal snATAC-seq separately, we then did an initial round of ATAC clustering using all post cell QC cells to find doublet clusters. We removed doublet clusters with multiple cell-type-specific marker peaks, intermediate placement between broad cell type clusters in PC space, high fragment counts, and high doublet scores determined per cell per sample by ArchR35. Note that this does not necessarily preclude doublets of the same cell type.
RNA cell QC
Multimodal snRNA-seq cells had to pass Cell Ranger ARC cell filtering and have at least 500 genes and <20% of mitochondrial reads. The Cell Ranger ARC filtered genes x cells matrix was subsetted to only these cells passing cell QC.
RNA clustering
To cluster genes x cells matrices, we did: log normalization to 10,000 reads using Seurat::NormalizeData131, most variable gene feature selection using a variance stabilizing transformation (VST)131, center/scale features to mean 0 and variance 1 across cells using base::scale, PCA dimensionality reduction using irlba::prcomp_irlba, batch correction by sample via Harmony::HarmonyMatrix27, shared nearest neighbor creation using RANN::nn2 and Seurat::ComputeSNN131, Louvain clustering using Seurat::RunModulatrityClustering131, and cluster visualization using UMAP coordinates via umap::umap. We used 20 PCs for the broad cell type clustering and 10 PCs for the sorted RA PBMC mRNA clustering since there was less variation within a cell type.
For cluster identification, we used marker genes seen in Supplementary Fig. 2l, m for the broad cell types and in panel b of Supplementary Figs. 3–7 for the chromatin classes.
RNA doublet cluster removal
After doing an initial round of RNA clustering on the post cell QC cells, we removed doublet clusters with multiple cell-type-specific genes, intermediate placement between broad cell type clusters in PC space, high UMI counts, and high doublet scores determined per cell per sample by Scrublet132. Note that this does not necessarily preclude doublets of the same cell type.
Symphony classification of transcriptional identity
To determine the RA transcriptional cell types/states within our multimodal data, we used Symphony95 to map the multimodal snRNA-seq profiles into the AMP-RA reference synovial tissue transcriptional cell types/states14 (Supplementary Fig. 1b, d). We used one Symphony reference object from that study for the broad cell types together and one for each broad cell type we tested (T cell, stromal, myeloid, B/plasma, and endothelial) for the fine-grain cell state identities. The broad cell types and lymphocyte states were defined using both gene and surface protein expression while the others were defined using gene expression only. In each case, we mapped the multimodal snRNA-seq gene x cells matrix into the appropriate Symphony reference object using the mapQuery function, accounting for sample as a batch variable. Using the knnPredict function with k = 5, each multiome cell was classified into a reference transcriptional cell type/state by the most common annotation of its five nearest AMP-RA reference neighbors in the harmonized embedding. We considered it a high confidence mapping if at least 3 out of the 5 nearest reference neighbors were the same cell type/state, though the number of cell types/states will affect this as more cell types/states means more boundary regions between cell types/states.
Broad cell type clustering
For non-doublet cells passing cell QC, we subsetted the feature x cells matrices and performed broad cell type clustering within modalities as described above in “ATAC clustering” for the unimodal scATAC-seq and multimodal snATAC-seq datasets separately and “RNA clustering” for the multimodal snRNA-seq datasets (Supplementary Fig. 1a, b). We also classified the multimodal snRNA-seq cells into the AMP-RA CITE-seq study14 broad cell types using Symphony95 (see “Symphony classification of transcriptional identity”). The small minority of cells (2%) with discordant cell types defined in the snATAC-, snRNA-, and CITE-seq modalities for the multiome datasets were removed (Supplementary Fig. 1b). Here, as in all analyses unless otherwise stated, we included OA samples to increase cell counts, but we did not make any OA versus RA comparisons due to low power.
Fine-grain chromatin class clustering
To define chromatin classes within broad cell types (Supplementary Fig. 1c), we made peaks x cells matrices for each broad cell type concatenating unimodal scATAC-seq and multimodal snATAC-seq cells of that type across the consensus peaks. Since peaks were called on all unimodal scATAC-seq cells regardless of cell type, we first subset each consensus peaks x broad cell type cells matrix by “peaks with minimal accessibility” (PMA). We defined minimal accessibility as consensus peaks that had a fragment in at least 0.5% of cells of that type, except for endothelial cells which we increased to a minimum of 50 cells. After subsetting the matrix by PMA peaks, we ran the same clustering pipeline detailed in “ATAC clustering”. For endothelial cells, due to small cell counts, we batch-corrected on both sample and assay and updated Harmony’s sigma parameter to 0.2. We did another round of QC to exclude cells that clustered primarily due to relatively fewer total fragments per cell and fewer peaks with at least one 1 fragment per cell, and then re-clustered. We tried a number of clustering resolutions (see Supplementary Fig. 13 for a subset) and chose the resolution at which known cell-state-specific gene markers’ promoter peak chromatin accessibility and gene expression largely respected cluster boundaries, such as PRF1 in TA−4: CD4+ PRF1+ cytotoxic (Fig. 2b) or SPP1 in MA−4: SPP1+ FABP5+ intermediate (Fig. 4b).
To label chromatin classes, we used the first letter of the broad cell types (T - T cell; S - stromal; M - myeloid; B - B/plasma; E - endothelial), a subscript A for accessibility, a cluster number (ordered by number of cells, with the biggest cluster named 0). To give biological context, we took advantage of both the peak accessibility and gene expression profiles. We chose a class’s markers based on a number of factors: (1) the class-specificity of the marker gene’s expression, (2) the class-specificity of the marker peak associated to that gene’s promoter, (3) previous reports of that gene as a cell type marker in the literature, and (4) corroboration with our well-annotated AMP-RA tissue CITE-seq dataset14 via reference mapping95 (Figs. 2–6b, 7a–c; Supplementary Figs. 1d, 3–7b, 10g, h; Supplementary Data 3, 4). We proposed a cell identity based on known markers in the field; for example, PDCD1 and CXCL13 in TFH/TPH12 or PRG4 and CD55 in lining fibroblasts21. We further supported the proposed identity by the correspondence to the transcriptional cell state annotation from our well-annotated AMP-RA reference of synovial tissue CITE-seq data14 (Fig. 7a–c; Supplementary Fig. 10g, h; Supplementary Data 3).
T cell lineage analysis
We used a logistic regression model to investigate how promoter peaks align with the CD4 and CD8 lineage distinction (‘lineage’) across T cells beyond their chromatin class identity (‘class’), sample identity (‘sample’), and overall fragment counts (‘nFragments’). The lineage variable was defined as the cell’s chromatin accessibility at the promoter peaks of: CD4+ CD8A- (+1), CD4+ CD8A+ or CD4− CD8A− (0), CD4− CD8A+ (−1); cell counts by lineage and class are in Supplementary Table 3. A plus sign (+) signifies that the CD4 or CD8 lineage promoter peak is accessible while a minus sign (−) signifies that it is not. Genome-wide T cell promoter peaks were defined as those T cell PMA peaks that overlapped an ENCODE promoter-like cCRE25, whose proposed target gene was assessed via overlapping ENSEMBL133 hg38 release 92 transcript annotations. We note that if there were multiple overlapping transcripts, we selected one gene to annotate the cCREs by excluding lincRNA, miRNA, antisense genes, orfs, and other pseudogenes then selecting one of the remaining genes. We excluded peaks that were uniformly positive or negative after binarizing. For each of these binarized promoter peaks (‘peak’), we calculated two logistic regressions using lme4::glmer134 with a nloptwrap optimizer for speed:
A lineage beta in the model is positive if the peak is associated to CD4 and negative if associated to CD8. We calculated significance as a likelihood ratio test (LRT) between the full and null models with multiple hypothesis test correction using FDR < 0.20; significant results are shown in Supplementary Data 1. Furthermore, we defined a lineage score per cell via: (1) subsetting the normalized chromatin accessibility matrix by the lineage-significant peaks; (2) dividing CD4-associated peaks by the number of CD4-associated peaks to normalize; (3) dividing CD8A-associated peaks by the number of CD8A-associated peaks to normalize; (4) multiplying CD8A-associated peaks by −1 to differentiate lineage; (5) summing over peaks by cell to get a cell score. Thus, if a cell’s lineage score is positive, that cell is more associated with CD4 and CD8 if otherwise. We aggregated these cell scores by chromatin class in Supplementary Fig. 3d.
TF motif analysis
We used ArchR35 version 1.0.2 for our TF motif analysis. For each cell type’s final QC cells, we subsetted each sample’s fragments using awk135, bgzip136, and tabix137 before creating arrow files from them using createArrowFiles with all additional QC flags nullified. ArchR removed samples with two or fewer cells, so one sample with only two B/plasma cells was removed in that cell type. From the arrow files, we created an ArchR project via ArchRProject. We added our peak set into the project by addPeakSet and recreated a peaks by cells matrix via addPeakMatrix. We added our chromatin classes to the project’s cell metadata with addCellColData. Then, we added motif annotations to our peaks using addMotifAnnotations with the JASPAR2020 motif set version 2, a 4 bp motif search window width, and motif p value of 5e-05. We added chromVAR background peaks via addBgdPeaks and then calculated chromVAR deviations using addDeviationsMatrix. Next, we found class-specific peaks for each chromatin class using getMarkerFeatures via a Wilcoxon test and accounting for TSS Enrichment and log10(nFragments). Within those peaks, we found motif enrichment via peakAnnoEnrichment with cutoffs FDR ≤ 0.1 and Log2FC ≥ 0.5. We modeled our heatmap of motif enrichment on plotEnrichHeatmap, but we added some filters. As in the default plotEnrichHeatmap method, we used the −log10(padj), where the p value is calculated via a hypergeometric test, as the motif enrichment value. For each chromatin class sorted by maximum motif enrichment value, we chose the top motifs not already chosen that had at least an enrichment value of 5 for that class, had the maximal or within 95% of the maximal enrichment for that class, and whose corresponding TF had at least 0.05 mean-aggregated normalized gene expression for that class. For myeloid cells, the enrichment cutoff was set to 2 to show some motifs for MA−0. In endothelial cells, there were so few EA−3 cells that only 1 class-specific peak was called, resulting in no useful motif information to be shown; we also added a SOX17 motif (JASPAR109 ID MA0078.1), a prominent arteriolar endothelial TF86, to the JASPAR2020 motif set for endothelial cells. For the chosen motifs, we plotted the percentage of the max enrichment value across classes with the max value in parentheses in the motif label as in plotEnrichHeatmap.
For the TFs associated with the top class-specific accessible motifs, we used a one-sided Wilcoxon test to compare the normalized gene expression for the TF between cells in that chromatin class and the other cells within that cell type, with the alternative hypothesis being “greater” and multiple hypothesis test correction within cell types using FDR (Supplementary Data 2).
Loci visualization
To visualize the chromatin accessibility read buildups by chromatin class or transcriptional cell state (class/state), we first subsetted the deduplicated BAM files for each sample by the cells in the specific state/class using an awk135 command looking for the samtools CB:Z (i.e., cell barcode) flag; a BAM index file was made for each BAM file for region subsetting purposes later. Then for each class/state at each locus, we subsetted each sample’s BAM file for that region using samtools view, merged the BAM files across samples using samtools merge, converted the BAM files to bedgraph files using bedtools129 genomecov, and then divided the bedgraph counts by the total read count (by 1e7 reads) in that class/state to allow for comparison between classes/states. The bedgraph files were then imported to IGV138 and the data range for each class/state was set to the maximum value across classes/states. Tracks were colored by their class/state. We did not always show all classes/states for space reasons, but we picked representatives that were similar in the locus shown. Peaks (see “ATAC peak calling”), motifs (see “TF motif analysis”), and SNPs (see “Genetic variant analysis”) were imported into IGV as BED files. We could not label all motifs found in these loci for space reasons, so we picked the enriched motif we were highlighting and a few other enriched motifs. We also could not always show all the gene isoforms for all loci for space reasons, but we did always show a representative isoform for those that looked similar in the locus shown.
Stromal DNA methylation analysis
We downloaded 1859 DM loci for RA versus OA synovial fibroblast cell lines from Nakano et al., 201347. We converted the 1 bp DM regions from hg19 to hg38 reference genomes using liftOver139; 1 region did not map. Next, we overlapped these DM loci with our 200 bp stromal PMA peaks using intersectBed129 to get 152 DM loci, with 67 associated eith hypermethylation and 85 to hypomethylation. We defined a per-cell score as in the “T cell lineage analysis” section, but with positive scores corresponding to hypermethylation and negative scores to hypomethylation. We calculated a one-sided Wilcoxon test p value of DNA methylation cell scores between the 11,733 cells in SA−0 and the 12,574 stromal cells not in SA−0 to get significance of SA−0 enrichment for hypomethylated regions.
We used the genes assigned to the DM loci from the original paper47. For the genes related to hypermethylated DM and hypomethylated DM accessible loci separately, we plotted their scaled mean normalized gene expression within fibroblast classes SA−0, SA−1, and SA−2 to assess fibroblast class preferences.
Cultured fibroblast datasets
We obtained two cultured unstimulated FLS multiome datasets from Smith et al.44. We downloaded their genes x cells matrices from Immport accession ID SDY2213 and fragment files from the authors. We subset these files by their QCed cells found in Immport file adata_scatac_chromVAR_motif_cultured.968213.h5; there were 19,573 QC cells across the two samples. We overlapped this subsetted fragment file by our peaks to create a peaks x cells matrix. We saw good overlap in that matrix with 99.99% of our peaks having at least 1 cell represented and all cells having overlapping fragments with at least a few hundred peaks. For both gene and peak matrices individually, we concatenated the two samples and normalized as above.
Fibroblast identity analysis
We subsetted our stromal tissue datasets to only include fibroblast populations (SA−0, SA−1, SA−2). We calculated differentially expressed genes between tissue lining (SA−1) and sublining (SA−0, SA−2) populations in the normalized gene expression matrix using presto::wilcoxauc and adjusted p values using FDR. We created gene sets of 382 lining and 254 sublining genes using the cutoffs: FDR < 0.1, logFC > 0.25, and AUC > 0.6. We then calculated a per-cell score as in the “T cell lineage analysis” section, but with positive scores corresponding to lining fibroblasts and negative scores to sublining fibroblasts. Using the tissue-defined gene sets, we calculated this per-cell fibroblast identity gene score in the normalized cultured fibroblast gene expression matrix (see “Cultured fibroblast datasets”). We used a two-sided Wilcoxon test of fibroblast identity gene scores between all pairs of fibroblast sources to determine significance via ggpubr::compare_means. We did the same analysis with differentially accessible peaks in the normalized chromatin accessibility matrix using cutoffs FDR < 0.1, logFC > 0.1, and AUC > 0.58 to get 248 lining peaks and 294 sublining peaks.
Tissue and blood analysis
We downloaded a publicly available 10x Single Cell Multiome ATAC + Gene Expression dataset92 of healthy donor (female, age 25) PBMCs with granulocytes removed through cell sorting as part of our sister study93 (‘Public PBMC’ dataset). The PBMC cell labels were generated using the processing defined in that study. No further quality control was done on the fragment file downloaded from the 10x website (https://cf.10xgenomics.com/samples/cell-arc/2.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz). For each cell type (B, T, and myeloid), we subset the fragment file by that cell type’s cells and then overlapped them with our peaks to get a peaks x cells matrix as done in “ATAC quality control”. We concatenated this matrix to our RA tissue’s peaks x cells matrix for each corresponding cell type and then re-clustered using the same PMA and variable peaks chosen for tissue and harmonizing by sample. We chose the resolution that best mirrored the RA tissue chromatin classes. The odds ratio for each individual biological source’s cell label and the combined tissue and blood cluster label was calculated as in “Class/state odds ratio”. We replicated this analysis using the RA PBMCs for TFH/TPH and Treg FACS populations and the 5 RA tissue chromatin classes.
Class/state odds ratio
For each combination of chromatin class and transcriptional cell state within a cell type, we constructed a 2 × 2 contingency table of the number of cells belonging or not to the class and/or state. For cell states that had >10 cells, we then calculated the odds ratio (OR) and p value via stats::fisher.test. We did multiple hypothesis test correction via stats::p.adjust using FDR < 0.05. We displayed the natural log of the OR via base::log, and if the value was infinite, we capped it at 1 plus the ceiling of the non-infinite max absolute value of logged OR for display purposes; negative infinity was the negative capped number. All the ORs and p values for all class/state combinations from Fig. 7a–c and Supplementary Fig. 10g, h are in Supplementary Data 3.
We defined the accuracy of the class/state correspondence as the percentage of multiome cells with perfect mapping (i.e., all 5 nearest neighbors in the reference had the same cell state) within each group of ‘concordant’ (i.e., cells whose class and state agreed as determined by the odds ratio) or ‘discordant’ (i.e., cells whose class and state disagreed) cells per cell type. For example, cells mapping to class TA−0: CD8A+ GZMK+ and state T-14: CD8+ GZMK+ memory would be ‘concordant’ cells while cells mapping to class TA−2: CD4+ PD-1+ TFH/TPH and state T-14: CD8+ GZMK+ memory would be ‘discordant’ cells.
ATAC pseudobulk differential peak analysis
For T, stromal, and myeloid cell types, we summed the non-binary peaks x cells matrix by sample and transcriptional cell state combinations across cells. We subset the summed matrix to include only samples with more than 150 cells, states with more than 130 cells, and combinations with more than 10 cells. For the within-class analysis, we split the matrix by the transcriptional cell states that belonged to the same chromatin class (e.g., 5 T cell matrices); we excluded any class with only 1 state passing our QC thresholds. We also kept the full matrix per cell type for the across-classes analysis. We subset peaks by each cell type’s promoter PMA peaks (see “T cell lineage analysis”) that had at least 5 reads across the pseudobulks within that analysis. For each peak for each set of states (either within or across classes), we calculated two negative binomial models of that peak’s sample/state pseudobulk distribution using MASS::glm.nb, accounting for covariates of sample identity (‘sample’) and the number of fragments (‘nFragments’) in the sample and cell state combination and differing by the inclusion of transcriptional cell state (‘cell state’):
Cell state and sample were represented by 1-hot encoded matrices. We calculated an ANOVA log-likelihood ratio test (LRT) p value between these two models and reconciled multiple hypothesis test correction within each analysis separately via FDR. Peaks were considered differential if they had FDR < 0.10.
Linear discriminant analysis
We used LDA to determine how well knowing the chromatin harmonized principal component (hPC) information helped predict the mRNA fine-grain cell states for each pairwise combination of states. We specifically use pairwise combinations instead of 1 versus all comparisons to assess the chromatin accessibility data’s ability to give rise to one or multiple transcriptional cell states. For each pair of transcriptional cell states within a broad cell type, we subset all data structures by those cells and remade the cell state vector into a 1-hot encoding. If either cell state of the pair had <50 cells, we excluded it from further analysis. We used the 10 chromatin hPCs from the fine-grain chromatin class clustering (see “Fine-grain chromatin class clustering”). Covariates of sample (1-hot encoded for 12 samples) and scaled logged number of fragments (nFragments) were used since both can affect cell type identity. We trained an LDA model using MASS::lda on 75% of cells across the pair of states, verifying that the training and testing sets had cells from both states:
We tested the model using stats::predict for the 25% of held-out data and quantified the discriminative value of the model using an area under the curve AUC metric from ROCR140 library functions ROCR::prediction and ROCR::performance. Pairs of distinct clusters were only calculated once; the square matrices of results have the triangles mirrored. If the cell states were the same and a model was not run (identity line) or the model between pairs of clusters had a constant variable due to samples with too few cells (non-identity line), the box is greyed out.
Superstate FACS protocol
From pooled PBMC samples from 4 RA patients, we enriched for CD4 T cells using the MACS protocol and sorted for 4 populations using FACS (CD4+CD127−CD25hi Tregs, CD4+CD127−CD25int Tregs, CD4+CD25−PD1+CXCR5+ TFH, and CD4+CD25−PD1+CXCR5− TPH). FACS sequential gating plots can be found in Supplementary Fig. 15a. We used the following antibodies: CD3-FITC, CD4-BV421, CD25-PE-Cy7, CD127-BV650, CXCR5-PE, PD1-APC. All antibodies were purchased by BioLegend and used at one microliter per million cells. The Live/Dead dye 7-AAD was purchased from ThermoFisher Scientific and used at five microliters per million cells. After nuclei isolation, each sorted population was tagged with a nuclear hashing antibody before pooling across populations. Total-SeqTM-A hashtag antibodies (A0451-A0454) were purchased from BioLegend and used at a 1:40 dilution.
Superstate multiome experimental protocol
We performed a multiome experiment as described in “Multiome experimental protocol”, with the additional step of producing cDNA from Hashtag oligos (for Protein Antibody Hashtags) during GEM incubation, generating the Hashtag library alongside the Gene Expression library. The Hashtag library was sequenced at approximately five thousand reads per cell.
Superstate multiome quality control
Quality control steps for the superstate multiome experiment were the same as the RA tissue multiome experiments, up to and not including the doublet step in both modalities (Supplementary Fig. 1b). To better account for doublets between these very similar cell states, we only included cells with a single identity determined by running Seurat::HTODemux131 on the normalized hashtag library. Those cell state identities were strictly used as a label. Cells needed to pass QC in all three modalities to be included in the downstream analysis. We kept 402 CD4+CD127−CD25hi Tregs, 1690 CD4+CD127−CD25int Tregs, 535 CD4+CD25−PD1+CXCR5+ TFH, and 371 CD4+CD25−PD1+CXCR5− TPH cells.
Single-cell differential peak analysis
We used a logistic regression model to determine differential promoter peaks across chromatin class identity. We did this at the single cell level for the combined unimodal scATAC-seq and multimodal snATAC-seq cells and took into account the sample’s sample (‘sample’) and overall fragment counts (‘nFragments’) as covariates. Genome-wide promoter peaks were defined per cell type as in “T cell lineage analysis”. For each peak and class combination, we calculated two logistic regressions using lme4::glmer134 with a nloptwrap optimizer for speed:
The log2FC was determined as the cell type beta. We calculated significance as a LRT between the full and null models with multiple hypothesis test corrections using FDR. The top 5 peaks per class, defined as having log2FC > 0.5 and −log10(FDR) > 5, ordered by FDR, are shown in Supplementary Data 4.
Single-cell differential gene analysis
For the multiome cells only, we calculated differentially expressed genes between chromatin class identities within a cell type via a two-sided Wilcoxon test using a normalized gene expression matrix input to presto::wilcoxauc. The top 5 genes per class, defined as having logFC > 0.5 and −log10(FDR) > 5, ordered by FDR and logFC, are shown in Supplementary Data 4. We selected one peak of potentially multiple that overlapped the annotated gene based on the differential peak’s significance in the corresponding class.
TFH/TPH/Treg differential feature analysis
For the sorted RA PBMCs, we determined differential genes and peaks between each pair of states within one chromatin class: (1) CD4+CD127−CD25hi Tregs and CD4+CD127−CD25int Tregs; (2) CD4+CD25−PD1+CXCR5+ TFH and CD4+CD25−PD1+CXCR5− TPH. We calculated differential genes as in “Single-cell differential gene analysis”. Differential promoter peaks were calculated similarly to “Single-cell differential peak analysis”, but we excluded sample as a covariate since there was a single pooled RA PBMC sample and used stats::glm instead of lme4::glmer since we removed the random effect of sample, thus negating the need for a mixed effect model. If a gene had multiple promoter peaks, we chose the peak with the max normalized peak accessibility summed across cells in that pair of states. Furthermore, we only included peak/gene pairs with at least 1 fragment/UMI in greater than 50 cells in that pair of states. We corrected p values using FDR separately within modalities.
Symphony classification of chromatin class
To utilize the richer clinical information in the more abundant AMP-RA reference datasets, we classified each AMP-RA reference cell into a chromatin class. We used the same shared transcriptional spaces by cell type defined in “Symphony classification of transcriptional identity”, but we reversed the reference and query objects in the knnPredict function, such that the multiome cells were in the ‘reference’ and the AMP-RA reference cells were in the ‘query’. We used the most common annotation of the 5 nearest multiome neighbors to classify the chromatin class in the AMP-RA reference cells. We averaged the 5 nearest multiome neighbors’ UMAP dimensions to visualize the classified chromatin classes in the AMP-RA reference cells on the chromatin class UMAPs.
Unimodal scATAC-seq and AMP-RA CITE-seq shared donor analysis
There were different samples that came from the same donors in the unimodal scATAC-seq and AMP-RA reference CITE-seq datasets. We expected similar, but not the same, chromatin class proportions for samples coming from the same donor’s tissue but put through different experimental protocols and class assignment methods. First, we filtered out any donors that did not have at least 200 scATAC-seq or CITE-seq cells in all cell types except endothelial, in which we lowered the threshold to 100 cells. We then calculated the proportion of each sample’s cells coming from each chromatin class for each technology and plotted the CITE-seq proportion by scATAC-seq proportion for each donor, faceted by chromatin class in Fig. 8a and Supplementary Fig. 16a. We calculated the Pearson correlation and two-sided p value for each chromatin class by stats::cor.test.
Co-varying neighborhood analysis
We used the significant CNA99 correlations between AMP-RA reference cell neighborhoods and sample-level covariates from our AMP-RA reference study14. We re-plotted the AMP-RA reference cell CNA correlations on the chromatin class UMAPs and re-aggregated them by classified chromatin class calculated in “Symphony classification of chromatin class”. In Supplementary Table 6, clinical metrics and CTAPs were listed if the median abundance correlation of the AMP-RA reference cells within their Symphony-classified chromatin class was more extreme than the FDR threshold for that patient attribute14. Classes were considered significantly expanded if that class’s cells were positively correlated with that attribute’s per-sample class abundance within a cell type and depleted if negatively correlated.
Genetic variant analysis
We used the set of RA-associated non-coding SNP locations and statistically fine-mapped PIPs from our previously published RA multi-ancestry genome-wide association meta-analysis study107. We subsetted the SNPs by PIP > 0.1 and overlapped their locations with our 200 bp trimmed peaks using intersectBed129. For the overlapping peaks, we plotted their normalized chromatin accessibility mean-aggregated by chromatin class and scaled in Fig. 8d with more description in Supplementary Table 5. To determine broad cell type specificity of a peak’s accessibility, we calculated a Wilcoxon test one-sided “greater” p value between the normalized, mean aggregated, scaled peak accessibility in the broad cell type’s classes versus those classes in the other broad cell types. Classes were considered accessible for that peak if the scaled mean normalized peak accessibility over 24 classes and 11 peaks, z, >1. We plotted example loci in Fig. 8e and Supplementary Fig. 17 as described in “Loci visualization”; we excluded some chromatin classes for space, but we kept the most accessible chromatin classes and at least one chromatin class from each cell type at each locus. The TF motif logos in Fig. 8e and Supplementary Fig. 17 were downloaded from the JASPAR motif database109 for accession IDs MA0517.1 (STAT1::STAT2), MA0039.4 (KLF4), and MA1483.1 (ELF2); they were not to scale, but the motif position the SNP disrupts was aligned to the SNP. We further aggregated multimodal snATAC-seq reads by transcriptional cell state for visualization purposes in Supplementary Fig. 18.
Computational versions used
Specific software versions are listed here, but more information about how they were used within this study can be found in the appropriate Methods sections.
Flow cytometry data was analyzed using FlowJo (v10.7.2 for tissue samples and v10.8.1 for blood samples).
We used R v3.6.1 for most analyses with the following packages: argparse v2.0.3, aricode v1.0.0, BiocGenerics v0.30.0, class v7.3-17, data.table v1.12.8, dplyr v1.0.2, GenomeInfoDb v1.20.0, GenomicRanges v1.36.1, ggbeeswarm v0.6.0, ggplot2 v3.3.0, ggpubr v0.4.0, ggrastr v0.2.3, ggrepel v0.8.2, ggthemes v4.2.0, gplots v3.0.1.1, gridExtra v2.3, gtools v3.8.2, harmony v1.0, IRanges v2.18.3, irlba v2.3.3, lattice v0.20-41, lme4 v1.1-21, magrittr v1.5, MASS v7.3-51.6, Matrix v1.2-18, Matrix.utils v0.9.7, matrixStats v0.56.0, patchwork v1.1.0.9000, pheatmap v1.0.12, plyr v1.8.6, presto v1.0.0, RANN v2.6.1, RColorBrewer v1.1-2, rcompanion v2.4.1, Rcpp v1.0.4.6, RcppCNPy v0.2.10, repr v1.0.1, reticulate v1.13, Rmisc v1.5.1, ROCR v1.0-7, rstatix v0.7.0, S4Vectors v0.22.1, scales v1.1.1, Seurat v3.2.0, Signac v1.1.0, stringr v1.4.0, symphony v1.0, tibble v3.0.1, tidyr v1.0.3, umap v0.2.3.1, uwot v0.1.8, viridis v0.5.1, viridisLite v0.3.0.
For ArchR analyses, we used R v4.2.0 with the following packages: ArchR v1.0.2, argparse v2.1.6, Biobase v2.56.0, BiocGenerics v0.42.0, Biostrings v2.64.1, BSgenome v1.64.0, BSgenome.Hsapiens.UCSC.hg38 v1.4.4, chromVARmotifs v0.2.0, data.table v1.14.4, GenomeInfoDb v1.32.4, GenomicRanges v1.48.0, ggplot2 v3.3.6, gridExtra v2.3, gtable v0.3.1, gtools v3.9.3, IRanges v2.30.1, JASPAR2016 v1.24.0, JASPAR2018 v1.1.1, JASPAR2020 v0.99.10, magrittr v2.0.3, Matrix v1.5-1, MatrixGenerics v1.8.1, matrixStats v0.62.0, plyr v1.8.7, Rcpp v1.0.9, rhdf5 v2.40.0, rtracklayer v1.56.1, S4Vectors v0.34.0, stringr v1.4.1, SummarizedExperiment v1.26.1, TFBSTools v1.34.0, tidyr v1.2.1, XVector v0.36.0.
We also used python v3.7.3, scrublet v0.2.3, samtools v1.9, bedtools v2.28.0, bedops v2.4.36, GNU Awk 3.1.7, jupyter v4.4.0.
Supplementary information
Acknowledgements
This work was supported by the Accelerating Medicines Partnership (AMP) in Rheumatoid Arthritis and Lupus Network. AMP is a public-private partnership (AbbVie Inc., Arthritis Foundation, Bristol-Myers Squibb Company, Foundation for the National Institutes of Health, GlaxoSmithKline, Janssen Research and Development, LLC, Lupus Foundation of America, Lupus Research Alliance, Merck Sharp & Dohme Corp., National Institute of Allergy and Infectious Diseases, National Institute of Arthritis and Musculoskeletal and Skin Diseases, Pfizer Inc., Rheumatology Research Foundation, Sanofi and Takeda Pharmaceuticals International, Inc.) created to develop new ways of identifying and validating promising biological targets for diagnostics and drug development. Funding was provided through grants from the National Institutes of Health (UH2-AR067676, UH2-AR067677, UH2-AR067679, UH2-AR067681, UH2-AR067685, UH2-AR067690, UH2-AR067691, UH2-AR067694, and UM2-AR067678). Accelerating Medicines Partnership and AMP are registered service marks of the U.S. Department of Health and Human Services. This work is supported in part by funding from the National Institutes of Health (1UH2AR067677-01, U01HG009379, UC2AR081023). We also acknowledge support by NIH NHGRI T32HG002295 and NIAMS T32AR007530 (to K. Weinand and A.N.); Arthritis National Research Foundation (to F.Z.); NIAID T32AR007258 (to K.S.); and NIH NIAMS R01AR078769 (to D.A.R.). S.S. was in part supported by the Uehara Memorial Foundation and the Osamu Hayaishi Memorial Scholarship. K. Wei is supported by NIH-NIAMS K08AR077037, Burroughs Wellcome Fund Career Awards for Medical Scientists, a Doris Duke Charitable Foundation Clinical Scientist Development Award, and a Rheumatology Research Foundation Innovative Research Award. UK Birmingham is supported by the Versus Arthritis Research Into Inflammatory Arthritis Centre Versus Arthritis (Versus Arthritis grant 22072) and the EU Innovative Medicines Initiative RT CURE. We wish to thank Tiffany Amariuta, Kaitlyn A. Lagattuta, Anika Gupta, Angela Zou, Miles Tran, and Nicholas Sugiarto for the helpful discussion.
Author contributions
K. Weinand, S.S., and S.R. conceptualized the study. K. Weinand conducted all computational analyses. S.S., A.N., F.Z., and S.R. provided input on statistical analyses and study design. S.S., A.N., A.H.J., D.A.R., J.H.A., M.B.B., K. Wei, and S.R. provided input on cellular analyses and interpretation. S.S. and S.R. supervised the study. AMP RA/SLE Network recruited patients and obtained synovial biopsies for unimodal scATAC-seq samples. L.T.D. and K. Wei recruited patients and obtained synovial biopsies for multimodal samples. K. Wei, A.H.J, G.F.M.W., A.N., and M.B.B. designed and implemented the tissue disaggregation, cell sorting, and single-cell sequencing pipeline. A.H.J., K. Wei, and G.F.M.W supervised and executed the tissue disaggregation pipeline for unimodal scATAC-seq samples. K. Wei, G.F.M.W, and Z.Z. supervised and executed the tissue disaggregation pipeline for multimodal tissue samples. K. Weinand, S.S., G.F.M.W., M.A.S., D.A.R., K. Wei, and S.R. designed the RA PBMC multimodal experiment, L.T.D. provided the samples, and M.A.S. executed the experiment. K. Weinand, S.S., and S.R. wrote the initial manuscript. All authors contributed to editing the final manuscript.
Peer review
Peer review information
Nature Communications thanks Caroline Ospelt, Richard Scheuermann and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Data availability
The raw FASTQs files generated in this study have been deposited in the dbGaP database under accession code phs003417.v2.p1. These data are available under restricted access as patient-identifiable data; access can be requested from dbGaP. The processed data files generated in this study have been deposited in Synapse under accession code syn53650034141. Source data are provided with this paper. Symphony references from ref. 14 are available in Synapse under accession code syn52297840142. Cultured unstimulated FLS multiome datasets from ref. 44 are available in Immport accession ID SDY2213. JASPAR motifs from ref. 109 are available in JASPAR under accession codes MA0517.1, MA0039.4, MA1483.1, and MA0078.1.
Code availability
The code used to generate the results presented herein can be found on GitHub (https://github.com/immunogenomics/RA_ATAC_multiome/).
Competing interests
S.R. is a founder for Mestag Therapeutics, a scientific advisor for Janssen, Sonoma, and Pfizer. D.A.R. reports personal fees from Pfizer, Janssen, Merck, GlaxoSmithKline, AstraZeneca, Scipher Medicine, HiFiBio, and Bristol-Myers Squibb, and grant support from Merck, Janssen, and Bristol-Myers Squibb outside the submitted work. D.A.R. is a co-inventor on the patent for TPH cells as a biomarker of autoimmunity. The remaining authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A list of authors and their affiliations appears at the end of the paper.
Contributor Information
Soumya Raychaudhuri, Email: soumya@broadinstitute.org.
Accelerating Medicines Partnership Program: Rheumatoid Arthritis and Systemic Lupus Erythematosus (AMP RA/SLE) Network:
Jennifer Albrecht, William Apruzzese, Nirmal Banda, Jennifer L. Barnas, Joan M. Bathon, Ami Ben-Artzi, Brendan F. Boyce, David L. Boyle, S. Louis Bridges, Jr., Vivian P. Bykerk, Debbie Campbell, Hayley L. Carr, Arnold Ceponis, Adam Chicoine, Andrew Cordle, Michelle Curtis, Kevin D. Deane, Edward DiCarlo, Patrick Dunn, Andrew Filer, Gary S. Firestein, Lindsy Forbess, Laura Geraldino-Pardilla, Susan M. Goodman, Ellen M. Gravallese, Peter K. Gregersen, Joel M. Guthridge, Maria Gutierrez-Arcelus, Siddarth Gurajala, V. Michael Holers, Diane Horowitz, Laura B. Hughes, Kazuyoshi Ishigaki, Lionel B. Ivashkiv, Judith A. James, Joyce B. Kang, Gregory Keras, Ilya Korsunsky, Amit Lakhanpal, James A. Lederer, Zhihan J. Li, Yuhong Li, Katherine P. Liao, Arthur M. Mandelin, II, Ian Mantel, Mark Maybury, Andrew McDavid, Joseph Mears, Nida Meednu, Nghia Millard, Larry W. Moreland, Alessandra Nerviani, Dana E. Orange, Harris Perlman, Costantino Pitzalis, Javier Rangel-Moreno, Karim Raza, Yakir Reshef, Christopher Ritchlin, Felice Rivellese, William H. Robinson, Laurie Rumker, Ilfita Sahbudin, Dagmar Scheel-Toellner, Jennifer A. Seifert, Kamil Slowikowski, Melanie H. Smith, Darren Tabechian, Paul J. Utz, Dana Weisenfeld, Michael H. Weisman, and Qian Xiao
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-024-48620-7.
References
- 1.Smolen JS, et al. Rheumatoid arthritis. Nat. Rev. Dis. Prim. 2018;4:1–23. [Google Scholar]
- 2.Smolen JS, Aletaha D, McInnes IB. Rheumatoid arthritis. Lancet. 2016;388:2023–2038. doi: 10.1016/S0140-6736(16)30173-8. [DOI] [PubMed] [Google Scholar]
- 3.Han B, et al. Fine mapping seronegative and seropositive rheumatoid arthritis to shared and distinct HLA alleles by adjusting for the effects of heterogeneity. Am. J. Hum. Genet. 2014;94:522–532. doi: 10.1016/j.ajhg.2014.02.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Padyukov L. Genetics of rheumatoid arthritis. Semin. Immunopathol. 2022;44:47–62. doi: 10.1007/s00281-022-00912-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Viatte S, Barton A. Genetics of rheumatoid arthritis susceptibility, severity, and treatment response. Semin. Immunopathol. 2017;39:395–408. doi: 10.1007/s00281-017-0630-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Pitzalis C, Choy EHS, Buch MH. Transforming clinical trials in rheumatology: towards patient-centric precision medicine. Nat. Rev. Rheumatol. 2020;16:590–599. doi: 10.1038/s41584-020-0491-4. [DOI] [PubMed] [Google Scholar]
- 7.Yazici Y, et al. Efficacy of tocilizumab in patients with moderate to severe active rheumatoid arthritis and a previous inadequate response to disease-modifying antirheumatic drugs: the ROSE study. Ann. Rheum. Dis. 2012;71:198–205. doi: 10.1136/ard.2010.148700. [DOI] [PubMed] [Google Scholar]
- 8.Genovese MC, et al. Interleukin-6 receptor inhibition with tocilizumab reduces disease activity in rheumatoid arthritis with inadequate response to disease-modifying antirheumatic drugs: the tocilizumab in combination with traditional disease-modifying antirheumatic drug therapy study. Arthritis Rheum. 2008;58:2968–2980. doi: 10.1002/art.23940. [DOI] [PubMed] [Google Scholar]
- 9.Schwartz DM, et al. JAK inhibition as a therapeutic strategy for immune and inflammatory diseases. Nat. Rev. Drug Discov. 2017;17:78. doi: 10.1038/nrd.2017.267. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Humby F, et al. Rituximab versus tocilizumab in anti-TNF inadequate responder patients with rheumatoid arthritis (R4RA): 16-week outcomes of a stratified, biopsy-driven, multicentre, open-label, phase 4 randomised controlled trial. Lancet. 2021;397:305–317. doi: 10.1016/S0140-6736(20)32341-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Zhang F, et al. Defining inflammatory cell states in rheumatoid arthritis joint synovial tissues by integrating single-cell transcriptomics and mass cytometry. Nat. Immunol. 2019;20:928–942. doi: 10.1038/s41590-019-0378-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Rao DA, et al. Pathologically expanded peripheral T helper cell subset drives B cells in rheumatoid arthritis. Nature. 2017;542:110–114. doi: 10.1038/nature20810. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Alivernini S, et al. Distinct synovial tissue macrophage subsets regulate inflammation and remission in rheumatoid arthritis. Nat. Med. 2020;26:1295–1306. doi: 10.1038/s41591-020-0939-8. [DOI] [PubMed] [Google Scholar]
- 14.Zhang F, et al. Deconstruction of rheumatoid arthritis synovium defines inflammatory subtypes. Nature. 2023;623:616–624. doi: 10.1038/s41586-023-06708-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Stoeckius M, et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods. 2017;14:865–868. doi: 10.1038/nmeth.4380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Qin Y, et al. Age-associated B cells contribute to the pathogenesis of rheumatoid arthritis by inducing activation of fibroblast-like synoviocytes via TNF-α-mediated ERK1/2 and JAK-STAT1 pathways. Ann. Rheum. Dis. 2022;81:1504–1514. doi: 10.1136/ard-2022-222605. [DOI] [PubMed] [Google Scholar]
- 17.Roadmap Epigenomics Consortium, et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518:317–330. doi: 10.1038/nature14248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Yarrington RM, Rudd JS, Stillman DJ. Spatiotemporal cascade of transcription factor binding required for promoter activation. Mol. Cell Biol. 2015;35:688–698. doi: 10.1128/MCB.01285-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Lynch AW, et al. MIRA: joint regulatory modeling of multimodal expression and chromatin accessibility in single cells. Nat. Methods. 2022;19:1097–1108. doi: 10.1038/s41592-022-01595-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Heinz, S., Romanoski, C. E., Benner, C. & Glass, C. K. The selection and function of cell type-specific enhancers. Nat. Rev. Mol. Cell Biol.16, 144–154 (2015). [DOI] [PMC free article] [PubMed]
- 21.Wei K, et al. Notch signalling drives synovial fibroblast identity and arthritis pathology. Nature. 2020;582:259–264. doi: 10.1038/s41586-020-2222-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Zhang F, et al. IFN-γ and TNF-α drive a CXCL10+ CCL2+ macrophage phenotype expanded in severe COVID-19 lungs and inflammatory diseases with tissue inflammation. Genome Med. 2021;13:64. doi: 10.1186/s13073-021-00881-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Donlin LT, et al. Methods for high-dimensional analysis of cells dissociated from cryopreserved synovial tissue. Arthritis Res. Ther. 2018;20:139. doi: 10.1186/s13075-018-1631-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Amemiya HM, Kundaje A, Boyle AP. The ENCODE blacklist: identification of problematic regions of the genome. Sci. Rep. 2019;9:1–5. doi: 10.1038/s41598-019-45839-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.ENCODE Project Consortium, et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature. 2020;583:699–710. doi: 10.1038/s41586-020-2493-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Frankish A, et al. GENCODE 2021. Nucleic Acids Res. 2021;49:D916–D923. doi: 10.1093/nar/gkaa1087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Korsunsky I, et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods. 2019;16:1289–1296. doi: 10.1038/s41592-019-0619-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Kim H-J, et al. Stable inhibitory activity of regulatory T cells requires the transcription factor Helios. Science. 2015;350:334–339. doi: 10.1126/science.aad0616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Perry N, et al. Methylation-sensitive restriction enzyme quantitative polymerase chain reaction enables rapid, accurate, and precise detection of methylation status of the regulatory T cell (Treg)-specific demethylation region in primary human tregs. J. Immunol. 2021;206:446–451. doi: 10.4049/jimmunol.1901275. [DOI] [PubMed] [Google Scholar]
- 30.Jonsson AH, et al. Granzyme K+ CD8 T cells form a core population in inflamed human tissue. Sci. Transl. Med. 2022;14:eabo0686. doi: 10.1126/scitranslmed.abo0686. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Wang L, Xiong Y, Bosselut R. Maintaining CD4–CD8 lineage integrity in T cells: where plasticity serves versatility. Semin. Immunol. 2011;23:360–367. doi: 10.1016/j.smim.2011.08.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Hidalgo LG, Einecke G, Allanach K, Halloran PF. The transcriptome of human cytotoxic T cells: similarities and disparities among allostimulated CD4+ CTL, CD8+ CTL and NK cells. Am. J. Transplant. 2008;8:627–636. doi: 10.1111/j.1600-6143.2007.02128.x. [DOI] [PubMed] [Google Scholar]
- 33.Campbell JJ, et al. CCR7 Expression and Memory T Cell Diversity in Humans. J. Immunol. 2001;166:877–884. doi: 10.4049/jimmunol.166.2.877. [DOI] [PubMed] [Google Scholar]
- 34.Schep, A. N., Wu, B., Buenrostro, J. D. & Greenleaf, W. J. ChromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat. Methods10.1038/nmeth.4401 (2017). [DOI] [PMC free article] [PubMed]
- 35.Granja JM, et al. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat. Genet. 2021;53:403–411. doi: 10.1038/s41588-021-00790-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Intlekofer AM, et al. Effector and memory CD8+ T cell fate coupled by T-bet and eomesodermin. Nat. Immunol. 2005;6:1236–1244. doi: 10.1038/ni1268. [DOI] [PubMed] [Google Scholar]
- 37.Herndler-Brandstetter D, et al. KLRG1+ effector CD8+ T cells lose KLRG1, differentiate into all memory T cell lineages, and convey enhanced protective immunity. Immunity. 2018;48:716–729.e8. doi: 10.1016/j.immuni.2018.03.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Shan Q, et al. The transcription factor Runx3 guards cytotoxic CD8+ effector T cells against deviation towards follicular helper T cell lineage. Nat. Immunol. 2017;18:931–939. doi: 10.1038/ni.3773. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Ise W, et al. The transcription factor BATF controls the global regulators of class-switch recombination in both B cells and T cells. Nat. Immunol. 2011;12:536–543. doi: 10.1038/ni.2037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Chen AF, et al. NEAT-seq: simultaneous profiling of intra-nuclear proteins, chromatin accessibility and gene expression in single cells. Nat. Methods. 2022;19:547–553. doi: 10.1038/s41592-022-01461-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Knab K, Chambers D, Krönke G. Synovial macrophage and fibroblast heterogeneity in joint homeostasis and inflammation. Front. Med. (Lausanne) 2022;9:862161. doi: 10.3389/fmed.2022.862161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Buechler MB, et al. Cross-tissue organization of the fibroblast lineage. Nature. 2021;593:575–579. doi: 10.1038/s41586-021-03549-5. [DOI] [PubMed] [Google Scholar]
- 43.Korsunsky I, et al. Cross-tissue, single-cell stromal atlas identifies shared pathological fibroblast phenotypes in four chronic inflammatory diseases. Med. (N. Y) 2022;3:481–518.e14. doi: 10.1016/j.medj.2022.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Smith MH, et al. Drivers of heterogeneity in synovial fibroblasts in rheumatoid arthritis. Nat. Immunol. 2023;24:1200–1210. doi: 10.1038/s41590-023-01527-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Wei, K., Nguyen, H. N. & Brenner, M. B. Fibroblast pathology in inflammatory diseases. J. Clin. Invest.131, e149538 (2021). [DOI] [PMC free article] [PubMed]
- 46.Kaluscha S, et al. Evidence that direct inhibition of transcription factor binding is the prevailing mode of gene and repeat repression by DNA methylation. Nat. Genet. 2022;54:1895–1906. doi: 10.1038/s41588-022-01241-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Nakano K, Whitaker JW, Boyle DL, Wang W, Firestein GS. DNA methylome signature in rheumatoid arthritis. Ann. Rheum. Dis. 2013;72:110–117. doi: 10.1136/annrheumdis-2012-201526. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Whitaker JW, et al. An imprinted rheumatoid arthritis methylome signature reflects pathogenic phenotype. Genome Med. 2013;5:40. doi: 10.1186/gm444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Bartok B, Firestein GS. Fibroblast-like synoviocytes: key effector cells in rheumatoid arthritis. Immunol. Rev. 2010;233:233–255. doi: 10.1111/j.0105-2896.2009.00859.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Barnett KR, et al. ATAC-Me captures prolonged DNA methylation of dynamic chromatin accessibility loci during cell fate transitions. Mol. Cell. 2020;77:1350–1364.e6. doi: 10.1016/j.molcel.2020.01.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Shiozawa S, Tsumiyama K, Yoshida K, Hashiramoto A. Pathogenesis of joint destruction in rheumatoid arthritis. Arch. Immunol. Ther. Exp. (Warsz.) 2011;59:89–95. doi: 10.1007/s00005-011-0116-3. [DOI] [PubMed] [Google Scholar]
- 52.Caire R, et al. YAP transcriptional activity dictates cell response to TNF in vitro. Front. Immunol. 2022;13:856247. doi: 10.3389/fimmu.2022.856247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Malemud CJ. The role of the JAK/STAT signal pathway in rheumatoid arthritis. Ther. Adv. Musculoskelet. Dis. 2018;10:117–127. doi: 10.1177/1759720X18776224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Stratman AN, et al. Chemokine mediated signalling within arteries promotes vascular smooth muscle cell recruitment. Commun. Biol. 2020;3:734. doi: 10.1038/s42003-020-01462-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Wu J, Bohanan CS, Neumann JC, Lingrel JB. KLF2 transcription factor modulates blood vessel maturation through smooth muscle cell migration. J. Biol. Chem. 2008;283:3942–3950. doi: 10.1074/jbc.M707882200. [DOI] [PubMed] [Google Scholar]
- 56.Pagani F, Tratta E, Dell’Era P, Cominelli M, Poliani PL. EBF1 is expressed in pericytes and contributes to pericyte cell commitment. Histochem. Cell Biol. 2021;156:333–347. doi: 10.1007/s00418-021-02015-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Parker KR, et al. Single-cell analyses identify brain mural cells expressing CD19 as potential off-tumor targets for CAR-T immunotherapies. Cell. 2020;183:126–142.e17. doi: 10.1016/j.cell.2020.08.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Wang Y, Yan K, Lin J, Li J, Bi J. Macrophage M2 co-expression factors correlate with the immune microenvironment and predict outcome of renal clear cell carcinoma. Front. Genet. 2021;12:615655. doi: 10.3389/fgene.2021.615655. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Meng Q, Pan B, Sheng P. Histone deacetylase 1 is increased in rheumatoid arthritis synovium and promotes synovial cell hyperplasia and synovial inflammation in the collagen-induced arthritis mouse model via the microRNA-124-dependent MARCKS-JAK/STAT axis. Clin. Exp. Rheumatol. 2021;39:970–981. doi: 10.55563/clinexprheumatol/1xsigp. [DOI] [PubMed] [Google Scholar]
- 60.Remmerie A, et al. Osteopontin expression identifies a subset of recruited macrophages distinct from Kupffer cells in the fatty liver. Immunity. 2020;53:641–657.e14. doi: 10.1016/j.immuni.2020.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Zhang F, Luo W, Li Y, Gao S, Lei G. Role of osteopontin in rheumatoid arthritis. Rheumatol. Int. 2015;35:589–595. doi: 10.1007/s00296-014-3122-z. [DOI] [PubMed] [Google Scholar]
- 62.Roberts AW, et al. Tissue-resident macrophages are locally programmed for silent clearance of apoptotic cells. Immunity. 2017;47:913–927.e6. doi: 10.1016/j.immuni.2017.10.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Galvan MD, Foreman DB, Zeng E, Tan JC, Bohlson SS. Complement component C1q regulates macrophage expression of mer tyrosine kinase to promote clearance of apoptotic cells. J. Immunol. 2012;188:3716–3723. doi: 10.4049/jimmunol.1102920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Schmidl C, et al. Transcription and enhancer profiling in human monocyte subsets. Blood. 2014;123:e90–e99. doi: 10.1182/blood-2013-02-484188. [DOI] [PubMed] [Google Scholar]
- 65.Resendes KK, Rosmarin AG. Sp1 control of gene expression in myeloid cells. Crit. Rev. Eukaryot. Gene Expr. 2004;14:171–182. doi: 10.1615/CritRevEukaryotGeneExpr.v14.i3.20. [DOI] [PubMed] [Google Scholar]
- 66.Chopin M, et al. Transcription factor PU.1 promotes conventional dendritic cell identity and function via induction of transcriptional regulator DC-SCRIPT. Immunity. 2019;50:77–90.e5. doi: 10.1016/j.immuni.2018.11.010. [DOI] [PubMed] [Google Scholar]
- 67.Schotte R, Nagasawa M, Weijer K, Spits H, Blom B. The ETS transcription factor Spi-B is required for human plasmacytoid dendritic cell development. J. Exp. Med. 2004;200:1503–1509. doi: 10.1084/jem.20041231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Sanz I, et al. Challenges and opportunities for consistent classification of human B cell and plasma cell populations. Front. Immunol. 2019;10:2458. doi: 10.3389/fimmu.2019.02458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Descatoire M, et al. Identification of a human splenic marginal zone B cell precursor with NOTCH2-dependent differentiation properties. J. Exp. Med. 2014;211:987–1000. doi: 10.1084/jem.20132203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Moroney JB, Vasudev A, Pertsemlidis A, Zan H, Casali P. Integrative transcriptome and chromatin landscape analysis reveals distinct epigenetic regulations in human memory B cells. Nat. Commun. 2020;11:5435. doi: 10.1038/s41467-020-19242-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Mouat IC, Goldberg E, Horwitz MS. Age-associated B cells in autoimmune diseases. Cell Mol. Life Sci. 2022;79:402. doi: 10.1007/s00018-022-04433-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Rubtsov A. v. et al. CD11c-expressing B cells are located at the T cell/B cell border in spleen and are potent APCs. J. Immunol. 2015;195:71–79. doi: 10.4049/jimmunol.1500055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Al-Maskari M, et al. Site-1 protease function is essential for the generation of antibody secreting cells and reprogramming for secretory activity. Sci. Rep. 2018;8:14338. doi: 10.1038/s41598-018-32705-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Meednu N, et al. Dynamic spectrum of ectopic lymphoid B cell activation and hypermutation in the RA synovium characterized by NR4A nuclear receptor expression. Cell Rep. 2022;39:110766. doi: 10.1016/j.celrep.2022.110766. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Willis SN, et al. Environmental sensing by mature B cells is controlled by the transcription factors PU.1 and SpiB. Nat. Commun. 2017;8:1426. doi: 10.1038/s41467-017-01605-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Wang Y, et al. Rheumatoid arthritis patients display B-cell dysregulation already in the naïve repertoire consistent with defects in B-cell tolerance. Sci. Rep. 2019;9:19995. doi: 10.1038/s41598-019-56279-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Wu F, et al. B cells in rheumatoid arthritis: pathogenic mechanisms and treatment prospects. Front. Immunol. 2021;12:750753. doi: 10.3389/fimmu.2021.750753. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Nguyen HV, et al. The Ets-1 transcription factor is required for Stat1-mediated T-bet expression and IgG2a class switching in mouse B cells. Blood. 2012;119:4174–4181. doi: 10.1182/blood-2011-09-378182. [DOI] [PubMed] [Google Scholar]
- 79.Winkelmann R, et al. B cell homeostasis and plasma cell homing controlled by Krüppel-like factor 2. Proc. Natl Acad. Sci. USA. 2011;108:710–715. doi: 10.1073/pnas.1012858108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Mora-López F, Pedreño-Horrillo N, Delgado-Pérez L, Brieva JA, Campos-Caro A. Transcription of PRDM1, the master regulator for plasma cell differentiation, depends on an SP1/SP3/EGR-1 GC-box. Eur. J. Immunol. 2008;38:2316–2324. doi: 10.1002/eji.200737861. [DOI] [PubMed] [Google Scholar]
- 81.Fan, F. & Podar, K. The role of AP-1 transcription factors in plasma cell biology and multiple myeloma pathophysiology. Cancers (Basel)13, 2326 (2021). [DOI] [PMC free article] [PubMed]
- 82.Kutschera S, et al. Differential endothelial transcriptomics identifies semaphorin 3 G as a vascular class 3 semaphorin. Arterioscler. Thromb. Vasc. Biol. 2011;31:151–159. doi: 10.1161/ATVBAHA.110.215871. [DOI] [PubMed] [Google Scholar]
- 83.Thiriot A, et al. Differential DARC/ACKR1 expression distinguishes venular from non-venular endothelial cells in murine tissues. BMC Biol. 2017;15:45. doi: 10.1186/s12915-017-0381-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Kalucka J, et al. Single-cell transcriptome atlas of murine endothelial cells. Cell. 2020;180:764–779.e20. doi: 10.1016/j.cell.2020.01.015. [DOI] [PubMed] [Google Scholar]
- 85.Wigle JT, et al. An essential role for Prox1 in the induction of the lymphatic endothelial cell phenotype. EMBO J. 2002;21:1505–1513. doi: 10.1093/emboj/21.7.1505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Corada M, et al. Sox17 is indispensable for acquisition and maintenance of arterial identity. Nat. Commun. 2013;4:2609. doi: 10.1038/ncomms3609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Cheng W-X, et al. Genistein inhibits angiogenesis developed during rheumatoid arthritis through the IL-6/JAK2/STAT3/VEGF signalling pathway. J. Orthop. Transl. 2020;22:92–100. doi: 10.1016/j.jot.2019.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Yoshitomi, Y., Ikeda, T., Saito-Takatsuji, H. & Yonekura, H. Emerging role of AP-1 transcription factor junb in angiogenesis and vascular development. Int. J. Mol. Sci.22, 2804 (2021). [DOI] [PMC free article] [PubMed]
- 89.González-Hernández, S. et al. Sox17 controls emergence and remodeling of nestin-expressing coronary vessels. Circ Res127, e252-e270 (2020). [DOI] [PubMed]
- 90.Dusart P, et al. A systems-approach reveals human nestin is an endothelial-enriched, angiogenesis-independent intermediate filament protein. Sci. Rep. 2018;8:14668. doi: 10.1038/s41598-018-32859-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Mizoguchi F, et al. Functionally distinct disease-associated fibroblast subsets in rheumatoid arthritis. Nat. Commun. 2018;9:789. doi: 10.1038/s41467-018-02892-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.PBMC from a healthy donor - granulocytes removed through cell sorting (10k). Single cell multiome ATAC + gene expression dataset by cell ranger ARC 2.0.0. 10x Genomics https://www.10xgenomics.com/datasets/pbmc-from-a-healthy-donor-granulocytes-removed-through-cell-sorting-10-k-1-standard-2-0-0 (2021).
- 93.Sakaue, S. et al. Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles. Nat. Genet.56, 615–626 (2024). [DOI] [PMC free article] [PubMed]
- 94.Gordon S, Plüddemann A. Tissue macrophages: heterogeneity and functions. BMC Biol. 2017;15:53. doi: 10.1186/s12915-017-0392-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Kang JB, et al. Efficient and precise single-cell reference atlas mapping with symphony. Nat. Commun. 2021;12:1–21. doi: 10.1038/s41467-021-25957-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Bonelli M, et al. Phenotypic and functional analysis of CD4+CD25−Foxp3+ T cells in patients with systemic lupus erythematosus. J. Immunol. 2009;182:1689–1695. doi: 10.4049/jimmunol.182.3.1689. [DOI] [PubMed] [Google Scholar]
- 97.McCann FE, et al. Apremilast, a novel PDE4 inhibitor, inhibits spontaneous production of tumour necrosis factor-alpha from human rheumatoid synovial cells and ameliorates experimental arthritis. Arthritis Res. Ther. 2010;12:R107. doi: 10.1186/ar3041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Bluhm A, et al. ZBTB10 binds the telomeric variant repeat TTGGGG and interacts with TRF2. Nucleic Acids Res. 2019;47:1896–1907. doi: 10.1093/nar/gky1289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Reshef YA, et al. Co-varying neighborhood analysis identifies cell populations associated with phenotypes of interest from single-cell transcriptomics. Nat. Biotechnol. 2022;40:355–363. doi: 10.1038/s41587-021-01066-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Song, Y., Yuan, M., Xu, Y. & Xu, H. Tackling inflammatory bowel diseases: targeting proinflammatory cytokines and lymphocyte homing. Pharmaceuticals (Basel)15, 1080 (2022). [DOI] [PMC free article] [PubMed]
- 101.Seth A, Craft J. Spatial and functional heterogeneity of follicular helper T cells in autoimmunity. Curr. Opin. Immunol. 2019;61:1–9. doi: 10.1016/j.coi.2019.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Trynka G, et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 2013;45:124–130. doi: 10.1038/ng.2504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Nathan A, et al. Single-cell eQTL models reveal dynamic T cell state dependence of disease loci. Nature. 2022;606:120–128. doi: 10.1038/s41586-022-04713-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Sanyal A, Lajoie BR, Jain G, Dekker J. The long-range interaction landscape of gene promoters. Nature. 2012;489:109–113. doi: 10.1038/nature11279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Maurano MT, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337:1190–1195. doi: 10.1126/science.1222794. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Amariuta T, Luo Y, Knevel R, Okada Y, Raychaudhuri S. Advances in genetics toward identifying pathogenic cell states of rheumatoid arthritis. Immunol. Rev. 2020;294:188–204. doi: 10.1111/imr.12827. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Ishigaki K, et al. Multi-ancestry genome-wide association analyses identify novel genetic mechanisms in rheumatoid arthritis. Nat. Genet. 2022;54:1640–1651. doi: 10.1038/s41588-022-01213-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Binder, C. et al. CD2 Immunobiology. Front. Immunol.11, 1090 (2020). [DOI] [PMC free article] [PubMed]
- 109.Castro-Mondragon JA, et al. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2022;50:D165–D173. doi: 10.1093/nar/gkab1113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Tamura T, Kurotaki D, Koizumi S. Regulation of myelopoiesis by the transcription factor IRF8. Int. J. Hematol. 2015;101:342–351. doi: 10.1007/s12185-015-1761-9. [DOI] [PubMed] [Google Scholar]
- 111.Kurotaki D, et al. Transcription factor IRF8 governs enhancer landscape dynamics in mononuclear phagocyte progenitors. Cell Rep. 2018;22:2628–2641. doi: 10.1016/j.celrep.2018.02.048. [DOI] [PubMed] [Google Scholar]
- 112.Wang H, Morse HC. IRF8 regulates myeloid and B lymphoid lineage diversification. Immunol. Res. 2009;43:109–117. doi: 10.1007/s12026-008-8055-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Hwang, S.-H. et al. Leukocyte-specific protein 1 regulates T cell migration in rheumatoid arthritis. Proc. Natl. Acad. Sci.112, E6535-43 (2015). [DOI] [PMC free article] [PubMed]
- 114.Solomon MJ, Larsen PL, Varshavsky A. Mapping protein-DNA interactions in vivo with formaldehyde: evidence that histone H4 is retained on a highly transcribed gene. Cell. 1988;53:937–947. doi: 10.1016/S0092-8674(88)90469-2. [DOI] [PubMed] [Google Scholar]
- 115.Skene, P. J. & Henikoff, S. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. Elife6, e21856 (2017). [DOI] [PMC free article] [PubMed]
- 116.Balendran T, Lim K, Hamilton JA, Achuthan AA. Targeting transcription factors for therapeutic benefit in rheumatoid arthritis. Front. Immunol. 2023;14:1196931. doi: 10.3389/fimmu.2023.1196931. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 117.Bravo González-Blas C, et al. SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks. Nat. Methods. 2023;20:1355–1367. doi: 10.1038/s41592-023-01938-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 118.Kamimoto K, et al. Dissecting cell identity via network inference and in silico gene perturbation. Nature. 2023;614:742–751. doi: 10.1038/s41586-022-05688-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Farh KK-H, et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature. 2015;518:337–343. doi: 10.1038/nature13835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.Orlik C, et al. Keratinocytes costimulate naive human T cells via CD2: a potential target to prevent the development of proinflammatory Th1 cells in the skin. Cell Mol. Immunol. 2020;17:380–394. doi: 10.1038/s41423-019-0261-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Mahajan S, Gollob JA, Ritz J, Frank DA. CD2 stimulation leads to the delayed and prolonged activation of STAT1 in T cells but not NK cells. Exp. Hematol. 2001;29:209–220. doi: 10.1016/S0301-472X(00)00652-4. [DOI] [PubMed] [Google Scholar]
- 122.Ferraro A, et al. Interindividual variation in human T regulatory cells. Proc. Natl Acad. Sci. USA. 2014;111:E1111–E1120. doi: 10.1073/pnas.1401343111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Koch AE, et al. Enhanced production of monocyte chemoattractant protein-1 in rheumatoid arthritis. J. Clin. Invest. 1992;90:772–779. doi: 10.1172/JCI115950. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Bielecki P, et al. Skin-resident innate lymphoid cells converge on a pathogenic effector state. Nature. 2021;592:128–132. doi: 10.1038/s41586-021-03188-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Zhao S, et al. Effect of JAK inhibition on the induction of proinflammatory HLA-DR+ CD90+ rheumatoid arthritis synovial fibroblasts by interferon‐γ. Arthritis Rheumatol. 2022;74:441–452. doi: 10.1002/art.41958. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 126.Neph S, et al. BEDOPS: high-performance genomic feature operations. Bioinformatics. 2012;28:1919–1920. doi: 10.1093/bioinformatics/bts277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 10.1186/gb-2008-9-9-r137 (2018). [DOI] [PMC free article] [PubMed]
- 128.Martins, A. L., Walavalkar, N. M., Anderson, W. D., Zang, C. & Guertin, M. J. Universal correction of enzymatic sequence bias reveals molecular signatures of protein/DNA interactions. Nucleic Acids Res.10.1093/nar/gkx1053 (2018). [DOI] [PMC free article] [PubMed]
- 129.Quinlan, A. R. BEDTools: the swiss-army tool for genome feature analysis. Curr. Protoc. Bioinformatics47, 11.12.1–11.12.34 (2014). [DOI] [PMC free article] [PubMed]
- 130.Lawrence M, et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 2013;9:e1003118. doi: 10.1371/journal.pcbi.1003118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Stuart T, et al. Comprehensive integration of single-cell data. Cell. 2019;177:1888–1902.e21. doi: 10.1016/j.cell.2019.05.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Wolock SL, Lopez R, Klein AM. Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 2019;8:281–291.e9. doi: 10.1016/j.cels.2018.11.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.Cunningham F, et al. Ensembl 2022. Nucleic Acids Res. 2022;50:D988–D995. doi: 10.1093/nar/gkab1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Bates D, Mächler M, Bolker BM, Walker SC. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 2015;67:1–48. doi: 10.18637/jss.v067.i01. [DOI] [Google Scholar]
- 135.Aho AV, Kernighan BW, Weinberger PJ. Awk — a pattern scanning and processing language. Softw. Pract. Exp. 1979;9:267–279. doi: 10.1002/spe.4380090403. [DOI] [Google Scholar]
- 136.Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics10.1093/bioinformatics/btp352. (2009).
- 137.Li H. Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics. 2011;27:718–719. doi: 10.1093/bioinformatics/btq671. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Robinson JT, et al. Integrative genomics viewer. Nat. Biotechnol. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 139.Lee BT, et al. The UCSC genome browser database: 2022 update. Nucleic Acids Res. 2022;50:D1115–D1122. doi: 10.1093/nar/gkab959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Sing T, Sander O, Beerenwinkel N, Lengauer T. ROCR: visualizing classifier performance in R. Bioinformatics. 2005;21:3940–3941. doi: 10.1093/bioinformatics/bti623. [DOI] [PubMed] [Google Scholar]
- 141.Weinand, K et al. Dataset: the chromatin landscape of pathogenic transcriptional cell states in rheumatoid arthritis. Synapse10.7303/syn53650034 (2024). [DOI] [PubMed]
- 142.Zhang, F. et al. Dataset: deconstruction of rheumatoid arthritis synovium defines inflammatory subtypes. Synapse10.7303/syn52297840 (2023). [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The raw FASTQs files generated in this study have been deposited in the dbGaP database under accession code phs003417.v2.p1. These data are available under restricted access as patient-identifiable data; access can be requested from dbGaP. The processed data files generated in this study have been deposited in Synapse under accession code syn53650034141. Source data are provided with this paper. Symphony references from ref. 14 are available in Synapse under accession code syn52297840142. Cultured unstimulated FLS multiome datasets from ref. 44 are available in Immport accession ID SDY2213. JASPAR motifs from ref. 109 are available in JASPAR under accession codes MA0517.1, MA0039.4, MA1483.1, and MA0078.1.
The code used to generate the results presented herein can be found on GitHub (https://github.com/immunogenomics/RA_ATAC_multiome/).