Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2018 Mar 30.

Published in final edited form as: Nat Rev Genet. 2016 Oct 24;17(12):758–772. doi: 10.1038/nrg.2016.119

a) Capture Hi-C¹⁰⁵ indicates that the nuclear receptor interacting protein 1 (NRIP1) locus on human chromosome 21 forms a loop with a previously unannotated region nearby. Pacific Biosciences (PacBio) CaptureSeq data could be aligned here (R. Johnson, personal communication), leading to the annotation of lncRNA OTTHUMG00000488671 in GENCODE. b)| A long non-coding RNA (lncRNA) transcription start site (TSS) falls within an ENCODE-defined enhancer ¹⁰² (red and orange blocks; processed by Ensembl¹³⁴). Three transcription factor binding (TFB) regions — E2F1, E2F4 and E2F6 — co-localize based on ENCODE chromatin immunoprecipitation followed by sequencing (ChIP-seq) data¹⁰². In combination, these data suggest an ‘extended gene model’ for NRIP1, which may aid the interpretation of three genome-wide association study (GWAS) signals linked to Crohn’s disease (rs2823286, rs1297265 and rs1736020; shown as asterisks) as previously noted by Mifsud et al.¹⁰⁵ c) NRIP1 contains one transcript in RefSeq and 6 in GENCODE. The coding sequence (CDS; shown as an open green box) has Swiss-Prot support, and a PhyloCSF conservation signal¹³¹. (The untranslated regions (UTRs) are shown as filled red boxes.) d) Two distinct first exons of NRIP1 are annotated, both supported by 5’ Cap Analysis of Gene Expression (CAGE) data⁴⁵. RNA-seq from Uhlen et al.¹¹⁵ indicates differential expression, with usage of the upstream exon apparently limited to bone marrow (and adipose; not shown). This TSS is dominant in white blood cells, which are bone-marrow-derived. RNA-seq and CAGE support a more general expression profile for the downstream first exon, with evidence of TSS variability.