Skip to main content
PLOS Genetics logoLink to PLOS Genetics
. 2021 Jul 19;17(7):e1009681. doi: 10.1371/journal.pgen.1009681

Highly conserved and cis-acting lncRNAs produced from paralogous regions in the center of HOXA and HOXB clusters in the endoderm lineage

Neta Degani 1, Yoav Lubelsky 1, Rotem Ben-Tov Perry 1, Elena Ainbinder 2, Igor Ulitsky 1,*
Editor: Eric A Miska3
PMCID: PMC8330917  PMID: 34280202

Abstract

Long noncoding RNAs (lncRNAs) have been shown to play important roles in gene regulatory networks acting in early development. There has been rapid turnover of lncRNA loci during vertebrate evolution, with few human lncRNAs conserved beyond mammals. The sequences of these rare deeply conserved lncRNAs are typically not similar to each other. Here, we characterize HOXA-AS3 and HOXB-AS3, lncRNAs produced from the central regions of the HOXA and HOXB clusters. Sequence-similar orthologs of both lncRNAs are found in multiple vertebrate species and there is evident sequence similarity between their promoters, suggesting that the production of these lncRNAs predates the duplication of the HOX clusters at the root of the vertebrate lineage. This conservation extends to similar expression patterns of the two lncRNAs, in particular in cells transiently arising during early development or in the adult colon. Functionally, the RNA products of HOXA-AS3 and HOXB-AS3 regulate the expression of their overlapping HOX5–7 genes both in HT-29 cells and during differentiation of human embryonic stem cells. Beyond production of paralogous protein-coding and microRNA genes, the regulatory program in the HOX clusters therefore also relies on paralogous lncRNAs acting in restricted spatial and temporal windows of embryonic development and cell differentiation.

Author summary

Each of the four Hox clusters in vertebrate genomes encodes up to 11 transcription factors whose activity is extensively regulated spatially and temporally, and which help determine the developmental and adult transcriptome in space and time. These Hox transcription factors belong to 13 homology groups, and Hox clusters also encode various noncoding transcripts, including microRNAs and long noncoding RNAs (lncRNAs). We characterize in detail two lncRNAs, HOXA-AS3 and HOXB-AS3, which are transcribed from matching regions in the HOXA and HOXB clusters, respectively. These lncRNAs are highly conserved in vertebrate evolution and transcribed antisense to Hox protein-coding genes from groups 5–7. Beyond the matching positions, the promoters of HOXA-AS3 and HOXB-AS3 share sequence similarity, their expression patterns are correlated with each other, mostly in the endoderm lineage, and they positively regulate the expression of the Hox protein-coding genes that they overlap. Regulation by lncRNAs thus appears to be an ancestral feature of HOX clusters, likely pre-dating the duplication of the Hox clusters at the root of the vertebrate lineage.

Introduction

Over the past decade, genome-wide transcriptome analyses revealed a plaetora of noncoding RNAs, that are expressed from a large number of genomic loci. Among those non-coding genes are long noncoding RNAs (lncRNAs), RNA Pol2 products that are longer than 200 nt. Similarly to mRNAs, lncRNAs begin with a 5’ cap and end with a poly(A) tail. To date, thousands of lncRNAs have been reported in different vertebrates [1,2], and it is yet unknown how many of them are functional and what is the full extent of their biological diversity. Many lncRNAs display highly restricted expression profiles during development, potentially allowing them to control gene expression in specific cellular contexts [2,3]. Some lncRNAs have been shown to indeed contribute to proper embryonic development [4].

Mouse and human Hox genes are organized in four genomic clusters (HOXA to HOXD) that exhibit a unique mode of transcriptional regulation–temporal and spatial collinearity–the position of the genes along the chromosome roughly corresponds to the time and place of their expression during development. The sequential activation of Hox genes in the primitive streak helps determine the subsequent pattern of expression along the anterior–posterior axis of the embryo [5,6]. Despite the crucial importance of Hox genes during development [7], the molecular pathways that dictate their collinear expression are not fully understood.

Noncoding RNAs are likely to play important roles in Hox gene regulation. For example, Hox clusters encode two conserved miRNAs, miR-10 and miR-196, that target some of the Hox genes and help establish specific regulatory programs in the embryo [8,9]. One of the first lncRNAs that has been studied in detail, HOTAIR, is produced from the HOXC cluster and was reported to regulate expression of HOXD genes [10]. Since this seminal discovery, numerous lncRNAs have been implicated as important in the Hox gene regulation [11]. For example, HOTTIP, a lncRNA is located at the 5’ end of the HOXA cluster, was shown to control activation of 5’ HOXA genes in cis via cooperation with an MLL histone methyltransferase complex and chromosomal looping that brings it into close proximity with 5’ HOXA gene loci [12].

The protein-coding genes in the four vertebrate Hox clusters belong to 13 groups of orthologs that can be traced to ancestral clusters that existed before the two rounds of genome-duplication [13]. The two conserved microRNA families encoded in the Hox clusters, miR-10 and miR-196, are represented in multiple clusters [14]. lncRNAs have been described in each of the four clusters but so far there were no known cases of clear similarity between lncRNAs across clusters. Here, we focus on a pair of lncRNAs that appear to be some of the most conserved lncRNAs produced from the vertebrate Hox clusters–HOXA-AS3 and HOXB-AS3. We provide evidence that it is likely that the production of these lncRNAs precedes the duplication of the ancestral Hox cluster into HOXA and HOXB. Both lncRNAs are expressed predominantly in the embryo, with expression patterns more similar to each other than to nearby protein-coding genes. In the adult, HOXA-AS3 expression is mostly restricted to tissues of endodermal lineage, and specifically to immature goblet cells and tuft cells. The similar expression of HOXA-AS3 and HOXB-AS3 is likely driven by conserved and shared binding sites for CDX transcription factors in the HOXA-AS3 and HOXB-AS3 promoters. Using human cell lines and human embryonic stem cells, we show that perturbation of HOXA-AS3 and HOXB-AS3 expression results in corresponding changes in expression of HOX-6 and HOX-7 genes. These results suggest co-ordinated and ancient lncRNAs production from central regions of the Hox clusters that plays important cis-acting gene regulatory roles in cells of the endodermal lineage.

Results

A pair of conserved lncRNAs in the middle of HOXA and HOXB clusters

The central regions of HOX clusters give rise to a large variety of transcription products that undergo extensive alternative splicing (S1A Fig). We first focused on HOXA-AS3, the main transcription start site of which lies ~700 nt downstream of the annotated 3’ end of HOXA5 and which is transcribed antisense to HOXA5 and HOXA6, terminating in the single intron of HOXA7 (Figs 1A and S1A). The region in the mouse genome that aligns to the HOXA-AS3 promoter is the promoter of Hoxaas3 (2700086A05Rik), which terminates in the intergenic region between Hoxa6 and Hoxa7 (Fig 1A). The promoter of HOXA-AS3 is highly conserved in other vertebrates, but transcripts originating from it are not consistently annotated, likely due to its very restricted expression in adult tissues, as it is expressed predominantly in the embryo (see below). Using available RNA-seq data we could identify orthologs for HOXA-AS3 in opossum and X. tropicalis (Figs 1A and S1). Transcription of these orthologs, similarly to that of the human HOXA-AS3, started ~500 nt downstream of the 3’ end of HOXA5 and ended in the intron of HOXA7. HOXA-AS3 exhibited significant sequence similarity with the orthologs from mouse, opossum, and X. tropicalis (BLAST E-value<10−40). Notably, homology with the X. tropicalis ortholog was restricted to the region overlapping HOXA7.

Fig 1. Orthologs of HOXA-AS3 and HOXB-AS3 in different vertebrate species.

Fig 1

Transcript models annotated by Ensembl, Refseq, or PLAR [61], or manually reconstructed based on RNA-seq data (see S1 Fig) for HOXA-AS3 (A) and HOXB-AS3 (B) are shown alongside the annotated protein-coding genes in the locus. The lncRNAs are transcribed from the ‘+’ strand and all other genes are transcribed from the ‘-’ strand. The regions of HOXA-AS3 and HOXB-AS3 are shaded.

HOXB-AS3 transcription in human starts ~900 nt downstream of the 3’ end of HOXB5 and terminates in the intergenic region between HOXB6 and HOXB7 (Figs 1B and S1A). Presumably because of its broader expression compared to HOXA-AS3, orthologs of HOXB-AS3 were readily identifiable in more species. In mouse, it is annotated as Hoxb5os (0610040B09Rik), and we could identify orthologs in opossum, X. tropicalis, coelacanth, spotted gar, medaka, and elephant shark (Figs 1B and S1). HOXB-AS3 exhibited significant sequence similarity with the orthologs from mouse and opossum (BLAST E-value<10−40), but not with more distant species. Comparison of the sequences with LncLOOM [15] identified four motifs conserved in mammals and in X. tropicalis but no deeper conservation was detected (S1 Dataset). Both HOXA-AS3 and HOXB-AS3 show negative PhyloCSF [16] scores throughout the locus (S2 Fig), and so it is unlikely that they encode highly conserved proteins. Notably, a primate-specific protein has been recently found to be encoded by HOXB-AS3 [17] (see Discussion).

The corresponding positions of the two lncRNAs and the high conservation of their presence in other species made us scrutinize and compare the sequences of their promoters. BLAST comparison of the corresponding promoters from HOXA and HOXB clusters found significant homology in representative vertebrate species all the way to the cartilaginous fish elephant shark (E-value 6e-31 in human, 5e-32 in mouse, 1e-21 in Xenopus, 73–80% base identity). Mapping the transcription start sites of HOXA-AS3 and HOXB-AS3 transcripts based on RNA-seq data (where available) suggested that the precise position of transcription initiation varies between the clusters and to a lesser extent between the species (S3A Fig). Among the highly conserved sequences preserved in both classes, we note a pair of tandem binding sites for the CDX1/2 proteins—CCATAAA and CCATTAAA [18] that appear once on the sense and once on the antisense strand. When considering all the human promoters annotated in FANTOM5.5 [19], the HOXA-AS3 promoter contained 11 predicted CDX binding sites, a number of predicted binding sites larger than that in 99.95% of human promoters annotated in FANOM5.5 (only 83 of the 200K promoters had 11 predicted sites or more). HOXB-AS3 promoter had two predicted sites, a number comparable to that of several other Hox genes (S3B Fig).

HOXA-AS3 and HOXB-AS3 are co-expressed in embryonic and adult tissues

In order to obtain a comprehensive picture of where the two lncRNAs are expressed in both fetal and adult cell types, we relied on data from the FANTOM5.5 project [19], which provide strand-specific data across hundreds of cell types. Both lncRNAs were expressed in a highly specific manner and with patterns largely distinct from those of the overlapping HOX-5, HOX-6, HOX-7 genes (Figs 2A and S4B). In particular, expression of HOXA-AS3 in human was more tissue-specific than that of the genes it overlapped and more closely resembled HOXB-AS3 than any of the genes whose transcription units it overlapped (Fig 2A). The correlation between HOXA-AS3 and HOXB-AS3 was comparable to the correlations between Hox protein-coding genes with a posterior expression domain (Fig 2A), and larger than that found typically between Hox genes in different clusters (S4A Fig). In mouse, Hoxaas3 and Hoxb5os were also more tissue specific than the Hox genes they overlapped, but the correlation between them was weaker (S4B Fig). Notably, correlations in mouse were more difficult to assess, as Hoxaas3 was expressed with TPM≥1 only in 17 samples (for comparison, HOXA-AS3 was expressed with TPM≥1 in 65 human samples). The samples in which HOXA-AS3 and HOXB-AS3 were expressed (S1 Table) were mostly embryonic or derived from embryonic stem cells. In human, both lncRNAs were co-expressed during late-stage differentiation of embryonic stem cells to embryoid bodies. In mouse, both Hoxaas3 and Hoxb5os were co-expressed in the E8.5 mesoderm in the neonate intestine (consistent with the single cell RNA-seq data, Fig 2B).

Fig 2. Expression of HOXA-AS3 and HOXB-AS3 in adult and embryonic tissues.

Fig 2

(A) Left: Correlation coefficients between log-transformed FANTOM5.5 expression levels [19] in hundreds of samples for the indicated genes. Right: Number of samples in which each gene is expressed within the indicated TPM ranges. (B) Expression levels of the indicated genes in clusters of single cells during gastrulation, data from [25].

In order to assess the spatial expression patterns of Hoxaas3 and Hoxb5os and other Hox genes, we reanalyzed Geo-seq data from the mouse E7.5 embryos [20] (S5 Fig). Both lncRNAs exhibited specific and overlapping expression domains in the region corresponding to primitive streak or ‘late mesoderm’ (sections 9–10, posterior region), consistently with the scRNA-seq data. Notably it has been suggested that some of the cells in this region are endoderm cells that egress through the mesoderm late in gastrulation [2023]. Hoxaas3 and Hoxb5os expression domain was more specific than that of the overlapping Hox genes, and interestingly, overlapped with the expression of Cdx1, and to a lesser extent Cdx2.

We next examined Hoxaas3 expression in adult mouse tissues in the Tabula Muris scRNA-seq dataset (Hoxb5os is not annotated in this dataset). The only cell type where there was appreciable expression were Goblet and epithelial cells from the large intestine (S4C Fig), consistent with our more detailed analysis (see below). HOXA-AS3 and HOXB-AS3 thus exhibit a very high tissue specificity in the adult tissues, similarly to other lncRNAs [24].

To obtain single-cell resolution on the expression of the two lncRNAs during early embryonic development, we used the large-scale single-cell dataset recently published by the Sanger institute [25], which profiled mouse embryos at E6.5–E8.5 (Fig 2B). At these stages, Hoxaas3 and Hoxb5os were generally more highly expressed than the protein-coding genes overlapping their gene bodies. As in the FANTOM data, Hoxaas3 expression was most similar to the expression of Hoxb5os and the two lncRNAs were highly expressed in neuro-mesodermal progenitors (NPM in Fig 2B), various mesodermal populations, caudal epiblast, and gut cells.

HOXA-AS3 and HOXB-AS3 regulate their adjacent Hox genes in HT-29 cells

Inspection of ENCODE data suggested HOXA-AS3 is not well-expressed in commonly used human cell lines, consistently with its overall low expression in adult tissues. HOXB-AS3 is somewhat more broadly expressed, as it is expressed also in Ag04450, IMR-90, and NHLF cells. Surprisingly, there was no substantial expression of HOXA-AS3 or HOXB-AS3 in A549 cell line and limited expression in HUVEC, where they have been previously studied [2627] (S6 Fig and Discussion). In contrast, ENCODE RNA-seq data showed that HOXA-AS3 and HOXB-AS3 are well expressed in HT-29 (S6 Fig)–a human colon adenocarcinoma cell line that under certain growth conditions exhibits characteristics of mature intestinal cells, such as enterocytes or mucus producing cells which have brush borders and expresses Villin and additional intestinal microvilli proteins [28,29].

In order to perturb the expression of HOXA-AS3 and HOXB-AS3, we first used CRISPR interference (CRISPRi) [30]–a catalytically inactive version of Cas9 (dCAS9) optionally fused to a KRAB domain (dCas9-KRAB) together with guide RNAs (gRNAs) directed to a region downstream to the TSS of the target [31]. We transfected HT-29 cells with pools of three gRNAs targeting HOXA-AS3 and HOXB-AS3 promoters and dCAS9-KRAB vectors, which reduced lncRNA levels by 50%–80% compared to cells transfected with the dCas9-KRAB vector and an empty gRNA plasmid (Fig 3A and 3B). As HOXA-AS3 levels are reduced, HOXA5, HOXA6 and HOXA7 RNA levels are also down-regulated by 30–40% (Fig 3A). Similarly, HOXB-AS3 knockdown (KD) was followed by a down regulation of HOXB5 and HOXB6 by 50–70% (Fig 3B).

Fig 3. CRISPR inhibition and activation of HOXA-AS3 and HOXB-AS3 in HT-29 cells.

Fig 3

(A) Changes in expression of the indicated genes is shown following inhibition of HOXA-AS3. n = 4. (B) As in A, following inhibition of HOXB-AS3. n = 4. (C) As in A, following activation of HOXA-AS3. n = 4 (D) As in A following activation of HOXB-AS3. Normalized to actin. Two-sided t-test. *—P<0.05, **—P<0.005, ***—P<0.0005. Two-sided t-test compared to the transfection control.

Next, we tested the effect of over-expression (OE) of the lncRNAs using CRISPR activation (CRISPRa)–dCas9 fused to VP64 transcriptional activation domain and directed by the sgRNAs to a region upstream to the HOXA-AS3 and HOXB-AS3 TSSs. In this system, the VP64 domain recruits the transcription machinery to activate expression of the lncRNA of interest [31]. OE in HT-29 cells resulted in effects opposite to those observed following lncRNA KD, as it increased expression of the adjacent genes, significantly for HOXA-AS3 (Fig 3C and 3D).

These results suggest that HOXA-AS3 and HOXB-AS3 production or their RNA products have a positive regulatory effect on the expression of the neighboring HOX5–7 genes.

HOXA-AS3 and HOXB-AS3 RNA products are required for their cis-regulatory activity

In order to differentiate between the potential effects on chromatin caused by the use of the KRAB effectors and the transcription or the RNA products of HOXA-AS3 and HOXB-AS3, we used RNAi to target the RNA products of HOXA-AS3 and HOXB-AS3. First we transfected siRNA pools targeting HOXA-AS3 or HOXB-AS3 into HT-29 cells. This resulted in a substantial reduction in RNA levels for both HOXA-AS3 and HOXB-AS3 and a concomitant reduction in the expression of neighboring genes that was similar to the effects observed with CRISPRi (Fig 4A and 4B). When HOXA-AS3 was reduced by 60%, HOXA5/6/7 were significantly downregulated by 20–45% (Fig 4A). Similarly, when HOXB-AS3 was reduced by ~40%, there was a significant downregulation of HOXB5 and HOXB6 (Fig 4B). As an alternative approach, a stably expressed shRNA targeting HOXA-AS3 introduced via a lentiviral infection led to a stronger effect with the same trend as that observed using CRISPRi and siRNA, where KD of the lncRNA was accompanied by a decrease of expression of the neighboring genes (Fig 4C). HOXA-AS3 and HOXB-AS3 RNA products are therefore important for regulation of the adjacent genes.

Fig 4. RNA products of HOXA-AS3 and HOXB-AS3 are required for regulation of their adjacent Hox genes.

Fig 4

(A-C) qRT-PCR measurements of the indicated genes in HT-29 cells treated with the indicated reagents. Normalized to Actin. n = 4 for siHOXA-AS3 and siHOXB-AS3. n = 3 for shHOXA-AS3. *—P<0.05, **—P<0.005, ***—P<0.0005. Two-sided t-test compared to the transfection control. (D) Changes in gene expression in RNA-seq data of HT-29 cells treated with HOXA-AS3 or HOXB-AS3 siRNAs. Shown are HOXA-AS3, HOXB-AS3, and all other HOX genes with average FPKM≥1. Asterisks indicate adjusted P<0.05 as computed by DESeq2.

In order to characterize more broadly the consequences of down-regulation of HOXA-AS3 and HOXB-AS3, we used RNA-seq to profile transcriptome-wide gene expression in HT-29 cells treated with siRNAs targeting these lncRNAs or with a non-targeting control. RNAi resulted in reduction in expression of the lncRNAs, concomitantly with reduction in the overlapping genes, and a broad mild reduction in expression of genes in the HOXA and HOXB clusters (HOXC and HOXD clusters are mostly silent in HT-29 cells) (Fig 4D and S2 Table), with more significant effects observed in the HOXB cluster that is overall more expressed than HOXA in HT-29 cells (S7A Fig). In the case of HOXB-AS3 it was apparent that the KD had a strong effect on the levels of the overlapping HOXB5–7 genes relative to the other HOX genes. The repressive effect of KD of HOXA-AS3 on HOXB genes, and of HOXB-AS3 KD on HOXA genes was validated by qRT-PCR following siRNA KD or CRISPRi of these genes (S7A and S7B Fig) These results suggest that loss of HOXA-AS3 and HOXB-AS3 has broad effects on expression of genes from HOXA and HOXB clusters.

Beyond the effect on the expression of HOX genes, HOXB-AS3 had a larger effect on gene expression (S7C Fig), consistently with its higher expression levels in HT-29 cells. Analysis of the gene expression changes using GOrilla [32] (S2 Table) showed that HOXA-AS3 KD was associated with a significant reduction in genes related to cell cycle and proliferation (top down-regulated GO category “mitotic cell cycle process” adjusted P = 1.52×10−6), consistent with its reported positive effect of proliferation reported in other cell lines [27,33,34] (see Discussion). HOXB-AS3 led to a significant up-regulation of genes whose protein products are involved in ncRNA processing, and specifically in rRNA processing (adjusted P = 5.92×10−5), potentially related to its reported functions in rRNA biogenesis observed in leukemia cells [35]. The changes in gene expression outside of the HOX clusters following HOXA-AS3 or HOXB-AS3 KD could result from the consequences of changes in gene expression or from additional trans-acting functions of these lncRNAs (see Discussion).

HOXA-AS3 is localized in the both the nucleus and cytoplasm of HT-29 cells

We next focused on HOXA-AS3 and characterized its precise expression pattern at higher resolution, as it is more narrowly expressed compared to HOXB-AS3, and also has a longer exonic sequence which permits the use of Stellaris smFISH protocol with 96 exonic probes for the human HOXA-AS3 and 94 for the mouse Hoxaas3 (S3 Table), whereas only 24 probes were possible for HOXB-AS3.We first analyzed the subcellular localization of HOXA-AS3 and HOXA5 in HT-29 cells (Fig 5A). We observed variable expression of both genes among cells, in some of the cells we could detect expression of only one of the transcripts, while others expressed both genes. HOXA-AS3 transcript was detectable in just ~15% of the >100 imaged cells, in up to 3 foci per cell and with localization mainly in the nucleus, though it could also be detected in the cytoplasm. Interestingly, in some of the cells that express both HOXA-AS3 and HOXA5 we detected a rare yet highly specific co-localization in the perinuclear area (Fig 5B). As expected from their genomic co-location, HOXA-AS3 and HOXA5 are co-localized in what is likely their site of transcription in the nucleus (Fig 5B).

Fig 5. Single-molecule FISH detection of HOXA-AS3 and HOXA5 in HT-29 cells.

Fig 5

(A) HOXA-AS3 (red) and HOXA5 (green) transcripts in a sample of HT-29 cells. Scale bar: 10 μm. (B) HOXA-AS3 and HOXA5 are co-localized at their presumed site of transcription. (C) HOXA-AS3 and HOXA5 are occasionally co-localized in the perinuclear area (white arrow). (D) HOXA-AS3 and HOXA5 are occasionally expressed separately.

HOXA-AS3 is expressed in a specific subset of colon epithelial cells

As HT-29 cells contain a mixture of cellular states from the colon epithelium [28,29], HOXA-AS3 expression in a small subset of cells may imply that it is only found in a defined subpopulation of cells. We therefore analyzed the expression pattern of HOXA-AS3 and Hoxaas3 in normal intestinal epithelial cells, using single-cell RNA sequencing (scRNA-seq) data.

In scRNA-seq data from the human colon scRNA-seq data, HOXA-AS3 was expressed predominantly in epithelial cells, and within those it was detected specifically in tuft and immature goblet cells, that are deep crypt goblet cells that are part of the stem cell niche [36] (Fig 6A). Similarly, in the mouse small intestine [37] HOXA-AS3 is mainly expressed in tuft cells at comparable expression levels to the tuft marker Dclk1 (Fig 6B). In contrast, in the mouse colon scRNA-seq Hoxaas3 is mainly detected in goblet cells (Fig 6C). In order to examine expression in intact tissue, we performed smFISH for Hoxaas3 in the jejunum of the mouse small intestine, which contains a relatively high fraction of goblet cells, and compared it to smFISH of the goblet cell marker Gob5, the tuft cell marker Dclk1, and Atoh1 marking intestinal secretory precursor cells, including immature goblet and tuft cells. Based on the marker expression and the positions of the cells, we conclude that Hoxaas3 is expressed in the early immature goblets and in the secretory precursor cells (Fig 6D). Hoxaas3 and Hoxa5 were occasionally co-localized, similar to the observations in HT-29 cells (Fig 6D).

Fig 6. Expression of HOXA-AS3 in the human and mouse gut.

Fig 6

(A) Expression of HOXA-AS3 in single cells of the human colon (data from [62]). (B-C) Expression of the indicated genes in scRNA-seq from the mouse small intestine (B) and colon (C). Data from [37]. (D) smFISH of Hoxaas3, Hoxa5, Gob5 and Atoh1 expression in the mouse intestine. Scale bar:10μm. Arrows indicate a subset of RNA molecules detected in the images.

scRNA-seq and smFISH from both human and mouse samples thus supports the notion that HOXA-AS3 is expressed in a specific subpopulation, which may explain the apparently variable expression pattern that we observed in HT-29.

HOXA-AS3 and HOXB-AS3 are induced during early differentiation of human embryonic stem cells towards endoderm

As both HOXA-AS3 and HOXB-AS3 were more highly expressed in embryonic stages compared to adult tissues, we next wanted to evaluate the expression and activities of HOXA-AS3 and HOXB-AS3 during early developmental transitions. Endoderm is one of the three primary germ cell layers, and endoderm patterning is controlled by a series of reciprocal interactions with nearby mesoderm tissues. As development proceeds, broad gene expression patterns within the foregut, midgut, and hindgut become progressively refined into precise domains from which specific organs will arise. Human embryonic stem cells can be differentiated towards endodermal cell lineages in a robust manner, resulting, within seven days, in three different populations–anterior foregut (AFG), posterior foregut (PFG) and midgut/hindgut (MHG), using a protocol established by Loh et al. [38] (S8A–S8C Fig). During this differentiation process a graded, spatially collinear Hox gene expression is observed, after in-vitro patterning, whereby PFG cells express 3’ anterior Hox genes (e.g. HOXA1) and MHG cells express 5’ posterior Hox genes (including HOXA10) [38] (S8A Fig).

Pluripotent hESCs and cells from each stage of the differentiation were validated by multiple markers (S4 Table) using qRT-PCR (S8D Fig) and by immunostaining (S8E Fig), matching the expression patterns observed in the RNA-seq data from [38] (Fig 7A), HOXA-AS3 and HOXB-AS3 were strongly induced and expressed only in the MHG population, alongside their adjacent HOX-6 and HOX-7 genes, whereas HOXA5 and HOXB5 were alse expressed in PFG cells (Fig 7A).

Fig 7. Function of HOXA-AS3 and HOXB-AS3 during endodermal differentiation of hESCs.

Fig 7

(A) Read coverage in RNA-seq data from [38] for shown parts of the HOXA (top) and HOXB (bottom) clusters. In each cluster all the tracks are normalized together. (B-C) Expression levels estimated by qRT-PCR for the indicated genes in hESCs following CRISPRa-mediated 48h activation of HOXA-AS3 (n = 7/4) (B) and HOXB-AS3 (n = 3) (C). (D-E). Expression levels estimated by qRT-PCR in MHG cells following CRISPRi-mediated repression of HOXA-AS3 (n = 3) (D) and HOXB-AS3 (n = 3) (E). (F) Changes in expression of the indicated genes following infection of hESCs with two separate HOXA-AS3 shRNAs, followed by differentiation to MHG. n = 6. *—P<0.05; **—P<0.005; ***—P<0.005. Two sided t-test. Errors bars—SEM.

HOXA-AS3 and HOXB-AS3 regulate expression of their adjacent Hox genes during hESC differentiation

To study the functions of HOXA-AS3 and HOXB-AS3 during early steps of stem cell differentiation, we established dCas9-expressing H9 hESCs, using viral infection of Tet-dependent inducible versions of the dCAS9 and dCas9-VP64. We preferred to avoid the use of dCas9-KRAB in this system, as we were able to obtain efficient KD using dCas9 alone, which does not by itself directly affect chromatin modifications. We then established derivatives of these stable lines expressing specific gRNAs targeting the promoters of HOXA-AS3 or HOXB-AS3.

After 48 hr of doxycycline (Dox) addition to the dCas9-VP64 expressing lines, we observed an up-regulation of HOXA-AS3 and HOXB-AS3 lncRNAs in their respective lines (Fig 7B and 7C). Furthermore, we observed up-regulation of the genes adjacent to these lncRNAs, even though neither of these genes are normally expressed in hESCs, and the chromatin of the HOXA cluster in hESC is in an inactive conformation [3942]. Activation of HOXA-AS3 in hESCs resulted in increased expression of HOXA5-7 (Fig 7B). Similarly, HOXB-AS3 activation led to an activation of HOXB5 and HOXB6 (Fig 7C). Next, we tested for changes in the pluripotency and differentiation markers in the CRISPRa lines. Although there was no remarkable change in the Oct4 pluripotency marker, we observed an increase in endodermal markers–HOXA-AS3 or HOXB-AS3 overexpression (OE) lines led to an upregulation of Sox17, a definitive endoderm marker known to be required for normal development of the definitive gut endoderm [43] and in Cdx2 levels, a marker of later stages of endodermal differentiation, expressed mainly in the MHG cells (Fig 7C).

Next we wanted to examine the effect of reducing the levels of HOXA-AS3 and HOXB-AS3 during endodermal differentiation at time points at which they are endogenously induced during the third stage of differentiation, as the cells are transitioning from DE to MHG (Fig 7A). For both HOXA-AS3 and HOXB-AS3 we obtained a ~50% KD using the Dox-inducible dCAS9, and targeting HOXA-AS3 resulted in downregulation of HOXA5/6/7 (Fig 7D), with a relatively smaller effect on HOXA7, which is generally expressed at low levels in MHG cells. KD of HOXB-AS3 led to downregulation of HOXB5 (Fig 7E). In both cases we observed no major changes in expression of markers for pluripotency (Oct4), endoderm (Sox17) and mid/hindgut (Cdx2) (Fig 7D and 7E).

In order to study the role of HOXA-AS3 RNA product during hESC differentiation, we used the two shRNA constructs described above to generate hESC lines where HOXA-AS3 is stably targeted by RNAi. In this system, a stable reduction of HOXA-AS3 expression also has a similar effect on HOXA5–7 (Fig 7E). There was also a significant reduction in levels of Oct4, although its expression is low at the MHG stages, and so the physiological significance of this reduction is unclear. Knockdown of HOXA-AS3 and HOXB-AS3 also led to a reduction in the expression of genes from the other cluster, similar to the observations in HT-29 cells (S8F Fig).

Discussion

We found here that HOXA-AS3 and HOXB-AS3 are ultraconserved lncRNAs which demonstrate high conservation in promoter sequence, genomic configuration and regulation that underpin similar expression patterns in specific biological processes, and relate to related functions in regulating expression of their proximal genes. The expression of HOXA-AS3 and HOXB-AS3 during embryonic development is particularly high in intestine-specifying lineages such as the MHG cell population that emerges during endodermal differentiation of hESCs, in the primitive streak around E7.5, in hindstomach and small intestine epithelial cells in E12 during mouse development (S9 Fig). Moreover we observe co-expression of HOXA-AS3 and HOXB-AS3 in the adult intestine and colon in human and in mouse, specifically in cells that transition from the stem cell niche to fully specified intestinal cells in the crypt, presumably utilizing some of the same regulatory programs that are used during early development. smFISH in mouse intestine showed specific enrichment of HOXA-AS3 expression in early immature goblet cells and in the secretory precursor cells, highlighting the expression timing to mid-differentiation–the phase where the cells are committing and acquiring their specific fate, in concurrence to its induction in hESC differentiation, and potentially related to expression in only a small subset of HT-29 cells in culture.

There have been several recent reports about the functions of HOXA-AS3 and HOXB-AS3 in other systems. HOXA-AS3 was reported to be induced during adipogenic induction of human mesenchymal stem cells (MSCs), and its silencing promoted proliferation of MSCs and inhibited osteogenesis in vitro and in vivo, in both human and mouse cells [44]. Positive effects of HOXA-AS3 on proliferation and migration in vitro and during tumorigenesis in vivo were also observed in glioma cells [33]. Another study found that HOXA-AS3 promoted proliferation, migration and invasion in A549 lung carcinoma cell line and tumor growth in vivo [27], where it was found in both the nucleus and the cytoplasm, consistent with our data in HT-29 cells. In that study HOXA-AS3 was suggested to positively regulate HOXA6, as siRNA-mediated KD of HOXA-AS3 reduced levels of HOXA6 mRNA and protein (but not those of HOXA5) in A549 cells. A more recent publication extended the positive effects of HOXA-AS3 on proliferation to additional non-small-cell lung carcinoma cell lines [34]. These studies are overall consistent with our observations that in normal tissues HOXA-AS3 is preferentially expressed in proliferating progenitors, and the reduction in a proliferation signature in the RNA-seq data of HOXA-AS3 KD cells. Lin et al. found HOXA-AS3 to have a negative effect on HOXA3 expression, by binding to both HOXA3 mRNA and HOXA3 protein. Lastly, HOXA-AS3 was recently proposed to regulate NF-kappaB signalling in HUVECs [26]. Notably, in ENCODE RNA-seq data HOXA-AS3 is undetectable in both A549 cells and HUVECs (which do express HOXA6 and HOXA7), whereas it is well-expressed in the HT-29 cells we used in this study (S5 Fig).

HOXB-AS3 was reported to be down-regulated in colorectal cancers and to produce a 53 aa protein conserved in primates [17]. Notably, PhyloCSF scores throughout HOXB-AS3 are negative (S1 Fig), so it is very unlikely that it encodes a conserved protein. In colorectal cancer cells HOXB-AS3 was shown to inhibit cell proliferation [17]. In NPM1-mutated acute myeloid leukemia cells, HOXB-AS3 does not associate with polysomes and promotes cell proliferation in both human and mouse leukemia cells [35,45]. Interestingly, in this system, KD of HOXB-AS3 using antisense oligonucleotides did not affect expression of other Hox genes, but rather regulated expression of ribosomal RNA, in trans, via interaction with EBP1 [35], consistently with our observation of changes in rRNA processing genes upon HOXB-AS3 KD in HT-29 cells. There is therefore evidence of trans-acting activities of HOXB-AS3. We note that our findings about cis-acting regulation of HOXB6 and HOXB7 by HOXB-AS3 do not exclude these additional functions and in fact it is likely that lncRNAs that are robustly expressed and highly conserved have aquired additional, species- or clade-specific functions during evolution.

We report a positive effect of HOXA-AS3 and HOXB-AS3 production on the expression levels of their overlapping HOX5–7 genes. We studied these effects in vitro in cultured cells and mostly in a cancer cell line with an abnormal karyotype, and future studies will elucidate the roles of HOXA-AS3 and HOXB-AS3 RNA products in vivo. Notably, the positive effect we report does not translate into a tight co-expression between the lncRNAs and the protein-coding genes in this region when considering a broad range of conditions and cell types (Figs 2A and S4A), likely because other mechanisms contribute to expression of the HOX5–7 genes in cells which do not express the lncRNAs. For example, we see strong expression of HOXA5 in the PFG cell population in differentiating hESCs (Fig 7A). Various mechanism for cis-acting regulation of gene expression by lncRNAs have been demonstrated in different systems [46]. Future studies will elucidate the mechanism underlying the regulation of HOX5–7 gene expression by HOXA-AS3 and HOXB-AS3, which may resemble those of other lncRNAs. It is of particular interest to study whether HOXA-AS3 and HOXB-AS3 influence the nature of the transcripts produced in the complex loci of the HOX clusters, e.g., but influencing promoter choice. Genome editing of the loci can be particularly powerful for promoting understanding of lncRNA biology, but it is particularly difficult to perform and interpret in the Hox clusters, due to the high density of gene regulatory elements within the clusters and the complex relations between them. The most relevant systems to perform editing of the human HOXA-AS3 and HOXB-AS3 is likely hESCs, which can then be differentiated to MHG cells, but CRISPR-mediated editing in hESCs is inefficient [47]. Indeed, despite screening hundreds of clones, we were so far unsuccessful in obtaining homozygous deletions of the HOXA-AS3 promoter in hESCs. Mouse models carrying specific manipulations, such as insertion of polyA sites, will also be highly informative.

Some of the protein-coding genes and miRNAs in the Hox clusters were shown to be functionally equivalent to each other and to contribute differentially to organismal function via their divergent expression patterns [48,49]. These orthologs formed by duplication during the formation of the four vertebrate Hox gene clusters. The paucity of known lncRNA paralogs present in different Hox clusters can be rather easily explained by the overall high rate of lncRNA evolution [50], which likely rewired the sequences and exon-intron architectures of Hox lncRNAs extensively over the past 500 million years. The numerous features we identified as shared between HOXA-AS3 and HOXB-AS3 suggest that at least some lncRNAs were duplicated and maintained regulatory functions in the Hox cluster throughout vertebrate evolution, during which individual clusters also acquired additional lncRNAs, some of which are functional, and that further sculpted gene expression within each cluster. Importantly, there is also evidence of extensive cross-regulation between the clusters, including by lncRNAs [10]. Future studies will examine the potential contribution of HOXA-AS3 and HOXB-AS3 lncRNAs to cross-cluster regulation, as well as the extent of similarity that they maintained in their modes of action.

Materials and methods

Tissue culture

H9 hESC were routinely cultured on irradiated MEFs in hESC medium: DMEM/F-12 (Sigma, D6421), 15% KNOCK-OUT Serum Replacement (Gibco, 10828–028), Glutamax X1, (Gibco, 35050–038), 1% Non-essential amino acids (NEAA) (Biological Industries, 01-340-1B), 0.1 mM 2-mercaptoethanol (Gibco, 31350–010), and 8ng/ml bFGF (Peprotech, 100-18B), at 37°C in a humidified incubator with 5% CO2. HT-29, MCF7 and HEK293T cell lines and were routinely cultured in DMEM containing 10% fetal bovine serum and 100 U penicillin/0.1 mg ml−1 streptomycin, at 37°C in a humidified incubator with 5% CO2.

Endodermal differentiation

Endodermal differentiation was performed as previously described [38]. Pluripotent human stem cells were grown in the absence of MEF for four passages in mTeSR1 (StemCell Technologies, 85850) and seeded on Geltrex (invitrogen, A1413202). After 1–2 days of recovery in mTeSR1, hESC were washed with F12 (Gibco, 21765–029) and then were treated for 24 hours with Activin A (100 ng/mL, R&D Systems, 338-AC-010), CHIR99021 (2 μM, Stemgent, 04–0004), and PI-103 (50 nM, Tocris, 2930) in CDM2 to specify APS. Afterwards, cells were washed (F12), then treated for 48 hours with Activin A (100 ng/mL) and LDN-193189/DM3189 (250 nM, Stemgent, 04–0074) in CDM2 to generate DE by day 3. Day 3 DE was patterned into AFG, PFG, or MHG by 4 days of continued differentiation in CDM2. DE was washed (F12), then differentiated as follows: AFG, A-83-01 (1 μM, Tocris, 2939) and LDN-193189 (250 nM, Stemgent, 04–0074); PFG, RA (2 μM, Sigma, R2625) and LDN-193189 (250 nM); MHG, BMP4 (10 ng/mL, R&D Systems, 314-BP-010), CHIR99021 (3 μM, Stemgent, 04–0004), and FGF2 (100 ng/mL, Peprotech, 100-18B), yielding day 7 anteroposterior domains. Media was refreshed every 24 hours for each differentiation step.

HT-29 enterocytic differentiation

Enterocyte differentiation was performed as previously described [51]. HT-29 cells were seeded in 90% confluence on ThInCerts (Greiner, 60–657641) in 6 well plates. Cells were cultured for 31 days in glucose free conditions (Sigma, 11966–025) and the medium was changed every 2 days.

Transfections

Plasmid transfections for HEK293T, MCF7 and HT-29 were performed using PolyEthylene Imine (PEI) (PEI linear, Mr 25,000, Polyscience). CRISPRi/a transient experiments were harvested after 72h. siRNAs were transfected into HT-29 cells at 25 nM siRNA pool or with control pool (Dharmacon) by using DharmaFECT 4 (horizon, T-2004) following the manufacturer’s protocol. Cells were harvested after 48h of siRNA treatment.

Lentivirus production and stable lines generation

All lentivirus production was performed as previously described [52]. Medium was collected from plates 72 hr after transfection, filtered by VIVASPIN (Sartorius, VS2001), concentrated and stored –80°C. hESC and HT-29 cells were infected by lentiviral particles incubated in the growth medium containing and 8μg/ml Polybren (Sigma, 107689) to attached cells, following selection after 24h for several passages for pool isolation.

RNA and RT-qPCR

Total RNA was extracted from different cell lines and mouse tissues, by using RNeasy (Qiagen) according to the manufacturer’s protocol. cDNA was synthesized by using qScript Flex cDNA synthesis kit (Quanta, 95049). Fast SYBR Green master mix (Life, 4385614) was used for qPCR with gene-specific primers (S5 Table).

Immunofluorescence

Cultured cells were fixed with 4% paraformaldehyde for 10 minutes. Fixed cells were permeabilized using 0.1% triton X-100, blocked with 5% normal goat serum, incubated with a primary antibody, followed by incubation with a secondary antibody conjugated to a fluorescent dye. Antibodies used: Rabbit α-Eomes (Abcam, ab23345), Goat α-Sox17 (R&D Systems, AF1924), Goat α-Cdx2 (R&D Systems, AF3665), Goat α-Otx2 (R&D Systems, AF1979).

Single-molecule FISH

Cultured cells were fixed with 4% paraformaldehyde 24 hr after plating. Tissue was frozen in Tissue-Tek O.C.T compound (Sakura 4583) blocks and sectioned using a Leica cryostat (CM3050) at 10 μm thickness. Libraries of 96 and 94 probes (S3 Table) were designed to target human HOXA-AS3 and mouse Hoxaas3 RNA sequences, respectively and a commercially available library of 48 probes was used to detect HOXA5 (cat # VSMF-2538-5) (Stellaris RNA FISH probes, Biosearch Technologies). Hybridization conditions and imaging were as previously described [53,54]. smFISH imaging was performed on a Nikon-Ti-E inverted fluorescence microscope with a 100 × oil-immersion objective and a Photometrics Pixis 1024 CCD camera using MetaMorph software as previously described [55].

RNA-seq

HT-29 cells were transfected with 25nM siRNA against HOXA-AS3, HOXB-AS3, or with control siRNA using DharmaFECT 4 transfection reagent. RNA was extracted using TRIREAGENT (MRC TR 118) 48 hours post transfection, 1μg of total RNA was used for RNAseq library preparation using the SENSE mRNA-Seq Library Prep Kit V2 for Illumina (Lexogen, LX-001.96) according to the manufacturer’s recommended protocol. Gene expression levels were quantified using RSEM [56] and a RefSeq gene annotation database. Differential expression was computed using DESeq2 with default settings [57]. RNA-seq datasets are deposited in GEO database under the accession GSE168444. RNA-seq data from previous studies were downloaded from the SRA database, and quantified using RSEM with the same annotation file. Gene Ontology enrichment was analyzed using GORilla [32] on gene lists sorted by DESeq2 log2FoldChange, on genes with an average FPKM larger than 1, after excluding pseudogenes, and transcripts shorter than 200 nt.

gRNA cloning

Guide RNAs were designed by CHOPCHOP. For single sgRNA expression guide sequences were cloned into pKLV-U6gRNA(BbsI)-PGKpuro2ABFP (Addgene plasmid #50946) [58]. following Zhang Lab General Protocol (https://media.addgene.org/cms/filer_public/6d/d8/6dd83407-3b07-47db-8adb-4fada30bde8a/zhang-lab-general-cloning-protocol-target-sequencing_1.pdf). For dual sgRNA expression a mega-primer donor was generated by PCR using primers with the following structure:

Fw primer: tacatcttgtggaaaggacgaaacaccg-gRNA1-gttttagagctagaaatagcaagttaaaataaggc

Rev primer: cttgctatttctagctctaaaac-gRNA2(rev-compliment)-gggaaagagtggtctcatacagaacttataag

with pDecko-GFP (Addgene plasmid #72619 [59]) as template.

The PCR product was cloned into pDecko-GFP by restriction free cloning [60].

Supporting information

S1 Fig. CAGE and RNA-seq read coverage support for gene models of HOXA-AS3 and HOXB-AS3 orthologs.

(A) Human gene models annotated in GENCODE v36 in the central part of the HOXA and HOXB clusters (protein-coding transcripts are in blue and noncoding are in green). The total CAGE read coverage from FANTOM5.5 is shown on top. (B) In each species, annotated or reconstructed gene models are shown for HOXA-AS3 or HOXB-AS3 (the strand from which they are produced is defined as the ‘+’ strand) and the protein-coding HOX5–7 genes (transcribed from the ‘-’ strand). RNA-seq data are from the following datasets: SRP023152 (opossum), SRP041863 (Chicken), GSE136018 (Medaka), and SRP013772 (Shark).

(EPS)

S2 Fig. PhyloCSF scores for the HOXA-AS3 and HOXB-AS3.

PhyloSCF scores [16] taken from the PhyloCSF UCSC genome browser, for each of the three frames for the ‘+’ strand from which HOXA-AS3 and HOXB-AS3 are transcribed. Position of the proposed ORF from [17] is shown.

(EPS)

S3 Fig

(A) Sequence conservation and similarity in the HOXA-AS3 and HOXB-AS3 promoter regions. Exonic sequences (where known) are in bold. Predicted binding sites of the indicated transcription factors, taken from the UCSC genome browser are shaded in yellow. Regions of the 5’ splice sites at the end of the first exon, where known, are shaded in blue. (B) Number of CDX1 or CDX2 binding sites predicted by JASPAR [63] in 201,802 human promoters annotated in FANTOM5.5 [19]. For each TSS we considered the region -100 to 100 relative to the TSS. Selected genes are highlighted.

(EPS)

S4 Fig

(A) Distribution of correlation coefficients between expression patterns of pairs of genes within the same Hox cluster (red) and found in different Hox clusters (blue). The correlation coefficient between HOXA-AS3 and HOXB-AS3 is shown in green. All the coefficients are computed across all the samples from the human FANTOM5.5 data. (B) as in Fig 2A for the mouse FANTOM5.5 data. (C) Expression of Hoxaas3 in clusters of single cells from the Tabula Muris database [64]. Eight cell groups with the highest expression are shown.

(EPS)

S5 Fig. Spatial expression patterns of HOXA-AS3 and HOXB-AS3 and other genes in the E7.5 mouse embryo.

Geo-seq data were re-mapped to the RefSeq annotations and visualized as in [20]. Each gene is shown on a separate scale. Genes are grouped based on their genomic location or gene family.

(EPS)

S6 Fig. Expression of central HOXA and HOXB genes in ENCODE cell lines.

(A) Shown are selected RefSeq gene models for the indicated genes alongside RNA-seq strand-specific read coverage from the indicated cell lines from ENCODE datasets of total (HT-29) or polyA-selected (A459, HUVEC) RNA on the indicated strand as depicted in the UCSC genome browser. HOXA-AS3 and HOXB-AS3 are transcribed from the ‘+’ strand and the protein-coding genes from the ‘-’ strand. (B) Read coverage from the ENCODE datasets in IGV genome browser, showing the agglomerated coverage from both strands, and the splice-junction-supporting reads from the ‘+’ strand (blue) and the ‘-’ strand (red).

(EPS)

S7 Fig. Cross-regulation by HOXA-AS3 and HOXB-AS3 of HOXB and HOXA clusters.

(A) Expression levels in HT-29 cells of the protein-coding genes in the indicated paralogous HOX gene group. Shown are the average expression levels across our RNA-seq dataset. (B) As in Fig 4A and 4B, for the indicated genes from the HOXB (left) and HOXA (right) clusters. (C) As in Fig 3A and 3B, for the indicated genes in the HOXB (left) and HOXA (right) clusters. (D) Changes in gene expression and DESeq2 p-values for the transcriptome-wide changes in gene expression following siRNA-mediated KD of HOXA-AS3 (left) and HOXB-AS3 (right) in HT-29 cells. Genes with adjusted P<0.05 are in red.

(EPS)

S8 Fig. hESC endodermal differentiation overview.

(A) Expression levels of Hox genes and lncRNAs in data from [38]. (B) Endodermal differentiation process and signaling molecules. (C) Characterization of different cell morphology in different stages of the differentiation. (D) RNA levels and expression dynamics were measured by qRT-PCR at different stages of endodermal differentiation, and normalized to actin. (E) Immunofluorescence stainings of human ESCs differentiated in different stages of differentiation. (F) As in Fig 7E (left) and Fig 7F (right) for the indicated genes.

(EPS)

S9 Fig. Regulation of HOXA-AS3 and HOXB-AS3 by CDX1/2 transcription factors.

CDX2 ChIP-seq read coverage from the gut at the indicated stage and RNA-seq at E12 in the indicated tissue, data from [65]. The region corresponding to the promoters of Hoxaas3 and Hoxb5os is shaded.

(EPS)

S1 Table. FANTOM5 expression data.

(XLSX)

S2 Table. RNA-seq data and GO enrichments.

(XLSX)

S3 Table. Markers used for validation.

(XLSX)

S4 Table. smFISH probes.

(XLSX)

S5 Table. Primers and siRNAs.

(XLSX)

S1 Dataset. LncLOOM analysis of the HOXB-AS3 conservation.

(ZIP)

Acknowledgments

We thank members of the Ulitsky lab for useful discussions, Thomas Toubul, Peter DeHoff, and Louise Laurent for discussions on the use of CRISPRi and CRISPRa in hESCs, Gilad Beck for stem cell advice and valued contributions to the hESC work, and Shani Ben-Moshe for help with smFISH of intestinal markers and mouse intestine sections.

Data Availability

RNA-seq datasets are deposited in GEO database under the accession GSE168444 (reviewer token olepogycpjsrdcp).

Funding Statement

This study was funded by grants from the US-Israel Binational Science Foundation (grant Number 2015171), Minerva Foundation, Israel Science Foundation grant 1242/14 and European Research Council grant lincSAFARI, all to IU. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature. 2009;458: 223–227. doi: 10.1038/nature07672 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Sarropoulos I, Marin R, Cardoso-Moreira M, Kaessmann H. Developmental dynamics of lncRNAs across mammalian organs and species. Nature. 2019;571: 510–514. doi: 10.1038/s41586-019-1341-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Mercer TR, Dinger ME, Sunkin SM, Mehler MF, Mattick JS. Specific expression of long noncoding RNAs in the mouse brain. Proc Natl Acad Sci U S A. 2008;105: 716–721. doi: 10.1073/pnas.0706729105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Perry RB-T, Ulitsky I. The functions of long noncoding RNAs in development and stem cells. Development. 2016;143: 3882–3894. doi: 10.1242/dev.140962 [DOI] [PubMed] [Google Scholar]
  • 5.Deschamps J, van Nes J. Developmental regulation of the Hox genes during axial morphogenesis in the mouse. Development. 2005;132: 2931–2942. doi: 10.1242/dev.01897 [DOI] [PubMed] [Google Scholar]
  • 6.Izpisúa-Belmonte JC, Tickle C, Dollé P, Wolpert L, Duboule D. Expression of the homeobox Hox-4 genes and the specification of position in chick wing development. Nature. 1991;350: 585–589. doi: 10.1038/350585a0 [DOI] [PubMed] [Google Scholar]
  • 7.Kmita M, Tarchini B, Zàkàny J, Logan M, Tabin CJ, Duboule D. Early developmental arrest of mammalian limbs lacking HoxA/HoxD gene function. Nature. 2005;435: 1113–1116. doi: 10.1038/nature03648 [DOI] [PubMed] [Google Scholar]
  • 8.Hornstein E, Mansfield JH, Yekta S, Hu JK-H, Harfe BD, McManus MT, et al. The microRNA miR-196 acts upstream of Hoxb8 and Shh in limb development. Nature. 2005;438: 671–674. doi: 10.1038/nature04138 [DOI] [PubMed] [Google Scholar]
  • 9.Mansfield JH, Harfe BD, Nissen R, Obenauer J, Srineel J, Chaudhuri A, et al. MicroRNA-responsive’sensor’transgenes uncover Hox-like and other developmentally regulated patterns of vertebrate microRNA expression. Nat Genet. 2004;36: 1079–1083. doi: 10.1038/ng1421 [DOI] [PubMed] [Google Scholar]
  • 10.Rinn JL, Kertesz M, Wang JK, Squazzo SL, Xu X, Brugmann SA, et al. Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell. 2007;129: 1311–1323. doi: 10.1016/j.cell.2007.05.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Casaca A, Hauswirth GM, Bildsoe H, Mallo M, McGlinn E. Regulatory landscape of the Hox transcriptome. Int J Dev Biol. 2018;62: 693–704. doi: 10.1387/ijdb.180270em [DOI] [PubMed] [Google Scholar]
  • 12.Wang KC, Yang YW, Liu B, Sanyal A, Corces-Zimmerman R, Chen Y, et al. A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature. 2011;472: 120–124. doi: 10.1038/nature09819 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Hoegg S, Meyer A. Hox clusters as models for vertebrate genome evolution. Trends Genet. 2005;21: 421–424. doi: 10.1016/j.tig.2005.06.004 [DOI] [PubMed] [Google Scholar]
  • 14.Yekta S, Tabin CJ, Bartel DP. MicroRNAs in the Hox network: an apparent link to posterior prevalence. Nat Rev Genet. 2008;9: 789–796. doi: 10.1038/nrg2400 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ross CJ, Rom A, Spinrad A, Gelbard-Solodkin D, Degani N, Ulitsky I. Uncovering deeply conserved motif combinations in rapidly evolving noncoding sequences. Genome Biol. 2021;22: 29. doi: 10.1186/s13059-020-02247-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Lin MF, Jungreis I, Kellis M. PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics. 2011;27: i275–82. doi: 10.1093/bioinformatics/btr209 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Huang J-Z, Chen M, Chen D, Gao X-C, Zhu S, Huang H, et al. A Peptide Encoded by a Putative lncRNA HOXB-AS3 Suppresses Colon Cancer Growth. Mol Cell. 2017;68: 171–184.e6. doi: 10.1016/j.molcel.2017.09.015 [DOI] [PubMed] [Google Scholar]
  • 18.Verzi MP, Hatzis P, Sulahian R, Philips J, Schuijers J, Shin H, et al. TCF4 and CDX2, major transcription factors for intestinal function, converge on the same cis-regulatory regions. Proc Natl Acad Sci U S A. 2010;107: 15157–15162. doi: 10.1073/pnas.1003822107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Consortium, Fantom, The RP, Clst, Forrest AR, Kawaji H, Rehli M, et al. A promoter-level mammalian expression atlas. Nature. 2014;507: 462–470. doi: 10.1038/nature13182 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Peng G, Suo S, Cui G, Yu F, Wang R, Chen J, et al. Molecular architecture of lineage allocation and tissue organization in early mouse embryo. Nature. 2019;572: 528–532. doi: 10.1038/s41586-019-1469-8 [DOI] [PubMed] [Google Scholar]
  • 21.Nowotschin S, Setty M, Kuo Y-Y, Liu V, Garg V, Sharma R, et al. The emergent landscape of the mouse gut endoderm at single-cell resolution. Nature. 2019;569: 361–367. doi: 10.1038/s41586-019-1127-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kwon GS, Viotti M, Hadjantonakis A-K. The endoderm of the mouse embryo arises by dynamic widespread intercalation of embryonic and extraembryonic lineages. Dev Cell. 2008;15: 509–520. doi: 10.1016/j.devcel.2008.07.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Chan MM, Smith ZD, Grosswendt S, Kretzmer H, Norman TM, Adamson B, et al. Molecular recording of mammalian embryogenesis. Nature. 2019;570: 77–82. doi: 10.1038/s41586-019-1184-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 2011;25: 1915–1927. doi: 10.1101/gad.17446611 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Pijuan-Sala B, Griffiths JA, Guibentif C, Hiscock TW, Jawaid W, Calero-Nieto FJ, et al. A single-cell molecular map of mouse gastrulation and early organogenesis. Nature. 2019;566: 490–495. doi: 10.1038/s41586-019-0933-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Zhu X, Chen D, Liu Y, Yu J, Qiao L, Lin S, et al. Long Noncoding RNA HOXA-AS3 Integrates NF-κB Signaling To Regulate Endothelium Inflammation. Mol Cell Biol. 2019;39. doi: 10.1128/MCB.00139-19 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Zhang H, Liu Y, Yan L, Zhang M, Yu X, Du W, et al. Increased levels of the long noncoding RNA, HOXA-AS3, promote proliferation of A549 cells. Cell Death Dis. 2018;9: 707. doi: 10.1038/s41419-018-0725-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Martínez-Maqueda D, Miralles B, Recio I. HT29 Cell Line. The Impact of Food Bioactives on Health. 2015. pp. 113–124. doi: 10.1007/978-3-319-16104-4_11 [DOI] [PubMed] [Google Scholar]
  • 29.Rousset M. The human colon carcinoma cell lines HT-29 and Caco-2: Two in vitro models for the study of intestinal differentiation. Biochimie. 1986. pp. 1035–1040. doi: 10.1016/s0300-9084(86)80177-8 [DOI] [PubMed] [Google Scholar]
  • 30.Gilbert LA, Larson MH, Morsut L, Liu Z, Brar GA, Torres SE, et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell. 2013;154: 442–451. doi: 10.1016/j.cell.2013.06.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Qi LS, Larson MH, Gilbert LA, Doudna JA, Weissman JS, Arkin AP, et al. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell. 2013;152: 1173–1183. doi: 10.1016/j.cell.2013.02.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z. GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics. 2009;10: 48. doi: 10.1186/1471-2105-10-48 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Wu F, Zhang C, Cai J, Yang F, Liang T, Yan X, et al. Upregulation of long noncoding RNA HOXA-AS3 promotes tumor progression and predicts poor prognosis in glioma. Oncotarget. 2017;8: 53110–53123. doi: 10.18632/oncotarget.18162 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Lin S, Zhang R, An X, Li Z, Fang C, Pan B, et al. LncRNA HOXA-AS3 confers cisplatin resistance by interacting with HOXA3 in non-small-cell lung carcinoma cells. Oncogenesis. 2019;8: 60. doi: 10.1038/s41389-019-0170-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Papaioannou D, Petri A, Dovey OM, Terreri S, Wang E, Collins FA, et al. The long non-coding RNA HOXB-AS3 regulates ribosomal RNA transcription in NPM1-mutated acute myeloid leukemia. Nat Commun. 2019;10: 5351. doi: 10.1038/s41467-019-13259-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Clevers H. The intestinal crypt, a prototype stem cell compartment. Cell. 2013;154: 274–284. doi: 10.1016/j.cell.2013.07.004 [DOI] [PubMed] [Google Scholar]
  • 37.Haber AL, Biton M, Rogel N, Herbst RH, Shekhar K, Smillie C, et al. A single-cell survey of the small intestinal epithelium. Nature. 2017;551: 333–339. doi: 10.1038/nature24489 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Loh KM, Ang LT, Zhang J, Kumar V, Ang J, Auyeong JQ, et al. Efficient endoderm induction from human pluripotent stem cells by logically directing signals controlling lineage bifurcations. Cell Stem Cell. 2014;14: 237–252. doi: 10.1016/j.stem.2013.12.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Soshnikova N, Duboule D. Epigenetic regulation of vertebrate Hox genes: a dynamic equilibrium. Epigenetics. 2009;4: 537–540. doi: 10.4161/epi.4.8.10132 [DOI] [PubMed] [Google Scholar]
  • 40.Soshnikova N, Duboule D. Epigenetic regulation of Hox gene activation: the waltz of methyls. Bioessays. 2008;30: 199–202. doi: 10.1002/bies.20724 [DOI] [PubMed] [Google Scholar]
  • 41.Kashyap V, Gudas LJ, Brenet F, Funk P, Viale A, Scandura JM. Epigenomic reorganization of the clustered Hox genes in embryonic stem cells induced by retinoic acid. J Biol Chem. 2011;286: 3250–3260. doi: 10.1074/jbc.M110.157545 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Varlakhanova N, Cotterman R, Bradnam K, Korf I, Knoepfler PS. Myc and Miz-1 have coordinate genomic functions including targeting Hox genes in human embryonic stem cells. Epigenetics Chromatin. 2011;4: 20. doi: 10.1186/1756-8935-4-20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Kanai-Azuma M, Kanai Y, Gad JM, Tajima Y, Taya C, Kurohmaru M, et al. Depletion of definitive gut endoderm in Sox17-null mutant mice. Development. 2002;129: 2367–2379. [DOI] [PubMed] [Google Scholar]
  • 44.Zhu X-X, Yan Y-W, Chen D, Ai C-Z, Lu X, Xu S-S, et al. Long non-coding RNA HoxA-AS3 interacts with EZH2 to regulate lineage commitment of mesenchymal stem cells. Oncotarget. 2016;7: 63561–63570. doi: 10.18632/oncotarget.11538 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Huang H-H, Chen F-Y, Chou W-C, Hou H-A, Ko B-S, Lin C-T, et al. Long non-coding RNA HOXB-AS3 promotes myeloid cell proliferation and its higher expression is an adverse prognostic marker in patients with acute myeloid leukemia and myelodysplastic syndrome. BMC Cancer. 2019;19: 617. doi: 10.1186/s12885-019-5822-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Gil N, Ulitsky I. Regulation of gene expression by cis-acting long non-coding RNAs. Nat Rev Genet. 2020;21: 102–117. doi: 10.1038/s41576-019-0184-5 [DOI] [PubMed] [Google Scholar]
  • 47.Ihry RJ, Worringer KA, Salick MR, Frias E, Ho D, Theriault K, et al. p53 inhibits CRISPR–Cas9 engineering in human pluripotent stem cells. Nat Med. 2018;24: 939–946. doi: 10.1038/s41591-018-0050-6 [DOI] [PubMed] [Google Scholar]
  • 48.Greer JM, Puetz J, Thomas KR, Capecchi MR. Maintenance of functional equivalence during paralogous Hox gene evolution. Nature. 2000;403: 661–665. doi: 10.1038/35001077 [DOI] [PubMed] [Google Scholar]
  • 49.Wong SFL, Agarwal V, Mansfield JH, Denans N, Schwartz MG, Prosser HM, et al. Independent regulation of vertebral number and vertebral identity by microRNA-196 paralogs. Proc Natl Acad Sci U S A. 2015;112: E4884–93. doi: 10.1073/pnas.1512655112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Ulitsky I. Evolution to the rescue: using comparative genomics to understand long non-coding RNAs. Nat Rev Genet. 2016;17: 601–614. doi: 10.1038/nrg.2016.85 [DOI] [PubMed] [Google Scholar]
  • 51.Zweibaum A, Pinto M, Chevalier G, Dussaulx E, Triadou N, Lacroix B, et al. Enterocytic differentiation of a subpopulation of the human colon tumor cell line HT-29 selected for growth in sugar-free medium and its inhibition by glucose. Journal of Cellular Physiology. 1985. pp. 21–29. doi: 10.1002/jcp.1041220105 [DOI] [PubMed] [Google Scholar]
  • 52.Tiscornia G, Singer O, Verma IM. Production and purification of lentiviral vectors. Nature Protocols. 2006. pp. 241–245. doi: 10.1038/nprot.2006.37 [DOI] [PubMed] [Google Scholar]
  • 53.Itzkovitz S, Lyubimova A, Blat IC, Maynard M, van Es J, Lees J, et al. Single-molecule transcript counting of stem-cell markers in the mouse intestine. Nat Cell Biol. 2011;14: 106–114. doi: 10.1038/ncb2384 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Lyubimova A, Itzkovitz S, Junker JP, Fan ZP, Wu X, van Oudenaarden A. Single-molecule mRNA detection and counting in mammalian tissue. Nat Protoc. 2013;8: 1743–1758. doi: 10.1038/nprot.2013.109 [DOI] [PubMed] [Google Scholar]
  • 55.Bahar Halpern K, Itzkovitz S. Single molecule approaches for quantifying transcription and degradation rates in intact mammalian tissues. Methods. 2016;98: 134–142. doi: 10.1016/j.ymeth.2015.11.015 [DOI] [PubMed] [Google Scholar]
  • 56.Li B, Dewey CN. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 2011;12: 323. doi: 10.1186/1471-2105-12-323 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Love M, Anders S, Huber W. Differential analysis of count data—the DESeq2 package. Genome Biol. 2014;15: 550. doi: 10.1186/s13059-014-0550-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Koike-Yusa H, Li Y, Tan EP, Velasco-Herrera Mdel C, Yusa K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat Biotechnol. 2014;32: 267–273. doi: 10.1038/nbt.2800 [DOI] [PubMed] [Google Scholar]
  • 59.Aparicio-Prat E, Arnan C, Sala I, Bosch N, Guigó R, Johnson R. DECKO: Single-oligo, dual-CRISPR deletion of genomic elements including long non-coding RNAs. BMC Genomics. 2015;16: 846. doi: 10.1186/s12864-015-2086-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Unger T, Jacobovitch Y, Dantes A, Bernheim R, Peleg Y. Applications of the Restriction Free (RF) cloning procedure for molecular manipulations and protein expression. J Struct Biol. 2010;172: 34–44. doi: 10.1016/j.jsb.2010.06.016 [DOI] [PubMed] [Google Scholar]
  • 61.Hezroni H, Koppstein D, Schwartz MG, Avrutin A, Bartel DP, Ulitsky I. Principles of Long Noncoding RNA Evolution Derived from Direct Comparison of Transcriptomes in 17 Species. Cell Rep. 2015. doi: 10.1016/j.celrep.2015.04.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Smillie CS, Biton M, Ordovas-Montanes J, Sullivan KM, Burgin G, Graham DB, et al. Intra- and Inter-cellular Rewiring of the Human Colon during Ulcerative Colitis. Cell. 2019;178: 714–730.e22. doi: 10.1016/j.cell.2019.06.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Khan A, Fornes O, Stigliani A, Gheorghe M, Castro-Mondragon JA, van der Lee R, et al. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res. 2018;46: D1284. doi: 10.1093/nar/gkx1188 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Tabula Muris Consortium, Overall coordination, Logistical coordination, Organ collection and processing, Library preparation and sequencing, Computational data analysis, et al. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature. 2018;562: 367–372. doi: 10.1038/s41586-018-0590-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Kumar N, Tsai Y-H, Chen L, Zhou A, Banerjee KK, Saxena M, et al. The lineage-specific transcription factor CDX2 navigates dynamic chromatin to control distinct stages of intestine development. Development. 2019;146. doi: 10.1242/dev.172189 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Bret Payseur, Eric A Miska

20 Jan 2021

* Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. *

Dear Dr Ulitsky,

Thank you very much for submitting your Research Article entitled 'Highly conserved and cis-acting lncRNAs produced from paralogous regions in the center of HOXA and HOXB clusters in the endoderm lineage' to PLOS Genetics.

The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some concerns that we ask you address in a revised manuscript.

We therefore ask you to modify the manuscript according to the review recommendations. Your revisions should address the specific points made by each reviewer.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Eric A Miska, PhD

Associate Editor

PLOS Genetics

Bret Payseur

Section Editor: Evolution

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The manuscript by Degani et al presents the analysis of the expression and regulation of two antisense lncRNAs within the Hox cluster. The study presented is of interest as the much remains to be learned regarding the function and impact of lncRNAs. The interest of the study stems from the high conservation of these two lncRNAs, their impact on Hox gene expression, but most importantly by the approach developed by the authors combining mining existing omic data and carefully designed experiments assessing the regulation of Hox genes by these lncRNAs and allowing to properly test transcription vs transcript mediated effect. The work is presented in a succinct, clear manner and is discussed in context of the current state of knowledge.

I have only minor comments

In the discussion it would be good to contrast the results shown on figure 2a and 2b with no or negative correlation between HOXA-AS3, HOXB-AS3 and the other HOX genes and the results stemming from the analyses of HT-29 cell as they appear somewhat conflicting. This is most likely the consequence of the tissues available and cell population represented in the data used for Figure 2.

- FigS1 - The labelling of strand and directionality for each of the species needs to be clarified. Perhaps more information in the legend, and a mention of the species-specificity of the strandedness where it is mentioned in the introduction. It is currently a little confusing as the figures seem to contradict the text without clarification that strandedness varies in the species presented.

- Fig S1: can the authors check the tracks for the RNA-Seq and the annotations for the opossum, they are not aligned - spelling on the shark track - Kindey

- Fig2 - the expression is of these lncRNAs is very low and the TPM>=1 threshold is understandable, but how robust are these findings to raising that threshold?

- Fig S5 shows very large transcriptional activity on both strands, I would be careful in drawing conclusions here. To reduce the noise could you plot the splicing junctions? as the exons are clearly marked especially for HOXB-AS3?

Page 9: “These results suggest that HOXA-AS3 and HOXB-AS3 production or their RNA products have a positive regulatory effect on the expression of the neighboring genes HOX5–7 genes.” The authors do not show significant effect of activation for HOXB-AS3 it would be good to slightly change the phrasing to also include this observation.

- Fig 6D - the arrows for HOXA-AS3, are these exhaustive? It looks like there are other molecules (including on the edge of next cell in the green "Phall region" top left of image). Do the different arrows colours represent anything? They are not referred to in the manuscript anywhere? Refer to them in the manuscript or remove them.

Reviewer #2: The authors of the study "Highly conserved and cis-acting lncRNAs produced from paralogous regions in the center of HOXA and HOXB clusters in the endoderm lineage" focus on a lncRNA derived antisense to HOXA3 in several species. Using their expertise in evolutionary conservation they start by finding syntenic regions across multiple species comprised of an array of evolutionary distances. The PhyloCSF analysis indicates that in most species this lncRNA is not likely to encode a protein (where as the positive control of HOX exons show clear synonymous mutations throughout evolution expected for proteins). Deeper sequence analysis reveals a tandem CDX1/2 sites that are very conserved. The authors perform co-expression analysis and find these two loci are highly correlated in expression. Moreover, it is expected that in IMR90 that only express proximal HOX genes that these genes would be expressed -- the authors find the same for HT-29 cancer line (where as a primary fibroblast line may have been more beneficial for normal HOX biology and a system developed by Howard Chang's lab (Rinn et al. PLoS Genetics 2006, Figure 7). The authors used the HT-29 line to perform LOF studies using CRISPR-I. They observe that the neighboring genes in both cases are down-regulated upon LOF of the lncRNA. Similarly, GOF results in increased expression of neighboring genes. The authors continue to show that HOXA3-as is expressed in specific regions of the intestine, but not relative to the HOXB-as. Moreover, the authors investigate the roles of these lncRNAs in hESC differentiation and find they are specifically expressed in mid/hind gut (MHG). The authors then perform CRISPR based LOF/GOF and see significant but very small effect sizes in regulation of MHG differentiation. Finally, the authors explore the transcriptional regulation of these Antisense lncRNAs by CDX1/2. They show that depletion of CDX1 down-regulates both HOXA3/B-as lncRNAs.

Overall, the evolutionary analysis of this study is interesting and the co-expression compelling. Yet all the LOF/GOF studies are far less convincing, mostly owing to highly variable and or small effect sizes. I have the following concerns before high-profile publication in PLoS Genetics.

1) How often does the CCATAA motif show up in randomly selected transcripts? Can the authors determine if this is unique to the HOXA/B-as genes relative to all other HOX genes?

2) HOXA/B-as lncRNAs are highly correlated. If all hox pairs are considered is the correlation more than expected for other hox genes that are highly correlated?

For example HoxA1-7 should be highly correlated with HOXD1-7. Essentially one would expect a high correlation between the binary proximal and distal information expressed in each hox cluster. This was found by the Duboule lab in vivo as well. Counter to the co-linearity model it is noted in this and other studies that the hox cluster either expresses HOX1-7 or HOX9-13 paralogs (at least for HOXA and D clusters). Thus, it would not be surprising to find strong correlation across HOX clusters since there are typically only binary patters on prox-on or distal-on but not prox & distal on (as would have been suggested by the collinearity model).

Similarly, Is HOTAIRM1 as or more correlated with HOXA1 & HOXA3 as HOXB3-as and HOXA3-as pairs? Or how does the HOXA3/B3-as correlation compare to intra-cluster correlation? This could be done systematically with all HOX pairs and determine the observed value of HOXA3/B3 relative to all other correlations -- is the observed significant relative the expected empirical null?

3) The authors show that LOF and GOF decrease and increase the neighboring genes respectively. I may have missed it, but did the authors check if there was cross talk between the HOXA/B LOF/GOF experiments? For example does LOF/GOF of HOXA3-as affect HOXB? Vice-versa? From my reading, when HOXA3-as CRISPR-I/A was performed the authors only looked at the neighboring genes in cis -- where as it would seem important to see if HOXA3-as also affected HOXB-as. Same for Figure 7.

4)Minor: It is somewhat concerning that HT-29 cell lines have low expression and extra copies of HOX clusters -- where as IMIR90 would not have this issue.

5) RNA-FISH wouldn't one expect to see more cytoplasmic localization of HOXA5 as a positive control for spatial resolution of RNA-FISH?

6) Intestine FISH would be good to see HOXA3-as and HOXB-as co-localized with target genes as in Figure 5?

7) Figure 7: everything in this figure trends the right way, but differences that are significant are very small in effect-size. While the HOXA3-as LOF/GOF seems ok in HESCs the depletion in MHG shows ~40% depletion of HOXA3-as and 20-30%% depletion of HOXA5. Whilst the non-significant OCT4 and SOX17 markers have a variance that encompasses the effect size of the "significant" changes. Does this have physiologically visible phenotype? For example, can the authors quantify using FISH approaches in previous figures that these markers and or genes are making a significant change in differentiation? Again is there cross-talk as the qRT-PCR only focuses on neighboring genes of LOF/GOF target and not the reciprocal cluster?

8) Surprising that the combo of CDX 1/2 has less of an affect than either alone?

Reviewer #3: Hox cluster regulation has long been a central setting with which to reveal and dissect mechanisms of gene/genome regulation. Assessment of lncRNAs within Hox clusters have not been without significant controversy (Hotair), and the continued identification and functional assessment of Hox-embedded ncRNas is important to understand their quantitative impact in shaping Hox output and in revealing commonalities/differences in the mechanisms by which they control Hox cluster expression.

In this manuscript by Degani and colleagues demonstrate i) genomic conservation of two Hox-embedded lncRNAs across vertebrate species; ii) utilise published resources to good effect to characterise the developmental and adult expression of these lncRNAs with additional fluorescent in situ hybridisation to provide spatial detail; iii) perform in vitro functional analyses indicating these lncRNAs have a positive effect on Hox gene expression of the opposite strand. Overall, the manuscript is quite straightforward, well written, I have no major technical concerns with minor technical comments below. Greater understanding of global Hox impact, and assessment of any altered ESC differentiation endpoint, dowstream of altered Hox signatures, would strengthen the impact of this work.

Genomic and transcriptomic analyses:

The transcript assessment in Figure 1 looks simplified. For example, we know the Hoxa5/a6 locus is highly complex, with various transcripts produced (see work of Lucie Jeanotte). I understand this figure does not need to reflect that level of complexity in Hox protein coding genes, but is there any evidence of alternate antisense transcripts produced? What do the opening words “in human” mean – ie, what cumulative datasets is this data derived from? ( I later see in Supp fig 2 there are 2 variants? … and Supp fig 5 that Hoxb-AS3 does appear 3 transcripts in cell lines). Please clarify.

I believe the authors mean orthologue rather than homolog throughout when comparing species?

Pg 4:

“The corresponding positions of the two lncRNAs and their high conservation in other species made us scrutinize…” Just to be clear, please add… the high conservation of their presence in other species (as written it could lead reader to think this was sequence conservation which I don’t believe was assessed directly). Regarding this last point, is there any evidence of sequence conservation?

The single cell analysis is a good addition. The relatively higher expression of Hoxaas3 in NMP, caudal epiblast, caudal mesoderm would suggest these antisense transcripts are expressed similar to Hox protein coding genes in the overall whole embryo A-P context. Whole mount ISH of each antisense transcript would be a good addition here, particularly as this data indicates the antisense transcripts are more highly expressed than the protein-coding genes they overlap.

Functional and expression analyses:

Elegant strategies employed, in some cases multiple strategies to corroborate.

For each gain or loss-of-function in vitro perturbation, a clear effect was observed for the Hox genes overlapping the antisense transcript in question (a5/a6/a7 for Hoxasa3). This manuscript would greatly benefit from a more comprehensive assessment of Hox genes both cis and trans, particularly given later the authors show it is the RNA transcript itself that is functional, not simply a local chromatin mechanism.

Figure S6: MCF7 cell line, lncRNA activation – this experiment was done twice, with no indication of whether the Hox protein coding activation is significant (for a6 and a7 at least). Please repeat with stats or remove from manuscript.

Regarding Figure 6, I find it quite difficult to interpret image 6D. There appears to be a haze of red signal indicating Hoxaas3, however it is indicated in the text and from scRNAseq data to be restricted to certain cell types – this is not apparent. Can the authors please clarify, and describe what technical controls are performed.

The ESC differentiation section was an excellent addition, strongly suggest to comprehensively characterise Hox expression in this system.

The Cdx1/2 direct regulation of HoxAS transcripts is interesting but of course preliminary. I find it difficult to interpret FigS8 (B) – siCdx1 results in expected knockdown of Cdx1, but siCdx2 also does, and when both siRNAs used the level of knockdown is less than either alone? Similar questions with other graphs. The benefit of its inclusion in general, and moreover, without directly supportive ChIPseq data is questionable.

Very minor text points

Abstract

“Sequence-similar homologs of both lncRNAs are found in multiple vertebrate species.” As mentioned above, I believe this is meant to be orthologs, and second, I believe you have compared the promoter sequence but have you compared the lncRNA sequence similarity to support this statement?

Pg 2:

“the molecular pathways that dictate their collinear expression remain mostly unknown.” Not sure this is strictly true, there’s increasing work in both ESCs and in vivo showing the signals and mechanisms that guide correct temporal Hox activation. This is of course not the point of this ms, but I would just slightly reword.

Pg2:

“miR-196 (iab-4 in D. melanogaster)”

these microRNAs show functional conservation, but they are actually not conserved in sequence, nor located at the exact syntenic position, so are not related.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: John Rinn

Reviewer #3: Yes: Edwina McGlinn

Decision Letter 1

Bret Payseur, Eric A Miska

23 Apr 2021

Dear Dr Ulitsky,

Thank you very much for submitting your Research Article entitled 'Highly conserved and cis-acting lncRNAs produced from paralogous regions in the center of HOXA and HOXB clusters in the endoderm lineage' to PLOS Genetics.

The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some concerns that we ask you address in a revised manuscript.

We therefore ask you to modify the manuscript according to the review recommendations. Your revisions should address the specific points made by each reviewer.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Eric A Miska, PhD

Associate Editor

PLOS Genetics

Bret Payseur

Section Editor: Evolution

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have addressed appropriately the comments raised previously

Reviewer #3: The revised manuscript is very good, the varioius transcriptomic analyses and figures incorporated greatly strengthen the ms and taking out the more preliminary work focuses the story. I still would have loved to see the detailed characterisation of ESC differentiation but comprehensive analysis in another cell line is sufficient. I have no further concerns.

Reviewer #4: The manuscript by Degani and co-authors presents an evolutionary and functional analysis of two long non-coding RNAs, transcribed on the antisense strand in the HOXA and HOXB clusters. The authors show that these lncRNAs are highly conserved during evolution and that their origin likely predates the whole-genome duplications that led to the emergence of the HOX gene clusters. They perform computational analyses of publicly available gene expression resources (including both bulk tissue expression and single cell RNA-seq data), as well as their own expression experiments to determine the spatial and temporal expression patterns of these two lncRNAs and of the neighboring HOXA and HOXB genes. Moreover, they perform in vitro knockdown (with CRISPRi and siRNA) and activation experiments for these two lncRNAs and they observe an up-regulation of the neighboring HOXA5-7 and HOXB5-7 genes. They conclude that these lncRNAs contribute to the complex regulatory mechanisms that control HOXA gene expression.

I find the evolutionary analyses convincing; there is no doubt that these lncRNAs are ancient and that they share similar structures (though not necessarily sequence conservation) in vertebrates.

I am less convinced by the loss-of-function and gain-of-function experiments and by the associated transcriptomics analyses. Here are my main comments:

1) As the authors themselves acknowledge in this manuscript, the transcriptional organization of the HOX clusters is very complex. There are many alternative isoforms, on the sense or antisense strand, not to mention numerous regulatory elements embedded in the locus. The molecular consequences of gene editing in these loci are thus hard to predict, and have to be carefully analyzed before drawing a definitive conclusion. I thus wonder what exactly happens with the transcriptional organization of the locus in the CRISPRi and CRISPRa experiments. Are the preferred transcript start sites identical? Albeit in a different experimental setting (a targeted genomic deletion of the HOTAIR lncRNA), it was shown that the HOTAIR alteration leads to the emergence of a new lncRNA at the locus and potentially to transcriptional leakage on neighboring HOX genes (Amandio et al, PLoS Genetics, 2016). As unfortunately we still know little of the molecular consequences of CRISPRi and CRISPRa experiments, it is worth examining the resulting transcript organization with RNA-seq assays rather than examining a few target genes with qRT-PCR.

2) The authors do present an RNA-seq analysis for the siRNA knockdown performed on HT-29 cells (S2 table, qRT-PCR experiments in Figure 4). However, the differential expression analysis is not convincing: unless I have misunderstood the table (a detailed legend would help), there is no significant difference in expression for any HOX genes upon siRNA down-regulation of HOXA-AS3 and HOXB-AS3. In fact even the HOXA-AS3 and HOXB-AS3 transcripts do not show any significant expression change - I am not sure if this is because the expression levels are too low or too variable. This RNA-seq analysis (for the HOXA6-7 and HOXB6-7 genes) is not consistent with the qRT-PCR analysis shown in Figure 4. In general, qRT-PCR analyses, which show only fold changes but do not inform on the basal expression levels of the focus genes, ares not sufficient to prove that there is an effect on the target gene expression.

Overall, I think it is important to emphasize that all the experiments performed in the manuscript are performed in vitro, often on cancer cell lines that likely have numerous chromosomal alterations, and may not adequately reflect the situation in vivo. I would thus recommend that the results be interpreted with extreme caution.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #3: No: 

Reviewer #4: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #3: Yes: Edwina McGlinn

Reviewer #4: No

Decision Letter 2

Bret Payseur

24 Jun 2021

Dear Dr Ulitsky,

We are pleased to inform you that your manuscript entitled "Highly conserved and cis-acting lncRNAs produced from paralogous regions in the center of HOXA and HOXB clusters in the endoderm lineage" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional acceptance, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about making your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Eric A. Miska

Associate Editor

PLOS Genetics

Bret Payseur

Section Editor: Evolution

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-20-01741R2

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Acceptance letter

Bret Payseur

6 Jul 2021

PGENETICS-D-20-01741R2

Highly conserved and cis-acting lncRNAs produced from paralogous regions in the center of HOXA and HOXB clusters in the endoderm lineage

Dear Dr Ulitsky,

We are pleased to inform you that your manuscript entitled "Highly conserved and cis-acting lncRNAs produced from paralogous regions in the center of HOXA and HOXB clusters in the endoderm lineage" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Zsofi Zombor

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. CAGE and RNA-seq read coverage support for gene models of HOXA-AS3 and HOXB-AS3 orthologs.

    (A) Human gene models annotated in GENCODE v36 in the central part of the HOXA and HOXB clusters (protein-coding transcripts are in blue and noncoding are in green). The total CAGE read coverage from FANTOM5.5 is shown on top. (B) In each species, annotated or reconstructed gene models are shown for HOXA-AS3 or HOXB-AS3 (the strand from which they are produced is defined as the ‘+’ strand) and the protein-coding HOX5–7 genes (transcribed from the ‘-’ strand). RNA-seq data are from the following datasets: SRP023152 (opossum), SRP041863 (Chicken), GSE136018 (Medaka), and SRP013772 (Shark).

    (EPS)

    S2 Fig. PhyloCSF scores for the HOXA-AS3 and HOXB-AS3.

    PhyloSCF scores [16] taken from the PhyloCSF UCSC genome browser, for each of the three frames for the ‘+’ strand from which HOXA-AS3 and HOXB-AS3 are transcribed. Position of the proposed ORF from [17] is shown.

    (EPS)

    S3 Fig

    (A) Sequence conservation and similarity in the HOXA-AS3 and HOXB-AS3 promoter regions. Exonic sequences (where known) are in bold. Predicted binding sites of the indicated transcription factors, taken from the UCSC genome browser are shaded in yellow. Regions of the 5’ splice sites at the end of the first exon, where known, are shaded in blue. (B) Number of CDX1 or CDX2 binding sites predicted by JASPAR [63] in 201,802 human promoters annotated in FANTOM5.5 [19]. For each TSS we considered the region -100 to 100 relative to the TSS. Selected genes are highlighted.

    (EPS)

    S4 Fig

    (A) Distribution of correlation coefficients between expression patterns of pairs of genes within the same Hox cluster (red) and found in different Hox clusters (blue). The correlation coefficient between HOXA-AS3 and HOXB-AS3 is shown in green. All the coefficients are computed across all the samples from the human FANTOM5.5 data. (B) as in Fig 2A for the mouse FANTOM5.5 data. (C) Expression of Hoxaas3 in clusters of single cells from the Tabula Muris database [64]. Eight cell groups with the highest expression are shown.

    (EPS)

    S5 Fig. Spatial expression patterns of HOXA-AS3 and HOXB-AS3 and other genes in the E7.5 mouse embryo.

    Geo-seq data were re-mapped to the RefSeq annotations and visualized as in [20]. Each gene is shown on a separate scale. Genes are grouped based on their genomic location or gene family.

    (EPS)

    S6 Fig. Expression of central HOXA and HOXB genes in ENCODE cell lines.

    (A) Shown are selected RefSeq gene models for the indicated genes alongside RNA-seq strand-specific read coverage from the indicated cell lines from ENCODE datasets of total (HT-29) or polyA-selected (A459, HUVEC) RNA on the indicated strand as depicted in the UCSC genome browser. HOXA-AS3 and HOXB-AS3 are transcribed from the ‘+’ strand and the protein-coding genes from the ‘-’ strand. (B) Read coverage from the ENCODE datasets in IGV genome browser, showing the agglomerated coverage from both strands, and the splice-junction-supporting reads from the ‘+’ strand (blue) and the ‘-’ strand (red).

    (EPS)

    S7 Fig. Cross-regulation by HOXA-AS3 and HOXB-AS3 of HOXB and HOXA clusters.

    (A) Expression levels in HT-29 cells of the protein-coding genes in the indicated paralogous HOX gene group. Shown are the average expression levels across our RNA-seq dataset. (B) As in Fig 4A and 4B, for the indicated genes from the HOXB (left) and HOXA (right) clusters. (C) As in Fig 3A and 3B, for the indicated genes in the HOXB (left) and HOXA (right) clusters. (D) Changes in gene expression and DESeq2 p-values for the transcriptome-wide changes in gene expression following siRNA-mediated KD of HOXA-AS3 (left) and HOXB-AS3 (right) in HT-29 cells. Genes with adjusted P<0.05 are in red.

    (EPS)

    S8 Fig. hESC endodermal differentiation overview.

    (A) Expression levels of Hox genes and lncRNAs in data from [38]. (B) Endodermal differentiation process and signaling molecules. (C) Characterization of different cell morphology in different stages of the differentiation. (D) RNA levels and expression dynamics were measured by qRT-PCR at different stages of endodermal differentiation, and normalized to actin. (E) Immunofluorescence stainings of human ESCs differentiated in different stages of differentiation. (F) As in Fig 7E (left) and Fig 7F (right) for the indicated genes.

    (EPS)

    S9 Fig. Regulation of HOXA-AS3 and HOXB-AS3 by CDX1/2 transcription factors.

    CDX2 ChIP-seq read coverage from the gut at the indicated stage and RNA-seq at E12 in the indicated tissue, data from [65]. The region corresponding to the promoters of Hoxaas3 and Hoxb5os is shaded.

    (EPS)

    S1 Table. FANTOM5 expression data.

    (XLSX)

    S2 Table. RNA-seq data and GO enrichments.

    (XLSX)

    S3 Table. Markers used for validation.

    (XLSX)

    S4 Table. smFISH probes.

    (XLSX)

    S5 Table. Primers and siRNAs.

    (XLSX)

    S1 Dataset. LncLOOM analysis of the HOXB-AS3 conservation.

    (ZIP)

    Attachment

    Submitted filename: HOX-AS PLoS Genetics point-by-point.pdf

    Attachment

    Submitted filename: Point-by-point HOXA_B-AS3 manuscript, round 2.pdf

    Data Availability Statement

    RNA-seq datasets are deposited in GEO database under the accession GSE168444 (reviewer token olepogycpjsrdcp).


    Articles from PLoS Genetics are provided here courtesy of PLOS

    RESOURCES