Abstract
Introduction
Pseudogenes are paralogues of functional genes historically viewed as defunct due to either the lack of regulatory elements or the presence of frameshift mutations. Recent evidence, however, suggests that pseudogenes may regulate gene expression, although the functional role of pseudogenes remains largely unknown. We previously reported that MYLKP1, the pseudogene of MYLK that encodes myosin light chain kinase (MLCK), is highly expressed in lung and colon cancer cell lines and tissues but not in normal lung or colon. The MYLKP1 promoter is minimally active in normal bronchial epithelial cells but highly active in lung adenocarcinoma cells. In this study, we further validate MYLKP1 as an oncogene via elucidation of the functional role of MYLKP1 genetic variants in colon cancer risk.
Methods
Proliferation and migration assays were performed in MYLKP1-transfected colon and lung cancer cell lines (H441, A549) and commercially-available normal lung and colon cells. Fourteen MYLKP1 SNPs (MAFs >0.01) residing within the 4 kb MYLKP1 promoter region, the core 1.4 kb of MYLKP1 gene, and a 4 kb enhancer region were selected and genotyped in a colorectal cancer cohort. MYLKP1 SNP influences on activity of MYLKP1 promoter (2kb) was assessed by dual luciferase reporter assay.
Results
Cancer cell lines, H441 and A549, exhibited increased MYLKP1 expression, increased MYLKP1 luciferase promoter activity, increased proliferation and migration. Genotyping studies identified two MYLKP1 SNPs (rs12490683; rs12497343) that significantly increase risk of colon cancer in African Americans compared to African American controls. Rs12490683 and rs12497343 further increase MYLKP1 promoter activity compared to the wild type MYLKP1 promoter.
Conclusion
MYLKP1 is a cancer-promoting pseudogene whose genetic variants differentially enhance cancer risk in African American populations.
Introduction
Pseudogenes are a type of long non-coding RNA originally derived from paralogues of functional genes. Historically, pseudogenes were considered non-functional genomic artifacts of catastrophic pathways, due to either the lack of regulatory elements or the presence of frameshift mutations [1]. However, nucleotides within these pseudogenes are conserved suggesting there is selective pressure to maintain the original genetic elements within the pseudogene [1]. Nearby regulatory elements regulate pseudogene transcription, and pseudogenes often share elements of the original gene's 5’ UTR and 3’ UTR regions allowing for differential regulation across tissue types. Recent evidence further suggests that pseudogenes may also serve as microRNA decoys leading to senescence susceptibility [2–4] and aberrantly regulate gene expression in cancer tissues [5–7]. For example, PTENP1 [8] is a pseudogene of the tumor suppressor gene PTEN [9, 10] that is downregulated via methylation in renal cell carcinoma with PTENP1 a competing non-endogenous RNA to suppress cancer progression [11]. Overall, pseudogenes require additional functional exploration in both cancer and non-neoplastic processes [5, 6].
We previously reported the functionality of MYLKP1, a pseudogene partially duplicated from MYLK on chromosome 3p13, with divergence from MYLK unique to higher hominids [12]. MYLK encodes three variants of myosin light chain kinase (MLCK) [13, 14] that participate in regulating cytoskeletal elements involved in maintaining cell integrity, contractility, motility, cell division [14, 15] and vascular barrier integrity [15, 16]. MYLK is associated with signaling pathways that include Rho/ROCK and Ca2+ signaling, which participate in colon cancer metastasis [17, 18]. MYLK downregulation is a hallmark of colon cancer metastasis, and MYLK mRNA and smooth muscle MLCK (smMLCK) protein are dysregulated in lung cancer [19, 20]. We previously demonstrated that genes influenced by MYLK expression are associated with a poor prognosis in a variety of cancer [21].
Evolutionarily, exons 13 through 17 of MYLK have been subjected to interchromosomal duplication, generating the partially duplicated MYLKP1 pseudogene [22]. MYLKP1 transcribes a sense strand of MYLK that decreases MYLK RNA stability [15]. Despite strong homology with the smMLCK promoter (~90%), the MYLKP1 promoter is minimally active in normal bronchial epithelial cells but highly active as the smMLCK promoter in lung adenocarcinoma cells. Moreover, MYLKP1 and smMLCK exhibit differential transcriptional profiling with MYLKP1 strongly expressed in cancer cell lines (cervix, leukemia, uterus, colon) and tissues (colon, lymph node, vulva, bladder carcinoma), whereas smMLCK is highly expressed in non-neoplastic cells (bone marrow stem, uterine fibroblast, airway smooth muscle) and tissues (brain, breast, cervix, colon, liver, uterus, vein), tissues where MYLKP1 expression is virtually absent. Thus, mechanistically, MYLKP1 over-expression dramatically inhibits smMLCK expression in cancer cells and increases cell proliferation.
We have previously demonstrated that MYLK SNPs confer increased susceptibility to inflammatory disease that drives disease severity and mortality, particularly in African descent subjects with asthma and acute inflammatory lung injury [23, 24]. These results suggest the possibility that SNPs in the conserved MYLKP1 promoter may exhibit higher minor allele frequencies (MAFs) in colon cancer subjects. Selected MYLKP1 promoter SNPs were genotyped in a colorectal cancer cohort and further assessed by luciferase reporter promoter activity assays. Two known MYLKP1 SNPs, rs12497343 (C>G) and rs12490683 (G>A) [25], affected MYLKP1 promoter activity and were significantly associated with colon cancer risk in African Americans. These studies provide evidence for the functional involvement of MYLKP1 pseudogenes in human carcinogenesis and suggest potential roles of MYLKP1 as a novel population-specific diagnostic or therapeutic target in human colon cancer.
Methods
Primary cell cultures and cell lines
Beas-2b is a human bronchial epithelial cell line, H460 is a non-small cell lung cancer cell line, and A549 is an adenocarcinoma cell line provided by American Type Culture Collection (Manassa, VA, USA). All cell lines were grown according to the manufacturer’s protocol. Beas-2b and A549 were used to assess promoter function in MYLKP1. Promoter activity was measured using a standard luciferase assay that has been previously described [14, 15]. H23 non-small lung cancer cell-line, H441 adenocarcinoma, and H522 lung cancer were obtained from American Type Culture Collection (Manassa, VA, USA), were grown according to the manufacturer’s protocols, and were used to assess proliferation and migration.
MYLKP1 luciferase assay
MYLKP1 promoter (2kb) luciferase constructs were designed in a basic pGL4 vector containing each combination of the major and minor alleles of rs12497343 (C>G) and rs12490683 (G>A) (4 constructs in total). For dual luciferase reporter gene assays, cells grown in 12-well plates were cotransfected with 1 μg of the firefly luciferase vector containing the MYLKP1 promoter and 20 ng of TK-renilla luciferase vector (Promega, Madison, WI, USA) using Fugene HD transfection reagent (Roche, Basel, Switzerland) as described previously [20].
Cell proliferation and migration
For proliferation assays, cells were transfected with pcDNA 3.1 control or pcDNA 3.1 with MYLKP1 gene clone using Fugene 6 transfection reagent (Roche) [15]. Two days after transfection, cells were selected with 400 μg/ml of Geneticin (G418; Sigma-Aldrich, St. Louis, MO, USA) and maintained with 200 μg/ml of G418. Cells grown in a 12-well plate with initial number of 105 cells/well were harvested each day and counted using Countess Automated Cell Counter (Invitrogen, Carlsbad, CA, USA) up to 5 days.
PCR differential detection
Total RNA was purchased from Agilent Technologies (Santa Clara, CA, USA) or isolated using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's protocol. For a conventional RT-PCR, each reaction was carried out with 2 μl cDNA, 0.5 μM forward (3bf) and reverse (3ar) primers, and 0.01 U Phusion DNA polymerase (Finnzymes, Espoo, Finland). Three-step PCR was performed according to the manufacturer's protocol as previously described [15]. The signal was detected by ethidium bromide staining after being run on a 2% agarose gel [15].
Colorectal cases and controls
Individuals with colorectal cancer (n = 853; 400 AAs and 453 whites) who underwent surgical resection at the University of Chicago Department of Medicine between 1994 and 2008 were retrospectively ascertained from the Cancer Center and Pathology Department databases. Individuals known to have hereditary syndromes (familial adenomatous polyposis and Lynch syndrome) or inflammatory bowel disease were excluded. Available baseline characteristics including age, gender, race, colorectal tumor location, histological grade, depth of invasion, nodal involvement and recorded metastases.
Cancer-free control samples (n = 498; 302 AAs and 196 whites) were ascertained through our Pathology Department database (n = 305) and the University of Chicago Department of Medicine TRIDOM biobank (n = 93). The pathology-based controls included cancer-free individuals who had thyroidectomies and amputations and the biobank controls included cancer-free individuals visiting for a variety of bebign complaints. Controls were matched to cases by age at diagnosis, 10-year birth cohort, gender and race as recorded in the database. The details of sample collection and DNA preparation from archived surgical specimens have been validated and described previously [26].
Genotyping
Using Tagger in Haploview, we selected a total of 14 SNPs with frequencies greater than 0.05 from the region spanning MYLKP1, the MLCK pseudogene. iPLEX assays for these 14 SNPs and 100 ancestry informative markers (AIMs) were designed using the Sequenom Assay Design software, and genotyped on the Sequenom MassARRAY platform. Selection and genotyping of the AIMs utilized have been published previously [27]. The methods for genotyping were also described previously [26].
Genetic analysis
Utilizing AIMs information, the percent West African ancestry was estimated for each individual using STRUCTURE 2.1. Using prior population information from 60 Europeans and 131 West Africans, a model was run with K = 2 populations and a burn-in length of 30,000 iterations followed by 70,000 replications [28]. We excluded from the genetic analysis any African American subjects whose West African ancestry was < 0.25 (N = 10) and European American subjects whose West African ancestry was >0.1 (N = 46). Percent West African ancestry for heterozygotes and homozygotes was compared between controls and colorectal African American cases for each SNP genotyped. Percent West African ancestry was also compared via Welch two sample t-test for the homozygotes of the major allele and the homozygotes of the minor allele for controls and African American colorectal cancer cases. A p-value for false discovery rate (FDR) was performed using the Bejamini-Hochberg adjustment in R and reported with both unadjusted and adjusted p-values.
PLINK for utilized for the genetic analysis [29]. SNPs were tested for departures from Hardy-Weinberg equilibrium (HWE) which excluded three SNPs with p values < 0.005. We further removed any individual in which more than two SNPs were not successfully genotyped. After removal of poor quality DNAs, the average genotype rate in the remaining 11 SNPs was greater than 95%. We excluded SNPs with minor allele frequencies less than 0.05. Association with colorectal cancer was tested in European and African Americans separately. We tested association by calculation of the chi square statistic for the difference in allele frequency between cases and controls and calculated odds ratios and 95% confidence intervals. A p-value corrected for false discovery rate (FDR) was performed using the Benjamini-Hochberg adjustment in R for all tests (Chi-squared, dominance, recessive, and additive). We further tested dominant, recessive, and log-additive genetic models. Using logistic regression, p values were adjusted for West African ancestry estimates, sex, and age. Nominal significance was p< 0.05. Haplotype analysis was performed with the haplo.stats package in R. A chi-squared test was performed for each reported haplotype ([A,C], [A,G], [G,C], [G,G]) across African and European control and case haplotype frequencies.
Results
Detection of MYLKP1 expression in human cancer cells and transfected non-cancer cells
MYLKP1 contains a 72-base pair deletion compared with the MYLK gene (nt 342–413). PCR primers designed to flank the region containing the deletion were used to simultaneously amplify a segment of both MYLK and MYLKP1 via traditional PCR techniques [15]. PCR products on a 2% agarose gel revealed two bands with the lower band reflecting the amplified MYLKP1 mRNA transcript and the upper band reflecting MYLK mRNA transcript (Fig 1). We employed this method to detect MYLKP1 expression in several cell lines including human cancer cells (H23, H460, H441) and non-cancer epithelial cells (Beas2b) transfected with the MYLKP1 plasmid (Fig 1). Genomic DNA (gDNA) showed both bands due to the presence of both amplicons in the human genome and was used as a positive control. Non-cancer lung epithelial cells (Beas2b) displayed expression of only MYLK, however, these cells expressed both MYLKP1 and MYLK after transfection with MYLKP1. Cancer cells (H23, H460, H441) displayed basal expression of both MYLK and MYLKP1. After MYLKP1 transfection, cancer cells preferentially over-express the smaller target, MYLKP1, indicating that MYLKP1 suppresses expression of MYLK (Fig 1).
MYLKP1 expression enhances cancer cell proliferation and migration
Histological staining demonstrated increased MYLKP1 expression in A549 lung cancer cells (Fig 2A) corresponding with significant proliferation (Fig 2C) (p<0.05), consistent with our previous report that MYLKP1 promotes proliferation in cancer cell lines and tissues [15]. Both H441 and A549 cell lines demonstrated significantly increased cell migration following MYLKP1 transfection compared to control (p<0.05) (Fig 2B).
MYLKP1 promoter SNPs increase colon cancer risk in african americans
We have previously shown MYLKP1 expression in cancer cell lines inhibits the expression of MYLK in cancer cells [15]. To further test MYLKP1 as a potential oncogene, 11 MYLKP1 SNPs surviving QC filtering were evaluated for genetic association in a cohort of African American and European American colorectal cancer subjects (Table 1). Only the MYLKP1 SNP s12490683 achieved statistical significance in the analysis of European American colorectal cancer cases and controls. In the allele frequency test, both rs12497343 (p = 0.047) and rs12490683 (p = 0.023), present in the genomic region corresponding to the smooth muscle MLCK promoter in exon 16 and intron 15 (Fig 3A), were nominally associated with colorectal cancer risk in African Americans (Table 1). After adjustment for multiple testing (Benjamini and Hochberg false discovery rate—FDR), no SNP achieved significance (Table 1), however, these specific sites were selected for evaluation of potential functionality. We also tested dominant and recessive genetic models and found rs12497343 and rs12490683 achieved smaller p values in the recessive genetic model (0.018 and 0.002, respectively). After FDR correction, rs12490683 retained a p value < 0.05 (0 = 0.030) in the recessive genetic model (Table 1). Percent of West African heritage was compared between each SNP via chi-square test by genotype and corrected for FDR (Figure A in S1 File). By logistic regression, we tested a log-additive genetic model and adjusted for age, sex, and West African ancestry (Table 2). For rs7638312, a significance (p = 0.001) was reported for percentage of West African Ancestry between genotypes (Table 3), and rs7638312 was the only SNP with a significant difference in percentage West African Ancestry between genotypes (Table 3). After adjusting for age and sex, p values for rs12497343 and rs12490683 remained less than 0.05 but became insignificant after adjustment for West African ancestry (p values >0.05) (Table 2). A single SNP (rs4677496) in exon 17 region that we previously identified to be essential for smooth muscle MYLK expression [14] was excluded from the analysis due to a poor genotyping rate (Table 1). Haplotype analysis for rs12497343 and rs12490683 was performed for each haplotype ([A,C], [A,G], [G,C], [G,G]) across the four groups (European controls, European cases, African controls, and African cases), and chi-square p-values are reported with none being significant (Table A in S1 File).
Table 1. P values and odds ratios for associations with MLCKP1 polymorphisms in African and European American colorectal cancer.
SNP | BP | MA | F_A | F_U | P_allele | OR (95% CI) | P_dom | P_rec | P_add | OR_add (95% CI) |
---|---|---|---|---|---|---|---|---|---|---|
African Americans | ||||||||||
rs10490780 | 75325508 | G | 0.140 | 0.148 | 0.687 | 0.94 (0.69,1.28) |
0.761 | 0.694 | 0.698 | 0.94 (0.70,1.28) |
rs9824516 | 75326959 | A | 0.132 | 0.124 | 0.657 | 1.08 (0.77,1.51) |
0.575 | 0.845 | 0.663 | 1.08 (0.77,1.50) |
rs7638312 | 75327606 | C | 0.061 | 0.057 | 0.775 | 1.07 (0.67,1.71) |
NA | NA | 0.780 | 1.07 (0.68,1.69) |
rs6796799 | 75328126 | A | 0.273 | 0.270 | 0.907 | 1.02 (0.79,1.30) |
0.919 | 0.636 | 0.907 | 1.02 (0.79,1.30) |
rs4677497 | 75328974 | G | 0.155 | 0.164 | 0.666 | 0.93 (0.69,1.27) |
0.581 | 0.853 | 0.665 | 0.93 (0.69,1.27) |
rs12490683 | 75329934 | A | 0.238 | 0.186 | 0.023 | 1.37 (1.04,1.80) |
0.208 | 0.002 | 0.029 | 1.35 (1.03,1.76) |
rs12497343 | 75330074 | G | 0.264 | 0.216 | 0.047 | 1.30 (1.00,1.69) |
0.159 | 0.018 | 0.041 | 1.33 (1.01,1.74) |
rs6801219 | 75332618 | G | 0.102 | 0.121 | 0.293 | 0.83 (0.59,1.18) |
0.282 | 0.802 | 0.312 | 0.84 (0.60,1,12) |
rs2091870 | 75333283 | G | 0.338 | 0.345 | 0.781 | 0.97 (0.77,1.22) |
0.509 | 0.665 | 0.784 | 0.97 (0.77,1.22) |
rs4552385 | 75336163 | C | 0.466 | 0.457 | 0.747 | 1.04 (0.83,1.29) |
0.711 | 0.342 | 0.753 | 1.04 (0.83,1.28) |
rs4677503 | 75338306 | A | 0.323 | 0.311 | 0.466 | 1.09 (0.86,1.39) |
0.357 | 0.946 | 0.482 | 1.09 (0.86,1.36) |
European Americans | ||||||||||
rs6796799 | 75328126 | A | 0.134 | 0.124 | 0.623 | 1.10 (0.75,1.61) |
NA | NA | 0.62 | 1.10 (0.75,1.61) |
rs4677497 | 75328974 | G | 0.118 | 0.112 | 0.795 | 1.06 (0.70,1.59) |
NA | NA | 0.80 | 1.05 (0.71,1.58) |
rs12490683 | 75329934 | G | 0.381 | 0.364 | 0.603 | 1.07 (0.83,1.39) |
0.831 | 0.432 | 0.58 | 1.08 (0.82,1.41) |
rs12497343 | 75330074 | C | 0.393 | 0.382 | 0.714 | 1.05 (0.81,1.36) |
0.609 | 0.964 | 0.69 | 1.06 (0.80,1.39) |
rs2091870 | 75333283 | A | 0.368 | 0.339 | 0.374 | 1.13 (0.87,1.48) |
0.728 | 0.165 | 0.35 | 1.14 (0.87,1.50) |
rs4552385 | 75336163 | T | 0.305 | 0.300 | 0.882 | 1.02 (0.77,1.35) |
0.724 | 0.304 | 0.88 | 1.02 (0.77,1.37) |
rs4677503 | 75338306 | G | 0.333 | 0.311 | 0.470 | 1.11 (0.84,1.45) |
0.963 | 0.109 | 0.47 | 1.10 (0.84,1.44) |
P_allele is the p value obtained in the chi square test of allele frequency, and it is associated with odds ratio (OR) and 95% confidence interval (CI) in the adjacent column; P_dom is the value obtained assuming a dominant genetic model; P_rec is the value obtained assuming a recessive genetic model. P_add is the unadjusted p value obtained from a logistic regression assuming a log-additive genetic model, and it is associated with OR and 95% CI in the adjacent column. NA indicates that the frequency of one of the genotypes was too low to perform the test of the model. For the nominally significant SNPs, rs12490683 and rs12497343, the lowest p values obtained in the analysis are in bold. There are four fewer SNPs displayed in the European American part of the table because the minor allele frequency of the excluded SNPs was less than 0.05.
BP, base pair position on chromosome 3, GRCh38; F_A, frequency of minor allele in colorectal cancer cases; F_U, frequency of minor allele in controls; MA, minor allele; SNP, single nucleotide polymorphism.
Table 2. P values and odds ratios for associations with MYLKP1 polymorphisms in African and European American colorectal cancer, adjusted for age, sex, and West African ancestry.
SNP | BP | MA | P_adj1 | OR_adj1 | P_adj2 | OR_adj2 |
---|---|---|---|---|---|---|
African Americans | ||||||
rs10490780 | 75325508 | G | 0.791 | 0.96 | 0.718 | 0.95 |
rs9824516 | 75326959 | A | 0.452 | 1.14 | 0.518 | 1.12 |
rs7638312 | 75327606 | C | 0.623 | 1.12 | 0.672 | 1.11 |
rs6796799 | 75328126 | A | 0.728 | 1.05 | 0.811 | 1.03 |
rs4677497 | 75328974 | G | 0.630 | 0.93 | 0.603 | 0.92 |
rs12490683 | 75329934 | A | 0.138 | 1.24 | 0.038 | 1.33 |
rs12497343 | 75330074 | G | 0.220 | 1.19 | 0.078 | 1.28 |
rs6801219 | 75332618 | G | 0.383 | 0.86 | 0.318 | 0.84 |
rs2091870 | 75333283 | G | 0.421 | 0.91 | 0.744 | 0.96 |
rs4552385 | 75336163 | C | 0.842 | 1.02 | 0.660 | 1.05 |
rs4677503 | 75338306 | A | 0.545 | 1.08 | 0.367 | 1.11 |
European Americans | ||||||
rs6796799 | 75328126 | A | 0.610 | 1.11 | 0.558 | 1.12 |
rs4677497 | 75328974 | G | 0.809 | 1.05 | 0.781 | 1.06 |
rs12490683 | 75329934 | G | 0.601 | 1.08 | 0.579 | 1.08 |
rs12497343 | 75330074 | C | 0.673 | 1.06 | 0.685 | 1.06 |
rs2091870 | 75333283 | A | 0.354 | 1.14 | 0.347 | 1.14 |
rs4552385 | 75336163 | T | 0.869 | 1.03 | 0.915 | 1.02 |
rs4677503 | 75338306 | G | 0.496 | 1.10 | 0.494 | 1.10 |
P_adj1 is the p value for association adjusted for age, sex, and West African ancestry and its associated odds ratio (OR_adj1) is in the adjacent column. P_adj2 is the p value for association adjusted for age and sex and its associated OR (OR_adj2) is in the adjacent column. There are four fewer SNPs displayed in the European American part of the table because the minor allele frequency of the excluded SNPs was less than 0.05.
BP, base pair position on chromosome 3, GRCh38; MA, minor allele; SNP, single nucleotide polymorphism. Bolded are the two SNPs chosen for functional analyses
Table 3. Percentage of West African Heritage by SNP for African American Colorectal Cancer Cases.
SNP | Minor Allele | West African Ancestry | P-Value (Adjusted)a |
---|---|---|---|
rs10490780 | G | 0.838 | 0.926 (0.926) |
rs9824516 | A | 0.826 | 0.186 (0.292) |
rs7638312 | C | 0.826 | 0.001 (0.011) |
rs6796799 | A | 0.815 | 0.596 (0.755) |
rs4677497 | G | 0.809 | 0.821 (0.903) |
rs12490683 | A | 0.784 | 0.068 (0.193) |
rs12497343 | G | 0.788 | 0.040 (0.193) |
rs6801219 | G | 0.836 | 0.154 (0.282) |
rs2091870 | G | 0.807 | 0.136 (0.282) |
rs4552385 | C | 0.817 | 0.618 (0.755) |
rs4677503 | A | 0.804 | 0.070 (0.193) |
West African Ancestry heritage for each SNP. FDR is calculated using the Benjamini and Hochberg adjusted p-values via the R program p.adjust.
a Chi-squared p value calculated for each SNP.
MYLKP1 SNPs associated with colon cancer risk alter MYLKP1 promoter activity
After confirming the role of MYLKP1 in the H441 and A549 cell lines (Fig 2), we investigated the role of two SNPs of interest, rs12490683 G>A and rs12497343 C>G, in regulation of MYLKP1 promoter activity (Fig 3B). MYLKP1 promoter luciferase reporter assays were conducted in a human adenocarcinoma cell line (H522) and a non-cancer cell line (Beas2b). The wild type vector, utilizing the major allelic pairing (rs12490683-G and rs12497343-C) showed MYLKP1 to be significantly upregulated in cancer cells (H522) over epithelial cells (Beas-2b) (p<0.05) (Fig 3B). Furthermore, transfection of a MYLKP1 promoter luciferase reporter harboring the minor allelic pairing (rs12497343-G and rs12490683-A) into H522 cancer cells resulted in significantly greater promoter activity (p<0.05) when compared to the major allelic pairing in H522 cancer cells (Fig 3B).
Discussion
We and others have demonstrated that the pseudogene, MYLKP1, located on 3p12.3 (HGNC ID:7591) representing an intrachromosomal duplication of exons 13 to 17 of MYLK copied from 3q21.1 (HGNC ID:7590) [19, 30], is selectively expressed in cancer, regulates MLCK levels, and increases cancer cell proliferation in vitro [15, 22]. While MYLKP1 and functional MYLK share high levels of DNA sequence similarity (93%), MYLK is an intricate gene spanning over 270 kb and containing 34 exons which via alternative splicing [2], generates 9 transcripts that encode 3 proteins including a 220 kDa non-muscle MLCK isoform (nmMLCK), a 130 kDa smooth muscle MLCK isoform (smMLCK) [20], and a 20 kDa protein isoform known as telokin [31]. MYLK encodes the multi-functional myosin light chain kinase (MLCK) which is involved in diverse functions in multiple types of cancer.
Similar to other documented pseudogenes [30, 32, 33], we have shown that MYLK and MYLKP1 have a pseudogene/parent gene crosstalk relationship. Due to high sequence similarity to the functional gene, pseudogenes often pose a challenge for gene prediction programs with frequent misidentification as real genes. For instance, initial interpretation of the sequence data from human chromosome 22 indicated that 19% of the coding sequences are pseudogenic [12]. More robust direct surveys of pseudogenes revealed that the estimated number of pseudogenes is ~20,000 [6, 14], a figure comparable to the number of protein-coding genes in the human genome [17]. Despite the abundance of pseudogenes in the human genome, the pathophysiological roles of pseudogenes remain poorly understood. Unlike duplicated pseudogenes and retrotransposed pseudogenes [14, 15], other pseudogenes are potentially transcriptionally active, expressing mRNAs utilizing their own promoters or adjacent promoters [16, 18]. Duplicated pseudogenes including MYLKP1, generated by tandem duplication or unequal crossover events [34], produce antisense RNAs and inhibit functional gene expression through antisense-sense mechanism [8] with functional effects on human disease [5, 15, 35, 36].
We identified MYLKP1 as a pseudogene of MYLK that regulates levels of cellular MLCK and is selectively expressed in cancer cells, a finding observed with other pseudogenes [5, 37, 38]. The pseudogene, PTENP1, acts as a microRNA decoy and thus helps maintain cellular levels of PTEN, however, the PTENP1 locus is selectively lost in specific cancer cells, resulting in decreased PTEN expression and increased proliferation [5]. Our studies indicate that MYLKP1 may function similarly to regulate levels of MLCK, a Ca2+/CaM-dependent enzyme that functions as a critical regulator of cytoskeletal function [39], cell contraction, cytokinesis [10], cellular motility [11, 40–42], mitosis [7], apoptosis [32], cell migration [31, 39] and inflammatory cell trafficking [33]. Both the smMLCK and nmMLCK isoforms are essential participants in many key pathophysiologic features of human diseases including essential hypertension [4, 20, 22], acute inflammatory lung injury, asthma [14, 43] as well as breast, pancreatic and non-small cell lung cancer [44, 45]. MYLK expression is also increased in angiogenesis and in tumors that exhibit increased invasiveness [1]. We have previously shown that nmMLCK is an independent predictor of poor clinical outcome among cancer patients that was independent of other clinic-pathologic factors [2]. Specifically, MLCK participates in migration, metastasis, and increased cellular proliferation [6, 46–48].
Previously, we have shown that an upregulation in MYLKP1 mRNA expression produces a functional transcript in multiple cancer cell lines [14, 15], and this corresponds with the downregulation of functional MYLK mRNA in cancer cell lines. MYLKP1 expression inhibits the functional gene products of MYLK, including smMLCK protein expression. We attempted to elucidate a potentially active biological role for MYLKP1 and to clarify its participation as a candidate gene in cancer risk. We now show that MYLKP1 selectively transcribes mRNA in cancer cells and dramatically decreases the expression of the functional MYLK (Fig 1). Moreover, expression of the pseudogene increases cell proliferation of normal and cancer cells (Fig 2A, Fig 2B), indicating an active role of MYLKP1 during carcinogenesis. We previously demonstrated that MYLKP1 is selectively expressed in cancer cells, functions as a regulator of MLCK levels, and increases cancer cell proliferation in vitro [14]. The potential for cross-talk between the parent gene and the pseudogene (MYLK and MYLKP1) and nmMLCK's potential as a cancer biomarker provide unique targets for cancer therapeutics that have the potential to affect cancer cell proliferation.
The rate of colon cancer mortality among African Americans is significantly higher than Caucasian Americans independent of socioeconomic status [49]. Mutations with a higher MAF in African Americans with colon cancer could provide a particularly valuable therapeutic target, and the unique regulation of the parent gene (MYLK) by its pseudogene (MYLKP1) provides a possible mechanistic explanation for the increased severity of colon cancer and its development at younger ages in African Americans [49]. Two promoter SNPs (rs12497343 and rs12490683) in the MYLKP1 promoter region are promising candidates that could contribute to the regulation of MYLKP1 in cancer. These SNPs were discovered to be significant among populations of African descent and could contribute to health disparity in colon cancer outcomes but require independent replication for confirmation of this potentially important association. Improved reference panels that account for the unique diversity in African American genetic backgrounds and use of imputation to overcome obstacles with the homology between the MYLK and MYLKP1 promoter regions, may reveal unique therapeutic targets for cancer and elucidate mechanisms and pathways that contribute to greater colon cancer severity in African American populations [50]. Either next generation sequencing or imputation of the MYLKP1 promoter could provide genotypes for the rs4677496 SNP, which was unable to be genotyped.
Together, these studies, which provide further support for the functional involvement of pseudogenes in human pathobiology, suggest MYLKP1 should be considered as a novel diagnostic or therapeutic target in human cancer.
Supporting information
Abbreviations
- MLCK
myosin light chain kinase
- MYLK
myosin light chain kinase
- smMLCK
smooth muscle MLCK
- MLKP1
myosin light chain kinase pseudo gene
Data Availability
All relevant data are within the paper and its Supporting Information files.
Funding Statement
The authors received no specific funding for this work.
References
- 1.Harrison PM, Zheng D, Zhang Z, Carriero N, Gerstein M. Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability. Nucleic Acids Res. 2005. April 28;33(8):2374–83. Print 2005. 10.1093/nar/gki531 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bosia C, Pagnani A, Zecchina R. Modelling Competing Endogenous RNA Networks. PLoS One. 2013. June 26;8(6):e66609 10.1371/journal.pone.0066609 Print 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Tam OH, Aravin AA, Stein P, Girard A, Murchison EP, Cheloufi S, et al. Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes. Nature. 2008. May 22;453(7194):534–8. 10.1038/nature06904 Epub 2008 Apr 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gutschner T, Diederichs S. The hallmarks of cancer: A long non-coding RNA point of view. RNA Biol. 2012. June;9(6):703–19. Epub 2012 Jun 1. 10.4161/rna.20481 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Poliseno L, Salmena L, Zhang J, Carver B, Haveman WJ, Pandolfi PP. A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature. 2010. June 24;465(7301):1033–8. 10.1038/nature09144 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Zhou S, Wang J, Zhang Z. An emerging understanding of long noncoding RNAs in kidney cancer. J Cancer Res Clin Oncol. 2014. December;140(12):1989–95. 10.1007/s00432-014-1699-y Epub 2014 May 11. [DOI] [PubMed] [Google Scholar]
- 7.Zheng L, Li X, Gu Y, Lv X, Xi T. The 3’UTR of the pseudogene CYP4Z2P promotes tumor angiogenesis in breast cancer by acting as a ceRNA for CYP4Z1. Breast Cancer Res Treat. 2015. February;150(1):105–18. 10.1007/s10549-015-3298-2 Epub 2015 Feb 22. [DOI] [PubMed] [Google Scholar]
- 8.Liu J, Xing Y, Xu L, Chen W, Cao W, Zhang C. Decreased expression of pseudogene PTENP1 promotes malignant behaviours and is associated with the poor survival of patients with HNSCC. Sci Rep. 2017. January 23;7:41179 10.1038/srep41179 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Poliseno L, Haimovic A, Christos PJ, Vega Y Saenz de Miera EC, Shapiro R, Pavlick A, et al. Deletion of PTENP1 pseudogene in human melanoma. J Invest Dermatol. 2011. December;131(12):2497–500. 10.1038/jid.2011.232 Epub 2011 Aug 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Khan I, Kerwin J, Owen K, Griner E, Reproducibility Project: Cancer Biology, Reproducibility Project Cancer Biology. Registered report: A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Elife. 2015. September 3;4 10.7554/eLife.08245 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Yu G, Yao W, Gumireddy K, Li A, Wang J, Xiao W, et al. Pseudogene PTENP1 Functions as a Competing Endogenous RNA to Suppress Clear-Cell Renal Cell Carcinoma Progression. Mol Cancer Ther. 2014 Dec;13(12):3086–97. 10.1158/1535-7163.MCT-14-0245 Epub 2014 Sep 23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Brand-Arpon V, Rouquier S, Massa H, de Jong PJ, Ferraz C, Ioannou PA, et al. A Genomic Region Encompassing a Cluster of Olfactory Receptor Genes and a Myosin Light Chain Kinase (MYLK) Gene Is Duplicated on Human Chromosome Regions 3q13– q21 and 3p13. Genomics. 1999. February 15;56(1):98–110. 10.1006/geno.1998.5690 [DOI] [PubMed] [Google Scholar]
- 13.Yin F, Hoggatt AM, Zhou J, Herring BP. 130-kDa smooth muscle myosin light chain kinase is transcribed from a CArG-dependent, internal promoter within the mouse mylk gene. Am J Physiol Cell Physiol. 2006. June;290(6):C1599–609. Epub 2006 Jan 11. 10.1152/ajpcell.00289.2005 [DOI] [PubMed] [Google Scholar]
- 14.Han YJ, Ma SF, Wade MS, Flores C, Garcia JG. An intronic MYLK variant associated with inflammatory lung disease regulates promoter activity of the smooth muscle myosin light chain kinase isoform. J Mol Med (Berl). 2012. March;90(3):299–308. 10.1007/s00109-011-0820-9 Epub 2011 Oct 21. [DOI] [PubMed] [Google Scholar]
- 15.Han YJ, Ma SF, Yourek G, Park YD, Garcia JG. A transcribed pseudogene of MYLK promotes cell proliferation. FASEB J. 2011. July;25(7):2305–12. 10.1096/fj.10-177808 Epub 2011 Mar 25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Dudek SM, Garcia JG. Cytoskeletal regulation of pulmonary vascular permeability. J Appl Physiol (1985). 2001. October;91(4):1487–500. 10.1152/jappl.2001.91.4.1487 [DOI] [PubMed] [Google Scholar]
- 17.Lazar V, Garcia JG. A Single Human Myosin Light Chain Kinase Gene (MLCK; MYLK) 1 Transcribes Multiple Nonmuscle Isoforms. Genomics. 1999. April 15;57(2):256–67. [DOI] [PubMed] [Google Scholar]
- 18.Stadler S, Nguyen CH, Schachner H, Milovanovic D, Holzner S, Brenner S, et al. Colon cancer cell-derived 12(S)-HETE induces the retraction of cancer-associated fibroblast via MLC2, RHO/ROCK and Ca2+ signaling. Cell Mol Life Sci. 2017. May;74(10):1907–1921. 10.1007/s00018-016-2441-5 Epub 2016 Dec 24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Sayagués JM, Corchete LA, Gutiérrez ML, Sarasquete ME, Del Mar Abad M, Bengoechea O, et al. Genomic characterization of liver metastases from colorectal cancer patients. Oncotarget. 2016. November 8;7(45):72908–72922. doi: 10.18632/oncotarget.12140 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Tan X, Chen M. MYLK and MYL9 expression in non-small cell lung cancer identified by bioinformatics analysis of public expression data. Tumour Biol. 2014. December;35(12):12189–200. 10.1007/s13277-014-2527-3 Epub 2014 Sep 2. [DOI] [PubMed] [Google Scholar]
- 21.Tong Z, Wang T, Garcia JG. Genes influenced by the non-muscle isoform of myosin light chain kinase impact human cancer prognosis. PLoS One. 2014. April 8;9(4):e94325 10.1371/journal.pone.0094325 eCollection 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Giorgi D, Ferraz C, Mattei Mg, Demaille J, Rouquier S. The Myosin Light Chain Kinase Gene Is Not Duplicated in Mouse: Partial Structure and Chromosomal Localization of Mylk. Genomics. 2001. July;75(1–3):49–56. 10.1006/geno.2001.6571 [DOI] [PubMed] [Google Scholar]
- 23.Christie JD, Ma SF, Aplenc R, Li M, Lanken PN, Shah CV, et al. Variation in the myosin light chain kinase gene is associated with development of acute lung injury after major trauma. Crit Care Med. 2008. October;36(10):2794–800. [DOI] [PubMed] [Google Scholar]
- 24.Flores C, Ma SF, Maresso K, Ober C, Garcia JG. A variant of the myosin light chain kinase gene is associated with severe asthma in African Americans. Genet Epidemiol. 2007. May;31(4):296–305. 10.1002/gepi.20210 [DOI] [PubMed] [Google Scholar]
- 25.Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001. January 1;29(1):308–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Kupfer SS, Torres JB, Hooker S, Anderson JR, Skol AD, Ellis NA, et al. Novel single nucleotide polymorphism associations with colorectal cancer on chromosome 8q24 in African and European Americans. Carcinogenesis. 2009. August;30(8):1353–7. 10.1093/carcin/bgp123 Epub 2009 Jun 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Robbins C, Torres JB, Hooker S, Bonilla C, Hernandez W, Candreva A, et al. Confirmation study of prostate cancer risk variants at 8q24 in African Americans identifies a novel risk locus. Genome Res. 2007. December;17(12):1717–22. Epub 2007 Oct 31 10.1101/gr.6782707 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics. 2003. August;164(4):1567–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics, 81 http://pngu.mgh.harvard.edu/purcell/plink/ [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Esposito F, De Martinio M, Petti MG, Forzati F, Tornincasa M, Federico A, et al. HMGA1 pseudogenes as candidate proto-oncogenic competitive endogenous RNAs. Oncotarget. 2014. September 30;5(18):8341–54. doi: 10.18632/oncotarget.2202 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Matsumura F. Regulation of myosin II during cytokinesis in higher eukaryotes. Trends Cell Biol. 2005. July;15(7):371–7. 10.1016/j.tcb.2005.05.004 [DOI] [PubMed] [Google Scholar]
- 32.Karreth FA, Reschke M, Ruocco A, Ng C, Chapuy B, Leopold V, et al. The BRAF Pseudogene Functions as a Competitive Endogenous RNA and Induces Lymphoma In Vivo. Cell. 2015. April 9;161(2):319–32. 10.1016/j.cell.2015.02.043 Epub 2015 Apr 2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Zhang H, Xiong Y, Xia R, Wei C, Shi X, Nei F. The pseudogene-derived long noncoding RNA SFTA1P is down-regulated and suppresses cell migration and invasion in lung adenocarcinoma. Tumour Biol. 2017. February;39(2):1010428317691418 10.1177/1010428317691418 [DOI] [PubMed] [Google Scholar]
- 34.Pink RC, Wicks K, Caley DP, Punch EK, Jacobs L, Carter DR. Pseudogenes: pseudo-functional or key regulators in health and disease? RNA. 2011. May;17(5):792–8. 10.1261/rna.2658311 Epub 2011 Mar 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Chiefari E, Iiritano S, Paonessa F, Le Pera I, Arcidiacono B, Filocamo M, et al. Pseuogene-mediated posttranscriptional silencing of HMGA1 can result in insulin resistance and type 2 diabetes. Nat Commun. 2010. July 27;1:40 10.1038/ncomms1040 [DOI] [PubMed] [Google Scholar]
- 36.Wen YZ, Zheng LL, Qu LH, Ayala FJ, Lun ZR. Pseudogenes are not pseudo any more. RNA Biol. 2012. January;9(1):27–32. 10.4161/rna.9.1.18277 Epub 2012 Jan 1. [DOI] [PubMed] [Google Scholar]
- 37.Kalyana-sundaram S, Kumar-Sinha C, Shakar S, Robinson DR, Wu YM, Asangani IA, et al. Expressed pseudogenes in the transcriptional landscape of human cancers. Cell. 2012. June 22;149(7):1622–34. 10.1016/j.cell.2012.04.041 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Muro EM, Mah N, Andrade-Navarro MA. Functional evidence of post-transciptional regulation by pseudogenes. Biochimie. 2011. November;93(11):1916–21. 10.1016/j.biochi.2011.07.024 Epub 2011 Jul 27. [DOI] [PubMed] [Google Scholar]
- 39.Verin AD, Bilgert-McClian LI, Patterson CE, Garcia JG. Biochemical regulation of the nonmuscle myosin light chain kinase isoform in bovine endothelium. Am J Respir Cell Mol Biol. 1998. November;19(5):767–76. 10.1165/ajrcmb.19.5.3126 [DOI] [PubMed] [Google Scholar]
- 40.Herring BP, El-Mounayri O, Gallagher PJ, Yin F, Zhou J. Regulation of myosin light chain kinase and telokin expression in smooth muscle tissues. Am J Physiol Cell Physiol. 2006. November;291(5):C817–27. Epub 2006 Jun 14. 10.1152/ajpcell.00198.2006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Matsumura F, Totsukawa G, Yamakita Y, Yamashiro S. Role of myosin light chain phosphorylation in the regulation of cytokinesis. Cell Struct Funct. 2001. December;26(6):639–44. [DOI] [PubMed] [Google Scholar]
- 42.Garcia JG, Verin AD, Schaphorst KL. Regulation of thrombin-mediated endothelial cell contraction and permeability. Semin Thromb Hemost. 1996;22(4):309–15. 10.1055/s-2007-999025 [DOI] [PubMed] [Google Scholar]
- 43.Mirzapoiazova T, Moitra J, Moreno-Vinasco L, Sammani S, Turner JR, Chiang ET, et al. Non-muscle myosin light chain kinase isoform is a viable molecular target in acute inflammatory lung injury. Am J Respir Cell Mol Biol. 2011. January;44(1):40–52. 10.1165/rcmb.2009-0197OC Epub 2010 Feb 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Matsushita I, Hanai H, Kajimura M, Tamakoshi K, Nakajima T, Matsubayashi Y, et al. Should gastric cancer patients more than 80 years of age undergo surgery? comparison with patients not treated surgically concerning prognosis and quality of life. J Clin Gastroenterol. 2002. July;35(1):29–34. [DOI] [PubMed] [Google Scholar]
- 45.Minamyia Y, Nakagawa T, Saito H, Matsuzaki I, Taguchi K, Ito M, et al. Increased expression of myosin light chain kinase mRNA is related to metastasis in non-small cell lung cancer. Tumour Biol. 2005. May-Jun;26(3):153–7. Epub 2005 Jun 20. 10.1159/000086487 [DOI] [PubMed] [Google Scholar]
- 46.Khuon S, Liang L, Dettman RW, Sporn PH, Wysolmerski RB, Chew TL. Myosin light chain kinase mediates transcellular intravasation of breast cancer cells through the underlying endothelial cells: a three-dimensional FRET study. J Cell Sci. 2010. February 1;123(Pt 3):431–40. 10.1242/jcs.053793 Epub 2010 Jan 12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Cui WJ, Liu Y, Zhou XL, Wang FZ, Zhang XD, Ye LH. Myosin light chain kinase is responsible for high proliferative ability of breast cancer cells via anti-apoptosis involving p38 pathway. Acta Pharmacol Sin. 2010. June;31(6):725–32. 10.1038/aps.2010.56 Epub 2010 May 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Barkan D, Kleinman H, Simmons JL, Asmussen H, Kamaraju AK, Hoenorhoff MJ, et al. Inhibition of metastatic outgrowth from a single dormant tumor cells by targeting the cytoskeleton. Cancer Res. 2008. August 1;68(15):6241–50. 10.1158/0008-5472.CAN-07-6849 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Ahuja N, Chang D, Gearhart SL. Disparities in colon cancer presentation and in-hospital mortality in Maryland: a ten-year review. Ann Surg Oncol. 2007. February;14(2):411–6. Epub 2006 Nov 1. 10.1245/s10434-006-9130-9 [DOI] [PubMed] [Google Scholar]
- 50.Kessler MD, Yerges-Armstrong L, Taub MA, Shetty AC, Maloney K, Jeng LJ, et al. Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA), O'Connor TD. Challenges and disparities in the application of personalized genomic medicine to populations with African ancestry. Nat Commun. 2016. October 11;7:12521 10.1038/ncomms12521 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All relevant data are within the paper and its Supporting Information files.