Abstract
Nursing informatics requires an understanding of patient-centered data and clinical workflow, and epigenetic research requires an understanding of data analysis. The purpose of this article is to document the methodology that nursing informatics specialists can use to conduct epigenetic research and subsequently strengthen patient-centered care. A pilot study of a secondary methylation data analysis using The Cancer Genome Atlas data from individuals with colon cancer is utilized to illustrate the methodology. The steps for conducting the study using public and free resources are discussed. These steps include finding a data source; downloading and analyzing differentially methylated regions; annotating differentially methylated region, gene ontology and function analysis; and reporting results. A model of epigenetic testing workflow is provided, as is a list of publicly available data and analysis sources that can be used to conduct epigenetic research.
KEY WORDS: Differentially methylated regions, Epigenetics, Gene ontology, Methodology, Methylation, Nursing informatics
Nursing informatics is a data-driven subspecialty of nursing that has the ability to improve patient care outcomes through the interdisciplinary sharing of patient data in epigenetic research.1 Epigenetics is defined as modifications to DNA transcription that are not due to changes in the DNA sequence.2 These changes often lead to chronic diseases such as cancer and other immune disorders.3 Using data to assess epigenetic changes can strengthen the delivery of precision medicine.4 Assessing and measuring epigenetic changes between groups use bioinformatics and statistical instruments. There are several epigenetic processes, including histone modification and nucleosome positioning, known to impact gene expression; this article focuses on differential methylation.5–7 Although many programs exist to enable the analysis of differential methylation between groups, little information exists on what easy-to-use, reliable resources are available for informatics nurse specialists to obtain for the purpose of comparing and analyzing valid methylation data in epigenetic research. The purpose of this article is to describe methodology that informatics nurse specialists can use to conduct epigenetic research for various patient populations. The article is based on an unpublished pilot study performed in 2017 by the author, and although the results are accurate, the focus of this article is to describe the methodology used to report the results.
Colorectal cancer (CRC) is the third most commonly diagnosed malignant cancer among both men and women in the United States.8 It is estimated that approximately 151 000 new cases of CRC will be diagnosed, with approximately 53 000 deaths in 2022 in the United States alone.9 The 5-year survival rate for stage I CRC is approximately 92%, decreasing to 12% for patients with stage IV CRC.10 Approximately 80% of CRC is attributable to epigenetic factors, rather than heritable mutations, and a 2018 study found that approximately 55% of CRC cases were caused by modifiable risk factors.11 Investigating the epigenetic changes that occur during stages of colon cancer (CC) may provide insight into finding epigenetic markers associated with CC.
Methylation is a chemical process when a methyl group attaches via a covalent bond to the nucleic acid cytosine.5 Methylation of cytosine prevents the nucleotide from being transcribed, potentially leading to loss of expression of the gene formed by the transcription of the nucleotide sequence.12 When cytosine is followed by guanine, it is referred to as a CpG site. CpG islands are clusters of CpG sites, generally found near gene promoter regions, and approximately half of all promoters have CpG islands that lead to gene silencing when methylated.13 Differentially methylated regions (DMRs) are areas of multiple methylated CpG sites in close proximity to each other.13 The process of identifying DMRs involves analyzing the individual differentially methylated CpG sites and then determining the sequential proximity of each differentially methylated CpG site to find the DMR. By analyzing the methylation patterns associated at each stage of CC, we can develop methods to better understand the mechanisms of cancer pathogenesis and provide insight into prevention of metastasis and future treatment and detection methods.14
In one commonly used method to detect methylated cytosine, DNA is bathed in sodium bisulfite to convert nonmethylated cytosine to uracil, which can then be measured, because methylated cytosine remains as cytosine.15 The bisulfite prepared DNA is compared with the original to determine which cytosine CpG sites are methylated. Based on this principle, several methods to detect methylation are currently used in research. The most common, because of the lower cost, is methylation microarrays.16 In this process, specific areas of DNA associated with genes are targeted and analyzed, and approximately 85 000 CpG sites are analyzed, covering the promoter region of 99% of known genes.17 A second process called reduced representation bisulfite sequencing produces small fragments of DNA with CpG sites at both ends and then analyzes them for methylated sites.15 This method generates data on CpG-rich areas in DNA and covers 10 times more of the genome than microarrays, but requires more data storage (up to 5 gigabytes per patient).17 A third method is whole genome bisulfite sequencing, where the entire genome is analyzed for any methylated cytosine, and all methylated sites are reported, requiring data storage of up to 90 gigabytes per patient.18
Once the methylation data are collected and analyzed, the CpG or DMRs are generally reported in a data table using the generic feature format and/or the gene transfer format.19 These data file formats were developed to allow applications to share genomic data for analysis in a smaller file size than the raw genomic data.19 The data can then be imported into annotation software or Web sites that will convert the methylation data into gene information, a process called annotation.
Annotation is the process of taking raw genomic data and identifying the regions of the genes that are impacted by the data.20 Annotation can be either a manual or automated process using programs and Web sites and includes both structural as well as functional annotation. The process will provide a list of genomic locations of the CpG or DMR as well as the genes impacted.21 The genomic location will provide information about specific genes and gene parts (transcripts) where the methylation is occurring.22 The genes will be provided as gene symbols approved by the Human Gene Nomenclature Committee, an international group of scientists that developed an unambiguous list of gene symbols and names.23
Understanding and communicating gene functions for reproducible research through the use of controlled vocabulary are the purpose of gene ontology (GO).24 Gene ontology allows for the classification of single genes or groups of genes by molecular function, biological process, and cellular component, as well as functional annotation.25 Gene ontology is reported as connections using evidence codes and P values showing associations between gene groups.26 The P values reported for GO analysis also include a corrected P value with either false discovery rate, Benjamini, Bayesian, or others to enhance the accuracy of reports.27
When calculating P values, small sample sizes can use standard P values to determine statistical significance. When dealing with large samples, such as those in epigenetics, repeated testing is necessary.28 Many (if not all) biostatistical software programs will provide a corrected P value reported as false discovery rate, Benjamini, or Bayesian-corrected, and this value should be used as the reported statistical significant value.29 Functional annotation is a component of GO that provides insight into the biological interaction of genes.30
The effects of methylation on CC risk, carcinogenesis, and progression present mixed results,31 with some showing promise toward developing methylation-based biomarkers for the detection of CC.32 Known levels of methylation for each cancer stage could lead to recommended alternate treatments based on epigenetic processes.33 By analyzing the methylation changes associated with CC stages, we were able to isolate the cancer-related methylation processes to prevent and treat cancer progression and metastasis.
MATERIALS AND METHODS
There are five distinct steps undertaken during an epigenetic research project, with the assumption made that a patient population of interest has already been determined prior to undertaking the research. Unlike traditional research, epigenetics does not rely on the researcher to have a distinct hypothesis to test against a null when beginning the process; rather, the research itself leads to hypotheses from the findings. Figure 1 shows the basic workflow steps in an epigenetic research process, and Table 1 provides an abridged list of online sites with available data and resources that can be used for successful research projects.
FIGURE 1.

General workflow steps for epigenetic research.
Table 1.
Brief Epigenetic and Bioinformatics Web Sites With Free Access
| Site Name | Web Address | Purpose |
|---|---|---|
| Cosmic Browsera | https://cancer.sanger.ac.uk/cosmic | Annotation |
| DNA Methylation Interactive Visualization Database (DNVIVD) | http://119.3.41.228/dnmivd/index | Annotation |
| Ensembl | http://useast.ensembl.org/index.html | Annotation |
| Genecardsa | https://www.genecards.org | Annotation |
| Genome Tools | http://genometools.org/index.html | Annotation |
| UCSC Genome Browsera | https://genome.ucsc.edu | Annotation |
| Washington University | http://epigenomegateway.wustl.edu/browser/ | Annotation |
| National Center for Biotechnology Information (NCBI) | https://www.ncbi.nlm.nih.gov | Annotation/data source |
| Bioconductora | https://www.bioconductor.org/ | Bioinformatics data analysis |
| Biopython | https://biopython.org/ | Bioinformatics data analysis |
| Vennya | https://bioinfogp.cnb.csic.es/tools/venny | Comparative analysis |
| Python | https://www.python.org/ | Data analysis |
| R Projecta | https://www.r-project.org/ | Data analysis |
| Array Express | https://www.ebi.ac.uk/arrayexpress/ | Data source |
| Blueprint Epigenome | https://www.blueprint-epigenome.eu/ | Data source |
| Gene Expression Omnibus (GEO) | https://www.ncbi.nlm.nih.gov/gds | Data source |
| Genome Aggregation Database | https://gnomad.broadinstitute.org/ | Data source |
| Genomics Data Lake | https://docs.microsoft.com/en-us/azure/open-datasets/dataset-genomics-data-lake | Data source |
| International Genome Sample Resource (ISGR) | https://www.internationalgenome.org/data-portal/ | Data source |
| TCGAa | https://portal.gdc.cancer.gov | Data source |
| ggplot | https://ggplot2.tidyverse.org/ | Data visualization |
| Genboree | http://www.genboree.org/site/epigenomics_toolset | Education |
| National Human Genome Research Institute (NHGRI) | https://www.genome.gov | Education |
| National Library of Medicine (MEDLINE) | https://medlineplus.gov/genetics/understanding/ | Education |
| University of Utah | https://learn.genetics.utah.edu/content/epigenetics/ | Education |
| Database for Annotation Visualization and Integrated Discovery (DAVID)a | https://david.ncifcrf.gov | Ontology |
| Kyoto Encyclopedia of Genes and Genomes (KEGG)a | https://www.genome.jp/kegg | Ontology |
| Protein Analysis Through Evolutionary Relationships (PANTHER) | http://www.pantherdb.org | Ontology |
a Source used/cited in this article; inclusion in this list does not imply endorsement, nor is this list meant to be exhaustive, merely an example of reliable online resources available for epigenetic research.
As previously mentioned, differential methylation research is data-intensive, requiring significant storage and computing capacity. Current differential methylation analysis methods using whole genome bisulfite sequencing provide raw data files between 700 megabytes and 1 gigabyte in size per patient. Methylation data files can approach 500 megabytes, or 7 million rows of comma separated data per patient. This exceeds the row limit for most spreadsheet programs, so data editing can be difficult without data analysis programs. Online data processing platforms exist that allow researchers to overcome the row limits of most personal computers, but data transfer speeds can cause data uploads to take hours, if not days, depending on Internet connections. As the raw data are analyzed to DMR and then annotated to genes, the data become more manageable, and data analysis can be accomplished by most personal computers.
Step 1—Find Data Source
The data for this analysis were retrieved from The Cancer Genome Atlas (TCGA), and DMR analysis was conducted using a publicly available R package, TCGAbiolinks.34 Cases were chosen based on available data by cancer stage. The Cancer Genome Atlas had 30 cases available from the colon adenocarcinoma database, with methylation data performed using the Illumina 450K platform and aligned to Hg38 that also had age, weight, and sex documented with the cancer staged as stage IV. Using this information, 30 cases were randomly chosen from the database from each of the other three stages of I, II, and III. All cases were randomly chosen in each stage based on the methylation analysis platform, age, sex, and weight documentation being present.
Colorectal cancer staging is performed using the American Joint Committee on Cancer staging system.35 This system relies on visualization of the cancer and surrounding tissue. Stages are ranked from stage 0 to IV based on the size of the Tumor, whether there is lymph Node involvement, and whether there is Metastasis involved (TNM scale). Stages II-IV each have subgroupings of A-C with progressive ratings on the TNM scale. In CRC, stage 0 is in the earliest stage, sometimes referred to as “in situ,” as it has not grown outside of the mucosal tissue (TisN0M0). Stage I has tumor cells grown into the submucosa but has not spread to lymph nodes or other sites (T1-2N0M0). Stage II CRC has the tumor growing into the outermost layer of the colon or rectum, but not yet spreading to lymph nodes or distant sites (T3-4N0M0). Stage III is characterized by lymph node or fat involvement with no distant spreading (Tis-4 N1-2 M0), and stage IV involves metastasis to distant organs or peritoneum (TanyNanyM1a-1c).35 Stage I was used as the control group in this study because of the high survival rate and low tumor involvement, and at the time of data acquisition, no stage 0 samples were available.
Step 2—Download and Analyze Differentially Methylated Region
Using an R package called TCGAbiolinks, DMRs were identified by calculating the differences between the mean methylation of each group. This package provided the platform for DMR analysis by CpG site identifier as well as statistically significant DMRs by gene coordinates.34 This process allowed for the identification of individual CpG sites that were contained within the DMRs. Differentially methylated regions were aligned to genomic location to determine significant regions using a Benjamini-Hochberg false discovery rate–adjusted Wilcoxon tests. Limma, another R package, was used to calculate P values for all the CpG sites.36 The design matrix for Limma allowed for P value adjustment for the identified covariates of weight, age, race, and sex by cancer stage using linear modeling per CpG site and then performing a Bayesian correction on each site. The individual CpG site P values provided by Limma were then annotated to the CpG site data provided by TCGAbiolinks. When determining statistical significance for the DMRs, the CpG site with the lowest adjusted P value from the Limma results was used as the P value for the DMRs. This provided a more robust analysis and annotation of the DMR.
The Cancer Genome Atlas provides β values for each CpG site, with values ranging from 0 to 1 (completely unmethylated to completely methylated).37 TCGAbiolinks offers calculation of the difference between two groups using Wilcoxon and adjusting using the Benjamini-Hochberg method. Differentially methylated regions were then filtered to include only those with at least three CpGs and a P < .05 per group. Further classification was performed to group the DMRs into categories of 5%, 10%, and 15% methylation differences between the two cancer groups for analysis.
Steps 3 and 4—Annotate Differentially Methylated Region, Gene Ontology, and Function Analysis
To determine genes of interest, the significant DMR coordinates from each comparison group were then annotated to protein-coding genes using the University of California Santa Cruz Genome browser and Human Genome browser.38 This process provided a list of genes per pair of comparison groups—VI to I, III to I, II to I, VI to III, VI to II, and III to II—for each methylation threshold: 5%, 10%, and 15%. To obtain genes that were found only in specific groups, the list of DMRs was entered into a program called Venny,39 which provides comparative lists and a Venn-type output with unique items in each group. Figure 2 shows the result output from Venny. The unique genes from each group were linked back to the significant DMRs to determine significant genes from each paired group, and genes that overlapped groups were not considered significant for this study. Gene ontology was performed on each paired group list, to determine the overall biological impact and significant genes for each comparison group using the DAVID (Database For Annotation Visualization and Integrated Discovery) database.40 Significance was determined based on a Benjamini-adjusted P < .05.
FIGURE 2.

Venn diagram showing significant (P < .05) overlapping DMR with >5% (left) and >10% (right) methylation (hypo and hyper together) change between CC stages II-IV versus stage I (yellow: stage II vs I, pink: stage III vs I, blue: stage IV vs I).
Step 5—Report Results
It is important to note that the focus of this article is to explain the methodology that was used during the process of conducting epigenetic research. Selected study results are discussed to illustrate how the results of this type of study can be reported. Not all results obtained from the study are reported. The study used to demonstrate the methodology was originally conducted in 2018; thus, some of the scientific references are outdated. Augusta University institutional review board determined that the study was not human subject research prior to initiating the study.
Participants
We included 120 subjects by randomly sampling 30 individuals each with four stages of CC who had age, weight, sex, and race documented. Data included Illumina 450K array methylation information from primary solid tumor tissue for all 120 subjects, covering >470 000 CpG sites per subject. The sample participants with stage I CC showed a high probability of survival, significantly decreasing as the CC stages increased. In the study sample, there were no stage I cases with days to death recorded, and the mean survival for individuals with stage II CC was 1615 days (±993; average, <5 years). Individuals with stage III CC had a mean survival of 513 days (±305 days), and individuals with stage IV CC had a mean survival of 709 days (±637 days). The combined survival for all CC cases was a mean of 781 ± 695 days (P = .019).
Differentially Methylated Region Analysis by Stage
Differentially methylated region analysis was conducted by grouping the patients per cancer stages. For a more complete analysis, a full six-way analysis including linear regression was performed. Table 2 shows the results of the overall DMR analysis, reported by hypermethylated and hypomethylated DMRs and by percent methylation differences in between-stage comparisons. Each stage was analyzed for significant DMRs, as determined by a Bayesian-corrected P < .05.
Table 2.
Number and Percent of Total by Stage of Hypermethylated and Hypomethylated DMR Comparison at 5%, 10% and 15% Methylation Difference Between Stages
| n (% of Total) | Stage | |||||
|---|---|---|---|---|---|---|
| II vs I | III vs I | IV vs I | III vs II | IV vs II | IV vs III | |
| 5% Hypo | 157 (1.16) | 161 (1.53) | 1691 (12.43) | 156 (1.96) | 1898 (13.11) | 670 (4.07) |
| Hyper | 126 (0.93) | 24 (0.23) | 8 (0.06) | 58 (0.73) | 11 (0.08) | 3 (0.02) |
| 10% Hypo | 4 (0.03) | 8 (0.08) | 188 (1.38) | 2 (0.03) | 66 (0.46) | 18 (0.11) |
| Hyper | 1 (0.01) | 2 (0.02) | 1 (0.01) | 2 (0.03) | 1 (0.01) | 0 (0) |
Abbreviations: Hyper, hypermethylated DMR; Hypo, hypomethylated DMR; vs, versus (comparison by stage).
All values with P < .05. There were 26 940 total DMRs identified: 13 543 in II versus I, 10 513 in III versus I, 13 603 in IV versus I, 7792 in III versus II, 14 479 in IV versus II and 16 449 in IV versus III; all with P < .05.
When compared with stage I, stage II revealed a fairly even distribution of hypermethylated and hypomethylated DMRs at 5% methylation difference and no DMRs with greater than 10% methylation difference. Stage III, when compared with stage I, showed a significant reduction in 5% hypermethylated DMRs and the same number of 5% hypomethylated DMRs as stage II, again, with no DMRs having greater than 10% methylation change. When comparing stage IV with stage I, there were both a significant reduction in 5% hypermethylated DMRs and a significant increase in 5% hypomethylated DMRs. Overall, the trend among DMRs was toward hypomethylation as the cancer stage progressed. This is consistent with findings that cancer overall is a process of global hypomethylation as the disease progresses.41 Table 3 shows the results of the top methylated DMRs from each comparison group annotated back to a specific gene. For brevity, only the top DMRs from each group are reported.
Table 3.
Top Methylated DMRs With ≥10% Methylation Difference and P < .05 by Stage With Gene Annotation
| Stage | Gene | # CpGs | DMR Location | Dis. to TSS | Methylation | Gene Name or Role | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Start | End | Region | Str | % | P | ||||||
| Hypomethylated | II to I | ZFP28 | 8 | 56 538 306 | 56 539 277 | Promoter | + | −32 | 15.8 | .030 | Transcription regulator |
| III to I | SATB2 | 8 | 199 470 774 | 199 471 690 | CDS | − | 234 | 14.1 | .047 | Transcription regulator | |
| IV to I | SLC6A15 | 4 | 84 911 884 | 84 912 097 | Intron | − | 635 | 17.5 | .004 | Amino acid transport | |
| 4 | 84 911 421 | 84 911 643 | Intron | − | 1248 | 15.6 | .038 | ||||
| III to II | CDH22 | 4 | 46 174 425 | 46 175 023 | Promoter | − | 133 505 | 11.1 | .027 | Cell adhesion | |
| IV to II | RIMS4 | 6 | 44 810 096 | 44 810 905 | Promoter | − | 104 | 16.3 | .014 | Membrane regulator | |
| IV to III | ZNF665 | 7 | 53 192 776 | 53 193 397 | CDS | − | −13 | 11.8 | .005 | Transcription regulator | |
| Hypermethylated | II to I | CHRNA4 | 3 | 63 347 973 | 63 348 221 | Intron | − | 13 094 | 10.2 | < .001 | Cholinergic receptor nicotinic alpha 4 subunit |
| III to I | PITX1 | 8 | 135 031 223 | 135 031 748 | Promoter | − | 1912 | 10.4 | .027 | Transcription regulator | |
| III to II | NRTN | 5 | 5 827 887 | 5 828 394 | Promoter | + | 2495 | 10.5 | .012 | Neurturin | |
| IV to II | ALOX5 | 8 | 45 418 926 | 45 419 435 | Intron | + | 44 600 | 14.3 | .008 | Inflammatory process | |
Abbreviations: #CpGs, number of differentially methylated CpG sites that are in the DMR; Dis to TSS, distance to transcription start site for closest CpG site; Methylation % is negative value for all entries; Str, DNA strand.
P is adjusted P value. Stage: Comparison where methylation difference occurs. Differentially methylated region in red can be linked through annotation to cancer pathways and/or functions. Differentially methylated regions in bold font have significant methylation in more than one stage comparison. Groups not listed contained no significant DMR with gene annotations.
Gene Function Analysis and Gene Ontology
To explain the methodology, one gene from the hypermethylated DMRs and one from the hypomethylated DMRs group were selected for reporting. Box-and-whisker plots (Figure 3) show overall methylation changes among CC groups. Stages with significant P values are indicated in the plot graph.
FIGURE 3.

Sample gene methylation box-and-whisker plots. Left gene showing overall hypermethylation, right gene showing overall hypomethylation.
ALOX5 is an oncogene, and it is upregulated (increased expression) in CC, where it enhances cell proliferation and survival.42 It is suggested that inhibition of expression of ALOX5 might be valuable in the prevention and treatment of CRC, and elevated expression in CRC correlates with tumor aggressiveness.43 Our study revealed significant hypermethylated DMRs between stages IV and II (P = .006), with no other statistically significant methylation occurring associated with the ALOX5 gene, possibly indicating a reduced expression. Further study into the lifestyle and epigenetic changes leading to methylation changes and the rationale for hypermethylation of the DMRs on this oncogene at later stages of CC could provide valuable information.
CDH22 is an oncogenic member of the cadherin family and is highly expressed in the pituitary gland and the brain.44 A previous study found that CDH22 was overexpressed in CC and lymphatic metastasis of CC when compared with normal colon tissue and that CDH22 knockdown (decreased expression) inhibited CRC tumor metastasis and was involved in cancer metastasis through cell migration, invasion, and adhesion.45,46 This validates the DMR methylation data found in this pilot study, where CDH22 had DMR with >10% hypomethylation between stages II and III progression, as well as between stages II and IV progression.
Gene ontology revealed molecular functions, processes, and pathways associated with cancer-related functionality in most cases. Table 4 shows the top ontology result output from the hypomethylated genes. The functions and processes that were significantly impacted relate to cancer functions, where the pathway is related to metabolism for cellular energy. For brevity, the list was truncated to present only the top process with the lowest Benjamini-corrected P value. Hypermethylated GO revealed no significant results between any stages of CC.
Table 4.
Top Gene Ontology From Hypomethylated DMR With Methylation Change >5% and P < .05 by Stage
| Group/Terms | No. Genesa | P | Benjaminib |
|---|---|---|---|
| III to I: GO molecular functions | |||
| GO:0005509—calcium ion binding | 32 | 3.14E-15 | 5.84E-13 |
| III to I: GO biological processes | |||
| GO:0007156—homophilic cell adhesion via plasma membrane adhesion molecules | 27 | 8.48E-27 | 6.39E-24 |
| IV to I: GO pathways | |||
| hsa04974—protein digestion and absorption | 16 | 6.78E-05 | 0.005 |
| IV to I: GO molecular functions | |||
| GO:0043565—sequence-specific DNA binding | 90 | 2.00E-21 | 1.84E-18 |
| IV to I: GO biological processes | |||
| GO:0045944—positive regulation of transcription from RNA polymerase II promoter | 119 | 6.34E-14 | 2.17E-10 |
Only groups compared with control listed. Groups not listed contained no significant ontology analyses. Items in red indicate annotation to cancer-related processes. Terms in bold font appear in more than one stage comparison.
aNumber of genes from gene list appearing in ontology process.
bBenjamini-corrected P value.
DISCUSSION
Methylation is a process that occurs naturally as a function of DNA transcription, influenced by external environmental factors and internal disease processes. Both hypomethylation and hypermethylation, depending on the nature of genes, can cause and exacerbate disease processes. By analyzing the methylation processes associated with cancer stages, we were able to isolate epigenetic changes associated with cancer-related processes. For better feasibility in cancer prevention, DNA methylation analysis provides a great opportunity for nursing informatics and informatics nurse specialist (INS) to offer new insights in identifying biomarkers for CC diagnosis, progression, prognosis, and therapeutics. Known standard levels of methylation could lead to recommended alternative treatments to enhance precision medicine by improving the survival in CC.
Cancer is a disease process that could be prevented by effectively regulating the environment to provide ideal conditions to slow down its growth and spreading. By analyzing the methylation patterns associated at each stage of CC, we hope to develop methods to better understand the mechanisms of cancer pathogenesis and provide insight into future treatment and detection of CC. By detecting CC at an early stage, patients have a greater chance for survival and may have less sequelae from the disease process. In the event it is not detected early, understanding the methylation processes involved in the progression and spread of CC can provide mechanisms to work with patients to minimize or potentially reverse the cancer with methylation changes being caused by the disease process.
Lifestyle and environmental exposures play a significant role in methylation processes. Modification of one methylation risk factor may reduce the spread of CC, and full analysis may reveal more efficient and effective treatment options for improved outcomes from CC. The INS is positioned to contribute to interdisciplinary epigenetic research that can eventually be used to help patients optimize lifestyle choices to prevent cancer progression.
In the second edition of the Scope and Standards of Nursing Informatics practice, the American Nurses Association noted that because of advances in genetic mapping and clinical decision support, INSs need to increase their knowledge of genomics and genomic research to support the expanding practice.47 This position changed, and the third edition only notes that INSs need to consider the Genetic Information Nondiscrimination Act as a regulatory requirement.48 However, the methods and workflow associated with differential methylation research as well as how the research is reported provide the INS with resources to build a strengthened foundation for patient-centered data reporting and recording. These methods are a valuable resource for INS as they start to collaborate in interdisciplinary epigenetic research that will improve patient care. Contributions to improved healthcare can be accomplished through the incorporation of epigenetic research findings into the electronic health record, which INSs can guide as patient-centered precision healthcare emerges.
Acknowledgments
The authors thank Dr Shaoyong Su, Georgia Prevention Institute, for his original assistance with the pilot study, which has been used to illustrate the methodology described in this article. The results discussed here are based on data generated from the TCGA Research Network: http://cancergenome.nih.gov/.
Footnotes
John J. Milner ORCID: orcid.org/0000-0003-1988-3344
Julie K. Zadinsky ORCID: orcid.org/0000-0002-1158-8300
S. Pamela K. Shiao ORCID: orcid.org/0000-0002-9714-8372
The authors have disclosed that they have no significant relationships with, or financial interest in, any commercial companies pertaining to this article.
Contributor Information
Julie K. Zadinsky, Email: jzadinsky@augusta.edu.
S. Pamela K. Shiao, Email: sypshiao@gmail.com.
References
- 1.Milner JJ, Zadinsky JK. Nursing informatics and epigenetics: an interdisciplinary approach to patient-focused research. CIN: Computers, Informatics, Nursing. 2022;40(8): 515–520. 10.1097/cin.0000000000000922. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Dupont C, Armant DR, Brenner CA. Epigenetics: definition, mechanisms and clinical perspective. Seminars in Reproductive Medicine. 2009;27(5): 351–357. 10.1055/s-0029-1237423. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jones PA, Baylin SB. The epigenomics of cancer. Cell. 2007;128(4): 683–692. 10.1016/j.cell.2007.01.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Khoury MJ. Precision medicine vs preventive medicine. Journal of the American Medical Association. 2019;321(4): 406. 10.1001/jama.2018.18636. [DOI] [PubMed] [Google Scholar]
- 5.Costello JF, Plass C. Methylation matters. Journal of Medical Genetics. 2001;38(5): 285–303. 10.1136/jmg.38.5.285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kurdyukov S, Bullock M. DNA methylation analysis: choosing the right method. Biology (Basel). 2016;5(1): 3. 10.3390/biology5010003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.You JS, Jones PA. Cancer genetics and epigenetics: two sides of the same coin? Cancer Cell. 2012;22(1): 9–20. 10.1016/j.ccr.2012.06.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Cancer facts and figures 2022. American Cancer Society; 2022. [Google Scholar]
- 9.Siegel RL Miller KD Goding Sauer A, et al. Colorectal cancer statistics, 2020. CA: A Cancer Journal for Clinicians. 2020;70(3): 145–164. 10.3322/caac.21601. [DOI] [PubMed] [Google Scholar]
- 10.Society AC . Survival rates for colorectal cancer, by stage. 2018. Updated February 21, 2018. https://www.cancer.org/cancer/colon-rectal-cancer/detection-diagnosis-staging/survival-rates.html
- 11.Islami F Goding Sauer A Miller KD, et al. Proportion and number of cancer cases and deaths attributable to potentially modifiable risk factors in the United States. CA: A Cancer Journal for Clinicians. 2018;68(1): 31–54. 10.3322/caac.21440. [DOI] [PubMed] [Google Scholar]
- 12.Deaton AM, Bird A. CpG islands and the regulation of transcription. Genes & Development. 2011;25(10): 1010–1022. 10.1101/gad.2037511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Griffiths AJF, Wessler SR, Carroll SB, Doebley J. Introduction to Genetic Analysis. 11th ed. W. H. Freeman; 2015. [Google Scholar]
- 14.Puccini A Berger MD Naseem M, et al. Colorectal cancer: epigenetic alterations and their clinical implications. Biochimica et Biophysica Acta—Reviews on Cancer. 2017;1868(2): 439–448. 10.1016/j.bbcan.2017.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Khodadadi E Fahmideh L Khodadadi E, et al. Current advances in DNA methylation analysis methods. BioMed Research International. 2021;2021: 8827516. 10.1155/2021/8827516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Bock C. Analysing and interpreting DNA methylation data. Nature Reviews Genetics. 2012;13(10): 705–719. 10.1038/nrg3273. [DOI] [PubMed] [Google Scholar]
- 17.Field Guide to Methylation Methods. Illumina; 2016. https://www.illumina.com/content/dam/illumina-marketing/documents/products/other/field_guide_methylation.pdf [Google Scholar]
- 18.Satterlee JS Beckel-Mitchener A McAllister K, et al. Community resources and technologies developed through the NIH Roadmap Epigenomics Program. Methods in Molecular Biology. 2015;1238: 27–49. 10.1007/978-1-4939-1804-1_2. [DOI] [PubMed] [Google Scholar]
- 19.Stein L. GFF3. 2022. Updated August 18, 2020. https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md
- 20.Davis C. Medical definition of genome annotation. Medterms Medical Dictionary. 2021. [Google Scholar]
- 21.Biolyse . What is gene annotation in bioinformatics. 2022. Updated November 3, 2018. http://biolyse.ca/what-is-gene-annotation-in-bioinformatics/
- 22.Zerbino DR, Wilder SP, Johnson N, Juettemann T, Flicek PR. The Ensembl regulatory build. Genome Biology. 2015;16(1): 56. 10.1186/s13059-015-0621-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.HGNC . About the HGNC. 2022. 2022. https://www.genenames.org/about/
- 24.Smith B, Williams J, Schulze-Kremer S. The ontology of the gene ontology. AMIA Annual Symposium Proceedings. 2003;609–613. [PMC free article] [PubMed] [Google Scholar]
- 25.du Plessis L, Skunca N, Dessimoz C. The what, where, how and why of gene ontology—a primer for bioinformaticians. Briefings in Bioinformatics. 2011;12(6): 723–735. 10.1093/bib/bbr002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hill DP, Smith B, McAndrews-Hill MS, Blake JA. Gene ontology annotations: what they mean and where they come from. BMC Bioinformatics. 2008;9(suppl 5): S2. 10.1186/1471-2105-9-S5-S2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Tanha K, Mohammadi N, Janani L. P-value: what is and what is not. Medical Journal of the Islamic Republic of Iran. 2017;31: 65–65. 10.14196/mjiri.31.65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Teng M Wang Y Kim S, et al. Empirical Bayes model comparisons for differential methylation analysis. Comparative and Functional Genomics. 2012;2012: 376706. 10.1155/2012/376706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Benjamini Y. Discovering the false discovery rate. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2010;72(4): 405–416. doi: 10.1111/j.1467-9868.2010.00746.x [DOI] [Google Scholar]
- 30.Reddy TE. Chapter 2—the functional genome: epigenetics and epigenomics. In: Ginsburg GS, Willard HF, eds. Genomic and Precision Medicine. 3rd ed. Silver Spring, MD: Academic Press; 2017: 21–44. [Google Scholar]
- 31.Draht MXG Goudkade D Koch A, et al. Prognostic DNA methylation markers for sporadic colorectal cancer: a systematic review. Clinical Epigenetics. 2018;10: 35. 10.1186/s13148-018-0461-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Mahasneh A, Al-Shaheri F, Jamal E. Molecular biomarkers for an early diagnosis, effective treatment and prognosis of colorectal cancer: current updates. Experimental and Molecular Pathology. 2017;102(3): 475–483. 10.1016/j.yexmp.2017.05.005. [DOI] [PubMed] [Google Scholar]
- 33.Chen JJ, Wang AQ, Chen QQ. DNA methylation assay for colorectal carcinoma. Cancer Biology & Medicine. 2017;14(1): 42–49. 10.20892/j.issn.2095-3941.2016.0082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Silva TC Colaprico A Olsen C, et al. TCGAbiolinksGUI: a graphical user interface to analyze cancer molecular and clinical data. F1000Research. 2018;7(439): 1–16. 10.12688/f1000research.14197.1. [DOI] [Google Scholar]
- 35.2018 Cancer Staging Manual. 7th ed. Springer-Verlag; 2018. [Google Scholar]
- 36.Ritchie ME Phipson B Wu D, et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research. 2015;43(7): e47. 10.1093/nar/gkv007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.NCI . Genomic Data Commons Users Guide. 2018. GDC Web site. https://docs.gdc.cancer.gov/Data_Dictionary/
- 38.Karolchik D Hinrichs AS Furey TS, et al. The UCSC table browser data retrieval tool. Nucleic Acids Research. 2004;32(database issue): D493–D496. 10.1093/nar/gkh103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Oliveros JC. Venny: an interactive tool for comparing lists with Venn's diagrams. Updated 2015. http://bioinfogp.cnb.csic.es/tools/venny/index.html
- 40.Huang da W, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Research. 2009;37(1): 1–13. 10.1093/nar/gkn923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Bardhan K, Liu K. Epigenetics and colorectal cancer pathogenesis. Cancers (Basel). 2013;5(2): 676–713. 10.3390/cancers5020676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Kleinstein SE Heath L Makar KW, et al. Genetic variation in the lipoxygenase pathway and risk of colorectal neoplasia. Genes, Chromosomes & Cancer. 2013;52(5): 437–449. 10.1002/gcc.22042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Barresi V, Grosso M, Vitarelli E, Tuccari G, Barresi G. 5-lipoxygenase is coexpressed with Cox-2 in sporadic colorectal cancer: a correlation with advanced stage. Diseases of the Colon & Rectum. 2007;50(10): 1576–1584. 10.1007/s10350-007-0311-9. [DOI] [PubMed] [Google Scholar]
- 44.Wu J, Jester WF, Jr., Laslett AL, Meinhardt A, Orth JM. Expression of a novel factor, short-type PB-cadherin, in Sertoli cells and spermatogenic stem cells of the neonatal rat testis. Journal of Endocrinology. 2003;176(3): 381–391. 10.1677/joe.0.1760381. [DOI] [PubMed] [Google Scholar]
- 45.Kelly NJ, Varga JFA, Specker EJ, Romeo CM, Coomber BL, Uniacke J. Hypoxia activates cadherin-22 synthesis via eIF4E2 to drive cancer cell migration, invasion and adhesion. Oncogene. 2018;37(5): 651–662. 10.1038/onc.2017.372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Zhou J, Li J, Chen J, Liu Y, Gao W, Ding Y. Over-expression of CDH22 is associated with tumor progression in colorectal cancer. Tumour Biology. 2009;30(3): 130–140. 10.1159/000225242. [DOI] [PubMed] [Google Scholar]
- 47.American Nurses Association . Nursing Informatics: Scope and Standards of Practice. 2nd ed. American Nurses Association; 2014. [Google Scholar]
- 48.American Nurses Association . Nursing Informatics: Scope and Standards of Practice. 3rd ed. American Nurses Association; 2022. [Google Scholar]
