Abstract
An accurate assessment of the cervical lymph node metastasis status in oral cavity cancer not only helps predict the prognosis of patients, but also helps surgeons to perform the appropriate treatment. We investigated the utilization of microarray technology focusing on the differences in gene expression profiles between primary tumors of oral squamous cell carcinoma that had metastasized to cervical lymph nodes and those that had not metastasized in the hope of finding new biomarkers to serve for diagnosis and treatment of oral cavity cancer. To design this experiment, we prepared two groups: the learning case group with 30 patients and the test case group with 13 patients. All tissue samples were performed using laser captured microdissection to yield cancer cells, and RNA was isolated from purified cancer cells. To identify a predictive gene expression signature, the different gene expressions between the two groups with and without metastasis in the learning case (n = 30) were analyzed, and the 85 genes expressed differentially were selected. Subsequently, to construct a more accurate prediction model, we further selected the genes with a high power for prediction from the 85 genes using the AdaBoost algorithm. The eight candidate genes, DCTD, IL‐15, THBD, GSDML, SH3GL3, PTHLH, RP5‐1022P6 and C9orf46, were selected to achieve the minimum error rate. Quantitative reverse transcription–polymerase chain reaction was carried out to validate the selected genes. From these statistical methods, the prediction model was constructed including the eight genes and this model was evaluated by using the test case group. The results in 12 of 13 cases (∼92.3%) were predicted correctly. (Cancer Sci 2007; 98: 740–746)
In 2005, 335 870 Japanese people died from cancer. Of these, 5679 people had oral cavity cancer (http://www.mhlw.go.jp/toukei/saikin/hw/jinkou/suikei05/index.html) and the major cause of death by cancer was metastasis. An accurate assessment of the cervical lymph node metastasis status in oral cavity cancer not only helps predict the prognosis of patients, but also helps surgeons to carry out the appropriate treatment. When the disease is localized, surgical procedures can be used to remove the tumor in its entirety. For patients who are diagnosed clinically as cervical lymph node metastasis‐positive (N+), a surgical procedure, known as radical neck dissection (RND), is used to remove all lymph node groups from levels I, II, III, IV and V, which involves the sacrifice of the internal jugular vein, sternocleidomastoid muscle and spinal accessory nerve.
The clinical diagnostic procedure for clinical staging of cervical lymph nodes is carried out by clinical examination of the neck region or by ultrasound, computed tomography and magnetic resonance imaging. But the sensitivity of these methods is still limited. Post‐operative histological examination shows that approximately 30% of clinically diagnosed metastasis‐negative (N0) patients have metastasis‐positive lymph nodes in the neck,( 1 ) and 10–20% of clinically diagnosed metastasis‐positive (N+) patients turn out to be metastasis‐free. Due to the fact that the false‐negative rate is high in clinically diagnosed metastasis‐negative (N0) patients, most surgeons would not like to select a ‘wait and watch’ policy, because it may allow metastasis to spread further. Thus, the surgeons usually carry out a supraomohyoid neck dissection (SOHND) to remove lymph nodes at levels I, II and III to screen for metastasis. Although SOHND is not as stringent as RND and the technique of neck dissection has been perfected over the last century, surgeons still face minor and major complications during the surgical procedure,( 2 ) also sequelae such as chronic pain and limitation of shoulder movement due to a weak trapezius muscle.
Metastasis is a very complicated process. To metastasize, the cancer cells must break away from the tumor, increase their mobility and move through the extracellular matrix. Next they must invade the lymph vessels and grow in the lymph nodes or invade blood vessels and travel in the circulatory system. They then can pass through the vessel walls into surrounding tissue (distant metastasis). We think that the original genes controlling this process and the gene products of this process may be used as predictive markers of cervical lymph node metastasis. In the present study, microarray technology was used to investigate the differences in gene expression profiles between primary tumors of oral squamous cell carcinoma (OSCC) that metastasized to cervical lymph nodes and those that did not metastasize, in an effort to find new biomarkers that will provide more accurate diagnosis and more appropriate treatment for OSCC.
Materials and Methods
Tumor samples. All of the primary oral cancer specimens were obtained from anonymous patients who were previously untreated at the Faculty of Dentistry, Tokyo Medical and Dental University and were defined as squamous cell carcinoma of the oral cavity by histopathology. Informed consent was obtained from all of the patients. All clinical materials were approved by the ethics committee. The samples were embedded using Tissue‐Tek OCT Compound (Sakura Finetek USA) and stored at −80°C until use. These samples were grouped into metastasis group and non‐metastasis group based on clinical diagnosis and histological examination. For the learning case, 30 samples were prepared (Table 1) including 13 samples from patients who were found to be N+ in the cervical lymph node and 17 samples from patients who were found to be N0 in the cervical lymph node. Those that remained metastasis‐free were monitored for at least 1 year after the primary tumor was removed. (The primary tumors were surgically removed between April 2004 and October 2005.) For the test case, 13 samples were prepared (Table 1) including seven samples found to be N+ in the cervical lymph node and six samples found to be N0 in the cervical lymph node. Those remaining metastasis‐free were monitored for at least 6 months after the primary tumor was removed. (The primary tumors were surgically removed between November 2005 and April 2006.) To determine the technical reproducibility, we prepared eight samples from the learning case for a replicate experiment.
Table 1.
Clinical and histological characteristics of individual patients
| Case | Sex | Age (years) | Primary site | TN | Differentiation | Prediction score |
|---|---|---|---|---|---|---|
| Learning case | ||||||
| 1 | M | 56 | Lower gingival | T2N0 | Moderately | −0.791 |
| 2 | M | 62 | Buccal mucosa | T1N0 | Well | −0.032 |
| 3 | M | 55 | Upper gingival | T2N0 | Well | −0.780 |
| 4 | F | 66 | Tongue | T2N0 | Moderately | −0.309 |
| 5 | M | 72 | Hard palate | T1N0 | Moderately | −0.481 |
| 6 | M | 58 | Mouth floor | T2N0 | Well | −0.716 |
| 7 | M | 80 | Tongue | T2N0 | Well | −0.564 |
| 8 | M | 30 | Tongue | T2N0 | Well | −0.609 |
| 9 | F | 81 | retromolar trigone | T1N0 | Well | −0.507 |
| 10 | F | 60 | Tongue | T2N0 | Well | −0.264 |
| 11 | M | 59 | Tongue | T3N0 | Moderately | −0.481 |
| 12 | M | 68 | retromolar trigone | T2N0 | Moderately | −0.428 |
| 13 | M | 54 | Tongue | T1N0 | Well | −0.534 |
| 14 | M | 56 | Upper gingival | T2N0 | Well | −0.303 |
| 15 | M | 43 | Tongue | T2N0 | Moderately | −0.716 |
| 16 | M | 46 | Tongue | T2N0 | Well | −0.564 |
| 17 | M | 58 | Tongue | T2N0 | Well | −0.282 |
| 18 | F | 64 | Lower gingival | T4aN1 | Well | 0.348 |
| 19 | M | 66 | Tongue | T1N2b | Moderately | 0.411 |
| 20 | M | 77 | Lower gingival | T2N2b | Poor | 0.577 |
| 21 | F | 74 | Buccal mucosa | T2N1 | Moderately | 0.780 |
| 22 | M | 78 | Lower gingival | T2N1 | Well | 0.499 |
| 23 | M | 71 | Buccal mucosa | T3N2b | Poor | 0.499 |
| 24 | M | 71 | Lower gingival | T3N2b | Well | 0.564 |
| 25 | M | 61 | Tongue | T2N2c | Moderately | 0.318 |
| 26 | M | 60 | Lower gingival | T4N2b | Moderately | 1.000 |
| 27 | M | 57 | Upper gingival | T4aN1 | Poor | 0.571 |
| 28 | M | 70 | Tongue | T3N1 | Moderately | 0.592 |
| 29 | M | 52 | Lower gingival | T4aN2b | Moderately | 0.817 |
| 30 | M | 37 | Tongue | T2N3 | Poor | 0.292 |
| Test case | ||||||
| 1 | M | 66 | Lower gingival | T2N0 | Well | −0.318 |
| 2 | F | 66 | Upper gingival | T2N0 | Moderately | −0.288 |
| 3 | M | 73 | Tongue | T2N0 | Poor | −0.507 |
| 4 | M | 32 | Tongue | T3N0 | Moderately | −0.260 |
| 5 | M | 58 | Mouth floor | T3N0 | Moderately | −0.053 |
| 6 | M | 72 | Lower gingival | T2N0 | Well | −0.165 |
| 7 | M | 66 | Mouth floor | T3N1 | Moderately | 0.162 |
| 8 | M | 68 | Lower gingival | T4aN2b | Moderately | 0.214 |
| 9 | F | 54 | Tongue | T2N2b | Moderately | 0.329 |
| 10 | M | 59 | Mouth floor | T4N1 | Moderately | 0.115 |
| 11 | M | 67 | Mouth floor | T4N1 | Well | 0.143 |
| 12 | M | 60 | Tongue | T4N2c | Moderately | 0.164 |
| 13 | M | 53 | Tongue | T2N2b | Moderately | −0.228 |
T1, tumor ≤2 cm in greatest dimension; T2, tumor >2 cm but ≤4 cm in greatest dimension; T3, tumor >4 cm in greatest dimension; T4, (lip) tumor invades through cortical bone, inferior alveolar nerve, floor of mouth, or skin of face (i.e. chin or nose); T4a, (oral cavity) tumor invades adjacent structures (e.g. through cortical bone, into deep [extrinsic] muscle of tongue [genioglossus, hyoglossus, palatoglossus, and styloglossus], maxillary sinus, and skin of face); T4b, tumor invades masticator space, pterygoid plates, or skull base and/or encases internal carotid artery; N0, no regional lymph node metastasis; N1, metastasis in a single ipsilateral lymph node, ≤3 cm in greatest dimension; N2, metastasis in a single ipsilateral lymph node, >3 cm but ≤6 cm in greatest dimension, or in multiple ipsilateral lymph nodes, ≤6 cm in greatest dimension, or in bilateral or contralateral lymph nodes, ≤6 cm in greatest dimension; N2a, metastasis in a single ipsilateral lymph node >3 cm but ≤6 cm in dimension; N2b, metastasis in multiple ipsilateral lymph nodes, ≤6 m in greatest dimension; N2c, metastasis in bilateral or contralateral lymph nodes, ≤6 cm in greatest dimension; N3, metastasis in a lymph node >6 cm in greatest dimension.
Laser captured microdissection. All primary tumor specimens were cut into 9‐µm sections at −20°C using a LEICA cryostat model 3050S. The sections were mounted on a special slide for use in laser captured microdissection (LCM) and immediately placed at −80°C before use. First, the sections were fixed in cold ethanol for 3 min. They were then were washed in dionised water for 30 s, stained with hematoxylin for 40 s, and again washed in dionised water for 30 s. The sections were dried by cold wind for 2 or 3 min before the LCM. Squamous cell carcinomas were obtained accurately from the hematoxylin‐stained tissue sections by LCM.
RNA isolation and quality assessment. Total RNA was extracted from the harvested cells using the RNeasy Micro Kit of Qiagen, and the concentration was measured using a NanoDrop ND‐100 Spectrophotometer. All RNA was run with RNA 6000 Pico LabChip kits on the Agilent 2100 Bioanalyzer to analyze the quality of total RNA. The total RNA quality was assessed by RNA integrity number (RIN) value,( 3 , 4 ) and the samples with RIN values below 5( 5 ) were not used for the next step.
cRNA amplification and biotin labeling. Total RNA (100 ng) of each sample was used for starting the protocol of Two‐Cycle cDNA Synthesis and labeling of cRNA, following the recommendations of Affymetrix.( 6 ) The yield of biotin‐labeled cRNA was measured using a NanoDrop ND‐100 Spectrophotometer and the quality was analyzed using an Agilent 2100 Bioanalyzer. We removed samples with a yield less than 40 µg or with a median size of biotin‐labeled cRNA fragments less than 500 bp.
Microarray production. The Human Genome U133 Plus 2.0 array was purchased from the Affymetrix company in Japan. The array comprised 1 300 000 distinct oligonucleotides and featured over 47 000 transcripts and variants, including approximately 39 000 of the best‐characterized human genes.
Cocktail solution and microarray hybridization. Before making a cocktail solution, we used 20 µg of biotin‐labeled cRNA and broke down the full length to 35–200 base fragments. Then, we used 15 µg of broken cRNA to make the cocktail solution, and the solution was put into GeneChip HG U133 plus 2 and hybridized for 16 h at 45°C. After hybridization, the arrays were washed and stained using Fluidic station 450 with protocol EukGE‐WS2v5_450 and the arrays were scanned using the Affymetrix GeneChip Scaner 3000.
Statistical analysis. After scanning, the fluorescence intensity was measured using Affymetrix Microarray Suite 5.0 software, and the array was removed if it had a report with a scale factor larger than 6, 3′/5′β‐actin larger than 35 or 3′/5′ glyceraldehyde‐3‐phosphate dehydrogenase larger than 7. At low‐level analysis, the arrays were imported into the RMAExpress software (http://rmaexpress.bmbolstad.com) to perform normalization using the RMA algorithm( 7 , 8 ) and computing expression levels, because the RMA algorithm gave the most reproducible results and showed the highest correlation coefficients with real‐time polymerase chain reaction data.( 9 , 10 ) After the expression levels were calculated, the array data were imported into DNA‐Chip analysis software (http://www.dchip.org) for high‐level analysis. Gene filtering was carried out using the variation across samples criteria (0.3 < standard deviation/mean < 100). For group comparison, two‐group t‐tests were used with a threshold of P < 0.04, absolute value of the difference in mean expression between two groups (Δ) > 100 intensity units and a fold change in mean expression >1.5 and <0.66. The 85 genes (Table 2) were selected. After selecting 85 genes that showed a difference in expression levels between the two groups, we again extracted from the 85 genes with software using the Adaboost algorithm.( 11 ) The software was able to autoselect the best gene combination for separating the metastasis group from the non‐metastasis group with the lowest cross validation (CV) error rate. Eight genes (Table 3) were extracted with a higher power for prediction, and were used to evaluate 13 samples from the test case.
Table 2.
The 85 genes related to lymph node metastasis
| Accession no. | Gene symbol | Description | Fold change | P‐value |
|---|---|---|---|---|
| Downregulated genes in the metastasis group | ||||
| NM_000597 | IGFBP2 | Insulin‐like growth factor binding protein 2, 36 kDa | −6.81 | 0.004005 |
| NM_002276 | KRT19 | Keratin 19 | −5.78 | 0.028855 |
| BG401568 | SLC16A9 | Solute carrier family 16 (monocarboxylic acid transporters), member 9 | −3.54 | 0.000823 |
| NM_001387 | DPYSL3 | Dihydropyrimidinase‐like 3 | −3.33 | 0.035334 |
| NM_016140 | CGI‐38 | Brain specific protein /// brain specific protein | −2.49 | 0.019781 |
| NM_001823 | CKB | Creatine kinase, brain | −2.42 | 0.006229 |
| AF288571 | LEF1 | Lymphoid enhancer‐binding factor 1 | −2.35 | 0.033874 |
| NM_002820 | PTHLH | Parathyroid hormone‐like hormone | −2.34 | 0.02038 |
| AW451197 | CDNA clone IMAGE:5278089 | −2.32 | 0.035398 | |
| AI278995 | Predicted: Homo sapiens similar to B230208J24Rik protein (LOC201501), mRNA | −2.24 | 0.01941 | |
| M31157 | PTHLH | Parathyroid hormone‐like hormone | −2.16 | 0.038689 |
| NM_003027 | SH3GL3 | SH3‐domain GRB2‐like 3 | −2.15 | 0.029178 |
| AL567411 | CDK5R1 | Cyclin‐dependent kinase 5, regulatory subunit 1 (p35) | −2.13 | 0.011045 |
| BG434174 | Stoned B‐like factor | −2.11 | 0.023189 | |
| AI522132 | Hypothetical protein LOC115749 | −2.04 | 0.038373 | |
| BC005961 | PTHLH | Parathyroid hormone‐like hormone /// parathyroid hormone‐like hormone | −2.00 | 0.022891 |
| NM_001759 | CCND2 | Cyclin D2 | −1.98 | 0.018311 |
| BG290193 | ZNF703 | Zinc finger protein 703 | −1.93 | 0.037675 |
| AI189753 | TM4SF1 | Transmembrane 4 L six family member 1 | −1.92 | 0.011249 |
| AA143793 | RAB11FIP1 | RAB11 family interacting protein 1 (class I) | −1.89 | 0.013964 |
| BF111651 | PPAPDC1B | Phosphatidic acid phosphatase type 2 domain containing 1B | −1.89 | 0.03133 |
| AL137763 | GRHL3 | Grainyhead‐like 3 (Drosophila) | −1.89 | 0.035038 |
| BC000408 | ACAT2 | Acetyl‐Coenzyme A acetyltransferase 2 (acetoacetyl Coenzyme A thiolase) | −1.88 | 0.006495 |
| AW026491 | CCND2 | Cyclin D2 | −1.87 | 0.032209 |
| AL137629 | KALRN | Kalirin, RhoGEF kinase | −1.82 | 0.031132 |
| BF514585 | SESN3 | Sestrin 3 | −1.77 | 0.021875 |
| M90657 | TM4SF1 | Transmembrane 4 L six family member 1 | −1.76 | 0.011547 |
| AI458128 | CBX6 | Chromobox homolog 6 | −1.73 | 0.018388 |
| BF680438 | LONRF1 | LON peptidase N‐terminal domain and ring finger 1 | −1.70 | 0.01133 |
| AU154469 | SLC11A2 | Solute carrier family 11 (proton‐coupled divalent metal ion transporters), member 2 | −1.69 | 0.018722 |
| BG165333 | CNKSR3 | CNKSR family member 3 | −1.68 | 0.016795 |
| AI654238 | B4GALNT3 | β1,4‐N‐acetylgalactosaminyltransferase‐transferase‐III | −1.63 | 0.018618 |
| AI346835 | TM4SF1 | Transmembrane 4 L six family member 1 | −1.52 | 0.006659 |
| Upregulated genes in the metastasis group | ||||
| BE501952 | SATL1 | Spermidine/spermine N1‐acetyl transferase‐like 1 | 1.51 | 0.015987 |
| AI809870 | SMYD2 | SET and MYND domain containing 2 | 1.54 | 0.003793 |
| NM_017665 | ZCCHC10 | Zinc finger, CCHC domain containing 10 | 1.54 | 0.010698 |
| BF214329 | Mitochondrial fission regulator 1 | 1.56 | 0.008586 | |
| AI123233 | RANBP6 | RAN binding protein 6 | 1.58 | 0.0279 |
| NM_016040 | TMED5 | Transmembrane emp24 protein transport domain containing 5 | 1.59 | 0.015651 |
| NM_001889 | CRYZ | Crystallin, zeta (quinone reductase) | 1.59 | 0.012586 |
| NM_013322 | SNX10 | Sorting nexin 10 | 1.59 | 0.028971 |
| NM_024699 | ZFAND1 | Zinc finger, AN1‐type domain 1 | 1.6 | 0.018314 |
| U13700 | CASP1 | Caspase 1, apoptosis‐related cysteine peptidase (interleukin 1, beta, convertase) | 1.61 | 0.02315 |
| AW157773 | ZFP62 | Zinc finger protein 62 homolog (mouse) | 1.61 | 0.016554 |
| NM_016283 | TAF9 | TAF9 RNA polymerase II, TATA box binding protein (TBP)‐associated factor, 32 kDa | 1.62 | 0.012154 |
| BF439522 | MGC23909 | Hypothetical protein MGC23909 | 1.63 | 0.017789 |
| NM_003187 | TAF9 | TAF9 RNA polymerase II, TATA box binding protein (TBP)‐associated factor, 32 kDa | 1.65 | 0.006175 |
| NM_014873 | LPGAT1 | Lysophosphatidylglycerol acyltransferase 1 | 1.65 | 0.010458 |
| AK001947 | RP5‐1022P6.2 | Hypothetical protein KIAA1434 | 1.65 | 0.026428 |
| NM_016576 | GMPR2 | Guanosine monophosphate reductase 2 | 1.66 | 0.039452 |
| NM_024430 | PSTPIP2 | Proline‐serine‐threonine phosphatase interacting protein 2 | 1.67 | 0.016465 |
| AW612657 | LYPLAL1 | Lysophospholipase‐like 1 | 1.67 | 0.021165 |
| L12723 | HSPA4 | Heat shock 70 kDa protein 4 | 1.69 | 0.006059 |
| BC001025 | RCL1 | RNA terminal phosphate cyclase‐like 1 | 1.71 | 0.007725 |
| AI042152 | TncRNA | Trophoblast‐derived noncoding RNA | 1.71 | 0.02217 |
| AW962511 | FLJ22531 | Hypothetical protein FLJ22531 | 1.73 | 0.036123 |
| AI634046 | CFLAR | CASP8 and FADD‐like apoptosis regulator | 1.74 | 0.03599 |
| AF183569 | ARTS‐1 | Type 1 tumor necrosis factor receptor shedding aminopeptidase regulator | 1.76 | 0.009405 |
| AI805560 | ZMYM6 | Zinc finger, MYM‐type 6 | 1.77 | 0.01367 |
| AI347128 | IGBP1 | Immunoglobulin (CD79A) binding protein 1 | 1.77 | 0.001289 |
| NM_002198 | IRF1 | Interferon regulatory factor 1 | 1.8 | 0.016707 |
| NM_000361 | THBD | Thrombomodulin | 1.81 | 0.007352 |
| NM_000361 | THBD | Thrombomodulin | 1.84 | 0.007352 |
| NM_012485 | HMMR | Hyaluronan‐mediated motility receptor (RHAMM) | 1.84 | 0.009623 |
| BF735901 | NUDCD2 | NudC domain containing 2 | 1.86 | 0.00967 |
| AL559202 | Full‐length cDNA clone CS0DF034YI03 of fetal brain of Homo sapiens (human) | 1.86 | 0.009237 | |
| AW973232 | gb:AW973232 /DB_XREF=gi:8163078 /DB_XREF=EST385330 /FEA=EST /CNT=5 /TID=Hs.293553.0 /TIER=ConsEnd /STK=0 /UG=Hs.293553 /UG_TITLE=ESTs | 1.89 | 0.008534 | |
| NM_004120 | GBP2 | Guanylate binding protein 2, interferon‐inducible /// guanylate binding protein 2, interferon‐inducible | 1.89 | 0.029524 |
| U29343 | HMMR | Hyaluronan‐mediated motility receptor (RHAMM) | 1.92 | 0.004741 |
| AW119113 | THBD | Thrombomodulin | 1.92 | 0.010555 |
| NM_018530 | GSDML | Gasdermin‐like | 1.95 | 0.027119 |
| NM_014349 | APOL3 | Apolipoprotein L, 3 | 1.95 | 0.018873 |
| AI224133 | Transcribed locus, weakly similar to XP_517454.1 PREDICTED: similar to hypothetical protein MGC45438[Pan troglodytes] | 1.96 | 0.039636 | |
| AI928035 | IRX2 | Iroquois homeobox protein 2 | 1.99 | 0.022852 |
| NM_018465 | C9orf46 | Chromosome 9 open reading frame 46 | 1.99 | 0.014382 |
| AW003140 | mRNA; cDNA DKFZp686K1098 (from clone DKFZp686K1098) | 2.04 | 0.025115 | |
| AW613387 | Endothelial cell growth factor 1 (platelet‐derived) | 2.09 | 0.013803 | |
| NM_001657 | AREG | Amphiregulin (schwannoma‐derived growth factor) | 2.10 | 0.013069 |
| BC005254 | CLEC2B | C‐type lectin domain family 2, member B | 2.11 | 0.024784 |
| AI656493 | DCTD | dCMP deaminase | 2.17 | 0.004514 |
| NM_005415 | SLC20A1 | Solute carrier family 20 (phosphate transporter), member 1 | 2.17 | 0.022575 |
| NM_004815 | ARHGAP29 | Rho GTPase activating protein 29 | 2.27 | 0.009215 |
| AA976354 | KIAA1618 | KIAA1618 | 2.61 | 0.017377 |
| NM_000585 | IL15 | Interleukin 15 | 2.80 | 0.00711 |
| AI539443 | STAT1 | Signal transducer and activator of transcription 1, 91 kDa | 3.00 | 0.033578 |
Table 3.
Genes selected for the prediction model
| Accession | Gene symbol | Description | Fold change | P‐value |
|---|---|---|---|---|
| AI656493 | DCTD | dCMP deaminase | 2.17 | 0.004514 |
| NM_000585 | IL15 | Interleukin 15 | 2.80 | 0.00711 |
| AW119113 | THBD | Thrombomodulin | 1.92 | 0.010555 |
| NM_018530 | GSDML | Gasdermin‐like | 1.92 | 0.027119 |
| NM_003027 | SH3GL3 | SH3‐domain GRB2‐like 3 | −2.15 | 0.029178 |
| BC005961 | PTHLH | Parathyroid hormone‐like hormone | −2.00 | 0.022891 |
| BE328402 | RP5‐1022P6 | Hypothetical protein KIAA1434 | 1.92 | 0.020426 |
| NM_018465 | C9orf46 | Chromosome 9 open reading frame 46 | 1.99 | 0.014382 |
Quantitative reverse transcription–polymerase chain reaction analysis. Quantitative reverse transcription–polymerase chain reaction (RT‐PCR)( 12 ) was to validate the results of eight meaningfully expressed genes from the analyzed microarray array data. For each sample, 100 ng of original total RNA was used to synthesize the first strand of cDNA by reverse transcriptase using oligo dT primer following the protocol recommended by Invitrogen (Superscript III First‐Strand Synthesis System for RT‐PCR). Primer sets for quantitative RT‐PCR (Table 4) were designed using PRIMER 3 software (http://www.genome.wi.mit.edu) and were synthesized by the Sigma Corporation. The PCR reaction was carried out using an ABI Prism 7900 Sequence Detection system with Power SYBR Green Master Mix (15 µL Power SYBR Green Master Mix, 0.3 µL with 5 µM of each primer, 5 µL cDNA, 9.4 µL water). For each sample, reactions were carried out in triplicate following the program: denaturation for 15 s at 95°C, and annealing and extension for 60 s at 60°C. Cumulative fluorescence was measured at the end of the extension phase of each cycle. Quantification was based on standard curves from serial dilution of human normal total RNA purchased from Stratagene Corporation. The results were normalized by to actin, and then compared with the microarray data of eight genes.
Table 4.
Primers for the genes used in real‐time polymerase chain reaction
| Gene symbol | Product size (bps) | Forward primer | Reverse primer |
|---|---|---|---|
| DCTD | 113 | ctgcgaggctcctgtttaat | aagcttttgactcggtctgc |
| IL15 | 103 | acaaacatcactctgctgcttagac | ctgatccaaggtctgatcatcttct |
| THBD | 105 | agcacttgtgttgtctggtggt | tgtgcacacagagatagcatgaa |
| GSDML | 149 | tgaggcacgaattctctgtg | ggcagtgaggacagactggt |
| SH3GL3 | 103 | gcttcctgtcctaaaagtcattggt | ctgaggaatataggccattcgttg |
| PTHLH | 122 | tgtggcttgtttatccttagctc | cttgccctaggttgtgaact |
| RP5‐1022P6 | 104 | caatgagctttgcacagtttga | tagtcccttagcttttgcctcttg |
| C9orf46 | 121 | cttcctggtcccgattgttc | actcttttctgtttccagtatgtcctc |
| Actin | 150 | atgtggccgaggactttga | tgtgtggacttgggagagga |
Results
To identify a predictive gene expression signature, 30 primary tumor samples (learning case group) located in the oral cavity region were analyzed. These included 13 samples from individuals who were found postoperatively to have metastasis in the lymph node of the neck, and 17 samples from individuals who were found postoperatively to have no metastasis in the lymph node of the neck and who remained metastasis‐free when monitored for at least 1 year after primary tumor removal. The cancer cells of the tumor were obtained by LEM technology. Total RNA was isolated and its quality checked. We removed samples with RIN values below 5. At first, technical reproducibility was determined using eight samples from the training case. The technical replicates of the same two‐sample comparison showed a high Pearson correlation coefficient. The lowest Pearson correlation coefficient was 0.9433 (Supplementary Fig. S1). This result indicated that the technical reproducibility of gene expression was high. To analyze the results of 30 primary tumors samples, two‐group t‐tests were used with a threshold of P < 0.04 and an absolute value of the difference in mean expression between the two groups (Δ) > 100 intensity units, with fold change in mean expression >1.5 and <0.66. The 85 genes expressed differentially between the two patient groups with and without cervical lymph node metastasis were selected (Table 2), including 33 genes that were downregulated and 52 genes that were upregulated in the metastasis group. Next, hierarchical clustering was carried out using 85 genes from the 30 samples by Pearson's correlation distance metric and average linkage (Fig. 1). Two major cluster branches were created. One major cluster included 16 non‐metastasis samples and the other included 13 metastasis samples and one non‐metastasis sample (missed clustering). Subsequently, to construct a more accurate and practical prediction model using a smaller number of genes, we selected further the genes with a high power for prediction from the 85 genes using AdaBoost algorithm.( 11 ) In the AdaBoost algorithm, the optimal gene and its weight are determined in each boosting step, and the prediction model is constructed by weighted voting of the selected genes. We performed 1000 replicates of five‐fold cross validation for the learning cases. Eight candidate genes (DCTD, IL‐15, THBD, GSDML, SH3GL3, PTHLH, RP5‐1022P6 and C9orf46) were selected (Table 3), which achieved the minimum error rate. Next a prediction score was established for each sample (Table 1). Prediction scores have a value from −1 to 1, and the borderline is 0. A positive score indicates cervical lymph node metastasis, whereas a negative score indicates that the sample is metastasis‐free. In the learning case, all 13 metastasis samples had positive scores and all 17 non‐metastasis samples had negative scores (Fig. 2A). Microarray is an excellent tool that can analyze the expression of tens of thousands of genes. However it has some problems with accuracy and universal use. For the prediction system with high accuracy, verification that we could accurately analyze gene expression using this method was required. Thus, to confirm the prediction results, quantitative RT‐PCR of the selected eight genes was carried out and normalized to actin before being compared by microarray data. The Pearson correlation values of the eight genes between microarray data and quantitative RT‐PCR data were calculated and revealed to be over 0.73 (Table 5), showing a high correlation between microarray data and quantitative RT‐PCR data in this study. We evaluated the prediction model by using the test case group. A prediction score was calculated for each sample using the prediction model constructed in this study (Fig. 2B). Six non‐metastasis samples and six of the seven metastasis samples (∼92.3%) were predicted correctly by the prediction model. Only one case (∼7.7%) was a failure by this prediction model (circled in red in Fig. 2B).
Figure 1.

Hierarchical clustering for 85 genes from 30 samples, including 13 metastasis samples (shown by pink color +) and 17 non‐metastasis samples (shown by green color –). Red color shows that the gene is upregulated and blue color shows that the gene is downregulated. Two major cluster branches were created. One major cluster included 16 non‐metastasis samples and the other one included 13 metastasis samples and one non‐metastasis sample (missed clustering).
Figure 2.

The samples were rank‐ordered by their score (determined by the Adaboost algorithm). A vertical line shows the total discriminant score. The samples with negative score indicated that the tumors were free of lymph node metastasis. The samples with positive scores indicated that the tumors metastasized to the cervical lymph node. (A) The prediction result of learning cases. All of the 17 samples in the non‐metastasis group were negative and all of the 13 samples in the metastasis group were positive. (B) The prediction results of test cases. The six samples in the non‐metastasis group were negative. Six of seven samples in the metastasis group were positive and one was negative (failure sample, circled in red).
Table 5.
Pearson correlation of expression values between microarray data and real‐time polymerase chain reaction data of the eight genes
| Gene symbol | Pearson correlation | P‐value |
|---|---|---|
| DCTD | 0.796 | 2.46 × 10−7 |
| IL15 | 0.739 | 4.74 × 10−6 |
| THBD | 0.753 | 2.49 × 10−6 |
| GSDML | 0.868 | 1.06 × 10−9 |
| SH3GL3 | 0.911 | 6.86 × 10−12 |
| PTHLH | 0.852 | 4.69 × 10−9 |
| RP5‐1022P6 | 0.768 | 1.18 × 10−6 |
| C9orf46 | 0.742 | 4.05 × 10−6 |
Discussion
At present, the methods for diagnosing the status of lymph node metastasis in oral cavity cancer are not accurate. Thus, two opinions were formed about the treatment of individual clinically diagnosed oral cavity cancer cases that are cervical lymph node metastasis‐free. The first is the ‘neck dissection’ policy and the other is the ‘wait and watch’ policy. However, neither policy provides appropriate treatment for the disease. Because the neck dissection policy can cause pain, discomfort and in some cases leads to complications (such as chronic pain and shoulder palsy), it is important to ascertain whether a patient really is metastasis‐free. The alternative ‘wait and watch’ policy may allow an overlooked metastasis to spread widely for the patient who has micrometastasis. The goal of our study is to devise a novel diagnostic system that may improve the diagnosis of N status in oral cavity cancer. The results in 12 of 13 cases (∼92.3%) were predicted correctly. Only one case (∼7.7%) was a failure by this prediction model. The misjudged metastasis case using our prediction model was a 53‐year‐old man with moderately differentiated tongue squamous cell carcinoma. It is very difficult to discuss why the case missed, because we could not find any relationship between clinicopathological features of the patient and the score. In this study we would like to say that quantitative RT‐PCR data should not be use for this prediction model system. The data could not be predicted accurately. The reason was that the prediction score from microarray data were normalized by RMA algorithm, but the quantitative RT‐PCR data were normalized to actin; therefore the gene expression value of each gene by quantitative RT‐PCR data differs from the microarray data.
Of the eight genes identified, IL‐15 is of particular interest. IL‐15 is a cytokine that regulates T and natural killer cell activation and proliferation. Studies on mice suggest that IL‐15 may increase the expression of apoptosis inhibitor. A recent study has reported that IL‐15 expression has been shown to play an important role in cell proliferation, invasion and metastasis of human colorectal cancer.( 13 , 14 ) In the present study, we observed IL‐15 overexpression in the metastasis group (fold change [FC]: 2.8, P < 0.00711; Pearson correlation between microarray data and real‐time PCR, 0.739). This showed that IL‐15 may also play a role in the metastasis of oral squamous cell carcinoma. Further study is required to learn more about the roles of IL‐15 in the metastasis of OSCC.
A second interesting gene is PTHLH. The protein encoded by this gene is a member of the parathyroid hormone family. This hormone regulates endochondral bone development and epithelial–mesenchymal interactions during formation of the mammary glands and teeth. Some articles have reported that PTHLH may play a role in metastasis of breast cancer and prostate cancer cell lines by upregulation.( 15 , 16 ) But in our results, PTHLH was upregulated in the non‐metastasis group and downregulated in the metastasis group (FC: −2, P < 0.022891; Pearson correlation between array data and real‐time PCR, 0.852). It is difficult to explain why, but it may be that the PTHLH mechanism is different in vitro compared with in vivo, or it may be that the role of PTHLH in each type of cells is different. It could also be that cancer cells produce PTHLH to prompt cancer cell migration and invasion, but when the metastasis process is finished PTHLH is no longer necessary and so was downregulated in the metastasis group. Further study may clarify the role of PTHLH in OSCC.
The novel diagnosis system using gene sets may be applied in diagnosis of the disease. Further, the system may be also applied for other diseases in the future.
Supporting information
Supporting info item
Acknowledgments
Many thanks to Drs Takashi Shimoji, Koichi Nagasaki, Kiyotsugu Yoshida, Masaru Uekusa and Fumiyuki Uematsu for helpful discussion during the preparation of this article. We also thank Professor Marie Cosgrove for having checked the English language of this paper.
References
- 1. Jones AS, Phillips DE, Helliwell TR, Roland NJ. Occult node metastases in head and neck squamous carcinoma. Eur Arch Otorhinolaryngol 1993; 250: 446–9. [DOI] [PubMed] [Google Scholar]
- 2. Genden EM, Ferlito A, Shaha AR et al. Complications of neck dissection. Acta Otolaryngol 2003; 123: 795–801. [DOI] [PubMed] [Google Scholar]
- 3. Mueller O, Lightfoot S, Schroeder A. RNA integrity number (RIN) standardization of RNA quality control. Agilent Application Note, May 1 2004. Publication no. 5989‐1165EN. Available from URL: http://www.gene‐quantification.de/RIN.pdf
- 4. Imbeaud S, Graudens E, Boulanger V et al. Towards standardization of RNA quality assessment using user‐independent classifiers of microcapillary electrophoresis traces. Nucleic Acids Res 2005; 33: e56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Lee J, Hever A, Willhite D, Zlotnik A, Hevezi P. Effects of RNA degradation on gene expression analysis of human postmortem tissues. FASEB J 2005; 19: 1356–8. [DOI] [PubMed] [Google Scholar]
- 6. Technical note, GeneChip Eukaryotic Small Sample Target Labeling Assay Version II. Available from URL: http://genomics.msu.edu/RTSF/small_sample_labeling.pdf
- 7. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 2003; 31: e15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Li J, Spletter ML, Johnson JA. Dissecting tBHQ induced ARE‐driven gene expression through long and short oligonucleotide arrays. Physiol Genomics 2005; 21: 43–58. [DOI] [PubMed] [Google Scholar]
- 9. Irizarry RA, Hobbs B, Collin F et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003; 4: 249–64. [DOI] [PubMed] [Google Scholar]
- 10. Millenaar FF, Okyere J, May ST, Van Zanten M, Voesenek LA, Peeters AJ. How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results. BMC Bioinformatics 2006; 7: 137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Freund Y, Schapire RE. A short introduction to boosting. J. Japan. Soc. Artif. Intel. 1999; 14: 771–80. [Google Scholar]
- 12. Ginzinger DG. Gene quantification using real‐time quantitative PCR: an emerging technology hits the mainstream. Exp Hematol 2002; 30: 503–12. [DOI] [PubMed] [Google Scholar]
- 13. Kuniyasu H, Ohmori H, Sasaki T et al. Production of interleukin 15 by human colon cancer cells is associated with induction of mucosal hyperplasia, angiogenesis, and metastasis. Clin Cancer Res 2003; 9: 4802–10. [PubMed] [Google Scholar]
- 14. Kuniyasu H, Oue N, Nakae D et al. Interleukin‐15 expression is associated with malignant potential in colon cancer cells. Pathobiology 2001; 69: 86–95. [DOI] [PubMed] [Google Scholar]
- 15. Shen X, Qian L, Falzon M. PTH‐related protein enhances MCF‐7 breast cancer cell adhesion, migration, and invasion via an intracrine pathway. Exp Cell Res 2004; 294: 420–33. [DOI] [PubMed] [Google Scholar]
- 16. Shen X, Falzon M. PTH‐related protein modulates PC‐3 prostate cancer cell adhesion and integrin subunit profile. Mol Cell Endocrinol 2003; 199: 165–77. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supporting info item
