Identification of a predictive gene expression signature of cervical lymph node metastasis in oral squamous cell carcinoma

Su Tien Nguyen; Shogo Hasegawa; Hitoshi Tsuda; Hirofumi Tomioka; Masaru Ushijima; Masaki Noda; Ken Omura; Yoshio Miki

doi:10.1111/j.1349-7006.2007.00454.x

. 2007 Mar 28;98(5):740–746. doi: 10.1111/j.1349-7006.2007.00454.x

Identification of a predictive gene expression signature of cervical lymph node metastasis in oral squamous cell carcinoma

Su Tien Nguyen ^1,³, Shogo Hasegawa ², Hitoshi Tsuda ⁴, Hirofumi Tomioka ², Masaru Ushijima ⁵, Masaki Noda ^3,⁷, Ken Omura ^2,³, Yoshio Miki ^1,^3,^6,^✉

PMCID: PMC11158652 PMID: 17391312

Abstract

An accurate assessment of the cervical lymph node metastasis status in oral cavity cancer not only helps predict the prognosis of patients, but also helps surgeons to perform the appropriate treatment. We investigated the utilization of microarray technology focusing on the differences in gene expression profiles between primary tumors of oral squamous cell carcinoma that had metastasized to cervical lymph nodes and those that had not metastasized in the hope of finding new biomarkers to serve for diagnosis and treatment of oral cavity cancer. To design this experiment, we prepared two groups: the learning case group with 30 patients and the test case group with 13 patients. All tissue samples were performed using laser captured microdissection to yield cancer cells, and RNA was isolated from purified cancer cells. To identify a predictive gene expression signature, the different gene expressions between the two groups with and without metastasis in the learning case (n = 30) were analyzed, and the 85 genes expressed differentially were selected. Subsequently, to construct a more accurate prediction model, we further selected the genes with a high power for prediction from the 85 genes using the AdaBoost algorithm. The eight candidate genes, DCTD, IL‐15, THBD, GSDML, SH3GL3, PTHLH, RP5‐1022P6 and C9orf46, were selected to achieve the minimum error rate. Quantitative reverse transcription–polymerase chain reaction was carried out to validate the selected genes. From these statistical methods, the prediction model was constructed including the eight genes and this model was evaluated by using the test case group. The results in 12 of 13 cases (∼92.3%) were predicted correctly. (Cancer Sci 2007; 98: 740–746)

In 2005, 335 870 Japanese people died from cancer. Of these, 5679 people had oral cavity cancer (http://www.mhlw.go.jp/toukei/saikin/hw/jinkou/suikei05/index.html) and the major cause of death by cancer was metastasis. An accurate assessment of the cervical lymph node metastasis status in oral cavity cancer not only helps predict the prognosis of patients, but also helps surgeons to carry out the appropriate treatment. When the disease is localized, surgical procedures can be used to remove the tumor in its entirety. For patients who are diagnosed clinically as cervical lymph node metastasis‐positive (N+), a surgical procedure, known as radical neck dissection (RND), is used to remove all lymph node groups from levels I, II, III, IV and V, which involves the sacrifice of the internal jugular vein, sternocleidomastoid muscle and spinal accessory nerve.

The clinical diagnostic procedure for clinical staging of cervical lymph nodes is carried out by clinical examination of the neck region or by ultrasound, computed tomography and magnetic resonance imaging. But the sensitivity of these methods is still limited. Post‐operative histological examination shows that approximately 30% of clinically diagnosed metastasis‐negative (N0) patients have metastasis‐positive lymph nodes in the neck,⁽ ¹ ⁾ and 10–20% of clinically diagnosed metastasis‐positive (N+) patients turn out to be metastasis‐free. Due to the fact that the false‐negative rate is high in clinically diagnosed metastasis‐negative (N0) patients, most surgeons would not like to select a ‘wait and watch’ policy, because it may allow metastasis to spread further. Thus, the surgeons usually carry out a supraomohyoid neck dissection (SOHND) to remove lymph nodes at levels I, II and III to screen for metastasis. Although SOHND is not as stringent as RND and the technique of neck dissection has been perfected over the last century, surgeons still face minor and major complications during the surgical procedure,⁽ ² ⁾ also sequelae such as chronic pain and limitation of shoulder movement due to a weak trapezius muscle.

Metastasis is a very complicated process. To metastasize, the cancer cells must break away from the tumor, increase their mobility and move through the extracellular matrix. Next they must invade the lymph vessels and grow in the lymph nodes or invade blood vessels and travel in the circulatory system. They then can pass through the vessel walls into surrounding tissue (distant metastasis). We think that the original genes controlling this process and the gene products of this process may be used as predictive markers of cervical lymph node metastasis. In the present study, microarray technology was used to investigate the differences in gene expression profiles between primary tumors of oral squamous cell carcinoma (OSCC) that metastasized to cervical lymph nodes and those that did not metastasize, in an effort to find new biomarkers that will provide more accurate diagnosis and more appropriate treatment for OSCC.

Materials and Methods

Tumor samples. All of the primary oral cancer specimens were obtained from anonymous patients who were previously untreated at the Faculty of Dentistry, Tokyo Medical and Dental University and were defined as squamous cell carcinoma of the oral cavity by histopathology. Informed consent was obtained from all of the patients. All clinical materials were approved by the ethics committee. The samples were embedded using Tissue‐Tek OCT Compound (Sakura Finetek USA) and stored at −80°C until use. These samples were grouped into metastasis group and non‐metastasis group based on clinical diagnosis and histological examination. For the learning case, 30 samples were prepared (Table 1) including 13 samples from patients who were found to be N+ in the cervical lymph node and 17 samples from patients who were found to be N0 in the cervical lymph node. Those that remained metastasis‐free were monitored for at least 1 year after the primary tumor was removed. (The primary tumors were surgically removed between April 2004 and October 2005.) For the test case, 13 samples were prepared (Table 1) including seven samples found to be N+ in the cervical lymph node and six samples found to be N0 in the cervical lymph node. Those remaining metastasis‐free were monitored for at least 6 months after the primary tumor was removed. (The primary tumors were surgically removed between November 2005 and April 2006.) To determine the technical reproducibility, we prepared eight samples from the learning case for a replicate experiment.

Table 1.

Clinical and histological characteristics of individual patients

Case	Sex	Age (years)	Primary site	TN	Differentiation	Prediction score
Learning case
1	M	56	Lower gingival	T2N0	Moderately	−0.791
2	M	62	Buccal mucosa	T1N0	Well	−0.032
3	M	55	Upper gingival	T2N0	Well	−0.780
4	F	66	Tongue	T2N0	Moderately	−0.309
5	M	72	Hard palate	T1N0	Moderately	−0.481
6	M	58	Mouth floor	T2N0	Well	−0.716
7	M	80	Tongue	T2N0	Well	−0.564
8	M	30	Tongue	T2N0	Well	−0.609
9	F	81	retromolar trigone	T1N0	Well	−0.507
10	F	60	Tongue	T2N0	Well	−0.264
11	M	59	Tongue	T3N0	Moderately	−0.481
12	M	68	retromolar trigone	T2N0	Moderately	−0.428
13	M	54	Tongue	T1N0	Well	−0.534
14	M	56	Upper gingival	T2N0	Well	−0.303
15	M	43	Tongue	T2N0	Moderately	−0.716
16	M	46	Tongue	T2N0	Well	−0.564
17	M	58	Tongue	T2N0	Well	−0.282
18	F	64	Lower gingival	T4aN1	Well	0.348
19	M	66	Tongue	T1N2b	Moderately	0.411
20	M	77	Lower gingival	T2N2b	Poor	0.577
21	F	74	Buccal mucosa	T2N1	Moderately	0.780
22	M	78	Lower gingival	T2N1	Well	0.499
23	M	71	Buccal mucosa	T3N2b	Poor	0.499
24	M	71	Lower gingival	T3N2b	Well	0.564
25	M	61	Tongue	T2N2c	Moderately	0.318
26	M	60	Lower gingival	T4N2b	Moderately	1.000
27	M	57	Upper gingival	T4aN1	Poor	0.571
28	M	70	Tongue	T3N1	Moderately	0.592
29	M	52	Lower gingival	T4aN2b	Moderately	0.817
30	M	37	Tongue	T2N3	Poor	0.292
Test case
1	M	66	Lower gingival	T2N0	Well	−0.318
2	F	66	Upper gingival	T2N0	Moderately	−0.288
3	M	73	Tongue	T2N0	Poor	−0.507
4	M	32	Tongue	T3N0	Moderately	−0.260
5	M	58	Mouth floor	T3N0	Moderately	−0.053
6	M	72	Lower gingival	T2N0	Well	−0.165
7	M	66	Mouth floor	T3N1	Moderately	0.162
8	M	68	Lower gingival	T4aN2b	Moderately	0.214
9	F	54	Tongue	T2N2b	Moderately	0.329
10	M	59	Mouth floor	T4N1	Moderately	0.115
11	M	67	Mouth floor	T4N1	Well	0.143
12	M	60	Tongue	T4N2c	Moderately	0.164
13	M	53	Tongue	T2N2b	Moderately	−0.228

Open in a new tab

T1, tumor ≤2 cm in greatest dimension; T2, tumor >2 cm but ≤4 cm in greatest dimension; T3, tumor >4 cm in greatest dimension; T4, (lip) tumor invades through cortical bone, inferior alveolar nerve, floor of mouth, or skin of face (i.e. chin or nose); T4a, (oral cavity) tumor invades adjacent structures (e.g. through cortical bone, into deep [extrinsic] muscle of tongue [genioglossus, hyoglossus, palatoglossus, and styloglossus], maxillary sinus, and skin of face); T4b, tumor invades masticator space, pterygoid plates, or skull base and/or encases internal carotid artery; N0, no regional lymph node metastasis; N1, metastasis in a single ipsilateral lymph node, ≤3 cm in greatest dimension; N2, metastasis in a single ipsilateral lymph node, >3 cm but ≤6 cm in greatest dimension, or in multiple ipsilateral lymph nodes, ≤6 cm in greatest dimension, or in bilateral or contralateral lymph nodes, ≤6 cm in greatest dimension; N2a, metastasis in a single ipsilateral lymph node >3 cm but ≤6 cm in dimension; N2b, metastasis in multiple ipsilateral lymph nodes, ≤6 m in greatest dimension; N2c, metastasis in bilateral or contralateral lymph nodes, ≤6 cm in greatest dimension; N3, metastasis in a lymph node >6 cm in greatest dimension.

Laser captured microdissection. All primary tumor specimens were cut into 9‐µm sections at −20°C using a LEICA cryostat model 3050S. The sections were mounted on a special slide for use in laser captured microdissection (LCM) and immediately placed at −80°C before use. First, the sections were fixed in cold ethanol for 3 min. They were then were washed in dionised water for 30 s, stained with hematoxylin for 40 s, and again washed in dionised water for 30 s. The sections were dried by cold wind for 2 or 3 min before the LCM. Squamous cell carcinomas were obtained accurately from the hematoxylin‐stained tissue sections by LCM.

RNA isolation and quality assessment. Total RNA was extracted from the harvested cells using the RNeasy Micro Kit of Qiagen, and the concentration was measured using a NanoDrop ND‐100 Spectrophotometer. All RNA was run with RNA 6000 Pico LabChip kits on the Agilent 2100 Bioanalyzer to analyze the quality of total RNA. The total RNA quality was assessed by RNA integrity number (RIN) value,⁽ ³ , ⁴ ⁾ and the samples with RIN values below 5⁽ ⁵ ⁾ were not used for the next step.

cRNA amplification and biotin labeling. Total RNA (100 ng) of each sample was used for starting the protocol of Two‐Cycle cDNA Synthesis and labeling of cRNA, following the recommendations of Affymetrix.⁽ ⁶ ⁾ The yield of biotin‐labeled cRNA was measured using a NanoDrop ND‐100 Spectrophotometer and the quality was analyzed using an Agilent 2100 Bioanalyzer. We removed samples with a yield less than 40 µg or with a median size of biotin‐labeled cRNA fragments less than 500 bp.

Microarray production. The Human Genome U133 Plus 2.0 array was purchased from the Affymetrix company in Japan. The array comprised 1 300 000 distinct oligonucleotides and featured over 47 000 transcripts and variants, including approximately 39 000 of the best‐characterized human genes.

Cocktail solution and microarray hybridization. Before making a cocktail solution, we used 20 µg of biotin‐labeled cRNA and broke down the full length to 35–200 base fragments. Then, we used 15 µg of broken cRNA to make the cocktail solution, and the solution was put into GeneChip HG U133 plus 2 and hybridized for 16 h at 45°C. After hybridization, the arrays were washed and stained using Fluidic station 450 with protocol EukGE‐WS2v5_450 and the arrays were scanned using the Affymetrix GeneChip Scaner 3000.

Statistical analysis. After scanning, the fluorescence intensity was measured using Affymetrix Microarray Suite 5.0 software, and the array was removed if it had a report with a scale factor larger than 6, 3′/5′β‐actin larger than 35 or 3′/5′ glyceraldehyde‐3‐phosphate dehydrogenase larger than 7. At low‐level analysis, the arrays were imported into the RMAExpress software (http://rmaexpress.bmbolstad.com) to perform normalization using the RMA algorithm⁽ ⁷ , ⁸ ⁾ and computing expression levels, because the RMA algorithm gave the most reproducible results and showed the highest correlation coefficients with real‐time polymerase chain reaction data.⁽ ⁹ , ¹⁰ ⁾ After the expression levels were calculated, the array data were imported into DNA‐Chip analysis software (http://www.dchip.org) for high‐level analysis. Gene filtering was carried out using the variation across samples criteria (0.3 < standard deviation/mean < 100). For group comparison, two‐group t‐tests were used with a threshold of P < 0.04, absolute value of the difference in mean expression between two groups (Δ) > 100 intensity units and a fold change in mean expression >1.5 and <0.66. The 85 genes (Table 2) were selected. After selecting 85 genes that showed a difference in expression levels between the two groups, we again extracted from the 85 genes with software using the Adaboost algorithm.⁽ ¹¹ ⁾ The software was able to autoselect the best gene combination for separating the metastasis group from the non‐metastasis group with the lowest cross validation (CV) error rate. Eight genes (Table 3) were extracted with a higher power for prediction, and were used to evaluate 13 samples from the test case.

Table 2.

The 85 genes related to lymph node metastasis

Accession no.	Gene symbol	Description	Fold change	P‐value
Downregulated genes in the metastasis group
NM_000597	IGFBP2	Insulin‐like growth factor binding protein 2, 36 kDa	−6.81	0.004005
NM_002276	KRT19	Keratin 19	−5.78	0.028855
BG401568	SLC16A9	Solute carrier family 16 (monocarboxylic acid transporters), member 9	−3.54	0.000823
NM_001387	DPYSL3	Dihydropyrimidinase‐like 3	−3.33	0.035334
NM_016140	CGI‐38	Brain specific protein /// brain specific protein	−2.49	0.019781
NM_001823	CKB	Creatine kinase, brain	−2.42	0.006229
AF288571	LEF1	Lymphoid enhancer‐binding factor 1	−2.35	0.033874
NM_002820	PTHLH	Parathyroid hormone‐like hormone	−2.34	0.02038
AW451197		CDNA clone IMAGE:5278089	−2.32	0.035398
AI278995		Predicted: Homo sapiens similar to B230208J24Rik protein (LOC201501), mRNA	−2.24	0.01941
M31157	PTHLH	Parathyroid hormone‐like hormone	−2.16	0.038689
NM_003027	SH3GL3	SH3‐domain GRB2‐like 3	−2.15	0.029178
AL567411	CDK5R1	Cyclin‐dependent kinase 5, regulatory subunit 1 (p35)	−2.13	0.011045
BG434174		Stoned B‐like factor	−2.11	0.023189
AI522132		Hypothetical protein LOC115749	−2.04	0.038373
BC005961	PTHLH	Parathyroid hormone‐like hormone /// parathyroid hormone‐like hormone	−2.00	0.022891
NM_001759	CCND2	Cyclin D2	−1.98	0.018311
BG290193	ZNF703	Zinc finger protein 703	−1.93	0.037675
AI189753	TM4SF1	Transmembrane 4 L six family member 1	−1.92	0.011249
AA143793	RAB11FIP1	RAB11 family interacting protein 1 (class I)	−1.89	0.013964
BF111651	PPAPDC1B	Phosphatidic acid phosphatase type 2 domain containing 1B	−1.89	0.03133
AL137763	GRHL3	Grainyhead‐like 3 (Drosophila)	−1.89	0.035038
BC000408	ACAT2	Acetyl‐Coenzyme A acetyltransferase 2 (acetoacetyl Coenzyme A thiolase)	−1.88	0.006495
AW026491	CCND2	Cyclin D2	−1.87	0.032209
AL137629	KALRN	Kalirin, RhoGEF kinase	−1.82	0.031132
BF514585	SESN3	Sestrin 3	−1.77	0.021875
M90657	TM4SF1	Transmembrane 4 L six family member 1	−1.76	0.011547
AI458128	CBX6	Chromobox homolog 6	−1.73	0.018388
BF680438	LONRF1	LON peptidase N‐terminal domain and ring finger 1	−1.70	0.01133
AU154469	SLC11A2	Solute carrier family 11 (proton‐coupled divalent metal ion transporters), member 2	−1.69	0.018722
BG165333	CNKSR3	CNKSR family member 3	−1.68	0.016795
AI654238	B4GALNT3	β1,4‐N‐acetylgalactosaminyltransferase‐transferase‐III	−1.63	0.018618
AI346835	TM4SF1	Transmembrane 4 L six family member 1	−1.52	0.006659
Upregulated genes in the metastasis group
BE501952	SATL1	Spermidine/spermine N1‐acetyl transferase‐like 1	1.51	0.015987
AI809870	SMYD2	SET and MYND domain containing 2	1.54	0.003793
NM_017665	ZCCHC10	Zinc finger, CCHC domain containing 10	1.54	0.010698
BF214329		Mitochondrial fission regulator 1	1.56	0.008586
AI123233	RANBP6	RAN binding protein 6	1.58	0.0279
NM_016040	TMED5	Transmembrane emp24 protein transport domain containing 5	1.59	0.015651
NM_001889	CRYZ	Crystallin, zeta (quinone reductase)	1.59	0.012586
NM_013322	SNX10	Sorting nexin 10	1.59	0.028971
NM_024699	ZFAND1	Zinc finger, AN1‐type domain 1	1.6	0.018314
U13700	CASP1	Caspase 1, apoptosis‐related cysteine peptidase (interleukin 1, beta, convertase)	1.61	0.02315
AW157773	ZFP62	Zinc finger protein 62 homolog (mouse)	1.61	0.016554
NM_016283	TAF9	TAF9 RNA polymerase II, TATA box binding protein (TBP)‐associated factor, 32 kDa	1.62	0.012154
BF439522	MGC23909	Hypothetical protein MGC23909	1.63	0.017789
NM_003187	TAF9	TAF9 RNA polymerase II, TATA box binding protein (TBP)‐associated factor, 32 kDa	1.65	0.006175
NM_014873	LPGAT1	Lysophosphatidylglycerol acyltransferase 1	1.65	0.010458
AK001947	RP5‐1022P6.2	Hypothetical protein KIAA1434	1.65	0.026428
NM_016576	GMPR2	Guanosine monophosphate reductase 2	1.66	0.039452
NM_024430	PSTPIP2	Proline‐serine‐threonine phosphatase interacting protein 2	1.67	0.016465
AW612657	LYPLAL1	Lysophospholipase‐like 1	1.67	0.021165
L12723	HSPA4	Heat shock 70 kDa protein 4	1.69	0.006059
BC001025	RCL1	RNA terminal phosphate cyclase‐like 1	1.71	0.007725
AI042152	TncRNA	Trophoblast‐derived noncoding RNA	1.71	0.02217
AW962511	FLJ22531	Hypothetical protein FLJ22531	1.73	0.036123
AI634046	CFLAR	CASP8 and FADD‐like apoptosis regulator	1.74	0.03599
AF183569	ARTS‐1	Type 1 tumor necrosis factor receptor shedding aminopeptidase regulator	1.76	0.009405
AI805560	ZMYM6	Zinc finger, MYM‐type 6	1.77	0.01367
AI347128	IGBP1	Immunoglobulin (CD79A) binding protein 1	1.77	0.001289
NM_002198	IRF1	Interferon regulatory factor 1	1.8	0.016707
NM_000361	THBD	Thrombomodulin	1.81	0.007352
NM_000361	THBD	Thrombomodulin	1.84	0.007352
NM_012485	HMMR	Hyaluronan‐mediated motility receptor (RHAMM)	1.84	0.009623
BF735901	NUDCD2	NudC domain containing 2	1.86	0.00967
AL559202		Full‐length cDNA clone CS0DF034YI03 of fetal brain of Homo sapiens (human)	1.86	0.009237
AW973232		gb:AW973232 /DB_XREF=gi:8163078 /DB_XREF=EST385330 /FEA=EST /CNT=5 /TID=Hs.293553.0 /TIER=ConsEnd /STK=0 /UG=Hs.293553 /UG_TITLE=ESTs	1.89	0.008534
NM_004120	GBP2	Guanylate binding protein 2, interferon‐inducible /// guanylate binding protein 2, interferon‐inducible	1.89	0.029524
U29343	HMMR	Hyaluronan‐mediated motility receptor (RHAMM)	1.92	0.004741
AW119113	THBD	Thrombomodulin	1.92	0.010555
NM_018530	GSDML	Gasdermin‐like	1.95	0.027119
NM_014349	APOL3	Apolipoprotein L, 3	1.95	0.018873
AI224133		Transcribed locus, weakly similar to XP_517454.1 PREDICTED: similar to hypothetical protein MGC45438[Pan troglodytes]	1.96	0.039636
AI928035	IRX2	Iroquois homeobox protein 2	1.99	0.022852
NM_018465	C9orf46	Chromosome 9 open reading frame 46	1.99	0.014382
AW003140		mRNA; cDNA DKFZp686K1098 (from clone DKFZp686K1098)	2.04	0.025115
AW613387		Endothelial cell growth factor 1 (platelet‐derived)	2.09	0.013803
NM_001657	AREG	Amphiregulin (schwannoma‐derived growth factor)	2.10	0.013069
BC005254	CLEC2B	C‐type lectin domain family 2, member B	2.11	0.024784
AI656493	DCTD	dCMP deaminase	2.17	0.004514
NM_005415	SLC20A1	Solute carrier family 20 (phosphate transporter), member 1	2.17	0.022575
NM_004815	ARHGAP29	Rho GTPase activating protein 29	2.27	0.009215
AA976354	KIAA1618	KIAA1618	2.61	0.017377
NM_000585	IL15	Interleukin 15	2.80	0.00711
AI539443	STAT1	Signal transducer and activator of transcription 1, 91 kDa	3.00	0.033578

Open in a new tab

Table 3.

Genes selected for the prediction model

Accession	Gene symbol	Description	Fold change	P‐value
AI656493	DCTD	dCMP deaminase	2.17	0.004514
NM_000585	IL15	Interleukin 15	2.80	0.00711
AW119113	THBD	Thrombomodulin	1.92	0.010555
NM_018530	GSDML	Gasdermin‐like	1.92	0.027119
NM_003027	SH3GL3	SH3‐domain GRB2‐like 3	−2.15	0.029178
BC005961	PTHLH	Parathyroid hormone‐like hormone	−2.00	0.022891
BE328402	RP5‐1022P6	Hypothetical protein KIAA1434	1.92	0.020426
NM_018465	C9orf46	Chromosome 9 open reading frame 46	1.99	0.014382

Open in a new tab

Quantitative reverse transcription–polymerase chain reaction analysis. Quantitative reverse transcription–polymerase chain reaction (RT‐PCR)⁽ ¹² ⁾ was to validate the results of eight meaningfully expressed genes from the analyzed microarray array data. For each sample, 100 ng of original total RNA was used to synthesize the first strand of cDNA by reverse transcriptase using oligo dT primer following the protocol recommended by Invitrogen (Superscript III First‐Strand Synthesis System for RT‐PCR). Primer sets for quantitative RT‐PCR (Table 4) were designed using PRIMER 3 software (http://www.genome.wi.mit.edu) and were synthesized by the Sigma Corporation. The PCR reaction was carried out using an ABI Prism 7900 Sequence Detection system with Power SYBR Green Master Mix (15 µL Power SYBR Green Master Mix, 0.3 µL with 5 µM of each primer, 5 µL cDNA, 9.4 µL water). For each sample, reactions were carried out in triplicate following the program: denaturation for 15 s at 95°C, and annealing and extension for 60 s at 60°C. Cumulative fluorescence was measured at the end of the extension phase of each cycle. Quantification was based on standard curves from serial dilution of human normal total RNA purchased from Stratagene Corporation. The results were normalized by to actin, and then compared with the microarray data of eight genes.

Table 4.

Primers for the genes used in real‐time polymerase chain reaction

Gene symbol	Product size (bps)	Forward primer	Reverse primer
DCTD	113	ctgcgaggctcctgtttaat	aagcttttgactcggtctgc
IL15	103	acaaacatcactctgctgcttagac	ctgatccaaggtctgatcatcttct
THBD	105	agcacttgtgttgtctggtggt	tgtgcacacagagatagcatgaa
GSDML	149	tgaggcacgaattctctgtg	ggcagtgaggacagactggt
SH3GL3	103	gcttcctgtcctaaaagtcattggt	ctgaggaatataggccattcgttg
PTHLH	122	tgtggcttgtttatccttagctc	cttgccctaggttgtgaact
RP5‐1022P6	104	caatgagctttgcacagtttga	tagtcccttagcttttgcctcttg
C9orf46	121	cttcctggtcccgattgttc	actcttttctgtttccagtatgtcctc
Actin	150	atgtggccgaggactttga	tgtgtggacttgggagagga

Open in a new tab

Results

To identify a predictive gene expression signature, 30 primary tumor samples (learning case group) located in the oral cavity region were analyzed. These included 13 samples from individuals who were found postoperatively to have metastasis in the lymph node of the neck, and 17 samples from individuals who were found postoperatively to have no metastasis in the lymph node of the neck and who remained metastasis‐free when monitored for at least 1 year after primary tumor removal. The cancer cells of the tumor were obtained by LEM technology. Total RNA was isolated and its quality checked. We removed samples with RIN values below 5. At first, technical reproducibility was determined using eight samples from the training case. The technical replicates of the same two‐sample comparison showed a high Pearson correlation coefficient. The lowest Pearson correlation coefficient was 0.9433 (Supplementary Fig. S1). This result indicated that the technical reproducibility of gene expression was high. To analyze the results of 30 primary tumors samples, two‐group t‐tests were used with a threshold of P < 0.04 and an absolute value of the difference in mean expression between the two groups (Δ) > 100 intensity units, with fold change in mean expression >1.5 and <0.66. The 85 genes expressed differentially between the two patient groups with and without cervical lymph node metastasis were selected (Table 2), including 33 genes that were downregulated and 52 genes that were upregulated in the metastasis group. Next, hierarchical clustering was carried out using 85 genes from the 30 samples by Pearson's correlation distance metric and average linkage (Fig. 1). Two major cluster branches were created. One major cluster included 16 non‐metastasis samples and the other included 13 metastasis samples and one non‐metastasis sample (missed clustering). Subsequently, to construct a more accurate and practical prediction model using a smaller number of genes, we selected further the genes with a high power for prediction from the 85 genes using AdaBoost algorithm.⁽ ¹¹ ⁾ In the AdaBoost algorithm, the optimal gene and its weight are determined in each boosting step, and the prediction model is constructed by weighted voting of the selected genes. We performed 1000 replicates of five‐fold cross validation for the learning cases. Eight candidate genes (DCTD, IL‐15, THBD, GSDML, SH3GL3, PTHLH, RP5‐1022P6 and C9orf46) were selected (Table 3), which achieved the minimum error rate. Next a prediction score was established for each sample (Table 1). Prediction scores have a value from −1 to 1, and the borderline is 0. A positive score indicates cervical lymph node metastasis, whereas a negative score indicates that the sample is metastasis‐free. In the learning case, all 13 metastasis samples had positive scores and all 17 non‐metastasis samples had negative scores (Fig. 2A). Microarray is an excellent tool that can analyze the expression of tens of thousands of genes. However it has some problems with accuracy and universal use. For the prediction system with high accuracy, verification that we could accurately analyze gene expression using this method was required. Thus, to confirm the prediction results, quantitative RT‐PCR of the selected eight genes was carried out and normalized to actin before being compared by microarray data. The Pearson correlation values of the eight genes between microarray data and quantitative RT‐PCR data were calculated and revealed to be over 0.73 (Table 5), showing a high correlation between microarray data and quantitative RT‐PCR data in this study. We evaluated the prediction model by using the test case group. A prediction score was calculated for each sample using the prediction model constructed in this study (Fig. 2B). Six non‐metastasis samples and six of the seven metastasis samples (∼92.3%) were predicted correctly by the prediction model. Only one case (∼7.7%) was a failure by this prediction model (circled in red in Fig. 2B).

Hierarchical clustering for 85 genes from 30 samples, including 13 metastasis samples (shown by pink color +) and 17 non‐metastasis samples (shown by green color –). Red color shows that the gene is upregulated and blue color shows that the gene is downregulated. Two major cluster branches were created. One major cluster included 16 non‐metastasis samples and the other one included 13 metastasis samples and one non‐metastasis sample (missed clustering).

The samples were rank‐ordered by their score (determined by the Adaboost algorithm). A vertical line shows the total discriminant score. The samples with negative score indicated that the tumors were free of lymph node metastasis. The samples with positive scores indicated that the tumors metastasized to the cervical lymph node. (A) The prediction result of learning cases. All of the 17 samples in the non‐metastasis group were negative and all of the 13 samples in the metastasis group were positive. (B) The prediction results of test cases. The six samples in the non‐metastasis group were negative. Six of seven samples in the metastasis group were positive and one was negative (failure sample, circled in red).

Table 5.

Pearson correlation of expression values between microarray data and real‐time polymerase chain reaction data of the eight genes

Gene symbol	Pearson correlation	P‐value
DCTD	0.796	2.46 × 10⁻⁷
IL15	0.739	4.74 × 10⁻⁶
THBD	0.753	2.49 × 10⁻⁶
GSDML	0.868	1.06 × 10⁻⁹
SH3GL3	0.911	6.86 × 10⁻¹²
PTHLH	0.852	4.69 × 10⁻⁹
RP5‐1022P6	0.768	1.18 × 10⁻⁶
C9orf46	0.742	4.05 × 10⁻⁶

Open in a new tab

Discussion

At present, the methods for diagnosing the status of lymph node metastasis in oral cavity cancer are not accurate. Thus, two opinions were formed about the treatment of individual clinically diagnosed oral cavity cancer cases that are cervical lymph node metastasis‐free. The first is the ‘neck dissection’ policy and the other is the ‘wait and watch’ policy. However, neither policy provides appropriate treatment for the disease. Because the neck dissection policy can cause pain, discomfort and in some cases leads to complications (such as chronic pain and shoulder palsy), it is important to ascertain whether a patient really is metastasis‐free. The alternative ‘wait and watch’ policy may allow an overlooked metastasis to spread widely for the patient who has micrometastasis. The goal of our study is to devise a novel diagnostic system that may improve the diagnosis of N status in oral cavity cancer. The results in 12 of 13 cases (∼92.3%) were predicted correctly. Only one case (∼7.7%) was a failure by this prediction model. The misjudged metastasis case using our prediction model was a 53‐year‐old man with moderately differentiated tongue squamous cell carcinoma. It is very difficult to discuss why the case missed, because we could not find any relationship between clinicopathological features of the patient and the score. In this study we would like to say that quantitative RT‐PCR data should not be use for this prediction model system. The data could not be predicted accurately. The reason was that the prediction score from microarray data were normalized by RMA algorithm, but the quantitative RT‐PCR data were normalized to actin; therefore the gene expression value of each gene by quantitative RT‐PCR data differs from the microarray data.

Of the eight genes identified, IL‐15 is of particular interest. IL‐15 is a cytokine that regulates T and natural killer cell activation and proliferation. Studies on mice suggest that IL‐15 may increase the expression of apoptosis inhibitor. A recent study has reported that IL‐15 expression has been shown to play an important role in cell proliferation, invasion and metastasis of human colorectal cancer.⁽ ¹³ , ¹⁴ ⁾ In the present study, we observed IL‐15 overexpression in the metastasis group (fold change [FC]: 2.8, P < 0.00711; Pearson correlation between microarray data and real‐time PCR, 0.739). This showed that IL‐15 may also play a role in the metastasis of oral squamous cell carcinoma. Further study is required to learn more about the roles of IL‐15 in the metastasis of OSCC.

A second interesting gene is PTHLH. The protein encoded by this gene is a member of the parathyroid hormone family. This hormone regulates endochondral bone development and epithelial–mesenchymal interactions during formation of the mammary glands and teeth. Some articles have reported that PTHLH may play a role in metastasis of breast cancer and prostate cancer cell lines by upregulation.⁽ ¹⁵ , ¹⁶ ⁾ But in our results, PTHLH was upregulated in the non‐metastasis group and downregulated in the metastasis group (FC: −2, P < 0.022891; Pearson correlation between array data and real‐time PCR, 0.852). It is difficult to explain why, but it may be that the PTHLH mechanism is different in vitro compared with in vivo, or it may be that the role of PTHLH in each type of cells is different. It could also be that cancer cells produce PTHLH to prompt cancer cell migration and invasion, but when the metastasis process is finished PTHLH is no longer necessary and so was downregulated in the metastasis group. Further study may clarify the role of PTHLH in OSCC.

The novel diagnosis system using gene sets may be applied in diagnosis of the disease. Further, the system may be also applied for other diseases in the future.

Supporting information

Supporting info item

CAS-98-740-s001.pdf^{(85.1KB, pdf)}

Acknowledgments

Many thanks to Drs Takashi Shimoji, Koichi Nagasaki, Kiyotsugu Yoshida, Masaru Uekusa and Fumiyuki Uematsu for helpful discussion during the preparation of this article. We also thank Professor Marie Cosgrove for having checked the English language of this paper.

 References

1. Jones AS, Phillips DE, Helliwell TR, Roland NJ. Occult node metastases in head and neck squamous carcinoma. Eur Arch Otorhinolaryngol 1993; 250: 446–9. [DOI] [PubMed] [Google Scholar]
2. Genden EM, Ferlito A, Shaha AR et al. Complications of neck dissection. Acta Otolaryngol 2003; 123: 795–801. [DOI] [PubMed] [Google Scholar]
3. Mueller O, Lightfoot S, Schroeder A. RNA integrity number (RIN) standardization of RNA quality control. Agilent Application Note, May 1 2004. Publication no. 5989‐1165EN. Available from URL: http://www.gene‐quantification.de/RIN.pdf
4. Imbeaud S, Graudens E, Boulanger V et al. Towards standardization of RNA quality assessment using user‐independent classifiers of microcapillary electrophoresis traces. Nucleic Acids Res 2005; 33: e56. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Lee J, Hever A, Willhite D, Zlotnik A, Hevezi P. Effects of RNA degradation on gene expression analysis of human postmortem tissues. FASEB J 2005; 19: 1356–8. [DOI] [PubMed] [Google Scholar]
6. Technical note, GeneChip Eukaryotic Small Sample Target Labeling Assay Version II. Available from URL: http://genomics.msu.edu/RTSF/small_sample_labeling.pdf
7. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 2003; 31: e15. [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Li J, Spletter ML, Johnson JA. Dissecting tBHQ induced ARE‐driven gene expression through long and short oligonucleotide arrays. Physiol Genomics 2005; 21: 43–58. [DOI] [PubMed] [Google Scholar]
9. Irizarry RA, Hobbs B, Collin F et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003; 4: 249–64. [DOI] [PubMed] [Google Scholar]
10. Millenaar FF, Okyere J, May ST, Van Zanten M, Voesenek LA, Peeters AJ. How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results. BMC Bioinformatics 2006; 7: 137. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Freund Y, Schapire RE. A short introduction to boosting. J. Japan. Soc. Artif. Intel. 1999; 14: 771–80. [Google Scholar]
12. Ginzinger DG. Gene quantification using real‐time quantitative PCR: an emerging technology hits the mainstream. Exp Hematol 2002; 30: 503–12. [DOI] [PubMed] [Google Scholar]
13. Kuniyasu H, Ohmori H, Sasaki T et al. Production of interleukin 15 by human colon cancer cells is associated with induction of mucosal hyperplasia, angiogenesis, and metastasis. Clin Cancer Res 2003; 9: 4802–10. [PubMed] [Google Scholar]
14. Kuniyasu H, Oue N, Nakae D et al. Interleukin‐15 expression is associated with malignant potential in colon cancer cells. Pathobiology 2001; 69: 86–95. [DOI] [PubMed] [Google Scholar]
15. Shen X, Qian L, Falzon M. PTH‐related protein enhances MCF‐7 breast cancer cell adhesion, migration, and invasion via an intracrine pathway. Exp Cell Res 2004; 294: 420–33. [DOI] [PubMed] [Google Scholar]
16. Shen X, Falzon M. PTH‐related protein modulates PC‐3 prostate cancer cell adhesion and integrin subunit profile. Mol Cell Endocrinol 2003; 199: 165–77. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting info item

CAS-98-740-s001.pdf^{(85.1KB, pdf)}

[b1] 1. Jones AS, Phillips DE, Helliwell TR, Roland NJ. Occult node metastases in head and neck squamous carcinoma. Eur Arch Otorhinolaryngol 1993; 250: 446–9. [DOI] [PubMed] [Google Scholar]

[b2] 2. Genden EM, Ferlito A, Shaha AR et al. Complications of neck dissection. Acta Otolaryngol 2003; 123: 795–801. [DOI] [PubMed] [Google Scholar]

[b3] 3. Mueller O, Lightfoot S, Schroeder A. RNA integrity number (RIN) standardization of RNA quality control. Agilent Application Note, May 1 2004. Publication no. 5989‐1165EN. Available from URL: http://www.gene‐quantification.de/RIN.pdf

[b4] 4. Imbeaud S, Graudens E, Boulanger V et al. Towards standardization of RNA quality assessment using user‐independent classifiers of microcapillary electrophoresis traces. Nucleic Acids Res 2005; 33: e56. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b5] 5. Lee J, Hever A, Willhite D, Zlotnik A, Hevezi P. Effects of RNA degradation on gene expression analysis of human postmortem tissues. FASEB J 2005; 19: 1356–8. [DOI] [PubMed] [Google Scholar]

[b6] 6. Technical note, GeneChip Eukaryotic Small Sample Target Labeling Assay Version II. Available from URL: http://genomics.msu.edu/RTSF/small_sample_labeling.pdf

[b7] 7. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 2003; 31: e15. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b8] 8. Li J, Spletter ML, Johnson JA. Dissecting tBHQ induced ARE‐driven gene expression through long and short oligonucleotide arrays. Physiol Genomics 2005; 21: 43–58. [DOI] [PubMed] [Google Scholar]

[b9] 9. Irizarry RA, Hobbs B, Collin F et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003; 4: 249–64. [DOI] [PubMed] [Google Scholar]

[b10] 10. Millenaar FF, Okyere J, May ST, Van Zanten M, Voesenek LA, Peeters AJ. How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results. BMC Bioinformatics 2006; 7: 137. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b11] 11. Freund Y, Schapire RE. A short introduction to boosting. J. Japan. Soc. Artif. Intel. 1999; 14: 771–80. [Google Scholar]

[b12] 12. Ginzinger DG. Gene quantification using real‐time quantitative PCR: an emerging technology hits the mainstream. Exp Hematol 2002; 30: 503–12. [DOI] [PubMed] [Google Scholar]

[b13] 13. Kuniyasu H, Ohmori H, Sasaki T et al. Production of interleukin 15 by human colon cancer cells is associated with induction of mucosal hyperplasia, angiogenesis, and metastasis. Clin Cancer Res 2003; 9: 4802–10. [PubMed] [Google Scholar]

[b14] 14. Kuniyasu H, Oue N, Nakae D et al. Interleukin‐15 expression is associated with malignant potential in colon cancer cells. Pathobiology 2001; 69: 86–95. [DOI] [PubMed] [Google Scholar]

[b15] 15. Shen X, Qian L, Falzon M. PTH‐related protein enhances MCF‐7 breast cancer cell adhesion, migration, and invasion via an intracrine pathway. Exp Cell Res 2004; 294: 420–33. [DOI] [PubMed] [Google Scholar]

[b16] 16. Shen X, Falzon M. PTH‐related protein modulates PC‐3 prostate cancer cell adhesion and integrin subunit profile. Mol Cell Endocrinol 2003; 199: 165–77. [DOI] [PubMed] [Google Scholar]

PERMALINK

Identification of a predictive gene expression signature of cervical lymph node metastasis in oral squamous cell carcinoma

Su Tien Nguyen

Shogo Hasegawa

Hitoshi Tsuda

Hirofumi Tomioka

Masaru Ushijima

Masaki Noda

Ken Omura

Yoshio Miki

Abstract

Materials and Methods

Table 1.

Table 2.

Table 3.

Table 4.

Results

Figure 1.

Figure 2.

Table 5.

Discussion

Supporting information

Acknowledgments

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Identification of a predictive gene expression signature of cervical lymph node metastasis in oral squamous cell carcinoma

Su Tien Nguyen

Shogo Hasegawa

Hitoshi Tsuda

Hirofumi Tomioka

Masaru Ushijima

Masaki Noda

Ken Omura

Yoshio Miki

Abstract

Materials and Methods

Table 1.

Table 2.

Table 3.

Table 4.

Results

Figure 1.

Figure 2.

Table 5.

Discussion

Supporting information

Acknowledgments

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

 References