Abstract
Objective
A pilot study to examine accrual rates, efficiency of data capture approaches, study design and genotyping capacity for a future genetic validation study was undertaken.
Design
The process pilot evaluated feasibility of applying a matched case-control design to validate association of two candidate estrogen receptor (ER) single nucleotide polymorphisms (SNPs) with incidence of venothromboembolic events (VTE) in breast cancer patients treated with tamoxifen where criteria included frequency matching by age, number of years diagnosed with breast cancer within 4-year intervals, and geographic residency.
Setting
The study was conducted at Marshfield Clinic, in central Wisconsin.
Participants
Study-eligible cases with a breast cancer diagnosis between 1994 and 2006 who experienced a VTE within 5 years of last tamoxifen exposure were matched at a ratio of 1:4 to controls with a breast cancer diagnosed between 1994 and 2006 with no VTE history following tamoxifen exposure for ≥2 years.
Methods
Feasibility of enrolling, phenotyping, and genotyping 20% of the total number of validated eligible cases and controls was tested in order to project enrollment rates and assess probability of enrolling the projected sample size for the prospective validation study and adequacy of planned data capture. Conditional logistic regression analysis was conducted for the matched case-control study design.
Results
Enrollment accruals included 19 of 24 targeted cases (79%), and 74 of 96 (77%) targeted controls. Electronic data capture for most variables was nearly 100%. No unexpected statistically significant differences were observed between cases and controls. Capacity to conduct in-house screening for rs2234689 (ER1 PvuII), rs9340799 (ER1 XbaI), rs13146272 (CYP4V2), rs2227589 (SERPINC 1) and rs1613662 (GP6) was successfully established. Association of GP6 with VTE was further validated (P=0.0403; OR, 0.19).
Conclusion
Accrual rates to the larger prospective study will require a multi-center design to ensure enrollment of adequate numbers of cases and controls for achieving the projected sample size required to validate association of the ER SNPs. To prevent study failure due to poor accrual, the importance of conducting feasibility studies before launching large scale validation studies of genetic association and adverse drug events, is discussed.
Keywords: Breast cancer, Estrogen receptors, Genetic association, Tamoxifen, Venous thromboembolism
Validating genetic associations underlying adverse drug events (ADE) is daunting due to the relatively low incidence of these events, yet such studies hold high significance due to the substantial morbidity and mortality associated with event incidence. Paradoxically, these studies require large numbers of patients incurring incident events. Validation studies thus require careful examination of feasibility and study design to ensure their success. The current study explored feasibility of a study designed to validate a previously observed genetic association and incidence of thromboembolic events in women exposed to tamoxifen observed in a prior pilot study.1 Although tamoxifen has proven efficacy against all stages of breast cancer, venous thromboembolic event (VTE) complications, including deep venous thrombosis (DVT) and pulmonary embolism (PE), are uncommon but serious consequences of tamoxifen treatment with an estimated incidence of 1.7% to 8.4%.2,3,4 Genetic variability in estrogen-dependent signaling has been postulated in association with incidence of adverse events.3
Activity of tamoxifen is associated with competitive binding to estrogen receptor 1 and 2 (ER1, ER2), blocking ligand binding, thus resulting in inhibition of estrogen-mediated growth of estrogen-sensitive tumors.5 Alleged ER-mediated alteration of hepatic synthesis of hemostatic proteins that may increase risk of VTE has been postulated.6 We extended this hypothesis and proposed that tamoxifen-related VTE may occur due to polymorphisms in the ER1 and ER2 genes, impacting variably on hemostatic factor synthesis depending on which single nucleotide polymorphisms (SNPs) are encoded. Specifically, we posited functionally important SNPs in ER candidate genes that influence risk for DVT/PE in patients receiving tamoxifen for breast cancer treatment. We tested this hypothesis by comparing genotypes of SNPS occurring in ER 1 on previously banked DNA from women with breast cancer treated with tamoxifen who experienced VTE, and women drawn from the same population who experienced no VTE following exposure. Study findings demonstrated a tentative association of XbaI (rs9340799) genotype and the ER1 Xbal/PvuII haplotype (rs9340799 and rs2234693) (P=0.035) among subjects who developed VTE.1 These data remain to be validated in a larger cohort of subjects in addition to SNPs in three genes that demonstrated association with DVT previously reported by Bezemer et al,7 including the SERPINC1 antithrombin gene (rs2227589) (odds ratio [OR] 1.29; risk allele frequency 0.10), GP6 platelet collagen receptor (rs1613662) (OR=1.15; risk allele frequency 0.84), and CYP4V2 in the cytochrome P450 family 4 (rs13146272) (OR=1.24; risk allele frequency 0.64) (P≤0.05 for all three SNPs). The modest relative risk contributed by these SNPs was deemed important, because it represented a large population-attributable risk percentage due to their widespread distribution in the population. A dosage effect with increased ORs was further noted in homozygotes compared to heterozygotes.7 These SNPs were also explored in the present study to determine relative contribution of additional risk to the tamoxifen-exposed target population or potential for genetic interaction with ER1 genes XbaI/PvuII and/or other known risk factors.
Assuming a 4% incidence of VTE among women in our population, projected enrollment of several hundred women with incident VTE post tamoxifen exposure was projected to conduct a powered validation study. The study tested feasibility of enrolling, phenotyping, and genotyping 20% of the total number of validated eligible cases and controls using the prospective randomized controlled trial design in order to project enrollment rates and assess probability of enrolling an adequate sample size into a prospective genetic association validation study with 80% power at 5% level of significance. Further, the study sought to (1) test proposed methodology for identification of subjects, (2) assess clinical data capture and prospective collection of DNA, (3) establish genotyping capacity of collected DNA, and (4) test preliminary analytical approaches to the sample data, although such analyses were under-powered. Study outcomes are summarized in this report, and importance of conducting feasibility in the context of designing validation studies of genetic association studies that examine adverse event prediction related to pharmacological treatment is highlighted.
Methods
Study Design
The study was conducted at Marshfield Clinic (MC) the largest, physician-owned, private group medical practice in Wisconsin, with an extensive regional oncology practice providing care to residents of mostly rural central, northern, and western Wisconsin and Michigan’s Upper Peninsula through its network of 45 regional centers in collaboration with local and regional hospitals. The healthcare system is supported by an electronic health record (EHR) that maintains real-time access to a patient’s health record including data related to diagnoses, procedures, clinical notes, laboratory data, imaging, pharmacy data, hospital progress notes, and discharge summaries captured in a data warehouse which is interfaced with dedicated patient registries, including the chemotherapy database and cancer registries.
This study tested enrollment to a matched case-control study design wherein cases, defined as MC patients with a breast cancer diagnosis between 1994 and 2006 (to allow for followup period of at least 3 years in the more recently exposed patients) treated with tamoxifen and who experienced a VTE within 5 years of last tamoxifen exposure, were matched to controls, defined as MC patients diagnosed with breast cancer in the same temporal period with no history of incident VTE following tamoxifen exposure for a minimum of 2 years, in a ratio of four controls per each case. Matching criteria applied included (1) frequency matching by age group (ie, cases and controls were matched by age ±5 years from age at diagnosis); (2) frequency matching by year of diagnosis (ie, women diagnosed within the same 4-year block were grouped together: 1994–1997, 1998–2001, 2002–2006); (3) stratification to Marshfield Epidemiologic Study Area (MESA) residency, where MESA is defined as residency within a 19 zip code region centered geographically around Marshfield, Wisconsin and a 9 zip code area in northern Wisconsin. Established in 1991, MESA is a population-based study area with a total population of approximately 90,000 largely rural-dwelling residents with a history of low in-and-out migration, who seek care almost exclusively at MC, making comprehensive tracking of their health status possible via MC’s EHR and data warehouse. The population is genetically fairly homogeneous with >88% of northern European ancestry. The Personalized Medicine Research Project (PMRP) is a population-based DNA biobank of 20,000 individuals drawn largely from MESA. Non-MESA resident cases were enrolled and matched regionally to other non-MESA resident controls when sufficient data ascertained that subjects met inclusion/exclusion criteria. Subjects were enrolled prospectively in order to collect DNA. Relevant medical history was validated with subjects at time of enrollment. The study was reviewed and approved by MC’s institutional review board (IRB). Participants were enrolled after obtaining IRB-approved, written informed consent.
The study tested feasibility of enrolling, phenotyping, and genotyping 20% of the total number of validated eligible cases and controls in order to project enrollment rates and assess probability of enrolling an adequate sample size into a prospective, genetic association validation study with 80% power at 5% level of significance. The projected sample size for the prospective, powered, validation study shown in table 1 summarizes the total number of subjects (cases and controls) projected to achieve adequate sample sizes for the risk ratios shown if cases are matched at a ratio of one case to four controls, including matching by age at diagnosis, year of diagnosis within three groups defined in 4-year blocks between 1994 and 2006. The minor allele frequencies for ER1 PvuII (rs2234693), and XbaI (rs9340799) in a European population are both in the range of those shown in three separate databases queried by an Entrez search (ie, 0.3). years after the last exposure to tamoxifen; (3) they were deceased; and (4) they had onset of the VTE before tamoxifen exposure. Study eligible cases were defined as living patients with breast cancer who experienced first DVT or DVT recurrence following tamoxifen exposure. Living patients with breast cancer who did not experience DVT following tamoxifen exposure during an observational window of at least 2 years following first exposure were defined as study-eligible controls to an appropriately matched case as defined by matching criteria outlined above.
Table 1.
Expected frequency, minor allele (a) | Frequency, homozygous/heterozygous variants (aa/aA) | Sample size by expected risk ratio
|
||
---|---|---|---|---|
1.50 | 1.75 | 2.00 | ||
0.10 | 0.19 | 1200 | 630 | 405 |
0.20 | 0.36 | 505 | 245 | 155 |
0.30 | 0.49 | 265 | 130 | 80 |
Interrogation of the EMR identified all individuals with a breast cancer diagnosis (n=3848) and tamoxifen exposure between 1994 and 2006 who were seen within the MC system. Exclusion criteria applied to the electronically identified cases (n=240) included DVT incidence before tamoxifen exposure (n=62), DVT incidence >5 years following tamoxifen exposure (n=14), and PMRP subjects already genetically characterized (n=16), leaving 148 subjects who met the case definition. Sample size estimates projected a requirement for 130 cases for a study with 80% power to validate association of the ER1 SNPs (table 1), and 520 controls, assuming a 1:4 case:control ratio to minimize numbers of cases required without loss of power. Assuming a participation rate of approximately 80%, it was projected that 120 of the 130 required validated cases could potentially be enrolled for the full validation study, if all assumptions were valid and all cases and controls could be confirmed. Additionally, due to the number of controls required to achieve 1:4 ratio of cases to controls, feasibility of enrolling a subset of the 520 required controls was a further aim of the study. Therefore, this pilot study proposed to test enrollment of 20% (n=24) of the 120 potentially eligible cases and controls (n=104) in sufficient numbers to achieve 1:4 case:control ratio for a total of 128 subjects, to examine efficacy of the enrollment plan to achieve enrollment target and feasibility of planned data collection. In addition, establishment of in-house genotyping capacity was undertaken by genotyping DNA collected in this study for targeted alleles. Enrollment of study controls was focused first on patients residing within the MESA zip codes, since likelihood of enrolling this population has historically been highest for other studies.
Exclusion/Inclusion Criteria
Patients were excluded from the feasibility study if (1) they were enrolled in PMRP and were previously genotyped in the original association study;1 (2) VTE occurred more than 5
Screening and Enrollment
Potentially eligible subjects treated with oral tamoxifen from January 1992 to December 2006 were validated using the chemotherapy database or using FreePharma text search software and the Medications Manager application. Electronic documentation of dose was required to increase likelihood of tamoxifen exposure. Manual validation of tamoxifen use was performed on all identified patients. Incidence of DVT/PE (or lack thereof) in study subjects was identified electronically through interrogation of the EHR, and potential cases were validated by manual abstraction of radiographic data from the vascular laboratory to confirm VTE. Eligible controls that met matching criteria for each consenting case were selected and matched from the subset of eligible controls that were sorted by year of diagnosis, residence, and age at diagnosis. Among the 143 that had been identified by electronic feasibility assessment, only 91 remained eligible following the validation process. Reasons for eligibility failure included that the patient (1) was deceased, (2) had never initiated tamoxifen, and (3) VTE event occurred prior to tamoxifen exposure.
Patient Recruitment Protocol
Eligible subjects were sent a recruitment letter that was followed-up telephonically 2 weeks later. Following verbal consent, subjects were scheduled for an appointment to obtain written informed consent, undergo phlebotomy and validation of their medical history.
DNA Collection
Heparinized blood (10 ml) was drawn from each consenting participant and sent to the research laboratory for DNA extraction from the buffy coat and genotyping for Factor V Leiden; prothrombin gene mutation; and estrogen receptor candidate genes, ER1 PvuII, XbaI, CYP4V2 (rs13146272), SERPINC1 (rs2227589), and GP6 (rs1613662). Additional blood was collected and submitted to Marshfield Laboratories for determination of Proteins C and S, lupus anticoagulant panel, homocysteine, Antithrombin III, and cardiolipin levels. Patients who had previously been screened for the clinical markers (eg, patients who have experienced a DVT) were not required to be re-tested, and available historical laboratory data were evaluated for these subjects.
Phenotypic Data Collection
To build a phenotypic profile, data were abstracted electronically or manually from the EHR or registries, and data points were manually validated. Variables collected included age, gender, race, education level, weight/height/ BMI (at start of tamoxifen therapy), smoking history, breast cancer data (date of diagnosis, tumor stage, grade, morphology, regional lymph node involvement, human epidermal growth factor receptor 2 [HER2 Neu] status, estrogen receptor/ progesterone receptor [ER/PR] status, types and dates of treatment, and recurrence), tamoxifen therapy data (start and stop dates and duration of treatment), DVT data (date of diagnosis, site of DVT, port associated), surgery within 3 to 6 months of DVT diagnosis, laboratory values captured at the date closest to initiation of tamoxifen (including total cholesterol, low and high density lipoproteins, and triglycerides). If electronically available, laboratory data concerning the following markers were also abstracted: lupus anticoagulant panel, protein S, protein C, homocysteine, factor V Leiden, prothrombin gene mutation, Antithrombin III, and cardiolipin antibodies. Data related to concomitant medications including hormone replacement therapy, oral contraceptives, nonsteroidal anti-inflammatory drugs, acetyl salicylic acid (aspirin), statins, cholesterol lowering agents, corticosteroids, and antihypertensives were further collected. Presence of the following co-morbid conditions was also documented: hypertension, diabetes, congestive heart failure, depression, liver failure, kidney failure, coronary artery disease, chronic obstructive pulmonary disease, previous history of DVT or PE, or other cancers.
Laboratory Approaches to Genotyping
DNA samples of eligible consented subjects were extracted from blood collected using the Autopure LS instrument which accomplishes automated purification of archival-quality DNA. Genotyping was accomplished using a single Sequenom assay for all of the relevant polymorphisms: ER1 PvuII (rs2234693), XbaI (rs9340799), CYP4V2 (rs13146272), SERPINC1 (rs2227589), GP6 (rs 1613662). Each polymorphism was masked using SNP masker and placed into the Sequenom Assay Design 3.1 program to develop the multiplex assay. A single assay with all of the polymorphisms was created. Polymorphisms had previously been tested in a theoretical multiplex reaction using this program to create the assay design, so creation of a single multiplex reaction was possible without delay.
Allele detection was carried out using the MALDI-TOF mass spectrometric method of allele determination on a Sequenom platform (MassARRAY Typer 3.4 Software, Sequenom Corp, San Diego, California). A multiplexed PCR reaction was carried out on genomic DNA to amplify regions of interest. These products were then annealed with primers in the region directly adjacent to the polymorphism of interest, and a single base pair primer extension reaction was performed to generate the allele specific products. The products were placed on a proprietary chip and analyzed using MALDI-TOF mass spectrometry and Sequenom Typer 3.4 software to make the allele determination. Any automatic genotype determinations that were flagged as lower confidence or in the event of program failure to automatically determine the genotype were double-checked manually to determine if allele calls could be made.
For quality control, the assay was tested on a panel of 28 Caucasian HapMap Central European samples from the CEPH population provided by Coriell. Results were compared with those released through the dbSNP Genotyping detail (Genotype Query Form–Beta, http://www.ncbi.nlm.nih.gov/projects/SNP/snp_gf.cgi?pg=2&RSPick=1&tax_id=9606&RSlist=486907) or the cancer500 web site (Cancer Genome Anatomy Project, SNP500Cancer Database, http://snp500cancer.nci.nih.gov/home_1.cfm?CFID=2111817&CF TOKEN=11851595) to ensure genotyping accuracy. Polymorphisms that were not automatically determined at least 80% of the time were redesigned or excluded from the assay. The frequencies of the alleles and genotypes obtained with the assay were compared to the expected frequencies from dbSNP, and all polymorphisms in the assay were checked for disagreement with Hardy Weinberg equilibrium. Any polymorphisms that failed were further analyzed to ensure no genotyping error occurred. On the chip that included the sample genotypes, several (2 to 4) previously genotyped Coriell samples were also included to ensure that the assay was working as intended. Water negative control samples (2 to 4) were also included to ensure that accurate calls were made.
Statistical Analysis Summary
Since the proposed study represented a feasibility process pilot for a future clinical trial, it lacked statistical power for hypothesis testing even if the projected enrollment targets of 24 cases and 96 controls were achieved. Therefore, statistical approaches were largely intended to simulate analyses to be applied to the future study. Applying univariate analysis, descriptive statistics (for any continuous measurements: mean, standard deviation, median, and the range; for any categorical measurements: percentage, and corresponding 95% confidence interval) were reported for each of the patient’s attributes including BMI, age at diagnosis, history of co-morbid conditions, specific genetic markers of interest, and other variables included in the data collection forms. Conditional logistic regression modeling approaches were applied to obtain odds ratio and corresponding 95% confidence interval for the status of VTE according to each of the above-mentioned risk factors, as well as each of the targeted genetic markers of interest and VTE outcome, as the predictor variable to test this approach for future study design, although genetic data lacked statistical power (power=20%). To conduct these analyses, SNPs were classified as binomial markers by coding the homozygous risk allele as ‘1’ and the heterozygous or homozygous non risk allele as ‘0’. Results were then analyzed using SAS PROC PHREG, with case-control status specified in the STRATA statement in order to account for matching. Further descriptive analyses included determination of enrollment rate, analysis of data capture rates for each data point, and assessment of accuracy of genotyping. All statistical analyses were carried out using a commercially-available statistical software package (SAS).
Results
The following narrative summarizes outcomes of this feasibility study.
Enrollment
Overall enrollment achieved was 77.5% of total targeted enrollment, enrolling 93 of the targeted 120 subjects. Cases achieved 79% of enrollment target enrolling 19 of 24 targeted cases, and controls achieved 77% of enrollment target with 74 of 96 targeted controls enrolled over a timeframe of 22 months, at which time enrollment terminated. Screening of 91 individuals to achieve enrollment of 19 of 24 cases, and 352 subjects to enroll 74 of 104 controls resulted in a screening failure rate of 79% for both cases and controls, respectively.
Barriers to Enrollment Identified by Patients
Among patients who chose not to participate, 17 of 70 (24%) offered one of the following explanations:
12 cited distance from the Clinic or unwillingness to make additional Clinic visits
1 cited being overwhelmed by current circumstances
1 was scheduled for surgery
2 cited difficulty in ambulation or barriers to getting to the Clinic
1 declined to be involved in research.
Among patients enrolled, blood and DNA were obtained from 100% of subjects, and no subjects who enrolled withdrew.
Matched Analysis
For all variables analyzed, including demographic variables (eg, age, race, education), clinical variables (eg, BMI, blood pressure), environmental/pharmaceutical exposures (eg, smoking, hormone replacement therapy, contraceptive use, anti-inflammatory medications, cholesterol lowering agents, and anti-hypertensive drugs), laboratory values (eg, lupus anticoagulants, clotting factors, homocysteine, lipid panels), cancer-related variables (eg, surgical or radiological breast cancer treatment, lymph node involvement, recurrence, hormone receptor status, age at diagnosis, age at tamoxifen initiation, age at time of data abstraction, and tumor characteristics [stage, grade, morphology]), and comorbidities (eg, hypertension, dementia, cardiac, vascular, hepatic, renal or connective tissue disease, diabetes, and ulcers), matched analysis between cases and controls applying conditional logistic regression analysis found no statistically significant differences for these parameters.
Demographic Data Summary
Demographic data of enrolled subjects are summarized in table 2. Cancer-related variables for cases and controls are summarized in table 3. A statistically significant difference in breast cancer treatment by chemotherapy was observed between cases and controls. Table 4 summarizes comparisons of pharmaceutical exposures between patients and controls. Differences in warfarin use among cases and controls achieved statistical significance (P<0.0001). In matched analysis of mean duration of tamoxifen treatment between cases and controls, mean duration was 3.4 (±1.8) years and 4.6 (±1.0) years, respectively, and this difference achieved statistical significance (P=0.0258). For all other variables including age at abstraction, age at initiation and termination of tamoxifen treatment, BMI, lupus anticoagulant, protein S, protein C, homocysteine, antithrombin III, lipid panels, and blood pressure measures, no statistically significant differences were found between cases and controls when matched analysis was performed. Conditions and relative frequencies of other comorbidities seen most commonly among cases and controls are summarized in table 5.
Table 2.
Variable | Cases (N=19) | % Capture (n) | Controls (N=74) | % Capture (n) |
---|---|---|---|---|
Caucasian race | 18 (100%) | 95% (18) | 65 (100%) | 88% (65) |
Education ≥12 years | 14 (100%) | 73% (14) | 26 (96%) | 37% (27) |
MESA resident | 6 (32%) | 100% (19) | 24 (32%) | 100% (74) |
Age at abstraction (y) | 69.1 ± 8.1 | 100% (19) | 69.2 ± 7.4 | 100% (74) |
MESA, Marshfield Epidemiologic Study Area
Table 3.
Variable | Cases | % Capture (n) | Controls | % Capture (n) |
---|---|---|---|---|
Age at primary breast cancer diagnosis (y) | 59.8 ± 7.7 | 100% (19/19) | 59.6 ± 6.9 | 100% (74/74) |
Lymph node involvement | 7/19 (37%) | 100% (19/19) | 18/71 (25%) | 96% (71/74) |
Lumpectomy treatment | 11/19 (58%) | 100% (19/19) | 38/74 (51%) | 100% (74/74) |
Mastectomy treatment | 9/19 (47%) | 100% (19/19) | 40/74 (54%) | 100% (74/74) |
Radiation therapy | 14/19 (74%) | 100% (19/19) | 53/74 (72%) | 100% (74/74) |
Chemotherapy* | 14/19 (74%) | 100% (19/19) | 26/73 (36%) | 99% (73/74) |
Breast cancer recurrence | 2/19 (11%) | 100% (19/19) | 3/74 (4%) | 100% (74/74) |
HER2 1, 2 or 3+ status | 7/19 (37%) | 100% (19/19) | 16/26 (62%) | 35% (26/74) |
ER positive status | 2/17 (12%) | 90% (17/19) | 2/59 (3%) | 80% (59/74) |
PR positive status | 4/17 (24%) | 90% (17/19) | 9/56 (16%) | 76% (56/74) |
Cancer stage | 84% (16/19) | 73% (54/74) | ||
Stage I | 8/16 (50%) | 26/54 (48%)1 | ||
Stage II A/B | 7/16 (44%) | 24/54 (44%) | ||
Stage III A/B/C | 1/16 (6%) | 4/54 (7%) | ||
Tumor grade | 100% (19/19) | 100%(74/74) | ||
I/III | 2/19 (11%) | 10 (14%) | ||
II/III | 6/19 (32%) | 21 (28%) | ||
III/III | 4/19 (21%) | 8 (11%) | ||
Tumor morphology | 100% (19/19) | 100% (74/74) | ||
Ductal involvement | 14/19 (74%) | 52/74 (70%) | ||
Invasive tumors | 7/19 (37%) | 23/74 (31%) |
*P=0.0225. 1Controls were matched 1:3 for stage 1 (insufficient stage 1 controls to match at ratio of 1:4); matching at all other stages was 1:4. HER2, Human Epidermal Growth Factor Receptor 2; ER, Estrogen receptor; PR, Progesterone receptor
Table 4.
Variable | Cases | % Capture (n) | Controls | % Capture (n) |
---|---|---|---|---|
Hormone replacement therapy | 15/19 (79%) | 100% (19/19) | 40/68 (59%) | 92% (68/74) |
Contraception | ||||
Oral | 12/17 (71%) | 90% (17/19) | 28/49 (57%) | 66% (49/74) |
Intrauterine | 2/16 (13%) | 84% (16/19) | 4/47 (9%) | 64% (47/74) |
Non-steroidal anti-inflammatory drugs | 14/19 (74%) | 100% (19/19) | 39/74 (53%) | 100% (74/74) |
Acetyl salicylic acid (aspirin) | 13/19 (68%) | 100% (19/19) | 43/73 (59%) | 99% (73/74) |
Corticosteroids | 7/19 (37%) | 100% (19/19) | 20/73 (27%) | 99% (73/74) |
Statins | 10/19 (53%) | 100% (19/19) | 31/74 (42%) | 100% (74/74)) |
Other cholesterol lowering agent | 5/19 (26%) | 100% (19/19) | 7/74 (9%) | 100% (74/74) |
Antihypertensive agents | 11/19 (58%) | 100% (19/19) | 36/74 (49%) | 100% (74/74) |
Warfarin* | 17/19 (89%) | 100% (19/19) | 5/73 (7%) | 99% (73/74) |
Warfarin use: P<0.0001
Table 5.
Comorbidity | Cases (N=19)n (%) | Controls (N=74) n (%) |
---|---|---|
Hypertension | 11 (58%) | 39 (43%) |
Myocardial infarction | 2 (11%) | 3 (4%) |
Congestive heart failure | 1 (5%) | 2 (3%) |
Peripheral vascular disease | 2 (11%) | 1 (1%) |
Coronary artery disease | 4 (21%) | 5 (7%) |
Chronic obstructive | 1 (5%) | 2 (3%) |
pulmonary disease | ||
Diabetes | 5 (26%) | 10 (14%) |
Other cancer | 5 (26%) | 8 (11%) |
Connective tissue disease | 1 (5%) | 4 (5%) |
Ulcers | 1 (5%) | 1 (1%) |
Comorbidity Analysis Summary
No statistically significant differences were noted among cases and controls for distribution of comorbidities, and data were available for 100% of both cases (19/19) and controls (74/74). No cases (0/19) or controls (0/74) had a history of the following comorbidities: leukemia, lymphomas, cardiovascular disease, diabetes-associated end organ damage, moderate to severe liver disease, hemiplegia or HIV/AIDS. No cases (0/19) had a history of the following conditions (prevalence among controls is shown in parentheses): mild liver disease (1 control [1%]), renal disease (4 controls [5%]), metastatic solid tumor (1 control [1%]), dementia (1 control [1%]), or stroke (2 controls [3%]).
Genetic Analysis
Capacity to successfully screen for the following SNPs was established in-house for the following: rs2234689 (ER1 PvuII), rs9340799 (XbaI), rs13146272 (CYP4V2), rs2227589 (SERPINC 1), and rs1613662 (GP6). With only 20% power to observe an association between incidence of VTE and most risk alleles, conditional logistic regression analysis adjusting for matching and other variables between cases and controls did not achieve statistical significance for most alleles examined (data not shown). The exception was the risk allele ‘A’ for GP6, which achieved statistical significance (P=0.0403) among the 10 cases and 58 matched controls following conditional logistic regression analysis, adjusting for matching and other variables.
Discussion
Rationale for Undertaking a Process Feasibility Study
A previous study conducted by our group identified two SNPs occurring in the Estrogen I receptor (ER1) PvuII and XbaI that demonstrated a possible association with emergence of DVT in women with breast cancer treated with tamoxifen.1 The sample size projected to validate these findings in a further statistically powered study was n=120 cases and n=480 controls, assuming a 1:4 ratio of cases to controls. Given the historically low success rate in validating genetic association, this sample size proved challenging to achieve at a single institution in the face of the following considerations: (1) low historical accrual rates to non-treatment oncology studies, (2) genetic nature of the validation study, and (3) low adverse event rate among tamoxifen-exposed cases identified at our institution. Conducting a feasibility trial to assess enrollment and overall study design to assess the need for a multi-center study was viewed as essential preliminary data to accurately project enrollment rates and cost across institutions and demonstrate feasibility for robust enrollment and appropriateness of the study design before applying for funding support for the validation study.
The importance of conducting feasibility pilot studies in advance of implementing study designs requiring robust prospective accrual is becoming increasingly important in the era of translational medicine. Some NIH institutes, including the National Heart Lung and Blood Institute, have already begun to implement monitoring of recruitment adequacy with the purpose of predicting failure of clinical trials based on inadequacy of accrual, and terminating such trials early.8 Several recent studies examining accrual to federally funded treatment trials reported that 50% to 80% of trials did not achieve the projected accrual goals within the proposed accrual timeframe, and over one-third failed to meet minimum accrual goals at the time of closure.9,10 Notably, among 82 cancer trial cooperative group trials, therapeutic trials had superior accrual to non-therapeutic trials (59% vs. 27%, P=0.05).10 Importantly, the group indicated that those studies that conducted pre-trial accrual assessment achieved higher rates of sufficient accrual compared to those studies that did not (67% vs 47%) and emphasized a need to identify approaches that more accurately identify accrual to clinical studies.10
Studies seeking to validate genetic association generally require large sample sizes. This becomes particularly challenging when proposing definition of biomarkers predictive of serious adverse response to pharmaceutical intervention where enrollment of a relatively large cohort of persons impacted by the target adverse event is required in the face of the low event rate among users of the medication. Research examining enrollment to oncology genetic studies in the context of epidemiology have largely been focused on hereditary breast cancer, but enrollment data into other cancer-related genetic epidemiological research has not been characterized11 and represents a relatively new area of research.12
A recent clinical trial, published in 2012 by Regan et al,13 examined CYP2D6 SNP association with tamoxifen response in post-menopausal women (n=4861). The study reported no association between tamoxifen and disease control, but did detect an association with hot flashes (a frequent side effect) as an ADE. Incidence of VTE was not addressed in their study. The authors emphasized the need for more studies defining tamoxifen metabolism, mediation of drug effects, and collateral response to drugs.13 Moreover, in the context of newly defined candidate alleles, such as ER1, PvuII and XbaI identified in our previous study1, there is little historical precedent to inform optimal design of studies to validate association of these markers and ADEs associated with tamoxifen or other oncological adjuvant therapy. The present feasibility pilot was undertaken to inform optimization of clinical trial design to validate candidacy of these newly identified putative candidate SNPs.
Demonstration and validation of genetic association in the context of complex diseases in population-based studies of unrelated individuals is generally driven by a number of factors including true effect size of the polymorphism, distribution of the allele frequency in the given population, case-control matching, degree of genetic heterogeneity of the population, frequency of the disease event, and presence of other confounding variables or interactions that impact on the candidate association.14 Estimating strength of the effect of a given allele in exploratory studies enrolling modest numbers of cases and controls often have failed to validate statistical significance reported in initial studies, emphasizing the need for validation studies. Historically, estimations of the magnitude of the effect for the variant under study and many candidate associations have not been supportable.15–17
Key Findings: Interpretation and Discussion of Study Outcomes
Due to a historically high subject research participation rate in our population, minimal time investment by the participant for collection of a single blood sample at their convenience during a clinic visit, and relatively low risk of the study, an optimistic assumption of 80% enrollment among the eligible, electronically identified subjects was projected to achieve the 20% enrollment of the full validation study. Accrual of eligible cases and controls proved challenging, however, despite extension of the projected enrollment window from 5 months to 22 months and intensive screening efforts to achieve the target of 24 cases and 104 controls. Among the 148 eligible cases identified by interrogation of the EHR, 63% (n=93) were validated manually as true cases. Loss of eligibility was largely attributable to death of patients before study initiation, false positive status regarding tamoxifen exposure, or VTE incidence in a timeframe before drug exposure. Ultimately the entire pool of eligible cases (n=93, 100%) identified as having experienced a VTE following tamoxifen exposure were approached for enrollment, resulting in 21% accrual among all eligible cases screened to achieve enrollment of 79% of enrollment target (12/24). Among controls, accrual rate was 21% (74/352), achieving 71% of targeted enrollment and providing a case:control ratio of nearly 1:4. The overall institutional contribution to the projected sample size for the full validation study was 14.6% of required number of cases (19/130) and 14.2% of controls (74/520).
The 21% observed enrollment rate among both cases and controls was somewhat lower than the 27% historical enrollment reported by Schroen et al10 for NCI-supported non-therapeutic trials performed by the Clinical Trials Cooperative Group. Based on the most frequently cited reasons offered by the subjects who declined enrollment, institutional access was cited most frequently by those declining participation. This is not totally unexpected due to the rural nature of the population served by our institution and large distances patients travel to receive care, although other studies have accrued well in the same setting. This is further borne out by the low percentage of MESA residency (32%) among enrollees (table 2), where MESA residents would be in closest proximity to the institution, since MESA encompasses the 19 zip codes surrounding the city of Marshfield, the healthcare system’s headquarters. These data indicated that >60% of enrolled subjects may have travelled considerable distances to participate in this study. This would argue strongly for an enrollment strategy that would link enrollment of targeted patients to future scheduled clinic visits in order to reduce travel as a barrier to enrollment. Another potential variable contributing to the lower rate of enrollment observed in the present study is the genetic nature of the study.
Cases and controls were generally well matched, exhibiting no statistically significant differences, with cases generally being matched at a ratio of 1:3 to 1:4 during analyses. Only two variables achieved statistically significant differences among cases and controls: (1) warfarin utilization (P<0.0001), which was expected, since cases experiencing VTE would have received anticoagulation therapy, whereas controls with no history of VTE would not be expected to have warfarin exposure; and (2) chemotherapy treatment (P=0.0225). Cases were twice as likely to have had chemotherapy as controls (74% vs 36%, respectively). This finding is not particularly surprising, since chemotherapy and intravenous vascular access used for chemotherapy administration are risk factors for VTE. Overall, collection of data for most variables was possible for nearly 100% of cases and controls. Variables that did not achieve 100% and potential reasons of solution are summarized in table 6.
Table 6.
Variable | Probable Cause | Solution |
---|---|---|
Race | Inconsistently captured in EHR | Collect at enrollment |
Education | Not well documented in EHR | Re-evaluate value of data/collect at enrollment |
Cancer stage | Not entered in EHR | Supplement with manual chart abstraction |
HER2/neu; ER/PR status | May not have been evaluated | Better evaluated as widely accepted standard in recent years. Limited testing may be required for older samples. |
Contraception | Not well captured in EHR | Collect on enrollment |
HER2/neu, Human Epidermal Growth Factor Receptor 2; ER, Estrogen receptor; PR, Progesterone receptor; EHR, Electronic health record
Important Insights and Next Steps
The study successfully established and validated genotyping accuracy for alleles of interest in-house in preparation of the expanded study. A limitation of the current study was the low statistical power (20%) to validate SNP association with incidence of VTE. Notably, the risk allele ‘A’ for GP6 achieved and retained statistical significance (P=0.0403), even at 20% power, among the 10 cases and 58 matched controls following conditional logistic regression analysis and adjustment for age, year of diagnosis, BMI, time to follow up, and cancer stage, achieving an OR of 0.19. These data tentatively validate putative association of GP6 with DVT previously reported.7
An additional limitation of the study is survival bias. Due to the high mortality rate among breast cancer patients, the potential for death either before observation of an ADE following tamoxifen exposure or as a consequence of an ADE with no opportunity for sample collection will introduce study bias and ‘effect size erosion’.17 Anderson et al18 have simulated formulas for estimating the magnitude of effect size erosion based on a variant’s OR for disease, OR of lethality, and minor allele frequency status to support enhancement of power calculations precision for replication of genetic association outcomes utilizing case-control designs.18 Revisiting the sample size estimates projected in the current feasibility study via application of the proposed formula is warranted before the full study is advanced.
The decision to conduct a multi-center study brings with it additional challenges. Greene et al19 suggested that failure to replicate genetic association may be a function of allele frequency of a second allele if the alleles are interactive. Thus, the validation study will need to test for allelic interactions and examine allele frequencies among populations in which association replication is undertaken. Whereas our population is relatively homogenous, conducting the study at a multi-institutional level may introduce population stratification and heterogeneity, which will need to be factored into the design of an expanded study.
The Way Forward
The importance of conducting enrollment feasibility trials is reinforced by the outcomes of our study, if trial failure is to be prevented. In a recent analysis of low enrollment at a single institution due to failure of clinical trials to recruit participants, Kitterman et al20 projected an annual financial loss of $1 million for studies that failed to achieve target enrollment, with the majority of these representing government-funded studies. Nassar et al21 argued that under-enrollment of clinical studies demonstrating recruitment failure may be construed as unethical, in light of wasted resources, risk incurred by research participants, wasted time, and lack of clinical or scientific value of these trials. These authors advocated use of professional recruitment agencies to help define recruitment strategies for challenging studies.
Projection of accrual rates established in this feasibility study reinforced the need for a multi-institutional design to achieve target enrollment into the validation study, and enrollment outcomes observed will inform scaling required to achieve a successful outcome. An extensive review by Wilke et al22 of successes and challenges for defining genetic risk factors specifically in the context of clinically significant ADEs advocates multi-center approaches to accomplish validation of genetic association. These authors recommended that exploration of genetic association with low-frequency ADEs be undertaken in the context of consortia, or be nationally or globally scaled, encompassing collaborations between government agencies, healthcare systems, academic medical centers, and pharmaceutical companies in order to achieve the scale required to validate association. Further recommendations included modeling of functional relevance of the SNP to pharmacokinetics and pharmacodynamics associated with drug metabolism, establishment of the background incidence rates, and unambiguous diagnostic criteria.22 In the context of exploring the genetics of tamoxifen metabolism and outcomes, a previously established consortium will be explored as a venue for advancing this validation study.
Acknowledgements
We wish to acknowledge work of Carla Rottscheit, project programmer, and Katrina Moore, Research coordinator, for their assistance on this project. We acknowledge Marie Fleisner for editorial assistance with manuscript preparation.
References
- 1.Onitilo AA, McCarty CA, Wilke RA, Glurich I, Engel JM, Flockhart DA, Nguyen A, Li L, Mi D, Skaar TC, Jin Y. Estrogen receptor genotype is associated with risk of venous thromboembolism during tamoxifen therapy. Breast Cancer Res Treat 2009;115:643–650 [DOI] [PubMed] [Google Scholar]
- 2.Fisher B, Costantino JP, Wickerham DL, Redmond CK, Kavanah M, Cronin WM, Vogel V, Robidoux A, Dimitrov N, Atkins J, Daly M, Wieand S, Tan-Chiu E, Ford L, Wolmark N. Tamoxifen for prevention of breast cancer: report of the National Surgical Adjuvant Breast and Bowel Project P-1 Study. J Natl Cancer Inst 1998;90:1371–1388 [DOI] [PubMed] [Google Scholar]
- 3.Normanno N, Di Maio M, De Maio E, De Luca A, de Matteis A, Giodano A, Perrone F, NCI-Naple Breast Cancer Group Mechanisms of endocrine resistance and novel therapeutic strategies in breast cancer. Endocr Relat Cancer 2005;12:721–747 [DOI] [PubMed] [Google Scholar]
- 4.Baum M, Budzar AU, Cuzick J, Forbes J, Houghton JH, Klijn JG, Sahmoud T, ATAC Trialists’ Group Anastrozole alone or in combination with tamoxifen versus tamoxifen alone for adjuvant treatment of postmenopausal women with early breast cancer: first results of the ATAC randomised trial. Lancet 2002;359:2131–2139 [DOI] [PubMed] [Google Scholar]
- 5.Jordan VC, O’Malley BW. Selective estrogen-receptor modulators and antihormonal resistance in breast cancer. J Clin Oncol 2007;25:5815–5824 [DOI] [PubMed] [Google Scholar]
- 6.Dhingra K. Antiestrogens--tamoxifen, SERMs and beyond. Invest New Drugs 1999;17:285–311 [DOI] [PubMed] [Google Scholar]
- 7.Bezemer ID, Bare LA, Doggen CJ, Arellano AR, Tong C, Rowland CM, Catanese J, Young BA, Reitsma PH, Devlin JJ, Rosendaal FR. Gene variants associated with deep vein thrombosis. JAMA 2008;299:1306–1314 [DOI] [PubMed] [Google Scholar]
- 8.National Heart Lung and Blood Institue Guidance and Implementation for Monitoring Adequacy of Accrual of Participants to NHLBI Supported Human Subjects Research. Available at: http://www.nhlbi.nih.gov/funding/policies/accrual_guidelines.htm Accessed: May 17, 2012
- 9.Cheng SK, Dietrich MS, Dilts DM. Predicting accrual achievement: monitoring accrual milestones of NCI-CTEP sponsored clinical trials. Clin Cancer Res 2011;17:1947–1955 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Schroen AT, Petroni GR, Wang H, Gray R, Wang XF, Cronin W, Sargent DJ, Benedetti J, Wickerham DL, Djulbegovic B, Slingluff CL., JrPreliminary evaluation of factors associated with premature trial closure and feasibility of accrual benchmarks in phase III oncology trials. Clin Trials 2010;7:312–321 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Ford BM, Evans JS, Stoffel EM, Balmana J, Regan MM, Syngal S. Factors associated with enrollment in cancer genetic trials. Cancer Epidemiol Biomarkers Prev 2006;15:1355–1359 [DOI] [PubMed] [Google Scholar]
- 12.Fleeman N, Martin Saborido C, Payne K, Boland A, Dickson R, Dundar Y, Fernandez Santander A, Howell S, Newman W, Oyee J, Walley T. The clinical effectiveness and cost-effectiveness of genotyping for CYP2D6 for the management of women with breast cancer treated with tamoxifen: a systematic review. Health Technol Assess 2011;15:1–102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Regan MM, Leyland-Jones B, Bouzyk M, Pagani O, Tang W, Kammler R, Dell’orto P, Biasi MO, Thurlimann B, Lyng MB, Ditzel HJ, Neven P, Debled M, Maibach R, Price KN, Gelber RD, Coates AS, Goldhirsch A, Rae JM, Viale G, Breast International Group (BIG) 1-98 Collaborative Group CYP2D6 genotype and tamoxifen response in postmenopausal women with endocrine-response breast cancer: the Breast International Group 1-98 Trial. J Natl Cancer Inst 2012;104:441–451 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Peters SP. Reporting and evaluating genetic association studies. Respir Res 2009;10:109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Shriner D, Vaughan LK, Padilla MA, Tiwari HK. Problems with genome-wide association studies. Science 2007;316:1840–1842 [DOI] [PubMed] [Google Scholar]
- 16.Williams SM, Canter JA, Crawford DC, Moore JH, Ritchie MD, Haines JL. Problems with genome-wide association studies. Science 2007;316:1840–1842 [PubMed] [Google Scholar]
- 17.Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K. A comprehensive review of genetic association studies. Genet Med 2002;4:45–61 [DOI] [PubMed] [Google Scholar]
- 18.Anderson CD, Nalls MA, Biffi A, Rost NS, Greenberg SM, Singleton AB, Meschia JF, Rosand J. The effect of survival bias on case-control genetic association studies of highly lethal diseases. Circ Cardiovasc Genet 2011;4:188–196 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Greene CS, Penrod NM, Willians SM, Moore JH. Failure to replicate genetic association may provide important clues about genetic architecture. PLoS One 2009;4:e5639. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kitterman DR, Cheng SK, Dilts DM, Orwoll ES. The prevalence and economic impact of low-enrolling clinical studies at an academic medical center. Acad Med 2011;86:1360–1366 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Nassar N, Grady D, Balke CW. Commentary: Improving participant recruitment in clinical and translational research. Acad Med 2011;86:1334–1335 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Wilke RA, Lin DW, Roden DM, Watkins PB, Flockhart D, Zineh I, Giacomini KM, Krauss RM. Identifying genetic risk factors for serious adverse drug reactions: current progress and challenges. Nat RevDrug Discov 2007;6:904–916 [DOI] [PMC free article] [PubMed] [Google Scholar]