Investigation of 95 variants identified in a genome-wide study for association with mortality after acute coronary syndrome

Background Genome-wide association studies (GWAS) have identified new candidate genes for the occurrence of acute coronary syndrome (ACS), but possible effects of such genes on survival following ACS have yet to be investigated. Methods We examined 95 polymorphisms in 69 distinct gene regions identified in a GWAS for premature myocardial infarction for their association with post-ACS mortality among 811 whites recruited from university-affiliated hospitals in Kansas City, Missouri. We then sought replication of a positive genetic association in a large, racially diverse cohort of myocardial infarction patients (N = 2284) using Kaplan-Meier survival analyses and Cox regression to adjust for relevant covariates. Finally, we investigated the apparent association further in 6086 additional coronary artery disease patients. Results After Cox adjustment for other ACS risk factors, of 95 SNPs tested in 811 whites only the association with the rs6922269 in MTHFD1L was statistically significant, with a 2.6-fold mortality hazard (P = 0.007). The recessive A/A genotype was of borderline significance in an age- and race-adjusted analysis of the entire combined cohort (N = 3095; P = 0.052), but this finding was not confirmed in independent cohorts (N = 6086). Conclusions We found no support for the hypothesis that the GWAS-identified variants in this study substantially alter the probability of post-ACS survival. Large-scale, collaborative, genome-wide studies may be required in order to detect genetic variants that are robustly associated with survival in patients with coronary artery disease.


Background
Genome-wide association studies (GWAS) have identified robust genetic associations in a variety of common diseases [1], including myocardial infarction (MI) [2][3][4][5][6]. The GWAS approach, with its emphasis on large sample sizes and inclusion of hundreds of thousands of genetic markers, has produced a degree of reproducibility that was generally lacking in earlier candidate gene studies of MI [7]. However, nine GWAS-identified genetic susceptibility markers, all meeting criteria for genome-wide statistical significance, collectively account for only 3% of the estimated heritability of early-onset myocardial infarction (MI), raising questions about the clinical utility of such markers for predicting MI [1].
In contrast to the identification of risk markers for incident disease, risk-stratifying patients with established disease, including those who have suffered an MI, is a cornerstone of modern cardiovascular care. Yet association studies of GWAS-generated candidate genes with prognosis following acute coronary syndromes (ACS), including unstable angina, non-ST elevation myocardial infarction (NSTEMI), and ST-elevation myocardial infarction (STEMI), require resource-intensive longitudinal designs and have rarely been performed. Although the post-ACS period is considered high risk, there is substantial heterogeneity in patient outcomes, and genetic markers could potentially be useful in defining risk, guiding treatment, and understanding the mechanisms of recurrent events and mortality. In contrast to genetic screening for incident MI, which requires screening large populations of patients without recognized disease, prognostically important genetic variations after an ACS could accelerate translation to clinical practice by focusing upon a narrower cohort of patients at high risk. Moreover, the identification of genetic pathways leading to a poor prognosis following ACS may identify new pathways of disease progression that could become novel targets for the chemoprevention of recurrent ACS.
GWAS-identified risk factors for incident MI pose an important opportunity to identify genetic markers of prognosis in an ACS population, given the clinical logic that a validated risk factor for MI occurrence may also lead to more rapid disease progression after an initial event. To test this possibility, we selected as a pool of candidate prognostic markers the 95 most statistically significant of approximately 2.5 million genetic variants tested in a GWAS of premature MI occurrence (Myocardial Infarction Genetics Consortium) [5]. We specifically tested the hypothesis that these risk markers for MI would be associated with all-cause mortality within 3 years following ACS.

Identification of Candidate Genes
We tested 95 SNPs in 63 individual genes, and an additional 6 distinct gene regions containing more than one genetic locus. The 95 candidate SNPs were ranked the most statistically significant (P < 1 × 10 -5 ) of all~2.5 million SNPs that were included on, or imputed from, the Affymetrix 6.0 microarray and brought forward into replication stage 3 of the Myocardial Infarction Genetics Consortium Study [5].

Study Population and Genotyping
Our study design called for testing genetic markers for prognostic association with 3-year mortality in an ACS cohort, and attempting replication of significant associations in additional cohorts of MI and/or coronary artery disease (CAD) patients. The discovery cohort was comprised of 811 self-reported white patients of European ancestry with ACS who were identified from a consecutive series of patients presenting to two Kansas City, MO hospitals (Mid-America Heart Institute and Truman Medical Center), from March 2001 through June 2003. Standard definitions were used to diagnose ACS patients with either myocardial infarction or unstable angina [8,9]. Individuals were monitored for incident deaths from any cause, as determined by periodic queries of the Social Security Administration Death Master File [10]. Follow-up was planned for a minimum of 3-years.
The Translational Research Investigating Underlying disparities in acute Myocardial infarction Patients' Health status study (TRIUMPH) served as the replication cohort, as described in a recent publication [11]. This study recruited several sites that were enriched for African-American patients with MI and was specifically designed to address racial disparities in outcomes, including genetic variations between races. Patients were adults > 18 years of age with elevated cardiac biomarkers (troponin or creatine kinase-MB fraction) as well as other clinical evidence of MI (ECG ST-segment changes or prolonged ischemia signs/symptoms). Collected data included baseline chart abstractions for demographic and medical history, followed by study coordinators contacting patients for follow-up interviews at 1, 6, and 12 months after MI. Long-term mortality was assessed by periodic queries of the Social Security Administration Death Master File [10].
Additional validation cohorts for survival analysis involving one SNP, rs6922269 in the MTHFD1L gene, were provided by Cleveland Clinic GeneBank (coronary artery disease including acute myocardial infarction) [12,13], Emory Cardiology Biobank (cardiac catheterization patients) [14], and the Ludwigshafen Risk and Cardiovascular Health (LURIC) study (patients hospitalized for coronary angiography) [15]. Clinical details of the cohorts are described in Appendix A.
Genotyping of the 95 SNPs was performed at the Broad Institute at MIT using the iPLEX MassARRAY platform (Sequenom) on extracted leukocyte DNA (TRI-UMPH) or whole genome amplified DNA (Mid-America Heart cohort) [16,17]. More extensive details of the cases, DNA extraction methods, and genotyping procedures have been recently described [5,7,18]. Flanking DNA sequence and other identifiers for each genetic variant are available upon request from the authors.

Statistical Analysis
Genotype distributions were examined for significant deviations (P < 0.05) from Hardy-Weinberg equilibrium (HWE). Chi-square testing was used to screen for possible HWE violations, which were further investigated for statistically significant departure from HWE expectations by Monte Carlo testing involving 10, 000 random reshufflings of alleles [19].
Initially, Kaplan-Meier survival analysis was performed for each variant using SPSS 17.0 (Chicago, IL). The equality of survival curves was tested by the log-rank test pooled over all genotypic strata (one degree of freedom). If the log-rank P value was < 0.05, then pairwise log-rank tests were performed to explore which genotype or genotypes were most likely to confer mortality risk. Cox regression models were then used to adjust positive associations for age, sex, hypertension, ACS type, prior myocardial infarction, prior revascularization, congestive heart failure, diabetes, renal failure, marital status, educational level, menopause, and smoking and alcohol use prior to the ACS. We tested proportionalhazards assumption for each covariate using Shoenfeld residuals. We report raw P values and considered the conservative Bonferroni correction (0.05/95 = 0.0005) to represent the study-wide statistical significance threshold [20], but in addition to considering chance, we also considered canonical epidemiological principles of causation, including magnitude of effect, allelic doseresponse, and adjustment for confounding.
Our sample had 93% power to detect an association, by the log rank test (P < 0.05), for a hazard ratio of 2.5 or higher, given a frequent genotype (0.5), and 80% power to detect a hazard ratio of 3.3 or higher, given an infrequent genotype (0.1) [21]. Given that genetic variants conferring more modest effects may not reach the conventional statistical significance level of P < 0.05, we sought to explore the possibility that null results might be related to lack of power (type II error), by examining the characteristics of the overall P value distribution by visualizing a Q-Q plot (quantile-quantile) using SPSS 17.0 (Chicago, IL).

Results
The clinical characteristics of the 811 cases are described in Table 1. The population of ACS cases included 308 STEMI (38%), 284 NSTEMI (35%), and 219 unstable angina patients (27%). A family history of coronary artery disease or myocardial infarction among first-degree relatives was found in over half of cases (52%). In addition, cardiac risk factor profiles were typical of a population with ACS, with over one half of patients having hypercholesterolemia and hypertension, a third with history of smoking, and over one fifth with diagnosed diabetes. Previous revascularization had been performed in over a third of the cases.
There were 90 deaths in the cohort, which was followed for a mean of 39.1 months, with maximum follow-up time of 60 months. Risk factor data were missing for 2 individuals. There were marked differences in overall cardiac risk factor profiles, as expected, with deceased patients having relatively advanced age, as well as more extensive cardiovascular co-morbidity, including congestive heart failure ( Table 2).
A total of 95 variants in 69 distinct genes or gene regions were genotyped. The average genotype call rate for these variants was 99.3%. Two assays failed (FTO rs9941349, CETP rs6499863). Eight variants violated HWE at the P < 0.05 level (additional file 1: Table S1).
The genotype distributions, numbers of deaths by genotypic category, and unadjusted P values for all 95 genetic variables are shown in additional file 1: Table  S1, with the P value distributions summarized in the Q-Q plot shown in Figure 1. Overall, there were 4 positive associations (P < 0.05). Three of these associations did not retain statistical significance in a Cox proportional hazards adjusted model (rs6458545, P = 0.36; rs769449, P = 0.41; rs7754840, P = 0.13). However, the MTHFD1L association with mortality hazard remained statistically significant following adjustment for traditional cardiac risk factors. Of these covariates, age (HR 1.04/year; P <   However, the frequencies of such co-morbidities were not significantly different from the 78 other deceased patients with any other MTHFD1L rs6922269 genotype. In order to limit multiple comparisons, only the MTHFD1L rs6922269 variant was brought forward for additional genotyping in the TRIUMPH cohort. In order to maximize power, the entire group of patients was subjected to combined Kaplan-Meier and Cox regression analysis. Characteristics, by genotype, of the entire study group appear in Table 3. As expected, mean follow-up time did not significantly differ by genotype. The A/A genotype of rs6922269 was strongly associated with all-cause mortality in the entire group (Figure 2; P < 0.0001). However, after adjustment for age, sex, and race, the hazard ratio for mortality in association with the A/A genotype was 1.35, with a borderline statistically significant P value of 0.052. Although sample size was too limited to document any formal trend towards increased mortality risk by number of A alleles, a suggestive decrement in risk was observed in Cox-adjusted hazard ratios, from 1.35 (A/A), to 1.21 (A/G) to 1.0 (G/ G reference), as shown in Table 4. In the TRIUMPH replication cohort alone (N = 2284), a striking racial dichotomy was noted, with the lowest survival rate (81%) observed in African-American A/A homozygotes (additional file 1, Table S2), compared with 90% or greater survival in all other genotype categories for Whites and African-Americans. The difference in survival by genotype was statistically significant in African-Americans (P = 0.015) but not in Whites (P = 0.284).
Given the equivocal, but nominally statistically significant, results in our primary replication cohort, we collaborated with independent investigators to determine the

Discussion
We initially found a nominally statistically significant association between the A/A genotype of MTHFD1L and mortality in an initial cohort of 811 white patients with ACS. However, in the primary replication cohort, as well as three additional replication cohorts of patients with MI and/or CAD, this association was not confirmed, with a total of over 9, 000 patients studied.
Our experience in this study highlights that robust replication in multiple independent studies is a critical criterion for judging the validity of genetic associations. However, assembling the clinical data required to perform such replication studies is challenging, given that it may require not only the ascertainment of many thousands of patients, but also oversampling for minorities (by race and/or sex), and then tracking outcomes for many years. Our report of survival data for multiple high-priority SNPs advances the relatively neglected field of post-ACS genetic prognosis.
Although chance is the likely explanation for the apparently statistically significant association that we initially observed between mortality and rs6922269, other epidemiological factors should be considered in judging the cause-effect relationship between a putative risk factor and an outcome. In particular, prior evidence that the A allele is a risk factor for myocardial infarction is equivocal. Two major independent GWAS have reported the A allele of MTHFD1L to be associated with early-onset myocardial infarction [5,22]. First, the Wellcome Trust Case Control Consortium (WTCCC) reported a per-allele risk of 1.23 [1.15-1.33] for the A allele of rs6922269 and coronary artery disease, with a genome-wide statistical significance of 2.90 × 10 -8 [22]. Subsequently, the transatlantic Coronary ARtery DIsease Genome wide Replication and Meta-analysis (CARDIo-GRAM) consortium performed a meta-analysis of 14 GWAS studies (22, 233 cases, 64, 763 controls), and MTHFD1L A risk allele not among the statistically significant genome-wide associations (P = 7.38 × 10-5; A frequency 0.28) [23]. The A allele was not associated with the presence of significant coronary arterial stenosis in a series of consecutive patients referred for coronary angiography due to known or suspected stable CAD [24].
Clearly, even findings that meet accepted genomewide criteria for statistical significance in individual studies should be replicated widely before genetic associations are accepted as valid and robust. In addition, in  light of our study, it appears that realistic sample size calculations for prognostic studies may need to posit effect sizes at least as small as those that have emerged from GWAS studies of the occurrence of CAD, which are typically substantially less than 1.4. The challenge of mustering sufficient power is further amplified in prognosis studies by the fact that only a small proportion of patients will experience mortality within several years of follow-up. Therefore, sample size requirements for prognostic studies are expected to be at least as large as those for case-control studies. While it is beyond the scope of the present study to prescribe detailed sample size recommendations for future studies of prognosis in patients with common cardiovascular diseases, we would note that the CARDIoGRAM consortium included 22, 233 patients with CAD, with an approximate 3:1 controlto-case ratio. In addition to requiring large sample sizes, considerations for oversampling clinically relevant subgroups, outcome adjudication, and genotyping scope and methods are all emerging challenges to field of cardiovascular genetics as it seeks to leverage genetic risk factors to better risk-stratify outcomes in AMI suvivors. Additional limitations to this study include the heterogeneity of the present study, with respect to rs6922269 and our inability to discern whether or not there could be an effect in particular subgroups of patients, given the overall absence of a significant association.

Conclusions
We found no convincing support for the hypothesis that SNPs identified from GWAS studies of cardiac risk are associated with all-cause mortality following ACS, suggesting that independent GWAS studies of cohorts of many thousands of ACS patients may be required in order to identify prognostic factors in biological pathways promoting post-ACS mortality.