Genetic correlates of longevity and selected age-related phenotypes: a genome-wide association study in the Framingham Study

Background Family studies and heritability estimates provide evidence for a genetic contribution to variation in the human life span. Methods We conducted a genome wide association study (Affymetrix 100K SNP GeneChip) for longevity-related traits in a community-based sample. We report on 5 longevity and aging traits in up to 1345 Framingham Study participants from 330 families. Multivariable-adjusted residuals were computed using appropriate models (Cox proportional hazards, logistic, or linear regression) and the residuals from these models were used to test for association with qualifying SNPs (70, 987 autosomal SNPs with genotypic call rate ≥80%, minor allele frequency ≥10%, Hardy-Weinberg test p ≥ 0.001). Results In family-based association test (FBAT) models, 8 SNPs in two regions approximately 500 kb apart on chromosome 1 (physical positions 73,091,610 and 73, 527,652) were associated with age at death (p-value < 10-5). The two sets of SNPs were in high linkage disequilibrium (minimum r2 = 0.58). The top 30 SNPs for generalized estimating equation (GEE) tests of association with age at death included rs10507486 (p = 0.0001) and rs4943794 (p = 0.0002), SNPs intronic to FOXO1A, a gene implicated in lifespan extension in animal models. FBAT models identified 7 SNPs and GEE models identified 9 SNPs associated with both age at death and morbidity-free survival at age 65 including rs2374983 near PON1. In the analysis of selected candidate genes, SNP associations (FBAT or GEE p-value < 0.01) were identified for age at death in or near the following genes: FOXO1A, GAPDH, KL, LEPR, PON1, PSEN1, SOD2, and WRN. Top ranked SNP associations in the GEE model for age at natural menopause included rs6910534 (p = 0.00003) near FOXO3a and rs3751591 (p = 0.00006) in CYP19A1. Results of all longevity phenotype-genotype associations for all autosomal SNPs are web posted at http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?id=phs000007. Conclusion Longevity and aging traits are associated with SNPs on the Affymetrix 100K GeneChip. None of the associations achieved genome-wide significance. These data generate hypotheses and serve as a resource for replication as more genes and biologic pathways are proposed as contributing to longevity and healthy aging.

Conclusion: Longevity and aging traits are associated with SNPs on the Affymetrix 100K GeneChip. None of the associations achieved genome-wide significance. These data generate hypotheses and serve as a resource for replication as more genes and biologic pathways are proposed as contributing to longevity and healthy aging.

Background
Genetic factors associated with human longevity and healthy aging remain largely unknown. Heritability estimates of longevity derived from twin registries and large population-based samples suggest a significant but modest genetic contribution to the human lifespan (heritability ~15 to 30%) [1][2][3][4]. However, genetic influences on lifespan may be greater once an individual achieves age 60 years [5]. Moreover, the reported magnitude of the genetic contribution to other important aspects of aging such as healthy physical aging (wellness) [6], physical performance [7,8], cognitive function [9], and bone aging [10] are much larger. Both exceptional longevity and a healthy aging phenotype have been linked to the same region on chromosome 4 [11,12], suggesting that although longevity per se and healthy aging are different phenotypes, they may share some common genetic pathways.
A number of potential candidate genes in a variety of biological pathways have been associated with longevity in model organisms. Genes involved in the regulation of DNA repair and genes in the evolutionarily conserved insulin/insulin-like growth factor signaling pathway [13,14] are emerging as holding great promise in the future elucidation of the underlying physiology controlling lifespan. Many of these genes have human homologs and thus have potential to provide insights into human longevity [15][16][17][18][19][20]. Although numerous candidate genes have been proposed, studies in humans are limited and initial findings often fail replication [21,22]. More recently genome-wide association studies (GWAS) have become feasible and offer a more comprehensive and untargeted approach to detect genes with modest phenotypic effects that underlie common complex conditions [23].
We had the opportunity to use the Framingham Heart Study (FHS) Affymetrix 100K SNP genotyping resource for a GWAS of longevity and aging-related phenotypes. The FHS offers the unique advantage of a longitudinal family-based community sample with participants who have been well-characterized throughout adulthood with respect to prospectively ascertained risk factors and diseases and continuously followed until death. We report several strategies for 100K SNP associations: 1) a simple low p-value SNP ranking strategy; 2) SNP selection due to associations with more than one related phenotype; and 3) SNP associations within candidate genes and regions previously reported to be associated with longevity in model organisms or humans.

Study sample
The genotyped study sample is comprised of 1345 Original cohort (n = 258) and Offspring (n = 1087) participants who are members of the 330 largest FHS families. The Overview [24] provides further details of this sample. With respect to aging and longevity traits, 149 deaths occurred at a mean age at death of 83 years (range 46 to 99 years) and 713 participants achieved age 65 years or greater. The Boston University Medical Center Institutional Review Board approved the examination content of Original Cohort and Offspring examinations. All participants provided written informed consent at every examination including consent for genetic studies.

Longevity and aging phenotype definitions and residual creation
Age at death Both the Original Cohort and the Offspring Cohort remain under continuous surveillance and all deaths that occurred prior to January 1, 2005 were included in this study. Deaths were identified using multiple strategies including routine participant contact for research examinations or health history updates, surveillance at the local hospital, search of obituaries in the local newspaper, and if needed through use of the National Death Index. Death certificates were routinely obtained and all hospital and nursing home records prior to death and autopsy reports (if performed) were requested. In addition, if there was insufficient information to determine a cause of death, the next of kin were interviewed by a senior investigator. All records pertinent to the death were reviewed by an endpoint panel comprised of three senior investigators. The date and cause of death (classified as due to coronary heart disease, stroke, other cardiovascular disease [CVD], cancer, other causes, or unknown cause) was recorded.
Cox proportional hazards models were used to generate martingale residuals using the PHREG procedure in SAS to perform the regression analysis of survival time from age at study entry to age at death. Models were sex-specific and adjusted for 1) birth cohort and 2) birth cohort, education, current smoking status (yes/no), obesity (body mass index ≥30 kg/m 2 ), hypertension (blood pressure ≥140/90 mmHg or on antihypertensive treatment), elevated cholesterol (cholesterol > 239 mg/dL), diabetes (fasting blood sugar ≥126 mg/dL, random blood sugar of ≥200 mg/dL, or use of insulin or oral hypoglycemic agents) and comorbidity defined as CVD and cancer. Birth cohort was defined as a categorical variable for all regression models with the following categories based on year of birth: birth year prior to 1900, 1900 to 1909, 1910 to 1919, 1920 to 1929, 1930 to 1939, 1940 to 1949, and 1950 and later. All covariates were measured at study entry. Residuals from Original Cohort and Offspring participants were pooled.

Morbidity-free survival at age 65 years
Morbidity-free survival was defined as achieving age 65 years free of CVD, dementia, and cancer. CVD events included angina pectoris, coronary insufficiency, myocardial infarction, heart failure, stroke, transient ischemic attack (TIA), intermittent claudication and coronary or CVD death. Suspected CVD events were reviewed by a panel of three investigators who adjudicated events using previously established criteria in place since study inception [25]. A separate panel of study neurologists determined the presence of stroke or TIA and a team of at least one neurologist and one neuropsychologist determined the presence of dementia. Two independent reviewers examined records for all cancers, and the vast majority of cancer cases were microscopically confirmed with pathology reports.
Logistic regression models were used to generate deviance residuals. Models were sex-specific and adjusted for 1) birth cohort and 2) birth cohort, education, current smoking status, obesity, hypertension, elevated cholesterol, and diabetes. Covariates were defined as above for age at death. All covariates were measured at the examination closest to the participant attaining age 65 years using a 5 year window around age 65 years. Residuals from Original Cohort and Offspring participants were pooled.
Age at natural menopause Natural menopause occurred after a woman had ceased menstruating naturally for one year and the age at natural menopause was the self-reported age at last menstruation. Mean age at natural menopause was similar in Original Cohort and Offspring women and the distribution of naturally menopausal ages in women in the 330 FHS families was similar to that of women in all 1643 FHS families [26,27]. The mean age at natural menopause in women in the 100K sample was 50.2 years (range 38 to 57 years) in Original Cohort women and 49.1 years (range 29 to 60 years) in Offspring women.
Crude age at natural menopause and standardized residuals from multiple linear regressions in SAS [28] that adjusted age at natural menopause for covariates of interest were used as traits for analysis. Covariates were obtained at all attended examinations prior to the onset of menopause and included mean number of cigarettes smoked per day, mean body mass index, parity (0 versus 1 or more live births), and generation (Original Cohort vs. Offspring).

Walking speed
Walking speed was measured on Original Cohort participants at examination 27 (January 2002 through December 2003, mean age of Original Cohort at exam 27: 86.7 years) and Offspring participants attending an ancillary study to examination 7 (1999 to 2004, mean age at exam: 62.0 years). Trained technicians timed participants walking at their normal pace on a four meter course twice and subsequently asked participants to repeat the course walking at a rapid pace. The mean timed fast walk among Offspring participants in the 100K genotyping sample was 2.44 seconds (standard deviation 0.89). The timed fast walk was used for analysis. Sex-specific linear regression was used to generate residuals adjusted for age and height measured at the time of the walk.
Biologic age by osseographic scoring system An osseographic scoring system (OSS) was applied to hand radiographs obtained on original cohort (1967 to 1969, mean age 58.7 years) and offspring participants (1992 to 1993, mean age 51.6 years) [10]. Biologic age was then defined as the standardized residual between the OSS predicted age and the actual age. Biologic age defined by this system predicted mortality [10,29], was very heritable (h 2 = 0.57 ± 0.06), and a genome-wide linkage analysis was performed with LOD scores >1.8 present on chromosomes 3q, 11p, 16q, and 21q [10]. Sex-and cohort-specific ranked residuals generated from linear regression of age on log-OSS adjusted for height, body mass index, menopause, and estrogen therapy, were used for analysis.

Genotyping
Affymetrix 100K SNP GeneChip genotyping and the Marshfield STR genotyping performed by the Mammalian Genotyping Service http://research.marshfieldclinic.org/ genetics are described in the Overview paper [24].

Statistical analysis
The statistical methods for genome-wide linkage and association analyses are described in the Overview [24].

Association
All residual traits described above as well as the additional traits listed in Table 1 were computed using Cox proportional hazards with martingale residuals for survival traits, logistic regression with deviance residuals for dichotomous traits, and linear regression with standard residuals for quantitative traits. The full set of FHS participants with the phenotype were used to create the residuals. The residuals were used to test for association between the genotyped subset of individuals and the SNPs using additive family-based association test (FBAT) and generalized estimating equations (GEE) models as described in the Overview [24]. A total of 70,987 autosomal SNPs met the criteria of genotypic call rate ≥80%, minor allele frequency ≥10%, Hardy-Weinberg test p ≥ 0.001, and ≥10 informative families for FBAT. The number of tests with an FBAT p < 0.001, p < 0.0001, and p < 0.00001 for all phenotypes was similar to what would be expected under the assumptions that the 70,987 tested SNPs were independent and there were no true associations. The GEE tests tended to give an excess of very small p-values over what would be expected under these assumptions.

SNP prioritization
We used several strategies to prioritize SNPs associated with longevity and aging traits. First, we used an untargeted approach whereby the top 50 SNP associations ranked according to the strength of the p-value for each trait were examined. Next, we explored the consistency of SNP associations across related sets of traits chosen a priori (trait set one: age at death and morbidity-free survival at age 65 years; trait set two: biologic age and walking speed). Trait set one was chosen based upon linkage data in humans demonstrating that both longevity and a healthy aging trait were linked to the same region on chromosome 4 raising the hypothesis that the two phenotypes may share common genetic pathways [11,12]. The traits in set two reflect aging with good physical functioning and thus we postulated that biologic age and walking speed may have genetic variants in common. We also investigated SNP associations in candidate genes and regions reported to be associated with longevity identified from established databases including NCBI [14] using the search term "longevity" and the Science of Aging Knowledge Environment genes/intervention database http:// sageke.sciencemag.org/cgi/genesdb[30] choosing genes potentially related to lifespan in humans.
The SNPs were annotated using the UCSC genome browser tables using the May 2004 assembly http:// genome.ucsc.edu/[31,32]. All genes within 60 kb of the top ranked SNPs were identified.

Results
The longevity and aging traits available in the FHS 100K SNP resource are listed in Table 1. In this report, we consider only five of the traits listed in Table 1: multivariableadjusted age at death, morbidity-free survival at age 65 years, age at natural menopause, walking speed, and biologic age by OSS. These traits include a pooled sample of Original Cohort and Offspring participants, with the exception of walking speed, which is reported in Off-spring participants only. Details of the sample size and covariate adjustment for each trait are provided in Table 1.
For each of the five phenotypes, Table 2a and 2b provides the top five SNPs ranked in order by lowest p-value for the GEE and FBAT models (all associations can be viewed on the web http://www.ncbi.nlm.nih.gov/projects/gap/cgibin/study.cgi?id=phs000007). If multiple SNPs in linkage disequilibrium (LD r 2 > 0.80) were included in the top 5, additional SNPs were included until a set of 5 independent associations were listed. Eight SNPs on chromosome 1 were associated with age at death in the FBAT analysis; all with p-value < 10 -4 and two with p-value < 10 -5 . The 8 SNPs consisted of two sets of SNPs (rs10493513, rs10493514, rs6689491, rs6657082, rs1405051) and (rs10493515, rs10493518, rs10493517), clustered in two regions approximately 500 kb apart. There was exceptionally high LD across this 500 kb region: the minimum r 2 between pairs of the eight SNPs was 0.58. The nearest genes in this region existing in public databases were >500 kb from any of these SNPs [31,32].
There were several additional associations not listed in Table 2a and 2b that were of interest. For age at death in the GEE analysis, SNP associations ranked numbers 9 and 13 were rs10507486 (p-value 0.000128) and rs4943794 (p-value 0.000277), both are intronic FOXO1A SNPs. For age at natural menopause, top ranked SNP associations in the GEE model included number 11, rs6910534 (p = 0.00003) near FOXO3a and number 18, rs3751591 (p = 0.00006) in CYP19A1. Table 2c presents the LOD scores ≥2.0 and the corresponding 1.5-LOD support interval from genome-wide linkage for the three quantitative aging traits. None of the regions overlapped with SNPs associated with these aging traits in the FBAT and GEE analyses. Of note for biologic age by OSS the linkage peak on chromosome 21 confirmed a prior Framingham Study report using a genomewide scan with 401 microsatellite markers [10]. Table 3 provides all SNP associations with a GEE or FBAT p < 0.01 for both traits within the two pairs of related traits. For age at death and morbidity-free survival at age 65 years, FBAT models identified 7 SNPs and GEE models identified 9 SNPs associated with both traits including rs2374983 near PON1 (Tables 3a and 3b). For biologic age by OSS and walking speed, 13 SNPs in FBAT models and 6 SNPs in GEE models were associated with both traits (Tables 3c and 3d).
We identified from the literature 79 potential candidate genes and regions associated with longevity (see Additional file 1 for listing). Of these, 12 genes had no SNPs and 67 genes had 1 to 45 SNPs within 60 kb of the gene on the 100K Affymetrix GeneChip. There were 2036 SNPs in the LGV1 region on chromosome 4 previously linked to exceptional longevity [11]. Table 4 shows the candidate genes with SNPs associated with an FBAT or GEE p-value < 0.01 for age at death including: FOXO1a, GAPDH, KL, LEPR, PON1, PSEN1, SOD2, and WRN and for morbidityfree survival at 65 years including:GHR, LEPR, MORF4L1, PON1, PTH, and WRN. Biologic age by OSS shared 2 SNPs in common with age at death: rs4943794 intronic to FOXO1a and rs911847 near SOD2.

Discussion
To our knowledge, this is the first dense GWAS of longevity and aging traits in a community-based sample of adults from two generations of the same families. Over 1300 men and women have detailed longevity and aging-  A SNP in LD (r 2 > 0.8) with a higher ranked SNP, is identified with an asterisk. All SNPs for a phenotype are listed until 5 independent SNPs are identified. Thus, for some phenotypes more than 5 SNPs are listed. For the age at death trait, the FBAT analysis identified two areas on chromosome 1 in LD, with r 2 = .5-.6 between the two regions and r 2 of nearly 1.0 within the region. † Multivariable-adjusted trait results are presented ‡Trait had <500 participants in the sample. ¶Results limited to traits presented  related phenotypes and 100K SNP genotyping results available on the web. This resource has the potential to detect novel susceptibility genes for human longevity and aging and to examine the relevance of promising candidate gene associations reported in animal models to human aging. We describe several strategies to prioritize SNP associations in this unique resource to enhance the discovery of various genes and pathways that contribute to the control of human longevity. Furthermore, FHS investigators are part of the NIA sponsored Longevity Consortium http://www.longevityconsortium.org which offers the opportunity of collaboration with other investigators to replicate important findings in additional cohorts.
In our untargeted approach of ranking SNP associations by the strength of the p-value, 2 intronic FOXO1a SNPs were associated with age at death. One of these SNPs (rs4943794) also was associated with biologic age by OSS in our a priori evaluation of select candidate genes. FOXO Studies of this gene in humans are limited; two case-control studies have not identified an association between FOXO1a and longevity [36,37]. However, the prospective population-based Leiden 85-plus Study found that FOXO1a was associated with increased mortality attributable to diabetes related deaths in participants aged 85 years and older [38]. The Leiden 85-plus Study also reported that genetic variation causing a reduction in insulin/IGF-1 signaling resulted in improved old age survival among women [20]. However, that report examined other genes in the insulin/insulin-like signaling pathway and did not specifically examine FOXO1a. Finally, the untargeted approach to SNP selection also identified a SNP near FOXO3a associated with age at natural menopause. This gene has been implicated in oocyte death, We examined pleiotropic effects by identifying SNP associations across two pairs of related traits. One SNP near PON1 emerged as associated with both age at death and morbidity-free survival. Surprisingly, there were relatively few SNPs associated with both traits; prior work had suggested that longevity per se and healthy aging may share common genetic pathways [11,12]. However, morbidityfree survival was measured at age 65 years, it is possible that as our participants age morbidity-free survival defined at age 75 or 85 years will share additional SNP associations with our longevity trait, age at death. A SNP near SOX5, a gene potentially related to musculoskeletal function was associated with both biologic age by OSS and walking speed.
Our strategy of selecting SNPs in candidate genes and regions previously reported to be associated with longevity yielded interesting findings.  [15]. Thus, results from this GWAS may direct resources to the most relevant candidate genes and pathways for further investigation in humans.
Several important limitations merit comment. First, we acknowledge that there may be a survival bias as participants in this sample had to survive to provide DNA (first systematic DNA collection began 1995) and hence are likely healthier than the full FHS sample. To ameliorate this issue, we adjusted for covariates using the full Framingham sample, and used the residual traits for the subset of individuals genotyped using the 100K Affymetrix Gene-Chip to test for association with the SNPs using linear regression models. Residual traits from Cox and logistic models typically are not ideally distributed for linear regression models, but our adjustment method using the full sample precludes the testing of SNP associations with age at death and morbidity-free survival using Cox and logistic models. Second, the 100K Affymetrix GeneChip provides limited coverage of the genome; many of our a priori candidate genes did not have any SNP coverage on the chip. For example, several genes that have been studied in model organisms or even in humans such as ACE, Lamin A, SIRT2 and SIRT3, had no SNPs within 60 kb of the gene on the 100K Affymetrix GeneChip. However, genotyping is near-complete for the NHLBI funded 550 K genome-wide scan on all FHS participants. This will enable deeper exploration of our initial 100K SNP associations in a larger sample with denser coverage of the genome. Third, in this analysis we did not examine epistasis or gene-environment interactions which may modify the associations in this study. Importantly, this study is hypothesis generating. Our findings need to be replicated in other samples.

Conclusion
In summary, the untargeted genome-wide approach to detect genetic associations with longevity and aging traits provides an opportunity to identify novel biologic pathways related to lifespan control. GWAS also have the potential to direct investigators of human aging to the most promising candidate gene associations and biologic pathways reported to regulate lifespan in animal models. Enhancing our understanding of the mechanisms responsible for aging may in turn identify directions for health promotion and disease prevention efforts in middle-aged and older adults so that older persons can enjoy more time in good health. These data generate hypotheses regarding novel biologic pathways contributing to longevity and healthy aging and serve as a resource for replication of findings from other population-based samples.