Genetic variants associated with fasting glucose and insulin concentrations in an ethnically diverse population: results from the Population Architecture using Genomics and Epidemiology (PAGE) study

Background Multiple genome-wide association studies (GWAS) within European populations have implicated common genetic variants associated with insulin and glucose concentrations. In contrast, few studies have been conducted within minority groups, which carry the highest burden of impaired glucose homeostasis and type 2 diabetes in the U.S. Methods As part of the 'Population Architecture using Genomics and Epidemiology (PAGE) Consortium, we investigated the association of up to 10 GWAS-identified single nucleotide polymorphisms (SNPs) in 8 genetic regions with glucose or insulin concentrations in up to 36,579 non-diabetic subjects including 23,323 European Americans (EA) and 7,526 African Americans (AA), 3,140 Hispanics, 1,779 American Indians (AI), and 811 Asians. We estimated the association between each SNP and fasting glucose or log-transformed fasting insulin, followed by meta-analysis to combine results across PAGE sites. Results Overall, our results show that 9/9 GWAS SNPs are associated with glucose in EA (p = 0.04 to 9 × 10-15), versus 3/9 in AA (p= 0.03 to 6 × 10-5), 3/4 SNPs in Hispanics, 2/4 SNPs in AI, and 1/2 SNPs in Asians. For insulin we observed a significant association with rs780094/GCKR in EA, Hispanics and AI only. Conclusions Generalization of results across multiple racial/ethnic groups helps confirm the relevance of some of these loci for glucose and insulin metabolism. Lack of association in non-EA groups may be due to insufficient power, or to unique patterns of linkage disequilibrium.


Background
Impaired glucose homeostasis is associated with increased risk of cardiovascular disease and type 2 diabetes [1,2]. While plasma glucose and insulin concentrations normally fluctuate in response to dietary intake and physical activity levels, several genomewide association studies (GWAS) have demonstrated that common genetic variants contribute to glucose homeostasis. Most studies of this complex trait have focused on primarily ancestrally European populations, despite the high prevalence of insulin resistance and diabetes in many minority groups.
Investigation of the clinical and public health implications of these genetic discoveries requires not only confirmation in EA populations, but importantly generalization of these associations to other groups such as African Americans, Hispanics, American Indians, and Asians; groups that were not adequately represented in most GWAS. A prior study found that a genetic risk score composed of 16 SNPs previously associated with fasting glucose in GWAS in European populations was associated with fasting glucose in non-Hispanic blacks and Mexican Americans, indicating that genetic factors associated with fasting glucose may be shared across some racial/ethnic groups [3]. The purpose of this study is to examine 36,579 participants from diverse racial and ethnic backgrounds as part of the NHGRI-supported 'Population Architecture using Genomics and Epidemiology (PAGE)' Consortium to investigate associations between SNPs previously-identified in genome-wide scans for loci associated with glucose and insulin concentrations.

Study populations
PAGE involves several studies, described briefly below and in greater detail on the PAGE website (https://www. pagestudy.org). All studies were approved by Institutional Review Boards at their respective sites, and all participants provided informed consent.
Causal Variants across the Life Course (CALiCo) is a consortium of six demographically diverse population based studies and a central laboratory, and includes approximately 58,000 men and women ranging in age from adolescence to older adulthood. Three CALiCo studies participated in the present analysis: Atherosclerosis Risk in Communities Study (ARIC) (N = 13,383) [4], Cardiovascular Health Study (CHS) (N = 4,509) [5], and Strong Heart Cohort Study (SHCS) (N = 1,714) [6]. In addition to the studies involved in the CALiCo consortium, PAGE includes three other large studies. The Multiethnic Cohort (MEC) is a population-based prospective cohort study of over 215,000 men and women in Hawaii and California aged 45-75 at baseline (1993)(1994)(1995)(1996) and primarily of five ancestries [7]. Participants eligible for the present study were controls in nested case-control studies of breast, colorectal, or prostate cancer or for biomarker studies, and who had glucose and/or insulin measurements (N=942). This analysis also included data from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study. EAGLE accesses the genetic component of three National Health and Nutrition Examination Surveys (NHANES): NHANES III (phase 2 collected between 1991 and 1994), NHANES 1999-2000, and NHANES 2001-2002. Overall, 7,719 NHANES participants aged 18 and older were included in these analyses. Finally, the Women's Health Initiative (WHI) is a multifaceted clinical trial and cohort study investigating post-menopausal women's health in the U.S [11]. Out of the 161,808 women enrolled in WHI, 8,312 were selected and included in the present study. Except for the Women's Health Initiative, all studies recruited men and women. All studies collected self-identified racial/ethnic group via questionnaire. In the current analysis, we included "East Asians" defined as MEC participants who identified themselves as of sole or mixed Japanese descent, and WHI participants of Japanese, Chinese, Filipino, Vietnamese, and/or Korean ancestry. Fasting glucose and insulin concentrations were measured using standard assays, at laboratories specific to each PAGE site.
At all PAGE sites, we excluded underweight (BMI<18.5 kg/m 2 ) and extremely overweight (BMI>60 kg/m 2 ) individuals with the assumption that these extremes could be attributable to data coding errors, an underlying illness or possibly to a familial syndrome and hence, a rare mutation. We excluded individuals self-reporting that they have ever been diagnosed with diabetes, or who report taking diabetes medications. In addition, to mirror typical exclusion criteria of other studies of glucose homeostasis, we also excluded individuals with fasting glucose concentrations consistent with diabetes (i.e., ≥126 mg/dl or ≥7.0 mmol/L ), regardless of selfreported diabetes status.
After applying the above exclusion criteria, a total of 36,579 participants were selected from the PAGE consortium for analysis.

SNP selection and genotyping
Ten SNPs in 8 genetic regions were selected for genotyping based on prior GWAS findings of positive association with glucose or insulin concentrations, and exceeding a genome-wide significance of p <5 × 10 -8 in studies published through 2010 [12][13][14]. Nine SNPs were previously associated with glucose, and 2 were associated with insulin, with 1 of these SNPs associated with both quantitative traits (rs780094/GCKR). In the glucose analysis, we included an additional GWAS finding for type 2 diabetes (rs7903146/TCF7L2) that had been subsequently associated with fasting glucose concentrations [15]. Each PAGE site prioritized which SNPs to genotype based on investigator interests, genotyping platforms, and resources, resulting in heterogeneity of available glucose or insulin SNPs across racial/ethnic groups. Ten SNPs were genotyped in European Americans and African Americans, 4 were genotyped in Hispanics and in American Indians, and 2 were genotyped in East Asians.
DNA extraction and genotyping methods followed standard protocols. Each PAGE site employed different genotyping platforms, with similar quality control criteria. CALiCo sites used TaqMan, the Illumina 370CNV BeadChip, the Affymetrix Genome-Wide Human SNP Array 6.0, and the Illumina HumanCVD BeadChip. A portion of CHS genotype data was obtained from a previous GWAS. EAGLE used Sequenom's iPLEX® Gold coupled with MassARRAY MALDI-TOF MS detection and Illumina's BeadXpress with a custom GoldenGate genotyping assay. MEC used Applied Biosystems OpenArray and TaqMan. WHI used Illumina BeadXpress with the Veracode GoldenGate genotyping assay. All sites used internal and blinded external controls, and excluded genotypes deviating from Hardy-Weinberg expectations (p-value < 0.001) or with low concordance (typically, <95% -99%). In addition to site-specific quality control, all PAGE study sites genotyped 360 DNA samples from the International HapMap Project and submitted these data to the PAGE Coordinating Center for concordance checks [16]. Additional details on data collection, specimen processing, and genotyping are found in the Additional file 1: Supplementary Methods.

Statistical analysis
In order to maximize comparability with prior studies of glucose homeostasis, we converted insulin and glucose concentrations into units commonly reported in the literature. Thus, we investigated continuous fasting glucose (mmol/L) and natural log transformed fasting insulin (pmol/L). The association between each SNP and its related quantitative trait was estimated using linear regression with robust standard errors (SEs) [17]. SNP genotype was coded assuming an additive genetic model (i.e., 0, 1, or 2 copies of the coded allele). For ease in interpreting the results, we coded the allele that was associated with an increased insulin or glucose concentration in the prior GWAS. All analyses were stratified by self-identified racial/ethnic group, and adjusted for covariates known to be associated with insulin and/or glucose concentrations: smoking (current vs. former/never; smoking increases insulin resistance) [18], continuous BMI (obesity is associated with insulin resistance) [19], sex (insulin metabolism differs by sex) [20], and continuous age (insulin metabolism varies by age) [21]. Analyses were performed for each of the 6 participating PAGE studies separately and study-specific results (effect sizes and robust SEs) were combined with fixed-effects metaanalysis using R.
Based on our hypothesis that GWAS-identified glucose and insulin SNPs are associated with glucose and/ or insulin concentrations across all race/ethnicities, we did not adjust for multiple testing. We labeled metaanalysis results as "replicating" (for EA) or "generalizing" (for other racial/ethnic groups) if the beta was in the same direction as the original GWAS, and was statistically significant (i.e., p < 0.05). All aggregate results will be available via dbGaP (http://www.ncbi.nlm.nih.gov/gap) at a future date.
Approximately 13% of the overall WHI study cohort was selected to contribute to PAGE. This selection was non-random, and was enriched for subjects with certain incident health conditions (e.g., cardiovascular disease and stroke), non-European American race/ethnicity, and BMI>40. Therefore, analyses of WHI data incorporated inverse probability weighting to account for this sampling strategy.
We only reported results if the meta-analysis sample size was > 400. For each racial/ethnic group, we estimated the statistical power to detect the GWASreported effect sizes for each SNP using Quanto (hydra. usc.edu/gxe/), assuming the same effect size as reported in the prior GWAS, an additive genetic model and a two-sided test of association at p = 0.05. Power calculations were based on allele frequencies specific to each racial/ethnic group. We evaluated I 2 as a measure of heterogeneity [22], to describe the presence or absence of excess variation across the PAGE study sites.

Results
The distribution of insulin and glucose measurements and demographic characteristics for participants in each PAGE site by racial/ethnic group are detailed in Table 1. In general, subjects were middle-aged to older adults (mean age ranging from 41 to 73 years across studies), with average BMI ranging from normal to obese (24 -32 kg/m 2 ). Coded allele frequencies, stratified by racial/ ethnic group, are presented in Table 2. Analyses involved a total of 36,579 subjects, including 23,323 European Americans, 7,526 African-Americans, 3,140 Hispanics, 1,779 American Indians, and 811 East Asians. Table 3 lists meta-analysis results for EA and AA, for 9 SNPs previously associated with glucose in GWAS. Within European Americans all coded alleles were significantly associated with increased glucose concentrations, and thus replicated findings of prior GWAS. In African Americans, 7 of 9 SNPs demonstrated associations in the same direction as the original GWAS report, and 3 were statistically significant (rs10830963/MTNR1B, p = 3.7 × 10 -4 ; rs4506565/TCF7L2, p = 0.03; and rs7903146/TCF7L2, p = 5.9 × 10 -5 ). In Hispanics, 3 out of 4 genotyped SNPs were associated with significantly increased glucose concentrations (rs560887/G6PC2, p = 5.5 × 10 -5 ; rs780094/GCKR, p = 2.7 × 10 -5 ; and rs10830963/MTNR1B, p = 3.3 × 10 -5 ). Two out of four SNPs were also associated with increased glucose concentrations in American Indians (rs4607517/ GCKR, p = 0.03; and rs780094/GCKR, p = 0.04). Only 2 SNPs were genotyped in East Asians, and 1 was associated with increased glucose concentrations (rs780094/GCKR, p = 0.03). Table 3 lists meta-analysis results for EA and AA, for 2 SNPs (rs780094/GCKR and rs35767/IGF1) associated insulin in prior GWAS. The association between rs780094/ GCKR and (ln) insulin replicated in EA (p-value = 1.3 × 10 -10 ), but did not generalize to AA (p = 0.12). This association was also significant in Hispanics (p = 0.01) and American Indians (p = 4.8 × 10 -4 ), but not East Asians (p = 0.39). The association between rs35767/IGF1 and insulin was not significant in EA or AA.

Evidence for heterogeneity
Overall, we observed little evidence of heterogeneity across studies. In EA, rs7903146/TCF7L2 had a statistically significant p-value for heterogeneity in the association with glucose (χ 2 = 18.79, p = 0.001), For this association, the site-specific betas for WHI, CHS, EAGLE, ARIC, and MEC were 0.03, 0.03, 0.02, 0.14, and 0.12 mmol/L, respectively. In American Indians, rs560887/G6PC2 had a statistically significant I 2 p-value for heterogeneity in the association with glucose in American Indians (χ 2 = 7.31, p = 0.03). For this association, the data came from 3 sites of SHCS (Arizona, Oklahoma, and South Dakota); and the betas were 0.42, -0.01, and 0.17 mmol/L. For each of these SNPs, results of random effects models are presented in Table 3.

Discussion
Overall, our results demonstrate that all 9 GWAS findings for glucose replicate in EA, yet fewer generalize to other racial/ethnic groups (3/9 in AA, 3/4 in Hispanics, and 2/4 in American Indians and 1/2 in East Asians). In the analysis of insulin, we found that rs780094/GCKR replicated in EA, and generalized to Hispanics and American Indians. We observed limited evidence for excess heterogeneity by site in the meta-analyses, with significant heterogeneity detected for only two SNPs. Some earlier studies have examined the generalizability of some or all of these index SNPs for glucose and insulin in populations of African ancestry, with limited success. The Howard University Family Study [23] found nominal significance (p<0.05) at the SLC30A8 locus, the Multi-Ethnic Study of Atherosclerosis [24] at MTNR1B, and the Candidate Gene Association Resource at G6PC2, GCK, and MTNR1B [25]. The sparse generalization of glucose and insulin-related GWAS findings to African Americans could be attributable to several phenomena. First, reduced power in AA: as illustrated in Table 3, several of the non-significant effect sizes in AA were very similar in magnitude to the effect sizes reported in the original GWAS. For example, for rs780094/GCKR, the coded allele was associated with similarly higher glucose concentrations in the original GWAS in EA (0.029 mmol/L) [12]. However, the coded allele frequency in PAGE EA was 0.59, versus 0.81 in AA. This difference in coded allele frequency may have reduced our ability to detect an association in AA; the power to detect the previously reported effect size for this SNP was 0.99 in EA, and 0.51 in AA. Second, reduced linkage disequilibrium in AA: lack of association, particularly for analyses having close to adequate power (such as rs560887/G6PC2 in AA), may be due to differences in   [24,26], and at GCKR in East Asians [27]. Larger follow-up studies are needed to determine whether other genetic regions identified in European GWAS are important in non-Europeans, and to expand research into other populations, such as American Indians, that have not been as thoroughly investigated at these genes.
The PAGE consortium offers a unique opportunity to investigate associations between candidate SNPs and glucose and insulin concentrations in ancestrally diverse cohorts with well-characterized phenotypes. The substantial strength of PAGE is the relatively large samples of ancestrally diverse participants, in which very little is known about the genetic etiology of insulin and glucose concentrations. However, this analysis had several limitations. First, smaller sample sizes in Hispanics (N=3140), American Indian (N=1779), and East Asians (N = 811) limited our ability to detect statistically significant associations in these groups. As shown in Table 2, coded allele frequencies varied considerably between racial/ethnic groups, which likely resulted in reduced power in several analyses. Further, not all SNPs were genotyped in all race/ethnicity groups, which limited our ability to assess generalization of GWAS findings to Hispanics, American Indians, and East Asians. In addition, our genotyping approach was limited to SNPs previously associated with glucose and/or insulin in GWAS conducted prior to 2010. These GWAS were all conducted in European-descent populations, and thus the most promising SNPs from those studies may not be relevant to other populations, due to between-population differences in linkage disequilibrium, particularly in AA.

Conclusion
In conclusion, in this large and diverse study we were able to replicate 9 GWAS-identified glucose SNPs and 1 of 2 insulin SNPs in EA. Even with limited sample sizes for additional racial/ethnic groups, we found that most of the nine GWAS glucose findings analyzed generalized to at least one non-EA racial/ethnic group, with several SNPs generalizing to multiple groups (e.g., rs780094/ GCKR generalized to Hispanics, American Indians, and East Asians) These findings indicate that it would be worthwhile to pursue additional genotype data on larger samples drawn from of these populations, and perform an even more comprehensive investigation of the generalizability of GWAS findings for glucose and insulin in diverse populations. In addition, an investigation of gene-environment and gene-gene interactions may help resolve ancestry-based differences in the genetic basis of glucose and insulin concentrations.

Additional file
Additional file 1: Supplementary Methods.
Strong Heart Study (SHS) is supported by NHLBI grants U01 HL65520, U01 HL41642, U01 HL41652, U01 HL41654, and U01 HL65521. The opinions expressed in this paper are those of the author(s) and do not necessarily reflect the views of the Indian Health Service. Assistance with phenotype harmonization, SNP selection and annotation, data cleaning, data management, integration and dissemination, and general study coordination was provided by the PAGE Coordinating Center (U01HG004801-01 and its NHGRI ARRA supplement). The National Institutes of Mental Health also contributes to the support for the Coordinating Center. The PAGE consortium thanks the staff and participants of all PAGE studies for their important contributions.
Author details