Race-ethnic differences in the association of genetic loci with HbA1c levels and mortality in U.S. adults: the third National Health and Nutrition Examination Survey (NHANES III)

Background Hemoglobin A1c (HbA1c) levels diagnose diabetes, predict mortality and are associated with ten single nucleotide polymorphisms (SNPs) in white individuals. Genetic associations in other race groups are not known. We tested the hypotheses that there is race-ethnic variation in 1) HbA1c-associated risk allele frequencies (RAFs) for SNPs near SPTA1, HFE, ANK1, HK1, ATP11A, FN3K, TMPRSS6, G6PC2, GCK, MTNR1B; 2) association of SNPs with HbA1c and 3) association of SNPs with mortality. Methods We studied 3,041 non-diabetic individuals in the NHANES (National Health and Nutrition Examination Survey) III. We stratified the analysis by race/ethnicity (NHW: non-Hispanic white; NHB: non-Hispanic black; MA: Mexican American) to calculate RAF, calculated a genotype score by adding risk SNPs, and tested associations with SNPs and the genotype score using an additive genetic model, with type 1 error = 0.05. Results RAFs varied widely and at six loci race-ethnic differences in RAF were significant (p < 0.0002), with NHB usually the most divergent. For instance, at ATP11A, the SNP RAF was 54% in NHB, 18% in MA and 14% in NHW (p < .0001). The mean genotype score differed by race-ethnicity (NHW: 10.4, NHB: 11.0, MA: 10.7, p < .0001), and was associated with increase in HbA1c in NHW (β = 0.012 HbA1c increase per risk allele, p = 0.04) and MA (β = 0.021, p = 0.005) but not NHB (β = 0.007, p = 0.39). The genotype score was not associated with mortality in any group (NHW: OR (per risk allele increase in mortality) = 1.07, p = 0.09; NHB: OR = 1.04, p = 0.39; MA: OR = 1.03, p = 0.71). Conclusion At many HbA1c loci in NHANES III there is substantial RAF race-ethnic heterogeneity. The combined impact of common HbA1c-associated variants on HbA1c levels varied by race-ethnicity, but did not influence mortality.


Background
The prevalence of type 2 diabetes (T2D) is not equal among race-ethnic groups in the United States, with a prevalence of 12.8% in non-Hispanic blacks (NHB), 8.4% in Mexican Americans (MA), and 6.6% in non-Hispanic whites (NHW) aged 20 yrs or older [1]. Diabetes-related complications also differ between race-ethnicities [2] and there is greater impact of diabetes on life-years in minority groups [3]. Raceethnic differences in environmental exposures and health care experiences [4] likely influence different outcomes for people with diabetes, but genetic differences may also play an important role. Despite recent advances in the study of T2D genetics, relatively little is known about how race-ethnic genetic differences contribute to inter-race variability in diabetes risk or diabetes-related traits.
Given the selection pressure by infectious diseases such as malaria on some erythrocyte-related genes in African populations [8][9][10] and the influence of erythrocyte genes on HbA 1c [7,11], we hypothesized that risk alleles at HbA 1c -associated loci may have substantial race-ethnic frequency variation and that associations with HbA 1c levels may also differ by race. Furthermore, since elevated HbA 1c is associated with risk of cardiovascular disease or mortality [12][13][14][15][16][17][18][19], we hypothesized that an association between HbA 1c -associated SNPs and mortality may exist and there may be race-ethnic differences in this association. Using 11 confirmed HbA 1c -associated SNPs at ten loci [7], we compared NHB, MA, and NHW individuals from NHANES (National Health and Nutrition Examination Survey) III to test the hypotheses that there is significant race-ethnic variation in HbA 1c risk (HbA 1c -raising) allele frequency, risk-allele association with HbA 1c levels and risk-allele association with mortality.

Methods
Study subjects from the third national health and nutrition examination survey NHANES III was a nationally representative sample of the non-institutionalized civilian U.S. population collected using stratified multistage probability sampling. NHANES participants underwent a physical examination, phlebotomy, and a household interview [20]. This study was limited to non-diabetic patients (aged 20 or older) with 8-23 hours of fasting prior to blood sampling. Blood from NHANES III Phase II (1991-1994) participants aged 12 or older were used to generate Epstein-Barr transformed lymphocyte cell lines for DNA extraction. Mortality data (death within a mean of 13.5 years of follow-up) were merged from the NHANES III mortality-linked data file. Race-ethnic group was assigned based on self-report. The survey asked each subject to categorize his/her race as "white," "black," or "other" and his/her ethnicity as "Mexican-American," "other Hispanic," or "not Hispanic." Of 3,894 individuals with complete data for analysis, we excluded 149 who were not of NHB, MA or NHW raceethnicity and 704 with diabetes (293 NHW, 167 NHB and 244 MA), leaving 901 NHB, 909 MA, and 1,231 NHW individuals in the analysis. Written informed consent was obtained from all subjects and this study was approved by the National Center for Health Statistics (NCHS) Ethics Review Board.

Diabetes definition and HbA 1c measures
Individuals with diabetes were excluded to avoid the confounding effects of treatment on HbA 1c . We defined diabetes as a fasting plasma glucose ≥ 7.0 mmol/L, report of a diagnosis of diabetes or use of hypoglycemic medications. HbA 1c levels were measured using HPLC (Bio-Rad DIA-MAT glycosylated hemoglobin analyzer system) [21].

SNP genotyping and allele frequencies
Genotyping was performed using Sequenom iPLEX. We genotyped 11 SNPs at ten loci shown among white non-diabetic individuals in MAGIC to have genome-wide significant association with HbA 1c. [7] We used SNP rs282606 as a proxy for ATP11A rs7998202 (CEU r 2 = 1.0), SNP rs10830956 as a proxy for MTNR1B rs1387153 (CEU r 2 = 1.0), and rs2022003 as a proxy for SPTA1 rs2779116 (CEU r 2 = 0.927) [r 2 for ASW and MEX populations not available]. The minimum call rate for genotyping was 95%. Allele frequencies of all SNPs were in Hardy Weinberg Equilibrium (HWE) based on National Center for Health Statistics standards (HWE rejected if p < 0.01 in ≥ 2 or more raceethnic groups). We compared NHANES observed allele frequencies with those available from HapMap (http://hapmap. ncbi.nlm.nih.gov/, Release 27, Phases II and III, NCBI build 36), comparing NHW with CEU (Utah residents with Northern and Western European ancestry from the CEPH collection), NHB with ASW (African ancestry in Southwest USA), and MA with MEX (Mexican ancestry in Los Angeles, California).

Genotype risk score
We calculated a genotype risk score to test the collective association with HbA 1c of 11 SNPs at 10 loci (2 uncorrelated SNPs at ANK1). We assumed that each SNP was associated with HbA 1c based on previous association results in whites, despite potential ancestral differences in NHB or MA in linkage disequilibrium (LD) patterns [22]. Since we did not know the effect size of the MAGIC SNPs in non-white populations, we did not apply SNP-specific weights to account for SNP-specific differences in effect on HbA 1c , but simply summed the presence of 0, 1, or 2 risk alleles carried by individuals at each SNP. In addition to the 11-SNP GRS, we also performed a secondary analysis using an eight SNP "non-glycemic" risk score by excluding the three glycemic loci (G6PC2, GCK, MTNR1B) for score calculation.

Statistical analyses of association
We stratified the analysis by race-ethnicity (NHB, MA, and NHW) and to estimate rates and proportions within groups used weights to account for sampling probabilities using methods previously described [23]. P-values for differences across race-ethnic groups were calculated using Satterthwaite adjusted-F statistics for continuous variables and chi-square tests for categorical variables. To estimate the significance of differences in allele frequencies across groups we used Fisher's Exact tests.
To investigate the relationship between SNPs and HbA 1c level we used linear regression and an additive genetic model adjusted for age and sex. We included one SNP at a time in the models for individual SNP associations with HbA 1c , with genotypes coded as 0, 1 or 2 depending on the number of HbA 1c -raising alleles present. To study the collective effect of the 11 SNPs on HbA 1c we used linear regression adjusted for age and sex, totaled the number of risk alleles at all 11 SNPs to calculate a risk score, and tested associations of a per-risk-allele increase in genotype risk score with HbA 1c . We calculated the adjusted model R 2 with and without the genotype risk score for each group to determine the percent variance in HbA 1c explained by genetic effects. The same procedure was carried out for the 8 SNP "non-glycemic" risk score, as well as for genetic associations with mortality (percent dead as of 13.5 years post-baseline exam). To determine if a significant genetic risk score x ethnicity interaction effect on HbA 1c exists, we also applied the following linear regression model on the whole sample: Hba 1c level (outcome) = sex, age, genetic risk score, ethnicity, genetic risk score x ethnicity interaction.
For tests of association with mortality we used logistic regression to estimate the odds of mortality with per-risk-allele increase in HbA 1c . For analysis of mortality, Cox models yielded similar results to logistic regression, so Cox model results are not shown. We also applied the following logistic regression model on the whole sample: mortality (outcome) = sex, age, GRS, ethnicity, GRS x ethnicity interaction. For the analyses we used SUDAAN (version 10.0) [24] and SAS (version 9.2, SAS Institute Inc, Cary, NC). We considered p values less than 0.05 to indicate statistical significance, based on one test per previously established SNP at each locus for each hypothesis (SNP is associated with HbA1c; SNP is associated with mortality).
Linkage disequilibrium, signatures of population differentiation and natural selection at HbA 1c -associated loci To evaluate inter-ethnic differences in LD near the SNPs, we examined 500 kb around each SNP (HapMap Release 27, Build 36, phases II and III) for four populations (CEU, YRI, ASW, and MEX). Using Haploview version 4.2, [25] we counted the number of "Gabriel" LD regions (based on confidence intervals) [26] in that region for each population. We investigated natural selection around the ten loci using Haplotter [27] and HapMap Phase II data. Standardized Integrated Haplotype Score (iHS) (a statistic based on differential LD around positively selected alleles that compares haplotype length with ancestral allele versus derived allele to detect positive selection) [27], Fay and Wu's H + statistic (a measure used to scan a region for allele frequencies that are skewed from the neutral model) [28] and the Fixation Index (F ST ) (a statistic using allele frequencies to measure genetic divergence between subpopulations) [29] were obtained through Haplotter SNP queries spanning 2 Mb regions at each locus.

Characteristics of participants
NHW individuals were older, had lower BMI and lower mean HbA 1c than did NHB and MA individuals (global p values all <0.0001, Table 1).

Risk allele frequencies of HbA 1c -associated variants
Risk allele frequencies across the 11 loci varied widely within the three race-ethnic groups (Additional file 1: Table  S1). Six out of 11 HbA 1c -associated SNPs had risk allele frequencies that differed significantly across race-ethnic groups (Fisher's p <0.0002). At five of these six loci, risk allele frequency of NHB was most divergent, including SNPs near ANK1 (two uncorrelated SNPs), MTNR1B, ATP11A/ TUBGCP3 and TMPRSS6. At the SNP near SPTA1, risk allele frequency differed most in MA. The HbA 1c -raising allele was the minor (less frequent) allele in all three ethnic groups for SNPs near SPTA1, GCK, MTNR1B, FN3K, and TMPRSS6. The HbA 1c -raising allele was the major (more frequent) allele at SNPs near ABCB11, HFE, ANKI (rs6474359) and HK1. At two loci, ATP11A, and ANK1 (rs4737009), the HbA 1c -raising allele was the minor allele in NHW and MA, but the major allele in NHB (Additional file 1: Table S1). Risk allele frequencies observed in this study and those available from HapMap were generally similar, although at some loci minor dissimilarity with Hap-Map was observed in NHB and MA cohorts (Figure 1; Additional file 1: Table S2).

SNP associations with HbA 1c
Though single-SNP associations are underpowered (Additional file 1: Table S3), we did observe that in NHW, eight of the 11 SNPs in NHW were consistent with Soranzo et al. ) serving as proxies for those in the MAGIC study. Beta coefficients were negative for three, three and four of the 11 SNPs in NHW, NHB and MA groups, respectively, but corresponding SNPs did not generate significant associations (Table 2). Three out of 11 HbA 1c -associated SNPs had nominally significant (p < 0.05) associations with HbA 1c levels in at least one of the three race-ethnic groups, but altogether only four of the 33 possible associations (11 SNPs x three race-ethnic groups) were significant (p < 0.05). No significant associations were observed in NHB. Two HbA 1c -SNPs produced a significant association only in NHW (both SNPs at ANK1), and one produced a significant association in both NHW and MA (rs855791 near TMPRSS6).
Combined associations of 11 HbA 1c SNPS with HbA 1c The mean 11-SNP genotype scores (actual scores ranged from 1-18) were 11.0 (± 0.09 [SE]) in NHB, 10.7 (± 0.08) in MA and 10.4 (± 0.07) in NHW, (p value for global difference across race-ethnicity < 0.0001, Table 3). Median genetic risk scores (unweighted) were 11.0 (SD =2.2), 11.0 (SD =2.3) and 11.0 (SD =2.0) in NHW, NHB and MA, respectively, with distributions of genetic risk scores negatively skewed toward a lower score in all three ethnic groups. The per-risk allele increase in the score was significantly associated with HbA 1c levels in NHW and MA, but not NHB. When comparing the top and bottom 10% of the genotype score distribution for each raceethnic group, the smallest difference in HbA 1c was observed in NHW (NHW: 0.49%; NHB: 0.56%; MA: 0.54%). The genotype score explained very little of the variance in HbA 1c levels in NHB (0.0005%) compared with NHW (0.0016%) and MA (0.0121%). Variance explained in NHW is comparable to the previously published value [7]. We observed no significant genetic risk score x ethnicity interaction on HbA 1c level (p = 0.68).

Combined associations of eight non-glycemic SNPs with HbA 1c
The mean "non-glycemic" 8-SNP genotype scores (actual scores ranged from 4-15) were 8.80 (± 0.06[SE]) in NHB, 8.72 (± 0.06) in MA and 8.41(± 0.06) in NHW, (p value for global difference across race-ethnicity < 0.0001) (Additional file 1: Table S4). The per-risk allele increase in the score was significantly associated with HbA 1c levels in NHW, but not in NHB and MA.

Association of 11 HbA 1c SNPs with mortality
Mortality rates differed between race-ethnic groups (Table 4) with a higher mortality rate observed in NHB (19.4%) compared with NHW (12.8%) and MA (14.5%). The 11-SNP genotype score was not associated with mortality in any race-ethnic group. We observed no significant genetic risk score x ethnicity interaction on mortality (p=0.62). Power calculations for the mortality analysis are provided in Additional file 1: Table S5.
Linkage disequilibrium at HbA 1c -associated loci There were consistently fewer LD regions in the CEU population compared to YRI at every locus (YRI:CEU): SPTA1  Table 6). ASW, which represents a population with African ancestry in the southwestern United States, only had higher numbers of LD regions compared to CEU in two out of 11 regions, possibly due to lower coverage of ASW compared to CEU (and YRI) in HapMap Release 27.

Evidence of population differentiation and natural selection at HbA 1c -associated loci
Fay and Wu's H + was highly skewed at two loci (HK1 and ATP11A) in CEU (Additional file 1: Table S7). Integrated haplotype scores (iHS) were not highly negative or positive at these SNPs, as would be characteristic for regions undergoing recent natural selection. F ST , a measure of the amount of allelic fixation due to drift, was greater than 15% at ANK1 and ATP11A in both CEU and YRI, suggesting population  differentiation at these loci [29]. Haplotter queries by gene did not reveal evidence of natural selection directly at the genes queried, but evidence of natural selection was observed within a 2 Mb region of ABC11/G6PC2 and TMPRSS6 for CEU and YRI, respectively.

Discussion
Genome-wide association studies of HbA 1c levels in cohorts of white individuals of European ancestry revealed a combination of glycemic and non-glycemic biological influences on HbA 1c , with three loci associated with HbA 1c in or near genes likely involved in glycemic control pathways and seven loci associated with HbA 1c in or near genes likely to be involved in erythrocyte biology [7]. In this study we found that in the nationally representative NHANES III sample of US adults, heterogeneity in risk allele frequencies exists across race-ethnic groups for six of these HbA 1c -associated SNPs. Five SNP risk allele frequencies in NHB were significantly lower or higher than the other two groups. Risk allele frequencies observed in NHANES III were generally consistent with frequencies of comparable populations available in HapMap, suggesting that HapMap and NHANES III can be considered representative of each other at these SNPs at least with respect to white, African American and Mexican American race-ethnic populations. An 11-HbA 1c -associated SNP genotype score was subtly different by race-ethnicity and was associated with increase in HbA 1c in NHW and MA but not NHB. The 11-SNP genotype score was not significantly associated with mortality in any group. There are several potential sources for the inter-race-ethnic heterogeneity of SNP and genotype risk score associations with HbA 1c that we observed. One potential source of heterogeneity is race-specific selection acting on erythrocyte-related loci that influence HbA 1c . Variants in the βhemoglobin gene (HBB), for example, produce abnormal erythrocytes that can affect HbA 1c levels [30] but are protective against malaria and are thus maintained in populations and found at highest frequencies in regions historically exposed to this disease like Africa and India [31]. Rare mutations in many loci associated with HbA 1c (SPTA1, ANK1, HK1, TMPRSS6) are known to cause hereditary red blood cell disorders [7] and common variants at several loci (SPTA1, HFE, ANK1, HK1, TMPRSS6) are associated with hematological traits like hemoglobin concentration and mean corpuscular volume [32][33][34]. Adjustment of models of these common variants predicting HbA 1c levels for levels of hemoglobin concentration or mean corpuscular volume attenuate SNP-HbA 1c relationships, suggesting mediation of HbA 1c varation by elements of erythrocyte biology [7]. Further, a recent genetic association study showed some differences in the genetic regulation of hematological traits in Europeans compared with Africans [35]. Our analyses of differentiation and selection suggest that there may be some selection pressure at the ANK1, HK1, ATP11A, TMPRSS6 and ABC11/G6PC2 loci, the first four of which are erythrocyte-related loci. However, in the present study, race-ethnic differences in association with HbA 1c by SNP were observed at only two of these loci (ANK1 [rs4737009] and TMPRSS6). We also examined inter-population allele frequency differences of trait-associated SNPs which may indicate that selection is operating on the trait [36]. While frequencies of some disease-associated alleles have been reported as largely heterogeneous between race-ethnicities [36][37][38][39], other data suggest no greater differentiation than would be expected from a random set of SNPs [40]. We found heterogeneous inter-race-ethnic risk allele frequencies at six of the HbA 1cassociated SNPs and three of these (SNPs near ANK1 [both SNPs] and TMPRSS6) showed inter-race heterogeneity in SNP association with HbA 1c .   We found modest race-ethnic differences in the association of individual or collective HbA 1c -associated SNPs and levels of HbA 1c . We found nominally significant associations with an HbA 1c -associated SNP genotype score and levels of HbA 1c in NHW, as expected, and also in MA, but not in NHB individuals. Ancestral variation in LD probably accounts for some of this difference in association. LD is more fine-grained in genomes of African individuals [22], so some of the HbA 1c -associated SNPs may be more tightly linked to putative functional alleles in NHW and MA than in NHB. Modest power given the relatively small sample size of NHANES III could also account for the relatively weak association of HbA 1c SNPs with HbA 1c in each race-ethnic group (Additional file 1: Table S3). No significant interactions were observed, also possibly due to low power. T2D diagnosis was based on fasting glucose with no OGTT, which may have introduced misclassification in T2D status of study subjects. Furthermore, greater heterogeneity exists in NHB, and this heterogeneity may have influenced variability in HbA 1c levels. Since there are no ancestry markers available in NHANES to evaluate genetic heterogeneity within populations, we were unable to evaluate substructure within ethnic groups and, for the purposes of this study, assumed little to no intra-population substructure.
Despite previous epidemiological associations of HbA 1c levels with mortality or cardiovascular disease [12][13][14][15][16][17][18][19] and race-ethnic variation in mortality rates in NHANES III, we did not see any evidence of an association of HbA 1c -associated loci with mortality in any race-ethnic group. If HbA 1c is associated with mortality, it is likely to be mediated through HbA 1c 's association with hyperglycemia and insulin resistance, but many HbA 1c -associated loci are associated with erythrocyte biology and not hyperglycemia. A lack of association of the HbA 1c -associated SNPs studied here and cardiovascular disease events has also been shown previously in white cohorts [7]. This unlinking of hyperglycemia from HbA 1c biology also has bearing on diabetes screening and diagnosis. Another explanation for a lack of association of the HbA 1c genetic risk score with mortality is the lack of statistical power due to small sample size within each ethnicity (Additional file 1: Table S5). When pooling the entire sample and carrying out an interaction model we also observed no significant genetic risk score x ethnicity interaction on mortality.
Race-ethnic differences in HbA 1c levels were observed in the present study and have been shown previously [41][42][43][44][45][46]. Population differences in HbA 1c levels are partly attributable to variability in non-biological factors including raceethnic differences in lifestyle, socioeconomics, health insurance access or screening intensity [41,44]. Further, there are likely race ethnic differences in non-glycemic biological factors including glycemic level, hemoglobinopathies [30,[47][48][49], iron deficiency anemias [21,48,[50][51][52][53][54], and erythrocyte survival [48,55,56]. The data suggest that glycemic control is not the only root cause of inter-race-ethnic differences in HbA 1c . Although the clinical impact of HbA 1c genetics on diabetes detection appears to be modest in whites, at least , whether race-ethnic heterogeneity in HbA 1c genetics influences diabetes diagnosis in other race-ethnic groups requires further investigation.
The major strengths of this study include genotyping of all 11 known HbA 1c -associated SNPs in the nationally representative, multi-race-ethnic NHANES III cohort. The heterogeneity of HbA 1c -associated SNP frequencies across race-ethnic groups and the limited impact of these SNPs on HbA 1c level in NHB individuals underscore the importance of extending association studies and the discovery of causal variants to diverse populations for a comprehensive understanding of HbA 1c genetic architecture. As diverse populations become increasingly incorporated into genetic studies for variant detection, inter-race-ethnic variation will likely continue to be revealed, necessitating careful investigation of its sources and significance.

Conclusions
In NHANES III there is substantial RAF race-ethnic heterogeneity at many HbA 1c loci. An 11-HbA 1c -associated SNP genotype score was subtly different by race-ethnicity and was associated with increase in HbA 1c in NHW and MA but not NHB. While the numerous potential sources for this race-ethnic heterogeneity in association with HbA 1c require further exploration, the data underscore the importance of extending genetic analysis to non-white populations, especially where they may have impact on guidelines for disease screening, diagnosis or management.

Additional file
Additional file 1: Table S1. Weighted Risk (HbA 1c -raising) allele frequencies of 11 HbA 1c -associated SNPs by race-ethnicity, Third National Health and Nutrition Survey (NHANES III); Table S2. Weighted allele and genotype frequencies of 11 HbA 1c -associated SNPs by race-ethnicity, Third National Health and Nutrition Examination Survey (NHANES III) versus HapMap; Table S3. Power calculations for HbA 1c at alpha=0.05 and alpha=0.05/11 (Bonferroni corrected) assuming similar effect sizes to those published by Soranzo et al. (2010); Table S4. Adjusted mean HbA 1c levels (%) and an 8-SNP "non-glycemic" genetic risk score by raceethnicity, Third National Health and Nutrition Examination Survey (NHANES III); Table S5. Power calculations for mortality at alpha=0.05 and alpha=0.05/11 (Bonferroni corrected); Table S6. Number of LD blocks in 500 kb regions flanking each SNP (based on Haploview version 4.2, HapMap release 27, build 36, Phases II and III, February 2009) and Table  S7. Scans for signatures of population differentiation and natural selection in 2Mb regions surrounding 10 SNPs associated with HbA 1c in Europeans from Haplotter queries by SNP and queries by locus (2Mb regions).

Competing interests
The author(s) declare that they have no competing interests.