Association of Nrf2-encoding NFE2L2 haplotypes with Parkinson's disease

Background Oxidative stress is heavily implicated in the pathogenic process of Parkinson's disease. Varying capacity to detoxify radical oxygen species through induction of phase II antioxidant enzymes in substantia nigra may influence disease risk. Here, we hypothesize that variation in NFE2L2 and KEAP1, the genes encoding the two major regulators of the phase II response, may affect the risk of Parkinson's disease. Methods The study included a Swedish discovery case-control material (165 cases and 190 controls) and a Polish replication case-control material (192 cases and 192 controls). Eight tag single nucleotide polymorphisms representing the variation in NFE2L2 and three representing the variation in KEAP1 were chosen using HapMap data and were genotyped using TaqMan Allelic Discrimination. Results We identified a protective NFE2L2 haplotype in both of our European case-control materials. Each haplotype allele was associated with five years later age at onset of the disease (p = 0.001) in the Swedish material, and decreased risk of PD (p = 2 × 10-6), with an odds ratio of 0.4 (95% CI 0.3-0.6) for heterozygous and 0.2 (95% CI 0.1-0.4) for homozygous carriers, in the Polish material. The identified haplotype includes a functional promoter haplotype previously associated with high transcriptional activity. Genetic variation in KEAP1 did not show any associations. Conclusion These data suggest that variation in NFE2L2 modifies the Parkinson's disease process and provide another link between oxidative stress and neurodegeneration.


Background
Oxidative stress has been implicated as a major contributing factor in neurodegenerative diseases in general and Parkinson's disease (PD) in particular [1]. Cellular responses to oxidative stress are major determinants of disease susceptibility and aging, particularly in tissues that are sensitive to oxidative stress, such as the central nervous system [2,3]. In PD brain specimens, signs of oxidative stress are especially prominent in the substantia nigra. This may be the result of combined presence of a high dopamine metabolism generating reactive oxygen species (ROS), low levels of antioxidant glutathione and increased levels of iron catalyzing ROS formation [4]. Furthermore, genetic aberrations in oxidative responses may cause neurodegenerative diseases. Examples include mutations in SOD1 (encoding superoxide dismutase 1) that cause amyotrophic lateral sclerosis [5] and loss of function mutations in DJ-1 (encoding Parkinson disease protein 7) that leads to early onset PD with high penetrance [6].
Nuclear factor-erythroid 2 (NF-E2)-related factor 2 (Nrf2) is a member of the cap 'n' collar family of basic leucine zipper transcription factors that regulate the expression of many antioxidant pathway genes in the so-called phase II response [7]. Nrf2 is maintained at basal levels in cells by binding to its inhibitor protein, Kelch-like erythroid-cell-derived protein with CNC homology (ECH)-associated protein 1 (Keap1) [8,9]. Keap1 is a BTB (Broad complex, Tramtrack, Bric-a-Brac) domain-containing protein [9] that targets Nrf2 for ubiquitination by Cul3/Roc-1, leading to its constitutive degradation [10][11][12][13]. Upon exposure to oxidative stress, xenobiotics, or electrophilic metabolites of phase I enzymes, repression of Nrf2 by Keap1 ubiquitination is disrupted and newly produced Nrf2 enters the nucleus [14]. There, it forms heterodimers with other transcription regulators, such as small Maf proteins, and induces the expression of antioxidant phase II genes through interaction with the antioxidant responsive element (ARE) in the promoter of these genes [15,16]. Binding of Nrf2 to ARE drives the expression of phase II enzymes, such as NQO1 [NAD(P)H dehydrogenase (quinine) 1] and HO-1 (Heme oxygenase 1), that generate antioxidant molecules, such as glutathione [17,18]. Nrf2 has been shown to protect neurons from acute injury in culture [19][20][21] and in vivo [22]. Furthermore, upregulation of Nrf2 activity in astrocytes delays motor neuron degeneration in a mouse model of familial amyotrophic lateral sclerosis [23].
Apart from the undisputed involvement of oxidative stress in the PD process [1] there are additional and more specific links between Nrf2 function and PD. First, nuclear localization of Nrf2 is induced in PD-affected substantia nigra, even though the response appears insufficient to protect neurons from degeneration [24]. Second, treatment of nigrostriatal cultures with Nrf2 activators protects from oxidative stress-induced loss of dopaminergic cells [25]. Third, a recently discovered function of DJ-1 is to stabilize Nrf2 by preventing its interaction with Keap1 [26]. Fourth, recent and quite striking data show that induced expression of Nrf2 in brains of transgenic mice protects from MPTP (1methyl-4-phenyl-1,2,3,6-tetrahydropyridine)-caused damage to the nigrostriatal dopaminergic pathway as seen in PD [27].
Despite these extensive preclinical data, no association has as yet been demonstrated between NFE2L2 and KEAP1, the Nrf2 or Keap1 encoding genes, and neurodegenerative disease [28]. Here, we have, for the first time, performed a complete haplotype analysis of the NFE2L2 and KEAP1 genes in relation to risk of PD in two independent case-control materials. We found strong protective effects of an NFE2L2 haplotype in two independent case-control materials, indicating that varying efficiency in the oxidative protection by Nrf2 may influence PD pathogenesis.

Case-control materials
The Swedish discovery material consisted of 165 PD cases and 190 age-matched controls. All individuals were of Caucasian origin. The cases fulfilled the Parkinson's Disease Society Brain Bank criteria for idiopathic PD [29], except for that the presence of more than one relative with PD was not considered an exclusion criterion. PD cases with an age at onset (AAO) of <50 years were screened to exclude that they were carriers of recognized PD-causing mutations in the DJ-1, Parkin, PINK1 and LRRK2 genes [30,31]. Demographic characteristics are given in table 1.
The Polish replication material consisted of 192 PD cases and 192 sex-matched controls. Controls were chosen to be of as high age as possible when included in the study to minimize the number of controls developing PD later in life. All individuals were of Caucasian origin and had no familial aggregation of PD. Demographic characteristics are given in table 1.

Tag SNP selection
Single nucleotide polymorphism (SNP) genotyping data covering NFE2L2 and KEAP1 for the European material CEU (Utah residents with ancestry from northern and western Europe) were downloaded from the International Haplotype Mapping Project web site http://www.hapmap. org [32] and processed in the Haploview software [33]. Linkage disequilibrium (LD) blocks were constructed according to Gabriel et al. [34] and tag SNPs were assigned using the tagger function [33]. A minor allele frequency of ≥ 5% and pair wise tagging with a minimum r 2 of 0.80 were applied to capture the common variations within the blocks covering NFE2L2 and KEAP1. The common genetic variation of NFE2L2 was tagged for by eight tag SNPs: rs16865105, rs7557529, rs2886161, rs1806649, rs2001350, rs10183914, rs2706110 and rs13035806, and

Statistical Analyses
Demographics for the PD cases and controls were compared using χ 2 -statistics for categorical parameters, i.e. sex, family history of neurodegenerative disorders and smoking habits, and using Mann-Whitney U test for age at sampling. Effects of sex, family history of neurodegenerative disease and smoking habits were analyzed by identifying significantly relevant covariates using forward stepwise logistic or linear regression in each material. All tag SNPs were analyzed for deviation from Hardy-Weinberg equilibrium using χ 2 -statistics. Single marker associations were performed using logistic or linear regression in an additive model (dd = 0, Dd = 1 and DD = 2, where D = minor allele and d = major allele).
Haplotype frequencies were estimated in the Helix-Tree 6.3 software using the expectation-maximization algorithm [36] yielding all possible haplotypes present in our materials. In subsequent analyses, however, only haplotypes with an overall estimated frequency of >1.0% were included, while the rarer haplotypes were pooled. In the regression analysis the phase uncertainty for each individual was taken care of by coding each haplotype covariate according to the phase probabilities.
The genes were analyzed to identify the haplotype window with the strongest association to PD diagnosis and AAO of the disease, and to identify the haplotype alleles responsible for the association. To this end, we used a sliding window approach with stepwise forward logistic or linear haplotype regression including relevant covariates. The impact of each associated haplotype allele of the identified window was then investigated with logistic or linear regression including relevant covariates. Pairwise LD between the individual tag SNPs and promoter SNPs was calculated according to Gabriel et al [34] by means of r 2 .
The p-value threshold for statistical significance used in this study was p = 0.05. To correct for multiple testing, Bonferroni correction for the number of studied SNPs was used in all single marker analyses and permutation tests with 10 000 permutations were performed in the sliding window model. Corrected p-values are designated as p c . The statistical softwares used were SYSTAT11 (SYSTAT Software GmbH, Erkrath, Germany) and HelixTree 6.3 (Golden Helix, Bozeman, MT, USA).

Ethics
The study was approved by the regional ethics committee at the University of Gothenburg, Sweden, and the ethics committee of the Pomeranian Medical University, Szczecin, Poland and was in compliance with the Helsinki Declaration of 1975. Written informed consent was obtained from all subjects.

Demographics
Swedish cases and controls were well matched in age. There were significant differences in the distribution of sex, family history of PD and smoking habits (table 1). We identified sex, family history of PD and ever smoking as significantly relevant covariates for the analyses of association with disease risk. Family history of PD alone was identified as a significant covariate in analyses of association with AAO.
Polish cases and controls were matched in sex. Age at sampling was significantly higher in controls then in PD cases (table 1). No covariates were identified as significantly relevant for the analysis of association with disease risk. We identified sex as a significant covariate for analysis of association with AAO.

Tag SNP genotyping
None of the studied markers either in the Swedish or the Polish material had a Hardy-Weinberg equilibrium p-value of < 0.01. The call rate was >95% for both Taq-Man genotyping and sequencing.

Association analysis -Swedish material
After correction for multiple testing none of the tag SNPs alone significantly affected risk of PD (table 3). The A allele of tag SNP 6 (rs10183914) in NFE2L2 was estimated to increase AAO of PD with four years per allele (p c = 0.028). No associations were seen for the markers in KEAP1 with either risk of PD (table 3) or AAO in PD (data not shown).
The haplotype window of NFE2L2 consisting of the five consecutive tag SNPs 2-6 (rs7557529, rs2886161, rs1806649, rs2001350 and rs10183914) was strongly associated with risk of PD (p c = 0.008), as well as with AAO of the disease (p c = 0.003). Phasing of this window resulted in six haplotypes with a frequency of ≥ 5% in the PD group (table 4). Within this window, the haplotypes GAGGG and GAAAG were both associated with increased risk of PD, with an odds ratio of 2.4 (p = 0.007) and 3.7 (p = 0.010) per haplotype allele, respectively (table 5). Additionally, the haplotype GAAAA was estimated to increase AAO of PD with approximately five years per haplotype allele (p = 7 × 10 -4 , table 5).
With regards to KEAP1, phasing of all three tag SNPs (rs1048290, rs11085735 and rs1048287) resulted in four haplotypes with a frequency of ≥ 5% (table 4) without significant associations with risk or AAO of the disease (data not shown).

Promoter analysis -Swedish material
SNPs in the NFE2L2 promoter (figure 1) have previously been shown to affect promoter activity and Nrf2 expression in vitro [37]. To test the hypothesis that the observed haplotype associations described above may be explained by linkage disequilibrium to these functional polymorphisms, we genotyped all individuals for three SNPs in the NFE2L2 promoter region by sequencing.
Phasing of the promoter window (rs35652124, rs6706649 and rs6721961) resulted in four haplotypes with a frequency of ≥ 5% (table 4). The AGA haplotype with low promoter activity [37] showed tendency to association with increased risk of PD (p = 0.056, table 5) and was in LD (r 2 = 0.53) with the risk haplotype GAGGG. The protective, disease-delaying haplotype GAAAA was in LD (r 2 = 0.39) with the promoter haplotype AGC, i.e. the common promoter variant associated with full Nrf2 expression. The promoter SNPs are located between tag SNPs 2 and 3 (figure 1) and analysis of the full risk haplotype GAGAAGGG (freq: PD = 10.8%, freq controls = 4.9%) was associated with risk of PD (p = 0.002) with an odds ratio for PD of 2.8 per haplotype allele (table 5). The full protective haplotype GAGCAAAA (freq: PD = 22.5%, freq controls = 26.4%) was estimated to delay AAO in PD (p = 0.001) with approximately 5 years per haplotype allele (table 5).

Replication of NFE2L2 associations -Polish material
In line with the results from the Swedish material, none of the tag SNPs or promoter SNPs alone had a significant effect on the risk of PD after correction for multiple testing (table 3). The observed association of rs10183914 with AAO in the Swedish material could not be replicated (data not shown).
In line with the results from the Swedish material, the haplotype window consisting of tag SNPs 2-6 was associated with risk of PD (p = 0.005). The haplotype GAAAA of this window was associated with decreased risk of PD (p = 0.005) with an odds ratio of 0.6 (table 5). Notably, this haplotype is identical to the protective haplotype associated with delayed AAO in the Swedish material. As in the Swedish material, GAAAA was in LD (r 2 = 0.42) with the common promoter haplotype AGC. Furthermore, the AGC haplotype alone showed association with reduced risk of PD (p = 0.003) with an odds ratio of 0.6 per haplotype allele (table 5). The full protective haplotype GAGCAAAA (freq: PD = 12.6%, controls = 27.8%) showed strong association with risk of PD (p = 2 × 10 -6 ) with an odds ratio for PD of 0.4 for heterozygous carriers (table 5) and 0.2 (95% CI 0.1-0.4) for homozygous carriers. We could not replicate the risk association of the full haplotype GAGAAGGG (freq: PD = 5.9%, controls = 9.6%), since this haplotype showed association with reduced, rather than increased, risk of PD (p = 0.031) in the Polish material. There was no association with AAO in the Polish material (data not shown).

Discussion
To our knowledge, this is the first case-control haplotype study showing association of the Nrf2-encoding NFE2L2 gene with a neurodegenerative disease. In NFE2L2, we found a region, including the promoter, which was clearly associated with risk of PD in two independent case-control materials. A haplotype including the fully functional variant of the promoter (GAGCAAAA) was associated with delayed AAO in the Swedish material and reduced risk of PD in the Polish material. These results support each other and are in agreement with data from animal and in vitro models suggesting important protective functions of Nrf2 in the central nervous system [22,23,25], and more specifically, in the nigrostriatal dopaminergic pathway affected in PD [27]. SNPs in NFE2L2 have previously been investigated for association with PD in two data sets for which data is released to the public. The first was a Japanese multiple candidate gene study [28]. This study included three NFE2L2 SNPs: rs2886161, rs2886162 and rs2706112, which showed no evidence of single marker associations with PD. The other study was an American two tiered whole-genome association study (first: sib pair, second: case-control) [38]. This study included six NFE2L2 SNPs (rs2706110, rs10183914, rs6726395, rs34820876, rs13005431, and rs6433657) of which none were included in our study. The SNP rs6726395 showed association with PD (p = 9 × 10 -3 ) in the first tier of their study but was not replicated in the second tier (p = 0.9). This SNP is in LD (r 2 = 0.9) with rs7557529 that in our study showed association with risk of PD in the Polish material (p = 0.04), but not in the Swedish (p = 0.8).
The haplotype block identified in the Swedish discovery material consisted of five consecutive tag SNPs that are located upstream of or in intronic regions of NFE2L2. The disease-associated haplotypes could thus influence expression of Nrf2 or be linked to non-synonymous SNPs. However, SNPs that affect transcription factor function are particularly rare [39] and there are no non-synonymous SNPs with a reported frequency ≥ 5% within NFE2L2 either in HapMap or in the NCBI SNP database (dbSNP). In addition, the gene and protein sequences are >80% conserved across mammalian species, supporting a strong selection pressure against genetic variation in coding sequences of NFE2L2. NFE2L2 promoter polymorphisms, on the other hand, have previously been studied and found to affect NFE2L2 promoter activity in vitro [37]. Analysis of the promoter haplotypes in our materials suggests that part of the haplotype associations with risk of PD is explained by linkage to these functional promoter SNPs. Indeed, the protective haplotype GAAAA is in linkage with the wild type, well functioning version of the promoter AGC, and the full AGC-including haplotype GAGCAAAA was associated with decreased risk of PD in the Polish material and older AAO of PD in the Swedish material.
While an AAO-modifying gene is conceptually not the same as a risk gene they represent two overlapping concepts. An AAO-modifying gene will easily appear as a  risk gene in certain study designs. The two studies differ substantially in the PD patients' AAO. The Swedish patients were on average 4 years older at disease onset compared with Polish patients. Depending on the shape of the age-dependent penetrans function for the NFE2L2 haplotypes, the Polish material may be better suited for detecting risk associations, while it may be easier do detect a genetic influence on AAO in the Swedish material.
In the Polish material the significant association was as the protective GAAAA haplotype while in the Swedish the GAGGG haplotype was associated with increased risk. For obvious reasons the presence of a risk/protective haplotype implies that other haplotypes must be protective/risk haplotypes. Since the different haplotypes viewed as covariates are negative confounders (due to the constraint of a total sum of two haplotypes per individual) it is a statistical phenomenon that the regression analysis can find significance for a protective haplotype in one material, while in another material a risk haplotype is found significant.
KEAP1 showed no association with PD in the Swedish or the Polish material. This finding is consistent with the American genome-wide study discussed above in which three KEAP1 SNPs (rs11085735, rs1048287, and rs2007529) were included but did not show any association with PD [38].

Conclusions
In summary, a common NFE2L2 haplotype influences risk of PD in two discrete Caucasian case-control materials. The molecular consequence of this haplotype may be increased efficiency in the Keap1-Nrf2-ARE response to oxidative stress and thereby higher capacity to withstand endogenous or environmental risk factors for PD. Further investigations in other populations as well as functional studies addressing how the disease-associated NFE2L2 haplotype affects gene expression are now needed. To conclude, these results together with recent preclinical data provide another link between oxidative stress and the pathogenesis of PD and support NFE2L2 as a novel susceptibility gene for PD.