Genome-Wide association study identifies candidate genes for Parkinson's disease in an Ashkenazi Jewish population

Background To date, nine Parkinson disease (PD) genome-wide association studies in North American, European and Asian populations have been published. The majority of studies have confirmed the association of the previously identified genetic risk factors, SNCA and MAPT, and two studies have identified three new PD susceptibility loci/genes (PARK16, BST1 and HLA-DRB5). In a recent meta-analysis of datasets from five of the published PD GWAS an additional 6 novel candidate genes (SYT11, ACMSD, STK39, MCCC1/LAMP3, GAK and CCDC62/HIP1R) were identified. Collectively the associations identified in these GWAS account for only a small proportion of the estimated total heritability of PD suggesting that an 'unknown' component of the genetic architecture of PD remains to be identified. Methods We applied a GWAS approach to a relatively homogeneous Ashkenazi Jewish (AJ) population from New York to search for both 'rare' and 'common' genetic variants that confer risk of PD by examining any SNPs with allele frequencies exceeding 2%. We have focused on a genetic isolate, the AJ population, as a discovery dataset since this cohort has a higher sharing of genetic background and historically experienced a significant bottleneck. We also conducted a replication study using two publicly available datasets from dbGaP. The joint analysis dataset had a combined sample size of 2,050 cases and 1,836 controls. Results We identified the top 57 SNPs showing the strongest evidence of association in the AJ dataset (p < 9.9 × 10-5). Six SNPs located within gene regions had positive signals in at least one other independent dbGaP dataset: LOC100505836 (Chr3p24), LOC153328/SLC25A48 (Chr5q31.1), UNC13B (9p13.3), SLCO3A1(15q26.1), WNT3(17q21.3) and NSF (17q21.3). We also replicated published associations for the gene regions SNCA (Chr4q21; rs3775442, p = 0.037), PARK16 (Chr1q32.1; rs823114 (NUCKS1), p = 6.12 × 10-4), BST1 (Chr4p15; rs12502586, p = 0.027), STK39 (Chr2q24.3; rs3754775, p = 0.005), and LAMP3 (Chr3; rs12493050, p = 0.005) in addition to the two most common PD susceptibility genes in the AJ population LRRK2 (Chr12q12; rs34637584, p = 1.56 × 10-4) and GBA (Chr1q21; rs2990245, p = 0.015). Conclusions We have demonstrated the utility of the AJ dataset in PD candidate gene and SNP discovery both by replication in dbGaP datasets with a larger sample size and by replicating association of previously identified PD susceptibility genes. Our GWAS study has identified candidate gene regions for PD that are implicated in neuronal signalling and the dopamine pathway.


Background
Genetic linkage studies of Parkinson's Disease (PD) have identified susceptibility loci in five genes which include SNCA [1] (PARK1), Parkin [2] (PARK2), PTEN-induced putative kinase [3] (PINK1;PARK6), DJ-1 [4] (PARK7) and Leucine rich repeat kinase 2 [5] (LRRK2; PARK8). Mutations in these genes are rare and highly penetrant with large effects (e.g Parkin, PARK2), and their prevalence may vary substantially by age at onset (AAO), family history of PD (FHPD), and ethnicity [6,7]. On the other hand, common genetic variants defined as variants with a minimum allele frequency (MAF) of 5% to 20-30% are also believed to contribute to PD disease susceptibility. Often genome wide association studies (GWAS) exclude SNPs with low allele frequencies (MAF < 5%), thereby excluding some rare variants that may contribute to disease susceptibility. To date, nine PD GWAS studies in North American, European and Asian populations have been published [8][9][10][11][12][13][14][15][16]. While the majority of studies have confirmed the association of the previously identified genetic risk factors, SNCA and MAPT, only two studies have identified three new PD susceptibility genes that reached genome wide significance [13,16]. In a Japanese population, a GWAS identified the new susceptibility loci PARK16 at chr1q32 and BST1 on 4p15 and the HLA region was identified as a susceptibility locus in a late-onset sporadic PD population from North America [13,16]. In a recent meta-analysis of datasets from five of the published PD GWAS an additional 6 novel candidate genes (SYT11, ACMSD, STK39, MCCC1/LAMP3, GAK and CCDC62/HIP1R) were identified [17]. Collectively the associations identified in these GWAS account for only a small proportion of the estimated total heritability of PD suggesting that an 'unknown' component of the genetic architecture of PD remains to be identified. Some of the contributing factors to the difficulties in identification of risk variants are: etiologic heterogeneity across populations, other genetic mechanisms like methylation, and the importance of multiple, rare variants in common diseases, which are not well captured by current GWAS approaches.
To overcome some of these limitations, we applied a GWAS approach to a relatively homogeneous Ashkenazi Jewish (AJ) population living in the New York area to search for both 'rare' and 'common' genetic variants that confer risk of PD by examining any SNPs with allele frequencies exceeding 2%. We have focused on a genetic isolate, the AJ population, as a discovery dataset since this cohort has a higher sharing of genetic background, and historically experienced a significant bottleneck, thereby potentially increasing allele frequencies such that some rare variants in other European populations may be more frequent in the AJ population [18][19][20][21][22][23].
Our study had three main aims. First, we used our AJ case-control population as a discovery dataset and performed a GWAS using an overall MAF threshold of > 2% to identify novel candidate SNPs, and conducted a replication study using two publicly available datasets from dbGaP (CIDR/Pankratz et al 2009 [24] Genome wide association study in familial PD and NINDS: The National Institute of Neurological Disorders and Stroke [11]). Second, we re-analyzed the dbGaP datasets from CIDR/Pankratz et al 2009 and NINDS using an overall MAF threshold of 2% or higher to identify rare genetic variants in these datasets, and attempted to replicate the findings in the AJ dataset. Third, we examined susceptibility and candidate genes identified in previously published GWAS and other association studies, including MAPT, SNCA, LRRK2, GBA, PARK16, BST1, HLA-DRA, SYT11, ACMSD, STK39, MCCC1/LAMP3, GAK, and CCDC62/HIP1R in all three datasets. While the sample size of the AJ discovery set is relatively small, the joint analysis dataset had a combined sample size of 2,050 cases and 1,836 controls.

Subjects
The AJ GWAS dataset was created by combining participants from two studies the Genetic Epidemiology of PD study (PD EPI) and the AJ Study. The ascertainment of cases (n = 168) and controls (n = 84) for the PD EPI study was described in detail in Marder et al. [25] and the ascertainment of cases (n = 100) and controls (n = 94) for the AJ study is described below. Briefly, for the PD EPI and AJ study, PD cases were recruited from the Center for PD and Other Movement Disorders at Columbia University. All met research criteria for PD. All controls underwent the same evaluation as cases, which included a medical history, Unified Parkinson's Disease Rating Scale (UPDRS) and Mini Mental State Exam (MMSE). Family history of PD and related disorders in first-degree relatives was obtained using a structured interview that has been shown to be reliable and valid.
The PD EPI study was enriched for cases with AAO of 50 years of age or younger and the majority of controls were recruited via random digit dialling. Information on Jewish ancestry in each of the grandparents was obtained during an interview. Information about Ashkenazi origin was not specifically obtained; however~90% of Jews in the United States are Ashkenazi. For the AJ study, PD cases were recruited specifically based on their AJ ancestry and information on Ashkenazi Jewish ancestry in each of the grandparents was obtained during an interview. This study was approved by the Institutional Review Board at Columbia University Medical Center. Each study participant signed a written informed consent approved by the University Human Ethics Committee.

Genotyping and Quality Control Assessment
A total of 268 cases and 178 controls were genotyped using the Illumina Human 610-quad bead arrays (Cases n = 91 and Controls n = 96) or the Illumina Human 660-quad bead arrays (Cases n = 191 and Controls n = 84). All DNAs were derived from whole blood.
Quality scores were determined from allele cluster definitions for each SNP as determined by the Illumina GenomeStudio Genotyping Module version 3.0 and the combined intensity data from 100% of study samples. Genotype calls with a quality score (Gencall value) of 0.25 or higher were considered acceptable. We genotyped 10 samples in duplicate to assess genotyping accuracy and found blind duplicate reproducibility to be 100%. In addition, 10 cases and 2 controls were genotyped by the above 2 platforms, and for the overlapping SNPs, genotypes matched. 6 individuals was removed with similar genotype with others in the IBD analysis using PLINK http://pngu.mgh.harvard.edu/ purcell/plink/ [26]. Subsequently, for the samples with duplicates, we used the Illumina Human 660quad bead arrays. Overall we performed additional quality control (QC) measures using PLINK. We excluded SNPs with the following characteristics: missing genotyping rate > 5%; minimum allele frequency < 2%; Hardy-Weinberg Equilibrium (HWE) test [27] at a p-value < 0.0001 in controls. This screen reduced the total number of analyzed SNPs by 1.67%. Following all QC measures, we analyzed 525,124 SNPs. Figure 1a) represents the Q-Q plot for the AJ dataset. Q-Q plot was generated using the WGAviewer program [28]. SNAP http://broad.mit.edu/mpg/snap/ was used to identify and to annotate nearby SNPs in linkage disequilibrium (proxies) based on HapMap, to query and display LD and regional association plot with GWAS results [29].
GBA and LRRK2 mutation status have been reported previously in the AJ sample derived from PD EPI study previously [22,30]

Population stratification
We examined ancestry for each subject to estimate cryptic population stratification using the identity-by-state (IBS) based clustering method as implemented in PLINK [26]. Briefly, we used all available SNPs (n = 522,578 autosomal SNPs) for the PLINK analysis (version 1.05) to assess underlying population structure. To assess potential cryptic population stratification, we augmented the 446 AJ samples with white subjects from the HapMap website http://www.hapmap.org/, which included 60 European Americans, 60 Yorubans and 90 Asians. The best fitting model assumed two underlying populations; however, the proportion of the second cluster was small (n = 14), and this group of individuals clustered with the HapMap whites. These subjects were not dropped from the analysis.

Statistical Analysis
We conducted single point allelic association analysis using the Mantel-Haenszel chi-squared test statistic, which tests for SNP-disease association conditional on population subcluster estimated from the PLINK analysis described above (Additional file 1). For the SNPs with the strongest support for association and are located within a gene, we performed two additional analyses for the region containing the top SNPs and for several PD genes that have been previously reported. First, we conducted haplotype analysis using 2 or 3 contiguous SNPs as implemented in the PLINK program. This approach computed a Wald statistic p value for comparing each haplotype between cases and controls as well as an overall p-value which compares the frequencies across all the haplotypes [26]. Second, we performed odds ratios adjusting for age and sex as implemented in PLINK (Results; Additional files 1 and 2).

Candidate Gene Analyses
We performed separate analyses focusing on SNPs in the candidate genes that were identified from previous genome wide association studies, including MAPT, SNCA, LRRK2, GBA, PARK16, BST1, HLA-DRA, SYT11, ACMSD, STK39, MCCC1/LAMP3, GAK and CCDC62/ HIP1R. For these genes, we computed Mantel-Haenzel chi-squared test to assess allelic association and computed odds ratios adjusting for sex and age to assess the effect size of the SNP in the AJ population.

Replication Datasets
To determine whether the findings from the Ashkenazi Jewish discovery samples are supported in independent samples, we examined the publicly available GWAS data for the CIDR/Pankratz et al 2009 [24] and NINDS PD GWAS [11] Table 1.
We applied the same allelic association model to the replication datasets to determine whether the candidate SNPs from the AJ dataset are associated with PD in these two unrelated datasets. To assess the overall effect of candidate SNPs, we then conducted a meta-analysis using the weighted Z-score meta-analysis as implemented in METAL (http://www.sph.umich.edu/csg/abecasis/ metal/). For some SNPs, only two of three datasets were used because the SNPs were not available in all datasets and imputation was unreliable. Table 1 shows the summary characteristics of genotyped subjects for the AJ discovery dataset and two replication datasets of white subjects that were comparable in demographic and clinical characteristics. The AJ dataset had a slightly larger number of cases (n = 268) versus controls (n = 178), but the remaining two datasets had a comparable number of cases versus controls. To control for population stratification, we excluded subjects who clustered differently from the majority of Ashkenazi Jews (See Subsection Population Stratification, Materials and Methods section for details).

Results and discussion
Overall, the NINDS dataset obtained from the dbGaP consisted of 931 PD cases and 798 controls, a total of eight PD cases were excluded because these individuals were missing genotype data from the Human Hap300v1. The CIDR/Pankratz et al 2009 dataset consisted of 900 cases and 867 controls. We excluded 41 cases and seven controls with genotyping rates < 99%. These two datasets were downloaded from the dbGaP site on September 25th, 2009; therefore, the number of cases and controls do not necessarily agree with the publications for the two studies [11,24]. For these two external datasets, we did not have

AJ dataset
We identified seven candidate SNPs of high priority from the AJ discovery dataset (Table 2). Specifically, we identified the top 57 SNPs with P value < 9.9 × 10 -5 from the AJ discovery dataset (Additional file 1; Figure 2a). Although these SNPs do not meet the stringent Bonferroni corrected genome wide significance pvalue of 9.5 × 10 -8 , these SNPs provide the strongest support for harbouring susceptibility genes for PD. To further screen these 57 candidate SNPs, we checked to see whether they were: (1) located within genic regions, and (2) replicated in at least one independent dataset. Twenty-seven out of 57 SNPs were located in or near to genes (Additional file 1), and the remainder of SNPs were located in intergenic regions. When we evaluated those SNPs in the two replication data sets, we identified six SNPs which were located within six candidate genes, namely LOC100505836, LOC153328/ SLC25A48, UNC13B, SLCO3A1, WNT3, and NSF (Table 2 Figure 3). Of the six SNPs located within a gene, for three SNPs (rs10121009, rs7171137, and rs183211), the direction of allelic association was the same in all three datasets, whereas for SNPs rs415430, rs4976493 and rs1694037 the direction was the same in two datasets.

NINDS and CIDR/PANKRATZ
In the NINDS Dataset, we re-examined the data set and identified four SNPs that reached genome wide significance at p < 9.7 × 10 -8 (Figure 2b, Additional file 2). Of the four SNPs, one SNP (rs3784847) was located within a gene, CDH8, and the remaining three SNPs were in intergenic regions. The allele frequencies were   Rs4976493 in LOC153328/SLC25A48 was associated with PD in the AJ and NINDS datasets, but not in the CIDR/Pankratz et al 2009 dataset. However, the metaanalysis based on the three datasets supported association with PD (rs4976493, p = 0.005) ( Table 2). We then performed a 2-mer and 3-mer sliding window haplotype analysis (Additional file 3a). Strongest association in the AJ dataset was observed for a 2-mer haplotype rs4976493-rs4246802 ('GG') (AJ: p = 6.93 × 10 -5 ; NINDS: p = 0.025) (Additional file 3a). In the NINDS dataset, the haplotype involving rs2304075-rs6596270 'TT' was most significantly associated with PD (p = 0.008, with global-p = 0.027) and this haplotype was also significant in the AJ dataset (p = 0.003, with globalp = 0.004).

UNC13B (9p13.3)
rs10121009 was consistently associated with PD in all three datasets ( Allele A in rs7171137 was consistently associated with increased risk of PD in all the AJ and NINDS datasets and the meta analysis supported the association (p = 4.09 × 10 -5 , Table 2). Moreover, haplotype 'GA' at SNPs rs2387400-rs7171137 was associated with PD (p = 4.04 × 10 -5 in the AJ dataset and 0.014 in the NINDS dataset) (Additional file 3c).

NSF (17q21.31) and WNT3 (17q21.32)
Because the two genes are closely located with each other and with MAPT, we present each gene independently first, then the region as a whole, including MAPT. We observed a strong single and haplotype association between PD and rs183211 (NSF) in the AJ and CIDR/Pankratz et al 2009 datasets, but not in the NINDS dataset (Figure 4a, b, c). In addition, WNT3, located adjacent to NSF was also associated with PD in the AJ and NINDS datasets (Figure 3a, Table 2 Additional file 3d). In comparison, rs1981997, a SNP with the strongest support in MAPT, was similarly, but slightly weakly, associated with PD in the AJ dataset (p = 0.0009) and the CIDR/PANKRATZ dataset (p = 0.0002).
Because NSF and WNT3 are closely located to MAPT and multiple datasets show support for possible association with PD, we examined this region encompassing MAPT-NSF-WNT3 further. For this purpose, we used the three most significant SNPs for each gene: (rs183211) in NSF, one 'H1 TagSNP' (rs1981997) in MAPT, and rs415430 in WNT3. This analysis is to determine whether the NSF and WNT3 SNP confer an independent association or whether the association with the NSF SNP was primarily due to high linkage disequilibrium between the two SNPs. We reasoned that, if there exist an independent contribution from NSF, WNT3, or both, we expect to see allelic or haplotype association in NSF, WNT3 or both, regardless of the SNP allele at MAPT. However, this analysis is limited in the present study, because the allele frequency of the 'A' allele is low in all three datasets (i.e., 0.216 in the AJ, 0.202 in the NINDS, and 0.202 in the CIDR/PANK-RANTZ dataset). Table 3 supports that the C-T haplotype at NSF and WNT3 was associated with PD (p = 1.91 × 10 -5 ). When we extended the analysis to include the 3-mer haplotypes by including the associated SNP at MAPT, we observed that G-C-T haplotype was strongly associated with PD as was the C-T haplotype at NSF and WNT3 (p = 0.00014), but A-C-T was not (p = 0.1172). This suggests that LD may play a role in the association with NSF and WNT3. However, haplotype A-C-T frequency was higher in cases than in controls. While haplotype A-C-T association was not statistically significant, this association does support the possibility that a variant(s) in the NSF and WNT3 may contribute to PD, independent of MAPT. This association was replicated in the NINDS dataset, but not in the CIDR/ PANKRATZ because the CIDR/PANKRATZ dataset lacked the SNP in WNT3. Taken together, there is suggestive evidence that NSF and WNT3 are candidate genes that need to be further studied.

LOC100505836 (3p24)
The SNP rs1694037, located in LOC100505836, was replicated in the CIDR dataset (p = 0.049) but not in the NINDS dataset (p = 0.849) and was not significant in the meta-analysis of all three datasets ( Table 2).

Replication of Previously identified PD Susceptibility Genes
In the analysis of the discovery AJ dataset, the previously identified PD susceptibility genes MAPT, SNCA, LRRK2, GBA, PARK16, BST1, HLA, SYT11, ACMSD, STK39, LAMP3, GAK and CCDC6/HIP1R were not included in the top 57 candidate SNPs/genes. GBA and LRRK2 were reported in the AJ sample derived from PD EPI study previously [22,30]. These genes also failed to reach genome wide significance in the NINDS and CIDR/Pankratz et al 2009 datasets. As shown in previous studies, it is not unexpected to miss risk genes in GWAS when SNP coverage is sparse in the candidate gene regions. Thus we further assessed association of SNPs at the chromosomal regions harbouring these genes in the AJ dataset and performed a meta-analysis including the NINDS and CIDR/ Pankratz et al 2009 datasets. PD susceptibility genes that were associated in the AJ dataset are reported below.

MAPT (17q21.31)
We assessed associated SNPs and the presence of chromosome 17q21.31 alleles in the H1-H2 haplotype clades in MAPT using 'rs1981997' (HAPMAP CEU: 'A' = 0.208 and 'G' = 0.792) as a haplotype Tag SNP because the major (G) allele and the minor (A) allele of this SNP are fixed in the H1 and H2 clades respectively [31]. H1-H2 haplotype Tag SNP rs1981997 was associated with PD in the allelic and haplotype association analyses in both AJ and CIDR/Pankratz et al 2009 datasets. As discussed above, the present study supports the notion that in addition to MAPT, NSF, WNT3, or both contribute to PD susceptibility (Table 3).

LRRK2 (12q12)
We assessed association of SNPs at the 12q12 region harbouring the LRRK2 gene in each dataset. SNPs within or near to LRRK2 did not reach genome wide significance in any of the datasets and were not included in the top '57' SNPs in the AJ dataset. However, we genotyped all subjects in the AJ dataset for the LRRK2 'G2019S' mutation, and observed an association of haplotypes consistent with previously published studies [20,30]. Strongest association was observed for the haplotype rs1427271-rs10735934-rs34637584 'GTA' (p = 7.66 × 10 -5 ) (Additional file 4 for single point analysis; haplotype results not shown).

GBA (1q21)
Although SNPs within the GBA gene from the GWAS did not reach genome wide significance in any of the datasets analyzed we did observe strong association of SNPs and haplotypes at the GBA locus in our AJ dataset. When we assessed association of 8 SNPs at Chromosome 1q21 spanning a 74.4kb region from TRIM46 (rs4971100) to SCAMP3 (rs3180018) we observed that SNPs located in GBA were significantly associated with disease (i.e. rs2990245: OR = 1.39; p = 0.015) (Additional file 4). Using the SNPs from GWAS along with the GBA N370S allele (which was genotyped in all subjects), our 2-mer and 3mer sliding window haplotype analyses flanking the GBA N370S allele revealed that a risk haplotype spanning 12.5Kb of 'ATG' (GBA 'N370S', rs2049805 and rs1045253) was associated with PD in the AJ dataset (p = 8.19 × 10 -4 ) but not in the replication datasets ( Figure 5). We previously reported that the GBA 'N370S' mutation is associated with PD [22].

PARK16 (1q32.1)
The PARK16 locus was previously identified as a susceptibility locus in a GWAS of PD in a Japanese population (p = 1.52 × 10 -12 ) [13] and was also confirmed in a meta-analysis of PD GWAS. This region encompasses multiple genes; therefore, we assessed SNPs for association for a region spanning~170 Kb (203905087-204074636 bp) at the PARK16 locus in all three datasets ( Figure 6). In the AJ dataset the most strongly associated SNP, rs823114 (p = 6.12 × 10 -4 ) was located in an intergenic region proximal to NUCKS1. Our analysis confirmed the finding of Satake et al (2009) [13] in a Japanese population in which they found that rs823114 (p = 2.7 × 10 -34) was strongly associated with transcript levels of NUCKS1 suggesting that this gene is a promising candidate for PARK16. However this SNP was not associated with PD in the replication datasets. In addition, two SNPs (rs708730 and rs1891094) in SLC41A1 were modestly associated with PD in the AJ dataset and NINDS dataset.

BST1 (4p15.32)
The BST1 gene was previously identified as a susceptibility gene in a GWAS of PD in a Japanese population [13] and was also confirmed in a meta-analysis of PD GWAS. On 4p15.32, four SNPs (rs11931532, rs12645693, rs4698412 and rs4538475) reached p < 5 × 10 -7 in the   The HLA region was recently identified as a susceptibility locus in a late-onset sporadic PD population from North America [16]. We did not find evidence for association of SNPs at the HLA-DRA region with PD in AJ dataset (Additional file 4).

Conclusions
This study identifies the candidate gene regions LOC100505836, SLC25A48, UNC13B, SLCO3A1, WNT3, and NSF as new candidates for PD using an AJ casecontrol population as a discovery dataset and two other large publicly available dataset as replication datasets. By utilizing a relatively genetically homogeneous AJ population and searching for variants that are rare (defined as a MAF threshold of 2% or higher), we report additional susceptibility variants for PD. In addition, we examined the magnitude of association of previously reported PD candidate genes including MAPT, SNCA, LRRK2, GBA, PARK16, BST1, HLA, SYT11, ACMSD, STK39, MCCC1/ LAMP3, GAK and CCDC62/HIP1R in the AJ dataset and found them to be comparable to several reports in North American and European populations.
Of the new candidate genes that we identified in this study, many represent interesting candidates for PD based on function, as discussed below, and warrant additional follow up in independent studies and different PD populations. Functional studies suggest a role for three of the genes that we identified (SLC25A48, UNC13B, and NSF) in neuronal signalling and the dopamine pathway. SLC25A48 is a member of the solute carrier family 25 proteins that function as transporters of a large variety of molecules including ATP/ADP and amino acids [32]. Characterized SLC25s localize to the inner mitochondrial membrane and are also often referred to as mitochondrial carriers or uncoupling proteins (UCPs) [33]. SLC25A48 is highly expressed in the central nervous system (CNS) including the hypothalamus, pituitary and brainstem and has been shown to be important in healthy neurons for energy production and to have a role in neuronal signalling [34]. Previous studies have suggested a role for mitochondrial UCPs in PD, Alzheimer disease and amyotrophic lateral sclerosis [32].
The SNP rs10121009, located in UNC13B (MUNC13) was included in the top 57 SNPs in the AJ dataset and also showed evidence of strong association in a metaanalysis for all three datasets (p = 2.75 × 10 -6 ). Experiments in C. elegans and mammalian cellular models systems suggest a role for the MUNC13 family of proteins in the priming of synaptic and secretory vesicles in a step just preceding fusion with the plasma membrane. MUNC13 has been shown to control the release of both neurotransmitters and neuropeptides from motorneurons in the Caenorhabditis elegans (C.elegans) neuromuscular junction [35]. The lipids and proteins involved in these networks are highly conserved between C. elegans and mammals.
Because NSF and WNT3 are closely located to MAPT and multiple datasets show support for possible association with PD, we examined this region encompassing MAPT-NSF-WNT3 further. Our data support the possibility that a variant(s) in the NSF and WNT3 may contribute to PD, independent of MAPT. This association was replicated in the NINDS dataset, but not in the CIDR/pankratz et al 2009 dataset because the CIDR/ Pankratz et al 2009 dataset lacked the associated SNP in WNT3. Taken together and based on the function of these genes, there is suggestive evidence that NSF and WNT3 are candidate genes that need to be further studied. The function of NSF in vesicular trafficking and membrane fusion is well documented and the protein has also been shown to play a role in the fusion of synaptic vesicles in the presynaptic membrane during neurotransmission and to interact with neurotransmission receptors at the postsynaptic side [36]. More recent studies suggest an interaction between NSF and the Dopamine D1 receptor (D1R) which is important for the membrane localization of D1R [37]. D1R plays important roles in regulating motor coordination, working memory, learning and reward and D1R dysfunction is as associated with both psychiatric and neurological disorders including PD [38]. WNT3 is a member of the WNT gene family which encode secreted signaling proteins that play a role in several developmental processes, including embryonic and adult neurogenesis. Postnatal neurogenesis has been observed in two brain regions: the subventricular zone (SVZ) of the lateral ventricle and the subgranular zone (SGZ) of the dentate gyrus in the hippocampus, among vertebrates including human. Genetic factors essential for neural development including WNT3 are also expressed in adult neurogenic regions. Cell proliferation of neural progenitors in the SVZ of PD patients and animal models has been shown to be decreased and modulated by dopamine.
We also replicate association of several previously identified PD genes and loci in our AJ population including MAPT, SNCA, LRRK2, GBA, PARK16, BST1, STK39 and LAMP3. Both LRRK2 and GBA represent the most common risk factors in the AJ PD population. In the AJ dataset, we observed a significant association for the LRRK2 'G2019S' mutation as well as for a single haplotype, and these findings are consistent with previously published studies. Among 268 PD cases, 31 (11.6%) individuals carried the LRRK2 G2019S mutation, and their mean age at onset was younger/similar to non-carriers (mean age at onset of 56.5 (SD = 11.1) vs. 60.3 (SD = 12.3), respectively). The GBA 'N370S' mutation is the most common allele reported in AJ PD cases in several studies however a risk haplotype supporting a founder effect has not been previously reported. In our AJ PD dataset we identified a risk haplotype of 'ATG' (GBA 'N370S', rs2049805 and rs1045253)(p = 8.19 × 10 -4 ) spanning 12.5 Kb suggesting that these individuals share a common founder. Among 268 PD cases, 28 individuals carried the GBA N370S mutation (10.4%), and their mean age at onset was younger/similar to non-carriers (mean age at onset of 57.4 (SD = 12.4) vs. 60.2 (SD = 12.1), respectively).
Our analysis of the PARK16 locus in our AJ dataset confirms the finding of Satake et al (2009) [13] in a Japanese population and suggests that NUCKS1 is a promising candidate for PARK16. More recently, Tucci et al (2010) [39] analysed the coding regions of 3 candidate genes (NUCKS1, RAB7L1, and SLC41A1) at PARK16 in a British cohort of 182 PD patients. Novel mutations were identified in 1 PD patient in RAB7L1 (K157R) and in another patient in SLC41A1 (A350 V). Follow-up studies including re-sequencing of the NUCKS1 gene and other candidate genes at the PARK16 region are warranted.
In summary, our GWAS study has identified candidate gene regions for PD that are implicated in neuronal signalling and the dopamine pathway. Although the power to detect genome-wide level significance in the AJ dataset was low because of the small sample size we have demonstrated the utility of this dataset in gene and SNP discovery both by replication in dbGaP datasets with a larger sample size combined with joint analyses and by replicating association of previously identified PD susceptibility genes. Follow-up genotyping, replication studies and sequencing will be needed to confirm our findings in future studies.