Association between CFL1 gene polymorphisms and spina bifida risk in a California population

Background CFL1 encodes human non-muscle cofilin (n-cofilin), which is an actin-depolymerizing factor and is essential in cytokinesis, endocytosis, and in the development of all embryonic tissues. Cfl1 knockout mice exhibit failure of neural tube closure at E10.5 and die in utero. We hypothesized that genetic variation within the human CFL1 gene may alter the protein's function and result in defective actin depolymerizing and cellular activity during neural tube closure. Such alterations may be associated with an increased risk for neural tube defects (NTDs). Methods Having re-sequenced the human CFL1 gene and identified five common single nucleotide polymorphisms (SNPs) in our target population, we investigated whether there existed a possible association between the genetic variations of the CFL1 gene and risk of spina bifida. Samples were obtained from a large population-based case-control study in California. Allele association, genotype association and haplotype association were evaluated in two different ethnicity groups, non-Hispanic white and Hispanic white. Results Homozygosity for the minor alleles of the SNPs studied (rs652021, rs665306, rs667555, rs4621 and rs11227332) appeared to produce an increased risk for spina bifida. Subjects with the haplotype composed of all minor alleles (CCGGT) appeared to have increased spina bifida risk (OR = 1.6, 95% CI: 0.9~2.9), however, this finding is not statistically significant likely due to limited sample size. Conclusion The sequence variation of human CFL1 gene is a genetic modifier for spina bifida risk in this California population.


Background
Neural tube defects (NTDs) are a group of severe congenital malformations characterized by a failure of neural tube closure during early embryonic development. NTDs are complex birth defects with a multi-factorial pattern of inheritance, requiring both genetic and environmental factors to contribute to their etiology [1]. The development and closure of the neural tube is usually completed within 28 days post-conception, in a process that is tightly regulated yet prone to environmental perturbation [2]. Periconceptional folic acid supplementation has been repeatedly reported to prevent 50~70% of NTDs [3][4][5]. Mutations and polymorphisms in folate pathway genes such as methylenetetrahydrofolate reductase (MTHFR), methionine synthase (MTR), methionine synthase reductase (MTRR), and betaine-homocysteine methyltransferase (BHMT), have been intensively investigated and in some studies have shown an association with NTD risk [6][7][8][9]. However, none of these factors individually contributes substantially to the population burden of NTDs.
In addition to folate, other developmental mechanisms have been postulated as contributors to abnormal neural tube development. Animal models have provided crucial mechanistic information and possible candidate genes to explain susceptibility to NTDs [10]. More than 80 genetic mouse models exhibit NTDs, with new ones emerging from gene targeting studies and large-scale mutagenesis screens on a regular basis [2]. A survey of the genes whose disruption causes NTD indicates multiple key signaling pathways and cellular functions that are essential for neural tube closure. One such gene candidate involves nonmuscle cofilin (n-cofilin), an actin-depolymerizing factor. N-cofilin, encoded by the CFL1 gene, is essential for cytokinesis, endocytosis, and plays a critical role in the development of all embryonic tissues. Inactivation of the Cfl1 gene in mice results in embryolethality and failure of neural tube closure by E10.5 [11]. It has been suggested that the neural tube closure defects are due to compromised delamination and migration of neural crest cells in these animals. In vitro migration assays performed on neural crest cells from these knockout embryos demonstrated limited traveling distance, failure of cell polarization, and a lack of F-actin structures such as fibers, bundles and cortical F-actins [11]. These findings suggest that n-cofilin regulates cytoskeletal dynamics during neural crest migration.
The human CFL1 gene (NM_005507), which maps to chromosome 11q13.1 [12], contains four exons and encodes an 18.5 kDa phosphoprotein. Human n-cofilin protein shares 98.8% homology with mouse protein, and the human CFL1 gene is 92.9% homologous with the mouse gene. In this study, we re-sequenced the genomic region on human chromosome 11 which encompasses the CFL1 gene, and tested the hypothesis that genetic polymorphisms in human CFL1 gene may modify human spina bifida risk. This hypothesis was evaluated in a population-based case-control study of infants with spina bifida.

Subjects
Epidemiological data and biological specimens were derived from the California Birth Defects Monitoring Program, a population-based active surveillance system for collecting information on infants and fetuses with congenital malformations [13]. Program staff collected diagnostic and demographic information from multiple sources of medical records for all live-born and stillborn fetuses (defined as 20 weeks gestation) and pregnancies electively or spontaneously terminated. Nearly all structural anomalies diagnosed within one year of delivery were ascertained. Overall ascertainment has been estimated as being 97% complete [14].
246 cases (infants with spina bifida) and 336 controls (non-malformed infants) were included in this study. Among the 246 cases, 86 (35.0%) were non-Hispanic white, 128 (52.0%) were Hispanic white, and 32 (13.0%) were of other ethnicities (African American, Asian, etc.). Among the 336 controls, 154 (45.8%) were non-Hispanic white, 113 (33.6%) were Hispanic white, and 69 (20.5%) were of other ethnicity (African American, Asian, etc.). These cases and controls were derived from 1983-86 and 1994-95 birth cohorts in selected California counties. Each case and control infant was linked to its newborn bloodspot, which served as the source of DNA in our genotyping analyses. All samples were obtained with approval from the State of California Health and Welfare Agency Committee for the Protection of Human Subjects. Genomic DNA was extracted from dried newborn screening bloodspots using the Puregene DNA Extraction Kit (Gentra, Minneapolis, MN).

Re-sequencing of CFL1 gene
DNA re-sequencing of CFL1 gene was conducted in a subset of samples (48 cases and 48 controls) in order to identify all sequence variations of the target genome region. Primers listed in Table 1 were designed using an online program, Primer3 [15]. The amplicons generated by the primers covered the majority of the CFL1 gene locus (GC11M065378: NCBI build 35:11:65378866~65383462), including the complete coding region, 5' and 3' un-translated regions, and all introns. Sequencing analyses were performed using BigDye Terminator Kit version 3.0 (Applied Biosystems, Foster City, CA) on an ABI 3730 Genetic Analyzer (Applied Biosystems, Foster City, CA). Sequencing results were exported to Sequencher™ software version 4.2.2 (Gene Code Corp., Ann Arbor, MI) for alignment and multiple comparisons.

Genotyping analysis
Five common single nucleotide polymorphisms (SNPs) were identified from the re-sequenced CFL1 gene. SNPs rs652021, rs665306, and rs667555 were genotyped using TaqMan SNP assays (Assay-on-Demand, Applied Biosystems, Foster City, CA) on an ABI PRISM 7900 Sequence Detection System (Applied Biosystems, Foster City, CA) following the manufacturer's instructions. SNPs rs4621 and rs11227332 were genotyped using direct sequencing analyses. Genotyping assays were performed by laboratory technicians who were blinded to the case-control status of the samples. Five percent of the samples were duplicated in order to assess possible genotyping error.

Statistical analysis
Haploview software version 3.3.2 (Daly Lab at the Broad Institute, Cambridge, MA) was used to perform haplotype analysis [16]. Haploview software predicts haplotype based on EM algorithm. Pair-wise Linkage Disequilibrium (LD) between markers was measured by D' (defined as the linkage disequilibrium measure, D, divided by the theoretical maximum for the observed allele frequencies) and the correlation coefficient r 2 . Spina bifida risk was measured in three ways: allelic association, haplotype association, and genotype association. Single SNP allele association and haplotype association analyses were performed using Haploview software. For each SNP, minor allele frequency (MAF) and heterozygosity (HET) were computed, and deviation from Hardy-Weinberg Equilibrium (HWE) was tested among control infants. For each haplotype, counts for case and control association tests were obtained by summing the fractional likelihoods of each individual. For example, if an individual was determined by the EM algorithm to have a 40% likelihood of haplotype A and 60% likelihood of haplotype B, 0.4 and 0.6 would be added to the counts for A and B respectively. 95% confidence intervals for allele frequencies and haplotype frequencies were calculated using an online program, VassarStats [17]. Spina bifida risks were measured by odds ratios (ORs). Haplotype-specific odds ratios were computed taking the most common haplotype as reference. For genotype analysis, ORs were computed for each SNP by logistic regression models utilizing SAS software (version 9.1). Some models were adjusted for race/ethnicity (defined as non-Hispanic white and Hispanic white).

Results
Five common SNPs were detected through our DNA resequencing effort: rs665306 (intron 1), rs667555 (intron 1), rs4621 (exon 2, synonymous, Asp66Asp), rs652021 (intron 2), and rs11227332 (intron 1). The functional effects of these polymorphisms were predicted through FASTSNP (Functional Analysis and Selection Tool for SNP in Large Scale Association Study) [18] and are listed in Table 2. rs4621 is a synonymous SNP on exon 2, and the A allele abolishes a putative exonic splicing enhancer (ESE) domain. rs11227332 is located in intron 1, and the G allele of this SNP disrupts a putative transcription factor (TF) binding site for cAMP-responsive element binding protein (CREB). SNP rs665306 in intron 1 locates in a putative binding site for several transcription factors (TFs), including AP-1, NF-E2 and USF; The nucleotide change from C to T disrupts the putative USF binding site. Another SNP in intron 1, rs667555, is in a putative TF binding site for MZF1, while the less common T allele disrupts the binding. rs652021 is located in intron 2 and has no known functional effect that can be predicted based on current knowledge. One novel G->A change, located 2bp downstream of rs11227332, was observed in 1 spina bifida infant and 1 control infant. Non-synonymous SNPs listed in the NCBI SNP database (Build 36.1), rs11550147, rs11550151, rs11550152, rs11550156, rs11550157, rs11550158 and rs11550160, were not found in our study population. Therefore, these SNPs were not subjected to our analyses of the case-control data.
The five common SNPs were genotyped in the case-control study. Genotyping was successful for 99.6%, 99.2%, 95.1%, 98.0% and 96.7% among cases, and 99.1%, 98.8%, 86.9%, 97.0% and 94.9% among controls, for rs652021, rs4621, rs11227332, rs665306 and rs667555, respectively. Missing of genotyping results was due to failure of PCR amplification caused by limited amount of DNA. Successful rates between cases and controls were not significantly different except for SNP rs11227332. Characteristics of SNP markers genotyped for CFL1 gene as well as allelic association with spina bifida risk in the two major subpopulations, non-Hispanic whites and Hispanic whites, are listed in Table 2 and sorted by their physical location on the chromosome. Among controls, non-Hispanic whites and Hispanic whites were in Hardy-Weinberg Equilibrium (HWE) for all SNPs studied. Strong linkage disequilibria were observed for all markers in both populations based on the calculation of D' and r 2 ( Table  3). Among the non-Hispanic whites, SNPs rs652021, rs4621 and rs11227332 presented significantly higher minor allele frequencies (MAF) in the spina bifida infants than in the controls (p < 0.05), with no appreciable difference between the Hispanic white cases and controls; this suggests a possible association between these less common alleles and spina bifida risk among the non-Hispanic white population.
We performed association analyses between haplotypes and spina bifida risk in the two major ethnicity groups using the Haploview program; the results are presented in Table 4. Haplotypes were estimated based on the EM algorithm. Only haplotypes with frequencies greater than 1% were included in the analyses. In non-Hispanic white controls, the most common haplotype was TAACG (0.642), while in Hispanic white controls, the most common haplotype was CGATT (0.441). In both ethnic groups, the haplotype CGGTT, which is composed of all minor alleles for the five SNPs, appeared to be associated with a slightly increased risk for spina bifida (OR = 1.6). We then tested haplotypes of each two adjacent SNPs. In non-Hispanic whites, haplotype CG for rs652021-rs4621 (OR = 1.5, 95% CI: 1.0, 2.2) and haplotype GT for rs11227332-rs665306 (OR = 1.6, 95% CI: 1.0, 2.6) appeared to be associated with modestly increased risk of spina bifida compared to wild types TA or AC, respectively. In Hispanic whites, haplotype GT for rs11227332-rs665306 exhibited slightly elevated risk (OR = 1.6, 95% CI: 0.9, 2.9).
We also evaluated the association between individual CFL1 SNP marker genotypes and spina bifida risk (Table  5) in the overall population, as well as in the two major ethnic subpopulations, non-Hispanic white and Hispanic white. Among non-Hispanic whites, homozygotes for the minor alleles for all SNPs exhibited more than a two-fold increase in risk for spina bifida. Increases in risk were also seen in Hispanic whites; however, these increases were not statistically significant, i.e., consistent with random variation.
We performed functional prediction using an online program, FASTSNP, which predicts functional impact of SNPs according to current knowledge. FASTSNP provides a "risk score" for each SNP based on its putative biological D' is below the diagonal, r 2 is above the diagonal function. Among the five SNPs we studied, rs4621, which is a synonymous change located in exon 1, has the highest "risk score". SNP rs652021 has a risk score of zero, which means that it has essentially no known functional impact. Strong linkage disequilibrium (LD) was observed among the five SNPs in both sub-populations (non-Hispanic white and Hispanic white) in our study. Therefore, the observed associations of these SNPs with spina bifida could be due to the strong LD to the "real" functional variation.

Discussion and conclusion
To our knowledge, this is the first investigation of human CFL1 gene polymorphisms as risk factors of spina bifida risk. Infants with the CGGTT haplotype had an increased spina bifida risk in the two studied ethnic groups. The observed increases in spina bifida risks indicated a possible role of actin depolymerizing factor in human neural tube morphogenesis.
One possible mechanism underlying the observed results that can be suggested follows. During early embryonic development, genes that regulate various morphogenetic activities must function cooperatively for the neural tube to close properly. Neural crest cell migration is one of these critical activities [2]. Neural crest cells migrate from the neural tube to the periphery and give rise to a variety of cell types. Therefore faulty neural crest cell migration might interfere with normal neural tube closure. This suggestion is bolstered by observations derived from experimental models. Mammals have at least three highly conserved genes encoding F-actin polymerizing factors: ncofilin [19], m-cofilin (muscle cofilin) [20], and ADF (actin depolymerizing factor) [21]. Actin depolymerization is one of the key activities required for actin-driven motility [22]. Recently, detailed analysis of n-cofilin function in mouse embryonic development using gene knockout technology suggested that n-cofilin is essential for neural crest cell migration [11]. Our study, from a popu- lation perspective, revealed that genetic variations of actin depolymerizing factor n-cofilin among infants may contribute to the risk of spina bifida.
The strengths of this study include its population-based ascertainment of cases and controls and its evaluation of race/ethnicity as a potentially important modifier of risk in the presence of variant genotypes and haplotypes. Population variation needs to be evaluated in populationbased case control studies of gene-risk association. Haplotype association analysis in both sub-populations suggested that haplotype CGGTT, which is comprised of minor alleles of all five SNPs, conferred an increased risk of spina bifida. Although not unexpected, it is noteworthy that the distribution of haplotypes differed between the two ethnic groups. For example, the frequency of the high risk haplotype CGGTT in non-Hispanic whites (0.161) was higher than that in Hispanic whites (0.108). Ordinal logistic regression analysis revealed that the homozygous status of minor alleles of all five SNPs were each associated with a greater than two-fold increase in spina bifida risk in the non-Hispanic white population. These increases were also seen in Hispanic whites, even though they did not reach statistical significance. There is cur-rently no experimental data on the functional consequences of these variants. We performed functional prediction using an online program, FASTSNP, which predicts functional impact of SNPs according to current knowledge. FASTSNP provides a "risk score" for each SNP based on its putative biological function. Among the five SNPs we studied, rs4621, which is a synonymous change located in exon 1, has the highest "risk score". SNP rs652021 has a risk score of zero, which means that it has essentially no known functional impact. Strong linkage disequilibrium (LD) was observed among the five SNPs in both sub-populations (non-Hispanic white and Hispanic white) in our study. Therefore, the observed associations of these SNPs with spina bifida could be due to the strong LD to the "real" functional variation.
Our results, although based on multiple tests on small sample sizes and therefore lowered statistical power, represent a preliminary step in elucidating the association between CFL1 gene variations and spina bifida risk. The haplotype CGGTT, which is composed of minor alleles of all five SNPs and was commonly found in both major race/ethnic groups studied, conferred an increased risk of spina bifida. Combined with existing knowledge about