MTHFR and F5 genetic variations have association with preeclampsia in Pakistani patients: a case control study

Background To study the role of single nucleotide variants (SNVs) of genes related to preeclampsia in Pakistani pregnant women. Methods After ethical approval and getting informed consent; 250 pregnant women were enrolled and equally divided into two groups (125 preeclamptic cases and 125 normotensive pregnant women). Demographic details and medical history were recorded, and 10 ml blood sample was obtained for DNA extraction. The tetra-primer amplification refractory mutation system (ARMS) assays were developed for assessing the variants of three preeclampsia related genes; F5, MTHFR and VEGFA. An association of six SNVs; F5:c.1601G > A (rs6025), F5:c.6665A > G (rs6027), MTHFR: c.665C > T (rs1801133), MTHFR: c.1286A > C (rs1801131), VEGFA: c.-2055A > C (rs699947) and VEGFA: c.*237C > T (rs3025039) with preeclampsia was determined by using different genetic models. Results Genotyping of the SNVs revealed that patients with MTHFR:c.665C > T, have increased susceptibility to preeclampsia (CT versus CC/TT: OR = 2.79, 95% CI = 1.18–6.59; P* = 0.046 and CT/TT vs CC: OR = 2.91, 95% CI = 1.29–6.57; P* = 0.0497, in overdominant and dominant models, respectively), whereas F5:c.6665A > G, (A/G vs AA/GG: OR = 0.42, 95% CI = 0.21–0.84; P* = 0.038 in overdominant model) and MTHFR:c.1286A > C, (CC versus AA: OR = 0.36, 95% CI = 0.18–0.72; P* = 0.0392 in codominant model) have significantly decreased risk for preeclampsia. F5:c.1601G > A, VEGFA: c.-2055A > C and VEGFA: c.*237C > T variants revealed no relationship with the disease. Conclusion This is the first case control study describing the protective role of F5:c.6665A > G against preeclampsia in any world population. In addition, the present study confirmed the association and role of MTHFR gene variations in the development of preeclampsia in Pakistani patients. Further genetic studies may be required to better understand the complex genetic mechanism of SNVs in preeclampsia related genes in pregnant women.


Background
Preeclampsia has been estimated to affect 2-8% of pregnancies, causing 10-15% of maternal deaths worldwide [1,2]. It is a multifactorial and complex disorder and various studies have proposed genetic, environmental, immunological and nutritional factors for its occurrence, though the exact cause largely remains debatable [3][4][5].
More than 70 candidate genes related to thrombophilia, blood pressure regulation, angiogenesis, hormones and lipid metabolism have been studied to detect an association with preeclampsia, however, results from these studies are inconsistent and conflicting [6].
Pakistani population is genetically heterogeneous and have unique genetic profiles, several novel genes and alleles have been identified to help better understand the disease prediction and the course of pathogenicity. Limited literature is available for Pakistani patients explaining the interaction of genetic variations for prediction and understanding the pathogenic mechanisms of preeclampsia [16]. It is hypothesized that genotypes of SNVs of genes, crucial for development and the function of the placenta may reveal the novel genetic associations to predict the disease and its mode of manifestation. In the present study, 6 SNVs of F5, MTHFR, and VEGFA genes have been studied in patients with preeclampsia and normal controls. This study may help to better understand the genetics of preeclampsia and the role of variants in the related genes for better prognosis and management of the disorder.

Study design and participants
The proposed case-control study was conducted after approval from the Research Ethics Committee of Liaquat University of medical and health sciences (LUMHS), Jamshoro. Written informed consent was taken from all the participants. A total of 250 pregnant women (125 cases and 125 controls) were selected from labour room, wards and outpatient department of Gynaecology and Obstetrics units. It was attempted to recruit all preeclamptic patients admitted from March 2014 to Feb 2015. During this period 187 preeclamptic patients from 20 different districts of Sindh were admitted and after following exclusion criteria, 125 cases were recruited.

Inclusion and exclusion criteria
Inclusion criteria for the preeclamptic woman was defined according to the American college of obstetrics and gynaecologists as the development of gestational hypertension (blood pressure ≥ 140/90 mmHg on two events at least 6 h apart) and significant proteinuria (≥ 0.3 g protein in 24-h urine specimen or ≥ 1+ on dipstick test) after 20 weeks of gestation in previously normotensive women. Severe preeclampsia was defined on the basis of the presence of one of the following symptoms or signs in the presence of preeclampsia i.e. blood pressure ≥ than 160/110 mmHg (readings were taken on two events at least 6 h apart), proteinuria ≥5 g in 24 h urine specimen (or ≥ 3+ on two urine samples at least 4 h apart), oliguria (urine volume < 500 ml / 24 h), cerebral or visual disturbances, pulmonary edema or cyanosis, epigastric or right upper quadrant pain, platelet count less than 100,000/mm 3 , presence of haemolysis, elevated liver enzymes and low platelets (HELLP) syndrome and fetal growth restriction. Eclampsia was defined as the occurrence of convulsions in women with preeclampsia [17]. Early and late onset preeclampsia was defined as the development of preeclampsia before 34 weeks and at or after 34 weeks of gestation, respectively [18].
Inclusion criteria for controls were pregnant women greater than 20 weeks of gestation in the absence of diagnostic criteria for preeclampsia until discharge of pregnant women after delivery of the baby. Subjects with the history of chronic hypertension, renal diseases, multiple pregnancy, molar pregnancy, diabetes mellitus, chronic infectious diseases, thromboembolic events, and antiphospholipid syndrome were excluded from the study. Controls recruited were matched for age, ethnicity and parity.
The sample size was calculated by taking the prevalence of combined thrombophilic mutations (F5: c.1601G > A and MTHFR:c.665C > T) as described by Mello G et al. [21]. Assuming the prevalence as 19.8% in cases and 5.3% in controls, the sample size was calculated at 95% confidence level with alpha = 0.05 and > 80% power, using Ausvet Epitools Epidemiological calculator (http://epitools.ausvet.com.au/). The minimum sample size was calculated to be n = 95 in each group, however, to increase the confidence, 125 subjects were selected in each group.

Sample collection and DNA extraction
Ten ml of venous blood was collected in 50 ml tube containing 400 μl of anticoagulant ethylene diamine tetra acetic acid (EDTA), 0.5 M from both preeclamptic and normal pregnant women. Genomic DNA was extracted by inorganic method, described as previously [22].

Development of tetra-primer ARMS PCR assay
Genotyping of variants of selected genes was carried out by tetra-primer amplification refractory mutation system polymerase chain reaction (ARMS PCR) [23]. Primers were designed using PRIMER1 web tool [24]. Primer sequences were confirmed by UCSC In-Silico PCR and Blat-UCSC genome browser websites [25]. Amplification conditions were optimized and desired fragments amplified using 2720 thermocycler (Applied Biosystems). For SNVs F5: c.1601G > A and VEGFA: c.*237C > T, PCR performed in 20 μl reaction containing 100 ng genomic DNA, 1.25 mM dNTP, 0.6 U Taq polymerase, 2.5 mM MgCl 2 buffer and10μM of each forward and reverse outer primers and forward and reverse inner primers. PCR reaction for remaining SNVs was carried out in 20 μl reaction containing 100 ng of genomic DNA, 1.25 μM of dNTP, 0.6 U Taq polymerase, 2 mM MgCl 2 buffer and 8 μM of each forward and reverse outer primers and forward and reverse inner primers. PCR conditions were 95°C for 5 min, followed by 30 cycles at 94°C for 30 s, annealing for 45 s, extension at 65°C for 2 min and a final extension at 72°C for 10 min. PCR products were separated on a 2% agarose gel (Fig. 1).
The selected samples were Sanger sequenced for confirmation of the amplicon and to validate the ARMS assays. Forward and reverse outer primers were used to amplify desired segment for DNA sequencing. Sequences of primers for ARMS assay and product sizes are mentioned in (Table 1).

Statistical analysis
Student's t-test, two-sided Fisher exact test/Chi-square test were applied to the continuous and categorical variables, respectively. A p-value of ≤0.05 was considered as statistically significant. Genotype, allele and haplotype frequencies and Hardy-Weinberg equilibrium (HWE) were calculated. The SNVs association between cases and controls in codominant, dominant, recessive, overdominant and log-additive genetic models, and haplotype association test were performed by logistic regression by using SNPStat software [26]. Odds ratio (OR) and 95% confidence interval were determined to find the association between allelic frequencies between the two groups. To overcome the adjustment of multiple comparisons; false discovery rate (FDR), which is the expected proportion of type 1 errors among all positive tests, controlled by applying the step-up approach of Benjamini and Hochberg was used [27]. For the Fig. 1 The Agrose gel electrophoresis of newly developed tetra-primer ARMS PCR assays for F5, MTHFR and VEGFA genes variants

Demographic and clinical characteristics
The demographic and clinical characteristics of the participants are shown in Table 2. Among the demographic variables; age at marriage, height, weight and body mass index (BMI) were not different between preeclamptic and control groups (P > 0.05). The gestational age at presentation and family history were found significantly different between both groups. The preeclamptic and control groups were matched for ethnicity; the ethnic distribution of the preeclamptic patients is presented in Table 2. Fifty (40%) patients presented with early onset and 75 (60%) with late onset preeclampsia whereas, 55 (44%) preeclamptic women developed mild preeclampsia and 70 (56%) presented with severe preeclampsia.

Genotype distributions and allele frequencies
The genotype distribution of the six SNVs in both control and preeclamptic groups were concordant with HWE. The genotype frequencies and association test are presented in Table 3.
The F5:c.6665A > G variant showed 0.42 fold decreased risk for preeclampsia (A/G vs AA/GG: 95% CI = 0.21-0.84; P* = 0.038) in the overdominant model. It was also found to be associated with preeclampsia in the codominant and dominant models; however, the association lost after FDR correction.
Allele frequencies for MTHFR: c.665C > T and c.1286A > C were significantly different among cases and control groups (P* = 0.0425; P* = 0.0308, respectively). The frequencies of preeclamptics according to the onset and severity of preeclampsia were further compared with the genotype distribution of F5:c.6665A > G and MTHFR variants (Table 4).
It was observed that the frequency of heterozygous MTHFR:c.1286A > C genotype was different between preeclamptics presented at less than 34 weeks of gestation as compared to patients presented at or after 34 weeks. Preeclamptics with MTHFR: c.1286A > C, AC genotype were found to be 0.19 times less susceptible to present with early onset preeclampsia (95% CI = 0.08-0.49; P* = 0.012) as compared to late onset. However; MTHFR: c.1286A > C, association with severity of preeclampsia did not remain significant at Benjamini-Hochberg adjusted P-value. The haplotype frequencies of variants in preeclamptic and controls; and linkage disequilibrium are shown in Table 5.
The G-G haplotype of F5:c.1601G > A and c.6665A > GG, and C-C and T-C haplotypes of MTHFR: c.665C > T and c.1286A > C variants showed significant association with preeclampsia.

Discussion
Preeclampsia is a complex disorder involving the role of multiple genes related to placental pathophysiology. Variations of several genes have been studied in preeclamptic patients belonging to different populations and ethnic groups [6]. Pakistani population is genetically heterogeneous and  has remained an excellent source to study the relationship between gene and disease. Despite numerous genetic studies conducted in different populations [3], to the best of our knowledge, only one study reported the association of angiotensin-converting enzyme (ACE) gene I/D variant with preeclampsia in Pakistani population [16]. Preeclampsia has been related to thrombophilia and the candidate genes commonly involve F5:c.1601G > A, MTHFR: c.665C > T and c.1286A > C variants. Although various studies have reported an association of SNVs with preeclampsia [19,28], the role of F5, MTHFR and VEGFA gene variants in Pakistani preeclamptic women have not been determined yet. Our data showed significant differences in the genotype frequencies of MTHFR: c.665C > T, c.1286A > C and F5:c.6665A > G variants in different genetic models and allelic frequencies of MTHFR: c.665C > T, and c.1286A > C variants between preeclamptic women and controls. This is the first study to report an association of F5: c.6665A > G variant with preeclampsia in any population. A previous study investigated the role of 20 missense variations of the F5 gene in Japanese preeclamptic women, including F5:c.6665A > G and indicated the significant association of only two variants rs6033 and rs6020. F5:c.6665A > G variant was not found significant in association with preeclampsia in Japanese patients [10], which is in disagreement with our findings revealing decrease risk to preeclampsia. F5:c.6665A > G results in the substitution of aspartic acid into glycine in the C2 terminal domain of the F5 gene. The mutated residue is smaller, neutral and more hydrophobic than the wild type [29]. The substitution may affect the hydrogen bond formation, ionic interaction and rigidity of the protein. It may have an influence on the conformational changes in the protein and may be associated with unexplained activated protein C (APC) resistance [9][10][11]. F5:c.6665A > G has been reported as the likely benign functional variant and have significantly higher frequency among Asians and Arabians [30].
We did not find any association of F5:c.1601G > A with preeclampsia in our cohort. Though the F5 is one of the thrombophilic genes and its variants may have a key role in the development of the preeclampsia; however, the association of F5:c.1601G > A with thrombophilia and preeclampsia has remained controversial among studies; conducted on patients with different ethnic groups [31]. It has been found that F5:c.1601G > A is common in Northern India and predisposes women to preeclampsia [20]; which is contradictory to our results, though patients may have common ethnic lineages. In agreement with our study, no significant relationship of F5:c.1601G > A with preeclampsia was found in Turkish and Iranian patients [32,33]. This finding strengthens the hypothesis that F5 gene variations may have a role in   The results of the present study exhibited the increased risk of preeclampsia with CT genotype under the overdominant model, and with the T allele of MTHFR: c.665C > T. In a meta-analysis, Yang et al. [34] included 57 different studies that showed TT as a risk factor for preeclampsia in Caucasians, South Americans, East Asians and Africans whereas, TT + CT was observed as a risk factor in East Asians. In the same study, a protective role was observed with CC, CT, and CC + CT in East Asians and CT in South Asians; however, no significant association was found in Hispanics and Middle East population. Another meta-analysis demonstrated a 1.45-fold increased risk for preeclampsia with CT genotype in East Asians [35]. Contrary to these findings and the present study, Aggarwal et al. [20] found T allele protective against preeclampsia in North Indian women.
MTHFR: c.665C > T is a missense variant, resulting in the substitution of alanine to valine. This leads to the production of thermolabile protein product which possesses reduced catalytic activity. The TT and CT genotypes have shown 20-65% of reduced enzyme activity to process folic acid as compared to CC genotype. Further, this variant increases the risk of hyperhomocysteinemia aggravated with folic acid deficiency [36,37]. PolyPhen prediction indicated it as probably damaging, whereas HOPE inference showed that affected residue is bigger than wild type and located in a domain that is important for the protein activity and its interactions with other domains [29,38]. MTHFR:c.1286A > C has been associated with reduced enzyme activity, though not with thermolability [36]. PolyPhen predicted it to be benign. HOPE analysis predicted the affected residue as neutral and more hydrophobic that might disturb correct folding [29,38].
In addition, we also found the significant association of MTHFR: c.1286A > C with preeclampsia. This is contradictory with the results reported in Australian [39], Dutch [40] and Mexican [41] women; where the studies have reported lack of association of MTHFR gene variants with preeclampsia which may be due to genetic heterogeneity and different ethnic backgrounds of the patients. However, the study from Southeast of Iran reported an association of MTHFR:c.1286A > C with preeclampsia suggesting its role as a risk factor for preeclampsia in Asians; though in contrast to the findings of a current study they observed AC genotype as a risk factor for preeclampsia [19].
We analysed the association of F5:c.6665A > G and MTHFR variants between early and late onset preeclampsia. Previously, MTHFR:c.1286A > C, AC and CC genotypes and F5:c.1601G > A, GA genotype have been associated with over 2.5-fold increased risk for early onset preeclampsia [19]. Contradictory to these findings, our study suggested less risk for early onset preeclampsia with the AC genotype of MTHFR: c.1286A > C. This suggests further research to explore the role of genotypes in severity and outcome. VEGF is expressed in the placenta and has a significant role in its development and maintenance. VEGFA variants have been widely studied and its association with preeclampsia has been found in several studies, including Chinese, Brazilian, Hungarian and Korean patients [42][43][44][45]. However, our study negates the association of VEGFA: c.-2055A > C and c.*237C > T variants with preeclampsia in Pakistani patients. There is no significant difference found among allelic frequencies between cases and controls. These findings are in agreement with the studies of VEGFA variants, comprising of North American, Greece, Mexican and Sri Lankan patients [46][47][48][49][50][51]. The differences in results in various studies may be due to different inclusion criteria for cases and controls, sample sizes, different geographical locations, environmental factors as well as different ethnicities and genetic features.
The Pakistani population consists of at least 18 ethnic groups with more than 60 spoken languages [52]. The major ethnicities consist of Punjabi, Sindhi, Balochi, Urdu speaking, Pathan and Saraiki groups. All the ethnic groups have different genetic lineages resulting in genetic heterogeneity. Punjabi ethnics have a complex admixture of South Asian, East Asian and West Eurasian lineages, whereas, Pathan, Balochi and Sindhi share alleles with Greeks and Georgians [53][54][55]. While the Urdu speaking ethnic community has heterogeneous Indian ancestry [56]. Thus the genetic variations exhibit significant differences in the risk of developing various disorders and disease progression in the Pakistani population [15,57,58].. Majority of the preeclamptic patients in our study were Sindhi (56%) and Urdu (27.2%) speaking. The frequencies and genotypes of the genes analysed in this study, support the ethnic biases of the genetic variants. The frequency of the MTHFR: c.665C > T allele is up to 16.7% in Indian ethnic groups with the highest frequency of 7.8% of the TT genotype in the Rajput ethnicity. Whereas, MTHFR: c.1286A > C allele was found higher among Dravidians of east India and south India [59]. Moreover, F5:c.1601G > A heterozygous are more frequent in European descendants and carry 5 to 9% of the F5:c.1601G > A heterozygotes, as compared to less than 1% among Asians and African descendants [60].
In the present study, non-significant differences in BMI between preeclamptics and control groups were observed. There may be several factors which may affect the BMI, such as differences among residents of urban and rural areas, low and middle-income countries and the socioeconomic status of participants [61][62][63]. In the study, majority of the preeclamptics were referred from the rural areas of Sindh and had lower socioeconomic status; there was non-availability of pre-pregnancy BMI records and lack of knowledge regarding selfmeasurement. BMI in the present study was recorded at the time of presentation in both groups. The current findings are supported by a multicentre study conducted in Pakistan that did not find a significant association of BMI with preeclampsia [64]; furthermore a metaanalysis conclusively reported BMI as a weak predictor for the preeclampsia [65].
There are certain limitations in the present study. Though our sample population included some major ethnicities of Pakistan including Sindhi and Urdu ethnic groups; ethnicities of other provinces were in minority and for this reason ethnic diversities related to preeclampsia may not be generalized and require large scale studies in other provinces. Similarly, limited literature availability on genetic aspects in Pakistani preeclamptic women presented obstacles in the comparison of present study findings. We only investigated three genes, so it is possible that other genes may have a role in the development of preeclampsia in the Pakistani population.
Conclusion MTHFR: c.665C > T variant was associated with preeclampsia, whereas F5:c.6665A > G and MTHFR: c.1286A > C variants may have a protective effect against preeclampsia in Pakistani pregnant women. The significant association of SNVs for predisposition to preeclampsia may require further research for identification of more genetic variants related to preeclampsia genes. This may help to better understand the pathophysiological mechanisms of preeclampsia and may pave paths for effective therapeutic approaches. Furthermore, the development of cost effective ARMS assays may be a rapid, simple and economical method to genotype the SNVs for further studies.
Abbreviations ACE: Angiotensin-converting enzyme; APC: Activated protein C; ARMS: Amplification refractory mutation system; BMI: Body mass index; CI: Confidence interval; EDTA: Ethylene diamine tetra acetic acid; F5: Factor 5; FDR: False discovery rate; HELLP: Haemolysis elevated liver enzymes and low platelets; HWE: Hardy-Weinberg equilibrium; MTHFR: Methylenetetrahydrofolate reductase; OR: Odds ratio; PCR: Polymerase chain reaction; SNVs: Single nucleotide variants; VEGFA: Vascular endothelial growth factor A enrolments, performed clinical investigations. HS: Assisted in bioinformatics and sequencing. IDU: Manuscript writing and Critical review. AMW: conceived the study, supervised the laboratory experiments, and verified the data and manuscript drafting. All authors read and approved the final manuscript.

Funding
The study was funded by Liaquat University of Medical and Health Sciences, Jamshoro, Pakistan. The funding body played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Availability of data and materials
The data of the manuscript will be provided on the request, by the corresponding author.