Absence of association between SERPINE2 genetic polymorphisms and chronic obstructive pulmonary disease in Han Chinese: a case-control cohort study

Background Recent studies have proposed that the serine protease inhibitor E2 (SERPINE2) was a novel susceptibility gene for chronic obstructive pulmonary disease (COPD) in Caucasians. However, this issue still remained controversial. Additional evidences from populations with different environments and/or genetic backgrounds, such as East Asian, would be helpful to elucidate the issue. Methods In this study, five proposed causal SNPs in SERPINE2 were genotyped in 327 COPD patients and 349 controls, all of which belonged to the Han population sampled from Southwest China. The frequency of each SNP was compared both individually and in combination between patients and controls. The potential relationship between these SNPs and severity of COPD was also investigated. Results Three SNPs (rs3795877, rs6747096, and rs3795879) showed complete linkage disequilibrium (r2 = 1), and the minor allele frequencies were 13.0% and 12.9% in case and control cohorts, respectively, with no significant difference observed (P = 0.96). We also failed to observe any significant correlation between these SNPs and COPD severity (P = 0.67). The other two SNPs (rs7579646 and rs840088) also presented a similar pattern. Moreover, four major haplotypes were observed in our sample but none showed a significant difference between case and control groups (P > 0.1). Conclusion Our results failed to obtain the evidence that these SNPs in SERPINE2 contributed to the COPD susceptibility in the Han Chinese population.


Background
Chronic obstructive pulmonary disease (COPD, MIM# 606963), one of the major sources of morbidity all over the world [1], is characterized by airflow limitation that is not fully reversible and a chronic persistent inflammatory process [2]. Tobacco smoking has been proven to be the predominant environmental factor for COPD. However, only approximately 15% of smokers develop airway obstruction [3,4]. This phenomenon, together with the familial clustering in patients with COPD [5], suggested that genetic factors might play an important role in COPD development.
Numerous studies have demonstrated that the balance between the serine proteases and serine protease inhibitors (SERPINs) is important in maintaining the matrix biology of the lung cells [6][7][8] and an imbalance between these two factors has been a long-standing hypothesis to explain lung destruction in emphysema during developing COPD [7]. However, in the SERPINs family, only SERPINA1 (α1-antitrypsin) has been confirmed to be a genetic risk factor for this disease so far. In addition, the mutant protease inhibitor (Pi) Z homozygote of the SERPINA1 gene, which can increase individual susceptibility to COPD, is rare across worldwide populations (0.001% -4.5%), especially in Asians (< 0.004%), and accountable for only 2% COPD patients [9]. All of these observations suggested that other members of this gene family may also be involved in the onset of COPD and deserve further investigation.
Serine protease inhibitor E2 (SERPINE2, NM_006216) is one member of the SERPINs family. Due to the elevated expression of SERPINE2 in murine lung development, and consistent correlations with COPD-related phenotypes in human lung tissues, SERPINE2 was supposed to be a novel candidate gene for COPD [10]. Further investigation identified some SNPs in SERPINE2 that might confer individual susceptibility to COPD by influencing matrix metalloproteinase pathways [10,11]. However, this issue was questionable since these associations have not yet been verified [12]. To untangle this controversy and illustrate the relationship between SERPINE2 and COPD, additional evidence from populations with different environments and/or genetic backgrounds is warranted. Moreover, all of the above studies focused only on Caucasians, it remains unclear whether their conclusions also applied to East Asian populations, whose genetic background has been proven to be strikingly different from Caucasian.
In the present study, the five putative causal SNPs in SERPINE2 were genotyped in a case-control population from southwest Han Chinese and their potential correlations with COPD were evaluated. Furthermore, the relationship between SERPINE2 and COPD severity was also investigated. Our results would help to illustrate the genetic contribution to this disease.

Study population
327 unrelated patients with COPD and 349 healthy smokers were recruited from the First Affiliated Hospital of Kunming Medical College (Kunming, China). All smokers belonged to the Han nationality to minimize the potential sampling bias due to population stratification. COPD patients were diagnosed based on the results from multiple examinations, including the ratio of forced expiratory volume in one second/forced vital capacity (FEV 1 / FVC ratio < 70%), and FEV 1 < 80% predicted, according to the Global Initiative for Chronic Obstructive Lung Disease criteria [2]. Based on the pulmonary function, the patients were further classified into two subgroups: mild/moderate COPD (FEV 1 > 50% pred); and severe/very severe COPD (FEV 1 ≤ 50% pred), to evaluate whether SERPINE2 contributed to the disease severity. The healthy smokers exhibited normal pulmonary function (FEV 1 /FVC > 70% and FEV 1 > 80% pred) and a smoking history ≥ 10 packyrs, and were excluded from the possibility of COPD by a Chest CT. This study was approved by our institutional ethics committee, and informed consent was obtained from all participants. The detailed information of patients and healthy smokers was listed in Table 1.

Genotyping procedures
Genomic DNA was isolated from whole blood leukocytes by the conventional phenol-chloroform method.
Five SNPs in SERPINE2 (rs3795877, rs6747096, rs3795879, rs7579646 and rs840088) were genotyped by PCR amplification and direct resequencing. Since three of them, rs3795877, rs6747096 and rs3795879, showed strong linkage disequilibrium (LD) but yielded different results [10,12], we included all three instead of one tag SNP for genotyping, with the attempt to investigate whether it was also the case in our cohort. PCR was performed in a 50 μL reaction mixture consisting of 250 ng genomic DNA, 5 μL 10 × PCR Buffer (20 mM Tris-HCl pH 8.3, 500 mM KCl, 15 mM MgCl 2 ), 2 μL dNTP mixture, 1 μL of each primer (10 μM) and 1 U Taq DNA polymerase.
Reaction conditions for PCR were as follows: 95°C for 5 minutes, 35 cycles consisting of 94°C for 1 minute, 56°C (detailed annealing temperature for each amplicon shown in Additional file 1) for 1 minute, and 72°C for 1 minute, and finally with an incubation at 72°C for 5 minutes. The PCR products were separated by electrophoresis on a 1.5% agarose gel, recovered by column kit (Watson Biotechnologies Inc., China), and resequenced directly by BigDye™ Terminator v3.1 Cycle Sequencing kit and ABI PRISM 3730 sequencer (Applied Biosystems Inc., USA). The known positive and negative controls were included in the sequencing process and all samples were sequenced on both strands. The resulting sequences were analyzed with DNAStar software Package (DNASTAR Inc., USA).

Statistical analysis
Age, smoking history, and pulmonary function data were displayed as mean ± SD. Hardy-Weinberg equilibrium was tested using a goodness-of-fit χ 2 test with one degree of freedom. The frequencies of each SNP between patients and controls, and between different disease groups (mild/ moderate vs. severe/very severe) were compared by twotailed Chi-square tests. All statistical tests were performed in SPSS 13.0 (SPSS Inc., USA). Pairwise LD between each SNP was evaluated by a maximum likelihood method to infer phase for dual heterozygotes, and quantified as r 2 .
The individual haplotypes composed of the five SNPs were estimated from genotype data using the PHASE program (v.2.0) [13]. Odds ratios (OR) and 95% confidence intervals (CI) were also calculated to assess the relative disease risk. Significance was accepted when P (Probability) value was < 0.05.

Results
The genotype and allele frequencies of the five SERPINE2 SNPs in patients with COPD and in healthy smokers were listed in Table 2. All five SNPs were in Hardy-Weinberg equilibrium in both case and control group (P > 0.1). Three SNPs (rs3795877, rs6747096 and rs3795879) were in complete LD, which was consistent with HapMap data (CEU, CHB and JPT populations), and thus presented the identical pattern in genotype distribution.  Table 2). Since COPD was observed mainly in males, we excluded 64 women from the case cohort and 98 women from the control population and consequently only males were considered for further analysis. The results, however, were still not significant (data not shown).
In order to investigate whether SERPINE2 SNPs contributed to the severity of COPD, we also compared the frequencies of these five SNPs in mild/moderate and severe/ very severe subgroups (shown in Table 3). The three SNPs in complete LD (rs3795877, rs6747096 and rs3795879) presented a similar pattern in minor allele frequency: 13.5% in the mild/moderate group while 12.4% in the severe/very severe one respectively, suggesting that there was no significant difference (P = 0.67). The genotype distribution of these three SNPs also failed to yield a significant result (P = 0.86). The other two SNPs (rs7579646 and rs840088) displayed a similar result (see Table 3). In both total cohort and subgroup analysis, these results did not change after adjusting for multiple significant testing by Bonferroni correction or false discovery rate (FDR).
To further assess whether any of the SNPs influence COPD onset, logistic regression analysis was performed in 418 subjects (210 cases and 208 controls) with full records in age, sex, smoking history and all genotyping data. In this logistic regression analysis, FEV 1 (% pred) that classified the case and control was regarded as the outcome variable, while sex and smoking history were covariates with allele count (0/1/2) for each SNP genotype. Our results indicated that men were more likely to develop COPD than women, and COPD was more prevalent in elderly persons, which was in agreement with the previous study [14]. However, others factors, including the genetic variants, did not influence FEV 1 (% pred) (data not shown).
We also calculated the pairwise LD for the five SNPs. There was complete LD among the three SNPs: rs3795877, rs6747096 and rs3795879 (r 2 = 1). The other two SNPs (rs7579646 and rs840088) were in low LD both with the three abovementioned ones (r 2 = 0.478 and 0.133, respectively) and within themselves (r 2 = 0.288). In addition to single SNP analysis, a combination analysis for these SNPs was also performed. The result indicated that there were four major haplotypes with a frequency above 3% in our samples, and their frequencies varied from 7.1% to 54.6% in patients (shown in Table 4). No haplotypes were observed to affect the OR of COPD significantly (ORs range 0.87-1.19; P > 0.1), which was consistent with the results of single SNP analysis.

Discussion
Since COPD is a complex disease caused by multiple genetic and environmental factors, it is essential to obtain more data on different populations to elucidate the role of the SERPINE2 in COPD. In the present study, we provided new genotyping data of SERPINE2 SNPs in the Han Chinese population, but failed to obtain any significant association between SNPs, including both genotype and haplotype, and the disease phenotype. Our results were not consistent with DeMeo's [10], but confirmed the conclusion of Chappell et al. [12]. Interestingly, a recent paper from the DeMeo group also failed to replicate their original result on these five SNPs in another population either [11]. It was also worth mentioning that several other SNPs, such as rs6734100, which failed to present correlation with COPD in original study [10], showed the most significant association in Zhu et al. [11] and deserved further investigation.
When conflicting results among different studies of this kind occur, common explanations include population stratification, genetic heterogeneity, phenotypic heterogeneity, and statistical power. Generally, genetic heterogeneity is often a factor to explain this conflict [15]. Considering that the genetic background of Caucasians is strikingly different from that of East Asians, we cannot rule out the possibility that the conflict may just reflect the different pathology of this disease in different ethnic populations.
Sometimes a small sample size may bring some bias into association studies [15]. To investigate whether our sample size was sufficient to detect genetic determinants of minor effect, we assessed the power of our sample size by using the genetic power calculator [16]. With the 2.0% disease prevalence in the Kunming population [17] and α = 0.05, for a variant with 0.1-0.4 minor allele frequency in an additive model, our sample size provided 60% power to detect a genetic relative risk (or OR) of 1.5, while for a genotypic OR of 1.75, the power increased to ≥ 80%.
In addition to these, it was worth noting that some results of statistical analyses were not interpreted decently in DeMeo et al. [10]. Since as many as 48 SNPs were genotyped and investigated in that study, over two SNPs would yield a significant result by chance given a random distribution and a threshold of 0.05. To reduce any false positive effect and obtain a reasonable result, some corrections, such as Bonferroni correction or FDR, would be crucial, and would definitely lower the significance level dramatically. Since the original genotyping data of DeMeo et al. [10] was not available and thus the FDR could not be calculated, Bonferroni correction was used, despite the fact that this method was somewhat conservative. Another concern is about the potential mechanism how these SNPs in SERPINE2 could play a role in COPD susceptibility. Since these five SNPs are located in the first three introns and over-expression of SERPINE2 was supposed to be correlated with COPD [10], it was reasonable to hypothesize that these SNPs might alter the SERPINE2 expression in tissues, given these five SNPs really correlated with COPD susceptibility. To test this hypothesis, we retrieved the gene expression data of SERPINE2 from Hap-Map lymphoblastoid cell line [18] and HapMap genotype data [19], and performed a linear regression analysis on these five SNPs in four HapMap populations. The three SNPs in strong LD, rs3795877, rs6747096 and rs3795879, did not present correlation with gene expression, except in the JPT population (r 2 = 0.32, P = 0.0012). Similarly, the remaining two SNPs (rs7579646 and rs840088) also failed to show any correlation in all four HapMap populations, except for rs840088 in the CHB population (r 2 = 0.15, P = 0.027). Considering that this relationship was observed randomly with two SNPs in two populations and that expression data was only avail- able for the lymphoblastoid cell line, which represents only one tissue type, it was far from drawing any conclusion on whether these SNPs could regulate the expression of SERPINE2. More studies on gene expression, especially expression in lung cells or a cell line with lung derivation, are necessary to clarify this issue.

Conclusion
In summary, our study was the first report to investigate the association between SERPINE2 SNPs and COPD in an East Asian population. The negative results we obtained, in combination with the available evidence in the literature, suggest that it is still far from reaching a consensus and more studies should be surveyed.