Association of 42 SNPs with genetic risk for cervical cancer: an extensive meta-analysis

Background A large number of single nucleotide polymorphisms (SNPs) associated with cervical cancer have been identified through candidate gene association studies and genome-wide association studies (GWAs). However, some studies have yielded different results for the same SNP. To obtain a more comprehensive understanding, we performed a meta-analysis on previously published case–control studies involving the SNPs associated with cervical cancer. Methods Electronic searches of PubMed and Embase were conducted for all publications about the association between gene polymorphisms and cervical cancer. One-hundred and sixty-seven association studies were included in our research. For each SNP, three models (the allele, dominant and recessive effect models) were adopted in the meta-analysis. For each model, the effect summary odds ratio (OR) and 95% CI were calculated. Heterogeneity between studies was evaluated by Cochran’s Q test. If the p value of Q test was less than 0.01, a random effect model was used; otherwise, a fixed effect model was used. Results The results of our meta-analysis showed that: (1) There were 8, 2 and 8 SNPs that were significantly associated with cervical cancer (P < 0.01) in the allele, dominant and recessive effect models, respectively. (2) rs1048943 (CYP1A1 A4889G) showed the strongest association with cervical cancer in the allele effect model (1.83[1.57, 2.13]); in addition, rs1048943 (CYP1A1 A4889G) had a very strong association in the dominant and recessive effect model. (3) 15, 11 and 10 SNPs had high heterogeneity (P < 0.01) in the three models, respectively. (4) There was no published bias for most of the SNPs according to Egger’s test (P < 0.01) and Funnel plot analysis. For some SNPs, their association with cervical cancer was only tested in a few studies and, therefore, might have been subjected to published bias. More studies on these loci are required. Conclusion Our meta-analysis provides a comprehensive evaluation of cervical cancer association studies. Electronic supplementary material The online version of this article (doi:10.1186/s12881-015-0168-z) contains supplementary material, which is available to authorized users.

HPV, nearly 90% of women with HPV infection are able to clear the virus. So only a very small proportion of women with persistent HPV infection ultimately develop into cervical cancer and it indicated that HPV infection is a necessary but not sufficient risk factor for the origin and development of cervical cancer. Consequently, host genetic differences in the effective host immune response may influence the risk for cervical cancer among those infected with HPV. Therefore, it is very important to identify the gene loci related to cervical cancer origin and progression. Over the past few decades, the genetic susceptibility of cervical cancer has been examined by candidate gene association and genome-wide association studies, and researchers have found that the most important SNP was located in 6q12, within the human leukocyte antigen (HLA), or MHC, genes [7,8]. The HLA-II (DRB1) gene contains many mutations, and these mutations result in changes of the amino acid sequence of HLA-II. Many studies have reported that HLA-II (DRB1) is strongly associated with cervical cancer. However, the structure of the DRB1 gene is complex, and thus, it is very difficult to analyze SNPs of DRB1 with the standard SNP gene effect model. At the same time, other genetic intervals and SNPs have been reported to be related to the pathogenesis of cervical cancer and to play an important role in this process. Therefore, our meta-analysis does not include SNPs in the HLA genes, but focuses on these other reported SNPs. Although researchers have had great success in their research on the gene mutations associated with cervical cancer, many problems still remains. Some studies show conflicting results for the same SNP. For example, in studies of the relationship between TNF-α-308G > A with the pathogenesis of cervical cancer, Duarte I [9] found that this SNP is significantly associated with cervical cancer (OR = 1.8, 95% CI [1.21, 2.69]). However, Gostout BS found that TNF-α-308G > A does not increase the incidence rate of cervical cancer (OR (95% CI) =0.98 [0.64, 1.50]) [10]. These controversial results may be caused by small sample sizes, racial or ethnic differences, or clinical and genetic heterogeneity. Therefore, it is very important to assess whether the combined evidence shows an association between a SNP and cervical cancer. Metaanalysis is a very effective method by which the results of many studies with small sample sizes are combined. Through this method, the relationship of some SNPs, such as TNF-α-308G > A and TNF-α-238G > A, associated with cervical cancer has been proven. TNF-α-308G > A can increase the susceptibility of cervical cancer, while TNF-α-238G > A can significantly decrease its susceptibility [11]. However, only one or two SNPs were identified in a previously published meta-analysis on SNP loci and cervical cancer. To comprehensively and systematically assess the association between all of the available SNPs and cervical cancer susceptibility, we searched the PubMed database and Embase and performed a meta-analysis on the results of the selected studies. For each SNP, three genetic models were considered: the allele, dominant and recessive effect models. We also examined the heterogeneity between studies and the existence of published bias using Egger's test. As far as we know, this is the most detailed meta-analysis of SNPs and cervical cancer to date.

Data collection
The PubMed and Embase were searched for the appropriate studies using the following keywords: (polymorphism OR mutation OR single nucleotide polymorphisms OR genome-wide association study OR SNP OR GWAS) AND (cervical cancer OR cervical carcinoma). The studies to be included in the meta-analysis were selected in accordance with the following criteria: (1) the articles must have been published between January of 1990 and June of 2014; (2) the studies must employ a case-control design and must examine the association between SNPs and cervical cancer; (3) data on the SNP genotypes of patients and controls must be available; (4) the studies must be published as a full paper, not as a meeting abstract or review; and (5) NOT-HLA. For each study, we extracted the following information: the gene polymorphisms, first author, date of publication, title, population and number of cases and controls. Then, we choose those SNPs which published at least 2 times. Using these criteria, 152 papers involving 42 SNPs were selected for the meta-analysis ( Figure 1).

Selection of the genetic model
To comprehensively analyze the association between SNPs and cervical cancer, we adopted three genetic models: the allele effect model, the dominant effect model, and the recessive effect model. In these models, we assumed that each SNP marker locus has two alleles (A and a). A is the high-risk candidate allele, and a is the low-risk allele. The three models are described as follows: 1) Allele model: the effect of the A allele vs. the effect of the a allele; 2) Dominant model: If the SNP produces a cervical cancer phenotype when present in either one or two copies of the A allele, i.e., the AA + Aa vs. aa genotypes. 3) Recessive model: If only the aa genotype exists, the SNP produces a cervical cancer phenotype.
All meta-analysis were performed using RevMan 5.2 software. For each model, we calculated the OR value and 95% CI for the individual study. To evaluate the weight of each individual study on overall pooled OR, we performed a sensitivity analysis by sequentially removing each article at a time.

Evaluation of heterogeneity
Cochran's Q test was used to evaluate the heterogeneity of between-and within-study variation. In fact, Cochran's Q test is simply a chi-square test [12]. The null hypothesis was that all studies were evaluating the same effect. Rejecting the null hypothesis meant that heterogeneity exists between studies. P < 0.01 was considered to be significant. Another indicator of heterogeneity is I [2], which measures the degree of inconsistency across studies. The formula is as follows: I 2 = (Q-(k-1))/Q*100% (where k is the number of studies). When the value of I 2 is more than 25%, 50% or 75%, low-, mid-or high-grade heterogeneity is present, respectively [13][14][15][16].

Evaluation of the statistical association between the identified SNPs and cervical cancer
In this meta-analysis, Cochran's Q test was used to evaluate the heterogeneity between studies. If the Q-statistic was not significant, we considered that all of the differences between studies were caused by sampling error. Then, we selected the fixed effects model in the metaanalysis. In contrast, if the p value was significant (P < 0.01), meaning that heterogeneity exists between studies, we chose the random effects model.

Evaluation of publication bias
Funnel plots were used to intuitively assess publication bias. The horizontal ordinate of the Funnel plots corresponded to the study effects. If the variable was continuous, the effects are just shown as the original value; otherwise, the effects are shown as a log value. The vertical ordinate corresponds to the sample size, standard error or accuracy. The smaller the sample, the more scattered the distribution; and the larger the sample size, the more concentrated the distribution. If there is no bias, the Funnel plot is symmetrical. In contrast, if the diagram is asymmetrical, it means that publication bias exists. In addition, Egger's test was used to quantitatively assess the symmetry of the Funnel plots [17,18]. Egger's test cannot be used in a meta-analysis when the number of studies is less than 2. Therefore, we only used Egger's test for SNPs with larger than or equal to 2 studies. Egger's test was carried out using Stata 12.0 software.

Results
In our search for eligible studies and loci, we input the aforementioned keywords into the PubMed and Embase and then obtained 2552 studies. Screened by the criteria mentioned in the data collection, 152 of these 2552 studies involving 42 SNPs were included in our meta-analysis (Additional file 1: Table S1 and Additional file 2). The Cohen's Kappa value was 0.79(P < 0.05). Each of the 42 SNPs was reported in at least two studies. The number of studies for each locus was also counted. Fourteen SNPs were reported more than five times, and five SNPs were reported more than 10 times. The five SNPs genotypes in the cases and controls were extracted for subsequent analysis.

Meta-analysis results for the dominant effect model
Based on the dominant model (AA + Aa vs. aa genotype), we tested the heterogeneity between studies. Heterogeneity was found for eleven SNPs (P < 0.01). For these SNPs, the random effects model was used in the meta-analysis. For the others that did not show heterogeneity, the fixed effects model was used. Table 2 lists all of the SNPs with dominant genetic model, and we found a significant association between two of these SNPs and cervical cancer. These two SNPs had no heterogeneity, and the fixed effects model was adopted. rs1048943  Publication bias was tested using Funnel plots and Egger's test. We found that CCND1 (rs603965), CD28 (rs3116496) had publication bias. This bias may have resulted because these SNPs were analyzed in only few studies or because of differences in the selection of the cases and controls.

Meta-analysis results for the recessive effect model
Based on the recessive model (AA vs. Aa + aa), there were ten SNPs that showed heterogeneity, with a Q test P value of <0.01. The random effects model was used for these ten SNPs. The fixed effects model was used for the remaining SNPs.

Meta-analysis of special phenotype
During the data collection process, we noticed that some publications provided additional testing, such as genotyping for GSTM1 (positive or negative) and CYP2E1 (c1 or c2). These SNPs also were included in the metaanalysis. The results are shown in Table 4. SNPs with heterogeneity were tested using the random effects model. The fixed effects model was used for the remaining SNPs. As shown in Table 4, No SNP were significantly associated with cervical cancer (P < 0.01).
We performed a sensitivity analysis by sequentially removing each article at a time for the SNPs which number of studies was larger than or equal to 4 for the three models. Then we found that only CTLA-318 rs5742909, XRCC1 codon 194 in dominant genetic model and IFNr rs62559044 4 in allele model can affect the overall pooled OR. The data can be seen in Additional file 4.

Meta-analysis results for SNP subgroups
In our meta-analysis, some SNPs showed heterogeneity and then were subjected to subgroup analysis to explain the causes of their heterogeneity. Most SNPs were reported by only a few individual studies and were not suitable for classification into subgroups; thus, we only selected 5 SNPs for  Table 5. For P53 codon 72 Arg/Pro, the 44 studies were divided into two subgroups: the Asian group (17 studies) and the Caucasians group (11 studies). We selected the random effects model if the SNP had heterogeneity; otherwise, we selected the fixed effects model. We found that this SNP was significant associated with cervical cancer in the allele effect model and that the Arg allele increased the susceptibility of cervical cancer in the Caucasians groups but did not show a significant association in the Asian group. The remaining 4 SNPs did not show significantly association with cervical cancer in the two group in the three effect model (All p values were larger than 0.01). In addition, we found that some SNPs had heterogeneity when considering the total population but did not have heterogeneity when divided into subgroups. This phenomenon indicates that population size is one reason for heterogeneity. However, if the SNPs