A comprehensive evaluation of the role of genetic variation in follicular lymphoma survival

Background Survival in follicular lymphoma (FL) is highly variable, even within prognostic groups defined by tumor grade and the Follicular Lymphoma International Prognostic Index. Studies suggest that germline single nucleotide polymorphisms (SNPs) may hold prognostic information but further investigation is needed. Methods We explored the association between SNPs and FL outcome using two approaches: 1) Two independent genome-wide association studies (GWAS) of ~300.000 SNPs followed by a meta-analysis encompassing 586 FL patients diagnosed in Denmark/Sweden 1999–2002 and in the United States 2001–2006; and 2) Investigation of 22 candidate-gene variants previously associated with FL outcome in the Danish/Swedish cohort (N = 373). We estimated time to lymphoma-specific death (approach 1 and 2) and lymphoma progression (approach 2) with hazard ratios (HR) and 95% confidence intervals (CI) in a multivariable Cox regression model. Results In the GWAS meta-analysis, using a random effects model, no variants were associated with lymphoma-specific death at a genome-wide significant level (p < 5.0 ×10−8). The strongest association was observed for tightly linked SNPs on 17q24 near the ABCA10 and ABCA6 genes (rs10491178 HRrandom = 3.17, 95% CI 2.09-4.79, prandom = 5.24 ×10−8). The ABCA10 and ABCA6 genes belong to a family of genes encoding for ABC transporter proteins, implicated in multidrug resistance. In line with a previous study, rs2466571 in CD46 (HR = 0.73, 95% CI 0.58-0.91, p = 0.006) showed nominal association with lymphoma progression, as did two highly linked SNPs in IL8 (rs4073 HR = 0.78, 95% CI 0.62-0.97, p = 0.02; rs2227307 HR = 0.75, 95% CI 0.60-0.94, p = 0.01) previously associated with overall survival. Conclusions The results suggest a possible role for multidrug resistance in FL survival and add to the evidence that genetic variation in CD46 and IL8 may have prognostic implications in FL. Our findings need further confirmation in other independent populations or in a larger multicenter GWAS. Electronic supplementary material The online version of this article (doi:10.1186/s12881-014-0113-6) contains supplementary material, which is available to authorized users.

To gain additional knowledge of the role of genetic variation in FL outcome, we explored the association of~300.000 SNPs with lymphoma-specific death in two independent GWAS in Sweden/Denmark [25] and the USA [22], followed by a full meta-analysis encompassing a total of 586 FL cases. We also investigated genetic variants previously reported to be associated with FL survival as well as recently established FL risk-associated SNPs for the association with lymphoma-specific death and progression in the Swedish/Danish (N = 373) and Swedish datasets (N = 231), respectively.

Methods
This study was approved by the Ethical Review Board in each country (Sweden: The Regional Ethical Review Board in Stockholm; Denmark: Scientific Ethics Committee for the Capital Region of Copenhagen; California, USA: UCSF Human Research Protection Program, Committee on Human Subjects Research) and all study participants gave informed consent.

SCALE study participants
We included cases with FL in the Scandinavian Lymphoma Etiology (SCALE) study, described in detail elsewhere [27]. Briefly, SCALE is a population-based case-control study of the etiology of malignant lymphomas conducted in Sweden and Denmark 1999 to 2002. Overall, 3,740 incident malignant lymphoma patients 18-74 years of age were included. Through a rapid case ascertainment network, the patients were identified shortly after diagnosis. Lymphoma subtypes were reviewed and reclassified according to the WHO classification [28]. Study participants were restricted to individuals with sufficient knowledge of the Danish or Swedish language to answer questions in a telephone interview and without a history of organ transplantation, HIV infection, or other hematopoietic malignancy. In the study overall, 85% of eligible NHL patients were included. In Sweden, a detailed registration of subtypes of all eligible patients permitted evaluation of participation rates by NHL subtype, which was 94% for FL. Early death was an uncommon reason for non-participation among the FL patients (1%). Four-hundred ninety-nine (85%) of all interviewed FL patients also gave blood. The median time from diagnosis to venipuncture was three months (interquartile range: one to five months). In SCALE overall, patients who gave blood were similar to all participating patients with regard to age, sex and educational level.

Clinical data and follow-up
In Sweden, information about established prognostic factors (Ann Arbor stage, performance status, number of involved nodal areas, hemoglobin and lactate dehydrogenase levels), treatments and signs of progression, was collected from medical records. Lymphoma progression was defined as the time to progression, date of start of second-line treatment or lymphoma-specific death, if no second-line treatment was given (data only available in Sweden). Dates and causes of death were obtained through the Cause of Death registry [29], with complete follow-up through February 2, 2012. In Denmark, clinical parameters (Ann Arbor stage, performance status, treatment) and complete follow-up of survival through December 17, 2009, were obtained from the Danish national lymphoma database (LYFO) [30]. Lymphomaspecific death was defined as having lymphoma as the main underlying cause of death. Based on the number of FLIPI risk factors, the Swedish cases were classified as having low, intermediate or high risk of death (0-1, 2 or ≥3 risk factors, respectively). The Danish patients were classified according to a modified FLIPI based on age (≥60) and stage (I-II vs. III-IV) only, as low risk if they had none of these factors, intermediate if they had one, and high risk if they had two risk factors.

Genotyping
In 400 FL cases for whom a sufficient amount of DNA was available, genotyping of 317,503 SNPs was conducted at the Genome Institute of Singapore within a GWAS using the Illumina HumanHap 300 (version 1.0) array [25]. The resulting dataset was filtered on the basis of genotyping call rates (≥95%), sample completion rate (≥90%), minor allele frequency (≥0.03) and non-deviation from the Hardy-Weinberg equilibrium (p < 10 −6 ). SNPs on the X or Y chromosomes or with cluster plot problems were excluded. Study participants with gender discrepancies and/or labeling errors were removed, as were samples with evidence of cryptic family relationship (identified using the genome command in PLINK [31]) and population outliers on the basis of their values of the first three principal components (identified using EIGENSTRAT software and unlinked SNPs only [32]). Complete genotyping was obtained for 298,702 SNPs in 373 (94%) of the genotyped cases.

The University of California San Francisco (UCSF) study
The UCSF study population consisted of 213 FL cases included in a population-based case-control study of NHL (>2,055 cases, 2,081 controls) conducted in the San Francisco Bay area of incident NHL at ages 20-84 years 2001 to 2006. The cases were identified through rapid case ascertainment methods by the Cancer Prevention Institute of California with case reporting supplemented by Surveillance, Epidemiology and End Results data [33]. Overall, 69% of eligible cases participated in the study. Diagnostic material was re-reviewed and classified according to the WHO lymphoma classification by the study pathologist [28]. Blood and/or buccal specimens were collected from 87% of the cases within median 27 days from diagnosis. Genotyping of 329,294 SNPs was performed using the Illumina HumanCNV370-Duo BeadChip (Illumina, Inc., San Diego, CA). SNPs were excluded for low call rates (<90%), low minor allele frequency (<0.03), location on the sex chromosomes or cluster plot problems. Non-European samples according to MDS plot inspection were removed. Vital status and cause of death were updated through August 2012 through record linkage with the Greater Bay Area Cancer registry. Patients were censored at date of death or date of last known contact.

Selection of SNPs associated with FL outcome and established FL risk SNPs
Published studies of germline polymorphisms and FL outcome were found by searching PubMed and Web of Science (through June 30, 2013). We selected SNPs statistically significantly (p < 0.05) associated with overall or lymphoma-specific survival or with time to progression/ treatment in at least one study and investigated the association of these with lymphoma-specific death and progression in SCALE and the Swedish cohort, respectively. Fifteen studies and 30 SNPs representing 29 separate loci were identified (Table 1). Genotype data on 22 of these markers (21 loci) were available through genotyping (N = 9) or imputation (N = 13). We used 1000 genomes multi-ethnic reference panel and Impute 2 for imputation (http://mathgen.stats.ox.ac.uk/impute/impute_v2. html#download_reference_data). A strict threshold was set to genotypes with probabilities >0.9, SNPs with information scores >0.8 and call rates >0.9. Six SNPs strongly associated with FL risk in recent GWAS [22][23][24][25][26], of which one (rs6457327 in the HLA class II region) also has been associated with FL prognosis in two studies [4,18], were also selected for survival analysis.

Genome-wide association studies
In SCALE and UCSF separately, we estimated the association of SNPs with lymphoma-specific death using the Cox proportional hazards model. Time was measured from diagnosis to lymphoma-specific death or end of follow-up. Deaths from other causes than lymphoma were censored. Patients without an event were censored at the time of last known follow-up. The SNPs were coded as 0, 1 or 2 based on the number of minor alleles, and treated as continuous variables in the model. Adjustment was made for age at diagnosis, sex and population stratification (using the first three dimensions calculated by using EIGENSTRAT [32] in SCALE and multidimensional scaling [31] in UCSF; in both cohorts unlinked SNPs only were used for these calculations). Data on FLIPI and first-line rituximab was available in SCALE and the Swedish cohort, respectively, but not in UCSF, and hence these covariates were not included in the model. A full meta-analysis of the GWAS results was performed using the DerSimonian-Laird random effects method [34]. PLINK v 1.07 (http://pngu.mgh. harvard.edu/purcell/plink/) [31] and R [35] were used for these analyses. In complementary analyses, top variants in the meta-analysis were also adjusted for FLIPI risk groups (in SCALE) and first-line rituximab (in the Swedish cohort). The top locus in the meta-analysis was further explored in a regional association plot, combining the p-values of association in the meta-analysis, LD data from 1000 Genomes European population, gene information from the UCSC browser, and estimated recombination rates (http://csg.sph.umich.edu/locuszoom/) [36]. The potential functionality of the top SNPs and SNPs in strong LD (r 2 ≥ 0.8) with these in the 1000 Genomes European population were explored using RegulomeDB, which integrates multiple types of functional data generated by ENCODE and other sources (http://RegulomeDB.org/) [37].

Investigation of SNPs with reported prognostic impact in FL and established FL risk SNPs
We investigated the association of the selected candidate SNPs and established FL risk SNPs with lymphomaspecific death in SCALE and lymphoma progression in SCALE Sweden using the same Cox model as in the GWAS. The proportional hazard assumption was tested by plotting Schoenfeld residuals against follow-up time; no violations were found. In complementary analyses, associated variants were also adjusted for FLIPI risk groups and first-line rituximab. Also, potential immortal time bias was assessed in the Swedish cohort by starting follow-up at date of venipuncture instead of diagnosis. We further applied the multi-SNP modeling of risk alleles performed in three previous studies (rs5361, rs3799488, rs1799864 and rs1800796 [3]; rs4073, rs2069762, rs3212227 and rs454078 [7]; rs1801131, rs1127717 and rs719235 [15]) on the SCALE dataset. The SNPs were then coded as 0 or 1, based on the presence of the risk (protective/deleterious) allele [3,7,15]. The impact of having one, two or three or more as compared to no risk alleles on lymphoma-specific death was assessed in a Cox model, adjusting for age and sex. Multiple testing was adjusted for with the Bonferroni  method [38]. The analyses were done in SAS 9.2. Power calculations were performed in R (using survSNP) [35].

Patient characteristics
The Danish-Swedish FL cohort was followed for a median of 8.9 years from diagnosis (range 4.3 months to 12.1 years). Median age at diagnosis was 57 years and 65% had Ann Arbor stage III or IV disease (Table 2). One-hundred thirty-seven patients (37%) died of which 88 deaths (64%) were classified as lymphoma-specific.
The UCSF cohort was followed for a median of 7.5 years (range 6.9 months to 10.3 years). Twenty-six percent (N = 56) died, of which 43% (N = 24) were classified as lymphoma-specific deaths. In Swedish patients, progression occurred in 67% (N = 155), and rituximab was used in first-line treatment in 10% and overall in 47%. Overall survival differed as expected between risk groups defined by FLIPI (low, intermediate, high; p log-rank < 0.05) in the total SCALE study population and in the Swedish and Danish populations separately (data not shown).

Genome-wide association studies and meta-analysis
In SCALE and UCSF separately, no variants were associated with lymphoma-specific death at a genome-wide significant level (p ≤ 5.0 × 10 −8 ). Thirteen variants in SCALE (on chromosomes 1, 4, 8 and 17; Additional file 1: Figure S1a) and five variants in UCSF (on chromosomes 1, 2, 6, 9 and 20; Additional file 1: Figure S1b) were associated with lymphoma-specific death with a p ≤ 10 −6 , which was more than would be expected by chance (Additional file 1: Figure S2a and S2b).
Among the top 48 SNPs in the UCSF GWAS with p ≤ 10 −5 seven were not genotyped in SCALE. We were able to impute six of these with high confidence (information score >0.8). The meta-analysis of these SNPs gave no additional suggestive associations (data not shown).
There was no correlation between rs10491178 and performance status, lactate dehydrogenase level (Swedish cohort, N = 231) or FLIPI risk groups (SCALE cohort, N = 373; Fisher's Exact test p ≥ 0.15; data not shown). For rs10491178, rs3131729, rs11932201 and rs2250066, additional adjustment for FLIPI and first-line rituximab in SCALE and the Swedish cohort, respectively, did not alter the results meaningfully (data not shown). The association of the top SNP at 17q24 with overall survival (SCALE) was weaker compared with lymphoma-specific death (rs10491178 HR = 2.00, 95% CI 1.35-2.97, p = 5.68 × 10 −4 ), and there was no association with lymphoma progression (Swedish cohort; Additional file 1: Table S2).
Combining data in the RegulomeDB suggested a putative functional role for six of the ten top SNPs, of which rs10491178 (chr 17), rs11932201 (chr 4) and rs2250066 (chr 19) had RegulomeDB scores of 5, indicating that they may affect transcription factor binding or are located in a DNAse hypersensitivity site. There was no data for rs3131729 (chr 1). The lowest RegulomeDB score (i.e. strongest evidence of being in a regulatory site) among our top SNPs and highly linked variants (r 2 ≥ 0.8) was found at the SNP rs113464685 in ABCA10 (score 3b), in LD with rs10491178 (data not shown). After imputing the genotypes for rs113464685 in SCALE (as described in Methods for the candidate gene SNPs), we ran the Cox regression for this SNP and found estimates for lymphoma-specific death virtually identical to those of the other SNPs on 17q24 (HR = 3.10, 95% CI 1.98-4.89, p = 1.12 × 10 −6 ).

Discussion
In this GWAS meta-analysis of common genetic variation in FL prognosis in a total of 586 patients, we did not identify any variants associated with lymphoma-specific death at a genome-wide significant level (p < 5.0 ×10 −8 ). The strongest association was observed for tightly linked SNPs located on 17q24 near the ABCA10 and ABCA6 genes (rs10491178 HR random = 3.17, 95% CI 2.09-4.79, p random = 5.24 ×10 −8 ). Investigating previously reported candidate genes, we found further support of a role for rs2466571 in CD46 in FL progression. Two SNPs in IL8 (rs4073 and rs2227307) previously linked with overall survival in FL, where here associated with FL progression. We also observed an association with MTHFR and FL progression but the direction was opposite of a previous report on overall survival in FL.

Genome-wide association study and meta-analysis
The increased rate of lymphoma-specific death for carriers of the A allele of rs10491178 in ABCA10 on 17p24 was consistent in the SCALE and UCSF cohorts, strengthening the notion of a possible causal association Table 4 Relative risk A of lymphoma-specific death and lymphoma progression for selected SNPs previously associated with any follicular lymphoma outcome in at least one previous study Swedish cases only (n = 231) for time to progression. C Imputed SNP. D rs6457327 has previously been associated with both FL risk and prognosis. E Opposite direction of association to previous study [15]. The minor allele (A1) was investigated for association except for rs2466571 (CD46), where the major allele (A2) was used as reference for easier comparison with previous study result.
with this region. The association was independent of established prognostic risk factors (lactate dehydrogenase levels, performance status and FLIPI). ABCA10 is clustered among four other members of the ATP binding cassette (ABC) 1 family on 17q24, including ABCA6 harboring another five linked associated SNP markers. The ABC transporters -a family of proteins responsible for the movement of a wide variety of xenobiotics, lipids and metabolic products across the cell membranes -are implicated in multidrug resistance [39]. They are overexpressed in several tumor types [39], including FL [40]. The exact function of ABCA10 and ABCA6 remains to be elucidated [41], but the ABCA3 transporter, similar in terms of amino acid sequences and structural organization to ABCA10 and ABCA6 [41], has been shown to impede the efficacy of rituximab in aggressive B-cell lymphomas, and vincristine, anthracyclines and etopside in acute myeloid leukemia when expressed in high levels [42]. Interestingly, rs10491178 (C > T) encodes a premature stop codon (Arg (CGA) -> Stop (TGA)) in the ABCA10 transcript (Refseq NM_080282; Ensembl ABCA10-001) [43].
We also looked up SNPs in high LD (r 2 > 0.8) with rs10491178 for potential regulatory functions in Regulo-meDB, and identified another SNP rs113464685 in ABCA10, in strong LD (r 2 = 1.00) with the top SNPs at 17q24, that lies in the binding site of the transcription factor paired box 5 (PAX5). PAX5 has a pivotal regulatory function in B-cell development and its aberrant expression is correlated with aggressive subsets of B-cell NHL [44]. This SNP is thus an alternative candidate to explain the observed association with the region.
The suggestive associations (p ≤ 10 −6 ) of rs3131729 in the Dab, reelin signal transducer, homolog 1 (DAB1) gene on 1p32-p31, the intergenic SNP rs11932201 on chromosome 4, and rs2250066 in the kallikrein (KLK) 11 gene on 19q13, were also consistent in the SCALE and UCSF studies. The role of the DAB1 protein in the developing nervous system is well studied but for other tissues knowledge is scarce [45]. rs11932201 is flanked by two genes (long intergenic non-protein coding RNA 1085 (LINC01085) and CPEB2 antisense RNA 1 (CPEB2-AS1)) of which little is known. The KLK locus on 19q13 harbors genes encoding a family of serine proteases associated with cancer risk and prognosis in several studies [46].

Investigation of SNPs with reported prognostic impact in FL and established FL risk SNPs
Upon investigating SNPs linked to FL survival in previous studies, we observed a shorter time to lymphoma progression for C compared to A allele carriers at rs2466571 in CD46 on 1q32 (HR = 1.37, p trend = 0.006), at a nominally significant level. This is consistent with a recent study of 107 FL patients investigating event-free survival in FL (HR = 1.49, p trend < 0.05) [8]. CD46 is a complement inhibitor, protecting the cell against complement-mediated lysis [47]. High expression levels of membrane-bound complement regulatory proteins such as CD46 in tumors has been shown to suppress anti-tumor T-cell responses, and may inhibit anti-tumor therapeutic activity of monoclonal antibodies, including rituximab in B-cell NHL [47].
In our study, rs1801131 in MTHFR was associated with time to progression and lymphoma-specific death, and rs2069762 in the promoter of IL2 with lymphomaspecific death. However, these results are conflicting with previous reports of overall survival in FL and are therefore difficult to interpret [7,15]. For rs1801274, located in FCGR2A on 1q23, one of the two most studied variants [5][6][7]10,13,14,16,17], we observed no associations with lymphoma progression or lymphoma-specific death. This is in line with the results of most previous studies, irrespective of how outcome was defined (progression-free [6,10,14,16,17], event-free [5,6] or overall survival [5,7]).
Several SNPs, including rs6457327, rs10484561 and rs2647012 in the HLA class I and II regions on 6p21, are convincingly associated with FL risk [22,24,25], suggesting an important role of this genetic region in the development of FL, and thereby possibly also in FL progression. Indeed, rs6457327 has been associated with FL transformation [4,18] and overall survival [4] in two previous studies (Table 1). However, we did not observe any clear associations for rs6457327, rs10484561 and rs2647012 with FL progression or lymphoma-specific death in our cohort.

Strengths and weaknesses
The strengths of the present analysis include the population-based design of the two cohorts, the large number of patients compared to previous studies, and the comprehensive evaluation of the impact of genetic variation in FL outcome. Population stratification was adjusted for in the GWAS and we observed little or no evidence of inflation of the statistics (lambda GC SCALE = lambda GC UCSF = 1.00) [49]. In the analysis of candidate gene variants and lymphoma progression or death, adjustment for established risk factors including age, sex, FLIPI categories and first-line rituximab resulted in marginal changes of the point estimates only.
Although candidate gene studies have obvious weaknesses compared with genome-wide studies, a strength include a higher prior probability of true findings, since they are based on a biological understanding of cancer survival pathways [50]. Still, many candidate gene findings cannot be replicated, suggesting some false-positive results in the existing literature [50]. Most of the previous studies were small (N < 150) and the number of SNPs tested were sometimes large (up to 6679 variants), resulting in a considerable risk of chance findings. However, the present study was limited to detect moderate to strong effects (HR >1.5-2.0) for the investigated SNPs with acceptable probability (≥0.80). Hence, chance findings cannot be excluded.
In the study of genetic variation and FL prognosis, different outcome measures have been used, including overall and lymphoma-specific survival and event-free survival. Progression-free survival, defined as the time from study entry to lymphoma progression or all-cause death, is generally recognized as the most valid measure in intervention studies [51]. We argue that lymphomaspecific survival/death may be superior to overall survival (but not to progression-free survival) if the study aims to test an association with lymphoma progression leading to death. In view of the high median age at diagnosis of FL and the often indolent clinical course [1], a relatively large proportion of the patients are expected to die of non-lymphoma related causes, which could dilute associations with lymphoma progression and death if overall survival is used instead. On the other hand, with the use of lymphoma-specific survival some deaths due in part to progression may have been missed potentially leading to a lower specificity. We did not observe similar results for three out of four top genetic variants and lymphoma progression, as with lymphoma-specific death, although this analysis could only be performed in a subset of the patients. Explanations for such a difference could lie in the definitions of these outcomes. While time to progression and progression-free survival primarily reflect the first part of the follow-up period, including mainly time to first-line or second-line treatment, time to lymphoma-specific death reflects a larger part of the follow-up and thus, to a higher degree, treatment response or lack of such. In the future, attempts to validate the current findings, and when comparing results between investigations, the use of similar definitions of outcome will be important for the interpretation.

Conclusions
In the present GWAS meta-analysis, there was suggestive evidence that inherited polymorphisms in the ABCA10 or ABCA6 genes may be associated with risk of lymphomaspecific death. In candidate-gene analysis, an association with lymphoma progression was observed for SNPs in CD46 and IL8, previously linked with lymphoma progression (CD46) and overall survival (IL8). These findings need further confirmation in other independent populations or in a larger multicenter GWAS.

Additional file
Additional file 1: Table S1. Top SNPs associated with lymphoma-specific death (prandom<5.0 x10-5) in the meta-analysis of FL patients in SCALE and UCSF (N=586). The minor allele (A1) was investigated for association. A2=major allele, MAF= minor allele frequency, NA= not applicable. Table S2. Relative risk of lymphoma-specific and all-cause death in SCALE (N=373), and lymphoma progression in SCALE Sweden (N=231) for the top SNPs in the pooled analysis of SCALE and UCSF. The minor allele was investigated for association. Table S3. Relative risk of lymphoma-specific death and lymphoma progression for SNPs previously associated with follicular lymphoma risk. Figure S1. Manhattan plot of the -log10(p-values) for the association with lymphoma-specific death in SCALE (A) and UCSF (B), by chromosome and within chromosome location. The top ten SNPs in the meta-analysis are highlighted (light green). The red line marks -log10(5.0 x10-8). Figure S2. QQ plot of the observed versus the expected -log(p-values) for the association with lymphoma-specific death in SCALE (A) and UCSF (B). Figure S3. Manhattan plot of the -log10(p-values) for the association with lymphoma-specific death in the meta-analysis of SCALE and UCSF. The top ten SNPs are highlighted (light green). The red line marks -log10(5.0 x10-8). Figure S4. QQ plot of the observed versus the expected -log(p-values) for the association with lymphoma-specific death in the meta-analysis of SCALE and UCSF. Figure S5. Regional association plot of the top locus in the combined analysis on 17q, displaying the p-values of association in SCALE, extent of LD with rs10491178 as reference, relation of SNPs and genes, and estimated recombination rates at each position. Figure S6. Estimations of the power to detect associations between selected candidate SNPs and lymphoma-specific death and lymphoma progression, respectively, for different effect sizes and minor allele frequencies (MAF).