The candidate genes TAF5L, TCF7, PDCD1, IL6 and ICAM1 cannot be excluded from having effects in type 1 diabetes

Background As genes associated with immune-mediated diseases have an increased prior probability of being associated with other immune-mediated diseases, we tested three such genes, IL23R, IRF5 and CD40, for an association with type 1 diabetes. In addition, we tested seven genes, TAF5L, PDCD1, TCF7, IL12B, IL6, ICAM1 and TBX21, with published marginal or inconsistent evidence of an association with type 1 diabetes. Methods We genotyped reported polymorphisms of the ten genes, nonsynonymous SNPs (nsSNPs) and, for the IL12B and IL6 regions, tag SNPs in up to 7,888 case, 8,858 control and 3,142 parent-child trio samples. In addition, we analysed data from the Wellcome Trust Case Control Consortium genome-wide association study to determine whether there was any further evidence of an association in each gene region. Results We found some evidence of associations between type 1 diabetes and TAF5L, PDCD1, TCF7 and IL6 (ORs = 1.05 – 1.13; P = 0.0291 – 4.16 × 10-4). No evidence of an association was obtained for IL12B, IRF5, IL23R, ICAM1, TBX21 and CD40, although there was some evidence of an association (OR = 1.10; P = 0.0257) from the genome-wide association study for the ICAM1 region. Conclusion We failed to exclude the possibility of some effect in type 1 diabetes for TAF5L, PDCD1, TCF7, IL6 and ICAM1. Additional studies, of these and other candidate genes, employing much larger sample sizes and analysis of additional polymorphisms in each gene and its flanking region will be required to ascertain their contributions to type 1 diabetes susceptibility.


Background
Type 1 diabetes is a chronic autoimmune disease with a complex pathogenesis involving multiple genetic and environmental factors. Before the advent of genome-wide association (GWA) studies, disease loci were primarily sought through the testing of candidate genes, selected based usually upon limited prior information about the function of the gene and the pathogenic mechanisms of disease. The candidate gene approach has been successful in finding disease loci, but as only relatively small numbers of genes have been studied, few true positive associations have been found, despite numerous studies and enormous effort [1]. Only five type 1 diabetes loci with compelling evidence had been identified before the advent of GWA studies: the HLA class II genes on chromosome 6p21 [2]; the insulin gene (INS) on 11p15 [3]; CTLA4 on 2q33 [4]; PTPN22 on 1q13 [5,6]; and, IL2RA/ CD25 on 10p15 [7,8]. Another five type 1 diabetes loci with convincing evidence have so far been identified by GWA studies in chromosome regions 2q24.3 [9], 12q24, 12q13, 16p13 and 18p11 [10,11].
Previously, we noted that, with the exception of INS [12], the type 1 diabetes loci contain polymorphisms that have been associated with susceptibility to other immunemediated diseases, such as Graves' disease (GD) and systemic lupus erythemastosus (SLE), suggesting the existence of shared disease loci [13]. In this study, we tested three genes, namely, IL23R, IRF5 and CD40, which have been associated with other immune-mediated diseases (Table 1), including Crohn's disease (CD), psoriasis, SLE and GD, for an association with type 1 diabetes. In addition, we tested seven genes, namely, ICAM1, TAF5L, PDCD1, TCF7, IL12B, IL6 and TBX21, with marginal or inconsistent evidence of an association with type 1 diabetes (Table 1). PDCD1 has also been associated with SLE and Rheumatoid Arthritis (RA). We genotyped reported polymorphisms, nonsynonymous SNPs (nsSNPs) and tag SNPs for the IL12B and IL6 regions in up to 7,888 case, 8,858 control and 3,142 parent-child trio samples. In addition, we used Wellcome Trust Case Control Consortium (WTCCC) [10] GWA study data to determine whether there was any further evidence of an association with type 1 diabetes in the linkage disequilibrium (LD) blocks containing the reported polymorphisms of interest.

Subjects
Type 1 diabetes families were white European or of white European descent, with two parents and at least one affected child comprising DNA samples from up to 918 Finnish multiplex/simplex families [14], 456 multiplex Diabetes UK Warren I families [15], 278 multiplex Human Biological Data Interchange (HBDI) families [16], 80 Yorkshire simplex families, 263 Belfast multiplex/simplex [17], 360 Norwegian simplex families [18] and 240 Romanian simplex families [19]. The single SNPs from TAF5L and IL12B, two SNPs from PDCD1 and five SNPs from IL6 were genotyped in all the families. The TCF7 SNP was genotyped in Warren, Yorkshire, Belfast and Romanian families. The TBX21 SNP was genotyped in Warren, Yorkshire, HBDI, Belfast and Romanian families. The CD40 SNP was genotyped in Warren and HBDI families.
The type 1 diabetes cases (maximum 7,888) [20], the British 1958 Birth Cohort (maximum 8,858) [21] and the UK Blood Services controls (1,500) have been described previously [6,10]. All cases and controls are white European. All DNA samples were collected after approval from the relevant research ethics committees and written informed consent was obtained from the participants.
Additional file 1, Table S1 contains a summary of the samples genotyped for each gene.

SNP identification and genotyping
IL12B is among the genes re-sequenced by the University of Washington: the Fred Hutchinson Cancer Research Centre (UW-FHCRC) Variation Discovery Resource [22]. The IL12B SeattleSNPs encompass the introns and exons between 1.7 kb 5' of exon 2 to 2 kb 3' of the untranslated exon 8 of IL12B. In this region, they detected 33 polymorphisms in their set of 23 DNA samples from Centre d'Etude du Polymorhisme Humain (CEPH) parents of European descent. As the untranslated exon 1 and 5' region beyond it were not sequenced, we-resequenced, a further 2.9 kb 5', including exon 1 and the known promoter in the same 23 CEPH parents used by SeattleSNPs. This identified a further four polymorphisms including the CTCTAA/CG complex promoter deletion insertion (DIP) (rs17860508), incorrectly described previously as a 4 bp deletion [23].
IL6 was also resequenced by the SeattleSNP Project.
SNPs were genotyped using either Taqman MGB chemistry (Applied Biosystems) or Invader Biplex Assay (Third Wave Technologies, Madison). The D5S2941 microsatellite was genotyped using PCR and evaluated size differences using an ABI 3700 capillary sequencer.

Wellcome Trust Case Control Consortium
We used data from the WTCCC GWA study [10] to determine whether there was any evidence of an association with type 1 diabetes in the LD blocks containing the polymorphisms of interest. LD blocks were defined using Haploview [24] and HapMap Project [25] data for the 60 Centre d'Etude du Polymorphisme Humain (CEPH) par- TBX21 has a role in the complex regulation of T lymphocyte responses, as a master-regulator of Th1 cytokine IFN-γ gene expression.
The protein encoded pairs with the receptor molecule IL12RB1/ IL12Rbeta1, and both are required for IL23A signaling. This protein associates constitutively with Janus kinase 2 (JAK2), and also binds to transcription activator STAT3 in a ligand-dependent manner.   ents. We used the four gamete rule [26] for defining LD blocks within Haploview. We note that the 2,000 case and 3,000 (1,500 from the British 1958 Birth Cohort and 1,500 from the UK Blood Services) control samples used by the WTCCC where also genotyped in this study. We required WTCCC SNPs to have a minor allele frequency (MAF) ≥ 0.05, a call rate ≥ 0.99 and no extreme deviation from Hardy-Weinberg equilibrium (HWE) (χ 1 2 ≤ 25) [27].

Statistical Analysis
All statistical analyses were performed in either Stata [28] or R [29] statistical systems. Additional routines may be downloaded [30].
All unaffected parent and control genotypes were assessed for, and found to be in Hardy-Weinberg equilibrium (P > 0.05). SNPs genotyped in the family collection were analysed using the transmission/disequilibrium test [31] and, after estimating pseudo-controls [32], conditional logistic regression, respectively modelling allelic relative risks (RRs; a one-degree-of-freedom (df) test) and genotype RRs (a two-df test). In the case and pseudo-control analysis, we consider the transmitted pair of alleles as the "case" and the other three possible pairs of transmitted alleles as "pseudo-controls" in a matched case-control study [32].
The one-df test assumes multiplicative allelic effects and the two-df test assumes no specific mode of inheritance, for example, in the analysis of TCF7 C883A, genotype risks of C/A and A/A were modelled relative to the C/C genotype.
In the case-control collection, we performed similar tests using logistic regression models, stratified by 12 broad geographical regions (Southwestern; Southern; Southeastern; London; Eastern; Wales; Midlands; North Midlands; Northwestern; East and West Riding; Northern; and, Scotland), to allow for geographic variation in allele frequencies across Great Britain [27].
The tag SNPs were selected and analysed using a multilocus test, as previously described [7,33]. Qu et al. [8] recently reported that the family-based multilocus test was not confined to heterozygous parents, which compromised the protection against population stratification. This is incorrect, as described in the studies by Chapman et al. [33] and Lowe et al. [34], only transmissions from heterozygous parents contribute to the test.

TAF5L
We genotyped one TAF5L SNP (C744A; rs3753886), which has previously been associated with type 1 diabetes [35] (Table 1). We found inconsistent evidence of an association between type 1 diabetes and C744A in the casecontrol and family collections. In 7,497 case and 7,496 control genotypes, we obtained marginal evidence of an association (P = 7.32 × 10 -3 ; OR for allele A = 1.07, 95% CI 1.02-1.12; Table 2). Although, in 2,645 parent-child trio genotypes, there was borderline evidence of an association (P = 0.0519), the risk for the minor allele (RR for allele A = 0.92, 95% CI 0.85-0.99; Table 3) was going in the opposite direction to that in the case-control collection.
We found borderline evidence of an association in the WTCCC, which had two SNPs with a MAF ≥ 0.05 in the 5 kb LD block containing C744A. The lowest P-value was for C744A (P = 0.0561).

PDCD1
We genotyped two PDCD1 SNPs (7146G>A and 872C>T), which have previously been associated with type 1 diabetes [36] and RA respectively [37] (Table 1). We found inconsistent evidence of an association between type 1 diabetes and 7146G>A in the case-control and family collections (Tables 2 and 3). In 7,888 case and 8,858 control genotypes, we obtained limited evidence of an association (P = 0.0102; OR for allele A = 1.10, 95% CI = 1.02-1.17). In 3,125 parent-child trio genotypes, no evidence of an association was found (P = 0.498). For the second SNP, 872C>T, no evidence of an association was found in either collection (Tables 2 and 3).
There were no WTCCC SNPs with a MAF ≥ 0.05 in the 12 kb LD block containing 7146G>A. We note that 7146G>A was not included in either HapMap or the WTCCC study.

TCF7
We genotyped one TCF7 SNP (C883A; rs5742913), which had previously been associated with type 1 diabetes [38] ( Table 1). Despite obtaining no evidence of an overall association, Noble et al. [38] proceeded to analyse subgroups, defined by up to three criteria, in which they found an excess of the A allele transmitted: from fathers; to male offspring; to low HLA risk (non-DR3/DR4) offspring; and, to early onset offspring (Table 1). We found some evidence of an association between type 1 diabetes and C883A in the case-control and family collections (Tables 2 and 3). In 7,434 case and 8,637 control genotypes, in contrast to Noble et al. [38], we obtained evidence of an association with type 1 diabetes (P = 4.16 × 10 -4 ; OR for allele A = 1.13, 95% CI = 1.06-1.22). In 1,556 parent-child trio genotypes, no evidence of an association was found (P = 0.608). In addition, we found no evidence of an association between type 1 diabetes and C883A when comparable subgroup analyses, as described by Noble et al. [38], were performed (see Additional file 1, Table S2). We also performed, a case-only regression analysis of cases and affected offspring from the case-control and family collections respectively, which showed no con-   Table S3). We note that the lack of evidence for an interaction between C883A and PTPN22 1858C>T contradicts a previous study reporting an over-transmission of PTPN22 1858T to individuals who have at least one copy of TCF7 883A (P = 0.015) [39].
We found no evidence of an association in the WTCCC, which had two SNPs with a MAF ≥ 0.05 in the 66 kb LD block containing C883A. The SNP with the lowest P-value was rs756699 (P = 0.694). We note that C883A was not included in either HapMap or the WTCCC study.

IL12B
IL12B has been reported to be associated with type 1 diabetes in some but not all studies (Table 1). Therefore, we investigated the contribution of IL12B to type 1 diabetes susceptibility, as thoroughly as possible, by genotyping a SNP (A1159C; rs3212227), two rare nsSNPs (rs3213096 and rs3213119), a microsatellite (D5S2941) and a set of tag SNPs for the IL12B region. In 4,321 case, 4,711 control and 3,015 parent-child genotypes, we obtained no evidence of an association between type 1 diabetes and A1159C (P = 0.134 and 0.630 respectively; Tables 2 and  3). In addition, we performed a case-only analysis to replicate the association reported by Windsor et al. [40] between A1159C and age-at-diagnosis (Table 1). However, no evidence of an age-at-diagnosis effect at A1159C was found (see Additional file 1, Table S4).
They detected two alleles, a major allele 1 with eight repeat units and a minor allele 2 with nine repeat units, and found over-transmission of allele 2 to affected offspring. We detected an additional two, extremely rare, alleles: allele 3 (ten repeat units) and allele 4 (seven repeat units), both found at frequency of <0.001. In 1,327 case and 1,160 control genotypes, we obtained no evidence of an association between type 1 diabetes and D5S2941 allele 2 (P = 0.114). Davoodi-Semiromi et al. also reported an association between type 1 diabetes and the D5S2941 allele 2-1159C haplotype (P = 0.02) and suggested the possibility that the causal variant remained ungenotyped and elsewhere in IL12B [41]. Consequently, we also analysed this haplotype using an EM algorithm-based routine to assign haplotypes to cases and controls, which were then analysed using a linear model weighted by the posterior haplotype probabilities for each case or control. Again we found no association with disease for this haplotype in 1,298 case and 1,111 control genotypes (P = 0.195).
To investigate the possibility of a polymorphism associated with type 1 diabetes in IL12B that we, or others, have not yet genotyped, we adopted an linkage disequilibrium (LD) mapping approach using tag SNPs (Table 4). We combined resequencing data in 23 CEPH parents from the SeattleSNP project [22] with inhouse resequencing of the untranslated exon 1 and 5', not resequenced in the Seat-tleSNP project (Methods). In the combined resequencing data, we identified 38 polymorphisms, comprising 34 SNPs (four SNPs provided by the in house resequencing), three deletion-insertion polymorphisms (DIPs) and the microsatellite D5S2941 (see Additional file 1, Table S5). Six tag SNPs were selected (minimum R 2 = 0.80) from 25 SNPs with a minor allele frequency (MAF) ≥ 0.10. The set of tag SNPs was genotyped in the case-control collection and analysed using a multilocus test, which tests for an association between type 1 diabetes and the tag SNPs due to LD with one or more causal variants [33]. The multilocus test P-value was 0.940, providing no evidence of an association between type 1 diabetes and the IL12B region (Table 4). We note that rs17860508, the promoter DIP CTCTAA/CG, was selected as a tag SNP. Previously, this polymorphism had been associated with asthma susceptibility [23] and IgE levels [42], but we found no association with type 1 diabetes (4,367 case and 4,714 control genotypes, P = 0.878).
Finally, we tested for an association between type 1 diabetes and two rare nsSNPs (rs3213096 and rs3213119). Although the MAF was 0.022 for both nsSNPs in the Seat-tleSNPs CEPH sequencing panel, in the case-control collection, rs321096 had a much lower MAF of 0.0035 in controls and consequently, we had no power in a collection of 4,383 case and 4,732 control genotypes to test for an association. The other nsSNP, rs3213199, with MAF of 0.034 in controls, showed no evidence of an association with type 1 diabetes in 4,348 case and 4,691 control genotypes (P = 0.0941).
We found limited evidence of an association in the WTCCC, which had six SNPs with a MAF ≥ 0.05 in the 14 kb LD block containing A1159C. The only associated SNP was rs6859018 (P = 0.0274), which is in perfect LD (r 2 = 1) with A1159C in 60 CEPH parents. Consequently, based on the perfect LD between rs6859018 and A1159C, we can conclude that this association is a false positive. We note that A1159C was not included in the WTCCC study.

IL6
As IL6 has been reported to be associated with type 1 diabetes [43] (Table 1), we sought to replicate this association by genotyping IL6-174G>C (rs1800795) and a set of tag SNPs for the IL6 region. We found inconsistent evidence of an association between type 1 diabetes and IL6-174G>C in the case-control and family collections (Tables  2 and 3). In 7,785 case and 8,852 control genotypes, we obtained limited evidence of an association with type 1 diabetes (P = 0.0291; OR for allele C = 1.05, 95% CI = 1.01-1.10). In 2,803 parent-child trio genotypes, no evidence of an association was found (P = 0.333). As Kristiansen et al. [43] found evidence that the IL6-174C allele was only associated with type 1 diabetes in female offspring (Table 1), we analysed IL6-174G>C by sex. In the cases and controls, we obtained limited evidence of an association in males (P = 0.0378), but not in females (see Additional file 1, Table S6a) and in the families, we found no evidence of an association in either male or female type 1 diabetes offspring (see Additional file 1, Table  S6b).
Previously, Gillespie et al. [44] found frequency differences in IL6-174G>C genotypes between males and females diagnosed ages > 10 years. Consequently, we conducted a similar case-only analysis using a multinomial logistic regression model to adjust for broad geographical region within Great Britain and for population of the cases and affected offspring, respectively. We found no evidence of genotype differences between 5,700 male and 5,292 female cases and affected offspring (P = 0.370) and when analysed by age-at-diagnosis (see Additional file 1, Table S7).
To test for an association between type 1 diabetes and the IL6 region, we adopted a LD mapping approach. We used SeattleSNP resequencing data in 23 CEPH parents, four tag SNPs were selected (minimum R 2 = 0.80) from twelve SNPs with a MAF ≥ 0.10 (see Additional file 1, Table S8). We found no evidence for association in the case-control collection (multilocus test P = 0.236). However, in the family collection, limited evidence of an association was found (multilocus test P = 0.0231) ( Table 4).
We found no evidence of an association in the WTCCC, which had two SNPs with a MAF ≥ 0.05 in the 4 kb LD block containing IL6-174G>C. The SNP with the lowest Pvalue was rs2069835 (P = 0.343). We note that IL6-174G>C was included in the WTCCC study, but was dropped as the call rate was below 0.99.
We found some evidence of an association in the WTCCC, which had two SNPs with a MAF ≥ 0.05 in the 15 kb LD block containing G241R. The most associated SNP was rs892188 (P = 0.0257; OR for allele T = 1.10, 95% CI = 1.01-1.19), located just over 2 kb upstream of the 3' UTR of ICAM5, which has low LD (r 2 = 0.274) with G241R. We note that G241R was not included in the WTCCC study.

TBX21
We genotyped one TBX21 SNP (His33Gln; rs2240017), which had previously been associated with type 1 diabetes in a Japanese case and control collection [46] (Table 1).
No evidence for association in either collection was found (Tables 2 and 3). We note that the G (Gln) allele frequency in controls was 0.024 (Table 2) and in parents 0.027, considerably lower than reported in Japanese by Sasaki et al. (MAF = 0.105) [46], but similar to that found in other European populations (MAF = 0.030) [47].
We found no evidence of an association in the WTCCC, which had one SNP with a MAF ≥ 0.05 in the 14 kb LD block containing His33Gln. The SNP, rs2240017, had a Pvalue of 0.131. We note that His33Gln was not included in the WTCCC study.

IL23R
We genotyped one IL23R SNP (Arg381Gln; rs11209026), which has previously been associated with IBD and psoriasis (Table 1). In 6,087 cases and 6,303 controls, we found no evidence of an association with type 1 diabetes (P = 0.183; Table 2).
The only WTCCC SNP with a MAF ≥ 0.05 in a 15 kb LD block containing Arg381Gln was Arg381Gln (P = 0.857).

IRF5 and CD40
We genotyped two SNPs from IRF5 and CD40 (-3835/ rs2004640 and 168A>G/rs1883832, respectively), which have previously been associated with SLE and GD respectively ( Table 1). The CD40 -168A>G SNP has been reported to disrupt the Kozak consensus sequence necessary for efficient translation [48]. No evidence of an association was found for either SNP (Tables 2 and 3).
We found no evidence of an association in the WTCCC, which had no SNPs with a MAF ≥ 0.05 in the 5 kb LD block containing -3835 and 168A>G was not contained within a block. Neither -3835 nor 168A>G were included in the WTCCC study.

Discussion
In this study, we have tested ten candidate genes for an association with type 1 diabetes using large case-control and family collections. We did obtain some evidence, albeit inconsistent between collections, of an association with TAF5L (C744A), PDCD1 (7146G>A), TCF7 (C883A) and IL6 (IL6-174G>C, rs2069849 and the IL6 region). Although TAF5L (C744A; rs3753886), PDCD1 (7146G>A; rs11568821), TCF7 (C883A; rs5742913) and IL6 (IL6-174G>C; rs1800795) have previously been associated with type 1 diabetes, the possibility remains that these associations are false positives. However, the findings reported here maybe the result of the case-control collection having more power than the family collection to detect SNPs with relatively small effects in type 1 diabetes or being in weak LD with the causal locus. Consequently, additional studies will be required to ascertain the contribution of TAF5L, PDCD1, TCF7 and IL6 to type 1 diabetes susceptibility. The case-control collection (8,000 cases and 8,000 controls) provided about 60% power to detect an OR of 1.2 for a MAF of 0.10 at a P-value of about 1 × 10 -6 ; and about 96% power for a MAF of 0.20. The family collection (3,125 parent-child trios) provided less power to detect an OR of 1.2, after increasing the Pvalue to 1 × 10 -3 , there was about 45.1 % power for a MAF of 0.10 and about 81.4% power for a MAF of 0.20.
We did not obtain any evidence of an association between type 1 diabetes and either ICAM1, IL12B or TBX21, all of which had previously been associated with type 1 diabetes [45,46,49]. However, limited evidence of an association with rs892188, located in the LD block containing the reported SNP in ICAM1, was provided by the WTCCC. Additional studies will be required to ascertain the contribution of rs892188 to type 1 diabetes susceptibility. The previous reports of disease associations may well have been false positives, which have to be expected given: the low prior probability, even for candidate genes [1,50], of detecting a true causal locus of complex disease; the frequent use of relatively small sample sizes and of nominal levels of significance; and, the large numbers of SNPs tested for an association with type 1 diabetes. It is interesting to note that when, by chance, a true positive result is found, as in the case for PTPN22 Arg620Trp SNP in type 1 diabetes [5], it is replicated by many groups, very rapidly [6,39,51,52], although this has a large effect approaching an OR = 2.
Finally, we did not obtain any evidence of an association between type 1 diabetes and IL23R, IRF5 or CD40, all of which have previously been associated with other immune-mediated diseases including, most recently, IL23R in Crohn's disease [10,53] and in psoriasis [54]. Lack of association of IL23R, IRF5 and CD40 with type 1 diabetes helps to delineate pathogenic mechanisms between type 1 diabetes and other immune-mediated diseases, especially when the associations reported for the other diseases are highly likely to be true positive results. Nevertheless, for these and other loci, it is possible that one or more of these genes could have allelic heterogeneity in which one allele predisposes to certain autoimmune diseases and a second allele at a different location in the gene predisposes to another. This possibility necessitates the continued investigation of further polymorphisms within each gene.

Conclusion
The functional candidate gene approach has now been superseded by GWA studies, which are detecting major susceptibility loci [10,11,53,55]. Most of the confirmed loci from GWA studies have ORs ≤ 1.3, consistent with an L-shaped distribution of allelic effect sizes (that is, a small number of genes with large effects and a large number of genes with small effects) [1,11]. TAF5L, PDCD1, TCF7, IL6 and ICAM1 may be amongst the numerous loci with small effects in type 1 diabetes. We note that genes found with small effects on disease may have much larger effects in subgroups of phenotypically defined cases. For example, CTLA4 genotypes has a small effect overall in type 1 diabetes (OR = 1.20, 95% CI 1.13-1.27), but subclassification of cases with or without the thyroid peroxidase autoantibodies revealed an increased effect (OR = 1.49, 95% CI 1.29-1.72) for cases with autoantibodies (without autoantibodies OR = 1.16, 95% CI 1.10-1.24) [56].