Linkage analysis of HLA and candidate genes for celiac disease in a North American family-based study

Background Celiac disease has a strong genetic association with HLA. However, this association only explains approximately half of the sibling risk for celiac disease. Therefore, other genes must be involved in susceptibility to celiac disease. We tested for linkage to genes or loci that could play a role in pathogenesis of celiac disease. Methods DNA samples, from members of 62 families with a minimum of two cases of celiac disease, were genotyped at HLA and at 13 candidate gene regions, including CD4, CTLA4, four T-cell receptor regions, and 7 insulin-dependent diabetes regions. Two-point and multipoint heterogeneity LOD (HLOD) scores were examined. Results The highest two-point and multipoint HLOD scores were obtained in the HLA region, with a two-point HLOD of 3.1 and a multipoint HLOD of 5.0. For the candidate genes, we found no evidence for linkage. Conclusions Our significant evidence of linkage to HLA replicates the known linkage and association of HLA with CD. In our families, likely candidate genes did not explain the susceptibility to celiac disease.


Background
Celiac disease (CD) is a common, familial, autoimmune gastrointestinal disease. It is caused by sensitivity to the dietary protein gluten, which is present in wheat, rye and barley. Symptoms include growth failure, abdominal pain, and diarrhea. Dermatitis herpetiformis is a cutaneous manifestation of CD. Complications of CD include lymphoma, osteoporosis, anemia, and seizures. The prevalence of CD in the US is 1:250 [1] and the ratio of symptomatic to asymptomatic cases is between 1:5 and 1:7 [2]. Before the advent of serological testing for diagnosing CD, it was considered a rare disease in the US.
The clinical standard for diagnosis of CD is a small intestinal biopsy showing villus atrophy and resolution of symptoms on a gluten-free diet. However, small intestinal biopsy is expensive, invasive, and often rejected by the US patient population. The serological IgA endomy-sial antibody (EMA) test is a screening tool that has greatly facilitated evaluation for CD in people with suggestive symptoms and in high-risk populations. IgA EMA testing has proven to be greater than 95% sensitive for adults and children with classic symptomatic CD [3][4][5][6][7][8][9][10] and greater than 98% specific in controls without known clinical disease [11,12]. It is therefore an inexpensive and specific method of screening family members for genetic studies. Moreover, a recent study has identified symptomatic EMA positive individuals who have CD in whom intestinal biopsies were normal with only minor mucosal lesions. All the patients showed clinical and serological recovery on a gluten-free diet. They propose that sero-logic criteria may be more definitive in the diagnostic process than traditional biopsy criteria [13].
CD has a strong genetic association with the HLA class II DQ2 genotype composed of the DQA1*05 and DQB1*02 alleles [14]. However, the HLA association alone is insufficient to explain the hereditary nature of the disease, and is estimated to explain less than half the sibling risk [15][16][17][18]. There appears to be genetic heterogeneity, implying that more than one additional gene is involved in the disease. With current analysis software, it is possible to map complex traits like CD, where several genetic loci are probably involved and the mode of inheritance is unclear.
One first step to identifying genes predisposing to CD is to investigate candidate genes. Likely candidates include the classes of genes involved in immune function, e.g., Tcell receptor (TCR) genes and immune-modulating genes. Other candidate genes are those from associated, independent diseases in which there is a higher rate of CD than in the general population, e.g., other autoimmune diseases such as insulin dependent diabetes mellitus (IDDM). These associations may be explained by common gene(s) responsible for both diseases or the diseases may share a similar autoimmune pathogenic mechanism [19]. There have been several European studies to localize genes for CD, but no significant evidence for linkage has been reported other than at HLA [20][21][22][23][24][25][26][27][28][29].
In this first study of families with CD from North America, we investigated linkage to several candidate genes that could play a role in the pathogenesis of CD using 62 families with at least two cases of CD.

Ascertainment of families with CD
Families with at least two cases of CD or dermatitis herpetiformis were ascertained through local gastroenterologists, gluten intolerance support groups, and advertising at local and national celiac disease support meetings. There was no selection of cases based on sex or race, although all individuals were Caucasian. None of the families appear to be related. The research study was approved by the University of Utah Health Sciences Center Institutional Review Board. Participants ranged in age from 2 years to 100+ years. Blood samples were collected from affected individuals and their first-degree relatives. For more distantly related cases, we also collected blood from individuals that are connections between the cases. For example, for two affected grandchildren (with different parents) and an affected grandparent, we would collect samples from the grandchildren, their parents, and the grandparent. The breakdown of the affected individuals is shown in Table 1.

Diagnostic criteria
Medical records were obtained to confirm previous biopsy-proven CD or dermatitis herpetiformis. IgA EMA testing was performed for participants who did not have a biopsy proven diagnosis of CD or dermatitis herpetiformis. Since IgA EMA is highly sensitive and specific for CD, we did not require biopsy confirmation for phenotype assignment.
IgA EMA was measured by indirect immunofluorescence using primate smooth muscle (IMCO Diagnostics, Buffalo, New York) as substrate [30]. IgA EMA titers greater than or equal to 1:5 were considered positive. Limiting dilution was performed on the positive sera.

Genotyping at short tandem repeat markers (STRs)
DNA was extracted from lymphocytes using PureGene DNA isolation kits (Gentra Systems Inc.). HLA DQA1 and DQB1 genotypes were determined as described in Feolo et al. [31]. Genotyping of DNA samples from 175 affected individuals, their parents, and any connecting rel- atives from 62 families was performed with 25 markers at 13 candidate gene regions and 4 markers at HLA. However, all families were not genotyped with all markers, because some families were collected after genotyping had been done for some of the STRs. The candidate gene regions, markers, and chromosomal locations are listed in Table 2. For all markers, amplification of 20 ng genomic DNA in a total reaction mix of 10 µl was performed according to standard PCR procedures, with minor modifications to optimize product clarity. Genotyping was performed either using an ABI373 or radioactively using polyacrylamide gels. Genotypic data were stored in the same database as all kindred and phenotype information.

Linkage analysis methods
Analyses were performed using dominant and recessive genetic models, each with 2 liability classes of either affected or unknown/unaffected based on diagnostic crite-ria (Table 3). For each model, unaffected individuals and individuals with serology or biopsy based diagnosis were given a penetrance function based on disease prevalence. For linkage analysis, we used the FASTLINK [32] implementation of the LINKAGE program [33,34] for twopoint analysis, and the GENEHUNTER program [35] for both parametric and non-parametric (NPL) multi-point analyses. Two-point linkage in the presence of locus heterogeneity was assessed by the admixture test of Ott, using HOMOG [36]. We used a heterogeneity LOD (HLOD) of > 1.3 to indicate nominal evidence for linkage for all linkage analyses [37].

Results
Candidate genes were selected based on function of those genes (i.e., T-cell receptors, CTLA4, and CD4) or from loci of associated diseases (i.e., IDDM). Although associated diseases were not considered in the selection of families, in several families, members had IDDM. In one * There were a total of 62 families, however not all families were genotyped for all markers or missing genotypes precluded analysis in the family. + α = proportion of families linked ; θ = recombination fraction; Best HLOD from the analyses of dominant and recessive models family, a CD case, his sibling and 3 extended relatives had IDDM; in a second family, the CD case had IDDM; in a third family, the mother, 2 siblings, a daughter, and a cousin of a CD case had IDDM; and in a fourth family, the sister of a CD case had IDDM.
The highest 2-point HLOD scores obtained with either model are shown in Table 3

Discussion
In this study, we examined linkage to a set of candidate genes for CD. This subset of genes was selected based on genes that could be related to CD through function or an associated disease. For statistical and linkage analysis of complex diseases, we used general recessive and dominant models. Several biostatisticians have suggested that general models provide power to distinguish linkage signals independent of the true underlying disease mode of inheritance, provided both dominant and recessive models are used [38][39][40]. As expected, the highest two-point and multipoint LOD scores were obtained in the HLA region, with a two-point HLOD of 3.1 and a multipoint HLOD of 5.0. This result replicates the known association and linkage of HLA to CD [22,25,29] and demon-strates the power of the family resource to detect linkage in the set of candidate gene markers.
We were interested in identifying non-HLA loci for celiac disease. We were unable to detect even nominal evidence for linkage at any of the loci investigated. For those regions where we examined only 1 marker, it may be that one marker was insufficient in order to detect linkage even if it existed. A number of candidate genes investigated in this study were examined previously in European populations. Our results are in agreement with previous linkage and/or association studies of CD and Tcell receptor genes (TCRα, TCRγ, TCRβ, and TCRδ), where they saw no evidence for linkage or association, although sample sizes were small [28,41]. CD28 and CTLA-4, two genes encoding receptors that regulate Tlymphocyte activation, are located at 2q33. Holopainen et al [24] reported linkage and association to this region in a study of 100 Finnish families with CD, which may suggest a possible founder effect in these families.  [22]. They found significant evi-  [27] reported heterogeneity LOD scores > 2.0 at 5 regions, including 11p11 previously reported by Zhong et al. [29]. From these studies, the only region with at least nominal evidence for linkage, which overlapped with the candidate regions studied here, was at IDDM3 at 15q26. One study reported possible evidence for linkage [29], one reported weak evidence [42], and two reported no linkage [20,22]. We were unable to detect linkage.

Conclusions
Our significant evidence of linkage to HLA replicates the known linkage and association of HLA with CD. In our families, likely candidate genes/loci did not explain the susceptibility to CD. It may be that these genes/loci are not involved in CD, that we had insufficient genotyping within regions, or that one, or a number of these genes, has a small effect so that we were unable to detect linkage with our set of families. We were unable to detect linkage at IDDM3 and at CTLA4, for which positive linkages were previously reported. This is similar to the experience in most other reported studies of celiac disease. Non-replication of linkage results in complex diseases is common, and may be due to the low power of studies to detect genes of relatively small effect and/or to a high degree of genetic heterogeneity among families. Larger data sets with more power likely are needed in order to find strong evidence for linkage.