Genetic mapping of high caries experience on human chromosome 13

Background Our previous genome-wide linkage scan mapped five loci for caries experience. The purpose of this study was to fine map one of these loci, the locus 13q31.1, in order to identify genetic contributors to caries. Methods Seventy-two pedigrees from the Philippines were studied. Caries experience was recorded and DNA was extracted from blood samples obtained from all subjects. Sixty-one single nucleotide polymorphisms (SNPs) in 13q31.1 were genotyped. Association between caries experience and alleles was tested. We also studied 1,481 DNA samples obtained from saliva of subjects from the USA, 918 children from Brazil, and 275 children from Turkey, in order to follow up the results found in the Filipino families. We used the AliBaba2.1 software to determine if the nucleotide changes of the associated SNPs changed the prediction of the presence of transcription-binding site sequences and we also analyzed the gene expression of the genes selected based on binding predictions. Mutation analysis was also performed in 33 Filipino individuals of a segment of 13q31.1 that is highly conserved in mammals. Results Statistically significant association with high caries experience was found for 11 markers in 13q31.1 in the Filipino families. Haplotype analysis also confirmed these results. In the populations used for follow-up purposes, associations were found between high caries experience and a subset of these markers. Regarding the prediction of the transcription-binding site, the base change of the SNP rs17074565 was found to change the predicted-binding of genes that could be involved in the pathogenesis of caries. When the sequence has the allele C of rs17074565, the potential transcription factors binding the sequence are GR and GATA1. When the subject carries the G allele of rs17074565, the potential transcription factor predicted to bind to the sequence is GATA3. The expression of GR in whole saliva was higher in individuals with low caries experience when compared to individuals with high caries experience (p = 0.046). No mutations were found in the highly conserved sequence. Conclusions Genetic factors contributing to caries experience may exist in 13q31.1. The rs17074565 is located in an intergenic region and is predicted to disrupt the binding sites of two different transcription factors that might be involved with caries experience. GR expression in saliva may be a biomarker for caries risk and should be further explored.


Background
Caries is a multifactorial disease and our ongoing research continues to provide evidence that genetic factors related to the host are involved in caries susceptibility. Our previous studies focused on genetic variation of genes involved in the enamel formation [1][2][3][4] and in the immunological system [5]. We complemented these studies by performing a genome-wide linkage scan to unravel novel loci for caries [6].
The genome-wide linkage study identified three loci for low caries experience (5q13.3, 14q11.2, and Xq27.1) and two loci for high caries experience (13q31.1 and 14q24.3) [6]. The fine mapping of the locus 5q12. 1-13.3 suggested that BTF3 has a functional role in the pathogenesis of caries [7]. The fine mapping of the locus 14q11.2 pointed at TRAV4 as involved in caries experience [8]. Both genes are suggested to be protecting factors against caries. These results clearly demonstrate that focusing on the regions identified by the genomewide linkage analysis can lead to the identification of genetic contributors to caries. In the present study we fine mapped the locus 13q31.1 in order to identify genetic contributors involved in high caries experience.

Studied population
We studied 3,151 individuals from six population data sets, including samples from the Philippines, USA, Brazil, and Turkey.
The Filipino sample set consisted of DNA samples from 477 subjects (224 females and 253 males) from 72 pedigrees recruited between 2005 and 2007 living in the Cebu Island. The mean age of the individuals was 25.8 years and ages ranged from one to 82 years. The mean DMFT/dmft score was 9.7 and scores ranged from 0 to 32. We compared individuals living in the same area in the Philippines, and therefore with similar cultural backgrounds and access to dental care, in an attempt to reduce the influence of environmental confounders. The families studied all come from the central part of the country, mostly Cebu Island, and the surrounding islands. All families are small-scale fishermen or landless rural dwellers. They all appear to be descendents from a proto-Malay stock. Most parents reported brushing the teeth of their children and similar dietary habits.
The sample from Pittsburgh, USA consisted of 1,481 (715 males and 766 females) unrelated subjects who sought treatment at the University of Pittsburgh and were part of the Dental Registry and DNA Repository project. The mean age of the individuals was 40.9 years and ages ranged from six to 92 years. The mean DMFT/ dmft score was 15.9 and scores ranged from 0 to 28. This population is at high risk for oral and systemic diseases but no detailed data on caries risk factors are available for this study group. Pittsburgh is the largest city in the Appalachian region of the United States, which is one of the poorest in the country. Pittsburgh has had fluoridated water since 1953, however, nearly half of the children in Pittsburgh between six and eight have had cavities according to the State Department of Health of Pennsylvania (http://www.portal.state.pa.us/portal/ server.pt/community/oral_health/14180). More than 70% of 15-year-olds in the city have had cavities, the highest percentage in the state. Close to 30% of the city's children have untreated cavities. That is more than double the state average of 14%.
From Brazil, two sample data sets were available for this study. The first consisted of DNA samples from 598 unrelated children and teenagers (313 males and 285 females) that sought treatment at the Federal University of Rio de Janeiro during 2010 and 2011. The mean age of the children was 9.0 years and ages ranged from two to 18 years. The mean DMFT/dmft score was 2.5 and scores ranged from 0 to 17. The second sample set included DNA samples of children from Nova Friburgo recruited during the year of 2012. The city of Nova Friburgo is located in the northern mountainous region of the Rio de Janeiro state, 136 km from downtown Rio de Janeiro. Children (N = 320, 158 males and 162 females) were from eight daycare centers in Nova Friburgo. The mean age of the children was three and half years and ages ranged from one to six years. The mean dmft score was 1.4 and scores ranged from 0 to 16. Variables related to risk factors for caries were not available for all participants and these two Brazilian cohorts and these data could not be included in the analyses.
From Istanbul, Turkey, two sample data sets were also available for this study. The first sample was from a study originally designed as a case-control study and consisted of 172 unrelated children (93 females and 79 males) from three to six years of age recruited during the year of 2006. Ninety children had a dmft score of four or more and 82 children were caries free [2]. The second sample was designed as a cohort study and included 103 children (45 males and 58 females). The mean age of the children was five years and ages ranged from four to six years. The mean dmft score was 2.5 and scores ranged from 0 to 9. For this study group, most parents reported not brushing the teeth of their children. Drinking water in the region is not artificially fluoridated.
These Review Board, Turkey) and appropriate informed consent was obtained from all participants. Age appropriate assent documents were used for children between seven and 14 years and informed; written consent was obtained from the child, as well as from the parents.

Determination of caries experience
Caries was diagnosed using a modified World Health Organization protocol recommended for oral health surveys [9]. Teeth lost to trauma or primary teeth lost to exfoliation were not included in the final DMFT/dmft scores. When records indicated that teeth were extracted for orthodontic reasons or periodontal disease, or treatments were performed in sound teeth, these situations were not included in the final DMFT/dmft scores. The studies developed in Turkey included white spot lesions as evidence of caries. For all studies, carious lesions were recorded as present when a break in enamel was apparent on visual inspection. All the examiners carried out the clinical examination after being calibrated by an experienced specialist. Details about the determination of caries experience were previously described [1,2,4,6].
In this study, the populations were classified as either 'low caries experience' or 'high caries experience' , based on DMFT/dmft distribution in each cohort (DMFT/dmft mean and standard deviation) and subject's age. The criteria used here for classification of caries experience took age into consideration, since it is expected that caries experience will increase in the general population with age [10]. Table 1 presents caries experience definitions for Filipinos and US cohorts. For the Turkish and Brazilian cohorts (which included only children), subjects that had a DMFT/dmft score between 0-2 were classified as 'low caries experience.' The subjects that had a DMFT/ dmft score 3 or higher were classified as 'high caries experience.'

Single Nucleotide Polymorphism (SNP) genotyping
A target region at the locus 13q31.1 was fine mapped based on our previous genome-wide linkage results [6]. This region covers approximately one million base pairs. For the selection of the SNPs we used data from the International HapMap Project on Whites and Chinese (www.hapmap.org), viewed through the software Haploview [11]. Based on pairwise linkage disequilibrium and haplotype blocks we selected 61 SNPs (Table 2) in the region and genotype was performed by polymerase chainreactions with the Taqman method with the real-time PCR system ABI PRISM® 7900HT Sequence Detection System (Foster City, CA, USA). Probes were supplied by Applied Biosystems (Foster City, CA, USA).
For the first step of the genotype analyses, we evaluated the 61 selected SNPs in the Filipino families. The association between caries experience and the SNPs were tested with the transmission disequilibrium test (TDT) within the programs Family-Based Association Test (FBAT) under a recessive model [12], since the original linkage results [6] suggested a recessive model. After Bonferroni correction (0.05/61), an established alpha was 0.00082, to accommodate for the concern of multiple tests. In the second step of the genotyping analyses, we follow-up the results of eleven SNPs selected from the original 61 SNP panel based on obtained p-values. The data sets from the US, Brazil, and Turkey were tested. The differences in genotype and allele frequencies between 'high' and 'low' caries experience groups were tested using PLINK with an established alpha of 0.05. Haplotype analysis was also performed. Hardy-Weinberg equilibrium was evaluated using the chi-square test within each SNP in each population and only the results that were in Hardy-Weinberg equilibrium were further analyzed.

Bioinformatics analysis to predict transcription factor binding sites
Since the 13q.31.1 region studied contains no genes, sequences containing the eleven associated SNPs were analyzed with AliBaba 2.1 software (http://www.generegulation. com/pub/programs/alibaba2/index.html). This analysis was performed for the identification of an alteration of the prediction of potential transcription factor binding sites according to the base change of each SNP.

Gene expression analyses
DNA and RNA extracted from whole saliva were used to assess expression levels of genes selected from the bioinformatics analysis. These samples came from 143 unrelated individuals living in twelve sites of the Patagonian region of Argentina recruited for two weeks, one during the month of December 2006 and the other during the month of May 2008, and are detailed elsewhere [7,8].  Samples are part of the University of Pittsburgh Center for Craniofacial and Dental Genetics studies. The mean age of the subjects was 21.7 years (between 1 and 72 years) and both the Centro de Educación Médica e Investigaciones Clínicas "Norberto Quirno" (CEMIC) and University of Pittsburgh Institutional Review Boards approved the study of these samples and appropriate written informed consent was obtained from all participants (parents provided consent for the participation of individuals 17 years of age and under). The criteria used for classification of caries experience are presented in Table 1. Quantitative real-time PCR was used to determine expression in whole saliva of GATA1, GR, GATA3, IL4, IL5, and IL13 genes (Table 3). GATA1, GR, and GATA3 were selected because they are predicted to bind in the sequence affected by the SNP rs17074565. IL4, IL5, and IL13 were selected due to evidence that GATA3 can promote secretion of these genes [13].
Parametric and nonparametric tests were used to compare differences in expression between high and low caries experience individuals. The Pearson or Spearman correlation tests were used to analyze the strength of the relationship between GATA3 and interleukins (IL4, IL5, and IL13) to verify if there is evidence of co-expression of GATA3 and interleukins in whole saliva.

Mutation analysis
We sequenced a region in 13q31.1 that is a highly conserved ( Figure 1). This highly conserved area was identified by evaluation of data available at the UCSC genome browser (http://genome.ucsc.edu/). Three primers for the amplification of the entire region of approximately 1,200 base pairs were designed using the software PRI-MER3. Primer sequences and PCR conditions are presented in Table 3. The sequences obtained were verified against a consensus sequence obtained from the UCSC genome browser with the software Sequencher 5.1.

Association results in the Filipino Families
Out of 61 SNPs used for fine mapping the target chromosomal region, eleven were statistically significant Reverse TCCACCACCCTGTTGCTGTA or borderline associated (p ≤ 0.001) with caries experience. These results are summarized in the Table 4. Associations could also be seen between caries experience and the haplotypes of these markers (Table 5).

Association results in the follow-up Populations
Follow-up studies showed significant association for the markers rs17074565 and rs980635 in the Brazilian data set from Nova Friburgo and additional borderline results, which are presented in Table 6. A borderline result was found for the marker rs9601986 in a recessive model (p = 0.06) in the population data set from the US. In this same population, the genotype TT of the marker rs4885849 demonstrated to be a protect factor against caries in the logistic regression model (p = 0.029; OR = 0.26, 95% confidence interval 0.07-0.87). The logistic regression analysis included genotypes of the eleven markers selected after the analyses with the Filipinos as covariates.
In the case-control study from Turkey, the marker rs9318803 was a protect factor for caries experience (p = 0.02; OR = 0.37, 95% confidence interval 0.16-0.85) in the logistic regression. An association with this same marker is the same population data set was also observed when the recessive model was tested (p = 0.03).

Transcription factor binding site predictions according to the base change in each SNP
We determined potential transcription factors binding sites according to the base change of each associated SNP in the DNA sequence. Three of the eleven SNPs (rs17074565, rs9601669, and rs4885849) were predicted to alter the transcription factors binding to the sequence (Table 7). Prediction change for SNP rs17074565 were of particular interest. When the sequence has the C allele the predicted transcription factors binding are GR (glucocorticoid receptor) and GATA1 (GATA binding protein 1). When the allele G is present, the predicted transcription factor binding is GATA3 (GATA binding protein 3).

Gene expression in Whole Saliva
The genotype distribution of the SNP rs17074565 in the tested samples was 84 CC, 4 CG, and 10 GG. There was no association between caries experience and genotype distribution.
In the real-time PCR analysis, mRNA expression comparisons were performed between low and high caries experience groups, and according to genotypes. Statistically significant difference in the expression level of GR  was found between low caries and high caries experience individuals (Table 8). No differences were found when genotypes and gene expression levels were compared between individuals with low and high caries experience (Table 9). GATA3 expression was statistically significant correlated with IL4 expression (r = 0.46; p < 0.0001), IL5 expression (r = 0.23; p = 0.019), and IL13 expression (r = 0.66; p < 0.0001).

Mutation analysis
For mutation analyses, we selected 33 unrelated individuals from the Philippines that carried two copies of the associated alleles of markers rs6563245, rs17074565, rs9601669, rs1490023, and rs9318803. These markers were selected due to the proximity with the highly conserved region identified in the multispecies comparison ( Figure 1). No mutations were found.

Discussion
Our previous genome-wide linkage analysis showed suggestive linkage (LOD score above 2.0) to 13q31.1 when high caries experience was tested under a recessive model [6]. Our fine-mapping studies in the expanded data set of Filipino families confirmed the initial linkage results and showed association with markers in the locus. Eleven markers were over represented in allele transmissions to individuals with high caries experience. Follow-up studies of these eleven markers in five independent population data sets showed trends for association and associations between a subset of these markers in 13q31.1 and high caries experience. One possible explanation for the different results found in the Filipino samples in comparison to the other data sets is the possibility that the population from the Philippines studied here was more homogeneous, with very limited access to dental care, very similar diets based on rice and corn, and no exposure to fluorides and similar oral hygiene habits.
Since there are no genes in the studied region identified originally in our genome-wide linkage analysis, one of the intergenic SNPs in the region could in fact contribute to high caries experience. Another possibility is that the associated SNPs in the region could be in linkage disequilibrium to genetic variants outside the studied region, since the extent of linkage disequilibrium in the human genome can be greater than 100 kilobases [14][15][16].
One mechanism we are proposing is that the locus 13q31.1 may influence caries by altering transcription  efficiency. The genetic variant (SNP) associated with high caries experience may disrupt a transcription factorbinding site. It is common knowledge that transcriptionfactors bind directly to DNA to cause changes in gene expression. To gain insight in this hypothesis we predicted transcription factors that bind to the sites of the SNPs associated with caries experience. Our analyses suggested that three markers (rs17074565, rs9601669, and rs4885849) potentially alter the prediction of the transcription factors binding to the region depending on the base change. Genes such as OCT1 (organic caution transporters), CPC1 (central pair complex 1), c-Rel (reticuloendotheliosis viral oncogene homolog), and LyF-1 (IKAROS family zinc finger 1) were predicted to bind at the sequences of certain SNPs depending on the base change. The genes most likely to be involved in the pathogenesis of caries based on their function were related to the predictions made using the base change of SNP rs17074565. The C allele was predicted to have GR (glucocorticoid receptor) and GATA1 (GATA binding protein 1) binding, but this prediction changed to GATA3 (GATA binding protein 3) when the G allele was input. GR is the receptor to which glucocorticoids bind and there is evidence that show the use of anti-asthmatic medications with glucocorticoids decrease salivary flow rate and changes saliva composition and saliva pH [15][16][17][18][19][20][21][22][23]. In addition, rats receiving continuous glucocorticoid infusion show significantly increased caries progression, which may mean that glucocorticoids reduce the response of odontoblasts in the presence of a carious lesion. Our expression data show a statistically significant difference in GR expression when individuals with low and high caries experience were compared. The use of glucocorticoids to treat asthma is a likely mechanism that explains at least in part the evidence suggesting individuals with asthma have higher caries experience [24], however we have no record of our study samples came from individuals being treated with glucocorticoids. One possible reason we detected differential GR expression with higher levels in individuals with lower caries experience  relates to individual salivary cortisol levels. Cortisol provides a quick burst of energy for survival purposes, heighten memory functions, burst immunity, lower sensitivity to pain, and helps maintain homeostasis. When secreted a higher levels for longer periods of time, it relates to a state of chronic stress. Salivary cortisol levels are found to be elevated in children with rampant caries and those levels will decrease after restorative treatment is provided [25]. Furthermore, children with early childhood caries have significantly higher levels of salivary cortisol when compared to caries free children, and a positive correlation of salivary cortisol levels of the mothers of children affected by early childhood caries exist [26], suggesting both child and mother salivary cortisol level may impact incidence of early childhood caries. Higher expression of GR in whole saliva of individuals with lower caries experience in our data may indicate these individuals have lower levels of stress, but also this may indicate a more active immune response system. Conversely, individuals with higher caries experience showing lower levels of GR expression in whole saliva may be more susceptible to cariogenic biofilm formation. Our study has the obvious limitation of sample sizes that might not allow for the detection of relatively small effects. The frequency of the G allele of rs17074565 in the Argentina dataset is 12% and comparisons of relative gene expression stratified by genotypes were likely impaired in our data. However, our data also showed a correlation between GATA3 levels of expression in whole saliva and expression of IL4, IL5, and IL13 in the same individuals, which confirms previous findings [27]. Although the expression of these genes in whole saliva was not different depending on the caries experience in our data, it is possible that these genes may be involved in the pathogenesis of caries by the modulation of immune responses.
Another limitation is differences in caries definitions from the multiple study samples. The Turkish casecontrol cohort included white spot lesion as evidence of caries, while the other populations had carious lesions recorded as present when a break in enamel was apparent on visual inspection. Different definitions are due to the individual study designs that included examinations in a dental office versus in the house of the family participants, or the possibility of using compressed air to dry teeth. These differences can impact our results, particularly for the cohorts with lower caries levels (i.e., from Brazil). One area of active research in our group is the determination of the best definition of caries for genetic analyses. The distribution of caries in the study population was different and we scaled low and high caries experience differently to accommodate those distinctions; hence, the different definitions in Table 1. The average caries experience level (mean DMFT scores) was used to help guide the determination of caries experience by age in each group.

Conclusions
Genetic factors that contribute to high caries experience may exist in the gene desert 13q31.1 studied. rs17074565 may have a functional role in caries, disrupting the binding site for two different transcription factors that might be involved with immune responses. GR expression in saliva and salivary cortisol may be biomarkers for caries risk and should be further explored.