Frequency of single nucleotide polymorphisms in NOD1 gene of ulcerative colitis patients: a case-control study in the Indian population

Background Epidemiological studies have provided enough evidence that genetic factors have an important role in determining susceptibility to IBD. The most significant finding in the IBD research has been identification of mutations in the gene that encodes Nod2 (nucleotide-binding oligomerization domain 2) protein in a subgroup of patients with Crohn's disease. However, a very similar gene encoding Nod1 protein still has not been well documented for its association with Ulcerative colitis patients. Detection of polymorphism in NOD1 gene using SNP analysis has been attempted in the present study. We evaluated frequency and significance of mutations present in the nucleotide-binding domain (NBD) of NOD1 gene in context to Indian population. Methods A total of 95 patients with ulcerative colitis and 102 controls enrolled in the Gastroenterology department of All India Institute of Medical Sciences, New Delhi were screened for SNPs by DHPLC and RFLP techniques. Exon 6 locus in the NBD domain of NOD1 gene was amplified and sequenced. Genotype and allele frequencies of the patients and controls were calculated by the Pearson's χ2 test, Fisher's exact test and ANOVA with Bonferroni's correction using SPSS software version 12. Results We have demonstrated DHPLC screening technique to show the presence of SNPs in Exon 6 locus of NBD domain of NOD1 gene. The DHPLC analysis has proven suitable for rapid detection of base pair changes. The data was validated by sequencing of clones and subsequently by RFLP analysis. Analyses of SNP data revealed 3 significant mutations (W219R, p = 0.002; L349P, p = 0.002 and L370R, p = 0.039) out of 5 in the Exon 6 locus of NBD domain of the gene that encompasses ATP and Mg2+binding sites. No significant association was observed within different sub phenotypes. Conclusion We propose that the location of mutations in the Exon 6 spanning the ATP and Mg2+ binding site of NBD in NOD1 gene may affect the process of oligomerization and subsequent function of the LRR domain. Further studies are been conducted at the protein level to prove this possibility.


Background
Two diseases grouped under idiopathic bowel inflammation are ulcerative colitis (UC) and Crohn's disease (CD). Both diseases can be distinguished according to the differences in the clinical-pathological features [1]. The environmental factors and genetic predispositions participate in the emergence of the disease [2]. Several IBD linkage regions were identified in genome wide linkage scans [3]. Both CD and UC are considered complex genetic traits, as inheritance does not follow any simple Mendelian model [3]. The discovery of mutations in the NOD2/CARD15 gene (the first susceptibility gene known for CD associated mainly with an ileal involvement) in western European and north American countries as well as in Hungary was striking [4,5]. However, patient-control studies of Japanese, Chinese, Korean and Turkish populations did not encounter NOD2/CARD15 polymorphisms either in patients or control groups [6][7][8][9][10]. Nonsynonymous SNP scan for ulcerative colitis identified a previously unknown susceptibility locus at ECM1 and showed that several risk loci were common to ulcerative colitis and Crohn's disease (IL23R, IL12B, HLA, NKX2-3 and MST1), whereas autophagy genes ATG16L1 and IRGM along with NOD2 were specific for Crohn's disease [11].
Nod1 is a cytosolic protein and a member of a family of proteins known as the NLR/Nod (CATERPILLER family) [12]. Nod1 has been recognized as pattern-recognition receptor (PRR). NOD1/CARD4 and is located on chromosome 7p14 that has been genetically linked to asthma [13]. This protein family also includes a closely related protein Nod2. Both Nod1 and Nod2 are thought to function in inflammation, innate and adaptive immunity as well as in a variety of other processes that determine the balance between health and disease. These proteins are involved in recognition of intracellular bacteria primarily through sensing glycopeptides derived from microbial peptidoglycan. NLR family members are characterized as centrally located oligomerization and nucleotide-binding domain (NBD) that is followed by domain containing multiple leucine rich repeats at the carboxy terminal and caspase recruitment domain (CARD) at the amino terminal end [14]. However, Nod2 contains two CARD domain containing proteins [15]. Activation of Nod1 and Nod2 by ligand binding initiates a variety of cellular responses including Nf-kB and MAPK activation, cytokine production and apoptosis [16][17][18]. In the present study, we have attempted to study SNP in NOD1 gene in ulcerative colitis patients in order to determine if any significant mutation is associated with ulcerative colitis. The analysis is based on DHPLC technique which could screen accurately a large number of samples within a short period of time. The analysis was also validated by sequencing of PCR products and PCR-RFLP analysis.

Patients and Healthy controls
The study group consisted of 95 unrelated patients of Ulcerative colitis enrolled in the Gastroenterology department of All India Institute of Medical Sciences, New Delhi, India. 102 healthy controls matched for age and sex were also evaluated. The control subjects were healthy volunteers or patients with functional dyspepsia. They had no gastrointestinal or liver diseases. The diagnosis of UC was established according to clinical guidelines and criteria based on endoscopic, radiological, and histopathological examinations. The demographic and clinical features of Ulcerative colitis patients are represented in Table 1. Patients with UC were classified according to Montreal classification for age at onset, disease extent and behavior [19]. The mean age at diagnosis was 37 ± 12 years in UC patients and the mean disease duration was 4.3 ± 3.92 years. All patients and healthy controls gave informed consent and the study was approved by the ethical committee of the institute.

DNA extraction 1. Biopsy samples of patients
In the present study we have analyzed the genomic DNA from biopsy samples of patients since a parallel study is being carried out to study the gut bacteria profile during disease conditions. Genomic DNA was extracted from the colon biopsy samples (0.10 -0.30 gm) according to the modified protocol of Taggart [20]. Tissue pieces were placed in 500 μl STE buffer (0.1 M NaCl, 0.05 M Tris-HCl and 0.01 M EDTA, pH 8.0), 10 μl SDS 10%, and 30 μl proteinase K (10 mg/ml). Solution was incubated at 50°C for 2 h for digestion. The tubes were inverted several times to accelerate the digestion process. After the digestion process was concluded, 3 μl of DNAse free RNAse was added, and incubated for 30 min at 37°C. The resulting digestion mixture was extracted once with 500 μl of buffer saturated phenol, pH 8, once with phenol-chloroform-isoamyl alcohol (24:1), and finally with chloroform-isoamyl alcohol. The DNA was then precipitated in 1 ml cold ethanol and sodium acetate (3 M, NaOAc, pH 5.

Blood samples for controls
Genomic DNA was isolated for control individuals from peripheral blood leucocytes following standard protocols [21].

Detection of polymorphisms 1. DHPLC Analysis
Analysis of SNP was carried out using Denaturing High-Performance Liquid Chromatography on a fully automated WAVE DNA fragment analysis system equipped with a DNA Sep column (Transgenomic, Crewe, UK). Prior to DHPLC analysis, PCR products from a reference sample with known allele contribution were added in equimolar amounts to PCR products from all patient samples, denatured at decreasing temperature from 95 to 65°C to allow hetero duplex formation and the mixtures were automatically loaded onto the column with an autosampler. At a critical denaturing temperature, homoand heteroduplexes were released off the column at different times. PCR products were examined for heteroduplexes by subjecting 20 μl of each PCR product to a denaturation step (10 min at 95°C) followed by gradual re-annealing step by decreasing sample temperature from 95 to 65°C over a period of 45 min. The PCR products were then separated (flow rate of 0.9 ml/min) through a 2% linear acetonitrile gradient and detected at 260 nm absorbance. , which gives a computer assisted determination of melting profile and analytical conditions for each fragment. The actual running temperature was established by repeatedly injecting the sample 1-2°C below and above the calculated temperature (64°C). Hetero duplex formation was checked by the melting profile of a known sequence of Exon 6. The temperature giving 70 to 80% double helical fraction of wild-type DNA was defined. Positive controls were used to determine the DHPLC conditions. A full list of primer sequence and annealing temperature for PCR amplification, resolution temperature and start concentrations of buffer B for DHPLC analysis are listed in Table 2. PCR fragments spanning the ATPase domain and Mg 2+ binding site were amplified from all 95 patients and 102 control individuals and screened further by DHPLC using the established gradient and temperature conditions.

Sequencing
The PCR products demonstrating differential DHPLC profile were subsequently cloned in pGEMT vector and sequenced on both strands to confirm the sequence variations. Sequencing reactions were performed with the ABI big Dye Terminator cycle sequencing kit v1.1 (Applied Biosystems, Foster City, CA, USA) and samples were sequenced on an ABI Prism 310 Genetic Analyzer (Applied Biosystems). The sequences were deposited to the NCBI database.

RFLP analysis
In order to check the specificity of the DHPLC technique, the samples screened by DHPLC were further subjected to PCR-RFLP analysis. The PCR product of 840 bp (using N1Ex6-1forward and N1Ex6-2 reverse primer) was digested with Eco88I and SdaI to resolve mutations of E266K and L370R respectively. The PCR product of 427 bp (using N1Ex6-2 set of primers) was digested with MbiI to resolve mutations of 4773delG and PCR product of 412 bp (using N1Ex6-1 set of primers) was digested with Eco81I to resolve mutation of W219R. The restriction enzymes were procured from Fermentas. All the digestions were run overnight at 37°C, electrophoresed on a 2% agarose gel, visualized under UV illumination and stained with 0.4 mg/l ethidium bromide.

Statistical Analysis
Data was evaluated by SPSS software version 12 using standard contingency χ 2 tests or Fisher's Exact Test for calculating Genotype frequency differences between cases and controls. A two-tailed P-value < .05 was considered significant. Hardy-Weinberg equilibrium was carried out using Pearson's chi square test to determine whether the proportion of each genotype obtained was in agreement with expected values as calculated from allele frequencies.
Multiple comparisons were done using one way ANOVA based on the conservative Bonferroni correction. The significance level of α = .05 was chosen for all sets. Figure 1 represents the typical DHPLC chromatogram showing SNP profile of NOD1 gene in Exon 6 locus of NBD domain. The transitions observed in the ATP binding domain are represented in Figure 1B and Mg 2+ binding domain are shown in Figure 1C to 1F.

Results
Homozygous nucleotide exchanges could be distinguished because of a slight shift in the elution time compared to the reference. The addition of an approximately equal amount of wild-type DNA to the sample (1:1) before the denaturation step allows homozygous alterations to be detected reliably. This step was taken for all the samples to identify homozygous sequence variations so that all the samples were analyzed first without mixing with an equal amount of wild-type DNA to detect heterozygous mutations. These were later confirmed by sequencing as shown in Figure 2. Table 3 represents the summary of SNPs in Exon 6 of NOD1 gene. Amino acid substitution from E266K was earlier observed by Walters et al, 2006 in CD patients [22].
RFLP analysis using selected restriction enzymes further confirmed the status of SNPs in our samples. Figure 3 shows representative results for genotyping of Exon 6 locus of NOD1 gene. To detect the nucleotide swap and to reconfirm our DHPLC data, RFLP was used. The variants were well distinguishable after restriction digestion (Figure 3). Transition of E266K ( Figure 3A) could be detected in homozygous wild type when digested with Eco88I generating three bands 424 bp, 303 bp and 113 bp (GG), the mutated DNA was visible as a double band 727 bp and 113 bp (AA), whereas heterozygous type exhibited four bands (GA). Transition of L370R was studied after digesting the PCR product with SdaI. Wild-type (TT) yielded two bands of 736 bp and 104 bp, single band of 840 bp in homozygous mutated forms (GG), whereas heterozygous forms (TG) yielded three bands ( Figure 3B). In W219R transition, three forms were resolved after digesting with Eco81I. Single band of 412 bp was observed in wild type (TT), two bands of 274 bp and 138 bp were observed in homozygous condition (AA) whereas three bands were observed as expected in heterozygous condition (TA) (Figure 3C). 4773delG mutation was detected after digesting the PCR product with MbiI. Two bands of 331 bp and 96 bp sizes were observed in wild type (GG), the deleted form was visible as a single band 427 bp due to missing of a nucleotide at the restriction site, where as heterozygous types yielded expected three bands ( Figure 3D).
The genotypes and alleles distribution for NOD1 variants in UC and controls are compared at different loci of Exon 6 ( Table 4). No significant departures were noted from the Hardy-Weinberg equilibrium (data not shown). Out of five SNPs reported in this study, frequencies of transitions fromW219R (p = 0.002); L349P (p = 0.002) and L370R (p = 0.039) were found to be significant whereas previously reported mutation of E266K did not show significant result in our study group. Figure 4 represents a comprehensive list of mutations detected and their location in the NOD1 gene. The nucleotide change T4644A leads to a change in the amino acid W219R that is located in the ATP binding site of the NBD domain whereas both L349P and L370R mutations are located in the Mg 2+ binding site of the domain.
In order to establish any genotype/phenotype correlation, genotype and allele frequencies of the following SNPs were stratified by phenotypic sub groups. Analysis of the allele and genotype frequencies of 4644T>A, 5035T>C Representative results of NOD1 mutation profile identified in Mg 2+ binding site and ATPase domain of Exon 6 by DHPLC anal-ysis

Discussion
This is the first report on the prevalence of the NOD1 polymorphisms in patients with Ulcerative colitis from northern part of India. The DHPLC scanning procedure described here has been found to be efficient and fast in screening and detecting point mutations in the samples including wild types as well as mutants. The sensitivity of the procedure was determined by sequencing the PCR products. We found this procedure well reproducible since the pattern of DHPLC chromatograms matched well with our expected sequencing data. This method has earlier been used for detecting point mutations for factor IX gene scanning [23] and other clinical applications [24]. We have successfully confirmed these results by RFLP analysis.
In the course of the study, DNA from biopsy samples as well as blood samples were analyzed since a parallel study is being carried out to investigate the importance of commensal bacterial flora and its communication strategies with the host during IBD.
We did not observe any difference between quality of DNA from colon biopsy samples versus blood. Several lines of evidence suggest that poorly regulated activation of the innate immune system could result in chronic inflammatory diseases. Mutations in domain NBD and Nucleotide numbering is based on NOD1 gene available with GeneBank accession no. AF149774. The first base of the initiator methionine is taken as the start of the cDNA (PTC = premature termination codon at specified amino acid residue, aa = amino acid, E = Glutamic acid, K = Lysine, P = Proline, L = Leucine, R = Arginine, W = Tryptophan). ss represents the accession nos. of SNPs submitted to NCBI database.
NOD1 E266K, L370R, W219R and 4773delG genotypes were deduced from the migration profile on a 1.5% agarose gel   LRRs of NOD2 gene are frequently observed in patients with Crohn's disease [7,25,26]. So far, association of NOD1 gene with ulcerative colitis patients has not been documented. We have chosen the Exon 6 spanning the NBD domain of NOD1 gene for our study. Earlier studies have shown that mutations within the exon encoding the nucleotide-binding domain in the CATERPILER gene family are associated with hereditary periodic fevers characterized by constitutive IL-1β production [27]. The CATERPILLER protein cryopyrin/NALP3 regulates IL-1β processing by assembling the multimeric inflammasome complex that is regulated by binding with ATP [27,28]. Mutation of the nucleotide-binding domain might affect ATP binding that may change the function of the following processes like caspase-1 activation, IL-1β production, cell death, macromolecular complex formation, self-association and association with the inflammasome component.
The mutation E266K in the NOD1 gene observed in the Exon 6 region was not significant (P = 0.272) in ulcerative colitis patients of Indian origin when compared with the non IBD population. However, this polymorphism has earlier been reported to be significantly associated with Crohn's disease susceptibility [5]. Although studied on a limited number of samples our data shows that this genotype does not demonstrate any association with ulcerative colitis. However new SNPs detected by us located in the Mg 2+ binding domain of the protein were L370R (P = 0.039) and L349P (P = 0.002). The third significant mutation was observed in the ATP binding domain of the gene, W219R (P = 0.002). These mutations are so far not reported in UC patients. The genomic organization demonstrates a high degree of conservation of the NBD-and LRR encoding exons and all the predicted NBD/LRR proteins are likely Mg 2+ and ATP binding proteins [29,30]. These domains play an important role in the oligomerization process thus any mutation in the ATP binding domain would lead to a defective oligomerization process due to non-availability of ATP required for this process. Deletion of G at 4773 position causing a frame-shift mutation observed in few Ulcerative colitis patients though not in a significant population, but can be predicted as a potential locus that give rise to a pre-termination codon at 295 position of the amino acid encoding a truncated protein that may affect the function of NOD1 gene considerably. Interestingly, we observed and recorded that the patients showing this variant exhibited symptoms of acute inflammation.
We have not found any significant association between the different genotypes and the demographic data on the patients or the clinical characteristics of UC though there was an increasing trend in frequency of 5035T>C variant allele in the disease extent from rectum to pancolitis and left colon without significant association with any subphenotypes.
Certain limitations of our data like limited size of the samples must be considered when we are interpreting our data. Before, we can make a firm conclusion on association of these mutations with the disease, there is a need for replication in an independent cohort. Given the importance of these results, further confirmatory studies are warranted in larger UC population.

Conclusion
Screening of samples for SNP analysis using DHPLC technique has been quite useful and less time consuming in analyzing large number of patients samples. This highthroughput genotyping technique is particularly suitable for routine diagnosis of SNPs.
Our study confirms association of three SNPs to ulcerative colitis. Significant mutations observed in ATP (W219R, p = 0.002) and Mg 2+ (L370R, p = 0.039 and L349P, p = 0.002) binding domains of Exon 6 may lead to a defective oligomerization of protein which subsequently may lead to a 'loss of function' by preventing the recognition of MDP that is necessary for subsequent NF-kB activation