Bmc Medical Genetics Resequencing of Genes for Transforming Growth Factor Β1 (tgfb1) Type 1 and 2 Receptors (tgfbr1, Tgfbr2), and Association Analysis of Variants with Diabetic Nephropathy

Background: Diabetic nephropathy is the leading cause of end stage renal failure in the western world. There is substantial epidemiological evidence supporting a genetic predisposition to diabetic nephropathy, however the exact molecular mechanisms remain unknown. Transforming growth factor (TGFβ1) is a crucial mediator in the pathogenesis of diabetic nephropathy.


Background
Diabetic nephropathy is a major clinical complication of diabetes mellitus. Epidemiological evidence supporting a genetic contribution to this disease includes ethnicity differences [1], familial clustering [2,3] and the fact that only a subset of individuals with type 1 diabetes develops diabetic nephropathy regardless of metabolic control [4]. In addition, simulation studies have shown that environmental effects are insufficient to account for the familial aggregation of this disease [3,5].
Transforming growth factor beta (TGFβ1) is a multifunctional cytokine implicated in the pathogenesis of many forms of progressive renal disease, including diabetic nephropathy, by promoting renal hypertrophy and the accumulation of extracellular matrix [6]. Protein and mRNA levels of TGFβ1 are significantly increased in the renal glomeruli and tubulointerstitium of animal models of diabetes and in humans with diabetes [7]. TGFβ1 is transcriptionally activated by high extracellular glucose in murine glomerular mesangial cells [8]. Transgenic mice over-expressing TGFβ1 develop progressive renal failure, suggesting that chronically elevated levels of circulating TGFβ1 are integral to the pathogenesis of kidney disease [9,10]. Direct blockade of TGFβ protein by chronic administration of anti-TGFβ1 antibodies has been shown to decrease renal insufficiency [11]. In addition, antisense TGFβ1 oligonucleotides reduce the cellular hypertrophy and stimulation of matrix synthesis normally seen in renal cells exposed to high extracellular glucose [12].
The activity of TGFβ1 in regulating cell proliferation, differentiation and extracellular matrix production are mediated by a heterodimeric complex of type 1 and type 2 receptors. Upregulation of TGFβ1 receptors have been reported in animal models of glomerulosclerosis [13,14]. It has been proposed that upregulation of TGFβR2 induced by high extracellular glucose may contribute to distal tubular hypertrophy in diabetic nephropathy [15]. Isono and colleagues demonstrated that increased expression of TGFβR2 in the diabetic kidney is primarily due to stimulation of gene transcription rather than increased mRNA stability [16].
TGFβ1 is encoded by the TGFB1 gene located at chromosome 19q13.1 [17]. We have investigated the role of five known single nucleotide polymorphisms, which may influence TGFB1 gene expression (TGFB1: -800G>A, -509C>T, +72InsC, +869T>C, +915G>C) for their association with diabetic nephropathy. TGFβ receptors type 1 and type 2 are encoded by TGFBR1 and TGFBR2 genes respectively. At present there are over six hundred variants recorded in dbSNP for these genes, with little information available on the role of these variants in relation to renal complications of diabetes. We have screened the genomic draft sequence for the TGFBR1 and TGFBR2 genes in an Irish population to identify genomic variants. Allele frequencies were subsequently determined in a healthy control population and selected SNPs genotyped in a casecontrol collection. In summary, we investigated if putatively functional variants in three genes, TGFB1, TGFBR1 and TGFBR2, contribute to genetic susceptibility to diabetic nephropathy in type 1 diabetes.

Subjects
Ethical approval was obtained from the appropriate Research Ethics Committees in each country and written, informed consent obtained from individuals prior to conducting this study. The case and control groups used for this study (Table 1) have been described previously [18]. All patients were at least third generation Irish Caucasians diagnosed with type 1 diabetes mellitus before 31 years of age, and required insulin from diagnosis. Patients with nephropathy (cases, n = 272) had diabetes for at least 10 years before the onset of proteinuria (>0.5 g/24 h). Patients without nephropathy (controls, n = 367) had diabetes for at least 15 years, were not in receipt of antihypertensive medication, and had no evidence of non-diabetic renal disease. Patients with microalbuminuria were excluded from both groups.

In silico analysis
For TGFBR1, the nucleotide sequence of draft clone RP11-96L7 for human chromosome 9 was downloaded from the National Centre for Biotechnology Information [19]. Similarly, the sequence for TGFBR2 was obtained for draft clone RP11-1024P17 on human chromosome 3. Reference mRNA (NM_004612; NM_003242) and protein (NP_004603; NP_003233) sequences were also downloaded from NCBI for TGFBR1 and TGFBR2 respectively. These were used to determine intron-exon boundaries for genomic DNA using Vector NTI Advance (suite 2, version 8, Informax Inc (Europe), Oxford, UK). The nomenclature for all identified variants follows the Human Genome Variation Society recommendations for coding sequences, updated 21 st May 2005 [20]. In addition, we have provided rs numbers for all previously identified SNPs and ss numbers for novel SNPs to facilitate ease of comparison between research groups.
Amplification and mutation screening 6464 bases of TGFBR1 and 5204 bases of TGFBR2 genomic sequences were divided into fragments with an average size of approximately 500 base pairs, for PCR and screening purposes in 15 case and 15 control individuals. As the TGFBR1 and TGFBR2 gene sequences cover approximately 45 kb and 84 kb respectively from start to stop codon, only the coding regions of these genes (including all exons, exon-intron boundaries and untranslated regions) were screened to prioritise the identification of potentially functional gene variants. Each PCR product was then evaluated using WaveMaker v3.4 software (Transgenomic Ltd, Crewe, UK) and analysed on the WAVE™ (dHPLC) DNA Fragment Analysis System (Transgenomic Ltd) following the manufacturer's recommendations. Differentially separating fragments (representing DNA variants) were bidirectionally sequenced to identify variants using an ABI PRISM ® 3100 Genetic Analyser (Applied Biosystems, Warrington, UK). Forty-eight healthy controls (n = 96 chromosomes) from the Young Hearts collection [21] (a healthy Irish Caucasian population) were genotyped by direct capillary sequencing (Applied Biosystems) to establish allele frequencies for all gene variants.

Genotyping
Five SNPs were selected for genotyping in the TGFB1 gene as they have been previously suggested to influence the expression of TGFβ1, in addition to demonstrating a minor allele frequency greater than 5%. TaqMan assays were successfully designed for TGFB1: -800G>A (rs1800468), TGFB1: -509C>T (rs1800469), TGFB1: +869T>C (rs1982073) and TGFB1: +915G>C (rs1800471) SNPs, but proved problematic for TGFB1: +72InsC (rs1800999) due to the presence of a long C homopolymer. TGFB1: +72InsC was successfully genotyped using a biplex Invader™ assay (Third WAVE Technologies Inc, Madison, MI, USA). Genotyping was performed for receptor variants using Pyrosequencing ® technology according to the manufacturer's instructions (Biotage, Uppsala, Sweden). Details of the primer sequences used for resequencing purposes, together with the WAVE conditions and the oligos used for the genotyping assays are listed (Tables 2, 3, 4, 5) with further details readily available from the authors on request. 272 case and 367 control samples were available for genotyping TGFB1 SNPs, however fewer samples were available (241 cases and 322 controls) for genotyping TGFBR2 gene variants. Genotype frequencies were assessed for Hardy-Weinberg equilibrium using a χ 2 goodness-of-fit test. The χ 2 test for contingency tables was used to compare genotype and allele frequencies between case and control subjects with the level of significance set to p < 0.05. Haploview [22] was used to visualise linkage disequilibrium (LD) and haplotype blocks within each gene.

Results
We have submitted our annotated sequencing data for TGFBR1 and TGFBR2 genes as GenBank accession numbers DQ383416 -DQ383424 and DQ377553 -DQ377559 respectively. A total of fifteen variants were identified in these genes (TGFBR1, n = 5; TGFBR2, n = 10) of which eight were previously recorded in dbSNP; we have obtained unique NCBI identifiers for all novel SNPs (n = 7; Table 6).
The distribution of genotypes was found to be in Hardy-Weinberg equilibrium for all SNPs in both case and control groups. No significant differences were observed in genotype and allele frequencies between case and control groups for any of the SNPs assessed (Table 7). Logistic regression analysis for the clinical characteristics described in Table 1 did not reveal a significant association with any variant and diabetic nephropathy. Adjusted p values for these potential covariates are shown in Table  8.
The level of observed LD between all genotyped variants within each gene, together with the raw |D'| and R 2 scores are shown in the Figures (Figures 1, 2). The most common combinations of alleles observed 5' to 3' were GC-TG   We observed no significant differences in genotype or allele frequencies between case and control groups for any of the SNPs assessed. The TGFB1: +869T>C SNP has been associated with diabetic nephropathy in a Chinese population with type 2 diabetes [31], however our results do not support this finding for nephropathy in type 1 diabetes. The results from Wong and colleagues' smaller Chinese study (cases, n = 58; controls, n = 65), may be explained by a difference in genetic factors between type 1 and type 2 diabetes or differences between the Chinese and Irish populations.
Our study employed rigorous phenotypic criteria for inclusion of cases and controls. The annual incidence of diabetic nephropathy increases over the first fifteen to twenty years duration of type 1 diabetes, but after twentyfive years the absence of overt proteinuria makes the subsequent development of nephropathy unlikely [32,33]. The present report utilised cases and controls that were well matched for prolonged duration of diabetes (cases mean duration = 26.9 ± SD 8.3 years, control mean duration = 27.7 ± SD 9.0years). Our results for TGFB1 are in accord with Ng and colleagues' study which also failed to find an association between the TGFB1: -800G>A, -509C>T, +869T>C or +915G>C polymorphisms and diabetic nephropathy in US Caucasians with type 1 diabetes [34]. This is in contrast to a larger study in UK Caucasians where a significant association (p = 0.027) was identified between TGFB1: +869T>C and diabetic nephropathy [35]. This UK study utilised the Golden Years cohort of type 1 diabetic individuals as a control population [31] with all the recruited subjects (n = 410) having a very long duration of type 1 diabetes (> 50 years). Although these individuals did not have renal failure due to diabetic nephropathy 29% were taking antihypertensive medication and 35.7% had evidence of micro-or macroalbuminuria [36]. These clinical features (antihypertensive medication and micro-or macroalbuminura) form distinct exclusion criteria from our own diabetic control group. Although our sample size has ~90% power to detect a doubling in the minor allele frequency in cases relative to controls (e.g. 10% vs. 5%), there is a need for a collaborative genotyping effort in larger sample collections to definitively determine the role of these SNPs in predisposition to diabetic nephropathy.
TGFβ type I receptors form a heterodimeric complex with TGFβ type II receptors and bind to TGFβ to mediate many TGFβ activities including regulation of cell proliferation, differentiation and extracellular matrix production. It has been recently reported that TGFβ1-mediated epithelial-tomesenchymal transition requires functional TGFBR2 [37]. Variants have been recorded for both TGFBR1 and TGFBR2 genes, however there is limited genomic information regarding their influence on diabetic nephropathy. TGFBR1 is composed of nine exons and maps to chromosome 9q33-q34 [38]. TGFBR2 is composed of seven exons and maps to 3p22 [39]. There are presently 409 validated SNPs recorded in dbSNP for these two genes (TGFBR1, n = 115; TGFBR2, n = 294; dbSNP, accessed 12/01/06). Due to the large number of reported SNPs and potential ethnic variation in SNP occurrence and frequency [40], we resequenced these genes in our population.
We prioritised screening of the protein coding regions of these genes to aid identification of potentially functional gene variants. We screened for variants directly affecting lariat regions, splice sites, exonic/intronic splice enhanc-    ers, signal sequences, protein coding sequence, polyadenylation signals and untranslated regions. It is possible that other variants may affect regulatory mechanisms, (promoter or enhancer elements, microRNA etc.) or that features such as post-translation modifications may affect these candidate genes and their subsequent protein activity. It is also possible that rare variants may play a role in susceptibility to diabetic nephropathy; however this study lacks sufficient sample numbers to definitively assess the role of rare variants in this disease. Five novel variants were identified in TGFBR1, of which none were at sufficient frequency to assess in this case-control collection.
We have identified nine SNPs in TGFBR2. Two SNPs are located in exons (TGFBR2: c.1157C>T, TGFBR2: c.1149G>A) and one SNP in the 3' UTR which was found to be putatively functional (TGFBR2: c.*747C>G). Analysis of TGFBR2: c.*747C>G genotyping did not reveal a significant association with diabetic nephropathy. A microsatellite [AT] del was also identified in the 3' UTR of TGFBR2. TGFBR2: c.1157C>T in exon four was found in only one sample in the heterozygous state (MAF: 1.1%), and does not lead to a change in amino acid (aac → aat = N 389 N). TGFBR2: c.1149G>A was found in only two samples (MAF: 2.2%), but leads to a non-synonymous change in amino acid (gtg → atg = V 387 M) in the serine-threonine protein kinase active domain of the mature chain for TGFβR2 (PROSITE: PS00108; Pfam: PF00069, accessed 03/02/06). Genotyping TGFBR2: c.1149G>A did not reveal a significant association with diabetic nephropathy, however we did identify a doubling of the minor allele in cases (MAF: 1.9% in cases vs. 0.8% in controls). This finding may be due to the low frequency of minor allele, however our available sample numbers do not provide sufficient power to appropriately assess the association of this SNP with diabetic nephropathy. The search for causative variants for susceptibility to diabetic nephropathy is constrained by limited numbers of well-characterised, precisely phenotyped cases and controls, which represents a major challenge in the study of complex disease genetics. While the power to identify disease gene loci is influenced by many factors, the requirement for adequate samples sizes of stringently phenotyped individuals is critical to the success and validity of complex disease association studies. Our results warrant further investigations of rare variants, particularly the TGFBR2 exonic SNPs, provided the sample population is sufficiently powered to assess the association.

Conclusion
Although experimental evidence suggests TGFβ1 blockade may be an important therapeutic target we were unable to identify any association between TGFB1 gene variants and diabetic nephropathy. In resequencing the genes we identified eight novel variants for TGFB1, TGFBR1 and TGFBR2 genes but did not detect significant association between any of the common SNPs and nephropathy in this Caucasian population with type 1 diabetes. Although |D'| values were not particularly large for TGFB1 markers (D' Plot), they were statistically significant Figure 1 Although |D'| values were not particularly large for TGFB1 markers (D' Plot), they were statistically significant. R 2 measure, there was little correlation observed between the genotyped markers (R 2 Plot). Further details are displayed in the descriptive shown tables below the LD Plots. D' is the value of D primer between the two loci; LOD is the log of the likelihood odds ration (a measure of confidence in the value of D'); R 2 is the correlation coefficient between the two loci and CI low/CI high represent 95% confidence limits for D' where the minor allele frequency is greater than 5%.
APM: Participated in study conception and design, supervised recruitment of patients in Northern Ireland, contributed to data interpretation, co-wrote the manuscript and approved final manuscript.