Identification of novel mutations in Chinese Hans with autosomal dominant polycystic kidney disease

Background Autosomal dominant polycystic kidney disease (ADPKD) is the most common inherited renal disease with an incidence of 1 in 400 to 1000. The disease is genetically heterogeneous, with two genes identified: PKD1 (16p13.3) and PKD2 (4q21). Molecular diagnosis of the disease in at-risk individuals is complicated due to the structural complexity of PKD1 gene and the high diversity of the mutations. This study is the first systematic ADPKD mutation analysis of both PKD1 and PKD2 genes in Chinese patients using denaturing high-performance liquid chromatography (DHPLC). Methods Both PKD1 and PKD2 genes were mutation screened in each proband from 65 families using DHPLC followed by DNA sequencing. Novel variations found in the probands were checked in their family members available and 100 unrelated normal controls. Then the pathogenic potential of the variations of unknown significance was examined by evolutionary comparison, effects of amino acid substitutions on protein structure, and effects of splice site alterations using online mutation prediction resources. Results A total of 92 variations were identified, including 27 reported previously. Definitely pathogenic mutations (ten frameshift, ten nonsense, two splicing defects and one duplication) were identified in 28 families, and probably pathogenic mutations were found in an additional six families, giving a total detection level of 52.3% (34/65). About 69% (20/29) of the mutations are first reported with a recurrent mutation rate of 31%. Conclusions Mutation study of PKD1 and PKD2 genes in Chinese Hans with ADPKD may contribute to a better understanding of the genetic diversity between different ethnic groups and enrich the mutation database. Besides, evaluating the pathogenic potential of novel variations should also facilitate the clinical diagnosis and genetic counseling of the disease.


Background
Autosomal dominant polycystic kidney disease (ADPKD) is a severe inherited disorder accounting for up to 10% of end-stage renal diseases [1]. The disease is characterized by numerous gradually enlarging fluid-filled epithelial cysts in bilateral kidneys. Two mapped genes, PKD1 (MIM 601313) and PKD2 (MIM 173910), are known to cause the disease [2,3]. The former, mutated in 85% of all cases, encodes polycystin-1 (PC1), which is a receptor protein for cell-cell/matrix interactions in the regulation of cell proliferation and apoptosis; the latter, mutated in 15% of the cases, encodes polycystin-2 (PC2), which functions as a transient receptor potential ion channel and regulates intracellular Ca 2+ concentration. PC1 interacts with PC2 to form a functional complex that acts as a flow-dependent mechanosensor for regulating the differentiated state of tubular epithelial cells [4][5][6][7].
Early diagnosis of ADPKD is established primarily by ultrasound imaging with age-related cyst number criteria [8], however, for younger at-risk individuals and those with PKD2 mutations, ultrasonography may be insufficient for providing a definite diagnosis [9,10]. In such cases, linkage analysis is helpful for predicting the diagnosis, but it requires the participation of at least two affected relatives. In recent years, mutation screening, a direct and efficient approach, has been proven applicable to all cases suspected of ADPKD. According to a recent study, the mutation detection rate is approximately 86% using a combination of SURVEYOR Nuclease-Wave HS analysis and direct sequencing [11]. However, mutation screening of PKD genes for clinical diagnostic purposes has been proven difficult because of the structural complexity of PKD genes and the high diversity of their mutations.
The PKD1 gene encodes an approximately 14 Kb transcript with 46 exons extending to 50 kb of the genomic DNA [12] and the 5' part of this gene covering exons 1-33 is duplicated three or more times proximally on chromosome 16 [13]. Therefore, locus-specific amplification of PKD1 is required to acquire a single copy of the gene's duplicated region. PKD2, which encodes a 3 kb open reading frame with 15 exons, extends to a 70 kb genomic area [14]. Furthermore, no hot mutation in both genes has been reported, and distinguishing the pathogenic mutations from non-pathogenic variations remains a major difficulty in direct gene diagnosis of the disease.
In the present study, a group of novel mutations discovered from the direct mutation screening of both PKD1 and PKD2 in 65 Chinese families with ADPKD were described. A total of 100 unrelated normal controls were also recruited to differentiate between possible mutations and polymorphic changes. All mutation data detected would be helpful in direct gene diagnosis, as well as in genetic counseling in clinical practice.

Methods
The patients and the normal controls A total of 121 individuals from 65 unrelated families were recruited from West China Hospital, Sichuan University. Among them, 86 individuals were diagnosed with ADPKD according to the ultrasound criteria recommended by Ravine et al. The general clinical data of the patient cohort are summarized in Table 1. In addition, 100 unrelated healthy volunteers 35 to 56 years old were also recruited as controls after exclusion of any renal cysts by ultrasound examination. Peripheral blood samples were collected from all participants, with prior informed consent. This study was approved by the Institutional Ethical Review Boards, Sichuan University.

PCR amplification
For each proband, the genomic DNA was extracted from the peripheral blood sample using a standard phenol-chloroform procedure and prepared in duplicate. The duplicated region of PKD1 encoding exons 1-33 was amplified as five specific long fragments, and electrophoresed on 1% agarose gel to detect the possible large range of sequence rearrangement [15][16][17]. Then, the five long-range PCR products were diluted 1:10 4 to avoid genomic DNA carryover, and served as templates for 50 nested PCR reactions. Meanwhile, exons from the unique region of PKD1 gene and the entire PKD2 gene were amplified from the genomic DNA by 31 additional PCR reactions. A total of 81 PCR products, ranging from 150 to 450 bp, were separated on 2.0% agarose gel to check the amplification efficiency, then were prepared for DHPLC analysis.

Mutation screening by DHPLC
DNA fragments were analyzed using an automated WAVE Nucleic Acid Fragment Analysis System (Transgenomic, Omaha, Nebraska). Wavemaker 4.2 software (Transgenomic, Omaha, Nebraska) was used to determine the optimal melting temperature for the tested fragments. Heteroduplexes of amplicons were generated prior to DHPLC by denaturing the PCR products at 94°C for 5 min and cooling them at room temperature for 45 min. Then, about 5-8 μL of each product was injected into a high-throughput DNASep column and eluted with a linear acetonitrile gradient of 2% per minute at a flow rate of 0.9 mL min -1 . All chromatograms were grouped based on the differences in profiles between the normal controls and patients.

DNA sequencing
All fragments with aberrant elution profiles were sequenced to confirm the possible changes using the same forward and reverse PCR primers used for the PCR amplifications. The change was checked with the duplicate sample after it was found. A necessary cloning and sequencing of the fragment was performed if more than one sequence changes were observed in a DNA fragment. The NCBI RefSeq sequences were used for PKD1 [GeneBank:  [18,19].

Evaluation of the pathogenicity of sequence variations
Frame-shifting deletions or insertions, nonsense, typical splicing and in-frame changes of five or more amino acids were defined as pathogenic mutations in this study [20]. Pathogenic potential of missense, intronic changes and synonymous were evaluated using a method recommended by Tan et al.
Firstly, gene variations were classified by analyzing recurrence as reported in the literature, ADPKD mutation database (PKDB) [21] and Single Nucleotide Polymorphism database (dbSNP). Secondly, novel variations and previous unclassified variations were checked in family members available and 100 unrelated normal controls. Variations found in unaffected family members or unrelated normal controls were classified as polymorphisms. Then, the functional significance of the remaining unclassified variations was evaluated computationally using webbased software programs. SIFT and PolyPhen-2 were used to predict possible impact of substitutions on protein function and/or structure [22][23][24][25][26]. The Align-GVGD program was used to determine the Grantham Matrix Score (GMS) for evaluating evolutionary conservation (Grantham Variation[GV]) and chemical differences of resulting amino acid substitutions (the Grantham Distance[GD]) [27][28][29]. Potential splice-site effects were predicted using NNSplice and NetGene2 with default settings for missense, synonymous, and intronic changes [30][31][32][33][34].
All variations analyzed by these web-based software programs were finally sorted into four categories: 1) probable pathogenic; 2) indeterminate; 3) probable polymorphism; and 4) polymorphism. Only gene variations that were unanimously predicted to be deleterious by SIFT, Poly-Phen-2 and Align-GVGD or to affect splicing by NNSplice and NetGene2 were considered to be "probably pathogenic", if no other definite mutation was found in the same patient. If a definite mutation coexisted with a deleterious missense change or a likely atypical splicing variation in the same patient, the missense change and the atypical splicing variation were considered to be "indeterminate". Similarly, only variations that were scored as begin or predicted to have no effect on splicing by all corresponding applications were considered to be "polymorphisms". Otherwise, they were classified as "probable polymorphisms".
Definite mutations were found in 28 of the families including 10 frameshift, 10 nonsense, two typical splicing and one duplication of five amino acids. These diseasecausing mutations are reported in Table 3. Totally 28 missense changes were detected in the patients, of which 9 were reported as polymorphisms previously. Additionally, NP_001009944.2: p.Ser372Asn and p.Arg2654Gly that coexisted with a definite mutation NP_001009944.2: p.Arg2430* in patient 09032 were found in unaffected family members; NP_001009944.2: p.Leu1290Val that coexisted with NP_001009944.2: p.Arg462fs in patient 08006, NP_001009944.2: p.Arg3169Gln that coexisted with NP_001009944.2: p.Trp3785* in patient 08020, and NP_001009944.2: p.Ala1792Thr in patient 09026 were found in unrelated normal controls; these five missense variations were classified as polymorphisms. The pathogenic potential of the remaining 14 unclassified missense changes were evaluated by SIFT, PolyPhen-2 and Align-GVGD (see Additional file 1). Finally, additional six were predicted to be deleterious by all three software applications, and classified as "probably pathogenic" (Table 4); two were scored as benign unanimously and defined as "polymorphisms"; others scored as deleterious by only one or two of these applications were considered to be "probable polymorphisms". Novel synonymous variations, intronic changes, and missense variations scored as "benign" or "unclassified" by SIFT, PolyPhen-2 and Align-GVGD were evaluated for splice-site effects (see Additional file 2). Three variations, NM_001009944.2: c.7704-12C > T, NM_001009944.2: c.7796T > G (p.Leu2599Arg), and NM_001009944.2: c.10618+16_10618+18delinsAAA were predicted to have a slight effect on splice-site by only one of the applications and therefore were considered to be "probable polymorphisms". Based on the analysis criteria indicated above, six variations were predicted to be "probably pathogenic", eight were classified as "probable polymorphisms", and 55 were classified as "polymorphisms". These polymorphisms and probable polymorphisms are shown in Additional file 3.

Discussion
Mutation analysis of PKD1 and PKD2 in Chinese ADPKD patients previously focused on the unique region of the genes [35][36][37], only one systematic mutation analysis of  both genes in Chinese patients by single-strand conformation polymorphism (SSCP) has been reported, which contained only 24 families [38]. Therefore, it is essential to understand how mutations are distributed in all regions of the genes in Chinese patients and the genetic diversity between different ethnic groups. The present study has analyzed 65 ADPKD families using DHPLC and DNA sequencing, giving a mutation detection rate of 52.3%. Among the 29 mutations, 69% are reported for the first time, and recurrent mutations account for about 31%. No hot mutation was found in this study. Gene mutations detected in the study include frameshift, nonsense, missense, and splice-site changes, and the proportion of each type of mutation is in agreement with that reported by Rossetti et al. (P > 0.05, Chi-square Test). In total of 62 variations detected in PKD1 gene of the patients, 21 variations were located in exon 15, and accounted for 33.9% (21/62) which is higher than that recorded in PKDB (181/873) (P < 0.05, Chi-square Test). No definitely pathogenic mutation that coexisted with another mutation in a same patient was found in our study of Chinese patients, only some missense, synonymous, or intronic changes were found to coexist with a definite pathogenic mutation, but were scored as benign according to the analysis criteria. A duplication of five amino acids (NP_001009944.2: p.Val2217_Leu2221dup) found in patient 09024 was an exonic rearrangement that may affect the structure of the protein, therefore, was classified as a pathogenic mutation as recommended by Rossetti et al. A number of sequence changes in the 5' replicated region of PKD1 were also present in their homologous locis, such as the mutations NM_001009944.2: c.7288C > T (p.Arg2430*), c.8614DelA (p.Ile2872Serfs*3), and four polymorphisms, NM_001009944.2: c.1849 +14_1849+26delTGGTGGGTGGTGG, c.8087T > G (p. Leu2696Arg), c.8681_8689delCCAACTCCG (p. Ala2894_Ser2896del) and c.9506G > A (p.Arg3169Gln). Nucleotide sequences of PKD1 exhibiting these changes were identical to at least one of their homologous copies on chromosome 16. A similar phenomenon was also observed in other ethnic groups [39,40], especially in a population of 41 unrelated Thai and six unrelated Korean families with ADPKD by Phakdeekitcharoen et al. A possible reason for the phenomenon is that gene conversion has happened between PKD1 and its homologous loci [41,42]. Nevertheless, this kind of variations should be interpreted carefully when utilized in clinical diagnosis.
Because of the high prevalence of polymorphisms and private mutations, particularly in PKD1, it is difficult to determine whether a specific genetic change is a mutation or a polymorphism. In the present study, all novel variations and previous unclassified variations were first checked in family members and unrelated normal controls. Then the pathogenic potential of the remaining variations were analyzed by web-based software applications. SIFT, PolyPhen-2, NNSplice, and NetGene2 were used with default setting. The GMS was used to score the GD and GV of each substitution, and this method has been currently automated in the program Align-GVGD. All analyses have finally identified additional six probably pathogenic mutations, increasing the overall detection rate to 52.3%, and this result demonstrates the utility of bioinformatics evaluation of gene variations in PKD genes.
DHPLC is a well-known method that could be used for the detection of heterozygous variants. However, DNA fragments with variants located in the GC-rich region or the 5' and 3' ends nearby tend not to generate recognizable elution peaks, which could lead to false negative results [43]. Therefore, direct sequencing both PKD1 and PKD2 genes of patients, especially the mutation-negative cases, could be one of the most efficient methods for mutation detection. Large deletions and duplications were reported to account for 1%-3% of the mutations in PKD1 patients [44,45], however, if the rearrangement extends beyond the limits of the large amplicons, only the wild-type would be amplified by PCR. For this kind of mutations, multiplex ligationdependent probe amplification may serve as a more reliable detection assay [46,47]. Considering the structural complexity of the PKD1 gene and the diversity of mutation types, a combination of multiple methods rather than a single assay is highly recommended to meet the patients' demand for a complete molecular genetic diagnosis of ADPKD.

Conclusions
The present mutation analysis of PKD1 and PKD2 genes in Chinese Hans with ADPKD may contribute to a better understanding of the genetic diversity between different ethnic groups and enrich the mutation database. Besides, evaluating the pathogenic potential of novel variations should also facilitate the clinical diagnosis and genetic counseling of the disease, particularly through the direct gene approach.
Additional file 2: Supplementary Table S2. Atypical splicing prediction. Atypical splicing prediction of novel synonymous variations, intronic changes, and missense variations was performed using NNSplice and NetGene2.
Additional file 3: Supplementary Table S3. Summary of PKD1 and PKD2 Genetic Variations (Polymorphisms, Probable Polymorphisms). A brief summary of polymorphisms and probable polymorphisms detected from patients, unaffected family members and normal controls in this study.