MLPA identification of dystrophin mutations and in silico evaluation of the predicted protein in dystrophinopathy cases from India

Background Duchenne muscular dystrophy (DMD) and Becker muscular dystrophy (BMD) are X-linked recessive disorders caused by mutations in the DMD gene. The aim of this study was to predict the effect of gene mutations on the dystrophin protein and study its impact on clinical phenotype. Methods In this study, 415 clinically diagnosed patients were tested for mutations by Multiplex ligation dependent probe amplification (MLPA). Muscle biopsy was performed in 34 patients with negative MLPA. Phenotype-genotype correlation was done using PROVEAN, hydrophobicity and eDystrophin analysis. We have utilized bioinformatics tools in order to evaluate the observed mutations both at the level of primary as well as secondary structure. Results Mutations were identified in 75.42% cases, of which there were deletions in 91.6% and duplications in 8.30%. As per the reading frame rule, 84.6% out-of frame and 15.3% in-frame mutations were noted. Exon 50 was the most frequently deleted exon and the exon 45–52 region was the hot-spot for deletions in this cohort. There was no correlation noted between age of onset or creatine kinase (CK) values with extent of gene mutation. The PROVEAN analysis showed a deleterious effect in 94.5% cases and a neutral effect in 5.09% cases. Mutations in exon 45–54 (out of frame) and exon 46–54 (in-frame) regions in the central rod domain of dystrophin showed more negative scores compared to other domains in the present study. Hydrophobicity profile analysis showed that the hydrophobic regions I & III were equally affected. Analysis of deletions in hinge III hydrophobic region by the eDystrophin programme also predicted a hybrid repeat seen to be associated with a BMD like disease progression, thus making the hinge III region relatively tolerant to mutations. Conclusions We found that, while the predictions made by the software utilized might have overall significance, the results were not convincing on a case by case basis. This reflects the inadequacy of the currently available tools and also underlines the possible inadequacy of MLPA to detect other minor mutations that might enhance or suppress the effect of the primary mutation in this large gene. Next Generation Sequencing or targeted Sanger sequencing on a case by case basis might improve phenotype- genotype correlation. Electronic supplementary material The online version of this article (doi:10.1186/s12881-017-0431-6) contains supplementary material, which is available to authorized users.


Background
Duchenne muscular dystrophy is the most severe and common form of X-linked recessive neuromuscular degenerative disorder affecting 1 in 3500 live male births [1]. It is clinically characterized by progressive muscle weakness, calf hypertrophy and elevated creatine kinase (CK) levels, wheel chair bound before the age of 12 and death due to respiratory failure. Becker muscular dystrophy is a milder form with an incidence of 1 in 20,000 male births [2,3]. Both are caused due to defects in the DMD gene that encodes a 427 kDa cytoskeletal protein dystrophin located at Xp.21.2. Dystrophin is the largest human gene consisting of 79 exons which encodes a 14.6 Kb mRNA expressed mainly in skeletal muscle, heart and brain [4,5]. Clinical severity depends on whether the reading frame is maintained. Disruption of the reading frame (out of frame) leading to prematurely truncated nonfunctional dystrophin usually gives rise to a severe DMD phenotype. Although (In-frame) mutations retaining ORF, code for semi-functional dystrophin and are predicted to be associated with a mild BMD phenotype, there are exceptions to this general rule as there are patients with severe DMD carrying in-frame mutations [5][6][7]. About 65% of DMD gene mutations are accounted for by intragenic deletions, 10-15% by duplications and remaining by point mutations [8]. Deletions are mostly clustered in two hotspots, either at proximal (towards 5'end) or distal (towards 3'end) part of the gene [9]. Therapeutic approaches are also designed to transform the Duchenne phenotype to milder Becker phenotype by restoring the expression of the dystrophin gene via exon -skipping strategies [10,11]. As no effective treatment is available for DMD/BMD, an accurate genetic diagnosis for prenatal screening is very crucial. Several techniques are available to identify mutations in the dystrophin gene. Multiplex ligation dependent probe amplification (MLPA) technique can determine the chromosomal DNA copy number changes for each exon in a single multiplex -PCR based reaction. MLPA covers all 79 exons in the DMD gene and detects deletion/duplication of one or more exons in the dystrophin gene [12].
In this study, phenotypegenotype correlation was performed based on mutational findings of 415 clinically suspected DMD/BMD patients at our centre in Southern India. This paper is an attempt to understand the impact of mutations on the structure of the dystrophin protein using bioinformatics tools.

Subjects
Clinically suspected cases (n = 415) of DMD/BMD referred for genetic testing, as a part of diagnosis from August 2013 to July 2015 were included in this study. Diagnosis was based on clinical presentation, elevated CK level, pattern of inheritance and muscle biopsy. Muscle biopsy was performed in thirtyfour patients where the genetic analysis was negative. The study was approved by the Institutional Ethics committee and written informed consent was obtained from all patients.

Genetic testing by Multiplex ligation-dependent probe amplification
Blood samples were collected in EDTA vacutainer and genomic DNA was extracted by salting out method and stored at -20°C until tested [13]. The MLPA reaction was carried out to screen all exons of the dystrophin gene using SALSA MLPA P034 and P035 probe sets (available commercially MRC Holland, Netherlands). The procedure was performed according to manufacturer's instructions [12]. Amplified products were separated using ABI 3500 XL Genetic analyzer and data were analyzed by coffalyser software. Normal healthy individuals were used as controls and included in every run.

Muscle biopsy
Open muscle biopsy was performed in 34 patients under local anaesthesia after obtaining informed consent. Tissue samples were immediately frozen in isopentane precooled in liquid nitrogen. Serial 6-μm thick sections were cut using cryostat and stained for routine histological stainshematoxylin-eosin (HE), modified Gomori trichrome and enzyme histochemical stains -NADHtetrazolium reductase, succinic dehydrogenase, cytochrome oxidase and ATPase at PH 9.5 and 4.6.
Immunohistochemical staining using monoclonal antibodies against dystrophin (dys1, dys2, dys3) and sarcoglycans (α, β, γ, δ) as primary, and HRPconjugated novalink polymer as secondary was carried out. All sections were compared with control samples (from patients other than muscular dystrophy) labelled in parallel.

Bioinformatics analysis
SIFT, PolyPhen-2, Mutation Assessor, MAPP, PANTHER, Condel and several others are the computational methods developed based on evolutionary principles to predict the effect of coding variants on protein function. These tools focus only on single amino acid substitutions whereas, the PROVEAN (Protein Variation Effect Analyzer) tool predicts the functional impact for all classes of protein sequence variations, not only single amino acid substitutions, but also insertions, deletions, and multiple substitutions (http://provean.jcvi.org). The PROVEAN tool was applied to generate a PROVEAN score for each variant. This score can be used as a measure to distinguish disease variants and common polymorphisms. This tool was used in this study to predict the functional effects of protein sequence variations (deletion/duplication) [14].
Hydrophobicity profile analysis was also carried out. Dystrophin protein sequence was obtained from Genbank (http://www.ncbi.nlm.nih.gov/genbank) and imported into Bioedit software 7.0.1. Kyte-Doolittle scale mean hydrophobicity profile analysis was performed to construct the hydrophobic regions of dystrophin protein to find out whether mutation in the hydrophobic regions has a role in pathogenesis of DMD [15].
eDystrophin database (http://edystrophin.genouest.org) was used to analyze the consequences of in-frame mutations in BMD patients on dystrophin protein in this cohort. It provides three-dimensional structure model of the mutation site and changes in the interacting partners of the protein due to mutation [16].

Clinical findings
Totally 415 clinically suspected cases of DMD/BMD were subjected to MLPA testing. Most of the patients had delayed milestones, difficulty in climbing stairs and rising from the floor. The mean age of onset for DMD & BMD were 4.40 ± 2.30 years and 12.53 ± 6.55 years respectively. The mean age at presentation was 9.72 ± 6.36 years and the mean creatine kinase value was 11218.9 ± 9799 U/L. Family history of DMD/BMD was observed in 18.5% of cases. Contractures were common and observed in 64.6% of cases. There were thirty patients in this cohort who were wheel chair dependent at an average age of 9.5 years. Intelligence quotient performed in 30 patients using Binet Kamat scale showed average intelligence in 15 (50%), dull normal in 6(20%), mild mental retardation in 3 (10%) and borderline intelligence in 3 (10%) respectively.

PROVEAN analysis
The possible biological functional effect of sequence variations on the dystrophin protein was tested for 313 cases by PROVEAN analysis. The output consisted of a PROVEAN score and a prediction of 'deleterious' or 'neutral' based on the magnitude of the score and a set threshold of (-2.5) . Deleterious effect was observed in 297 (94.5%) cases and neutral effect in 16 (5.09%) cases. Further examination of the neutral effect mutations which included both out of frame and in-frame mutations revealed the deletions to be either exon 51 deletion/duplication or duplications in exon 2-7, 2-11 region in our cases. Figure 5 shows a graph of PROVEAN score plotted against age of onset.

Hydrophobicity profile analysis
Kyte-Doolittle scale mean hydrophobicity profile analysis was performed for 48 cases with in-frame mutation. Mutational disruption in the hydrophobic regions I & III was found in 7 cases each. In this group of cases, hydrophobic region I & III was equally affected. Table 2 represents the Dystrophin hydrophobic regions mutations.

eDystrophin analysis
Using eDystrophin database, we analyzed consequences of in-frame mutations on dystrophin protein structure  for 44 available mutations. On 3D structure modelling of the dystrophin protein, 12 cases retain the typical filamentous structure of dystrophin, while the filamentous structure was not maintained in 25 cases. We found that mutations between exon 1-30 did not affect the protein structural domains (7 out of 44 cases). Figure 6 depicts the effect of the most frequent in-frame mutation exon 45-47 deletion in our sample. Fig. 3 Transversely cut skeletal muscle tissue shows dystrophic features on HE staining in both DMD and βsarcoglycanopathy (Fig I & Q) as against normal muscle tissue (Fig A). Immunohistochemically, antibodies against dystrophin (dys1,2,3) and sarcoglycans (α,β,γ,δ) shows preserved expression along the membrane in all the fibres (Fig B-H) in normal muscle tissue, while total loss of expression for dystrophin (Fig J,K,L) and preserved expression for sarcoglycans (Fig M,N

Discussion
This study presents the retrospective analysis of genetic testing for 415 clinically suspected DMD/BMD patients in our centre located in southern India using MLPA. MLPA is a rapid and highly sensitive technique used to detect deletions and duplications in the DMD gene [17][18][19][20][21]. In this cohort, the overall detection rate by MLPA was 75.42%. Our findings are comparable to the study Wang et al., [22], who reported a mutation rate of 72.5% in the Chinese population. The present study showed deletions in 91.6% cases and duplications in 8.30% cases in the dystrophin gene. The frequency of deletion was more common than duplications, similar to frequency reported from other parts of India [23][24][25][26][27]. The reported deletion rates in Pakistanis is 40.7%, Chinese 66.25%, Korean 45.5% and in Taiwanese patients 36%, thus showing possible variations among different populations [22,[28][29][30]. The duplication rate in our cases mainly involved larger fragments and the pattern of duplication was more towards the distal part of the gene unlike other populations [22]. Random age distribution was observed in this cohort, i.e. there was no correlation between extent of deletion/duplication or position of mutation, and the age of onset of clinical symptoms. This finding was similar to the Dubowitz study where no correlations could be drawn between age of onset or severity to the extent of mutation [31]. Muscle biopsy was undertaken for patients who tested negative by MLPA. Immunohistochemically, the diagnosis of DMD was established for the patients with complete absence of staining along the sarcolemma. However, BMD patients showed heterogenous dystrophin expression ranging from reduced patchy staining to normal staining on IHC [32][33][34]. The dystrophinglycoprotein complex is responsible for stabilizing the muscle fiber, a perturbation in any of its components may result in overlapping clinical presentation. Six patients with suspected DMD showed normal dystrophin labelling, but absence of sarcoglycans expression. Immunohistochemistry thus still remains the gold standard method for diagnosing muscular dystrophies [24]. IHC should be considered to detect dysfunctional dystrophin expression when genetic testing results are negative.

Genotype-Phenotype correlation
Age of onset, CK values, age at wheel chair bound and IQ score was evaluated in this study to define genotype and clinical phenotype correlation. Patients who lost ambulation at an average age of 9.5 years were seen to have deletions in the exon 45-55 region of the DMD gene (n = 30). A lower IQ score was noted largely in patients who had distal gene deletions. This was keeping with expectation as the full length isoform Dp427 is minimally expressed in the brain [35]. The dystrophin isoforms Dp140 & Dp71 which are highly expressed in the brain lack the proximal exons. The role of dystrophin in the brain remains unclear, however mutations at the 3' end of the gene have been associated with compromised brain function. Ricotti et al [36] observed that mutations disrupting the isoform Dp140 & Dp70 are more frequently associated with lower IQ scores. There was no correlation noted in CK values with gene mutation as this was a cross sectional study [37].
The PROVEAN analysis predicts effect of mutation based on the changed aminoacid sequence of mutated dystrophin protein. Mutations in exon 45-54 (Out of frame) and exon 46-54 (In-frame) region in the central rod domain of dystrophin showed more negative scores compared to other domains in the present study. Previous reports demonstrated that the phosphorylation sites of dystrophin present within the central rod domain including T2621 which is encoded by exon 53 might affect the structure of this N terminal domain. Dystrophin upon Table 2 Hydrophobic region mutations identified in this cohort by Kyte-Doolittle scale mean hydrophobicity profile analysis using BioEdit software Hydrophobic region No. of cases with in-frame mutation (n = 48)

Not involved 33
Involved 15

Region I 7
Region II 1

Region III 7
Region IV 0 phosphorylation is believed to undergo a conformational change in the N-terminal actin binding domain, thereby enhancing its affinity for myofibrillar actin [38,39]. Actin also binds the central rod domain encoded by exon 31-45 which is located between spectrin type repeats 11-17 [40]. This reconfirms the role of rod domain in dystrophin function [41]. Dystrophin protein interacts with integral membrane proteins to form the dystrophin-glycoprotein complex (DGC). The role of DGC is to stabilize the sarcolemma and protect the muscle fibers from long term damage. The hydrophobic region of dystrophin plays an important role in maintaining the stability and interaction with other proteins. There are four hydrophobic regions in dystrophin coded by exons 3-6 (region I), 42 (region II), 51 (region III), and 65-68 (region IV) which are found on the calponin homology CH2 domain on the actinbinding domain (ABD), spectrin-type repeat 16, hinge III and the EF Hand domain respectively. Liang et al [16] observed that mutational disruption in the hydrophobic region I, II, IV directly impairs the DGC function which leads to severe DMD phenotype, whereas, region III disruption leads to a less severe BMD phenotype. Carsana et al [42] demonstrated that an in-frame deletion of the hinge region in the distal rod domain shows a milder phenotype compared with deletions that do not include hinge III region. Further analysis by PROVEAN programme showed the deletion of hinge III region has more negative score compared to deletions which do not include the hinge III region. This suggests that clinical severity of the BMD maybe determined by the presence or absence of hinge III region in the dystrophin protein. However, all patients (n = 12) with exon 51 deletion /duplication corresponding to region III with age of onset ranging from 1-8 years had a severe DMD phenotype as predicted by reading frame rule.
Dystrophin is a large cytoskeletal protein comprised of four domains. The larger central rod domain has 24 repeating units similar to spectrin-like repeats. The repeat is a triple coiled coil structure made up of three helices with heptad pattern of amino acids [43,44]. This filamentous protein acts as a scaffold for several interacting partners and also provides resistance to the stress of muscle contraction. Any mutation altering this structure of dystrophin might be expected to affect its function along with that of its binding partners. The eDystrophin programme provides a computational model for each in-frame mutation and shows whether an approximate 3D filamentous structure is reconstituted (hybrid repeat) or a more deleterious structure (fractional) repeat is formed. Nicholas et.al [44] reported the differences in the structure of mutant dystrophin protein may be responsible for clinical heterogeneity in BMD patients. They observed earlier wheel chair dependency and early development of cardiomyopathy in patients with exon 45-47 (Fractional repeat) deletion compared to exon 45-48 (Hybrid repeat) deletion. Fractional repeat has slower refolding dynamics and higher molecular surface hydrophobicity compared to hybrid repeat. In this study, the most prevalent in-frame deletion observed was exon 45-47 deletion which was associated with age of onset 4-20 years and exon 45-48 deletion which was associated with age of onset 5-20 years. Analysis of hinge III deletion in e-dystrophin programme also results in retention of typical filamentous structure of dystrophin (hybrid repeat). The hybrid repeat reconstitution depends on exon phasing and though the presence of hybrid repeat does not restore the dystrophin function completely, it results in a more functional protein compared to fractional repeat [15]. Exon phasing if considered along with restoration of reading frame for exon-skipping therapy might result in improved clinical outcome.
To assess the effect of mutation on clinical severity, we did correlations between pathogenicity score and the age of onset of the clinical symptoms primarily, observed muscle weakness. Both DMD & BMD patients showed no definite correlation between sequence variation as assessed by PROVEAN score and clinical symptoms. In this cohort, we observed 'neutral effect' both in patients having exon 51 deletion/duplication which would produce truncated protein and duplications in exon 2-11 region, where the entire amino acid sequence is disturbed. We hypothesize that this mild phenotype seen as milder disease progression despite a large predicted 'out of frame' mutation in the proximal part of the protein could be due to compensatory changes in the downstream region. Further, the possibility of false positive deletion calls due to variations at the site of primer binding cannot be ruled out. These mutations which cannot be detected by MLPA should be further evaluated by sequencing.

Conclusion
In this study, the mutational spectrum of patients at this centre were compared with global populations. Our data reiterates that muscle biopsy followed by immunohistochemistry should be considered only when genetic tests results are negative. The phenotype-genotype correlation revealed that the clinical severity of BMD depends on the site and type of deletion to some extent. It also indicates that the presence of central rod domain plays an important role in dystrophin function and disease progression of DMD/BMD. Identification and characterization of dystrophin domains and their binding partners is very important for understanding the pathways that are involved, which in turn might help in devising treatments for this devastating disorder. An accurate genetic diagnosis is essential for genetic counselling and patient's treatment because therapies are mutation-specific. It may be advisable to carry out targeted sequencing to detect point mutations or any additional variants that may affect disease severity.