Targeted next generation sequencing with an extended gene panel does not impact variant detection in mitochondrial diseases

Background Since the advent of next generation sequencing (NGS), several studies have tried to evaluate the relevance of targeted gene panel sequencing and whole exome sequencing for molecular diagnosis of mitochondrial diseases. The comparison between these different strategies is extremely difficult. A recent study analysed a cohort of patients affected by a mitochondrial disease using a NGS approach based on a targeted gene panel including 132 genes. This strategy led to identify the causative mutations in 15.2% of cases. The number of novel genes responsible for respiratory chain deficiency increases very rapidly. Methods In order to determine the impact of larger panels used as a first screening strategy on molecular diagnosis success, we analysed a cohort of 80 patients affected by a mitochondrial disease with a first mitochondrial DNA (mtDNA) NGS screening and secondarily a targeted mitochondrial panel of 281 nuclear genes. Results Pathogenic mtDNA abnormalities were identified in 4.1% (1/24) of children and 25% (14/56) of adult patients. The remaining 65 patients were analysed with our targeted mitochondrial panel and this approach enabled us to achieve an identification rate of 21.7% (5/23) in children versus 7.1% (3/42) in adults. Conclusions Our results confirm that larger gene panels do not improve diagnostic yield of mitochondrial diseases due to (i) their very high genetic heterogeneity, (ii) the ongoing discovery of novel genes and (iii) mutations in genes apparently not related to mitochondrial function that lead to secondary respiratory chain deficiency. Electronic supplementary material The online version of this article (10.1186/s12881-018-0568-y) contains supplementary material, which is available to authorized users.


Background
Mitochondrial disorders (MD) take part of a group of rare diseases, characterized by an impairment of the mitochondrial respiratory chain (RC), with a prevalence of 1:5000 live births [1]. Deficiency of the mitochondrial RC is responsible for the lack of ATP production, which provides energy in each cell through oxidative phosphorylation (OXPHOS). The diagnosis of such diseases is challenging because of extreme phenotypic heterogeneity, variable age of onset and different modes of inheritance. Among these diseases, some affect a single specific organ (like Leber Hereditary Optic Neuropathy, LHON), but a majority of them involve multiple organ systems.
The clinical spectrum is very wide, from mild clinical features such as Chronic Progressive External Ophtalmoplegia (CPEO) to very severe neurologic impairment such as Leigh Syndrome (LS). The common clinical features of mitochondrial diseases include ptosis, ophtalmoplegia, myopathy, cardiomyopathy, sensorineural deafness, optic atrophy, pigmentary retinopathy and diabetes mellitus. Encephalopathy, epilepsy, cerebellar ataxia, axonal neuropathy, migraine, stroke-like episodes, cognitive impairment and movement disorders are mainly found in patients presenting with neurological symptoms.
Most of the proteins required for structure, biogenesis and function of mitochondria are encoded by nuclear genes (nDNA) but 13 essential subunits of RC complexes are encoded by the mitochondrial genome.
Human mitochondrial DNA (mtDNA) is a circular double-stranded molecule, constituted by 16,569 base pairs in size, encoding 13 respiratory chain subunits, 22 tRNAs and 2 rRNAs. MtDNA is present in multiple copies within the mitochondria of each cell [2]. Although mitochondria possess their own genome, they need nuclear genes to encode proteins for their biogenesis including mtDNA maintenance, mitochondrial dynamics (fusion and fission), coenzyme Q10 biosynthesis, assembling of the respiratory chain complexes, activity and turnover. More than 250 nuclear genes have already been linked to play a role in mitochondrial disorders and the list of candidate genes is growing up as over 1500 genes have been identified controlling mitochondrial structure and function [3,4]. Finding the pathogenic variants by molecular genetic testing confirms the diagnostic, and provides to the geneticist the ability to deliver genetic counselling. However, responsible genes remain to be identified in most patients suspected with a mitochondrial disease. Recently, Next Generation Sequencing (NGS) has improved the efficiency of mutation discovery and facilitated the molecular routine diagnosis of such diseases in term of money and time spent [5,6].
In a recent study, Legati and colleagues analysed a cohort of patients, affected by a mitochondrial disease mostly characterized by early onset, using a combined NGS approach based on a targeted gene panel and whole exome sequencing (WES) [6]. Their custom-made targeted mitochondrial panel included 132 genes. It allowed to identify the causative mutations in 15.2% of cases. The authors then identified the causative molecular abnormalities in 6 out of 10 patients tested by WES. In order to determine the impact of larger panels on molecular diagnosis success, we analysed a cohort of 80 patients affected by a mitochondrial disorder. A mtDNA disease was identified in 15 patients (18.7%) including 1 child out of 24 (4.1%) and 14 adults out of 56 (25%). In a second step, the remaining 65 patients were analysed with a targeted mitochondrial panel of 281 genes allowing to obtain a detection rate of 12.3% with 21.7% (5/23) in children and of 7.1% (3/42) in adult patients.

Clinical and biochemical investigation of patients
Eigthy patients (38 males and 42 females; 24 children (9 with age onset < 1; median age of the cohort =1 ± 3. 3 years) and 56 adults (median age of the cohort = 58 ± 14.3 years) diagnosed as affected by mitochondrial disease through the analysis of clinical, biochemical and histological data, were included in this cohort (Table 1). All have been referred to the National Centre of Mitochondrial Diseases (CHU Nice, France), also certified by the European Network of reference centers for rare neuromuscular diseases (EURO-NMD). Histological analysis was available in 63 cases out of 80 with muscle biopsies evocative of a mitochondrial myopathy in 45 patients out of 63 (10/19 children and 35/44 adults). Based on the biochemical data obtained in either muscle, liver, fibroblasts or both and available in the large majority of patients (59 out of 80), we identified isolated defect in complex I (n = 4; 2/21 children and 2/38 adults), complex II (n = 1; 1/21 child), complex III (n = 6; 6/38 adults), complex IV (n = 3; 2/21 children and 1/38 adult) and multiple defects (n = 10; 6/21 children and 4/38 adults) ( Table 1). Informed consent for diagnostic and research studies was obtained for all subjects in accordance with the Declaration of Helsinki protocols.

Molecular genetics mtDNA analysis
The identification of mtDNA single deletions and point mutations has been performed by using XL-PCR and NGS protocols, respectively [7]. The presence of mtDNA deletions was confirmed by Southern bot analysis [8].

Custom targeted panel analysis
We designed a custom panel of genomic regions corresponding to 281 genes, selected in 2016 to be already involved in mitochondrial disorders (NIH Genetic Testing Registry) or to be candidate genes (Additional file 1: Table  S1). We designed RNA probes to capture the transcribed sequences of genes (exons and exon/intron junctions) with Agilent SureSelect kit. 1 μg of genomic DNA was fragmented and adaptors were added in a single enzymatic step by the library builder (Thermofisher Scientific). The adaptor-tagged DNA library was purified and amplified. 750 ng of each library was hybridized using SureSelect capture library overnight at 65°C. The resulting libraries were recovered using streptavidin beads and a postcapture PCR amplification was carried out. Libraries were pooled, emulsion PCR, enrichment and loading of template-positive ion sphere particles were performed on an IonChef system. Ion PI chips V3 were sequenced on the Ion Proton, using Ion PI Hi-Q sequencing kit. The sequences were aligned against the human reference sequence (GRCh37/hg19) using Torrent Suite Software 5.0. 4. Variant calling was then performed using variant caller version 5.0.4.0. Annotation and filtering of the variants were accomplished by submitted them to Ion Reporter Software version 5.2. Filtering was carried out by applying a series of steps: variants with a minor allele frequency (MAF) < 1% in the 1000 genomes project or in the 5000 exomes european-american (NHLBI ESP) were kept. We focused on predicted missense, frame-shift, stop-gain or stop-loss, and splice-site variants. For remained variants in the final list, we also checked the prediction score in Polyphen 2 (http://genetics.bwh.harvard.edu/pph2/), SIFT

Validation of variants identified by NGS
Variants identified by NGS were validated by Sanger sequencing. Coding regions with exon/intron junctions were amplified through PCR (primers available upon request). PCR products were sequenced using an ABI Prism 3100XL apparatus (Applied Biosystems). The chromatograph traces were analyzed using Sequencing Analysis software.

Analysis of the nuclear genome
We used a targeted gene panel to analyze the coding sequences as well as the exon/intron junctions of 281 genes in the remaining 65 patients who were negative for mtDNA screening. The regions were captured with 34,839 RNA probes corresponding to a total amount of Depending on the clinical and familial history, we focused on autosomal recessive, dominant or X-linked mode of disease inheritance. If an autosomal recessive mode of inheritance was suspected, we analyzed specifically homozygous and compound heterozygous variants. In cases of only one heterozygous variant found, we performed extended systematic NGS coverage analysis in search of a second allelic variant, including genomic deletions. When the mode of inheritance was unknown or compatible with autosomal dominant or X-linked transmission, single heterozygous variants were also considered.
Four patients (N°6, 10, 13, 43) carried homozygous pathogenic variants within the AGK, DNAJC19, SDHAF1 and TYMP genes whereas 2 (N°7, 15) carried compound heterozygous pathogenic variants within ETHE1 and TK2 respectively. Two patients (N°29, 68) carried one heterozygous causative mutation, within TWNK or OPA1, responsible for a dominant disease. Among the 10 different identified pathogenic variants, six corresponded to already known causative mutations. Four were predicted pathogenic variants in genes responsible for mitochondrial diseases. The phenotypes of these eight first patients overlapped with the clinical presentations previously described for the corresponding genes and segregation studies, available in 4 cases, were concordant with the pathogenicity of the identified variants ( Table 3).

Variants of uncertain significance (VUS)
We also identified VUS in 7 patients. One male adult patient (N°66) carried a novel hemizygous variant (c. 893G > A; p.Arg298Gln) in the AIFM1 gene, localized on      Table 4). Since the age of 60, the patient had CPEO associated with dysphagia but without any other associated symptom (Table 1). His mother presented exactly the same clinical signs with same onset but she was not available for testing. At the age of 69, muscle biopsy revealed COX-negative fibers with a complex IV deficiency by spectrophotometry. Blue native PAGE analysis revealed an assembly defect or increased instability of complex IV. The c.893G > A variant is not frequent in ExAC database (< 0.01%) (5/87658 tested alleles including 2 in a hemizygous state in male individuals). It affects a highly-conserved residue in the NADH-binding domain and is predicted as probably pathogenic. Mutations in AIFM1 reported so far cause a progressive disorder affecting the muscles and the nervous system with a more severe phenotype than the one presented by our patient [9]. Without further functional analyses, it will be difficult to definitively confirm the deleterious consequences of the identified variant. We identified 2 subjects carrying a heterozygous variant in KIF5A, which encodes a kinesin-like protein. Both patients presented with a sensory axonal polyneuropathy ( Table 1). The first case, a 70 yearsold patient (N°71), also had ataxia, hearing loss and cachexia. COX-negative fibers were found in muscle but without respiratory chain deficiency. The second case (N°77) was a 77 years-old patient who suffered from frontotemporal dementia and Paget disease. The 2 variants, c.1248A > T; p.Lys416Asn, and c.2354A > G; p.Glu785Gly, respectively, are not found in ExAC databases and in silico analysis predicts them to be probably pathogenic ( Table 4). The corresponding amino acids are located in the coiled-coil domain and are conserved over species. Mutations in KIF5A are described in a wide clinical spectrum from hereditary spastic paraplegia (HSP) 10 to axonal neuropathy [10] and was recently implied in early-onset phenotype with severe myoclonus and evidence of mitochondrial dysfunction [11].
We also identified a novel heterozygous variant, c. 4682G > A; p.Cys1561Tyr in another kinesin family member (KIF1B) ( Table 4). The patient (N°34) presented with an early-onset disease including frequent falls and lower limb weakness. At 16 years of age, he developed a tetraparesia during an infectious episode with progressive incomplete recovery. At 43 years-old, he has lower limb weakness with areflexia, and jaw muscle weakness. Electromyogram study showed evidence of anterior horn cell involvement and we found a complex III deficiency in muscle. The missense variant affects a highly conserved amino acid located in the PH (Pleckstrin Homology) domain, is not frequent in ExAC database (< 0. 01%) and is predicted to be deleterious (Table 4).
Heterozygous pathogenic variants in DNA2 have been identified in adult-onset mitochondrial myopathy with mtDNA instability [12]. We found a novel DNA2 heterozygous variant, c.2862G > C; p.Leu954Phe in a patient presenting with cerebellar ataxia, myoclonic epilepsy, cataract and bilateral hearing loss (Patient N°67) ( Table 4). Muscle biopsy revealed histological signs of mitochondrial myopathy with low level in complex I and mtDNA multiple deletions. The variant, localized in the helicase domain, is not found in ExAC database and in silico analysis predicts it as pathogenic (Table 4).
We identified 2 heterozygous variants in POLG2 in a 38 year-old patient (N°35) presenting with sensory neuropathy and multiple mtDNA deletions ( Table 4). The c. 1105A > G; p.Arg369Gly variant was firstly being classified as functionally pathogenic [13]. It was then reclassified by ClinVar as variant of undetermined signification. More recently, this variant has been identified in homozygous state in control individuals and is now considered as likely benign [14]. The second variant (c.390-2A > C) is supposed to have an effect on splicing by in silico analysis (http://www.mutationtaster.org/). Mutations that affect splicing and POLG2 expression have previously been described in patients [15,16]. Thus the c.390-2A > C splice acceptor variant in POLG2 we identified is consistent with previous POLG2 pathogenic variants responsible for mtDNA deletions and late-onset mitochondrial disease.
Two compound heterozygous variants, c.23315G > A; p. Arg7772Gln and c.15337G > A; p.Val5113Ile, were found within SYNE1 (spectrin repeat-containing nuclear envelope protein 1) in a 46-year old patient (N°64) presenting with a cerebellar syndrome and CPEO. DNA samples from his parents were not available but his asymptomatic daughter carried one variant only c.15337G > A; p. Val5113Ile, that had been previously described (Table 4) [17]. There was mitochondrial aggregation with mtDNA deletions but no RC deficiency in the patient's muscle. The 2 variants are rare (< 0.01% in EXAC) and predicted to be deleterious. Disease-causing SYNE1 variants were previously reported in a large clinical spectrum with biallelic mutations responsible for SCAR8 phenotype including pure cerebellar atrophy, ataxia and dysarthria, with variable age at onset of symptoms (6-50 years) [18].

Misannotated mutations and likely benign variants
In several samples, we found variants that have previously been annotated as pathogenic, but the current patients did not present with symptoms attributed to those mutations. For instance, a known pathogenic variant c. 1987C > T; p.Arg663Cys in MFN2, which encodes a mitochondrial GTPase mitofusin protein, was found in a 14-year old patient (N°23) ( Table 5) [19]. Another heterozygous variant c.1085C > T; p.Thr362Met in the same  [20,21]. Both patients, however, did not present with peripheral neuropathy. Pathogenic rare variants in OPA3 have previously been shown to cause optic atrophy, with either autosomal dominant or autosomal recessive inheritance [22]. We identified a novel heterozygous variant in OPA3 c. 229G > A; p.Ala77Thr in a young patient (N°22) presenting with delayed motor development, refractory epilepsy with pyramidal and extrapyramidal syndrome ( Table 5). This variant was inherited from his mother suggesting a recessive inheritance but we could not find the second mutation. However, the absence of optic atrophy at 9 years of age is not consistent with a causative effect of the identified variant. In the last 3 cases (N°s69, 59, 2), we identified possible deleterious variants in the DGUOK, MLYCD and PC genes, responsible for recessive diseases, in a heterozygous state (Table 5) [23][24][25]. Respective clinical presentations and absence of associated deletions on the controlateral alleles allowed to eliminate their involvement in the disease.

Discussion
Since the advent of NGS, several studies have tried to evaluate the relevance of targeted gene panel sequencing and whole exome sequencing for molecular diagnosis of mitochondrial diseases. To date, responsible genes are unknown in more than 2/3 of patients affected by a mitochondrial disorder. This situation is due to the large genetic heterogeneity of these diseases. NGS has greatly improved the screening of the mitochondrial genome. Nevertheless, once the mtDNA has been eliminated, only around 250 nuclear genes out of the 1500 potentially involved are known to date. Several studies have reported a NGS approach based on targeted gene panel sequencing, on WES or both in cohorts of patients suspected of having a mitochondrial disorder [6,[26][27][28][29][30][31][32]. The comparison between these different studies is extremely difficult. The success of molecular diagnosis is highly dependent on the quality of the clinical diagnosis and biochemical characterization. The number of patients reported in the different cohorts ranged from 24 to 148 with heterogeneous populations in terms of ages, isolated or familial cases, biochemically proven respiratory chain deficiency or previous screening for specific sets of genes known to be associated with phenotypes. The success rate varies from 8 to 24% when gene panels are used, the higher rate belongs to a "MitoExome" (1034 nuclear DNA genes and 37 mtDNA genes) [26,27,29]. Results also depend on bioinformatic data processing, variant prioritization and numerous others parameters. It is from 17 to more than 50% with WES-based strategies [30,31]. Recently, Legati and colleagues analysed 125 patients with a mitochondrial disease. Their cohort included 78 children with age of onset ≤1, the mean onset of remaining ones was 18.6 years. A previous screening had eliminated mtDNA mutations and nuclear genes known to be associated with the observed phenotypes [6]. Using a targeted gene panel including 132 genes, they identified the causative mutations in 19 patients (15.2%). They estimated a diagnostic success of the NGS panel strategy of around 25%, when used as a first strategy approach [6]. We wondered if using larger panels could improve the success rate. We studied a cohort of 80 patients highly suspected of having mitochondrial disease, including children and adults, sent to our center during the first 3 months of 2016. Abnormalities of the mitochondrial genome were found in 15 cases and we used a custom-made targeted panel including 281 genes in the 65 remaining patients. Among these genes, 266 were known to be involved in mitochondrial disorders and 15 were candidates based on their role or involvement in different pathways that include genes responsible for respiratory chain deficiency. The number of novel genes responsible for RC deficiency increases dramatically (more than 110 in the last five years) and it should be noted that the version of the panel we use today is still different from that used in this study. With the panel described we identified pathogenic variants in 8 patients out of 65 (12.3%), including 5/23 children (21.7%) and 3/42 adults (7.1%). Our cohort is a mix of pediatric and adult patients and the success rate in the pediatric population (21.7%) is higher than the one found in the cohort described by Legati and colleagues [6]. However, they had previously excluded the main candidate genes and the 2 studies are difficult to compare. We also clearly show that mtDNA abnormalities are mainly found in adult patients (25%) compared to children (4.1%) whereas the situation is reversed for pathogenic variants in nuclear genes (7.1% in adults and 21.7% in children). However, our data suggest that a gradual increase in the size of the panels could not resolve all undiagnosed cases.
WES can improve the diagnostic rate by discovering novel mitochondrial disease-linked gene. Another reason to prefer WES than panels as a first strategy approach is that patients with respiratory chain deficiency may harbor pathogenic variants in genes apparently not related to mitochondrial function and that respiratory chain abnormalities may be secondary to other disease. The patient n°64, who carries two compound heterozygous mutations in SYNE1, illustrates this situation (Table 4). SYNE1 encodes a multiisomeric modular protein which forms a network between organelles and the actin cytoskeleton to maintain the subcellular spatial organization [33]. This gene is responsible for different neuromuscular disorders including autosomal dominant Emery-Dreifuss muscular dystrophy 4 or autosomal recessive spinocerebellar ataxia 8 (SCAR 8) [34,35].
It was not an obvious candidate gene for mitochondrial disease. However, unpublished results suggested that patients presenting with ataxia associated with SYNE1 mutations may have a secondary mitochondrial dysfunction. It is for this reason that we included SYNE1 in our panel leading to the identification of two variants in a patient who presented ataxia and CPEO in adulthood with mitochondrial aggregates and multiple deletions of mtDNA in muscle. Further analyses will be necessary to explain these unexpected secondary effects and to exclude an artefact. However, mutations in nonmitochondrial proteins can make finding the pathogenic variants even more difficult. Functional analyses are also required to distinguish between confirmed or possible disease-causing mutations. Familial segregation and concordant clinical phenotype are 2 critical parameters. Nevertheless, segregation studies are often impossible in adult patients and in pediatric cohorts, the family size is mostly small. Furthermore, in general mitochondrial disease displays poor correlation between genotype and phenotype.

Conclusions
Our data highlights the great underlying genetic heterogeneity of suspected mitochondrial disease and how hard is to assign a pathogenic role to each variant identified by NGS. International networks are required to create a common genetic database in order to improve the molecular diagnosis of these diseases. Our results also confirm that a panel strategy is not optimal to identify the molecular abnormalities associated with mitochondrial disease even by increasing the number of genes analyzed. Data from the literature suggests that exome may be of greater interest than gene panels that need to be permanently reassessed based on the identification of new responsible genes. Nevertheless, further comparative studies will be needed to determine the best strategy to use for the diagnosis of these pathologies.

Additional file
Additional file 1: Table S1.