A de novo synonymous variant in EFTUD2 disrupts normal splicing and causes mandibulofacial dysostosis with microcephaly: case report

Background Mandibulofacial dysostosis with microcephaly (MFDM) is a rare autosomal dominant genetic disease characterized by intellectual and growth retardations, as well as major microcephaly, induced by missense and splice site variants or microdeletions in the EFTUD2 gene. Case presentation Here, we investigate the case of a young girl with symptoms of MFDM and a normal karyotype. Whole-exome sequencing of the family was performed to identify genetic alterations responsible for this phenotype. We identified a de novo synonymous variant in the EFTUD2 gene. We demonstrated that this synonymous variant disrupts the donor splice-site in intron 9 resulting in the skipping of exon 9 and a frameshift that leads to a premature stop codon. Conclusions We present the first case of MFDM caused by a synonymous variant disrupting the donor splice site, leading to exon skipping.

Its exact prevalence is unknown, but more than 80 cases have been described in the literature until now. MFDM is mostly caused by de novo variants in the EFTUD2 gene (MIM# 603892) [5]. In some rarer instances, the MFDM is transmitted from a parent in an autosomal dominant manner (19% of the cases) or due to germline mosaicism (6% of the cases). EFTUD2 encodes the U5-116kD, a highly conserved GTPase component of the major spliceosome complex that processes precursor mRNAs to produce mature mRNAs by allowing the dissociation of U4 and U6 snRNPs during splicing in a GTP-dependent manner [6].
The EFTUD2 gene is composed of 29 exons and presents four transcript variants encoding three different isoforms. Seventy-six distinct single-nucleotide variants (SNVs) and seven microdeletions in EFTUD2 involved in MFDM have been described to date [5]. They can alter basic, surface-forming residues that are potentially available for protein-protein interactions in the internal face of the protein and could conceivably affect protein stability by several mechanisms acting on protein stability, conformation, localization, and/or post-translational modifications. Various types of EFTUD2-variants have been identified, including missense, frameshift, intronic splice site variants and deletions. However synonymous splice site variants in the gene have never been previously implicated in this disease.
Synonymous variants initially do not appear to alter the structure and function of the proteins. They have long been interpreted as "silent" variants. Studies in evolutionary genetics have, however, shown that not all synonymous codons are used at the same frequency in the genome and that selection pressure is exerted even on the synonymous codons as they are used differently for mRNA splicing, translation, and processing machinery. The association of synonymous variants with over 50 human diseases has further confirmed the importance of these phenomena [7].

Case presentation
Here, we report a seven-year-old female patient, who is a native of Libya, who presents postnatal microcephaly to -3SD, sensorineural hearing loss, and global intellectual delay with difficulties of comprehension. She also presents epileptic seizures, livedo and facial dysmorphisms such as micro-retrognatism, malar hypoplasia, dental malocclusion, limitation of mouth opening, and large protruding ears.
As her karyotype was normal and her parents were both healthy, we performed whole-exome sequencing (WES) of the child and her parents to identify putative genetic alterations responsible for this phenotype. WES was performed on genomic DNA prepared from the patient and the parents' blood samples. The mean coverage of the exome-wide regions was 139.09, 119.25, and 148.62 reads, corresponding to a coverage of at least 10 reads of 95.99, 95.91, and 96.08% of the exome for the patient, mother, and father, respectively. In our variant analysis, we prioritized variants that were rare in the healthy population according to GnomAD v3 database (< 1%), the variants predicted to be deleterious on protein function according to SIFT and PolyPhen tools, and transmitted as compound heterozygous or arose de novo, consistent with the non-consanguineous and healthy parent context (Table S1).
Among these pertinent variants, the only one that could explain the patient's phenotype was the de novo synonymous variant c.702G > T (transcript NM_ 004247.4) in the exon 9 of EFTUD2 at position chr17: 42956924 (GRCh37/hg19) in the patient (Fig. 1a). This variant replaces a GGG codon to GGT, resulting in the retention of glycine at amino acid residue 234 (p.G234G). According to ACMG 2015 guidelines [8], this variant is classified as having unknown significance. Sanger sequencing confirmed that neither parents carried the variant (Fig. 1b). The variant is located in the G-domain of the protein, which is known to bind and hydrolyze GTP and a site of other variants of EFTUD2 gene that are associated with MFDM (Fig. 1c). As MFDM disease patterns seem to correspond closely to the symptoms of the patient (Table 1), we decided to investigate the potential impact of this synonymous variant on EFTUD2 function.
The T allele at this position is novel in all public databases, including the NHLBI Exome Sequencing Project, the 1000 Genomes Project, and GnomAD v3, suggesting very high conservation of the G allele in the population. The mutated residue is the last nucleotide of exon 9, localized at the exon/intron junction adjacent to the splice donor site GT (c.702 + 1 and + 2). According to three splicing prediction tools -SpliceSiteFinder-like (SSF), MaxEntScore (MES) and Human Splicing Finder (HSF) -our variant affects the donor splice site by creating an alternative cryptic donor site "GT" preceding the original one ( Fig. 2a, b).
To test the prediction, we investigated the consequence of the variant on the splicing of EFTUD2 gene in vivo, in peripheral blood of the proband and her parents. After RNA isolation from leukocytes, we performed an RT-PCR and amplified 360 bases covering exon 8 to exon 12 of EFTUD2 cDNA. We observed in all three individuals the expected PCR product band of~360 bp and an additional PCR product of~280 bp in the proband only (Fig. 3a). This result suggests deletion of about 80 bp in the patient's EFTUD2 cDNA.
The sequencing of the alternative cDNA showed complete deletion of exon 9 ( Fig. 3b and c). As the exon 9 length is not a multiple of 3 (83 bp), its deletion would trigger a frameshift leading to a premature stop codon that truncates the protein c.620_702del, p.His209Aspfs*25 ( Supplementary  Fig. 1). This result demonstrates that the de novo synonymous variant identified in EFTUD2 is responsible for the splicing defect leading to the skipping of exon 9, an exon that is present in all splice isoforms of EFTUD2.

Patients
The patient was recruited at the "Unité de Diagnostic Prénatal -CPDP" of the American Hospital of Paris. The

Whole exome sequencing
Genomic DNA was isolated from peripheral blood using standard protocols. Exome sequencing libraries were prepared with the TruSeq Exome Kit (Illumina, San Diego, CA, USA) following the manufacturer's recommendations. Paired-end (2 × 75 bp) sequencing was performed on a NextSeq500 sequencer (Illumina, San Diego, CA, USA).

Bioinformatic analysis
FastQ data were aligned to the GRCh37 (hg19) reference genome with bwa-0.7.12 [9], sorted and indexed with samtools-1.2 [10], deduplicated with PICARD-1.110, and base corrected and indel realigned with GATK-3.8 [11,12]. Variant calling was done with GATK-3.8 Haplotype-Caller in GVCF ERC mode. Variants were called individually for each sample and then combined with GATK-3.8 GenotypeVCFs to produce a combined VCF. The combined VCF was then uploaded and analyzed with Ingenuity Variant Analysis software. Alignments were visualized with GenomeBrowse (Golden Helix -Massachusetts). FastQC-0.11.5 was used to calculate quality metrics for FastQ files and Qualimap-2.2.1 [13] was used to calculate coverage statistics using the truseq-exome-targeted-regions-manifest-v1-2.bed file. The reference file used for alignment and variant calling was human_g1k_v37.fasta which was provided with the GATK b37 resource bundle.

RNA isolation and RT-PCR
Peripheral blood samples from the proband and her parents were used for the analyses in this study. Peripheral blood mononuclear cells were isolated by Ficoll-Paque™ density gradient centrifugation. After total RNA extraction using Trizol, Reverse-Transcription and PCR were performed as described in [14]. Forward and reverse primer sequences purchased from IDT were respectively: 5′ GTGGAATACATGCTTATTAATCCATTGACC 3′ and 5′ GAGCAAGAGAGAGGTGTAGGCATC 3′. PCR products were analyzed on a 2% agarose gel as described in [14]. Finally, we used PCR clean-up gel extraction from Macherey-Nagel to isolate DNA bands from the agarose gel for sequencing.

Sanger sequencing
The EFTUD2 variant was validated using capillary Sanger sequencing. Briefly, a 262 bp DNA stretch of EFTUD2 was amplified using the Expand Long Template PCR System (Roche, Meylan, France), following the manufacturer's recommendations. The PCR primer pair was 5′-TTCAAG TTCTCTGGCTCCCA-3′ (forward) and 5′-CCCTCAGT TCACCCTACCAG-3′ (reverse). After purification with the Exostar kit (GE Healthcare, Little Chalfont, UK), PCR products were bi-directionally sequenced with the same primers using Big Dye Terminator Kit v3.1 (Life Technologies). Sequence reactions were run on an ABI PRISM 3730xl sequencer (Life Technologies).

Discussion and conclusions
The increased access to next-generation sequencing for clinical purposes has allowed the identification of thousands of novel pathogenic variants in different individuals. One of the main challenges in clinical genetics is the interpretation of pathogenicity from a sea of variants that remain largely of unknown significance.
Synonymous variants are often interpreted by default as being silent and benign given their predicted null impact on the protein sequence. However, there is evidence for some synonymous SNVs to affect RNA splicing, expression, folding and ultimately function, and, in doing so, contribute to the pathophysiology of many diseases [15][16][17].
In this case study, we report a synonymous c.702G > T variant in the EFTUD2 gene. This variant has not previously been reported in the literature and is absent from large population databases (GnomAD, 1000 Genomes); without further analysis, our initial classification would have been of uncertain significance. However, in silico analysis predicted the disruption of normal splice site,  prompting in vitro investigation of its biological significance. The sequencing of the whole exome did not identify other deleterious variants that could be of clinical interest. Although we cannot exclude the presence of relevant deleterious variations in the non-coding regions, the strong correlation between the patient's phenotype and the clinical consequence of heterozygous alteration of EFTUD2 was sufficient to assume its implication in the disease.
The synonymous variant modifies the consensus sequence between exon 9 and intron 9 from GGG|gt to GGT|gt. In contrary to in silico prediction tools that predicted the creation of an additional GT donor site (Fig. 2), the study of cDNA from blood showed that this variant disrupts the recognition of the donor site by the splicing machinery and results in complete skipping of exon 9. This result could give a hint to the limitations of predictive splicing tools that do not predict the disruption of the splice site induced by this variant. Our study is the first description of synonymous SNV of EFTUD2 in an MFDM patient. Studying cDNA from blood can have some limitations mostly if the gene of interest has different transcripts with a tissue-specific expression; however, we ensured that the EFTUD2 gene is ubiquitously expressed and that the different transcripts do not present differences such as alternative splicing in the region of interest.
Some exonic regions are involved in splicing regulation in highly conserved sites called exonic splice enhancers (ESEs) [18]. In 80% of splicing consensus sites, the last nucleotide of the exon is a "G" which is highly important for the recognition by the splicing machinery [19]. Recently, Savisaar et al. showed that ESEs are under strong selection pressure at synonymous sites, suggesting that synonymous variants in these sites may be a common cause of single-locus genetic diseases [20]. A deleterious missense variant in the last G nucleotide resulting in exon skipping has already been reported in BRCA1 in 2 patients who developed breast cancer at a young age [21] and in a patient with retinitis pigmentosa [22]. To our knowledge, our study is the first to report a deleterious synonymous variant in the final nucleotide of an exon that results in exon skipping.
In conclusion, synonymous variants should not be disregarded especially when they are predicted to affect splicing according to in silico tools. This study provides important evidence for the classification of such variants.
Additional file 1: Supplementary Fig. 1. mRNA sequence of the WT allele versus the mutant allele. The exon 9 skipping in mutant allele is predicted to cause a frameshift, leading to a premature codon stop. The exon 8 is in red, exon 9 in green and exon 10 in blue.
Additional file 2: Table S1. Number of prioritized variants during the WES data filtering analysis.