Targeted next generation sequencing in 112 Chinese patients with intellectual disability/developmental delay: novel mutations and candidate gene

Background Intellectual disability/developmental delay is a complex condition with extraordinary heterogeneity. A large proportion of patients lacks a specific diagnosis. Next generation sequencing, enabling identification of genetic variations in multiple genes, has become an efficient strategy for genetic analysis in intellectual disability/developmental delay. Methods Clinical data of 112 Chinese families with unexplained intellectual disability/developmental delay was collected. Targeted next generation sequencing of 454 genes related to intellectual disability/developmental delay was performed for all 112 index patients. Patients with promising variants and their other family members underwent Sanger sequencing to validate the authenticity and segregation of the variants. Results Fourteen promising variants in genes EFNB1, MECP2, ATRX, NAA10, ANKRD11, DHCR7, LAMA1, NFIX, UBE3A, ARID1B and PTPRD were identified in 11 of 112 patients (11/112, 9.82%). Of 14 variants, eight arose de novo, and 13 are novel. Nine patients (9/112, 8.03%) got definite molecular diagnoses. It is the first time to report variants in EFNB1, NAA10, DHCR7, LAMA1 and NFIX in Chinese intellectual disability/developmental delay patients and first report about variants in NAA10 and LAMA1 in affected individuals of Asian ancestry. Conclusions Targeted next generation sequencing of 454 genes is an effective test strategy for patients with unexplained intellectual disability/developmental delay. Genetic heterogenicity is significant in this Chinese cohort and de novo variants play an important role in the diagnosis. Findings of this study further delineate the corresponding phenotypes, expand the mutation spectrum and support the involvement of PTPRD in the disease. Electronic supplementary material The online version of this article (10.1186/s12881-019-0794-y) contains supplementary material, which is available to authorized users.


Background
Intellectual disability/developmental delay (ID/DD) is a common group of neurodevelopmental disorders with a prevalence of 1%~3%, which starts before the age of 18 years and is characterized by substantial limitations in intellectual functioning and adaptive behavior [1]. ID cannot be diagnosed until the child is older than five years old, when standardized measures of developmental skills, such as the Wechsler Intelligence Scale for Children (WISC), are more reliable and valid. DD is defined as delay in two or more developmental domains, including gross or fine motor, speech/language, social/personal, cognitive and activities of daily living. DD can be assessed by Gesell Developmental Scale and mostly predicts a future diagnosis of ID. Although ID/DD can also be caused by exogenous factors such as maternal alcohol abuse during pregnancy, birth complications, infections and extreme malnutrition, genetics plays a vital role in its etiology [2]. Discerning the precise genetic causes of ID/DD patients will inform prognosis, management and therapy, enable access to disorder-specific support groups, and facilitate family planning [3]. Unfortunately, due to extreme genetic heterogeneity, genetic causes are remaining to be clarified in most ID/DD patients [4].
Genomic variants including structural variants and sequence variants can both lead to ID/DD. The former encompasses both copy number variants (CNVs) and balanced rearrangements and can be detected by conventional karyotyping and chromosomal microarray analysis (CMA), explaining up to 15% of ID/DD cases [1,2]. Sequence variants may cause monogenic disorders and can be discovered by DNA sequencing. In recent years, next generation sequencing (NGS), enabling identification of genetic variations in multiple genes, has become an effective strategy for genetic analysis in ID/DD. Based on NGS technology, three diagnostic tests including targeted NGS, also known as gene panel, whole exome sequencing (WES) and whole genome sequencing (WGS) sequencing are currently used for diagnosis of ID/DD. The main differences among the three tests are the different range of the targeted sequenced regions. Targeted NGS focuses on hundreds of disorder-specific genes. By contrast, WES covering all~20,000 protein-coding genes and WGS sequencing all the entire genomes are non-targeted tests [5]. With more genome regions covered, WES and WGS get a higher diagnostic rate of ID/DD (~40% and~42%, respectively) compared with targeted NGS (11%~32%) [3,[6][7][8][9]. However, given its lower cost, deeper coverage depth, easier data management, targeted NGS is still a common approach in routine clinical diagnostic laboratories. In this study, targeted NGS for 454 genes related to ID/DD was performed for 112 Chinese patients with unexplained ID/DD to elucidate their genetic causes and enable access to further medical management.

Patients
Patients with unexplained ID/DD were defined as those who did not get an etiological diagnosis after prior etiology tests including screening for inborn errors of metabolism and karyotype analysis. The inclusion criteria were: 1) age at first exam was from 3 months to 18 years; 2) ID: IQ < 70, assessed by WISC; DD: DQ < 76 in two or more developmental domains assessed by Gesell Developmental Scale; 3) no history of perinatal brain injury, postnatal hypoxia, intoxication, cranial trauma or central nervous system infection; 4) no evidence of recognizable inherited metabolic disorder or neurodegenerative disorders.
All 112 Chinese patients were examined and enrolled by pediatric neurologists in Peking University First Hospital from May of 2014 to August 2016. Genomic DNA of each index patient and his or her parents or other family members were extracted from peripheral leukocyte using Flexi Gene DNA Kit (QIAGEN, Germany) according to standard procedure.
Sanger sequence was performed for index patients and other family members to validate the authenticity and segregation of the promising variants.

Statistical analyses
Differences were analyzed statistically using the Chi-square test by IBM SPSS Statistics 19.

Results
Of 112 Chinese patients, 69 were males, and 43 were females. The median age was 3 years and 7 months [range 4 months-17 years]. 49 patients, older than 5 years of age, were diagnosed with ID, while the remaining 63 patients, younger than 5 years of age, were diagnosed with DD. 18 patients (18/112, 16.07%) presented with mild ID/DD, while the remaining 94 patients (94/112, 83.93%) were affected by moderate to severe ID/DD. Congenital malformation, abnormal behavior, epilepsy, positive family history and MRI abnormality were observed in 52.68, 22.32,17.86, 13.39 and 59.70% of patients, respectively. There was no statistical difference (p >0.05) in the ratio of gender (male versus female) and the severity of ID/DD (mild delay versus moderate and severe delay), and the incidence of the congenital malformation, abnormal behavior, epilepsy, positive family history and MRI abnormality between two groups of patients with or without meaningful targeted NGS results (Table 1).
A hemizygous variant c.6257 T > C; p.(Leu2086Ser) in ATRX (NM_000489.3) was detected in Patient 3, who was characterized by moderate ID, dysmorphic face (large forehead, low anterior hairline, hypertelorism, broad nasal bridge, small ears, strabismus), ventricular septal defect (repaired at the age of 5 years), scoliosis, and high arch of left foot. No microcephaly, genitourinary malformation, deafness or signs of anemia including hepatosplenomegaly, anemia-like bone changes, jaundice or abnormal red blood cell indices were observed. He had a complicated family history (Fig. 2). Sanger sequencing for family members (     Her main complains were ID, epilepsy and sinus block. No ataxia or ocular anomalies were noted evaluated at the age of 9 years and 4 months. The patient was equipped with a pacemaker and could not undergo MRI examination. Therefore, it was unknown whether Patient 7 had brain abnormality or not. To clarify if the patient had any other promising damaging variants, trio-based WES was performed for Patient 7 and her parents. Interestingly, except variants in LAMA1, no other promising variants stood out.

Variants in autosomal dominant ID/DD genes
De novo heterozygous variant c.613C > T; p.(Gln205Ter) in NFIX (NM_001271043.1) was detected in Patient 8. The boy, at the age of 4 years and 2 months, was clinically suspected as Sotos syndrome (MIM# 614753) with facial dysmorphia (long and narrow face, high forehead and downslanting palpebral fissures), mild DD, significant delay in language (started to speak at the age of 3 years) and overgrowth. Previous fluorescence in situ hybridization (FISH) did not detect the deletion of 5q35 region, a common defect leading to Sotos syndrome.
De novo heterozygous variant c.403G > T; p.(Glu135-Ter) in UBE3A (NM_000462.3) was identified in Patient 9. Evaluated at the age of 3 years and 3 months, the boy presented with global DD, significantly delayed language (starting to speak at 3 years of age), epilepsy (onset at 2 years of age), inappropriate smile, microcephaly and facial dysmorphia (hypertelorism, small ear, long philtrum and prominent jaw). Clinical diagnosis of Angelman syndrome (MIM# 105830) was established. Deletion of 15q11-q13 was excluded by multiplex ligation-dependent probe amplification (MLPA).
De novo missense variant c.6212 T > A; p.(Ile2071Asn) in ARID1B (NM_001346813.1) was identified in Patient 10. The boy, at the age of 1 year, presented with mild DD, coarse face (low anterior hairline, thick eyebrows, broad nasal tip, long philtrum, thin upper vermilion and low-set ears), nystagmus, strabismus, delayed dentition, single transverse palmar crease, prominent distal phalanges of 4th toe of right foot, delayed myelination and agenesis of splenium of corpus callosum (Fig. 1c).
A variant in candidate gene PTPRD Patient 11 was a girl with moderate nonsyndromic DD. She was able to walk alone and speak at 16 months and 3.5 years, respectively. At the age of 5 years and 5  (Fig. 3a). The variant is not seen in control population databases. Although the variant site is not conserved with a low GERP++RS score (− 10.0) [27], the variant is predicted to disrupt the wild type splice donor site in intron 44 with a consensus value (CV) of − 26.8% in HSF and a dropping SSP's prediction score from 1 to 0. As predicted, a new splice donor site at c.5534 + 73_5534 + 74GT with a high SSP's prediction score of 0.97 will be created. The altered splicing will cause the retention of 72 nucleotides in intron 44, change the reading frame and lead to a premature stop codon at position 1846 (p.(Ser1845ArgfsTer2)) (Fig.  3b). The changed/missing region p.1845_1912 of the predicted mutated protein is a part of tyrosine-protein phosphatase 1 domain (Fig. 3. c), which is highly conserved in all seven isoforms of PTPRD protein.

Discussion
In this study, through targeted sequencing of 454 ID/DD genes, promising variants were identified in 9.82% of patients. Same as expected, genetic heterogenicity was significant in this Chinese cohort. 11 patients presented with 14 distinct variants in 11 genes. Only ATRX was found mutated in two patients. In addition, of 14 variants, except for the variant in MECP2, all the remaining 13 variants are novel, expanding the mutations spectrum, especially for genes EFNB1, NAA10, LAMA1 and NFIX, which are identified recently and only a few of mutations have been reported so far. It is the first time to report mutations in EFNB1, NAA10, DHCR7, LAMA1 and NFIX in Chinese ID/DD patients and first report about mutation in NAA10 and LAMA1 in ID/DD patients of Asian ancestry.
The high rate of de novo variants (57.14%) is remarkable, which account for 72.72% (8/11) of patients, not only among autosomal dominant conditions but also in X-linked conditions. The critical role of de novo variants in ID/DD has been reported previously [7,[28][29][30]. However, it is important to note that many factors have to be evaluated when using the de novo evidence criteria, PS2 and PM6, to judge the pathogenicity of variants [23][24][25]. For instance, in this study, the testing strategy was gene panel followed by parental testing of variant and being without confirmation of paternity and maternity. Therefore, PS2 could not be used here. Whether PM6 can be applied depends on the consistency and specificity of the phenotype, the number of de novo observations and the inheritance.
Interpreting the variants correctly and achieving a robust genetic diagnosis is still a considerable challenge. Beside the origin of the variant like de novo discussed above, many other factors such as variant allele frequencie, inheritance model and patient's phenotype should also be evaluated carefully to determine if the variants impair gene function and underlie the phenotype. In this study, after comprehensive analysis, 9 of 112 patients obtained definite diagnosis with a diagnostic rate of  (Ser1845ArgfsTer2). (c) PTPRD protein structure. PTPRD protein is a single-pass type I membrane protein and predicted to contain conserved function domains: three Ig-like C2 domains (Ig-like C2 type1, Ig-like C2 type2 and Ig-like C2 type3) (Purple), eight fibronectin type-III domains (Orange), and two tyrosine-protein phosphatase domains (Green) (https://www.uniprot.org/uniprot/P23468). As predicted, the truncated protein (p.(Ser1845ArgfsTer2)) will lack part of tyrosine-protein phosphatase 1 domain (dotted box) 8.03% eventually. Diagnostic rate of targeted NGS is associated with multiple factors including 1) features of study subjects such as gender, severity of ID/DD, with positive family history/complications or not, 2) genes included in the panel, 3) test strategy like gene panel followed by parental testing of variant or NGS for trios/ larger family group, 4) data analysis pipeline and definition of the "diagnosis". Previous studies using targeted NGS in ID/DD led to a conclusive diagnostic rate of 11%~32% [6][7][8][9]. The yield of this study is lower than that reported. It may be due to that, in this study, 1) subjects were not limited in "syndromic ID/DD" or "moderated to severe ID/DD," 2) the number of studied genes is relatively small, 3) only proband underwent NGS. For one patient with unexplained ID/DD, more genes and more family members sequenced help increase the chance of diagnosis. However, it also means higher costs. Clinicians and geneticists should weigh the cost effectiveness.
Patient 5 harbored two promising variants in two different genes NAA10 and ANKRD11. The variant c.248G > A; p.(Arg83His) in NAA10 is absent in control population, located in the N-acetyltransferase domain and predicted to be deleterious in silico. A different missense change at the same 83 amino acid residue p.(Arg83Cys) is classified as "pathogenic" by three submitters in ClinVar database [31]. NAA10 is located in Xq28 and mono-allelic mutation in NAA10 can cause non-syndromic ID/DD in both males and females [32,33]. The main clinical features of Patient 5 including DD, hyperactivity, electrocardiographic T-wave abnormality and delayed bone age, were all previously reported in other patients with NAA10 mutations [32,33]. Moreover, his mother, in heterozygous state for the variant in NAA10, also demonstrated slight intellectual defect. The variant in NAA10 is classified as "likely pathogenic" based on evidence criteria PM1, PM2, PM5, PM6, PP3 and PP4. The variant c.884G > A; p.(Ser295-Asn) in ANKRD11 arose de novo in the patient. Heterozygous mutations in ANKRD11 lead to KBG syndrome, which is characterized by macrodontia of the upper central incisors, distinctive craniofacial findings, short stature, skeletal anomalies and neurologic symptoms including ID/DD and seizures [34]. Phenotypic heterogenicity is significant in ANKRD11-related KBG syndrome and none of the features mentioned above is a prerequisite for diagnosis. The variant in ANKRD11 in Patient 5 is classified as "uncertain significance" based on evidence criteria PM2 (absent in population database), while the de novo evidence criteria PS2 or PM6 cannot be used due to lack of confirmation of the paternity and maternity and unspecific phenotype with high genetic heterogeneity. It is unclear if the variant in ANKRD11 contributes to the phenotype of Patient 5.
It is worth noting that the variant in ARID1B in Patient 10 is a missense variant. Mono-allelic mutations in ARID1B lead to CSS1. The common features of CSS1 are ID/DD, speech delay, coarse facies, hypertrichosis, small fifth finger or toenails, feeding difficulties [35]. Agenesis of the corpus callosum, seizures, myopia, growth delay, abnormal dentition and single transverse palmar crease are also recorded in some patients [35]. Of 98 reported pathogenic or likely pathogenic mutations in ARID1B, only 3 (3/98, 3.06%) are missense variants and all the remaining are truncating variants (91/ 98, 92.85%) or splice site variants (4/98, 4.08%) [36]. Phenotypes of Patient 10 including DD, classical coarse face, delayed myelination and agenesis of the splenium are highly concordant with CSS1. Therefore, although missense mutations are rare in ARID1B, combining the variant in silico features (absent in control population, conserved, predicted to be damaging) and consistent inheritance pattern (arising de novo, auto-dominant), we propose that the missense variant in ARID1B is the genetic cause of the patient. This finding further confirms the point that missense variants in ARID1B, which may lead to gain of function or dominant negative effect, can also result in disease.
Patient 7 did not get definite diagnosis after comprehensive analysis. The patient presented with two compound heterozygous variants c.1711_1712del; p.(Ala571ProfsTer8) and c.2755G > C; p.(Gly919Arg) in LAMA1. Bi-allelic mutation in LAMA1 causes a cerebellar dysplasia syndrome named as Poretti-Boltshauser syndrome (PBS) (MIM# 615960) [37]. Recently, Whiffin et al. have presented a statistical framework for calculating the maximum credible population allele frequency (AF) of pathogenic variant based on disease inheritance mode, prevalence, genetic and allelic heterogeneity, penetrance and sampling variance [38]. PBS is an autosomal recessive disorder with a prevalence estimated to be less than 1/1000,000 [39]. Characterized by cerebellar dysplasia with cysts with an enlarged, elongated and square-like shaped fourth ventricle on neuroimaging, phenotype of PBS is highly specific [40,41]. And, at the moment, bi-allelic variants in LAMA1 is the only cause of PBS, revealing its unobvious genetic heterogeneity. Up to date, 20 PBS families with bi-allelic variants in LAMA1 have been published [37,40]. The two variants in LAMA1 in Patient 7 have not been reported before. Therefore, maximum allelic contribution (or allelic heterogeneity) is 1/(21 × 2). There is no published evidence that bi-allelic loss-of-function variants in LAMA1 do not cause PBS. There is only one individual with a homozygous splicing variant in LAMA1 in gnomAD. Despite of this and considering the huge number of controls in this database, we consider full penetrance. According to Whiffin's calculator [42], the maximum credible population AF of pathogenic variant in LAMA1 is 3e-5 (setting the inheritance = biallelic, prevalence = 1/1000,000, genetic heterogeneity = 1, allelic heterogeneity = 0.03, penetrance = 1). The frameshift variant c.1711_1712del; p.(Ala571ProfsTer8) is absent from population databases and is predicted to lead to truncation of the protein at 578th amino acid residue. Several variants in LAMA1 leading to longer truncated protein have been reported to be pathogenic. Therefore, the variant c.1711_1712del; p.(Ala571ProfsTer8) is supposed to be pathogenic (PM2, PVS1). The frequency of missense variant c.2755G > C; p.(Gly919Arg) in East Asian control population is 1.85e-3, although never in the homozygous state (gno-mAD). 1.85e-3 is much higher than the 3e-5, suggesting that the missense variant may be benign (BS1). However, the missense variant is predicted to be damaging by multiple prediction software and is in trans with the pathogenic truncated variant, which are supporting (PP1) and moderate evidence (PM3) for its pathogenicity, respectively. Based on evidence above, the missense variant is classified as "uncertain significance." Although presenting with ID, the patient did not show other typical features of PBS like ataxia or ocular anomalies [37,40]. It is also unclear if the patient has cerebellar dysplasia or not, the essential feature of PBS. Her epilepsy and sinus block have also not been reported in patients with PBS before. Therefore, the patient's diagnosis remained uncertain.
Besides, one de novo variant in candidate gene PTPRD was detected. PTPRD is a receptor-type protein-tyrosine phosphatase and highly expressed in the human brain (HPA RNA-seq normal tissues) [43], especially in neurons and oligodendrocytes [44]. Ptprd-deficient mice exhibited learning impairment, and Ptprd is an important regulator of synaptic plasticity [45]. It has been proved that PTPRD interacts with IL1RAPL1, mutations in which lead to non-syndromic ID (MIM# 300143). In silico, residual variation intolerance score (RVIS) [46] and pLI score [13] of PTPRD is "-3.08(4.8%)" and "1", respectively, which suggests that the gene is intolerant to functional genetic variant and loss of function variant, respectively. Although no intragenic mutation in PTPRD has been reported in ID/DD patients yet, Choucair N et al. [47] found a homozygous PTPRD gene microdeletion in one patient with trigonocephaly, hearing loss, and ID. Recently, Gao K et al. [48] found that PTPRD combining with BTD, GALNT10, NMUR2, AUTS2 and DLG2 constructs a small epilepsy and ID/DD related gene network. All above information suggests that PTPRD is a promising candidate gene of ID/DD. Our case provides more evidence for the association between PTPRD and ID/DD.

Conclusions
Here, through targeted NGS of 454 genes and comprehensive analysis, we help 8.0% of patients get genetic diagnoses. It confirms the effectiveness of the test strategy. The study emphasizes the high genetic heterogenicity of Chinese ID/DD patients and the important role of de novo variants. Its findings further ascertain related genes as causative genes of ID/DD, delineate the corresponding phenotypes and expand the mutation spectrum. Identification of the variant in PTPRD provides more evidence to support its involvement in ID/DD.