Novel variants underlying autosomal recessive intellectual disability in Pakistani consanguineous families
BMC Medical Genetics volume 21, Article number: 59 (2020)
Intellectual disability (ID) is both a clinically diverse and genetically heterogeneous group of disorder, with an onset of cognitive impairment before the age of 18 years. ID is characterized by significant limitations in intellectual functioning and adaptive behaviour. The identification of genetic variants causing ID and neurodevelopmental disorders using whole-exome sequencing (WES) has proven to be successful. So far more than 1222 primary and 1127 candidate genes are associated with ID.
To determine pathogenic variants causative of ID in three unrelated consanguineous Pakistani families, we used a combination of WES, homozygosity-by-descent mapping, de-deoxy sequencing and bioinformatics analysis.
Rare pathogenic single nucleotide variants identified by WES which passed our filtering strategy were confirmed by traditional Sanger sequencing and segregation analysis. Novel and deleterious variants in VPS53, GLB1, and MLC1, genes previously associated with variable neurodevelopmental anomalies, were found to segregate with the disease in the three families.
This study expands our knowledge on the molecular basis of ID as well as the clinical heterogeneity associated to different rare genetic causes of neurodevelopmental disorders. This genetic study could also provide additional knowledge to help genetic assessment as well as clinical and social management of ID in Pakistani families.
Next generation sequencing (NGS) methods have diversified the field of medical genetics. The increase in the number of causative ID genes is directly associated with the implementation of NGS technology for the diagnosis of patients . Intellectual disability (ID) is often part of a wide spectrum of neurodevelopmental disorders, with a total prevalence of 1–3%, around the globe . ID is characterized clinically by below average intellectual functioning of the human brain and adaptive behaviours which occur before the age of 18 years . Whole exome sequencing (WES) is highly useful to identify rare genetic variants implicated in ID. Till 2016, only 746 genes were directly associated with ID based on clinical features and cognitive assessments, with more than 50% of these genes causing autosomal recessive ID . The list of known and candidate ID genes has been increasing rapidly in the last few years according to the sysID database, with at least 1222 primary 1127 candidate genes .
Clinical and molecular diagnosis of ID is challenging due to its phenotypic and molecular heterogeneity. WES-based studies investigating patients with ID of variable severity, have shown a low yield of causative variants ranging from 16 to 68% [5,6,7,8,9,10] due to this heterogeneity. The identification of the genetic etiology underlying autosomal dominant ID still remains elusive in many cases. The low yield of diagnosis in this group may be due to its syndromic nature, the reduced penetrance, the lack of availability of genomic material from additional family members for segregation studies, and the yielding of a large number of variants potentially contributing or causing the patients’ phenotypes.
In contrast, the diagnosis yield of ID-causing genes in consanguineous families is often higher . In literature, cohorts consisting of consanguineous families from the Greater Middle East including Pakistan, Iran and Saudi Arabia show a diagnostic yield up to 90% in several WGS/WES studies [9, 12,13,14,15,16].
In our study, we identified three novel homozygous variants in different ID-related genes. These results were obtained by carrying out WES in consanguineous Pakistani families. The involvement of novel variants in autosomal recessive inheritance is supported by linkage analysis by using short tandem repeat, homozygosity-by-descent (HBD) mapping, brain expression databases and published literature related to neurodevelopmental disorders.
The study was approved by the board of advanced studies research and ethical committees of the International Islamic University, in Islamabad, Pakistan, and in University College London, UCL Institute of Neurology according to the declaration of Helsinki. The study included five probands originating from south Punjab in Pakistan. Blood samples was collected from all family members and genomic DNA extraction was performed by the phenol chloroform extraction method. Linkage analysis was performed with STR markers mapping genes causing autosomal recessive ID. STR markers were obtained from the LDB genetic map database of the Psychiatric University Hospital in Zurich, Switzerland. However, the genotypes of families remained unclear due to uninformative microsatellite markers and therefore patient samples underwent Whole Exome Sequencing (WES) analysis.
Whole exome sequencing and analysis
WES was performed on the probands selected from subjected families. WES was carried out on an Illumina platform HiSeq 2500 systems on average coverage of 150× by Macrogen Company (Geumcheon-gu, Seoul, South Korea). To filter the patients fastQ files, quality control tool such as Trimmomatic  applied to generate clean reads. Then reads were aligned to the reference human genome (GRCh38) using the Burrows Wheeler Aligner (BWA) tool and duplicate removed using Picard. The variant calling process was performed using the Genome analysis tool kit (GATK). Initially common and intronic variants were removed. All functional variants were prioritized for rare variants by filtering through databases  such as Exome Aggregation Consortium (ExAC) . Only homozygous or compound heterozygous, non-synonymous, frameshift, splice site and coding indel variants with allelic frequencies of less than < 0.001% in the 1000 genome project and ExAC database were selected for further analysis. The variants with allelic frequencies of < 0.001% were shortlisted and pathogenic scores were checked by using Ensemble genome browser.
Sanger sequencing primers were designed on Primer 3 plus (www.primer3plus.com) for the shortlisted variants from all three families. PCR products were amplified using allele specific primers for selected variants (see Additional File 1, Table 1). Sequencing analysis was performed using an ABI-3730 DNA analyzer.
Models carrying the variants were constructed for each family in order to check the effect of the variant on a normal protein model by using the corresponding swiss model. Multiple alignments were performed across orthologous classes of different species and showing absolute conservation in all three families.
Two patients from the consanguineous Pakistani family MR-4 were studied (IV: 2 and IV: 5 Fig. 1a). Both pregnancy and birth were normal in both affected individuals. Head circumference was normal at birth. In the first proband (IV: 2), cognitive development was initially normal, but irascible behaviour was noticed in the early months of life. After 8 to 10 months, psychomotor delay became evident with reduced head size. The proband’s age of onset of generalized tonic-clonic seizures was 3 years and developed maximum one seizure per day which lasted up to 5 minutes. She has progressive spasticity accompanied by microcephaly, failed to achieve -4SD at the age of 3 years. Brain CT scan performed at 3 years of age demonstrated subtle hypodensity in gray matter more marked in the left basal ganglia and cerebral atrophic changes (Fig. 1b). The second proband (IV: 5) carried similar phenotypes as described above. His age of onset of seizures was 5 years. He developed maximum 10 seizures per day which lasted up to 1 min. Brain CT scan demonstrated cerebellar atrophic changes with subcortical cysts (Fig. 1b). Electroencephalogram test results demonstrated hypsarrhythmia (specific EEG pattern seen in both patients with structural abnormalities of the brain). Metabolic testing done in both patients was all within normal limits.
WES performed in family MR-4 patients (IV: 2 and IV: 4) (Supplementary Table 2) revealed a homozygous missense variant in VPS53 shared by the 2 siblings (c.C605T, p.Pro202Leu) (Additional File 1, Table 2). VPS53 gene is located on chromosome 17 and consists of 18 exons. This variant in VPS53 (OMIM 615851) is disease causing based on the variant’s pathogenic score (Table 1). The variant was validated in this family through co- segregating analysis. Parents (III: 1 and III: 2) were heterozygous carriers, while both affected (IV: 2 and IV: 5) were homozygous and the unaffected sibling (IV: 4) was wild type (Fig. 1c). Multiple species alignment was performed which showed complete conservation of the affected amino acid residue (p.Pro231Leu) across different species (Fig. 1d).
In family MR-7, two patients (IV: 4 and IV: 5 Fig. 2a) were the 4th and 5th children of healthy, consanguineous parents from Pakistan. A 3 year old boy (IV: 4) manifested developmental regression and seizures at the age of 1.5 years. His disease progressed slowly, and gradually he lost his motor skills and became bed ridden and unable to sit without support. His head occipital frontal circumference size is 45.5 cm, considered as microcephalic, and he is not able to talk. Brain MRI scans demonstrated abnormal deep white mater signal in subcortical area, prominent ventricular and extra vent spaces. EEG results show diffuse slowing of background activity (Fig. 2b). The second proband (IV: 5) presents with developmental regression which started at the age of 9 months. She is able to talk just a few words, and her head circumference is overly large. Brain MRI scans demonstrated cerebellar emotional changes with reduced periventricular deep white matter (Fig. 2b).
The two patients (IV: 4 and IV: 5) were subjected to WES to identify the underlying genetic causes of the disease (Additional File 1, Table 3). Genomic Evolutionary Rate Profiling (GERP) and Combined Annotation Dependent Depletion (CADD) Phred pathogenic scores were also analyzed by using the Ensembl variant effect predictor (VEP) tool. WES data and all online pathogenic score predictors pinpoint that variant c.C1318T, p.His440Tyr in the GLB1 (OMIM: 622458) gene is the most probable cause of the phenotype of family MR-7 (Table 1). Co-segregation analysis showed complete segregation of the GLB1 variant in all family members (Fig. 2c). Swiss modeling was used to generate the normal and muted protein structure of the GLB1 gene shown in (Fig. 2d). Multiple alignment were performed for the GLB1 gene across orthologous species showing complete conservation in the region of GLB1 gene (Fig. 2e).
In family MR-8, a 7 year old female patient (IV: 2) was the second child to healthy and consanguineous parents from south Punjab in Pakistan (Fig. 3a). The pregnancy was normal and she was delivered at term via caesarian section, with a birth weight of 2.2 kg and head circumference of 41 cm. She developed macrocephaly within the first few months of life, and thereafter showed motor deterioration, and cognitive decline. Her head circumference was 43.5 cm (>95th percentile) at 4 months of age, which indicated macrocephaly. She controlled her head at 8 months of age. She walked independently at 15 months of age, but with some difficulty. She developed recurrent episodes of seizures at 2.5 years of age, as well as mental retardation and progressive motor dysfunction. Brain CT scans performed at 7 years of age, revealed extensive bilaterally symmetrical white matter changes and with subcortical cysts in the bilateral anterior temporal region of the brain (Fig. 3b).
WES performed in this patient (IV: 2) is described in methods. In total, 832 variants were selected in the exonic regions and adjacent intronic regions. The variants with allelic frequencies < 0.001% were shortlisted (Additional File 1, Table 4). Variants were validated in the human protein atlas database to check the expression analysis and only one of the rare variants was in the MLC1 gene (OMIM: 605908) showing higher levels of expression in the brain both at the RNA and protein level. This variant (c.C959A: p.T320K) was not present in (ExAC) and the Genome Aggregation Database (GnomAD) previously. The putative scores of filtered variants were checked and only the MLC1 gene was predicted as pathogenic. Following in-silico tools predicting variant as pathogenic, Mutation Taster (disease_causing), Polyphen2 (Probably damaging), SIFT (Deleterious), GERP and CADD phred (Table 1). After in-silico analysis, co-segregation analysis was performed for further validation of the variant and family members showed complete segregation for the MLC1 variant c.C959A: p.The320Lys (Fig. 3c). Multiple alignment was performed for the MLC1 gene across orthologous species showing complete conservation in the mutational site (Fig. 3d).
Genetic studies of autosomal recessive ID in consanguineous families can yield variants that segregate with the phenotype of the family and can be considered as disease-causing variants. Our study describes three ID families collected from the South Punjab area of Pakistan. Using combined genotyping, homozygosity-by-descent (HBD) and WES, we identified three novel variants in known genes. In family MR-4, a novel missense variant at nucleotide position c.C605T in exon 8 of the VPS53 gene, which substitutes the amino acid proline (CCA) with leucine (CTA) (p.Pro202Leu). The variant was confirmed against control samples of the population (n = 100) and the in-silico predictions of the identified variant were evaluated by using different tools (Table 1). The Golgi associated retrograde protein complex (GARP) consists of four subunits encoded by the VPS53, VPS52, VPS54, and ANG2 genes. The GARP complex plays an important role to direct retrograde vesicles from endosomes to Trans Golgi network (TGN) [20, 21]. Dysfuctioning of the tethering mechanism between the retrograde vesicles and GARP complex results in an accumulation of lysosomal receptor molecules within the TGN leading to swelling of lysosomes [20, 21]. The previously reported variant, c.A2084; p.Gln695Arg, replaces a glycine with an arginine at the conserved c-terminal domain of the VPS53 gene. The splice-site variant c.1556 + 5 G > A is predicted to result in a truncated protein with dysfunctioned VPS53 and GARP complex. Reported work suggests that variants in the VPS53 subunit may lead to progressive cerebello cerebral atrophy (PCCA) . Phenotypes of our family clinically resemble the PCCA disease. Interestingly, families with PCCA-associated ID from Jewish or Moroccan ancestry were previously reported carrying VPS53 variants . However, the pathomechanism of VPS53 variants in PCCA disease is not yet fully understood and future functional studies will be needed.
In family MR-7, a novel homozygous variant was identified in GLB1 (c.C1318T, p.His440Tyr), where amino acid histidine (CAC) was replaced with amino acid tyrosine (TAC). GLB1 encodes for an enzyme called beta galactosidase, located in lysosomes which acts as a degrading component for GM1 gangliosides within lysosomes. The accumulation of GM1 gangliosides leads to the lysosomal storage disease due to the deficiency of dissolute of β-galactosidase enzyme [23, 24]. So far, at least 185 disease-causing variants have been reported in the GLB1 gene . The variant we identified is located within the B-domain 1 of the β-galactosidase structure. B-domain consists of four beta sheets, and variants in the protein core region have been implicated in large structural changes of the β-galactosidase protein structure. These variants could directly impair the lysosomes degrading activity.
In family MR-8, we identified a novel homozygous missense variant in MLC1 (c.C959A, p.The320Lys), where the amino acid threonine (ACG) is replaced with amino acid lysine (AAG). Variants in the MLC1 gene have been implicated in megalencephalic leukoencephalopathy with subcortical cysts, an autosomal recessive disorder characterized by macrocephaly, progressive motor and cognitive features, and variable presence of subcortical cysts [26,27,28]. So far, at least 111 disease-causing variants were identified in the MLC1 gene . In the present study, we identified a novel (p.The230Lys) variant in MLC1 segregating with a consistent phenotype in our family. Also, in-silico analysis of the identified variant and further co-segregation studies, strongly implicate this variant in the disease of these patients.
In conclusion, we have been able to identify novel pathogenic variants in the ID-related genes VPS53, GLB1 and MLC1. These results allowed us to identify the molecular diagnosis for the recruited families and will also be important for interpreting variants that will be identified in other Pakistani patients and families in the future. The study also helps families to acquire better disease treatment and management.
Availability of data and materials
The patients’ data are available from the corresponding author on request. The Exome datasets analysed in the study have been deposited in the Harvard dataverse under the following links: https://doi.org/10.7910/DVN/0LN0GK, https://doi.org/10.7910/DVN/CKCSRJ, https://doi.org/10.7910/DVN/PLJLNA. The additional file is also available using the link: https://doi.org/10.7910/DVN/HBDJAP.
Whole Exome Sequencing
- VPS53 :
Vacuolar Protein Sorting 53
- GLB1 :
Galactosidase, beta 1
- MLC1 :
Megalencephalic Leukoencephalopathy with Subcortical Cysts
Short tandem repeat
Online Mendelian inheritance in men
Combined Annotation Dependent Depletion
Genomic Evolutionary Rate Profiling
Exome Aggregation Consortium
Genome Aggregation Database
The genome analysis toolkit
Di Resta C, Galbiati S, Carrera P, Ferrari M. Next-generation sequencing approach for the diagnosis of human diseases: open challenges and new opportunities. EJIFCC. 2018;29(1):4.
American Psychiatric Association. Diagnostic and statistical manual of mental disorders (DSM-5®). Washington, D.C.: American Psychiatric Pub; 2013.
Moeschler JB, Shevell M. Comprehensive evaluation of the child with intellectual disability or global developmental delays. Pediatrics. 2014;134(3):e903–18.
Kochinke K, Zweier C, Nijhof B, Fenckova M, Cizek P, Honti F, Keerthikumar S, Oortveld MA, Kleefstra T, Kramer JM. Systematic phenomics analysis deconvolutes genes mutated in intellectual disability into biologically coherent modules. Am J Hum Genet. 2016;98(1):149–64.
De Ligt J, Willemsen MH, Van Bon BW, Kleefstra T, Yntema HG, Kroes T, Vulto-van Silfhout AT, Koolen DA, De Vries P, Gilissen C. Diagnostic exome sequencing in persons with severe intellectual disability. N Engl J Med. 2012;367(20):1921–9.
Reading R. Diagnostic exome sequencing in persons with severe intellectual disability. Child Care Health Dev. 2013;39(2):301.
Wright CF, FitzPatrick DR, Firth HV. Paediatric genomics: diagnosing rare disease in children. Nat Rev Genet. 2018;19(5):253.
Srivastava S, Cohen JS, Vernon H, Barañano K, McClellan R, Jamal L, Naidu S, Fatemi A. Clinical whole exome sequencing in child neurology practice. Ann Neurol. 2014;76(4):473–83.
Santos-Cortez RLP, Khan V, Khan FS, Chakchouk I, Lee K, Rasheed M, Hamza R, Acharya A, Ullah E, Saqib MAN. Novel candidate genes and variants underlying autosomal recessive neurodevelopmental disorders with intellectual disability. Hum Genet. 2018;137(9):735–52.
Han JY, Jang JH, Park J, Lee IG. Targeted next-generation sequencing of Korean patients with developmental delay and/or intellectual disability. Front Pediatr. 2018;6:391.
Jamra R. Genetics of autosomal recessive intellectual disability. Med Genet. 2018;30(3):323–7.
Najmabadi H, Hu H, Garshasbi M, Zemojtel T, Abedini SS, Chen W, Hosseini M, Behjati F, Haas S, Jamali P. Deep sequencing reveals 50 novel genes for recessive cognitive disorders. Nature. 2011;478(7367):57.
Megahed H, Nicouleau M, Barcia G, Medina-Cano D, Siquier-Pernet K, Bole-Feysot C, Parisot M, Masson C, Nitschké P, Rio M. Utility of whole exome sequencing for the early diagnosis of pediatric-onset cerebellar atrophy associated with developmental delay in an inbred population. Orphanet J Rare Dis. 2016;11(1):57.
Riazuddin S, Hussain M, Razzaq A, Iqbal Z, Shahzad M, Polla D, Song Y, van Beusekom E, Khan A, Tomas-Roca L. Exome sequencing of Pakistani consanguineous families identifies 30 novel candidate genes for recessive intellectual disability. Mol Psychiatry. 2017;22(11):1604.
Harripaul R, Vasli N, Mikhailov A, Rafiq MA, Mittal K, Windpassinger C, Sheikh TI, Noor A, Mahmood H, Downey S. Mapping autosomal recessive intellectual disability: combined microarray and exome sequencing identifies 26 novel candidate genes in 192 consanguineous families. Mol Psychiatry. 2018;23(4):973.
Hu H, Kahrizi K, Musante L, Fattahi Z, Herwig R, Hosseini M, Oppitz C, Abedini SS, Suckow V, Larti F. Genetics of intellectual disability in consanguineous families. Mol Psychiatry. 2019;24(7):1027.
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20.
Sherry ST, Ward M-H, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29(1):308–11.
Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, O’Donnell-Luria AH, Ware JS, Hill AJ, Cummings BB. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536(7616):285.
Conibear E, Stevens TH. Vps52p, Vps53p, and Vps54p form a novel multisubunit complex required for protein sorting at the yeast late Golgi. Mol Biol Cell. 2000;11(1):305–23.
Liewen H, Meinhold-Heerlein I, Oliveira V, Schwarzenbacher R, Luo G, Wadle A, Jung M, Pfreundschuh M, Stenner-Liewen F. Characterization of the human GARP (Golgi associated retrograde protein) complex. Exp Cell Res. 2005;306(1):24–34.
Feinstein M, Flusser H, Lerman-Sagie T, Ben-Zeev B, Lev D, Agamy O, Cohen I, Kadir R, Sivan S, Leshinsky-Silver E. VPS53 mutations cause progressive cerebello-cerebral atrophy type 2 (PCCA2). J Med Genet. 2014;51(5):303–8.
Kannebley JS, Silveira-Moriyama L, Bastos LO, Steiner CE. Clinical findings and natural history in ten unrelated families with juvenile and adult GM1 gangliosidosis. In: JIMD Reports, vol. 24. Berlin, Heidelberg: Springer; 2015. p. 115–22.
Sandhoff K, Harzer K. Gangliosides and gangliosidoses: principles of molecular and metabolic pathogenesis. J Neurosci. 2013;33(25):10195–208.
Stenson PD, Mort M, Ball EV, Evans K, Hayden M, Heywood S, Hussain M, Phillips AD, Cooper DN. The human gene mutation database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Hum Genet. 2017;136(6):665–77.
Van der Knaap M, Barth P, Stroink HA, Van Nieuwenhuizen O, Arts W, Hoogenraad F, Valk J. Leukoencephalopathy with swelling and a discrepantly mild clinical course in eight children. Ann Neurol. 1995;37(3):324–34.
Leegwater PA, Yuan BQ, van der Steen J, Mulders J, Könst AA, Boor PI, Mejaski-Bosnjak V, van der Maarel SM, Frants RR, Oudejans CB. Mutations of MLC1 (KIAA0027), encoding a putative membrane protein, cause megalencephalic leukoencephalopathy with subcortical cysts. Am J Hum Genet. 2001;68(4):831–8.
Bonkowsky JL, Nelson C, Kingston J, Filloux F, Mundorff M, Srivastava R. The burden of inherited leukodystrophies in children. Neurology. 2010;75(8):718–25.
Stenson PD, Ball EV, Mort M, Phillips AD, Shiel JA, Thomas NS, Abeysinghe S, Krawczak M, Cooper DN. Human gene mutation database (HGMD®): 2003 update. Hum Mutat. 2003;21(6):577–81.
We highly acknowledge participation of both affected and unaffected members of the three families in the present study. We also acknowledge Higher Education Commission project_7028 and IRSIP Islamabad, Pakistan and UCL Institute of Neurology, Queen Square London financially supported the research program of PhD student Muhammad Ilyas.
Ethical approval and consent to participate
The study was approved from the board of advanced studies and research and ethical committee (No: IIU (BI&BT)/FBAS-2018-3595), International Islamic University, Islamabad, Pakistan, and University College London, UCL Institute of Neurology according to the declaration of Helsinki principle. Written informed consents were obtained from the all individuals including child’s with parental/ legal guardian consent.
The Project was partially funded by Higher Education Commission Pakistan (project No_7028) and by awarding International Research Support Initiative Program (IRSIP) (Grant No: 1–8/HEC/HRD/2017/8378, PIN: IRSIP 39 BMS 42) to MI and UCL Institute of Neurology; Queen Square London UK. Exome and Sanger sequencing work carried out at UCL Institute of Neurology, London UK funded by UCL Institute of Neurology; Queen Square London UK. The funders did not play any role in the study design, data collection, interpretation and preparation of the manuscript.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
List of Primers used for Segregation analysis. Supplementary Table 2. Exome sequencing Family MR-4 two Patients revealed VPS53 Mutation. Supplementary Table 3. Exome sequencing Family MR-7 two Patients revealed GLB1 Mutation. Supplementary Table 4. Exome sequencing Family MR-8 one Patients revealed MLC1 gene Mutation.
About this article
Cite this article
Ilyas, M., Efthymiou, S., Salpietro, V. et al. Novel variants underlying autosomal recessive intellectual disability in Pakistani consanguineous families. BMC Med Genet 21, 59 (2020). https://doi.org/10.1186/s12881-020-00998-z