A novel G6PD deleterious variant identified in three families with severe glucose-6-phosphate dehydrogenase deficiency

Background Glucose-6-phosphate dehydrogenase deficiency (D-G6PD) is an X-linked recessive disorder resulted from deleterious variants in the housekeeping gene Glucose-6-phosphate 1-dehydrogenase (G6PD), causing impaired response to oxidizing agents. Screening for new variations of the gene helps with early diagnosis of D-G6PD resulting in a reduction of disease related complications and ultimately increased life expectancy of the patients. Methods One thousand five hundred sixty-five infants with pathological jaundice were screened for G6PD variants by Sanger sequencing all of the 13 exons, and the junctions of exons and introns of the G6PD gene. Results We detected G6PD variants in 439 (28.1%) of the 1565 infants with pathological jaundice. In total, 9 types of G6PD variants were identified in our cohort; and a novel G6PD missense variant c.1118 T > C, p.Phe373Ser in exon 9 of the G6PD gene was detected in three families. Infants with this novel variant showed decreased activity of G6PD, severe anemia, and pathological jaundice, consistent with Class I G6PD deleterious variants. Analysis of the resulting protein’s structure revealed this novel variant affects G6PD protein stability, which could be responsible for the pathogenesis of D-G6PD in these patients. Conclusions High rates of G6PD variants were detected in infants with pathological jaundice, and a novel Class I G6PD deleterious variants was identified in our cohort. Our data reveal that variant analysis is helpful for the diagnosis of D-G6PD in patients, and also for the expansion of the spectrum of known G6PD variants used for carrier detection and prenatal diagnosis.

More than 300 G6PD gene variants have been described in D-G6PD. These variants have been categorized into five classes, from Class I to Class V, by the World Health Organization (WHO) based on biochemical phenotype and clinical manifestations [11][12][13]. About 400 million people worldwide has been estimated to have D-G6PD [14]. This condition appears most frequently in certain parts of Africa, the Mediterranean, Asia, and the Middle East. Some genetic variants have reached a high incidence rate in people from certain parts of the world since they present with a selective advantage against malaria [15]. Mutations at different sites of the G6PD gene result in different effects on enzyme activity [16][17][18][19] (Fig. 1). The majority of variants of the G6PD gene result in red cell enzyme deficiency through decreasing enzyme stability [18,19]. The polymorphic variants of the G6PD gene influence amino acid residues at multiple sites all over the enzyme and decrease the stability of the enzyme in the red cell, possibly by affecting protein folding [18][19][20]. These unfavorable variants of the G6PD gene mostly disturb residues at the dimer interface, or the residues responsible for association with a structural NADP Fig. 1 The common variants and classification in the G6PD gene. The G6PD gene variants are classified into Class I to Class IV based on the genotype and clinical manifestations (WHO classification). Red color indicates a Class I severe enzyme deficiency with chronic non-spherocytic hemolytic anemia (CNSHA). Blue color indicates a Class II severe enzyme deficiency with less than 10% of the normal activity. Orange color indicates a Class III mild to a moderate enzyme deficiency (10 to 60% of normal activity). Black color indicates a Class IV very mild or almost normal enzyme activity (> 60% normal activity and no clinical problem). Data from G6PD gene variant database (http://www.bioinf.org.uk) molecule to stabilize the enzyme [21][22][23][24][25]. The de novo variants appear very rare, which causing the more severe condition of chronic non-spherocytic hemolytic anemia [20].
Pathological jaundice is an important condition and accounts for a large number of Neonatal Intensive Care Unit (NICU) admissions. Generally, jaundice in infants commonly presents in the first week of life. Pathological jaundice appears as early as the first day of life and can lead to adverse complications in the absence of timely intervention. A total serum bilirubin (TSB) level above the 95th percentile for an infant's age (in hours) is defined as serum hyperbilirubinemia, which occurs in 8-9% of infants during the first week of life [26,27]. The frequency of G6PD deficiency in infants with jaundice is well reported, however, the frequency of G6PD variants with G6PD deficiency in the infants with jaundice has not be studied to a great extent [28][29][30]. In this study, we screened for G6PD variants via DNA sequencing using blood samples from infants with pathological jaundice who were suspected to have D-G6P. We identified a new G6PD deleterious variant in three families with D-G6PD. Our data are indicative of the molecular mechanism underlying D-G6PD, and the importance of the molecular diagnosis and genetic screening for this disease.

Patient data and family consent
One thousand five hundred sixty-five infants born with pathologic jaundice at Renmin Hospital of Wuhan University between September 2015 to September 2018 were screened for G6PD gene variants. The identified novel variants were verified in 350 infants without jaundice as unrelated controls, and 80 blood donors serving as healthy controls were also screened. The red cell count (RBC), hemoglobin (Hb), hematocrit (HCT), total bilirubin (TBIL) and direct bilirubin (DBIL) were tested in all the newborns using routine clinical laboratory methods as previously reported [31][32][33]. Ethical Committee of Renmin Hospital of Wuhan University approved this study. Written consents were obtained from the families for reporting their clinical details.

G6PD variants analyzed in infants with pathological jaundice
One thousand five hundred sixty-five infants with pathological jaundice had G6PD variant screening performed by PCR amplification of the exons of the G6PD genes following Sanger sequencing. As a control, more than 300 unrelated infants (without jaundice) were also screened. There was a 28.1% variant detection rate with 439 infants out of 1565 infants found to have 9 different G6PD genetic variants. Detailed characteristics of each G6PD gene variants in two spice form and the number of times detected are summarized in Table 1. We also calculated the proportion of each variants in the 439 positive infants ( Table 1). The most detected G6PD variants are c.1388G > A, p.Arg463His (9.33%) and c.1376G > T, p.Arg459Leu (8.95%). These two variants were found in 33.26 and 31.89% of subjects respectively, and represent the majority of variants found in our cohort. The order of incidence for the other variants from most frequent to least frequent is: c.95A > G, p.His32Arg (3.19%) with a proportion of 11.39%, c.871G > A, p.Val291Met (2.04%) with a proportion of 7.29%, c.1024C > T, p.Leu342Phe (1.79%) with a proportion of 6.38%, c.466G > A, p.Glu156Lys (1.02%) with a proportion of 3.64%, and c.1192 G > A, p.Glu398Lys and c.1004 C > A, p.Ala335Asp with a very low incidence of 0.77%, and a proportion of 2.73%.

Mapping and sequencing for the novel G6PD gene variant
A novel G6PD variant, c.1118 T > C, p.Phe373Ser had the lowest incidence (0.19%) in our cohort with a proportion of 0.68% in our detected variants (Table 1). This variants appeared in 3 families, with a total of 14 family members affected (Family 1: 7affected; Family 2: 4 affected, Family 3: 3 affected) (Fig. 2). Analysis of all of the 13 exons of G6PD in the three probands when compared with the unrelated cohort revealed a hemizygous missense variant c.1118 T T > C in exon 9. This results in a putative amino acid change from phenylalanine (TTC) to serine (TCC) in codon 373 p.Phe373Ser (Fig. 3a~c). Exon 9 of the G6PD gene was detected in all the other members of these three families. It was noted that the missense variant c.1118 T G > C, p.Phe373Ser came from the mothers in the family and not the fathers (Fig. 3d~f) (data only showed the family 1). Moreover, c.1118 T T > C, p.Phe373Ser was not present in 350 unrelated infants or in 80 healthy controls. The CADD score showed the G6PD c.1118 T > C is a potentially pathogenic variant (CADD score 25.5). The GERP score showed that the G6PD c.1118 T is highly conserved (score is 5.82).

Novel Class I G6PD gene variant identified in the infants from 3 families
To examine if the novel variants impact G6PD function we also examined RBC, Hb, HCT, TBIL, DBIL and the G6PD activity in all persons from the three families. Results are shown in Table 2. The G6PD activity in three probands was 3.5% (Family I III1), 4.6% (Family 2 II2) and 5.1% (family 3 (II1), which is below the 10% of normal activity (100%). The Hb values in the three probands was examined and severe anemia was indicated in the infants (Family 1 III1, 89 g/L;

Pathogenicity analysis of the novel G6PD mutant
With bioinformatics tools to predict the effect of amino acid substitutions on protein function, the missense variant c.1118 T T > C, p.Phe373Ser was classified as pathogenic (Table 3). MutPred prediction score for this mutant was 0.902, revealing that this variant was a deleterious variant and has the probability for a gain of disorder (statistically significant p = 0.007). MutPred2 prediction analysis showed that the amino acid substitution of this variant is pathogenic (score 0.885), affecting amino acids 137, 173 and 233, and also the eukaryotic linear motifs.
We further analyzed if the c.1118 T T > C, p.Phe373-Ser variant meet the pathogenic criteria of American College of Medical Genetics (ACMG) for the classification of variants [34]. Our data showed that the c.1118 T T > C, p.Phe373Ser variant meets the criteria of 2 pathogenic strong (PS): de novo variant confirmed in parents (PS2) and appeared in several members of 3 families with increased segregation (PS1), and 2 pathogenic moderate (PM): a novel missense change at an amino acid (PM5) localizing at the wellstudied functional domain (PM1). In addition, this

Conservation and stability analysis of the novel G6PD mutant
According to Alamut, both nucleotide c.1118 T and amino acid phenylalanine 373 are highly conserved. MultAlin Multiple sequence alignments of G6PD protein sequences from different species (Fig. 4) also showed high evolutionary conservation with respect to phenylalanine-373, which is substituted by serine in the affected members of the Chinese family. The angle of the amino acid and distance of atoms around the Phe373 and Phe373Ser residue were studied using the software Swiss-PdbViewer (DeepView). When the phenylalanine residue was substituted by a serine residue, there was marked enlargement in the distance of the atoms from 3.81 Å (wild type) to 7.51 Å (variant type), and the angle of the three amino acid significantly expanded from 20.67°C (wild type) to 44.39°C (variant type) (Fig. 5).
The stability prediction for the variant and the properties of the structural environment along with its values for wildtype and mutant residues were also studied.

Discussion
Several factors play a role in infants developing pathological jaundice including the imbalance between production, conjugation, and elimination of bilirubin, environmental factors, and ethnicity [35]. G6PD deficiency is one of the common etiological factors for infant pathological jaundice and the G6PD variants are the major reason for the D-G6PD in China. We screened for G6PD gene variants in infants with pathologic jaundice in the Wuhan area and found that gene variants were detected in 28.1% of infants, which is comparable to 31.5% detection rate reported by Yazd in Neonatal Pathologic Hyperbilirubinemia [28]. Our data is the first report of the incidence of G6PD gene variants in infants with G6PD deficiency and pathological jaundice in the southeastern area of China.
Screening for the G6PD variants and prevention of the clinical manifestations of D-G6PD is essential for the public health. Many factors such as chemotherapeutic drugs, household and environmental agents trigger hemolytic anemia in patients with G6PD deficiency. The newborn screenings are usually performed 1 week after birth. This screening is helpful to prevent hemolysis prompted by treatment of infections, and other triggers. In addition, screening will allow for immediate treatment of severe anemia and hemolysis with resuscitation and Table 2 The clinical data of three families   Items  Family 1  Family 2  Family 3   patient  I1  I2  II1  II2  II3  III1  III2  I1  I2  II1  II2  I1  I2  II1   age  61  59  39  36  27  2.5  0.5  36  37  8  1  32 29    The screening results should be shared with the parents and necessary education on D-G6PD should be provided. All of these approaches might be excellent prophylactic measures in preventing hemolytic crises later in life for the infants. Our study identified a novel G6PD variant, which will increase the spectrum of G6PD variants for diagnosis of D6PD deficiency, carrier detection and prenatal diagnosis. Several methods have been used for screening the G6PD variants, for example, G6PD variant detection array, multicolor melting curve analysis, etc. However, these methods are difficult to detect the novel G6PD variants. We used Sanger sequencing following the PCR-amplification of all G6PD exons and the exon-intron boundaries, which can detect both the known and unknown variants in the patients. Next-generation sequencing is also used for identifying novel G6PD variants, however it is not a time or cost effective method and is not suitable for large-scale screening. Our method is a simple, quick and economic screening method for the G6PD variants. Different G6PD gene variants cause different levels of enzyme deficiencies and disease manifestations [36]. Our data showed that c.1388 G > A, c.1376 G > T and c.95 A > G were the three most common G6PD variants, accounting for 76.54% of the total disease alleles in this study cohort. The detection rate and proportion of each variant was similar to other regions in south China [31-33, 37, 38]. Two common variants (detection rate > 5%) and three low frequency variants (detection rate 1-5%) found in this study belong to one of the 5 class deletion variants, with 2 variants (C.95A > G and c. 1192G > A) in Class I, 2 variants ((c.1466G > T and c. 1388G > A) in Fig. 4 Evolutionary conservation of identified G6PD missense variant. a As shown, the second structure and p.Phe373 are located at the end of the beta strand, the green dot depicts the NADP binding site, the orange dot shows the N6-acetyllysine modified residue, and the red dot shows the substrate binding site. b G6PD sequences from different species were aligned using the Clustal Omega in order to investigate the evolutionary conservation of the phenylalanine residue located in the codon. The multiple alignments revealed total conservation of this residue across all species, suggesting that this residue is crucial for the protein functionality Class II, and 2 variants (c.466G > A and c.871G > A) in Class III, all of which are reported to cause different degrees of enzyme deficiencies [33].
We also identified a novel G6PD variant, c.1118 T > C. Infants with this variant appear pathologically jaundice. Pathogenicity analysis showed this is a deleterious variant; and it is pathogenic. Conservation and stability analysis showed that this variant would reduce the stability of the G6PD protein. Infants with this variant had severe anemia, which showed the morphological characteristics of the nonspherocytic hemolytic anemia (data not shown). Therefore, the identified novel variants c.1118 T > C belongs to the Class I G6PD. This data provides evidence that this novel G6PD variant is a cause of nonspherocytic hemolytic anemia, and has significant clinical impact on the pathology of G6PD-D although the frequency of the variant is low in this cohort. In the future, the functional analysis of the novel variant will be performed, particularly evaluating the effect of the variant on G6PD protein stability and cellular activity. Combining the cellular effect of the novel variant with the clinical cohort study focusing on the novel variant will emphasize its role in pathogenesis of D-G6PD.

Conclusions
High rates of G6PD variants are detected in infants with pathological jaundice, and a novel Class I G6PD variant has been identified in our cohort. Our data reveal that variant analysis is helpful for the diagnosis of D-G6PD and also in expanding the spectrum of G6PD variants evaluated for in carrier detection and prenatal diagnosis.
Additional file 1 : Table S1. The primers for amplification and sequencing the exons of G6PD gene.
Abbreviations G6PD: Glucose-6-phosphate dehydrogenase; D-G6PD: Glucose-6-phosphate dehydrogenase deficiency; NADPH: Nicotinamide Adenine Dinucleotide Phosphate; RBC: Red cell count; Hb: Hemoglobin; HCT: Hematocrit; TBIL: Total bilirubin; DBIL: Direct bilirubin; NBT: Nitroblue tetrazolium The authors' contribution in the study was as follow: YT collected clinical data, performed data analysis/sequencing and taken overall guidance. BL performed clinical data collection, DNA sequencing and data analysis. HZ collected clinical data, performed sequencing and data analysis. AB, ZW and JG performed clinical data collection/analysis. CS, MM, SK and BT drafted manuscript, developed manuscript concepts and analyzed data. YL performed clinical data collection/analysis, taken overall instruction, participated manuscript concept development, and provided the grant support.

Funding
This project was supported by the National Natural Science Foundation of China (81502087 to YL) for some tests and sequencing of the clinical samples; and all the lab tests were performed in Department of Clinical Laboratory, Renmin Hospital of Wuhan University, China.

Availability of data and materials
All data supporting the results reported in a published article can be found. The patient datasets for the current study are not publicly accessible in