Updated carrier rates for c.35delG (GJB2) associated with hearing loss in Russia and common c.35delG haplotypes in Siberia

Background Mutations in GJB2 gene are a major causes of deafness and their spectrum and prevalence are specific for various populations. The well-known mutation c.35delG is more frequent in populations of Caucasian origin. Data on the c.35delG prevalence in Russia are mainly restricted to the European part of this country. We aimed to estimate the carrier frequency of c.35delG in Western Siberia and thereby update current data on the c.35delG prevalence in Russia. According to a generally accepted hypothesis, c.35delG originated from a common ancestor in the Middle East or the Mediterranean ~ 10,000–14,000 years ago and spread throughout Europe with Neolithic migrations. To test the c.35delG common origin hypothesis, we have reconstructed haplotypes bearing c.35delG and evaluated the approximate age of c.35delG in Siberia. Methods The carrier frequency of c.35delG was estimated in 122 unrelated hearing individuals living in Western Siberia. For reconstruction of haplotypes bearing c.35delG, polymorphic D13S141, D13S175, D13S1853 flanking the GJB2 gene, and intragenic rs3751385 were genotyped in deaf patients homozygous for c.35delG (n = 24) and in unrelated healthy individuals negative for c.35delG (n = 67) living in Siberia. Results We present updated carrier rates for c.35delG in Russia complemented by new data on c.35delG carrier frequency in Russians living in Western Siberia (4.1%). Two common D13S141-c.35delG-D13S175-D13S1853 haplotypes, 126-c.35delG-105-202 and 124-c.35delG-105-202, were reconstructed in the c.35delG homozygotes from Siberia. Moreover, identical allelic composition of the two most frequent c.35delG haplotypes restricted by D13S141 and D13S175 was established in geographically remote regions: Siberia and Volga-Ural region (Russia) and Belarus (Eastern Europe). Conclusions Distribution of the c.35delG carrier frequency in Russia is characterized by pronounced ethno-geographic specificity with a downward trend from west to east. Comparative analysis of the c.35delG haplotypes supports a common origin of c.35delG in some regions of Russia (Volga-Ural region and Siberia) and in Eastern Europe (Belarus). A rough estimation of the c.35delG age in Siberia (about 4800 to 8100 years ago) probably reflects the early formation stages of the modern European population (including the European part of the contemporary territory of Russia) since the settlement of Siberia by Russians started only at the end of sixteenth century. Electronic supplementary material The online version of this article (10.1186/s12881-018-0650-5) contains supplementary material, which is available to authorized users.

The recessive pathogenic GJB2 variant c.35delG (p.Gly12Valfs*2) (NM_004004.5) is known to be prevalent in deaf patients of Caucasian origin [2,12,13]. Previous studies have revealed the c.35delG carrier frequency to be around 1 in 50 overall in Europe [12], reaching 1 in 31 in Southern Europe [14]. In meta-analysis of the data published up to 2008, mean carrier frequencies of c.35delG were found to be 1.89, 1.52, 0.64, 1, and 0.64% for European, American, Asian, Oceanic, and African populations, respectively [13]. The c.35delG is a deletion of one guanine (G) from a string of six (GGGGGG) in the GJB2 coding sequence resulting in a frameshift and termination of the Cx26 protein sequence at amino acid 13 (p.Gly12-Valfs*2). The occurrence of c.35delG as a possible "hot spot" caused by DNA polymerase "slippage" was previously assumed to be an explanation of the high prevalence of this pathogenic variant in the GJB2 gene [5,15,16]. Nevertheless, convincing evidence emerged that the founder effect plays an important role in the prevalence and accumulation of c.35delG in populations of Caucasian origin since lower rates or absence of c.35delG are observed in other populations. According to a generally accepted hypothesis, c.35delG originated from a common ancestor in the Middle East or the Mediterranean approximately 10,000-14,000 years ago and spread throughout Europe with Neolithic migrations. Specific c.35delG prevalence and discovery of common STR-and SNP-haplotypes bearing the c.35delG mutation in Mediterranean, Middle Eastern, North-European populations, and in individuals of European origin in the USA support this hypothesis [14,[17][18][19][20][21][22][23][24][25][26][27]. Relevant data were also obtained for populations of the Volga-Ural region of Russia [28] and Belarus [29].
The c.35delG predominance in deaf patients was reported in several studies conducted in the European part of Russia [30][31][32][33][34][35][36][37][38]. In the ethnically heterogeneous population of Siberia, epidemiological and molecular genetic studies of hereditary deafness are currently limited to regions of the Altai Republic, the Tuva Republic (Southern Siberia), and the Sakha Republic (Yakutia, Eastern Siberia) [39][40][41]. The presence of c.35delG in a homozygous or compound-heterozygous state was the main cause of hereditary hearing loss in deaf Russian patients living in these regions in contrast to deaf patients belonging to Siberian indigenous peoples (Altaians, Tuvinians, Yakuts) who were negative for the c.35delG mutation [39][40][41].
This study presents an updated summary of published data on the c.35delG (p.Gly12Valfs*2) prevalence in Russia complemented by our original data on the c.35delG carrier frequency in Western Siberia. To test the c.35delG common origin hypothesis, we genotyped polymorphic markers flanking the GJB2 gene and reconstructed haplotypes bearing c.35delG in deaf patients from Siberia homozygous for c.35delG.

Subjects
Twenty-four unrelated patients (mostly Russians) with congenital or early onset profound hearing loss living in several Siberian regions (Altai, Tuva, Yakutia) were previously found to be homozygous for c.35delG [39][40][41]. The carrier frequency of c.35delG was estimated in 122 unrelated normal hearing individuals (mostly Russians) from the Novosibirsk region (Western Siberia). Genotyping of three polymorphic short tandem repeat (STR) markers D13S141, D13S175, D13S1853 and an intragenic SNP (rs3751385) was performed in 24 unrelated deaf patients homozygous for c.35delG and in 67 unrelated healthy individuals from the Novosibirsk region (Western Siberia) who were negative for c.35delG.

C.35delG screening and analysis of genetic markers
All primers and genotyping methods are summarized in Table 1. The c.35delG screening was performed according to [42]. Polymorphic STR markers flanking the GJB2 gene: D13S141 (~39.2 kb centromeric to c.35delG), D13S175 and D13S1853 (~84.8 kb and~277.1 kb telomeric to c.35delG, respectively), and intragenic SNP (rs3751385) lo-cating~0.7 kb centromeric from c.35delG were used to reconstruct haplotypes bearing c.35delG. These markers were used previously in relevant studies and were therefore chosen to keep compatibility and enable comparative analysis with already available data. Genotyping of D13S141, D13S175 and D13S1853 was performed in the SB RAS Genomics Core Facility (Institute of Chemical Biology and Fundamental Medicine SB RAS, Novosibirsk, Russia).

Statistical analysis
Haplotype frequencies were estimated from observed genotype data using Expectation-Maximization (EM) algorithm of the Arlequin 3.5.2.2 software [43]. Fisher's exact test (significance level 0.05) was used to compare the allelic and haplotype distributions. Linkage disequilibrium between the marker alleles and c.35delG as well as the age of c.35delG were estimated as described previously [44,45]. The linkage disequilibrium was calculated as where δ is the measure of linkage disequilibrium, Pd is the frequency of the marker allele among all chromosomes carrying c.35delG, and Pn is the frequency of the same allele among chromosomes without c.35delG. The age of c.35delG was estimated as where g is the number of generations from the moment of the c.35delG appearance to the present, Q is the share of chromosomes carrying c.35delG unlinked with the founder haplotype, Pn is the frequency of the allele included in founder haplotype in the population, and Ѳ is the recombinant fraction calculated from the physical distance of the markers from the c.35delG location on the assumption of 1 cM = 1000 kb.

Carrier frequency of c.35delG in Russia
Screening of c.35delG in unrelated hearing individuals (mostly Russians) living in the Novosibirsk region (Western Siberia) has revealed 5 out of 122 examined subjects to be c.35delG carriers (4.1%). These data supplement current information about the c.35delG prevalence in Russia. We have analyzed all available literature data (published up to 2018) on c.35delG carrier frequencies in Russia and in some countries of the former Soviet Union (USSR) which populations undoubtedly contributed to the contemporary population of Russia. The distribution of c.35delG carrier frequencies is presented in Fig. 1. High c.35delG frequencies are observed in the populations of north-western and central parts of Russia with downward trend from west to east.

Discussion
Distribution of the c.35delG carrier frequency in Russia is characterized by pronounced ethno-geographic specificity ( Fig. 1). High c.35delG rates are typical for    [41]. Haplotype (D13S141-rs3751385-D13S175-D13S1853) analysis of chromosome 13 in c.35delG homozygous deaf Russian patients from Siberia revealed the two most common haplotypes (126-T-c.35delG-105-202 and 124-T-c.35delG-105-202). It is interesting to compare the common c.35delG haplotypes identified in Siberia with available relevant data for other populations [19, 21, 23-26, 28, 29] (Table 3). Such comparisons are to some extent possible for the c.35delG haplotypes flanked by D13S141 and D13S175 (~124 kb). For these markers, the  (Fig. 2B). Unfortunately, unified classification of the D13S141 and D13S175 alleles based on their size in nucleotides or number of dinucleotide (CA) repeats is absent due to different genotyping methods or simple numerical designations for alleles used in previous studies, making accurate comparative analysis hardly possible. Nonetheless, the 105 bp long allele of D13S175 was detected in the most common c.35delG haplotypes in the majority of studies while a broad variety of alleles was observed for D13S141. We were able to compare both allelic size in nucleotides (determined by fragment analysis) and numbers of CA repeats (detected by Sanger sequencing) in two most frequent D13S141 alleles in the c.35delG homozygotes from Siberia (analyzed in this study), Volga-Ural region of Russia (kindly provided by L. Dzhemileva [28]), and Belarus [29]. Fourteen CA repeats (CA 14 ) were found in D13S141 alleles 126, 127, and 125 while thirteen CA repeats (CA 13 ) were found in D13S141 alleles 124, 125, and 123 in DNA samples  Spain Greece [26] Allele destinations are taken from original sources. afourteen CA repeats (CA 14 ) was revealed by Sanger sequencing; bthirteen CA repeats (CA 13 ) was revealed by Sanger sequencing; c -frequencies of haplotypes were calculated on the basis of the data given in original sources [19,21,24,26] from Siberia, Belarus, and Volga-Ural region, respectively (Table 3). Thus, the identity of allelic composition of two most frequent haplotypes D13S141-c.35delG-D13S175 (~124 kb) was revealed in these geographically remote regions (Table 3). In addition, we assume that the conservative region of c.35delG haplotypes from Siberia may span longer being flanked by D13S141 and D13S1853 (~316 kb).
The contemporary Siberian population of Caucasian origin (mostly Russians) was formed as a result of multiple migration flows from the European part of Russia that started from the first settlement of Siberia by Russians at the end of sixteenth century [51]. Our rough dating of c.35delG expansion into Siberia (about 4800-8100 years ago) could be a reflection of complex processes of early formation stages of the modern European population (including the European part of Russia). These data do not contradict the earlier hypothesis about the c.35delG occurrence in the Middle East or the Mediterranean approximately 10,000-14,000 years ago followed by its spreading with migration flows across Europe [14,[17][18][19][20][21][22][23][24][25][26][27]. However, taking into account the current data on three ancient components in the origin of modern Europeans [52], it cannot be ruled out that c.35delG could also have independently originated among any ancient populations of North-West Europe or anywhere else.

Conclusions
Distribution of the c.35delG carrier frequency in Russia is characterized by pronounced ethno-geographic specificity. High frequencies of c.35delG are observed in the populations living in north-western and central parts of Russia with a downward trend from west to east. The territory of Siberia can be assumed as the north-eastern geographic "end point" of the c.35delG prevalence in Eurasia. Comparative analysis of the c.35delG haplotypes supports the common origin of c.35delG in some regions of Russia (Siberia and Volga-Ural region) and in Eastern Europe (Belarus). A thorough study of the haplotypes associated with c.35delG in populations from different world regions could further elucidate its origin and age.

Additional file
Additional file 1: Table S1. With detailed data on c.35delG carrier frequencies on the territory of Russia and in some countries of the former Soviet Union which were obtained from all available papers published up to 2018. This file also includes list of references for Table S1. (DOCX 42 kb) Abbreviations GJB2: Gap junction protein, beta-2; SNP marker: Single nucleotide polymorphic marker; STR marker: Short tandem repeat marker