Gene spectrum analysis of thalassemia for people residing in northern China

Background Southern China provinces have high incidence of thalassemia, however, sporadic cases can be found in northern China as well. Methods People resided in north China who were suspected to have thalassemia were detected mutations by gap-polymerase chain reaction (Gap-PCR) and reverse dot blot (RDB) analyses. Those with positive findings from 2012 to 2017 were further analyzed for basic clinical data and ancestral information either by medical records or by telephone follow-up or both. Results Most people enrolled in our study had no or mild symptoms. For those with positive gene findings, people originated from the north had higher percentage of β-thalassemia gene mutations compared with those originated from the south (72.8% vs. 62.4%, χ2 = 9.92, P = 0.001). Analysis of the individual gene distribution of people from south and north areas did not show significant difference either in α- thalassemia (P = 0.221) or β-thalassemia (P = 0.979). No significant difference was found in the frequency of α mutation in people living in different altitudes. However, for β-thalassemia, the frequency of the 6 most common mutations was significantly different in people living in different provinces with altitude below 500 m, 500–1000 m, and above 1000 m (χ2 test, P < 0.05). Conclusion Most of people in north China with thalassemia mutation gene were thalassemia carriers. People originated from the north had higher frequency of β mutation than those originated from the south, but the north people had similar individual gene mutation profile compared with south people both for α and β mutations. People lived in different altitudes had different spectrum of β mutations.


Background
Thalassemia is an autosomal inherited defect caused by the reduced or absent synthesis of the alpha or beta globin chains of the hemoglobin (Hb) tetramer, which leads to hereditary anemia and made it one of the most pervasive monogenic diseases worldwide [1]. The thalassemia heterozygotes have shown resistance to malaria caused by Plasmodium falciparum [2].
The two main types of thalassemia, α and β thalassemia, each can be subdivided into another two forms, the α0 and α + thalassemia, and β0 and β + thalassemia.
The heterozygous state of α + thalassemia or α0 thalassemia are often clinically asymptomatic while the compound heterozygous states for α + thalassemia and α0 thalassemia usually results in hemoglobin H. Hemoglobin Bart's, which represents the homozygous state for α0 thalassemia, leads to death in the uterus or just after birth. The heterozygous inheritance of a β thalassemia, which called β thalassemia minor, usually demonstrates asymptomatic microcellular anemia while others are silent carriers. Both βthalassemia major and intermediate can result from the homozygous or compound heterozygous inheritance of β mutations. Patients with β thalassemia major usually present with severe anemia in infancy and become transfusion dependent for life, whereas patients with βthalassemia intermediate may develop mild to moderate anemia and variable blood transfusion requirements [3].
The high frequency of inherited hemoglobin variants is present in the area extending from sub-Saharan Africa to the Indian subcontinent and East and Southeast Asia, especially the Mediterranean region [4]. In China, southern provinces, such as Guangxi, Guangdong, and Fujian, are known to be high incidence areas of the disease [5]. Thalassemia is an endemic disease which is mainly found in South China. The incidence in northern China is extremely low, and the sample size of the previous studies in the north was all less than 100.
Very few reports focused on thalassemia mutation detection and their clinical relevance in north China previously due to the rarity [6]. Because of continued migration, these diseases are now becoming increasingly common in international metropolitan cities such as Beijing. Furthermore, there indeed exist sporadic clinical cases with pure northern ancestry within three generations.
This study, through the analysis of data from 1059 people who lived in the north part of China with thalassemia mutation genes, including detailed family histories and gene spectrum, aims to trace the precise geographic origins of the various alleles identified in north China, give a geographical distribution of mutations of people resident in north China.

Probands
Among a total of 2136 people who came to Peking Union Medical College Hospital (PUMCH) from 2012 to 2017 to screen for thalassemia mutations, 1059 (299 males and 760 females) people had at least one positive finding (any mutations including heterozygosity or homozygosity) either for αor β mutation or both. PUMCH is a fully equipped hematology clinic and the only center for the detection of thalassemia mutation gene in the north China, so our data were considered representative for the northern China. The majority of people with positive findings originated from the 15 provinces in southern China; while the rest had ancestral home sporadically distributed in the 15 north provinces. The confirmed diagnosis was based on 'Criteria for diagnosis and treatment of hematologic diseases' and 'Guidelines for the prevention and control of thalassemia'. [1] Analysis of mutations Samples were collected by standard methods: 3 tubes of fresh venous blood (2 ml each), anticoagulated by EDTA-K2, were used for gene detection, hemoglobin electrophoresis, and blood cell analysis, respectively. Genomic DNA was isolated from the blood samples by the DNA rapid extraction kit (QIaamp DNA mini blood kit, QIAGEN, Hilden, Germany).
The reaction condition was 96°C for 5 min, 98°C for 45 s, 65°C for 90 s, 72°C for 3 min for 10 cycles; 98°C for 30 s, 65°C for 45 s, 72°C for 3 min for 25 cycles; 72°C for 10 min. Samples were preserved at 4°C. The reaction system and condition for the detection of the 3 common α-globin point mutations (HBA2:c.427 T > C, HBA2:c.369C > G, and HBA2:c.377 T > C, non deletion mutation) was similar to that of the β-globin genes. PCR reaction system was as follow: total volume was 25 μl, containing 23 μl of PCR reaction liquid, and2 μl of DNA template. The reaction condition was 96°C for 5 min, 98°C for 45 s, 65°C for 90 s, 72°C for 3 min by 10 cycles; 98°C for 30 s, 65°C for 45 s, 72°C for 3 min for 25 cycles; 72°C for 10 min. Samples were preserved at 4°C. The genotype was determined by hybrid membrane spot color characteristics of PCR products.

Ancestral home statistics
People with no searchable identity cards or follow-up information were excluded from the study. Using hospital information system, people were first classified by identity card numbers, then telephone follow-up survey was explored to confirm their ancestral information, including a household registration and immigration history of their ancestors within three generations of the family, to recognize whether these northern inhabitants had southern origin.

Data analysis
The frequency of 6 αand 14 β-thalassemia mutations which had been shown to be the most common mutations in China was summarized for all the 1059 patients, and people carrying the mutations were collected with their medical data and basic laboratory parameters, and further distinguished by their ancestor origin to see the difference between north and south. People from areas with different average altitude: below 500 m, 500 to 1000 m and above 1000 m were further studied for the difference of the gene distribution.
SPSS 24.0 software (IBM, NY, USA) was used for statistical analysis. The ratio of αand β-thalassemia alleles was calculated, and data were analyzed by Fisher's exact test and Pearson's chi-squared test. P<0.05 was considered statistically significant.

Demographic data
All subjects investigated here were north dwellers who had been resided in the northern China for more than 3 years. There were 1059 patients involved in this study, including 359 α-thalassemia carriers, 683 β-thalassemia carriers, and 17 αand β-thalassemia carriers at the same time. The mean age of these patients was 30.2 ± 13.9 years ranging from 0 to 82 years. Their age and sex distribution exhibited relatively nonhomogenous that most patients were aged between 21 and 40 years (79.9%), and more than two-third of carriers were females (71.8%), probably due to the greater likelihood of hypochromic anemia in this particular population. Although thalassemia is rare in north part of China, there were 42 α-thalassemia (9.5%) and 140 β-thalassemia (15.6%) carriers with pure northern descent within three generations in our study, spreading all over the north provinces ( Table 1).
The α mutations accounted for 27.2% of the all north originated (northerners) thalassemia genes and β mutations accounted for the rest 72.8%, while the percentage of α and β mutations in south originate people (southerners) were 37.6 and 62.4%, respectively. Northerners seemed to have higher percentage of β mutations and the south descendants had higher frequency of α mutation (χ 2 = 6.76, P < 0.05).
Not surprisingly, the majority of people with thalassemia mutated genes in our study can be defined as 'carriers' which had no symptoms, and the rest were 'patients' with mild clinical manifestation. More than 95% people in our study had mild symptoms, including mild anemia (Hb: 9-11 g/dl) and normal or moderately elevated serum ferritin level (<500 μg/L) However, there were still some people with severe anemia, elevated total bilirubin, enlarged spleen size, accompanied with iron overload and other complications (less than 5%). Totally, 18 β-thalassemia cases (3 with pure north origin) were diagnosed as β-thalassemia intermediate (Hb:<9 g/dl), with 4 of them transfusion-dependent (TDT). No β-thalassemia major, hemoglobin H and hemoglobin Bart's were found in our research.
Detail analysis of the spectrum of α-thalassemia mutations in different regions showed that there were no significant difference in the percentage of differentα-thalassemia mutations between people from north and south (P = 0.275) (Fig. 1).
Similarly, no difference was found in the percentage of different β thalassemia mutations between north and south regions, either (P = 0.661, Fig. 1).

Spectrum in different altitudes
As shown in Tables 3 and 4 ), and all these provinces had a frequency of higher than 40%. On the other hand, mutation HBB:c.52A > T was more common in higher altitude areas, especially in Xinjiang (100.0%), Gansu (75.0%), and Sichuan (47.3%). We found that the gene distribution in provinces with similar average altitudes was similar. We then classified the provinces according to their average altitude, 15 provinces were below 500 m, 8 were about 500-1000 m, and 5 were above 1000 m. The frequency of HBB:c.316-197C > T was gradually declined from the altitude below 500 m, to above 1000 m, and to about 500-1000 m. While the HBB:c.126_129delCTTT was more common in the provinces with elevation between 500 and 1000 m. And the frequency of HBB:c.52A > T was gradually increased from plain area to plateau area. The number of genes in these divided regions were summarized and listed in Table 5.

Discussion
Although the majority of the people in our study were descendants of southerners, there were indeed some people who were 'original' northerners and this is the report including the largest cases from the northerners. The purpose of this study was to compare the discrepancy of distribution of thalassemia gene between low incidence areas (north) and 'hot spot' (south) areas. To this aim, people with the 6 most common α mutations and 17 β mutations detected by Gap-PCR or RDB analysis and with traceable clinical data were studies.
It is noteworthy that the south originated people had higher frequency of α mutation (37.6%) than the north ones (27.2%, P < 0.05) in our study. This finding is consistent with the former literatures [9][10][11][12][13][14][15][16], showing that β-thalassemia is more common in China, either in the South or the North [18][19][20]. Our study gathered the largest thalassemia mutation cohort in the north, and demonstrated that North origin people seemed to have even higher percentage of β-thalassemia. Reason for the bias of different thalassemia distribution is not clear. Clinical  Table 3 and Table 4 Table 3 Frequency and distribution of               analysis showed that most of the people had no symptoms or with mild anemia. No symptomatic α-thalassemia or β-thalassemia major was found in our study. Since most of people carrying thalassemia mutations in north China have no or mild symptoms, they were suspected to have thalassemia only because they had microcytic anemia or sometimes, low mean corpuscular volume (MCV), abnormal erythrocyte morphology, or electrophoresis or for the differential diagnosis. Silent α-thalassemia may not have low mean corpuscular volume (MCV) and can be ignored during regular medical check. Of course, the low incidence of thalassemia mutation in northerners may also contribute to the deviation. Therefore, data in our research cannot represent the true prevalence of αor β-thalassemia [16], Even though, it is a cumulative data of five years from the largest thalassemia gene detection center, it may reflect the true situation of "detectable" thalassemia in the North China to some extent. Considering the harmlessness of those silence carriers, it is important to understand the general distribution of thalassemia and their clinical relevance, as shown in our study.
As for the percentage of individual thalassemia gene, there was no significant difference between people from north and south China, both for αand β-thalassemia, showing the ethnic coherence of North and South China.
Although mutations of α-thalassemia were evenly distributed in different provinces, with NG_000006. 1:g.26264_45564del19301 the highest one, mutations of  β-thalassemia showed significant difference in the different geographical regions. After trying to classify the regions in different ways, we finally found that the altitude of provinces accurately reflected the distribution of β-thalassemia phenotype. The frequency of HBB:c.316-197C > T was relatively high in the provinces below 500 m, HBB:c.126_129delCTTT centralized in the provinces with elevation between 500 and 1000 m, and HBB:c.52A > T mutation gradually increased from plain area to plateau area. This is the first report of this interesting phenomenon, and we did not find any literature to verify it so far. Although there are numerous reports about gene distribution in different provinces in south China, but the studying objects came from the limit provinces, which mostly located in the southern part of China with a relatively low altitude.
All genetic mutations of thalassemia may have been influenced by natural selection [17]. It has been shown that malaria is a strongly selective factor for many genotypes (e.g., G6PD deficiency, thalassemia, ABO, Rh, MN, Duffy, secretory types (Ss), human leukocyte antigens (HLA), etc.) [21] Malaria may have variations from different altitudes probably due to change of distribution of mosquitoes across an altitudinal gradient [20]. However, few studies have focused on the malaria transition in different altitudes in China. Therefore, the change of gene pattern of β-thalassemia in different altitudes may give a clue for the malaria transition in the history of China [22,23].
Race difference may also play a significant role in gene difference, and we couldn't exclude racial diversity at different altitudes. But in this study, we had checked that all the people detected in our study were Han Chinese, so the difference from ethnic diversity was small.
The reasons for thalassemia gene diversity are very complicated. One report has shown that even the frequency of PC allele for acid phosphatase in fourteen Sardinian villages positively correlates with the altitude and negatively with past malarial morbidity. Thus, thalassemia trait exerts a protective action only in subjects carrying PA allele for acid phosphatase [24]. In addition to malaria, other environmental factors related to altitude may also play a role in shaping the present pattern of distribution of thalassemia which need to be further investigated. There are also some reports pointed out that some thalassemia mutations in northern China may be a result of genetic founder effects, which need further research for verification and falsification [25,26].
For the first time, we reported the major thalassemia gene profile of people who resided in the north part of China, where thalassemia was considered only sporadic onset. Although this study needs to be verified in larger cohort, we described the possibility of relationship between altitude and pattern of β-thalassemia. Future whole genome sequencing studies which may better define the genetic polymorphisms, together with detail analysis of the change related to altitude which may cause the shift of thalassemia gene are worth doing. These findings will enrich our understanding of etiology and mechanism of thalassemia in China.

Conclusions
This study demonstrated a geographical distribution of mutations of people resident in north China, which showed that most of people in north China with thalassemia mutation gene were thalassemia carriers and indicated that in both north and south China, the spectrum of α and β mutations may have no significant difference. Furthermore, this research presented that people originate from regions with different level of altitudes may have different spectrum of β mutations, which broadened our understanding of etiology and mechanism of thalassemia in China.