Genetic studies of the Roma (Gypsies): a review
© Kalaydjieva et al 2001
Received: 15 January 2001
Accepted: 2 April 2001
Published: 2 April 2001
Skip to main content
© Kalaydjieva et al 2001
Received: 15 January 2001
Accepted: 2 April 2001
Published: 2 April 2001
Data provided by the social sciences as well as genetic research suggest that the 8-10 million Roma (Gypsies) who live in Europe today are best described as a conglomerate of genetically isolated founder populations. The relationship between the traditional social structure observed by the Roma, where the Group is the primary unit, and the boundaries, demographic history and biological relatedness of the diverse founder populations appears complex and has not been addressed by population genetic studies.
Recent medical genetic research has identified a number of novel, or previously known but rare conditions, caused by private founder mutations. A summary of the findings, provided in this review, should assist diagnosis and counselling in affected families, and promote future collaborative research. The available incomplete epidemiological data suggest a non-random distribution of disease-causing mutations among Romani groups.
Although far from systematic, the published information indicates that medical genetics has an important role to play in improving the health of this underprivileged and forgotten people of Europe. Reported carrier rates for some Mendelian disorders are in the range of 5 -15%, sufficient to justify newborn screening and early treatment, or community-based education and carrier testing programs for disorders where no therapy is currently available. To be most productive, future studies of the epidemiology of single gene disorders should take social organisation and cultural anthropology into consideration, thus allowing the targeting of public health programs and contributing to the understanding of population structure and demographic history of the Roma.
This review of genetic studies of the Roma was prompted by two recent developments: (i) Studies conducted over the last decade have resulted in the identification of a number of novel single gene disorders and disease-causing mutations. The accumulating data are already sufficient to outline a pattern and draw conclusions about public health policies and future research. (ii) The economic and political changes in Eastern Europe and the wars in former Yugoslavia have led to the west-bound migration of large numbers of Roma [7,8], changing the traditional demographic profile of Gypsy minorities across Europe. A predictable consequence of this new diaspora is that medical practitioners in many countries will encounter Romani patients with previously unknown or very rare disorders. A summary of the available information should facilitate diagnostic investigations and counselling in these affected families and stimulate international collaboration.
Literature searches were performed using the U.S.A National Library of Medicine PubMed/MEDLINE databases for the period 1960 to December 2000. Database searches using the keyword "Gypsies" identified 297 articles whilst the keyword "Gypsy" produced 573 articles. The discrepancy is due mainly to the inclusion of articles about the "gypsy retransposable element" and the "gypsy moth". Searches using the terms "Roma", "Romani" and "Romany" yielded results that were not relevant to the topic (eg. Roma, the capital of Italy) or else incomplete.
The majority of the 297 articles dealt with issues beyond the focus of this review, namely social problems related to the health of the Roma (28.6%), or general medical problems (29.6%). The remainder were reports on genetic research, of which 41 studies (13.8%) were in the field of clinical genetics, 44 (14.8%) were molecular studies of genetic disorders, and 39 (13.1%) covered population genetic research. In the clinical and molecular genetics fields, we have given preference to publications which were not limited to single case descriptions, and dealt with disorders with public health impact. Population genetics papers were selected on the basis of the compatibility of study design, specifically the analysis of comparable polymorphic systems.
Complementary data on history, linguistics, cultural anthropology and demography were found through standard library and bibliographic searches, and included publications recommended by consulting experts in Romani studies (Drs. Elena Marushiakova and Vesselin Popov from the Bulgarian Academy of Sciences and Dr. Ian Hancock from the University of Texas at Austin).
Genetic studies of the Roma have been conducted for over 70 years, with thousands of individuals sampled across Europe. During the years of the Third Reich, Gypsies, together with Jews, attracted the special attention of German geneticists . A grant proposal signed by Nobel prize winner Ferdinand Sauerbruch and funded by the Deutsche Forschungsgemeinschaft designed the "genetic and medical research" at the death camp in Auschhwitz . The Race Hygiene and Population Biology Research Centre, established in 1936, organised thorough records of Jewish and Romani pedigrees and provided "the scientific basis" for the "final solution", the annihilation of millions of Jews and Roma in the concentration camps of Nazi-occupied Europe.
Post-war genetic research has been preoccupied with the Indian origins of the Roma [[10–16]], pursuing the "Indian connection" even in studies meant to focus on severe genetic disorders . Most studies have remained in the realm of scientific exploration, away from the health needs of the Roma. Many publications display judgemental and paternalistic attitudes, that would be considered unacceptable if used with regard to other populations.
This historical "track record", the persisting practices of discrimination and marginalisation [[3–6]], and the fact that, unlike the Jews, the Finns and the French Canadians, the Roma are still the "object" of investigations conducted by outsiders, are all likely to impact on the attitudes of the Roma towards genetics. Building up the trust and collaboration necessary for both public health programs and research, should become a goal of the health care systems of Europe.
Population genetic studies have used mostly "classical" polymorphisms to investigate Romani individuals from different European counties and address three main questions: (i) similarity between Roma and Indians; (ii) relatedness to European populations; (iii) affinities between Romani populations from different countries [[10–24]]. Single locus comparisons have resulted in controversy, with some pointing to close genetic affinity between Roma and Indians, and others indicating that the Roma are indistinguishable from Europeans. Heterogeneity between countries has become apparent and has led to the conclusion that the European Roma are composed of two different populations, characterised respectively by a high and a low frequency of blood group B , or defined as East and West European Roma, with the former closely related to Indian populations . Heterogeneity of Romani populations within the same country has been suggested by the very small number of studies addressing this issue [19,21,25,26].
Multi-locus reanalysis of previously published data on European Roma
Proportion of the variance explained by differences
All Roma (n = 1287) versus
1.81 ± 1.45%
0.58 ± 0.29%
(n = 5169)
Roma (n = 1287) versus
0.36 ± 0.69%
2.8 ± 0.39%
North Indians (n = 315)
3.47 ± 0.46%
populations (n = 1287) in
0.19 ± 0.12%
Populations (n = 5169)
As a result of traditionally low socio-economic status and limited access of the Roma to health care, their unique genetic heritage has long escaped the attention of European medicine and is now being randomly "discovered".
Mendelian disorders of the Roma caused by private founder mutations
Hereditary motor and
Hereditary motor and
Congenital cataracts facial
Limb girdle muscular
dystrophy type 2C
The list includes three novel neurological disorders, namely hereditary motor and sensory neuropathies type Lom (HMSN-L) [[37–39]] and type Russe (HMSN-R) , and the congenital cataracts facial dysmorphism neuropathy syndrome (CCFDN) [41,42].
In addition, a number of previously known but rare disorders have been identified and shown to be caused by novel private mutations (Table 2). Examples include limb-girdle muscular dystrophy type 2C (LGMD2C) , galactokinase deficiency , primary congenital glaucoma , and congenital myasthenia .
In view of the lack of systematic studies, the list cannot be comprehensive and is likely to represent the biases and interests of individual medical researchers working in this field. Data in the literature, particularly from the Spanish Collaborative Study of Congenital Malformations , point to the existence of a number of additional rare single gene disorders, whose molecular basis is still to be identified. These include hereditary idiopathic torsion dystonia (ITD) , epidermolysis bullosa , albinism , and some rare autosomal recessive malformation syndromes, such as Bowen-Conradi, Jarcho-Levin, Meckel, Smith-Lemli-Opitz, and Fraser [47,49].
A third group of Mendelian disorders includes common conditions, where the mutation prevalent in the surrounding or in global populations is likely to have been introduced by admixture, for example cystic fibrosis and delF508 , phenylketonuria and the R252W and IVS10nt546 mutations [51,52], and medium chain acyl-coenzyme A dehydrogenase (MCAD) deficiency and G985 .
With the exception of phenylketonuria, Mendelian disorders have been described as genetically homogeneous, with a single mutation accounting for all affected individuals and related polymorphic haplotypes unambiguously indicating a common origin and founder effect [[37–40],[42–46]].
Reported gene frequencies are high for both private and "imported" mutations, and often exceed by an order of magnitude those for global populations. For example, galactokinase deficiency whose worldwide frequency is 1:150,000 to 1:1,000,000 [56,57] affects 1 in 5,000 Romani children ; autosomal dominant polycystic kidney disease (ADPKD) has a global prevalence of 1:1000 individuals worldwide  and 1:40 among the Roma in some parts of Hungary ; primary congenital glaucoma ranges between 1:5,000 and 1:22,000 worldwide [59,60] and about 1:400 among the Roma in Central Slovakia [61,62].
Reported carrier rates for single gene disorders among the Roma
Primary congenital glaucoma
Autosomal dominant polycystic
Hereditary motor and sensory
Limb girdle muscular dystrophy
Although incomplete, the available data already lead to some practical conclusions: (i) What may appear to be a novel disorder confined to a single family, could in fact be an indication of a common problem affecting large numbers of individuals. Research should therefore extend beyond case descriptions and aim at more comprehensive epidemiological information. (ii) The emphasis on consanguinity in affected families displaces the focus from an obvious need for public health intervention to patterns of personal behaviour. In the face of the reported high gene frequencies, consanguinity is no more relevant than it would be as a cause of beta-thalassemia in Mediterranean countries. (iii) High gene frequencies may result in the parallel segregation of phenotypically similar but genetically distinct disorders within the same kindred [40,42]. This clustering should be borne in mind in diagnostic studies, where assumptions based on pedigree structure should be avoided and independent clinical and genetic assessment should be conducted in all cases.
Research into Mendelian disorders has provided ample evidence of genetic stratification, with mutations occurring at high frequencies in some Romani communities and altogether absent in others, located in close geographic proximity. In some cases, such as Glanzmann thrombasthenia [63,64], LGMD2C [65,66], galactokinase deficiency , CCFDN  and HMSN-R , the identity of the affected groups has been specified. Other studies, for example of congenital glaucoma [61,62] and ADPKD  provide only an indication of the area of residence of the affected communities. In the few cases where gene frequencies can be compared between high-risk groups and the general Romani population of the same country, substantial differences become apparent (Table 3).
The pattern emerging from genetic research is that of a conglomerate of founder populations which extend across Europe but at the same time differ within individual countries, and whose demographic history, internal structure and relationships are poorly understood. An insight is provided by the social sciences.
The 18th century theory on the Indian origins of the Roma [reviewed in 1], is based on the similarities between Romani and languages spoken in the Indian subcontinent and is supported by genetic evidence. However the lack of close relationship to any specific language or dialect has left unresolved the question of the original ethnic composition of the proto-Roma, with both single [72,73] and diverse  origins proposed by linguists. Translated into the language of genetics, this is a relevant question related to the homogeneity or diversity of the founding population.
Inferred from linguistic influences retained in all Romani dialects, the major migration routes pass through Persia, Armenia, Greece and the Slavic-speaking parts of the Balkans . The first documents pointing to the arrival of the Roma in the Balkans date from the 11th-12th century [1,75]. By the 15th century, mention of their presence can be found in historical records from all parts of Europe [1,2].
Ottoman Empire Tax
11-15,000 males of
During its subsequent history in Europe, this founder population split into numerous socially divided and geographically dispersed endogamous groups, with historical records from different parts of the continent consistently describing the travelling Gypsies as "a group of 30 to 100 people led by an elder" [1,2]. These splits, a possible compound product of the ancestral tradition of the jatis of India, and the new social pressures in Europe (e.g. Gypsy slavery in Romania  and repressive legislation banning Gypsies from most western European countries [1,2]), can be regarded as secondary bottlenecks, reducing further the number of unrelated founders in each group. The historical formation of the present-day 8 million Romani population of Europe is therefore the product of the complex initial migrations of numerous small groups, superimposed on which are two large waves of recent migrations from the Balkans into Western Europe, in the 19th - early 20th century, after the abolition of slavery in Rumania [1,2,76] and over the last decade, after the political changes in Eastern Europe [7,8].
The Group is still the primary building block of the social organisation of the Roma [1,2]. Group identity and the ensuing divisions and rules of endogamy are based on tradition, customs and organs of self-rule, language and dialects, trades, history of migrations, and religion. Individual groups can be classified into major metagroups [1,2,75]: the Roma of East European extraction; the Sinti in Germany and Manouches in France and Catalonia; the Kaló in Spain, Ciganos in Portugal and Gitans of southern France; and the Romanichals of Britain . The greatest diversity is found in the Balkans, where numerous groups with well defined social boundaries exist. The 700-800,000 Roma in Bulgaria belong to three metagroups, comprising a large number of smaller groups .
Linguistics, history and cultural anthropology suggest two major, equally plausible historical scenarios that could lead to a "jigsaw puzzle" of founder populations: (i) a genetically substructured ancestral population, where the old social traditions of strict endogamy have been retained and subsequent splits of the comprising groups have enhanced the original genetic differences; (ii) a small homogeneous ancestral population spawning numerous subgroups where strong drift effects have resulted in substantial genetic divergence. Genetic research has indeed faced the "jigsaw puzzle" and has thus far been unable to resolve it. The genetic data provide evidence of population stratification, however a closer examination is precluded by the random cross-section sampling design of most population genetic studies, where the traditional social organisation and self-identity of the Roma have been ignored and subjects classified on the basis of the political boundaries of Europe. The relationship between social organisation and genetic structure does not appear to be straightforward and is still to be addressed in population genetic research based on the long standing identity of Gypsy groups. The issue is of relevance to public health policies and the targeted prevention of mendelian disorders, as well as to future studies of genetically complex disorders.
The existing information on single gene disorders is certainly not exclusive to the Roma. The phenomenon of clustering of rare disorders and private founder mutations has been studied in detail in well characterised founder populations, such as the Jews [77,78], Finns [79,80] and French Canadians . Unlike the above examples however, genetic studies of the Roma have failed to take the immediate benefits of research back to the individuals and families that have been the object of research. Yet by now it should be obvious that genetics has an important role to play in improving the quality of health care for the Roma. Treatable disorders such as galactokinase and MCAD deficiency, with an expected incidence of affected births in the range of 1:1,000 to 1:5,000, meet the standard criteria for newborn screening more than does phenylketonuria, with its average incidence of 1:10,000. Adding the simple, sensitive and specific mutation tests to existing newborn screening programs would be technically simple and highly efficient due to the homogeneous genetic basis of the disorders.
Carrier testing should be made available to Romani communities at high risk for severe untreatable disorders. Information on the identity of affected Romani populations is important for public health intervention since it would allow the planning and facilitate the implementation of targeted prevention programs, especially in the Eastern European countries where economic resources are limited. The importance of the educational component of such programs has already been demonstrated by the highly successful prevention of Tay-Sachs disease among Ashkenazi Jews  and the failure of sickle-cell screening among Afro-Americans . This component would be particularly important for a population like the Roma, which has been subject to racism and persecution throughout its co-existence with European societies.
The attention of geneticists is increasingly attracted by genetically isolated populations in the third world. In terms of living standards and the major health indicators, the Roma are much closer to the developing world than to their European neighbours . This forgotten people of Europe can be regarded as a test case for the capacity of genetics to provide better health.
We thank the Romani families and communities, and the numerous colleagues in different countries who have made our research into the genetics of the Roma possible, the members of the Gypsy Genetic Heritage Consortium Prof. J.-C. Kaplan, Prof. A. Urtizberea, Prof. J.-P. Liegeois, Drs. M. Jeanpierre and L. Merlini for their commitment to international collaboration, and Drs. E. Marushiakova, V. Popov and I. Hancock for enlightening discussions of the ethnology, history and linguistics of the Roma. Special thanks to the research team at the Centre for Human Genetics of Edith Cowan University.
L.K. wishes to acknowledge funding from The Wellcome Trust, The National Health and Medical Research Council of Australia, The Muscular Dystrophy Association of the US, L'Assoçiation Française contre les Myopathies, The Australian Research Council and Edith Cowan University.
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.