Breakpoint characterization of large deletions in EXT1 or EXT2 in 10 Multiple Osteochondromas families

Background Osteochondromas (cartilage-capped bone tumors) are by far the most commonly treated of all primary benign bone tumors (50%). In 15% of cases, these tumors occur in the context of a hereditary syndrome called multiple osteochondromas (MO), an autosomal dominant skeletal disorder characterized by the formation of multiple cartilage-capped bone tumors at children's metaphyses. MO is caused by various mutations in EXT1 or EXT2, whereby large genomic deletions (single-or multi-exonic) are responsible for up to 8% of MO-cases. Methods Here we report on the first molecular characterization of ten large EXT1- and EXT2-deletions in MO-patients. Deletions were initially indentified using MLPA or FISH analysis and were subsequently characterized using an MO-specific tiling path array, allele-specific PCR-amplification and sequencing analysis. Results Within the set of ten large deletions, the deleted regions ranged from 2.7 to 260 kb. One EXT2 exon 8 deletion was found to be recurrent. All breakpoints were located outside the coding exons of EXT1 and EXT2. Non-allelic homologous recombination (NAHR) mediated by Alu-sequences, microhomology mediated replication dependent recombination (MMRDR) and non-homologous end-joining (NHEJ) were hypothesized as the causal mechanisms in different deletions. Conclusions Molecular characterization of EXT1- and EXT2-deletion breakpoints in MO-patients indicates that NAHR between Alu-sequences as well as NHEJ are causal and that the majority of these deletions are nonrecurring. These observations emphasize once more the huge genetic variability which is characteristic for MO. To our knowledge, this is the first study characterizing large genomic deletions in EXT1 and EXT2.


Background
Osteochondromas (cartilage-capped bone tumors) are by far the most commonly treated of all primary benign bone tumors (50%) [1]. Most osteochondromas appear as solitary, nonhereditary lesions, but in 15% of cases these tumors occur as multiple lesions in the context of multiple osteochondromas (MO) [2] (OMIM 133700-133701), previously known as osteocartilaginous exostosis or multiple hereditary exostosis (MHE/HME). The prevalence of this autosomal dominant skeletal disorder is estimated to be 1/50.000 in the Western population [3]. MO is characterized by the formation of multiple osteochondromas mainly arising from the growth plate area in the juxta-epiphyseal region of long tubular bones. These bone neoplasmas are caused by an increased chondrocyte proliferation and bone growth at children's metaphyses [4]. During the first decades of life, they develop gradually and grow in size and number, until skeletal maturation is achieved at the end of puberty with the closing of the growth plates [5]. MO is characterized by a significant inter-and intrafamilial phenotypic variability, including variation in the number and size of osteochondromas, the number and location of involved bones, and the degree of the deformities. Various complications arise from these benign tumors, but malignant transformation towards a chondrosarcoma is by far the most serious one, occurring in 0.5-2% of patients [2,3,6]. MO is caused by mutations in Exostosin-1 (EXT1) (OMIM *608177) [7] or Exostosin-2 (EXT2) (OMIM *608210) [8,9]. EXT1 consists of 11 exons, is located at 8q24.11-q24.13 and spans~350 kb [10], while EXT2, located at 11p12-p11, comprises of 16 exons and span-ning~108 kb [11]. Both genes act as tumor suppressor genes that belong to the EXT multigene family [12] and are ubiquitously expressed. All members of this family encode proteins that are involved in the adhesion and/ or polymerization of heparin sulphate (HS) chains at HS proteoglycans (HSPG's). The HSPG's play a crucial role in the diffusion of Indian Hedgehog (Ihh), which is important for the regulation of chondrocyte proliferation and differentiation. In osteochondromas, the cartilage cells of the tumor cap are heterogeneous regarding to the mutation status in EXT1 or EXT2, with a mixture of homozygous and heterozygous EXT-inactivated cells [13]. MO is characterized by a huge genetic heterogeneity. The online Multiple Osteochondromas Mutation Database (MOdb) http://medgen.ua.ac.be/LOVDv.2.0/home. php provides an overview on all currently reported MOcausing mutations. To date, the database contains more than 1000 MO-patients representing more than 600 unique EXT1-or EXT2-mutations. In 3.7% of the MOdb-cases, the disease is caused by a large deletion consisting of at least one EXT1-or EXT2-exon [14]. However, these rearrangements are expected to account for up to 5-8% of MO cases [15,16].
Smaller single or multi-exon rearrangements in EXT1 or EXT2 cause a phenotype of multiple osteochondromas only. They are routinely screened for by Multiplex Ligation-dependent Probe Amplification (MLPA) [16]. However, this diagnostic technique does not provide detailed information on the breakpoints, so different deletions involving the same exon(s) cannot be distinguished, nor can insight be acquired into the developmental mechanism of these gross rearrangements. Therefore, we characterized for the first time the MOcausing genomic deletions in EXT1 or EXT2 in index patients from ten unrelated families, using an MO-specific tiling path array, allele-specific PCR-amplification and direct sequencing.

Patients
This study included index patients from ten unrelated families (Family 89, 122, 150, 200, 250, 279, 300, 334, 338 and 361) originating from all over Europe and the USA. The study was approved by the ethical committee of the University of Antwerp under number A04-64. Consent of the patients was obtained. All patients had radiological confirmed multiple osteochondromas. Genomic DNA was isolated from peripheral blood according to standard procedures. Two patients were reported to be sporadic cases while for six patients the disease was reported to be inherited. Details on patients are listed in Table 1.
All patients were selected from a larger cohort of MO patients based on the presence of a single or multi exon deletion in the EXT1 or EXT2 gene, detected by MLPA analysis [16] or FISH analysis with EXT1-probes 46F10, 65G5 and 90D8 [7] or EXT2-probes A1151 and D0694 [9]. The MLPA/FISH findings on two of these families (Families 150 and 200) have been reported previously [16].

ArrayCGH tiling path array
High-resolution oligonucleotide arrayCGH analysis was performed using a custom-made Agilent oligonucleotide-based chip (Agilent Technologies, Santa Clara, CA, USA) in accordance with manufacturers' instructions. This array contains~44.000 probes and has a tiling coverage for EXT1 and EXT2, additional genes involved in HS-pathways and adjacent genes. This MO-specific tiling path array has been thoroughly described in [22].

PCR-amplification & sequencing
For the exact mapping of the deletion breakpoints, allele specific PCR's were designed around the deletion breakpoints that had been mapped using arrayCGH. PCR-amplifications were performed by with a Touch Down temperature protocol using Platinum Taq DNA Polymerase (Invitrogen, San Diego, CA, USA) and the Advantage UltraPure dNTP Combination Kit (Clontech, Mountain View, CA, USA). Amplification products were subsequently sequenced using Big Dye Terminator Cycle Sequencing kit with sequencing analysis on an ABI3130xl genetic analyzer (Applied Biosystems, Foster City, CA, USA). Primers for PCRamplification and sequencing analysis of breakpoints are listed in Table 1.

In silico analysis of breakpoint regions
Sequences located 25 bp down-and upstream from the breakpoints as well as the breakpoint region were analyzed for the presence of DNA-motifs (and their complements) that are known to be associated with genomic deletions, being translin target sites (ATGCAG, GCCCWSSW), immunoglobulin heavy chain class switch repeats (GAGCT, GGGCT, GGGGT, TGGGG, TGAGC) and the deletion hotspot consensus sequence (TGRRKM) [23]. The same regions were also analyzed for purine-pyrimidine repeats. Finally, sequences 125 bp down-and upstream from the breakpoint junctions were analyzed for their AT-content.
Regions of 1 kb down-and upstream from the breakpoint junctions were additionally analyzed for repeats (Alu-elements and LINE's) with the UCSC human genome browser http://genome.ucsc.edu.
In the search for homologous sequences, alignment of sequences up to 10 kb down-and upstream from the breakpoint junctions was performed using the NCBI BLAST2-software http://www.ncbi.nlm.nih.gov/blast/ bl2seq/wblast2.cgi.

Identification of deletion breakpoints with ArrayCGH, PCR-amplification & sequencing
We subjected samples from the ten index patients with heterozygous causal EXT1 or EXT2 deletions to tiling path arrayCGH, allele-specific PCR-amplification and sequencing analysis for characterization of the deletion breakpoints. This allowed determination of the precise size of the respective deletions and the position of the proximal and distal breakpoints or breakpoint regions for all ten patients. An overall view to scale on the extend of all analyzed EXT1 and EXT2 deletions is provided in Figure 1, while Table 2 summarizes sequences flanking the deletions breakpoints.

In silico analysis of deletion breakpoints
An overview on all results on the presence of microhomologies at the breakpoint junctions, alternating purine-pyrimidine sequences and deletion-associated DNA sequence motifs can be found in Table 2. Analysis of the AT-percentages for the sequences located 125 bp down-& upstream of proximal and distal breakpoints showed AT-enriched regions (≥65%) on both breakpoints for families 338 and 279.
A schematic illustration on all recombination events can be found in Figure 2, including the BLAST2 results concerning homologous regions as well as the UCSC results concerning the presence of Alu-and LINE1-elements. Only the 2 kb regions around the breakpoints are represented. No other homologies that could have played a role in the different recombination events were identified outside these regions.

Discussion
The most prominent DNA-repair mechanism in human cells is NHEJ, involving simple ligation of any two DSB's in the absence of extensive sequence homology. NHEJ is divided into two sub-pathways, classical and non-classical. In classical NHEJ, end-resection is very limited since this pathway only efficiently joins DSB's with overhangs of fewer than four bases. Typical for this mutation mechanism is that it can be facilitated by terminal microhomologies (1-4 bp), although their presence is not necessary [24]. In non-classical NHEJ or "microhomology-mediated end joining" (MMEJ) however, a rare end-joining event takes place which uses longer microhomology regions (5-25 bases) [25]. Non-classical NHEJ can also repair ends of a single DSB in a way similar to SSA, resulting in the generation of small-scale deletions [26]. MMRDR comprises all replication-based mutational models that are predicated upon the use of microhomology for strand misaligning. NHEJ-compatible events involving microhomologies can be explained either by NHEJ or by MMRDR, implicating that both mechanisms do not necessarily rule each other out [17].
Homologous recombination is the second major repair pathway for DSB's. It uses a homologous sequence as a template for repair. However, erroneous recombination on a non-allelic homologous site can lead to chromosomal rearrangements, including deletions. The different submechanisms depend on the homologous sequence that is used. NAHR or "unequal HR" represents the main homologous recombination mechanism. It is the    typical mechanism for recurrent deletions and is the most common mechanism underlying disease-associated genomic rearrangements. It occurs between two nonallelic homologous sequences, generally with a length of at least 200 bp, often consisting of repetitive elements such as long or short interspersed nuclear element (LINE's or SINE's including Alu-elements) or low copy repeats (LCR's). A second HR-mechanism is SSA, which can take place when DSB's are flanked by direct repeats. This pathway uses these repeat sequences as the identical sequences that HR needs for repair, instead of requiring the presence of a homologous sequence. After 5'-end resection, the 3'-tails simply anneal to each other before one of the 3'-tails can find and base-pair with a homologous sequence. SSA results in simple rearrangements with deletion of the DNA-fragment located between the repeats as well as one of both repeats.
Since the success rate of this pathway is inversely related to the distance separating the direct repeats (SSA depends on the formation of a short hairpin loop between breakpoint ends), SSA only accounts for smallscale deletions [27,28]. So, it is clear that gross genetic deletions do not appear randomly in the genome. They are associated with DNA-sequences promoting either one of the above mentioned mutation mechanisms. Additional features that have been shown to play a role in the appearance of large deletions, are sequences rich in adenine and thymine (AT-enriched sequences), alternating purinepyrimidine repeats and recombination-associated motifs such as translin binding sites, immunoglobulin heavy chain class switch sites and deletion hotspot consensus sequences [23].
To date, single or multi-exon rearrangements in EXT1 or EXT2 have been routinely screened for by Multiplex Ligation-dependent Probe Amplification (MLPA) [16]. However, since this diagnostic technique does not provide insight into the developmental mechanism of these rearrangements, we characterized for the first time the large genomic deletions in EXT1 and EXT2 in index patients from ten unrelated families using sequencing analysis and hypothesized on the developmental mechanism of these rearrangements.
Recent studies have demonstrated that high content of Alu-elements results in increased frequency of gene disruption by large deletions in several human diseases [29]. Alu-elements are by far the most abundant short interspersed nuclear element (SINE's), with an estimated copy number of~1.4 million [30,31]. They are conserved repeats with a consensus sequence of~300 bp that have been amplified in primate genomes through retroposition [32] and consist of different subfamilies (AluY, AluSx, ...) (reviewed in [33]). These elements have been proposed to have a number of functions in the human genome, but it is certain that they did have a major impact on genomic architecture, since dispersion of the Alu-sequences throughout the genome offers many opportunities for NAHR [34,35], leading to Alurecombination mediated deletion (ARMD)-events [36,37]. LINE1-elements, a subgroup of the long interspersed nuclear elements (LINE's) are also known to frequently provide sequence homology for NAHR. Consequently, we analyzed the whole genomic sequences of EXT1 and EXT2 for the presence of these elements. The EXT1 genomic region was found to be enriched in Alu-elements (13.4% compared to 10.6% in the human genome), while the EXT2 region showed to be rich in LINE1-elements (20.4% compared to 16.9% in the human genome) [38]. This might explain why both genes are more prone to deletions.
In the EXT1 exon 2-3 deletion of family 250 as well as in the EXT2 exon 8 deletion of unrelated families 122 (Spain, familial MO) and 334 (Bulgaria, de novo MO) extensive sequence similarity was identified between the breakpoint regions. These rearrangements were shown to be caused by recombination between Alu-repeats, giving rise to the formation of a novel complete recombinant Alu-sequence. We concluded these deletions to be caused by NAHR, with this hypothesis further supported by the fact that the EXT2 exon 8 deletion was recurrent. Additionally, these deletion-causing recombinations might have been facilitated by the presence of multiple deletion-associated DNA sequence motifs.
For family 200 and 279, no homology was found at the breakpoints of the respective EXT1 exon 8 deletion and the EXT2 exon 8, although multiple LINE1-and Alu-elements were identified in the proximity of the breakpoints as well as multiple deletion-associated DNA sequence motifs. Consequently, these cases were found to be consistent with the classical NHEJ mutation mechanism.
For family 89, the breakpoints of the EXT1 exon 2-11 deletion were located within a 5-bp homologous sequence. Multiple LINE1-and Alu-elements were identified at the distal breakpoint, while the proximal breakpoint only showed the presence of the complement of a deletion hotspot consensus sequence. The 5 bp-microhomology at the breakpoint junctions implied that this deletion was probably caused by non-classical NHEJ or by MMRDR.
For the remaining 4 families (Family 300 with EXT1 exon 6-7 deletion, Family 338 with EXT1 exon 11 deletion, Family 361 with EXT1 exon 2-11 deletion, Family 150 with EXT2 exon 2 deletion), deletion breakpoints were located within microhomologies (2-4 bp), consistent with either the classical NHEJ or the MMRDR mechanism. Alu-elements were found in the proximity of all breakpoint regions, except for the distal breakpoint of the EXT2 exon 2 deletion in family 150, while deletion-associated motifs were found at the proximal breakpoint of the EXT1 exon 2-11 deletion from family 362.
In our series, only two patients harbored an identical deletion (EXT2 -exon 8 deletion). In MO, the size and location of the various intragenic EXT deletions does not seem to correlate with phenotypical differences as all these deletions are assumed to cause loss-of-function of the respective tumor suppressor gene EXT1 or EXT2. The same is true for EXT loss of function point mutations, where no intragenic genotype-phenotype correlation is observed [16]. The only correlation that can be observed is for larger deletions causing the contiguous gene syndromes Langer Giedion syndrome [19] and Proximal 11p Deletion syndrome (Potocki-Shafer syndrome) [20,21], but such patients were not included in our series. Furthermore, our patient dataset was too small to confirm previous observations of EXT1 patients being more severely affected compared to EXT2 patients [16].
The identification of the breakpoints of this subset of patients makes it possible to design allele specific PCRassays allowing targeted screening for recurrent deletions. Performing these allele specific PCR-amplifications on a larger set of EXT-deletion patients can further confirm the absence or presence of deletion hot spots for MO-causing single-or multi-exon deletions in EXT1 and EXT2.

Conclusions
Out of the ten deletions analyzed in this study, NAHR was found to be the causal mechanism in two cases. Two deletions were caused by classical NHEJ, while four other rearrangement events could be explained by either classical NHEJ or MMRDR. Finally, non-classical NHEJ or MMRDR were the possible mutation mechanisms for the last deletion. One deletion, typically caused by NAHR, was shown to be recurrent in 2 patients, but no clear deletion breakpoint hotspots could be identified within our set of patients.
So, we can conclude that MO-causing large genomic deletions in EXT1 or EXT2 are caused by a variety of mutation mechanisms, emphasizing once more the huge genetic variability for MO.