Samples
DNA was extracted from 54 CRC cell lines (C10, C32, C70, C75, C80, C99, C106, C125PM, C170, CACO2, CL11, COLO201, COLO205, COLO320, COLO678, COLO741, CX1, GP5D, H508, H716, HCA7, HCA46, HCT8, HCT116, HCT15/DLD1, HRA19, HT29, HT55, HT115, HUTU80, LOVO, LIM1863, LS1034, LS123, LS174T, LS411, MBDA8, NCI-H747, PC/JW, RKO, SKCO1, SNUC2B, SW1116, SW1222, SW1417, SW403, SW48, SW620, SW837, SW948, T84, VACO4S, VACO5, VACO10MS).
Research was undertaken in compliance with the Helsinki Declaration and with full ethical approval (Oxfordshire REC B, 05/Q1605/66). 268 anonymised, formalin-fixed, paraffin-embedded (FFPE) samples of colorectal adenomas from thirteen FAP patients with FAP were identified. In order to increase the frequency of tumours showing LOH, we selected cases with germline mutations around codon 1300. We also used larger lesions (> 0.5 cm diameter) to minimise problems of polyclonality in smaller colorectal adenomas [8]. For 10 of the larger polyps (diameter ≤ 0.9 cm), more than one sample was obtained from different parts of the same polyp in order to check for consistency of the molecular data, although in each case, complete concordance was observed. After enrichment for dysplastic epithelium using a fine gauge needle and the dissecting microscope, DNA was extracted from each adenoma using a standard proteinase K digestion and the Qiagen DNeasy kit, except that elution with water was undertaken twice to increase the yield. Normal tissue from the same block was extracted using the same method.
For analysis of associations between mitotic recombination breakpoints and colorectal cancer risk, two case-control series were analysed: (i) cases and controls from the UK CORGI study of familial colorectal tumour patients [9]; and (ii) cases from the VICTOR http://www.octo-oxford.org.uk/alltrials/infollowup/vic.html and QUASAR 2 http://www.octo-oxford.org.uk/alltrials/trials/q2.html clinical trials and controls from the UK 1958 Birth Cohort http://www.b58cgene.sgul.ac.uk/. All cases and controls, comprising about 4,000 samples in total, were of white UK ethnic origin. Further details of ascertainment, inclusion criteria and exclusion criteria can be found in [9]. Genotyping data were derived from the Illumina Hap300, Hap370 or Hap550 arrays (see below). Each series was analysed separately for significant allele or genotype frequency differences between cases and controls, followed by a weighted meta-analysis using the Mantel-Haenszel method in STATA9.0.
SNP microarray analyses
Patient and colorectal cancer cell line DNAs were prepared and hybridised to arrays (Affymetrix 10 K HuSNP and Illumina Hap300, Hap370 and Hap550) using the manufacturer's standard protocols. Genotyping calls were made using the manufacturer's software, resulting in call rates of over 98% and sample failure rates of < 2% using good quality DNA. LOH and copy number changes were scored using the manufacturer's software in each case, supplemented by visual inspection of allele frequency plots along each chromosome.
For the custom Goldengate arrays, designed to assess LOH on 5q in FFPE tumour samples, the Illumina custom probe design software was used to test the suitability of all SNPs from dbSNP126 that mapped between the chromosome 5 centromere and the APC locus (~112 Mb). After elimination of SNPs with low design scores, a panel of 360 remained. An additional 11 SNPs distal to the APC gene were chosen to provide evidence that LOH extended as far as the telomere; 7 were located on chromosome 5p in order to give evidence of whole-chromosome LOH; and 10 SNPs mapped elsewhere in the genome to give evidence of copy number changes. 200–300 ng of each DNA was hybridised to the arrays using the manufacturer's standard protocols. After excluding SNPs with Gentrain scores of less than 0.3, the Illumina Beadstudio software was used to indicate LOH and copy number change on 5q, either in unpaired sample mode (for CRC cell lines) or, after training on FFPE samples, in paired mode for the colorectal adenomas. We found, however, that visual inspection of allelic binning and manual corrections of clear errors by two or more independent observers (SR, KH, IT) improved data quality and allowed breakpoints to be mapped more closely.
Microsatellite and analysis
Chromosome 5q microsatellite markers (D5S623, D5S664, D5S407, D5S398, D5S2107, D5S624, D5S1990, D5S2089, D5S647, D5S2019, D5S2003, D5S2041, D5S2029, D5S107, D5S644, D5S669, D5S346) and in-del polymorphisms (rs2067135, rs1305058, rs3087334, rs2307799, rs1610940) were chosen according to their location from the human genome March 2006 build http://genome.ucsc.edu. PCR genotyping used a single dye-labelled primer, the ABI 3730 sequencer and the Genescan/Genotyper software. For analysis of CRC cell lines, absence of one allele was required to score possible LOH. The fact that paired constitutional DNA was not available in most cases meant that this approach was more useful for indicating definite retention of homozygosity than mapping LOH.
Pyrosequencing
According to the manufacturer's standard protocols, a Pyrosequencing assay was designed to discriminate between the SMN1 and SMN2 genes using a C>T substitution [10] corresponding to SNP rs4916 (location chr5:69,408,109 in the March 2006 Human Genome Build). Briefly, primers were designed using the proprietary Pyrosequencing assay design software, giving: (i) biotinylated forward PCR primer, TCCTTTATTTTCCTTACAGGGTTT; (ii) reverse PCR primer, ATGCTGGCAGACTTACTCCTTAAT; and sequencing primer, TCCTTCTTTTTGATTTTGT. Allelic intensities were derived from the standard Pyrosequencing software and used to derive genotypes (see below).