Participants in the present study were recruited in the context of a wider research project to evaluate stroke risk factors in a Portuguese population sample, which enrolled first-ever stroke patients under 65 years of age through Neurology and Internal Medicine Departments of several hospitals in Portugal. Stroke was defined as a focal neurological deficit of sudden or rapid onset lasting more than 24 hours, and classified into ischemic or intracerebral hemorrhage based on brain imaging (computed tomography and/or magnetic resonance imaging). The diagnosis of stroke was confirmed by a neurologist. Demographic characteristics (age and gender), information on previous vascular risk factors and comorbid conditions (diabetes mellitus, hypertension, cardiac disease, dyslipidemia, obesity), life-style risk factors (smoking status, alcohol consumption, physical inactivity and others), and detailed clinical data during hospitalization, including neurological symptoms, complications and interventions, were collected for the majority of patients. Occurrence of aphasia, neglect, paresis, gaze paresis, dysphagia, permanent consciousness disturbance, urinary incontinence and medical and neurological complications were clinical parameters indicative of stroke severity. Stroke outcome at discharge and at three months was assessed, by direct interview, using the modified Rankin Scale (mRS).
For the present study, 568 patients with relevant clinical data and a DNA sample were available. Eight patients had a second stroke event after enrolment, affecting patient recovery, and were thus excluded. Of the remaining 560, 14 did not return after discharge for the three months evaluation, and therefore only 546 patients were included in the analysis. Patients were classified in two groups, according to their mRS at three months: patients with mRS ≤ 1 were assigned to the "good recovery" group and patients with mRS>1 were assigned to the "poor recovery" group (handicapped patients). 276 individuals were included in the good recovery group (63.0% males and 37.0% females) and 270 in the poor recovery group (64.4% males, 35.6% females). The poor recovery group included 12 patients who died before the three months evaluation (seven of them before hospital discharge, and five others after discharge). Genetic power calculations were performed using the CaTS software .
The study was approved by the Ethics Committee of Instituto Nacional de Saúde Dr. Ricardo Jorge and other hospitals involved, subjects gave informed consent and procedures followed were in accordance with institutional guidelines.
Single nucleotide polymorphisms (SNPs) within the MMP-2 and MMP-9 genes and up to 5 kb of the flanking regions were selected using the Haploview software (v4.0) , based on their tagging potential (HapMap Release 21/phase II July 2006). 4 SNPs in MMP-9 and 20 SNPs in MMP-2 were genotyped using the Sequenom iPLEX assays with allele detection by mass spectroscopy, using Sequenom MassARRAY technology (Sequenom, San Diego, USA) and following the manufacturer's protocol. Primer sequences were designed using Sequenom's MassARRAY Assay Design 3.0 software. 1 SNP in MMP-2 was genotyped using TaqMan® Pre-Designed SNP Genotyping Assays, in an ABI PRISM 7900HT Sequence Detector System (Applied Biosystems, Foster City, USA). Extensive quality control was performed using eight HapMap individuals, duplicated samples within and across genotyping plates, Mendelian segregation in three pedigrees and no-template samples. Call rates <90% and deviation from Hardy-Weinberg equilibrium led to SNP exclusion from the analysis. 2 SNPs in MMP-9 failed quality control and were substituted. In total, 21 MMP-2 SNPs and 4 MMP-9 SNPs were analysed.
The effect of discrete and continuous non-genetic variables on stroke outcome at three months was determined using the Pearson χ2 test and Mann-Whitney test, respectively. These included age, gender, stroke risk factors as well as data on clinical variables collected during hospitalization (like occurrence of paresis, aphasia and medical complications). Variables with a P < 0.25 in univariate analysis or of particular clinical relevance were included in a logistic regression model using forward selection  and were maintained in the model if they were associated at a P ≤ 0.05 level with stroke outcome. Logistic regression analyses were then used to determine the effect of each genetic variable on stroke outcome after adjustment for those significant non-genetic variables. Odds ratio (OR) and 95% confidence intervals (95% CI) were computed for the log-additive model. Univariate and logistic regression analyses were performed using MASS and SNPassoc packages of the R software  (v2.6.0). The Gabriel et al. (2002)  default method of the Haploview software  (v4.0) was used to determine haplotype blocks in the MMP-2 and MMP-9 genes. Since recovery processes may be regulated differently in ischemic and hemorrhagic stroke patients, we performed the same analyses in the subset of ischemic stroke patients. The small number of hemorrhagic stroke patients (N = 105) precluded the independent analysis of this subset.
Significant associations in individual SNP analysis were corrected for multiple testing using the Bonferroni method. The alternative SNPSpD approach, based on the spectral decomposition (SpD) of matrices of pairwise linkage disequilibrium (LD) between SNPs was also applied . Since some of the 21 SNPs genotyped in the MMP-2 gene are in LD with each other in our sample, we used the SNPSpD approach to estimate the effective number of independent SNPs in our sample for multiple testing corrections.