The current medical literature is increasing weekly with studies identifying DNA variants and their possible interaction with environmental factors that may have impact on risk of disease. The growth of such studies has been spurred by the promise of understanding the genetic and environmental basis of complex diseases, and the possibility of identifying therapeutically responsive targets for drug development. Enormous numbers of DNA variants have been associated with diseases and traits and this number will only grow as it becomes economically feasible to sequence an individual patient's entire genome.
One key data interpretation challenge lies in how best to assess the phenotypic heterogeneity and risk factor heterogeneity within the affected patient population. Even in situations where the association between a risk factor and disease is highly significant, there are individuals with the disease who do not manifest all risk factors and those with risk factors who manifest no disease. The presence of a risk factor is not a sufficient determinant of disease. This point is a critical consideration in drug development, as the effective size of the patient population that may be treated with a drug designed to target a particular genetic risk factor may in fact be much smaller than the total patient population. This has implications for the design of clinical trials that may incorporate genetic data, and ultimately for decisions on the feasibility of producing a medication.
Age-related macular degeneration (AMD) is an example of a complex disease that has been shown to have clear genetic and environmental antecedents. The leading cause of visual loss in the aging population, neovascular AMD is characterized by the growth of abnormal new blood vessels underlying the retina which can cause severe and rapid vision loss due to hemorrhage and exudation (for review please see Miller, 2008).
The general population harbors both modifiable and non-modifiable characteristics associated with AMD, however, the current study examines the afflicted subsample of population rather than at the entire population. Prior epidemiologic characteristics shown to be associated with the risk of AMD include age, gender, elevated body mass index (BMI), hyperlipidemia, hypertension, and cigarette smoking[4–12]. These factors are all well-documented to be associated with the risk of cardiovascular disease, and events such as myocardial infarction and stroke. In terms of cardiovascular risk factors, several studies have found that cigarette smoking (perhaps through oxidative stress and injury) elevates the risk of AMD[13, 14]. Another risk factor associated with cardiovascular disease is heavy alcohol consumption, which has also been shown to be associated with late-stage AMD, including neovascular AMD in one study,  but, other studies were unable to replicate this association[15–17]. Similarly, elevated BMI has been shown to be associated with AMD progression and also elevated risk of AMD[19, 20].
Cholesterol and lipid metabolism have been implicated in the pathogenesis of AMD,[21–32] and there is evidence both for and against the hypothesis that cholesterol lowering statin therapy may have a protective effect on the development of AMD[33–36]. In terms of hypertension, there is conflicting evidence supporting an association with neovascular AMD[6, 19, 37, 38].
Several genes have been associated with all subtypes of AMD, including the advanced stages, with the most strongly associated variants seen within the complement factor H (CFH) gene on chromosome 1q25. The CFH gene is known to play a role in the immune/inflammatory system[39–43]. Additionally, other strongly associated variants with large influence on AMD risk, particularly the neovascular subtype, are found in the ARMS2/ HTRA1 genes on chromosome 10q26[44–49].
Nevertheless, the ability to predict AMD risk would be greatly enhanced if both the effects of genetic and environmental risk factors were considered collectively, although the degree to which these factors interact in the risk of AMD or its progression is unclear. For example, although cigarette smoking has been shown to elevate the risk of AMD and its progression, significant interactions between smoking and CFH variants in predicting AMD risk have not been shown[50, 51]. While there is one report of variation within ARMS2 and interaction with smoking, others have not demonstrated this finding[46, 49]. In terms of cardiovascular risk factors, when smoking was included in a multivariate model, alcohol consumption, hypertension, and BMI were no longer associated with neovascular AMD. Only history of cigarette smoking remained significantly associated with neovascular AMD, with each pack-year being associated with a 2% increase in the risk of disease. Therefore it is important that presymptomatic diagnostic tests (and presumably any therapeutic agents in development) should be designed to take into account the assessment of all informative genetic variants along with documented disease associated environmental factors[2, 51].
Recognizing that any patient population with the same disease phenotype will be heterogeneous to some degree for any single risk factor or collection of factors, it is critical that a multivariate or multifactor approach is used to consider risk. Another important consideration in interpreting measures of association is that although the association may be statistically significant, not all cases with the disease will have the risk factor. For example, our group has shown that having two copies of the risk allele (TT) at ARMS2/HTRA1 rs1049331 significantly increases risk of developing neovascular AMD when compared to individuals who are homozygous for the common allele (CC), with many times greater magnitude of effect than important non-genetic factors such as smoking. However, it should be kept in mind that only 33% of the neovascular AMD patients evaluated actually carry the TT genotype, relative to 16% of their matched sibling controls.
Consequently, it is reasonable to hypothesize that appropriately designed studies may be able to identify meaningfully distinct subtypes or clusters of patients within the neovascular AMD population on the basis of genetic or environmental characteristics predictive of the risk of disease. If, for example, a pharmaceutical company was developing a drug specifically targeting neovascular AMD that focused on specific genetic and cardiovascular risk characteristics, the actual patient population that might be responsive or benefit from such an agent would actually be a subset of the total, comprising only those patients with that particular risk profile. There may well be overlapping pathophysiological antecedents between risk of cardiovascular disease and neovascular AMD[4, 10–12, 53, 54].
In the present study we examine the genetic and cardiovascular risk characteristics of patients with neovascular AMD in a multivariate segmentation analysis to identify clusters of patients with distinct epidemiologic and genetic risk profiles. To do this, we leverage a clustering analytic approach, a multivariate method that yields groups of individuals who have underlying similarities across a number of different behavioral, attitudinal, and/or demographic characteristics. In the public health sector, standard clustering methods have been leveraged to identify relevant subgroups of individuals with a particular disorder. For example, three distinct subgroups of individuals with obsessive compulsive disorder have been identified. Each group was characterized by pathophysiologic mechanisms and different treatment outcomes, which may have significance in classifying and treating these patients. Other clustering studies have been conducted with suicidal psychiatric patients, substance abusers, Parkinson's Disease, and caregivers of eating disorder patients among others.