Effects of interacting networks of cardiovascular risk genes on the risk of type 2 diabetes mellitus (the CODAM study)

Background: Genetic dissection of complex diseases requires innovative approaches for identification of disease-predisposing genes. A well-known example of a human complex disease with a strong genetic component is Type 2 Diabetes Mellitus (T2DM). Methods: We genotyped normal-glucose-tolerant subjects (NGT; n = 54), subjects with an impaired glucose metabolism (IGM; n = 111) and T2DM (n = 142) subjects, in an assay (designed by Roche Molecular Systems) for detection of 68 polymorphisms in 36 cardiovascular risk genes. Using the single-locus logistic regression and the so-called haplotype entropy, we explored the possibility that (1) common pathways underlie development of T2DM and cardiovascular disease -which would imply enrichment of cardiovascular risk polymorphisms in "pre-diabetic" (IGM) and diabetic (T2DM) populations- and (2) that gene-gene interactions are relevant for the effects of risk polymorphisms. Results: In single-locus analyses, we showed suggestive association with disturbed glucose metabolism (i.e. subjects who were either IGM or had T2DM), or with T2DM only. Moreover, in the haplotype entropy analysis, we identified a total of 14 pairs of polymorphisms (with a false discovery rate of 0.125) that may confer risk of disturbed glucose metabolism, or T2DM only, as members of interacting networks of genes. We substantiated gene-gene interactions by showing that these interacting networks can indeed identify potential "disease-predisposing allele-combinations". Conclusion: Gene-gene interactions of cardiovascular risk polymorphisms can be detected in prediabetes and T2DM, supporting the hypothesis that common pathways may underlie development of T2DM and cardiovascular disease. Thus, a specific set of risk polymorphisms, when simultaneously present, increases the risk of disease and hence is indeed relevant in the transfer of risk.


Background
Genetic strategies for dissection of complex diseases require innovative approaches for identification of disease-predisposing genes or -combinations of polymorphisms. The development and progression of complex diseases involve interplay of genetic and environmental factors. This implies involvement of several susceptibility genes in the development of a complex disease, and an interaction between these genes and the environment, e.g. lifestyle habits [1]. Not only gene-environment interactions, but also gene-gene interactions may be highly relevant for development of a complex disease.
Type 2 Diabetes Mellitus (T2DM) is an example of a complex, multigenic disorder with a high prevalence in man [2,3]. A large number of candidate susceptibility genes and loci for T2DM have been proposed and some of those have been replicated in more than one study population using linkage analysis (reviewed by Stern [4]).
It becomes increasingly clear that simple assessment of the contribution of individual genes to T2DM susceptibility via analysis of the individual effects of common polymorphisms may not be sufficient for identification of the processes and genes involved in T2DM. Indeed, genetic interactions in T2DM have been described for some pairs of genes or SNPs [5][6][7]. Therefore, these polymorphisms should be studied both individually and as members of genetic networks. To address complex relations between potential susceptibility genes, Zhang et al. [8] developed a method for detecting complex haplotype interactions (or allele coupling) among a set of polymorphisms, which may influence the susceptibility to a complex disease. The advantage of this method over single-locus analyses lies in the fact that it allows for identification of the combined presence of specific (functional) SNPs in disease. This approach has shown to be very effective for the analysis of gene-gene interactions, where genetic polymorphisms do not reside on the same chromosome, i.e. are evidently physically unlinked, but interact with each-other to contribute to the risk of complex disease [9]. The rationale behind this approach is that simultaneous presence of particular gene variations, even when physically unlinked, may predispose to disease because some variations may affect the "local environment" (e.g. insulin resistance or glucose homeostasis) in a way that affects the consequences of other variations (e.g. reduced beta-cell function).
T2DM is known to be associated with increased risk of cardiovascular disease and we hypothesize that the increased cardiovascular risk in T2DM implies common metabolic pathways for development of T2DM and cardiovascular disease and, hence, that "pre-diabetic" (IGM) and diabetic (T2DM) populations will be enriched in cardiovascular risk polymorphisms. This motivated us to use an assay for candidate markers of cardiovascular risk designed by Roche Molecular Systems [10] for our current analyses.
For this study we determined a set of known cardiovascular risk polymorphisms in the CODAM (Cohort Study on Diabetes and Atherosclerosis Maastricht) population [11]. We used both single locus-based logistic regression and the method developed by Zhang et al. [8] to explore the possibility that complex haplotype interactions (or allele coupling) in this set of cardiovascular risk polymorphisms can be implicated in risk of glucose intolerance or T2DM.

Description of subjects
The Cohort study of Diabetes and Atherosclerosis Maastricht (CODAM) is a prospective population-based cohort study in The Netherlands [10]. Inclusion criteria were: age 40-70 years and Caucasian descent (i.e. four Caucasian grandparents), and in addition either a body mass index (BMI)>25 kg/m2, and/or a positive family history for T2DM, and/or a history of gestational diabetes, and/or the use of antihypertensive medication, and/or a postprandial blood glucose larger than 6.0 mmol/l and/or glucosuria. All subjects were genetically independent (i.e. unrelated) and the NGT, IGM and T2DM phenotype was assigned based on the results of an oral glucose tolerance test. IGM subjects were either impaired glucose tolerant or had impaired fasting glucose levels. A majority of the subjects (>85%) were unaware of their glucose tolerance status prior to inclusion in this cohort. The study was approved by the local Medical Ethical Committee of the Maastricht University/Maastricht University Hospital and all subjects gave written informed consent. All subjects with disturbed glucose metabolism (i.e. subjects who were IGM or who had T2DM) and a random sample of the control (NGT) individuals were included in the current study. A summary of the characteristics of these subjects is provided in Table 1.

Genotyping
Research assays for cardiovascular disease genetics designed by Roche Molecular Systems [11] were used to genotype 68 SNPs in genes in pathways of lipid and homocystein metabolism, regulation of blood pressure and coagulation, inflammation, cellular adhesion, and matrix integrity (listed in Table 2). DNA samples of the subjects included in the study were genotyped by using the polymerase chain reaction (PCR). This led to 309 genotypes of 68 loci that can each be assigned to one of the following three groups: a NGT (control) group (n = 54), a glucose intolerant group (n = 111 IGM subjects), and a T2DM group (n = 142 patients). Each genotype can be divided into 16 blocks according to their chromosome identities. See Table 2. A cut-of value of ≥ 21% for "heavy missing" was selected on the basis of experience [8]. We decided not to include the APOE(Arg158Cys) locus in the logistic regression analyses because it had a missing rate of 41.7% (≥ 21%). All other loci had less than 10% missing.

Phase 1: Detection of main individual effects of the polymorphisms
To evaluate the main effects of the polymorphisms on insulin resistance or T2DM, without taking into account the genotype for the other 67 polymorphisms, single locus logistic regression analyses were performed. Four polymorphisms (i.e., ACE, ADRB2, GNB3, APOC3) that provided significant results in these analyses were subsequently evaluated together in a multiple logistic regression model to determine of these effects were independent of one-another. We applied the procedure of Storey and Tibshinari [12] for controlling the overall error rate of multiple testing. In this procedure a statistical significance measure called false discovery rate (FDR) was defined, which is the expected proportion of false positives among the tests called significant [12,13].

Phase 2: Detection of gene-gene interactions
To detect the haplotype interactions that influence the susceptibility to IGM and T2DM, we performed the following haplotype entropy procedure on every pair of SNP blocks and on every SNP pair.
Consider the unphased genotype data G = {G 1 ,..., G n } on the n subjects at m loci from any two SNP blocks or two SNPs, where G i = (G i1 ,..., G im ) and G ij takes values 0, 1, 2 according to whether its genetic haplotype at the locus j is homozygous with allele 0, or homozygous with allele 1, or heterozygous. Additional categories are created for missing two alleles (G ij = 9) and the presence of only one missing allele (G ij = 7 when allele 0 is missing at locus j and G ij = 8 when allele 1 is missing). Each G i can be partitioned into two sub-genotypes We want to test the null hypothesis that the above two SNP blocks or two SNPs are independent. In the case of testing the independence between two SNPs, the problem can be viewed as testing the independence of two single SNP blocks with m = 2. Note that for m loci, the number of possible genotypes is 3 m , much higher than 2 m , the number of possible haplotypes. This means that the genotype space involved in modelling the dataset G has a much lower dimension than would be the corresponding haplotype space. Therefore, using haplotype frequencies to test the above null hypothesis would be more efficient than using the genotype-based χ 2 test and Fisher exact test in contingency tables. However, these haplotype frequencies are not directly available because the underlying haplotype pairs for these genotypes are unknown.
In an effort to tackle these limitations, Zhang et al. [8]  Taking the minus of the logarithm of this profile likelihood and dividing it by 2n, we obtain the entropy of the frequencies , namely Taking notice of the uncertainty of H, we define the haplotype-entropy e(G)of these genotypes as the minimal value of e(G | H) when H is running over all the possible sets of candidate-haplotype pairs. That is, e(G) is the value of e(G | H) when H attains the maximum of the above profile likelihood [9]. Zhang et al. [8] demonstrated that the smaller the haplotype-entropy, the stronger the interaction between the above two SNP blocks would be. This implies that when two SNP blocks are interacting, the haplotype entropy of their combined genotypes will be expected less than those generated from two independent blocks. Note that when the two SNP blocks are independent, exchanging the labels among sub-genotypes in one block would not change the distribution of e(G). To take advantage of this property in calculating the null distribution of e(G)(i.e., the distribution when the two blocks are independent), we conducted a permutation π only on the labels of sub-genotypes , which leads to a permutated genotype sample with the haplotypeentropy e(G π ) Repeating this permutation procedure say 200 times, we obtained a permutation distribution of e(G) as an approximation to the null distribution of e(G). The P-value of for an observed value of e(G) is then defined as the proportion of the times that e(G) is larger than e(G π ). A Z-score can be also defined as in [8].
Our haplotype entropy procedure involves two stages: In the first stage the above permutation procedure is performed on each of three individual groups for interactions between and within haplotype blocks defined in Table 2.
The glucose intolerance and T2DM predisposing interactions are found in the second stage by contrasting the interaction patterns observed for patients with the interaction patterns for controls. Thus, significant interaction between polymorphisms in one glucose intolerance state, but not in the others, implies is up-or down-interaction.
Here, the up-interaction means two blocks (or two polymorphisms) are independent in controls (NGT subjects) but become dependent blocks in cases (IGM or T2DM subjects). Similarly, we can define the down-interaction. The up-interactions would suggest that those interactions lead to a susceptibility to the disease, whereas the downinteractions could imply that the related interactions may have a protective effect on developing the disease [8].
Significant up-or down-interaction between two polymorphisms was established according to the criteria P ≤ 0.05/P ≥ 0.145). This was guided by a simulation study using a coalescent-based program called MS, by R Hudson [14]. We simulated genotypes for the three different situations (i.e., closely linked with a low mutation rate, weakly linked with a low mutation rate, and weakly linked with a high mutation rate) described by quantities (θ, R) = (4,4), (4,20), and (16,16). Here θ = 4N eµ ,R = 4N e r, N e is the effective population size, µ is the total per-generation mutation rate across the region sequenced, and r is the genetic distance, in morgans, between loci. For each  setting of (θ, R), we generated 20 samples of size 20 as control samples from a population in which the underlying two genotype blocks are independent, and 20 samples of size 20 as case samples from a population in which the two genotype blocks are dependent. We then applied our testing procedure to these data sets. The accuracy of our procedure can be measured by the quantities F a and F p , where F a is the proportion of false positives when the underlying two blocks in genotypes are independent, and F p is the proportion of successes in identifying an up-or down-interaction. Note that F a and F p can be roughly viewed as the type I error rate and the power of our testing procedure. For (θ, R) = (4,4), (4,20), and (16,16), (F a , F p ) = (0.10,0.80),(0.05,0.80),(0.00,0.85) respectively. The results imply that if we use the thresholds P ≤ 0.05/P ≥ 0.145 in our multiple testing procedure, the overall type I error rate could be reasonably controlled at the level of 0.10 or less for the above simulated situations.
Unfortunately, since the above coalescent model may be biased against our data, the true type I error could be significantly different from 0.10. We have addressed this issue by correcting for multiple testing using the so-called Bayesian FDR-controlling procedure of Storey and Tibshirani [12].
The interactions that have been found in the above two stages may facilitate understanding of the pathological mechanisms involved in the diseases, as well as the further identification of some SNP blocks that provide significant association with the diseases only when their interactions with other blocks are taken into account. Note that the significant interactions between two polymorphisms detected in stage 1 of these analyses may be present irrespective of the insulin resistant state (i.e. is present in NGT, IGM and T2DM). These polymorphisms can be considered to be in complete association in this Caucasian study population. Such interactions cannot be directly used to discriminate between insulin resistant states (see also Table 4 and 5).

Phase 3: Disease-predisposing allele combinations
The data obtained in Phase 2 of the study only provide information on which combinations of polymorphisms may be related to disease-risk, but this procedure does not identify risk-alleles. We have further analysed the genepairs identified in phase 2 to specify which combinations of alleles may actually be predisposing to disease. For this, frequencies of allele combinations in NGT, IGM and/or T2DM subjects were analysed using (comparisons of proportions; χ 2 tests).
Statistical analyses of phase 1 and 3 studies were done using SPSS 9.0 (Chicago IL, USA). The analyses in phase 2 were done using C and Splus 7.0 (Insightful Corp).

Results
The general characteristics of the subjects (Table 1) show the expected metabolic differences between the NGT, IGM and T2DM subjects such as higher BMI, glucose, insulin, triglycerides and blood pressure and lower HDL (high density lipoprotein)-cholesterol in the T2DM patients. The groups did not differ with respect to gender and age. The NGT, IGM and T2DM groups were also comparable with respect to the presence of coronary heart disease.

Phase 1: Main individual effects of polymorphisms
The results of logistic regression analyses (Table 3) suggest that the heterozygous genotype of ACE(ins/del) and ADRB2(gln27glu) and the homozygous genotype of the a If total of minor + major alleles < 100%, this means that there were some missing genotype data; b The 4 polymorphisms that had a P-value < 0.05 in individual logistic analyses were entered in this logistic regression analysis simultaneously, hence the effects of these polymorphisms are adjusted for oneanother; c The expected proportion of false positives among these significant findings is 10% (FDR = 0.1).
AA allele of APOC3 (C-641A) may predict disturbed glucose metabolism (i.e. subjects who were IGM or who had T2DM; P = 0.001, P = 0.002 and P = 0.008, respectively). A homozygous genotype for the minor allele of GNB3 (C825T), on the other hand, predicts protection from T2DM with a P-value that may be considered borderline significant (P = 0.0007). Applying the FDR-controlling procedure of Storey and Tibshirani [12] to these P-values in Table 3, we found that the expected proportion of false positives among our findings for ACE (ins/del), APOC3 (C -641A), GNB3(C825T) and apoC3 (C-641A) is 10%.

Phase 2: Gene-gene interactions
We sought for pair-wise interactions between the 16 SNP blocks and between the polymorphisms. An up-interaction between two polymorphisms is claimed to be associated with disturbed glucose metabolism if significant evidence of the interaction between these two polymor-phisms was found in the whole group of subjects with a disturbed glucose metabolism (those who were IGM or had T2DM) but not in the control (NGT) subjects. An upinteraction associated with T2DM is claimed if the interaction between these two polymorphisms was found significant in the T2DM subjects but not in the NGT or IGM Criteria for significant up-or down-interaction are P ≤ 0.05 and P ≥ 0.145. Thus, for instance, in the NGT subjects there is significant interaction between NPPA and ADBR2 (P = 0.015), which is absent in the IGM (P = 0.455) and the T2DM (P = 0.410; Table 4). Hence, IGM and T2DM status (i.e. a disturbed glucose metabolism) are associated with down-interaction between NPPA and ADRB2. When significant interactions between two polymorphisms is detected irrespective of the insulin resistant state (i.e. is present in NGT, IGM as well as in T2DM), this means that these polymorphisms are in complete association, for example, interaction between MTHFR and NPPA is P = 0.005 in the NGT subjects, P < 0.001 in the IGM subjects and P < 0.001 in the T2DM subjects ( Table 5). This association is, by itself not informative for the insulin resistance state but may indicate involvement of the genes via a transitive interaction (see results section) subjects. Similarly we can define a down-interaction associated with disturbed glucose metabolism or associated with T2DM.
This revealed interacting pairs of genetic polymorphisms that are described in Table 6 (underlying data are in Table  4 and 5). Noteworthy, three of the four polymorphisms that were identified as potential main factors in the previous section, i.e. ADRB2(gln27glu), APOC3 (C-641A), GNB3(C825T), were also a member of one of these networks. If we accepted a FDR of 0.125, then 14 pairs of polymorphisms were detected to be associated with either IGM or with T2DM or with both ( Table 6). This implies about 14×FDR = 1.75 pairs are expected to be false positive in these 14 pairs. Two important groups of networks can be identified. The first group, which might influence susceptibility to disturbed glucose metabolism (i.e. IGM and T2DM combined), was formed by down-interactions between LPA(G121A) and CBS(844 68bp-/Ins), between APOC3(C1100T) and F7(-323 10-bp Del/Ins), and between APOB(Thr7Ile) and SCNN1A(Ala663Thr) and up-interactions between APOA4(Gln360His) and APOC3(C-641A). The second group, which was found to be potentially associated with T2DM only, consisted of up-interactions between SELE(Leu554Phe) and ITGA2(G873A), between APOB(Thr71Ile) and GNB3(C825T), between APOB(Thr71Ile) and CETP(Ile405Val), between ADRB2(Gln27Glu) and CBS(84468bp-/Ins), and between APOA4(Thr347Ser) and APOC3(C1100T). In summary, at the FDR level of 0.125, polymorphisms in 7 genes (LPA, CBS, APOC3, F7, APOB, SCNN1A, APOA4) are involved in susceptibility to a disturbed glucose metabolism (subjects who were IGM or who had T2DM), and susceptibility polymorphisms in 9 genes (SELE, ITGA2, APOB, GNB3, CETP, ADRB2, CBS, APOA4, APOC3) may predispose to T2DM. Moreover, polymorphisms in 6 genes (ITGB3, CBS, LPA, F7, SCNN1A, APOC3) are involved in different networks that are IGM specific. If we relax the FDR level to 0.461 (in which P-values less than 0.05 are called significant), then 18 extra pairs of interacting polymorphisms can be identified (Table 6), but 32 × 0.461 = 14.75 of those pairs are expected to be false positive.
In addition to these data on up-and down-interaction, we found that the following polymorphisms were in complete association with some members of the interaction networks (i.e., these polymorphism pairs are significantly interacting in the all three groups, NGT, IGM,  Table 5). This indicates that, besides the above-mentioned direct interactions via allele-coupling, additional polymorphisms may  also, but perhaps indirectly, contribute to the interaction networks. For example, MTHFR is in complete association with NPPA (Table 5), AGT is in complete association with ADRB2 (Table 4), and NPPA is down-interacting with ADRB2 (Table 4). This implies MTHFR and AGT are transitively down-interacting with effects on susceptibility to a disturbed glucose metabolism.

Phase 3: Disease-predisposing allele combinations
To substantiate the applicability of the data obtained in phase 2 of this study, we analysed the allelic distribution in the gene-pairs that were identified in phase 2. For the gene-pairs that were significantly associated with T2DM, ADRB2(Gln27Glu)-CBS(Ile278/Ins), APOB(Thr7Ile)-CETP(Ile 405Val), APOA4(Thr347Ser)-APOC3(C1100T), APOB(Thr7Ile)-GNB(C825T), ITGA2(G873A)-SELE(Leu 554Phe), we found that when the C-allele of APOB was present, the distribution of the C and T alleles of GNB3 differed between T2DM (n = 207 of which 15.9% were T) and non T2DM (n = 222 of which 26.6% were T, P = 0.007). When the T allele of APOB was present, the distribution of the GNB3 alleles did not differ between T2DM (n = 106; 35.4% T) and non T2DM (n = 79; 47.2% T, n.s.). In other words, the allelic distribution of GNB3 between T2DM and non-T2DM is related to a specific genetic background for the APOB polymorphism, which implies genegene interaction. Likewise, we found that when the Tallele of SELE was present, the distribution of the A and G alleles of ITGA2 differed between T2DM (n = 12; 100% A) and non T2DM (n = 8; 62.5% A; P = 0.02). When the Callele of SELE was present, the distribution of the ITGA2 alleles did not differ between T2DM and non T2DM. The gene-pairs that were significantly associated with IGM or T2DM subjects, APOB(Thr7Ile)-SCNN1A(Ale663Thr), F7(-323 10-bp Del/Ins)-APOC3(C1100T), APOA4(Gln36 0His)-APOC3(C-641A), LPA(G121A), CBS(Ile278/Ins) were analysed in a similar way. The distribution of the C and T alleles of APOC3 differed between disturbed glucose metabolism subjects (n = 71; 57.7% C) and NGT subjects (n = 16; 25.0% C, P = 0.018) when the insertion was present in F7, but not when the deletion was present. The distribution of the A and G alleles of LPA differed between subjects with a disturbed glucose metabolism (n = 68; 89.7% G,) and NGT (n = 12; 33.3% G, P < 0.001) when the insertion was present in CBS, but not when it was absent. The distribution of the C and A alleles of APOC3 differed between disturbed glucose metabolism (n = 422; 53.4% A) and NGT (n = 89; 36.0% A, P = 0.003) when the G allele of APOA4 was present, but not when the T allele was present. For the other gene-pairs we could not identify interaction using this approach.

Discussion
The rationale behind the current study is that the effects of various risk genes on development of a complex disease will most likely not be independent from one-another. Therefore, such genes should be studied both individually and as members of genetic networks. T2DM is a wellknown example of a complex disease and, in our view, the increased cardiovascular risk in T2DM implies that common metabolic pathways may be involved in development of T2DM and cardiovascular disease. We explored this possible involvement -including the possibility that gene-gene interactions are relevant for the effects of risk polymorphisms-in normoglycaemic, "pre-diabetic" (IGM) and diabetic (T2DM) subjects. Of note, the prevalence of coronary heart disease (CHD) was not different between the three disease groups. The involved risk genes and gene-gene interactions for subjects with a disturbed glucose metabolism (i.e. who were IGM or who had T2DM) or T2DM only, that we report here will therefore not result from imbalance in the distribution of coronary heart disease between the groups, but rather be related to their glucose tolerance status.
In phase 1 we performed single locus-based logistic regression analyses to identify main independent predictors for risk of a disturbed glucose metabolism or T2DM. This may be considered a more "traditional" approach. In phase 2 we used a novel approach to detect complex haplotype interactions (or allele coupling) among a set of polymorphisms to evaluate their effects on risk of glucose intolerance or T2DM. This resulted in identification of several networks of interacting genes. In phase 3 we assessed the applicability of data as obtained in phase 2 by constructing "disease predisposing allele combinations", thus showing that these add valuable information to that obtained in phase 1.
Twelve (LPA, CBS, APOC3, F7, APOB, SCNN1A, APOA4, SELE, ITGA2, GNB3, CETP, ADRB2) of the genes that we report here to be associated with susceptibility for a disturbed glucose metabolism and/or T2DM alone (when FDR = 0.125), actually reside on chromosomal regions that have previously been implicated in T2DM or related traits or have directly or indirectly been implicated in "diabetes associated" traits (for more detailed information see Tables 7 and 8). Our present findings may therefore be carefully considered to be duplications/ confirmations of those previously published results. We propose that the fact that most of these genes/polymorphisms do not confer risk "on their own" but rather via interaction with other susceptibility genes may be one of the reasons for the apparent inconsistency in replication of the disease predisposing effects of several diabetes risk genes. In other words, the contribution of genetic polymorphisms to risk of human diseases with a complex genetic background such as T2DM may preferably be evaluated within their own genetic environment, i.e. while taking into account the genotype of other relevant poly-morphisms carried by an individual. The approach described in phase 2 and 3 of this study provides a method for this. For 2 of the 5 gene-pairs that were significantly associated with T2DM, we could show that the distribution of polymorphism A between T2DM and non-T2DM was depend-ent on which allele of polymorphism B was present. This indicates significant gene-gene interaction. Also for 3 of the 4 gene-pairs that were significantly associated with a disturbed glucose metabolism we could show gene-gene interaction. For the other gene-pairs we could not identify gene-gene interaction using this approach. One reason for this could be that the method used in phase 2 is more sen-

Gene Gene or Chromosomal area was implicated in: Variation
Polymorphism was implicated in: LPA 6q27 • The metabolic syndrome [15] which is characterised by insulin resistance G121A CBS 21q22.3 • Hyperhomocysteinemia [16] which, in turn, is related to diabetic nephropathy [17].
-323 10bp Ins/Del sitive to networks that interact at a more complex level. Some of the gene-gene interactions may be more subtle than just pair-wise interactions, e.g. in predisposition to T2DM, there are pair-wise interactions between APOB(Thr7Ile) and CETP(Ile 405Val), but also between APOB(Thr7Ile) and GNB(C825T), so CETP and GNB3 polymorphisms may influence each-other via the APOB polymorphism or, alternatively, the effect of the APOB polymorphisms may be influenced by variations in both CETP and the GNB3.
Although the above results represent statistical association, the actual risk can hold true at the functional level. For instance, in our current analyses the gene-gene combinations that are involved in susceptibility for a disturbed glucose metabolism (Table 7) all include at least one gene involved in plasma lipid/lipoprotein metabolism. This suggests that disturbances in lipid metabolism may predispose to insulin resistance that is aggravated by the simultaneous presence of another polymorphism in a different gene. In the gene-gene combinations that predispose to T2DM (Table 8), this holds true for 3 out of the 5 combinations.
A comparison with previous data on type 1 diabetes (T1DM) that were obtained with a highly similar genotyping assay, unveiled that the gene-gene interaction network that was identified by Zhang et al. [8] as a potential genetic aetiology to T1DM, is different from the network for T2DM we present here. However, these networks do share a common set of polymorphisms including: NPPA(T2238C), APOB(Thr7Ile), ADRB2(Gln27Glu), NOS3(A-922G), APOA4(Gln360His), APOC3(C-641A), SCNN1A(Ala663Thr), LIPC(C-480T), LDLR(Ncol+/-), ACE (Intron 16 Ins/Del), and CBS(Ile278/Ins). This suggests that genes that predispose to T1DM may, especially in combination with other gene polymorphisms, add to the risk of T2DM and vice versa. This may relate to genes affecting beta cell function (e.g. NOS, ADRB2) and the effects of lipids on insulin secretion, the so-called lipotoxic effect on beta cells. (e.g. apoB, apoA4, apoC3, LIPC, LDL-R). These relations require further investigation.
The main limitation of the current study is the fact that, due to the small sample size, the epistatic interactions between SNPs must be quite strong to be detected. This also caused our selection of main effects of individual SNPs from the logistic regression to be only marginally significant. On other hand, although we have done some simulation study on the power of the haplotype entropybased test, the issues on the power of haplotype entropy based SNP networking study and on an appropriate sample size for such a study have not been addressed. Despite of this limitation, we believe our study provides further evidence to support the hypothesis that the genetic basis of a disturbed glucose metabolism and T2DM is complex with many genes acting in concert in the transfer of risk.

Conclusion
Our current approach to explore the presence of complex haplotype interactions (or allele coupling) among a set of polymorphisms has led to identification of several interacting networks of genes that may be relevant in predisposition to disturbed glucose metabolism or T2DM. Our current data thus imply that the combined presence of a specific set of risk polymorphisms, when simultaneously present, increases the risk of disease is indeed relevant in the transfer of risk. Such interacting polymorphisms may represent (functional) polymorphisms on different, physically independent genes. Simultaneous analysis of multiple risk polymorphisms located on different genes may be an important step in the identification of gene-gene interaction, and identification of the metabolic routes that are influenced by the "disease-predisposing allele combinations" may prove to be instrumental in further identification of the processes that underlie development of T2DM.