We performed a two-stage DNA pooling-based GWA for asthma and atopy, evaluating a genome-wide panel of markers in a subset of subjects using a DNA pooling strategy, and in a second stage we evaluated the most promising markers at an individual level in the individual pooled samples as well as in a second set of cases and controls. We identified SGK493 as a new potential gene associated to atopy. MAP3K5, COL18A1 and COL29A1 genes were found to lesser extent associated with atopy.
The most significant results were observed for SGK493 gene, in chromosome 2, with similar results for atopy and for atopic asthma, probably due to sharing predisposing factors . Significant polymorphisms were located in two different linkage disequilibrium blocks. The first block includes the putative promoter region of SGK493 (5' upstream) and the second block covers part of the gene and the 3' untranslated region (UTR) and 3'downstream region (Figure 1). Haplotype based analysis confirmed results obtained in single marker analysis. We could not elucidate which is the functionality of these SNPs and a fine mapping of the region within a more powered sample would be required. We cannot exclude that the functional variant could be tagged by these SNPs and located in another close gene. SGK493 gene was identified during the creation of a catalogue of human protein kinases , but its particular function is unknown. SGK493 could be involved in pathological states as protein kinases mediate most of the signal transduction in eukaryotic cells . We reported that SGK493 gene is ubiquitously expressed in human tissues, however a higher expression on lung and uterus was previously identified by microarray experiment from 73 human tissues http://biogps.gnf.org/. Recently, the knock out of Sgk493 (also known as Pkdcc) in mouse has been described presenting extreme phenotypes that are not a priori related with atopy or asthma. The knock out mice showed abnormal respiration and died within a day possibly due to cleft palate .
Another kinase MAP3K5 was also related to atopy as well as atopic asthma in this study. MAP3K5 encodes for a member of the mitogen-activated protein kinase family that regulates the activation of the transcription factor activator protein-1 (AP1) in leukotriene D (4) (LTD(4)) stimulated airway smooth muscle cells and in nitric oxygen (NO) stimulated bronchial epithelial cells [25, 26]. AP1 play a role in the production of airway inflammation .
Polymorphisms in COL18A1 and COL29A1 were not consistently replicated. The COL18A1 polymorphism shown to be significantly associated with atopy in the pooling sample is located at the 3' coding region of the gene. COL18A1 encodes a protein expressed in epithelial and endothelial basement membranes, involved in regulation of angiogenesis and endothelial cell proliferation [27–29]. The COL29A1 polymorphism was previously found to be associated with atopic dermatitis . Finally, a region situated in chromosome 8, which contains a predicted gene (NT_007995.50), was also analyzed in detail. Nominally associations were observed in the pooling sample for atopy, but they were not replicated.
Signals described in previous GWA for atopy, asthma and related phenotypes have not been detected in this study [3–5]. The lack of replication of the GWA results can be caused by differences in the definition used to classify affected individuals, genetic coverage of the genome and the p value threshold . In particular, the region detected by Moffatt et al on chromosome 17 has been associated with childhood onset asthma, while in this study the asthmatic individuals were selected independently of their asthma onset. We have not detected the FCER1A region identified by Weidinger et al for atopy, but a different genotyping platform was used.
Despite significant replication values of most promising SNPs found in the GWA in pooled DNA, we acknowledge some limitations as the lack of replication for some of the loci identified in the pooling based analysis. Non-replication of the initial findings is a common feature of the initial findings in GWA studies , mainly due to heterogeneity in the aetiology of the disease and biases. In addition, the impossibility of detection and adjustment by potential confounders in the analysis of DNA pooling could produce the inconsistencies observed.
Another of the main limitation is the sample size for pool construction and replication. Regarding DNA pooling, the method for measure allele frequency differences corrects by the number of subjects included in each pool . After this correction we obtained some signals at very low p-values, which reinforce the strength of these results. Given limited sample, our study was able to detect variants with larger effects in this population and probably other variants with smaller effects would not be detected. False positives are controlled in the replication phase by individual analysis. Other pooling strategies such as use of different sub-samples of pools would allow capture more biological variation. We acknowledge that with having larger pools we may be losing positive signals but we consider this as a less crucial issue for this analysis. Power calculation shows that replication sample was powered to detect reported associations given the parameters observed in the analysis of individual pooling data (see Additional file 1: table S3). For this reason, some of the results were replicated and significance for a SNP in SGK493 reaches Bonferroni level. Although this type of correction by multiple testing is over-conservative and induces false negatives, it is a good indicator of the significance level. In addition to the significance level, some of the results were replicated in the additional sample and replication is considered essential to establish the validity of associations .
Finally, another limitation is the small number of regions analyzed in detail in the individual analysis. We discarded complex regions with segmental duplications and putative insertions or deletions  because their complexity could a priori result in reducing power to detect associations due to non-classical inheritance patterns of markers located on them. In contrast, disease definition is a positive feature of this study since it is highly homogeneous among patients derived from different ECHRS centre, and this is a keystone for valid inferences in association studies.