Our study is a novel approach to identify the not-yet described molecular pathways which associate the process of DVT and the development of stroke. By removing all known intersections between curated genomes involved in each of these pathophysiological processes, we ensured that 24 DVT-specific genes to stroke had not been already described as such. Then, 11 stroke RNA array-expression datasets acquired from GEO (Table 1) were utilized to test the correlation between each of these 24 genes and stroke.
Results from mega-analysis showed that expression levels of eight DVT-alone genes were significantly changed in stroke-cases as compared to normal controls (p-value < 0.05, see in DVT_Stroke→Mega_Analysis). However, only one of eight DVT-alone genes, SP1, passed the pre-selected significance of association criteria (p < 0.005 and LFC > 1) (Table 2, Fig. 1). In specialty, LFC of SP1 was 1.34 from mega-analysis, demonstrating the changes of SP1 were increased by more than 150% (Table 2), suggesting it was a potential stroke biomarker and maybe possibly involved in the development of stroke. Although sample size, sample age, and sample organism of 11 datasets were used in the mega-analysis, the population region (country) was the only factor that could affect the expression of SP1 in case of stroke (p = 0.037, Table 2).
As shown in Fig. 2 (a), SP1 acts as a hub-gene constructing multiple potential pathways that could contribute to stroke based on “promoter binding” and “expression” levels. In order to confirm the pathways (edges) presented in Fig. 2, a further mega-analysis was used to test the performance of the pathway genes in Fig. 2, using the 11 datasets employed in this study (Table 1). Results showed that not all the pathways in Fig. 2 were supported by the 11 datasets, which was expected. The pathways built in Fig. 2 were literature-based, which integrated information from different modality of data with varied platforms. However, majority of the activity of the genes were confirmed from the expression datasets and strengthened the validity of the identified pathway. As shown in Fig. 2, SP1 promotes 7 inhibitors of stroke, including PTGS2, L10, IGF1, LEP, ENTPD1, HSPA1A and HIF1A. However, the literature-based pathways also suggested that SP1 could inhibit three stroke-promoters (IL1B, TGFB1, and ACE), which was not supported by the 11 datasets employed in this study (these genes demonstrated increased expression levels in the case of increased SP1 expression). These results suggest that SP1 is more likely to play a therapeutic role, rather than preventive role, in the pathological development of stroke.
The protein encoded by SP1 is a zinc finger transcription factor that binds to GC-rich motifs of many promoters. It involves in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. In principle, variations in promoter sequences can alter gene expression directly by altering a transcription factor binding site, and promoter variants with effects on the transcriptional activity of certain human genes have been identified as disease risk factors [10]. For instance, in the SP1 → LEP→stroke pathway, SP1 binds in the promoter region for the LEP gene. As a result, altered SP1 transcriptional activity leads to promoted production of leptin [11], which ameliorates neurological deficits and reduces infarct volumes after stroke [12]. In another pathway, SP1 → IL10 → stroke, SP1 positively regulates the transcription of IL10 [13], which has been successfully used as a therapeutic mediator to reduce post-stroke secondary neuroin-flammation [14]. More pathways have been revealed in Fig. 2 with detailed presented in DVT_Stroke→ShortestPath. These pathways got support from the 11 datasets employed in this study. These results suggested the possible mechanisms through which SP1 plays a post-stroke therapeutic role.
Although the discussion was mainly focused on the gene SP1 that presented significant expression change in the case of stroke, other genes with minor expression variances may also worth a closer look, including PF4 (LFC: 0.79; p-value< 10–3), CYP4V2(LFC: 0.72; p-value = 0.046). Literature based pathway analysis suggested that these two genes may related to stroke through multiple pathways (see DVT_Stroke: PF4_CYP4V2). However, further study using experimental data are needed to validate these pathways. In addition, 8 out of the 24 DVT-specific genes were not included in the 11 stroke expression data collected in this study, and therefore were not reported in the mega-analysis (see DVT_Stroke: Mega_Analysis). Analysis with datasets including these genes are needed to explore their potential role in stroke.