- Research article
- Open Access
- Open Peer Review
Weighted correlation network and differential expression analyses identify candidate genes associated with BRAF gene in melanoma
BMC Medical Genetics volume 20, Article number: 54 (2019)
Primary cutaneous malignant melanoma is a cancer of the pigment cells of the skin, some of which are accompanied by BRAF mutation. Melanoma incidence and mortality rates have been rising around the world. As the current knowledge about pathogenesis, clinical and genetic features of cutaneous melanoma is not very clear, we aim to use bioinformatics to identify the potential key genes involved in the expression and mutation status of BRAF.
Firstly, we used UCSC public hub datasets of melanoma (Lin et al., Cancer Res 68(3):664, 2008) to perform weighted genes co-expression network analysis (WGCNA) and differentially expressed genes analysis (DEGs), respectively. Secondly, overlapping genes between significant gene modules and DEGs were screened and validated at transcriptional levels and overall survival in TCGA and GTEx datasets. Lastly, the functional enrichment analysis was accomplished to find biological functions on the web-server database.
We performed weighted correlation network and differential expression analyses, using gene expression data in melanoma samples. We identified 20 genes whose expression was correlated with the mutation status of BRAF. For further validation, three of these genes (CYR61, DUSP1, and RNASE4) were found to have similar expression patterns in skin tumors from TCGA compared with normal skin samples from GTEx. We also found that weak expression of these three genes was associated with worse overall survival in the TCGA data. These three genes were involved in the nucleic acid metabolic process.
In this study, CYR61, DUSP1, and RNASE4 were identified as potential genes of interest for future molecular studies in melanoma, which would improve our understanding of its causes and underlying molecular events. These candidate genes may provide a promising avenue of future research for therapeutic targets in melanoma.
Skin cutaneous melanoma (SKCM) is a malignant cancer that originates from melanocytes and exists in different forms. The main types are basal cell cancer (BCC), squamous cell cancer (SCC) and melanoma [1, 2]. Melanoma is the most dangerous type of skin cancer. The primary cause of melanoma is ultraviolet light (UV) exposure in those with low levels of skin pigment [1, 2]. The UV light may come from the sun or other sources, such as artificial light devices. Besides, about 25 % of melanoma derives from moles. Those with many moles, a history of affected family members, and who have poor immune function were at greater risk . A number of rare genetic defects such as xeroderma pigmentosum also increase risk . Diagnosis can be finished by biopsy of any concerning skin lesion .
At least 50 % of melanomas harbor a V600E mutation in the BRAF gene. Tumors with BRAF mutations could respond to BRAF kinase inhibitor vemurafenib that was approved by the FDA in 2011 for therapy of patients with advanced melanoma and late-stage (metastatic) melanoma [4, 5]. Recently, the FDA approved the other two drugs named dabrafenib and ipilimumab as therapy for patients with BRAF V600E mutation-positive in melanoma .
Existing research has revealed that cancer cannot be caused by only one gene or factor. It must be a network of different genes and pathways working together. Weighted gene co-expression network analysis (WGCNA)  is a methodology used to analyze novel gene modules co-expressing in gene expression data. Many studies have shown that WGCNA can be used to explore genes, a network of genes and correlation of genes in different cancers [8, 9]. Moreover, differentially expressed genes (DEGs) analysis method has been applied in gene expression data .
In this paper, the study was designed to find potential genes and correlated pathways associated with the expression level and mutation status of BRAF in melanoma samples. By analyzing gene expression data  from UCSC public hub with the WGCNA algorithm and DEGs analysis, significant gene modules associated with the expression level of BRAF were identified and differentially expressed genes associated with the mutation status of BRAF were screened, then overlapping genes were validated in TCGA and GTEx database.
Materials and methods
A dataset containing the gene expression and basic phenotypes information of 95 melanoma samples was downloaded from the Cancer Browser website (https:// xenabrowser.net/datapages/?cohort=Melanoma%20 (Lin%202,008)). The gene expression information was experimentally collected through GeneChip Fluidics Station (Affymetrix), and the matrix values were log2 ratio transformed. Genes were mapped onto Affymetrix HT-HGU133A probeMAP.
Melanoma samples that had both expression data and BRAF mutation status were included for further analysis. According to this criterion, there were 67 melanoma samples (30 BRAF wild-type and 37 BRAF mutation) corresponding to our analysis requirement.
After the dataset was downloaded, probe identification numbers (IDs) were transformed into gene symbols. For multiple probes corresponding to one gene, the probe with the most significant p-value from the downstream differential analysis was retained as the gene expression value. As for DEGs analysis, we divided 67 samples into two groups (BRAF wild-type and BRAF mutation group) for screening differentially expressed genes. As for WGCNA analysis, we used BRAF gene expression values as clinical trait data. Figure 1 shows the paths of the data analysis.
Weighted gene co-expression network construction
The full set of genes with available expression data (10,994 genes) was applied to find the scale-free gene modules of co-expression and highly correlated genes constructed by WGCNA . To construct a weighted gene network, the soft threshold power β was set to 3, which was the lowest power based on scale-free topology . We set the parameter maxBlockSize = 11,000, and TOMType = “unsigned”. Topological overlap matrix (TOM) was calculated by adjacency transformation, and the value (1-TOM) was designated to the distance for identification of hierarchical clustering genes and modules. The minimum module size was set to 30.
Module clinical feature associations
In order to identify modules that were significantly associated with the designated clinical trait (the expression level of BRAF), we plotted the heat map of modules-trait relationship according to the tutorial of the WGCNA package for R.
Identification of DEGs
Linear models for microarray data (limma package) is a library used for analyzing gene expression microarray data , especially for the assessment of differential expression and the analysis of designed experiments [14, 15]. limma package in R has been applied to identify the DEGs between BRAF mutation and wild-type (marked as control group) samples. Genes with |log2 fold change (FC)| ≥ 1 and adjusted p-value < 0.05 as the cut-off criterion were selected for subsequent analysis.
Validation of candidate genes
The overlapping genes between significant modules and DEGs were chosen as the potential genes for deep analysis and validation. GEPIA  (website: http://gepia.cancer-pku.cn/) is a web server for analyzing the RNA sequencing expression data of 9736 tumors and 8587 normal samples from the TCGA and the GTEx projects, using a standard processing pipeline. Survival analysis and expression consistency evaluation of potential genes were carried out in GEPIA built-in SKCM and GTEx datasets, which contain 461TCGA-SKCM tumor patients, 1TCGA-SKCM normal control, and 557 GTEx normal skin samples. For the transcriptional level validation, the criteria of significant results was set to |log2 fold change| ≥ 1 and p-value < 0.01. For the overall survival analysis in TCGA datasets, the 458 samples with available overall survival data were divided into high and low expression groups using the median TPM as a breakpoint, and significance was determined using a logrank test with p < 0.05.
Functional enrichment analysis
GenCLiP 2.0  is a web-based text-mining server, which can analyze human genes associated with biological functions and molecular networks. We uploaded filtered genes to online analysis tool GenCLiP 2.0 (http://ci.smu.edu.cn/GenCLiP2/ analysis.php) to find correlated significant pathways.
Expression value analysis of microarray data
We chose 10,994 genes and 67 samples to construct the gene co-expression network by WGCNA. Figure 2a showed the relationship between the expression level of BRAF and melanoma samples.
Weighted gene co-expression network construction
Choosing a proper soft-thresholding power is a critical step when constructing a WGCNA network. As shown in Fig. 2b, power value 3(β = 3) was selected to produce a hierarchical clustering tree (Fig. 3) with different colors representing different modules.
Module clinical feature associations
Since we had a summary profile (eigengene) for each module, we simply correlated eigengenes with external traits (marked BRAF expression) and looked for the most significant associations. It was clear that the MEbrown (1021 genes) was most positive associated with the expression of BRAF (Fig. 4a). The results also demonstrated that the MEturquoise (1858 genes) was most negative associated with the expression of BRAF (Fig. 4a).
As shown in Fig. 4b, there were 27 eigengenes. The upper panels presented hierarchical clustering dendrograms of the eigengenes, in which the dissimilarity of eigengenes had been visualized. The bottle heatmaps presented the eigengene adjacencies for the expression of BRAF. The dendrogram indicated that brown and black modules were highly related and their correlations were stronger than their individual correlations with the expression BRAF (Fig. 4b).
Identification of DEGs
Compared with BRAF wild type group, a total of 36 genes were identified in BRAF mutation group by the threshold of |log2 fold change (FC)| ≥ 1 and adjusted p-value < 0.05, of which 9 were up-regulated genes and 27 were down-regulated genes (Table 1).
Validation of candidate genes
There were 1021 genes in the brown module, 1858 genes in the turquoise module and 36 genes in the DEGs (Fig. 5). As shown in Venn diagram, it had 5 genes (ANG, RNASE4, FOS, WSB1, ZSCAN18) between MEbrown and DEGs, and 15 genes (FHOD3, FERMT2, TNFAIP3, ANGPTL4, NCRNA00312, MYL9, ID3, CYR61, TXNIP, MFAP2, DACT1, DUSP1, COX7A1, FXYD3, NID2) between MEturquoise and DEGs (Fig. 5).
In order to verify these 20 overlapping candidate genes, we validated on online web server GEPIA, which contained the TCGA and GTEx melanoma samples. Figure 6a-b demonstrated the expression level of 3 genes in BRAF wild-type and BRAF mutation samples of melanoma, which was in accordance with its expression level in normal and tumor patients of SKCM. It was also revealed that low expression of these three genes has a worse overall survival in SKCM patients (Fig. 6c). Besides, we had discarded the other 17 genes that did not exhibit significant differential expression in the TCGA/GTEx data concordant with that observed in the Lin et al. data, and were not associated with significantly worse overall survival compared high expression group with low expression group in the TCGA/GTEx data (Additional file 1: Figure S1, S2 and S3).
Functional enrichment analysis
We used online website GenCLiP 2.0 tools to perform the functional and signaling pathway enrichment analysis of the above three genes (CYR61, DUSP1, and RNASE4). As shown in Table 2, the potential candidate genes (CYR61, DUSP1, and RNASE4) were involved in the nucleic acid metabolic process, while CYR61 and DUSP1 were most significantly enriched in the growth factor binding, ERK1 and ERK2 cascade, and regulation of ERK1 and ERK2 cascade.
Melanoma is the most fatal form of skin cancer and strikes tens of thousands of people worldwide each year. The amount of cases is increasing faster than any other type of malignant cancer .
Many patients with BRAF mutation have received target treatments and therapies which activate their body’s own immune system. There is BRAF mutation in melanoma. Besides, mutation also exists in NRAS gene and PTEN gene. Some scientists have struggled to find drugs targeting the mutated NRAS protein or NRAS protein , while others have uncovered a mechanism of resistance of targeted therapies for melanoma and identified compounds that inhibit eIF4F and enhance the effectiveness of vemurafenib in mice with melanomas .
In this study, firstly we applied WGCNA to identify the two key modules in melanoma that were associated with the expression of BRAF gene (the brown module was positive, and the turquoise was negative). At the same time, we identified the DEGs in the BRAF mutation group compared with BRAF wild-type group. Then, we chose the overlapping genes between modules and DEGs. Finally, as to the gene expression level and overall survival validation, we expand the scope of comparison range to the tumor group versus the normal group in TCGA/GTEx datasets.
We found that CYR61, DUSP1, and RNASE4 were significantly related to gene expression level and survival analysis results. CYR61 (Cysteine-rich angiogenic inducer 61) is a secreted, matricellular protein , which is associated with a range of cellular activities, such as cell adhesion, migration, differentiation, proliferation, apoptosis [21, 22]. Beak et al. suggested that CYR61 was highly expressed in colorectal carcinomas (CRC) and CYR61 might play a role as meaningful targets for therapeutic intervention of patients with CRC . D’ Antonio et al. also found that decreased expression level of CRY61 was associated with prostate cancer recurrence after surgical treatment . DUSP1 (Dual specificity protein phosphatase 1) is an oncogene that is associated with cancer progression in gastric cancer as well as a negative regulator of the mitogen-activated protein kinase (MAPK) signaling pathway, has anti-inflammatory properties [25,26,27]. Xiaoyi et al. also found that DUSP1 phosphatase regulated the pro-inflammatory milieu in head and neck squamous cell carcinoma , in addition to promoting angiogenesis, invasion, and metastasis in non-small-cell lung cancer (NSCLC) . RNASE4 (Ribonuclease 4) is an RNase that belongs to the pancreatic ribonuclease family and has marked specificity towards the 3′ side of uridine nucleotides . Unfortunately, to date there has been no research focused on the relationship between these several genes with melanoma.
The primary purpose of the study focuses on the prediction of key potential genes in cancers via data mining and data analysis. Though we have validated results in the TCGA and GTEx datasets, results need to be confirmed through molecular and cellular experiments.
Firstly, we have identified overlapping genes associated with the expression and the mutation status of BRAF in melanoma through WGCNA and DEGs analysis, respectively. Then, validation was applied to these overlapping genes, and three genes (CYR61, DUSP1, and RNASE4) were screened. However, more direct evidence is needed to confirm their association with melanoma. The study may be helpful for future studies concerning melanoma with the aim of finding potential key molecule targets of melanoma.
B-Raf proto-oncogene, serine/threonine kinase
Differentially Expressed Genes
University of California Santa Cruz
Weighted Gene Co-expression Network Analysis
Stewart B, Wild C. World cancer report 2014: International Agency for Research on Cancer; 2014.
Version LCPP, Prevention LC. Skin Cancer Prevention (PDQ®) - National Library of Medicine - PubMed Health. National Cancer Institute 2013. https://www.cancer.gov/types/skin/hp/skin-prevention-pdq.
Azoury SC, Lange JR. Epidemiology, risk factors, Prevention, and early detection of melanoma. Surg Clin North Am. 2014;94(5):945–62.
Reports FS. FDA approves vemurafenib and companion diagnostic test for late-stage skin cancer; 2011.
Sponghini AP, Rondonotti D, Giavarra M, Giorgione R, Platini F. Safety and efficacy of vemurafenib in BRAF V600E mutation-positive metastatic melanomas. J Transl Med. 2015;13(1):2075.
Shahabi V, Hamid O, Schmidt H, Chasalow SD, Alaparthy S, Jackson JR. Assessment of association between BRAF-V600E mutation status in melanomas and clinical response to ipilimumab. Cancer Immunol Immunother. 2012;61(5):733–7.
Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9(1):559.
Yang Q, Wang R, Wei B, Peng C, Wang L, Hu G, Kong D, Du C. Candidate biomarkers and molecular mechanism investigation for glioblastoma Multiforme utilizing WGCNA. Biomed Res Int. 2018;2018.
Zhai X, Xue Q, Liu Q, Guo Y, Chen Z. Colon cancer recurrence-associated genes revealed by WGCNA co-expression network analysis. Mol Med Rep. 2017;16(5):6499–505.
Rau A, Flister MJ, Rui H, Livermore Auer P. Exploring Drivers of Gene Expression in The Cancer Genome Atlas. 2018:227926.
Lin WM, Baker AC, Beroukhim R, Winckler W, Feng W, Marmion JM, Laine E, Greulich H, Tseng H, Gates C. Modeling genomic diversity and tumor dependency in malignant melanoma. Cancer Res. 2008;68(3):664.
Zhang B, Horvath S. A General Framework For Weighted Gene Co-Expression Network Analysis. Stat Appl Genet Mol Biol. 2005;4(1):Article17.
Chen-An T, Yi-Ju C, Chen JJ. Testing for differentially expressed genes with microarray data. Nucleic Acids Res. 2003;31(9):e52.
Phipson B, Lee S, Majewski IJ, Alexander WS, Smyth GK. Robust hyperparameter estimation protects against hypervariable genes and improves power to detect differential expression. Ann Appl Stat. 2016;10(2):946.
Smyth GK. Limma: linear models for microarray data[M]//Bioinformatics and computational biology solutions using R and Bioconductor. New York: Springer; 2005. p. 397-420.
Tang Z, Li C, Kang B, Gao G, Li C, Zhang Z. GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res. 2017;45 Web Server issue.
Wang JH, Zhao LF, Lin P, Su XR, Chen SJ, Huang LQ, Wang HF, Zhang H, Hu ZF, Yao KT. GenCLiP 2.0: a web server for functional clustering of genes and construction of molecular networks based on free terms. Bioinformatics. 2014;30(17):2534–6.
Owens B. Melanoma. Nature. 2014;515:S109.
Posch C, Moslehi H, Feeney L, Green GA, Ebaee A, Feichtenschlager V, Chong K, Peng L, Dimon MT, Phillips T. Combined targeting of MEK and PI3K/mTOR effector pathways is necessary to effectively inhibit NRAS mutant melanoma in vitro and in vivo. Proc Natl Acad Sci U S A. 2013;110(10):4015–20.
Boussemart L, Malka-Mahieu H, Girault I, Allard D, Hemmingsson O, Tomasic G, Thomas M, Basmadjian C, Ribeiro N, Thuaud F, et al. eIF4F is a nexus of resistance to anti-BRAF and anti-MEK cancer therapies. Nature. 2014;513(7516):105–9.
Lau LF. CCN1/CYR61: the very model of a modern matricellular protein. Cell Mol Life Sci. 2011;68(19):3149–63.
Jun JI, Lau LF. Taking aim at the extracellular matrix: CCN proteins as emerging therapeutic targets. Nat Rev Drug Discov. 2011;10(12):945–63.
Baek M, Bae S, Jeong D. Relationship of pro-angiogenic factor Cyr61to colorectal cancer development and prognosis, vol. 29; 2011.
D'Antonio KB, Lucianna S, Roula A, Mondul AM, Platz EA, Netto GJ, Getzenberg RH. Decreased expression of Cyr61 is associated with prostate cancer recurrence after surgical treatment. Clin Cancer Res. 2010;16(23):5908.
Peng HZ, Yun Z, Wang W, Ma BA. Dual specificity phosphatase 1 has a protective role in osteoarthritis fibroblastlike synoviocytes via inhibition of the MAPK signaling pathway. Mol Med Rep. 2017;16(6):8441–7.
Teng F, Xu Z, Chen J, Zheng G, Zheng G, Lv H, Wang Y, Wang L, Cheng X. DUSP1 induces apatinib resistance by activating the MAPK pathway in gastric cancer. Oncol Rep. 2018;40(3):1203–22.
Keyse SM, Emslie EA. Oxidative stress and heat shock induce a human gene encoding a protein-tyrosine phosphatase. Nature. 1992;359(6396):644–7.
Xiaoyi Z, J Madison H, Hong Y, D'Silva NJ, Kirkwood KL: DUSP1 phosphatase regulates the proinflammatory milieu in head and neck squamous cell carcinoma. Cancer Res 2014, 74(24):7191–7197.
Moncho-Amor V, Cáceres IIED, Bandres E, Martínez-Poveda B, Orgaz JL, Sánchez-Pérez I, Zazo S, Rovira A, Albanell J, Jiménez B. DUSP1/MKP1 promotes angiogenesis, invasion and metastasis in non-small-cell lung cancer. Oncogene. 2011;30(6):668–78.
Rosenberg HF, Dyer KD. Human ribonuclease 4 (RNase 4): coding sequence, chromosomal localization and identification of two distinct transcripts in human somatic tissues. Nucleic Acids Res. 1995;23(21):4290–5.
This study was supported by the National Natural Science Foundation of China (No. 81770294). The funder had no role in study design, data collection and analysis, except for bioinformatics training, writing the manuscript, and decision to publish.
Availability of data and materials
The expression data associated with this article has been deposited in the UCSC public hub under the https://xenabrowser.net/datapages/?dataset=lin2008_public%2Flin2008Exp_genomicMatrix&host=https%3A%2F%2Fucscpublic.xenahubs.net.
Phenotypic data is available on UCSC public hub (https://xenabrowser.net/datapages/?dataset=lin2008_public%2Flin2008_public_clinicalMatrix&host=https%3A%2F%2Fucscpublic.xenahubs.net).
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures S1-S3. Rows represent expression of 17 genes in melanoma samples (first), TCGA/GTEx (second), and TCGA (third), where genes were aligned by column. As the NCRNA00312 gene could not be retrieved, expression and survival results could not be obtained in GEPIA. Significance was determined as described in the caption of Fig. 6. (ZIP 3130 kb)
About this article
Cite this article
Zhao, B., You, Y., Wan, Z. et al. Weighted correlation network and differential expression analyses identify candidate genes associated with BRAF gene in melanoma. BMC Med Genet 20, 54 (2019) doi:10.1186/s12881-019-0791-1
- Weighted gene co-expression network analysis
- Differentially expressed genes
- Overall survival
- BRAF gene