E- ISSN: 2320 - 3528
P- ISSN: 2347 - 2286
Machine Learning Approach for Screening key Genes of Atherosclerosis and Performing Pan-Cancer Analysis
Objective: Atherosclerosis (AS), an immunoinflammatory disease caused by lipids, is a significant factor in coronary heart disease and stroke. Researchers worldwide are working to develop more effective ways of diagnosing and treating it. This article introduces a machine-learning algorithm that screens biomarkers and performs pan-cancer analysis as a reference.
Patients and methods: We first downloaded the Gene Expression Omnibus (GEO) dataset containing information on atherosclerotic patients for differential gene screening and analyzed differential mRNAs using Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analysis. Then, we combined Weighted Gene Co-expression Network Analysis (WGCNA), protein interaction network analysis, and mechanistic learning algorithms to screen for core genes, perform basic experimental validation, and immuno-infiltration analysis. Finally, we used The Cancer Genome Atlas Program (TCGA) and Genotype-Tissue Expression (GTEx) databases for pan-carcinogenic analysis of differentially expressed genes.
Results: After screening differentially expressed genes from the GSE28829 dataset, we performed WGCNA and obtained 122 key genes by combining module genes and differential expression genes. We used four machine learning algorithms (XGBoost, RandomForest, Support Vector Machine with Recursive Feature Elimination (SVMREF), and Generalized Linear Model (GLM)) to calculate the critical genes and obtain the junction of the results, which led us to identify four hub genes (Signaling Lymphocytic Activation Molecule Family Member 8 (SLAMF8), Toll-Like Receptor 2 (TLR2), Vesicle-Associated Membrane Protein 8 (VAMP8), and V-set and Immunoglobulin Domain Containing 4 (VSIG4)). Next, we built a diagnostic model and assessed its capabilities. The Receiver Operating Characteristic (ROC) curves of the four genes suggested their critical role in the development of atherosclerosis. We then performed gene correlation analysis, immune infiltration analysis, and Reverse Transcription Polymerase Chain Reaction (RT-PCR) verification of the four core genes and finally screened out TLR2 for pancarcinoma. Our analysis found that TLR2's expression in patients with various tumors differed from that in healthy individuals, and it was strongly associated with the prognosis of patients with diverse cancers.
Conclusion: TLR2 may be a target for intervention in developing diseases such as atherosclerosis and tumors.
Tingting Zhao1,3*, Zhenrun Zhan2, Xiaoyuan He3, Xu Tang1,3
To read the full article Download Full Article | Visit Full Article