E- ISSN: 2320 - 3528
P- ISSN: 2347 - 2286
Tingting Zhao1,3*, Zhenrun Zhan2, Xiaoyuan He3, Xu Tang1,3
1 Heping Hospital Affiliated to Changzhi Medical College, Changzhi, Shanxi 046000, China
2 Department of Endocrinology, the First Affiliated Hospital, Fujian Medical University, Fuzhou, 350005, China
3 Changzhi Medical College, Changzhi, Shanxi 046000, China
Received: 08-Aug-2024, Manuscript No. JMB-24-144862; Editor assigned: 12-Aug-2024, PreQC No. JMB-24-144862 (PQ); Reviewed: 26-Aug-2024, QC No. JMB-24-144862; Revised: 02-Sep-2024, Manuscript No. JMB-24-144862 (R); Published: 09-Sep-2024, DOI: 10.4172/2320-3528.13.3.004
Citation: Zhao T, et al. Machine Learning Approach for Screening Key Genes of Atherosclerosis and Performing Pan-Cancer Analysis. J Microbiol Biotechnol. 2024;13:004
Copyright: © 2024 Zhao T, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Research & Reviews: Journal of Microbiology and Biotechnology
Objective: Atherosclerosis (AS), an immunoinflammatory disease caused by lipids, is a significant factor in coronary heart disease and stroke. Researchers worldwide are working to develop more effective ways of diagnosing and treating it. This article introduces a machine-learning algorithm that screens biomarkers and performs pan-cancer analysis as a reference.
Patients and methods: We first downloaded the Gene Expression Omnibus (GEO) dataset containing information on atherosclerotic patients for differential gene screening and analyzed differential mRNAs using Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analysis. Then, we combined Weighted Gene Co-expression Network Analysis (WGCNA), protein interaction network analysis, and mechanistic learning algorithms to screen for core genes, perform basic experimental validation, and immuno-infiltration analysis. Finally, we used The Cancer Genome Atlas Program (TCGA) and Genotype-Tissue Expression (GTEx) databases for pan-carcinogenic analysis of differentially expressed genes.
Results: After screening differentially expressed genes from the GSE28829 dataset, we performed WGCNA and obtained 122 key genes by combining module genes and differential expression genes. We used four machine learning algorithms (XGBoost, RandomForest, Support Vector Machine with Recursive Feature Elimination (SVMREF), and Generalized Linear Model (GLM)) to calculate the critical genes and obtain the junction of the results, which led us to identify four hub genes (Signaling Lymphocytic Activation Molecule Family Member 8 (SLAMF8), Toll-Like Receptor 2 (TLR2), Vesicle-Associated Membrane Protein 8 (VAMP8), and V-set and Immunoglobulin Domain Containing 4 (VSIG4)). Next, we built a diagnostic model and assessed its capabilities. The Receiver Operating Characteristic (ROC) curves of the four genes suggested their critical role in the development of atherosclerosis. We then performed gene correlation analysis, immune infiltration analysis, and Reverse Transcription Polymerase Chain Reaction (RT-PCR) verification of the four core genes and finally screened out TLR2 for pancarcinoma. Our analysis found that TLR2's expression in patients with various tumors differed from that in healthy individuals, and it was strongly associated with the prognosis of patients with diverse cancers.
Conclusion: TLR2 may be a target for intervention in developing diseases such as atherosclerosis and tumors.
WGCNA; Atherosclerosis; Pan-cancer; Machine learning; Pancarcinoma; Database
Cardiovascular heart Disease (CVD) caused by atherosclerosis is the leading cause of death worldwide. The World Health Organization predicts that by 2030, 23.6 million people will die from CVD [1]. Atherosclerosis most commonly occurs in the coronary arteries and carotid arteries, etc. Coronary atherosclerosis often leads to diseases of the cardiovascular system, while carotid atherosclerosis often causes diseases of the cerebrovascular system. Carotid atherosclerotic stenosis is caused by the chronic accumulation of carotid subendointal atherosclerotic plaques, and 20%~30% of ischemic strokes or transient ischemic attacks are related to carotid atherosclerotic stenosis [2]. The atherosclerosis formation mechanism involves vascular endothelial injury, mononuclear macrophage adhesion, lipid deposition, and other factors [3]. Although many factors affect carotid atherosclerosis, the mechanism that affects the atherosclerosis process is not fully understood. Therefore, it is essential to explore further and find the key regulatory genes and potential signaling pathways, particularly to prevent the occurrence of atherosclerosis and delay its progression.
It is of great clinical significance to use gene chip technology to detect and analyze clinical patient specimens, screen valuable genes, and then deeply study the pathogenesis of atherosclerosis. Bioinformatics analysis can screen for differential genes for related diseases at the genomic level. However, independent microarray analysis often results in false positive rates. In this study, we downloaded the GSE28829 dataset from the GEO database, which studies early and advanced carotid atherosclerosis. We performed a series of preprocessing steps on the original data to screen out the differentially expressed genes associated with carotid atherosclerosis progression. We then conducted functional annotation and signaling pathway enrichment analysis of differentially expressed genes to identify potential signaling pathways. Finally, we used WGCNA combined with machine learning analysis to screen out the hub gene that regulates the progression of carotid atherosclerosis, and pan-cancer analysis was carried out [4,5]. The outcomes of this research will help to improve the comprehension of the molecular mechanism of the onset and progression of carotid atherosclerosis and the correlation of the hub gene with the occurrence of various tumors.
Data set collection and processing
The GSE28829 dataset, containing 13 samples with early and 16 with severe atherosclerosis lesions, was downloaded from the GEO database [6]. The dataset was annotated, and batch effect elimination was performed to acquire an ideal expression matrix file.
Differentially Expressed Gene (DEG) screening and Gene Set Enrichment Analysis (GSEA)
DEG screening was performed using the "limma" package, with a screening criterion of (log 2 FC) >0.5 and FDR<0.01. The important genes volcano and expression heat maps were plotted using "ggplot2" and "pheatmap" software packages. Genomic enrichment analysis of candidate genes was also performed.
Screening WGCNA-based targeting modules and genes and functional enrichment analysis
WGCNA of the gene set was performed to identify atherosclerosis-related modules [7]. The β that meets the standard soft threshold of the scale-free network is selected to construct the gene co-expression network, and then the modules are preliminarily divided.
We calculated correlation coefficients between genes and modules to assess their association, and finally, the hub genes within the module were screened. Finally, these hub genes are functionally annotated and pathway enriched [8].
Combined with machine learning to screen and validate atherosclerotic biomarkers
XGBoost algorithm, Random forest machine learning algorithm [9], GLM algorithm and SVM-RFE algorithm [10] based on the e1071 packet were used to screen core genes from important DEGs. Venn packets were used to intersect the four to obtain key genes, their expression was compared between groups, the expression of key genes based on the early and advanced atherosclerosis groups was collected.
The ROC curve of key genes was drawn using the pROC package, and their diagnostic efficiency in diagnosing atherosclerosis was evaluated.
Construction and validation of the signature
A diagnostic nomogram was drawn using R to construct a diagnostic model containing four core genes, and its ROC curve was plotted using the pROC software package to reflect its diagnostic efficiency [11].
Enrichment analysis of core biomarkers
GSEA of SLAMF8, TLR2, VAMP8, and VSIG4 was performed to investigate their association with atherosclerosis development differences in signalling pathway enrichment of DEGs and different groups in the GSE28829 dataset were explored using the "GSVA" package.
Sample collection and quantitative Real-Time Polymerase Chain Reaction (qRT-PCR)
Twenty patients who underwent cervical vascular ultrasound were enrolled from May 2022 to November 2022 at Heping Hospital, Affiliated with Changzhi Medical College. Ten patients had carotid atherosclerosis; the remaining 10 were controls without atherosclerosis. Peripheral blood was collected and stored at -80ºC for later use.
Total RNA was extracted using TRIzol® reagent and cDNA synthesis was performed according to the reverse transcription kit instructions (YEASEN Biotech). qRT-PCR was performed using cDNA as a template, and the results were analyzed with the 2-ΔΔCt method. Primers used for PCR are provided in Table 1.
Gene | Primer Sequence (5�?��??3�?�) | |
---|---|---|
GAPDH | Forward | GGAGCGAGATCCCTCCAAAAT |
Reverse | GGCTGTTGTCATACTTCTCATGG | |
SLAMF8 | Forward | CTGGAGACTCTGTACCATTCCC |
Reverse | AGCAATGAACACTTGTACCACG | |
TLR2 | Forward | ATCCTCCAATCAGGCTTCTCT |
Reverse | GGACAGGTCAAGGCTTTTTACA | |
VAMP8 | Forward | TGTGCGGAACCTGCAAAGT |
Reverse | CTTCTGCGATGTCGTCTTGAA | |
VSIG4 | Forward | GGGGCACCTAACAGTGGAC |
Reverse | GTCTGAGCCACGTTGTACCAG |
Table 1. All primers used for amplification.
Flow cytometry
Peripheral Blood Mononuclear Cells (PBMCs) were washed twice and re-suspended in RPMI medium (Sigma-Aldrich, USA) supplemented with 5% fetal bovine serum (Gibco, Australia) and stained with FIT-labeled anti-CD206, PE-labeled anti-CD11b, PerCP/Cyanine 5.5-labeled CD27 and APC-labeled anti-CD38 on ice in a dark room for 30 min on surface staining. Fluorescent minus one control tube was also prepared for each antibody. Finally, cell samples were collected and analyzed on a FACSCanto flow cytometer using BD FACSDiva software (BD Biosciences).
Pan-cancer candidate biomarker expression
RNA expression profiles and clinical data for 33 tumors were downloaded from https://portal.gdc.com, and pan-cancer analysis was conducted to assess the expression of candidate genes in each tumor. P-values below 0.05 were considered significant, indicating differential expression between tumors and normal tissues.
Analysis of prognosis
RNA expression profiles and clinical information for 33 cancers were obtained from the TCGA database. The 'forest plot' R package was used to perform univariate cox regression analysis to calculate the p-value, HR and 95% CI.
Data processing and screening and analysis of differential genes
Using a boxplot, we started by evaluating the batch effect of the expression matrix from the GSE28829 dataset. Figure 1 displays the differential expression of multiple genes in multiple sample data. The effectiveness of the "sva" software package in eliminating batch effects is shown by comparing Figure 1A to Figure 1B. The expression matrix file analyzed to identify 23 downregulated and 154 upregulated differentially expressed genes, visualized as volcano maps in Figure 2A. We created heat maps of the top 60 upregulated genes with the most significant differences using the "pheatmap" package (Figure 2B).
GSEA analysis
To better understand the role of differentially expressed genes in atherosclerosis, we analyzed atherosclerosis tissues using GSEA. Circadian rhythm, beta-alanine metabolism, arginine and proline metabolism, propanoate metabolism, and histidine metabolism were enriched in the advanced atherosclerosis samples. In contrast, graft-versus-host disease, the intestinal immune network for IgA production, asthma, allograft rejection, and Staphylococcus aureus infection were mainly enriched in the early atherosclerosis samples (Figures 3A-3B).
WGCNA
We clustered 13 early and 16 late samples from the GSE28829 dataset, as shown in Figure 4A. The soft threshold was set to 7 when R2>0.9 and the average connectivity was high (Figure 4B). We set the clustering height to 0.25 to combine the strongly associated modules, as shown in Figures 4C-4D. There was no obvious association between the intermodule correlations (Figure 4E), and the transcriptional correlation analysis within the modules indicated no substantial connection between the modules (Figure 4F). We investigated the association between modules and clinical conditions and found that the green module was positively associated with advanced atherosclerosis (r=0.81, p=1e-07e) and negatively correlated with normal atherosclerosis (r=-0.81, p=1e-07e) and negatively correlated with normal atherosclerosis (r=-0.81, p=1e-07e) (Figure 4G). The findings revealed that the green module was strongly associated with atherosclerosis in the scatter plot of MM vs. GS (Figure 4H). We will continue to analyze the genes in the module.
Figure 4: Conducting WGCNA. A) Clustering dendrogram of two sample groups; B) Soft threshold analysis; C) A cutoff value of 0.25 was used for the combination of similar modules; D) Display of the primary and combined modules; E) Module gene correlation; red indicates a positive association and blue indicates a negative association; F) Module gene clustering tree diagram; G) Red indicates a positive correlation between modules and traits and blue indicates a negative correlation; H) MM vs. GS scatter plot of atherosclerosis.
Functional notes and pathway enrichment
We extracted differing genes within the module using the Venn plot, resulting in 122 genes (Figure 5A). Functional annotation and pathway enrichment were performed to analyze biological functions further.
GO enrichment analysis showed that leukocyte-mediated immunity, leukocyte migration, endocytic vesicles, the external side of the plasma membrane, immune receptor activity and amide binding were closely related to candidate DEGs (Figure 5B). Differentially Ordered (DO) analysis revealed that candidate DEGs were related to hepatitis, atherosclerosis, arteriosclerosis, lung disease and arteriosclerotic cardiovascular disease (Figure 5C). KEGG analysis showed an association with rheumatoid arthritis, cytokine-cytokine receptor interaction, Staphylococcus aureus infection, tuberculosis, and phagosome (Figure 5D).
Selection of signature genes
We utilized four machine learning algorithms (XGBoost, RandomForest, SVM-REF, and GLM) to identify biomarkers, and each algorithm selected 30 core genes from module difference genes. All four algorithms demonstrated superior sensitivity and specificity (Figure 6A-6C). A Venn plot was used to identify four genes (SLAMF8, TLR2, VAMP8 and VSIG4) that overlapped in all four methods, indicating significant functional similarities (Figure 6D-6E).
Construction and testing of the nomogram for atherosclerosis
We developed an atherosclerosis diagnostic nomogram using rms packages (Figure 7A) based on hub genes (SLAMF8, TLR2, VAMP8 and VSIG4) and assessed its forecasting ability through ROC curves. The ROC curve demonstrated the high accuracy of the nomogram model (Figure 7B).
ssGSEA hallmark analysis in the atherosclerotic group and control group
We used ssGSEA to perform hallmark analysis in patients with advanced-stage and early-stage atherosclerosis. The results showed that most of the hallmark pathways were enriched with more DEGs in the advanced group. The top three pathways with significant enrichment differences were HALLMARK_WNT_BETA_CATENIN_SIGNALING, HALLMARK_APICAL_JUNCTION and HALLMARK_DNA_REPAIR (Figure 8A). It suggests that genes in the advanced group are predominantly upregulated by activating WNT signaling through the accumulation of beta-catenin CTNNB1, encoding components of the apical junction complex and involved in DNA repair. Figure 6E shows the high degree of consistency in the regulation of the body by the four signature genes complementing each other.
Therefore, SLAMF8, TLR2, VAMP8 and VSIG4 were upregulated by activating notch signaling, hedgehog signaling, and the mTORC1 complex, downregulated by KRAS activation, upregulated during the formation of blood vessels (angiogenesis) (Figure 8B). And the unfolded protein response, a cellular stress response related to the endoplasmic reticulum, regulated by MYC and NF-kB in response to TNF, is involved in the metabolism of heme and erythroblast differentiation (Figure 8B).
Quantitative RT-PCR
We used qRT-PCR to detect the transcripts of SLAMF8, TLR2, VAMP8 and VSIG4 in the peripheral blood of atherosclerotic patients and controls. TLR2 expression was markedly higher in the atherosclerosis subjects compared to control subjects (Figure 9). TLR2 could serve as a diagnostic marker for atherosclerosis and is important for its prediction and diagnosis, consistent with the results of preliminary bioinformatics analysis.
Immuno-infiltrate analysis
Immune infiltration analysis revealed that the infiltration of naive and memory B cells, regulatory T cells (Tregs), gamma delta T cells, M0 and M2 macrophages and activated dendritic cells were significantly associated with the development of atherosclerosis (Figure 10A). We also performed an immunecorrelation analysis on the signature gene TLR2. The results showed that TLR2 expression was positively associated with the infiltration of CD8 T cells, naive B cells, resting memory CD4 T cells, and regulatory T cells (Tregs) and negatively associated with the infiltration of M0 macrophages (Figure 10B-10F).
Flow cytometry
The number of infiltrating immune cells was detected by flow cytometry analysis. As shown in Figure 11, there was a significant increase in memory B cells in atherosclerosis blood samples. Meanwhile, in the experiment, we found that there was also an increasing trend of M2 macrophages.
Pan-cancer TLR2 expression
After analyzing TLR2 expression using PCR and immune infiltration analysis, we delved deeper into the gene's potential impact on immunity and tumor development. Previous research has shown that genetically mediated immune responses can play a role in developing atherosclerosis.
We looked at TLR2 expression across different tumors in the TCGA and found that it was highly expressed in Bladder Urothelial Carcinoma (BLCA), Cholangiocarcinoma (CHOL) (bile duct cancer), Colon Adenocarcinoma (COAD), Esophageal Carcinoma (ESCA), Glioblastoma Multiforme (GBM), Head and Neck Squamous Cell Carcinoma (HNSC), Kidney Renal Papillary Cell Carcinoma (KIRP), Stomach Adenocarcinoma (STAD), Thyroid Carcinoma (THCA) and Uterine Corpus Endometrial Carcinoma (UCEC), while being expressed at low levels in Breast Cancer (BRCA), Liver Hepatocellular Carcinoma (LIHC), Lung Adenocarcinoma (LUAD), Lung Squamous Cell Carcinoma (LUSC), Pancreatic Adenocarcinoma (PAAD), and (PRAD) Prostate Adenocarcinoma (Figure 12A).
After we simultaneously downloaded normal tissue data from the GTEx database, TLR2 was found to be highly expressed in Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (CESC), CHOL, ESCA, GBM, HNSC, Kidney Renal Clear Cell Carcinoma (KIRC), KIRP, Lower Grade Glioma (LGG), PAAD, Skin Cutaneous Melanoma (SKCM), Stomach Adenocarcinoma (STAD), Testicular Germ Cell Tumors (TGCT), and THCA was expressed at low levels in Adrenocortical Carcinoma (ACC), Breast Cancer (BRCA), Colon Adenocarcinoma (COAD), Diffuse Large B-cell Lymphoma (DLBC), Liver Hepatocellular Carcinoma (LIHC), Lung Adenocarcinoma (LUAD), Lung Squamous Cell Carcinoma (LUSC), and Rectum Adenocarcinoma (READ) (Figure 12B).
Prognostic value of TLR2 in pancarcinoma
We conducted a pan-cancer analysis to understand the association between TLR2 expression and patient prognosis, focusing on Progression-Free Survival (PFS), Disease-Free Survival (DFS), and Overall Survival (OS). Our cox regression analysis of 33 tumors revealed that TLR2 expression was significantly associated with OS in SKCM (<0.001), LGG (<0.001), Mesothelioma (MESO) (<0.05), and Thymoma (THYM) (<0.05), with TLR2 being a conservative element of SKCM and MESO, and a risk element of THYM and LGG (Figure 13A). Furthermore, TLR2 expression was significantly associated with DFS for KIRP (<0.05), where TLR2 was identified as a risk factor (Figure 13B). We also found that TLR2 expression was significantly associated with PFS in LGG (<0.001), SKCM (<0.001), CESC (<0.01), LUAD (<0.05), GBM (<0.05), and PCPG (<0.05), with TLR2 being a conservative element of SKCM, CESC, and LUAD, and a risk element of LGG, GBM, and PCPG (Figure 13C).
Figure 13: The correlation between TLR2 and the prognosis of cancer patients. A) The cox regression model was used to perform a pan-cancer analysis of the association between TLR2 expression and OS; B) The cox regression model was used to perform a pan-cancer analysis of the relationship between TLR2 expression and DFS; C) The cox regression model was used to perform a pan-cancer analysis of the relationship between TLR2 expression and PFS.
Atherosclerosis is a widespread inflammatory disease that affects large-sized and medium-sized arteries in the body and is responsible for various cardiovascular diseases, including stroke and ischemic heart disease. Stroke is the second most common cause of death globally and one of the leading causes of disability in adults [12]. A significant proportion of strokes and transient ischemic attacks are associated with atherosclerotic stenosis of the carotid arteries, with approximately 20%-30% of cases resulting from this condition. Another 15% of cases are associated with the progressive growth and rupture of atherosclerotic plaques [13,14]. As a result, understanding the pathogenesis and progression of carotid atherosclerosis is critical in preventing and treating cerebrovascular accidents. Next, we explore new gene targets to provide new atherosclerosis treatment and prevention strategies. Recent years have seen significant developments in bioinformatics analysis and molecular mechanisms studies. Through enrichment analysis of cellular components, molecular functions, and biological processes, we can better understand genes biological functions, regulatory mechanisms, and how gene variants affect disease progression and protein expression. WGCNA has also become widely used to screen for key gene modules highly correlated with phenotypes [15]. In the context of carotid atherosclerosis, WGCNA has started to study genes and mechanisms, although further studies in immune infiltration and functional analysis are still required [16,17]. Therefore, the goal is to establish a reliable evaluation system by combining WGCNA and machine learning techniques to validate the core genes and molecular pathways involved in patients with carotid atherosclerosis. It may lead to identifying potential therapeutic targets for carotid atherosclerotic plaques. Among the 177 DEGs that were filtered, 154 genes were upregulated, and 23 genes were downregulated. Gene set enrichment analysis of target DEGs showed they were primarily associated with propanoate metabolism, histidine metabolism, beta-alanine metabolism, allograft rejection, and the intestinal immune network for IgA production.
WGCNA analysis was performed on 29 samples, and 13 modules were clustered. GO and DO functional annotation and KEGG signaling pathway enrichment analysis of differential module genes showed that they were mainly associated with complement and coagulation cascades, phagosome, cytokine-cytokine receptor interaction, positive regulation of cytokine production, leukocyte-mediated immunity, leukocyte migration, arteriosclerosis and hepatitis. The development of atherosclerosis is due to the increase in cytokines, inflammatory cytokines, and chemokines, immune-related inflammatory mechanisms, and the infiltration of immune cells in the endothelium. This process contributes to plaques' formation, rupture, and thrombosis [18-20]. Four machine learning methods were used to filter out four core genes, SLAMF8, TLR2, VAMP8 and VSIG4, from the target modules. The expression levels of these four core genes differed significantly between patients with early and advanced carotid atherosclerosis, and the diagnostic models constructed were diagnostically effective. Furthermore, these four genes had a high degree of bio-functional identity.
Several studies have demonstrated a link between these four core genes and the pathogenesis of carotid atherosclerosis. The Signaling Lymphocyte Activation Molecule (SLAM) family receptors are expressed in various immune cell types and contain a transmembrane fragment and a cytoplasmic domain with tyrosine motifs. SLAMF8 is a non-classical receptor of the SLAM family, a cell surface receptor expressed on various immune cells [21]. The deletion of SLAMF8 can inhibit the activation of the Mitogen-activated protein kinase (MAPK) signaling pathway by downregulating the expression of Toll-like receptor-4 (TLR4) and can also inhibit the NF-κB signaling pathway to reduce the secretion of inflammatory cytokines. It has been proven a therapeutic target for hepatitis and rheumatoid arthritis [22,23]. Toll-Like Receptors (TLRs) are pattern recognition receptors that participate in innate immunity and play a critical role in the first line of defense against microbial invasion [24]. Recent studies have demonstrated that blocking Toll-like receptor 2 (TLR2) can slow down or reverse the induction of atherosclerosis by Chlamydia pneumoniae infection, and TLR2 can be used as a new therapeutic target for the prevention and treatment of atherosclerosis [25]. Vesicle-associated membrane protein 8 (VAMP8) is a novel oncoprotein that promotes cell proliferation and drug resistance in gliomas, but its role in carotid atherosclerosis is unclear due to insufficient research [26]. Lastly, v-set and immunoglobulin domain containing 4 (VSIG4) participate in the pathogenesis of atherosclerosis by promoting the migration of inflammatory cells to the lesion area and inducing chemokines that activate macrophages [27,28].
After analyzing the PCR results, we identified TLR2 as the gene for further investigation into immune infiltration and pan-cancer analysis. In the advanced stage of carotid atherosclerosis, the immune infiltration analysis showed higher levels of memory B cells, gamma delta T cells, M0 macrophages and M2 macrophages compared to the early stage. Dyslipidemia and immune inflammation are related to atherosclerosis formation and although B cells play an important role in maintaining immune system stability, they can also be harmful in autoimmune diseases and promote atherosclerosis [29-31]. In contrast, an increase in M2 macrophages may lead to the resolution of atherosclerotic inflammation and plaque resolution, potentially a protective and restorative factor in advanced atherosclerosis [32]. Previous research has found that T cells and M0 and M2 macrophages account for the largest proportion of the leukocyte population in advanced unstable plaque, consistent with our study's findings [33,34]. Our study also confirmed previous research showing that TLR2 is mainly associated with naive B cells, M0 macrophages, memory CD4+ T cells, CD8+ T cells, and Tregs. Additionally, TLR2 is a Toll-like receptor that regulates various immune cell subsets, and its expression has been linked to the progression of atherosclerosis [35]. Interestingly, recent studies have established a link between carotid atherosclerosis and the development of several cancers, including colorectal adenoma [36,37]. Similarly, radiotherapy to the neck in nasopharyngeal carcinoma patients has increased the risk of carotid atherosclerosis [38]. Our next step is to explore the role of TLR2, a core gene in carotid atherosclerosis, in different types of cancer. Our pan-cancer analysis found that TLR2 was a prognostic protective factor for SKCM and MESO but a risk factor for THYM and Lower Grade Glioma (LGG). Although previous research has shown that TLR2-mediated innate immunity affects the development of various tumors, its expression and relationship to the prognosis of specific tumors require further investigation. This research could help identify therapeutic targets and prognostic markers.
In this study, we conducted a comprehensive analysis of the immune profile and pan-cancer relevance of the core genes and after applying a series of algorithms, enrichment and pathway analyses, we identified TLR2 as a hub gene. We investigated the immune activities and pathways involved and their impact on the prognosis of different tumors. Based on our findings, we concluded that TLR2 is a potential intervention target for both carotid atherosclerosis and multiple tumors and may become a promising area of research for developing therapeutic strategies for cardiovascular and cerebrovascular diseases and cancer.
This study was supported by the Shanxi Province Graduate Education Innovation Project (2022Y737), the Basic Research Program of Shanxi Province (202203021212010), and the Youth Start-up Fund of Heping Hospital affiliated to Changzhi Medical College (HPYJ202225).
The public data that support the findings of this study are available from the Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo/), and the experimental data are available from the corresponding author upon reasonable request.