5-Year Impact Factor: 0.9
Volume 35, 12 Issues, 2025
  Original Article     December 2025  

GPX8, COL1A1, RAB31, and Other Genes as Diagnostic Biomarkers for Crohn's Disease: A Bioinformatics Analysis Using Machine Learning

By Ruan Qiang1, Lu Ya Wei2, Gu Guo Sheng1

Affiliations

  1. Department of General Surgery, Anhui No.2 Provincial People's Hospital, Hefei, China
  2. Department of General Surgery, Anhui Medical University, Hefei, China
doi: 10.29271/jcpsp.2025.12.1511

ABSTRACT
Objective: To identify potential pivotal genes associated with Crohn's disease (CD) that may serve as potential diagnostic biomarkers.
Study Design: A bioinformatics analysis.
Place and Duration of the Study: Department of Gastrointestinal Dysfunction Centre, Anhui No.2 Provincial People's Hospital, Hefei, China, from June 2023 to June 2024.
Methodology: Multiple gene expression datasets from the NCBI Gene Expression Omnibus (GEO) were used to identify potential diagnostic biomarkers. Weighted gene co-expression network analysis (WGCNA) was conducted; CD-related genes were selected through WGCNA, gene-gene interaction network analysis, and Least Absolute Shrinkage and Selection Operator (LASSO) regression. Logistic regression and six supervised learning techniques—including support vector machine, random forest, K-nearest neighbour, neural network, decision tree, and extreme gradient boosting (XGB)—were used to assess and compare the diagnostic performance of various models.
Results: Hub gene screening, immune infiltration analysis, and gene set enrichment analysis were used to explore the association between the potential genes and CD. A nomogram based on the logistic regression model was developed, and the clinical utility of the hub genes was evaluated through decision curve analysis (DCA). Additionally, six machine learning models were developed, with the XGB model demonstrating the highest performance. The area under the curve (AUC) for the XGB model was 99.4% for the training set, 77.7% for the validation set, and 83.5% for external validation data, indicating its superior diagnostic potential.
Conclusion: This study identified five genes that are closely associated with CD. Based on these genes, a CD diagnostic model was established, with the XGB model showing the most promising results among the six tested models.

Key Words: Crohn's disease, Genes, Prediction model, Machine learning.

INTRODUCTION

In the early 21st century, the incidence of inflammatory bowel disease (IBD) rose sharply in newly industrialised countries, turning it into a major global health issue.1 IBD, which encompasses CD and ulcerative colitis (UC), is marked by persistent intestinal inflammation and characterised by alternating periods of remission and flare-ups. Major complications of IBD can include the development of fibrotic stenosis, fistulas, impaired absorption, an increased risk of cancer, and the need for surgical intervention. Research on biomarkers that support CD diagnosis, along with studies on the underlying immune mechanisms, continues to be a vital area for improving both the under- standing and treatment of the disease.

CD is an autoimmune condition marked by repeated bouts of intestinal inflammation. The development of CD is thought to be closely linked to alterations in immune cell types and numbers, which disrupt immune homeostasis, trigger an inflammatory cascade, and cause sustained damage to the intestinal lining.2 Recent research on immune cell infiltration and pathway enrichment reinforces this idea, showing that differentially expressed genes are highly linked to immune responses and associated pathways.

In recent years, machine learning (ML) has emerged as a widely used bioinformatics approach for mining medical data.3 Additionally, biological networks, including weighted gene co-expression network analysis (WGCNA) and gene-gene interaction networks, are essential for uncovering the relationships between genes and their biological functions. This study aimed to identify potential biomarkers through bionetwork analysis and ML methods, assessing their uniqueness using ML-based diagnostic models.

METHODOLOGY

Seven datasets were retrieved from the GEO database. Each dataset included over 10 samples and was generated using four distinct microarray platforms. The raw data underwent pre- processing using the oligo and affy R packages, followed by background correction and normalisation through the Robust Multichip Average (RMA) approach. The GSE83448 dataset was utilised to build a WGCNA. Next, four additional datasets (GSE16879, GSE59071, GSE75214, and GSE102133) were analysed using the limma R package to identify differentially expressed genes (DEGs) between CD patients and healthy controls. DEGs were filtered with a threshold of |log2FC| >1 and p <0.05, and the findings were visualised using heat maps and volcano plots generated by the complex heat maps and ggplot2 R packages. The GSE186582 dataset included 317 samples from CD patients and 25 intestinal mucosal samples from healthy controls. The Combat algorithm from the sva R package was applied to adjust for batch effects across various platforms and experiments. Lastly, the GSE95095 dataset was used as an independent dataset for external validation (EV).

The R package WGCNA was used to identify gene modules and assess the association between each module and the sample type. The GSE83448 dataset provided 14 healthy control samples and 19 CD patient samples, totalling 33 samples; these samples were categorised into a CD group and a Healthy Control group. The hclust function was employed to cluster samples and detect outliers using the average method on the expression matrix. A soft threshold of seven was selected to optimise the scale-free topological fitting index, yielding an index value of 0.86, which enhanced the average network connectivity. The correlation between module Eigen genes and sample types was assessed by examining the characteristic gene expression profiles and the classification of clinical samples. Inclusion criteria included patients aged between 18 and 65 years and colonoscopy pathological diagnosis of Crohn's disease. Exclusion criteria included patients aged under 18 years, diagnosed with inflammatory diseases other than Crohn's disease (such as ulcerative colitis), and patients with intestinal tuberculosis or irritable bowel syndrome. Genes labelled as green within the selected module were exported to Cytoscape for detailed examination. To explore the roles of common DEGs in greater detail, Gene Ontology (GO) annotation and Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted using the clustering analyser in R.

The intersection of 196 genes screened from four gene sets, with 68 genes in the green module analysed by WGCNA, was taken to obtain 22 common genes. Their relationship with immune cell infiltration was evaluated, and finally, 5 hub genes were obtained through Least Absolute Shrinkage and Selection Operator (LASSO) regression dimension reduction processing. The intersection of these analyses pinpointed five immune cell types that exhibited significantly distinct infiltration patterns. Spearman’s correlation analysis was performed in R software to assess the relationship between the five hub genes and 22 significantly elevated immune cell types. The ggplot2 R package was then used to create visual representations of these correlations.

The datasets were randomly divided into training and testing sets, with 70% allocated to the training set and 30% to the testing set. Repeated stratified cross-validation was then applied to the training set. The stratified K-fold method maintained consistent proportions of sample types in each fold, aligning with the overall dataset distribution. To further assess model robustness, external validation was conducted using an independent dataset, GSE95095. This study evaluated six commonly used ML algorithms to determine the most effective classifier for the diagnostic model. The algorithm tested models included extreme gradient boosting (XGB), k-nearest neighbour (KNN), support vector machine (SVM), random forest (RF), decision tree (DT), and neural network (NN). Ultimately, model performance on both the test and external validation datasets was evaluated using a comprehensive array of metrics.

RESULTS

As shown in Figure 1, the differential genes between patients the CD group and the Healthy Control group were analysed using the limma package in R. After filtering the genes with thresholds of |log2FC| >1 and p <0.05, a total of 1,843 DEGs were screened in the GSE16879 dataset. A total of 691 DEGs were screened in the GSE59071 dataset. The GSE75214 dataset filtered out a total of 400 differential genes. In the GSE102133 dataset, 534 DEGs were identified (Figure 1A-D). After taking the intersection of the Venn diagram of the above differential genes, 196 common differential genes were identified (Figure 1E). The DEGs table intersected four gene clusters, which included 147 commonly upregulated genes and 49 commonly downregulated genes.

The WGCNA R package was used to assess the correlations between clinical features and genes by building a gene network with 820 genes and 33 samples from the GSE83448 dataset. A sample clustering diagram was generated (Figure 2A). To ensure a scale-free topology and preserve zero mean connectivity, a threshold value of 17 was selected (Figure 2C). The module merging threshold was set at 0.2, resulting in the creation of four distinct modules (Figure 2B). Yellow and green modules have the highest positive correlation and are important modules for subsequent analysis (Figure 2D).

To further elucidate the functions of these shared DEGs, GO annotation together with KEGG pathway enrichment analyses were performed (Table I).

To identify the pivotal genes, a cross-analysis was performed between the 68 related genes in the green module and the previously identified 196 DEGs, resulting in 22 key genes. These genes were analysed by immune infiltration, and subsequently, dimensionality reduction was performed by LASSO regression. Finally, five key genes were obtained: GPX8, COL1A1, RAB31, S100A9, and STX11 (Figure 3A-B). In CD samples, S100A9 showed a positive correlation with resting NK cell infiltration levels, while demonstrating a negative correlation with macrophage M1, monocytes, and eosinophils. STX11 was also positively associated with resting NK cell infiltration.

Figure 1: Identification of common DEGs. (A-D) Volcano maps show the differential genes between CD patients and the healthy controls in the GSE16879, GSE75214, GSE59071, and GSE102133 datasets, respectively. (E) The Venn diagram shows 196 common DEGs in the intersection of the four datasets.  Figure 2: Progress of GSE83448 WGCNA (A) Cluster tree of 33 samples from GSE83448. (B) Aggregation in GSE83448 across conditions (p  <0.05; |logFC|>1). A tree of 820 genes was screened. Using the topological overlap matrix, the genes were organised into distinct modules. (C) Determine soft thresholds (left) and average connectivity (right) for the fit index of the optimal scale-free topological model. (D) Heat map showing relationship between GSE83448 module feature genes and clinical traits. Figure 3: Association between hub gene and immune infiltration. (A) LASSO coefficient distribution of 5 hub genes. (B) Tuning parameters (λ) select cross-validation error curves. (C-N) Visualisation of the correlation between immune cells and hub genes.  Figure 4: Final selection of the hub gene logistic regression modelling. (A) Drawing of the nomogram of the prediction model for CD; (B)  Calibration curve of the model; (C) The representation of single factor and comprehensive factor in the model DCA curve.

Table I: Analysis of the biological processes, molecular functions, cellular components and KEGG pathway enrichment of genes.

Categories

Terms

Counts

p-values

Biological process

Cytokine-mediated signalling pathway

20~30

p <0.01

Biological process

Leucocyte migration

20~30

p <0.01

Biological process

Cell chemotaxis

20~30

p <0.01

Biological process

Response to molecule of bacterial origin

20~30

p <0.01

Biological process

Response to lipopolysaccharide

20~30

p <0.01

Biological process

Leucocyte chemotaxis

20~30

p <0.01

Biological process

Myeloid leucocyte migration

20~30

p <0.01

Biological process

Granulocyte migration

20~30

p <0.01

Biological process

Neutrophil migration

20~30

p <0.01

Biological process

Neutrophil chemotaxis

20~30

p <0.01

Molecular function

Receptor ligand activity

8~20

p <0.01

Molecular function

Signalling receptor activator activity

8~20

p <0.01

Molecular function

Extracellular matrix structural constituent

8~20

p <0.01

Molecular function

Cytokine activity

8~20

p <0.01

Molecular function

Cytokine receptor binding

8~20

p <0.01

Molecular function

Immune receptor activity

8~20

p <0.01

Molecular function

Chemokine receptor binding

8~20

p <0.01

Molecular function

Chemokine activity

8~20

p <0.01

Molecular function

CXCR chemokine receptor binding

8~20

p <0.01

Molecular function

Extracellular matrix structural constituent conferring tensile strength

8~20

p <0.01

Cellular component

Collagen-containing extracellular matrix

5~25

p <0.01

Cellular component

External side of plasma membrane

5~25

p <0.01

Cellular component

Endoplasmic reticulum lumen

5~25

p <0.01

Cellular component

Secretory granule membrane

5~25

p <0.01

Cellular component

Secretory granule lumen

5~25

p <0.01

Cellular component

Cytoplasmic vesicle lumen

5~25

p <0.01

Cellular component

Vesicle lumen

5~25

p <0.01

Cellular component

Collagen trimer

5~25

p <0.01

Cellular component

Platelet alpha granule

5~5

p <0.01

Cellular component

Complex of collagen trimers

5~25

p <0.01

KEGG pathway

Cytokine-cytokine receptor interaction

9~21

p <0.01

KEGG pathway

Rheumatoid arthritis

9~21

p <0.01

KEGG pathway

IL-17 signalling pathway

9~21

p <0.01

KEGG pathway

AGE-RAGE signalling pathway in diabetic complications

9~21

p <0.01

KEGG pathway

Viral protein interaction with cytokine and cytokine receptor

9~21

p <0.01

KEGG pathway

TNF signalling pathway

9~21

p <0.01

KEGG pathway

Staphylococcus aureus infection

9~21

p <0.01

KEGG pathway

Amoebiasis

9~21

p <0.01

KEGG pathway

Leishmaniasis

9~21

p <0.01

KEGG pathway

Malaria

9~21

p <0.01

Table II: Accuracy, Kappa value, sensitivity, specificity, precision, model recall, F, and AUC score of six supervised ML models.

Datasets

ML models

Accuracy

Kappa values

Sensitivity

Specificity

Precision

Recall

F1

AUC

Training sets

KNN

0.8833

0.5096

0.8739

1

1

0.8739

0.9327

0.971

DT

0.8167

0.3462

0.8108

0.8889

0.989

0.8108

0.8911

0.888

NN

0.775

0.3182

0.7568

1

1

0.7568

0.8615

0.94

RF

1

1

1

1

1

1

1

1

XGB

0.9542

0.7421

0.9505

1

1

0.9505

0.9746

0.994

SVM

0.9458

0.6688

0.955

0.8333

0.986

0.955

0.9703

0.933

Testing sets

KNN

0.8529

0.4226

0.8421

1

1

0.8421

0.9143

0.921

DT

0.8235

0.1692

0.8526

0.4286

0.9529

0.8526

0.9

0.651

NN

0.7647

0.2078

0.7684

0.7143

0.9733

0.7684

0.8588

0.886

RF

0.9412

0.3742

0.9895

0.2857

0.9495

0.9895

0.9691

0.844

XGB

0.8922

0.2087

0.9368

0.2857

0.9468

0.9368

0.9418

0.777

SVM

0.8824

0.3412

0.9053

0.5714

0.9663

0.9053

0.9348

0.839

External validation sets

KNN

0.8056

0.5532

0.875

0.6667

0.84

0.875

0.8571

0.773

DT

0.7778

0.4783

0.875

0.5833

0.8077

0.875

0.84

0.769

NN

0.8056

0.5714

0.8333

0.75

0.8696

0.8333

0.8511

0.802

RF

0.6667

0.1429

0.875

0.25

0.7

0.875

0.7778

0.769

XGB

0.8333

0.625

0.875

0.75

0.875

0.875

0.875

0.835

SVM

0.7222

0.4

0.75

0.6667

0.8182

0.75

0.7826

0.688

Similarly, GPX8 exhibited a positive correlation with the infiltration of both resting NK cells and macrophages M0. RAB31 was positively linked to the infiltration of macrophages M0, macrophages M2, resting NK cells, and activated mast cells (Figure 3C-N).

First, a logistic regression model was developed for the five selected hub genes, followed by the creation of a column graph and a calibration curve, and subsequently, a DCA curve of single factor and comprehensive factor for comparative analysis was drawn. Afterwards, six models were built using ML algorithms (Figure 4A-C), and ROC curves for all six of the CD categorical diagnostic assistants were established using the predicted probabilities for different categories. The diagnostic performance of the six CD prediction models was further assessed using the EV dataset, which included 24 CD samples and 12 healthy controls. The training dataset results for all six models are presented in Table II. Among them, the RF model demonstrated the highest accuracy at 100%, achieving the best specificity and sensitivity. In contrast, the NN model demonstrated the weakest performance, with an accuracy of merely 77.5% and a sensitivity of 75.7%, whereas the SVM model exhibited a specificity of only 83.3%. The testing dataset performance of all diagnostic models is illustrated in Table II. After training, the RF model continued to outperform others with an accuracy of 94.1%. Meanwhile, the NN model remained the least accurate, achieving nearly 76.5%. Sensitivity analysis revealed that this metric was generally the weakest, with a sensitivity of 76.8%. On the other hand, the RF model demonstrated the highest sensitivity at 98.9%, highlighting its strong predictive ability for detecting CD samples. Regarding specificity, the KNN algorithm performed best, achieving 100% specificity for CD samples. The specificity of the SVM algorithm is the lowest at 57.1%. The RF and XGB algorithms predicted the highest specificity of normal samples (approximately 72.5%). The results show that each of the six models has its own advantages and disadvantages. The F1 score can be used as a more reliable indicator. According to the F1 score, among the evaluated models, the RF algorithm exhibited superior performance, achieving the accuracy and sensitivity values of 100% and 96.9%, respectively. The performance metrics for all six CD diagnostic models on the external validation dataset are presented in Table II. The results indicate that all six models accurately classified normal samples; however, their performance varied for CD samples. In terms of accuracy, the XGB algorithm performed the best, achieving 83.3%, while the RF model showed the lowest accuracy at 66.7%. The sensitivity of all models was relatively consistent (approximately 87%). The NN and XGB boosting models demonstrated the highest specificity, both reaching 75%. In the end, gradient-lifted F1 scored the highest at 87.5%. The AUC of gradient lifting on EV was 83.5%.

DISCUSSION

The protein encoded by the GPX8 gene is Glutathione Peroxidase 8, which is a member of the glutathione peroxidase family. Glutathione peroxidase is an important class of antioxidant enzymes; it protects cells from oxidative damage. GPx7 (NPGPx) and GPx8 exhibit a high level of structural similarity.4 Studies have demonstrated that GPx8-deficient mice are more prone to colitis. GPx8 is particularly important in macrophage function. Colon tissue samples from IBD patients showed low GPx8 expression and high caspase-4 levels, suggesting that GPx8 may protect against colitis by inhibiting caspase-4/11 activity.5

S100A9 is a member of the S100 protein family. It is involved in intracellular and intercellular signalling, as well as the migration and activation of inflammatory cells. S100A9 also function independently as monomers, contributing to inflammatory processes when released into the extracellular environment.6 Research has demonstrated a correlation between the expression levels of S100A9 in blood leucocytes and the clinical characteristics of IBD.7 S100A9 is, therefore, considered to be a potential biomarker that can be used to assess the activity of inflammation and the prognosis of disease.

The COL1A1 gene encodes a protein that is crucial for preserving normal tissue structure and function. Fibrosis in CD results from diverse pathological mechanisms, including the accumulation of extracellular matrices (ECM). These processes compromise intestinal tissue integrity.8 High COL1A1 expression is closely linked to the fibrotic phenotype in the intestine,9 as it promotes ECM accumulation and thereby accelerates the development of intestinal fibrosis.10

The STX11 gene encodes the protein grammarin 11, which plays a crucial role in membrane and vesicle fusion processes both within and between cells. Grammarin 11 functions mainly in immune cells, especially in lymphocytes and antigen-presenting cells. It also plays a role in intercellular signalling and the modulation of immune responses.11 Additionally, STX11 may contribute to regulating inflammation and contributing to the onset and progression of autoimmune diseases.12 RAB31, a member of the RAB protein family,13 shows high RAB31 expression in breast cancer tissues.14 Nevertheless, the specific molecular pathways through which RAB31 influences the onset and progression of IBD remain insufficiently elucidated.

The underlying mechanisms of CD are believed to involve changes in the types and numbers of immune cells, leading to disrupted immune balance and sustained damage to the intestinal mucosa.15 The analyses of immune cell infiltration and pathway enrichment have demonstrated that genes with differential expression are intimately linked to immune responses and their associated signalling pathways. Notably, the intestinal samples from CD patients display significant diversity in immune cell composition, including effector memory CD8+ T cells, neutrophils, mast cells, activated dendritic cells, and γδ T cells. Furthermore, macrophages are recognised as the primary source of IL-23 in the gut, and they crosstalk with T cell subsets and innate lymphocytes to amplify intestinal inflammatory responses.16 Patients with CD often show a reduced number of regulatory T cells and impaired inhibitory function.17 In addition, other immune cells, including dendritic cells and mast cells, are important contributors to the pathogenesis of CD.18,19 Neutrophils activate and recruit other immune cells through degranulation, further exacerbating the inflammatory response.20 Spearman’s correlation analysis was performed on five key hub genes and 22 distinct immune cell types.21 The study's results further imply that S100A9, RAB31, STX11, and GPx8 could influence the fibrotic processes in CD by modulating immune responses and the ECM process.

However, the study has some limitations. First, this study relies on bioinformatics analysis of multiple databases, thus further experimental verification is needed. Second, while the study conducted extensive bioinformatics analysis, the underlying mechanisms of the identified hub genes remain under- explored.

CONCLUSION

During the process of gene screening, it was found that RAB31 and STX11 were newly related to the occurrence and development of CD, in addition to COL1A1, S100A9, and GPX8. These genes can be used as potential biomarkers to build clinical predictive models through ML methods.

ETHICAL APPROVAL:
The public databases involved have been ethically reviewed and this article compiles with the terms and policies of the data provider, so ethical approval is not required.

COMPETING INTEREST:
The authors declared no conflict of interest.

AUTHORS’ CONTRIBUTION:
RQ: Statistical analysis of the data and writing of the paper.
LY: Research design and scattered work.
GGS: Design of the study.
All authors approved the final version of the manuscript to be published.

REFERENCES

  1. Ng SC, Shi HY, Hamidi N, Underwood FE, Tang W, Benchimol EI, et al. Worldwide incidence and preva- lence of inflammatory bowel disease in the 21st century: A systematic review of population-based studies. Lancet 2017; 390(10114):2769-78. doi: 10.1016/S0140-6736(17) 32448-0.
  2. Huang R, Wang W, Chen Z, Chai J, Qi Q, Zheng H, et al. Identifying immune cell infiltration and effective diagnostic bio markers in Crohn's disease by bioinformatics analysis. Front Immunol 2023; 14:1162473. doi: 10.3389/fimmu. 2023.1162473.
  3. Yang A, Zhang W, Wang J, Yang K, Han Y, Zhang L. Review on the application of machine learning algorithms in the sequence data mining of DNA. Front Bioeng Biotechnol 2020; 8:1032. doi: 10.3389/fbioe.2020.01032
  4. Pei J, Pan X, Wei G, Hua Y. Research progress of glutathione peroxidase family (GPX) in redoxidation. Front Pharmacol 2023; 14:1147414. doi: 10.3389/fphar.2023.1147414.
  5. Hsu JL, Chou JW, Chen TF, Hsu JT, Su FY, Lan JL, et al. Gluta-thione peroxidase 8 negatively regulates caspase-4/11 to protect against colitis. EMBO Mol Med 2020; 12(1):e9386. doi: 10.15252/emmm.201809386.
  6. Xia P, Ji X, Yan L, Lian S, Chen Z, Luo Y. Roles of S100A8, S100A9 and S100A12 in infection, inflammation and immunity. Immunology 2024; 171(3):365-76. doi: 10.1111/imm. 13722.
  7. Yao Z, Zhang B, Niu G, Yan Z, Tong X, Zou Y, et al. Neutrophil infiltration characterized by upregulation of S100A8, S100A9, S100A12, and CXCR2 is associated with the co-occurrence of Crohn's disease and peripheral artery disease. Front Immunol 2022; 13:896645. doi: 10.3389/fimmu.2022.896645.
  8. Antar SA, Ashour NA, Marawan ME, Al-Karmalawy AA. Fibrosis: Types, effects, markers, mechanisms for disease progression, and its relation with oxidative stress, immunity, and inflammation. Int J Mol Sci 2023; 24(4):4004. doi: 10.3390/ijms24044004.
  9. Amamou A, Yaker L, Leboutte M, Bole-Feysot C, Savoye G, Marion-Letellier R. Dietary AhR ligands have no anti-fibrotic properties in TGF-β1-stimulated human colonic fibroblasts. Nutrients 2022; 14(16):3253. doi: 10.3390/nu14163253.
  10. Yao Y, Hu C, Song Q, Li Y, Da X, Yu Y, et al. ADAMTS16 activates latent TGF-β, accentuating fibrosis and dysfunction of the pressure-overloaded heart. Cardiovasc Res 2020; 116(5):956-69. doi: 10.1093/cvr/cvz187.
  11. Zhang G, Han J, Wang L, Yang X, Yan Z, Qu M, et al. The vesicular transporter STX11 governs ATGL-mediated hepatic lipolysis and lipophagy. iScience 2022; 25(4): 104085. doi: 10.1016/j.isci.2022.104085.
  12. Chen X, Wang F, Zhang Y, Teng W, Wang M, Nie D, et al. Genetic variant spectrum in 265 Chinese patients with hemophagocytic lymphohistiocytosis: Molecular analyses of PRF1, UNC13D, STX11, STXBP2, SH2D1A, and XIAP. Clin Genet 2018; 94(2):200-12. doi: 10.1111/cge.13363.
  13. Wei D, Zhan W, Gao Y, Huang L, Gong R, Wang W, et al. RAB31 marks and controls an ESCRT-independent exosome pathway. Cell Res 2021; 31(2):157-77. doi: 10.1038/s41 422-020-00409-1.
  14. Soelch S, Beaufort N, Loessner D, Kotzsch M, Reuning U, Luther T, et al. Rab31-dependent regulation of transforming growth factor ß expression in breast cancer cells. Mol Med 2021; 27(1):158. doi: 10.1186/s10020-021-00 419-8.
  15. Martin-Reyes F, Kentaoui-Bousellam I, Ruiz-Malagon AJ, Ho-Plagaro A, Lopez-Gomez C, Herraiz-Vilela M, et al. Mesenteric adipose tissue in Crohn's disease alters the immune response of the ileal mucosa and increases angiogenesis and fibrosis. Sci Rep 2025; 15(1):22591. doi: 10.1038/s41 598-025-07147-7.
  16. Schmitt H, Neurath MF, Atreya R. Role of the IL23/IL17 pathway in Crohn's disease. Front Immunol 2021; 12: 622934. doi: 10.3389/fimmu.2021.622934.
  17. Clough JN, Omer OS, Tasker S, Lord GM, Irving PM. Regulatory T-cell therapy in Crohn's disease: challenges and advances. Gut 2020; 69(5):942-52. doi: 10.1136/gutjnl- 2019-319850.
  18. Mitsialis V, Wall S, Liu P, Ordovas-Montanes J, Parmet T, Vukovic M, et al. Single-cell analyses of colon and blood reveal distinct immune cell signatures of ulcerative colitis and Crohn's disease. Gastroenterology 2020; 159(2):591- 608.e10. doi: 10.1053/j.gastro.2020.04.074.
  19. Lyons DO, Pullen NA. Beyond IgE: Alternative mast cell activation across different disease states. Int J Mol Sci 2020; 21(4):1498. doi: 10.3390/ijms21041498.
  20. Zhang N, Aiyasiding X, Li WJ, Liao HH, Tang QZ. Neutrophil degranulation and myocardial infarction. Cell Commun Signal 2022; 20(1):50. doi: 10.1186/s12964-022-00824-4.
  21. Zhang B, Wu Q, Li B, Wang D, Wang L, Zhou YL. m6A regulator-mediated methylation modification patterns and tumor microenvironment infiltration characterization in gastric cancer. Mol Cancer 2020; 19(1):53. doi: 10.1186/s12943- 020-01170-0.