مقالات پذیرفته شده در نهمین کنگره بین المللی زیست پزشکی
Identification of DNA methylation biomarkers for early detection and prognosis of colorectal cancer using genome-wide DNA methylation profiles
Identification of DNA methylation biomarkers for early detection and prognosis of colorectal cancer using genome-wide DNA methylation profiles
Mohammad Ebrahim Golchin,1,*Mahboobeh Golchin,2Esmaeil Babaei,3
1. Department of Genetics, Faculty of Science, Shahrekord University, Shahrekord, Iran 2. Department of Biology Faculty of Natural Sciences, University of Tabriz, Tabriz, Iran 3. Department of Biology Faculty of Natural Sciences, University of Tabriz, Tabriz, Iran
Introduction: Colorectal cancer is a common malignancy and a leading cause of cancer-related mortality, posing a challenge to healthcare systems. It is frequently diagnosed at advanced stages, which is associated with reduced survival and increased therapeutic complexity. Early detection and prognosis are critical for improving survival and treatment outcomes. DNA methylation, a stable epigenetic alteration in carcinogenesis, is a promising biomarker for diagnosis and prognosis. Characterizing DNA methylation patterns in pre-neoplastic stages (particularly adenomatous lesions) and in malignant tissues can substantially aid early detection and prognosis of colorectal cancer. These methylation changes, as early events in tumorigenesis, provide insights into epigenetic mechanisms and may serve not only as biomarkers but also as therapeutic targets for the development of agents that modulate DNA methylation.
Methods: Raw data from two datasets, GSE139404 and GSE149282, derived from array-based methylation profiling were downloaded from the Gene Expression Omnibus (GEO) database. Dataset GSE139404 consists of 22 high-grade adenomas (HGA) and 18 low-grade adenomas (LGA) as pre-malignant colorectal samples, and also includes 20 normal colon samples. Dataset GSE149282 includes 12 colorectal cancer (CRC) samples and 12 paired adjacent normal tissues. Datasets were normalized separately using the minfi package with the preprocessFunnorm function, and all samples passed quality control. To reduce technical noise, probes containing SNPs and probes that failed in any sample were filtered out. Subsequently, differentially methylated positions (DMPs) were identified for all three comparisons (HGA vs. LGA, HGA vs. normal, and CRC vs. adjacent normal) using thresholds of adjusted p-value < 0.05 and |logFC| ≥ 2. Differentially methylated regions (DMRs) across different comparison conditions were detected using the DMRcate package, applying thresholds of HMFDR < 0.05 and |meandiff| ≥ 0.2. weighted gene co‐methylation network analysis (WGCNA) was performed on dataset GSE139404, with samples classified into three groups (HGA, LGA and normal). Prior to network construction, the 5% most variable probes based on M-value variance across samples (n = 46655) were selected for analysis. Then, pearson correlation matrices were computed for all probes in colon samples. Using the WGCNA pickSoftThreshold function, a soft-thresholding power of β = 12 (R² = 0.9) was selected. TOM and dissTOM were then computed, and co-methylation modules were identified by hierarchical clustering using the dynamic tree-cut algorithm. Modules most strongly associated with HGA were selected, and hub CpG sites were defined with cut-offs of Module Membership (MM) > 0.8 and Gene Significance (GS) > 0.2. The intersection of all identified DMPs, DMRs, and hub CpG sites within the module of interest, yielded genes exhibiting variable methylation patterns at both positional and regional levels across CRC, HGA, LGA, and normal samples. For visualization and further examination of the implicated positions, regions, and genes, the mCSEA package was employed. Finally, to assess the biomarker potential of the identified genes at the positional level, ROC curves were plotted and genes with p-value < 0.01 and AUC > 0.8 were selected as candidate biomarkers.
Results: Following WGCNA on the GSE139404 dataset, the blue (r = –0.68, p = 2e-09) and red (r=0.61, p=3e-07) modules showed the strongest association with the HGA trait. By intersecting hub CpG sites from the blue module with DMPs and DMRs across the comparisons (HGA vs. LGA, HGA vs. normal, and CRC vs. adjacent normal tissue), the gene LPIN1 was identified and was hypomethylated in all comparisons. Similarly, intersecting hub CpG sites from the red module with DMPs and DMRs revealed FAM115A, FGF12, EBF1, EVC, RYR2, IGLON5, and DAB1, all of which were hypermethylated. ROC curves were plotted for two top gene positions, including cg24011073 (LPIN1) and cg03225210, cg01030534, and cg02245020 (FAM115A), based on different comparison conditions. Both genes, with AUC > 0.8 and p-value < 0.01, were identified as key potential biomarkers for early detection and diagnosis of CRC.
Conclusion: In conclusion, integrative methylation analysis revealed LPIN1 and FAM115A, FGF12, EBF1, EVC, RYR2, IGLON5, and DAB1 as key genes with altered methylation across normal, adenomatous, and malignant colorectal tissues. Among these, LPIN1 and FAM115A demonstrated the strongest diagnostic potential with robust ROC performance, highlighting them as promising biomarkers for early detection and prognosis of colorectal cancer. These findings provide novel insights into the epigenetic landscape of colorectal tumorigenesis and underscore the utility of DNA methylation signatures in biomarker development.
Keywords: colorectal cancer, early detection, DNA methylation, biomarker, adenomatous lesions