Machine learning has proven to be a powerful tool in identifying cancer-driving mutations at CTCF binding sites, according to a recent study published in the journal Nucleic Acids Research. The study focused on persistent CCCTC-binding factor (P-CTCF) binding sites (P-CTCFBSs) to assess if machine learning could detect pan-cancer mutational hotspots in these regions.
CTCF, a key regulator in the nuclear architecture and transcription of non-coding DNA, is impacted by mutations at its binding sites. Persistent CTCF-BSs exhibit higher binding strength, specific constitutive binding, enrichment in chromatin loop anchors, and involvement in topologically associating domain (TAD) boundaries. While mutations in these sites can activate cancer-driving genes, identifying such mutations has been challenging.
To address this, researchers developed a machine learning tool called CTCF-In-Silico Investigation of PersisTEnt Binding (INSITE). This tool predicts the persistence of CTCF binding after knockdown in cancer cells by assessing genetic and epigenetic characteristics. The study utilized data from the International Cancer Genome Consortium (ICGC), the Encyclopedia of DNA Elements (ENCODE), National Center for Biotechnology Information (NCBI), and GM12878 high-coverage whole-genome sequencing (WGS) data to analyze mutational loads at P-CTCFBSs.
After analyzing data from various cancer types, the study found that P-CTCFBSs had significantly higher mutational rates in prostate and breast cancers compared to all CTCF binding sites. Mutations in P-CTCF binding sites were more likely to disrupt CTCF binding, affecting chromatin looping and binding. The study also identified notable enrichment of disruptive mutations at P-CTCF-BSs across different cancer types.
Overall, the findings suggest that machine learning can effectively identify cancer-specific mutations at CTCF binding sites, shedding light on their role in pan-cancer genomic structures. The study underscores the importance of further research into these mutations to better understand cancer etiology and potential therapeutic targets. This research opens up new avenues for studying mutational profiles in various cancer types, offering valuable insights for future cancer research endeavors.