PeerJ, cilt.14, 2026 (SCI-Expanded, Scopus)
Background: Anticancer peptides (ACPs) are increasingly recognized as promising therapeutic candidates due to their ability to selectively target cancer cells. However, the systematic discovery of novel ACPs, particularly from high-throughput sequencing datasets, remains hindered by technical and methodological limitations. Current prediction frameworks require pre-extracted peptide sequences, involve manual preprocessing, and yield variable results, which restricts their applicability for large-scale, data-driven discovery.
Methods: To address these limitations, we developed MetaPepticon, a modular, end-to-end pipeline for the discovery of ACP candidates from diverse sequencing inputs, including raw genomic, metagenomic, transcriptomic, and metatranscriptomic reads, as well as assembled contigs and peptide sequences. MetaPepticon automates quality control, filtering, assembly, small open reading frame prediction, ACP classification using multiple predictive algorithms, and in silico toxicity filtering.
Results: MetaPepticon enables scalable and reproducible ACP prediction from raw sequences through integration of multiple predictors within a configurable agreement framework. Applied to 41,171 microbial genomes and 4,072,884 peptides, MetaPepticon identified 10,725 moderate-agreement ACP candidates, including 4,590 novel, non-toxic peptides. MetaPepticon expands the practical applicability of existing ACP prediction methods to high-throughput sequencing data and is freely available at: https://github.com/arikanlab/MetaPepticon.