A comprehensive dataset of therapeutic peptides, comprising 58,583 experimentally validated peptides with annotated structure information, has been presented. These peptides are categorized into 47 groups based on their therapeutic properties, such as antimicrobial or glucose-regulatory, with a substantial number being multi-functional peptides. The dataset aims to aid in the research and development of therapeutic peptides, particularly in computational tool advancements for peptide discovery and understanding the ‘sequence-structure-function’ relationship.
Peptide drugs have gained prominence as therapeutic agents due to their specificity, low immunogenicity, and potency. Understanding therapeutic peptides is crucial for accelerating drug development, given their adherence to the ‘sequence-structure-function’ concept and moonlighting characteristics. Various peptide databases have been established to consolidate peptides with specific functions, but there has been a lack of attention to multifunctional peptides and structural information in existing databases.
The dataset compiled in this study comprises 58,583 experimentally validated therapeutic peptides, with 21,130 being multi-functional. Structural information for these peptides was obtained from the Protein Data Bank and through computational tools like AlphaFold2. This dataset represents the most comprehensive collection of therapeutic peptides available, offering insights into sequence, structure, and function relationships crucial for peptide drug design.
The dataset classifies peptides into 15 major categories and 47 subcategories based on their therapeutic properties. Notably, peptides are designated as multifunctional when they exhibit at least two distinct functions without a subordinate relationship. The structural information for these peptides was obtained through experimental data and computational predictions using advanced tools like AlphaFold2.
The dataset is made available on FigShare, providing detailed information on peptide sequences, functions, modifications, origin, secondary structure, and tertiary structure. The dataset’s completeness and consensus were validated to ensure data quality and consistency. The distribution of peptide lengths, origins, and functional categories was analyzed to demonstrate the dataset’s comprehensiveness.
Researchers can utilize this dataset for various applications, including constructing computational pipelines for therapeutic peptide discovery, exploring sequence-structure-function relationships, and peptide drug repurposing. The dataset’s availability on FigShare ensures easy access for users interested in therapeutic peptide research.
📰 Related Articles
- Study Reveals Predictors of Psoriatic Arthritis Development in Psoriasis
- Ribo Life Pursues IPO to Advance RNA Drug Development
- Costa Rica Drug Bust Reveals Illegal Capybara Trafficking Nexus
- eBay Report Reveals Fall 2025 Luxury Fashion Trends
- abrdn UK Smaller Companies Growth Trust Reveals NAVs Transparency