AggrescanAI: Prediction of Aggregation-Prone Regions Using Contextualized Embeddings
Protein aggregation plays a central role in the pathogenesis of many neurodegenerative diseases and poses major challenges in protein engineering. A key driver of this process is the presence of aggregation-prone regions (APRs) within protein sequences. We present AggrescanAI, a deep learning-based...
Guardado en:
| Autores principales: | , , , , , |
|---|---|
| Formato: | Preprint |
| Lenguaje: | en_US |
| Publicado: |
Journal of Molecular Biology
2026
|
| Materias: | |
| Acceso en línea: | https://hdl.handle.net/20.500.14769/5227 https://doi.org/10.1016/j.jmb.2026.169643 |
| Aporte de: |
| Sumario: | Protein aggregation plays a central role in the pathogenesis of many neurodegenerative diseases and poses major challenges in protein engineering. A key driver of this process is the presence of aggregation-prone regions (APRs) within protein sequences. We present AggrescanAI, a deep learning-based tool that predicts residue-level aggregation propensity directly from sequence. It leverages contextual embeddings from the ProtT5 protein language model, which captures rich information implicitly encoded in the sequence, without requiring structural data. The model was trained on a set of experimentally annotated APRs, expanded via homology transfering, evaluated by cross-validation, and validated with an external benchmark. AggrescanAI outperforms state of the art predictors and captures aggregation shifts induced by pathogenic mutations. To facilitate accessibility, we provide a user-friendly and fully open Google Colab notebook: https://gitlab.com/bioinformatics-fil/aggrescanai. AggrescanAI represents a new generation of sequence-based aggregation predictors, powered by deep learning and protein language models. |
|---|