Expertise
The Toth-Petroczy group comprises an interdisciplinary team of computational biologists, computer scientists, molecular biologists, biochemists and biophysicists. The group aims to understand protein sequence space in the light of evolution and erroneous protein production with focus on intrinsically disordered proteins regions (IDRs).
The two main research directions in the lab include:
1) Sequence-to-function paradigm of intrinsically disordered regions and condensate forming proteins.
2) Phenotypic mutations in evolution and diseases1.
The group uses an interdisciplinary approach by developing novel mechanistic2,3 and machine learning/deep learning algorithms focusing on evolutionary and/or structural information4, and performing experimental validation and high-throughput screenings5. Specifically, the group has designed an alignment-free algorithm to assess homology between unalignable IDR sequences (SHARK-dive)6 and developed a method to identify conserved motifs within a set of homologous IDR sequences (SHARK-capture). The group has developed the CrowDsourcing Condensate Database and Encyclopedia (CD-CODE.org)7, that catalogues known condensates and their constituent proteins, and trained a predictor of proteins involved in condensates (PICNIC.cd-code.org) across organisms8.
References:
- Romero Romero et al. Phenotypic mutations contribute to protein diversity and shape protein evolution. Protein Sci Publ Protein Soc. 2022;31(9):e4397. https://doi.org/10.1002/pro.4397
- Landerer et al. Fitness Effects of Phenotypic Mutations at Proteome-Scale Reveal Optimality of Translation Machinery. Mol Biol Evol. 2024;41(3):msae048. https://doi.org/10.1093/molbev/msae048
- Landerer et al. deTELpy: Python package for high-throughput detection of amino acid substitutions in mass spectrometry datasets. Bioinforma Oxf Engl. 2024;40(7):btae424. https://doi.org/10.1093/bioinformatics/btae424
-
Luppino et al. DeMAG predicts the effects of variants in clinically actionable genes by integrating structural and evolutionary epistatic features. Nat Commun. 2022;14(1). https://doi.org/10.1038/s41467-023-37661-z
-
Romero Romero et al. Environment modulates protein heterogeneity through transcriptional and translational stop codon readthrough. Nat Commun. 2024;15(1):4446. https://doi.org/10.1038/s41467-024-48387-x
- Chow et al. SHARK enables sensitive detection of evolutionary homologs and functional analogs in unalignable and disordered sequences. Proc Natl Acad Sci. 2024;121(42):e2401622121. https://doi.org/10.1073/pnas.2401622121
-
Rostam et al. CD-CODE: crowdsourcing condensate database and encyclopedia. Nat Methods. 2023;20(5):673-676. https://doi.org/10.1038/s41592-023-01831-0
-
Hadarovich et al. PICNIC accurately predicts condensate-forming proteins regardless of their structural disorder across organisms. Published online June 5, 2023. https://doi.org/10.1101/2023.06.01.543229