LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 263

Search options

  1. Article ; Online: Letter by the ISCB President.

    Orengo, Christine

    Bioinformatics advances

    2021  Volume 1, Issue 1, Page(s) vbab002

    Language English
    Publishing date 2021-05-12
    Publishing country England
    Document type Journal Article
    ISSN 2635-0041
    ISSN (online) 2635-0041
    DOI 10.1093/bioadv/vbab002
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: FunPredCATH: An ensemble method for predicting protein function using CATH.

    Bonello, Joseph / Orengo, Christine

    Biochimica et biophysica acta. Proteins and proteomics

    2023  Volume 1872, Issue 2, Page(s) 140985

    Abstract: Motivation: The growth of unannotated proteins in UniProt increases at a very high rate every year due to more efficient sequencing methods. However, the experimental annotation of proteins is a lengthy and expensive process. Using computational ... ...

    Abstract Motivation: The growth of unannotated proteins in UniProt increases at a very high rate every year due to more efficient sequencing methods. However, the experimental annotation of proteins is a lengthy and expensive process. Using computational techniques to narrow the search can speed up the process by providing highly specific Gene Ontology (GO) terms.
    Methodology: We propose an ensemble approach that combines three generic base predictors that predict Gene Ontology (BP, CC and MF) terms from sequences across different species. We train our models on UniProtGOA annotation data and use the CATH domain resources to identify the protein families. We then calculate a score based on the prevalence of individual GO terms in the functional families that is then used as an indicator of confidence when assigning the GO term to an uncharacterised protein.
    Methods: In the ensemble, we use a statistics-based method that scores the occurrence of GO terms in a CATH FunFam against a background set of proteins annotated by the same GO term. We also developed a set-based method that uses Set Intersection and Set Union to score the occurrence of GO terms within the same CATH FunFam. Finally, we also use FunFams-Plus, a predictor method developed by the Orengo Group at UCL to predict GO terms for uncharacterised proteins in the CAFA3 challenge.
    Evaluation: We evaluated the methods against the CAFA3 benchmark and DomFun. We used the Precision, Recall and F
    Contributions: FunPredCATH compares well with other prediction methods on CAFA3, and the ensemble approach outperforms the base methods. We show that non-IEA models obtain higher F
    MeSH term(s) Databases, Protein ; Proteins/metabolism ; Molecular Sequence Annotation ; Sequence Analysis, Protein/methods ; Gene Ontology
    Chemical Substances Proteins
    Language English
    Publishing date 2023-12-19
    Publishing country Netherlands
    Document type Journal Article
    ZDB-ID 2918798-9
    ISSN 1878-1454 ; 1570-9639
    ISSN (online) 1878-1454
    ISSN 1570-9639
    DOI 10.1016/j.bbapap.2023.140985
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Large-scale clustering of AlphaFold2 3D models shines light on the structure and function of proteins.

    Bordin, Nicola / Lau, Andy M / Orengo, Christine

    Molecular cell

    2023  Volume 83, Issue 22, Page(s) 3950–3952

    Abstract: Two recent studies exploited ultra-fast structural aligners and deep-learning approaches to cluster the protein structure space in the AlphaFold Database. Barrio-Hernandez et al. ...

    Abstract Two recent studies exploited ultra-fast structural aligners and deep-learning approaches to cluster the protein structure space in the AlphaFold Database. Barrio-Hernandez et al.
    MeSH term(s) Cluster Analysis ; Databases, Factual
    Language English
    Publishing date 2023-11-16
    Publishing country United States
    Document type Journal Article
    ZDB-ID 1415236-8
    ISSN 1097-4164 ; 1097-2765
    ISSN (online) 1097-4164
    ISSN 1097-2765
    DOI 10.1016/j.molcel.2023.10.039
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: The opportunities and challenges posed by the new generation of deep learning-based protein structure predictors.

    Varadi, Mihaly / Bordin, Nicola / Orengo, Christine / Velankar, Sameer

    Current opinion in structural biology

    2023  Volume 79, Page(s) 102543

    Abstract: The function of proteins can often be inferred from their three-dimensional structures. Experimental structural biologists spent decades studying these structures, but the accelerated pace of protein sequencing continuously increases the gaps between ... ...

    Abstract The function of proteins can often be inferred from their three-dimensional structures. Experimental structural biologists spent decades studying these structures, but the accelerated pace of protein sequencing continuously increases the gaps between sequences and structures. The early 2020s saw the advent of a new generation of deep learning-based protein structure prediction tools that offer the potential to predict structures based on any number of protein sequences. In this review, we give an overview of the impact of this new generation of structure prediction tools, with examples of the impacted field in the life sciences. We discuss the novel opportunities and new scientific and technical challenges these tools present to the broader scientific community. Finally, we highlight some potential directions for the future of computational protein structure prediction.
    MeSH term(s) Deep Learning ; Computational Biology/methods ; Proteins/chemistry ; Amino Acid Sequence
    Chemical Substances Proteins
    Language English
    Publishing date 2023-02-18
    Publishing country England
    Document type Journal Article ; Review ; Research Support, Non-U.S. Gov't
    ZDB-ID 1068353-7
    ISSN 1879-033X ; 0959-440X
    ISSN (online) 1879-033X
    ISSN 0959-440X
    DOI 10.1016/j.sbi.2023.102543
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: Enhancing missense variant pathogenicity prediction with protein language models using VariPred.

    Lin, Weining / Wells, Jude / Wang, Zeyuan / Orengo, Christine / Martin, Andrew C R

    Scientific reports

    2024  Volume 14, Issue 1, Page(s) 8136

    Abstract: Computational approaches for predicting the pathogenicity of genetic variants have advanced in recent years. These methods enable researchers to determine the possible clinical impact of rare and novel variants. Historically these prediction methods used ...

    Abstract Computational approaches for predicting the pathogenicity of genetic variants have advanced in recent years. These methods enable researchers to determine the possible clinical impact of rare and novel variants. Historically these prediction methods used hand-crafted features based on structural, evolutionary, or physiochemical properties of the variant. In this study we propose a novel framework that leverages the power of pre-trained protein language models to predict variant pathogenicity. We show that our approach VariPred (Variant impact Predictor) outperforms current state-of-the-art methods by using an end-to-end model that only requires the protein sequence as input. Using one of the best-performing protein language models (ESM-1b), we establish a robust classifier that requires no calculation of structural features or multiple sequence alignments. We compare the performance of VariPred with other representative models including 3Cnet, Polyphen-2, REVEL, MetaLR, FATHMM and ESM variant. VariPred performs as well as, or in most cases better than these other predictors using six variant impact prediction benchmarks despite requiring only sequence data and no pre-processing of the data.
    MeSH term(s) Virulence ; Proteins/genetics ; Mutation, Missense ; Amino Acid Sequence ; Computational Biology/methods
    Chemical Substances Proteins
    Language English
    Publishing date 2024-04-07
    Publishing country England
    Document type Journal Article
    ZDB-ID 2615211-3
    ISSN 2045-2322 ; 2045-2322
    ISSN (online) 2045-2322
    ISSN 2045-2322
    DOI 10.1038/s41598-024-51489-7
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Protein diversification through post-translational modifications, alternative splicing, and gene duplication.

    Goldtzvik, Yonathan / Sen, Neeladri / Lam, Su Datt / Orengo, Christine

    Current opinion in structural biology

    2023  Volume 81, Page(s) 102640

    Abstract: Proteins provide the basis for cellular function. Having multiple versions of the same protein within a single organism provides a way of regulating its activity or developing novel functions. Post-translational modifications of proteins, by means of ... ...

    Abstract Proteins provide the basis for cellular function. Having multiple versions of the same protein within a single organism provides a way of regulating its activity or developing novel functions. Post-translational modifications of proteins, by means of adding/removing chemical groups to amino acids, allow for a well-regulated and controlled way of generating functionally distinct protein species. Alternative splicing is another method with which organisms possibly generate new isoforms. Additionally, gene duplication events throughout evolution generate multiple paralogs of the same genes, resulting in multiple versions of the same protein within an organism. In this review, we discuss recent advancements in the study of these three methods of protein diversification and provide illustrative examples of how they affect protein structure and function.
    MeSH term(s) Alternative Splicing ; Gene Duplication ; Evolution, Molecular ; Protein Isoforms/genetics ; Protein Processing, Post-Translational
    Chemical Substances Protein Isoforms
    Language English
    Publishing date 2023-06-23
    Publishing country England
    Document type Journal Article ; Review ; Research Support, Non-U.S. Gov't
    ZDB-ID 1068353-7
    ISSN 1879-033X ; 0959-440X
    ISSN (online) 1879-033X
    ISSN 0959-440X
    DOI 10.1016/j.sbi.2023.102640
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Exploiting protein family and protein network data to identify novel drug targets for bladder cancer.

    Adeyelu, Tolulope Tosin / Moya-Garcia, Aurelio A / Orengo, Christine

    Oncotarget

    2022  Volume 13, Page(s) 105–117

    Abstract: Bladder cancer remains one of the most common forms of cancer and yet there are limited small molecule targeted therapies. Here, we present a computational platform to identify new potential targets for bladder cancer therapy. Our method initially ... ...

    Abstract Bladder cancer remains one of the most common forms of cancer and yet there are limited small molecule targeted therapies. Here, we present a computational platform to identify new potential targets for bladder cancer therapy. Our method initially exploited a set of known driver genes for bladder cancer combined with predicted bladder cancer genes from mutationally enriched protein domain families. We enriched this initial set of genes using protein network data to identify a comprehensive set of 323 putative bladder cancer targets. Pathway and cancer hallmarks analyses highlighted putative mechanisms in agreement with those previously reported for this cancer and revealed protein network modules highly enriched in potential drivers likely to be good targets for targeted therapies. 21 of our potential drug targets are targeted by FDA approved drugs for other diseases - some of them are known drivers or are already being targeted for bladder cancer (FGFR3, ERBB3, HDAC3, EGFR). A further 4 potential drug targets were identified by inheriting drug mappings across our in-house CATH domain functional families (FunFams). Our FunFam data also allowed us to identify drug targets in families that are less prone to side effects i.e., where structurally similar protein domain relatives are less dispersed across the human protein network. We provide information on our novel potential cancer driver genes, together with information on pathways, network modules and hallmarks associated with the predicted and known bladder cancer drivers and we highlight those drivers we predict to be likely drug targets.
    MeSH term(s) ErbB Receptors/genetics ; Humans ; Molecular Targeted Therapy ; Oncogenes ; Proteins/metabolism ; Urinary Bladder Neoplasms/drug therapy ; Urinary Bladder Neoplasms/genetics
    Chemical Substances Proteins ; ErbB Receptors (EC 2.7.10.1)
    Language English
    Publishing date 2022-01-12
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2560162-3
    ISSN 1949-2553 ; 1949-2553
    ISSN (online) 1949-2553
    ISSN 1949-2553
    DOI 10.18632/oncotarget.28175
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: Dissecting peripheral protein-membrane interfaces.

    Tubiana, Thibault / Sillitoe, Ian / Orengo, Christine / Reuter, Nathalie

    PLoS computational biology

    2022  Volume 18, Issue 12, Page(s) e1010346

    Abstract: Peripheral membrane proteins (PMPs) include a wide variety of proteins that have in common to bind transiently to the chemically complex interfacial region of membranes through their interfacial binding site (IBS). In contrast to protein-protein or ... ...

    Abstract Peripheral membrane proteins (PMPs) include a wide variety of proteins that have in common to bind transiently to the chemically complex interfacial region of membranes through their interfacial binding site (IBS). In contrast to protein-protein or protein-DNA/RNA interfaces, peripheral protein-membrane interfaces are poorly characterized. We collected a dataset of PMP domains representative of the variety of PMP functions: membrane-targeting domains (Annexin, C1, C2, discoidin C2, PH, PX), enzymes (PLA, PLC/D) and lipid-transfer proteins (START). The dataset contains 1328 experimental structures and 1194 AphaFold models. We mapped the amino acid composition and structural patterns of the IBS of each protein in this dataset, and evaluated which were more likely to be found at the IBS compared to the rest of the domains' accessible surface. In agreement with earlier work we find that about two thirds of the PMPs in the dataset have protruding hydrophobes (Leu, Ile, Phe, Tyr, Trp and Met) at their IBS. The three aromatic amino acids Trp, Tyr and Phe are a hallmark of PMPs IBS regardless of whether they protrude on loops or not. This is also the case for lysines but not arginines suggesting that, unlike for Arg-rich membrane-active peptides, the less membrane-disruptive lysine is preferred in PMPs. Another striking observation was the over-representation of glycines at the IBS of PMPs compared to the rest of their surface, possibly procuring IBS loops a much-needed flexibility to insert in-between membrane lipids. The analysis of the 9 superfamilies revealed amino acid distribution patterns in agreement with their known functions and membrane-binding mechanisms. Besides revealing novel amino acids patterns at protein-membrane interfaces, our work contributes a new PMP dataset and an analysis pipeline that can be further built upon for future studies of PMPs properties, or for developing PMPs prediction tools using for example, machine learning approaches.
    MeSH term(s) Amino Acids/chemistry ; Binding Sites ; Peptides/chemistry ; Cell Membrane/chemistry
    Chemical Substances Amino Acids ; Peptides
    Language English
    Publishing date 2022-12-14
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2193340-6
    ISSN 1553-7358 ; 1553-734X
    ISSN (online) 1553-7358
    ISSN 1553-734X
    DOI 10.1371/journal.pcbi.1010346
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: CATH 2024: CATH-AlphaFlow Doubles the Number of Structures in CATH and Reveals Nearly 200 New Folds.

    Waman, Vaishali P / Bordin, Nicola / Alcraft, Rachel / Vickerstaff, Robert / Rauer, Clemens / Chan, Qian / Sillitoe, Ian / Yamamori, Hazuki / Orengo, Christine

    Journal of molecular biology

    2024  , Page(s) 168551

    Abstract: CATH (https://www.cathdb.info) classifies domain structures from experimental protein structures in the PDB and predicted structures in the AlphaFold Database (AFDB). To cope with the scale of the predicted data a new NextFlow workflow (CATH-AlphaFlow), ... ...

    Abstract CATH (https://www.cathdb.info) classifies domain structures from experimental protein structures in the PDB and predicted structures in the AlphaFold Database (AFDB). To cope with the scale of the predicted data a new NextFlow workflow (CATH-AlphaFlow), has been developed to classify high-quality domains into CATH superfamilies and identify novel fold groups and superfamilies. CATH-AlphaFlow uses a novel state-of-the-art structure-based domain boundary prediction method (ChainSaw) for identifying domains in multi-domain proteins. We applied CATH-AlphaFlow to process PDB structures not classified in CATH and AFDB structures from 21 model organisms, expanding CATH by over 100%. Domains not classified in existing CATH superfamilies or fold groups were used to seed novel folds, giving 253 new folds from PDB structures (September 2023 release) and 96 from AFDB structures of proteomes of 21 model organisms. Where possible, functional annotations were obtained using (i) predictions from publicly available methods (ii) annotations from structural relatives in AFDB/UniProt50. We also predicted functional sites and highly conserved residues. Some folds are associated with important functions such as photosynthetic acclimation (in flowering plants), iron permease activity (in fungi) and post-natal spermatogenesis (in mice). CATH-AlphaFlow will allow us to identify many more CATH relatives in the AFDB, further characterising the protein structure landscape.
    Language English
    Publishing date 2024-03-27
    Publishing country Netherlands
    Document type Journal Article
    ZDB-ID 80229-3
    ISSN 1089-8638 ; 0022-2836
    ISSN (online) 1089-8638
    ISSN 0022-2836
    DOI 10.1016/j.jmb.2024.168551
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Three-dimensional Structure Databases of Biological Macromolecules.

    Waman, Vaishali P / Orengo, Christine / Kleywegt, Gerard J / Lesk, Arthur M

    Methods in molecular biology (Clifton, N.J.)

    2022  Volume 2449, Page(s) 43–91

    Abstract: Databases of three-dimensional structures of proteins (and their associated molecules) provide: (a) Curated repositories of coordinates of experimentally determined structures, including extensive metadata; for instance information about provenance, ... ...

    Abstract Databases of three-dimensional structures of proteins (and their associated molecules) provide: (a) Curated repositories of coordinates of experimentally determined structures, including extensive metadata; for instance information about provenance, details about data collection and interpretation, and validation of results. (b) Information-retrieval tools to allow searching to identify entries of interest and provide access to them. (c) Links among databases, especially to databases of amino-acid and genetic sequences, and of protein function; and links to software for analysis of amino-acid sequence and protein structure, and for structure prediction. (d) Collections of predicted three-dimensional structures of proteins. These will become more and more important after the breakthrough in structure prediction achieved by AlphaFold2. The single global archive of experimentally determined biomacromolecular structures is the Protein Data Bank (PDB). It is managed by wwPDB, a consortium of five partner institutions: the Protein Data Bank in Europe (PDBe), the Research Collaboratory for Structural Bioinformatics (RCSB), the Protein Data Bank Japan (PDBj), the BioMagResBank (BMRB), and the Electron Microscopy Data Bank (EMDB). In addition to jointly managing the PDB repository, the individual wwPDB partners offer many tools for analysis of protein and nucleic acid structures and their complexes, including providing computer-graphic representations. Their collective and individual websites serve as hubs of the community of structural biologists, offering newsletters, reports from Task Forces, training courses, and "helpdesks," as well as links to external software.Many specialized projects are based on the information contained in the PDB. Especially important are SCOP, CATH, and ECOD, which present classifications of protein domains.
    MeSH term(s) Computational Biology ; Databases, Protein ; Protein Conformation ; Proteins/chemistry ; Software
    Chemical Substances Proteins
    Language English
    Publishing date 2022-05-04
    Publishing country United States
    Document type Journal Article
    ISSN 1940-6029
    ISSN (online) 1940-6029
    DOI 10.1007/978-1-0716-2095-3_3
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top