LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 29

Search options

  1. Article ; Online: CATH 2024: CATH-AlphaFlow Doubles the Number of Structures in CATH and Reveals Nearly 200 New Folds.

    Waman, Vaishali P / Bordin, Nicola / Alcraft, Rachel / Vickerstaff, Robert / Rauer, Clemens / Chan, Qian / Sillitoe, Ian / Yamamori, Hazuki / Orengo, Christine

    Journal of molecular biology

    2024  , Page(s) 168551

    Abstract: CATH (https://www.cathdb.info) classifies domain structures from experimental protein structures in the PDB and predicted structures in the AlphaFold Database (AFDB). To cope with the scale of the predicted data a new NextFlow workflow (CATH-AlphaFlow), ... ...

    Abstract CATH (https://www.cathdb.info) classifies domain structures from experimental protein structures in the PDB and predicted structures in the AlphaFold Database (AFDB). To cope with the scale of the predicted data a new NextFlow workflow (CATH-AlphaFlow), has been developed to classify high-quality domains into CATH superfamilies and identify novel fold groups and superfamilies. CATH-AlphaFlow uses a novel state-of-the-art structure-based domain boundary prediction method (ChainSaw) for identifying domains in multi-domain proteins. We applied CATH-AlphaFlow to process PDB structures not classified in CATH and AFDB structures from 21 model organisms, expanding CATH by over 100%. Domains not classified in existing CATH superfamilies or fold groups were used to seed novel folds, giving 253 new folds from PDB structures (September 2023 release) and 96 from AFDB structures of proteomes of 21 model organisms. Where possible, functional annotations were obtained using (i) predictions from publicly available methods (ii) annotations from structural relatives in AFDB/UniProt50. We also predicted functional sites and highly conserved residues. Some folds are associated with important functions such as photosynthetic acclimation (in flowering plants), iron permease activity (in fungi) and post-natal spermatogenesis (in mice). CATH-AlphaFlow will allow us to identify many more CATH relatives in the AFDB, further characterising the protein structure landscape.
    Language English
    Publishing date 2024-03-27
    Publishing country Netherlands
    Document type Journal Article
    ZDB-ID 80229-3
    ISSN 1089-8638 ; 0022-2836
    ISSN (online) 1089-8638
    ISSN 0022-2836
    DOI 10.1016/j.jmb.2024.168551
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Three-dimensional Structure Databases of Biological Macromolecules.

    Waman, Vaishali P / Orengo, Christine / Kleywegt, Gerard J / Lesk, Arthur M

    Methods in molecular biology (Clifton, N.J.)

    2022  Volume 2449, Page(s) 43–91

    Abstract: Databases of three-dimensional structures of proteins (and their associated molecules) provide: (a) Curated repositories of coordinates of experimentally determined structures, including extensive metadata; for instance information about provenance, ... ...

    Abstract Databases of three-dimensional structures of proteins (and their associated molecules) provide: (a) Curated repositories of coordinates of experimentally determined structures, including extensive metadata; for instance information about provenance, details about data collection and interpretation, and validation of results. (b) Information-retrieval tools to allow searching to identify entries of interest and provide access to them. (c) Links among databases, especially to databases of amino-acid and genetic sequences, and of protein function; and links to software for analysis of amino-acid sequence and protein structure, and for structure prediction. (d) Collections of predicted three-dimensional structures of proteins. These will become more and more important after the breakthrough in structure prediction achieved by AlphaFold2. The single global archive of experimentally determined biomacromolecular structures is the Protein Data Bank (PDB). It is managed by wwPDB, a consortium of five partner institutions: the Protein Data Bank in Europe (PDBe), the Research Collaboratory for Structural Bioinformatics (RCSB), the Protein Data Bank Japan (PDBj), the BioMagResBank (BMRB), and the Electron Microscopy Data Bank (EMDB). In addition to jointly managing the PDB repository, the individual wwPDB partners offer many tools for analysis of protein and nucleic acid structures and their complexes, including providing computer-graphic representations. Their collective and individual websites serve as hubs of the community of structural biologists, offering newsletters, reports from Task Forces, training courses, and "helpdesks," as well as links to external software.Many specialized projects are based on the information contained in the PDB. Especially important are SCOP, CATH, and ECOD, which present classifications of protein domains.
    MeSH term(s) Computational Biology ; Databases, Protein ; Protein Conformation ; Proteins/chemistry ; Software
    Chemical Substances Proteins
    Language English
    Publishing date 2022-05-04
    Publishing country United States
    Document type Journal Article
    ISSN 1940-6029
    ISSN (online) 1940-6029
    DOI 10.1007/978-1-0716-2095-3_3
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article: Structural and energetic analyses of SARS-CoV-2

    Lam, Su Datt / Waman, Vaishali P / Fraternali, Franca / Orengo, Christine / Lees, Jonathan

    Computational and structural biotechnology journal

    2022  Volume 20, Page(s) 6302–6316

    Abstract: Coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 is an ongoing pandemic that causes significant health/socioeconomic burden. Variants of concern (VOCs) have emerged affecting transmissibility, disease severity and re-infection risk. Studies ... ...

    Abstract Coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 is an ongoing pandemic that causes significant health/socioeconomic burden. Variants of concern (VOCs) have emerged affecting transmissibility, disease severity and re-infection risk. Studies suggest that the -
    Language English
    Publishing date 2022-11-07
    Publishing country Netherlands
    Document type Journal Article
    ZDB-ID 2694435-2
    ISSN 2001-0370
    ISSN 2001-0370
    DOI 10.1016/j.csbj.2022.11.004
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: CATHe: detection of remote homologues for CATH superfamilies using embeddings from protein language models.

    Nallapareddy, Vamsi / Bordin, Nicola / Sillitoe, Ian / Heinzinger, Michael / Littmann, Maria / Waman, Vaishali P / Sen, Neeladri / Rost, Burkhard / Orengo, Christine

    Bioinformatics (Oxford, England)

    2023  Volume 39, Issue 1

    Abstract: Motivation: CATH is a protein domain classification resource that exploits an automated workflow of structure and sequence comparison alongside expert manual curation to construct a hierarchical classification of evolutionary and structural ... ...

    Abstract Motivation: CATH is a protein domain classification resource that exploits an automated workflow of structure and sequence comparison alongside expert manual curation to construct a hierarchical classification of evolutionary and structural relationships. The aim of this study was to develop algorithms for detecting remote homologues missed by state-of-the-art hidden Markov model (HMM)-based approaches. The method developed (CATHe) combines a neural network with sequence representations obtained from protein language models. It was assessed using a dataset of remote homologues having less than 20% sequence identity to any domain in the training set.
    Results: The CATHe models trained on 1773 largest and 50 largest CATH superfamilies had an accuracy of 85.6 ± 0.4% and 98.2 ± 0.3%, respectively. As a further test of the power of CATHe to detect more remote homologues missed by HMMs derived from CATH domains, we used a dataset consisting of protein domains that had annotations in Pfam, but not in CATH. By using highly reliable CATHe predictions (expected error rate <0.5%), we were able to provide CATH annotations for 4.62 million Pfam domains. For a subset of these domains from Homo sapiens, we structurally validated 90.86% of the predictions by comparing their corresponding AlphaFold2 structures with structures from the CATH superfamilies to which they were assigned.
    Availability and implementation: The code for the developed models is available on https://github.com/vam-sin/CATHe, and the datasets developed in this study can be accessed on https://zenodo.org/record/6327572.
    Supplementary information: Supplementary data are available at Bioinformatics online.
    MeSH term(s) Humans ; Sequence Homology, Amino Acid ; Proteins/chemistry ; Algorithms ; Databases, Protein
    Chemical Substances Proteins
    Language English
    Publishing date 2023-01-17
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btad029
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: KinFams: De-Novo Classification of Protein Kinases Using CATH Functional Units.

    Adeyelu, Tolulope / Bordin, Nicola / Waman, Vaishali P / Sadlej, Marta / Sillitoe, Ian / Moya-Garcia, Aurelio A / Orengo, Christine A

    Biomolecules

    2023  Volume 13, Issue 2

    Abstract: Protein kinases are important targets for treating human disorders, and they are the second most targeted families after G-protein coupled receptors. Several resources provide classification of kinases into evolutionary families (based on sequence ... ...

    Abstract Protein kinases are important targets for treating human disorders, and they are the second most targeted families after G-protein coupled receptors. Several resources provide classification of kinases into evolutionary families (based on sequence homology); however, very few systematically classify functional families (FunFams) comprising evolutionary relatives that share similar functional properties. We have developed the FunFam-MARC (Multidomain ARchitecture-based Clustering) protocol, which uses multi-domain architectures of protein kinases and specificity-determining residues for functional family classification. FunFam-MARC predicts 2210 kinase functional families (KinFams), which have increased functional coherence, in terms of EC annotations, compared to the widely used KinBase classification. Our protocol provides a comprehensive classification for kinase sequences from >10,000 organisms. We associate human KinFams with diseases and drugs and identify 28 druggable human KinFams, i.e., enriched in clinically approved drugs. Since relatives in the same druggable KinFam tend to be structurally conserved, including the drug-binding site, these KinFams may be valuable for shortlisting therapeutic targets. Information on the human KinFams and associated 3D structures from AlphaFold2 are provided via our CATH FTP website and Zenodo. This gives the domain structure representative of each KinFam together with information on any drug compounds available. For 32% of the KinFams, we provide information on highly conserved residue sites that may be associated with specificity.
    MeSH term(s) Humans ; Protein Kinases/metabolism ; Proteins/chemistry ; Databases, Protein ; Sequence Homology, Amino Acid
    Chemical Substances Protein Kinases (EC 2.7.-) ; Proteins
    Language English
    Publishing date 2023-02-02
    Publishing country Switzerland
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2701262-1
    ISSN 2218-273X ; 2218-273X
    ISSN (online) 2218-273X
    ISSN 2218-273X
    DOI 10.3390/biom13020277
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Computational approaches to predict protein functional families and functional sites.

    Rauer, Clemens / Sen, Neeladri / Waman, Vaishali P / Abbasian, Mahnaz / Orengo, Christine A

    Current opinion in structural biology

    2021  Volume 70, Page(s) 108–122

    Abstract: Understanding the mechanisms of protein function is indispensable for many biological applications, such as protein engineering and drug design. However, experimental annotations are sparse, and therefore, theoretical strategies are needed to fill the ... ...

    Abstract Understanding the mechanisms of protein function is indispensable for many biological applications, such as protein engineering and drug design. However, experimental annotations are sparse, and therefore, theoretical strategies are needed to fill the gap. Here, we present the latest developments in building functional subclassifications of protein superfamilies and using evolutionary conservation to detect functional determinants, for example, catalytic-, binding- and specificity-determining residues important for delineating the functional families. We also briefly review other features exploited for functional site detection and new machine learning strategies for combining multiple features.
    MeSH term(s) Binding Sites ; Biological Evolution ; Catalysis ; Computational Biology ; Humans ; Machine Learning ; Protein Engineering ; Proteins/genetics
    Chemical Substances Proteins
    Language English
    Publishing date 2021-07-02
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't ; Review
    ZDB-ID 1068353-7
    ISSN 1879-033X ; 0959-440X
    ISSN (online) 1879-033X
    ISSN 0959-440X
    DOI 10.1016/j.sbi.2021.05.012
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms.

    Bordin, Nicola / Sillitoe, Ian / Nallapareddy, Vamsi / Rauer, Clemens / Lam, Su Datt / Waman, Vaishali P / Sen, Neeladri / Heinzinger, Michael / Littmann, Maria / Kim, Stephanie / Velankar, Sameer / Steinegger, Martin / Rost, Burkhard / Orengo, Christine

    Communications biology

    2023  Volume 6, Issue 1, Page(s) 160

    Abstract: Deep-learning (DL) methods like DeepMind's AlphaFold2 (AF2) have led to substantial improvements in protein structure prediction. We analyse confident AF2 models from 21 model organisms using a new classification protocol (CATH-Assign) which exploits ... ...

    Abstract Deep-learning (DL) methods like DeepMind's AlphaFold2 (AF2) have led to substantial improvements in protein structure prediction. We analyse confident AF2 models from 21 model organisms using a new classification protocol (CATH-Assign) which exploits novel DL methods for structural comparison and classification. Of ~370,000 confident models, 92% can be assigned to 3253 superfamilies in our CATH domain superfamily classification. The remaining cluster into 2367 putative novel superfamilies. Detailed manual analysis on 618 of these, having at least one human relative, reveal extremely remote homologies and further unusual features. Only 25 novel superfamilies could be confirmed. Although most models map to existing superfamilies, AF2 domains expand CATH by 67% and increases the number of unique 'global' folds by 36% and will provide valuable insights on structure function relationships. CATH-Assign will harness the huge expansion in structural data provided by DeepMind to rationalise evolutionary changes driving functional divergence.
    MeSH term(s) Humans ; Furylfuramide ; Databases, Protein ; Proteins/chemistry
    Chemical Substances Furylfuramide (054NR2135Y) ; Proteins
    Language English
    Publishing date 2023-02-08
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 2399-3642
    ISSN (online) 2399-3642
    DOI 10.1038/s42003-023-04488-9
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms

    Nicola Bordin / Ian Sillitoe / Vamsi Nallapareddy / Clemens Rauer / Su Datt Lam / Vaishali P. Waman / Neeladri Sen / Michael Heinzinger / Maria Littmann / Stephanie Kim / Sameer Velankar / Martin Steinegger / Burkhard Rost / Christine Orengo

    Communications Biology, Vol 6, Iss 1, Pp 1-

    2023  Volume 12

    Abstract: A new protein domain classification protocol incorporating deep learning strategies for detecting sequence and structure similarities between domain is used to systematically study and analyse the predicted AlphaFold2 structural models for proteins of 21 ...

    Abstract A new protein domain classification protocol incorporating deep learning strategies for detecting sequence and structure similarities between domain is used to systematically study and analyse the predicted AlphaFold2 structural models for proteins of 21 organisms.
    Keywords Biology (General) ; QH301-705.5
    Language English
    Publishing date 2023-02-01T00:00:00Z
    Publisher Nature Portfolio
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Article ; Online: Insertions in the SARS-CoV-2 Spike N-Terminal Domain May Aid COVID-19 Transmission

    Lam, Su Datt / Waman, Vaishali P / Orengo, Christine / Lees, Jonathan

    bioRxiv

    Abstract: Coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 is an ongoing pandemic that causes significant health/socioeconomic burden. Variants of concern (VOCs) have emerged which may affect transmissibility, disease severity and re-infection risk. Most ... ...

    Abstract Coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 is an ongoing pandemic that causes significant health/socioeconomic burden. Variants of concern (VOCs) have emerged which may affect transmissibility, disease severity and re-infection risk. Most studies focus on the receptor-binding domain (RBD) of the Spike protein. However, some studies suggest that the Spike N-terminal domain (NTD) may have a role in facilitating virus entry via sialic-acid receptor binding. Furthermore, most VOCs include novel NTD variants. Recent analyses demonstrated that NTD insertions in VOCs tend to lie close to loop regions likely to be involved in binding sialic acids. We extended the structural characterisation of these putative sugar binding pockets and explored whether variants could enhance the binding to sialic acids and therefore to the host membrane, thereby contributing to increased transmissibility. We found that recent NTD insertions in VOCs (i.e., Gamma, Delta and Omicron variants) and emerging variants of interest (VOIs) (i.e., Iota, Lambda, Theta variants) frequently lie close to known and putative sugar-binding pockets. For some variants, including the recent Omicron VOC, we find increases in predicted sialic acid binding energy, compared to the original SARS-CoV-2, which may contribute to increased transmission. We examined the similarity of NTD across a range of related Betacoronaviruses to determine whether the putative sugar-binding pockets are sufficiently similar to be exploited in drug design. Despite global sequence and structure similarity, most sialic-acid binding pockets of NTD vary across related coronaviruses. Typically, SARS-CoV-2 possesses additional loops in these pockets that increase contact with polysaccharides. Our work suggests ongoing evolutionary tuning of the sugar-binding pockets in the virus. Whilst three of the pockets are too structurally variable to be amenable to pan Betacoronavirus drug design, we detected a fourth pocket that is highly structurally conserved and could therefore be investigated in pursuit of a generic drug. Our structure-based analyses help rationalise the effects of VOCs and provide hypotheses for experiments. For example, the Omicron variant, which has increased binding to sialic acids in pocket 3, has a rather unique insertion near pocket 3. Our work suggests a strong need for experimental monitoring of VOC changes in NTD.
    Keywords covid19
    Language English
    Publishing date 2021-12-07
    Publisher Cold Spring Harbor Laboratory
    Document type Article ; Online
    DOI 10.1101/2021.12.06.471394
    Database COVID19

    Kategorien

  10. Article ; Online: Genetic diversity and evolution of dengue virus serotype 3: A comparative genomics study.

    Waman, Vaishali P / Kale, Mohan M / Kulkarni-Kale, Urmila

    Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases

    2017  Volume 49, Page(s) 234–240

    Abstract: Dengue virus serotype 3 (DENV-3), one of the four serotypes of Dengue viruses, is geographically diverse. There are five distinct genotypes (I-V) of DENV-3. Emerging strains and lineages of DENV-3 are increasingly being reported. Availability of genomic ... ...

    Abstract Dengue virus serotype 3 (DENV-3), one of the four serotypes of Dengue viruses, is geographically diverse. There are five distinct genotypes (I-V) of DENV-3. Emerging strains and lineages of DENV-3 are increasingly being reported. Availability of genomic data for DENV-3 strains provides opportunity to study its population structure. Complete genome sequences are available for 860 strains of four genotypes (I, II, III and V) isolated worldwide and were analyzed using population genetics and evolutionary approaches to map landscape of genomic diversity. DENV-3 population is observed to be stratified into five major subpopulations. Genotype I and II formed independent subpopulations while genotype III is subdivided into three subpopulations (GIII-a, GIII-b and GIII-c) and is therefore heterogeneous. Genotypes I, II and GIII-a subpopulations comprise of Asian strains whereas GIII-c comprises of American strains. GIII-b subpopulation includes mainly of American strains along with a few strains from Sri Lanka. Genetic admixture is predominantly observed in Sri Lankan strains of genotype III and all strains of genotype V. Inter-genotype recombination was observed to occur in non-structural region of several Asian strains whereas extent of recombination was limited in American strains. Significant positive selection was found to be operational on all genes and observed to be the main driving force of genetic diversity. Positive selection was strongly operational on the branches leading to Asian genotypes and helped to delineate the genetic differences between Asian and American lineages. Thus, inter-genotype recombination, migration and adaptive evolution are the major determinants of evolution of DENV-3.
    MeSH term(s) Asia/epidemiology ; Biological Evolution ; Dengue/epidemiology ; Dengue/virology ; Dengue Virus/classification ; Dengue Virus/genetics ; Dengue Virus/isolation & purification ; Genetic Variation ; Genome, Viral ; Genotype ; Humans ; Molecular Epidemiology ; North America/epidemiology ; Phylogeny ; Phylogeography ; Recombination, Genetic ; Selection, Genetic ; Serogroup ; South America/epidemiology
    Language English
    Publishing date 2017-04
    Publishing country Netherlands
    Document type Comparative Study ; Journal Article
    ZDB-ID 2037068-4
    ISSN 1567-7257 ; 1567-1348
    ISSN (online) 1567-7257
    ISSN 1567-1348
    DOI 10.1016/j.meegid.2017.01.022
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top