LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 256

Search options

  1. Article ; Online: Using alternative SMILES representations to identify novel functional analogues in chemical similarity vector searches.

    Kosonocky, Clayton W / Feller, Aaron L / Wilke, Claus O / Ellington, Andrew D

    Patterns (New York, N.Y.)

    2023  Volume 4, Issue 12, Page(s) 100865

    Abstract: Chemical similarity searches are a widely used family ... ...

    Abstract Chemical similarity searches are a widely used family of
    Language English
    Publishing date 2023-10-30
    Publishing country United States
    Document type Journal Article
    ISSN 2666-3899
    ISSN (online) 2666-3899
    DOI 10.1016/j.patter.2023.100865
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article: Distinct horizontal transfer mechanisms for type I and type V CRISPR-associated transposons.

    Hu, Kuang / Chia-Wei, Chou / Wilke, Claus O / Finkelstein, Ilya J

    bioRxiv : the preprint server for biology

    2023  

    Abstract: CRISPR-associated transposons (CASTs) co-opt CRISPR-Cas proteins and Tn7-family transposons for RNA-guided vertical and horizontal transmission. CASTs encode minimal CRISPR arrays but can't acquire new spacers. Here, we show that CASTs instead co-opt ... ...

    Abstract CRISPR-associated transposons (CASTs) co-opt CRISPR-Cas proteins and Tn7-family transposons for RNA-guided vertical and horizontal transmission. CASTs encode minimal CRISPR arrays but can't acquire new spacers. Here, we show that CASTs instead co-opt defense-associated CRISPR arrays for horizontal transmission. A bioinformatic analysis shows that all CAST sub-types co-occur with defense-associated CRISPR-Cas systems. Using an
    Language English
    Publishing date 2023-07-11
    Publishing country United States
    Document type Preprint
    DOI 10.1101/2023.03.03.531003
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article: BACPHLIP: predicting bacteriophage lifestyle from conserved protein domains.

    Hockenberry, Adam J / Wilke, Claus O

    PeerJ

    2021  Volume 9, Page(s) e11396

    Abstract: Bacteriophages are broadly classified into two distinct lifestyles: temperate and virulent. Temperate phages are capable of a latent phase of infection within a host cell (lysogenic cycle), whereas virulent phages directly replicate and lyse host cells ... ...

    Abstract Bacteriophages are broadly classified into two distinct lifestyles: temperate and virulent. Temperate phages are capable of a latent phase of infection within a host cell (lysogenic cycle), whereas virulent phages directly replicate and lyse host cells upon infection (lytic cycle). Accurate lifestyle identification is critical for determining the role of individual phage species within ecosystems and their effect on host evolution. Here, we present BACPHLIP, a BACterioPHage LIfestyle Predictor. BACPHLIP detects the presence of a set of conserved protein domains within an input genome and uses this data to predict lifestyle via a Random Forest classifier that was trained on a dataset of 634 phage genomes. On an independent test set of 423 phages, BACPHLIP has an accuracy of 98% greatly exceeding that of the previously existing tools (79%). BACPHLIP is freely available on GitHub (https://github.com/adamhockenberry/bacphlip) and the code used to build and test the classifier is provided in a separate repository (https://github.com/adamhockenberry/bacphlip-model-dev) for users wishing to interrogate and re-train the underlying classification model.
    Language English
    Publishing date 2021-05-06
    Publishing country United States
    Document type Journal Article
    ZDB-ID 2703241-3
    ISSN 2167-8359
    ISSN 2167-8359
    DOI 10.7717/peerj.11396
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Predicting an epidemic trajectory is difficult.

    Wilke, Claus O / Bergstrom, Carl T

    Proceedings of the National Academy of Sciences of the United States of America

    2020  Volume 117, Issue 46, Page(s) 28549–28551

    MeSH term(s) Epidemics ; Forecasting
    Keywords covid19
    Language English
    Publishing date 2020-11-03
    Publishing country United States
    Document type Journal Article ; Comment
    ZDB-ID 209104-5
    ISSN 1091-6490 ; 0027-8424
    ISSN (online) 1091-6490
    ISSN 0027-8424
    DOI 10.1073/pnas.2020200117
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article: Semantic search using protein large language models detects class II microcins in bacterial genomes.

    Kulikova, Anastasiya V / Parker, Jennifer K / Davies, Bryan W / Wilke, Claus O

    bioRxiv : the preprint server for biology

    2023  

    Abstract: Class II microcins are antimicrobial peptides that have shown some potential as novel antibiotics. However, to date only ten class II microcins have been described, and discovery of novel microcins has been hampered by their short length and high ... ...

    Abstract Class II microcins are antimicrobial peptides that have shown some potential as novel antibiotics. However, to date only ten class II microcins have been described, and discovery of novel microcins has been hampered by their short length and high sequence divergence. Here, we ask if we can use numerical embeddings generated by protein large language models to detect microcins in bacterial genome assemblies and whether this method can outperform sequence-based methods such as BLAST. We find that embeddings detect known class II microcins much more reliably than does BLAST and that any two microcins tend to have a small distance in embedding space even though they typically are highly diverged at the sequence level. In datasets of
    Language English
    Publishing date 2023-11-15
    Publishing country United States
    Document type Preprint
    DOI 10.1101/2023.11.15.567263
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Using machine learning to predict the effects and consequences of mutations in proteins.

    Diaz, Daniel J / Kulikova, Anastasiya V / Ellington, Andrew D / Wilke, Claus O

    Current opinion in structural biology

    2023  Volume 78, Page(s) 102518

    Abstract: Machine and deep learning approaches can leverage the increasingly available massive datasets of protein sequences, structures, and mutational effects to predict variants with improved fitness. Many different approaches are being developed, but ... ...

    Abstract Machine and deep learning approaches can leverage the increasingly available massive datasets of protein sequences, structures, and mutational effects to predict variants with improved fitness. Many different approaches are being developed, but systematic benchmarking studies indicate that even though the specifics of the machine learning algorithms matter, the more important constraint comes from the data availability and quality utilized during training. In cases where little experimental data are available, unsupervised and self-supervised pre-training with generic protein datasets can still perform well after subsequent refinement via hybrid or transfer learning approaches. Overall, recent progress in this field has been staggering, and machine learning approaches will likely play a major role in future breakthroughs in protein biochemistry and engineering.
    MeSH term(s) Neural Networks, Computer ; Machine Learning ; Algorithms ; Amino Acid Sequence ; Mutation
    Language English
    Publishing date 2023-01-03
    Publishing country England
    Document type Journal Article ; Review ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 1068353-7
    ISSN 1879-033X ; 0959-440X
    ISSN (online) 1879-033X
    ISSN 0959-440X
    DOI 10.1016/j.sbi.2022.102518
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Mining Patents with Large Language Models Elucidates the Chemical Function Landscape.

    Kosonocky, Clayton W / Wilke, Claus O / Marcotte, Edward M / Ellington, Andrew D

    ArXiv

    2023  

    Abstract: The fundamental goal of small molecule discovery is to generate chemicals with target functionality. While this often proceeds through structure-based methods, we set out to investigate the practicality of orthogonal methods that leverage the extensive ... ...

    Abstract The fundamental goal of small molecule discovery is to generate chemicals with target functionality. While this often proceeds through structure-based methods, we set out to investigate the practicality of orthogonal methods that leverage the extensive corpus of chemical literature. We hypothesize that a sufficiently large text-derived chemical function dataset would mirror the actual landscape of chemical functionality. Such a landscape would implicitly capture complex physical and biological interactions given that chemical function arises from both a molecule's structure and its interacting partners. To evaluate this hypothesis, we built a Chemical Function (CheF) dataset of patent-derived functional labels. This dataset, comprising 631K molecule-function pairs, was created using an LLM- and embedding-based method to obtain functional labels for approximately 100K molecules from their corresponding 188K unique patents. We carry out a series of analyses demonstrating that the CheF dataset contains a semantically coherent textual representation of the functional landscape congruent with chemical structural relationships, thus approximating the actual chemical function landscape. We then demonstrate that this text-based functional landscape can be leveraged to identify drugs with target functionality using a model able to predict functional profiles from structure alone. We believe that functional label-guided molecular discovery may serve as an orthogonal approach to traditional structure-based methods in the pursuit of designing novel functional molecules.
    Language English
    Publishing date 2023-12-18
    Publishing country United States
    Document type Preprint
    ISSN 2331-8422
    ISSN (online) 2331-8422
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article: Deep mutational scanning and machine learning uncover antimicrobial peptide features driving membrane selectivity.

    Randall, Justin R / Vieira, Luiz C / Wilke, Claus O / Davies, Bryan W

    Research square

    2023  

    Abstract: Antimicrobial peptides commonly act by disrupting bacterial membranes, but also frequently damage mammalian membranes. Deciphering the rules governing membrane selectivity is critical to understanding their function and enabling their therapeutic use. ... ...

    Abstract Antimicrobial peptides commonly act by disrupting bacterial membranes, but also frequently damage mammalian membranes. Deciphering the rules governing membrane selectivity is critical to understanding their function and enabling their therapeutic use. Past attempts to decipher these rules have failed because they cannot interrogate adequate peptide sequence variation. To overcome this problem, we develop deep mutational surface localized antimicrobial display (dmSLAY), which reveals comprehensive positional residue importance and flexibility across an antimicrobial peptide sequence. We apply dmSLAY to Protegrin-1, a potent yet toxic antimicrobial peptide, and identify thousands of sequence variants that positively or negatively influence its antibacterial activity. Further analysis reveals that avoiding large aromatic residues and eliminating disulfide bound cysteine pairs while maintaining membrane bound secondary structure greatly improves Protegrin-1 bacterial specificity. Moreover, dmSLAY datasets enable machine learning to expand our analysis to include over 5.7 million sequence variants and reveal full Protegrin-1 mutational profiles driving either bacterial or mammalian membrane specificity. Our results describe an innovative, high-throughput approach for elucidating antimicrobial peptide sequence-structure-function relationships which can inform synthetic peptide-based drug design.
    Language English
    Publishing date 2023-09-13
    Publishing country United States
    Document type Preprint
    DOI 10.21203/rs.3.rs-3280212/v1
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article: Deep mutational scanning and machine learning uncover antimicrobial peptide features driving membrane selectivity.

    Randall, Justin R / Vieira, Luiz C / Wilke, Claus O / Davies, Bryan W

    bioRxiv : the preprint server for biology

    2023  

    Abstract: Antimicrobial peptides commonly act by disrupting bacterial membranes, but also frequently damage mammalian membranes. Deciphering the rules governing membrane selectivity is critical to understanding their function and enabling their therapeutic use. ... ...

    Abstract Antimicrobial peptides commonly act by disrupting bacterial membranes, but also frequently damage mammalian membranes. Deciphering the rules governing membrane selectivity is critical to understanding their function and enabling their therapeutic use. Past attempts to decipher these rules have failed because they cannot interrogate adequate peptide sequence variation. To overcome this problem, we develop deep mutational surface localized antimicrobial display (dmSLAY), which reveals comprehensive positional residue importance and flexibility across an antimicrobial peptide sequence. We apply dmSLAY to Protegrin-1, a potent yet toxic antimicrobial peptide, and identify thousands of sequence variants that positively or negatively influence its antibacterial activity. Further analysis reveals that avoiding large aromatic residues and eliminating disulfide bound cysteine pairs while maintaining membrane bound secondary structure greatly improves Protegrin-1 bacterial specificity. Moreover, dmSLAY datasets enable machine learning to expand our analysis to include over 5.7 million sequence variants and reveal full Protegrin-1 mutational profiles driving either bacterial or mammalian membrane specificity. Our results describe an innovative, high-throughput approach for elucidating antimicrobial peptide sequence-structure-function relationships which can inform synthetic peptide-based drug design.
    Language English
    Publishing date 2023-09-10
    Publishing country United States
    Document type Preprint
    DOI 10.1101/2023.07.28.551017
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Site-Specific Amino Acid Distributions Follow a Universal Shape.

    Johnson, Mackenzie M / Wilke, Claus O

    Journal of molecular evolution

    2020  Volume 88, Issue 10, Page(s) 731–741

    Abstract: In many applications of evolutionary inference, a model of protein evolution needs to be fitted to the amino acid variation at individual sites in a multiple sequence alignment. Most existing models fall into one of two extremes: Either they provide a ... ...

    Abstract In many applications of evolutionary inference, a model of protein evolution needs to be fitted to the amino acid variation at individual sites in a multiple sequence alignment. Most existing models fall into one of two extremes: Either they provide a coarse-grained description that lacks biophysical realism (e.g., dN/dS models), or they require a large number of parameters to be fitted (e.g., mutation-selection models). Here, we ask whether a middle ground is possible: Can we obtain a realistic description of site-specific amino acid frequencies while severely restricting the number of free parameters in the model? We show that a distribution with a single free parameter can accurately capture the variation in amino acid frequency at most sites in an alignment, as long as we are willing to restrict our analysis to predicting amino acid frequencies by rank rather than by amino acid identity. This result holds equally well both in alignments of empirical protein sequences and of sequences evolved under a biophysically realistic all-atom force field. Our analysis reveals a near universal shape of the frequency distributions of amino acids. This insight has the potential to lead to new models of evolution that have both increased realism and a limited number of free parameters.
    MeSH term(s) Amino Acid Sequence ; Amino Acid Substitution ; Amino Acids/genetics ; Evolution, Molecular ; Models, Genetic ; Sequence Alignment
    Chemical Substances Amino Acids
    Language English
    Publishing date 2020-11-24
    Publishing country Germany
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ZDB-ID 120148-7
    ISSN 1432-1432 ; 0022-2844
    ISSN (online) 1432-1432
    ISSN 0022-2844
    DOI 10.1007/s00239-020-09976-8
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top