LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 111

Search options

  1. Article ; Online: METHODS FOR DISCRETE CODING OF MORPHOLOGICAL CHARACTERS FOR NUMERICAL ANALYSIS.

    Goldman, Nick

    Cladistics : the international journal of the Willi Hennig Society

    2021  Volume 4, Issue 1, Page(s) 59–71

    Language English
    Publishing date 2021-12-21
    Publishing country United States
    Document type Letter
    ZDB-ID 1462608-1
    ISSN 1096-0031 ; 0748-3007
    ISSN (online) 1096-0031
    ISSN 0748-3007
    DOI 10.1111/j.1096-0031.1988.tb00468.x
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Integrated structural and evolutionary analysis reveals common mechanisms underlying adaptive evolution in mammals.

    Slodkowicz, Greg / Goldman, Nick

    Proceedings of the National Academy of Sciences of the United States of America

    2020  Volume 117, Issue 11, Page(s) 5977–5986

    Abstract: Understanding the molecular basis of adaptation to the environment is a central question in evolutionary biology, yet linking detected signatures of positive selection to molecular mechanisms remains challenging. Here we demonstrate that combining ... ...

    Abstract Understanding the molecular basis of adaptation to the environment is a central question in evolutionary biology, yet linking detected signatures of positive selection to molecular mechanisms remains challenging. Here we demonstrate that combining sequence-based phylogenetic methods with structural information assists in making such mechanistic interpretations on a genomic scale. Our integrative analysis shows that positively selected sites tend to colocalize on protein structures and that positively selected clusters are found in functionally important regions of proteins, indicating that positive selection can contravene the well-known principle of evolutionary conservation of functionally important regions. This unexpected finding, along with our discovery that positive selection acts on structural clusters, opens previously unexplored strategies for the development of better models of protein evolution. Remarkably, proteins where we detect the strongest evidence of clustering belong to just two functional groups: Components of immune response and metabolic enzymes. This gives a coherent picture of pathogens and xenobiotics as important drivers of adaptive evolution of mammals.
    MeSH term(s) Adaptation, Physiological ; Animals ; Environment ; Enzymes/chemistry ; Evolution, Molecular ; Genomics ; Immunity ; Mammals/genetics ; Mammals/immunology ; Mammals/physiology ; Models, Molecular ; Phylogeny ; Protein Conformation ; Proteins/chemistry ; Selection, Genetic
    Chemical Substances Enzymes ; Proteins
    Language English
    Publishing date 2020-03-02
    Publishing country United States
    Document type Journal Article
    ZDB-ID 209104-5
    ISSN 1091-6490 ; 0027-8424
    ISSN (online) 1091-6490
    ISSN 0027-8424
    DOI 10.1073/pnas.1916786117
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: DNA Sequences Are as Useful as Protein Sequences for Inferring Deep Phylogenies.

    Kapli, Paschalia / Kotari, Ioanna / Telford, Maximilian J / Goldman, Nick / Yang, Ziheng

    Systematic biology

    2023  Volume 72, Issue 5, Page(s) 1119–1135

    Abstract: Inference of deep phylogenies has almost exclusively used protein rather than DNA sequences based on the perception that protein sequences are less prone to homoplasy and saturation or to issues of compositional heterogeneity than DNA sequences. Here, we ...

    Abstract Inference of deep phylogenies has almost exclusively used protein rather than DNA sequences based on the perception that protein sequences are less prone to homoplasy and saturation or to issues of compositional heterogeneity than DNA sequences. Here, we analyze a model of codon evolution under an idealized genetic code and demonstrate that those perceptions may be misconceptions. We conduct a simulation study to assess the utility of protein versus DNA sequences for inferring deep phylogenies, with protein-coding data generated under models of heterogeneous substitution processes across sites in the sequence and among lineages on the tree, and then analyzed using nucleotide, amino acid, and codon models. Analysis of DNA sequences under nucleotide-substitution models (possibly with the third codon positions excluded) recovered the correct tree at least as often as analysis of the corresponding protein sequences under modern amino acid models. We also applied the different data-analysis strategies to an empirical dataset to infer the metazoan phylogeny. Our results from both simulated and real data suggest that DNA sequences may be as useful as proteins for inferring deep phylogenies and should not be excluded from such analyses. Analysis of DNA data under nucleotide models has a major computational advantage over protein-data analysis, potentially making it feasible to use advanced models that account for among-site and among-lineage heterogeneity in the nucleotide-substitution process in inference of deep phylogenies.
    MeSH term(s) Animals ; Phylogeny ; Base Sequence ; Models, Genetic ; Codon ; Nucleotides ; Amino Acids/genetics ; Evolution, Molecular
    Chemical Substances Codon ; Nucleotides ; Amino Acids
    Language English
    Publishing date 2023-06-27
    Publishing country England
    Document type Journal Article
    ZDB-ID 1482572-7
    ISSN 1076-836X ; 1063-5157
    ISSN (online) 1076-836X
    ISSN 1063-5157
    DOI 10.1093/sysbio/syad036
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Short-range template switching in great ape genomes explored using pair hidden Markov models.

    Walker, Conor R / Scally, Aylwyn / De Maio, Nicola / Goldman, Nick

    PLoS genetics

    2021  Volume 17, Issue 3, Page(s) e1009221

    Abstract: Many complex genomic rearrangements arise through template switch errors, which occur in DNA replication when there is a transient polymerase switch to an alternate template nearby in three-dimensional space. While typically investigated at kilobase-to- ... ...

    Abstract Many complex genomic rearrangements arise through template switch errors, which occur in DNA replication when there is a transient polymerase switch to an alternate template nearby in three-dimensional space. While typically investigated at kilobase-to-megabase scales, the genomic and evolutionary consequences of this mutational process are not well characterised at smaller scales, where they are often interpreted as clusters of independent substitutions, insertions and deletions. Here we present an improved statistical approach using pair hidden Markov models, and use it to detect and describe short-range template switches underlying clusters of mutations in the multi-way alignment of hominid genomes. Using robust statistics derived from evolutionary genomic simulations, we show that template switch events have been widespread in the evolution of the great apes' genomes and provide a parsimonious explanation for the presence of many complex mutation clusters in their phylogenetic context. Larger-scale mechanisms of genome rearrangement are typically associated with structural features around breakpoints, and accordingly we show that atypical patterns of secondary structure formation and DNA bending are present at the initial template switch loci. Our methods improve on previous non-probabilistic approaches for computational detection of template switch mutations, allowing the statistical significance of events to be assessed. By specifying realistic evolutionary parameters based on the genomes and taxa involved, our methods can be readily adapted to other intra- or inter-species comparisons.
    MeSH term(s) Algorithms ; Animals ; DNA Replication ; Genome ; Genomics/methods ; Hominidae/genetics ; Humans ; Markov Chains ; Models, Genetic ; Poly A-U ; Quantitative Trait Loci ; Templates, Genetic
    Chemical Substances Poly A-U (24936-38-7)
    Language English
    Publishing date 2021-03-02
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2186725-2
    ISSN 1553-7404 ; 1553-7390
    ISSN (online) 1553-7404
    ISSN 1553-7390
    DOI 10.1371/journal.pgen.1009221
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: Short template switch events explain mutation clusters in the human genome.

    Löytynoja, Ari / Goldman, Nick

    Genome research

    2017  Volume 27, Issue 6, Page(s) 1039–1049

    Abstract: Resequencing efforts are uncovering the extent of genetic variation in humans and provide data to study the evolutionary processes shaping our genome. One recurring puzzle in both intra- and inter-species studies is the high frequency of complex ... ...

    Abstract Resequencing efforts are uncovering the extent of genetic variation in humans and provide data to study the evolutionary processes shaping our genome. One recurring puzzle in both intra- and inter-species studies is the high frequency of complex mutations comprising multiple nearby base substitutions or insertion-deletions. We devised a generalized mutation model of template switching during replication that extends existing models of genome rearrangement and used this to study the role of template switch events in the origin of short mutation clusters. Applied to the human genome, our model detects thousands of template switch events during the evolution of human and chimp from their common ancestor and hundreds of events between two independently sequenced human genomes. Although many of these are consistent with a template switch mechanism previously proposed for bacteria, our model also identifies new types of mutations that create short inversions, some flanked by paired inverted repeats. The local template switch process can create numerous complex mutation patterns, including hairpin loop structures, and explains multinucleotide mutations and compensatory substitutions without invoking positive selection, speculative mechanisms, or implausible coincidence. Clustered sequence differences are challenging for current mapping and variant calling methods, and we show that many erroneous variant annotations exist in human reference data. Local template switch events may have been neglected as an explanation for complex mutations because of biases in commonly used analyses. Incorporation of our model into reference-based analysis pipelines and comparisons of de novo assembled genomes will lead to improved understanding of genome variation and evolution.
    MeSH term(s) Animals ; Base Sequence ; Biological Evolution ; Genome, Human ; High-Throughput Nucleotide Sequencing ; Humans ; INDEL Mutation ; Inverted Repeat Sequences ; Models, Genetic ; Pan troglodytes ; Polymorphism, Single Nucleotide ; Sequence Alignment
    Language English
    Publishing date 2017-04-06
    Publishing country United States
    Document type Journal Article
    ZDB-ID 1284872-4
    ISSN 1549-5469 ; 1088-9051 ; 1054-9803
    ISSN (online) 1549-5469
    ISSN 1088-9051 ; 1054-9803
    DOI 10.1101/gr.214973.116
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Dynamic, adaptive sampling during nanopore sequencing using Bayesian experimental design.

    Weilguny, Lukas / De Maio, Nicola / Munro, Rory / Manser, Charlotte / Birney, Ewan / Loose, Matthew / Goldman, Nick

    Nature biotechnology

    2023  Volume 41, Issue 7, Page(s) 1018–1025

    Abstract: Nanopore sequencers can select which DNA molecules to sequence, rejecting a molecule after analysis of a small initial part. Currently, selection is based on predetermined regions of interest that remain constant throughout an experiment. Sequencing ... ...

    Abstract Nanopore sequencers can select which DNA molecules to sequence, rejecting a molecule after analysis of a small initial part. Currently, selection is based on predetermined regions of interest that remain constant throughout an experiment. Sequencing efforts, thus, cannot be re-focused on molecules likely contributing most to experimental success. Here we present BOSS-RUNS, an algorithmic framework and software to generate dynamically updated decision strategies. We quantify uncertainty at each genome position with real-time updates from data already observed. For each DNA fragment, we decide whether the expected decrease in uncertainty that it would provide warrants fully sequencing it, thus optimizing information gain. BOSS-RUNS mitigates coverage bias between and within members of a microbial community, leading to improved variant calling; for example, low-coverage sites of a species at 1% abundance were reduced by 87.5%, with 12.5% more single-nucleotide polymorphisms detected. Such data-driven updates to molecule selection are applicable to many sequencing scenarios, such as enriching for regions with increased divergence or low coverage, reducing time-to-answer.
    MeSH term(s) Nanopore Sequencing ; Research Design ; Bayes Theorem ; Genome ; Software ; High-Throughput Nucleotide Sequencing ; Sequence Analysis, DNA ; Nanopores
    Language English
    Publishing date 2023-01-02
    Publishing country United States
    Document type Journal Article
    ZDB-ID 1311932-1
    ISSN 1546-1696 ; 1087-0156
    ISSN (online) 1546-1696
    ISSN 1087-0156
    DOI 10.1038/s41587-022-01580-z
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Maximum likelihood pandemic-scale phylogenetics.

    De Maio, Nicola / Kalaghatgi, Prabhav / Turakhia, Yatish / Corbett-Detig, Russell / Minh, Bui Quang / Goldman, Nick

    Nature genetics

    2023  Volume 55, Issue 5, Page(s) 746–752

    Abstract: Phylogenetics has a crucial role in genomic epidemiology. Enabled by unparalleled volumes of genome sequence data generated to study and help contain the COVID-19 pandemic, phylogenetic analyses of SARS-CoV-2 genomes have shed light on the virus's ... ...

    Abstract Phylogenetics has a crucial role in genomic epidemiology. Enabled by unparalleled volumes of genome sequence data generated to study and help contain the COVID-19 pandemic, phylogenetic analyses of SARS-CoV-2 genomes have shed light on the virus's origins, spread, and the emergence and reproductive success of new variants. However, most phylogenetic approaches, including maximum likelihood and Bayesian methods, cannot scale to the size of the datasets from the current pandemic. We present 'MAximum Parsimonious Likelihood Estimation' (MAPLE), an approach for likelihood-based phylogenetic analysis of epidemiological genomic datasets at unprecedented scales. MAPLE infers SARS-CoV-2 phylogenies more accurately than existing maximum likelihood approaches while running up to thousands of times faster, and requiring at least 100 times less memory on large datasets. This extends the reach of genomic epidemiology, allowing the continued use of accurate phylogenetic, phylogeographic and phylodynamic analyses on datasets of millions of genomes.
    MeSH term(s) Humans ; Phylogeny ; COVID-19/epidemiology ; COVID-19/genetics ; SARS-CoV-2/genetics ; Likelihood Functions ; Pandemics ; Bayes Theorem
    Language English
    Publishing date 2023-04-10
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't ; Research Support, U.S. Gov't, P.H.S.
    ZDB-ID 1108734-1
    ISSN 1546-1718 ; 1061-4036
    ISSN (online) 1546-1718
    ISSN 1061-4036
    DOI 10.1038/s41588-023-01368-0
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: SWAMPy: Simulating SARS-CoV-2 Wastewater Amplicon Metagenomes with Python

    Fidan, Fatma Rabia / Boulton, William / De Maio, Nicola / Goldman, Nick

    bioRxiv

    Abstract: Motivation: Tracking SARS-CoV-2 variants through genomic sequencing has been an important part of the global response to the pandemic. As well as whole-genome sequencing of clinical samples, this surveillance effort has been aided by amplicon sequencing ... ...

    Abstract Motivation: Tracking SARS-CoV-2 variants through genomic sequencing has been an important part of the global response to the pandemic. As well as whole-genome sequencing of clinical samples, this surveillance effort has been aided by amplicon sequencing of wastewater samples, which proved effective in real case studies. Because of its relevance to public healthcare decisions, testing and benchmarking wastewater sequencing analysis methods is also crucial, which necessitates a simulator. Although metagenomic simulators exist, none are fit for the purpose of simulating the metagenomes produced through amplicon sequencing of wastewater. Results: Our new simulation tool, SWAMPy (Simulating SARS-CoV-2 Wastewater Amplicon Metagenomes with Python), is intended to provide realistic simulated SARS-CoV-2 wastewater sequencing datasets with which other programs that rely on this type of data can be evaluated and improved. Availability: The code for this project is available at https://github.com/goldman-gp-ebi/SWAMPy. It can be installed on any Unix-based operating system and is available under the GPL-v3 license.
    Keywords covid19
    Language English
    Publishing date 2022-12-12
    Publisher Cold Spring Harbor Laboratory
    Document type Article ; Online
    DOI 10.1101/2022.12.10.519890
    Database COVID19

    Kategorien

  9. Article: Genetic Variability of the SARS-CoV-2 Pocketome

    Yazdani, Setayesh / De Maio, Nicola / Ding, Yining / Shahani, Vijay / Goldman, Nick / Schapira, Matthieu

    Journal of proteome research. 2021 June 28, v. 20, no. 8

    2021  

    Abstract: In the absence of effective treatment, COVID-19 is likely to remain a global disease burden. Compounding this threat is the near certainty that novel coronaviruses with pandemic potential will emerge in years to come. Pan-coronavirus drugs—agents active ... ...

    Abstract In the absence of effective treatment, COVID-19 is likely to remain a global disease burden. Compounding this threat is the near certainty that novel coronaviruses with pandemic potential will emerge in years to come. Pan-coronavirus drugs—agents active against both SARS-CoV-2 and other coronaviruses—would address both threats. A strategy to develop such broad-spectrum inhibitors is to pharmacologically target binding sites on SARS-CoV-2 proteins that are highly conserved in other known coronaviruses, the assumption being that any selective pressure to keep a site conserved across past viruses will apply to future ones. Here we systematically mapped druggable binding pockets on the experimental structure of 15 SARS-CoV-2 proteins and analyzed their variation across 27 α- and β-coronaviruses and across thousands of SARS-CoV-2 samples from COVID-19 patients. We find that the two most conserved druggable sites are a pocket overlapping the RNA binding site of the helicase nsp13 and the catalytic site of the RNA-dependent RNA polymerase nsp12, both components of the viral replication–transcription complex. We present the data on a public web portal (https://www.thesgc.org/SARSCoV2_pocketome/), where users can interactively navigate individual protein structures and view the genetic variability of drug-binding pockets in 3D.
    Keywords COVID-19 infection ; RNA ; RNA-directed RNA polymerase ; Severe acute respiratory syndrome coronavirus 2 ; active sites ; burden of disease ; genetic variation ; pandemic ; proteome ; research
    Language English
    Dates of publication 2021-0628
    Size p. 4212-4215.
    Publishing place American Chemical Society
    Document type Article
    ZDB-ID 2078618-9
    ISSN 1535-3907 ; 1535-3893
    ISSN (online) 1535-3907
    ISSN 1535-3893
    DOI 10.1021/acs.jproteome.1c00206
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  10. Article ; Online: Genetic Variability of the SARS-CoV-2 Pocketome.

    Yazdani, Setayesh / De Maio, Nicola / Ding, Yining / Shahani, Vijay / Goldman, Nick / Schapira, Matthieu

    Journal of proteome research

    2021  Volume 20, Issue 8, Page(s) 4212–4215

    Abstract: In the absence of effective treatment, COVID-19 is likely to remain a global disease burden. Compounding this threat is the near certainty that novel coronaviruses with pandemic potential will emerge in years to come. Pan-coronavirus drugs-agents active ... ...

    Abstract In the absence of effective treatment, COVID-19 is likely to remain a global disease burden. Compounding this threat is the near certainty that novel coronaviruses with pandemic potential will emerge in years to come. Pan-coronavirus drugs-agents active against both SARS-CoV-2 and other coronaviruses-would address both threats. A strategy to develop such broad-spectrum inhibitors is to pharmacologically target binding sites on SARS-CoV-2 proteins that are highly conserved in other known coronaviruses, the assumption being that any selective pressure to keep a site conserved across past viruses will apply to future ones. Here we systematically mapped druggable binding pockets on the experimental structure of 15 SARS-CoV-2 proteins and analyzed their variation across 27 α- and β-coronaviruses and across thousands of SARS-CoV-2 samples from COVID-19 patients. We find that the two most conserved druggable sites are a pocket overlapping the RNA binding site of the helicase nsp13 and the catalytic site of the RNA-dependent RNA polymerase nsp12, both components of the viral replication-transcription complex. We present the data on a public web portal (https://www.thesgc.org/SARSCoV2_pocketome/), where users can interactively navigate individual protein structures and view the genetic variability of drug-binding pockets in 3D.
    MeSH term(s) Antiviral Agents/pharmacology ; Antiviral Agents/therapeutic use ; COVID-19 ; Humans ; Pandemics ; RNA-Dependent RNA Polymerase/genetics ; SARS-CoV-2
    Chemical Substances Antiviral Agents ; RNA-Dependent RNA Polymerase (EC 2.7.7.48)
    Language English
    Publishing date 2021-06-28
    Publishing country United States
    Document type Letter ; Research Support, Non-U.S. Gov't
    ZDB-ID 2078618-9
    ISSN 1535-3907 ; 1535-3893
    ISSN (online) 1535-3907
    ISSN 1535-3893
    DOI 10.1021/acs.jproteome.1c00206
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top