LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 43

Search options

  1. Article: AUTO-TUNE: SELECTING THE DISTANCE THRESHOLD FOR INFERRING HIV TRANSMISSION CLUSTERS.

    Weaver, Steven / Dávila-Conn, Vanessa / Ji, Daniel / Verdonk, Hannah / Ávila-Ríos, Santiago / Leigh Brown, Andrew J / Wertheim, Joel O / Kosakovsky Pond, Sergei L

    bioRxiv : the preprint server for biology

    2024  

    Abstract: Molecular surveillance of viral pathogens and inference of transmission networks from genomic data play an increasingly important role in public health efforts, especially for HIV-1. For many methods, the genetic distance threshold used to connect ... ...

    Abstract Molecular surveillance of viral pathogens and inference of transmission networks from genomic data play an increasingly important role in public health efforts, especially for HIV-1. For many methods, the genetic distance threshold used to connect sequences in the transmission network is a key parameter informing the properties of inferred networks. Using a distance threshold that is too high can result in a network with many spurious links, making it difficult to interpret. Conversely, a distance threshold that is too low can result in a network with too few links, which may not capture key insights into clusters of public health concern. Published research using the HIV-TRACE software package frequently uses the default threshold of 0.015 substitutions/site for HIV pol gene sequences, but in many cases, investigators heuristically select other threshold parameters to better capture the underlying dynamics of the epidemic they are studying. Here, we present a general heuristic scoring approach for tuning a distance threshold adaptively, which seeks to prevent the formation of giant clusters. We prioritize the ratio of the sizes of the largest and the second largest cluster, maximizing the number of clusters present in the network. We apply our scoring heuristic to outbreaks with different characteristics, such as regional or temporal variability, and demonstrate the utility of using the scoring mechanism's suggested distance threshold to identify clusters exhibiting risk factors that would have otherwise been more difficult to identify. For example, while we found that a 0.015 substitutions/site distance threshold is typical for US-like epidemics, recent outbreaks like the CRF07_BC subtype among men who have sex with men (MSM) in China have been found to have a lower optimal threshold of 0.005 to better capture the transition from injected drug use (IDU) to MSM as the primary risk factor. Alternatively, in communities surrounding Lake Victoria in Uganda, where there has been sustained hetero-sexual transmission for many years, we found that a larger distance threshold is necessary to capture a more risk factor-diverse population with sparse sampling over a longer period of time. Such identification may allow for more informed intervention action by respective public health officials.
    Language English
    Publishing date 2024-03-14
    Publishing country United States
    Document type Preprint
    DOI 10.1101/2024.03.11.584522
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes.

    Lucaci, Alexander G / Wisotsky, Sadie R / Shank, Stephen D / Weaver, Steven / Kosakovsky Pond, Sergei L

    PloS one

    2021  Volume 16, Issue 3, Page(s) e0248337

    Abstract: Despite many attempts to introduce evolutionary models that permit substitutions to instantly alter more than one nucleotide in a codon, the prevailing wisdom remains that such changes are rare and generally negligible or are reflective of non-biological ...

    Abstract Despite many attempts to introduce evolutionary models that permit substitutions to instantly alter more than one nucleotide in a codon, the prevailing wisdom remains that such changes are rare and generally negligible or are reflective of non-biological artifacts, such as alignment errors. Codon models continue to posit that only single nucleotide change have non-zero rates. Here, we develop and test a simple hierarchy of codon-substitution models with non-zero evolutionary rates for only one-nucleotide (1H), one- and two-nucleotide (2H), or any (3H) codon substitutions. Using over 42, 000 empirical alignments, we find widespread statistical support for multiple hits: 61% of alignments prefer models with 2H allowed, and 23%-with 3H allowed. Analyses of simulated data suggest that these results are not likely to be due to simple artifacts such as model misspecification or alignment errors. Further modeling reveals that synonymous codon island jumping among codons encoding serine, especially along short branches, contributes significantly to this 3H signal. While serine codons were prominently involved in multiple-hit substitutions, there were other common exchanges contributing to better model fit. It appears that a small subset of sites in most alignments have unusual evolutionary dynamics not well explained by existing model formalisms, and that commonly estimated quantities, such as dN/dS ratios may be biased by model misspecification. Our findings highlight the need for continued evaluation of assumptions underlying workhorse evolutionary models and subsequent evolutionary inference techniques. We provide a software implementation for evolutionary biologists to assess the potential impact of extra base hits in their data in the HyPhy package and in the Datamonkey.org server.
    MeSH term(s) Codon/genetics ; Evolution, Molecular ; Models, Genetic ; Nucleotides ; Phylogeny ; Software
    Chemical Substances Codon ; Nucleotides
    Language English
    Publishing date 2021-03-12
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ISSN 1932-6203
    ISSN (online) 1932-6203
    DOI 10.1371/journal.pone.0248337
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: phylotree.js - a JavaScript library for application development and interactive data visualization in phylogenetics.

    Shank, Stephen D / Weaver, Steven / Kosakovsky Pond, Sergei L

    BMC bioinformatics

    2018  Volume 19, Issue 1, Page(s) 276

    Abstract: Background: While several JavaScript packages for visualizing phylogenetic trees exist, most are best characterized as frameworks that are designed with a specific set of tasks in mind. Extending such packages to use cases that are not available as ... ...

    Abstract Background: While several JavaScript packages for visualizing phylogenetic trees exist, most are best characterized as frameworks that are designed with a specific set of tasks in mind. Extending such packages to use cases that are not available as features often ends up being difficult. Moreover, existing packages tend to produce standalone widgets that are not designed to serve as middleware, as opposed to flexible tools that can integrate with other components of an application.
    Results: phylotree.js is a library that extends the popular data visualization framework d3.js, and is suitable for building JavaScript applications where users can view and interact with phylogenetic trees. The effects of such interactions can be captured and communicated to other package components, making it possible to engineer complex and responsive applications that include phylogenetic trees. phylotree.js implements several abstractions in addition to features, and comes with a documented application programming interface, thus promoting interoperability and extensibility. Example applications include a tool to visualize and annotate phylogenetic trees, a web application for comparative sequence analysis, a structural viewer that interacts with a large phylogenetic tree, and an interactive tanglegram.
    Conclusions: phylotree.js is a useful tool and application module for a variety of computational biology software applications. The code is available on Github and is released under the MIT license.
    MeSH term(s) Computational Biology/methods ; Phylogeny ; Sequence Analysis, DNA ; Software ; User-Computer Interface
    Language English
    Publishing date 2018-07-25
    Publishing country England
    Document type Journal Article
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-018-2283-2
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: TopHap: rapid inference of key phylogenetic structures from common haplotypes in large genome collections with limited diversity.

    Caraballo-Ortiz, Marcos A / Miura, Sayaka / Sanderford, Maxwell / Dolker, Tenzin / Tao, Qiqing / Weaver, Steven / Pond, Sergei L K / Kumar, Sudhir

    Bioinformatics (Oxford, England)

    2022  Volume 38, Issue 10, Page(s) 2719–2726

    Abstract: Motivation: Building reliable phylogenies from very large collections of sequences with a limited number of phylogenetically informative sites is challenging because sequencing errors and recurrent/backward mutations interfere with the phylogenetic ... ...

    Abstract Motivation: Building reliable phylogenies from very large collections of sequences with a limited number of phylogenetically informative sites is challenging because sequencing errors and recurrent/backward mutations interfere with the phylogenetic signal, confounding true evolutionary relationships. Massive global efforts of sequencing genomes and reconstructing the phylogeny of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) strains exemplify these difficulties since there are only hundreds of phylogenetically informative sites but millions of genomes. For such datasets, we set out to develop a method for building the phylogenetic tree of genomic haplotypes consisting of positions harboring common variants to improve the signal-to-noise ratio for more accurate and fast phylogenetic inference of resolvable phylogenetic features.
    Results: We present the TopHap approach that determines spatiotemporally common haplotypes of common variants and builds their phylogeny at a fraction of the computational time of traditional methods. We develop a bootstrap strategy that resamples genomes spatiotemporally to assess topological robustness. The application of TopHap to build a phylogeny of 68 057 SARS-CoV-2 genomes (68KG) from the first year of the pandemic produced an evolutionary tree of major SARS-CoV-2 haplotypes. This phylogeny is concordant with the mutation tree inferred using the co-occurrence pattern of mutations and recovers key phylogenetic relationships from more traditional analyses. We also evaluated alternative roots of the SARS-CoV-2 phylogeny and found that the earliest sampled genomes in 2019 likely evolved by four mutations of the most recent common ancestor of all SARS-CoV-2 genomes. An application of TopHap to more than 1 million SARS-CoV-2 genomes reconstructed the most comprehensive evolutionary relationships of major variants, which confirmed the 68KG phylogeny and provided evolutionary origins of major and recent variants of concern.
    Availability and implementation: TopHap is available at https://github.com/SayakaMiura/TopHap.
    Supplementary information: Supplementary data are available at Bioinformatics online.
    MeSH term(s) COVID-19 ; Genome, Viral ; Haplotypes ; Humans ; Mutation ; Phylogeny ; SARS-CoV-2/genetics
    Language English
    Publishing date 2022-04-20
    Publishing country England
    Document type Journal Article ; Research Support, U.S. Gov't, Non-P.H.S. ; Research Support, N.I.H., Extramural
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btac186
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: Contrast-FEL-A Test for Differences in Selective Pressures at Individual Sites among Clades and Sets of Branches.

    Kosakovsky Pond, Sergei L / Wisotsky, Sadie R / Escalante, Ananias / Magalis, Brittany Rife / Weaver, Steven

    Molecular biology and evolution

    2020  Volume 38, Issue 3, Page(s) 1184–1198

    Abstract: A number of evolutionary hypotheses can be tested by comparing selective pressures among sets of branches in a phylogenetic tree. When the question of interest is to identify specific sites within genes that may be evolving differently, a common approach ...

    Abstract A number of evolutionary hypotheses can be tested by comparing selective pressures among sets of branches in a phylogenetic tree. When the question of interest is to identify specific sites within genes that may be evolving differently, a common approach is to perform separate analyses on subsets of sequences and compare parameter estimates in a post hoc fashion. This approach is statistically suboptimal and not always applicable. Here, we develop a simple extension of a popular fixed effects likelihood method in the context of codon-based evolutionary phylogenetic maximum likelihood testing, Contrast-FEL. It is suitable for identifying individual alignment sites where any among the K≥2 sets of branches in a phylogenetic tree have detectably different ω ratios, indicative of different selective regimes. Using extensive simulations, we show that Contrast-FEL delivers good power, exceeding 90% for sufficiently large differences, while maintaining tight control over false positive rates, when the model is correctly specified. We conclude by applying Contrast-FEL to data from five previously published studies spanning a diverse range of organisms and focusing on different evolutionary questions.
    MeSH term(s) Brassicaceae/genetics ; Cytochromes b/genetics ; Genetic Techniques ; HIV Reverse Transcriptase/genetics ; Haemosporida/genetics ; Phylogeny ; Rhodopsin/genetics ; Ribulose-Bisphosphate Carboxylase/genetics ; Selection, Genetic ; Trichomes/genetics
    Chemical Substances Rhodopsin (9009-81-8) ; Cytochromes b (9035-37-4) ; reverse transcriptase, Human immunodeficiency virus 1 (EC 2.7.7.-) ; HIV Reverse Transcriptase (EC 2.7.7.49) ; Ribulose-Bisphosphate Carboxylase (EC 4.1.1.39)
    Language English
    Publishing date 2020-10-16
    Publishing country United States
    Document type Comparative Study ; Journal Article ; Research Support, N.I.H., Extramural ; Validation Study
    ZDB-ID 998579-7
    ISSN 1537-1719 ; 0737-4038
    ISSN (online) 1537-1719
    ISSN 0737-4038
    DOI 10.1093/molbev/msaa263
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article: TopHap: Rapid inference of key phylogenetic structures from common haplotypes in large genome collections with limited diversity.

    Caraballo-Ortiz, Marcos A / Miura, Sayaka / Sanderford, Maxwell / Dolker, Tenzin / Tao, Qiqing / Weaver, Steven / Pond, Sergei L K / Kumar, Sudhir

    bioRxiv : the preprint server for biology

    2021  

    Abstract: Motivation: Building reliable phylogenies from very large collections of sequences with a limited number of phylogenetically informative sites is challenging because sequencing errors and recurrent/backward mutations interfere with the phylogenetic ... ...

    Abstract Motivation: Building reliable phylogenies from very large collections of sequences with a limited number of phylogenetically informative sites is challenging because sequencing errors and recurrent/backward mutations interfere with the phylogenetic signal, confounding true evolutionary relationships. Massive global efforts of sequencing genomes and reconstructing the phylogeny of SARS-CoV-2 strains exemplify these difficulties since there are only hundreds of phylogenetically informative sites and millions of genomes. For such datasets, we set out to develop a method for building the phylogenetic tree of genomic haplotypes consisting of positions harboring common variants to improve the signal-to-noise ratio for more accurate phylogenetic inference of resolvable phylogenetic features.
    Results: We present the
    Availability: TopHap
    Contact: s.kumar@temple.edu.
    Language English
    Publishing date 2021-12-14
    Publishing country United States
    Document type Preprint
    DOI 10.1101/2021.12.13.472454
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article: An evolutionary portrait of the progenitor SARS-CoV-2 and its dominant offshoots in COVID-19 pandemic.

    Kumar, Sudhir / Tao, Qiqing / Weaver, Steven / Sanderford, Maxwell / Caraballo-Ortiz, Marcos A / Sharma, Sudip / Pond, Sergei L K / Miura, Sayaka

    bioRxiv : the preprint server for biology

    2021  

    Abstract: We report the likely most recent common ancestor of SARS-CoV-2 - the coronavirus that causes COVID-19. This progenitor SARS-CoV-2 genome was recovered through a novel application and advancement of computational methods initially developed to reconstruct ...

    Abstract We report the likely most recent common ancestor of SARS-CoV-2 - the coronavirus that causes COVID-19. This progenitor SARS-CoV-2 genome was recovered through a novel application and advancement of computational methods initially developed to reconstruct the mutational history of tumor cells in a patient. The progenitor differs from the earliest coronaviruses sampled in China by three variants, implying that none of the earliest patients represent the index case or gave rise to all the human infections. However, multiple coronavirus infections in China and the USA harbored the progenitor genetic fingerprint in January 2020 and later, suggesting that the progenitor was spreading worldwide as soon as weeks after the first reported cases of COVID-19. Mutations of the progenitor and its offshoots have produced many dominant coronavirus strains, which have spread episodically over time. Fingerprinting based on common mutations reveals that the same coronavirus lineage has dominated North America for most of the pandemic. There have been multiple replacements of predominant coronavirus strains in Europe and Asia and the continued presence of multiple high-frequency strains in Asia and North America. We provide a continually updating dashboard of global evolution and spatiotemporal trends of SARS-CoV-2 spread (http://sars2evo.datamonkey.org/).
    Keywords covid19
    Language English
    Publishing date 2021-01-19
    Publishing country United States
    Document type Preprint
    DOI 10.1101/2020.09.24.311845
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: An Evolutionary Portrait of the Progenitor SARS-CoV-2 and Its Dominant Offshoots in COVID-19 Pandemic.

    Kumar, Sudhir / Tao, Qiqing / Weaver, Steven / Sanderford, Maxwell / Caraballo-Ortiz, Marcos A / Sharma, Sudip / Pond, Sergei L K / Miura, Sayaka

    Molecular biology and evolution

    2021  Volume 38, Issue 8, Page(s) 3046–3059

    Abstract: Global sequencing of genomes of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has continued to reveal new genetic variants that are the key to unraveling its early evolutionary history and tracking its global spread over time. Here we ... ...

    Abstract Global sequencing of genomes of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has continued to reveal new genetic variants that are the key to unraveling its early evolutionary history and tracking its global spread over time. Here we present the heretofore cryptic mutational history and spatiotemporal dynamics of SARS-CoV-2 from an analysis of thousands of high-quality genomes. We report the likely most recent common ancestor of SARS-CoV-2, reconstructed through a novel application and advancement of computational methods initially developed to infer the mutational history of tumor cells in a patient. This progenitor genome differs from genomes of the first coronaviruses sampled in China by three variants, implying that none of the earliest patients represent the index case or gave rise to all the human infections. However, multiple coronavirus infections in China and the United States harbored the progenitor genetic fingerprint in January 2020 and later, suggesting that the progenitor was spreading worldwide months before and after the first reported cases of COVID-19 in China. Mutations of the progenitor and its offshoots have produced many dominant coronavirus strains that have spread episodically over time. Fingerprinting based on common mutations reveals that the same coronavirus lineage has dominated North America for most of the pandemic in 2020. There have been multiple replacements of predominant coronavirus strains in Europe and Asia as well as continued presence of multiple high-frequency strains in Asia and North America. We have developed a continually updating dashboard of global evolution and spatiotemporal trends of SARS-CoV-2 spread (http://sars2evo.datamonkey.org/).
    MeSH term(s) Biological Evolution ; COVID-19/genetics ; COVID-19/metabolism ; Computational Biology/methods ; Contact Tracing/methods ; Evolution, Molecular ; Genome, Viral ; Humans ; Mutation ; Pandemics ; Phylogeny ; SARS-CoV-2/genetics ; SARS-CoV-2/metabolism ; SARS-CoV-2/pathogenicity ; Sequence Analysis, DNA/methods
    Language English
    Publishing date 2021-05-04
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 998579-7
    ISSN 1537-1719 ; 0737-4038
    ISSN (online) 1537-1719
    ISSN 0737-4038
    DOI 10.1093/molbev/msab118
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: HIV-TRACE (TRAnsmission Cluster Engine): a Tool for Large Scale Molecular Epidemiology of HIV-1 and Other Rapidly Evolving Pathogens.

    Kosakovsky Pond, Sergei L / Weaver, Steven / Leigh Brown, Andrew J / Wertheim, Joel O

    Molecular biology and evolution

    2018  Volume 35, Issue 7, Page(s) 1812–1819

    Abstract: In modern applications of molecular epidemiology, genetic sequence data are routinely used to identify clusters of transmission in rapidly evolving pathogens, most notably HIV-1. Traditional 'shoe-leather' epidemiology infers transmission clusters by ... ...

    Abstract In modern applications of molecular epidemiology, genetic sequence data are routinely used to identify clusters of transmission in rapidly evolving pathogens, most notably HIV-1. Traditional 'shoe-leather' epidemiology infers transmission clusters by tracing chains of partners sharing epidemiological connections (e.g., sexual contact). Here, we present a computational tool for identifying a molecular transmission analog of such clusters: HIV-TRACE (TRAnsmission Cluster Engine). HIV-TRACE implements an approach inspired by traditional epidemiology, by identifying chains of partners whose viral genetic relatedness imply direct or indirect epidemiological connections. Molecular transmission clusters are constructed using codon-aware pairwise alignment to a reference sequence followed by pairwise genetic distance estimation among all sequences. This approach is computationally tractable and is capable of identifying HIV-1 transmission clusters in large surveillance databases comprising tens or hundreds of thousands of sequences in near real time, that is, on the order of minutes to hours. HIV-TRACE is available at www.hivtrace.org and from www.github.com/veg/hivtrace, along with the accompanying result visualization module from www.github.com/veg/hivtrace-viz. Importantly, the approach underlying HIV-TRACE is not limited to the study of HIV-1 and can be applied to study outbreaks and epidemics of other rapidly evolving pathogens.
    MeSH term(s) Computational Biology ; HIV Infections/epidemiology ; HIV Infections/transmission ; HIV-1/genetics ; Humans ; Molecular Epidemiology/methods ; Software
    Language English
    Publishing date 2018-02-01
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 998579-7
    ISSN 1537-1719 ; 0737-4038
    ISSN (online) 1537-1719
    ISSN 0737-4038
    DOI 10.1093/molbev/msy016
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article: Assessment of potential agricultural and short-rotation forest bioenergy crop establishment sites in Jackson County, Florida, USA

    Merry, Krista / Bettinger, Pete / Cieszewski, Chris / Grebner, Donald / Siry, Jacek / Ucar, Zennure / Weaver, Steven

    Biomass and bioenergy. 2017 Oct., v. 105

    2017  

    Abstract: We present an analysis framework composed of a geospatial process that can be employed to identify suitable areas for increasing the supply of biomass material on landscapes that include a large number of private landowners. Our analysis estimated bare ... ...

    Abstract We present an analysis framework composed of a geospatial process that can be employed to identify suitable areas for increasing the supply of biomass material on landscapes that include a large number of private landowners. Our analysis estimated bare land and a broad open land class, and considered restrictions on bioenergy projects pertaining to soils, parcel size, landowner, and landowner residence. In our case study county within the southern United States, the public land area remaining after excluded areas were removed represented less than 1% of the non-excluded land in the county, and broader-classified, open areas comprised about 1% of the non-excluded land. About 30% of the non-excluded land in the county was located on privately-owned land. However, when logical restrictions were considered, this estimate was reduced by as much as 43 (open areas) to 50% (bare ground) of the non-excluded private lands. The analysis suggests that large reductions in estimates of potential land areas available for bioenergy production may be observed when realistic assumptions are placed on geospatial analyses of land.
    Keywords bioenergy ; biomass ; case studies ; energy crops ; forests ; landowners ; landscapes ; plant establishment ; private lands ; public lands ; soil ; Florida
    Language English
    Dates of publication 2017-10
    Size p. 453-463.
    Publishing place Elsevier Ltd
    Document type Article
    ZDB-ID 1090121-8
    ISSN 0961-9534
    ISSN 0961-9534
    DOI 10.1016/j.biombioe.2017.08.004
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

To top