LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 70

Search options

  1. Article ; Online: Hybrid-hybrid correction of errors in long reads with HERO.

    Kang, Xiongbin / Xu, Jialu / Luo, Xiao / Schönhuth, Alexander

    Genome biology

    2023  Volume 24, Issue 1, Page(s) 275

    Abstract: Although generally superior, hybrid approaches for correcting errors in third-generation sequencing (TGS) reads, using next-generation sequencing (NGS) reads, mistake haplotype-specific variants for errors in polyploid and mixed samples. We suggest HERO, ...

    Abstract Although generally superior, hybrid approaches for correcting errors in third-generation sequencing (TGS) reads, using next-generation sequencing (NGS) reads, mistake haplotype-specific variants for errors in polyploid and mixed samples. We suggest HERO, as the first "hybrid-hybrid" approach, to make use of both de Bruijn graphs and overlap graphs for optimal catering to the particular strengths of NGS and TGS reads. Extensive benchmarking experiments demonstrate that HERO improves indel and mismatch error rates by on average 65% (27[Formula: see text]95%) and 20% (4[Formula: see text]61%). Using HERO prior to genome assembly significantly improves the assemblies in the majority of the relevant categories.
    MeSH term(s) Sequence Analysis, DNA ; Algorithms ; High-Throughput Nucleotide Sequencing ; Benchmarking
    Language English
    Publishing date 2023-12-01
    Publishing country England
    Document type Journal Article
    ZDB-ID 2040529-7
    ISSN 1474-760X ; 1474-760X
    ISSN (online) 1474-760X
    ISSN 1474-760X
    DOI 10.1186/s13059-023-03112-7
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article: Strainline: full-length de novo viral haplotype reconstruction from noisy long reads

    Luo, Xiao / Kang, Xiongbin / Schönhuth, Alexander

    Genome biology. 2022 Dec., v. 23, no. 1

    2022  

    Abstract: Haplotype-resolved de novo assembly of highly diverse virus genomes is critical in prevention, control and treatment of viral diseases. Current methods either can handle only relatively accurate short read data, or collapse haplotype-specific variations ... ...

    Abstract Haplotype-resolved de novo assembly of highly diverse virus genomes is critical in prevention, control and treatment of viral diseases. Current methods either can handle only relatively accurate short read data, or collapse haplotype-specific variations into consensus sequence. Here, we present Strainline, a novel approach to assemble viral haplotypes from noisy long reads without a reference genome. Strainline is the first approach to provide strain-resolved, full-length de novo assemblies of viral quasispecies from noisy third-generation sequencing data. Benchmarking on simulated and real datasets of varying complexity and diversity confirm this novelty and demonstrate the superiority of Strainline.
    Keywords consensus sequence ; data collection ; genome ; haplotypes ; viruses
    Language English
    Dates of publication 2022-12
    Size p. 29.
    Publishing place BioMed Central
    Document type Article
    ZDB-ID 2040529-7
    ISSN 1474-760X
    ISSN 1474-760X
    DOI 10.1186/s13059-021-02587-6
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  3. Article: Enhancing Long-Read-Based Strain-Aware Metagenome Assembly.

    Luo, Xiao / Kang, Xiongbin / Schönhuth, Alexander

    Frontiers in genetics

    2022  Volume 13, Page(s) 868280

    Abstract: Microbial communities are usually highly diverse and often involve multiple strains from the participating species due to the rapid evolution of microorganisms. In such a complex microecosystem, different strains may show different biological functions. ... ...

    Abstract Microbial communities are usually highly diverse and often involve multiple strains from the participating species due to the rapid evolution of microorganisms. In such a complex microecosystem, different strains may show different biological functions. While reconstruction of individual genomes at the strain level is vital for accurately deciphering the composition of microbial communities, the problem has largely remained unresolved so far. Next-generation sequencing has been routinely used in metagenome assembly but there have been struggles to generate strain-specific genome sequences due to the short-read length. This explains why long-read sequencing technologies have recently provided unprecedented opportunities to carry out haplotype- or strain-resolved genome assembly. Here, we propose MetaBooster and MetaBooster-HiFi, as two pipelines for strain-aware metagenome assembly from PacBio CLR and Oxford Nanopore long-read sequencing data. Benchmarking experiments on both simulated and real sequencing data demonstrate that either the MetaBooster or the MetaBooster-HiFi pipeline drastically outperforms the state-of-the-art
    Language English
    Publishing date 2022-05-13
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2606823-0
    ISSN 1664-8021
    ISSN 1664-8021
    DOI 10.3389/fgene.2022.868280
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Strainline: full-length de novo viral haplotype reconstruction from noisy long reads.

    Luo, Xiao / Kang, Xiongbin / Schönhuth, Alexander

    Genome biology

    2022  Volume 23, Issue 1, Page(s) 29

    Abstract: Haplotype-resolved de novo assembly of highly diverse virus genomes is critical in prevention, control and treatment of viral diseases. Current methods either can handle only relatively accurate short read data, or collapse haplotype-specific variations ... ...

    Abstract Haplotype-resolved de novo assembly of highly diverse virus genomes is critical in prevention, control and treatment of viral diseases. Current methods either can handle only relatively accurate short read data, or collapse haplotype-specific variations into consensus sequence. Here, we present Strainline, a novel approach to assemble viral haplotypes from noisy long reads without a reference genome. Strainline is the first approach to provide strain-resolved, full-length de novo assemblies of viral quasispecies from noisy third-generation sequencing data. Benchmarking on simulated and real datasets of varying complexity and diversity confirm this novelty and demonstrate the superiority of Strainline.
    MeSH term(s) Benchmarking ; COVID-19/virology ; Contig Mapping/methods ; Genome, Viral ; Haplotypes ; High-Throughput Nucleotide Sequencing ; Humans ; SARS-CoV-2/classification ; SARS-CoV-2/genetics ; Sequence Analysis, DNA ; Software
    Language English
    Publishing date 2022-01-20
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2040529-7
    ISSN 1474-760X ; 1474-760X
    ISSN (online) 1474-760X
    ISSN 1474-760X
    DOI 10.1186/s13059-021-02587-6
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: StrainXpress: strain aware metagenome assembly from short reads.

    Kang, Xiongbin / Luo, Xiao / Schönhuth, Alexander

    Nucleic acids research

    2022  Volume 50, Issue 17, Page(s) e101

    Abstract: Next-generation sequencing-based metagenomics has enabled to identify microorganisms in characteristic habitats without the need for lengthy cultivation. Importantly, clinically relevant phenomena such as resistance to medication, virulence or ... ...

    Abstract Next-generation sequencing-based metagenomics has enabled to identify microorganisms in characteristic habitats without the need for lengthy cultivation. Importantly, clinically relevant phenomena such as resistance to medication, virulence or interactions with the environment can vary already within species. Therefore, a major current challenge is to reconstruct individual genomes from the sequencing reads at the level of strains, and not just the level of species. However, strains of one species can differ only by minor amounts of variants, which makes it difficult to distinguish them. Despite considerable recent progress, related approaches have remained fragmentary so far. Here, we present StrainXpress, as a comprehensive solution to the problem of strain aware metagenome assembly from next-generation sequencing reads. In experiments, StrainXpress reconstructs strain-specific genomes from metagenomes that involve up to >1000 strains and proves to successfully deal with poorly covered strains. The amount of reconstructed strain-specific sequence exceeds that of the current state-of-the-art approaches by on average 26.75% across all data sets (first quartile: 18.51%, median: 26.60%, third quartile: 35.05%).
    MeSH term(s) High-Throughput Nucleotide Sequencing ; Metagenome ; Metagenomics ; Sequence Analysis, DNA
    Language English
    Publishing date 2022-07-18
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 186809-3
    ISSN 1362-4962 ; 1362-4954 ; 0301-5610 ; 0305-1048
    ISSN (online) 1362-4962 ; 1362-4954
    ISSN 0301-5610 ; 0305-1048
    DOI 10.1093/nar/gkac543
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: VeChat: correcting errors in long reads using variation graphs.

    Luo, Xiao / Kang, Xiongbin / Schönhuth, Alexander

    Nature communications

    2022  Volume 13, Issue 1, Page(s) 6657

    Abstract: Error correction is the canonical first step in long-read sequencing data analysis. Current self-correction methods, however, are affected by consensus sequence induced biases that mask true variants in haplotypes of lower frequency showing in mixed ... ...

    Abstract Error correction is the canonical first step in long-read sequencing data analysis. Current self-correction methods, however, are affected by consensus sequence induced biases that mask true variants in haplotypes of lower frequency showing in mixed samples. Unlike consensus sequence templates, graph-based reference systems are not affected by such biases, so do not mistakenly mask true variants as errors. We present VeChat, as an approach to implement this idea: VeChat is based on variation graphs, as a popular type of data structure for pangenome reference systems. Extensive benchmarking experiments demonstrate that long reads corrected by VeChat contain 4 to 15 (Pacific Biosciences) and 1 to 10 times (Oxford Nanopore Technologies) less errors than when being corrected by state of the art approaches. Further, using VeChat prior to long-read assembly significantly improves the haplotype awareness of the assemblies. VeChat is an easy-to-use open-source tool and publicly available at https://github.com/HaploKit/vechat .
    MeSH term(s) Sequence Analysis, DNA/methods ; Algorithms ; Nanopores ; Haplotypes ; Data Analysis ; High-Throughput Nucleotide Sequencing ; Software
    Language English
    Publishing date 2022-11-04
    Publishing country England
    Document type Journal Article
    ZDB-ID 2553671-0
    ISSN 2041-1723 ; 2041-1723
    ISSN (online) 2041-1723
    ISSN 2041-1723
    DOI 10.1038/s41467-022-34381-8
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article: phasebook: haplotype-aware de novo assembly of diploid genomes from long reads

    Luo, Xiao / Kang, Xiongbin / Schönhuth, Alexander

    Genome biology. 2021 Dec., v. 22, no. 1

    2021  

    Abstract: Haplotype-aware diploid genome assembly is crucial in genomics, precision medicine, and many other disciplines. Long-read sequencing technologies have greatly improved genome assembly. However, current long-read assemblers are either reference based, so ... ...

    Abstract Haplotype-aware diploid genome assembly is crucial in genomics, precision medicine, and many other disciplines. Long-read sequencing technologies have greatly improved genome assembly. However, current long-read assemblers are either reference based, so introduce biases, or fail to capture the haplotype diversity of diploid genomes. We present phasebook, a de novo approach for reconstructing the haplotypes of diploid genomes from long reads. phasebook outperforms other approaches in terms of haplotype coverage by large margins, in addition to achieving competitive performance in terms of assembly errors and assembly contiguity.
    Keywords diploidy ; genome ; genome assembly ; genomics ; haplotypes ; precision medicine
    Language English
    Dates of publication 2021-12
    Size p. 299.
    Publishing place BioMed Central
    Document type Article
    ZDB-ID 2040529-7
    ISSN 1474-760X
    ISSN 1474-760X
    DOI 10.1186/s13059-021-02512-x
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  8. Article ; Online: Overlap graph-based generation of haplotigs for diploids and polyploids.

    Baaijens, Jasmijn A / Schönhuth, Alexander

    Bioinformatics (Oxford, England)

    2019  Volume 35, Issue 21, Page(s) 4281–4289

    Abstract: Motivation: Haplotype-aware genome assembly plays an important role in genetics, medicine and various other disciplines, yet generation of haplotype-resolved de novo assemblies remains a major challenge. Beyond distinguishing between errors and true ... ...

    Abstract Motivation: Haplotype-aware genome assembly plays an important role in genetics, medicine and various other disciplines, yet generation of haplotype-resolved de novo assemblies remains a major challenge. Beyond distinguishing between errors and true sequential variants, one needs to assign the true variants to the different genome copies. Recent work has pointed out that the enormous quantities of traditional NGS read data have been greatly underexploited in terms of haplotig computation so far, which reflects that methodology for reference independent haplotig computation has not yet reached maturity.
    Results: We present POLYploid genome fitTEr (POLYTE) as a new approach to de novo generation of haplotigs for diploid and polyploid genomes of known ploidy. Our method follows an iterative scheme where in each iteration reads or contigs are joined, based on their interplay in terms of an underlying haplotype-aware overlap graph. Along the iterations, contigs grow while preserving their haplotype identity. Benchmarking experiments on both real and simulated data demonstrate that POLYTE establishes new standards in terms of error-free reconstruction of haplotype-specific sequence. As a consequence, POLYTE outperforms state-of-the-art approaches in various relevant aspects, where advantages become particularly distinct in polyploid settings.
    Availability and implementation: POLYTE is freely available as part of the HaploConduct package at https://github.com/HaploConduct/HaploConduct, implemented in Python and C++.
    Supplementary information: Supplementary data are available at Bioinformatics online.
    MeSH term(s) Algorithms ; Diploidy ; Genome ; Haplotypes ; High-Throughput Nucleotide Sequencing ; Humans ; Polyploidy ; Sequence Analysis, DNA
    Language English
    Publishing date 2019-04-16
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btz255
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: phasebook: haplotype-aware de novo assembly of diploid genomes from long reads.

    Luo, Xiao / Kang, Xiongbin / Schönhuth, Alexander

    Genome biology

    2021  Volume 22, Issue 1, Page(s) 299

    Abstract: Haplotype-aware diploid genome assembly is crucial in genomics, precision medicine, and many other disciplines. Long-read sequencing technologies have greatly improved genome assembly. However, current long-read assemblers are either reference based, so ... ...

    Abstract Haplotype-aware diploid genome assembly is crucial in genomics, precision medicine, and many other disciplines. Long-read sequencing technologies have greatly improved genome assembly. However, current long-read assemblers are either reference based, so introduce biases, or fail to capture the haplotype diversity of diploid genomes. We present phasebook, a de novo approach for reconstructing the haplotypes of diploid genomes from long reads. phasebook outperforms other approaches in terms of haplotype coverage by large margins, in addition to achieving competitive performance in terms of assembly errors and assembly contiguity.
    MeSH term(s) Diploidy ; Genomics ; Haplotypes ; High-Throughput Nucleotide Sequencing/methods ; Humans ; Nanopore Sequencing ; Sequence Analysis, DNA/methods ; Software
    Language English
    Publishing date 2021-10-27
    Publishing country England
    Document type Evaluation Study ; Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2040529-7
    ISSN 1474-760X ; 1474-760X
    ISSN (online) 1474-760X
    ISSN 1474-760X
    DOI 10.1186/s13059-021-02512-x
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Deep variational graph autoencoders for novel host-directed therapy options against COVID-19.

    Ray, Sumanta / Lall, Snehalika / Mukhopadhyay, Anirban / Bandyopadhyay, Sanghamitra / Schönhuth, Alexander

    Artificial intelligence in medicine

    2022  Volume 134, Page(s) 102418

    Abstract: The COVID-19 pandemic has been keeping asking urgent questions with respect to therapeutic options. Existing drugs that can be repurposed promise rapid implementation in practice because of their prior approval. Conceivably, there is still room for ... ...

    Abstract The COVID-19 pandemic has been keeping asking urgent questions with respect to therapeutic options. Existing drugs that can be repurposed promise rapid implementation in practice because of their prior approval. Conceivably, there is still room for substantial improvement, because most advanced artificial intelligence techniques for screening drug repositories have not been exploited so far. We construct a comprehensive network by combining year-long curated drug-protein/protein-protein interaction data on the one hand, and most recent SARS-CoV-2 protein interaction data on the other hand. We learn the structure of the resulting encompassing molecular interaction network and predict missing links using variational graph autoencoders (VGAEs), as a most advanced deep learning technique that has not been explored so far. We focus on hitherto unknown links between drugs and human proteins that play key roles in the replication cycle of SARS-CoV-2. Thereby, we establish novel host-directed therapy (HDT) options whose utmost plausibility is confirmed by realistic simulations. As a consequence, many of the predicted links are likely to be crucial for the virus to thrive on the one hand, and can be targeted with existing drugs on the other hand.
    MeSH term(s) Humans ; COVID-19 ; SARS-CoV-2 ; Artificial Intelligence ; Pandemics ; Upper Extremity
    Language English
    Publishing date 2022-10-13
    Publishing country Netherlands
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 645179-2
    ISSN 1873-2860 ; 0933-3657
    ISSN (online) 1873-2860
    ISSN 0933-3657
    DOI 10.1016/j.artmed.2022.102418
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top