LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 13

Search options

  1. Article ; Online: Sequencing error profiles of Illumina sequencing instruments.

    Stoler, Nicholas / Nekrutenko, Anton

    NAR genomics and bioinformatics

    2021  Volume 3, Issue 1, Page(s) lqab019

    Abstract: Sequencing technology has achieved great advances in the past decade. Studies have previously shown the quality of specific instruments in controlled conditions. Here, we developed a method able to retroactively determine the error rate of most public ... ...

    Abstract Sequencing technology has achieved great advances in the past decade. Studies have previously shown the quality of specific instruments in controlled conditions. Here, we developed a method able to retroactively determine the error rate of most public sequencing datasets. To do this, we utilized the overlaps between reads that are a feature of many sequencing libraries. With this method, we surveyed 1943 different datasets from seven different sequencing instruments produced by Illumina. We show that among public datasets, the more expensive platforms like HiSeq and NovaSeq have a lower error rate and less variation. But we also discovered that there is great variation within each platform, with the accuracy of a sequencing experiment depending greatly on the experimenter. We show the importance of sequence context, especially the phenomenon where preceding bases bias the following bases toward the same identity. We also show the difference in patterns of sequence bias between instruments. Contrary to expectations based on the underlying chemistry, HiSeq X Ten and NovaSeq 6000 share notable exceptions to the preceding-base bias. Our results demonstrate the importance of the specific circumstances of every sequencing experiment, and the importance of evaluating the quality of each one.
    Language English
    Publishing date 2021-03-27
    Publishing country England
    Document type Journal Article
    ISSN 2631-9268
    ISSN (online) 2631-9268
    DOI 10.1093/nargab/lqab019
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Erratum: Increased yields of duplex sequencing data by a series of quality control tools.

    Povysil, Gundula / Heinzl, Monika / Salazar, Renato / Stoler, Nicholas / Nekrutenko, Anton / Tiemann-Boege, Irene

    NAR genomics and bioinformatics

    2021  Volume 3, Issue 1, Page(s) lqab014

    Abstract: This corrects the article DOI: 10.1093/nargab/lqab002.]. ...

    Abstract [This corrects the article DOI: 10.1093/nargab/lqab002.].
    Language English
    Publishing date 2021-03-01
    Publishing country England
    Document type Published Erratum
    ISSN 2631-9268
    ISSN (online) 2631-9268
    DOI 10.1093/nargab/lqab014
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Increased yields of duplex sequencing data by a series of quality control tools.

    Povysil, Gundula / Heinzl, Monika / Salazar, Renato / Stoler, Nicholas / Nekrutenko, Anton / Tiemann-Boege, Irene

    NAR genomics and bioinformatics

    2021  Volume 3, Issue 1, Page(s) lqab002

    Abstract: Duplex sequencing is currently the most reliable method to identify ultra-low frequency DNA variants by grouping sequence reads derived from the same DNA molecule into families with information on the forward and reverse strand. However, only a small ... ...

    Abstract Duplex sequencing is currently the most reliable method to identify ultra-low frequency DNA variants by grouping sequence reads derived from the same DNA molecule into families with information on the forward and reverse strand. However, only a small proportion of reads are assembled into duplex consensus sequences (DCS), and reads with potentially valuable information are discarded at different steps of the bioinformatics pipeline, especially reads without a family. We developed a bioinformatics toolset that analyses the tag and family composition with the purpose to understand data loss and implement modifications to maximize the data output for the variant calling. Specifically, our tools show that tags contain polymerase chain reaction and sequencing errors that contribute to data loss and lower DCS yields. Our tools also identified chimeras, which likely reflect barcode collisions. Finally, we also developed a tool that re-examines variant calls from raw reads and provides different summary data that categorizes the confidence level of a variant call by a tier-based system. With this tool, we can include reads without a family and check the reliability of the call, that increases substantially the sequencing depth for variant calling, a particular important advantage for low-input samples or low-coverage regions.
    Language English
    Publishing date 2021-02-09
    Publishing country England
    Document type Journal Article
    ISSN 2631-9268
    ISSN (online) 2631-9268
    DOI 10.1093/nargab/lqab002
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Accurate sequencing of DNA motifs able to form alternative (non-B) structures.

    Weissensteiner, Matthias H / Cremona, Marzia A / Guiblet, Wilfried M / Stoler, Nicholas / Harris, Robert S / Cechova, Monika / Eckert, Kristin A / Chiaromonte, Francesca / Huang, Yi-Fei / Makova, Kateryna D

    Genome research

    2023  Volume 33, Issue 6, Page(s) 907–922

    Abstract: Approximately 13% of the human genome at certain motifs have the potential to form noncanonical (non-B) DNA structures (e.g., G-quadruplexes, cruciforms, and Z-DNA), which regulate many cellular processes but also affect the activity of polymerases and ... ...

    Abstract Approximately 13% of the human genome at certain motifs have the potential to form noncanonical (non-B) DNA structures (e.g., G-quadruplexes, cruciforms, and Z-DNA), which regulate many cellular processes but also affect the activity of polymerases and helicases. Because sequencing technologies use these enzymes, they might possess increased errors at non-B structures. To evaluate this, we analyzed error rates, read depth, and base quality of Illumina, Pacific Biosciences (PacBio) HiFi, and Oxford Nanopore Technologies (ONT) sequencing at non-B motifs. All technologies showed altered sequencing success for most non-B motif types, although this could be owing to several factors, including structure formation, biased GC content, and the presence of homopolymers. Single-nucleotide mismatch errors had low biases in HiFi and ONT for all non-B motif types but were increased for G-quadruplexes and Z-DNA in all three technologies. Deletion errors were increased for all non-B types but Z-DNA in Illumina and HiFi, as well as only for G-quadruplexes in ONT. Insertion errors for non-B motifs were highly, moderately, and slightly elevated in Illumina, HiFi, and ONT, respectively. Additionally, we developed a probabilistic approach to determine the number of false positives at non-B motifs depending on sample size and variant frequency, and applied it to publicly available data sets (1000 Genomes, Simons Genome Diversity Project, and gnomAD). We conclude that elevated sequencing errors at non-B DNA motifs should be considered in low-read-depth studies (single-cell, ancient DNA, and pooled-sample population sequencing) and in scoring rare variants. Combining technologies should maximize sequencing accuracy in future studies of non-B DNA.
    MeSH term(s) Humans ; Nucleotide Motifs ; DNA, Z-Form ; Sequence Analysis, DNA ; DNA/genetics ; Base Composition ; High-Throughput Nucleotide Sequencing ; Nanopores
    Chemical Substances DNA, Z-Form ; DNA (9007-49-2)
    Language English
    Publishing date 2023-07-11
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ZDB-ID 1284872-4
    ISSN 1549-5469 ; 1088-9051 ; 1054-9803
    ISSN (online) 1549-5469
    ISSN 1088-9051 ; 1054-9803
    DOI 10.1101/gr.277490.122
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: Family reunion via error correction: an efficient analysis of duplex sequencing data.

    Stoler, Nicholas / Arbeithuber, Barbara / Povysil, Gundula / Heinzl, Monika / Salazar, Renato / Makova, Kateryna D / Tiemann-Boege, Irene / Nekrutenko, Anton

    BMC bioinformatics

    2020  Volume 21, Issue 1, Page(s) 96

    Abstract: Background: Duplex sequencing is the most accurate approach for identification of sequence variants present at very low frequencies. Its power comes from pooling together multiple descendants of both strands of original DNA molecules, which allows ... ...

    Abstract Background: Duplex sequencing is the most accurate approach for identification of sequence variants present at very low frequencies. Its power comes from pooling together multiple descendants of both strands of original DNA molecules, which allows distinguishing true nucleotide substitutions from PCR amplification and sequencing artifacts. This strategy comes at a cost-sequencing the same molecule multiple times increases dynamic range but significantly diminishes coverage, making whole genome duplex sequencing prohibitively expensive. Furthermore, every duplex experiment produces a substantial proportion of singleton reads that cannot be used in the analysis and are thrown away.
    Results: In this paper we demonstrate that a significant fraction of these reads contains PCR or sequencing errors within duplex tags. Correction of such errors allows "reuniting" these reads with their respective families increasing the output of the method and making it more cost effective.
    Conclusions: We combine an error correction strategy with a number of algorithmic improvements in a new version of the duplex analysis software, Du Novo 2.0. It is written in Python, C, AWK, and Bash. It is open source and readily available through Galaxy, Bioconda, and Github: https://github.com/galaxyproject/dunovo.
    MeSH term(s) Algorithms ; DNA/chemistry ; DNA/metabolism ; Humans ; Sequence Alignment ; Sequence Analysis, DNA ; User-Computer Interface
    Chemical Substances DNA (9007-49-2)
    Language English
    Publishing date 2020-03-04
    Publishing country England
    Document type Journal Article
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-020-3419-8
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: A 1,000-Year-Old RNA Virus.

    Peyambari, Mahtab / Warner, Sylvia / Stoler, Nicholas / Rainer, Drew / Roossinck, Marilyn J

    Journal of virology

    2018  Volume 93, Issue 1

    Abstract: Only a few RNA viruses have been discovered from archaeological samples, the oldest dating from about 750 years ago. Using ancient maize cobs from Antelope house, Arizona, dating from ca. 1,000 CE, we discovered a novel plant virus with a double-stranded ...

    Abstract Only a few RNA viruses have been discovered from archaeological samples, the oldest dating from about 750 years ago. Using ancient maize cobs from Antelope house, Arizona, dating from ca. 1,000 CE, we discovered a novel plant virus with a double-stranded RNA genome. The virus is a member of the family
    MeSH term(s) Arizona ; Evolution, Molecular ; Genome Size ; Geologic Sediments/virology ; High-Throughput Nucleotide Sequencing ; Phylogeny ; Plant Viruses/classification ; Plant Viruses/isolation & purification ; RNA Viruses/genetics ; RNA, Double-Stranded/genetics ; Sequence Analysis, DNA ; Sequence Analysis, RNA ; Zea mays/virology
    Chemical Substances RNA, Double-Stranded
    Language English
    Publishing date 2018-12-10
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 80174-4
    ISSN 1098-5514 ; 0022-538X
    ISSN (online) 1098-5514
    ISSN 0022-538X
    DOI 10.1128/JVI.01188-18
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article: Streamlined analysis of duplex sequencing data with Du Novo

    Stoler, Nicholas / Arbeithuber, Barbara / Guiblet, Wilfried / Makova, Kateryna D / Nekrutenko, Anton

    Genome biology. 2016 Dec., v. 17, no. 1

    2016  

    Abstract: Duplex sequencing was originally developed to detect rare nucleotide polymorphisms normally obscured by the noise of high-throughput sequencing. Here we describe a new, streamlined, reference-free approach for the analysis of duplex sequencing data. We ... ...

    Abstract Duplex sequencing was originally developed to detect rare nucleotide polymorphisms normally obscured by the noise of high-throughput sequencing. Here we describe a new, streamlined, reference-free approach for the analysis of duplex sequencing data. We show the approach performs well on simulated data and precisely reproduces previously published results and apply it to a newly produced dataset, enabling us to type low-frequency variants in human mitochondrial DNA. Finally, we provide all necessary tools as stand-alone components as well as integrate them into the Galaxy platform. All analyses performed in this manuscript can be repeated exactly as described at http://usegalaxy.org/duplex .
    Keywords data collection ; high-throughput nucleotide sequencing ; humans ; mitochondrial DNA
    Language English
    Dates of publication 2016-12
    Size p. 180.
    Publishing place BioMed Central
    Document type Article
    ZDB-ID 2040529-7
    ISSN 1474-760X ; 1465-6906
    ISSN (online) 1474-760X
    ISSN 1465-6906
    DOI 10.1186/s13059-016-1039-4
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  8. Article ; Online: Streamlined analysis of duplex sequencing data with Du Novo.

    Stoler, Nicholas / Arbeithuber, Barbara / Guiblet, Wilfried / Makova, Kateryna D / Nekrutenko, Anton

    Genome biology

    2016  Volume 17, Issue 1, Page(s) 180

    Abstract: Duplex sequencing was originally developed to detect rare nucleotide polymorphisms normally obscured by the noise of high-throughput sequencing. Here we describe a new, streamlined, reference-free approach for the analysis of duplex sequencing data. We ... ...

    Abstract Duplex sequencing was originally developed to detect rare nucleotide polymorphisms normally obscured by the noise of high-throughput sequencing. Here we describe a new, streamlined, reference-free approach for the analysis of duplex sequencing data. We show the approach performs well on simulated data and precisely reproduces previously published results and apply it to a newly produced dataset, enabling us to type low-frequency variants in human mitochondrial DNA. Finally, we provide all necessary tools as stand-alone components as well as integrate them into the Galaxy platform. All analyses performed in this manuscript can be repeated exactly as described at http://usegalaxy.org/duplex .
    MeSH term(s) DNA, Mitochondrial/genetics ; Genomics ; High-Throughput Nucleotide Sequencing ; Humans ; Polymorphism, Single Nucleotide/genetics ; Sequence Analysis, DNA/methods ; Software
    Chemical Substances DNA, Mitochondrial
    Language English
    Publishing date 2016-08-26
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 2040529-7
    ISSN 1474-760X ; 1474-760X
    ISSN (online) 1474-760X
    ISSN 1474-760X
    DOI 10.1186/s13059-016-1039-4
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Age-related accumulation of de novo mitochondrial mutations in mammalian oocytes and somatic tissues.

    Arbeithuber, Barbara / Hester, James / Cremona, Marzia A / Stoler, Nicholas / Zaidi, Arslan / Higgins, Bonnie / Anthony, Kate / Chiaromonte, Francesca / Diaz, Francisco J / Makova, Kateryna D

    PLoS biology

    2020  Volume 18, Issue 7, Page(s) e3000745

    Abstract: Mutations create genetic variation for other evolutionary forces to operate on and cause numerous genetic diseases. Nevertheless, how de novo mutations arise remains poorly understood. Progress in the area is hindered by the fact that error rates of ... ...

    Abstract Mutations create genetic variation for other evolutionary forces to operate on and cause numerous genetic diseases. Nevertheless, how de novo mutations arise remains poorly understood. Progress in the area is hindered by the fact that error rates of conventional sequencing technologies (1 in 100 or 1,000 base pairs) are several orders of magnitude higher than de novo mutation rates (1 in 10,000,000 or 100,000,000 base pairs per generation). Moreover, previous analyses of germline de novo mutations examined pedigrees (and not germ cells) and thus were likely affected by selection. Here, we applied highly accurate duplex sequencing to detect low-frequency, de novo mutations in mitochondrial DNA (mtDNA) directly from oocytes and from somatic tissues (brain and muscle) of 36 mice from two independent pedigrees. We found mtDNA mutation frequencies 2- to 3-fold higher in 10-month-old than in 1-month-old mice, demonstrating mutation accumulation during the period of only 9 mo. Mutation frequencies and patterns differed between germline and somatic tissues and among mtDNA regions, suggestive of distinct mutagenesis mechanisms. Additionally, we discovered a more pronounced genetic drift of mitochondrial genetic variants in the germline of older versus younger mice, arguing for mtDNA turnover during oocyte meiotic arrest. Our study deciphered for the first time the intricacies of germline de novo mutagenesis using duplex sequencing directly in oocytes, which provided unprecedented resolution and minimized selection effects present in pedigree studies. Moreover, our work provides important information about the origins and accumulation of mutations with aging/maturation and has implications for delayed reproduction in modern human societies. Furthermore, the duplex sequencing method we optimized for single cells opens avenues for investigating low-frequency mutations in other studies.
    MeSH term(s) Aging/genetics ; Animals ; DNA Mutational Analysis ; DNA, Mitochondrial/genetics ; Female ; Gene Frequency/genetics ; Genetic Drift ; Germ Cells/metabolism ; Inheritance Patterns/genetics ; Logistic Models ; Male ; Mammals/genetics ; Mice ; Mitochondria/genetics ; Models, Genetic ; Mutation/genetics ; Mutation Rate ; Nucleotides/genetics ; Oocytes/metabolism ; Organ Specificity/genetics ; Pedigree
    Chemical Substances DNA, Mitochondrial ; Nucleotides
    Language English
    Publishing date 2020-07-15
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 2126776-5
    ISSN 1545-7885 ; 1544-9173
    ISSN (online) 1545-7885
    ISSN 1544-9173
    DOI 10.1371/journal.pbio.3000745
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Dissemination of scientific software with Galaxy ToolShed.

    Blankenberg, Daniel / Von Kuster, Gregory / Bouvier, Emil / Baker, Dannon / Afgan, Enis / Stoler, Nicholas / Taylor, James / Nekrutenko, Anton

    Genome biology

    2014  Volume 15, Issue 2, Page(s) 403

    Abstract: The proliferation of web-based integrative analysis frameworks has enabled users to perform complex analyses directly through the web. Unfortunately, it also revoked the freedom to easily select the most appropriate tools. To address this, we have ... ...

    Abstract The proliferation of web-based integrative analysis frameworks has enabled users to perform complex analyses directly through the web. Unfortunately, it also revoked the freedom to easily select the most appropriate tools. To address this, we have developed Galaxy ToolShed.
    MeSH term(s) Computational Biology ; Internet ; Science ; Software
    Language English
    Publishing date 2014-02-20
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 2040529-7
    ISSN 1474-760X ; 1474-760X
    ISSN (online) 1474-760X
    ISSN 1474-760X
    DOI 10.1186/gb4161
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top