LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 251

Search options

  1. Article ; Online: pathMap: a path-based mapping tool for long noisy reads with high sensitivity.

    Wei, Ze-Gang / Zhang, Xiao-Dan / Fan, Xing-Guo / Qian, Yu / Liu, Fei / Wu, Fang-Xiang

    Briefings in bioinformatics

    2024  Volume 25, Issue 2

    Abstract: With the rapid development of single-molecule sequencing (SMS) technologies, the output read length is continuously increasing. Mapping such reads onto a reference genome is one of the most fundamental tasks in sequence analysis. Mapping sensitivity is ... ...

    Abstract With the rapid development of single-molecule sequencing (SMS) technologies, the output read length is continuously increasing. Mapping such reads onto a reference genome is one of the most fundamental tasks in sequence analysis. Mapping sensitivity is becoming a major concern since high sensitivity can detect more aligned regions on the reference and obtain more aligned bases, which are useful for downstream analysis. In this study, we present pathMap, a novel k-mer graph-based mapper that is specifically designed for mapping SMS reads with high sensitivity. By viewing the alignment chain as a path containing as many anchors as possible in the matched k-mer graph, pathMap treats chaining as a path selection problem in the directed graph. pathMap iteratively searches the longest path in the remaining nodes; more candidate chains with high quality can be effectively detected and aligned. Compared to other state-of-the-art mapping methods such as minimap2 and Winnowmap2, experiment results on simulated and real-life datasets demonstrate that pathMap obtains the number of mapped chains at least 11.50% more than its closest competitor and increases the mapping sensitivity by 17.28% and 13.84% of bases over the next-best mapper for Pacific Biosciences and Oxford Nanopore sequencing data, respectively. In addition, pathMap is more robust to sequence errors and more sensitive to species- and strain-specific identification of pathogens using MinION reads.
    MeSH term(s) Sequence Analysis, DNA/methods ; High-Throughput Nucleotide Sequencing/methods ; Genome ; Nanopore Sequencing ; Software ; Algorithms
    Language English
    Publishing date 2024-03-22
    Publishing country England
    Document type Journal Article
    ZDB-ID 2068142-2
    ISSN 1477-4054 ; 1467-5463
    ISSN (online) 1477-4054
    ISSN 1467-5463
    DOI 10.1093/bib/bbae107
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: invMap: a sensitive mapping tool for long noisy reads with inversion structural variants.

    Wei, Ze-Gang / Bu, Peng-Yu / Zhang, Xiao-Dan / Liu, Fei / Qian, Yu / Wu, Fang-Xiang

    Bioinformatics (Oxford, England)

    2023  Volume 39, Issue 12

    Abstract: Motivation: Longer reads produced by PacBio or Oxford Nanopore sequencers could more frequently span the breakpoints of structural variations (SVs) than shorter reads. Therefore, existing long-read mapping methods often generate wrong alignments and ... ...

    Abstract Motivation: Longer reads produced by PacBio or Oxford Nanopore sequencers could more frequently span the breakpoints of structural variations (SVs) than shorter reads. Therefore, existing long-read mapping methods often generate wrong alignments and variant calls. Compared to deletions and insertions, inversion events are more difficult to be detected since the anchors in inversion regions are nonlinear to those in SV-free regions. To address this issue, this study presents a novel long-read mapping algorithm (named as invMap).
    Results: For each long noisy read, invMap first locates the aligned region with a specifically designed scoring method for chaining, then checks the remaining anchors in the aligned region to discover potential inversions. We benchmark invMap on simulated datasets across different genomes and sequencing coverages, experimental results demonstrate that invMap is more accurate to locate aligned regions and call SVs for inversions than the competing methods. The real human genome sequencing dataset of NA12878 illustrates that invMap can effectively find more candidate variant calls for inversions than the competing methods.
    Availability and implementation: The invMap software is available at https://github.com/zhang134/invMap.git.
    MeSH term(s) Humans ; Genomics/methods ; High-Throughput Nucleotide Sequencing/methods ; Software ; Algorithms ; Genome, Human ; Chromosome Inversion ; Sequence Analysis, DNA/methods
    Language English
    Publishing date 2023-12-28
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btad726
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article: DMSC: A Dynamic Multi-Seeds Method for Clustering 16S rRNA Sequences Into OTUs.

    Wei, Ze-Gang / Zhang, Shao-Wu

    Frontiers in microbiology

    2019  Volume 10, Page(s) 428

    Abstract: Next-generation sequencing (NGS)-based 16S rRNA sequencing by jointly using the PCR amplification and NGS technology is a cost-effective technique, which has been successfully used to study the phylogeny and taxonomy of samples from complex microbiomes ... ...

    Abstract Next-generation sequencing (NGS)-based 16S rRNA sequencing by jointly using the PCR amplification and NGS technology is a cost-effective technique, which has been successfully used to study the phylogeny and taxonomy of samples from complex microbiomes or environments. Clustering 16S rRNA sequences into operational taxonomic units (OTUs) is often the first step for many downstream analyses. Heuristic clustering is one of the most widely employed approaches for generating OTUs. However, most heuristic OTUs clustering methods just select one single seed sequence to represent each cluster, resulting in their outcomes suffer from either overestimation of OTUs number or sensitivity to sequencing errors. In this paper, we present a novel dynamic multi-seeds clustering method (namely DMSC) to pick OTUs. DMSC first heuristically generates clusters according to the distance threshold. When the size of a cluster reaches the pre-defined minimum size, then DMSC selects the multi-core sequences (MCS) as the seeds that are defined as the
    Language English
    Publishing date 2019-03-12
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2587354-4
    ISSN 1664-302X
    ISSN 1664-302X
    DOI 10.3389/fmicb.2019.00428
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: smsMap

    Ze-Gang Wei / Shao-Wu Zhang / Fei Liu

    BMC Bioinformatics, Vol 21, Iss 1, Pp 1-

    mapping single molecule sequencing reads by locating the alignment starting positions

    2020  Volume 15

    Abstract: Abstract Background Single Molecule Sequencing (SMS) technology can produce longer reads with higher sequencing error rate. Mapping these reads to a reference genome is often the most fundamental and computing-intensive step for downstream analysis. Most ...

    Abstract Abstract Background Single Molecule Sequencing (SMS) technology can produce longer reads with higher sequencing error rate. Mapping these reads to a reference genome is often the most fundamental and computing-intensive step for downstream analysis. Most existing mapping tools generally adopt the traditional seed-and-extend strategy, and the candidate aligned regions for each query read are selected either by counting the number of matched seeds or chaining a group of seeds. However, for all the existing mapping tools, the coverage ratio of the alignment region to the query read is lower, and the read alignment quality and efficiency need to be improved. Here, we introduce smsMap, a novel mapping tool that is specifically designed to map the long reads of SMS to a reference genome. Results smsMap was evaluated with other existing seven SMS mapping tools (e.g., BLASR, minimap2, and BWA-MEM) on both simulated and real-life SMS datasets. The experimental results show that smsMap can efficiently achieve higher aligned read coverage ratio and has higher sensitivity that can align more sequences and bases to the reference genome. Additionally, smsMap is more robust to sequencing errors. Conclusions smsMap is computationally efficient to align SMS reads, especially for the larger size of the reference genome (e.g., H. sapiens genome with over 3 billion base pairs). The source code of smsMap can be freely downloaded from https://github.com/NWPU-903PR/smsMap .
    Keywords Computer applications to medicine. Medical informatics ; R858-859.7 ; Biology (General) ; QH301-705.5
    Language English
    Publishing date 2020-08-01T00:00:00Z
    Publisher BMC
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Article ; Online: A strategy for the treatment of non-small-cell lung cancer by Ag nanoparticles

    Zheng Gong / Ze-Gang Liu / Kun-Yu Du / Jiang-Hai Wu / Na Yang / Jing-Kui Shu / Sara Amirpour Amraii

    Arabian Journal of Chemistry, Vol 16, Iss 8, Pp 104960- (2023)

    2023  

    Abstract: This work describes an eco-friendly approach for the green formulation of Ag nanoparticles by Allium ampeloprasum extract, without using any toxic reducing and capping agents. The morphology, structure, and physicochemical properties were characterized ... ...

    Abstract This work describes an eco-friendly approach for the green formulation of Ag nanoparticles by Allium ampeloprasum extract, without using any toxic reducing and capping agents. The morphology, structure, and physicochemical properties were characterized by several analytical techniques such as fourier transformed infrared spectroscopy (FT-IR), field emission scanning electron microscopy (FESEM), transmission electron microscopy (TEM), energy dispersive X-ray spectroscopy (EDS), X-ray diffraction (XRD), and Ultraviolet–visible spectroscopy (UV–Vis). The nanoparticles were explored biologically in the anticancer assays. Exposure of the nanoparticles samples to non-small-cell lung cancer cells resulted in cell death, which was mostly due to necrosis but slightly due to late apoptosis. The viability of malignant cell lines reduced dose-dependently in the presence of nanoparticles. The IC50 of nanoparticles were 301, 266, 255, and 250 µg/mL against EKVX, HOP-62, A549 and NCI-H460 cancer cell lines, respectively. The green-synthesized nanoparticles induced cell death, suggesting anticancer prospects that may offer new insight into the development of an anticancer nanomedicine.
    Keywords Ag ; Characterization ; Plant ; Non-small-cell lung cancer ; Chemistry ; QD1-999
    Subject code 500
    Language English
    Publishing date 2023-08-01T00:00:00Z
    Publisher Elsevier
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Article ; Online: NPBSS: a new PacBio sequencing simulator for generating the continuous long reads with an empirical model.

    Wei, Ze-Gang / Zhang, Shao-Wu

    BMC bioinformatics

    2018  Volume 19, Issue 1, Page(s) 177

    Abstract: Background: PacBio sequencing platform offers longer read lengths than the second-generation sequencing technologies. It has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. Due to its ... ...

    Abstract Background: PacBio sequencing platform offers longer read lengths than the second-generation sequencing technologies. It has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. Due to its extremely wide range of application areas, fast sequencing simulation systems with high fidelity are in great demand to facilitate the development and comparison of subsequent analysis tools. Although there are several available simulators (e.g., PBSIM, SimLoRD and FASTQSim) that target the specific generation of PacBio libraries, the error rate of simulated sequences is not well matched to the quality value of raw PacBio datasets, especially for PacBio's continuous long reads (CLR).
    Results: By analyzing the characteristic features of CLR data from PacBio SMRT (single molecule real time) sequencing, we developed a new PacBio sequencing simulator (called NPBSS) for producing CLR reads. NPBSS simulator firstly samples the read sequences according to the read length logarithmic normal distribution, and choses different base quality values with different proportions. Then, NPBSS computes the overall error probability of each base in the read sequence with an empirical model, and calculates the deletion, substitution and insertion probabilities with the overall error probability to generate the PacBio CLR reads. Alignment results demonstrate that NPBSS fits the error rate of the PacBio CLR reads better than PBSIM and FASTQSim. In addition, the assembly results also show that simulated sequences of NPBSS are more like real PacBio CLR data.
    Conclusion: NPBSS simulator is convenient to use with efficient computation and flexible parameters setting. Its generating PacBio CLR reads are more like real PacBio datasets.
    MeSH term(s) Genome/genetics ; High-Throughput Nucleotide Sequencing/methods ; Humans ; Sequence Analysis, DNA/methods
    Language English
    Publishing date 2018-05-22
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-018-2208-0
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: smsMap: mapping single molecule sequencing reads by locating the alignment starting positions.

    Wei, Ze-Gang / Zhang, Shao-Wu / Liu, Fei

    BMC bioinformatics

    2020  Volume 21, Issue 1, Page(s) 341

    Abstract: Background: Single Molecule Sequencing (SMS) technology can produce longer reads with higher sequencing error rate. Mapping these reads to a reference genome is often the most fundamental and computing-intensive step for downstream analysis. Most ... ...

    Abstract Background: Single Molecule Sequencing (SMS) technology can produce longer reads with higher sequencing error rate. Mapping these reads to a reference genome is often the most fundamental and computing-intensive step for downstream analysis. Most existing mapping tools generally adopt the traditional seed-and-extend strategy, and the candidate aligned regions for each query read are selected either by counting the number of matched seeds or chaining a group of seeds. However, for all the existing mapping tools, the coverage ratio of the alignment region to the query read is lower, and the read alignment quality and efficiency need to be improved. Here, we introduce smsMap, a novel mapping tool that is specifically designed to map the long reads of SMS to a reference genome.
    Results: smsMap was evaluated with other existing seven SMS mapping tools (e.g., BLASR, minimap2, and BWA-MEM) on both simulated and real-life SMS datasets. The experimental results show that smsMap can efficiently achieve higher aligned read coverage ratio and has higher sensitivity that can align more sequences and bases to the reference genome. Additionally, smsMap is more robust to sequencing errors.
    Conclusions: smsMap is computationally efficient to align SMS reads, especially for the larger size of the reference genome (e.g., H. sapiens genome with over 3 billion base pairs). The source code of smsMap can be freely downloaded from https://github.com/NWPU-903PR/smsMap .
    MeSH term(s) Algorithms ; Computer Simulation ; Databases, Genetic ; Escherichia coli/genetics ; Humans ; Sequence Alignment ; Sequence Analysis, DNA/methods ; Software ; Time Factors
    Language English
    Publishing date 2020-08-04
    Publishing country England
    Document type Journal Article
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-020-03698-w
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: DBH: A de Bruijn graph-based heuristic method for clustering large-scale 16S rRNA sequences into OTUs.

    Wei, Ze-Gang / Zhang, Shao-Wu

    Journal of theoretical biology

    2017  Volume 425, Page(s) 80–87

    Abstract: Recent sequencing revolution driven by high-throughput technologies has led to rapid accumulation of 16S rRNA sequences for microbial communities. Clustering short sequences into operational taxonomic units (OTUs) is an initial crucial process in ... ...

    Abstract Recent sequencing revolution driven by high-throughput technologies has led to rapid accumulation of 16S rRNA sequences for microbial communities. Clustering short sequences into operational taxonomic units (OTUs) is an initial crucial process in analyzing metagenomic data. Although many heuristic methods have been proposed for OTU inferences with low computational complexity, they just select one sequence as the seed for each cluster and the results are sensitive to the selected sequences that represent the clusters. To address this issue, we present a de Bruijn graph-based heuristic clustering method (DBH) for clustering massive 16S rRNA sequences into OTUs by introducing a novel seed selection strategy and greedy clustering approach. Compared with existing widely used methods on several simulated and real-life metagenomic datasets, the results show that DBH has higher clustering performance and low memory usage, facilitating the overestimation of OTUs number. DBH is more effective to handle large-scale metagenomic datasets. The DBH software can be freely downloaded from https://github.com/nwpu134/DBH.git for academic users.
    MeSH term(s) Algorithms ; Cluster Analysis ; Computational Biology/methods ; Gastrointestinal Microbiome/genetics ; Heuristics ; Humans ; Metagenomics/methods ; Phylogeny ; RNA, Bacterial/genetics ; RNA, Ribosomal, 16S/genetics ; Sequence Analysis, DNA/methods
    Chemical Substances RNA, Bacterial ; RNA, Ribosomal, 16S
    Language English
    Publishing date 2017-04-26
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2972-5
    ISSN 1095-8541 ; 0022-5193
    ISSN (online) 1095-8541
    ISSN 0022-5193
    DOI 10.1016/j.jtbi.2017.04.019
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Retraction notice to "Potential of β-elemene induced ferroptosis through Pole2-mediated p53 and PI3K/AKT signaling in lung cancer cells" [Chem. Biol. Interact. 365 (25 September 2022) 110088].

    Gong, Zheng / Liu, Ze-Gang / Du, Kun-Yu / Wu, Jiang-Hai / Yang, Na / Malhotra, Anshoo / Shu, Jing-Kui

    Chemico-biological interactions

    2022  Volume 369, Page(s) 110302

    Abstract: This article has been retracted: please see Elsevier Policy on Article Withdrawal (https://www.elsevier.com/about/our-business/policies/article-withdrawal). The entire 'Reason' text must be identical to that in the XML version Box 6). ...

    Abstract This article has been retracted: please see Elsevier Policy on Article Withdrawal (https://www.elsevier.com/about/our-business/policies/article-withdrawal). The entire 'Reason' text must be identical to that in the XML version Box 6).
    Language English
    Publishing date 2022-12-10
    Publishing country Ireland
    Document type Journal Article ; Retraction of Publication
    ZDB-ID 218799-1
    ISSN 1872-7786 ; 0009-2797
    ISSN (online) 1872-7786
    ISSN 0009-2797
    DOI 10.1016/j.cbi.2022.110302
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: NPBSS

    Ze-Gang Wei / Shao-Wu Zhang

    BMC Bioinformatics, Vol 19, Iss 1, Pp 1-

    a new PacBio sequencing simulator for generating the continuous long reads with an empirical model

    2018  Volume 9

    Abstract: Abstract Background PacBio sequencing platform offers longer read lengths than the second-generation sequencing technologies. It has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. Due to its ... ...

    Abstract Abstract Background PacBio sequencing platform offers longer read lengths than the second-generation sequencing technologies. It has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. Due to its extremely wide range of application areas, fast sequencing simulation systems with high fidelity are in great demand to facilitate the development and comparison of subsequent analysis tools. Although there are several available simulators (e.g., PBSIM, SimLoRD and FASTQSim) that target the specific generation of PacBio libraries, the error rate of simulated sequences is not well matched to the quality value of raw PacBio datasets, especially for PacBio’s continuous long reads (CLR). Results By analyzing the characteristic features of CLR data from PacBio SMRT (single molecule real time) sequencing, we developed a new PacBio sequencing simulator (called NPBSS) for producing CLR reads. NPBSS simulator firstly samples the read sequences according to the read length logarithmic normal distribution, and choses different base quality values with different proportions. Then, NPBSS computes the overall error probability of each base in the read sequence with an empirical model, and calculates the deletion, substitution and insertion probabilities with the overall error probability to generate the PacBio CLR reads. Alignment results demonstrate that NPBSS fits the error rate of the PacBio CLR reads better than PBSIM and FASTQSim. In addition, the assembly results also show that simulated sequences of NPBSS are more like real PacBio CLR data. Conclusion NPBSS simulator is convenient to use with efficient computation and flexible parameters setting. Its generating PacBio CLR reads are more like real PacBio datasets.
    Keywords Sequence simulator ; Quality value ; Continuous long reads ; SMRT ; PacBio ; Computer applications to medicine. Medical informatics ; R858-859.7 ; Biology (General) ; QH301-705.5
    Subject code 620
    Language English
    Publishing date 2018-05-01T00:00:00Z
    Publisher BMC
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top