LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 26

Search options

  1. Article: MegaD: Deep Learning for Rapid and Accurate Disease Status Prediction of Metagenomic Samples.

    Mreyoud, Yassin / Song, Myoungkyu / Lim, Jihun / Ahn, Tae-Hyuk

    Life (Basel, Switzerland)

    2022  Volume 12, Issue 5

    Abstract: The diversity within different microbiome communities that drive biogeochemical processes influences many different phenotypes. Analyses of these communities and their diversity by countless microbiome projects have revealed an important role of ... ...

    Abstract The diversity within different microbiome communities that drive biogeochemical processes influences many different phenotypes. Analyses of these communities and their diversity by countless microbiome projects have revealed an important role of metagenomics in understanding the complex relation between microbes and their environments. This relationship can be understood in the context of microbiome composition of specific known environments. These compositions can then be used as a template for predicting the status of similar environments. Machine learning has been applied as a key component to this predictive task. Several analysis tools have already been published utilizing machine learning methods for metagenomic analysis. Despite the previously proposed machine learning models, the performance of deep neural networks is still under-researched. Given the nature of metagenomic data, deep neural networks could provide a strong boost to growth in the prediction accuracy in metagenomic analysis applications. To meet this urgent demand, we present a deep learning based tool that utilizes a deep neural network implementation for phenotypic prediction of unknown metagenomic samples. (1) First, our tool takes as input taxonomic profiles from 16S or WGS sequencing data. (2) Second, given the samples, our tool builds a model based on a deep neural network by computing multi-level classification. (3) Lastly, given the model, our tool classifies an unknown sample with its unlabeled taxonomic profile. In the benchmark experiments, we deduced that an analysis method facilitating a deep neural network such as our tool can show promising results in increasing the prediction accuracy on several samples compared to other machine learning models.
    Language English
    Publishing date 2022-04-30
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2662250-6
    ISSN 2075-1729
    ISSN 2075-1729
    DOI 10.3390/life12050669
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article: Chromosome-level Subgenome-aware

    Gardner, Cory / Chen, Junhao / Hadfield, Christina / Lu, Zhaolian / Debruin, David / Zhan, Yu / Donlin, Maureen J / Lin, Zhenguo / Ahn, Tae-Hyuk

    bioRxiv : the preprint server for biology

    2024  

    Abstract: Interspecies hybridization is prevalent in various eukaryotic lineages and plays important roles in phenotypic diversification, adaption, and speciation. To better understand the changes that occurred in the different subgenomes of a hybrid species and ... ...

    Abstract Interspecies hybridization is prevalent in various eukaryotic lineages and plays important roles in phenotypic diversification, adaption, and speciation. To better understand the changes that occurred in the different subgenomes of a hybrid species and how they facilitated adaptation, we completed chromosome-level
    Language English
    Publishing date 2024-03-19
    Publishing country United States
    Document type Preprint
    DOI 10.1101/2024.03.17.585453
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article: Comparison of 16S and whole genome dog microbiomes using machine learning.

    Lewis, Scott / Nash, Andrea / Li, Qinghong / Ahn, Tae-Hyuk

    BioData mining

    2021  Volume 14, Issue 1, Page(s) 41

    Abstract: Background: Recent advances in sequencing technologies have driven studies identifying the microbiome as a key regulator of overall health and disease in the host. Both 16S amplicon and whole genome shotgun sequencing technologies are currently being ... ...

    Abstract Background: Recent advances in sequencing technologies have driven studies identifying the microbiome as a key regulator of overall health and disease in the host. Both 16S amplicon and whole genome shotgun sequencing technologies are currently being used to investigate this relationship, however, the choice of sequencing technology often depends on the nature and experimental design of the study. In principle, the outputs rendered by analysis pipelines are heavily influenced by the data used as input; it is then important to consider that the genomic features produced by different sequencing technologies may emphasize different results.
    Results: In this work, we use public 16S amplicon and whole genome shotgun sequencing (WGS) data from the same dogs to investigate the relationship between sequencing technology and the captured gut metagenomic landscape in dogs. In our analyses, we compare the taxonomic resolution at the species and phyla levels and benchmark 12 classification algorithms in their ability to accurately identify host phenotype using only taxonomic relative abundance information from 16S and WGS datasets with identical study designs. Our best performing model, a random forest trained by the WGS dataset, identified a species (Bacteroides coprocola) that predominantly contributes to the abundance of leuB, a gene involved in branched chain amino acid biosynthesis; a risk factor for glucose intolerance, insulin resistance, and type 2 diabetes. This trend was not conserved when we trained the model using 16S sequencing profiles from the same dogs.
    Conclusions: Our results indicate that WGS sequencing of dog microbiomes detects a greater taxonomic diversity than 16S sequencing of the same dogs at the species level and with respect to four gut-enriched phyla levels. This difference in detection does not significantly impact the performance metrics of machine learning algorithms after down-sampling. Although the important features extracted from our best performing model are not conserved between the two technologies, the important features extracted from either instance indicate the utility of machine learning algorithms in identifying biologically meaningful relationships between the host and microbiome community members. In conclusion, this work provides the first systematic machine learning comparison of dog 16S and WGS microbiomes derived from identical study designs.
    Language English
    Publishing date 2021-08-21
    Publishing country England
    Document type Journal Article
    ZDB-ID 2438773-3
    ISSN 1756-0381
    ISSN 1756-0381
    DOI 10.1186/s13040-021-00270-x
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: TSSr: an R package for comprehensive analyses of TSS sequencing data.

    Lu, Zhaolian / Berry, Keenan / Hu, Zhenbin / Zhan, Yu / Ahn, Tae-Hyuk / Lin, Zhenguo

    NAR genomics and bioinformatics

    2021  Volume 3, Issue 4, Page(s) lqab108

    Abstract: Transcription initiation is regulated in a highly organized fashion to ensure proper cellular functions. Accurate identification of transcription start sites (TSSs) and quantitative characterization of transcription initiation activities are fundamental ... ...

    Abstract Transcription initiation is regulated in a highly organized fashion to ensure proper cellular functions. Accurate identification of transcription start sites (TSSs) and quantitative characterization of transcription initiation activities are fundamental steps for studies of regulated transcriptions and core promoter structures. Several high-throughput techniques have been developed to sequence the very 5'end of RNA transcripts (TSS sequencing) on the genome scale. Bioinformatics tools are essential for processing, analysis, and visualization of TSS sequencing data. Here, we present TSSr, an R package that provides rich functions for mapping TSS and characterizations of structures and activities of core promoters based on all types of TSS sequencing data. Specifically, TSSr implements several newly developed algorithms for accurately identifying TSSs from mapped sequencing reads and inference of core promoters, which are a prerequisite for subsequent functional analyses of TSS data. Furthermore, TSSr also enables users to export various types of TSS data that can be visualized by genome browser for inspection of promoter activities in association with other genomic features, and to generate publication-ready TSS graphs. These user-friendly features could greatly facilitate studies of transcription initiation based on TSS sequencing data. The source code and detailed documentations of TSSr can be freely accessed at https://github.com/Linlab-slu/TSSr.
    Language English
    Publishing date 2021-11-12
    Publishing country England
    Document type Journal Article
    ISSN 2631-9268
    ISSN (online) 2631-9268
    DOI 10.1093/nargab/lqab108
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: MegaR: an interactive R package for rapid sample classification and phenotype prediction using metagenome profiles and machine learning.

    Dhungel, Eliza / Mreyoud, Yassin / Gwak, Ho-Jin / Rajeh, Ahmad / Rho, Mina / Ahn, Tae-Hyuk

    BMC bioinformatics

    2021  Volume 22, Issue 1, Page(s) 25

    Abstract: Background: Diverse microbiome communities drive biogeochemical processes and evolution of animals in their ecosystems. Many microbiome projects have demonstrated the power of using metagenomics to understand the structures and factors influencing the ... ...

    Abstract Background: Diverse microbiome communities drive biogeochemical processes and evolution of animals in their ecosystems. Many microbiome projects have demonstrated the power of using metagenomics to understand the structures and factors influencing the function of the microbiomes in their environments. In order to characterize the effects from microbiome composition for human health, diseases, and even ecosystems, one must first understand the relationship of microbes and their environment in different samples. Running machine learning model with metagenomic sequencing data is encouraged for this purpose, but it is not an easy task to make an appropriate machine learning model for all diverse metagenomic datasets.
    Results: We introduce MegaR, an R Shiny package and web application, to build an unbiased machine learning model effortlessly with interactive visual analysis. The MegaR employs taxonomic profiles from either whole metagenome sequencing or 16S rRNA sequencing data to develop machine learning models and classify the samples into two or more categories. It provides various options for model fine tuning throughout the analysis pipeline such as data processing, multiple machine learning techniques, model validation, and unknown sample prediction that can be used to achieve the highest prediction accuracy possible for any given dataset while still maintaining a user-friendly experience.
    Conclusions: Metagenomic sample classification and phenotype prediction is important particularly when it applies to a diagnostic method for identifying and predicting microbe-related human diseases. MegaR provides various interactive visualizations for user to build an accurate machine-learning model without difficulty. Unknown sample prediction with a properly trained model using MegaR will enhance researchers to identify the sample property in a fast turnaround time.
    MeSH term(s) Humans ; Machine Learning ; Metagenome ; Metagenomics ; Phenotype ; RNA, Ribosomal, 16S/genetics
    Chemical Substances RNA, Ribosomal, 16S
    Language English
    Publishing date 2021-01-18
    Publishing country England
    Document type Journal Article
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-020-03933-4
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Massive metagenomic data analysis using abundance-based machine learning.

    Harris, Zachary N / Dhungel, Eliza / Mosior, Matthew / Ahn, Tae-Hyuk

    Biology direct

    2019  Volume 14, Issue 1, Page(s) 12

    Abstract: Background: Metagenomics is the application of modern genomic techniques to investigate the members of a microbial community directly in their natural environments and is widely used in many studies to survey the communities of microbial organisms that ... ...

    Abstract Background: Metagenomics is the application of modern genomic techniques to investigate the members of a microbial community directly in their natural environments and is widely used in many studies to survey the communities of microbial organisms that live in diverse ecosystems. In order to understand the metagenomic profile of one of the densest interaction spaces for millions of people, the public transit system, the MetaSUB international Consortium has collected and sequenced metagenomes from subways of different cities across the world. In collaboration with CAMDA, MetaSUB has made the metagenomic samples from these cities available for an open challenge of data analysis including, but not limited in scope to, the identification of unknown samples.
    Results: To distinguish the metagenomic profiling among different cities and also predict unknown samples precisely based on the profiling, two different approaches are proposed using machine learning techniques; one is a read-based taxonomy profiling of each sample and prediction method, and the other is a reduced representation assembly-based method. Among various machine learning techniques tested, the random forest technique showed promising results as a suitable classifier for both approaches. Random forest models developed from read-based taxonomic profiling could achieve an accuracy of 91% with 95% confidence interval between 80 and 93%. The assembly-based random forest model prediction also reached 90% accuracy. However, both models achieved roughly the same accuracy on the testing test, whereby they both failed to predict the most abundant label.
    Conclusion: Our results suggest that both read-based and assembly-based approaches are powerful tools for the analysis of metagenomics data. Moreover, our results suggest that reduced representation assembly-based methods are able to simultaneous provide high-accuracy prediction on available data. Overall, we show that metagenomic samples can be traced back to their location with careful generation of features from the composition of microbes and utilizing existing machine learning algorithms. Proposed approaches show high accuracy of prediction, but require careful inspection before making any decisions due to sample noise or complexity.
    Reviewers: This article was reviewed by Eugene V. Koonin, Jing Zhou and Serghei Mangul.
    MeSH term(s) Data Analysis ; Machine Learning ; Metagenome ; Metagenomics/methods ; Microbiota/genetics
    Language English
    Publishing date 2019-08-01
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't ; Research Support, U.S. Gov't, Non-P.H.S.
    ISSN 1745-6150
    ISSN (online) 1745-6150
    DOI 10.1186/s13062-019-0242-0
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: iCAT: diagnostic assessment tool of immunological history using high-throughput T-cell receptor sequencing.

    Rajeh, Ahmad / Wolf, Kyle / Schiebout, Courtney / Sait, Nabeel / Kosfeld, Tim / DiPaolo, Richard J / Ahn, Tae-Hyuk

    F1000Research

    2021  Volume 10, Page(s) 65

    Abstract: The pathogen exposure history of an individual is recorded in their T-cell repertoire and can be accessed through the study of T-cell receptors (TCRs) if the tools to identify them were available. For each T-cell, the TCR loci undergoes genetic ... ...

    Abstract The pathogen exposure history of an individual is recorded in their T-cell repertoire and can be accessed through the study of T-cell receptors (TCRs) if the tools to identify them were available. For each T-cell, the TCR loci undergoes genetic rearrangement that creates a unique DNA sequence. In theory these unique sequences can be used as biomarkers for tracking T-cell responses and cataloging immunological history. We developed the immune Cell Analysis Tool (iCAT), an R software package that analyzes TCR sequencing data from exposed (positive) and unexposed (negative) samples to identify TCR sequences statistically associated with positive samples. The presence and absence of associated sequences in samples trains a classifier to diagnose pathogen-specific exposure. We demonstrate the high accuracy of iCAT by testing on three TCR sequencing datasets. First, iCAT successfully diagnosed smallpox vaccinated versus naïve samples in an independent cohort of mice with 95% accuracy. Second, iCAT displayed 100% accuracy classifying naïve and monkeypox vaccinated mice. Finally, we demonstrate the use of iCAT on human samples before and after exposure to SARS-CoV-2, the virus behind the COVID-19 global pandemic. We were able to correctly classify the exposed samples with perfect accuracy. These experimental results show that iCAT capitalizes on the power of TCR sequencing to simplify infection diagnostics. iCAT provides the option of a graphical, user-friendly interface on top of usual R interface allowing it to reach a wider audience.
    MeSH term(s) Animals ; COVID-19 ; High-Throughput Nucleotide Sequencing ; Humans ; Mice ; Receptors, Antigen, T-Cell/genetics ; SARS-CoV-2 ; Software
    Chemical Substances Receptors, Antigen, T-Cell
    Language English
    Publishing date 2021-02-03
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2699932-8
    ISSN 2046-1402 ; 2046-1402
    ISSN (online) 2046-1402
    ISSN 2046-1402
    DOI 10.12688/f1000research.27214.2
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: Novel Candidate Genes Differentially Expressed in Glyphosate-Treated Horseweed (

    Yang, Yongil / Gardner, Cory / Gupta, Pallavi / Peng, Yanhui / Piasecki, Cristiano / Millwood, Reginald J / Ahn, Tae-Hyuk / Stewart, C Neal

    Genes

    2021  Volume 12, Issue 10

    Abstract: The evolution of herbicide-resistant weed species is a serious threat for weed control. Therefore, we need an improved understanding of how gene regulation confers herbicide resistance in order to slow the evolution of resistance. The present study ... ...

    Abstract The evolution of herbicide-resistant weed species is a serious threat for weed control. Therefore, we need an improved understanding of how gene regulation confers herbicide resistance in order to slow the evolution of resistance. The present study analyzed differentially expressed genes after glyphosate treatment on a glyphosate-resistant Tennessee ecotype (TNR) of horseweed (
    MeSH term(s) Computational Biology ; Conyza/drug effects ; Conyza/genetics ; DNA, Plant ; Genes, Plant ; Glycine/analogs & derivatives ; Glycine/pharmacology ; Herbicide Resistance/genetics ; Herbicides/pharmacology ; Sequence Analysis, DNA/methods ; Transcriptome ; Weed Control/methods ; Glyphosate
    Chemical Substances DNA, Plant ; Herbicides ; Glycine (TE7660XO1C)
    Language English
    Publishing date 2021-10-14
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2527218-4
    ISSN 2073-4425 ; 2073-4425
    ISSN (online) 2073-4425
    ISSN 2073-4425
    DOI 10.3390/genes12101616
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article: Novel Candidate Genes Differentially Expressed in Glyphosate-Treated Horseweed (Conyza canadensis)

    Yang, Yongil / Gardner, Cory / Gupta, Pallavi / Peng, Yanhui / Piasecki, Cristiano / Millwood, Reginald J. / Ahn, Tae-Hyuk / Stewart, C. Neal

    Genes. 2021 Oct. 14, v. 12, no. 10

    2021  

    Abstract: The evolution of herbicide-resistant weed species is a serious threat for weed control. Therefore, we need an improved understanding of how gene regulation confers herbicide resistance in order to slow the evolution of resistance. The present study ... ...

    Abstract The evolution of herbicide-resistant weed species is a serious threat for weed control. Therefore, we need an improved understanding of how gene regulation confers herbicide resistance in order to slow the evolution of resistance. The present study analyzed differentially expressed genes after glyphosate treatment on a glyphosate-resistant Tennessee ecotype (TNR) of horseweed (Conyza canadensis), compared to a susceptible biotype (TNS). A read size of 100.2 M was sequenced on the Illumina platform and subjected to de novo assembly, resulting in 77,072 gene-level contigs, of which 32,493 were uniquely annotated by a BlastX alignment of protein sequence similarity. The most differentially expressed genes were enriched in the gene ontology (GO) term of the transmembrane transport protein. In addition, fifteen upregulated genes were identified in TNR after glyphosate treatment but were not detected in TNS. Ten of these upregulated genes were transmembrane transporter or kinase receptor proteins. Therefore, a combination of changes in gene expression among transmembrane receptor and kinase receptor proteins may be important for endowing non-target-site glyphosate-resistant C. canadensis.
    Keywords Conyza canadensis ; amino acid sequences ; ecotypes ; evolution ; gene expression regulation ; gene ontology ; genes ; glyphosate ; glyphosate resistance ; herbicide-resistant weeds ; sequence homology ; transport proteins ; weed control ; Tennessee
    Language English
    Dates of publication 2021-1014
    Publishing place Multidisciplinary Digital Publishing Institute
    Document type Article
    ZDB-ID 2527218-4
    ISSN 2073-4425
    ISSN 2073-4425
    DOI 10.3390/genes12101616
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  10. Article ; Online: YeasTSS: an integrative web database of yeast transcription start sites.

    McMillan, Jonathan / Lu, Zhaolian / Rodriguez, Judith S / Ahn, Tae-Hyuk / Lin, Zhenguo

    Database : the journal of biological databases and curation

    2019  Volume 2019

    Abstract: The transcription initiation landscape of eukaryotic genes is complex and highly dynamic. In eukaryotes, genes can generate multiple transcript variants that differ in 5' boundaries due to usages of alternative transcription start sites (TSSs), and the ... ...

    Abstract The transcription initiation landscape of eukaryotic genes is complex and highly dynamic. In eukaryotes, genes can generate multiple transcript variants that differ in 5' boundaries due to usages of alternative transcription start sites (TSSs), and the abundance of transcript isoforms are highly variable. Due to a large number and complexity of the TSSs, it is not feasible to depict details of transcript initiation landscape of all genes using text-format genome annotation files. Therefore, it is necessary to provide data visualization of TSSs to represent quantitative TSS maps and the core promoters (CPs). In addition, the selection and activity of TSSs are influenced by various factors, such as transcription factors, chromatin remodeling and histone modifications. Thus, integration and visualization of functional genomic data related to these features could provide a better understanding of the gene promoter architecture and regulatory mechanism of transcription initiation. Yeast species play important roles for the research and human society, yet no database provides visualization and integration of functional genomic data in yeast. Here, we generated quantitative TSS maps for 12 important yeast species, inferred their CPs and built a public database, YeasTSS (www.yeastss.org). YeasTSS was designed as a central portal for visualization and integration of the TSS maps, CPs and functional genomic data related to transcription initiation in yeast. YeasTSS is expected to benefit the research community and public education for improving genome annotation, studies of promoter structure, regulated control of transcription initiation and inferring gene regulatory network.
    MeSH term(s) Databases, Nucleic Acid ; Internet ; Transcription Initiation Site ; Yeasts/classification ; Yeasts/genetics
    Language English
    Publishing date 2019-04-29
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 2496706-3
    ISSN 1758-0463 ; 1758-0463
    ISSN (online) 1758-0463
    ISSN 1758-0463
    DOI 10.1093/database/baz048
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top