LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 23

Search options

  1. Article ; Online: Maize Feature Store: A centralized resource to manage and analyze curated maize multi-omics features for machine learning applications.

    Sen, Shatabdi / Woodhouse, Margaret R / Portwood, John L / Andorf, Carson M

    Database : the journal of biological databases and curation

    2023  Volume 2023

    Abstract: The big-data analysis of complex data associated with maize genomes accelerates genetic research and improves agronomic traits. As a result, efforts have increased to integrate diverse datasets and extract meaning from these measurements. Machine ... ...

    Abstract The big-data analysis of complex data associated with maize genomes accelerates genetic research and improves agronomic traits. As a result, efforts have increased to integrate diverse datasets and extract meaning from these measurements. Machine learning models are a powerful tool for gaining knowledge from large and complex datasets. However, these models must be trained on high-quality features to succeed. Currently, there are no solutions to host maize multi-omics datasets with end-to-end solutions for evaluating and linking features to target gene annotations. Our work presents the Maize Feature Store (MFS), a versatile application that combines features built on complex data to facilitate exploration, modeling and analysis. Feature stores allow researchers to rapidly deploy machine learning applications by managing and providing access to frequently used features. We populated the MFS for the maize reference genome with over 14 000 gene-based features based on published genomic, transcriptomic, epigenomic, variomic and proteomics datasets. Using the MFS, we created an accurate pan-genome classification model with an AUC-ROC score of 0.87. The MFS is publicly available through the maize genetics and genomics database. Database URL  https://mfs.maizegdb.org/.
    MeSH term(s) Zea mays/genetics ; Multiomics ; Databases, Genetic ; Genomics ; Machine Learning
    Language English
    Publishing date 2023-11-08
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 2496706-3
    ISSN 1758-0463 ; 1758-0463
    ISSN (online) 1758-0463
    ISSN 1758-0463
    DOI 10.1093/database/baad078
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Enhanced pan-genomic resources at the maize genetics and genomics database.

    Cannon, Ethalinda K / Portwood, John L / Hayford, Rita K / Haley, Olivia C / Gardiner, Jack M / Andorf, Carson M / Woodhouse, Margaret R

    Genetics

    2024  Volume 227, Issue 1

    Abstract: Pan-genomes, encompassing the entirety of genetic sequences found in a collection of genomes within a clade, are more useful than single reference genomes for studying species diversity. This is especially true for a species like Zea mays, which has a ... ...

    Abstract Pan-genomes, encompassing the entirety of genetic sequences found in a collection of genomes within a clade, are more useful than single reference genomes for studying species diversity. This is especially true for a species like Zea mays, which has a particularly diverse and complex genome. Presenting pan-genome data, analyses, and visualization is challenging, especially for a diverse species, but more so when pan-genomic data is linked to extensive gene model and gene data, including classical gene information, markers, insertions, expression and proteomic data, and protein structures as is the case at MaizeGDB. Here, we describe MaizeGDB's expansion to include the genic subset of the Zea pan-genome in a pan-gene data center featuring the maize genomes hosted at MaizeGDB, and the outgroup teosinte Zea genomes from the Pan-Andropoganeae project. The new data center offers a variety of browsing and visualization tools, including sequence alignment visualization, gene trees and other tools, to explore pan-genes in Zea that were calculated by the pipeline Pandagma. Combined, these data will help maize researchers study the complexity and diversity of Zea, and to use the comparative functions to validate pan-gene relationships for a selected gene model.
    MeSH term(s) Zea mays/genetics ; Genome, Plant ; Genomics/methods ; Databases, Genetic ; Phylogeny
    Language English
    Publishing date 2024-04-05
    Publishing country United States
    Document type Journal Article ; Research Support, U.S. Gov't, Non-P.H.S. ; Research Support, Non-U.S. Gov't
    ZDB-ID 2167-2
    ISSN 1943-2631 ; 0016-6731
    ISSN (online) 1943-2631
    ISSN 0016-6731
    DOI 10.1093/genetics/iyae036
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Maize protein structure resources at the maize genetics and genomics database.

    Woodhouse, Margaret R / Portwood, John L / Sen, Shatabdi / Hayford, Rita K / Gardiner, Jack M / Cannon, Ethalinda K / Harper, Lisa C / Andorf, Carson M

    Genetics

    2023  Volume 224, Issue 1

    Abstract: Protein structures play an important role in bioinformatics, such as in predicting gene function or validating gene model annotation. However, determining protein structure was, until now, costly and time-consuming, which resulted in a structural biology ...

    Abstract Protein structures play an important role in bioinformatics, such as in predicting gene function or validating gene model annotation. However, determining protein structure was, until now, costly and time-consuming, which resulted in a structural biology bottleneck. With the release of such programs AlphaFold and ESMFold, this bottleneck has been reduced by several orders of magnitude, permitting protein structural comparisons of entire genomes within reasonable timeframes. MaizeGDB has leveraged this technological breakthrough by offering several new tools to accelerate protein structural comparisons between maize and other plants as well as human and yeast outgroups. MaizeGDB also offers bulk downloads of these comparative protein structure data, along with predicted functional annotation information. In this way, MaizeGDB is poised to assist maize researchers in assessing functional homology, gene model annotation quality, and other information unavailable to maize scientists even a few years ago.
    MeSH term(s) Humans ; Zea mays/genetics ; Zea mays/metabolism ; User-Computer Interface ; Databases, Genetic ; Computational Biology/methods ; Genome, Plant ; Molecular Sequence Annotation ; Genomics/methods
    Language English
    Publishing date 2023-02-09
    Publishing country United States
    Document type Journal Article ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 2167-2
    ISSN 1943-2631 ; 0016-6731
    ISSN (online) 1943-2631
    ISSN 0016-6731
    DOI 10.1093/genetics/iyad016
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: PanEffect: a pan-genome visualization tool for variant effects in maize.

    Andorf, Carson M / Haley, Olivia C / Hayford, Rita K / Portwood, John L / Harding, Stephen / Sen, Shatabdi / Cannon, Ethalinda K / Gardiner, Jack M / Kim, Hye-Seon / Woodhouse, Margaret R

    Bioinformatics (Oxford, England)

    2024  Volume 40, Issue 2

    Abstract: Summary: Understanding the effects of genetic variants is crucial for accurately predicting traits and functional outcomes. Recent approaches have utilized artificial intelligence and protein language models to score all possible missense variant ... ...

    Abstract Summary: Understanding the effects of genetic variants is crucial for accurately predicting traits and functional outcomes. Recent approaches have utilized artificial intelligence and protein language models to score all possible missense variant effects at the proteome level for a single genome, but a reliable tool is needed to explore these effects at the pan-genome level. To address this gap, we introduce a new tool called PanEffect. We implemented PanEffect at MaizeGDB to enable a comprehensive examination of the potential effects of coding variants across 50 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and to observe the effects of the 2.3 million natural variations in the maize pan-genome. Each variant effect score, calculated from the Evolutionary Scale Modeling (ESM) protein language model, shows the log-likelihood ratio difference between B73 and all variants in the pan-genome. These scores are shown using heatmaps spanning benign outcomes to potential functional consequences. In addition, PanEffect displays secondary structures and functional domains along with the variant effects, offering additional functional and structural context. Using PanEffect, researchers now have a platform to explore protein variants and identify genetic targets for crop enhancement.
    Availability and implementation: The PanEffect code is freely available on GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/PanEffect). A maize implementation of PanEffect and underlying datasets are available at MaizeGDB (https://www.maizegdb.org/effect/maize/).
    MeSH term(s) Zea mays/genetics ; Databases, Genetic ; Artificial Intelligence ; Genome, Plant ; Phenotype ; Software
    Language English
    Publishing date 2024-02-09
    Publishing country England
    Document type Journal Article ; Research Support, U.S. Gov't, Non-P.H.S. ; Research Support, Non-U.S. Gov't
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btae073
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article: qTeller: a tool for comparative multi-genomic gene expression analysis

    Woodhouse, Margaret R. / Sen, Shatabdi / Schott, David / Portwood, John L. / Freeling, Michael / Walley, Justin W. / Andorf, Carson M. / Schnable, James C.

    Bioinformatics. 2022 Jan. 1, v. 38, no. 1

    2022  

    Abstract: Motivation: Over the last decade, RNA-Seq whole-genome sequencing has become a widely used method for measuring and understanding transcriptome-level changes in gene expression. Since RNA-Seq is relatively inexpensive, it can be used on multiple genomes ... ...

    Abstract Motivation: Over the last decade, RNA-Seq whole-genome sequencing has become a widely used method for measuring and understanding transcriptome-level changes in gene expression. Since RNA-Seq is relatively inexpensive, it can be used on multiple genomes to evaluate gene expression across many different conditions, tissues, and cell types. Although many tools exist to map and compare RNA-Seq at the genomics level, few web-based tools are dedicated to making data generated for individual genomic analysis accessible and reusable at a gene-level scale for comparative analysis between genes, across different genomes, and meta-analyses. Results: To address this challenge, we revamped the comparative gene expression tool qTeller to take advantage of the growing number of public RNA-Seq datasets. qTeller allows users to evaluate gene expression data in a defined genomic interval and also perform two-gene comparisons across multiple user-chosen tissues. Though previously unpublished, qTeller has been cited extensively in the scientific literature, demonstrating its importance to researchers. Our new version of qTeller now supports multiple genomes for intergenomic comparisons, and includes capabilities for both mRNA and protein abundance datasets. Other new features include support for additional data formats, modernized interface and back-end database, and an optimized framework for adoption by other organisms' databases. Availability: The source code for qTeller is open-source and available through GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/qTeller). A maize instance of qTeller is available at the Maize Genetics and Genomics database (MaizeGDB) (https://qteller.maizegdb.org/), where we have mapped over 200 unique datasets from GenBank across 27 maize genomes. Supplementary information: Supplementary data are available at Bioinformatics online.
    Keywords Internet ; bioinformatics ; corn ; data collection ; databases ; gene expression ; genomics ; meta-analysis ; sequence analysis
    Language English
    Dates of publication 2022-0101
    Size p. 236-242.
    Document type Article
    ZDB-ID 1422668-6
    ISSN 1367-4803
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btab604
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  6. Article ; Online: A pan-genomic approach to genome databases using maize as a model system.

    Woodhouse, Margaret R / Cannon, Ethalinda K / Portwood, John L / Harper, Lisa C / Gardiner, Jack M / Schaeffer, Mary L / Andorf, Carson M

    BMC plant biology

    2021  Volume 21, Issue 1, Page(s) 385

    Abstract: Research in the past decade has demonstrated that a single reference genome is not representative of a species' diversity. MaizeGDB introduces a pan-genomic approach to hosting genomic data, leveraging the large number of diverse maize genomes and their ... ...

    Abstract Research in the past decade has demonstrated that a single reference genome is not representative of a species' diversity. MaizeGDB introduces a pan-genomic approach to hosting genomic data, leveraging the large number of diverse maize genomes and their associated datasets to quickly and efficiently connect genomes, gene models, expression, epigenome, sequence variation, structural variation, transposable elements, and diversity data across genomes so that researchers can easily track the structural and functional differences of a locus and its orthologs across maize. We believe our framework is unique and provides a template for any genomic database poised to host large-scale pan-genomic data.
    MeSH term(s) Data Accuracy ; Data Collection/methods ; Databases as Topic ; Genetic Variation ; Genome, Plant ; Genomics ; Zea mays/genetics
    Language English
    Publishing date 2021-08-20
    Publishing country England
    Document type Journal Article
    ISSN 1471-2229
    ISSN (online) 1471-2229
    DOI 10.1186/s12870-021-03173-5
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: qTeller: a tool for comparative multi-genomic gene expression analysis.

    Woodhouse, Margaret R / Sen, Shatabdi / Schott, David / Portwood, John L / Freeling, Michael / Walley, Justin W / Andorf, Carson M / Schnable, James C

    Bioinformatics (Oxford, England)

    2021  Volume 38, Issue 1, Page(s) 236–242

    Abstract: Motivation: Over the last decade, RNA-Seq whole-genome sequencing has become a widely used method for measuring and understanding transcriptome-level changes in gene expression. Since RNA-Seq is relatively inexpensive, it can be used on multiple genomes ...

    Abstract Motivation: Over the last decade, RNA-Seq whole-genome sequencing has become a widely used method for measuring and understanding transcriptome-level changes in gene expression. Since RNA-Seq is relatively inexpensive, it can be used on multiple genomes to evaluate gene expression across many different conditions, tissues and cell types. Although many tools exist to map and compare RNA-Seq at the genomics level, few web-based tools are dedicated to making data generated for individual genomic analysis accessible and reusable at a gene-level scale for comparative analysis between genes, across different genomes and meta-analyses.
    Results: To address this challenge, we revamped the comparative gene expression tool qTeller to take advantage of the growing number of public RNA-Seq datasets. qTeller allows users to evaluate gene expression data in a defined genomic interval and also perform two-gene comparisons across multiple user-chosen tissues. Though previously unpublished, qTeller has been cited extensively in the scientific literature, demonstrating its importance to researchers. Our new version of qTeller now supports multiple genomes for intergenomic comparisons, and includes capabilities for both mRNA and protein abundance datasets. Other new features include support for additional data formats, modernized interface and back-end database and an optimized framework for adoption by other organisms' databases.
    Availability and implementation: The source code for qTeller is open-source and available through GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/qTeller). A maize instance of qTeller is available at the Maize Genetics and Genomics database (MaizeGDB) (https://qteller.maizegdb.org/), where we have mapped over 200 unique datasets from GenBank across 27 maize genomes.
    Supplementary information: Supplementary data are available at Bioinformatics online.
    MeSH term(s) Genomics ; Genome ; Software ; Databases, Nucleic Acid ; Zea mays/genetics ; Gene Expression Profiling
    Language English
    Publishing date 2021-08-17
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btab604
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: A pan-genomic approach to genome databases using maize as a model system

    Woodhouse, Margaret R. / Cannon, Ethalinda K. / Portwood, John L., II / Harper, Lisa C. / Gardiner, Jack M. / Schaeffer, Mary L. / Andorf, Carson M.

    BMC Plant Biol. 2021 Dec., v. 21, no. 1 p.385-385

    2021  

    Abstract: Research in the past decade has demonstrated that a single reference genome is not representative of a species’ diversity. MaizeGDB introduces a pan-genomic approach to hosting genomic data, leveraging the large number of diverse maize genomes and their ... ...

    Abstract Research in the past decade has demonstrated that a single reference genome is not representative of a species’ diversity. MaizeGDB introduces a pan-genomic approach to hosting genomic data, leveraging the large number of diverse maize genomes and their associated datasets to quickly and efficiently connect genomes, gene models, expression, epigenome, sequence variation, structural variation, transposable elements, and diversity data across genomes so that researchers can easily track the structural and functional differences of a locus and its orthologs across maize. We believe our framework is unique and provides a template for any genomic database poised to host large-scale pan-genomic data.
    Keywords corn ; data collection ; databases ; epigenome ; genes ; genomics ; loci ; sequence diversity
    Language English
    Dates of publication 2021-12
    Size p. 385.
    Publishing place BioMed Central
    Document type Article ; Online
    ZDB-ID 2059868-3
    ISSN 1471-2229
    ISSN 1471-2229
    DOI 10.1186/s12870-021-03173-5
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  9. Article: qTeller: A tool for comparative multi-genomic gene expression analysis

    Woodhouse, Margaret R. / Sen, Shatabdi / Schott, David / Portwood, John L. / Freeling, Michael / Walley, Justin W. / Andorf, Carson M. / Schnable, James C.

    Bioinformatics. 2021 Aug. 18, v. 604

    2021  

    Abstract: Over the last decade, whole-genome-level expression sequencing using RNA-Seq has become a widely used method for measuring and understanding transcriptome-level changes in gene expression. Since RNA-Seq is relatively inexpensive, it can be used on ... ...

    Abstract Over the last decade, whole-genome-level expression sequencing using RNA-Seq has become a widely used method for measuring and understanding transcriptome-level changes in gene expression. Since RNA-Seq is relatively inexpensive, it can be used on multiple genomes to evaluate gene expression across many different conditions, tissues, and cell types. Common applications for RNA-Seq data include differential gene expression, gene identification, and splicing analysis. Although many tools exist to map and compare RNA-Seq at the genomics level, few web-based tools are dedicated to make data originally generated for individual genomic analysis accessible and reusable at a gene-level scale to allow for comparative analysis between genes, across different genomes, and meta-analyses. To address this challenge, we revamped the comparative gene expression tool qTeller to take advantage of the growing number of public RNA-Seq datasets. qTeller was originally designed as an RNA-Seq processing pipeline that allowed users to evaluate gene expression data in a defined genomic interval and also perform two-gene comparisons across multiple user-chosen tissues. Though previously unpublished, qTeller has been cited hundreds of times in the scientific literature, demonstrating its importance to researchers. We now present this version of qTeller that has been updated in several useful ways. qTeller is no longer reference-based, and supports data from multiple genomes and allows for intergenomic comparisons. qTeller's functionality has been expanded to allow for both mRNA and protein abundance datasets. Other new features include: support for additional data formats, modernized interface and back-end database, and optimized framework for adoption by other model organisms' databases. A working instance of qTeller is available for the maize research community where we have mapped over 200 unique datasets from GenBank across 27 maize genomes and made them available through the Maize Genetics and Genomics database.
    Keywords Internet ; bioinformatics ; corn ; data collection ; databases ; gene expression ; gene expression regulation ; genes ; genomics ; meta-analysis ; sequence analysis
    Language English
    Dates of publication 2021-0818
    Size p. 236-242.
    Document type Article
    ZDB-ID 1422668-6
    ISSN 1367-4803
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btab604
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  10. Article ; Online: PedigreeNet: a web-based pedigree viewer for biological databases.

    Braun, Bremen L / Schott, David A / Portwood, John L / Andorf, Carson M / Sen, Taner Z

    Bioinformatics (Oxford, England)

    2019  Volume 35, Issue 20, Page(s) 4184–4186

    Abstract: Motivation: Plant breeding aims to improve current germplasm that can tolerate a wide range of biotic and abiotic stresses. To accomplish this goal, breeders rely on developing a deeper understanding of genetic makeup and relationships between plant ... ...

    Abstract Motivation: Plant breeding aims to improve current germplasm that can tolerate a wide range of biotic and abiotic stresses. To accomplish this goal, breeders rely on developing a deeper understanding of genetic makeup and relationships between plant varieties to make informed plant selections. Although rapid advances in genotyping technology generated a large amount of data for breeders, tools that facilitate pedigree analysis and visualization are scant, leaving breeders to use classical, but inherently limited, hierarchical pedigree diagrams for a handful of plant varieties. To answer this need, we developed a simple web-based tool that can be easily implemented at biological databases, called PedigreeNet, to create and visualize customizable pedigree relationships in a network context, displaying pre- and user-uploaded data.
    Results: As a proof-of-concept, we implemented PedigreeNet at the maize model organism database, MaizeGDB. The PedigreeNet viewer at MaizeGDB has a dynamically-generated pedigree network of 4706 maize lines and 5487 relationships that are currently available as both a stand-alone web-based tool and integrated directly on the MaizeGDB Stock Pages. The tool allows the user to apply a number of filters, select or upload their own breeding relationships, center a pedigree network on a plant variety, identify the common ancestor between two varieties, and display the shortest path(s) between two varieties on the pedigree network. The PedigreeNet code layer is written as a JavaScript wrapper around Cytoscape Web. PedigreeNet fills a great need for breeders to have access to an online tool to represent and visually customize pedigree relationships.
    Availability and implementation: PedigreeNet is accessible at https://www.maizegdb.org/breeders_toolbox. The open source code is publically and freely available at GitHub: https://github.com/Maize-Genetics-and-Genomics-Database/PedigreeNet.
    Supplementary information: Supplementary data are available at Bioinformatics online.
    MeSH term(s) Databases, Factual ; Databases, Genetic ; Internet ; Pedigree ; Software ; Zea mays
    Language English
    Publishing date 2019-03-20
    Publishing country England
    Document type Journal Article ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btz208
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top