LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 27

Search options

  1. Article: Hypothesizing mechanistic links between microbes and disease using knowledge graphs.

    Santangelo, Brook / Bada, Michael / Hunter, Lawrence / Lozupone, Catherine

    bioRxiv : the preprint server for biology

    2023  

    Abstract: Knowledge graphs have found broad biomedical applications, providing useful representations of complex knowledge. Although plentiful evidence exists linking the gut microbiome to disease, mechanistic understanding of those relationships remains generally ...

    Abstract Knowledge graphs have found broad biomedical applications, providing useful representations of complex knowledge. Although plentiful evidence exists linking the gut microbiome to disease, mechanistic understanding of those relationships remains generally elusive. Here we demonstrate the potential of knowledge graphs to hypothesize plausible mechanistic accounts of host-microbe interactions in disease. To do so, we constructed a knowledge graph of linked microbes, genes and metabolites called MGMLink. Using a semantically constrained shortest path search through the graph and a novel path prioritization methodology based on cosine similarity, we show that this knowledge supports inference of mechanistic hypotheses that explain observed relationships between microbes and disease phenotypes. We discuss specific applications of this methodology in inflammatory bowel disease and Parkinson's disease. This approach enables mechanistic hypotheses surrounding the complex interactions between gut microbes and disease to be generated in a scalable and comprehensive manner.
    Language English
    Publishing date 2023-12-04
    Publishing country United States
    Document type Preprint
    DOI 10.1101/2023.12.01.569645
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Mapping of biomedical text to concepts of lexicons, terminologies, and ontologies.

    Bada, Michael

    Methods in molecular biology (Clifton, N.J.)

    2014  Volume 1159, Page(s) 33–45

    Abstract: Concept mapping is a fundamental task in biomedical text mining in which textual mentions of concepts of interest are annotated with specific entries of lexicons, terminologies, ontologies, or databases representing these concepts. Though there has been ... ...

    Abstract Concept mapping is a fundamental task in biomedical text mining in which textual mentions of concepts of interest are annotated with specific entries of lexicons, terminologies, ontologies, or databases representing these concepts. Though there has been a significant amount of research, there are still a limited number of practical, publicly available tools for concept mapping of biomedical text specified by the user as an independent task. In this chapter, several tools that can automatically map biomedical text to concepts from a wide range of terminological resources are presented, followed by those that can map to more restricted sets of these resources. This presentation is intended to serve as a guide to researchers without a background in biomedical concept mapping of text for the selection of an appropriate tool based on usability, scalability, configurability, balance between precision and recall, and the desired set of terminological resources with which to annotate the text. Only with effective automatic concept-mapping tools will systems be able to scalably analyze the biomedical literature and other large sets of documents as a fundamental part of more complex text-mining tasks such as information extraction and hypothesis evaluation and generation.
    MeSH term(s) Biological Ontologies ; Concept Formation ; Data Mining/methods ; Terminology as Topic
    Language English
    Publishing date 2014
    Publishing country United States
    Document type Journal Article ; Review
    ISSN 1940-6029
    ISSN (online) 1940-6029
    DOI 10.1007/978-1-4939-0709-0_3
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Creating an ignorance-base: Exploring known unknowns in the scientific literature.

    Boguslav, Mayla R / Salem, Nourah M / White, Elizabeth K / Sullivan, Katherine J / Bada, Michael / Hernandez, Teri L / Leach, Sonia M / Hunter, Lawrence E

    Journal of biomedical informatics

    2023  Volume 143, Page(s) 104405

    Abstract: Background: Scientific discovery progresses by exploring new and uncharted territory. More specifically, it advances by a process of transforming unknown unknowns first into known unknowns, and then into knowns. Over the last few decades, researchers ... ...

    Abstract Background: Scientific discovery progresses by exploring new and uncharted territory. More specifically, it advances by a process of transforming unknown unknowns first into known unknowns, and then into knowns. Over the last few decades, researchers have developed many knowledge bases to capture and connect the knowns, which has enabled topic exploration and contextualization of experimental results. But recognizing the unknowns is also critical for finding the most pertinent questions and their answers. Prior work on known unknowns has sought to understand them, annotate them, and automate their identification. However, no knowledge-bases yet exist to capture these unknowns, and little work has focused on how scientists might use them to trace a given topic or experimental result in search of open questions and new avenues for exploration. We show here that a knowledge base of unknowns can be connected to ontologically grounded biomedical knowledge to accelerate research in the field of prenatal nutrition.
    Results: We present the first ignorance-base, a knowledge-base created by combining classifiers to recognize ignorance statements (statements of missing or incomplete knowledge that imply a goal for knowledge) and biomedical concepts over the prenatal nutrition literature. This knowledge-base places biomedical concepts mentioned in the literature in context with the ignorance statements authors have made about them. Using our system, researchers interested in the topic of vitamin D and prenatal health were able to uncover three new avenues for exploration (immune system, respiratory system, and brain development) by searching for concepts enriched in ignorance statements. These were buried among the many standard enriched concepts. Additionally, we used the ignorance-base to enrich concepts connected to a gene list associated with vitamin D and spontaneous preterm birth and found an emerging topic of study (brain development) in an implied field (neuroscience). The researchers could look to the field of neuroscience for potential answers to the ignorance statements.
    Conclusion: Our goal is to help students, researchers, funders, and publishers better understand the state of our collective scientific ignorance (known unknowns) in order to help accelerate research through the continued illumination of and focus on the known unknowns and their respective goals for scientific knowledge.
    MeSH term(s) Female ; Humans ; Infant, Newborn ; Knowledge ; Knowledge Bases ; Premature Birth ; Publications ; Vitamin D ; Natural Language Processing
    Chemical Substances Vitamin D (1406-16-2)
    Language English
    Publishing date 2023-06-01
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2023.104405
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Concept recognition as a machine translation problem.

    Boguslav, Mayla R / Hailu, Negacy D / Bada, Michael / Baumgartner, William A / Hunter, Lawrence E

    BMC bioinformatics

    2021  Volume 22, Issue Suppl 1, Page(s) 598

    Abstract: Background: Automated assignment of specific ontology concepts to mentions in text is a critical task in biomedical natural language processing, and the subject of many open shared tasks. Although the current state of the art involves the use of neural ... ...

    Abstract Background: Automated assignment of specific ontology concepts to mentions in text is a critical task in biomedical natural language processing, and the subject of many open shared tasks. Although the current state of the art involves the use of neural network language models as a post-processing step, the very large number of ontology classes to be recognized and the limited amount of gold-standard training data has impeded the creation of end-to-end systems based entirely on machine learning. Recently, Hailu et al. recast the concept recognition problem as a type of machine translation and demonstrated that sequence-to-sequence machine learning models have the potential to outperform multi-class classification approaches.
    Methods: We systematically characterize the factors that contribute to the accuracy and efficiency of several approaches to sequence-to-sequence machine learning through extensive studies of alternative methods and hyperparameter selections. We not only identify the best-performing systems and parameters across a wide variety of ontologies but also provide insights into the widely varying resource requirements and hyperparameter robustness of alternative approaches. Analysis of the strengths and weaknesses of such systems suggest promising avenues for future improvements as well as design choices that can increase computational efficiency with small costs in performance.
    Results: Bidirectional encoder representations from transformers for biomedical text mining (BioBERT) for span detection along with the open-source toolkit for neural machine translation (OpenNMT) for concept normalization achieve state-of-the-art performance for most ontologies annotated in the CRAFT Corpus. This approach uses substantially fewer computational resources, including hardware, memory, and time than several alternative approaches.
    Conclusions: Machine translation is a promising avenue for fully machine-learning-based concept recognition that achieves state-of-the-art results on the CRAFT Corpus, evaluated via a direct comparison to previous results from the 2019 CRAFT shared task. Experiments illuminating the reasons for the surprisingly good performance of sequence-to-sequence methods targeting ontology identifiers suggest that further progress may be possible by mapping to alternative target concept representations. All code and models can be found at: https://github.com/UCDenver-ccp/Concept-Recognition-as-Translation .
    Language English
    Publishing date 2021-12-17
    Publishing country England
    Document type Journal Article
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-021-04141-4
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: Gold-standard ontology-based anatomical annotation in the CRAFT Corpus.

    Bada, Michael / Vasilevsky, Nicole / Baumgartner, William A / Haendel, Melissa / Hunter, Lawrence E

    Database : the journal of biological databases and curation

    2019  Volume 2017

    Abstract: Gold-standard annotated corpora have become important resources for the training and testing of natural-language-processing (NLP) systems designed to support biocuration efforts, and ontologies are increasingly used to facilitate curational consistency ... ...

    Abstract Gold-standard annotated corpora have become important resources for the training and testing of natural-language-processing (NLP) systems designed to support biocuration efforts, and ontologies are increasingly used to facilitate curational consistency and semantic integration across disparate resources. Bringing together the respective power of these, the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of full-length, open-access biomedical journal articles with extensive manually created syntactic, formatting and semantic markup, was previously created and released. This initial public release has already been used in multiple projects to drive development of systems focused on a variety of biocuration, search, visualization, and semantic and syntactic NLP tasks. Building on its demonstrated utility, we have expanded the CRAFT Corpus with a large set of manually created semantic annotations relying on Uberon, an ontology representing anatomical entities and life-cycle stages of multicellular organisms across species as well as types of multicellular organisms defined in terms of life-cycle stage and sexual characteristics. This newly created set of annotations, which has been added for v2.1 of the corpus, is by far the largest publicly available collection of gold-standard anatomical markup and is the first large-scale effort at manual markup of biomedical text relying on the entirety of an anatomical terminology, as opposed to annotation with a small number of high-level anatomical categories, as performed in previous corpora. In addition to presenting and discussing this newly available resource, we apply it to provide a performance baseline for the automatic annotation of anatomical concepts in biomedical text using a prominent concept recognition system. The full corpus, released with a CC BY 3.0 license, may be downloaded from http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml. Database URL: http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.
    Language English
    Publishing date 2019-11-14
    Publishing country England
    Document type Journal Article
    ZDB-ID 2496706-3
    ISSN 1758-0463 ; 1758-0463
    ISSN (online) 1758-0463
    ISSN 1758-0463
    DOI 10.1093/database/bax087
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Desiderata for ontologies to be used in semantic annotation of biomedical documents.

    Bada, Michael / Hunter, Lawrence

    Journal of biomedical informatics

    2010  Volume 44, Issue 1, Page(s) 94–101

    Abstract: A wealth of knowledge valuable to the translational research scientist is contained within the vast biomedical literature, but this knowledge is typically in the form of natural language. Sophisticated natural-language-processing systems are needed to ... ...

    Abstract A wealth of knowledge valuable to the translational research scientist is contained within the vast biomedical literature, but this knowledge is typically in the form of natural language. Sophisticated natural-language-processing systems are needed to translate text into unambiguous formal representations grounded in high-quality consensus ontologies, and these systems in turn rely on gold-standard corpora of annotated documents for training and testing. To this end, we are constructing the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-text biomedical journal articles that are being manually annotated with the entire sets of terms from select vocabularies, predominantly from the Open Biomedical Ontologies (OBO) library. Our efforts in building this corpus has illuminated infelicities of these ontologies with respect to the semantic annotation of biomedical documents, and we propose desiderata whose implementation could substantially improve their utility in this task; these include the integration of overlapping terms across OBOs, the resolution of OBO-specific ambiguities, the integration of the BFO with the OBOs and the use of mid-level ontologies, the inclusion of noncanonical instances, and the expansion of relations and realizable entities.
    MeSH term(s) Animals ; Biomedical Research ; Databases, Factual ; Documentation ; Humans ; Medical Informatics ; Natural Language Processing ; Semantics
    Language English
    Publishing date 2010-10-26
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2010.10.002
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: KaBOB: ontology-based semantic integration of biomedical databases.

    Livingston, Kevin M / Bada, Michael / Baumgartner, William A / Hunter, Lawrence E

    BMC bioinformatics

    2015  Volume 16, Page(s) 126

    Abstract: Background: The ability to query many independent biological databases using a common ontology-based semantic model would facilitate deeper integration and more effective utilization of these diverse and rapidly growing resources. Despite ongoing work ... ...

    Abstract Background: The ability to query many independent biological databases using a common ontology-based semantic model would facilitate deeper integration and more effective utilization of these diverse and rapidly growing resources. Despite ongoing work moving toward shared data formats and linked identifiers, significant problems persist in semantic data integration in order to establish shared identity and shared meaning across heterogeneous biomedical data sources.
    Results: We present five processes for semantic data integration that, when applied collectively, solve seven key problems. These processes include making explicit the differences between biomedical concepts and database records, aggregating sets of identifiers denoting the same biomedical concepts across data sources, and using declaratively represented forward-chaining rules to take information that is variably represented in source databases and integrating it into a consistent biomedical representation. We demonstrate these processes and solutions by presenting KaBOB (the Knowledge Base Of Biomedicine), a knowledge base of semantically integrated data from 18 prominent biomedical databases using common representations grounded in Open Biomedical Ontologies. An instance of KaBOB with data about humans and seven major model organisms can be built using on the order of 500 million RDF triples. All source code for building KaBOB is available under an open-source license.
    Conclusions: KaBOB is an integrated knowledge base of biomedical data representationally based in prominent, actively maintained Open Biomedical Ontologies, thus enabling queries of the underlying data in terms of biomedical concepts (e.g., genes and gene products, interactions and processes) rather than features of source-specific data schemas or file formats. KaBOB resolves many of the issues that routinely plague biomedical researchers intending to work with data from multiple data sources and provides a platform for ongoing data integration and development and for formal reasoning over a wealth of integrated biomedical data.
    MeSH term(s) Biological Ontologies ; Biomedical Research ; Computational Biology/methods ; Data Collection ; Databases, Factual ; Humans ; Information Storage and Retrieval/methods ; Internet ; Knowledge Bases ; PubMed ; Semantics
    Language English
    Publishing date 2015-04-23
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-015-0559-3
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: An open source knowledge graph ecosystem for the life sciences.

    Callahan, Tiffany J / Tripodi, Ignacio J / Stefanski, Adrianne L / Cappelletti, Luca / Taneja, Sanya B / Wyrwa, Jordan M / Casiraghi, Elena / Matentzoglu, Nicolas A / Reese, Justin / Silverstein, Jonathan C / Hoyt, Charles Tapley / Boyce, Richard D / Malec, Scott A / Unni, Deepak R / Joachimiak, Marcin P / Robinson, Peter N / Mungall, Christopher J / Cavalleri, Emanuele / Fontana, Tommaso /
    Valentini, Giorgio / Mesiti, Marco / Gillenwater, Lucas A / Santangelo, Brook / Vasilevsky, Nicole A / Hoehndorf, Robert / Bennett, Tellen D / Ryan, Patrick B / Hripcsak, George / Kahn, Michael G / Bada, Michael / Baumgartner, William A / Hunter, Lawrence E

    Scientific data

    2024  Volume 11, Issue 1, Page(s) 363

    Abstract: Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge ... ...

    Abstract Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.
    MeSH term(s) Algorithms ; Biological Science Disciplines ; Pattern Recognition, Automated ; Translational Research, Biomedical ; Knowledge Bases
    Language English
    Publishing date 2024-04-11
    Publishing country England
    Document type Journal Article
    ZDB-ID 2775191-0
    ISSN 2052-4463 ; 2052-4463
    ISSN (online) 2052-4463
    ISSN 2052-4463
    DOI 10.1038/s41597-024-03171-w
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Identification of OBO nonalignments and its implications for OBO enrichment.

    Bada, Michael / Hunter, Lawrence

    Bioinformatics (Oxford, England)

    2008  Volume 24, Issue 12, Page(s) 1448–1455

    Abstract: Motivation: Existing projects that focus on the semiautomatic addition of links between existing terms in the Open Biomedical Ontologies can take advantage of reasoners that can make new inferences between terms that are based on the added formal ... ...

    Abstract Motivation: Existing projects that focus on the semiautomatic addition of links between existing terms in the Open Biomedical Ontologies can take advantage of reasoners that can make new inferences between terms that are based on the added formal definitions and that reflect nonalignments between the linked terms. However, these projects require that these definitions be necessary and sufficient, a strong requirement that often does not hold. If such definitions cannot be added, the reasoners cannot point to the nonalignments through the suggestion of new inferences.
    Results: We describe a methodology by which we have identified over 1900 instances of nonredundant nonalignments between terms from the Gene Ontology (GO) biological process (BP), cellular component (CC) and molecular function (MF) ontologies, Chemical Entities of Biological Interest (ChEBI) and the Cell Type Ontology (CL). Many of the 39.8% of these nonalignments whose object terms are more atomic than the subject terms are not currently examined in other ontology-enrichment projects due to the fact that the necessary and sufficient conditions required for the inferences are not currently examined. Analysis of the ratios of nonalignments to assertions from which the nonalignments were identified suggests that BP-MF, BP-BP, BP-CL and CC-CC terms are relatively well-aligned, while ChEBI-MF, BP-ChEBI and CC-MF terms are relatively not aligned well. We propose four ways to resolve an identified nonalignment and recommend an analogous implementation of our methodology in ontology-enrichment tools to identify types of nonalignments that are currently not detected.
    Availability: The nonalignments discussed in this article may be viewed at http://compbio.uchsc.edu/Hunter_lab/Bada/nonalignments_2008_03_06.html. Code for the generation of these nonalignments is available upon request.
    Contact: mike.bada@uchsc.edu.
    MeSH term(s) Algorithms ; Artificial Intelligence ; Database Management Systems ; Databases, Genetic ; Information Storage and Retrieval/methods ; Natural Language Processing ; Systems Integration ; Vocabulary, Controlled
    Language English
    Publishing date 2008-05-07
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btn194
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Enrichment of OBO ontologies.

    Bada, Michael / Hunter, Lawrence

    Journal of biomedical informatics

    2007  Volume 40, Issue 3, Page(s) 300–315

    Abstract: This paper describes a frame-based integration of the three GO subontologies, the Chemical Entities of Biological Interest ontology, and the Cell Type Ontology in which relationships are modeled in a way that better captures the semantics between ... ...

    Abstract This paper describes a frame-based integration of the three GO subontologies, the Chemical Entities of Biological Interest ontology, and the Cell Type Ontology in which relationships are modeled in a way that better captures the semantics between biological concepts represented by the terms, rather than between the terms themselves, than previous frame-based efforts. We also describe a methodology for creating suggested enriching assertions by identifying patterns in GO terms, mapping these patterns to new, specific relationships, and matching term substrings to concepts. Using this methodology, a predicted assertion was made for 62% of GO terms that matched one of 31 patterns, and 97% of these predicted assertions were assessed to be valid, resulting in an initial set of over 4000 assertions. Furthermore, this methodology programmatically integrates assertions into an ontology such that each assertion is fully consistent with respect to higher (i.e., more general) relevant class and slot levels.
    MeSH term(s) Computational Biology/methods ; Database Management Systems ; Humans ; Information Science ; Information Storage and Retrieval ; Models, Biological ; Models, Genetic ; Models, Statistical ; Models, Theoretical ; Natural Language Processing ; Programming Languages ; Software ; Unified Medical Language System ; Vocabulary, Controlled
    Language English
    Publishing date 2007-06
    Publishing country United States
    Document type Journal Article
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2006.07.003
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top