LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 3 of total 3

Search options

  1. Book ; Online: MapperGPT

    Matentzoglu, Nicolas / Caufield, J. Harry / Hegde, Harshad B. / Reese, Justin T. / Moxon, Sierra / Kim, Hyeongsik / Harris, Nomi L. / Haendel, Melissa A / Mungall, Christopher J.

    Large Language Models for Linking and Mapping Entities

    2023  

    Abstract: Aligning terminological resources, including ontologies, controlled vocabularies, taxonomies, and value sets is a critical part of data integration in many domains such as healthcare, chemistry, and biomedical research. Entity mapping is the process of ... ...

    Abstract Aligning terminological resources, including ontologies, controlled vocabularies, taxonomies, and value sets is a critical part of data integration in many domains such as healthcare, chemistry, and biomedical research. Entity mapping is the process of determining correspondences between entities across these resources, such as gene identifiers, disease concepts, or chemical entity identifiers. Many tools have been developed to compute such mappings based on common structural features and lexical information such as labels and synonyms. Lexical approaches in particular often provide very high recall, but low precision, due to lexical ambiguity. As a consequence of this, mapping efforts often resort to a labor intensive manual mapping refinement through a human curator. Large Language Models (LLMs), such as the ones employed by ChatGPT, have generalizable abilities to perform a wide range of tasks, including question-answering and information extraction. Here we present MapperGPT, an approach that uses LLMs to review and refine mapping relationships as a post-processing step, in concert with existing high-recall methods that are based on lexical and structural heuristics. We evaluated MapperGPT on a series of alignment tasks from different domains, including anatomy, developmental biology, and renal diseases. We devised a collection of tasks that are designed to be particularly challenging for lexical methods. We show that when used in combination with high-recall methods, MapperGPT can provide a substantial improvement in accuracy, beating state-of-the-art (SOTA) methods such as LogMap.
    Keywords Computer Science - Computation and Language ; Computer Science - Artificial Intelligence
    Subject code 401
    Publishing date 2023-10-05
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Article ; Online: Unifying the identification of biomedical entities with the Bioregistry.

    Hoyt, Charles Tapley / Balk, Meghan / Callahan, Tiffany J / Domingo-Fernández, Daniel / Haendel, Melissa A / Hegde, Harshad B / Himmelstein, Daniel S / Karis, Klas / Kunze, John / Lubiana, Tiago / Matentzoglu, Nicolas / McMurry, Julie / Moxon, Sierra / Mungall, Christopher J / Rutz, Adriano / Unni, Deepak R / Willighagen, Egon / Winston, Donald / Gyori, Benjamin M

    Scientific data

    2022  Volume 9, Issue 1, Page(s) 714

    Abstract: The standardized identification of biomedical entities is a cornerstone of interoperability, reuse, and data integration in the life sciences. Several registries have been developed to catalog resources maintaining identifiers for biomedical entities ... ...

    Abstract The standardized identification of biomedical entities is a cornerstone of interoperability, reuse, and data integration in the life sciences. Several registries have been developed to catalog resources maintaining identifiers for biomedical entities such as small molecules, proteins, cell lines, and clinical trials. However, existing registries have struggled to provide sufficient coverage and metadata standards that meet the evolving needs of modern life sciences researchers. Here, we introduce the Bioregistry, an integrative, open, community-driven metaregistry that synthesizes and substantially expands upon 23 existing registries. The Bioregistry addresses the need for a sustainable registry by leveraging public infrastructure and automation, and employing a progressive governance model centered around open code and open data to foster community contribution. The Bioregistry can be used to support the standardized annotation of data, models, ontologies, and scientific literature, thereby promoting their interoperability and reuse. The Bioregistry can be accessed through https://bioregistry.io and its source code and data are available under the MIT and CC0 Licenses at https://github.com/biopragmatics/bioregistry .
    Language English
    Publishing date 2022-11-19
    Publishing country England
    Document type Journal Article
    ZDB-ID 2775191-0
    ISSN 2052-4463 ; 2052-4463
    ISSN (online) 2052-4463
    ISSN 2052-4463
    DOI 10.1038/s41597-022-01807-3
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: A Simple Standard for Sharing Ontological Mappings (SSSOM).

    Matentzoglu, Nicolas / Balhoff, James P / Bello, Susan M / Bizon, Chris / Brush, Matthew / Callahan, Tiffany J / Chute, Christopher G / Duncan, William D / Evelo, Chris T / Gabriel, Davera / Graybeal, John / Gray, Alasdair / Gyori, Benjamin M / Haendel, Melissa / Harmse, Henriette / Harris, Nomi L / Harrow, Ian / Hegde, Harshad B / Hoyt, Amelia L /
    Hoyt, Charles T / Jiao, Dazhi / Jiménez-Ruiz, Ernesto / Jupp, Simon / Kim, Hyeongsik / Koehler, Sebastian / Liener, Thomas / Long, Qinqin / Malone, James / McLaughlin, James A / McMurry, Julie A / Moxon, Sierra / Munoz-Torres, Monica C / Osumi-Sutherland, David / Overton, James A / Peters, Bjoern / Putman, Tim / Queralt-Rosinach, Núria / Shefchek, Kent / Solbrig, Harold / Thessen, Anne / Tudorache, Tania / Vasilevsky, Nicole / Wagner, Alex H / Mungall, Christopher J

    Database : the journal of biological databases and curation

    2023  Volume 2022

    Abstract: Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major ... ...

    Abstract Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Or are they associated in some other way? Such relationships between the mapped terms are often not documented, which leads to incorrect assumptions and makes them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Furthermore, the lack of descriptions of how mappings were done makes it hard to combine and reconcile mappings, particularly curated and automated ones. We have developed the Simple Standard for Sharing Ontological Mappings (SSSOM) which addresses these problems by: (i) Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in mappings explicit. (ii) Defining an easy-to-use simple table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data principles. (iii) Implementing open and community-driven collaborative workflows that are designed to evolve the standard continuously to address changing requirements and mapping practices. (iv) Providing reference tools and software libraries for working with the standard. In this paper, we present the SSSOM standard, describe several use cases in detail and survey some of the existing work on standardizing the exchange of mappings, with the goal of making mappings Findable, Accessible, Interoperable and Reusable (FAIR). The SSSOM specification can be found at http://w3id.org/sssom/spec. Database URL: http://w3id.org/sssom/spec.
    MeSH term(s) Data Management ; Databases, Factual ; Metadata ; Semantic Web ; Workflow
    Language English
    Publishing date 2023-02-01
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 2496706-3
    ISSN 1758-0463 ; 1758-0463
    ISSN (online) 1758-0463
    ISSN 1758-0463
    DOI 10.1093/database/baac035
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top