LIVIVO - Search results -

Search results

Result 1 - 10 of total 172

Search options

Article ; Online: Gene Set Summarization using Large Language Models.

Joachimiak, Marcin P / Caufield, J Harry / Harris, Nomi L / Kim, Hyeongsik / Mungall, Christopher J

ArXiv

2023

Abstract: Molecular biologists frequently interpret gene lists derived from high-throughput experiments and computational analysis. This is typically done as a statistical enrichment analysis that measures the over- or under-representation of biological function ... ...

Abstract	Molecular biologists frequently interpret gene lists derived from high-throughput experiments and computational analysis. This is typically done as a statistical enrichment analysis that measures the over- or under-representation of biological function terms associated with genes or their properties, based on curated assertions from a knowledge base (KB) such as the Gene Ontology (GO). Interpreting gene lists can also be framed as a textual summarization task, enabling the use of Large Language Models (LLMs), potentially utilizing scientific texts directly and avoiding reliance on a KB. We developed SPINDOCTOR (Structured Prompt Interpolation of Natural Language Descriptions of Controlled Terms for Ontology Reporting), a method that uses GPT models to perform gene set function summarization as a complement to standard enrichment analysis. This method can use different sources of gene functional information: (1) structured text derived from curated ontological KB annotations, (2) ontology-free narrative gene summaries, or (3) direct model retrieval. We demonstrate that these methods are able to generate plausible and biologically valid summary GO term lists for gene sets. However, GPT-based approaches are unable to deliver reliable scores or p-values and often return terms that are not statistically significant. Crucially, these methods were rarely able to recapitulate the most precise and informative term from standard enrichment, likely due to an inability to generalize and reason using an ontology. Results are highly nondeterministic, with minor variations in prompt resulting in radically different term lists. Our results show that at this point, LLM-based methods are unsuitable as a replacement for standard term enrichment analysis and that manual curation of ontological assertions remains necessary.
Language	English
Publishing date	2023-05-25
Publishing country	United States
Document type	Preprint
ISSN	2331-8422
ISSN (online)	2331-8422
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: BOSC 2022: the first hybrid and 23rd annual Bioinformatics Open Source Conference.

Harris, Nomi L / Hokamp, Karsten / Ménager, Hervé / Munoz-Torres, Monica / Unni, Deepak / Vasilevsky, Nicole / Williams, Jason

F1000Research

2022 Volume 11, Page(s) 1034

Abstract: ... The ... ...

Abstract	The 23
MeSH term(s)	Computational Biology ; Congresses as Topic ; Humans ; Systems Biology
Language	English
Publishing date	2022-09-12
Publishing country	England
Document type	Editorial
ZDB-ID	2699932-8
ISSN	2046-1402 ; 2046-1402
ISSN (online)	2046-1402
ISSN	2046-1402
DOI	10.12688/f1000research.125043.1
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

Article ; Online: Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES): a method for populating knowledge bases using zero-shot learning.

Caufield, J Harry / Hegde, Harshad / Emonet, Vincent / Harris, Nomi L / Joachimiak, Marcin P / Matentzoglu, Nicolas / Kim, HyeongSik / Moxon, Sierra / Reese, Justin T / Haendel, Melissa A / Robinson, Peter N / Mungall, Christopher J

Bioinformatics (Oxford, England)

2024 Volume 40, Issue 3

Abstract: Motivation: Creating knowledge bases and ontologies is a time consuming task that relies on manual curation. AI/NLP approaches can assist expert curators in populating these knowledge bases, but current approaches rely on extensive training data, and ... ...

Abstract	Motivation: Creating knowledge bases and ontologies is a time consuming task that relies on manual curation. AI/NLP approaches can assist expert curators in populating these knowledge bases, but current approaches rely on extensive training data, and are not able to populate arbitrarily complex nested knowledge schemas. Results: Here we present Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES), a Knowledge Extraction approach that relies on the ability of Large Language Models (LLMs) to perform zero-shot learning and general-purpose query answering from flexible prompts and return information conforming to a specified schema. Given a detailed, user-defined knowledge schema and an input text, SPIRES recursively performs prompt interrogation against an LLM to obtain a set of responses matching the provided schema. SPIRES uses existing ontologies and vocabularies to provide identifiers for matched elements. We present examples of applying SPIRES in different domains, including extraction of food recipes, multi-species cellular signaling pathways, disease treatments, multi-step drug mechanisms, and chemical to disease relationships. Current SPIRES accuracy is comparable to the mid-range of existing Relation Extraction methods, but greatly surpasses an LLM's native capability of grounding entities with unique identifiers. SPIRES has the advantage of easy customization, flexibility, and, crucially, the ability to perform new tasks in the absence of any new training data. This method supports a general strategy of leveraging the language interpreting capabilities of LLMs to assemble knowledge bases, assisting manual knowledge curation and acquisition while supporting validation with publicly-available databases and ontologies external to the LLM. Availability and implementation: SPIRES is available as part of the open source OntoGPT package: https://github.com/monarch-initiative/ontogpt.
MeSH term(s)	Semantics ; Knowledge Bases ; Databases, Factual
Language	English
Publishing date	2024-02-20
Publishing country	England
Document type	Journal Article
ZDB-ID	1422668-6
ISSN	1367-4811 ; 1367-4803
ISSN (online)	1367-4811
ISSN	1367-4803
DOI	10.1093/bioinformatics/btae104
Database	MEDical Literature Analysis and Retrieval System OnLINE

In stock of ZB MED Cologne/Königswinter

Zs.A 2374: Show issues

Location:
Je nach Verfügbarkeit (siehe Angabe bei Bestand)
bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular
Jg. 1995 - 2021: Lesesall (2.OG)
ab Jg. 2022: Lesesaal (EG)

Order via subito

Details ▾
- See ZB MED holdings
- Order with fees

Book ; Online: Gene Set Summarization using Large Language Models

Joachimiak, Marcin P. / Caufield, J. Harry / Harris, Nomi L. / Kim, Hyeongsik / Mungall, Christopher J.

2023

Abstract	Molecular biologists frequently interpret gene lists derived from high-throughput experiments and computational analysis. This is typically done as a statistical enrichment analysis that measures the over- or under-representation of biological function terms associated with genes or their properties, based on curated assertions from a knowledge base (KB) such as the Gene Ontology (GO). Interpreting gene lists can also be framed as a textual summarization task, enabling the use of Large Language Models (LLMs), potentially utilizing scientific texts directly and avoiding reliance on a KB. We developed SPINDOCTOR (Structured Prompt Interpolation of Natural Language Descriptions of Controlled Terms for Ontology Reporting), a method that uses GPT models to perform gene set function summarization as a complement to standard enrichment analysis. This method can use different sources of gene functional information: (1) structured text derived from curated ontological KB annotations, (2) ontology-free narrative gene summaries, or (3) direct model retrieval. We demonstrate that these methods are able to generate plausible and biologically valid summary GO term lists for gene sets. However, GPT-based approaches are unable to deliver reliable scores or p-values and often return terms that are not statistically significant. Crucially, these methods were rarely able to recapitulate the most precise and informative term from standard enrichment, likely due to an inability to generalize and reason using an ontology. Results are highly nondeterministic, with minor variations in prompt resulting in radically different term lists. Our results show that at this point, LLM-based methods are unsuitable as a replacement for standard term enrichment analysis and that manual curation of ontological assertions remains necessary.
Keywords	Quantitative Biology - Genomics ; Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Quantitative Biology - Quantitative Methods
Subject code	004
Publishing date	2023-05-20
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: BOSC 2023, the 24th annual Bioinformatics Open Source Conference.

Harris, Nomi L / Fields, Christopher J / Hokamp, Karsten / Just, Jérémy / Khetani, Radhika / Maia, Jessica / Ménager, Hervé / Munoz-Torres, Monica C / Unni, Deepak / Williams, Jason

F1000Research

2023 Volume 12, Page(s) 1568

Abstract: The 24th annual Bioinformatics Open Source Conference ( BOSC 2023) was part of the 2023i conference on Intelligent Systems for Molecular Biology and the European Conference on Computational Biology (ISMB/ECCB 2023). Launched in 2000 and held yearly since, ...

Abstract	The 24th annual Bioinformatics Open Source Conference ( BOSC 2023) was part of the 2023i conference on Intelligent Systems for Molecular Biology and the European Conference on Computational Biology (ISMB/ECCB 2023). Launched in 2000 and held yearly since, BOSC is the premier meeting covering open-source bioinformatics and open science. Like ISMB 2022, the 2023 meeting was a hybrid conference, with the in-person component hosted in Lyon, France. ISMB/ECCB attracted a near-record number of attendees, with over 2100 in person and about 900 more online. Approximately 200 people participated in BOSC sessions. In addition to 43 talks and 49 posters, BOSC featured two keynotes: Sara El-Gebali, who spoke about "A New Odyssey: Pioneering the Future of Scientific Progress Through Open Collaboration", and Joseph Yracheta, who spoke about "The Dissonance between Scientific Altruism & Capitalist Extraction: The Zero Trust and Federated Data Sovereignty Solution." Once again, a joint session brought together BOSC and the Bio-Ontologies COSI. The conference ended with a panel on Open and Ethical Data Sharing. As in prior years, BOSC was preceded by a CollaborationFest, a collaborative work event that brought together about 40 participants interested in synergistically combining ideas, shaping project plans, developing software, and more.
MeSH term(s)	Humans ; Computational Biology ; Software ; Information Dissemination
Language	English
Publishing date	2023-12-07
Publishing country	England
Document type	Editorial
ZDB-ID	2699932-8
ISSN	2046-1402 ; 2046-1402
ISSN (online)	2046-1402
ISSN	2046-1402
DOI	10.12688/f1000research.143015.1
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

Article ; Online: BOSC 2021, the 22nd Annual Bioinformatics Open Source Conference.

Harris, Nomi L / Cock, Peter J A / Fields, Christopher J / Hokamp, Karsten / Maia, Jessica / Munoz-Torres, Monica / Sharan, Malvika / Williams, Jason

F1000Research

2021 Volume 10

Abstract: The 22nd annual Bioinformatics Open Source Conference (BOSC 2021, open-bio.org/events/bosc-2021/) was held online as a track of the 2021 Intelligent Systems for Molecular Biology / European Conference on Computational Biology (ISMB/ECCB) conference. ... ...

Abstract	The 22nd annual Bioinformatics Open Source Conference (BOSC 2021, open-bio.org/events/bosc-2021/) was held online as a track of the 2021 Intelligent Systems for Molecular Biology / European Conference on Computational Biology (ISMB/ECCB) conference. Launched in 2000 and held every year since, BOSC is the premier meeting covering topics related to open source software and open science in bioinformatics. In 2020, BOSC partnered with the Galaxy Community Conference to form the Bioinformatics Community Conference (BCC2020); that was the first BOSC to be held online. This year, BOSC returned to its roots as part of ISMB/ECCB 2021. As in 2020, the Covid-19 pandemic made it impossible to hold the conference in person, so ISMB/ECCB 2021 took place as an online meeting attended by over 2000 people from 79 countries. Nearly 200 people participated in BOSC sessions, which included 27 talks reviewed and selected from submitted abstracts, and three invited keynote talks representing a range of global perspectives on the role of open science and open source in driving research and inclusivity in the biosciences, one of which was presented in French with English subtitles.
MeSH term(s)	Computational Biology ; Humans ; Pandemics ; Software
Language	English
Publishing date	2021-10-18
Publishing country	England
Document type	Congress ; Editorial
ZDB-ID	2699932-8
ISSN	2046-1402 ; 2046-1402
ISSN (online)	2046-1402
ISSN	2046-1402
DOI	10.12688/f1000research.74074.1
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

Article ; Online: Author Correction: Brain Data Standards - A method for building data-driven cell-type ontologies.

Scientific data

2023 Volume 10, Issue 1, Page(s) 246

Language	English
Publishing date	2023-04-28
Publishing country	England
Document type	Published Erratum
ZDB-ID	2775191-0
ISSN	2052-4463 ; 2052-4463
ISSN (online)	2052-4463
ISSN	2052-4463
DOI	10.1038/s41597-023-02165-4
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

Article ; Online: The 21st annual Bioinformatics Open Source Conference (BOSC 2020, part of BCC2020).

Harris, Nomi L / Cock, Peter J A / Fields, Christopher J / Hokamp, Karsten / Maia, Jessica / Munoz-Torres, Monica / Taschuk, Morgan / Yehudi, Yo

F1000Research

2020 Volume 9

Abstract: Launched in 2000 and held every year since, the Bioinformatics Open Source Conference (BOSC) is a volunteer-run meeting coordinated by the Open Bioinformatics Foundation (OBF) that covers open source software development and open science in ... ...

Abstract	Launched in 2000 and held every year since, the Bioinformatics Open Source Conference (BOSC) is a volunteer-run meeting coordinated by the Open Bioinformatics Foundation (OBF) that covers open source software development and open science in bioinformatics. Most years, BOSC has been part of the Intelligent Systems for Molecular Biology (ISMB) conference, but in 2018, and again in 2020, BOSC partnered with the Galaxy Community Conference (GCC). This year's combined BOSC + GCC conference was called the Bioinformatics Community Conference (BCC2020, bcc2020.github.io). Originally slated to take place in Toronto, Canada, BCC2020 was moved online due to COVID-19. The meeting started with a wide array of training sessions; continued with a main program of keynote presentations, talks, posters, Birds of a Feather, and more; and ended with four days of collaboration (CoFest). Efforts to make the meeting accessible and inclusive included very low registration fees, talks presented twice a day, and closed captioning for all videos. More than 800 people from 61 countries registered for at least one part of the meeting, which was held mostly in the Remo.co video-conferencing platform.
MeSH term(s)	Canada ; Computational Biology ; Congresses as Topic ; Humans
Keywords	covid19
Language	English
Publishing date	2020-09-21
Publishing country	England
Document type	Editorial
ZDB-ID	2699932-8
ISSN	2046-1402 ; 2046-1402
ISSN (online)	2046-1402
ISSN	2046-1402
DOI	10.12688/f1000research.26498.1
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

Article ; Online: Brain Data Standards - A method for building data-driven cell-type ontologies

Scientific Data, Vol 10, Iss 1, Pp 1-

2023 Volume 11

Abstract: Abstract Large-scale single-cell ‘omics profiling is being used to define a complete catalogue of brain cell types, something that traditional methods struggle with due to the diversity and complexity of the brain. But this poses a problem: How do we ... ...

Abstract	Abstract Large-scale single-cell ‘omics profiling is being used to define a complete catalogue of brain cell types, something that traditional methods struggle with due to the diversity and complexity of the brain. But this poses a problem: How do we organise such a catalogue - providing a standard way to refer to the cell types discovered, linking their classification and properties to supporting data? Cell ontologies provide a partial solution to these problems, but no existing ontology schemas support the definition of cell types by direct reference to supporting data, classification of cell types using classifications derived directly from data, or links from cell types to marker sets along with confidence scores. Here we describe a generally applicable schema that solves these problems and its application in a semi-automated pipeline to build a data-linked extension to the Cell Ontology representing cell types in the Primary Motor Cortex of humans, mice and marmosets. The methods and resulting ontology are designed to be scalable and applicable to similar whole-brain atlases currently in preparation.
Keywords	Science ; Q
Subject code	004
Language	English
Publishing date	2023-01-01T00:00:00Z
Publisher	Nature Portfolio
Document type	Article ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Brain Data Standards - A method for building data-driven cell-type ontologies.

Scientific data

2023 Volume 10, Issue 1, Page(s) 50

Abstract: Large-scale single-cell 'omics profiling is being used to define a complete catalogue of brain cell types, something that traditional methods struggle with due to the diversity and complexity of the brain. But this poses a problem: How do we organise ... ...

Abstract	Large-scale single-cell 'omics profiling is being used to define a complete catalogue of brain cell types, something that traditional methods struggle with due to the diversity and complexity of the brain. But this poses a problem: How do we organise such a catalogue - providing a standard way to refer to the cell types discovered, linking their classification and properties to supporting data? Cell ontologies provide a partial solution to these problems, but no existing ontology schemas support the definition of cell types by direct reference to supporting data, classification of cell types using classifications derived directly from data, or links from cell types to marker sets along with confidence scores. Here we describe a generally applicable schema that solves these problems and its application in a semi-automated pipeline to build a data-linked extension to the Cell Ontology representing cell types in the Primary Motor Cortex of humans, mice and marmosets. The methods and resulting ontology are designed to be scalable and applicable to similar whole-brain atlases currently in preparation.
MeSH term(s)	Animals ; Humans ; Mice ; Biological Ontologies ; Brain ; Callithrix ; Data Collection/standards
Language	English
Publishing date	2023-01-24
Publishing country	England
Document type	Journal Article
ZDB-ID	2775191-0
ISSN	2052-4463 ; 2052-4463
ISSN (online)	2052-4463
ISSN	2052-4463
DOI	10.1038/s41597-022-01886-2
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

To top

More links

Kategorien

Order via subito

Inter-library loan at ZB MED

More links

Kategorien

Order via subito

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito

Full text online

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

Order via subito

More links

Kategorien

Order via subito

More links

Kategorien

Order via subito

More links

Kategorien

Order via subito

Full text online

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

Order via subito