LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 39

Search options

  1. Article ; Online: Semantics-enabled biomedical literature analytics.

    Kilicoglu, Halil / Ensan, Faezeh / McInnes, Bridget / Wang, Lucy Lu

    Journal of biomedical informatics

    2024  Volume 150, Page(s) 104588

    MeSH term(s) Semantics ; Data Mining ; Data Science
    Language English
    Publishing date 2024-01-19
    Publishing country United States
    Document type Editorial ; Research Support, N.I.H., Extramural
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2024.104588
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Text mining approaches for dealing with the rapidly expanding literature on COVID-19.

    Wang, Lucy Lu / Lo, Kyle

    Briefings in bioinformatics

    2020  Volume 22, Issue 2, Page(s) 781–799

    Abstract: More than 50 000 papers have been published about COVID-19 since the beginning of 2020 and several hundred new papers continue to be published every day. This incredible rate of scientific productivity leads to information overload, making it difficult ... ...

    Abstract More than 50 000 papers have been published about COVID-19 since the beginning of 2020 and several hundred new papers continue to be published every day. This incredible rate of scientific productivity leads to information overload, making it difficult for researchers, clinicians and public health officials to keep up with the latest findings. Automated text mining techniques for searching, reading and summarizing papers are helpful for addressing information overload. In this review, we describe the many resources that have been introduced to support text mining applications over the COVID-19 literature; specifically, we discuss the corpora, modeling resources, systems and shared tasks that have been introduced for COVID-19. We compile a list of 39 systems that provide functionality such as search, discovery, visualization and summarization over the COVID-19 literature. For each system, we provide a qualitative description and assessment of the system's performance, unique data or user interface features and modeling decisions. Many systems focus on search and discovery, though several systems provide novel features, such as the ability to summarize findings over multiple documents or linking between scientific articles and clinical trials. We also describe the public corpora, models and shared tasks that have been introduced to help reduce repeated effort among community members; some of these resources (especially shared tasks) can provide a basis for comparing the performance of different systems. Finally, we summarize promising results and open challenges for text mining the COVID-19 literature.
    MeSH term(s) COVID-19/epidemiology ; COVID-19/virology ; Data Mining/methods ; Humans ; SARS-CoV-2/isolation & purification
    Language English
    Publishing date 2020-12-04
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't ; Review
    ZDB-ID 2068142-2
    ISSN 1477-4054 ; 1467-5463
    ISSN (online) 1477-4054
    ISSN 1467-5463
    DOI 10.1093/bib/bbaa296
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Call for papers: Semantics-enabled biomedical literature analytics.

    Kilicoglu, Halil / Ensan, Faezeh / McInnes, Bridget / Wang, Lucy Lu

    Journal of biomedical informatics

    2022  Volume 132, Page(s) 104134

    MeSH term(s) Data Mining ; Data Science ; Publications ; Semantics
    Language English
    Publishing date 2022-07-16
    Publishing country United States
    Document type Editorial
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2022.104134
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Book ; Online: The Rise of Open Science

    Cao, Hancheng / Dodge, Jesse / Lo, Kyle / McFarland, Daniel A. / Wang, Lucy Lu

    Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices

    2023  

    Abstract: In recent years, funding agencies and journals increasingly advocate for open science practices (e.g. data and method sharing) to improve the transparency, access, and reproducibility of science. However, quantifying these practices at scale has proven ... ...

    Abstract In recent years, funding agencies and journals increasingly advocate for open science practices (e.g. data and method sharing) to improve the transparency, access, and reproducibility of science. However, quantifying these practices at scale has proven difficult. In this work, we leverage a large-scale dataset of 1.1M papers from arXiv that are representative of the fields of physics, math, and computer science to analyze the adoption of data and method link-sharing practices over time and their impact on article reception. To identify links to data and methods, we train a neural text classification model to automatically classify URL types based on contextual mentions in papers. We find evidence that the practice of link-sharing to methods and data is spreading as more papers include such URLs over time. Reproducibility efforts may also be spreading because the same links are being increasingly reused across papers (especially in computer science); and these links are increasingly concentrated within fewer web domains (e.g. Github) over time. Lastly, articles that share data and method links receive increased recognition in terms of citation count, with a stronger effect when the shared links are active (rather than defunct). Together, these findings demonstrate the increased spread and perceived value of data and method sharing practices in open science.
    Keywords Computer Science - Digital Libraries ; Computer Science - Computation and Language ; Computer Science - Computers and Society ; Physics - History and Philosophy of Physics ; Physics - Physics and Society
    Subject code 001
    Publishing date 2023-10-04
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: Designing Guiding Principles for NLP for Healthcare

    Antoniak, Maria / Naik, Aakanksha / Alvarado, Carla S. / Wang, Lucy Lu / Chen, Irene Y.

    A Case Study of Maternal Health

    2023  

    Abstract: Objective: An ethical framework for the use of large language models (LLMs) is urgently needed to shape how natural language processing (NLP) tools are used for healthcare applications. Drawing directly from the voices of those most affected, we propose ... ...

    Abstract Objective: An ethical framework for the use of large language models (LLMs) is urgently needed to shape how natural language processing (NLP) tools are used for healthcare applications. Drawing directly from the voices of those most affected, we propose a set of guiding principles for the use of NLP in healthcare, with examples based on applications in maternal health. Materials and Methods: We led an interactive session centered on an LLM-based chatbot demonstration during a full-day workshop with 39 participants, and additionally surveyed 30 healthcare workers and 30 birthing people about their values, needs, and perceptions of AI and LLMs. We conducted quantitative and qualitative analyses of the interactive discussions to consolidate our findings into a set of guiding principles. Results: Using the case study of maternal health, we propose nine principles for ethical use of LLMs, grouped into three categories: (i) contextual significance, (ii) measurements, and (iii) who/what is valued. We describe rationales underlying these principles and provide practical advice. Discussion: Healthcare faces existing challenges including the balance of power in clinician-patient relationships, systemic health disparities, historical injustices, and economic constraints. Our principles serve as a framework for surfacing key considerations when deploying LLMs in medicine, as well as providing a methodological pattern for other researchers to follow. Conclusion: This set of principles can serve as a resource to practitioners working on maternal health and other healthcare fields to emphasize the importance of technical nuance, historical context, and inclusive design when developing LLMs for use in clinical settings.
    Keywords Computer Science - Computation and Language
    Subject code 170
    Publishing date 2023-12-18
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: APPLS

    Guo, Yue / August, Tal / Leroy, Gondy / Cohen, Trevor / Wang, Lucy Lu

    Evaluating Evaluation Metrics for Plain Language Summarization

    2023  

    Abstract: While there has been significant development of models for Plain Language Summarization (PLS), evaluation remains a challenge. PLS lacks a dedicated assessment metric, and the suitability of text generation evaluation metrics is unclear due to the unique ...

    Abstract While there has been significant development of models for Plain Language Summarization (PLS), evaluation remains a challenge. PLS lacks a dedicated assessment metric, and the suitability of text generation evaluation metrics is unclear due to the unique transformations involved (e.g., adding background explanations, removing specialized terminology). To address these concerns, our study presents a granular meta-evaluation testbed, APPLS, designed to evaluate metrics for PLS. We define a set of perturbations along four criteria inspired by previous work that a PLS metric should capture: informativeness, simplification, coherence, and faithfulness. An analysis of metrics using our testbed reveals that current metrics fail to capture simplification consistently. In response, we introduce POMME, a new metric designed to assess text simplification in PLS; the metric is calculated as the normalized perplexity difference between an in-domain and out-of-domain language model. We demonstrate POMME's correlation with fine-grained variations in simplification and validate its sensitivity across 4 text simplification datasets. This work contributes the first meta-evaluation testbed for PLS and a comprehensive evaluation of existing metrics. The APPLS testbed and POMME is available at https://github.com/LinguisticAnomalies/APPLS.
    Keywords Computer Science - Computation and Language
    Subject code 410
    Publishing date 2023-05-23
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Article ; Online: Automatic question answering for multiple stakeholders, the epidemic question answering dataset.

    Goodwin, Travis R / Demner-Fushman, Dina / Lo, Kyle / Wang, Lucy Lu / Dang, Hoa T / Soboroff, Ian M

    Scientific data

    2022  Volume 9, Issue 1, Page(s) 432

    Abstract: One of the effects of COVID-19 pandemic is a rapidly growing and changing stream of publications to inform clinicians, researchers, policy makers, and patients about the health, socio-economic, and cultural consequences of the pandemic. Managing this ... ...

    Abstract One of the effects of COVID-19 pandemic is a rapidly growing and changing stream of publications to inform clinicians, researchers, policy makers, and patients about the health, socio-economic, and cultural consequences of the pandemic. Managing this information stream manually is not feasible. Automatic Question Answering can quickly bring the most salient points to the user's attention. Leveraging a collection of scientific articles, government websites, relevant news articles, curated social media posts, and questions asked by researchers, clinicians, and the general public, we developed a dataset to explore automatic Question Answering for multiple stakeholders. Analysis of questions asked by various stakeholders shows that while information needs of experts and the public may overlap, satisfactory answers to these questions often originate from different information sources or benefit from different approaches to answer generation. We believe that this dataset has the potential to support the development of question answering systems not only for epidemic questions, but for other domains with varying expertise such as legal or finance.
    MeSH term(s) COVID-19 ; Humans ; Pandemics
    Language English
    Publishing date 2022-07-21
    Publishing country England
    Document type Dataset ; Journal Article
    ZDB-ID 2775191-0
    ISSN 2052-4463 ; 2052-4463
    ISSN (online) 2052-4463
    ISSN 2052-4463
    DOI 10.1038/s41597-022-01533-w
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Book ; Online: Paper Plain

    August, Tal / Wang, Lucy Lu / Bragg, Jonathan / Hearst, Marti A. / Head, Andrew / Lo, Kyle

    Making Medical Research Papers Approachable to Healthcare Consumers with Natural Language Processing

    2022  

    Abstract: When seeking information not covered in patient-friendly documents, like medical pamphlets, healthcare consumers may turn to the research literature. Reading medical papers, however, can be a challenging experience. To improve access to medical papers, ... ...

    Abstract When seeking information not covered in patient-friendly documents, like medical pamphlets, healthcare consumers may turn to the research literature. Reading medical papers, however, can be a challenging experience. To improve access to medical papers, we introduce a novel interactive interface-Paper Plain-with four features powered by natural language processing: definitions of unfamiliar terms, in-situ plain language section summaries, a collection of key questions that guide readers to answering passages, and plain language summaries of the answering passages. We evaluate Paper Plain, finding that participants who use Paper Plain have an easier time reading and understanding research papers without a loss in paper comprehension compared to those who use a typical PDF reader. Altogether, the study results suggest that guiding readers to relevant passages and providing plain language summaries, or "gists," alongside the original paper content can make reading medical papers easier and give readers more confidence to approach these papers.

    Comment: 39 pages, 10 figures
    Keywords Computer Science - Human-Computer Interaction ; Computer Science - Computation and Language ; H.5.2
    Subject code 303
    Publishing date 2022-02-28
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: Personalized Jargon Identification for Enhanced Interdisciplinary Communication

    Guo, Yue / Chang, Joseph Chee / Antoniak, Maria / Bransom, Erin / Cohen, Trevor / Wang, Lucy Lu / August, Tal

    2023  

    Abstract: Scientific jargon can impede researchers when they read materials from other domains. Current methods of jargon identification mainly use corpus-level familiarity indicators (e.g., Simple Wikipedia represents plain language). However, researchers' ... ...

    Abstract Scientific jargon can impede researchers when they read materials from other domains. Current methods of jargon identification mainly use corpus-level familiarity indicators (e.g., Simple Wikipedia represents plain language). However, researchers' familiarity of a term can vary greatly based on their own background. We collect a dataset of over 10K term familiarity annotations from 11 computer science researchers for terms drawn from 100 paper abstracts. Analysis of this data reveals that jargon familiarity and information needs vary widely across annotators, even within the same sub-domain (e.g., NLP). We investigate features representing individual, sub-domain, and domain knowledge to predict individual jargon familiarity. We compare supervised and prompt-based approaches, finding that prompt-based methods including personal publications yields the highest accuracy, though zero-shot prompting provides a strong baseline. This research offers insight into features and methods to integrate personal data into scientific jargon identification.
    Keywords Computer Science - Computation and Language
    Subject code 306
    Publishing date 2023-11-15
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Book ; Online: Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations

    Wang, Lucy Lu / Otmakhova, Yulia / DeYoung, Jay / Truong, Thinh Hung / Kuehl, Bailey E. / Bransom, Erin / Wallace, Byron C.

    2023  

    Abstract: Evaluating multi-document summarization (MDS) quality is difficult. This is especially true in the case of MDS for biomedical literature reviews, where models must synthesize contradicting evidence reported across different documents. Prior work has ... ...

    Abstract Evaluating multi-document summarization (MDS) quality is difficult. This is especially true in the case of MDS for biomedical literature reviews, where models must synthesize contradicting evidence reported across different documents. Prior work has shown that rather than performing the task, models may exploit shortcuts that are difficult to detect using standard n-gram similarity metrics such as ROUGE. Better automated evaluation metrics are needed, but few resources exist to assess metrics when they are proposed. Therefore, we introduce a dataset of human-assessed summary quality facets and pairwise preferences to encourage and support the development of better automated evaluation methods for literature review MDS. We take advantage of community submissions to the Multi-document Summarization for Literature Review (MSLR) shared task to compile a diverse and representative sample of generated summaries. We analyze how automated summarization evaluation metrics correlate with lexical features of generated summaries, to other automated metrics including several we propose in this work, and to aspects of human-assessed summary quality. We find that not only do automated metrics fail to capture aspects of quality as assessed by humans, in many cases the system rankings produced by these metrics are anti-correlated with rankings according to human annotators.

    Comment: ACL 2023; Github: https://github.com/allenai/mslr-annotated-dataset
    Keywords Computer Science - Computation and Language
    Subject code 410
    Publishing date 2023-05-23
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top