LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 60

Search options

  1. Article ; Online: Clinical natural language processing for secondary uses.

    Gao, Yanjun / Mahajan, Diwakar / Uzuner, Özlem / Yetisgen, Meliha

    Journal of biomedical informatics

    2024  Volume 150, Page(s) 104596

    MeSH term(s) Natural Language Processing
    Language English
    Publishing date 2024-01-24
    Publishing country United States
    Document type Editorial
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2024.104596
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Backdoor Adjustment of Confounding by Provenance for Robust Text Classification of Multi-institutional Clinical Notes.

    Ding, Xiruo / Sheng, Zhecheng / Yetişgen, Meliha / Pakhomov, Serguei / Cohen, Trevor

    AMIA ... Annual Symposium proceedings. AMIA Symposium

    2024  Volume 2023, Page(s) 923–932

    Abstract: Natural Language Processing (NLP) methods have been broadly applied to clinical tasks. Machine learning and deep learning approaches have been used to improve the performance of clinical NLP. However, these approaches require sufficiently large datasets ... ...

    Abstract Natural Language Processing (NLP) methods have been broadly applied to clinical tasks. Machine learning and deep learning approaches have been used to improve the performance of clinical NLP. However, these approaches require sufficiently large datasets for training, and trained models have been shown to transfer poorly across sites. These issues have led to the promotion of data collection and integration across different institutions for accurate and portable models. However, this can introduce a form of bias called confounding by provenance. When source-specific data distributions differ at deployment, this may harm model performance. To address this issue, we evaluate the utility of backdoor adjustment for text classification in a multi-site dataset of clinical notes annotated for mentions of substance abuse. Using an evaluation framework devised to measure robustness to distributional shifts, we assess the utility of backdoor adjustment. Our results indicate that backdoor adjustment can effectively mitigate for confounding shift.
    MeSH term(s) Humans ; Data Collection ; Electronic Health Records ; Machine Learning ; Natural Language Processing ; Substance-Related Disorders ; Multicenter Studies as Topic
    Language English
    Publishing date 2024-01-11
    Publishing country United States
    Document type Journal Article
    ISSN 1942-597X
    ISSN (online) 1942-597X
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Generalizing through Forgetting - Domain Generalization for Symptom Event Extraction in Clinical Notes.

    Zhou, Sitong / Lybarger, Kevin / Yetisgen, Meliha / Ostendorf, Mari

    AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science

    2023  Volume 2023, Page(s) 622–631

    Abstract: Symptom information is primarily documented in free-text clinical notes and is not directly accessible for downstream applications. To address this challenge, information extraction approaches that can handle clinical language variation across different ... ...

    Abstract Symptom information is primarily documented in free-text clinical notes and is not directly accessible for downstream applications. To address this challenge, information extraction approaches that can handle clinical language variation across different institutions and specialties are needed. In this paper, we present domain generalization for symptom extraction using pretraining and fine-tuning data that differs from the target domain in terms of institution and/or specialty and patient population. We extract symptom events using a transformer-based joint entity and relation extraction method. To reduce reliance on domain-specific features, we propose a domain generalization method that dynamically masks frequent symptoms words in the source domain. Additionally, we pretrain the transformer language model (LM) on task-related unlabeled texts for better representation. Our experiments indicate that masking and adaptive pretraining methods can significantly improve performance when the source domain is more distant from the target domain.
    Language English
    Publishing date 2023-06-16
    Publishing country United States
    Document type Journal Article
    ZDB-ID 2676378-3
    ISSN 2153-4063 ; 2153-4063
    ISSN (online) 2153-4063
    ISSN 2153-4063
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: The 2022 n2c2/UW shared task on extracting social determinants of health.

    Lybarger, Kevin / Yetisgen, Meliha / Uzuner, Özlem

    Journal of the American Medical Informatics Association : JAMIA

    2023  Volume 30, Issue 8, Page(s) 1367–1378

    Abstract: Objective: The n2c2/UW SDOH Challenge explores the extraction of social determinant of health (SDOH) information from clinical notes. The objectives include the advancement of natural language processing (NLP) information extraction techniques for SDOH ... ...

    Abstract Objective: The n2c2/UW SDOH Challenge explores the extraction of social determinant of health (SDOH) information from clinical notes. The objectives include the advancement of natural language processing (NLP) information extraction techniques for SDOH and clinical information more broadly. This article presents the shared task, data, participating teams, performance results, and considerations for future work.
    Materials and methods: The task used the Social History Annotated Corpus (SHAC), which consists of clinical text with detailed event-based annotations for SDOH events, such as alcohol, drug, tobacco, employment, and living situation. Each SDOH event is characterized through attributes related to status, extent, and temporality. The task includes 3 subtasks related to information extraction (Subtask A), generalizability (Subtask B), and learning transfer (Subtask C). In addressing this task, participants utilized a range of techniques, including rules, knowledge bases, n-grams, word embeddings, and pretrained language models (LM).
    Results: A total of 15 teams participated, and the top teams utilized pretrained deep learning LM. The top team across all subtasks used a sequence-to-sequence approach achieving 0.901 F1 for Subtask A, 0.774 F1 Subtask B, and 0.889 F1 for Subtask C.
    Conclusions: Similar to many NLP tasks and domains, pretrained LM yielded the best performance, including generalizability and learning transfer. An error analysis indicates extraction performance varies by SDOH, with lower performance achieved for conditions, like substance use and homelessness, which increase health risks (risk factors) and higher performance achieved for conditions, like substance abstinence and living with family, which reduce health risks (protective factors).
    MeSH term(s) Humans ; Social Determinants of Health ; Natural Language Processing ; Information Storage and Retrieval ; Electronic Health Records
    Language English
    Publishing date 2023-02-15
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 1205156-1
    ISSN 1527-974X ; 1067-5027
    ISSN (online) 1527-974X
    ISSN 1067-5027
    DOI 10.1093/jamia/ocad012
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: Automatic Assignment of Radiology Examination Protocols Using Pre-trained Language Models with Knowledge Distillation.

    Lau, Wilson / Aaltonen, Laura / Gunn, Martin / Yetisgen, Meliha

    AMIA ... Annual Symposium proceedings. AMIA Symposium

    2022  Volume 2021, Page(s) 668–676

    Abstract: Selecting radiology examination protocol is a repetitive, and time-consuming process. In this paper, we present a deep learning approach to automatically assign protocols to computed tomography examinations, by pre-training a domain-specific BERT model ( ... ...

    Abstract Selecting radiology examination protocol is a repetitive, and time-consuming process. In this paper, we present a deep learning approach to automatically assign protocols to computed tomography examinations, by pre-training a domain-specific BERT model (BERT
    MeSH term(s) Humans ; Language ; Natural Language Processing ; Radiography ; Radiology ; Support Vector Machine
    Language English
    Publishing date 2022-02-21
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ISSN 1942-597X
    ISSN (online) 1942-597X
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Call for papers: Special issue on clinical natural language processing for secondary use applications.

    Yetisgen, Meliha / Uzuner, Ozlem / Gao, Yanjun / Mahajan, Diwakar

    Journal of biomedical informatics

    2022  Volume 133, Page(s) 104152

    MeSH term(s) Natural Language Processing
    Language English
    Publishing date 2022-08-17
    Publishing country United States
    Document type Editorial
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2022.104152
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Book ; Online: The 2022 n2c2/UW Shared Task on Extracting Social Determinants of Health

    Lybarger, Kevin / Yetisgen, Meliha / Uzuner, Özlem

    2023  

    Abstract: Objective: The n2c2/UW SDOH Challenge explores the extraction of social determinant of health (SDOH) information from clinical notes. The objectives include the advancement of natural language processing (NLP) information extraction techniques for SDOH ... ...

    Abstract Objective: The n2c2/UW SDOH Challenge explores the extraction of social determinant of health (SDOH) information from clinical notes. The objectives include the advancement of natural language processing (NLP) information extraction techniques for SDOH and clinical information more broadly. This paper presents the shared task, data, participating teams, performance results, and considerations for future work. Materials and Methods: The task used the Social History Annotated Corpus (SHAC), which consists of clinical text with detailed event-based annotations for SDOH events such as alcohol, drug, tobacco, employment, and living situation. Each SDOH event is characterized through attributes related to status, extent, and temporality. The task includes three subtasks related to information extraction (Subtask A), generalizability (Subtask B), and learning transfer (Subtask C). In addressing this task, participants utilized a range of techniques, including rules, knowledge bases, n-grams, word embeddings, and pretrained language models (LM). Results: A total of 15 teams participated, and the top teams utilized pretrained deep learning LM. The top team across all subtasks used a sequence-to-sequence approach achieving 0.901 F1 for Subtask A, 0.774 F1 Subtask B, and 0.889 F1 for Subtask C. Conclusions: Similar to many NLP tasks and domains, pretrained LM yielded the best performance, including generalizability and learning transfer. An error analysis indicates extraction performance varies by SDOH, with lower performance achieved for conditions, like substance use and homelessness, that increase health risks (risk factors) and higher performance achieved for conditions, like substance abstinence and living with family, that reduce health risks (protective factors).
    Keywords Computer Science - Computation and Language ; I.2.7
    Subject code 400 ; 006
    Publishing date 2023-01-13
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: Building blocks for complex tasks

    Zhou, Sitong / Yetisgen, Meliha / Ostendorf, Mari

    Robust generative event extraction for radiology reports under domain shifts

    2023  

    Abstract: This paper explores methods for extracting information from radiology reports that generalize across exam modalities to reduce requirements for annotated data. We demonstrate that multi-pass T5-based text-to-text generative models exhibit better ... ...

    Abstract This paper explores methods for extracting information from radiology reports that generalize across exam modalities to reduce requirements for annotated data. We demonstrate that multi-pass T5-based text-to-text generative models exhibit better generalization across exam modalities compared to approaches that employ BERT-based task-specific classification layers. We then develop methods that reduce the inference cost of the model, making large-scale corpus processing more feasible for clinical applications. Specifically, we introduce a generative technique that decomposes complex tasks into smaller subtask blocks, which improves a single-pass model when combined with multitask training. In addition, we leverage target-domain contexts during inference to enhance domain adaptation, enabling use of smaller models. Analyses offer insights into the benefits of different cost reduction strategies.
    Keywords Computer Science - Computation and Language
    Subject code 006 ; 004
    Publishing date 2023-06-15
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Article ; Online: Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model.

    Lau, Wilson / Lybarger, Kevin / Gunn, Martin L / Yetisgen, Meliha

    Journal of digital imaging

    2022  Volume 36, Issue 1, Page(s) 91–104

    Abstract: Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use ... ...

    Abstract Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use applications to support diagnosis, triage, outcomes prediction, and clinical research. In this paper, we present a new corpus of radiology reports annotated with clinical findings. Our annotation schema captures detailed representations of pathologic findings that are observable on imaging ("lesions") and other types of clinical problems ("medical problems"). The schema used an event-based representation to capture fine-grained details, including assertion, anatomy, characteristics, size, and count. Our gold standard corpus contained a total of 500 annotated computed tomography (CT) reports. We extracted triggers and argument entities using two state-of-the-art deep learning architectures, including BERT. We then predicted the linkages between trigger and argument entities (referred to as argument roles) using a BERT-based relation extraction model. We achieved the best extraction performance using a BERT model pre-trained on 3 million radiology reports from our institution: 90.9-93.4% F1 for finding triggers and 72.0-85.6% F1 for argument roles. To assess model generalizability, we used an external validation set randomly sampled from the MIMIC Chest X-ray (MIMIC-CXR) database. The extraction performance on this validation set was 95.6% for finding triggers and 79.1-89.7% for argument roles, demonstrating that the model generalized well to the cross-institutional data with a different imaging modality. We extracted the finding events from all the radiology reports in the MIMIC-CXR database and provided the extractions to the research community.
    MeSH term(s) Humans ; Radiology ; Tomography, X-Ray Computed ; Semantics ; Research Report ; Natural Language Processing
    Language English
    Publishing date 2022-10-17
    Publishing country United States
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ZDB-ID 1033897-4
    ISSN 1618-727X ; 0897-1889
    ISSN (online) 1618-727X
    ISSN 0897-1889
    DOI 10.1007/s10278-022-00717-5
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: The Leaf Clinical Trials Corpus: a new resource for query generation from clinical trial eligibility criteria.

    Dobbins, Nicholas J / Mullen, Tony / Uzuner, Özlem / Yetisgen, Meliha

    Scientific data

    2022  Volume 9, Issue 1, Page(s) 490

    Abstract: Identifying cohorts of patients based on eligibility criteria such as medical conditions, procedures, and medication use is critical to recruitment for clinical trials. Such criteria are often most naturally described in free-text, using language ... ...

    Abstract Identifying cohorts of patients based on eligibility criteria such as medical conditions, procedures, and medication use is critical to recruitment for clinical trials. Such criteria are often most naturally described in free-text, using language familiar to clinicians and researchers. In order to identify potential participants at scale, these criteria must first be translated into queries on clinical databases, which can be labor-intensive and error-prone. Natural language processing (NLP) methods offer a potential means of such conversion into database queries automatically. However they must first be trained and evaluated using corpora which capture clinical trials criteria in sufficient detail. In this paper, we introduce the Leaf Clinical Trials (LCT) corpus, a human-annotated corpus of over 1,000 clinical trial eligibility criteria descriptions using highly granular structured labels capturing a range of biomedical phenomena. We provide details of our schema, annotation process, corpus quality, and statistics. Additionally, we present baseline information extraction results on this corpus as benchmarks for future work.
    MeSH term(s) Clinical Trials as Topic/standards ; Databases, Factual ; Humans ; Information Storage and Retrieval ; Natural Language Processing ; Patient Selection
    Language English
    Publishing date 2022-08-11
    Publishing country England
    Document type Dataset ; Journal Article
    ZDB-ID 2775191-0
    ISSN 2052-4463 ; 2052-4463
    ISSN (online) 2052-4463
    ISSN 2052-4463
    DOI 10.1038/s41597-022-01521-0
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top