LIVIVO - Search results -

Search results

Result 1 - 10 of total 36

Search options

Article ; Online: Generalizing through Forgetting - Domain Generalization for Symptom Event Extraction in Clinical Notes.

Zhou, Sitong / Lybarger, Kevin / Yetisgen, Meliha / Ostendorf, Mari

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science

2023 Volume 2023, Page(s) 622–631

Abstract: Symptom information is primarily documented in free-text clinical notes and is not directly accessible for downstream applications. To address this challenge, information extraction approaches that can handle clinical language variation across different ... ...

Abstract	Symptom information is primarily documented in free-text clinical notes and is not directly accessible for downstream applications. To address this challenge, information extraction approaches that can handle clinical language variation across different institutions and specialties are needed. In this paper, we present domain generalization for symptom extraction using pretraining and fine-tuning data that differs from the target domain in terms of institution and/or specialty and patient population. We extract symptom events using a transformer-based joint entity and relation extraction method. To reduce reliance on domain-specific features, we propose a domain generalization method that dynamically masks frequent symptoms words in the source domain. Additionally, we pretrain the transformer language model (LM) on task-related unlabeled texts for better representation. Our experiments indicate that masking and adaptive pretraining methods can significantly improve performance when the source domain is more distant from the target domain.
Language	English
Publishing date	2023-06-16
Publishing country	United States
Document type	Journal Article
ZDB-ID	2676378-3
ISSN	2153-4063 ; 2153-4063
ISSN (online)	2153-4063
ISSN	2153-4063
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Article ; Online: The 2022 n2c2/UW shared task on extracting social determinants of health.

Lybarger, Kevin / Yetisgen, Meliha / Uzuner, Özlem

Journal of the American Medical Informatics Association : JAMIA

2023 Volume 30, Issue 8, Page(s) 1367–1378

Abstract: Objective: The n2c2/UW SDOH Challenge explores the extraction of social determinant of health (SDOH) information from clinical notes. The objectives include the advancement of natural language processing (NLP) information extraction techniques for SDOH ... ...

Abstract	Objective: The n2c2/UW SDOH Challenge explores the extraction of social determinant of health (SDOH) information from clinical notes. The objectives include the advancement of natural language processing (NLP) information extraction techniques for SDOH and clinical information more broadly. This article presents the shared task, data, participating teams, performance results, and considerations for future work. Materials and methods: The task used the Social History Annotated Corpus (SHAC), which consists of clinical text with detailed event-based annotations for SDOH events, such as alcohol, drug, tobacco, employment, and living situation. Each SDOH event is characterized through attributes related to status, extent, and temporality. The task includes 3 subtasks related to information extraction (Subtask A), generalizability (Subtask B), and learning transfer (Subtask C). In addressing this task, participants utilized a range of techniques, including rules, knowledge bases, n-grams, word embeddings, and pretrained language models (LM). Results: A total of 15 teams participated, and the top teams utilized pretrained deep learning LM. The top team across all subtasks used a sequence-to-sequence approach achieving 0.901 F1 for Subtask A, 0.774 F1 Subtask B, and 0.889 F1 for Subtask C. Conclusions: Similar to many NLP tasks and domains, pretrained LM yielded the best performance, including generalizability and learning transfer. An error analysis indicates extraction performance varies by SDOH, with lower performance achieved for conditions, like substance use and homelessness, which increase health risks (risk factors) and higher performance achieved for conditions, like substance abstinence and living with family, which reduce health risks (protective factors).
MeSH term(s)	Humans ; Social Determinants of Health ; Natural Language Processing ; Information Storage and Retrieval ; Electronic Health Records
Language	English
Publishing date	2023-02-15
Publishing country	England
Document type	Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
ZDB-ID	1205156-1
ISSN	1527-974X ; 1067-5027
ISSN (online)	1527-974X
ISSN	1067-5027
DOI	10.1093/jamia/ocad012
Database	MEDical Literature Analysis and Retrieval System OnLINE

In stock of ZB MED Cologne/Königswinter

Zs.A 4128: Show issues			Location: Je nach Verfügbarkeit (siehe Angabe bei Bestand) bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular Jg. 1995 - 2021: Lesesall (2.OG) ab Jg. 2022: Lesesaal (EG)
Zs.MO 312: Show issues

Order via subito

Details ▾
- See ZB MED holdings
- Order with fees

Thesis ; Online: Extracting information from clinical text with limited annotated data

Lybarger, Kevin James

2020

Abstract: Thesis (Ph.D.)--University of Washington, 2020 ... Electronic health record (EHR) data informs decision-making in clinical care; however, EHR data are generally underused for other purposes, including secondary use applications. The need to leverage EHR ... ...

Abstract	Thesis (Ph.D.)--University of Washington, 2020 Electronic health record (EHR) data informs decision-making in clinical care; however, EHR data are generally underused for other purposes, including secondary use applications. The need to leverage EHR data, including clinical notes, is highlighted by the COVID-19 pandemic, as clinicians, researchers, and policymakers struggle to understand, treat, and contain a new disease. Secondary use cases for EHR data extend to many research areas related to healthcare effectiveness, epidemiology, and public health. Clinical notes contain many types of patient information that are not well characterized through structured data in the EHR, including social determinants of health (SDOH), symptoms, and other factors relevant to clinical informatics research. These patient data are frequently represented in the clinical narrative, rather than structured data, because structured data entry tools can be time-consuming and free-text entry allows richer descriptions. This text-encoded information can benefit secondary use applications, like large retrospective studies and clinical decision-support systems; however, the key information must first be automatically extracted, creating structured representations from unstructured clinical text. Data driven information extraction models require annotated data for training and evaluation, and annotated clinical data is limited by the high cost of annotation and privacy regulations. This work explores the automatic extraction of SDOH and COVID-19 diagnosis, testing, and symptom information from clinical text. The exploration of SDOH and COVID-19 focus on addressing the challenges associated with the limited availability of annotated clinical text. Here, "limited" is intended to mean a relatively small data set or low resource setting. The primary contributions of this work include the introduction of neural clinical information extraction models, new annotated clinical corpora, a novel active learning framework, and a secondary use application utilizing automatically extracted data. We present state-of-the-art neural information extraction approaches for SDOH and COVID-19 information, specifically designing the data-driven extraction architectures to achieve high performance with limited training data, by using multi-task learning and unsupervised pre-training. The extraction models generate event-based predictions that provide a detailed characterization of SDOH and COVID-19, achieving performance levels comparable to the inter-annotator agreement for several important factors. These information extraction approaches are relevant to a range of clinical data. As part of the exploration of SDOH and COVID-19, two new annotated corpora are developed: the Social History Annotation Corpus (SHAC) and the COVID-19 Annotated Clinical Text (CACT) Corpus. These corpora include detailed, high-quality annotations that characterize SDOH and COVID-19 across multiple dimensions. SHAC is unique in its annotation detail, size, and heterogeneity, and CACT is one of the first corpora with COVID-19 related annotations. These corpora are a substantial contribution to the available resources for training and evaluating machine learning-based extraction models at the University of Washington and for the larger clinical informatics community. In collecting SHAC, we introduced a novel active learning framework that uses a relatively simple text classification task as a proxy for a more complex event extraction task. The framework increased corpus richness and heterogeneity and improved extraction performance, relative to random selection. The largest performance improvements are associated with prominent risk factors, like drug and tobacco use, homelessness, and living with others. To demonstrate the utility of the automatically extracted data, this work presents a secondary use application exploring the prediction of COVID-19 infection. Incorporating automatically extracted symptom data improves COVID-19 infection prediction performance, beyond just using existing structured data.
Keywords	clinical informatics ; data science ; machine learning ; natural language processing ; Electrical engineering ; Computer science ; To Be Assigned ; covid19
Subject code	006
Language	English
Publishing country	us
Document type	Thesis ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Advancements in extracting social determinants of health information from narrative text.

Lybarger, Kevin / Bear Don't Walk, Oliver J / Yetisgen, Meliha / Uzuner, Özlem

Journal of the American Medical Informatics Association : JAMIA

2023 Volume 30, Issue 8, Page(s) 1363–1366

MeSH term(s)	Social Determinants of Health ; Narration ; Text Messaging
Language	English
Publishing date	2023-07-19
Publishing country	England
Document type	Research Support, N.I.H., Extramural ; Editorial
ZDB-ID	1205156-1
ISSN	1527-974X ; 1067-5027
ISSN (online)	1527-974X
ISSN	1067-5027
DOI	10.1093/jamia/ocad121
Database	MEDical Literature Analysis and Retrieval System OnLINE

In stock of ZB MED Cologne/Königswinter

Zs.A 4128: Show issues			Location: Je nach Verfügbarkeit (siehe Angabe bei Bestand) bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular Jg. 1995 - 2021: Lesesall (2.OG) ab Jg. 2022: Lesesaal (EG)
Zs.MO 312: Show issues

Order via subito

Details ▾
- See ZB MED holdings
- Order with fees

Book ; Online: The 2022 n2c2/UW Shared Task on Extracting Social Determinants of Health

Lybarger, Kevin / Yetisgen, Meliha / Uzuner, Özlem

2023

Abstract	Objective: The n2c2/UW SDOH Challenge explores the extraction of social determinant of health (SDOH) information from clinical notes. The objectives include the advancement of natural language processing (NLP) information extraction techniques for SDOH and clinical information more broadly. This paper presents the shared task, data, participating teams, performance results, and considerations for future work. Materials and Methods: The task used the Social History Annotated Corpus (SHAC), which consists of clinical text with detailed event-based annotations for SDOH events such as alcohol, drug, tobacco, employment, and living situation. Each SDOH event is characterized through attributes related to status, extent, and temporality. The task includes three subtasks related to information extraction (Subtask A), generalizability (Subtask B), and learning transfer (Subtask C). In addressing this task, participants utilized a range of techniques, including rules, knowledge bases, n-grams, word embeddings, and pretrained language models (LM). Results: A total of 15 teams participated, and the top teams utilized pretrained deep learning LM. The top team across all subtasks used a sequence-to-sequence approach achieving 0.901 F1 for Subtask A, 0.774 F1 Subtask B, and 0.889 F1 for Subtask C. Conclusions: Similar to many NLP tasks and domains, pretrained LM yielded the best performance, including generalizability and learning transfer. An error analysis indicates extraction performance varies by SDOH, with lower performance achieved for conditions, like substance use and homelessness, that increase health risks (risk factors) and higher performance achieved for conditions, like substance abstinence and living with family, that reduce health risks (protective factors).
Keywords	Computer Science - Computation and Language ; I.2.7
Subject code	400 ; 006
Publishing date	2023-01-13
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model.

Lau, Wilson / Lybarger, Kevin / Gunn, Martin L / Yetisgen, Meliha

Journal of digital imaging

2022 Volume 36, Issue 1, Page(s) 91–104

Abstract: Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use ... ...

Abstract	Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use applications to support diagnosis, triage, outcomes prediction, and clinical research. In this paper, we present a new corpus of radiology reports annotated with clinical findings. Our annotation schema captures detailed representations of pathologic findings that are observable on imaging ("lesions") and other types of clinical problems ("medical problems"). The schema used an event-based representation to capture fine-grained details, including assertion, anatomy, characteristics, size, and count. Our gold standard corpus contained a total of 500 annotated computed tomography (CT) reports. We extracted triggers and argument entities using two state-of-the-art deep learning architectures, including BERT. We then predicted the linkages between trigger and argument entities (referred to as argument roles) using a BERT-based relation extraction model. We achieved the best extraction performance using a BERT model pre-trained on 3 million radiology reports from our institution: 90.9-93.4% F1 for finding triggers and 72.0-85.6% F1 for argument roles. To assess model generalizability, we used an external validation set randomly sampled from the MIMIC Chest X-ray (MIMIC-CXR) database. The extraction performance on this validation set was 95.6% for finding triggers and 79.1-89.7% for argument roles, demonstrating that the model generalized well to the cross-institutional data with a different imaging modality. We extracted the finding events from all the radiology reports in the MIMIC-CXR database and provided the extractions to the research community.
MeSH term(s)	Humans ; Radiology ; Tomography, X-Ray Computed ; Semantics ; Research Report ; Natural Language Processing
Language	English
Publishing date	2022-10-17
Publishing country	United States
Document type	Journal Article ; Research Support, N.I.H., Extramural
ZDB-ID	1033897-4
ISSN	1618-727X ; 0897-1889
ISSN (online)	1618-727X
ISSN	0897-1889
DOI	10.1007/s10278-022-00717-5
Database	MEDical Literature Analysis and Retrieval System OnLINE

Full text online

Accessible to users with ZB MED library card

In stock of ZB MED Cologne/Königswinter

Zs.A 2941: Show issues

Location:
Je nach Verfügbarkeit (siehe Angabe bei Bestand)
bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular
Jg. 1995 - 2021: Lesesall (2.OG)
ab Jg. 2022: Lesesaal (EG)

Order via subito

Details ▾

Book ; Online: Health Text Simplification

Rahman, Md Mushfiqur / Irbaz, Mohammad Sabik / North, Kai / Williams, Michelle S. / Zampieri, Marcos / Lybarger, Kevin

An Annotated Corpus for Digestive Cancer Education and Novel Strategies for Reinforcement Learning

2024

Abstract: Objective: The reading level of health educational materials significantly influences information understandability and accessibility, particularly for minoritized populations. Many patient educational resources surpass the reading level and complexity ... ...

Abstract	Objective: The reading level of health educational materials significantly influences information understandability and accessibility, particularly for minoritized populations. Many patient educational resources surpass the reading level and complexity of widely accepted standards. There is a critical need for high-performing text simplification models in health information to enhance dissemination and literacy. This need is particularly acute in cancer education, where effective prevention and screening education can substantially reduce morbidity and mortality. Methods: We introduce Simplified Digestive Cancer (SimpleDC), a parallel corpus of cancer education materials tailored for health text simplification research. Utilizing SimpleDC alongside the existing Med-EASi corpus, we explore Large Language Model (LLM)-based simplification methods, including fine-tuning, reinforcement learning (RL), reinforcement learning with human feedback (RLHF), domain adaptation, and prompt-based approaches. Our experimentation encompasses Llama 2 and GPT-4. A novel RLHF reward function is introduced, featuring a lightweight model adept at distinguishing between original and simplified texts, thereby enhancing the model's effectiveness with unlabeled data. Results: Fine-tuned Llama 2 models demonstrated high performance across various metrics. Our innovative RLHF reward function surpassed existing RL text simplification reward functions in effectiveness. The results underscore that RL/RLHF can augment fine-tuning, facilitating model training on unlabeled text and improving performance. Additionally, these methods effectively adapt out-of-domain text simplification models to targeted domains.
Keywords	Computer Science - Computation and Language ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning
Subject code	006
Publishing date	2024-01-26
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Extracting COVID-19 Diagnoses and Symptoms From Clinical Text: A New Annotated Corpus and Neural Event Extraction Framework.

Lybarger, Kevin / Ostendorf, Mari / Thompson, Matthew / Yetisgen, Meliha

ArXiv

2021

Abstract: Coronavirus disease 2019 (COVID-19) is a global pandemic. Although much has been learned about the novel coronavirus since its emergence, there are many open questions related to tracking its spread, describing symptomology, predicting the severity of ... ...

Abstract	Coronavirus disease 2019 (COVID-19) is a global pandemic. Although much has been learned about the novel coronavirus since its emergence, there are many open questions related to tracking its spread, describing symptomology, predicting the severity of infection, and forecasting healthcare utilization. Free-text clinical notes contain critical information for resolving these questions. Data-driven, automatic information extraction models are needed to use this text-encoded information in large-scale studies. This work presents a new clinical corpus, referred to as the COVID-19 Annotated Clinical Text (CACT) Corpus, which comprises 1,472 notes with detailed annotations characterizing COVID-19 diagnoses, testing, and clinical presentation. We introduce a span-based event extraction model that jointly extracts all annotated phenomena, achieving high performance in identifying COVID-19 and symptom events with associated assertion values (0.83-0.97 F1 for events and 0.73-0.79 F1 for assertions). In a secondary use application, we explored the prediction of COVID-19 test results using structured patient data (e.g. vital signs and laboratory results) and automatically extracted symptom information. The automatically extracted symptoms improve prediction performance, beyond structured data alone.
Language	English
Publishing date	2021-03-10
Publishing country	United States
Document type	Preprint
ISSN	2331-8422
ISSN (online)	2331-8422
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Extracting COVID-19 diagnoses and symptoms from clinical text: A new annotated corpus and neural event extraction framework.

Lybarger, Kevin / Ostendorf, Mari / Thompson, Matthew / Yetisgen, Meliha

Journal of biomedical informatics

2021 Volume 117, Page(s) 103761

Abstract	Coronavirus disease 2019 (COVID-19) is a global pandemic. Although much has been learned about the novel coronavirus since its emergence, there are many open questions related to tracking its spread, describing symptomology, predicting the severity of infection, and forecasting healthcare utilization. Free-text clinical notes contain critical information for resolving these questions. Data-driven, automatic information extraction models are needed to use this text-encoded information in large-scale studies. This work presents a new clinical corpus, referred to as the COVID-19 Annotated Clinical Text (CACT) Corpus, which comprises 1,472 notes with detailed annotations characterizing COVID-19 diagnoses, testing, and clinical presentation. We introduce a span-based event extraction model that jointly extracts all annotated phenomena, achieving high performance in identifying COVID-19 and symptom events with associated assertion values (0.83-0.97 F1 for events and 0.73-0.79 F1 for assertions). Our span-based event extraction model outperforms an extractor built on MetaMapLite for the identification of symptoms with assertion values. In a secondary use application, we predicted COVID-19 test results using structured patient data (e.g. vital signs and laboratory results) and automatically extracted symptom information, to explore the clinical presentation of COVID-19. Automatically extracted symptoms improve COVID-19 prediction performance, beyond structured data alone.
MeSH term(s)	COVID-19/diagnosis ; Electronic Health Records ; Humans ; Information Storage and Retrieval ; Natural Language Processing ; Symptom Assessment
Language	English
Publishing date	2021-03-26
Publishing country	United States
Document type	Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
ZDB-ID	2057141-0
ISSN	1532-0480 ; 1532-0464
ISSN (online)	1532-0480
ISSN	1532-0464
DOI	10.1016/j.jbi.2021.103761
Database	MEDical Literature Analysis and Retrieval System OnLINE

Full text online

Accessible to users with ZB MED library card

Order via subito

Details ▾
- Full text online
- Order with fees

Book ; Online: MasonNLP+ at SemEval-2023 Task 8

Ramachandran, Giridhar Kaushik / Gangavarapu, Haritha / Lybarger, Kevin / Uzuner, Ozlem

Extracting Medical Questions, Experiences and Claims from Social Media using Knowledge-Augmented Pre-trained Language Models

2023

Abstract: In online forums like Reddit, users share their experiences with medical conditions and treatments, including making claims, asking questions, and discussing the effects of treatments on their health. Building systems to understand this information can ... ...

Abstract	In online forums like Reddit, users share their experiences with medical conditions and treatments, including making claims, asking questions, and discussing the effects of treatments on their health. Building systems to understand this information can effectively monitor the spread of misinformation and verify user claims. The Task-8 of the 2023 International Workshop on Semantic Evaluation focused on medical applications, specifically extracting patient experience- and medical condition-related entities from user posts on social media. The Reddit Health Online Talk (RedHot) corpus contains posts from medical condition-related subreddits with annotations characterizing the patient experience and medical conditions. In Subtask-1, patient experience is characterized by personal experience, questions, and claims. In Subtask-2, medical conditions are characterized by population, intervention, and outcome. For the automatic extraction of patient experiences and medical condition information, as a part of the challenge, we proposed language-model-based extraction systems that ranked $3^{rd}$ on both subtasks' leaderboards. In this work, we describe our approach and, in addition, explore the automatic extraction of this information using domain-specific language models and the inclusion of external knowledge.
Keywords	Computer Science - Computation and Language ; Computer Science - Machine Learning
Publishing date	2023-04-26
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

To top

More links

Kategorien

Order via subito

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito

Full text online

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito

Full text online

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

Order via subito

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Order via subito

Full text online

More links

Kategorien

Inter-library loan at ZB MED