LIVIVO - Das Suchportal für Lebenswissenschaften

switch to English language
Erweiterte Suche

Suchergebnis

Treffer 1 - 10 von insgesamt 55

Suchoptionen

  1. Artikel: Word centrality constrained representation for keyphrase extraction.

    Gero, Zelalem / Ho, Joyce C

    Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

    2022  Band 2021, Seite(n) 155–161

    Abstract: To keep pace with the increased generation and digitization of documents, automated methods that can improve search, discovery and mining of the vast body of literature are essential. Keyphrases provide a concise representation by identifying salient ... ...

    Abstract To keep pace with the increased generation and digitization of documents, automated methods that can improve search, discovery and mining of the vast body of literature are essential. Keyphrases provide a concise representation by identifying salient concepts in a document. Various supervised approaches model keyphrase extraction using local context to predict the label for each token and perform much better than the unsupervised counterparts. Unfortunately, this method fails for short documents where the context is unclear. Moreover, keyphrases, which are usually the gist of a document, need to be the central theme. We propose a new extraction model that introduces a centrality constraint to enrich the word representation of a Bidirectional long short-term memory. Performance evaluation on two publicly available datasets demonstrate our model outperforms existing state-of-the art approaches. Our model is publicly available at https://github.com/ZHgero/keyphrases_centrality.git.
    Sprache Englisch
    Erscheinungsdatum 2022-06-24
    Erscheinungsland United States
    Dokumenttyp Journal Article
    DOI 10.18653/v1/2021.bionlp-1.17
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  2. Buch ; Online: PGB

    Lee, Eric W / Ho, Joyce C

    A PubMed Graph Benchmark for Heterogeneous Network Representation Learning

    2023  

    Abstract: There has been rapid growth in biomedical literature, yet capturing the heterogeneity of the bibliographic information of these articles remains relatively understudied. Although graph mining research via heterogeneous graph neural networks has taken ... ...

    Abstract There has been rapid growth in biomedical literature, yet capturing the heterogeneity of the bibliographic information of these articles remains relatively understudied. Although graph mining research via heterogeneous graph neural networks has taken center stage, it remains unclear whether these approaches capture the heterogeneity of the PubMed database, a vast digital repository containing over 33 million articles. We introduce PubMed Graph Benchmark (PGB), a new benchmark dataset for evaluating heterogeneous graph embeddings for biomedical literature. The benchmark contains rich metadata including abstract, authors, citations, MeSH terms, MeSH hierarchy, and some other information. The benchmark contains three different evaluation tasks encompassing systematic reviews, node classification, and node clustering. In PGB, we aggregate the metadata associated with the biomedical articles from PubMed into a unified source and make the benchmark publicly available for any future works.
    Schlagwörter Computer Science - Machine Learning ; Computer Science - Social and Information Networks
    Thema/Rubrik (Code) 006
    Erscheinungsdatum 2023-05-04
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  3. Artikel: Uncertainty-based Self-training for Biomedical Keyphrase Extraction.

    Gero, Zelalem / Ho, Joyce C

    ... IEEE-EMBS International Conference on Biomedical and Health Informatics. IEEE-EMBS International Conference on Biomedical and Health Informatics

    2021  Band 2021

    Abstract: To keep pace with the increased generation and digitization of documents, automated methods that can improve search, discovery and mining of the vast body of literature are essential. Keyphrases provide a concise representation by identifying salient ... ...

    Abstract To keep pace with the increased generation and digitization of documents, automated methods that can improve search, discovery and mining of the vast body of literature are essential. Keyphrases provide a concise representation by identifying salient concepts in a document. Various supervised approaches model keyphrase extraction using local context to predict the label for each token and perform much better than the unsupervised counterparts. However, existing supervised datasets have limited annotated examples to train better deep learning models. In contrast, many domains have large amount of un-annotated data that can be leveraged to improve model performance in keyphrase extraction. We introduce a self-learning based model that incorporates uncertainty estimates to select instances from large-scale unlabeled data to augment the small labeled training set. Performance evaluation on a publicly available biomedical dataset demonstrates that our method improves performance of keyphrase extraction over state of the art models.
    Sprache Englisch
    Erscheinungsdatum 2021-08-10
    Erscheinungsland United States
    Dokumenttyp Journal Article
    ISSN 2641-3590
    ISSN 2641-3590
    DOI 10.1109/bhi50953.2021.9508592
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  4. Artikel: CATAN: Chart-aware temporal attention network for adverse outcome prediction.

    Gero, Zelalem / Ho, Joyce C

    IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics

    2021  Band 2021, Seite(n) 83–92

    Abstract: There is an increased adoption of electronic health record systems by a variety of hospitals and medical centers. This provides an opportunity to leverage automated computer systems in assisting healthcare workers. One of the least utilized but rich ... ...

    Abstract There is an increased adoption of electronic health record systems by a variety of hospitals and medical centers. This provides an opportunity to leverage automated computer systems in assisting healthcare workers. One of the least utilized but rich source of patient information is the unstructured clinical text. In this work, we develop CATAN, a chart-aware temporal attention network for learning patient representations from clinical notes. We introduce a novel representation where each note is considered a single unit, like a sentence, and composed of attention-weighted words. The notes in turn are aggregated into a patient representation using a second weighting unit, note attention. Unlike standard attention computations which focus only on the content of the note, we incorporate the chart-time for each note as a constraint for attention calculation. This allows our model to focus on notes closer to the prediction time. Using the MIMIC-III dataset, we empirically show that our patient representation and attention calculation achieves the best performance in comparison with various state-of-the-art baselines for one-year mortality prediction and 30-day hospital readmission. Moreover, the attention weights can be used to offer transparency into our model's predictions.
    Sprache Englisch
    Erscheinungsdatum 2021-10-15
    Erscheinungsland United States
    Dokumenttyp Journal Article
    ISSN 2575-2626
    ISSN 2575-2626
    DOI 10.1109/ichi52183.2021.00024
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  5. Artikel ; Online: Evaluating Natural Language Processing Packages for Predicting Hospital-Acquired Pressure Injuries From Clinical Notes.

    Gu, Siyi / Lee, Eric W / Zhang, Wenhui / Simpson, Roy L / Hertzberg, Vicki Stover / Ho, Joyce C

    Computers, informatics, nursing : CIN

    2024  Band 42, Heft 3, Seite(n) 184–192

    Abstract: Incidence of hospital-acquired pressure injury, a key indicator of nursing quality, is directly proportional to adverse outcomes, increased hospital stays, and economic burdens on patients, caregivers, and society. Thus, predicting hospital-acquired ... ...

    Abstract Incidence of hospital-acquired pressure injury, a key indicator of nursing quality, is directly proportional to adverse outcomes, increased hospital stays, and economic burdens on patients, caregivers, and society. Thus, predicting hospital-acquired pressure injury is important. Prediction models use structured data more often than unstructured notes, although the latter often contain useful patient information. We hypothesize that unstructured notes, such as nursing notes, can predict hospital-acquired pressure injury. We evaluate the impact of using various natural language processing packages to identify salient patient information from unstructured text. We use named entity recognition to identify keywords, which comprise the feature space of our classifier for hospital-acquired pressure injury prediction. We compare scispaCy and Stanza, two different named entity recognition models, using unstructured notes in Medical Information Mart for Intensive Care III, a publicly available ICU data set. To assess the impact of vocabulary size reduction, we compare the use of all clinical notes with only nursing notes. Our results suggest that named entity recognition extraction using nursing notes can yield accurate models. Moreover, the extracted keywords play a significant role in the prediction of hospital-acquired pressure injury.
    Mesh-Begriff(e) Humans ; Natural Language Processing ; Pressure Ulcer/diagnosis ; Critical Care ; Hospitals
    Sprache Englisch
    Erscheinungsdatum 2024-03-01
    Erscheinungsland United States
    Dokumenttyp Journal Article
    ZDB-ID 2078463-6
    ISSN 1538-9774 ; 1538-2931
    ISSN (online) 1538-9774
    ISSN 1538-2931
    DOI 10.1097/CIN.0000000000001053
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  6. Artikel ; Online: Hypergraph Transformers for EHR-based Clinical Predictions.

    Xu, Ran / Ali, Mohammed K / Ho, Joyce C / Yang, Carl

    AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science

    2023  Band 2023, Seite(n) 582–591

    Abstract: Electronic health records (EHR) data contain rich information about patients' health conditions including diagnosis, procedures, medications and etc., which have been widely used to facilitate digital medicine. Despite its importance, it is often non- ... ...

    Abstract Electronic health records (EHR) data contain rich information about patients' health conditions including diagnosis, procedures, medications and etc., which have been widely used to facilitate digital medicine. Despite its importance, it is often non-trivial to learn useful representations for patients' visits that support downstream clinical predictions, as each visit contains massive and diverse medical codes. As a result, the complex interactions among medical codes are often not captured, which leads to substandard predictions. To better model these complex relations, we leverage hypergraphs, which go beyond pairwise relations to jointly learn the representations for visits and medical codes. We also propose to use the self-attention mechanism to automatically identify the most relevant medical codes for each visit based on the downstream clinical predictions with better generalization power. Experiments on two EHR datasets show that our proposed method not only yields superior performance, but also provides reasonable insights towards the target tasks.
    Sprache Englisch
    Erscheinungsdatum 2023-06-16
    Erscheinungsland United States
    Dokumenttyp Journal Article
    ZDB-ID 2676378-3
    ISSN 2153-4063 ; 2153-4063
    ISSN (online) 2153-4063
    ISSN 2153-4063
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  7. Artikel: PubMed Author-assigned Keyword Extraction (PubMedAKE) Benchmark.

    Sheng, Jiasheng / Gero, Zelalem / Ho, Joyce C

    Proceedings of the ... ACM International Conference on Information & Knowledge Management. ACM International Conference on Information and Knowledge Management

    2022  Band 2022, Seite(n) 4470–4474

    Abstract: With the ever-increasing abundance of biomedical articles, improving the accuracy of keyword search results becomes crucial for ensuring reproducible research. However, keyword extraction for biomedical articles is hard due to the existence of obscure ... ...

    Abstract With the ever-increasing abundance of biomedical articles, improving the accuracy of keyword search results becomes crucial for ensuring reproducible research. However, keyword extraction for biomedical articles is hard due to the existence of obscure keywords and the lack of a comprehensive benchmark. PubMedAKE is an author-assigned keyword extraction dataset that contains the title, abstract, and keywords of over 843,269 articles from the PubMed open access subset database. This dataset, publicly available on Zenodo, is the largest keyword extraction benchmark with sufficient samples to train neural networks. Experimental results using state-of-the-art baseline methods illustrate the need for developing automatic keyword extraction methods for biomedical literature.
    Sprache Englisch
    Erscheinungsdatum 2022-10-17
    Erscheinungsland United States
    Dokumenttyp Journal Article
    ISSN 2155-0751
    ISSN 2155-0751
    DOI 10.1145/3511808.3557675
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  8. Artikel: CaliForest: Calibrated Random Forest for Health Data.

    Park, Yubin / Ho, Joyce C

    Proceedings of the ACM Conference on Health, Inference, and Learning

    2020  Band 2020, Seite(n) 40–50

    Abstract: Real-world predictive models in healthcare should be evaluated in terms of discrimination, the ability to differentiate between high and low risk events, and calibration, or the accuracy of the risk estimates. Unfortunately, calibration is often ... ...

    Abstract Real-world predictive models in healthcare should be evaluated in terms of discrimination, the ability to differentiate between high and low risk events, and calibration, or the accuracy of the risk estimates. Unfortunately, calibration is often neglected and only discrimination is analyzed. Calibration is crucial for personalized medicine as they play an increasing role in the decision making process. Since random forest is a popular model for many healthcare applications, we propose CaliForest, a new calibrated random forest. Unlike existing calibration methodologies, CaliForest utilizes the out-of-bag samples to avoid the explicit construction of a calibration set. We evaluated CaliForest on two risk prediction tasks obtained from the publicly-available MIMIC-III database. Evaluation on these binary prediction tasks demonstrates that CaliForest can achieve the same discriminative power as random forest while obtaining a better-calibrated model evaluated across six different metrics. CaliForest is published on the standard Python software repository and the code is openly available on Github.
    Sprache Englisch
    Erscheinungsdatum 2020-04-02
    Dokumenttyp Journal Article
    DOI 10.1145/3368555.3384461
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  9. Artikel: Controlled Molecule Generator for Optimizing Multiple Chemical Properties.

    Shin, Bonggun / Park, Sungsoo / Bak, JinYeong / Ho, Joyce C

    ACM CHIL 2021 : proceedings of the 2021 ACM Conference on Health, Inference, and Learning : April 8-9, 2021, Virtual Event. ACM Conference on Health, Inference, and Learning (2021 : Online)

    2022  Band 2021, Seite(n) 146–153

    Abstract: Generating a novel and optimized molecule with desired chemical properties is an essential part of the drug discovery process. Failure to meet one of the required properties can frequently lead to failure in a clinical test which is costly. In addition, ... ...

    Abstract Generating a novel and optimized molecule with desired chemical properties is an essential part of the drug discovery process. Failure to meet one of the required properties can frequently lead to failure in a clinical test which is costly. In addition, optimizing these multiple properties is a challenging task because the optimization of one property is prone to changing other properties. In this paper, we pose this multi-property optimization problem as a sequence translation process and propose a new optimized molecule generator model based on the Transformer with two constraint networks: property prediction and similarity prediction. We further improve the model by incorporating score predictions from these constraint networks in a modified beam search algorithm. The experiments demonstrate that our proposed model, Controlled Molecule Generator (CMG), outperforms state-of-the-art models by a significant margin for optimizing multiple properties simultaneously.
    Sprache Englisch
    Erscheinungsdatum 2022-02-18
    Erscheinungsland United States
    Dokumenttyp Journal Article
    DOI 10.1145/3450439.3451879
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  10. Artikel: FuzzyGap: Sequential Pattern Mining for Predicting Chronic Heart Failure in Clinical Pathways.

    Lee, Eric W / Ho, Joyce C

    AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science

    2019  Band 2019, Seite(n) 222–231

    Abstract: The rapid growth of electronic health records (EHRs) facilitates the use of clinical pathways, an actionable plan for patients which is represented as sequences of diagnostic records ordered by visit dates. We propose to extract discriminative and ... ...

    Abstract The rapid growth of electronic health records (EHRs) facilitates the use of clinical pathways, an actionable plan for patients which is represented as sequences of diagnostic records ordered by visit dates. We propose to extract discriminative and representative clinical pathways from EHRs using sequential pattern mining. However, existing sequential patterns cannot efficiently extract patterns due to patient variations in length and time period between visits. To resolve this problem, we propose FuzzyGap, a sequential pattern mining-based framework that extracts a discriminative subsequent pattern from the proper representation of the sequence of encounters which also emphasizes the last visit that is more significant than others. We demonstrate FuzzyGap using a case study of heart failure and show the effectiveness of sequential pattern mining.
    Sprache Englisch
    Erscheinungsdatum 2019-05-06
    Erscheinungsland United States
    Dokumenttyp Journal Article
    ZDB-ID 2676378-3
    ISSN 2153-4063
    ISSN 2153-4063
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

Zum Seitenanfang