LIVIVO - Search results -

Search results

Result 1 - 2 of total 2

Search options

Article ; Online: Better synonyms for enriching biomedical search.

Yeganova, Lana / Kim, Sun / Chen, Qingyu / Balasanov, Grigory / Wilbur, W John / Lu, Zhiyong

Journal of the American Medical Informatics Association : JAMIA

2020 Volume 27, Issue 12, Page(s) 1894–1902

Abstract: Objective: In a biomedical literature search, the link between a query and a document is often not established, because they use different terms to refer to the same concept. Distributional word embeddings are frequently used for detecting related words ...

Abstract	Objective: In a biomedical literature search, the link between a query and a document is often not established, because they use different terms to refer to the same concept. Distributional word embeddings are frequently used for detecting related words by computing the cosine similarity between them. However, previous research has not established either the best embedding methods for detecting synonyms among related word pairs or how effective such methods may be. Materials and methods: In this study, we first create the BioSearchSyn set, a manually annotated set of synonyms, to assess and compare 3 widely used word-embedding methods (word2vec, fastText, and GloVe) in their ability to detect synonyms among related pairs of words. We demonstrate the shortcomings of the cosine similarity score between word embeddings for this task: the same scores have very different meanings for the different methods. To address the problem, we propose utilizing pool adjacent violators (PAV), an isotonic regression algorithm, to transform a cosine similarity into a probability of 2 words being synonyms. Results: Experimental results using the BioSearchSyn set as a gold standard reveal which embedding methods have the best performance in identifying synonym pairs. The BioSearchSyn set also allows converting cosine similarity scores into probabilities, which provides a uniform interpretation of the synonymy score over different methods. Conclusions: We introduced the BioSearchSyn corpus of 1000 term pairs, which allowed us to identify the best embedding method for detecting synonymy for biomedical search. Using the proposed method, we created PubTermVariants2.0: a large, automatically extracted set of synonym pairs that have augmented PubMed searches since the spring of 2019.
MeSH term(s)	Algorithms ; Biomedical Research ; Information Storage and Retrieval/methods ; Linguistics ; Probability ; PubMed ; Terminology as Topic
Language	English
Publishing date	2020-10-15
Publishing country	England
Document type	Journal Article ; Research Support, N.I.H., Intramural
ZDB-ID	1205156-1
ISSN	1527-974X ; 1067-5027
ISSN (online)	1527-974X
ISSN	1067-5027
DOI	10.1093/jamia/ocaa151
Database	MEDical Literature Analysis and Retrieval System OnLINE

In stock of ZB MED Cologne/Königswinter

Zs.A 4128: Show issues			Location: Je nach Verfügbarkeit (siehe Angabe bei Bestand) bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular Jg. 1995 - 2021: Lesesall (2.OG) ab Jg. 2022: Lesesaal (EG)
Zs.MO 312: Show issues

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Details ▾
- See ZB MED holdings
- Order with fees

Article ; Online: Discovering themes in biomedical literature using a projection-based algorithm.

Yeganova, Lana / Kim, Sun / Balasanov, Grigory / Wilbur, W John

BMC bioinformatics

2018 Volume 19, Issue 1, Page(s) 269

Abstract: Background: The need to organize any large document collection in a manner that facilitates human comprehension has become crucial with the increasing volume of information available. Two common approaches to provide a broad overview of the information ... ...

Abstract	Background: The need to organize any large document collection in a manner that facilitates human comprehension has become crucial with the increasing volume of information available. Two common approaches to provide a broad overview of the information space are document clustering and topic modeling. Clustering aims to group documents or terms into meaningful clusters. Topic modeling, on the other hand, focuses on finding coherent keywords for describing topics appearing in a set of documents. In addition, there have been efforts for clustering documents and finding keywords simultaneously. Results: We present an algorithm to analyze document collections that is based on a notion of a theme, defined as a dual representation based on a set of documents and key terms. In this work, a novel vector space mechanism is proposed for computing themes. Starting with a single document, the theme algorithm treats terms and documents as explicit components, and iteratively uses each representation to refine the other until the theme is detected. The method heavily relies on an optimization routine that we refer to as the projection algorithm which, under specific conditions, is guaranteed to converge to the first singular vector of a data matrix. We apply our algorithm to a collection of about sixty thousand PubMed Conclusions: This study presents a contribution on theoretical and algorithmic levels, as well as demonstrates the feasibility of the method for large scale applications. The evaluation of our system on benchmark datasets demonstrates that our method compares favorably with the current state-of-the-art methods in computing clusters of documents with coherent topic terms.
MeSH term(s)	Algorithms ; Cluster Analysis ; Databases, Genetic ; Humans ; Polymorphism, Single Nucleotide/genetics ; Publications
Language	English
Publishing date	2018-07-16
Publishing country	England
Document type	Journal Article
ZDB-ID	2041484-5
ISSN	1471-2105 ; 1471-2105
ISSN (online)	1471-2105
ISSN	1471-2105
DOI	10.1186/s12859-018-2240-0
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

To top

Search results

Search options

Article ; Online: Better synonyms for enriching biomedical search.

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito

Article ; Online: Discovering themes in biomedical literature using a projection-based algorithm.

More links

Kategorien

Order via subito