LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 15

Search options

  1. Article ; Online: Improving Medical Speech-to-Text Accuracy using Vision-Language Pre-training Models.

    Huh, Jaeyoung / Park, Sangjoon / Lee, Jeong Eun / Ye, Jong Chul

    IEEE journal of biomedical and health informatics

    2024  Volume 28, Issue 3, Page(s) 1692–1703

    Abstract: Automatic Speech Recognition (ASR) is a technology that converts spoken words into text, facilitating interaction between humans and machines. One of the most common applications of ASR is Speech-To-Text (STT) technology, which simplifies user workflows ... ...

    Abstract Automatic Speech Recognition (ASR) is a technology that converts spoken words into text, facilitating interaction between humans and machines. One of the most common applications of ASR is Speech-To-Text (STT) technology, which simplifies user workflows by transcribing spoken words into text. In the medical field, STT has the potential to significantly reduce the workload of clinicians who rely on typists to transcribe their voice recordings. However, developing an STT model for the medical domain is challenging due to the lack of sufficient speech and text datasets. To address this issue, we propose a medical-domain text correction method that modifies the output text of a general STT system using the Vision Language Pre-training (VLP) method. VLP combines textual and visual information to correct text based on image knowledge. Our extensive experiments demonstrate that the proposed method offers quantitatively and clinically significant improvements in STT performance in the medical field. We further show that multi-modal understanding of image and text information outperforms single-modal understanding using only text information.
    MeSH term(s) Humans ; Speech ; Language ; Voice
    Language English
    Publishing date 2024-03-06
    Publishing country United States
    Document type Journal Article
    ZDB-ID 2695320-1
    ISSN 2168-2208 ; 2168-2194
    ISSN (online) 2168-2208
    ISSN 2168-2194
    DOI 10.1109/JBHI.2023.3345897
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Switchable and Tunable Deep Beamformer Using Adaptive Instance Normalization for Medical Ultrasound.

    Khan, Shujaat / Huh, Jaeyoung / Ye, Jong Chul

    IEEE transactions on medical imaging

    2022  Volume 41, Issue 2, Page(s) 266–278

    Abstract: Recent proposals of deep learning-based beamformers for ultrasound imaging (US) have attracted significant attention as computational efficient alternatives to adaptive and compressive beamformers. Moreover, deep beamformers are versatile in that image ... ...

    Abstract Recent proposals of deep learning-based beamformers for ultrasound imaging (US) have attracted significant attention as computational efficient alternatives to adaptive and compressive beamformers. Moreover, deep beamformers are versatile in that image post-processing algorithms can be readily combined. Unfortunately, with the existing technology, a large number of beamformers need to be trained and stored for different probes, organs, depth ranges, operating frequency, and desired target 'styles', demanding significant resources such as training data, etc. To address this problem, here we propose a switchable and tunable deep beamformer that can switch between various types of outputs such as DAS, MVBF, DMAS, GCF, etc., and also adjust noise removal levels at the inference phase, by using a simple switch or tunable nozzle. This novel mechanism is implemented through Adaptive Instance Normalization (AdaIN) layers, so that distinct outputs can be generated using a single generator by merely changing the AdaIN codes. Experimental results using B-mode focused ultrasound confirm the flexibility and efficacy of the proposed method for various applications.
    MeSH term(s) Algorithms ; Data Compression ; Image Processing, Computer-Assisted/methods ; Phantoms, Imaging ; Ultrasonography/methods
    Language English
    Publishing date 2022-02-02
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 622531-7
    ISSN 1558-254X ; 0278-0062
    ISSN (online) 1558-254X
    ISSN 0278-0062
    DOI 10.1109/TMI.2021.3110730
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Book ; Online: Breast Ultrasound Report Generation using LangChain

    Huh, Jaeyoung / Park, Hyun Jeong / Ye, Jong Chul

    2023  

    Abstract: Breast ultrasound (BUS) is a critical diagnostic tool in the field of breast imaging, aiding in the early detection and characterization of breast abnormalities. Interpreting breast ultrasound images commonly involves creating comprehensive medical ... ...

    Abstract Breast ultrasound (BUS) is a critical diagnostic tool in the field of breast imaging, aiding in the early detection and characterization of breast abnormalities. Interpreting breast ultrasound images commonly involves creating comprehensive medical reports, containing vital information to promptly assess the patient's condition. However, the ultrasound imaging system necessitates capturing multiple images of various parts to compile a single report, presenting a time-consuming challenge. To address this problem, we propose the integration of multiple image analysis tools through a LangChain using Large Language Models (LLM), into the breast reporting process. Through a combination of designated tools and text generation through LangChain, our method can accurately extract relevant features from ultrasound images, interpret them in a clinical context, and produce comprehensive and standardized reports. This approach not only reduces the burden on radiologists and healthcare professionals but also enhances the consistency and quality of reports. The extensive experiments shows that each tools involved in the proposed method can offer qualitatively and quantitatively significant results. Furthermore, clinical evaluation on the generated reports demonstrates that the proposed method can make report in clinically meaningful way.
    Keywords Electrical Engineering and Systems Science - Image and Video Processing ; Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning
    Subject code 004 ; 006
    Publishing date 2023-12-04
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Article ; Online: Variational Formulation of Unsupervised Deep Learning for Ultrasound Image Artifact Removal.

    Khan, Shujaat / Huh, Jaeyoung / Ye, Jong Chul

    IEEE transactions on ultrasonics, ferroelectrics, and frequency control

    2021  Volume 68, Issue 6, Page(s) 2086–2100

    Abstract: Recently, deep learning approaches have been successfully used for ultrasound (US) image artifact removal. However, paired high-quality images for supervised training are difficult to obtain in many practical situations. Inspired by the recent theory of ... ...

    Abstract Recently, deep learning approaches have been successfully used for ultrasound (US) image artifact removal. However, paired high-quality images for supervised training are difficult to obtain in many practical situations. Inspired by the recent theory of unsupervised learning using optimal transport driven CycleGAN (OT-CycleGAN), here, we investigate the applicability of unsupervised deep learning for US artifact removal problems without matched reference data. Two types of OT-CycleGAN approaches are employed: one with the partial knowledge of the image degradation physics and the other with the lack of such knowledge. Various US artifact removal problems are then addressed using the two types of OT-CycleGAN. Experimental results for various unsupervised US artifact removal tasks confirmed that our unsupervised learning method delivers results comparable to supervised learning in many practical applications.
    MeSH term(s) Artifacts ; Deep Learning ; Image Processing, Computer-Assisted ; Ultrasonography
    Language English
    Publishing date 2021-05-25
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 1525-8955
    ISSN (online) 1525-8955
    DOI 10.1109/TUFFC.2021.3056197
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Book ; Online: Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model

    Huh, Jaeyoung / Park, Sangjoon / Lee, Jeong Eun / Ye, Jong Chul

    2023  

    Abstract: Automatic Speech Recognition (ASR) is a technology that converts spoken words into text, facilitating interaction between humans and machines. One of the most common applications of ASR is Speech-To-Text (STT) technology, which simplifies user workflows ... ...

    Abstract Automatic Speech Recognition (ASR) is a technology that converts spoken words into text, facilitating interaction between humans and machines. One of the most common applications of ASR is Speech-To-Text (STT) technology, which simplifies user workflows by transcribing spoken words into text. In the medical field, STT has the potential to significantly reduce the workload of clinicians who rely on typists to transcribe their voice recordings. However, developing an STT model for the medical domain is challenging due to the lack of sufficient speech and text datasets. To address this issue, we propose a medical-domain text correction method that modifies the output text of a general STT system using the Vision Language Pre-training (VLP) method. VLP combines textual and visual information to correct text based on image knowledge. Our extensive experiments demonstrate that the proposed method offers quantitatively and clinically significant improvements in STT performance in the medical field. We further show that multi-modal understanding of image and text information outperforms single-modal understanding using only text information.
    Keywords Electrical Engineering and Systems Science - Audio and Speech Processing ; Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Sound ; Electrical Engineering and Systems Science - Image and Video Processing
    Subject code 410
    Publishing date 2023-02-27
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Article ; Online: Adaptive and Compressive Beamforming Using Deep Learning for Medical Ultrasound.

    Khan, Shujaat / Huh, Jaeyoung / Ye, Jong Chul

    IEEE transactions on ultrasonics, ferroelectrics, and frequency control

    2020  Volume 67, Issue 8, Page(s) 1558–1572

    Abstract: In ultrasound (US) imaging, various types of adaptive beamforming techniques have been investigated to improve the resolution and the contrast-to-noise ratio of the delay and sum (DAS) beamformers. Unfortunately, the performance of these adaptive ... ...

    Abstract In ultrasound (US) imaging, various types of adaptive beamforming techniques have been investigated to improve the resolution and the contrast-to-noise ratio of the delay and sum (DAS) beamformers. Unfortunately, the performance of these adaptive beamforming approaches degrades when the underlying model is not sufficiently accurate and the number of channels decreases. To address this problem, here, we propose a deep-learning-based beamformer to generate significantly improved images over widely varying measurement conditions and channel subsampling patterns. In particular, our deep neural network is designed to directly process full or subsampled radio frequency (RF) data acquired at various subsampling rates and detector configurations so that it can generate high-quality US images using a single beamformer. The origin of such input-dependent adaptivity is also theoretically analyzed. Experimental results using the B-mode focused US confirm the efficacy of the proposed methods.
    MeSH term(s) Algorithms ; Deep Learning ; Humans ; Image Processing, Computer-Assisted/methods ; Phantoms, Imaging ; Ultrasonography/methods
    Language English
    Publishing date 2020-03-05
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 1525-8955
    ISSN (online) 1525-8955
    DOI 10.1109/TUFFC.2020.2977202
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Book ; Online: OT-driven Multi-Domain Unsupervised Ultrasound Image Artifact Removal using a Single CNN

    Huh, Jaeyoung / Khan, Shujaat / Ye, Jong Chul

    2020  

    Abstract: Ultrasound imaging (US) often suffers from distinct image artifacts from various sources. Classic approaches for solving these problems are usually model-based iterative approaches that have been developed specifically for each type of artifact, which ... ...

    Abstract Ultrasound imaging (US) often suffers from distinct image artifacts from various sources. Classic approaches for solving these problems are usually model-based iterative approaches that have been developed specifically for each type of artifact, which are often computationally intensive. Recently, deep learning approaches have been proposed as computationally efficient and high performance alternatives. Unfortunately, in the current deep learning approaches, a dedicated neural network should be trained with matched training data for each specific artifact type. This poses a fundamental limitation in the practical use of deep learning for US, since large number of models should be stored to deal with various US image artifacts. Inspired by the recent success of multi-domain image transfer, here we propose a novel, unsupervised, deep learning approach in which a single neural network can be used to deal with different types of US artifacts simply by changing a mask vector that switches between different target domains. Our algorithm is rigorously derived using an optimal transport (OT) theory for cascaded probability measures. Experimental results using phantom and in vivo data demonstrate that the proposed method can generate high quality image by removing distinct artifacts, which are comparable to those obtained by separately trained multiple neural networks.
    Keywords Electrical Engineering and Systems Science - Image and Video Processing ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2020-07-10
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: Switchable Deep Beamformer

    Khan, Shujaat / Huh, Jaeyoung / Ye, Jong Chul

    2020  

    Abstract: Recent proposals of deep beamformers using deep neural networks have attracted significant attention as computational efficient alternatives to adaptive and compressive beamformers. Moreover, deep beamformers are versatile in that image post-processing ... ...

    Abstract Recent proposals of deep beamformers using deep neural networks have attracted significant attention as computational efficient alternatives to adaptive and compressive beamformers. Moreover, deep beamformers are versatile in that image post-processing algorithms can be combined with the beamforming. Unfortunately, in the current technology, a separate beamformer should be trained and stored for each application, demanding significant scanner resources. To address this problem, here we propose a {\em switchable} deep beamformer that can produce various types of output such as DAS, speckle removal, deconvolution, etc., using a single network with a simple switch. In particular, the switch is implemented through Adaptive Instance Normalization (AdaIN) layers, so that various output can be generated by merely changing the AdaIN code. Experimental results using B-mode focused ultrasound confirm the flexibility and efficacy of the proposed methods for various applications.
    Keywords Electrical Engineering and Systems Science - Image and Video Processing ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2020-08-31
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: Pushing the Limit of Unsupervised Learning for Ultrasound Image Artifact Removal

    Khan, Shujaat / Huh, Jaeyoung / Ye, Jong Chul

    2020  

    Abstract: Ultrasound (US) imaging is a fast and non-invasive imaging modality which is widely used for real-time clinical imaging applications without concerning about radiation hazard. Unfortunately, it often suffers from poor visual quality from various origins, ...

    Abstract Ultrasound (US) imaging is a fast and non-invasive imaging modality which is widely used for real-time clinical imaging applications without concerning about radiation hazard. Unfortunately, it often suffers from poor visual quality from various origins, such as speckle noises, blurring, multi-line acquisition (MLA), limited RF channels, small number of view angles for the case of plane wave imaging, etc. Classical methods to deal with these problems include image-domain signal processing approaches using various adaptive filtering and model-based approaches. Recently, deep learning approaches have been successfully used for ultrasound imaging field. However, one of the limitations of these approaches is that paired high quality images for supervised training are difficult to obtain in many practical applications. In this paper, inspired by the recent theory of unsupervised learning using optimal transport driven cycleGAN (OT-cycleGAN), we investigate applicability of unsupervised deep learning for US artifact removal problems without matched reference data. Experimental results for various tasks such as deconvolution, speckle removal, limited data artifact removal, etc. confirmed that our unsupervised learning method provides comparable results to supervised learning for many practical applications.
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning ; Electrical Engineering and Systems Science - Image and Video Processing ; Statistics - Machine Learning
    Subject code 006 ; 004
    Publishing date 2020-06-25
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Article ; Online: Tunable image quality control of 3-D ultrasound using switchable CycleGAN.

    Huh, Jaeyoung / Khan, Shujaat / Choi, Sungjin / Shin, Dongkuk / Lee, Jeong Eun / Lee, Eun Sun / Ye, Jong Chul

    Medical image analysis

    2022  Volume 83, Page(s) 102651

    Abstract: In contrast to 2-D ultrasound (US) for uniaxial plane imaging, a 3-D US imaging system can visualize a volume along three axial planes. This allows for a full view of the anatomy, which is useful for gynecological (GYN) and obstetrical (OB) applications. ...

    Abstract In contrast to 2-D ultrasound (US) for uniaxial plane imaging, a 3-D US imaging system can visualize a volume along three axial planes. This allows for a full view of the anatomy, which is useful for gynecological (GYN) and obstetrical (OB) applications. Unfortunately, the 3-D US has an inherent limitation in resolution compared to the 2-D US. In the case of 3-D US with a 3-D mechanical probe, for example, the image quality is comparable along the beam direction, but significant deterioration in image quality is often observed in the other two axial image planes. To address this, here we propose a novel unsupervised deep learning approach to improve 3-D US image quality. In particular, using unmatched high-quality 2-D US images as a reference, we trained a recently proposed switchable CycleGAN architecture so that every mapping plane in 3-D US can learn the image quality of 2-D US images. Thanks to the switchable architecture, our network can also provide real-time control of image enhancement level based on user preference, which is ideal for a user-centric scanner setup. Extensive experiments with clinical evaluation confirm that our method offers significantly improved image quality as well user-friendly flexibility.
    MeSH term(s) Humans ; Quality Control
    Language English
    Publishing date 2022-10-17
    Publishing country Netherlands
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1356436-5
    ISSN 1361-8423 ; 1361-8431 ; 1361-8415
    ISSN (online) 1361-8423 ; 1361-8431
    ISSN 1361-8415
    DOI 10.1016/j.media.2022.102651
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top