LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 149

Search options

  1. Article ; Online: Task-Specific Normalization for Continual Learning of Blind Image Quality Models.

    Zhang, Weixia / Ma, Kede / Zhai, Guangtao / Yang, Xiaokang

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

    2024  Volume 33, Page(s) 1898–1910

    Abstract: In this paper, we present a simple yet effective continual learning method for blind image quality assessment (BIQA) with improved quality prediction accuracy, plasticity-stability trade-off, and task-order/-length robustness. The key step in our ... ...

    Abstract In this paper, we present a simple yet effective continual learning method for blind image quality assessment (BIQA) with improved quality prediction accuracy, plasticity-stability trade-off, and task-order/-length robustness. The key step in our approach is to freeze all convolution filters of a pre-trained deep neural network (DNN) for an explicit promise of stability, and learn task-specific normalization parameters for plasticity. We assign each new IQA dataset (i.e., task) a prediction head, and load the corresponding normalization parameters to produce a quality score. The final quality estimate is computed by a weighted summation of predictions from all heads with a lightweight K -means gating mechanism. Extensive experiments on six IQA datasets demonstrate the advantages of the proposed method in comparison to previous training techniques for BIQA.
    Language English
    Publishing date 2024-03-12
    Publishing country United States
    Document type Journal Article
    ISSN 1941-0042
    ISSN (online) 1941-0042
    DOI 10.1109/TIP.2024.3371349
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: A novel approach to the detection of facial wrinkles: Database, detection algorithm, and evaluation metrics.

    Liu, Zijia / Qi, Quan / Wang, Sijia / Zhai, Guangtao

    Computers in biology and medicine

    2024  Volume 174, Page(s) 108431

    Abstract: Skin wrinkles result from intrinsic aging processes and extrinsic influences, including prolonged exposure to ultraviolet radiation and tobacco smoking. Hence, the identification of wrinkles holds significant importance in skin aging and medical ... ...

    Abstract Skin wrinkles result from intrinsic aging processes and extrinsic influences, including prolonged exposure to ultraviolet radiation and tobacco smoking. Hence, the identification of wrinkles holds significant importance in skin aging and medical aesthetic investigation. Nevertheless, current methods lack the comprehensiveness to identify facial wrinkles, particularly those that may appear insignificant. Furthermore, the current assessment techniques neglect to consider the blurred boundary of wrinkles and cannot differentiate images with varying resolutions. This research introduces a novel wrinkle detection algorithm and a distance-based loss function to identify full-face wrinkles. Furthermore, we develop a wrinkle detection evaluation metric that assesses outcomes based on curve, location, and gradient similarity. We collected and annotated a dataset for wrinkle detection consisting of 1021 images of Chinese faces. The dataset will be made publicly available to further promote wrinkle detection research. The research demonstrates a substantial enhancement in detecting subtle wrinkles through implementing the proposed method. Furthermore, the suggested evaluation procedure effectively considers the indistinct boundaries of wrinkles and is applicable to images with various resolutions.
    MeSH term(s) Humans ; Skin Aging/physiology ; Algorithms ; Face/diagnostic imaging ; Databases, Factual ; Female ; Male ; Image Processing, Computer-Assisted/methods ; Adult
    Language English
    Publishing date 2024-04-09
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 127557-4
    ISSN 1879-0534 ; 0010-4825
    ISSN (online) 1879-0534
    ISSN 0010-4825
    DOI 10.1016/j.compbiomed.2024.108431
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Image Quality Assessment for Realistic Zoom Photos.

    Han, Zongxi / Liu, Yutao / Xie, Rong / Zhai, Guangtao

    Sensors (Basel, Switzerland)

    2023  Volume 23, Issue 10

    Abstract: New CMOS imaging sensor (CIS) techniques in smartphones have helped user-generated content dominate our lives over traditional DSLRs. However, tiny sensor sizes and fixed focal lengths also lead to more grainy details, especially for zoom photos. ... ...

    Abstract New CMOS imaging sensor (CIS) techniques in smartphones have helped user-generated content dominate our lives over traditional DSLRs. However, tiny sensor sizes and fixed focal lengths also lead to more grainy details, especially for zoom photos. Moreover, multi-frame stacking and post-sharpening algorithms would produce zigzag textures and over-sharpened appearances, for which traditional image-quality metrics may over-estimate. To solve this problem, a real-world zoom photo database is first constructed in this paper, which includes 900 tele-photos from 20 different mobile sensors and ISPs. Then we propose a novel no-reference zoom quality metric which incorporates the traditional estimation of sharpness and the concept of image naturalness. More specifically, for the measurement of image sharpness, we are the first to combine the total energy of the predicted gradient image with the entropy of the residual term under the framework of free-energy theory. To further compensate for the influence of over-sharpening effect and other artifacts, a set of model parameters of mean subtracted contrast normalized (MSCN) coefficients are utilized as the natural statistics representatives. Finally, these two measures are combined linearly. Experimental results on the zoom photo database demonstrate that our quality metric can achieve SROCC and PLCC over 0.91, while the performance of single sharpness or naturalness index is around 0.85. Moreover, compared with the best tested general-purpose and sharpness models, our zoom metric outperforms them by 0.072 and 0.064 in SROCC, respectively.
    Language English
    Publishing date 2023-05-13
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2052857-7
    ISSN 1424-8220 ; 1424-8220
    ISSN (online) 1424-8220
    ISSN 1424-8220
    DOI 10.3390/s23104724
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Subjective and Objective Audio-Visual Quality Assessment for User Generated Content.

    Cao, Yuqin / Min, Xiongkuo / Sun, Wei / Zhai, Guangtao

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

    2023  Volume 32, Page(s) 3847–3861

    Abstract: In recent years, User Generated Content (UGC) has grown dramatically in video sharing applications. It is necessary for service-providers to use video quality assessment (VQA) to monitor and control users' Quality of Experience when watching UGC videos. ... ...

    Abstract In recent years, User Generated Content (UGC) has grown dramatically in video sharing applications. It is necessary for service-providers to use video quality assessment (VQA) to monitor and control users' Quality of Experience when watching UGC videos. However, most existing UGC VQA studies only focus on the visual distortions of videos, ignoring that the perceptual quality also depends on the accompanying audio signals. In this paper, we conduct a comprehensive study on UGC audio-visual quality assessment (AVQA) from both subjective and objective perspectives. Specially, we construct the first UGC AVQA database named SJTU-UAV database, which includes 520 in-the-wild UGC audio and video (A/V) sequences collected from the YFCC100m database. A subjective AVQA experiment is conducted on the database to obtain the mean opinion scores (MOSs) of the A/V sequences. To demonstrate the content diversity of the SJTU-UAV database, we give a detailed analysis of the SJTU-UAV database as well as other two synthetically-distorted AVQA databases and one authentically-distorted VQA database, from both the audio and video aspects. Then, to facilitate the development of AVQA fields, we construct a benchmark of AVQA models on the proposed SJTU-UAV database and other two AVQA databases, of which the benchmark models consist of AVQA models designed for synthetically distorted A/V sequences and AVQA models built through combining the popular VQA methods and audio features via support vector regressor (SVR). Finally, considering benchmark AVQA models perform poorly in assessing in-the-wild UGC videos, we further propose an effective AVQA model via jointly learning quality-aware audio and visual feature representations in the temporal domain, which is seldom investigated by existing AVQA models. Our proposed model outperforms the aforementioned benchmark AVQA models on the SJTU-UAV database and two synthetically distorted AVQA databases. The SJTU-UAV database and the code of the proposed model will be released to facilitate further research.
    MeSH term(s) Databases, Factual ; Learning ; Video Recording/methods ; Humans
    Language English
    Publishing date 2023-07-14
    Publishing country United States
    Document type Journal Article
    ISSN 1941-0042
    ISSN (online) 1941-0042
    DOI 10.1109/TIP.2023.3290528
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: Attention-Guided Neural Networks for Full-Reference and No-Reference Audio-Visual Quality Assessment.

    Cao, Yuqin / Min, Xiongkuo / Sun, Wei / Zhai, Guangtao

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

    2023  Volume PP

    Abstract: With the popularity of mobile Internet, audio and video (A/V) have become the main way for people to entertain and socialize daily. However, in order to reduce the cost of media storage and transmission, A/V signals will be compressed by service ... ...

    Abstract With the popularity of mobile Internet, audio and video (A/V) have become the main way for people to entertain and socialize daily. However, in order to reduce the cost of media storage and transmission, A/V signals will be compressed by service providers before they are transmitted to end-users, which inevitably causes distortions in the A/V signals and degrades the end-user's Quality of Experience (QoE). This motivates us to research the objective audio-visual quality assessment (AVQA). In the field of AVQA, most previous works only focus on single-mode audio or visual signals, which ignores that the perceptual quality of users depends on both audio and video signals. Therefore, we propose an objective AVQA architecture for multi-mode signals based on attentional neural networks. Specifically, we first utilize an attention prediction model to extract the salient regions of video frames. Then, a pre-trained convolutional neural network is used to extract short-time features of the salient regions and the corresponding audio signals. Next, the short-time features are fed into Gated Recurrent Unit (GRU) networks to model the temporal relationship between adjacent frames. Finally, the fully connected layers are utilized to fuse the temporal related features of A/V signals modeled by the GRU network into the final quality score. The proposed architecture is flexible and can be applied to both full-reference and no-reference AVQA. Experimental results on the LIVE-SJTU Database and UnB-AVC Database demonstrate that our model outperforms the state-of-the-art AVQA methods. The code of the proposed method will be publicly available to promote the development of the field of AVQA.
    Language English
    Publishing date 2023-03-16
    Publishing country United States
    Document type Journal Article
    ISSN 1941-0042
    ISSN (online) 1941-0042
    DOI 10.1109/TIP.2023.3251695
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: A Coding Framework and Benchmark towards Low-Bitrate Video Understanding.

    Tian, Yuan / Lu, Guo / Yan, Yichao / Zhai, Guangtao / Chen, Li / Gao, Zhiyong

    IEEE transactions on pattern analysis and machine intelligence

    2024  Volume PP

    Abstract: Video compression is indispensable to most video analysis systems. Despite saving the transportation bandwidth, it also deteriorates downstream video understanding tasks, especially at low-bitrate settings. To systematically investigate this problem, we ... ...

    Abstract Video compression is indispensable to most video analysis systems. Despite saving the transportation bandwidth, it also deteriorates downstream video understanding tasks, especially at low-bitrate settings. To systematically investigate this problem, we first thoroughly review the previous methods, revealing that three principles, i.e., task-decoupled, label-free, and data-emerged semantic prior, are critical to a machine-friendly coding framework but are not fully satisfied so far. In this paper, we propose a traditional-neural mixed coding framework that simultaneously fulfills all these principles, by taking advantage of both traditional codecs and neural networks (NNs). On one hand, the traditional codecs can efficiently encode the pixel signal of videos but may distort the semantic information. On the other hand, highly non-linear NNs are proficient in condensing video semantics into a compact representation. The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved w.r.t. the coding procedure, which is spontaneously learned from unlabeled data in a self-supervised manner. The videos collaboratively decoded from two streams (codec and NN) are of rich semantics, as well as visually photo-realistic, empirically boosting several mainstream downstream video analysis task performances without any post-adaptation procedure. Furthermore, by introducing the attention mechanism and adaptive modeling scheme, the video semantic modeling ability of our approach is further enhanced. Fianlly, we build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach. All codes, data, and models will be open-sourced for facilitating future research.
    Language English
    Publishing date 2024-02-20
    Publishing country United States
    Document type Journal Article
    ISSN 1939-3539
    ISSN (online) 1939-3539
    DOI 10.1109/TPAMI.2024.3367879
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Neighborhood evaluator for efficient super-resolution reconstruction of 2D medical images.

    Liu, Zijia / Han, Jing / Liu, Jiannan / Li, Zhi-Cheng / Zhai, Guangtao

    Computers in biology and medicine

    2024  Volume 171, Page(s) 108212

    Abstract: Background: Deep learning-based super-resolution (SR) algorithms aim to reconstruct low-resolution (LR) images into high-fidelity high-resolution (HR) images by learning the low- and high-frequency information. Experts' diagnostic requirements are ... ...

    Abstract Background: Deep learning-based super-resolution (SR) algorithms aim to reconstruct low-resolution (LR) images into high-fidelity high-resolution (HR) images by learning the low- and high-frequency information. Experts' diagnostic requirements are fulfilled in medical application scenarios through the high-quality reconstruction of LR digital medical images.
    Purpose: Medical image SR algorithms should satisfy the requirements of arbitrary resolution and high efficiency in applications. However, there is currently no relevant study available. Several SR research on natural images have accomplished the reconstruction of resolutions without limitations. However, these methodologies provide challenges in meeting medical applications due to the large scale of the model, which significantly limits efficiency. Hence, we suggest a highly effective method for reconstructing medical images at any desired resolution.
    Methods: Statistical features of medical images exhibit greater continuity in the region of neighboring pixels than natural images. Hence, the process of reconstructing medical images is comparatively less challenging. Utilizing this property, we develop a neighborhood evaluator to represent the continuity of the neighborhood while controlling the network's depth.
    Results: The suggested method has superior performance across seven scales of reconstruction, as evidenced by experiments conducted on panoramic radiographs and two external public datasets. Furthermore, the proposed network significantly decreases the parameter count by over 20× and the computational workload by over 10× compared to prior researches. On large-scale reconstruction, the inference speed can be enhanced by over 5×.
    Conclusion: The novel proposed SR strategy for medical images performs efficient reconstruction at arbitrary resolution, marking a significant breakthrough in the field. The given scheme facilitates the implementation of SR in mobile medical platforms.
    MeSH term(s) Algorithms ; Image Processing, Computer-Assisted/methods
    Language English
    Publishing date 2024-02-28
    Publishing country United States
    Document type Journal Article
    ZDB-ID 127557-4
    ISSN 1879-0534 ; 0010-4825
    ISSN (online) 1879-0534
    ISSN 0010-4825
    DOI 10.1016/j.compbiomed.2024.108212
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: Developing and Validating an Intelligent Mouth-Opening Training Device: A New Solution for Restricted Mouth Opening.

    Wu, Hao / Wang, Zilin / Han, Jing / Wu, Tianchi / Zhai, Guangtao / Zhang, Chenping / Liu, Jiannan

    Sensors (Basel, Switzerland)

    2024  Volume 24, Issue 6

    Abstract: Restricted mouth opening (trismus) is one of the most common complications following head and neck cancer treatment. Early initiation of mouth-opening exercises is crucial for preventing or minimizing trismus. Current methods for these exercises ... ...

    Abstract Restricted mouth opening (trismus) is one of the most common complications following head and neck cancer treatment. Early initiation of mouth-opening exercises is crucial for preventing or minimizing trismus. Current methods for these exercises predominantly involve finger exercises and traditional mouth-opening training devices. Our research group successfully designed an intelligent mouth-opening training device (IMOTD) that addresses the limitations of traditional home training methods, including the inability to quantify mouth-opening exercises, a lack of guided training resulting in temporomandibular joint injuries, and poor training continuity leading to poor training effect. For this device, an interactive remote guidance mode is introduced to address these concerns. The device was designed with a focus on the safety and effectiveness of medical devices. The accuracy of the training data was verified through piezoelectric sensor calibration. Through mechanical analysis, the stress points of the structure were identified, and finite element analysis of the connecting rod and the occlusal plate connection structure was conducted to ensure the safety of the device. The findings support the effectiveness of the intelligent device in rehabilitation through preclinical experiments when compared with conventional mouth-opening training methods. This intelligent device facilitates the quantification and visualization of mouth-opening training indicators, ensuring both the comfort and safety of the training process. Additionally, it enables remote supervision and guidance for patient training, thereby enhancing patient compliance and ultimately ensuring the effectiveness of mouth-opening exercises.
    MeSH term(s) Humans ; Trismus/etiology ; Trismus/rehabilitation ; Exercise Therapy/methods ; Exercise ; Head and Neck Neoplasms ; Mouth
    Language English
    Publishing date 2024-03-20
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2052857-7
    ISSN 1424-8220 ; 1424-8220
    ISSN (online) 1424-8220
    ISSN 1424-8220
    DOI 10.3390/s24061988
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Analysis of Video Quality Datasets via Design of Minimalistic Video Quality Models.

    Sun, Wei / Wen, Wen / Min, Xiongkuo / Lan, Long / Zhai, Guangtao / Ma, Kede

    IEEE transactions on pattern analysis and machine intelligence

    2024  Volume PP

    Abstract: Blind video quality assessment (BVQA) plays an indispensable role in monitoring and improving the end-users' viewing experience in various real-world video-enabled media applications. As an experimental field, the improvements of BVQA models have been ... ...

    Abstract Blind video quality assessment (BVQA) plays an indispensable role in monitoring and improving the end-users' viewing experience in various real-world video-enabled media applications. As an experimental field, the improvements of BVQA models have been measured primarily on a few human-rated VQA datasets. Thus, it is crucial to gain a better understanding of existing VQA datasets in order to properly evaluate the current progress in BVQA. Towards this goal, we conduct a first-of-its-kind computational analysis of VQA datasets via designing minimalistic BVQA models. By minimalistic, we restrict our family of BVQA models to build only upon basic blocks: a video preprocessor (for aggressive spatiotemporal downsampling), a spatial quality analyzer, an optional temporal quality analyzer, and a quality regressor, all with the simplest possible instantiations. By comparing the quality prediction performance of different model variants on eight VQA datasets with realistic distortions, we find that nearly all datasets suffer from the easy dataset problem of varying severity, some of which even admit blind image quality assessment (BIQA) solutions. We additionally justify our claims by comparing our model generalization capabilities on these VQA datasets, and by ablating a dizzying set of BVQA design choices related to the basic building blocks. Our results cast doubt on the current progress in BVQA, and meanwhile shed light on good practices of constructing next-generation VQA datasets and models.
    Language English
    Publishing date 2024-04-16
    Publishing country United States
    Document type Journal Article
    ISSN 1939-3539
    ISSN (online) 1939-3539
    DOI 10.1109/TPAMI.2024.3385364
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Robust Mesh Representation Learning via Efficient Local Structure-Aware Anisotropic Convolution.

    Gao, Zhongpai / Yan, Junchi / Zhai, Guangtao / Zhang, Juyong / Yang, Xiaokang

    IEEE transactions on neural networks and learning systems

    2023  Volume 34, Issue 11, Page(s) 8566–8578

    Abstract: Mesh is a type of data structure commonly used for 3-D shapes. Representation learning for 3-D meshes is essential in many computer vision and graphics applications. The recent success of convolutional neural networks (CNNs) for structured data (e.g., ... ...

    Abstract Mesh is a type of data structure commonly used for 3-D shapes. Representation learning for 3-D meshes is essential in many computer vision and graphics applications. The recent success of convolutional neural networks (CNNs) for structured data (e.g., images) suggests the value of adapting insights from CNN for 3-D shapes. However, 3-D shape data are irregular since each node's neighbors are unordered. Various graph neural networks for 3-D shapes have been developed with isotropic filters or predefined local coordinate systems to overcome the node inconsistency on graphs. However, isotropic filters or predefined local coordinate systems limit the representation power. In this article, we propose a local structure-aware anisotropic convolutional operation (LSA-Conv) that learns adaptive weighting matrices for each template's node according to its neighboring structure and performs shared anisotropic filters. In fact, the learnable weighting matrix is similar to the attention matrix in the random synthesizer-a new Transformer model for natural language processing (NLP). Since the learnable weighting matrices require large amounts of parameters for high-resolution 3-D shapes, we introduce a matrix factorization technique to notably reduce the parameter size, denoted as LSA-small. Furthermore, a residual connection with a linear transformation is introduced to improve the performance of our LSA-Conv. Comprehensive experiments demonstrate that our model produces significant improvement in 3-D shape reconstruction compared to state-of-the-art methods.
    Language English
    Publishing date 2023-10-27
    Publishing country United States
    Document type Journal Article
    ISSN 2162-2388
    ISSN (online) 2162-2388
    DOI 10.1109/TNNLS.2022.3151609
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top