LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 1838

Search options

  1. Book ; Online: Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification

    Zhu, Wentao

    2024  

    Abstract: In recent years, researchers combine both audio and video signals to deal with challenges where actions are not well represented or captured by visual cues. However, how to effectively leverage the two modalities is still under development. In this work, ...

    Abstract In recent years, researchers combine both audio and video signals to deal with challenges where actions are not well represented or captured by visual cues. However, how to effectively leverage the two modalities is still under development. In this work, we develop a multiscale multimodal Transformer (MMT) that leverages hierarchical representation learning. Particularly, MMT is composed of a novel multiscale audio Transformer (MAT) and a multiscale video Transformer [43]. To learn a discriminative cross-modality fusion, we further design multimodal supervised contrastive objectives called audio-video contrastive loss (AVC) and intra-modal contrastive loss (IMC) that robustly align the two modalities. MMT surpasses previous state-of-the-art approaches by 7.3% and 2.1% on Kinetics-Sounds and VGGSound in terms of the top-1 accuracy without external training data. Moreover, the proposed MAT significantly outperforms AST [28] by 22.2%, 4.4% and 4.7% on three public benchmark datasets, and is about 3% more efficient based on the number of FLOPs and 9.8% more efficient based on GPU memory usage.

    Comment: Accepted by WACV 2024; well-formatted PDF is in https://drive.google.com/file/d/10Zo_ydJZFAm7YsxHDgTjhyc4dEJbW_dk/view?usp=sharing
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning ; Computer Science - Multimedia ; Computer Science - Sound ; Electrical Engineering and Systems Science - Audio and Speech Processing
    Subject code 004 ; 006
    Publishing date 2024-01-08
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification

    Zhu, Wentao

    2024  

    Abstract: Audio and video are two most common modalities in the mainstream media platforms, e.g., YouTube. To learn from multimodal videos effectively, in this work, we propose a novel audio-video recognition approach termed audio video Transformer, AVT, ... ...

    Abstract Audio and video are two most common modalities in the mainstream media platforms, e.g., YouTube. To learn from multimodal videos effectively, in this work, we propose a novel audio-video recognition approach termed audio video Transformer, AVT, leveraging the effective spatio-temporal representation by the video Transformer to improve action recognition accuracy. For multimodal fusion, simply concatenating multimodal tokens in a cross-modal Transformer requires large computational and memory resources, instead we reduce the cross-modality complexity through an audio-video bottleneck Transformer. To improve the learning efficiency of multimodal Transformer, we integrate self-supervised objectives, i.e., audio-video contrastive learning, audio-video matching, and masked audio and video learning, into AVT training, which maps diverse audio and video representations into a common multimodal representation space. We further propose a masked audio segment loss to learn semantic audio activities in AVT. Extensive experiments and ablation studies on three public datasets and two in-house datasets consistently demonstrate the effectiveness of the proposed AVT. Specifically, AVT outperforms its previous state-of-the-art counterparts on Kinetics-Sounds by 8%. AVT also surpasses one of the previous state-of-the-art video Transformers [25] by 10% on VGGSound by leveraging the audio signal. Compared to one of the previous state-of-the-art multimodal methods, MBT [32], AVT is 1.3% more efficient in terms of FLOPs and improves the accuracy by 3.8% on Epic-Kitchens-100.

    Comment: Accepted by WACV 2024; well-formatted PDF is in https://drive.google.com/file/d/1qvW52lamsvNGMCqPS7q8g8L4NaR_LlbR/view?usp=sharing. arXiv admin note: text overlap with arXiv:2401.04023
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning ; Computer Science - Multimedia ; Computer Science - Sound ; Electrical Engineering and Systems Science - Audio and Speech Processing
    Subject code 004
    Publishing date 2024-01-08
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: TPC-ViT

    Zhu, Wentao

    Token Propagation Controller for Efficient Vision Transformer

    2024  

    Abstract: Vision transformers (ViTs) have achieved promising results on a variety of Computer Vision tasks, however their quadratic complexity in the number of input tokens has limited their application specially in resource-constrained settings. Previous ... ...

    Abstract Vision transformers (ViTs) have achieved promising results on a variety of Computer Vision tasks, however their quadratic complexity in the number of input tokens has limited their application specially in resource-constrained settings. Previous approaches that employ gradual token reduction to address this challenge assume that token redundancy in one layer implies redundancy in all the following layers. We empirically demonstrate that this assumption is often not correct, i.e., tokens that are redundant in one layer can be useful in later layers. We employ this key insight to propose a novel token propagation controller (TPC) that incorporates two different token-distributions, i.e., pause probability and restart probability to control the reduction and reuse of tokens respectively, which results in more efficient token utilization. To improve the estimates of token distributions, we propose a smoothing mechanism that acts as a regularizer and helps remove noisy outliers. Furthermore, to improve the training-stability of our proposed TPC, we introduce a model stabilizer that is able to implicitly encode local image structures and minimize accuracy fluctuations during model training. We present extensive experimental results on the ImageNet-1K dataset using DeiT, LV-ViT and Swin models to demonstrate the effectiveness of our proposed method. For example, compared to baseline models, our proposed method improves the inference speed of the DeiT-S by 250% while increasing the classification accuracy by 1.0%.

    Comment: Accepted by the main conference of WACV 2024; well-formatted PDF is in https://drive.google.com/file/d/1Id3oEdYv3OWing1qojQMyjvhZO-gG-Dm/view?usp=sharing

    supplementary is in https://drive.google.com/file/d/15LhYlBdCXtompA0_TLAp_ZJb4_sq2N5V/view?usp=sharing
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning ; Computer Science - Multimedia ; Computer Science - Neural and Evolutionary Computing
    Subject code 006
    Publishing date 2024-01-02
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: Deformable Audio Transformer for Audio Event Detection

    Zhu, Wentao

    2023  

    Abstract: Transformers have achieved promising results on a variety of tasks. However, the quadratic complexity in self-attention computation has limited the applications, especially in low-resource settings and mobile or edge devices. Existing works have proposed ...

    Abstract Transformers have achieved promising results on a variety of tasks. However, the quadratic complexity in self-attention computation has limited the applications, especially in low-resource settings and mobile or edge devices. Existing works have proposed to exploit hand-crafted attention patterns to reduce computation complexity. However, such hand-crafted patterns are data-agnostic and may not be optimal. Hence, it is likely that relevant keys or values are being reduced, while less important ones are still preserved. Based on this key insight, we propose a novel deformable audio Transformer for audio recognition, named DATAR, where a deformable attention equipping with a pyramid transformer backbone is constructed and learnable. Such an architecture has been proven effective in prediction tasks,~\textit{e.g.}, event classification. Moreover, we identify that the deformable attention map computation may over-simplify the input feature, which can be further enhanced. Hence, we introduce a learnable input adaptor to alleviate this issue, and DATAR achieves state-of-the-art performance.

    Comment: ICASSP 2024. arXiv admin note: substantial text overlap with arXiv:2201.00520 by other authors
    Keywords Computer Science - Sound ; Computer Science - Machine Learning ; Computer Science - Multimedia ; Computer Science - Neural and Evolutionary Computing ; Electrical Engineering and Systems Science - Audio and Speech Processing
    Subject code 006
    Publishing date 2023-12-24
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Article ; Online: CrnnCrispr: An Interpretable Deep Learning Method for CRISPR/Cas9 sgRNA On-Target Activity Prediction.

    Zhu, Wentao / Xie, Huanzeng / Chen, Yaowen / Zhang, Guishan

    International journal of molecular sciences

    2024  Volume 25, Issue 8

    Abstract: CRISPR/Cas9 is a powerful genome-editing tool in biology, but its wide applications are challenged by a lack of knowledge governing single-guide RNA (sgRNA) activity. Several deep-learning-based methods have been developed for the prediction of on-target ...

    Abstract CRISPR/Cas9 is a powerful genome-editing tool in biology, but its wide applications are challenged by a lack of knowledge governing single-guide RNA (sgRNA) activity. Several deep-learning-based methods have been developed for the prediction of on-target activity. However, there is still room for improvement. Here, we proposed a hybrid neural network named CrnnCrispr, which integrates a convolutional neural network and a recurrent neural network for on-target activity prediction. We performed unbiased experiments with four mainstream methods on nine public datasets with varying sample sizes. Additionally, we incorporated a transfer learning strategy to boost the prediction power on small-scale datasets. Our results showed that CrnnCrispr outperformed existing methods in terms of accuracy and generalizability. Finally, we applied a visualization approach to investigate the generalizable nucleotide-position-dependent patterns of sgRNAs for on-target activity, which shows potential in terms of model interpretability and further helps in understanding the principles of sgRNA design.
    MeSH term(s) CRISPR-Cas Systems ; Deep Learning ; RNA, Guide, CRISPR-Cas Systems/genetics ; Gene Editing/methods ; Humans ; Neural Networks, Computer
    Chemical Substances RNA, Guide, CRISPR-Cas Systems
    Language English
    Publishing date 2024-04-17
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2019364-6
    ISSN 1422-0067 ; 1422-0067 ; 1661-6596
    ISSN (online) 1422-0067
    ISSN 1422-0067 ; 1661-6596
    DOI 10.3390/ijms25084429
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Clinical, epidemiological, and genomic characteristics of a seasonal influenza A virus outbreak in Beijing: A descriptive study.

    Zhu, Wentao / Gu, Li

    Journal of medical virology

    2023  Volume 95, Issue 9, Page(s) e29106

    Abstract: China experienced a severe influenza season that began at the end of February 2023. The aim of this post hoc analysis was to investigate the clinical, epidemiological, and genomic features of this outbreak in Beijing. The number of cases increased ... ...

    Abstract China experienced a severe influenza season that began at the end of February 2023. The aim of this post hoc analysis was to investigate the clinical, epidemiological, and genomic features of this outbreak in Beijing. The number of cases increased rapidly from the end of February and reached its peak in March, with 7262 confirmed cases included in this study. The median age was 33 years, and 50.3% of them were male. The average daily positive rate reached 69% during the peak period. The instantaneous reproduction number (Rt) showed a median of 2.1, exceeded 2.5 initially, and remaining above 1 for the following month. The most common symptoms were fever (75.0%), cough (51.0%), and expectoration (42.9%), with a median body temperature of 38.5°C (interquartile range 38-39). Eight clinical symptoms were more likely to be observed in cases with fever, with odds ratio greater than 1. Viral shedding time ranged from 3 to 25 days, with median of 7.5 days. The circulating viruses in Beijing mainly included H1N1pdm09 (clades 5a.2a and 5a.2a.1), following with H3N2 (clade 2a.2a.3a.1). The descriptive study suggests that influenza viruses in this influenza season had a higher transmissibility and longer shedding duration, with fever being the most common symptom.
    MeSH term(s) Male ; Humans ; Adult ; Female ; Seasons ; Beijing/epidemiology ; Influenza A virus/genetics ; Influenza A Virus, H3N2 Subtype/genetics ; Influenza, Human/epidemiology ; Genomics ; Disease Outbreaks ; Fever/epidemiology
    Language English
    Publishing date 2023-09-27
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 752392-0
    ISSN 1096-9071 ; 0146-6615
    ISSN (online) 1096-9071
    ISSN 0146-6615
    DOI 10.1002/jmv.29106
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: GCRV-II invades monocytes/macrophages and induces macrophage polarization and apoptosis in tissues to facilitate viral replication and dissemination.

    Xia, Ning / Zhang, Yanqi / Zhu, Wentao / Su, Jianguo

    Journal of virology

    2024  Volume 98, Issue 3, Page(s) e0146923

    Abstract: Grass carp reovirus (GCRV), particularly the highly prevalent type II GCRV (GCRV-II), causes huge losses in the aquaculture industry. However, little is known about the mechanisms by which GCRV-II invades grass carp and further disseminates among tissues. ...

    Abstract Grass carp reovirus (GCRV), particularly the highly prevalent type II GCRV (GCRV-II), causes huge losses in the aquaculture industry. However, little is known about the mechanisms by which GCRV-II invades grass carp and further disseminates among tissues. In the present study, monocytes/macrophages (Mo/Mφs) were isolated from the peripheral blood of grass carp and infected with GCRV-II. The results of indirect immunofluorescent microscopy, transmission electron microscopy, real-time quantitative RT-PCR (qRT-PCR), western blot (WB), and flow cytometry analysis collectively demonstrated that GCRV-II invaded Mo/Mφs and replicated in them. Additionally, we observed that GCRV-II induced different types (M1 and M2) of polarization of Mo/Mφs in multiple tissues, especially in the brain, head kidney, and intestine. To assess the impact of different types of polarization on GCRV-II replication, we recombinantly expressed and purified the intact cytokines CiIFN-γ2, CiIL-4/13A, and CiIL-4/13B and successfully induced M1 and M2 type polarization of macrophages using these cytokines through
    MeSH term(s) Animals ; Apoptosis ; Carps ; Cytokines ; Fish Diseases/metabolism ; Fish Diseases/pathology ; Fish Diseases/virology ; Macrophages/metabolism ; Macrophages/pathology ; Macrophages/virology ; Monocytes/metabolism ; Orthoreovirus ; Reoviridae Infections/metabolism ; Reoviridae Infections/pathology ; Reoviridae Infections/veterinary ; Virus Replication
    Chemical Substances Cytokines
    Language English
    Publishing date 2024-02-12
    Publishing country United States
    Document type Journal Article
    ZDB-ID 80174-4
    ISSN 1098-5514 ; 0022-538X
    ISSN (online) 1098-5514
    ISSN 0022-538X
    DOI 10.1128/jvi.01469-23
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article: Genome-wide identification and characterization of

    Zhu, Qingdong / Han, Yading / Yang, Wentao / Zhu, Hang / Li, Guangtong / Xu, Ke / Long, Mingxin

    Frontiers in genetics

    2023  Volume 14, Page(s) 1186192

    Abstract: ... ...

    Abstract The
    Language English
    Publishing date 2023-09-01
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2606823-0
    ISSN 1664-8021
    ISSN 1664-8021
    DOI 10.3389/fgene.2023.1186192
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: The mechanisms of mitochondrial abnormalities that contribute to sleep disorders and related neurodegenerative diseases.

    Zhang, Wentao / Liu, Dan / Yuan, Mei / Zhu, Ling-Qiang

    Ageing research reviews

    2024  Volume 97, Page(s) 102307

    Abstract: Sleep is a highly intricate biological phenomenon, and its disorders play a pivotal role in numerous diseases. However, the specific regulatory mechanisms remain elusive. In recent years, the role of mitochondria in sleep disorders has gained ... ...

    Abstract Sleep is a highly intricate biological phenomenon, and its disorders play a pivotal role in numerous diseases. However, the specific regulatory mechanisms remain elusive. In recent years, the role of mitochondria in sleep disorders has gained considerable attention. Sleep deprivation not only impairs mitochondrial morphology but also decreases the number of mitochondria and triggers mitochondrial dysfunction. Furthermore, mitochondrial dysfunction has been implicated in the onset and progression of various sleep disorder-related neurological diseases, especially neurodegenerative conditions. Therefore, a greater understanding of the impact of sleep disorders on mitochondrial dysfunction may reveal new therapeutic targets for neurodegenerative diseases. In this review, we comprehensively summarize the recent key findings on the mechanisms underlying mitochondrial dysfunction caused by sleep disorders and their role in initiating or exacerbating common neurodegenerative diseases. In addition, we provide fresh insights into the diagnosis and treatment of sleep disorder-related diseases.
    Language English
    Publishing date 2024-04-12
    Publishing country England
    Document type Journal Article ; Review
    ZDB-ID 2075672-0
    ISSN 1872-9649 ; 1568-1637
    ISSN (online) 1872-9649
    ISSN 1568-1637
    DOI 10.1016/j.arr.2024.102307
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: 24-Epibrassinolide confers zinc stress tolerance in watermelon seedlings through modulating antioxidative capacities and lignin accumulation.

    Liu, Xuefang / Zhu, Quanwen / Liu, Wentao / Zhang, Jun

    PeerJ

    2023  Volume 11, Page(s) e15330

    Abstract: Zinc (Zn) is an important element in plants, but over-accumulation of Zn is harmful. It is well-known that brassinolide (BR) plays a key role in the regulation of abiotic stress responses in plants. However, the effects of brassinolide on alleviating Zn ... ...

    Abstract Zinc (Zn) is an important element in plants, but over-accumulation of Zn is harmful. It is well-known that brassinolide (BR) plays a key role in the regulation of abiotic stress responses in plants. However, the effects of brassinolide on alleviating Zn phytotoxicity in watermelon (
    MeSH term(s) Antioxidants/pharmacology ; Seedlings ; Zinc/pharmacology ; Lignin/pharmacology ; Citrullus ; Ascorbic Acid/pharmacology ; Glutathione/pharmacology
    Chemical Substances Antioxidants ; brassinolide (Y9IQ1L53OX) ; Zinc (J41CSQ7QDS) ; Lignin (9005-53-2) ; Ascorbic Acid (PQ6CK8PD0R) ; Glutathione (GAN16C9B8O)
    Language English
    Publishing date 2023-05-09
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2703241-3
    ISSN 2167-8359 ; 2167-8359
    ISSN (online) 2167-8359
    ISSN 2167-8359
    DOI 10.7717/peerj.15330
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top