LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 6 of total 6

Search options

  1. Book ; Online: Visual Prompt Based Personalized Federated Learning

    Li, Guanghao / Wu, Wansen / Sun, Yan / Shen, Li / Wu, Baoyuan / Tao, Dacheng

    2023  

    Abstract: As a popular paradigm of distributed learning, personalized federated learning (PFL) allows personalized models to improve generalization ability and robustness by utilizing knowledge from all distributed clients. Most existing PFL algorithms tackle ... ...

    Abstract As a popular paradigm of distributed learning, personalized federated learning (PFL) allows personalized models to improve generalization ability and robustness by utilizing knowledge from all distributed clients. Most existing PFL algorithms tackle personalization in a model-centric way, such as personalized layer partition, model regularization, and model interpolation, which all fail to take into account the data characteristics of distributed clients. In this paper, we propose a novel PFL framework for image classification tasks, dubbed pFedPT, that leverages personalized visual prompts to implicitly represent local data distribution information of clients and provides that information to the aggregation model to help with classification tasks. Specifically, in each round of pFedPT training, each client generates a local personalized prompt related to local data distribution. Then, the local model is trained on the input composed of raw data and a visual prompt to learn the distribution information contained in the prompt. During model testing, the aggregated model obtains prior knowledge of the data distributions based on the prompts, which can be seen as an adaptive fine-tuning of the aggregation model to improve model performances on different clients. Furthermore, the visual prompt can be added as an orthogonal method to implement personalization on the client for existing FL methods to boost their performance. Experiments on the CIFAR10 and CIFAR100 datasets show that pFedPT outperforms several state-of-the-art (SOTA) PFL algorithms by a large margin in various settings.

    Comment: 14 pages
    Keywords Computer Science - Machine Learning ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Distributed ; Parallel ; and Cluster Computing
    Subject code 006
    Publishing date 2023-03-15
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: Vision-Language Navigation

    Wu, Wansen / Chang, Tao / Li, Xinmeng

    A Survey and Taxonomy

    2021  

    Abstract: Vision-Language Navigation (VLN) tasks require an agent to follow human language instructions to navigate in previously unseen environments. This challenging field involving problems in natural language processing, computer vision, robotics, etc., has ... ...

    Abstract Vision-Language Navigation (VLN) tasks require an agent to follow human language instructions to navigate in previously unseen environments. This challenging field involving problems in natural language processing, computer vision, robotics, etc., has spawn many excellent works focusing on various VLN tasks. This paper provides a comprehensive survey and an insightful taxonomy of these tasks based on the different characteristics of language instructions in these tasks. Depending on whether the navigation instructions are given for once or multiple times, this paper divides the tasks into two categories, i.e., single-turn and multi-turn tasks. For single-turn tasks, we further subdivide them into goal-oriented and route-oriented based on whether the instructions designate a single goal location or specify a sequence of multiple locations. For multi-turn tasks, we subdivide them into passive and interactive tasks based on whether the agent is allowed to question the instruction or not. These tasks require different capabilities of the agent and entail various model designs. We identify progress made on the tasks and look into the limitations of existing VLN models and task settings. Finally, we discuss several open issues of VLN and point out some opportunities in the future, i.e., incorporating knowledge with VLN models and implementing them in the real physical world.
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Multimedia
    Subject code 004
    Publishing date 2021-08-25
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: How to Evaluate Your Dialogue Models

    Li, Xinmeng / Wu, Wansen / Qin, Long / Yin, Quanjun

    A Review of Approaches

    2021  

    Abstract: Evaluating the quality of a dialogue system is an understudied problem. The recent evolution of evaluation method motivated this survey, in which an explicit and comprehensive analysis of the existing methods is sought. We are first to divide the ... ...

    Abstract Evaluating the quality of a dialogue system is an understudied problem. The recent evolution of evaluation method motivated this survey, in which an explicit and comprehensive analysis of the existing methods is sought. We are first to divide the evaluation methods into three classes, i.e., automatic evaluation, human-involved evaluation and user simulator based evaluation. Then, each class is covered with main features and the related evaluation metrics. The existence of benchmarks, suitable for the evaluation of dialogue techniques are also discussed in detail. Finally, some open issues are pointed out to bring the evaluation method into a new frontier.
    Keywords Computer Science - Computation and Language
    Publishing date 2021-08-03
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: Prompt-based Context- and Domain-aware Pretraining for Vision and Language Navigation

    Liu, Ting / Hu, Yue / Wu, Wansen / Wang, Youkai / Xu, Kai / Yin, Quanjun

    2023  

    Abstract: Pretrained visual-language models have extensive world knowledge and are widely used in visual and language navigation (VLN). However, they are not sensitive to indoor scenarios for VLN tasks. Another challenge for VLN is how the agent understands the ... ...

    Abstract Pretrained visual-language models have extensive world knowledge and are widely used in visual and language navigation (VLN). However, they are not sensitive to indoor scenarios for VLN tasks. Another challenge for VLN is how the agent understands the contextual relations between actions on a path and performs cross-modal alignment sequentially. In this paper, we propose a novel Prompt-bAsed coNtext- and inDoor-Aware (PANDA) pretraining framework to address these problems. It performs prompting in two stages. In the indoor-aware stage, we apply an efficient tuning paradigm to learn deep visual prompts from an indoor dataset, in order to augment pretrained models with inductive biases towards indoor environments. This can enable more sample-efficient adaptation for VLN agents. Furthermore, in the context-aware stage, we design a set of hard context prompts to capture the sequence-level semantics in the instruction. They enable further tuning of the pretrained models via contrastive learning. Experimental results on both R2R and REVERIE show the superiority of PANDA compared to existing state-of-the-art methods.

    Comment: 12 pages
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 004
    Publishing date 2023-09-07
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: DAP

    Liu, Ting / Hu, Yue / Wu, Wansen / Wang, Youkai / Xu, Kai / Yin, Quanjun

    Domain-aware Prompt Learning for Vision-and-Language Navigation

    2023  

    Abstract: Following language instructions to navigate in unseen environments is a challenging task for autonomous embodied agents. With strong representation capabilities, pretrained vision-and-language models are widely used in VLN. However, most of them are ... ...

    Abstract Following language instructions to navigate in unseen environments is a challenging task for autonomous embodied agents. With strong representation capabilities, pretrained vision-and-language models are widely used in VLN. However, most of them are trained on web-crawled general-purpose datasets, which incurs a considerable domain gap when used for VLN tasks. To address the problem, we propose a novel and model-agnostic domain-aware prompt learning (DAP) framework. For equipping the pretrained models with specific object-level and scene-level cross-modal alignment in VLN tasks, DAP applies a low-cost prompt tuning paradigm to learn soft visual prompts for extracting in-domain image semantics. Specifically, we first generate a set of in-domain image-text pairs with the help of the CLIP model. Then we introduce soft visual prompts in the input space of the visual encoder in a pretrained model. DAP injects in-domain visual knowledge into the visual encoder of the pretrained model in an efficient way. Experimental results on both R2R and REVERIE show the superiority of DAP compared to existing state-of-the-art methods.

    Comment: 4 pages. arXiv admin note: substantial text overlap with arXiv:2309.03661
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 004
    Publishing date 2023-11-29
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: Autonomous Crowdsensing

    Wu, Wansen / Yang, Weiyi / Li, Juanjuan / Zhao, Yong / Zhu, Zhengqiu / Chen, Bin / Qiu, Sihang / Peng, Yong / Wang, Fei-Yue

    Operating and Organizing Crowdsensing for Sensing Automation

    2024  

    Abstract: The precise characterization and modeling of Cyber-Physical-Social Systems (CPSS) requires more comprehensive and accurate data, which imposes heightened demands on intelligent sensing capabilities. To address this issue, Crowdsensing Intelligence (CSI) ... ...

    Abstract The precise characterization and modeling of Cyber-Physical-Social Systems (CPSS) requires more comprehensive and accurate data, which imposes heightened demands on intelligent sensing capabilities. To address this issue, Crowdsensing Intelligence (CSI) has been proposed to collect data from CPSS by harnessing the collective intelligence of a diverse workforce. Our first and second Distributed/Decentralized Hybrid Workshop on Crowdsensing Intelligence (DHW-CSI) have focused on principles and high-level processes of organizing and operating CSI, as well as the participants, methods, and stages involved in CSI. This letter reports the outcomes of the latest DHW-CSI, focusing on Autonomous Crowdsensing (ACS) enabled by a range of technologies such as decentralized autonomous organizations and operations, large language models, and human-oriented operating systems. Specifically, we explain what ACS is and explore its distinctive features in comparison to traditional crowdsensing. Moreover, we present the ``6A-goal" of ACS and propose potential avenues for future research.
    Keywords Computer Science - Computers and Society
    Publishing date 2024-01-06
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top