LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 111

Search options

  1. Book ; Online: Finding Support Examples for In-Context Learning

    Li, Xiaonan / Qiu, Xipeng

    2023  

    Abstract: Additionally, the strong dependency among in-context examples makes it an NP-hard combinatorial optimization problem and enumerating all permutations is infeasible. Hence we propose LENS, a fiLter-thEN-Search method to tackle this challenge in two stages: ...

    Abstract Additionally, the strong dependency among in-context examples makes it an NP-hard combinatorial optimization problem and enumerating all permutations is infeasible. Hence we propose LENS, a fiLter-thEN-Search method to tackle this challenge in two stages: First we filter the dataset to obtain informative in-context examples individually. Specifically, we propose a novel metric, InfoScore, to evaluate the example's in-context informativeness based on the language model's feedback, and further propose a progressive filtering process to filter out uninformative examples. Then we propose diversity-guided example search which iteratively refines and evaluates the selected example permutations, to find examples that fully depict the task. The experimental results show that LENS significantly outperforms a wide range of baselines.

    Comment: Accepted to the Findings of EMNLP 2023
    Keywords Computer Science - Computation and Language
    Publishing date 2023-02-27
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: MoT

    Li, Xiaonan / Qiu, Xipeng

    Memory-of-Thought Enables ChatGPT to Self-Improve

    2023  

    Abstract: Large Language Models (LLMs) have shown impressive abilities in various tasks. However, fundamentally improving them depends on high-quality datasets or computationally expensive fine-tuning. On the contrary, humans can easily improve themselves by self- ... ...

    Abstract Large Language Models (LLMs) have shown impressive abilities in various tasks. However, fundamentally improving them depends on high-quality datasets or computationally expensive fine-tuning. On the contrary, humans can easily improve themselves by self-thinking and memory, without external resources. In this paper, we propose a framework, MoT, to let the LLM self-improve through Memory-of-Thought, without annotated datasets and parameter updates. Specifically, MoT is divided into two stages: 1. before the test stage, the LLM pre-thinks on the unlabeled dataset and saves the high-confidence thoughts as external memory; 2. During the test stage, given a test question, the LLM recalls relevant memory to help itself reason and answer it. Experimental results show that MoT can help ChatGPT significantly improve its abilities in arithmetic reasoning, commonsense reasoning, factual reasoning, and natural language inference. Further analyses show that each component contributes critically to the improvements and MoT can lead to consistent improvements across various CoT methods and LLMs.

    Comment: Accepted to appear at EMNLP 2023
    Keywords Computer Science - Computation and Language ; Computer Science - Artificial Intelligence
    Publishing date 2023-05-09
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: Agent Alignment in Evolving Social Norms

    Li, Shimin / Sun, Tianxiang / Qiu, Xipeng

    2024  

    Abstract: Agents based on Large Language Models (LLMs) are increasingly permeating various domains of human production and life, highlighting the importance of aligning them with human values. The current alignment of AI systems primarily focuses on passively ... ...

    Abstract Agents based on Large Language Models (LLMs) are increasingly permeating various domains of human production and life, highlighting the importance of aligning them with human values. The current alignment of AI systems primarily focuses on passively aligning LLMs through human intervention. However, agents possess characteristics like receiving environmental feedback and self-evolution, rendering the LLM alignment methods inadequate. In response, we propose an evolutionary framework for agent evolution and alignment, named EvolutionaryAgent, which transforms agent alignment into a process of evolution and selection under the principle of survival of the fittest. In an environment where social norms continuously evolve, agents better adapted to the current social norms will have a higher probability of survival and proliferation, while those inadequately aligned dwindle over time. Experimental results assessing the agents from multiple perspectives in aligning with social norms demonstrate that EvolutionaryAgent possesses the capability to align progressively better with the evolving social norms while maintaining its proficiency in general tasks. Effectiveness tests conducted on various open and closed-source LLMs as the foundation for agents also prove the applicability of our approach.
    Keywords Computer Science - Computation and Language ; Computer Science - Artificial Intelligence
    Subject code 004
    Publishing date 2024-01-09
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: PromptNER

    Zhang, Mozhi / Yan, Hang / Zhou, Yaqian / Qiu, Xipeng

    A Prompting Method for Few-shot Named Entity Recognition via k Nearest Neighbor Search

    2023  

    Abstract: Few-shot Named Entity Recognition (NER) is a task aiming to identify named entities via limited annotated samples. Recently, prototypical networks have shown promising performance in few-shot NER. Most of prototypical networks will utilize the entities ... ...

    Abstract Few-shot Named Entity Recognition (NER) is a task aiming to identify named entities via limited annotated samples. Recently, prototypical networks have shown promising performance in few-shot NER. Most of prototypical networks will utilize the entities from the support set to construct label prototypes and use the query set to compute span-level similarities and optimize these label prototype representations. However, these methods are usually unsuitable for fine-tuning in the target domain, where only the support set is available. In this paper, we propose PromptNER: a novel prompting method for few-shot NER via k nearest neighbor search. We use prompts that contains entity category information to construct label prototypes, which enables our model to fine-tune with only the support set. Our approach achieves excellent transfer learning ability, and extensive experiments on the Few-NERD and CrossNER datasets demonstrate that our model achieves superior performance over state-of-the-art methods.

    Comment: work in progress
    Keywords Computer Science - Computation and Language
    Subject code 410
    Publishing date 2023-05-20
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: Character-LLM

    Shao, Yunfan / Li, Linyang / Dai, Junqi / Qiu, Xipeng

    A Trainable Agent for Role-Playing

    2023  

    Abstract: Large language models (LLMs) can be used to serve as agents to simulate human behaviors, given the powerful ability to understand human instructions and provide high-quality generated texts. Such ability stimulates us to wonder whether LLMs can simulate ... ...

    Abstract Large language models (LLMs) can be used to serve as agents to simulate human behaviors, given the powerful ability to understand human instructions and provide high-quality generated texts. Such ability stimulates us to wonder whether LLMs can simulate a person in a higher form than simple human behaviors. Therefore, we aim to train an agent with the profile, experience, and emotional states of a specific person instead of using limited prompts to instruct ChatGPT API. In this work, we introduce Character-LLM that teach LLMs to act as specific people such as Beethoven, Queen Cleopatra, Julius Caesar, etc. Our method focuses on editing profiles as experiences of a certain character and training models to be personal simulacra with these experiences. To assess the effectiveness of our approach, we build a test playground that interviews trained agents and evaluates whether the agents \textit{memorize} their characters and experiences. Experimental results show interesting observations that help build future simulacra of humankind.

    Comment: To appear at EMNLP 2023; Repo at https://github.com/choosewhatulike/trainable-agents
    Keywords Computer Science - Computation and Language ; Computer Science - Artificial Intelligence
    Subject code 006
    Publishing date 2023-10-16
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: SpeechAgents

    Zhang, Dong / Li, Zhaowei / Wang, Pengyu / Zhang, Xin / Zhou, Yaqian / Qiu, Xipeng

    Human-Communication Simulation with Multi-Modal Multi-Agent Systems

    2024  

    Abstract: Human communication is a complex and diverse process that not only involves multiple factors such as language, commonsense, and cultural backgrounds but also requires the participation of multimodal information, such as speech. Large Language Model (LLM)- ...

    Abstract Human communication is a complex and diverse process that not only involves multiple factors such as language, commonsense, and cultural backgrounds but also requires the participation of multimodal information, such as speech. Large Language Model (LLM)-based multi-agent systems have demonstrated promising performance in simulating human society. Can we leverage LLM-based multi-agent systems to simulate human communication? However, current LLM-based multi-agent systems mainly rely on text as the primary medium. In this paper, we propose SpeechAgents, a multi-modal LLM based multi-agent system designed for simulating human communication. SpeechAgents utilizes multi-modal LLM as the control center for individual agent and employes multi-modal signals as the medium for exchanged messages among agents. Additionally, we propose Multi-Agent Tuning to enhance the multi-agent capabilities of LLM without compromising general abilities. To strengthen and evaluate the effectiveness of human communication simulation, we build the Human-Communication Simulation Benchmark. Experimental results demonstrate that SpeechAgents can simulate human communication dialogues with consistent content, authentic rhythm, and rich emotions and demonstrate excellent scalability even with up to 25 agents, which can apply to tasks such as drama creation and audio novels generation. Code and models will be open-sourced at https://github. com/0nutation/SpeechAgents

    Comment: work in progress
    Keywords Computer Science - Computation and Language
    Subject code 006
    Publishing date 2024-01-08
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: Query of CC

    Fei, Zhaoye / Shao, Yunfan / Li, Linyang / Zeng, Zhiyuan / Yan, Hang / Qiu, Xipeng / Lin, Dahua

    Unearthing Large Scale Domain-Specific Knowledge from Public Corpora

    2024  

    Abstract: Large language models have demonstrated remarkable potential in various tasks, however, there remains a significant scarcity of open-source models and data for specific domains. Previous works have primarily focused on manually specifying resources and ... ...

    Abstract Large language models have demonstrated remarkable potential in various tasks, however, there remains a significant scarcity of open-source models and data for specific domains. Previous works have primarily focused on manually specifying resources and collecting high-quality data on specific domains, which significantly consume time and effort. To address this limitation, we propose an efficient data collection method~\textit{Query of CC} based on large language models. This method bootstraps seed information through a large language model and retrieves related data from public corpora. It not only collects knowledge-related data for specific domains but unearths the data with potential reasoning procedures. Through the application of this method, we have curated a high-quality dataset called~\textsc{Knowledge Pile}, encompassing four major domains, including stem and humanities sciences, among others. Experimental results demonstrate that~\textsc{Knowledge Pile} significantly improves the performance of large language models in mathematical and knowledge-related reasoning ability tests. To facilitate academic sharing, we open-source our dataset and code, providing valuable support to the academic community.
    Keywords Computer Science - Computation and Language
    Subject code 004
    Publishing date 2024-01-25
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: Optimizing Non-Autoregressive Transformers with Contrastive Learning

    An, Chenxin / Feng, Jiangtao / Huang, Fei / Qiu, Xipeng / Kong, Lingpeng

    2023  

    Abstract: Non-autoregressive Transformers (NATs) reduce the inference latency of Autoregressive Transformers (ATs) by predicting words all at once rather than in sequential order. They have achieved remarkable progress in machine translation as well as many other ... ...

    Abstract Non-autoregressive Transformers (NATs) reduce the inference latency of Autoregressive Transformers (ATs) by predicting words all at once rather than in sequential order. They have achieved remarkable progress in machine translation as well as many other applications. However, a long-standing challenge for NATs is the learning of multi-modality data distribution, which is the main cause of the performance gap between NATs and ATs. In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution. We derive contrastive constraints to stabilize the training process and integrate this resulting objective with the state-of-the-art NAT architecture DA-Transformer. Our model \method is examined on 3 different tasks, including machine translation, text summarization, and paraphrasing with 5 benchmarks. Results show that our approach outperforms previous non-autoregressive baselines by a significant margin and establishes new state-of-the-art results for non-autoregressive transformers on all the benchmarks.
    Keywords Computer Science - Computation and Language
    Subject code 006
    Publishing date 2023-05-23
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: Multijugate Dual Learning for Low-Resource Task-Oriented Dialogue System

    Li, Shimin / Zhang, Xiaotian / Zheng, Yanjun / Li, Linyang / Qiu, Xipeng

    2023  

    Abstract: Dialogue data in real scenarios tend to be sparsely available, rendering data-starved end-to-end dialogue systems trained inadequately. We discover that data utilization efficiency in low-resource scenarios can be enhanced by mining alignment information ...

    Abstract Dialogue data in real scenarios tend to be sparsely available, rendering data-starved end-to-end dialogue systems trained inadequately. We discover that data utilization efficiency in low-resource scenarios can be enhanced by mining alignment information uncertain utterance and deterministic dialogue state. Therefore, we innovatively implement dual learning in task-oriented dialogues to exploit the correlation of heterogeneous data. In addition, the one-to-one duality is converted into a multijugate duality to reduce the influence of spurious correlations in dual training for generalization. Without introducing additional parameters, our method could be implemented in arbitrary networks. Extensive empirical analyses demonstrate that our proposed method improves the effectiveness of end-to-end task-oriented dialogue systems under multiple benchmarks and obtains state-of-the-art results in low-resource scenarios.

    Comment: Accepted to Findings of ACL 2023
    Keywords Computer Science - Computation and Language
    Subject code 006
    Publishing date 2023-05-25
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Book ; Online: Origin Tracing and Detecting of LLMs

    Li, Linyang / Wang, Pengyu / Ren, Ke / Sun, Tianxiang / Qiu, Xipeng

    2023  

    Abstract: The extraordinary performance of large language models (LLMs) heightens the importance of detecting whether the context is generated by an AI system. More importantly, while more and more companies and institutions release their LLMs, the origin can be ... ...

    Abstract The extraordinary performance of large language models (LLMs) heightens the importance of detecting whether the context is generated by an AI system. More importantly, while more and more companies and institutions release their LLMs, the origin can be hard to trace. Since LLMs are heading towards the time of AGI, similar to the origin tracing in anthropology, it is of great importance to trace the origin of LLMs. In this paper, we first raise the concern of the origin tracing of LLMs and propose an effective method to trace and detect AI-generated contexts. We introduce a novel algorithm that leverages the contrastive features between LLMs and extracts model-wise features to trace the text origins. Our proposed method works under both white-box and black-box settings therefore can be widely generalized to detect various LLMs.(e.g. can be generalized to detect GPT-3 models without the GPT-3 models). Also, our proposed method requires only limited data compared with the supervised learning methods and can be extended to trace new-coming model origins. We construct extensive experiments to examine whether we can trace the origins of given texts. We provide valuable observations based on the experimental results, such as the difficulty level of AI origin tracing, and the AI origin similarities, and call for ethical concerns of LLM providers. We are releasing all codes and data as a toolkit and benchmark for future AI origin tracing and detecting studies. \footnote{We are releasing all available resource at \url{https://github.com/OpenLMLab/}.}

    Comment: working in progress
    Keywords Computer Science - Computation and Language
    Subject code 410
    Publishing date 2023-04-27
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top