LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 34

Search options

  1. Book ; Online: Learnings from Data Integration for Augmented Language Models

    Halevy, Alon / Dwivedi-Yu, Jane

    2023  

    Abstract: One of the limitations of large language models is that they do not have access to up-to-date, proprietary or personal data. As a result, there are multiple efforts to extend language models with techniques for accessing external data. In that sense, ... ...

    Abstract One of the limitations of large language models is that they do not have access to up-to-date, proprietary or personal data. As a result, there are multiple efforts to extend language models with techniques for accessing external data. In that sense, LLMs share the vision of data integration systems whose goal is to provide seamless access to a large collection of heterogeneous data sources. While the details and the techniques of LLMs differ greatly from those of data integration, this paper shows that some of the lessons learned from research on data integration can elucidate the research path we are conducting today on language models.
    Keywords Computer Science - Computation and Language
    Publishing date 2023-04-10
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: Human-Centered Planning

    Li, Yuliang / Kamra, Nitin / Desai, Ruta / Halevy, Alon

    2023  

    Abstract: LLMs have recently made impressive inroads on tasks whose output is structured, such as coding, robotic planning and querying databases. The vision of creating AI-powered personal assistants also involves creating structured outputs, such as a plan for ... ...

    Abstract LLMs have recently made impressive inroads on tasks whose output is structured, such as coding, robotic planning and querying databases. The vision of creating AI-powered personal assistants also involves creating structured outputs, such as a plan for one's day, or for an overseas trip. Here, since the plan is executed by a human, the output doesn't have to satisfy strict syntactic constraints. A useful assistant should also be able to incorporate vague constraints specified by the user in natural language. This makes LLMs an attractive option for planning. We consider the problem of planning one's day. We develop an LLM-based planner (LLMPlan) extended with the ability to self-reflect on its output and a symbolic planner (SymPlan) with the ability to translate text constraints into a symbolic representation. Despite no formal specification of constraints, we find that LLMPlan performs explicit constraint satisfaction akin to the traditional symbolic planners on average (2% performance difference), while retaining the reasoning of implicit requirements. Consequently, LLM-based planners outperform their symbolic counterparts in user satisfaction (70.5% vs. 40.4%) during interactive evaluation with 40 users.
    Keywords Computer Science - Artificial Intelligence
    Subject code 004
    Publishing date 2023-11-07
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: Multimodal Neural Databases

    Trappolini, Giovanni / Santilli, Andrea / Rodolà, Emanuele / Halevy, Alon / Silvestri, Fabrizio

    2023  

    Abstract: The rise in loosely-structured data available through text, images, and other modalities has called for new ways of querying them. Multimedia Information Retrieval has filled this gap and has witnessed exciting progress in recent years. Tasks such as ... ...

    Abstract The rise in loosely-structured data available through text, images, and other modalities has called for new ways of querying them. Multimedia Information Retrieval has filled this gap and has witnessed exciting progress in recent years. Tasks such as search and retrieval of extensive multimedia archives have undergone massive performance improvements, driven to a large extent by recent developments in multimodal deep learning. However, methods in this field remain limited in the kinds of queries they support and, in particular, their inability to answer database-like queries. For this reason, inspired by recent work on neural databases, we propose a new framework, which we name Multimodal Neural Databases (MMNDBs). MMNDBs can answer complex database-like queries that involve reasoning over different input modalities, such as text and images, at scale. In this paper, we present the first architecture able to fulfill this set of requirements and test it with several baselines, showing the limitations of currently available models. The results show the potential of these new techniques to process unstructured data coming from different modalities, paving the way for future research in the area. Code to replicate the experiments will be released at https://github.com/GiovanniTRA/MultimodalNeuralDatabases
    Keywords Computer Science - Multimedia ; Computer Science - Computation and Language ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Databases ; Computer Science - Information Retrieval
    Subject code 004
    Publishing date 2023-05-02
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: "That's so cute!"

    Dwivedi-Yu, Jane / Halevy, Alon Y.

    The CARE Dataset for Affective Response Detection

    2022  

    Abstract: Social media plays an increasing role in our communication with friends and family, and our consumption of information and entertainment. Hence, to design effective ranking functions for posts on social media, it would be useful to predict the affective ... ...

    Abstract Social media plays an increasing role in our communication with friends and family, and our consumption of information and entertainment. Hence, to design effective ranking functions for posts on social media, it would be useful to predict the affective response to a post (e.g., whether the user is likely to be humored, inspired, angered, informed). Similar to work on emotion recognition (which focuses on the affect of the publisher of the post), the traditional approach to recognizing affective response would involve an expensive investment in human annotation of training data. We introduce CARE$_{db}$, a dataset of 230k social media posts annotated according to 7 affective responses using the Common Affective Response Expression (CARE) method. The CARE method is a means of leveraging the signal that is present in comments that are posted in response to a post, providing high-precision evidence about the affective response of the readers to the post without human annotation. Unlike human annotation, the annotation process we describe here can be iterated upon to expand the coverage of the method, particularly for new affective responses. We present experiments that demonstrate that the CARE annotations compare favorably with crowd-sourced annotations. Finally, we use CARE$_{db}$ to train competitive BERT-based models for predicting affective response as well as emotion detection, demonstrating the utility of the dataset for related tasks.
    Keywords Computer Science - Machine Learning ; Computer Science - Computation and Language ; Computer Science - Information Retrieval
    Publishing date 2022-01-27
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: NormBank

    Ziems, Caleb / Dwivedi-Yu, Jane / Wang, Yi-Chia / Halevy, Alon / Yang, Diyi

    A Knowledge Bank of Situational Social Norms

    2023  

    Abstract: We present NormBank, a knowledge bank of 155k situational norms. This resource is designed to ground flexible normative reasoning for interactive, assistive, and collaborative AI systems. Unlike prior commonsense resources, NormBank grounds each ... ...

    Abstract We present NormBank, a knowledge bank of 155k situational norms. This resource is designed to ground flexible normative reasoning for interactive, assistive, and collaborative AI systems. Unlike prior commonsense resources, NormBank grounds each inference within a multivalent sociocultural frame, which includes the setting (e.g., restaurant), the agents' contingent roles (waiter, customer), their attributes (age, gender), and other physical, social, and cultural constraints (e.g., the temperature or the country of operation). In total, NormBank contains 63k unique constraints from a taxonomy that we introduce and iteratively refine here. Constraints then apply in different combinations to frame social norms. Under these manipulations, norms are non-monotonic - one can cancel an inference by updating its frame even slightly. Still, we find evidence that neural models can help reliably extend the scope and coverage of NormBank. We further demonstrate the utility of this resource with a series of transfer experiments.
    Keywords Computer Science - Computation and Language
    Publishing date 2023-05-26
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: VerifAI

    Tang, Nan / Yang, Chenyu / Fan, Ju / Cao, Lei / Luo, Yuyu / Halevy, Alon

    Verified Generative AI

    2023  

    Abstract: Generative AI has made significant strides, yet concerns about the accuracy and reliability of its outputs continue to grow. Such inaccuracies can have serious consequences such as inaccurate decision-making, the spread of false information, privacy ... ...

    Abstract Generative AI has made significant strides, yet concerns about the accuracy and reliability of its outputs continue to grow. Such inaccuracies can have serious consequences such as inaccurate decision-making, the spread of false information, privacy violations, legal liabilities, and more. Although efforts to address these risks are underway, including explainable AI and responsible AI practices such as transparency, privacy protection, bias mitigation, and social and environmental responsibility, misinformation caused by generative AI will remain a significant challenge. We propose that verifying the outputs of generative AI from a data management perspective is an emerging issue for generative AI. This involves analyzing the underlying data from multi-modal data lakes, including text files, tables, and knowledge graphs, and assessing its quality and consistency. By doing so, we can establish a stronger foundation for evaluating the outputs of generative AI models. Such an approach can ensure the correctness of generative AI, promote transparency, and enable decision-making with greater confidence. Our vision is to promote the development of verifiable generative AI and contribute to a more trustworthy and responsible use of AI.

    Comment: 8 pages, 4 figures
    Keywords Computer Science - Databases ; Computer Science - Computation and Language ; Computer Science - Machine Learning
    Publishing date 2023-07-06
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: Detecting Inspiring Content on Social Media

    Ignat, Oana / Boureau, Y-Lan / Yu, Jane A. / Halevy, Alon

    2021  

    Abstract: Inspiration moves a person to see new possibilities and transforms the way they perceive their own potential. Inspiration has received little attention in psychology, and has not been researched before in the NLP community. To the best of our knowledge, ... ...

    Abstract Inspiration moves a person to see new possibilities and transforms the way they perceive their own potential. Inspiration has received little attention in psychology, and has not been researched before in the NLP community. To the best of our knowledge, this work is the first to study inspiration through machine learning methods. We aim to automatically detect inspiring content from social media data. To this end, we analyze social media posts to tease out what makes a post inspiring and what topics are inspiring. We release a dataset of 5,800 inspiring and 5,800 non-inspiring English-language public post unique ids collected from a dump of Reddit public posts made available by a third party and use linguistic heuristics to automatically detect which social media English-language posts are inspiring.

    Comment: accepted at ACII 2021
    Keywords Computer Science - Computation and Language
    Publishing date 2021-09-06
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: The Moral Integrity Corpus

    Ziems, Caleb / Yu, Jane A. / Wang, Yi-Chia / Halevy, Alon / Yang, Diyi

    A Benchmark for Ethical Dialogue Systems

    2022  

    Abstract: Conversational agents have come increasingly closer to human competence in open-domain dialogue settings; however, such models can reflect insensitive, hurtful, or entirely incoherent viewpoints that erode a user's trust in the moral integrity of the ... ...

    Abstract Conversational agents have come increasingly closer to human competence in open-domain dialogue settings; however, such models can reflect insensitive, hurtful, or entirely incoherent viewpoints that erode a user's trust in the moral integrity of the system. Moral deviations are difficult to mitigate because moral judgments are not universal, and there may be multiple competing judgments that apply to a situation simultaneously. In this work, we introduce a new resource, not to authoritatively resolve moral ambiguities, but instead to facilitate systematic understanding of the intuitions, values and moral judgments reflected in the utterances of dialogue systems. The Moral Integrity Corpus, MIC, is such a resource, which captures the moral assumptions of 38k prompt-reply pairs, using 99k distinct Rules of Thumb (RoTs). Each RoT reflects a particular moral conviction that can explain why a chatbot's reply may appear acceptable or problematic. We further organize RoTs with a set of 9 moral and social attributes and benchmark performance for attribute classification. Most importantly, we show that current neural language models can automatically generate new RoTs that reasonably describe previously unseen interactions, but they still struggle with certain scenarios. Our findings suggest that MIC will be a useful resource for understanding and language models' implicit moral assumptions and flexibly benchmarking the integrity of conversational agents. To download the data, see https://github.com/GT-SALT/mic

    Comment: ACL 2022 main conference
    Keywords Computer Science - Computation and Language
    Subject code 170
    Publishing date 2022-04-06
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: Affective Signals in a Social Media Recommender System

    Dwivedi-Yu, Jane / Wang, Yi-Chia / Qin, Lijing / Canton-Ferrer, Cristian / Halevy, Alon Y.

    2022  

    Abstract: People come to social media to satisfy a variety of needs, such as being informed, entertained and inspired, or connected to their friends and community. Hence, to design a ranking function that gives useful and personalized post recommendations, it ... ...

    Abstract People come to social media to satisfy a variety of needs, such as being informed, entertained and inspired, or connected to their friends and community. Hence, to design a ranking function that gives useful and personalized post recommendations, it would be helpful to be able to predict the affective response a user may have to a post (e.g., entertained, informed, angered). This paper describes the challenges and solutions we developed to apply Affective Computing to social media recommendation systems. We address several types of challenges. First, we devise a taxonomy of affects that was small (for practical purposes) yet covers the important nuances needed for the application. Second, to collect training data for our models, we balance between signals that are already available to us (namely, different types of user engagement) and data we collected through a carefully crafted human annotation effort on 800k posts. We demonstrate that affective response information learned from this dataset improves a module in the recommendation system by more than 8%. Online experimentation also demonstrates statistically significant decreases in surfaced violating content and increases in surfaced content that users find valuable.
    Keywords Computer Science - Social and Information Networks
    Publishing date 2022-06-24
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Book ; Online: TimelineQA

    Tan, Wang-Chiew / Dwivedi-Yu, Jane / Li, Yuliang / Mathias, Lambert / Saeidi, Marzieh / Yan, Jing Nathan / Halevy, Alon Y.

    A Benchmark for Question Answering over Timelines

    2023  

    Abstract: Lifelogs are descriptions of experiences that a person had during their life. Lifelogs are created by fusing data from the multitude of digital services, such as online photos, maps, shopping and content streaming services. Question answering over ... ...

    Abstract Lifelogs are descriptions of experiences that a person had during their life. Lifelogs are created by fusing data from the multitude of digital services, such as online photos, maps, shopping and content streaming services. Question answering over lifelogs can offer personal assistants a critical resource when they try to provide advice in context. However, obtaining answers to questions over lifelogs is beyond the current state of the art of question answering techniques for a variety of reasons, the most pronounced of which is that lifelogs combine free text with some degree of structure such as temporal and geographical information. We create and publicly release TimelineQA1, a benchmark for accelerating progress on querying lifelogs. TimelineQA generates lifelogs of imaginary people. The episodes in the lifelog range from major life episodes such as high school graduation to those that occur on a daily basis such as going for a run. We describe a set of experiments on TimelineQA with several state-of-the-art QA models. Our experiments reveal that for atomic queries, an extractive QA system significantly out-performs a state-of-the-art retrieval-augmented QA system. For multi-hop queries involving aggregates, we show that the best result is obtained with a state-of-the-art table QA technique, assuming the ground truth set of episodes for deriving the answer is available.
    Keywords Computer Science - Computation and Language ; Computer Science - Artificial Intelligence ; Computer Science - Information Retrieval
    Subject code 006
    Publishing date 2023-06-01
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top