LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 63

Search options

  1. Book ; Online: Reproducibility Analysis and Enhancements for Multi-Aspect Dense Retriever with Aspect Learning

    Bi, Keping / Sun, Xiaojie / Guo, Jiafeng / Cheng, Xueqi

    2024  

    Abstract: Multi-aspect dense retrieval aims to incorporate aspect information (e.g., brand and category) into dual encoders to facilitate relevance matching. As an early and representative multi-aspect dense retriever, MADRAL learns several extra aspect embeddings ...

    Abstract Multi-aspect dense retrieval aims to incorporate aspect information (e.g., brand and category) into dual encoders to facilitate relevance matching. As an early and representative multi-aspect dense retriever, MADRAL learns several extra aspect embeddings and fuses the explicit aspects with an implicit aspect "OTHER" for final representation. MADRAL was evaluated on proprietary data and its code was not released, making it challenging to validate its effectiveness on other datasets. We failed to reproduce its effectiveness on the public MA-Amazon data, motivating us to probe the reasons and re-examine its components. We propose several component alternatives for comparisons, including replacing "OTHER" with "CLS" and representing aspects with the first several content tokens. Through extensive experiments, we confirm that learning "OTHER" from scratch in aspect fusion is harmful. In contrast, our proposed variants can greatly enhance the retrieval performance. Our research not only sheds light on the limitations of MADRAL but also provides valuable insights for future studies on more powerful multi-aspect dense retrieval models. Code will be released at: https://github.com/sunxiaojie99/Reproducibility-for-MADRAL.

    Comment: accepted by ecir2024 as a reproducibility paper
    Keywords Computer Science - Information Retrieval
    Subject code 006
    Publishing date 2024-01-07
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Article ; Online: Visual Reasoning: From State to Transformation.

    Hong, Xin / Lan, Yanyan / Pang, Liang / Guo, Jiafeng / Cheng, Xueqi

    IEEE transactions on pattern analysis and machine intelligence

    2023  Volume 45, Issue 9, Page(s) 11352–11364

    Abstract: Most existing visual reasoning tasks, such as CLEVR in VQA, ignore an important factor, i.e., transformation. They are solely defined to test how well machines understand concepts and relations within static settings, like one image. Such state driven ... ...

    Abstract Most existing visual reasoning tasks, such as CLEVR in VQA, ignore an important factor, i.e., transformation. They are solely defined to test how well machines understand concepts and relations within static settings, like one image. Such state driven visual reasoning has limitations in reflecting the ability to infer the dynamics between different states, which has shown to be equally important for human cognition in Piaget's theory. To tackle this problem, we propose a novel transformation driven visual reasoning (TVR) task. Given both the initial and final states, the target becomes to infer the corresponding intermediate transformation. Following this definition, a new synthetic dataset namely TRANCE is first constructed on the basis of CLEVR, including three levels of settings, i.e., Basic (single-step transformation), Event (multi-step transformation), and View (multi-step transformation with variant views). Next, we build another real dataset called TRANCO based on COIN, to cover the loss of transformation diversity on TRANCE. Inspired by human reasoning, we propose a three-staged reasoning framework called TranNet, including observing, analyzing, and concluding, to test how recent advanced techniques perform on TVR. Experimental results show that the state-of-the-art visual reasoning models perform well on Basic, but are still far from human-level intelligence on Event, View, and TRANCO. We believe the proposed new paradigm will boost the development of machine visual reasoning. More advanced methods and new problems need to be investigated in this direction.
    Language English
    Publishing date 2023-08-07
    Publishing country United States
    Document type Journal Article
    ISSN 1939-3539
    ISSN (online) 1939-3539
    DOI 10.1109/TPAMI.2023.3268093
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Book ; Online: A Comparative Study of Training Objectives for Clarification Facet Generation

    Ni, Shiyu / Bi, Keping / Guo, Jiafeng / Cheng, Xueqi

    2023  

    Abstract: Due to the ambiguity and vagueness of a user query, it is essential to identify the query facets for the clarification of user intents. Existing work on query facet generation has achieved compelling performance by sequentially predicting the next facet ... ...

    Abstract Due to the ambiguity and vagueness of a user query, it is essential to identify the query facets for the clarification of user intents. Existing work on query facet generation has achieved compelling performance by sequentially predicting the next facet given previously generated facets based on pre-trained language generation models such as BART. Given a query, there are mainly two types of training objectives to guide the facet generation models. One is to generate the default sequence of ground-truth facets, and the other is to enumerate all the permutations of ground-truth facets and use the sequence that has the minimum loss for model updates. The second is permutation-invariant while the first is not. In this paper, we aim to conduct a systematic comparative study of various types of training objectives, with different properties of not only whether it is permutation-invariant but also whether it conducts sequential prediction and whether it can control the count of output facets. To this end, we propose another three training objectives of different aforementioned properties. For comprehensive comparisons, besides the commonly used evaluation that measures the matching with ground-truth facets, we also introduce two diversity metrics to measure the diversity of the generated facets. Based on an open-domain query facet dataset, i.e., MIMICS, we conduct extensive analyses and show the pros and cons of each method, which could shed light on model training for clarification facet generation. The code can be found at \url{https://github.com/ShiyuNee/Facet-Generation}
    Keywords Computer Science - Information Retrieval
    Subject code 006
    Publishing date 2023-10-01
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: CIR at the NTCIR-17 ULTRE-2 Task

    Yu, Lulu / Bi, Keping / Guo, Jiafeng / Cheng, Xueqi

    2023  

    Abstract: The Chinese academy of sciences Information Retrieval team (CIR) has participated in the NTCIR-17 ULTRE-2 task. This paper describes our approaches and reports our results on the ULTRE-2 task. We recognize the issue of false negatives in the Baidu search ...

    Abstract The Chinese academy of sciences Information Retrieval team (CIR) has participated in the NTCIR-17 ULTRE-2 task. This paper describes our approaches and reports our results on the ULTRE-2 task. We recognize the issue of false negatives in the Baidu search data in this competition is very severe, much more severe than position bias. Hence, we adopt the Dual Learning Algorithm (DLA) to address the position bias and use it as an auxiliary model to study how to alleviate the false negative issue. We approach the problem from two perspectives: 1) correcting the labels for non-clicked items by a relevance judgment model trained from DLA, and learn a new ranker that is initialized from DLA; 2) including random documents as true negatives and documents that have partial matching as hard negatives. Both methods can enhance the model performance and our best method has achieved nDCG@10 of 0.5355, which is 2.66% better than the best score from the organizer.

    Comment: 5 pages, 1 figure, NTCIR-17
    Keywords Computer Science - Information Retrieval
    Subject code 006
    Publishing date 2023-10-18
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Article ; Online: In Situ Toxicity Reduction and Food Safety Assessment of Pak Choi (

    Jiao, Wei / Li, Zhi / Li, Ruiping / Guo, Jiafeng / Hou, Xiaoshu / Zhang, Xi / Wang, Fangli

    Toxics

    2023  Volume 11, Issue 10

    Abstract: An economical and effective method is still lacking for cadmium (Cd) toxicity reduction and food product safety improvement in soil-vegetable systems. Therefore, this study aimed to reduce the Cd toxicity to pak choi ( ...

    Abstract An economical and effective method is still lacking for cadmium (Cd) toxicity reduction and food product safety improvement in soil-vegetable systems. Therefore, this study aimed to reduce the Cd toxicity to pak choi (
    Language English
    Publishing date 2023-09-29
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2733883-6
    ISSN 2305-6304 ; 2305-6304
    ISSN (online) 2305-6304
    ISSN 2305-6304
    DOI 10.3390/toxics11100824
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Book ; Online: Ensemble Ranking Model with Multiple Pretraining Strategies for Web Search

    Sun, Xiaojie / Yu, Lulu / Wang, Yiting / Bi, Keping / Guo, Jiafeng

    2023  

    Abstract: An effective ranking model usually requires a large amount of training data to learn the relevance between documents and queries. User clicks are often used as training data since they can indicate relevance and are cheap to collect, but they contain ... ...

    Abstract An effective ranking model usually requires a large amount of training data to learn the relevance between documents and queries. User clicks are often used as training data since they can indicate relevance and are cheap to collect, but they contain substantial bias and noise. There has been some work on mitigating various types of bias in simulated user clicks to train effective learning-to-rank models based on multiple features. However, how to effectively use such methods on large-scale pre-trained models with real-world click data is unknown. To alleviate the data bias in the real world, we incorporate heuristic-based features, refine the ranking objective, add random negatives, and calibrate the propensity calculation in the pre-training stage. Then we fine-tune several pre-trained models and train an ensemble model to aggregate all the predictions from various pre-trained models with human-annotation data in the fine-tuning stage. Our approaches won 3rd place in the "Pre-training for Web Search" task in WSDM Cup 2023 and are 22.6% better than the 4th-ranked team.

    Comment: 4 pages, 2 figures, WSDM Cup 2023
    Keywords Computer Science - Information Retrieval
    Subject code 006
    Publishing date 2023-02-18
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: Feature-Enhanced Network with Hybrid Debiasing Strategies for Unbiased Learning to Rank

    Yu, Lulu / Wang, Yiting / Sun, Xiaojie / Bi, Keping / Guo, Jiafeng

    2023  

    Abstract: Unbiased learning to rank (ULTR) aims to mitigate various biases existing in user clicks, such as position bias, trust bias, presentation bias, and learn an effective ranker. In this paper, we introduce our winning approach for the "Unbiased Learning to ... ...

    Abstract Unbiased learning to rank (ULTR) aims to mitigate various biases existing in user clicks, such as position bias, trust bias, presentation bias, and learn an effective ranker. In this paper, we introduce our winning approach for the "Unbiased Learning to Rank" task in WSDM Cup 2023. We find that the provided data is severely biased so neural models trained directly with the top 10 results with click information are unsatisfactory. So we extract multiple heuristic-based features for multi-fields of the results, adjust the click labels, add true negatives, and re-weight the samples during model training. Since the propensities learned by existing ULTR methods are not decreasing w.r.t. positions, we also calibrate the propensities according to the click ratios and ensemble the models trained in two different ways. Our method won the 3rd prize with a DCG@10 score of 9.80, which is 1.1% worse than the 2nd and 25.3% higher than the 4th.

    Comment: 5 pages, 1 figure, WSDM Cup 2023
    Keywords Computer Science - Information Retrieval
    Subject code 006
    Publishing date 2023-02-15
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: Visual Transformation Telling

    Hong, Xin / Lan, Yanyan / Pang, Liang / Guo, Jiafeng / Cheng, Xueqi

    2023  

    Abstract: In this paper, we propose a new visual reasoning task, called Visual Transformation Telling (VTT). This task requires a machine to describe the transformation that occurred between every two adjacent states (i.e. images) in a series. Unlike most existing ...

    Abstract In this paper, we propose a new visual reasoning task, called Visual Transformation Telling (VTT). This task requires a machine to describe the transformation that occurred between every two adjacent states (i.e. images) in a series. Unlike most existing visual reasoning tasks that focus on state reasoning, VTT emphasizes transformation reasoning. We collected 13,547 samples from two instructional video datasets, CrossTask and COIN, and extracted desired states and transformation descriptions to create a suitable VTT benchmark dataset. Humans can naturally reason from superficial states differences (e.g. ground wetness) to transformations descriptions (e.g. raining) according to their life experience but how to model this process to bridge this semantic gap is challenging. We designed TTNet on top of existing visual storytelling models by enhancing the model's state-difference sensitivity and transformation-context awareness. TTNet significantly outperforms other baseline models adapted from similar tasks, such as visual storytelling and dense video captioning, demonstrating the effectiveness of our modeling on transformations. Through comprehensive diagnostic analyses, we found TTNet has strong context utilization abilities, but even with some state-of-the-art techniques such as CLIP, there remain challenges in generalization that need to be further explored.
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 006
    Publishing date 2023-05-03
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: Visual Reasoning

    Hong, Xin / Lan, Yanyan / Pang, Liang / Guo, Jiafeng / Cheng, Xueqi

    from State to Transformation

    2023  

    Abstract: Most existing visual reasoning tasks, such as CLEVR in VQA, ignore an important factor, i.e.~transformation. They are solely defined to test how well machines understand concepts and relations within static settings, like one image. Such \textbf{state ... ...

    Abstract Most existing visual reasoning tasks, such as CLEVR in VQA, ignore an important factor, i.e.~transformation. They are solely defined to test how well machines understand concepts and relations within static settings, like one image. Such \textbf{state driven} visual reasoning has limitations in reflecting the ability to infer the dynamics between different states, which has shown to be equally important for human cognition in Piaget's theory. To tackle this problem, we propose a novel \textbf{transformation driven} visual reasoning (TVR) task. Given both the initial and final states, the target becomes to infer the corresponding intermediate transformation. Following this definition, a new synthetic dataset namely TRANCE is first constructed on the basis of CLEVR, including three levels of settings, i.e.~Basic (single-step transformation), Event (multi-step transformation), and View (multi-step transformation with variant views). Next, we build another real dataset called TRANCO based on COIN, to cover the loss of transformation diversity on TRANCE. Inspired by human reasoning, we propose a three-staged reasoning framework called TranNet, including observing, analyzing, and concluding, to test how recent advanced techniques perform on TVR. Experimental results show that the state-of-the-art visual reasoning models perform well on Basic, but are still far from human-level intelligence on Event, View, and TRANCO. We believe the proposed new paradigm will boost the development of machine visual reasoning. More advanced methods and new problems need to be investigated in this direction. The resource of TVR is available at \url{https://hongxin2019.github.io/TVR/}.

    Comment: Accepted by TPAMI. arXiv admin note: substantial text overlap with arXiv:2011.13160
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence
    Publishing date 2023-05-02
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Book ; Online: Few-shot Link Prediction on N-ary Facts

    Wei, Jiyao / Guan, Saiping / Jin, Xiaolong / Guo, Jiafeng / Cheng, Xueqi

    2023  

    Abstract: N-ary facts composed of a primary triple (head entity, relation, tail entity) and an arbitrary number of auxiliary attribute-value pairs, are prevalent in real-world knowledge graphs (KGs). Link prediction on n-ary facts is to predict a missing element ... ...

    Abstract N-ary facts composed of a primary triple (head entity, relation, tail entity) and an arbitrary number of auxiliary attribute-value pairs, are prevalent in real-world knowledge graphs (KGs). Link prediction on n-ary facts is to predict a missing element in an n-ary fact. This helps populate and enrich KGs and further promotes numerous downstream applications. Previous studies usually require a substantial amount of high-quality data to understand the elements in n-ary facts. However, these studies overlook few-shot relations, which have limited labeled instances, yet are common in real-world scenarios. Thus, this paper introduces a new task, few-shot link prediction on n-ary facts. It aims to predict a missing entity in an n-ary fact with limited labeled instances. We further propose a model for Few-shot Link prEdict on N-ary facts, thus called FLEN, which consists of three modules: the relation learning, support-specific adjusting, and query inference modules. FLEN captures relation meta information from limited instances to predict a missing entity in a query instance. To validate the effectiveness of FLEN, we construct three datasets based on existing benchmark data. Our experimental results show that FLEN significantly outperforms existing related models in both few-shot link prediction on n-ary facts and binary facts.
    Keywords Computer Science - Artificial Intelligence ; Computer Science - Information Retrieval ; Computer Science - Machine Learning
    Subject code 006
    Publishing date 2023-05-10
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top