LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 7 of total 7

Search options

  1. Book ; Online: The Manga Whisperer

    Sachdeva, Ragav / Zisserman, Andrew

    Automatically Generating Transcriptions for Comics

    2024  

    Abstract: In the past few decades, Japanese comics, commonly referred to as Manga, have transcended both cultural and linguistic boundaries to become a true worldwide sensation. Yet, the inherent reliance on visual cues and illustration within manga renders it ... ...

    Abstract In the past few decades, Japanese comics, commonly referred to as Manga, have transcended both cultural and linguistic boundaries to become a true worldwide sensation. Yet, the inherent reliance on visual cues and illustration within manga renders it largely inaccessible to individuals with visual impairments. In this work, we seek to address this substantial barrier, with the aim of ensuring that manga can be appreciated and actively engaged by everyone. Specifically, we tackle the problem of diarisation i.e. generating a transcription of who said what and when, in a fully automatic way. To this end, we make the following contributions: (1) we present a unified model, Magi, that is able to (a) detect panels, text boxes and character boxes, (b) cluster characters by identity (without knowing the number of clusters apriori), and (c) associate dialogues to their speakers; (2) we propose a novel approach that is able to sort the detected text boxes in their reading order and generate a dialogue transcript; (3) we annotate an evaluation benchmark for this task using publicly available [English] manga pages. The code, evaluation datasets and the pre-trained model can be found at: https://github.com/ragavsachdeva/magi.
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Publishing date 2024-01-18
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: The Change You Want to See (Now in 3D)

    Sachdeva, Ragav / Zisserman, Andrew

    2023  

    Abstract: The goal of this paper is to detect what has changed, if anything, between two "in the wild" images of the same 3D scene acquired from different camera positions and at different temporal instances. The open-set nature of this problem, occlusions/dis- ... ...

    Abstract The goal of this paper is to detect what has changed, if anything, between two "in the wild" images of the same 3D scene acquired from different camera positions and at different temporal instances. The open-set nature of this problem, occlusions/dis-occlusions due to the shift in viewpoint, and the lack of suitable training datasets, presents substantial challenges in devising a solution. To address this problem, we contribute a change detection model that is trained entirely on synthetic data and is class-agnostic, yet it is performant out-of-the-box on real world images without requiring fine-tuning. Our solution entails a "register and difference" approach that leverages self-supervised frozen embeddings and feature differences, which allows the model to generalise to a wide variety of scenes and domains. The model is able to operate directly on two RGB images, without requiring access to ground truth camera intrinsics, extrinsics, depth maps, point clouds, or additional before-after images. Finally, we collect and release a new evaluation dataset consisting of real-world image pairs with human-annotated differences and demonstrate the efficacy of our method. The code, datasets and pre-trained model can be found at: https://github.com/ragavsachdeva/CYWS-3D
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 004
    Publishing date 2023-08-20
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: The Dynamic Travelling Thief Problem

    Sachdeva, Ragav / Neumann, Frank / Wagner, Markus

    Benchmarks and Performance of Evolutionary Algorithms

    2020  

    Abstract: Many real-world optimisation problems involve dynamic and stochastic components. While problems with multiple interacting components are omnipresent in inherently dynamic domains like supply-chain optimisation and logistics, most research on dynamic ... ...

    Abstract Many real-world optimisation problems involve dynamic and stochastic components. While problems with multiple interacting components are omnipresent in inherently dynamic domains like supply-chain optimisation and logistics, most research on dynamic problems focuses on single-component problems. With this article, we define a number of scenarios based on the Travelling Thief Problem to enable research on the effect of dynamic changes to sub-components. Our investigations of 72 scenarios and seven algorithms show that -- depending on the instance, the magnitude of the change, and the algorithms in the portfolio -- it is preferable to either restart the optimisation from scratch or to continue with the previously valid solutions.

    Comment: Accepted for publication and presentation at ICONIP 2020, https://iconip2020.apnns.org/
    Keywords Computer Science - Neural and Evolutionary Computing
    Publishing date 2020-04-24
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: ScanMix

    Sachdeva, Ragav / Cordeiro, Filipe R / Belagiannis, Vasileios / Reid, Ian / Carneiro, Gustavo

    Learning from Severe Label Noise via Semantic Clustering and Semi-Supervised Learning

    2021  

    Abstract: In this paper, we address the problem of training deep neural networks in the presence of severe label noise. Our proposed training algorithm ScanMix, combines semantic clustering with semi-supervised learning (SSL) to improve the feature representations ...

    Abstract In this paper, we address the problem of training deep neural networks in the presence of severe label noise. Our proposed training algorithm ScanMix, combines semantic clustering with semi-supervised learning (SSL) to improve the feature representations and enable an accurate identification of noisy samples, even in severe label noise scenarios. To be specific, ScanMix is designed based on the expectation maximisation (EM) framework, where the E-step estimates the value of a latent variable to cluster the training images based on their appearance representations and classification results, and the M-step optimises the SSL classification and learns effective feature representations via semantic clustering. In our evaluations, we show state-of-the-art results on standard benchmarks for symmetric, asymmetric and semantic label noise on CIFAR-10 and CIFAR-100, as well as large scale real label noise on WebVision. Most notably, for the benchmarks contaminated with large noise rates (80% and above), our results are up to 27% better than the related work. The code is available at https://github.com/ragavsachdeva/ScanMix.
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning
    Subject code 006
    Publishing date 2021-03-21
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: LongReMix

    Cordeiro, Filipe R. / Sachdeva, Ragav / Belagiannis, Vasileios / Reid, Ian / Carneiro, Gustavo

    Robust Learning with High Confidence Samples in a Noisy Label Environment

    2021  

    Abstract: Deep neural network models are robust to a limited amount of label noise, but their ability to memorise noisy labels in high noise rate problems is still an open issue. The most competitive noisy-label learning algorithms rely on a 2-stage process ... ...

    Abstract Deep neural network models are robust to a limited amount of label noise, but their ability to memorise noisy labels in high noise rate problems is still an open issue. The most competitive noisy-label learning algorithms rely on a 2-stage process comprising an unsupervised learning to classify training samples as clean or noisy, followed by a semi-supervised learning that minimises the empirical vicinal risk (EVR) using a labelled set formed by samples classified as clean, and an unlabelled set with samples classified as noisy. In this paper, we hypothesise that the generalisation of such 2-stage noisy-label learning methods depends on the precision of the unsupervised classifier and the size of the training set to minimise the EVR. We empirically validate these two hypotheses and propose the new 2-stage noisy-label training algorithm LongReMix. We test LongReMix on the noisy-label benchmarks CIFAR-10, CIFAR-100, WebVision, Clothing1M, and Food101-N. The results show that our LongReMix generalises better than competing approaches, particularly in high label noise problems. Furthermore, our approach achieves state-of-the-art performance in most datasets. The code will be available upon paper acceptance.
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 006
    Publishing date 2021-03-06
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: EvidentialMix

    Sachdeva, Ragav / Cordeiro, Filipe R. / Belagiannis, Vasileios / Reid, Ian / Carneiro, Gustavo

    Learning with Combined Open-set and Closed-set Noisy Labels

    2020  

    Abstract: The efficacy of deep learning depends on large-scale data sets that have been carefully curated with reliable data acquisition and annotation processes. However, acquiring such large-scale data sets with precise annotations is very expensive and time- ... ...

    Abstract The efficacy of deep learning depends on large-scale data sets that have been carefully curated with reliable data acquisition and annotation processes. However, acquiring such large-scale data sets with precise annotations is very expensive and time-consuming, and the cheap alternatives often yield data sets that have noisy labels. The field has addressed this problem by focusing on training models under two types of label noise: 1) closed-set noise, where some training samples are incorrectly annotated to a training label other than their known true class; and 2) open-set noise, where the training set includes samples that possess a true class that is (strictly) not contained in the set of known training labels. In this work, we study a new variant of the noisy label problem that combines the open-set and closed-set noisy labels, and introduce a benchmark evaluation to assess the performance of training algorithms under this setup. We argue that such problem is more general and better reflects the noisy label scenarios in practice. Furthermore, we propose a novel algorithm, called EvidentialMix, that addresses this problem and compare its performance with the state-of-the-art methods for both closed-set and open-set noise on the proposed benchmark. Our results show that our method produces superior classification results and better feature representations than previous state-of-the-art methods. The code is available at https://github.com/ragavsachdeva/EvidentialMix.

    Comment: Paper accepted at WACV'21: Winter Conference on Applications of Computer Vision
    Keywords Computer Science - Machine Learning ; Computer Science - Computer Vision and Pattern Recognition
    Subject code 006
    Publishing date 2020-11-11
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: Autonomy and Perception for Space Mining

    Sachdeva, Ragav / Hammond, Ravi / Bockman, James / Arthur, Alec / Smart, Brandon / Craggs, Dustin / Doan, Anh-Dzung / Rowntree, Thomas / Schutz, Elijah / Orenstein, Adrian / Yu, Andy / Chin, Tat-Jun / Reid, Ian

    2021  

    Abstract: Future Moon bases will likely be constructed using resources mined from the surface of the Moon. The difficulty of maintaining a human workforce on the Moon and communications lag with Earth means that mining will need to be conducted using collaborative ...

    Abstract Future Moon bases will likely be constructed using resources mined from the surface of the Moon. The difficulty of maintaining a human workforce on the Moon and communications lag with Earth means that mining will need to be conducted using collaborative robots with a high degree of autonomy. In this paper, we describe our solution for Phase 2 of the NASA Space Robotics Challenge, which provided a simulated lunar environment in which teams were tasked to develop software systems to achieve autonomous collaborative robots for mining on the Moon. Our 3rd place and innovation award winning solution shows how machine learning-enabled vision could alleviate major challenges posed by the lunar environment towards autonomous space mining, chiefly the lack of satellite positioning systems, hazardous terrain, and delicate robot interactions. A robust multi-robot coordinator was also developed to achieve long-term operation and effective collaboration between robots.

    Comment: This paper describes our 3rd place and innovation award winning solution to the NASA Space Robotics Challenge Phase 2
    Keywords Computer Science - Robotics ; Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning
    Subject code 629
    Publishing date 2021-09-26
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top