LIVIVO - Search results -

Search results

Result 1 - 7 of total 7

Search options

Book ; Online: The Manga Whisperer

Sachdeva, Ragav / Zisserman, Andrew

Automatically Generating Transcriptions for Comics

2024

Abstract: In the past few decades, Japanese comics, commonly referred to as Manga, have transcended both cultural and linguistic boundaries to become a true worldwide sensation. Yet, the inherent reliance on visual cues and illustration within manga renders it ... ...

Abstract	In the past few decades, Japanese comics, commonly referred to as Manga, have transcended both cultural and linguistic boundaries to become a true worldwide sensation. Yet, the inherent reliance on visual cues and illustration within manga renders it largely inaccessible to individuals with visual impairments. In this work, we seek to address this substantial barrier, with the aim of ensuring that manga can be appreciated and actively engaged by everyone. Specifically, we tackle the problem of diarisation i.e. generating a transcription of who said what and when, in a fully automatic way. To this end, we make the following contributions: (1) we present a unified model, Magi, that is able to (a) detect panels, text boxes and character boxes, (b) cluster characters by identity (without knowing the number of clusters apriori), and (c) associate dialogues to their speakers; (2) we propose a novel approach that is able to sort the detected text boxes in their reading order and generate a dialogue transcript; (3) we annotate an evaluation benchmark for this task using publicly available [English] manga pages. The code, evaluation datasets and the pre-trained model can be found at: https://github.com/ragavsachdeva/magi.
Keywords	Computer Science - Computer Vision and Pattern Recognition
Publishing date	2024-01-18
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: The Change You Want to See (Now in 3D)

Sachdeva, Ragav / Zisserman, Andrew

2023

Abstract: The goal of this paper is to detect what has changed, if anything, between two "in the wild" images of the same 3D scene acquired from different camera positions and at different temporal instances. The open-set nature of this problem, occlusions/dis- ... ...

Abstract	The goal of this paper is to detect what has changed, if anything, between two "in the wild" images of the same 3D scene acquired from different camera positions and at different temporal instances. The open-set nature of this problem, occlusions/dis-occlusions due to the shift in viewpoint, and the lack of suitable training datasets, presents substantial challenges in devising a solution. To address this problem, we contribute a change detection model that is trained entirely on synthetic data and is class-agnostic, yet it is performant out-of-the-box on real world images without requiring fine-tuning. Our solution entails a "register and difference" approach that leverages self-supervised frozen embeddings and feature differences, which allows the model to generalise to a wide variety of scenes and domains. The model is able to operate directly on two RGB images, without requiring access to ground truth camera intrinsics, extrinsics, depth maps, point clouds, or additional before-after images. Finally, we collect and release a new evaluation dataset consisting of real-world image pairs with human-annotated differences and demonstrate the efficacy of our method. The code, datasets and pre-trained model can be found at: https://github.com/ragavsachdeva/CYWS-3D
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	004
Publishing date	2023-08-20
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: The Dynamic Travelling Thief Problem

Sachdeva, Ragav / Neumann, Frank / Wagner, Markus

Benchmarks and Performance of Evolutionary Algorithms

2020

Abstract: Many real-world optimisation problems involve dynamic and stochastic components. While problems with multiple interacting components are omnipresent in inherently dynamic domains like supply-chain optimisation and logistics, most research on dynamic ... ...

Abstract	Many real-world optimisation problems involve dynamic and stochastic components. While problems with multiple interacting components are omnipresent in inherently dynamic domains like supply-chain optimisation and logistics, most research on dynamic problems focuses on single-component problems. With this article, we define a number of scenarios based on the Travelling Thief Problem to enable research on the effect of dynamic changes to sub-components. Our investigations of 72 scenarios and seven algorithms show that -- depending on the instance, the magnitude of the change, and the algorithms in the portfolio -- it is preferable to either restart the optimisation from scratch or to continue with the previously valid solutions. Comment: Accepted for publication and presentation at ICONIP 2020, https://iconip2020.apnns.org/
Keywords	Computer Science - Neural and Evolutionary Computing
Publishing date	2020-04-24
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: ScanMix

Sachdeva, Ragav / Cordeiro, Filipe R / Belagiannis, Vasileios / Reid, Ian / Carneiro, Gustavo

Learning from Severe Label Noise via Semantic Clustering and Semi-Supervised Learning

2021

Abstract: In this paper, we address the problem of training deep neural networks in the presence of severe label noise. Our proposed training algorithm ScanMix, combines semantic clustering with semi-supervised learning (SSL) to improve the feature representations ...

Abstract	In this paper, we address the problem of training deep neural networks in the presence of severe label noise. Our proposed training algorithm ScanMix, combines semantic clustering with semi-supervised learning (SSL) to improve the feature representations and enable an accurate identification of noisy samples, even in severe label noise scenarios. To be specific, ScanMix is designed based on the expectation maximisation (EM) framework, where the E-step estimates the value of a latent variable to cluster the training images based on their appearance representations and classification results, and the M-step optimises the SSL classification and learns effective feature representations via semantic clustering. In our evaluations, we show state-of-the-art results on standard benchmarks for symmetric, asymmetric and semantic label noise on CIFAR-10 and CIFAR-100, as well as large scale real label noise on WebVision. Most notably, for the benchmarks contaminated with large noise rates (80% and above), our results are up to 27% better than the related work. The code is available at https://github.com/ragavsachdeva/ScanMix.
Keywords	Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning
Subject code	006
Publishing date	2021-03-21
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: LongReMix

Cordeiro, Filipe R. / Sachdeva, Ragav / Belagiannis, Vasileios / Reid, Ian / Carneiro, Gustavo

Robust Learning with High Confidence Samples in a Noisy Label Environment

2021

Abstract: Deep neural network models are robust to a limited amount of label noise, but their ability to memorise noisy labels in high noise rate problems is still an open issue. The most competitive noisy-label learning algorithms rely on a 2-stage process ... ...

Abstract	Deep neural network models are robust to a limited amount of label noise, but their ability to memorise noisy labels in high noise rate problems is still an open issue. The most competitive noisy-label learning algorithms rely on a 2-stage process comprising an unsupervised learning to classify training samples as clean or noisy, followed by a semi-supervised learning that minimises the empirical vicinal risk (EVR) using a labelled set formed by samples classified as clean, and an unlabelled set with samples classified as noisy. In this paper, we hypothesise that the generalisation of such 2-stage noisy-label learning methods depends on the precision of the unsupervised classifier and the size of the training set to minimise the EVR. We empirically validate these two hypotheses and propose the new 2-stage noisy-label training algorithm LongReMix. We test LongReMix on the noisy-label benchmarks CIFAR-10, CIFAR-100, WebVision, Clothing1M, and Food101-N. The results show that our LongReMix generalises better than competing approaches, particularly in high label noise problems. Furthermore, our approach achieves state-of-the-art performance in most datasets. The code will be available upon paper acceptance.
Keywords	Computer Science - Computer Vision and Pattern Recognition
Subject code	006
Publishing date	2021-03-06
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: EvidentialMix

Sachdeva, Ragav / Cordeiro, Filipe R. / Belagiannis, Vasileios / Reid, Ian / Carneiro, Gustavo

Learning with Combined Open-set and Closed-set Noisy Labels

2020

Abstract: The efficacy of deep learning depends on large-scale data sets that have been carefully curated with reliable data acquisition and annotation processes. However, acquiring such large-scale data sets with precise annotations is very expensive and time- ... ...

Abstract	The efficacy of deep learning depends on large-scale data sets that have been carefully curated with reliable data acquisition and annotation processes. However, acquiring such large-scale data sets with precise annotations is very expensive and time-consuming, and the cheap alternatives often yield data sets that have noisy labels. The field has addressed this problem by focusing on training models under two types of label noise: 1) closed-set noise, where some training samples are incorrectly annotated to a training label other than their known true class; and 2) open-set noise, where the training set includes samples that possess a true class that is (strictly) not contained in the set of known training labels. In this work, we study a new variant of the noisy label problem that combines the open-set and closed-set noisy labels, and introduce a benchmark evaluation to assess the performance of training algorithms under this setup. We argue that such problem is more general and better reflects the noisy label scenarios in practice. Furthermore, we propose a novel algorithm, called EvidentialMix, that addresses this problem and compare its performance with the state-of-the-art methods for both closed-set and open-set noise on the proposed benchmark. Our results show that our method produces superior classification results and better feature representations than previous state-of-the-art methods. The code is available at https://github.com/ragavsachdeva/EvidentialMix. Comment: Paper accepted at WACV'21: Winter Conference on Applications of Computer Vision
Keywords	Computer Science - Machine Learning ; Computer Science - Computer Vision and Pattern Recognition
Subject code	006
Publishing date	2020-11-11
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Autonomy and Perception for Space Mining

Sachdeva, Ragav / Hammond, Ravi / Bockman, James / Arthur, Alec / Smart, Brandon / Craggs, Dustin / Doan, Anh-Dzung / Rowntree, Thomas / Schutz, Elijah / Orenstein, Adrian / Yu, Andy / Chin, Tat-Jun / Reid, Ian

2021

Abstract: Future Moon bases will likely be constructed using resources mined from the surface of the Moon. The difficulty of maintaining a human workforce on the Moon and communications lag with Earth means that mining will need to be conducted using collaborative ...

Abstract	Future Moon bases will likely be constructed using resources mined from the surface of the Moon. The difficulty of maintaining a human workforce on the Moon and communications lag with Earth means that mining will need to be conducted using collaborative robots with a high degree of autonomy. In this paper, we describe our solution for Phase 2 of the NASA Space Robotics Challenge, which provided a simulated lunar environment in which teams were tasked to develop software systems to achieve autonomous collaborative robots for mining on the Moon. Our 3rd place and innovation award winning solution shows how machine learning-enabled vision could alleviate major challenges posed by the lunar environment towards autonomous space mining, chiefly the lack of satellite positioning systems, hazardous terrain, and delicate robot interactions. A robust multi-robot coordinator was also developed to achieve long-term operation and effective collaboration between robots. Comment: This paper describes our 3rd place and innovation award winning solution to the NASA Space Robotics Challenge Phase 2
Keywords	Computer Science - Robotics ; Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning
Subject code	629
Publishing date	2021-09-26
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

To top

Search results

Search options

Book ; Online: The Manga Whisperer

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: The Change You Want to See (Now in 3D)

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: The Dynamic Travelling Thief Problem

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: ScanMix

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: LongReMix

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: EvidentialMix

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Autonomy and Perception for Space Mining

Full text online

More links

Kategorien

Inter-library loan at ZB MED