LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 31

Search options

  1. Article ; Online: Improving the perception of low-light enhanced images.

    Vazquez-Corral, Javier / Finlayson, Graham D / Herranz, Luis

    Optics express

    2024  Volume 32, Issue 4, Page(s) 5174–5190

    Abstract: Improving images captured under low-light conditions has become an important topic in computational color imaging, as it has a wide range of applications. Most current methods are either based on handcrafted features or on end-to-end training of deep ... ...

    Abstract Improving images captured under low-light conditions has become an important topic in computational color imaging, as it has a wide range of applications. Most current methods are either based on handcrafted features or on end-to-end training of deep neural networks that mostly focus on minimizing some distortion metric -such as PSNR or SSIM- on a set of training images. However, the minimization of distortion metrics does not mean that the results are optimal in terms of perception (i.e. perceptual quality). As an example, the perception-distortion trade-off states that, close to the optimal results, improving distortion results in worsening perception. This means that current low-light image enhancement methods -that focus on distortion minimization- cannot be optimal in the sense of obtaining a good image in terms of perception errors. In this paper, we propose a post-processing approach in which, given the original low-light image and the result of a specific method, we are able to obtain a result that resembles as much as possible that of the original method, but, at the same time, giving an improvement in the perception of the final image. More in detail, our method follows the hypothesis that in order to minimally modify the perception of an input image, any modification should be a combination of a local change in the shading across a scene and a global change in illumination color. We demonstrate the ability of our method quantitatively using perceptual blind image metrics such as BRISQUE, NIQE, or UNIQUE, and through user preference tests.
    Language English
    Publishing date 2024-03-04
    Publishing country United States
    Document type Journal Article
    ZDB-ID 1491859-6
    ISSN 1094-4087 ; 1094-4087
    ISSN (online) 1094-4087
    ISSN 1094-4087
    DOI 10.1364/OE.509713
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: Lightweight Deep Exemplar Colorization via Semantic Attention-Guided Laplacian Pyramid.

    Zou, Chengyi / Wan, Shuai / Blanch, Marc Gorriz / Murn, Luka / Mrak, Marta / Sock, Juil / Yang, Fei / Herranz, Luis

    IEEE transactions on visualization and computer graphics

    2024  Volume PP

    Abstract: Exemplar-based colorization aims to generate plausible colors for a grayscale image with the guidance of a color reference image. The main challenging problem is finding the correct semantic correspondence between the target image and the reference image. ...

    Abstract Exemplar-based colorization aims to generate plausible colors for a grayscale image with the guidance of a color reference image. The main challenging problem is finding the correct semantic correspondence between the target image and the reference image. However, the colors of the object and background are often confused in the existing methods. Besides, these methods usually use simple encoder-decoder architectures or pyramid structures to extract features and lack appropriate fusion mechanisms, which results in the loss of high-frequency information or high complexity. To address these problems, this paper proposes a lightweight semantic attention-guided Laplacian pyramid network (SAGLP-Net) for deep exemplar-based colorization, exploiting the inherent multi-scale properties of color representations. They are exploited through a Laplacian pyramid, and semantic information is introduced as high-level guidance to align the object and background information. Specially, a semantic guided non-local attention fusion module is designed to exploit the long-range dependency and fuse the local and global features. Moreover, a Laplacian pyramid fusion module based on criss-cross attention is proposed to fuse high frequency components in the large-scale domain. An unsupervised multi-scale multi-loss training strategy is further introduced for network training, which combines pixel loss, color histogram loss, total variance regularisation, and adversarial loss. Experimental results demonstrate that our colorization method achieves better subjective and objective performance with lower complexity than the state-of-the-art methods.
    Language English
    Publishing date 2024-05-09
    Publishing country United States
    Document type Journal Article
    ISSN 1941-0506
    ISSN (online) 1941-0506
    DOI 10.1109/TVCG.2024.3398791
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Trust Your Good Friends: Source-Free Domain Adaptation by Reciprocal Neighborhood Clustering.

    Yang, Shiqi / Wang, Yaxing / van de Weijer, Joost / Herranz, Luis / Jui, Shangling / Yang, Jian

    IEEE transactions on pattern analysis and machine intelligence

    2023  Volume 45, Issue 12, Page(s) 15883–15895

    Abstract: Domain adaptation (DA) aims to alleviate the domain shift between source domain and target domain. Most DA methods require access to the source data, but often that is not possible (e.g., due to data privacy or intellectual property). In this paper, we ... ...

    Abstract Domain adaptation (DA) aims to alleviate the domain shift between source domain and target domain. Most DA methods require access to the source data, but often that is not possible (e.g., due to data privacy or intellectual property). In this paper, we address the challenging source-free domain adaptation (SFDA) problem, where the source pretrained model is adapted to the target domain in the absence of source data. Our method is based on the observation that target data, which might not align with the source domain classifier, still forms clear clusters. We capture this intrinsic structure by defining local affinity of the target data, and encourage label consistency among data with high local affinity. We observe that higher affinity should be assigned to reciprocal neighbors. To aggregate information with more context, we consider expanded neighborhoods with small affinity values. Furthermore, we consider the density around each target sample, which can alleviate the negative impact of potential outliers. In the experimental results we verify that the inherent structure of the target features is an important source of information for domain adaptation. We demonstrate that this local structure can be efficiently captured by considering the local neighbors, the reciprocal neighbors, and the expanded neighborhood. Finally, we achieve state-of-the-art performance on several 2D image and 3D point cloud recognition datasets.
    Language English
    Publishing date 2023-11-03
    Publishing country United States
    Document type Journal Article
    ISSN 1939-3539
    ISSN (online) 1939-3539
    DOI 10.1109/TPAMI.2023.3310791
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Book ; Online: Continual learning in cross-modal retrieval

    Wang, Kai / Herranz, Luis / van de Weijer, Joost

    2021  

    Abstract: Multimodal representations and continual learning are two areas closely related to human intelligence. The former considers the learning of shared representation spaces where information from different modalities can be compared and integrated (we focus ... ...

    Abstract Multimodal representations and continual learning are two areas closely related to human intelligence. The former considers the learning of shared representation spaces where information from different modalities can be compared and integrated (we focus on cross-modal retrieval between language and visual representations). The latter studies how to prevent forgetting a previously learned task when learning a new one. While humans excel in these two aspects, deep neural networks are still quite limited. In this paper, we propose a combination of both problems into a continual cross-modal retrieval setting, where we study how the catastrophic interference caused by new tasks impacts the embedding spaces and their cross-modal alignment required for effective retrieval. We propose a general framework that decouples the training, indexing and querying stages. We also identify and study different factors that may lead to forgetting, and propose tools to alleviate it. We found that the indexing stage pays an important role and that simply avoiding reindexing the database with updated embedding networks can lead to significant gains. We evaluated our methods in two image-text retrieval datasets, obtaining significant gains with respect to the fine tuning baseline.

    Comment: 2nd CLVISION workshop in CVPR 2021
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 004
    Publishing date 2021-04-14
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Article ; Online: Distributed Learning and Inference With Compressed Images.

    Katakol, Sudeep / Elbarashy, Basem / Herranz, Luis / van de Weijer, Joost / Lopez, Antonio M

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

    2021  Volume 30, Page(s) 3069–3083

    Abstract: Modern computer vision requires processing large amounts of data, both while training the model and/or during inference, once the model is deployed. Scenarios where images are captured and processed in physically separated locations are increasingly ... ...

    Abstract Modern computer vision requires processing large amounts of data, both while training the model and/or during inference, once the model is deployed. Scenarios where images are captured and processed in physically separated locations are increasingly common (e.g. autonomous vehicles, cloud computing, smartphones). In addition, many devices suffer from limited resources to store or transmit data (e.g. storage space, channel capacity). In these scenarios, lossy image compression plays a crucial role to effectively increase the number of images collected under such constraints. However, lossy compression entails some undesired degradation of the data that may harm the performance of the downstream analysis task at hand, since important semantic information may be lost in the process. Moreover, we may only have compressed images at training time but are able to use original images at inference time (i.e. test), or vice versa, and in such a case, the downstream model suffers from covariate shift. In this paper, we analyze this phenomenon, with a special focus on vision-based perception for autonomous driving as a paradigmatic scenario. We see that loss of semantic information and covariate shift do indeed exist, resulting in a drop in performance that depends on the compression rate. In order to address the problem, we propose dataset restoration, based on image restoration with generative adversarial networks (GANs). Our method is agnostic to both the particular image compression method and the downstream task; and has the advantage of not adding additional cost to the deployed models, which is particularly important in resource-limited devices. The presented experiments focus on semantic segmentation as a challenging use case, cover a broad range of compression rates and diverse datasets, and show how our method is able to significantly alleviate the negative effects of compression on the downstream visual task.
    Language English
    Publishing date 2021-02-24
    Publishing country United States
    Document type Journal Article
    ISSN 1941-0042
    ISSN (online) 1941-0042
    DOI 10.1109/TIP.2021.3058545
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Learning Effective RGB-D Representations for Scene Recognition.

    Song, Xinhang / Jiang, Shuqiang / Herranz, Luis / Chen, Chengpeng

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

    2018  

    Abstract: Deep convolutional networks (CNN) can achieve impressive results on RGB scene recognition thanks to large datasets such as Places. In contrast, RGB-D scene recognition is still underdeveloped in comparison, due to two limitations of RGB-D data we address ...

    Abstract Deep convolutional networks (CNN) can achieve impressive results on RGB scene recognition thanks to large datasets such as Places. In contrast, RGB-D scene recognition is still underdeveloped in comparison, due to two limitations of RGB-D data we address in this paper. The first limitation is the lack of depth data for training deep learning models. Rather than fine tuning or transferring RGB-specific features, we address this limitation by proposing an architecture and a twostep training approach that directly learns effective depth-specific features using weak supervision via patches. The resulting RGBD model also benefits from more complementary multimodal features. Another limitation is the short range of depth sensors (typically 0.5m to 5.5m), resulting in depth images not capturing distant objects in the scenes that RGB images can. We show that this limitation can be addressed by using RGB-D videos, where more comprehensive depth information is accumulated as the camera travels across the scenes. Focusing on this scenario, we introduce the ISIA RGB-D video dataset to evaluate RGB-D scene recognition with videos. Our video recognition architecture combines convolutional and recurrent neural networks (RNNs) that are trained in three steps with increasingly complex data to learn effective features (i.e. patches, frames and sequences). Our approach obtains state-of-the-art performances on RGB-D image (NYUD2 and SUN RGB-D) and video (ISIA RGB-D) scene recognition.
    Language English
    Publishing date 2018-09-28
    Publishing country United States
    Document type Journal Article
    ISSN 1941-0042
    ISSN (online) 1941-0042
    DOI 10.1109/TIP.2018.2872629
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Book ; Online: Slimmable Compressive Autoencoders for Practical Neural Image Compression

    Yang, Fei / Herranz, Luis / Cheng, Yongmei / Mozerov, Mikhail G.

    2021  

    Abstract: Neural image compression leverages deep neural networks to outperform traditional image codecs in rate-distortion performance. However, the resulting models are also heavy, computationally demanding and generally optimized for a single rate, limiting ... ...

    Abstract Neural image compression leverages deep neural networks to outperform traditional image codecs in rate-distortion performance. However, the resulting models are also heavy, computationally demanding and generally optimized for a single rate, limiting their practical use. Focusing on practical image compression, we propose slimmable compressive autoencoders (SlimCAEs), where rate (R) and distortion (D) are jointly optimized for different capacities. Once trained, encoders and decoders can be executed at different capacities, leading to different rates and complexities. We show that a successful implementation of SlimCAEs requires suitable capacity-specific RD tradeoffs. Our experiments show that SlimCAEs are highly flexible models that provide excellent rate-distortion performance, variable rate, and dynamic adjustment of memory, computational cost and latency, thus addressing the main requirements of practical image compression.

    Comment: Accepted to CVPR 2021
    Keywords Electrical Engineering and Systems Science - Image and Video Processing ; Computer Science - Computer Vision and Pattern Recognition
    Subject code 006
    Publishing date 2021-03-29
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: A Novel Framework for Image-to-image Translation and Image Compression

    Yang, Fei / Wang, Yaxing / Herranz, Luis / Cheng, Yongmei / Mozerov, Mikhail

    2021  

    Abstract: Data-driven paradigms using machine learning are becoming ubiquitous in image processing and communications. In particular, image-to-image (I2I) translation is a generic and widely used approach to image processing problems, such as image synthesis, ... ...

    Abstract Data-driven paradigms using machine learning are becoming ubiquitous in image processing and communications. In particular, image-to-image (I2I) translation is a generic and widely used approach to image processing problems, such as image synthesis, style transfer, and image restoration. At the same time, neural image compression has emerged as a data-driven alternative to traditional coding approaches in visual communications. In this paper, we study the combination of these two paradigms into a joint I2I compression and translation framework, focusing on multi-domain image synthesis. We first propose distributed I2I translation by integrating quantization and entropy coding into an I2I translation framework (i.e. I2Icodec). In practice, the image compression functionality (i.e. autoencoding) is also desirable, requiring to deploy alongside I2Icodec a regular image codec. Thus, we further propose a unified framework that allows both translation and autoencoding capabilities in a single codec. Adaptive residual blocks conditioned on the translation/compression mode provide flexible adaptation to the desired functionality. The experiments show promising results in both I2I translation and image compression using a single model.

    Comment: 14 pages, 15 figures, accepted by Neurocomputing
    Keywords Electrical Engineering and Systems Science - Image and Video Processing ; Computer Science - Computer Vision and Pattern Recognition
    Publishing date 2021-11-25
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: On Implicit Attribute Localization for Generalized Zero-Shot Learning

    Yang, Shiqi / Wang, Kai / Herranz, Luis / van de Weijer, Joost

    2021  

    Abstract: Zero-shot learning (ZSL) aims to discriminate images from unseen classes by exploiting relations to seen classes via their attribute-based descriptions. Since attributes are often related to specific parts of objects, many recent works focus on ... ...

    Abstract Zero-shot learning (ZSL) aims to discriminate images from unseen classes by exploiting relations to seen classes via their attribute-based descriptions. Since attributes are often related to specific parts of objects, many recent works focus on discovering discriminative regions. However, these methods usually require additional complex part detection modules or attention mechanisms. In this paper, 1) we show that common ZSL backbones (without explicit attention nor part detection) can implicitly localize attributes, yet this property is not exploited. 2) Exploiting it, we then propose SELAR, a simple method that further encourages attribute localization, surprisingly achieving very competitive generalized ZSL (GZSL) performance when compared with more complex state-of-the-art methods. Our findings provide useful insight for designing future GZSL methods, and SELAR provides an easy to implement yet strong baseline.

    Comment: To appear in IEEE Signal Processing Letters. Overlapped with arXiv:2006.05938
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 006
    Publishing date 2021-03-08
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Book ; Online: HCV

    Wang, Kai / Liu, Xialei / Herranz, Luis / van de Weijer, Joost

    Hierarchy-Consistency Verification for Incremental Implicitly-Refined Classification

    2021  

    Abstract: Human beings learn and accumulate hierarchical knowledge over their lifetime. This knowledge is associated with previous concepts for consolidation and hierarchical construction. However, current incremental learning methods lack the ability to build a ... ...

    Abstract Human beings learn and accumulate hierarchical knowledge over their lifetime. This knowledge is associated with previous concepts for consolidation and hierarchical construction. However, current incremental learning methods lack the ability to build a concept hierarchy by associating new concepts to old ones. A more realistic setting tackling this problem is referred to as Incremental Implicitly-Refined Classification (IIRC), which simulates the recognition process from coarse-grained categories to fine-grained categories. To overcome forgetting in this benchmark, we propose Hierarchy-Consistency Verification (HCV) as an enhancement to existing continual learning methods. Our method incrementally discovers the hierarchical relations between classes. We then show how this knowledge can be exploited during both training and inference. Experiments on three setups of varying difficulty demonstrate that our HCV module improves performance of existing continual learning methods under this IIRC setting by a large margin. Code is available in https://github.com/wangkai930418/HCV_IIRC.

    Comment: accepted in BMVC 2021
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 006
    Publishing date 2021-10-21
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top