LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 6 of total 6

Search options

  1. Book ; Online: TAME

    Ntrougkas, Mariano / Gkalelis, Nikolaos / Mezaris, Vasileios

    Attention Mechanism Based Feature Fusion for Generating Explanation Maps of Convolutional Neural Networks

    2023  

    Abstract: The apparent ``black box'' nature of neural networks is a barrier to adoption in applications where explainability is essential. This paper presents TAME (Trainable Attention Mechanism for Explanations), a method for generating explanation maps with a ... ...

    Abstract The apparent ``black box'' nature of neural networks is a barrier to adoption in applications where explainability is essential. This paper presents TAME (Trainable Attention Mechanism for Explanations), a method for generating explanation maps with a multi-branch hierarchical attention mechanism. TAME combines a target model's feature maps from multiple layers using an attention mechanism, transforming them into an explanation map. TAME can easily be applied to any convolutional neural network (CNN) by streamlining the optimization of the attention mechanism's training method and the selection of target model's feature maps. After training, explanation maps can be computed in a single forward pass. We apply TAME to two widely used models, i.e. VGG-16 and ResNet-50, trained on ImageNet and show improvements over previous top-performing methods. We also provide a comprehensive ablation study comparing the performance of different variations of TAME's architecture. TAME source code is made publicly available at https://github.com/bmezaris/TAME

    Comment: Accepted for publication in the proceedings of IEEE Int. Symposium on Multimedia (ISM), Naples, Italy, Dec. 2022
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 629
    Publishing date 2023-01-18
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: Gated-ViGAT

    Gkalelis, Nikolaos / Daskalakis, Dimitrios / Mezaris, Vasileios

    Efficient Bottom-Up Event Recognition and Explanation Using a New Frame Selection Policy and Gating Mechanism

    2023  

    Abstract: In this paper, Gated-ViGAT, an efficient approach for video event recognition, utilizing bottom-up (object) information, a new frame sampling policy and a gating mechanism is proposed. Specifically, the frame sampling policy uses weighted in-degrees ( ... ...

    Abstract In this paper, Gated-ViGAT, an efficient approach for video event recognition, utilizing bottom-up (object) information, a new frame sampling policy and a gating mechanism is proposed. Specifically, the frame sampling policy uses weighted in-degrees (WiDs), derived from the adjacency matrices of graph attention networks (GATs), and a dissimilarity measure to select the most salient and at the same time diverse frames representing the event in the video. Additionally, the proposed gating mechanism fetches the selected frames sequentially, and commits early-exiting when an adequately confident decision is achieved. In this way, only a few frames are processed by the computationally expensive branch of our network that is responsible for the bottom-up information extraction. The experimental evaluation on two large, publicly available video datasets (MiniKinetics, ActivityNet) demonstrates that Gated-ViGAT provides a large computational complexity reduction in comparison to our previous approach (ViGAT), while maintaining the excellent event recognition and explainability performance. Gated-ViGAT source code is made publicly available at https://github.com/bmezaris/Gated-ViGAT

    Comment: Accepted for publication in the proceedings of IEEE Int. Symposium on Multimedia (ISM), Naples, Italy, Dec. 2022
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 004
    Publishing date 2023-01-18
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: Masked Feature Modelling

    Daskalakis, Dimitrios / Gkalelis, Nikolaos / Mezaris, Vasileios

    Feature Masking for the Unsupervised Pre-training of a Graph Attention Network Block for Bottom-up Video Event Recognition

    2023  

    Abstract: In this paper, we introduce Masked Feature Modelling (MFM), a novel approach for the unsupervised pre-training of a Graph Attention Network (GAT) block. MFM utilizes a pretrained Visual Tokenizer to reconstruct masked features of objects within a video, ... ...

    Abstract In this paper, we introduce Masked Feature Modelling (MFM), a novel approach for the unsupervised pre-training of a Graph Attention Network (GAT) block. MFM utilizes a pretrained Visual Tokenizer to reconstruct masked features of objects within a video, leveraging the MiniKinetics dataset. We then incorporate the pre-trained GAT block into a state-of-the-art bottom-up supervised video-event recognition architecture, ViGAT, to improve the model's starting point and overall accuracy. Experimental evaluations on the YLI-MED dataset demonstrate the effectiveness of MFM in improving event recognition performance.

    Comment: 8 pages
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning ; Computer Science - Multimedia
    Publishing date 2023-08-24
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: Filter-Pruning of Lightweight Face Detectors Using a Geometric Median Criterion

    Gkrispanis, Konstantinos / Gkalelis, Nikolaos / Mezaris, Vasileios

    2023  

    Abstract: Face detectors are becoming a crucial component of many applications, including surveillance, that often have to run on edge devices with limited processing power and memory. Therefore, there's a pressing demand for compact face detection models that can ...

    Abstract Face detectors are becoming a crucial component of many applications, including surveillance, that often have to run on edge devices with limited processing power and memory. Therefore, there's a pressing demand for compact face detection models that can function efficiently across resource-constrained devices. Over recent years, network pruning techniques have attracted a lot of attention from researchers. These methods haven't been well examined in the context of face detectors, despite their expanding popularity. In this paper, we implement filter pruning on two already small and compact face detectors, named EXTD (Extremely Tiny Face Detector) and EResFD (Efficient ResNet Face Detector). The main pruning algorithm that we utilize is Filter Pruning via Geometric Median (FPGM), combined with the Soft Filter Pruning (SFP) iterative procedure. We also apply L1 Norm pruning, as a baseline to compare with the proposed approach. The experimental evaluation on the WIDER FACE dataset indicates that the proposed approach has the potential to further reduce the model size of already lightweight face detectors, with limited accuracy loss, or even with small accuracy gain for low pruning rates.

    Comment: Accepted for publication in the IEEE/CVF WACV 2024 Workshops proceedings, Hawaii, USA, Jan. 2024
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 006
    Publishing date 2023-11-28
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: ViGAT

    Gkalelis, Nikolaos / Daskalakis, Dimitrios / Mezaris, Vasileios

    Bottom-up event recognition and explanation in video using factorized graph attention network

    2022  

    Abstract: In this paper a pure-attention bottom-up approach, called ViGAT, that utilizes an object detector together with a Vision Transformer (ViT) backbone network to derive object and frame features, and a head network to process these features for the task of ... ...

    Abstract In this paper a pure-attention bottom-up approach, called ViGAT, that utilizes an object detector together with a Vision Transformer (ViT) backbone network to derive object and frame features, and a head network to process these features for the task of event recognition and explanation in video, is proposed. The ViGAT head consists of graph attention network (GAT) blocks factorized along the spatial and temporal dimensions in order to capture effectively both local and long-term dependencies between objects or frames. Moreover, using the weighted in-degrees (WiDs) derived from the adjacency matrices at the various GAT blocks, we show that the proposed architecture can identify the most salient objects and frames that explain the decision of the network. A comprehensive evaluation study is performed, demonstrating that the proposed approach provides state-of-the-art results on three large, publicly available video datasets (FCVID, Mini-Kinetics, ActivityNet).
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning ; Computer Science - Multimedia
    Subject code 004 ; 006
    Publishing date 2022-07-20
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Article: Mixture subclass discriminant analysis link to restricted Gaussian model and other generalizations.

    Gkalelis, Nikolaos / Mezaris, Vasileios / Kompatsiaris, Ioannis / Stathaki, Tania

    IEEE transactions on neural networks and learning systems

    2013  Volume 24, Issue 1, Page(s) 8–21

    Abstract: In this paper, a theoretical link between mixture subclass discriminant analysis (MSDA) and a restricted Gaussian model is first presented. Then, two further discriminant analysis (DA) methods, i.e., fractional step MSDA (FSMSDA) and kernel MSDA (KMSDA) ... ...

    Abstract In this paper, a theoretical link between mixture subclass discriminant analysis (MSDA) and a restricted Gaussian model is first presented. Then, two further discriminant analysis (DA) methods, i.e., fractional step MSDA (FSMSDA) and kernel MSDA (KMSDA) are proposed. Linking MSDA to an appropriate Gaussian model allows the derivation of a new DA method under the expectation maximization (EM) framework (EM-MSDA), which simultaneously derives the discriminant subspace and the maximum likelihood estimates. The two other proposed methods generalize MSDA in order to solve problems inherited from conventional DA. FSMSDA solves the subclass separation problem, that is, the situation in which the dimensionality of the discriminant subspace is strictly smaller than the rank of the inter-between-subclass scatter matrix. This is done by an appropriate weighting scheme and the utilization of an iterative algorithm for preserving useful discriminant directions. On the other hand, KMSDA uses the kernel trick to separate data with nonlinearly separable subclass structure. Extensive experimentation shows that the proposed methods outperform conventional MSDA and other linear discriminant analysis variants.
    Language English
    Publishing date 2013-01
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 2162-237X
    ISSN 2162-237X
    DOI 10.1109/TNNLS.2012.2216545
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top