LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 8 of total 8

Search options

  1. Book ; Online: CubeTR

    Chasmai, Mustafa Ebrahim

    Learning to Solve The Rubiks Cube Using Transformers

    2021  

    Abstract: Since its first appearance, transformers have been successfully used in wide ranging domains from computer vision to natural language processing. Application of transformers in Reinforcement Learning by reformulating it as a sequence modelling problem ... ...

    Abstract Since its first appearance, transformers have been successfully used in wide ranging domains from computer vision to natural language processing. Application of transformers in Reinforcement Learning by reformulating it as a sequence modelling problem was proposed only recently. Compared to other commonly explored reinforcement learning problems, the Rubiks cube poses a unique set of challenges. The Rubiks cube has a single solved state for quintillions of possible configurations which leads to extremely sparse rewards. The proposed model CubeTR attends to longer sequences of actions and addresses the problem of sparse rewards. CubeTR learns how to solve the Rubiks cube from arbitrary starting states without any human prior, and after move regularisation, the lengths of solutions generated by it are expected to be very close to those given by algorithms used by expert human solvers. CubeTR provides insights to the generalisability of learning algorithms to higher dimensional cubes and the applicability of transformers in other relevant sparse reward scenarios.

    Comment: It has untested ideas without supporting experimentation. Discontinued work in this direction
    Keywords Computer Science - Machine Learning ; Computer Science - Artificial Intelligence
    Subject code 006
    Publishing date 2021-11-10
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Article ; Online: Robust Prototypical Few-Shot Organ Segmentation With Regularized Neural-ODEs.

    Pandey, Prashant / Chasmai, Mustafa / Sur, Tanuj / Lall, Brejesh

    IEEE transactions on medical imaging

    2023  Volume 42, Issue 9, Page(s) 2490–2501

    Abstract: Despite the tremendous progress made by deep learning models in image semantic segmentation, they typically require large annotated examples, and increasing attention is being diverted to problem settings like Few-Shot Learning (FSL) where only a small ... ...

    Abstract Despite the tremendous progress made by deep learning models in image semantic segmentation, they typically require large annotated examples, and increasing attention is being diverted to problem settings like Few-Shot Learning (FSL) where only a small amount of annotation is needed for generalisation to novel classes. This is especially seen in medical domains where dense pixel-level annotations are expensive to obtain. In this paper, we propose Regularized Prototypical Neural Ordinary Differential Equation (R-PNODE), a method that leverages intrinsic properties of Neural-ODEs, assisted and enhanced by additional cluster and consistency losses to perform Few-Shot Segmentation (FSS) of organs. R-PNODE constrains support and query features from the same classes to lie closer in the representation space thereby improving the performance over the existing Convolutional Neural Network (CNN) based FSS methods. We further demonstrate that while many existing Deep CNN-based methods tend to be extremely vulnerable to adversarial attacks, R-PNODE exhibits increased adversarial robustness for a wide array of these attacks. We experiment with three publicly available multi-organ segmentation datasets in both in-domain and cross-domain FSS settings to demonstrate the efficacy of our method. In addition, we perform experiments with seven commonly used adversarial attacks in various settings to demonstrate R-PNODE's robustness. R-PNODE outperforms the baselines for FSS by significant margins and also shows superior performance for a wide array of attacks varying in intensity and design.
    MeSH term(s) Image Processing, Computer-Assisted/methods ; Neural Networks, Computer ; Semantics
    Language English
    Publishing date 2023-08-31
    Publishing country United States
    Document type Journal Article
    ZDB-ID 622531-7
    ISSN 1558-254X ; 0278-0062
    ISSN (online) 1558-254X
    ISSN 0278-0062
    DOI 10.1109/TMI.2023.3258069
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: A View Independent Classification Framework for Yoga Postures.

    Chasmai, Mustafa / Das, Nirjhar / Bhardwaj, Aman / Garg, Rahul

    SN computer science

    2022  Volume 3, Issue 6, Page(s) 476

    Abstract: Yoga is a globally acclaimed and widely recommended practice for a healthy living. Maintaining correct posture while performing a Yogasana is of utmost importance. In this work, we employ transfer learning from human pose estimation models for extracting ...

    Abstract Yoga is a globally acclaimed and widely recommended practice for a healthy living. Maintaining correct posture while performing a Yogasana is of utmost importance. In this work, we employ transfer learning from human pose estimation models for extracting 136 key-points spread all over the body to train a random forest classifier which is used for estimation of the Yogasanas. The results are evaluated on an in-house collected extensive yoga video database of 51 subjects recorded from four different camera angles. We use a three step scheme for evaluating the generalizability of a Yoga classifier by testing it on (1) unseen frames, (2) unseen subjects, and (3) unseen camera angles. We argue that for most of the applications, validation accuracies on unseen subjects and unseen camera angles would be most important. We empirically analyze over three public datasets, the advantage of transfer learning and the possibilities of target leakage. We further demonstrate that the classification accuracies critically depend on the cross validation method employed and can often be misleading. To promote further research, we have made key-points dataset and code publicly available.
    Supplementary information: The online version contains supplementary material available at 10.1007/s42979-022-01376-7.
    Language English
    Publishing date 2022-09-13
    Publishing country Singapore
    Document type Journal Article
    ISSN 2661-8907
    ISSN (online) 2661-8907
    DOI 10.1007/s42979-022-01376-7
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Book ; Online: Person Re-Identification

    Chasmai, Mustafa Ebrahim / Banerjee, Tamajit

    2022  

    Abstract: Person Re-Identification (Re-ID) is an important problem in computer vision-based surveillance applications, in which one aims to identify a person across different surveillance photographs taken from different cameras having varying orientations and ... ...

    Abstract Person Re-Identification (Re-ID) is an important problem in computer vision-based surveillance applications, in which one aims to identify a person across different surveillance photographs taken from different cameras having varying orientations and field of views. Due to the increasing demand for intelligent video surveillance, Re-ID has gained significant interest in the computer vision community. In this work, we experiment on some existing Re-ID methods that obtain state of the art performance in some open benchmarks. We qualitatively and quantitaively analyse their performance on a provided dataset, and then propose methods to improve the results. This work was the report submitted for COL780 final project at IIT Delhi.
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Publishing date 2022-04-27
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: A Language-Guided Benchmark for Weakly Supervised Open Vocabulary Semantic Segmentation

    Pandey, Prashant / Chasmai, Mustafa / Natarajan, Monish / Lall, Brejesh

    2023  

    Abstract: Increasing attention is being diverted to data-efficient problem settings like Open Vocabulary Semantic Segmentation (OVSS) which deals with segmenting an arbitrary object that may or may not be seen during training. The closest standard problems related ...

    Abstract Increasing attention is being diverted to data-efficient problem settings like Open Vocabulary Semantic Segmentation (OVSS) which deals with segmenting an arbitrary object that may or may not be seen during training. The closest standard problems related to OVSS are Zero-Shot and Few-Shot Segmentation (ZSS, FSS) and their Cross-dataset variants where zero to few annotations are needed to segment novel classes. The existing FSS and ZSS methods utilize fully supervised pixel-labelled seen classes to segment unseen classes. Pixel-level labels are hard to obtain, and using weak supervision in the form of inexpensive image-level labels is often more practical. To this end, we propose a novel unified weakly supervised OVSS pipeline that can perform ZSS, FSS and Cross-dataset segmentation on novel classes without using pixel-level labels for either the base (seen) or the novel (unseen) classes in an inductive setting. We propose Weakly-Supervised Language-Guided Segmentation Network (WLSegNet), a novel language-guided segmentation pipeline that i) learns generalizable context vectors with batch aggregates (mean) to map class prompts to image features using frozen CLIP (a vision-language model) and ii) decouples weak ZSS/FSS into weak semantic segmentation and Zero-Shot segmentation. The learned context vectors avoid overfitting on seen classes during training and transfer better to novel classes during testing. WLSegNet avoids fine-tuning and the use of external datasets during training. The proposed pipeline beats existing methods for weak generalized Zero-Shot and weak Few-Shot semantic segmentation by 39 and 3 mIOU points respectively on PASCAL VOC and weak Few-Shot semantic segmentation by 5 mIOU points on MS COCO. On a harder setting of 2-way 1-shot weak FSS, WLSegNet beats the baselines by 13 and 22 mIOU points on PASCAL VOC and MS COCO, respectively.
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 004
    Publishing date 2023-02-27
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: Robust Prototypical Few-Shot Organ Segmentation with Regularized Neural-ODEs

    Pandey, Prashant / Chasmai, Mustafa / Sur, Tanuj / Lall, Brejesh

    2022  

    Abstract: Despite the tremendous progress made by deep learning models in image semantic segmentation, they typically require large annotated examples, and increasing attention is being diverted to problem settings like Few-Shot Learning (FSL) where only a small ... ...

    Abstract Despite the tremendous progress made by deep learning models in image semantic segmentation, they typically require large annotated examples, and increasing attention is being diverted to problem settings like Few-Shot Learning (FSL) where only a small amount of annotation is needed for generalisation to novel classes. This is especially seen in medical domains where dense pixel-level annotations are expensive to obtain. In this paper, we propose Regularized Prototypical Neural Ordinary Differential Equation (R-PNODE), a method that leverages intrinsic properties of Neural-ODEs, assisted and enhanced by additional cluster and consistency losses to perform Few-Shot Segmentation (FSS) of organs. R-PNODE constrains support and query features from the same classes to lie closer in the representation space thereby improving the performance over the existing Convolutional Neural Network (CNN) based FSS methods. We further demonstrate that while many existing Deep CNN based methods tend to be extremely vulnerable to adversarial attacks, R-PNODE exhibits increased adversarial robustness for a wide array of these attacks. We experiment with three publicly available multi-organ segmentation datasets in both in-domain and cross-domain FSS settings to demonstrate the efficacy of our method. In addition, we perform experiments with seven commonly used adversarial attacks in various settings to demonstrate R-PNODE's robustness. R-PNODE outperforms the baselines for FSS by significant margins and also shows superior performance for a wide array of attacks varying in intensity and design.
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence
    Subject code 006
    Publishing date 2022-08-25
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: A View Independent Classification Framework for Yoga Postures

    Chasmai, Mustafa / Das, Nirjhar / Bhardwaj, Aman / Garg, Rahul

    2022  

    Abstract: Yoga is a globally acclaimed and widely recommended practice for a healthy living. Maintaining correct posture while performing a Yogasana is of utmost importance. In this work, we employ transfer learning from Human Pose Estimation models for extracting ...

    Abstract Yoga is a globally acclaimed and widely recommended practice for a healthy living. Maintaining correct posture while performing a Yogasana is of utmost importance. In this work, we employ transfer learning from Human Pose Estimation models for extracting 136 key-points spread all over the body to train a Random Forest classifier which is used for estimation of the Yogasanas. The results are evaluated on an in-house collected extensive yoga video database of 51 subjects recorded from 4 different camera angles. We propose a 3 step scheme for evaluating the generalizability of a Yoga classifier by testing it on 1) unseen frames, 2) unseen subjects, and 3) unseen camera angles. We argue that for most of the applications, validation accuracies on unseen subjects and unseen camera angles would be most important. We empirically analyze over three public datasets, the advantage of transfer learning and the possibilities of target leakage. We further demonstrate that the classification accuracies critically depend on the cross validation method employed and can often be misleading. To promote further research, we have made key-points dataset and code publicly available.
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning
    Subject code 004
    Publishing date 2022-06-27
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: From Forks to Forceps

    Baby, Britty / Thapar, Daksh / Chasmai, Mustafa / Banerjee, Tamajit / Dargan, Kunal / Suri, Ashish / Banerjee, Subhashis / Arora, Chetan

    A New Framework for Instance Segmentation of Surgical Instruments

    2022  

    Abstract: Minimally invasive surgeries and related applications demand surgical tool classification and segmentation at the instance level. Surgical tools are similar in appearance and are long, thin, and handled at an angle. The fine-tuning of state-of-the-art ( ... ...

    Abstract Minimally invasive surgeries and related applications demand surgical tool classification and segmentation at the instance level. Surgical tools are similar in appearance and are long, thin, and handled at an angle. The fine-tuning of state-of-the-art (SOTA) instance segmentation models trained on natural images for instrument segmentation has difficulty discriminating instrument classes. Our research demonstrates that while the bounding box and segmentation mask are often accurate, the classification head mis-classifies the class label of the surgical instrument. We present a new neural network framework that adds a classification module as a new stage to existing instance segmentation models. This module specializes in improving the classification of instrument masks generated by the existing model. The module comprises multi-scale mask attention, which attends to the instrument region and masks the distracting background features. We propose training our classifier module using metric learning with arc loss to handle low inter-class variance of surgical instruments. We conduct exhaustive experiments on the benchmark datasets EndoVis2017 and EndoVis2018. We demonstrate that our method outperforms all (more than 18) SOTA methods compared with, and improves the SOTA performance by at least 12 points (20%) on the EndoVis2017 benchmark challenge and generalizes effectively across the datasets.

    Comment: WACV 2023
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence
    Subject code 006
    Publishing date 2022-11-26
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top