LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 50

Search options

  1. Article ; Online: Robust Prototypical Few-Shot Organ Segmentation With Regularized Neural-ODEs.

    Pandey, Prashant / Chasmai, Mustafa / Sur, Tanuj / Lall, Brejesh

    IEEE transactions on medical imaging

    2023  Volume 42, Issue 9, Page(s) 2490–2501

    Abstract: Despite the tremendous progress made by deep learning models in image semantic segmentation, they typically require large annotated examples, and increasing attention is being diverted to problem settings like Few-Shot Learning (FSL) where only a small ... ...

    Abstract Despite the tremendous progress made by deep learning models in image semantic segmentation, they typically require large annotated examples, and increasing attention is being diverted to problem settings like Few-Shot Learning (FSL) where only a small amount of annotation is needed for generalisation to novel classes. This is especially seen in medical domains where dense pixel-level annotations are expensive to obtain. In this paper, we propose Regularized Prototypical Neural Ordinary Differential Equation (R-PNODE), a method that leverages intrinsic properties of Neural-ODEs, assisted and enhanced by additional cluster and consistency losses to perform Few-Shot Segmentation (FSS) of organs. R-PNODE constrains support and query features from the same classes to lie closer in the representation space thereby improving the performance over the existing Convolutional Neural Network (CNN) based FSS methods. We further demonstrate that while many existing Deep CNN-based methods tend to be extremely vulnerable to adversarial attacks, R-PNODE exhibits increased adversarial robustness for a wide array of these attacks. We experiment with three publicly available multi-organ segmentation datasets in both in-domain and cross-domain FSS settings to demonstrate the efficacy of our method. In addition, we perform experiments with seven commonly used adversarial attacks in various settings to demonstrate R-PNODE's robustness. R-PNODE outperforms the baselines for FSS by significant margins and also shows superior performance for a wide array of attacks varying in intensity and design.
    MeSH term(s) Image Processing, Computer-Assisted/methods ; Neural Networks, Computer ; Semantics
    Language English
    Publishing date 2023-08-31
    Publishing country United States
    Document type Journal Article
    ZDB-ID 622531-7
    ISSN 1558-254X ; 0278-0062
    ISSN (online) 1558-254X
    ISSN 0278-0062
    DOI 10.1109/TMI.2023.3258069
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: OptiDistillNet: Learning nonlinear pulse propagation using the student-teacher model.

    Gautam, Naveenta / Kaushik, Vinay / Choudhary, Amol / Lall, Brejesh

    Optics express

    2022  Volume 30, Issue 23, Page(s) 42430–42439

    Abstract: We present a unique approach for learning the pulse evolution in a nonlinear fiber using a deep convolutional neural network (CNN) by solving the nonlinear Schrodinger equation (NLSE). Deep network model compression has become widespread for deploying ... ...

    Abstract We present a unique approach for learning the pulse evolution in a nonlinear fiber using a deep convolutional neural network (CNN) by solving the nonlinear Schrodinger equation (NLSE). Deep network model compression has become widespread for deploying such models in real-world applications. A knowledge distillation (KD) based framework for compressing a CNN is presented here. The student network, termed here as OptiDistillNet has better generalisation, has faster convergence, is faster and uses less number of trainable parameters. This work represents the first effort, to the best of our knowledge, that successfully applies a KD-based technique for any nonlinear optics application. Our tests show that even by reducing the model size by up to 91.2%, we can still achieve a mean square error (MSE) which is very close to the MSE of 1.04*10
    Language English
    Publishing date 2022-11-10
    Publishing country United States
    Document type Journal Article
    ZDB-ID 1491859-6
    ISSN 1094-4087 ; 1094-4087
    ISSN (online) 1094-4087
    ISSN 1094-4087
    DOI 10.1364/OE.463450
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Satellite hyperspectral imaging technology as a potential rapid pollution assessment tool for urban landfill sites: case study of Ghazipur and Okhla landfill sites in Delhi, India.

    Dutta, Amitava / Chaudhary, Priya / Sharma, Shilpi / Lall, Brejesh

    Environmental science and pollution research international

    2022  Volume 30, Issue 55, Page(s) 116742–116750

    Abstract: Hyperspectral imaging technology has been used for biochemical analysis of Earth's surface exploiting the spectral reflectance signatures of various materials. The new-generation Italian PRISMA (PRecursore IperSpettrale dellaMissione Applicativa) ... ...

    Abstract Hyperspectral imaging technology has been used for biochemical analysis of Earth's surface exploiting the spectral reflectance signatures of various materials. The new-generation Italian PRISMA (PRecursore IperSpettrale dellaMissione Applicativa) hyperspectral satellite launched by the Italian space agency (ASI) provides a unique opportunity to map various materials through spectral signature analysis for recourse management and sustainable development. In this study PRISMA hyperspectral satellite imagery-based multiple spectral indices were generated for rapid pollution assessment at Ghazipur and Okhla landfill sites in Delhi, India. It was found that the combined risk score for Okhla landfill site was higher than the Ghazipur landfill site. Various manmade materials identified, exploiting the hyperspectral imagery and spectral signature libraries, indicated presence of highly saline water, plastic (black, ABS, pipe, netting, etc.), asphalt tar, black tar paper, kerogen BK-Cornell, black paint and graphite, chalcocite minerals, etc. in large quantities in both the landfill sites. The methodology provides a rapid pollution assessment tool for municipal landfill sites.
    MeSH term(s) Waste Disposal Facilities ; Hyperspectral Imaging ; India ; Satellite Imagery ; Biological Products
    Chemical Substances Biological Products
    Language English
    Publishing date 2022-08-18
    Publishing country Germany
    Document type Journal Article
    ZDB-ID 1178791-0
    ISSN 1614-7499 ; 0944-1344
    ISSN (online) 1614-7499
    ISSN 0944-1344
    DOI 10.1007/s11356-022-22421-1
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Book ; Online: A Language-Guided Benchmark for Weakly Supervised Open Vocabulary Semantic Segmentation

    Pandey, Prashant / Chasmai, Mustafa / Natarajan, Monish / Lall, Brejesh

    2023  

    Abstract: Increasing attention is being diverted to data-efficient problem settings like Open Vocabulary Semantic Segmentation (OVSS) which deals with segmenting an arbitrary object that may or may not be seen during training. The closest standard problems related ...

    Abstract Increasing attention is being diverted to data-efficient problem settings like Open Vocabulary Semantic Segmentation (OVSS) which deals with segmenting an arbitrary object that may or may not be seen during training. The closest standard problems related to OVSS are Zero-Shot and Few-Shot Segmentation (ZSS, FSS) and their Cross-dataset variants where zero to few annotations are needed to segment novel classes. The existing FSS and ZSS methods utilize fully supervised pixel-labelled seen classes to segment unseen classes. Pixel-level labels are hard to obtain, and using weak supervision in the form of inexpensive image-level labels is often more practical. To this end, we propose a novel unified weakly supervised OVSS pipeline that can perform ZSS, FSS and Cross-dataset segmentation on novel classes without using pixel-level labels for either the base (seen) or the novel (unseen) classes in an inductive setting. We propose Weakly-Supervised Language-Guided Segmentation Network (WLSegNet), a novel language-guided segmentation pipeline that i) learns generalizable context vectors with batch aggregates (mean) to map class prompts to image features using frozen CLIP (a vision-language model) and ii) decouples weak ZSS/FSS into weak semantic segmentation and Zero-Shot segmentation. The learned context vectors avoid overfitting on seen classes during training and transfer better to novel classes during testing. WLSegNet avoids fine-tuning and the use of external datasets during training. The proposed pipeline beats existing methods for weak generalized Zero-Shot and weak Few-Shot semantic segmentation by 39 and 3 mIOU points respectively on PASCAL VOC and weak Few-Shot semantic segmentation by 5 mIOU points on MS COCO. On a harder setting of 2-way 1-shot weak FSS, WLSegNet beats the baselines by 13 and 22 mIOU points on PASCAL VOC and MS COCO, respectively.
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 004
    Publishing date 2023-02-27
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: Knowledge Distillation in Vision Transformers

    Habib, Gousia / Saleem, Tausifa Jan / Lall, Brejesh

    A Critical Review

    2023  

    Abstract: In Natural Language Processing (NLP), Transformers have already revolutionized the field by utilizing an attention-based encoder-decoder model. Recently, some pioneering works have employed Transformer-like architectures in Computer Vision (CV) and they ... ...

    Abstract In Natural Language Processing (NLP), Transformers have already revolutionized the field by utilizing an attention-based encoder-decoder model. Recently, some pioneering works have employed Transformer-like architectures in Computer Vision (CV) and they have reported outstanding performance of these architectures in tasks such as image classification, object detection, and semantic segmentation. Vision Transformers (ViTs) have demonstrated impressive performance improvements over Convolutional Neural Networks (CNNs) due to their competitive modelling capabilities. However, these architectures demand massive computational resources which makes these models difficult to be deployed in the resource-constrained applications. Many solutions have been developed to combat this issue, such as compressive transformers and compression functions such as dilated convolution, min-max pooling, 1D convolution, etc. Model compression has recently attracted considerable research attention as a potential remedy. A number of model compression methods have been proposed in the literature such as weight quantization, weight multiplexing, pruning and Knowledge Distillation (KD). However, techniques like weight quantization, pruning and weight multiplexing typically involve complex pipelines for performing the compression. KD has been found to be a simple and much effective model compression technique that allows a relatively simple model to perform tasks almost as accurately as a complex model. This paper discusses various approaches based upon KD for effective compression of ViT models. The paper elucidates the role played by KD in reducing the computational and memory requirements of these models. The paper also presents the various challenges faced by ViTs that are yet to be resolved.

    Comment: 28pages, 16 figures
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 004
    Publishing date 2023-02-04
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN

    Kumar, Neeraj / Narang, Ankur / Lall, Brejesh

    2023  

    Abstract: In this paper, we present a Diffusion GAN based approach (Prosodic Diff-TTS) to generate the corresponding high-fidelity speech based on the style description and content text as an input to generate speech samples within only 4 denoising steps. It ... ...

    Abstract In this paper, we present a Diffusion GAN based approach (Prosodic Diff-TTS) to generate the corresponding high-fidelity speech based on the style description and content text as an input to generate speech samples within only 4 denoising steps. It leverages the novel conditional prosodic layer normalization to incorporate the style embeddings into the multi head attention based phoneme encoder and mel spectrogram decoder based generator architecture to generate the speech. The style embedding is generated by fine tuning the pretrained BERT model on auxiliary tasks such as pitch, speaking speed, emotion,gender classifications. We demonstrate the efficacy of our proposed architecture on multi-speaker LibriTTS and PromptSpeech datasets, using multiple quantitative metrics that measure generated accuracy and MOS.
    Keywords Computer Science - Sound ; Computer Science - Computation and Language ; Electrical Engineering and Systems Science - Audio and Speech Processing
    Subject code 400
    Publishing date 2023-10-27
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: Single Stage Warped Cloth Learning and Semantic-Contextual Attention Feature Fusion for Virtual TryOn

    Pathak, Sanhita / Kaushik, Vinay / Lall, Brejesh

    2023  

    Abstract: Image-based virtual try-on aims to fit an in-shop garment onto a clothed person image. Garment warping, which aligns the target garment with the corresponding body parts in the person image, is a crucial step in achieving this goal. Existing methods ... ...

    Abstract Image-based virtual try-on aims to fit an in-shop garment onto a clothed person image. Garment warping, which aligns the target garment with the corresponding body parts in the person image, is a crucial step in achieving this goal. Existing methods often use multi-stage frameworks to handle clothes warping, person body synthesis and tryon generation separately or rely on noisy intermediate parser-based labels. We propose a novel single-stage framework that implicitly learns the same without explicit multi-stage learning. Our approach utilizes a novel semantic-contextual fusion attention module for garment-person feature fusion, enabling efficient and realistic cloth warping and body synthesis from target pose keypoints. By introducing a lightweight linear attention framework that attends to garment regions and fuses multiple sampled flow fields, we also address misalignment and artifacts present in previous methods. To achieve simultaneous learning of warped garment and try-on results, we introduce a Warped Cloth Learning Module. WCLM uses segmented warped garments as ground truth, operating within a single-stage paradigm. Our proposed approach significantly improves the quality and efficiency of virtual try-on methods, providing users with a more reliable and realistic virtual try-on experience. We evaluate our method on the VITON dataset and demonstrate its state-of-the-art performance in terms of both qualitative and quantitative metrics.

    Comment: 8 pages, 4 figures
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 006
    Publishing date 2023-10-08
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: Distilling Inductive Bias

    Habib, Gousia / Saleem, Tausifa Jan / Lall, Brejesh

    Knowledge Distillation Beyond Model Compression

    2023  

    Abstract: With the rapid development of computer vision, Vision Transformers (ViTs) offer the tantalizing prospect of unified information processing across visual and textual domains. But due to the lack of inherent inductive biases in ViTs, they require enormous ... ...

    Abstract With the rapid development of computer vision, Vision Transformers (ViTs) offer the tantalizing prospect of unified information processing across visual and textual domains. But due to the lack of inherent inductive biases in ViTs, they require enormous amount of data for training. To make their applications practical, we introduce an innovative ensemble-based distillation approach distilling inductive bias from complementary lightweight teacher models. Prior systems relied solely on convolution-based teaching. However, this method incorporates an ensemble of light teachers with different architectural tendencies, such as convolution and involution, to instruct the student transformer jointly. Because of these unique inductive biases, instructors can accumulate a wide range of knowledge, even from readily identifiable stored datasets, which leads to enhanced student performance. Our proposed framework also involves precomputing and storing logits in advance, essentially the unnormalized predictions of the model. This optimization can accelerate the distillation process by eliminating the need for repeated forward passes during knowledge distillation, significantly reducing the computational burden and enhancing efficiency.
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning
    Subject code 004
    Publishing date 2023-09-30
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Article ; Online: Riemannian Curvature of Deep Neural Networks.

    Kaul, Piyush / Lall, Brejesh

    IEEE transactions on neural networks and learning systems

    2019  Volume 31, Issue 4, Page(s) 1410–1416

    Abstract: We analyze deep neural networks using the theory of Riemannian geometry and curvature. The objective is to gain insight into how Riemannian geometry can characterize and predict the trained behavior of neural networks. We define a method for calculating ... ...

    Abstract We analyze deep neural networks using the theory of Riemannian geometry and curvature. The objective is to gain insight into how Riemannian geometry can characterize and predict the trained behavior of neural networks. We define a method for calculating Riemann and Ricci curvature tensors, and Ricci scalar curvature values for a trained neural net, in such a way that the output classifier softmax values are related to the input transformations, through the curvature equations. We also measure these curvature tensors experimentally for different networks which are pretrained with stochastic gradient descent and offer a way of visualizing and understanding the measurements to gain insight into the effect curvature has on behavior the neural networks locally, and possibly predict their behavior for different transformations of the test data. We also analyze the effect of variation in depth of the neural networks as well as how it behaves for different choices of data set.
    MeSH term(s) Algorithms ; Databases, Factual/statistics & numerical data ; Deep Learning/statistics & numerical data ; Neural Networks, Computer
    Language English
    Publishing date 2019-06-26
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 2162-2388
    ISSN (online) 2162-2388
    DOI 10.1109/TNNLS.2019.2919705
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Book ; Online: On the Optimal Beamwidth of UAV-Assisted Networks Operating at Millimeter Waves

    Rawat, Manishika / Giordani, Marco / Lall, Brejesh / Chaoub, Abdelaali / Zorzi, Michele

    2023  

    Abstract: The millimeter-wave (mm-wave) bands enable very large antenna arrays that can generate narrow beams for beamforming and spatial multiplexing. However, directionality introduces beam misalignment and leads to reduced energy efficiency. Thus, employing the ...

    Abstract The millimeter-wave (mm-wave) bands enable very large antenna arrays that can generate narrow beams for beamforming and spatial multiplexing. However, directionality introduces beam misalignment and leads to reduced energy efficiency. Thus, employing the narrowest possible beam in a cell may not necessarily imply maximum coverage. The objective of this work is to determine the optimal sector beamwidth for a cellular architecture served by an unmanned aerial vehicle (UAV) acting as a base station (BS). The users in a cell are assumed to be distributed according to a Poisson Point Process (PPP) with a given user density. We consider hybrid beamforming at the UAV, such that multiple concurrent beams serve all the sectors simultaneously. An optimization problem is formulated to maximize the sum rate over a given area while limiting the total power available to each sector. We observe that, for a given transmit power, the optimal sector beamwidth increases as the user density in a cell decreases, and varies based on the height of the UAV. Thus, we provide guidelines towards the optimal beamforming configurations for users in rural areas.

    Comment: 7 pages, 7 figures
    Keywords Computer Science - Information Theory ; Computer Science - Networking and Internet Architecture
    Subject code 003
    Publishing date 2023-01-26
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top