LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 14

Search options

  1. Article ; Online: DINI: data imputation using neural inversion for edge applications.

    Tuli, Shikhar / Jha, Niraj K

    Scientific reports

    2022  Volume 12, Issue 1, Page(s) 20210

    Abstract: The edge computing paradigm has recently drawn significant attention from industry and academia. Due to the advantages in quality-of-service metrics, namely, latency, bandwidth, energy efficiency, privacy, and security, deploying artificial intelligence ( ...

    Abstract The edge computing paradigm has recently drawn significant attention from industry and academia. Due to the advantages in quality-of-service metrics, namely, latency, bandwidth, energy efficiency, privacy, and security, deploying artificial intelligence (AI) models at the network edge has attracted widespread interest. Edge-AI has seen applications in diverse domains that involve large amounts of data. However, poor dataset quality plagues this compute regime owing to numerous data corruption sources, including missing data. As such systems are increasingly being deployed in mission-critical applications, mitigating the effects of corrupted data becomes important. In this work, we propose a strategy based on data imputation using neural inversion, DINI. It trains a surrogate model and runs data imputation in an interleaved fashion. Unlike previous works, DINI is a model-agnostic framework applicable to diverse deep learning architectures. DINI outperforms state-of-the-art methods by at least 10.7% in average imputation error. Applying DINI to mission-critical applications can increase prediction accuracy to up to 99% (F1 score of 0.99), resulting in significant gains compared to baseline methods.
    MeSH term(s) Humans ; Artificial Intelligence ; Benchmarking ; Chromosome Inversion ; Industry ; Privacy
    Language English
    Publishing date 2022-11-23
    Publishing country England
    Document type Journal Article
    ZDB-ID 2615211-3
    ISSN 2045-2322 ; 2045-2322
    ISSN (online) 2045-2322
    ISSN 2045-2322
    DOI 10.1038/s41598-022-24369-1
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Book ; Online: TransCODE

    Tuli, Shikhar / Jha, Niraj K.

    Co-design of Transformers and Accelerators for Efficient Training and Inference

    2023  

    Abstract: Automated co-design of machine learning models and evaluation hardware is critical for efficiently deploying such models at scale. Despite the state-of-the-art performance of transformer models, they are not yet ready for execution on resource- ... ...

    Abstract Automated co-design of machine learning models and evaluation hardware is critical for efficiently deploying such models at scale. Despite the state-of-the-art performance of transformer models, they are not yet ready for execution on resource-constrained hardware platforms. High memory requirements and low parallelizability of the transformer architecture exacerbate this problem. Recently-proposed accelerators attempt to optimize the throughput and energy consumption of transformer models. However, such works are either limited to a one-sided search of the model architecture or a restricted set of off-the-shelf devices. Furthermore, previous works only accelerate model inference and not training, which incurs substantially higher memory and compute resources, making the problem even more challenging. To address these limitations, this work proposes a dynamic training framework, called DynaProp, that speeds up the training process and reduces memory consumption. DynaProp is a low-overhead pruning method that prunes activations and gradients at runtime. To effectively execute this method on hardware for a diverse set of transformer architectures, we propose ELECTOR, a framework that simulates transformer inference and training on a design space of accelerators. We use this simulator in conjunction with the proposed co-design technique, called TransCODE, to obtain the best-performing models with high accuracy on the given task and minimize latency, energy consumption, and chip area. The obtained transformer-accelerator pair achieves 0.3% higher accuracy than the state-of-the-art pair while incurring 5.2$\times$ lower latency and 3.0$\times$ lower energy consumption.
    Keywords Computer Science - Machine Learning ; Computer Science - Hardware Architecture
    Subject code 006
    Publishing date 2023-03-26
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: AccelTran

    Tuli, Shikhar / Jha, Niraj K.

    A Sparsity-Aware Accelerator for Dynamic Inference with Transformers

    2023  

    Abstract: Self-attention-based transformer models have achieved tremendous success in the domain of natural language processing. Despite their efficacy, accelerating the transformer is challenging due to its quadratic computational complexity and large activation ... ...

    Abstract Self-attention-based transformer models have achieved tremendous success in the domain of natural language processing. Despite their efficacy, accelerating the transformer is challenging due to its quadratic computational complexity and large activation sizes. Existing transformer accelerators attempt to prune its tokens to reduce memory access, albeit with high compute overheads. Moreover, previous works directly operate on large matrices involved in the attention operation, which limits hardware utilization. In order to address these challenges, this work proposes a novel dynamic inference scheme, DynaTran, which prunes activations at runtime with low overhead, substantially reducing the number of ineffectual operations. This improves the throughput of transformer inference. We further propose tiling the matrices in transformer operations along with diverse dataflows to improve data reuse, thus enabling higher energy efficiency. To effectively implement these methods, we propose AccelTran, a novel accelerator architecture for transformers. Extensive experiments with different models and benchmarks demonstrate that DynaTran achieves higher accuracy than the state-of-the-art top-k hardware-aware pruning strategy while attaining up to 1.2$\times$ higher sparsity. One of our proposed accelerators, AccelTran-Edge, achieves 330K$\times$ higher throughput with 93K$\times$ lower energy requirement when compared to a Raspberry Pi device. On the other hand, AccelTran-Server achieves 5.73$\times$ higher throughput and 3.69$\times$ lower energy consumption compared to the state-of-the-art transformer co-processor, Energon. The simulation source code is available at https://github.com/jha-lab/acceltran.
    Keywords Computer Science - Hardware Architecture ; Computer Science - Machine Learning
    Subject code 004
    Publishing date 2023-02-28
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: EdgeTran

    Tuli, Shikhar / Jha, Niraj K.

    Co-designing Transformers for Efficient Inference on Mobile Edge Platforms

    2023  

    Abstract: Automated design of efficient transformer models has recently attracted significant attention from industry and academia. However, most works only focus on certain metrics while searching for the best-performing transformer architecture. Furthermore, ... ...

    Abstract Automated design of efficient transformer models has recently attracted significant attention from industry and academia. However, most works only focus on certain metrics while searching for the best-performing transformer architecture. Furthermore, running traditional, complex, and large transformer models on low-compute edge platforms is a challenging problem. In this work, we propose a framework, called ProTran, to profile the hardware performance measures for a design space of transformer architectures and a diverse set of edge devices. We use this profiler in conjunction with the proposed co-design technique to obtain the best-performing models that have high accuracy on the given task and minimize latency, energy consumption, and peak power draw to enable edge deployment. We refer to our framework for co-optimizing accuracy and hardware performance measures as EdgeTran. It searches for the best transformer model and edge device pair. Finally, we propose GPTran, a multi-stage block-level grow-and-prune post-processing step that further improves accuracy in a hardware-aware manner. The obtained transformer model is 2.8$\times$ smaller and has a 0.8% higher GLUE score than the baseline (BERT-Base). Inference with it on the selected edge device enables 15.0% lower latency, 10.0$\times$ lower energy, and 10.8$\times$ lower peak power draw compared to an off-the-shelf GPU.
    Keywords Computer Science - Machine Learning
    Subject code 006
    Publishing date 2023-03-23
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: AVAC

    Tuli, Shikhar / Tuli, Shreshth

    A Machine Learning based Adaptive RRAM Variability-Aware Controller for Edge Devices

    2020  

    Abstract: Recently, the Edge Computing paradigm has gained significant popularity both in industry and academia. Researchers now increasingly target to improve performance and reduce energy consumption of such devices. Some recent efforts focus on using emerging ... ...

    Abstract Recently, the Edge Computing paradigm has gained significant popularity both in industry and academia. Researchers now increasingly target to improve performance and reduce energy consumption of such devices. Some recent efforts focus on using emerging RRAM technologies for improving energy efficiency, thanks to their no leakage property and high integration density. As the complexity and dynamism of applications supported by such devices escalate, it has become difficult to maintain ideal performance by static RRAM controllers. Machine Learning provides a promising solution for this, and hence, this work focuses on extending such controllers to allow dynamic parameter updates. In this work we propose an Adaptive RRAM Variability-Aware Controller, AVAC, which periodically updates Wait Buffer and batch sizes using on-the-fly learning models and gradient ascent. AVAC allows Edge devices to adapt to different applications and their stages, to improve computation performance and reduce energy consumption. Simulations demonstrate that the proposed model can provide up to 29% increase in performance and 19% decrease in energy, compared to static controllers, using traces of real-life healthcare applications on a Raspberry-Pi based Edge deployment.

    Comment: Accepted at 2020 IEEE International Symposium on Circuits and Systems (ISCAS)
    Keywords Electrical Engineering and Systems Science - Systems and Control ; Computer Science - Machine Learning
    Subject code 621
    Publishing date 2020-05-06
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: CODEBench

    Tuli, Shikhar / Li, Chia-Hao / Sharma, Ritvik / Jha, Niraj K.

    A Neural Architecture and Hardware Accelerator Co-Design Framework

    2022  

    Abstract: Recently, automated co-design of machine learning (ML) models and accelerator architectures has attracted significant attention from both the industry and academia. However, most co-design frameworks either explore a limited search space or employ ... ...

    Abstract Recently, automated co-design of machine learning (ML) models and accelerator architectures has attracted significant attention from both the industry and academia. However, most co-design frameworks either explore a limited search space or employ suboptimal exploration techniques for simultaneous design decision investigations of the ML model and the accelerator. Furthermore, training the ML model and simulating the accelerator performance is computationally expensive. To address these limitations, this work proposes a novel neural architecture and hardware accelerator co-design framework, called CODEBench. It is composed of two new benchmarking sub-frameworks, CNNBench and AccelBench, which explore expanded design spaces of convolutional neural networks (CNNs) and CNN accelerators. CNNBench leverages an advanced search technique, BOSHNAS, to efficiently train a neural heteroscedastic surrogate model to converge to an optimal CNN architecture by employing second-order gradients. AccelBench performs cycle-accurate simulations for a diverse set of accelerator architectures in a vast design space. With the proposed co-design method, called BOSHCODE, our best CNN-accelerator pair achieves 1.4% higher accuracy on the CIFAR-10 dataset compared to the state-of-the-art pair, while enabling 59.1% lower latency and 60.8% lower energy consumption. On the ImageNet dataset, it achieves 3.7% higher Top1 accuracy at 43.8% lower latency and 11.2% lower energy consumption. CODEBench outperforms the state-of-the-art framework, i.e., Auto-NBA, by achieving 1.5% higher accuracy and 34.7x higher throughput, while enabling 11.0x lower energy-delay product (EDP) and 4.0x lower chip area on CIFAR-10.

    Comment: Published at ACM Transactions on Embedded Computing Systems. Code available at https://github.com/jha-lab/codebench
    Keywords Computer Science - Hardware Architecture ; Computer Science - Machine Learning ; Electrical Engineering and Systems Science - Image and Video Processing
    Subject code 006
    Publishing date 2022-12-07
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: FlexiBERT

    Tuli, Shikhar / Dedhia, Bhishma / Tuli, Shreshth / Jha, Niraj K.

    Are Current Transformer Architectures too Homogeneous and Rigid?

    2022  

    Abstract: The existence of a plethora of language models makes the problem of selecting the best one for a custom task challenging. Most state-of-the-art methods leverage transformer-based models (e.g., BERT) or their variants. Training such models and exploring ... ...

    Abstract The existence of a plethora of language models makes the problem of selecting the best one for a custom task challenging. Most state-of-the-art methods leverage transformer-based models (e.g., BERT) or their variants. Training such models and exploring their hyperparameter space, however, is computationally expensive. Prior work proposes several neural architecture search (NAS) methods that employ performance predictors (e.g., surrogate models) to address this issue; however, analysis has been limited to homogeneous models that use fixed dimensionality throughout the network. This leads to sub-optimal architectures. To address this limitation, we propose a suite of heterogeneous and flexible models, namely FlexiBERT, that have varied encoder layers with a diverse set of possible operations and different hidden dimensions. For better-posed surrogate modeling in this expanded design space, we propose a new graph-similarity-based embedding scheme. We also propose a novel NAS policy, called BOSHNAS, that leverages this new scheme, Bayesian modeling, and second-order optimization, to quickly train and use a neural surrogate model to converge to the optimal architecture. A comprehensive set of experiments shows that the proposed policy, when applied to the FlexiBERT design space, pushes the performance frontier upwards compared to traditional models. FlexiBERT-Mini, one of our proposed models, has 3% fewer parameters than BERT-Mini and achieves 8.9% higher GLUE score. A FlexiBERT model with equivalent performance as the best homogeneous model achieves 2.6x smaller size. FlexiBERT-Large, another proposed model, achieves state-of-the-art results, outperforming the baseline models by at least 5.7% on the GLUE benchmark.

    Comment: Preprint. In review
    Keywords Computer Science - Machine Learning ; Computer Science - Computation and Language
    Subject code 006
    Publishing date 2022-05-23
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Article ; Online: Predicting the growth and trend of COVID-19 pandemic using machine learning and cloud computing.

    Tuli, Shreshth / Tuli, Shikhar / Tuli, Rakesh / Gill, Sukhpal Singh

    Internet of things (Amsterdam, Netherlands)

    2020  Volume 11, Page(s) 100222

    Abstract: The outbreak of COVID-19 Coronavirus, namely SARS-CoV-2, has created a calamitous situation throughout the world. The cumulative incidence of COVID-19 is rapidly increasing day by day. Machine Learning (ML) and Cloud Computing can be deployed very ... ...

    Abstract The outbreak of COVID-19 Coronavirus, namely SARS-CoV-2, has created a calamitous situation throughout the world. The cumulative incidence of COVID-19 is rapidly increasing day by day. Machine Learning (ML) and Cloud Computing can be deployed very effectively to track the disease, predict growth of the epidemic and design strategies and policies to manage its spread. This study applies an improved mathematical model to analyse and predict the growth of the epidemic. An ML-based improved model has been applied to predict the potential threat of COVID-19 in countries worldwide. We show that using iterative weighting for fitting Generalized Inverse Weibull distribution, a better fit can be obtained to develop a prediction framework. This has been deployed on a cloud computing platform for more accurate and real-time prediction of the growth behavior of the epidemic. A data driven approach with higher accuracy as here can be very useful for a proactive response from the government and citizens. Finally, we propose a set of research opportunities and setup grounds for further practical applications.
    Language English
    Publishing date 2020-05-12
    Publishing country Netherlands
    Document type Journal Article
    ISSN 2542-6605
    ISSN (online) 2542-6605
    DOI 10.1016/j.iot.2020.100222
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Book ; Online: Generative Optimization Networks for Memory Efficient Data Generation

    Tuli, Shreshth / Tuli, Shikhar / Casale, Giuliano / Jennings, Nicholas R.

    2021  

    Abstract: In standard generative deep learning models, such as autoencoders or GANs, the size of the parameter set is proportional to the complexity of the generated data distribution. A significant challenge is to deploy resource-hungry deep learning models in ... ...

    Abstract In standard generative deep learning models, such as autoencoders or GANs, the size of the parameter set is proportional to the complexity of the generated data distribution. A significant challenge is to deploy resource-hungry deep learning models in devices with limited memory to prevent system upgrade costs. To combat this, we propose a novel framework called generative optimization networks (GON) that is similar to GANs, but does not use a generator, significantly reducing its memory footprint. GONs use a single discriminator network and run optimization in the input space to generate new data samples, achieving an effective compromise between training time and memory consumption. GONs are most suited for data generation problems in limited memory settings. Here we illustrate their use for the problem of anomaly detection in memory-constrained edge devices arising from attacks or intrusion events. Specifically, we use a GON to calculate a reconstruction-based anomaly score for input time-series windows. Experiments on a Raspberry-Pi testbed with two existing and a new suite of datasets show that our framework gives up to 32% higher detection F1 scores and 58% lower memory consumption, with only 5% higher training overheads compared to the state-of-the-art.

    Comment: Accepted in NeurIPS 2021 - Workshop on ML for Systems
    Keywords Computer Science - Machine Learning
    Subject code 006
    Publishing date 2021-10-06
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Book ; Online: APEX

    Tuli, Shreshth / Tuli, Shikhar / Jain, Udit / Buyya, Rajkumar

    Adaptive Ext4 File System for Enhanced Data Recoverability in Edge Devices

    2019  

    Abstract: Recently Edge Computing paradigm has gained significant popularity both in industry and academia. With its increased usage in real-life scenarios, security, privacy and integrity of data in such environments have become critical. Malicious deletion of ... ...

    Abstract Recently Edge Computing paradigm has gained significant popularity both in industry and academia. With its increased usage in real-life scenarios, security, privacy and integrity of data in such environments have become critical. Malicious deletion of mission-critical data due to ransomware, trojans and viruses has been a huge menace and recovering such lost data is an active field of research. As most of Edge computing devices have compute and storage limitations, difficult constraints arise in providing an optimal scheme for data protection. These devices mostly use Linux/Unix based operating systems. Hence, this work focuses on extending the Ext4 file system to APEX (Adaptive Ext4): a file system based on novel on-the-fly learning model that provides an Adaptive Recover-ability Aware file allocation platform for efficient post-deletion data recovery and therefore maintaining data integrity. Our recovery model and its lightweight implementation allow significant improvement in recover-ability of lost data with lower compute, space, time, and cost overheads compared to other methods. We demonstrate the effectiveness of APEX through a case study of overwriting surveillance videos by CryPy malware on Raspberry-Pi based Edge deployment and show 678% and 32% higher recovery than Ext4 and current state-of-the-art File Systems. We also evaluate the overhead characteristics and experimentally show that they are lower than other related works.
    Keywords Computer Science - Operating Systems
    Subject code 005
    Publishing date 2019-10-03
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top