LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 13

Search options

  1. Article ; Online: LMCrot: An enhanced protein crotonylation site predictor by leveraging an interpretable window-level embedding from a transformer-based protein language model.

    Pratyush, Pawel / Bahmani, Soufia / Pokharel, Suresh / Ismail, Hamid D / Kc, Dukka B

    Bioinformatics (Oxford, England)

    2024  

    Abstract: Motivation: Recent advancements in natural language processing have highlighted the effectiveness of global contextualized representations from Protein Language Models (pLMs) in numerous downstream tasks. Nonetheless, strategies to encode the site-of- ... ...

    Abstract Motivation: Recent advancements in natural language processing have highlighted the effectiveness of global contextualized representations from Protein Language Models (pLMs) in numerous downstream tasks. Nonetheless, strategies to encode the site-of-interest leveraging pLMs for per-residue prediction tasks, such as crotonylation (Kcr) prediction, remain largely uncharted.
    Results: Herein, we adopt a range of approaches for utilizing pLMs by experimenting with different input sequence types (full-length protein sequence versus window sequence), assessing the implications of utilizing per-residue embedding of the site-of-interest as well as embeddings of window residues centered around it. Building upon these insights, we developed a novel residual ConvBiLSTM network designed to process window-level embeddings of the site-of-interest generated by the ProtT5-XL-UniRef50 pLM using full-length sequences as input. This model, termed T5ResConvBiLSTM, surpasses existing state-of-the-art Kcr predictors in performance across three diverse datasets. To validate our approach of utilizing full sequence-based window-level embeddings, we also delved into the interpretability of ProtT5-derived embedding tensors in two ways: firstly, by scrutinizing the attention weights obtained from the transformer's encoder block; and secondly, by computing SHAP values for these tensors, providing a model-agnostic interpretation of the prediction results. Additionally, we enhance the latent representation of ProtT5 by incorporating two additional local representations, one derived from amino acid properties and the other from supervised embedding layer, through an intermediate-fusion stacked generalization approach, using an n-mer window sequence (or, peptide fragment). The resultant stacked model, dubbed LMCrot, exhibits a more pronounced improvement in predictive performance across the tested datasets.
    Availability and implementation: LMCrot is publicly available at https://github.com/KCLabMTU/LMCrot.
    Language English
    Publishing date 2024-04-25
    Publishing country England
    Document type Journal Article
    ZDB-ID 1422668-6
    ISSN 1367-4811 ; 1367-4803
    ISSN (online) 1367-4811
    ISSN 1367-4803
    DOI 10.1093/bioinformatics/btae290
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: pLMSNOSite: an ensemble-based approach for predicting protein S-nitrosylation sites by integrating supervised word embedding and embedding from pre-trained protein language model.

    Pratyush, Pawel / Pokharel, Suresh / Saigo, Hiroto / Kc, Dukka B

    BMC bioinformatics

    2023  Volume 24, Issue 1, Page(s) 41

    Abstract: Background: Protein S-nitrosylation (SNO) plays a key role in transferring nitric oxide-mediated signals in both animals and plants and has emerged as an important mechanism for regulating protein functions and cell signaling of all main classes of ... ...

    Abstract Background: Protein S-nitrosylation (SNO) plays a key role in transferring nitric oxide-mediated signals in both animals and plants and has emerged as an important mechanism for regulating protein functions and cell signaling of all main classes of protein. It is involved in several biological processes including immune response, protein stability, transcription regulation, post translational regulation, DNA damage repair, redox regulation, and is an emerging paradigm of redox signaling for protection against oxidative stress. The development of robust computational tools to predict protein SNO sites would contribute to further interpretation of the pathological and physiological mechanisms of SNO.
    Results: Using an intermediate fusion-based stacked generalization approach, we integrated embeddings from supervised embedding layer and contextualized protein language model (ProtT5) and developed a tool called pLMSNOSite (protein language model-based SNO site predictor). On an independent test set of experimentally identified SNO sites, pLMSNOSite achieved values of 0.340, 0.735 and 0.773 for MCC, sensitivity and specificity respectively. These results show that pLMSNOSite performs better than the compared approaches for the prediction of S-nitrosylation sites.
    Conclusion: Together, the experimental results suggest that pLMSNOSite achieves significant improvement in the prediction performance of S-nitrosylation sites and represents a robust computational approach for predicting protein S-nitrosylation sites. pLMSNOSite could be a useful resource for further elucidation of SNO and is publicly available at https://github.com/KCLabMTU/pLMSNOSite .
    MeSH term(s) Animals ; Proteins/metabolism ; Nitric Oxide/metabolism ; Oxidation-Reduction ; Protein Processing, Post-Translational ; Signal Transduction
    Chemical Substances Proteins ; Nitric Oxide (31C4KY9ESH)
    Language English
    Publishing date 2023-02-08
    Publishing country England
    Document type Journal Article
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-023-05164-9
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Deep Learning-Based Advances In Protein Posttranslational Modification Site and Protein Cleavage Prediction.

    Pakhrin, Subash C / Pokharel, Suresh / Saigo, Hiroto / Kc, Dukka B

    Methods in molecular biology (Clifton, N.J.)

    2022  Volume 2499, Page(s) 285–322

    Abstract: Posttranslational modification (PTM ) is a ubiquitous phenomenon in both eukaryotes and prokaryotes which gives rise to enormous proteomic diversity. PTM mostly comes in two flavors: covalent modification to polypeptide chain and proteolytic cleavage. ... ...

    Abstract Posttranslational modification (PTM ) is a ubiquitous phenomenon in both eukaryotes and prokaryotes which gives rise to enormous proteomic diversity. PTM mostly comes in two flavors: covalent modification to polypeptide chain and proteolytic cleavage. Understanding and characterization of PTM is a fundamental step toward understanding the underpinning of biology. Recent advances in experimental approaches, mainly mass-spectrometry-based approaches, have immensely helped in obtaining and characterizing PTMs. However, experimental approaches are not enough to understand and characterize more than 450 different types of PTMs and complementary computational approaches are becoming popular. Recently, due to the various advancements in the field of Deep Learning (DL), along with the explosion of applications of DL to various fields, the field of computational prediction of PTM has also witnessed the development of a plethora of deep learning (DL)-based approaches. In this book chapter, we first review some recent DL-based approaches in the field of PTM site prediction. In addition, we also review the recent advances in the not-so-studied PTM , that is, proteolytic cleavage predictions. We describe advances in PTM prediction by highlighting the Deep learning architecture, feature encoding, novelty of the approaches, and availability of the tools/approaches. Finally, we provide an outlook and possible future research directions for DL-based approaches for PTM prediction.
    MeSH term(s) Deep Learning ; Mass Spectrometry ; Protein Processing, Post-Translational ; Proteins/chemistry ; Proteomics
    Chemical Substances Proteins
    Language English
    Publishing date 2022-06-13
    Publishing country United States
    Document type Journal Article
    ISSN 1940-6029
    ISSN (online) 1940-6029
    DOI 10.1007/978-1-0716-2317-6_15
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Integrating Embeddings from Multiple Protein Language Models to Improve Protein

    Pokharel, Suresh / Pratyush, Pawel / Ismail, Hamid D / Ma, Junfeng / Kc, Dukka B

    International journal of molecular sciences

    2023  Volume 24, Issue 21

    Abstract: ... ...

    Abstract O
    MeSH term(s) Proteins/chemistry ; Protein Processing, Post-Translational ; Amino Acid Sequence ; Acetylglucosamine/metabolism ; N-Acetylglucosaminyltransferases/metabolism
    Chemical Substances Proteins ; Acetylglucosamine (V956696549) ; N-Acetylglucosaminyltransferases (EC 2.4.1.-)
    Language English
    Publishing date 2023-11-06
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2019364-6
    ISSN 1422-0067 ; 1422-0067 ; 1661-6596
    ISSN (online) 1422-0067
    ISSN 1422-0067 ; 1661-6596
    DOI 10.3390/ijms242116000
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: DFT-aided machine learning-based discovery of magnetism in Fe-based bimetallic chalcogenides.

    Pant, Dharmendra / Pokharel, Suresh / Mandal, Subhasish / Kc, Dukka B / Pati, Ranjit

    Scientific reports

    2023  Volume 13, Issue 1, Page(s) 3277

    Abstract: With the technological advancement in recent years and the widespread use of magnetism in every sector of the current technology, a search for a low-cost magnetic material has been more important than ever. The discovery of magnetism in alternate ... ...

    Abstract With the technological advancement in recent years and the widespread use of magnetism in every sector of the current technology, a search for a low-cost magnetic material has been more important than ever. The discovery of magnetism in alternate materials such as metal chalcogenides with abundant atomic constituents would be a milestone in such a scenario. However, considering the multitude of possible chalcogenide configurations, predictive computational modeling or experimental synthesis is an open challenge. Here, we recourse to a stacked generalization machine learning model to predict magnetic moment (µB) in hexagonal Fe-based bimetallic chalcogenides, Fe
    Language English
    Publishing date 2023-02-25
    Publishing country England
    Document type Journal Article
    ZDB-ID 2615211-3
    ISSN 2045-2322 ; 2045-2322
    ISSN (online) 2045-2322
    ISSN 2045-2322
    DOI 10.1038/s41598-023-30438-w
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Experimental Study: Deep Learning-Based Fall Monitoring among Older Adults with Skin-Wearable Electronics.

    Lee, Yongkuk / Pokharel, Suresh / Muslim, Asra Al / Kc, Dukka B / Lee, Kyoung Hag / Yeo, Woon-Hong

    Sensors (Basel, Switzerland)

    2023  Volume 23, Issue 8

    Abstract: Older adults are more vulnerable to falling due to normal changes due to aging, and their falls are a serious medical risk with high healthcare and societal costs. However, there is a lack of automatic fall detection systems for older adults. This paper ... ...

    Abstract Older adults are more vulnerable to falling due to normal changes due to aging, and their falls are a serious medical risk with high healthcare and societal costs. However, there is a lack of automatic fall detection systems for older adults. This paper reports (1) a wireless, flexible, skin-wearable electronic device for both accurate motion sensing and user comfort, and (2) a deep learning-based classification algorithm for reliable fall detection of older adults. The cost-effective skin-wearable motion monitoring device is designed and fabricated using thin copper films. It includes a six-axis motion sensor and is directly laminated on the skin without adhesives for the collection of accurate motion data. To study accurate fall detection using the proposed device, different deep learning models, body locations for the device placement, and input datasets are investigated using motion data based on various human activities. Our results indicate the optimal location to place the device is the chest, achieving accuracy of more than 98% for falls with motion data from older adults. Moreover, our results suggest a large motion dataset directly collected from older adults is essential to improve the accuracy of fall detection for the older adult population.
    MeSH term(s) Humans ; Aged ; Deep Learning ; Wearable Electronic Devices ; Algorithms ; Motion
    Language English
    Publishing date 2023-04-14
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2052857-7
    ISSN 1424-8220 ; 1424-8220
    ISSN (online) 1424-8220
    ISSN 1424-8220
    DOI 10.3390/s23083983
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: LMPhosSite: A Deep Learning-Based Approach for General Protein Phosphorylation Site Prediction Using Embeddings from the Local Window Sequence and Pretrained Protein Language Model.

    Pakhrin, Subash C / Pokharel, Suresh / Pratyush, Pawel / Chaudhari, Meenal / Ismail, Hamid D / Kc, Dukka B

    Journal of proteome research

    2023  Volume 22, Issue 8, Page(s) 2548–2557

    Abstract: Phosphorylation is one of the most important post-translational modifications and plays a pivotal role in various cellular processes. Although there exist several computational tools to predict phosphorylation sites, existing tools have not yet harnessed ...

    Abstract Phosphorylation is one of the most important post-translational modifications and plays a pivotal role in various cellular processes. Although there exist several computational tools to predict phosphorylation sites, existing tools have not yet harnessed the knowledge distilled by pretrained protein language models. Herein, we present a novel deep learning-based approach called LMPhosSite for the general phosphorylation site prediction that integrates embeddings from the local window sequence and the contextualized embedding obtained using global (overall) protein sequence from a pretrained protein language model to improve the prediction performance. Thus, the LMPhosSite consists of two base-models: one for capturing effective local representation and the other for capturing global per-residue contextualized embedding from a pretrained protein language model. The output of these base-models is integrated using a score-level fusion approach. LMPhosSite achieves a precision, recall, Matthew's correlation coefficient, and F1-score of 38.78%, 67.12%, 0.390, and 49.15%, for the combined serine and threonine independent test data set and 34.90%, 62.03%, 0.298, and 44.67%, respectively, for the tyrosine independent test data set, which is better than the compared approaches. These results demonstrate that LMPhosSite is a robust computational tool for the prediction of the general phosphorylation sites in proteins.
    MeSH term(s) Phosphorylation ; Deep Learning ; Proteins/metabolism ; Protein Processing, Post-Translational ; Amino Acid Sequence
    Chemical Substances Proteins
    Language English
    Publishing date 2023-07-17
    Publishing country United States
    Document type Journal Article ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 2078618-9
    ISSN 1535-3907 ; 1535-3893
    ISSN (online) 1535-3907
    ISSN 1535-3893
    DOI 10.1021/acs.jproteome.2c00667
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: Discovering symptom patterns of COVID-19 patients using association rule mining.

    Tandan, Meera / Acharya, Yogesh / Pokharel, Suresh / Timilsina, Mohan

    Computers in biology and medicine

    2021  Volume 131, Page(s) 104249

    Abstract: Background: The COVID-19 pandemic is a significant public health crisis that is hitting hard on people's health, well-being, and freedom of movement, and affecting the global economy. Scientists worldwide are competing to develop therapeutics and ... ...

    Abstract Background: The COVID-19 pandemic is a significant public health crisis that is hitting hard on people's health, well-being, and freedom of movement, and affecting the global economy. Scientists worldwide are competing to develop therapeutics and vaccines; currently, three drugs and two vaccine candidates have been given emergency authorization use. However, there are still questions of efficacy with regard to specific subgroups of patients and the vaccine's scalability to the general public. Under such circumstances, understanding COVID-19 symptoms is vital in initial triage; it is crucial to distinguish the severity of cases for effective management and treatment. This study aimed to discover symptom patterns and overall symptom rules, including rules disaggregated by age, sex, chronic condition, and mortality status, among COVID-19 patients.
    Methods: This study was a retrospective analysis of COVID-19 patient data made available online by the Wolfram Data Repository through May 27, 2020. We applied a widely used rule-based machine learning technique called association rule mining to identify frequent symptoms and define patterns in the rules discovered.
    Result: In total, 1,560 patients with COVID-19 were included in the study, with a median age of 52 years. The most frequently occurring symptom was fever (67%), followed by cough (37%), malaise/body soreness (11%), pneumonia (11%), and sore throat (8%). Myocardial infarction, heart failure, and renal disease were present in less than 1% of patients. The top ten significant symptom rules (out of 71 generated) showed cough, septic shock, and respiratory distress syndrome as frequent consequents. If a patient had a breathing problem and sputum production, then, there was higher confidence of that patient having a cough; if cardiac disease, renal disease, or pneumonia was present, then there was a higher confidence of septic shock or respiratory distress syndrome. Symptom rules differed between younger and older patients and between male and female patients. Patients who had chronic conditions or died of COVID-19 had more severe symptom rules than those patients who did not have chronic conditions or survived of COVID-19. Concerning chronic condition rules among 147 patients, if a patient had diabetes, prerenal azotemia, and coronary bypass surgery, there was a certainty of hypertension.
    Conclusion: The most frequently reported symptoms in patients with COVID-19 were fever, cough, pneumonia, and sore throat; while 1% had severe symptoms, such as septic shock, respiratory distress syndrome, and respiratory failure. Symptom rules differed by age and sex. Patients with chronic disease and patients who died of COVID-19 had severe symptom rules more specifically, cardiovascular-related symptoms accompanied by pneumonia, fever, and cough as consequents.
    MeSH term(s) Biomarkers/metabolism ; COVID-19/diagnosis ; COVID-19/epidemiology ; COVID-19/metabolism ; Data Mining ; Databases, Factual ; Diagnosis, Computer-Assisted ; Female ; Humans ; Male ; Middle Aged ; Pandemics ; Retrospective Studies ; SARS-CoV-2/metabolism
    Chemical Substances Biomarkers
    Language English
    Publishing date 2021-02-01
    Publishing country United States
    Document type Journal Article
    ZDB-ID 127557-4
    ISSN 1879-0534 ; 0010-4825
    ISSN (online) 1879-0534
    ISSN 0010-4825
    DOI 10.1016/j.compbiomed.2021.104249
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Building a knowledge base for colorectal cancer patient care using formal concept analysis.

    Xiang, Jing / Xu, Hanbing / Pokharel, Suresh / Li, Jiqing / Xue, Fuzhong / Zhang, Ping

    BMC medical informatics and decision making

    2022  Volume 21, Issue Suppl 11, Page(s) 369

    Abstract: Background: Colorectal cancer (CRC) is a heterogeneous disease with different responses to targeted therapies due to various factors, and the treatment effect differs significantly between individuals. Personalize medical treatment (PMT) is a method ... ...

    Abstract Background: Colorectal cancer (CRC) is a heterogeneous disease with different responses to targeted therapies due to various factors, and the treatment effect differs significantly between individuals. Personalize medical treatment (PMT) is a method that takes individual patient characteristics into consideration, making it the most effective way to deal with this issue. Patient similarity and clustering analysis is an important aspect of PMT. This paper describes how to build a knowledge base using formal concept analysis (FCA), which clusters patients based on their similarity and preserves the relations between clusters in hierarchical structural form.
    Methods: Prognostic factors (attributes) of 2442 CRC patients, including patient age, cancer cell differentiation, lymphatic invasion and metastasis stages were used to build a formal context in FCA. A concept was defined as a set of patients with their shared attributes. The formal context was formed based on the similarity scores between each concept identified from the dataset, which can be used as a knowledge base.
    Results: A hierarchical knowledge base was constructed along with the clinical records of the diagnosed CRC patients. For each new patient, a similarity score to each existing concept in the knowledge base can be retrieved with different similarity calculations. The ranked similarity scores that are associated with the concepts can offer references for treatment plans.
    Conclusions: Patients that share the same concept indicates the potential similar effect from same clinical procedures or treatments. In conjunction with a clinician's ability to undergo flexible analyses and apply appropriate judgement, the knowledge base allows faster and more effective decisions to be made for patient treatment and care.
    MeSH term(s) Humans ; Patient Care ; Knowledge Bases ; Cluster Analysis ; Judgment ; Colorectal Neoplasms/diagnosis ; Colorectal Neoplasms/therapy
    Language English
    Publishing date 2022-11-23
    Publishing country England
    Document type Journal Article
    ZDB-ID 2046490-3
    ISSN 1472-6947 ; 1472-6947
    ISSN (online) 1472-6947
    ISSN 1472-6947
    DOI 10.1186/s12911-021-01728-y
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Improving protein succinylation sites prediction using embeddings from protein language model.

    Pokharel, Suresh / Pratyush, Pawel / Heinzinger, Michael / Newman, Robert H / Kc, Dukka B

    Scientific reports

    2022  Volume 12, Issue 1, Page(s) 16933

    Abstract: Protein succinylation is an important post-translational modification (PTM) responsible for many vital metabolic activities in cells, including cellular respiration, regulation, and repair. Here, we present a novel approach that combines features from ... ...

    Abstract Protein succinylation is an important post-translational modification (PTM) responsible for many vital metabolic activities in cells, including cellular respiration, regulation, and repair. Here, we present a novel approach that combines features from supervised word embedding with embedding from a protein language model called ProtT5-XL-UniRef50 (hereafter termed, ProtT5) in a deep learning framework to predict protein succinylation sites. To our knowledge, this is one of the first attempts to employ embedding from a pre-trained protein language model to predict protein succinylation sites. The proposed model, dubbed LMSuccSite, achieves state-of-the-art results compared to existing methods, with performance scores of 0.36, 0.79, 0.79 for MCC, sensitivity, and specificity, respectively. LMSuccSite is likely to serve as a valuable resource for exploration of succinylation and its role in cellular physiology and disease.
    MeSH term(s) Computational Biology/methods ; Language ; Lysine/metabolism ; Protein Processing, Post-Translational ; Proteins/metabolism
    Chemical Substances Proteins ; Lysine (K3Z4F929H6)
    Language English
    Publishing date 2022-10-08
    Publishing country England
    Document type Journal Article
    ZDB-ID 2615211-3
    ISSN 2045-2322 ; 2045-2322
    ISSN (online) 2045-2322
    ISSN 2045-2322
    DOI 10.1038/s41598-022-21366-2
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top