LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 47

Search options

  1. Book ; Online: More for Less

    Gao, Andrew Kean

    Compact Convolutional Transformers Enable Robust Medical Image Classification with Limited Data

    2023  

    Abstract: Transformers are very powerful tools for a variety of tasks across domains, from text generation to image captioning. However, transformers require substantial amounts of training data, which is often a challenge in biomedical settings, where high ... ...

    Abstract Transformers are very powerful tools for a variety of tasks across domains, from text generation to image captioning. However, transformers require substantial amounts of training data, which is often a challenge in biomedical settings, where high quality labeled data can be challenging or expensive to obtain. This study investigates the efficacy of Compact Convolutional Transformers (CCT) for robust medical image classification with limited data, addressing a key issue faced by conventional Vision Transformers - their requirement for large datasets. A hybrid of transformers and convolutional layers, CCTs demonstrate high accuracy on modestly sized datasets. We employed a benchmark dataset of peripheral blood cell images of eight distinct cell types, each represented by approximately 2,000 low-resolution (28x28x3 pixel) samples. Despite the dataset size being smaller than those typically used with Vision Transformers, we achieved a commendable classification accuracy of 92.49% and a micro-average ROC AUC of 0.9935. The CCT also learned quickly, exceeding 80% validation accuracy after five epochs. Analysis of per-class precision, recall, F1, and ROC showed that performance was strong across cell types. Our findings underscore the robustness of CCTs, indicating their potential as a solution to data scarcity issues prevalent in biomedical imaging. We substantiate the applicability of CCTs in data-constrained areas and encourage further work on CCTs.

    Comment: 9 pages, 4 figures, 2 tables
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning ; I.4.9 ; I.2.10
    Subject code 006
    Publishing date 2023-06-30
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: NLP Meets RNA

    Gao, Andrew Kean

    Unsupervised Embedding Learning for Ribozymes with Word2Vec

    2023  

    Abstract: Ribozymes, RNA molecules with distinct 3D structures and catalytic activity, have widespread applications in synthetic biology and therapeutics. However, relatively little research has focused on leveraging deep learning to enhance our understanding of ... ...

    Abstract Ribozymes, RNA molecules with distinct 3D structures and catalytic activity, have widespread applications in synthetic biology and therapeutics. However, relatively little research has focused on leveraging deep learning to enhance our understanding of ribozymes. This study implements Word2Vec, an unsupervised learning technique for natural language processing, to learn ribozyme embeddings. Ribo2Vec was trained on over 9,000 diverse ribozymes, learning to map sequences to 128 and 256-dimensional vector spaces. Using Ribo2Vec, sequence embeddings for five classes of ribozymes (hatchet, pistol, hairpin, hovlinc, and twister sister) were calculated. Principal component analysis demonstrated the ability of these embeddings to distinguish between ribozyme classes. Furthermore, a simple SVM classifier trained on ribozyme embeddings showed promising results in accurately classifying ribozyme types. Our results suggest that the embedding vectors contained meaningful information about ribozymes. Interestingly, 256-dimensional embeddings behaved similarly to 128-dimensional embeddings, suggesting that a lower dimension vector space is generally sufficient to capture ribozyme features. This approach demonstrates the potential of Word2Vec for bioinformatics, opening new avenues for ribozyme research. Future research includes using a Transformer-based method to learn RNA embeddings, which can capture long-range interactions between nucleotides.
    Keywords Computer Science - Machine Learning ; Quantitative Biology - Biomolecules ; I.2.7
    Subject code 610
    Publishing date 2023-07-08
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: Vec2Vec

    Gao, Andrew Kean

    A Compact Neural Network Approach for Transforming Text Embeddings with High Fidelity

    2023  

    Abstract: Vector embeddings have become ubiquitous tools for many language-related tasks. A leading embedding model is OpenAI's text-ada-002 which can embed approximately 6,000 words into a 1,536-dimensional vector. While powerful, text-ada-002 is not open source ... ...

    Abstract Vector embeddings have become ubiquitous tools for many language-related tasks. A leading embedding model is OpenAI's text-ada-002 which can embed approximately 6,000 words into a 1,536-dimensional vector. While powerful, text-ada-002 is not open source and is only available via API. We trained a simple neural network to convert open-source 768-dimensional MPNet embeddings into text-ada-002 embeddings. We compiled a subset of 50,000 online food reviews. We calculated MPNet and text-ada-002 embeddings for each review and trained a simple neural network to for 75 epochs. The neural network was designed to predict the corresponding text-ada-002 embedding for a given MPNET embedding. Our model achieved an average cosine similarity of 0.932 on 10,000 unseen reviews in our held-out test dataset. We manually assessed the quality of our predicted embeddings for vector search over text-ada-002-embedded reviews. While not as good as real text-ada-002 embeddings, predicted embeddings were able to retrieve highly relevant reviews. Our final model, Vec2Vec, is lightweight (<80 MB) and fast. Future steps include training a neural network with a more sophisticated architecture and a larger dataset of paired embeddings to achieve greater performance. The ability to convert between and align embedding spaces may be helpful for interoperability, limiting dependence on proprietary models, protecting data privacy, reducing costs, and offline operations.

    Comment: 14 pages, 6 figures, 5 tables
    Keywords Computer Science - Computation and Language ; Computer Science - Artificial Intelligence ; Computer Science - Information Retrieval ; Computer Science - Machine Learning ; I.2.7 ; D.2.12
    Subject code 410
    Publishing date 2023-06-22
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: On the Origin of LLMs

    Gao, Sarah / Gao, Andrew Kean

    An Evolutionary Tree and Graph for 15,821 Large Language Models

    2023  

    Abstract: Since late 2022, Large Language Models (LLMs) have become very prominent with LLMs like ChatGPT and Bard receiving millions of users. Hundreds of new LLMs are announced each week, many of which are deposited to Hugging Face, a repository of machine ... ...

    Abstract Since late 2022, Large Language Models (LLMs) have become very prominent with LLMs like ChatGPT and Bard receiving millions of users. Hundreds of new LLMs are announced each week, many of which are deposited to Hugging Face, a repository of machine learning models and datasets. To date, nearly 16,000 Text Generation models have been uploaded to the site. Given the huge influx of LLMs, it is of interest to know which LLM backbones, settings, training methods, and families are popular or trending. However, there is no comprehensive index of LLMs available. We take advantage of the relatively systematic nomenclature of Hugging Face LLMs to perform hierarchical clustering and identify communities amongst LLMs using n-grams and term frequency-inverse document frequency. Our methods successfully identify families of LLMs and accurately cluster LLMs into meaningful subgroups. We present a public web application to navigate and explore Constellation, our atlas of 15,821 LLMs. Constellation rapidly generates a variety of visualizations, namely dendrograms, graphs, word clouds, and scatter plots. Constellation is available at the following link: https://constellation.sites.stanford.edu/.

    Comment: 14 pages, 6 figures, 1 table
    Keywords Computer Science - Digital Libraries ; Computer Science - Computation and Language ; I.2.1 ; H.5.0
    Publishing date 2023-07-19
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Article ; Online: Machine-learning-based virtual screening to repurpose drugs for treatment of Candida albicans infection.

    Gao, Andrew / Kouznetsova, Valentina L / Tsigelny, Igor F

    Mycoses

    2022  Volume 65, Issue 8, Page(s) 794–805

    Abstract: Background: Approximately 30% of Candida genus isolates are resistant to all currently available antifungal drugs and it is highly important to develop new treatments. Additionally, many current drugs are toxic and cause unwanted side effects. 1,3-beta- ... ...

    Abstract Background: Approximately 30% of Candida genus isolates are resistant to all currently available antifungal drugs and it is highly important to develop new treatments. Additionally, many current drugs are toxic and cause unwanted side effects. 1,3-beta-glucan synthase is an essential enzyme that builds the cell walls of Candida.
    Objectives: Targeting CaFKS1, a subunit of the synthase, could be used to fight Candida.
    Methods: In the present study, a machine-learning model based on chemical descriptors was trained to recognise drugs that inhibit CaFKS1. The model attained 96.72% accuracy for classifying between active and inactive drug compounds. Descriptors for FDA-approved and other drugs were calculated, and the model was used to predict the potential activity of these drugs against CaFKS1.
    Results: Several drugs, including goserelin and icatibant, were detected as active with high confidence. Many of the drugs, interestingly, were gonadotrophin-releasing hormone (GnRH) antagonists or agonists. A literature search found that five of the predicted drugs inhibit Candida experimentally.
    Conclusions: This study yields promising drugs to be repurposed to combat Candida albicans infection. Future steps include testing the drugs on fungal cells in vitro.
    MeSH term(s) Antifungal Agents/pharmacology ; Antifungal Agents/therapeutic use ; Candida ; Candida albicans ; Candidiasis/drug therapy ; Candidiasis/microbiology ; Humans ; Machine Learning ; Microbial Sensitivity Tests
    Chemical Substances Antifungal Agents
    Language English
    Publishing date 2022-06-19
    Publishing country Germany
    Document type Journal Article
    ZDB-ID 392487-7
    ISSN 1439-0507 ; 0933-7407
    ISSN (online) 1439-0507
    ISSN 0933-7407
    DOI 10.1111/myc.13475
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Computation by Convective Logic Gates and Thermal Communication.

    Bartlett, Stuart / Gao, Andrew K / Yung, Yuk L

    Artificial life

    2022  Volume 28, Issue 1, Page(s) 96–107

    Abstract: We demonstrate a novel computational architecture based on fluid convection logic gates and heat flux-mediated information flows. Our previous work demonstrated that Boolean logic operations can be performed by thermally driven convection flows. In this ... ...

    Abstract We demonstrate a novel computational architecture based on fluid convection logic gates and heat flux-mediated information flows. Our previous work demonstrated that Boolean logic operations can be performed by thermally driven convection flows. In this work, we use numerical simulations to demonstrate a different , but universal Boolean logic operation (NOR), performed by simpler convective gates. The gates in the present work do not rely on obstacle flows or periodic boundary conditions, a significant improvement in terms of experimental realizability. Conductive heat transfer links can be used to connect the convective gates, and we demonstrate this with the example of binary half addition. These simulated circuits could be constructed in an experimental setting with modern, 2-dimensional fluidics equipment, such as a thin layer of fluid between acrylic plates. The presented approach thus introduces a new realm of unconventional, thermal fluid-based computation.
    MeSH term(s) Communication ; Logic
    Language English
    Publishing date 2022-04-01
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2024100-8
    ISSN 1530-9185 ; 1064-5462
    ISSN (online) 1530-9185
    ISSN 1064-5462
    DOI 10.1162/artl_a_00358
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article: Methylation Profiling Identifies Stability of Isocitrate Dehydrogenase Mutation Over Time.

    Voisin, Mathew R / Gui, Chloe / Patil, Vikas / Gao, Andrew F / Zadeh, Gelareh

    The Canadian journal of neurological sciences. Le journal canadien des sciences neurologiques

    2023  , Page(s) 1–7

    Abstract: Objective: Isocitrate dehydrogenase (IDH) mutation status is a key diagnostic and prognostic feature of gliomas. It is thought to occur early in glioma tumorigenesis and remain stable over time. However, there are reports documenting a loss of IDH ... ...

    Abstract Objective: Isocitrate dehydrogenase (IDH) mutation status is a key diagnostic and prognostic feature of gliomas. It is thought to occur early in glioma tumorigenesis and remain stable over time. However, there are reports documenting a loss of IDH mutation status in a subset of patients with glioma recurrence. Here, we identified patients with a documented loss of IDH mutation status longitudinally and performed multi-platform analysis in order to determine if IDH mutations are stable throughout glioma evolution.
    Methods: We retrospectively identified patients from our institution from 2009 to 2018 with immunohistochemistry (IHC)-recorded IDH mutation status changes longitudinally. Archived formalin-fixed paraffin-embedded and frozen tissue samples from these patients were collected from our institution's tumour bank. Samples were analysed using methylation profiling, copy number variation, Sanger sequencing, droplet digital PCR (ddPCR) and IHC.
    Results: We reviewed 1491 archived glioma samples including 78 patients with multiple IDH mutant tumour samples collected longitudinally. In all instances of documented loss of IDH mutation status, multi-platform profiling identified a mixture of low tumour cell content and non-neoplastic tissue including perilesional, reactive or inflammatory cells.
    Conclusions: All patients with a documented loss of IDH mutation status longitudinally were resolved through multi-platform analysis. These findings support the hypothesis that IDH mutations occur early in gliomagenesis and in the absence of copy number changes at the IDH loci and are stable throughout tumour treatment and evolution. Our study highlights the importance of accurate surgical sampling and the role of DNA methylome profiling in diagnostically uncertain cases for integrated pathological and molecular diagnosis.
    Language English
    Publishing date 2023-07-12
    Publishing country England
    Document type Journal Article
    ZDB-ID 197622-9
    ISSN 0317-1671
    ISSN 0317-1671
    DOI 10.1017/cjn.2023.253
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: Bovine leukemia virus relation to human breast cancer: Meta-analysis.

    Gao, Andrew / Kouznetsova, Valentina L / Tsigelny, Igor F

    Microbial pathogenesis

    2020  Volume 149, Page(s) 104417

    Abstract: Bovine leukemia virus (BLV) is a virus that infects cattle around the world and is very similar to the human T-cell leukemia virus (HTLV), which causes adult T-cell leukemia/lymphoma (ATL). Recently, presence of BLV DNA and protein was demonstrated in ... ...

    Abstract Bovine leukemia virus (BLV) is a virus that infects cattle around the world and is very similar to the human T-cell leukemia virus (HTLV), which causes adult T-cell leukemia/lymphoma (ATL). Recently, presence of BLV DNA and protein was demonstrated in commercial bovine products and in humans. BLV DNA is generally found at higher rates in humans who have or will develop breast cancer, according to research done with subjects from several countries. These findings have led to a hypothesis that BLV transmission plays a role in breast cancer oncogenesis in humans. Here we summarize the current knowledge in the field.
    MeSH term(s) Adult ; Animals ; Breast Neoplasms ; Cattle ; Female ; Humans ; Leukemia Virus, Bovine/genetics
    Keywords covid19
    Language English
    Publishing date 2020-07-27
    Publishing country England
    Document type Journal Article ; Meta-Analysis
    ZDB-ID 632772-2
    ISSN 1096-1208 ; 0882-4010
    ISSN (online) 1096-1208
    ISSN 0882-4010
    DOI 10.1016/j.micpath.2020.104417
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: A case of disseminated microscopic demyelination with multifocal dystrophic calcification.

    Abdollahi, Maryam / Gao, Andrew F / Munoz, David G

    Neuropathology : official journal of the Japanese Society of Neuropathology

    2020  Volume 40, Issue 3, Page(s) 308–314

    Abstract: We present a 47-year-old woman with a 10-year disease course consisting of episodic confusion, aphasia, psychosis, depression, migrainous headaches and seizures. There was mild elevation of protein levels in the cerebrospinal fluid, progressive cerebral ... ...

    Abstract We present a 47-year-old woman with a 10-year disease course consisting of episodic confusion, aphasia, psychosis, depression, migrainous headaches and seizures. There was mild elevation of protein levels in the cerebrospinal fluid, progressive cerebral atrophy, and numerous small T1 hypointensities appearing as central "holes" in the corpus callosum on magnetic resonance imaging. She eventually expired due to status epilepticus and subsequent significant respiratory complications. In the central nervous system, there was generalized brain atrophy, and patchy labeling of blood vessels by antibodies to complement component 4d (C4d) and membrane attack complex. Innumerable small patches with loss of cell bodies (neurons and glial cells in gray matter and glial cells in white matter) and demyelination were scattered throughout the brain and spinal cord. There was no cavitation and the passing axons were mostly preserved. Large solid calcified foci were present predominantly in the pons along with disseminated focal calcification involving neuron cell bodies, neurites, and capillaries. Patchy labeling of glial cells and linear structures suggestive of myelin sheaths with C4d antibodies was observed while immunostains for SV40, tau, β-amyloid, alpha synuclein, p62, and trans-activation response DNA-binding protein 43 kDa were negative. Whole-exome sequencing did not reveal any clinically significant variants. Although the radiological findings are suggestive of Susac's syndrome (a rare condition characterized by encephalopathy, hearing loss, and branch retinal artery occlusion), in the absence of audiovisual manifestations, a definitive diagnosis cannot be rendered and therefore, this case may be representing a new entity. Further reports of similar cases are needed for clarification.
    MeSH term(s) Brain/pathology ; Calcinosis/pathology ; Demyelinating Diseases/pathology ; Fatal Outcome ; Female ; Humans ; Middle Aged ; Neurodegenerative Diseases/pathology
    Language English
    Publishing date 2020-03-03
    Publishing country Australia
    Document type Case Reports
    ZDB-ID 1483794-8
    ISSN 1440-1789 ; 0919-6544
    ISSN (online) 1440-1789
    ISSN 0919-6544
    DOI 10.1111/neup.12642
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article: Malignant Mimics of Trigeminal Schwannoma.

    Raswoli, Musthafa / Tsang, Derek S / Zadeh, Gelareh / Gao, Andrew F / Shultz, David B

    Advances in radiation oncology

    2022  Volume 8, Issue 1, Page(s) 101056

    Language English
    Publishing date 2022-08-27
    Publishing country United States
    Document type Case Reports
    ISSN 2452-1094
    ISSN 2452-1094
    DOI 10.1016/j.adro.2022.101056
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top