LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 179

Search options

  1. Book ; Online: Artificial Intelligence Bioinformatics: Development and Application of Tools for Omics and Inter-Omics Studies

    Facchiano, Angelo / Heider, Dominik / Chicco, Davide

    2020  

    Keywords Science: general issues ; Medical genetics ; artificial intelligence ; bioinformatics ; genomics ; omics ; inter-omics ; machine learning ; data mining ; proteomics
    Size 1 electronic resource (175 pages)
    Publisher Frontiers Media SA
    Document type Book ; Online
    Note English ; Open Access
    HBZ-ID HT021230286
    ISBN 9782889637522 ; 2889637522
    Database ZB MED Catalogue: Medicine, Health, Nutrition, Environment, Agriculture

    More links

    Kategorien

  2. Article: Editorial: Artificial intelligence and bioinformatics applications for omics and multi-omics studies.

    Facchiano, Angelo / Heider, Dominik / Mutarelli, Margherita

    Frontiers in genetics

    2024  Volume 15, Page(s) 1371473

    Language English
    Publishing date 2024-01-30
    Publishing country Switzerland
    Document type Editorial
    ZDB-ID 2606823-0
    ISSN 1664-8021
    ISSN 1664-8021
    DOI 10.3389/fgene.2024.1371473
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article: NeuralBeds: Neural embeddings for efficient DNA data compression and optimized similarity search.

    Sarumi, Oluwafemi A / Hahn, Maximilian / Heider, Dominik

    Computational and structural biotechnology journal

    2024  Volume 23, Page(s) 732–741

    Abstract: The availability of high throughput sequencing tools coupled with the declining costs in the production of DNA sequences has led to the generation of enormous amounts of omics data curated in several databases such as NCBI and EMBL. Identification of ... ...

    Abstract The availability of high throughput sequencing tools coupled with the declining costs in the production of DNA sequences has led to the generation of enormous amounts of omics data curated in several databases such as NCBI and EMBL. Identification of similar DNA sequences from these databases is one of the fundamental tasks in bioinformatics. It is essential for discovering homologous sequences in organisms, phylogenetic studies of evolutionary relationships among several biological entities, or detection of pathogens. Improving DNA similarity search is of outmost importance because of the increased complexity of the evergrowing repositories of sequences. Therefore, instead of using the conventional approach of comparing raw sequences, e.g., in fasta format, a numerical representation of the sequences can be used to calculate their similarities and optimize the search process. In this study, we analyzed different approaches for numerical embeddings, including Chaos Game Representation, hashing, and neural networks, and compared them with classical approaches such as principal component analysis. It turned out that neural networks generate embeddings that are able to capture the similarity between DNA sequences as a distance measure and outperform the other approaches on DNA similarity search, significantly.
    Language English
    Publishing date 2024-01-15
    Publishing country Netherlands
    Document type Journal Article
    ZDB-ID 2694435-2
    ISSN 2001-0370
    ISSN 2001-0370
    DOI 10.1016/j.csbj.2023.12.046
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Turbo autoencoders for the DNA data storage channel with Autoturbo-DNA.

    Welzel, Marius / Dreßler, Hagen / Heider, Dominik

    iScience

    2024  Volume 27, Issue 5, Page(s) 109575

    Abstract: DNA, with its high storage density and long-term stability, is a potential candidate for a next-generation storage device. The DNA data storage channel, composed of synthesis, amplification, storage, and sequencing, exhibits error probabilities and error ...

    Abstract DNA, with its high storage density and long-term stability, is a potential candidate for a next-generation storage device. The DNA data storage channel, composed of synthesis, amplification, storage, and sequencing, exhibits error probabilities and error profiles specific to the components of the channel. Here, we present Autoturbo-DNA, a PyTorch framework for training error-correcting, overcomplete autoencoders specifically tailored for the DNA data storage channel. It allows training different architecture combinations and using a wide variety of channel component models for noise generation during training. It further supports training the encoder to generate DNA sequences that adhere to user-defined constraints. Autoturbo-DNA exhibits error-correction capabilities close to non-neural-network state-of-the-art error correction and constrained codes for DNA data storage. Our results indicate that neural-network-based codes can be a viable alternative to traditionally designed codes for the DNA data storage channel.
    Language English
    Publishing date 2024-03-27
    Publishing country United States
    Document type Journal Article
    ISSN 2589-0042
    ISSN (online) 2589-0042
    DOI 10.1016/j.isci.2024.109575
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article: Unsupervised encoding selection through ensemble pruning for biomedical classification.

    Spänig, Sebastian / Michel, Alexander / Heider, Dominik

    BioData mining

    2023  Volume 16, Issue 1, Page(s) 10

    Abstract: Background: Owing to the rising levels of multi-resistant pathogens, antimicrobial peptides, an alternative strategy to classic antibiotics, got more attention. A crucial part is thereby the costly identification and validation. With the ever-growing ... ...

    Abstract Background: Owing to the rising levels of multi-resistant pathogens, antimicrobial peptides, an alternative strategy to classic antibiotics, got more attention. A crucial part is thereby the costly identification and validation. With the ever-growing amount of annotated peptides, researchers leverage artificial intelligence to circumvent the cumbersome, wet-lab-based identification and automate the detection of promising candidates. However, the prediction of a peptide's function is not limited to antimicrobial efficiency. To date, multiple studies successfully classified additional properties, e.g., antiviral or cell-penetrating effects. In this light, ensemble classifiers are employed aiming to further improve the prediction. Although we recently presented a workflow to significantly diminish the initial encoding choice, an entire unsupervised encoding selection, considering various machine learning models, is still lacking.
    Results: We developed a workflow, automatically selecting encodings and generating classifier ensembles by employing sophisticated pruning methods. We observed that the Pareto frontier pruning is a good method to create encoding ensembles for the datasets at hand. In addition, encodings combined with the Decision Tree classifier as the base model are often superior. However, our results also demonstrate that none of the ensemble building techniques is outstanding for all datasets.
    Conclusion: The workflow conducts multiple pruning methods to evaluate ensemble classifiers composed from a wide range of peptide encodings and base models. Consequently, researchers can use the workflow for unsupervised encoding selection and ensemble creation. Ultimately, the extensible workflow can be used as a plugin for the PEPTIDE REACToR, further establishing it as a versatile tool in the domain.
    Language English
    Publishing date 2023-03-16
    Publishing country England
    Document type Journal Article
    ZDB-ID 2438773-3
    ISSN 1756-0381
    ISSN 1756-0381
    DOI 10.1186/s13040-022-00317-7
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article: DNAsmart: Multiple attribute ranking tool for DNA data storage systems.

    Ezekannagha, Chisom / Welzel, Marius / Heider, Dominik / Hattab, Georges

    Computational and structural biotechnology journal

    2023  Volume 21, Page(s) 1448–1460

    Abstract: In an ever-growing need for data storage capacity, the Deoxyribonucleic Acid (DNA) molecule gains traction as a new storage medium with a larger capacity, higher density, and a longer lifespan over conventional storage media. To effectively use DNA for ... ...

    Abstract In an ever-growing need for data storage capacity, the Deoxyribonucleic Acid (DNA) molecule gains traction as a new storage medium with a larger capacity, higher density, and a longer lifespan over conventional storage media. To effectively use DNA for data storage, it is important to understand the different methods of encoding information in DNA and compare their effectiveness. This requires evaluating which decoded DNA sequences carry the most encoded information based on various attributes. However, navigating the field of coding theory requires years of experience and domain expertise. For instance, domain experts rely on various mathematical functions and attributes to score and evaluate their encodings. To enable such analytical tasks, we provide an interactive and visual analytical framework for multi-attribute ranking in DNA storage systems. Our framework follows a three-step view with user-settable parameters. It enables users to find the optimal en-/de-coding approaches by setting different weights and combining multiple attributes. We assess the validity of our work through a task-specific user study on domain experts by relying on three tasks. Results indicate that all participants completed their tasks successfully under two minutes, then rated the framework for design choices, perceived usefulness, and intuitiveness. In addition, two real-world use cases are shared and analyzed as direct applications of the proposed tool. DNAsmart enables the ranking of decoded sequences based on multiple attributes. In sum, this work unveils the evaluation of en-/de-coding approaches accessible and tractable through visualization and interactivity to solve comparison and ranking tasks.
    Language English
    Publishing date 2023-02-10
    Publishing country Netherlands
    Document type Journal Article
    ZDB-ID 2694435-2
    ISSN 2001-0370
    ISSN 2001-0370
    DOI 10.1016/j.csbj.2023.02.016
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Interactive polar diagrams for model comparison.

    Anžel, Aleksandar / Heider, Dominik / Hattab, Georges

    Computer methods and programs in biomedicine

    2023  Volume 242, Page(s) 107843

    Abstract: Objective: Evaluating the performance of multiple complex models, such as those found in biology, medicine, climatology, and machine learning, using conventional approaches is often challenging when using various evaluation metrics simultaneously. The ... ...

    Abstract Objective: Evaluating the performance of multiple complex models, such as those found in biology, medicine, climatology, and machine learning, using conventional approaches is often challenging when using various evaluation metrics simultaneously. The traditional approach, which relies on presenting multi-model evaluation scores in the table, presents an obstacle when determining the similarities between the models and the order of performance.
    Methods: By combining statistics, information theory, and data visualization, juxtaposed Taylor and Mutual Information Diagrams permit users to track and summarize the performance of one model or a collection of different models. To uncover linear and nonlinear relationships between models, users may visualize one or both charts.
    Results: Our library presents the first publicly available implementation of the Mutual Information Diagram and its new interactive capabilities, as well as the first publicly available implementation of an interactive Taylor Diagram. Extensions have been implemented so that both diagrams can display temporality, multimodality, and multivariate data sets, and feature one scalar model property such as uncertainty. Our library, named polar-diagrams, supports both continuous and categorical attributes.
    Conclusion: The library can be used to quickly and easily assess the performances of complex models, such as those found in machine learning, climate, or biomedical domains.
    Language English
    Publishing date 2023-10-06
    Publishing country Ireland
    Document type Journal Article
    ZDB-ID 632564-6
    ISSN 1872-7565 ; 0169-2607
    ISSN (online) 1872-7565
    ISSN 0169-2607
    DOI 10.1016/j.cmpb.2023.107843
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Book ; Online ; Thesis: Application of Machine Learning in the Detection of Antimicrobial Resistance

    Ren, Yunxiao [Verfasser] / Heider, Dominik [Akademischer Betreuer]

    2024  

    Author's details Yunxiao Ren ; Betreuer: Dominik Heider
    Keywords Medizin, Gesundheit ; Medicine, Health
    Subject code sg610
    Language English
    Publisher Philipps-Universität Marburg
    Publishing place Marburg
    Document type Book ; Online ; Thesis
    Database Digital theses on the web

    More links

    Kategorien

  9. Book ; Online ; Thesis: Analyse von Ursachen und Co-Faktoren bei frustraner mechanischer Rekanalisation beim ischämischen Schlaganfall

    Heider, Dominik [Verfasser]

    2020  

    Author's details Dominik Michael Heider
    Keywords Naturwissenschaften ; Science
    Subject code sg500
    Language German
    Publisher Saarländische Universitäts- und Landesbibliothek
    Publishing place Saarbrücken
    Document type Book ; Online ; Thesis
    Database Digital theses on the web

    More links

    Kategorien

  10. Article: Gaussian noise up-sampling is better suited than SMOTE and ADASYN for clinical decision making.

    Beinecke, Jacqueline / Heider, Dominik

    BioData mining

    2021  Volume 14, Issue 1, Page(s) 49

    Abstract: Clinical data sets have very special properties and suffer from many caveats in machine learning. They typically show a high-class imbalance, have a small number of samples and a large number of parameters, and have missing values. While feature ... ...

    Abstract Clinical data sets have very special properties and suffer from many caveats in machine learning. They typically show a high-class imbalance, have a small number of samples and a large number of parameters, and have missing values. While feature selection approaches and imputation techniques address the former problems, the class imbalance is typically addressed using augmentation techniques. However, these techniques have been developed for big data analytics, and their suitability for clinical data sets is unclear.This study analyzed different augmentation techniques for use in clinical data sets and subsequent employment of machine learning-based classification. It turns out that Gaussian Noise Up-Sampling (GNUS) is not always but generally, is as good as SMOTE and ADASYN and even outperform those on some datasets. However, it has also been shown that augmentation does not improve classification at all in some cases.
    Language English
    Publishing date 2021-11-29
    Publishing country England
    Document type Journal Article
    ZDB-ID 2438773-3
    ISSN 1756-0381
    ISSN 1756-0381
    DOI 10.1186/s13040-021-00283-6
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top