LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 162

Search options

  1. Book ; Online: Fundamentals of Clinical Data Science

    Kubben, Pieter / Dumontier, Michel / Dekker, Andre

    2019  

    Author's details edited by Pieter Kubben, Michel Dumontier, Andre Dekker
    Keywords Medical records/Data processing ; Bioinformatics
    Subject code 502.85
    Language English
    Size 1 Online-Ressource (VIII, 219 p. 45 illus., 35 illus. in color)
    Publisher Springer International Publishing ; Imprint: Springer
    Publishing place Cham
    Document type Book ; Online
    HBZ-ID HT019924497
    ISBN 978-3-319-99713-1 ; 9783319997124 ; 9783319997148 ; 3-319-99713-0 ; 3319997122 ; 3319997149
    DOI 10.1007/978-3-319-99713-1
    Database ZB MED Catalogue: Medicine, Health, Nutrition, Environment, Agriculture

    More links

    Kategorien

  2. Book ; Online: Fundamentals of Clinical Data Science

    Kubben, Pieter / Dumontier, Michel / Dekker, Andre

    2019  

    Keywords Medical equipment & techniques ; Life sciences: general issues ; Medicine ; Health informatics ; Bioinformatics
    Size 1 electronic resource (219 pages)
    Publisher Springer Nature
    Publishing place Cham
    Document type Book ; Online
    Note English ; Open Access
    HBZ-ID HT021028824
    ISBN 978-3-319-99713-1 ; 3-319-99713-0
    Database ZB MED Catalogue: Medicine, Health, Nutrition, Environment, Agriculture

    More links

    Kategorien

  3. Article ; Online: Generating synthetic personal health data using conditional generative adversarial networks combining with differential privacy.

    Sun, Chang / van Soest, Johan / Dumontier, Michel

    Journal of biomedical informatics

    2023  Volume 143, Page(s) 104404

    Abstract: A large amount of personal health data that is highly valuable to the scientific community is still not accessible or requires a lengthy request process due to privacy concerns and legal restrictions. As a solution, synthetic data has been studied and ... ...

    Abstract A large amount of personal health data that is highly valuable to the scientific community is still not accessible or requires a lengthy request process due to privacy concerns and legal restrictions. As a solution, synthetic data has been studied and proposed to be a promising alternative to this issue. However, generating realistic and privacy-preserving synthetic personal health data retains challenges such as simulating the characteristics of the patients' data that are in the minority classes, capturing the relations among variables in imbalanced data and transferring them to the synthetic data, and preserving individual patients' privacy. In this paper, we propose a differentially private conditional Generative Adversarial Network model (DP-CGANS) consisting of data transformation, sampling, conditioning, and network training to generate realistic and privacy-preserving personal data. Our model distinguishes categorical and continuous variables and transforms them into latent space separately for better training performance. We tackle the unique challenges of generating synthetic patient data due to the special data characteristics of personal health data. For example, patients with a certain disease are typically the minority in the dataset and the relations among variables are crucial to be observed. Our model is structured with a conditional vector as an additional input to present the minority class in the imbalanced data and maximally capture the dependency between variables. Moreover, we inject statistical noise into the gradients in the networking training process of DP-CGANS to provide a differential privacy guarantee. We extensively evaluate our model with state-of-the-art generative models on personal socio-economic datasets and real-world personal health datasets in terms of statistical similarity, machine learning performance, and privacy measurement. We demonstrate that our model outperforms other comparable models, especially in capturing the dependence between variables. Finally, we present the balance between data utility and privacy in synthetic data generation considering the different data structures and characteristics of real-world personal health data such as imbalanced classes, abnormal distributions, and data sparsity.
    MeSH term(s) Humans ; Privacy ; Machine Learning ; Minority Groups
    Language English
    Publishing date 2023-06-01
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2023.104404
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Towards an extensible FAIRness assessment of FAIR Digital Objects

    Emonet, Vincent / Çelebi, Remzi / Yang, Jinzhou / Dumontier, Michel

    Research Ideas and Outcomes. 2022 Oct. 12, v. 8 p.e94988-

    2022  

    Abstract: The objective of the FAIR Digital Objects Framework (FDOF) is for objects published in a digital environment to comply with a set of requirements, such as identifiability, and the use of a rich metadata record (Santos 2021, Schultes and Wittenburg 2019, ... ...

    Abstract The objective of the FAIR Digital Objects Framework (FDOF) is for objects published in a digital environment to comply with a set of requirements, such as identifiability, and the use of a rich metadata record (Santos 2021, Schultes and Wittenburg 2019, Schwardmann 2020). With the increasing prevalence of the FAIR (Findable, Accessible, Interoperable, Reusable) principles, and FAIR Digital Objects (FDO), used within different communities and domains (Wise et al. 2019), there will be a need to evaluate whether a FDO meets the requirements of the ecosystem in which it is used. Without a dedicated framework, communities will develop isolated assessment systems from the ground up (Sun et al. 2022, Bahim et al. 2020), which will cost them time, and lead to FAIRness assessments with limited interoperability and comparability. Previous work from the FAIR Metrics working group defined a framework for deploying individual FAIR metrics tests as separate services endpoints (Wilkinson et al. 2018, Wilkinson et al. 2019). To work in accordance with this framework, each test should take a subject URL as input, and return a score, either 0 or 1, a test version, and the test execution logs. A central service can then be used to assess the FAIRness of digital objects using collections of individual assessments. Such a framework could be easily extended, but there are currently no guidelines or tools to implement and publish new FAIRness assessments complying with this framework. To amend this problem, we published the fair-test library in python and its documentation, which help with developing and deploying individual FAIRness assessments. With this library, developers define their metric tests using custom python objects, which will guide them to provide all required metadata for their test as attributes, and implement the test evaluation logic as a function. The library also provides additional helper functions for common tasks, such as retrieving metadata from a URL, or testing a metric test. These tests can then be deployed as a web API, and registered in a central FAIR evaluation service supporting the FAIR metrics working group framework, such as FAIR enough or the FAIR evaluator. Finally, users of the evaluation services will be able to group the registered metrics tests in collections used to assess the quality of publicly available digital objects. There are currently as many as 47 tests that have been defined to assess compliance with various FAIR metrics, from which 25 have been defined using the fair-test library, including assessing if the identifier used is persistent, or if the metadata record attached to a digital object complies with a specific schema. This presentation introduces a user-friendly and extensible tool, which can assess whether specific requirements are met for a digital resource. Our contributions are: Developing and publishing the fair-test library to make the development and deployment of independent FAIRness assessment tests easier. Developing and publishing tests in python for existing FAIR metrics: 23 generic tests covering most of the FAIR metrics, and 2 domain-specific tests for the Rare Disease research community. We aim to engage with the FDO community to explore potential use-cases for an extensible tool to evaluate FDOs, and discuss their expectations related to the evaluation of digital objects. Insights and guidelines from the FDO community would contribute to further improving the fair-test ecosystem. Among improvements that are currently being under consideration, we can cite improving the collaborative aspect of metadata extraction, or adding new metadata to be returned by the tests.
    Keywords compliance ; ecosystems ; metadata ; research ; FAIR evaluations ; library ; validation
    Language English
    Dates of publication 2022-1012
    Publishing place Pensoft Publishers
    Document type Article ; Online
    ZDB-ID 2833254-4
    ISSN 2367-7163
    ISSN 2367-7163
    DOI 10.3897/rio.8.e94988
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  5. Book ; Online: Models towards Risk Behavior Prediction and Analysis

    Adekunle, Onaopepo / Riedl, Arno / Dumontier, Michel

    A Netherlands Case study

    2023  

    Abstract: In many countries financial service providers have to elicit their customers risk preferences, when offering products and services. For instance, in the Netherlands pension funds will be legally obliged to factor in their clients risk preferences when ... ...

    Abstract In many countries financial service providers have to elicit their customers risk preferences, when offering products and services. For instance, in the Netherlands pension funds will be legally obliged to factor in their clients risk preferences when devising their investment strategies. Therefore, assessing and measuring the risk preferences of individuals is critical for the analysis of individuals' behavior and policy prescriptions. In the psychology and economics, a number of methods to elicit risk preferences have been developed using hypothetical scenarios and economic experiments. These methods of eliciting individual risk preferences are usually applied to small samples because they are expensive and the implementation can be complex and not suitable when large cohorts need to be measured. A large number of supervised learning models ranging from linear regression to support vector machines are used to predict risk preference measures using socio-economic register data such as age, gender, migration background and other demographic variables in combination with data on income, wealth, pension fund contributions, and other financial data. The employed machine learning models cover a range of assumptions and properties as well as a diverse set of regression metrics. The optimum model is selected using the metrics and interpretability of the model. The optimal models are lasso regression and gradient boosting machines with mean average percentage error of about 30%. This is important as it helps to estimate risk attitudes without actually measuring them. It should be noted that with the current accuracy the tested models are not ready for deployment for applications that require high accuracy. However, the results do indicate which models should be used in situations that do not require the most accurate predictions such as augmentation data for pensions' recommendation.
    Keywords Computer Science - Computational Engineering ; Finance ; and Science
    Subject code 310
    Publishing date 2023-11-07
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Article ; Online: FAIR Principles for Clinical Practice Guidelines in a Learning Health System.

    Leung, Tiffany I / Dumontier, Michel

    Studies in health technology and informatics

    2019  Volume 264, Page(s) 1690–1691

    Abstract: The learning health system depends on a cycle of evidence generation, translation to practice, and continuous practice-based data collection. Clinical practice guidelines (CPGs) represent medical evidence, translated into recommendations on appropriate ... ...

    Abstract The learning health system depends on a cycle of evidence generation, translation to practice, and continuous practice-based data collection. Clinical practice guidelines (CPGs) represent medical evidence, translated into recommendations on appropriate clinical care. The FAIR guiding principles offer a framework for publishing the extensive knowledge work of CPGs and their resources. In this narrative literature review, we propose that FAIR CPGs would lead to more efficient production and dissemination of CPG knowledge to practice.
    MeSH term(s) Data Accuracy ; Government Programs ; Health Information Systems ; Practice Guidelines as Topic
    Language English
    Publishing date 2019-08-12
    Publishing country Netherlands
    Document type Journal Article ; Review
    ISSN 1879-8365
    ISSN (online) 1879-8365
    DOI 10.3233/SHTI190599
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Putting FAIR Evidence into Practice.

    Leung, Tiffany I / Dumontier, Michel

    Journal of general internal medicine

    2019  Volume 34, Issue 8, Page(s) 1369

    MeSH term(s) Humans ; Learning Health System
    Language English
    Publishing date 2019-05-06
    Publishing country United States
    Document type Letter ; Comment
    ZDB-ID 639008-0
    ISSN 1525-1497 ; 0884-8734
    ISSN (online) 1525-1497
    ISSN 0884-8734
    DOI 10.1007/s11606-019-05021-7
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article ; Online: SIENA: Semi-automatic semantic enhancement of datasets using concept recognition.

    Grigoriu, Andreea / Zaveri, Amrapali / Weiss, Gerhard / Dumontier, Michel

    Journal of biomedical semantics

    2021  Volume 12, Issue 1, Page(s) 5

    Abstract: Background: The amount of available data, which can facilitate answering scientific research questions, is growing. However, the different formats of published data are expanding as well, creating a serious challenge when multiple datasets need to be ... ...

    Abstract Background: The amount of available data, which can facilitate answering scientific research questions, is growing. However, the different formats of published data are expanding as well, creating a serious challenge when multiple datasets need to be integrated for answering a question.
    Results: This paper presents a semi-automated framework that provides semantic enhancement of biomedical data, specifically gene datasets. The framework involved a concept recognition task using machine learning, in combination with the BioPortal annotator. Compared to using methods which require only the BioPortal annotator for semantic enhancement, the proposed framework achieves the highest results.
    Conclusions: Using concept recognition combined with machine learning techniques and annotation with a biomedical ontology, the proposed framework can provide datasets to reach their full potential of providing meaningful information, which can answer scientific research questions.
    MeSH term(s) Biological Ontologies ; Machine Learning ; Semantics
    Language English
    Publishing date 2021-03-24
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ZDB-ID 2548651-2
    ISSN 2041-1480 ; 2041-1480
    ISSN (online) 2041-1480
    ISSN 2041-1480
    DOI 10.1186/s13326-021-00239-z
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Relation extraction from DailyMed structured product labels by optimally combining crowd, experts and machines.

    Shingjergji, Krist / Celebi, Remzi / Scholtes, Jan / Dumontier, Michel

    Journal of biomedical informatics

    2021  Volume 122, Page(s) 103902

    Abstract: The effectiveness of machine learning models to provide accurate and consistent results in drug discovery and clinical decision support is strongly dependent on the quality of the data used. However, substantive amounts of open data that drive drug ... ...

    Abstract The effectiveness of machine learning models to provide accurate and consistent results in drug discovery and clinical decision support is strongly dependent on the quality of the data used. However, substantive amounts of open data that drive drug discovery suffer from a number of issues including inconsistent representation, inaccurate reporting, and incomplete context. For example, databases of FDA-approved drug indications used in computational drug repositioning studies do not distinguish between treatments that simply offer symptomatic relief from those that target the underlying pathology. Moreover, drug indication sources often lack proper provenance and have little overlap. Consequently, new predictions can be of poor quality as they offer little in the way of new insights. Hence, work remains to be done to establish higher quality databases of drug indications that are suitable for use in drug discovery and repositioning studies. Here, we report on the combination of weak supervision (i.e., programmatic labeling and crowdsourcing) and deep learning methods for relation extraction from DailyMed text to create a higher quality drug-disease relation dataset. The generated drug-disease relation data shows a high overlap with DrugCentral, a manually curated dataset. Using this dataset, we constructed a machine learning model to classify relations between drugs and diseases from text into four categories; treatment, symptomatic relief, contradiction, and effect, exhibiting an improvement of 15.5% with Bi-LSTM (F1 score of 71.8%) over the best performing discrete method. Access to high quality data is crucial to building accurate and reliable drug repurposing prediction models. Our work suggests how the combination of crowds, experts, and machine learning methods can go hand-in-hand to improve datasets and predictive models.
    MeSH term(s) Crowdsourcing ; Drug Repositioning ; Machine Learning
    Language English
    Publishing date 2021-09-01
    Publishing country United States
    Document type Journal Article
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2021.103902
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Book ; Conference proceedings ; Online: ICBO 2013, International Conference on Biomedical Ontology 2013

    Dumontier, Michel

    proceedings of the 4th International Conference on Biomedical Ontology 2013, Montreal, Canada, July 7 - 12, 2013

    (CEUR workshop proceedings ; 1060)

    2013  

    Event/congress ICBO (4, 2013.07.07-12, Montreal) ; International Conference on Biomedical Ontology (4, 2013.07.07-12, Montreal)
    Author's details ed. by Michel Dumontier
    Series title CEUR workshop proceedings ; 1060
    Language English
    Size Online-Ressource ([139] S.)
    Publisher RWTH
    Publishing place Aachen
    Document type Book ; Conference proceedings ; Online
    Note Literaturangaben
    Database Library catalogue of the German National Library of Science and Technology (TIB), Hannover

    More links

    Kategorien

To top