LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 71

Search options

  1. Article: Association between Asthma and Lower Levels of Physical Activity: Results of a Population-Based Case-Control Study in Spain.

    De-Miguel-Diez, Javier / Llamas-Saez, Carlos / Vaquero, Teresa Saez / Jiménez-García, Rodrigo / López-de-Andrés, Ana / Carabantes-Alarcón, David / Carricondo, Francisco / Romero-Gómez, Barbara / Pérez-Farinos, Napoleón

    Journal of clinical medicine

    2024  Volume 13, Issue 2

    Abstract: 1) Background: Our aim was to determine changes in the prevalence of physical activity (PA) in adults with asthma between 2014 and 2020 in Spain, investigate sex differences and the effect of other variables on adherence to PA, and compare the ... ...

    Abstract (1) Background: Our aim was to determine changes in the prevalence of physical activity (PA) in adults with asthma between 2014 and 2020 in Spain, investigate sex differences and the effect of other variables on adherence to PA, and compare the prevalence of PA between individuals with and without asthma. (2) Methods: This study was a cross-sectional, population-based, matched, case-control study using European Health Interview Surveys for Spain (EHISS) for 2014 and 2020. (3) Results: We identified 1262 and 1103 patients with asthma in the 2014 and 2020 EHISS, respectively. The prevalence of PA remained stable (57.2% vs. 55.7%, respectively), while the percentage of persons who reported walking continuously for at least 2 days a week increased from 73.9% to 82.2% (
    Language English
    Publishing date 2024-01-19
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2662592-1
    ISSN 2077-0383
    ISSN 2077-0383
    DOI 10.3390/jcm13020591
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article: New methodology to measure in vivo permeance on blueberry (Vaccinium corymbosum) skin: A correlation to quality during storage

    Ramírez, Lina / Sáez, Carlos / Matiacevich, Silvia

    Postharvest biology and technology. 2019 Apr. 25,

    2019  

    Abstract: A new methodology to measure the in vivo water permeance (Pw) of the blueberry cuticle, and its correlation with blueberry quality parameters during storage is presented. Physical surface properties [contact angle (θ), surface free energy (γSV), drop ... ...

    Abstract A new methodology to measure the in vivo water permeance (Pw) of the blueberry cuticle, and its correlation with blueberry quality parameters during storage is presented. Physical surface properties [contact angle (θ), surface free energy (γSV), drop volume (Vd) and Pw] along with quality parameters (soluble solids, firmness, moisture content, and percentages of shriveled blueberries and blueberries with molds) were determined at different storage times. Neither θ nor γSV changed significantly, showing that the cuticle composition did not changed. Additionally, Vd decreased linearly over time, indicating that Pw was constant during the measurement; furthermore, Pw magnitude decreased after 7 days of storage, indicating a structural change in the cuticle. Although soluble solids, firmness and moisture content did not change during the storage, the percentages of shriveled blueberries and blueberries with molds increased significantly, and both were correlated with Pw change. Therefore, this new methodology can be used to measure blueberries quality.
    Keywords Gibbs free energy ; Vaccinium corymbosum ; blueberries ; contact angle ; firmness ; fruit quality ; storage time ; total soluble solids ; water content
    Language English
    Dates of publication 2019-0425
    Publishing place Elsevier B.V.
    Document type Article
    Note Pre-press version
    ZDB-ID 1082798-5
    ISSN 1873-2356 ; 0925-5214
    ISSN (online) 1873-2356
    ISSN 0925-5214
    DOI 10.1016/j.postharvbio.2019.04.020
    Database NAL-Catalogue (AGRICOLA)

    More links

    Kategorien

  3. Article ; Online: Potential limitations in COVID-19 machine learning due to data source variability: A case study in the nCov2019 dataset.

    Sáez, Carlos / Romero, Nekane / Conejero, J Alberto / García-Gómez, Juan M

    Journal of the American Medical Informatics Association : JAMIA

    2020  Volume 28, Issue 2, Page(s) 360–364

    Abstract: Objective: The lack of representative coronavirus disease 2019 (COVID-19) data is a bottleneck for reliable and generalizable machine learning. Data sharing is insufficient without data quality, in which source variability plays an important role. We ... ...

    Abstract Objective: The lack of representative coronavirus disease 2019 (COVID-19) data is a bottleneck for reliable and generalizable machine learning. Data sharing is insufficient without data quality, in which source variability plays an important role. We showcase and discuss potential biases from data source variability for COVID-19 machine learning.
    Materials and methods: We used the publicly available nCov2019 dataset, including patient-level data from several countries. We aimed to the discovery and classification of severity subgroups using symptoms and comorbidities.
    Results: Cases from the 2 countries with the highest prevalence were divided into separate subgroups with distinct severity manifestations. This variability can reduce the representativeness of training data with respect the model target populations and increase model complexity at risk of overfitting.
    Conclusions: Data source variability is a potential contributor to bias in distributed research networks. We call for systematic assessment and reporting of data source variability and data quality in COVID-19 data sharing, as key information for reliable and generalizable machine learning.
    MeSH term(s) Adult ; Aged ; COVID-19/classification ; Computer Communication Networks ; Data Accuracy ; Datasets as Topic/standards ; Female ; Humans ; Information Dissemination ; Machine Learning ; Male ; Middle Aged ; Patient Acuity
    Keywords covid19
    Language English
    Publishing date 2020-10-07
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1205156-1
    ISSN 1527-974X ; 1067-5027
    ISSN (online) 1527-974X
    ISSN 1067-5027
    DOI 10.1093/jamia/ocaa258
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Kinematics of Big Biomedical Data to characterize temporal variability and seasonality of data repositories: Functional Data Analysis of data temporal evolution over non-parametric statistical manifolds.

    Sáez, Carlos / García-Gómez, Juan M

    International journal of medical informatics

    2018  Volume 119, Page(s) 109–124

    Abstract: Aim: The increasing availability of Big Biomedical Data is leading to large research data samples collected over long periods of time. We propose the analysis of the kinematics of data probability distributions over time towards the characterization of ... ...

    Abstract Aim: The increasing availability of Big Biomedical Data is leading to large research data samples collected over long periods of time. We propose the analysis of the kinematics of data probability distributions over time towards the characterization of data temporal variability.
    Methods: First, we propose a kinematic model based on the estimation of a continuous data temporal trajectory, using Functional Data Analysis over the embedding of a non-parametric statistical manifold which points represent data temporal batches, the Information Geometric Temporal (IGT) plot. This model allows measuring the velocity and acceleration of data changes. Next, we propose a coordinate-free method to characterize the oriented seasonality of data based on the parallelism of lagged velocity vectors of the data trajectory throughout the IGT space, the Auto-Parallelism of Velocity Vectors (APVV) and APVVmap. Finally, we automatically explain the maximum variance components of the IGT space coordinates by means of correlating data points with known temporal factors from the domain application.
    Materials: Methods are evaluated on the US National Hospital Discharge Survey open dataset, consisting of 3,25M hospital discharges between 2000 and 2010.
    Results: Seasonal and abrupt behaviours were present on the estimated multivariate and univariate data trajectories. The kinematic analysis revealed seasonal effects and punctual increments in data celerity, the latter mainly related to abrupt changes in coding. The APVV and APVVmap revealed oriented seasonal changes on data trajectories. For most variables, their distributions tended to change to the same direction at a 12-month period, with a peak of change of directionality at mid and end of the year. Diagnosis and Procedure codes also included a 9-month periodic component. Kinematics and APVV methods were able to detect seasonal effects on extreme temporal subgrouped data, such as in Procedure code, where Fourier and autocorrelation methods were not able to. The automated explanation of IGT space coordinates was consistent with the results provided by the kinematic and seasonal analysis. Coordinates received different meanings according to the trajectory trend, seasonality and abrupt changes.
    Discussion: Treating data as a particle moving over time through a multidimensional probabilistic space and studying the kinematics of its trajectory has turned out to a new temporal variability methodology. Its results on the NHDS were aligned with the dataset and population descriptions found in the literature, contributing with a novel temporal variability characterization. We have demonstrated that the APVV and APVVmat are an appropriate tool for the coordinate-free and oriented analysis of trajectories or complex multivariate signals.
    Conclusion: The proposed methods comprise an exploratory methodology for the characterization of data temporal variability, what may be useful for a reliable reuse of Big Biomedical Data repositories acquired over long periods of time.
    MeSH term(s) Biomechanical Phenomena ; Biomedical Research ; Data Analysis ; Data Interpretation, Statistical ; Humans ; Models, Statistical ; Seasons ; Spatio-Temporal Analysis
    Language English
    Publishing date 2018-09-17
    Publishing country Ireland
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 1466296-6
    ISSN 1872-8243 ; 1386-5056
    ISSN (online) 1872-8243
    ISSN 1386-5056
    DOI 10.1016/j.ijmedinf.2018.09.015
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: Multisource and temporal variability in Portuguese hospital administrative datasets: Data quality implications.

    Souza, Júlio / Caballero, Ismael / Vasco Santos, João / Lobo, Mariana / Pinto, Andreia / Viana, João / Sáez, Carlos / Lopes, Fernando / Freitas, Alberto

    Journal of biomedical informatics

    2022  Volume 136, Page(s) 104242

    Abstract: Background: Unexpected variability across healthcare datasets may indicate data quality issues and thereby affect the credibility of these data for reutilization. No gold-standard reference dataset or methods for variability assessment are usually ... ...

    Abstract Background: Unexpected variability across healthcare datasets may indicate data quality issues and thereby affect the credibility of these data for reutilization. No gold-standard reference dataset or methods for variability assessment are usually available for these datasets. In this study, we aim to describe the process of discovering data quality implications by applying a set of methods for assessing variability between sources and over time in a large hospital database.
    Methods: We described and applied a set of multisource and temporal variability assessment methods in a large Portuguese hospitalization database, in which variation in condition-specific hospitalization ratios derived from clinically coded data were assessed between hospitals (sources) and over time. We identified condition-specific admissions using the Clinical Classification Software (CCS), developed by the Agency of Health Care Research and Quality. A Statistical Process Control (SPC) approach based on funnel plots of condition-specific standardized hospitalization ratios (SHR) was used to assess multisource variability, whereas temporal heat maps and Information-Geometric Temporal (IGT) plots were used to assess temporal variability by displaying temporal abrupt changes in data distributions. Results were presented for the 15 most common inpatient conditions (CCS) in Portugal.
    Main findings: Funnel plot assessment allowed the detection of several outlying hospitals whose SHRs were much lower or higher than expected. Adjusting SHR for hospital characteristics, beyond age and sex, considerably affected the degree of multisource variability for most diseases. Overall, probability distributions changed over time for most diseases, although heterogeneously. Abrupt temporal changes in data distributions for acute myocardial infarction and congestive heart failure coincided with the periods comprising the transition to the International Classification of Diseases, 10th revision, Clinical Modification, whereas changes in the Diagnosis-Related Groups software seem to have driven changes in data distributions for both acute myocardial infarction and liveborn admissions. The analysis of heat maps also allowed the detection of several discontinuities at hospital level over time, in some cases also coinciding with the aforementioned factors.
    Conclusions: This paper described the successful application of a set of reproducible, generalizable and systematic methods for variability assessment, including visualization tools that can be useful for detecting abnormal patterns in healthcare data, also addressing some limitations of common approaches. The presented method for multisource variability assessment is based on SPC, which is an advantage considering the lack of gold standard for such process. Properly controlling for hospital characteristics and differences in case-mix for estimating SHR is critical for isolating data quality-related variability among data sources. The use of IGT plots provides an advantage over common methods for temporal variability assessment due its suitability for multitype and multimodal data, which are common characteristics of healthcare data. The novelty of this work is the use of a set of methods to discover new data quality insights in healthcare data.
    MeSH term(s) Humans ; Portugal ; Data Accuracy ; Hospitals ; Hospitalization ; Myocardial Infarction
    Language English
    Publishing date 2022-11-11
    Publishing country United States
    Document type Journal Article
    ZDB-ID 2057141-0
    ISSN 1532-0480 ; 1532-0464
    ISSN (online) 1532-0480
    ISSN 1532-0464
    DOI 10.1016/j.jbi.2022.104242
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Subphenotyping of Mexican Patients With COVID-19 at Preadmission To Anticipate Severity Stratification: Age-Sex Unbiased Meta-Clustering Technique.

    Zhou, Lexin / Romero-García, Nekane / Martínez-Miranda, Juan / Conejero, J Alberto / García-Gómez, Juan M / Sáez, Carlos

    JMIR public health and surveillance

    2022  Volume 8, Issue 3, Page(s) e30032

    Abstract: Background: The COVID-19 pandemic has led to an unprecedented global health care challenge for both medical institutions and researchers. Recognizing different COVID-19 subphenotypes-the division of populations of patients into more meaningful subgroups ...

    Abstract Background: The COVID-19 pandemic has led to an unprecedented global health care challenge for both medical institutions and researchers. Recognizing different COVID-19 subphenotypes-the division of populations of patients into more meaningful subgroups driven by clinical features-and their severity characterization may assist clinicians during the clinical course, the vaccination process, research efforts, the surveillance system, and the allocation of limited resources.
    Objective: We aimed to discover age-sex unbiased COVID-19 patient subphenotypes based on easily available phenotypical data before admission, such as pre-existing comorbidities, lifestyle habits, and demographic features, to study the potential early severity stratification capabilities of the discovered subgroups through characterizing their severity patterns, including prognostic, intensive care unit (ICU), and morbimortality outcomes.
    Methods: We used the Mexican Government COVID-19 open data, including 778,692 SARS-CoV-2 population-based patient-level data as of September 2020. We applied a meta-clustering technique that consists of a 2-stage clustering approach combining dimensionality reduction (ie, principal components analysis and multiple correspondence analysis) and hierarchical clustering using the Ward minimum variance method with Euclidean squared distance.
    Results: In the independent age-sex clustering analyses, 56 clusters supported 11 clinically distinguishable meta-clusters (MCs). MCs 1-3 showed high recovery rates (90.27%-95.22%), including healthy patients of all ages, children with comorbidities and priority in receiving medical resources (ie, higher rates of hospitalization, intubation, and ICU admission) compared with other adult subgroups that have similar conditions, and young obese smokers. MCs 4-5 showed moderate recovery rates (81.30%-82.81%), including patients with hypertension or diabetes of all ages and obese patients with pneumonia, hypertension, and diabetes. MCs 6-11 showed low recovery rates (53.96%-66.94%), including immunosuppressed patients with high comorbidity rates, patients with chronic kidney disease with a poor survival length and probability of recovery, older smokers with chronic obstructive pulmonary disease, older adults with severe diabetes and hypertension, and the oldest obese smokers with chronic obstructive pulmonary disease and mild cardiovascular disease. Group outcomes conformed to the recent literature on dedicated age-sex groups. Mexican states and several types of clinical institutions showed relevant heterogeneity regarding severity, potentially linked to socioeconomic or health inequalities.
    Conclusions: The proposed 2-stage cluster analysis methodology produced a discriminative characterization of the sample and explainability over age and sex. These results can potentially help in understanding the clinical patient and their stratification for automated early triage before further tests and laboratory results are available and even in locations where additional tests are not available or to help decide resource allocation among vulnerable subgroups such as to prioritize vaccination or treatments.
    MeSH term(s) Aged ; COVID-19/epidemiology ; Child ; Cluster Analysis ; Humans ; Intensive Care Units ; Pandemics ; SARS-CoV-2
    Language English
    Publishing date 2022-03-30
    Publishing country Canada
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 2369-2960
    ISSN (online) 2369-2960
    DOI 10.2196/30032
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Data-driven discovery of changes in clinical code usage over time: a case-study on changes in cardiovascular disease recording in two English electronic health records databases (2001-2015).

    Rockenschaub, Patrick / Nguyen, Vincent / Aldridge, Robert W / Acosta, Dionisio / García-Gómez, Juan Miguel / Sáez, Carlos

    BMJ open

    2020  Volume 10, Issue 2, Page(s) e034396

    Abstract: Objectives: To demonstrate how data-driven variability methods can be used to identify changes in disease recording in two English electronic health records databases between 2001 and 2015.: Design: Repeated cross-sectional analysis that applied data- ...

    Abstract Objectives: To demonstrate how data-driven variability methods can be used to identify changes in disease recording in two English electronic health records databases between 2001 and 2015.
    Design: Repeated cross-sectional analysis that applied data-driven temporal variability methods to assess month-by-month changes in routinely collected medical data. A measure of difference between months was calculated based on joint distributions of age, gender, socioeconomic status and recorded cardiovascular diseases. Distances between months were used to identify temporal trends in data recording.
    Setting: 400 English primary care practices from the Clinical Practice Research Datalink (CPRD GOLD) and 451 hospital providers from the Hospital Episode Statistics (HES).
    Main outcomes: The proportion of patients (CPRD GOLD) and hospital admissions (HES) with a recorded cardiovascular disease (CPRD GOLD: coronary heart disease, heart failure, peripheral arterial disease, stroke; HES: International Classification of Disease codes I20-I69/G45).
    Results: Both databases showed gradual changes in cardiovascular disease recording between 2001 and 2008. The recorded prevalence of included cardiovascular diseases in CPRD GOLD increased by 47%-62%, which partially reversed after 2008. For hospital records in HES, there was a relative decrease in angina pectoris (-34.4%) and unspecified stroke (-42.3%) over the same time period, with a concomitant increase in chronic coronary heart disease (+14.3%). Multiple abrupt changes in the use of myocardial infarction codes in hospital were found in March/April 2010, 2012 and 2014, possibly linked to updates of clinical coding guidelines.
    Conclusions: Identified temporal variability could be related to potentially non-medical causes such as updated coding guidelines. These artificial changes may introduce temporal correlation among diagnoses inferred from routine data, violating the assumptions of frequently used statistical methods. Temporal variability measures provide an objective and robust technique to identify, and subsequently account for, those changes in electronic health records studies without any prior knowledge of the data collection process.
    MeSH term(s) Cardiovascular Diseases/epidemiology ; Clinical Coding/trends ; Cross-Sectional Studies ; Databases, Factual ; Electronic Health Records ; Humans
    Language English
    Publishing date 2020-02-13
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2599832-8
    ISSN 2044-6055 ; 2044-6055
    ISSN (online) 2044-6055
    ISSN 2044-6055
    DOI 10.1136/bmjopen-2019-034396
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article: Cross Sectional and Case-Control Study to Assess Time Trend, Gender Differences and Factors Associated with Physical Activity among Adults with Diabetes: Analysis of the European Health Interview Surveys for Spain (2014 & 2020).

    Llamas-Saez, Carlos / Saez-Vaquero, Teresa / Jiménez-García, Rodrigo / López-de-Andrés, Ana / Carabantes-Alarcón, David / Zamorano-León, José J / Cuadrado-Corrales, Natividad / Pérez-Farinos, Napoleón / Wärnberg, Julia

    Journal of clinical medicine

    2023  Volume 12, Issue 6

    Abstract: 1) Background: We aim to assess the time trend from 2014 to 2020 in the prevalence of physical activity (PA), identify gender differences and sociodemographic and health-related factors associated with PA among people with diabetes, and compare PA ... ...

    Abstract (1) Background: We aim to assess the time trend from 2014 to 2020 in the prevalence of physical activity (PA), identify gender differences and sociodemographic and health-related factors associated with PA among people with diabetes, and compare PA between people with and without diabetes. (2) Methods: We conducted a cross-sectional and a case-control study using as data source the European Health Interview Surveys for Spain (EHISS) conducted in years 2014 and 2020. The presence of diabetes and PA were self-reported. Covariates included socio-demographic characteristics, health-related variables, and lifestyles. To compare people with and without diabetes, we matched individuals by age and sex. (3) Results: The number of participants aged ≥18 years with self-reported diabetes were 1852 and 1889 in the EHISS2014 and EHISS2020, respectively. The proportion of people with diabetes that had a medium or high frequency of PA improved from 48.3% in 2014 to 52.6% in 2020 (
    Language English
    Publishing date 2023-03-22
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2662592-1
    ISSN 2077-0383
    ISSN 2077-0383
    DOI 10.3390/jcm12062443
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Stability metrics for multi-source biomedical data based on simplicial projections from probability distribution distances.

    Sáez, Carlos / Robles, Montserrat / García-Gómez, Juan M

    Statistical methods in medical research

    2017  Volume 26, Issue 1, Page(s) 312–336

    Abstract: Biomedical data may be composed of individuals generated from distinct, meaningful sources. Due to possible contextual biases in the processes that generate data, there may exist an undesirable and unexpected variability among the probability ... ...

    Abstract Biomedical data may be composed of individuals generated from distinct, meaningful sources. Due to possible contextual biases in the processes that generate data, there may exist an undesirable and unexpected variability among the probability distribution functions (PDFs) of the source subsamples, which, when uncontrolled, may lead to inaccurate or unreproducible research results. Classical statistical methods may have difficulties to undercover such variabilities when dealing with multi-modal, multi-type, multi-variate data. This work proposes two metrics for the analysis of stability among multiple data sources, robust to the aforementioned conditions, and defined in the context of data quality assessment. Specifically, a global probabilistic deviation and a source probabilistic outlyingness metrics are proposed. The first provides a bounded degree of the global multi-source variability, designed as an estimator equivalent to the notion of normalized standard deviation of PDFs. The second provides a bounded degree of the dissimilarity of each source to a latent central distribution. The metrics are based on the projection of a simplex geometrical structure constructed from the Jensen-Shannon distances among the sources PDFs. The metrics have been evaluated and demonstrated their correct behaviour on a simulated benchmark and with real multi-source biomedical data using the UCI Heart Disease data set. The biomedical data quality assessment based on the proposed stability metrics may improve the efficiency and effectiveness of biomedical data exploitation and research.
    Language English
    Publishing date 2017-02
    Publishing country England
    Document type Journal Article
    ZDB-ID 1136948-6
    ISSN 1477-0334 ; 0962-2802
    ISSN (online) 1477-0334
    ISSN 0962-2802
    DOI 10.1177/0962280214545122
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Article ; Online: Extremely missing numerical data in Electronic Health Records for machine learning can be managed through simple imputation methods considering informative missingness: A comparative of solutions in a COVID-19 mortality case study.

    Ferri, Pablo / Romero-Garcia, Nekane / Badenes, Rafael / Lora-Pablos, David / Morales, Teresa García / Gómez de la Cámara, Agustín / García-Gómez, Juan M / Sáez, Carlos

    Computer methods and programs in biomedicine

    2023  Volume 242, Page(s) 107803

    Abstract: Background and objective: Reusing Electronic Health Records (EHRs) for Machine Learning (ML) leads on many occasions to extremely incomplete and sparse tabular datasets, which can hinder the model development processes and limit their performance and ... ...

    Abstract Background and objective: Reusing Electronic Health Records (EHRs) for Machine Learning (ML) leads on many occasions to extremely incomplete and sparse tabular datasets, which can hinder the model development processes and limit their performance and generalization. In this study, we aimed to characterize the most effective data imputation techniques and ML models for dealing with highly missing numerical data in EHRs, in the case where only a very limited number of data are complete, as opposed to the usual case of having a reduced number of missing values.
    Methods: We used a case study including full blood count laboratory data, demographic and survival data in the context of COVID-19 hospital admissions and evaluated 30 processing pipelines combining imputation methods with ML classifiers. The imputation methods included missing mask, translation and encoding, mean imputation, k-nearest neighbors' imputation, Bayesian ridge regression imputation and generative adversarial imputation networks. The classifiers included k-nearest neighbors, logistic regression, random forest, gradient boosting and deep multilayer perceptron.
    Results: Our results suggest that in the presence of highly missing data, combining translation and encoding imputation-which considers informative missingness-with tree ensemble classifiers-random forest and gradient boosting-is a sensible choice when aiming to maximize performance, in terms of area under curve.
    Conclusions: Based on our findings, we recommend the consideration of this imputer-classifier configuration when constructing models in the presence of extremely incomplete numerical data in EHR.
    MeSH term(s) Humans ; Algorithms ; Electronic Health Records ; Bayes Theorem ; COVID-19 ; Machine Learning
    Language English
    Publishing date 2023-09-07
    Publishing country Ireland
    Document type Journal Article
    ZDB-ID 632564-6
    ISSN 1872-7565 ; 0169-2607
    ISSN (online) 1872-7565
    ISSN 0169-2607
    DOI 10.1016/j.cmpb.2023.107803
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top