LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 5 of total 5

Search options

  1. Article ; Online: The validity of electronic health data for measuring smoking status: a systematic review and meta-analysis.

    Haque, Md Ashiqul / Gedara, Muditha Lakmali Bodawatte / Nickel, Nathan / Turgeon, Maxime / Lix, Lisa M

    BMC medical informatics and decision making

    2024  Volume 24, Issue 1, Page(s) 33

    Abstract: Background: Smoking is a risk factor for many chronic diseases. Multiple smoking status ascertainment algorithms have been developed for population-based electronic health databases such as administrative databases and electronic medical records (EMRs). ...

    Abstract Background: Smoking is a risk factor for many chronic diseases. Multiple smoking status ascertainment algorithms have been developed for population-based electronic health databases such as administrative databases and electronic medical records (EMRs). Evidence syntheses of algorithm validation studies have often focused on chronic diseases rather than risk factors. We conducted a systematic review and meta-analysis of smoking status ascertainment algorithms to describe the characteristics and validity of these algorithms.
    Methods: The Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines were followed. We searched articles published from 1990 to 2022 in EMBASE, MEDLINE, Scopus, and Web of Science with key terms such as validity, administrative data, electronic health records, smoking, and tobacco use. The extracted information, including article characteristics, algorithm characteristics, and validity measures, was descriptively analyzed. Sources of heterogeneity in validity measures were estimated using a meta-regression model. Risk of bias (ROB) in the reviewed articles was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 tool.
    Results: The initial search yielded 2086 articles; 57 were selected for review and 116 algorithms were identified. Almost three-quarters (71.6%) of algorithms were based on EMR data. The algorithms were primarily constructed using diagnosis codes for smoking-related conditions, although prescription medication codes for smoking treatments were also adopted. About half of the algorithms were developed using machine-learning models. The pooled estimates of positive predictive value, sensitivity, and specificity were 0.843, 0.672, and 0.918 respectively. Algorithm sensitivity and specificity were highly variable and ranged from 3 to 100% and 36 to 100%, respectively. Model-based algorithms had significantly greater sensitivity (p = 0.006) than rule-based algorithms. Algorithms for EMR data had higher sensitivity than algorithms for administrative data (p = 0.001). The ROB was low in most of the articles (76.3%) that underwent the assessment.
    Conclusions: Multiple algorithms using different data sources and methods have been proposed to ascertain smoking status in electronic health data. Many algorithms had low sensitivity and positive predictive value, but the data source influenced their validity. Algorithms based on machine-learning models for multiple linked data sources have improved validity.
    MeSH term(s) Humans ; Electronic Health Records ; Predictive Value of Tests ; Sensitivity and Specificity ; Smoking/epidemiology ; Algorithms ; Chronic Disease
    Language English
    Publishing date 2024-02-02
    Publishing country England
    Document type Meta-Analysis ; Systematic Review ; Journal Article
    ZDB-ID 2046490-3
    ISSN 1472-6947 ; 1472-6947
    ISSN (online) 1472-6947
    ISSN 1472-6947
    DOI 10.1186/s12911-024-02416-3
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Book ; Online: Case-Base Neural Networks

    Islam, Jesse / Turgeon, Maxime / Sladek, Robert / Bhatnagar, Sahir

    survival analysis with time-varying, higher-order interactions

    2023  

    Abstract: In the context of survival analysis, data-driven neural network-based methods have been developed to model complex covariate effects. While these methods may provide better predictive performance than regression-based approaches, not all can model time- ... ...

    Abstract In the context of survival analysis, data-driven neural network-based methods have been developed to model complex covariate effects. While these methods may provide better predictive performance than regression-based approaches, not all can model time-varying interactions and complex baseline hazards. To address this, we propose Case-Base Neural Networks (CBNNs) as a new approach that combines the case-base sampling framework with flexible neural network architectures. Using a novel sampling scheme and data augmentation to naturally account for censoring, we construct a feed-forward neural network that includes time as an input. CBNNs predict the probability of an event occurring at a given moment to estimate the full hazard function. We compare the performance of CBNNs to regression and neural network-based survival methods in a simulation and three case studies using two time-dependent metrics. First, we examine performance on a simulation involving a complex baseline hazard and time-varying interactions to assess all methods, with CBNN outperforming competitors. Then, we apply all methods to three real data applications, with CBNNs outperforming the competing models in two studies and showing similar performance in the third. Our results highlight the benefit of combining case-base sampling with deep learning to provide a simple and flexible framework for data-driven modeling of single event survival outcomes that estimates time-varying effects and a complex baseline hazard by design. An R package is available at https://github.com/Jesse-Islam/cbnn.
    Keywords Statistics - Machine Learning ; Computer Science - Machine Learning
    Subject code 310
    Publishing date 2023-01-16
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Article: A Novel SARS-CoV-2 Viral Sequence Bioinformatic Pipeline Has Found Genetic Evidence That the Viral 3' Untranslated Region (UTR) Is Evolving and Generating Increased Viral Diversity.

    Farkas, Carlos / Mella, Andy / Turgeon, Maxime / Haigh, Jody J

    Frontiers in microbiology

    2021  Volume 12, Page(s) 665041

    Abstract: An unprecedented amount of SARS-CoV-2 sequencing has been performed, however, novel bioinformatic tools to cope with and process these large datasets is needed. Here, we have devised a bioinformatic pipeline that inputs SARS-CoV-2 genome sequencing in ... ...

    Abstract An unprecedented amount of SARS-CoV-2 sequencing has been performed, however, novel bioinformatic tools to cope with and process these large datasets is needed. Here, we have devised a bioinformatic pipeline that inputs SARS-CoV-2 genome sequencing in FASTA/FASTQ format and outputs a single Variant Calling Format file that can be processed to obtain variant annotations and perform downstream population genetic testing. As proof of concept, we have analyzed over 229,000 SARS-CoV-2 viral sequences up until November 30, 2020. We have identified over 39,000 variants worldwide with increased polymorphisms, spanning the ORF3a gene as well as the 3' untranslated (UTR) regions, specifically in the conserved stem loop region of SARS-CoV-2 which is accumulating greater observed viral diversity relative to chance variation. Our analysis pipeline has also discovered the existence of SARS-CoV-2 hypermutation with low frequency (less than in 2% of genomes) likely arising through host immune responses and not due to sequencing errors. Among annotated non-sense variants with a population frequency over 1%, recurrent inactivation of the ORF8 gene was found. This was found to be present in the newly identified B.1.1.7 SARS-CoV-2 lineage that originated in the United Kingdom. Almost all VOC-containing genomes possess one stop codon in ORF8 gene (Q27
    Language English
    Publishing date 2021-06-21
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2587354-4
    ISSN 1664-302X
    ISSN 1664-302X
    DOI 10.3389/fmicb.2021.665041
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: Principal component of explained variance: An efficient and optimal data dimension reduction framework for association studies.

    Turgeon, Maxime / Oualkacha, Karim / Ciampi, Antonio / Miftah, Hanane / Dehghan, Golsa / Zanke, Brent W / Benedet, Andréa L / Rosa-Neto, Pedro / Greenwood, Celia Mt / Labbe, Aurélie

    Statistical methods in medical research

    2016  Volume 27, Issue 5, Page(s) 1331–1350

    Abstract: The genomics era has led to an increase in the dimensionality of data collected in the investigation of biological questions. In this context, dimension-reduction techniques can be used to summarise high-dimensional signals into low-dimensional ones, to ... ...

    Abstract The genomics era has led to an increase in the dimensionality of data collected in the investigation of biological questions. In this context, dimension-reduction techniques can be used to summarise high-dimensional signals into low-dimensional ones, to further test for association with one or more covariates of interest. This paper revisits one such approach, previously known as principal component of heritability and renamed here as principal component of explained variance (PCEV). As its name suggests, the PCEV seeks a linear combination of outcomes in an optimal manner, by maximising the proportion of variance explained by one or several covariates of interest. By construction, this method optimises power; however, due to its computational complexity, it has unfortunately received little attention in the past. Here, we propose a general analytical PCEV framework that builds on the assets of the original method, i.e. conceptually simple and free of tuning parameters. Moreover, our framework extends the range of applications of the original procedure by providing a computationally simple strategy for high-dimensional outcomes, along with exact and asymptotic testing procedures that drastically reduce its computational cost. We investigate the merits of the PCEV using an extensive set of simulations. Furthermore, the use of the PCEV approach is illustrated using three examples taken from the fields of epigenetics and brain imaging.
    MeSH term(s) Analysis of Variance ; Computer Simulation ; DNA Methylation ; Data Interpretation, Statistical ; Genes/genetics ; Humans ; Models, Statistical ; Multivariate Analysis ; Neuroimaging/statistics & numerical data ; Principal Component Analysis/methods
    Language English
    Publishing date 2016-07-26
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 1136948-6
    ISSN 1477-0334 ; 0962-2802
    ISSN (online) 1477-0334
    ISSN 0962-2802
    DOI 10.1177/0962280216660128
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: A Mendelian randomization study of the effect of type-2 diabetes on coronary heart disease.

    Ahmad, Omar S / Morris, John A / Mujammami, Muhammad / Forgetta, Vincenzo / Leong, Aaron / Li, Rui / Turgeon, Maxime / Greenwood, Celia M T / Thanassoulis, George / Meigs, James B / Sladek, Robert / Richards, J Brent

    Nature communications

    2015  Volume 6, Page(s) 7060

    Abstract: In observational studies, type-2 diabetes (T2D) is associated with an increased risk of coronary heart disease (CHD), yet interventional trials have shown no clear effect of glucose-lowering on CHD. Confounding may have therefore influenced these ... ...

    Abstract In observational studies, type-2 diabetes (T2D) is associated with an increased risk of coronary heart disease (CHD), yet interventional trials have shown no clear effect of glucose-lowering on CHD. Confounding may have therefore influenced these observational estimates. Here we use Mendelian randomization to obtain unconfounded estimates of the influence of T2D and fasting glucose (FG) on CHD risk. Using multiple genetic variants associated with T2D and FG, we find that risk of T2D increases CHD risk (odds ratio (OR)=1.11 (1.05-1.17), per unit increase in odds of T2D, P=8.8 × 10(-5); using data from 34,840/114,981 T2D cases/controls and 63,746/130,681 CHD cases/controls). FG in non-diabetic individuals tends to increase CHD risk (OR=1.15 (1.00-1.32), per mmol·per l, P=0.05; 133,010 non-diabetic individuals and 63,746/130,681 CHD cases/controls). These findings provide evidence supporting a causal relationship between T2D and CHD and suggest that long-term trials may be required to discern the effects of T2D therapies on CHD risk.
    MeSH term(s) Adult ; Aged ; Blood Glucose/genetics ; Blood Glucose/metabolism ; Case-Control Studies ; Causality ; Coronary Disease/epidemiology ; Coronary Disease/genetics ; Diabetes Mellitus, Type 2/epidemiology ; Diabetes Mellitus, Type 2/genetics ; Diabetes Mellitus, Type 2/metabolism ; Glycated Hemoglobin A/genetics ; Glycated Hemoglobin A/metabolism ; Humans ; Mendelian Randomization Analysis ; Middle Aged ; Odds Ratio ; Polymorphism, Single Nucleotide ; Risk Factors
    Chemical Substances Blood Glucose ; Glycated Hemoglobin A ; hemoglobin A1c protein, human
    Language English
    Publishing date 2015-05-28
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ISSN 2041-1723
    ISSN (online) 2041-1723
    DOI 10.1038/ncomms8060
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top