LIVIVO - Das Suchportal für Lebenswissenschaften

switch to English language
Erweiterte Suche

Ihre letzten Suchen

  1. AU="Rivas, Manuel A"
  2. AU="Mangelis, Anastasios"
  3. AU="Simpson, Tina Y"
  4. AU="Li, Peirang"
  5. AU="Zhang, Zhao-Liang"
  6. AU="Perner, Sven"
  7. AU=Suwanwongse Kulachanya AU=Suwanwongse Kulachanya
  8. AU="Rose, Jacqueline"
  9. AU="E Lostis"

Suchergebnis

Treffer 1 - 10 von insgesamt 106

Suchoptionen

  1. Artikel: Efficient storage and regression computation for population-scale genome sequencing studies.

    Rivas, Manuel A / Chang, Christopher

    bioRxiv : the preprint server for biology

    2024  

    Abstract: In the era of big data in human genetics, large-scale biobanks aggregating genetic data from diverse populations have emerged as important for advancing our understanding of human health and disease. However, the computational and storage demands of ... ...

    Abstract In the era of big data in human genetics, large-scale biobanks aggregating genetic data from diverse populations have emerged as important for advancing our understanding of human health and disease. However, the computational and storage demands of whole genome sequencing (WGS) studies pose significant challenges, especially for researchers from underfunded institutions or developing countries, creating a disparity in research capabilities. We introduce new approaches that significantly enhance computational efficiency and reduce data storage requirements for WGS studies. By developing algorithms for compressed storage of genetic data, focusing particularly on optimizing the representation of rare variants, and designing regression methods tailored for the scale and complexity of WGS data, we significantly lower computational and storage costs. We integrate our approach into PLINK 2.0. The implementation demonstrates considerable reductions in storage space and computational time without compromising analytical accuracy, as evidenced by the application to the AllofUs project data. We improve runtime of an exome-wide association analysis of 19.4 million variants and a single phenotype from 695.35 minutes (approximately 11.5 hours) on a single machine to 1.57 minutes using 30Gb of memory and 50 threads (8.67 minutes using 4 threads). Similarly, we generalize to multi-phenotype analysis. We anticipate that our approach will enable researchers across the globe to unlock the potential of population biobanks, accelerating the pace of discoveries that can improve our understanding of human health and disease.
    Sprache Englisch
    Erscheinungsdatum 2024-04-15
    Erscheinungsland United States
    Dokumenttyp Preprint
    DOI 10.1101/2024.04.11.589062
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  2. Artikel: Integrative machine learning approaches for predicting disease risk using multi-omics data from the UK Biobank.

    Aguilar, Oscar / Chang, Cheng / Bismuth, Elsa / Rivas, Manuel A

    bioRxiv : the preprint server for biology

    2024  

    Abstract: We train prediction and survival models using multi-omics data for disease risk identification and stratification. Existing work on disease prediction focuses on risk analysis using datasets of individual data types (metabolomic, genomics, demographic), ... ...

    Abstract We train prediction and survival models using multi-omics data for disease risk identification and stratification. Existing work on disease prediction focuses on risk analysis using datasets of individual data types (metabolomic, genomics, demographic), while our study creates an integrated model for disease risk assessment. We compare machine learning models such as Lasso Regression, Multi-Layer Perceptron, XG Boost, and ADA Boost to analyze multi-omics data, incorporating ROC-AUC score comparisons for various diseases and feature combinations. Additionally, we train Cox proportional hazard models for each disease to perform survival analysis. Although the integration of multi-omics data significantly improves risk prediction for 8 diseases, we find that the contribution of metabolomic data is marginal when compared to standard demographic, genetic, and biomarker features. Nonetheless, we see that metabolomics is a useful replacement for the standard biomarker panel when it is not readily available.
    Sprache Englisch
    Erscheinungsdatum 2024-04-20
    Erscheinungsland United States
    Dokumenttyp Preprint
    DOI 10.1101/2024.04.16.589819
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  3. Artikel ; Online: Rare and common variant discovery in complex disease: the IBD case study.

    Venkataraman, Guhan R / Rivas, Manuel A

    Human molecular genetics

    2019  Band 28, Heft R2, Seite(n) R162–R169

    Abstract: Complex diseases such as inflammatory bowel disease (IBD), which consists of ulcerative colitis and Crohn's disease, are a significant medical burden-70 000 new cases of IBD are diagnosed in the United States annually. In this review, we examine the ... ...

    Abstract Complex diseases such as inflammatory bowel disease (IBD), which consists of ulcerative colitis and Crohn's disease, are a significant medical burden-70 000 new cases of IBD are diagnosed in the United States annually. In this review, we examine the history of genetic variant discovery in complex disease with a focus on IBD. We cover methods that have been applied to microsatellite, common variant, targeted resequencing and whole-exome and -genome data, specifically focusing on the progression of technologies towards rare-variant discovery. The inception of these methods combined with better availability of population level variation data has led to rapid discovery of IBD-causative and/or -associated variants at over 200 loci; over time, these methods have grown exponentially in both power and ascertainment to detect rare variation. We highlight rare-variant discoveries critical to the elucidation of the pathogenesis of IBD, including those in NOD2, IL23R, CARD9, RNF186 and ADCY7. We additionally identify the major areas of rare-variant discovery that will evolve in the coming years. A better understanding of the genetic basis of IBD and other complex diseases will lead to improved diagnosis, prognosis, treatment and surveillance.
    Mesh-Begriff(e) Asian Continental Ancestry Group/genetics ; Asian Continental Ancestry Group/statistics & numerical data ; Case-Control Studies ; European Continental Ancestry Group/genetics ; European Continental Ancestry Group/statistics & numerical data ; Genetic Linkage ; Genetic Predisposition to Disease ; Genome-Wide Association Study/history ; Genome-Wide Association Study/statistics & numerical data ; History, 20th Century ; History, 21st Century ; Humans ; Inflammatory Bowel Diseases/genetics ; Inflammatory Bowel Diseases/history ; Models, Statistical ; Polymorphism, Single Nucleotide ; Receptors, Interleukin/genetics ; Whole Exome Sequencing/statistics & numerical data
    Chemische Substanzen IL23R protein, human ; Receptors, Interleukin
    Sprache Englisch
    Erscheinungsdatum 2019-07-29
    Erscheinungsland England
    Dokumenttyp Historical Article ; Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't ; Review
    ZDB-ID 1108742-0
    ISSN 1460-2083 ; 0964-6906
    ISSN (online) 1460-2083
    ISSN 0964-6906
    DOI 10.1093/hmg/ddz189
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  4. Artikel ; Online: Graphical analysis for phenome-wide causal discovery in genotyped population-scale biobanks.

    Amar, David / Sinnott-Armstrong, Nasa / Ashley, Euan A / Rivas, Manuel A

    Nature communications

    2021  Band 12, Heft 1, Seite(n) 350

    Abstract: Causal inference via Mendelian randomization requires making strong assumptions about horizontal pleiotropy, where genetic instruments are connected to the outcome not only through the exposure. Here, we present causal Graphical Analysis Using Genetics ( ... ...

    Abstract Causal inference via Mendelian randomization requires making strong assumptions about horizontal pleiotropy, where genetic instruments are connected to the outcome not only through the exposure. Here, we present causal Graphical Analysis Using Genetics (cGAUGE), a pipeline that overcomes these limitations using instrument filters with provable properties. This is achievable by identifying conditional independencies while examining multiple traits. cGAUGE also uses ExSep (Exposure-based Separation), a novel test for the existence of causal pathways that does not require selecting instruments. In simulated data we illustrate how cGAUGE can reduce the empirical false discovery rate by up to 30%, while retaining the majority of true discoveries. On 96 complex traits from 337,198 subjects from the UK Biobank, our results cover expected causal links and many new ones that were previously suggested by correlation-based observational studies. Notably, we identify multiple risk factors for cardiovascular disease, including red blood cell distribution width.
    Mesh-Begriff(e) Biological Specimen Banks ; Cardiovascular Diseases/genetics ; Causality ; Computer Simulation ; Gene Regulatory Networks ; Genetic Pleiotropy/genetics ; Genetic Variation ; Genome-Wide Association Study/methods ; Genotype ; Humans ; Mendelian Randomization Analysis/methods ; Models, Theoretical ; Multifactorial Inheritance/genetics ; Phenotype ; Risk Factors
    Sprache Englisch
    Erscheinungsdatum 2021-01-13
    Erscheinungsland England
    Dokumenttyp Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 2553671-0
    ISSN 2041-1723 ; 2041-1723
    ISSN (online) 2041-1723
    ISSN 2041-1723
    DOI 10.1038/s41467-020-20516-2
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  5. Artikel: LARGE-SCALE MULTIVARIATE SPARSE REGRESSION WITH APPLICATIONS TO UK BIOBANK.

    Qian, Junyang / Tanigawa, Yosuke / Li, Ruilin / Tibshirani, Robert / Rivas, Manuel A / Hastie, Trevor

    The annals of applied statistics

    2022  Band 16, Heft 3, Seite(n) 1891–1918

    Abstract: In high-dimensional regression problems, often a relatively small subset of the features are relevant for predicting the outcome, and methods that impose sparsity on the solution are popular. When multiple correlated outcomes are available (multitask), ... ...

    Abstract In high-dimensional regression problems, often a relatively small subset of the features are relevant for predicting the outcome, and methods that impose sparsity on the solution are popular. When multiple correlated outcomes are available (multitask), reduced rank regression is an effective way to borrow strength and capture latent structures that underlie the data. Our proposal is motivated by the UK Biobank population-based cohort study, where we are faced with large-scale, ultrahigh-dimensional features, and have access to a large number of outcomes (phenotypes)-lifestyle measures, biomarkers, and disease outcomes. We are hence led to fit sparse reduced-rank regression models, using computational strategies that allow us to scale to problems of this size. We use a scheme that alternates between solving the sparse regression problem and solving the reduced rank decomposition. For the sparse regression component we propose a scalable iterative algorithm based on adaptive screening that leverages the sparsity assumption and enables us to focus on solving much smaller subproblems. The full solution is reconstructed and tested via an optimality condition to make sure it is a valid solution for the original problem. We further extend the method to cope with practical issues, such as the inclusion of confounding variables and imputation of missing values among the phenotypes. Experiments on both synthetic data and the UK Biobank data demonstrate the effectiveness of the method and the algorithm. We present multiSnpnet package, available at http://github.com/junyangq/multiSnpnet that works on top of PLINK2 files, which we anticipate to be a valuable tool for generating polygenic risk scores from human genetic studies.
    Sprache Englisch
    Erscheinungsdatum 2022-07-19
    Erscheinungsland United States
    Dokumenttyp Journal Article
    ZDB-ID 2376910-5
    ISSN 1941-7330 ; 1932-6157
    ISSN (online) 1941-7330
    ISSN 1932-6157
    DOI 10.1214/21-aoas1575
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  6. Artikel ; Online: Phenome-wide Burden of Copy-Number Variation in the UK Biobank.

    Aguirre, Matthew / Rivas, Manuel A / Priest, James

    American journal of human genetics

    2019  Band 105, Heft 2, Seite(n) 373–383

    Abstract: Copy-number variations (CNVs) represent a significant proportion of the genetic differences between individuals and many CNVs associate causally with syndromic disease and clinical outcomes. Here, we characterize the landscape of copy-number variation ... ...

    Abstract Copy-number variations (CNVs) represent a significant proportion of the genetic differences between individuals and many CNVs associate causally with syndromic disease and clinical outcomes. Here, we characterize the landscape of copy-number variation and their phenome-wide effects in a sample of 472,228 array-genotyped individuals from the UK Biobank. In addition to population-level selection effects against genic loci conferring high mortality, we describe genetic burden from potentially pathogenic and previously uncharacterized CNV loci across more than 3,000 quantitative and dichotomous traits, with separate analyses for common and rare classes of variation. Specifically, we highlight the effects of CNVs at two well-known syndromic loci 16p11.2 and 22q11.2, previously uncharacterized variation at 9p23, and several genic associations in the context of acute coronary artery disease and high body mass index. Our data constitute a deeply contextualized portrait of population-wide burden of copy-number variation, as well as a series of dosage-mediated genic associations across the medical phenome.
    Mesh-Begriff(e) Autistic Disorder/genetics ; Biological Specimen Banks ; Case-Control Studies ; Chromosome Deletion ; Chromosome Disorders/genetics ; Chromosomes, Human, Pair 16/genetics ; Chromosomes, Human, Pair 9/genetics ; Coronary Artery Disease/genetics ; DNA Copy Number Variations ; DiGeorge Syndrome/genetics ; Female ; Genetic Loci ; Genetic Predisposition to Disease ; Genome-Wide Association Study ; Genotype ; Humans ; Intellectual Disability/genetics ; Male ; Phenomics ; Phenotype ; Polymorphism, Single Nucleotide ; United Kingdom
    Sprache Englisch
    Erscheinungsdatum 2019-07-25
    Erscheinungsland United States
    Dokumenttyp Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 219384-x
    ISSN 1537-6605 ; 0002-9297
    ISSN (online) 1537-6605
    ISSN 0002-9297
    DOI 10.1016/j.ajhg.2019.07.001
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  7. Artikel ; Online: Time trajectories in the transcriptomic response to exercise - a meta-analysis.

    Amar, David / Lindholm, Malene E / Norrbom, Jessica / Wheeler, Matthew T / Rivas, Manuel A / Ashley, Euan A

    Nature communications

    2021  Band 12, Heft 1, Seite(n) 3471

    Abstract: Exercise training prevents multiple diseases, yet the molecular mechanisms that drive exercise adaptation are incompletely understood. To address this, we create a computational framework comprising data from skeletal muscle or blood from 43 studies, ... ...

    Abstract Exercise training prevents multiple diseases, yet the molecular mechanisms that drive exercise adaptation are incompletely understood. To address this, we create a computational framework comprising data from skeletal muscle or blood from 43 studies, including 739 individuals before and after exercise or training. Using linear mixed effects meta-regression, we detect specific time patterns and regulatory modulators of the exercise response. Acute and long-term responses are transcriptionally distinct and we identify SMAD3 as a central regulator of the exercise response. Exercise induces a more pronounced inflammatory response in skeletal muscle of older individuals and our models reveal multiple sex-associated responses. We validate seven of our top genes in a separate human cohort. In this work, we provide a powerful resource ( www.extrameta.org ) that expands the transcriptional landscape of exercise adaptation by extending previously known responses and their regulatory networks, and identifying novel modality-, time-, age-, and sex-associated changes.
    Mesh-Begriff(e) Adaptation, Physiological/genetics ; Age Factors ; Endurance Training ; Exercise/physiology ; Extracellular Matrix Proteins/genetics ; Gene Regulatory Networks ; Humans ; Inflammation/genetics ; Muscle, Skeletal/physiology ; Reproducibility of Results ; Resistance Training ; Smad3 Protein/genetics ; Systems Biology ; Time Factors ; Transcriptome
    Chemische Substanzen Extracellular Matrix Proteins ; SMAD3 protein, human ; Smad3 Protein
    Sprache Englisch
    Erscheinungsdatum 2021-06-09
    Erscheinungsland England
    Dokumenttyp Journal Article ; Meta-Analysis ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 2553671-0
    ISSN 2041-1723 ; 2041-1723
    ISSN (online) 2041-1723
    ISSN 2041-1723
    DOI 10.1038/s41467-021-23579-x
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  8. Artikel ; Online: Polygenic risk modeling with latent trait-related genetic components.

    Aguirre, Matthew / Tanigawa, Yosuke / Venkataraman, Guhan Ram / Tibshirani, Rob / Hastie, Trevor / Rivas, Manuel A

    European journal of human genetics : EJHG

    2021  Band 29, Heft 7, Seite(n) 1071–1081

    Abstract: Polygenic risk models have led to significant advances in understanding complex diseases and their clinical presentation. While polygenic risk scores (PRS) can effectively predict outcomes, they do not generally account for disease subtypes or pathways ... ...

    Abstract Polygenic risk models have led to significant advances in understanding complex diseases and their clinical presentation. While polygenic risk scores (PRS) can effectively predict outcomes, they do not generally account for disease subtypes or pathways which underlie within-trait diversity. Here, we introduce a latent factor model of genetic risk based on components from Decomposition of Genetic Associations (DeGAs), which we call the DeGAs polygenic risk score (dPRS). We compute DeGAs using genetic associations for 977 traits and find that dPRS performs comparably to standard PRS while offering greater interpretability. We show how to decompose an individual's genetic risk for a trait across DeGAs components, with examples for body mass index (BMI) and myocardial infarction (heart attack) in 337,151 white British individuals in the UK Biobank, with replication in a further set of 25,486 non-British white individuals. We find that BMI polygenic risk factorizes into components related to fat-free mass, fat mass, and overall health indicators like physical activity. Most individuals with high dPRS for BMI have strong contributions from both a fat-mass component and a fat-free mass component, whereas a few "outlier" individuals have strong contributions from only one of the two components. Overall, our method enables fine-scale interpretation of the drivers of genetic risk for complex traits.
    Mesh-Begriff(e) Algorithms ; Biological Specimen Banks ; Databases, Genetic ; Genetic Association Studies/methods ; Genetic Predisposition to Disease ; Genome-Wide Association Study ; Humans ; Models, Genetic ; Multifactorial Inheritance ; Phenotype ; Population Surveillance ; Quantitative Trait, Heritable ; Reproducibility of Results ; Risk Assessment ; Risk Factors ; United Kingdom/epidemiology
    Sprache Englisch
    Erscheinungsdatum 2021-02-08
    Erscheinungsland England
    Dokumenttyp Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
    ZDB-ID 1141470-4
    ISSN 1476-5438 ; 1018-4813
    ISSN (online) 1476-5438
    ISSN 1018-4813
    DOI 10.1038/s41431-021-00813-0
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

  9. Buch ; Online: Efficient computation and analysis of distributional Shapley values

    Kwon, Yongchan / Rivas, Manuel A. / Zou, James

    2020  

    Abstract: Distributional data Shapley value (DShapley) has been recently proposed as a principled framework to quantify the contribution of individual datum in machine learning. DShapley develops the foundational game theory concept of Shapley values into a ... ...

    Abstract Distributional data Shapley value (DShapley) has been recently proposed as a principled framework to quantify the contribution of individual datum in machine learning. DShapley develops the foundational game theory concept of Shapley values into a statistical framework and can be applied to identify data points that are useful (or harmful) to a learning algorithm. Estimating DShapley is computationally expensive, however, and this can be a major challenge to using it in practice. Moreover, there has been little mathematical analyses of how this value depends on data characteristics. In this paper, we derive the first analytic expressions for DShapley for the canonical problems of linear regression and non-parametric density estimation. These analytic forms provide new algorithms to compute DShapley that are several orders of magnitude faster than previous state-of-the-art. Furthermore, our formulas are directly interpretable and provide quantitative insights into how the value varies for different types of data. We demonstrate the efficacy of our DShapley approach on multiple real and synthetic datasets.

    Comment: 23 pages
    Schlagwörter Statistics - Machine Learning ; Computer Science - Machine Learning
    Thema/Rubrik (Code) 006
    Erscheinungsdatum 2020-07-02
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  10. Artikel ; Online: Significant sparse polygenic risk scores across 813 traits in UK Biobank.

    Tanigawa, Yosuke / Qian, Junyang / Venkataraman, Guhan / Justesen, Johanne Marie / Li, Ruilin / Tibshirani, Robert / Hastie, Trevor / Rivas, Manuel A

    PLoS genetics

    2022  Band 18, Heft 3, Seite(n) e1010105

    Abstract: We present a systematic assessment of polygenic risk score (PRS) prediction across more than 1,500 traits using genetic and phenotype data in the UK Biobank. We report 813 sparse PRS models with significant (p < 2.5 x 10-5) incremental predictive ... ...

    Abstract We present a systematic assessment of polygenic risk score (PRS) prediction across more than 1,500 traits using genetic and phenotype data in the UK Biobank. We report 813 sparse PRS models with significant (p < 2.5 x 10-5) incremental predictive performance when compared against the covariate-only model that considers age, sex, types of genotyping arrays, and the principal component loadings of genotypes. We report a significant correlation between the number of genetic variants selected in the sparse PRS model and the incremental predictive performance (Spearman's ⍴ = 0.61, p = 2.2 x 10-59 for quantitative traits, ⍴ = 0.21, p = 9.6 x 10-4 for binary traits). The sparse PRS model trained on European individuals showed limited transferability when evaluated on non-European individuals in the UK Biobank. We provide the PRS model weights on the Global Biobank Engine (https://biobankengine.stanford.edu/prs).
    Mesh-Begriff(e) Biological Specimen Banks ; Genetic Predisposition to Disease ; Genome-Wide Association Study ; Humans ; Multifactorial Inheritance/genetics ; Phenotype ; Risk Factors ; United Kingdom
    Sprache Englisch
    Erscheinungsdatum 2022-03-24
    Erscheinungsland United States
    Dokumenttyp Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't ; Research Support, U.S. Gov't, Non-P.H.S.
    ZDB-ID 2186725-2
    ISSN 1553-7404 ; 1553-7390
    ISSN (online) 1553-7404
    ISSN 1553-7390
    DOI 10.1371/journal.pgen.1010105
    Datenquelle MEDical Literature Analysis and Retrieval System OnLINE

    Zusatzmaterialien

    Kategorien

Zum Seitenanfang