LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 45

Search options

  1. Book ; Online: Mixture of multilayer stochastic block models for multiview clustering

    De Santiago, Kylliann / Szafranski, Marie / Ambroise, Christophe

    2024  

    Abstract: In this work, we propose an original method for aggregating multiple clustering coming from different sources of information. Each partition is encoded by a co-membership matrix between observations. Our approach uses a mixture of multilayer Stochastic ... ...

    Abstract In this work, we propose an original method for aggregating multiple clustering coming from different sources of information. Each partition is encoded by a co-membership matrix between observations. Our approach uses a mixture of multilayer Stochastic Block Models (SBM) to group co-membership matrices with similar information into components and to partition observations into different clusters, taking into account their specificities within the components. The identifiability of the model parameters is established and a variational Bayesian EM algorithm is proposed for the estimation of these parameters. The Bayesian framework allows for selecting an optimal number of clusters and components. The proposed approach is compared using synthetic data with consensus clustering and tensor-based algorithms for community detection in large-scale complex networks. Finally, the method is utilized to analyze global food trading networks, leading to structures of interest.
    Keywords Computer Science - Machine Learning ; Mathematics - Statistics Theory ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2024-01-09
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Article ; Online: Holistic view of the seascape dynamics and environment impact on macro-scale genetic connectivity of marine plankton populations.

    Laso-Jadart, Romuald / O'Malley, Michael / Sykulski, Adam M / Ambroise, Christophe / Madoui, Mohammed-Amin

    BMC ecology and evolution

    2023  Volume 23, Issue 1, Page(s) 46

    Abstract: Background: Plankton seascape genomics studies have revealed different trends from large-scale weak differentiation to microscale structures. Previous studies have underlined the influence of the environment and seascape on species differentiation and ... ...

    Abstract Background: Plankton seascape genomics studies have revealed different trends from large-scale weak differentiation to microscale structures. Previous studies have underlined the influence of the environment and seascape on species differentiation and adaptation. However, these studies have generally focused on a few single species, sparse molecular markers, or local scales. Here, we investigated the genomic differentiation of plankton at the macro-scale in a holistic approach using Tara Oceans metagenomic data together with a reference-free computational method.
    Results: We reconstructed the F
    Conclusion: Our results validate the isolation-by-current hypothesis for a non-negligible proportion of taxa and highlight the role of other physicochemical parameters in large-scale plankton genetic connectivity. The reference-free approach used in this study offers a new systematic framework to analyse the population genomics of non-model and undocumented marine organisms from a large-scale and holistic point of view.
    MeSH term(s) Animals ; Plankton/genetics ; Acclimatization ; Zooplankton/genetics ; Genomics ; Atlantic Ocean ; Eukaryota
    Language English
    Publishing date 2023-09-01
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 2730-7182
    ISSN (online) 2730-7182
    DOI 10.1186/s12862-023-02160-8
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Book ; Online: Inference of Multiscale Gaussian Graphical Model

    Sanou, Do Edmond / Ambroise, Christophe / Robin, Geneviève

    2022  

    Abstract: Gaussian Graphical Models (GGMs) are widely used for exploratory data analysis in various fields such as genomics, ecology, psychometry. In a high-dimensional setting, when the number of variables exceeds the number of observations by several orders of ... ...

    Abstract Gaussian Graphical Models (GGMs) are widely used for exploratory data analysis in various fields such as genomics, ecology, psychometry. In a high-dimensional setting, when the number of variables exceeds the number of observations by several orders of magnitude, the estimation of GGM is a difficult and unstable optimization problem. Clustering of variables or variable selection is often performed prior to GGM estimation. We propose a new method allowing to simultaneously infer a hierarchical clustering structure and the graphs describing the structure of independence at each level of the hierarchy. This method is based on solving a convex optimization problem combining a graphical lasso penalty with a fused type lasso penalty. Results on real and synthetic data are presented.
    Keywords Statistics - Machine Learning ; Computer Science - Machine Learning
    Publishing date 2022-02-11
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Article: Incorporating Phylogenetic Information in Microbiome Differential Abundance Studies Has No Effect on Detection Power and FDR Control.

    Bichat, Antoine / Plassais, Jonathan / Ambroise, Christophe / Mariadassou, Mahendra

    Frontiers in microbiology

    2020  Volume 11, Page(s) 649

    Abstract: We consider the problem of incorporating evolutionary information (e.g., taxonomic or phylogenic trees) in the context of metagenomics differential analysis. Recent results published in the literature propose different ways to leverage the tree structure ...

    Abstract We consider the problem of incorporating evolutionary information (e.g., taxonomic or phylogenic trees) in the context of metagenomics differential analysis. Recent results published in the literature propose different ways to leverage the tree structure to increase the detection rate of differentially abundant taxa. Here, we propose instead to use a different hierarchical structure, in the form of a correlation-based tree, as it may capture the structure of the data better than the phylogeny. We first show that the correlation tree and the phylogeny are significantly different before turning to the impact of tree choice on detection rates. Using synthetic data, we show that the tree does have an impact: smoothing
    Language English
    Publishing date 2020-04-15
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2587354-4
    ISSN 1664-302X
    ISSN 1664-302X
    DOI 10.3389/fmicb.2020.00649
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Article ; Online: metaVaR: Introducing metavariant species models for reference-free metagenomic-based population genomics.

    Laso-Jadart, Romuald / Ambroise, Christophe / Peterlongo, Pierre / Madoui, Mohammed-Amin

    PloS one

    2020  Volume 15, Issue 12, Page(s) e0244637

    Abstract: The availability of large metagenomic data offers great opportunities for the population genomic analysis of uncultured organisms, which represent a large part of the unexplored biosphere and play a key ecological role. However, the majority of these ... ...

    Abstract The availability of large metagenomic data offers great opportunities for the population genomic analysis of uncultured organisms, which represent a large part of the unexplored biosphere and play a key ecological role. However, the majority of these organisms lack a reference genome or transcriptome, which constitutes a technical obstacle for classical population genomic analyses. We introduce the metavariant species (MVS) model, in which a species is represented only by intra-species nucleotide polymorphism. We designed a method combining reference-free variant calling, multiple density-based clustering and maximum-weighted independent set algorithms to cluster intra-species variants into MVSs directly from multisample metagenomic raw reads without a reference genome or read assembly. The frequencies of the MVS variants are then used to compute population genomic statistics such as FST, in order to estimate genomic differentiation between populations and to identify loci under natural selection. The MVS construction was tested on simulated and real metagenomic data. MVSs showed the required quality for robust population genomics and allowed an accurate estimation of genomic differentiation (ΔFST < 0.0001 and <0.03 on simulated and real data respectively). Loci predicted under natural selection on real data were all detected by MVSs. MVSs represent a new paradigm that may simplify and enhance holistic approaches for population genomics and the evolution of microorganisms.
    MeSH term(s) Cluster Analysis ; Computational Biology/methods ; Genetic Variation ; Genetics, Population ; Metagenomics/methods ; Models, Genetic ; Selection, Genetic ; Software
    Language English
    Publishing date 2020-12-30
    Publishing country United States
    Document type Journal Article
    ISSN 1932-6203
    ISSN (online) 1932-6203
    DOI 10.1371/journal.pone.0244637
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Article ; Online: Learning the optimal scale for GWAS through hierarchical SNP aggregation.

    Guinot, Florent / Szafranski, Marie / Ambroise, Christophe / Samson, Franck

    BMC bioinformatics

    2018  Volume 19, Issue 1, Page(s) 459

    Abstract: Background: Genome-Wide Association Studies (GWAS) seek to identify causal genomic variants associated with rare human diseases. The classical statistical approach for detecting these variants is based on univariate hypothesis testing, with healthy ... ...

    Abstract Background: Genome-Wide Association Studies (GWAS) seek to identify causal genomic variants associated with rare human diseases. The classical statistical approach for detecting these variants is based on univariate hypothesis testing, with healthy individuals being tested against affected individuals at each locus. Given that an individual's genotype is characterized by up to one million SNPs, this approach lacks precision, since it may yield a large number of false positives that can lead to erroneous conclusions about genetic associations with the disease. One way to improve the detection of true genetic associations is to reduce the number of hypotheses to be tested by grouping SNPs.
    Results: We propose a dimension-reduction approach which can be applied in the context of GWAS by making use of the haplotype structure of the human genome. We compare our method with standard univariate and group-based approaches on both synthetic and real GWAS data.
    Conclusion: We show that reducing the dimension of the predictor matrix by aggregating SNPs gives a greater precision in the detection of associations between the phenotype and genomic regions.
    MeSH term(s) Algorithms ; Area Under Curve ; Case-Control Studies ; Computer Simulation ; Gene Frequency/genetics ; Genome-Wide Association Study ; Humans ; Linkage Disequilibrium/genetics ; Numerical Analysis, Computer-Assisted ; Phenotype ; Polymorphism, Single Nucleotide/genetics ; ROC Curve ; Spondylitis, Ankylosing/genetics
    Language English
    Publishing date 2018-11-29
    Publishing country England
    Document type Journal Article
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-018-2475-9
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article: A Sparse Mixture-of-Experts Model With Screening of Genetic Associations to Guide Disease Subtyping.

    Courbariaux, Marie / De Santiago, Kylliann / Dalmasso, Cyril / Danjou, Fabrice / Bekadar, Samir / Corvol, Jean-Christophe / Martinez, Maria / Szafranski, Marie / Ambroise, Christophe

    Frontiers in genetics

    2022  Volume 13, Page(s) 859462

    Abstract: Motivation: ...

    Abstract Motivation:
    Language English
    Publishing date 2022-06-06
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2606823-0
    ISSN 1664-8021
    ISSN 1664-8021
    DOI 10.3389/fgene.2022.859462
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Article: Adjacency-constrained hierarchical clustering of a band similarity matrix with application to genomics.

    Ambroise, Christophe / Dehman, Alia / Neuvial, Pierre / Rigaill, Guillem / Vialaneix, Nathalie

    Algorithms for molecular biology : AMB

    2019  Volume 14, Page(s) 22

    Abstract: Background: Genomic data analyses such as Genome-Wide Association Studies (GWAS) or Hi-C studies are often faced with the problem of partitioning chromosomes into successive regions based on a similarity matrix of high-resolution, locus-level ... ...

    Abstract Background: Genomic data analyses such as Genome-Wide Association Studies (GWAS) or Hi-C studies are often faced with the problem of partitioning chromosomes into successive regions based on a similarity matrix of high-resolution, locus-level measurements. An intuitive way of doing this is to perform a modified Hierarchical Agglomerative Clustering (HAC), where only adjacent clusters (according to the ordering of positions within a chromosome) are allowed to be merged. But a major practical drawback of this method is its quadratic time and space complexity in the number of loci, which is typically of the order of
    Results: By assuming that the similarity between physically distant objects is negligible, we are able to propose an implementation of adjacency-constrained HAC with quasi-linear complexity. This is achieved by pre-calculating specific sums of similarities, and storing candidate fusions in a min-heap. Our illustrations on GWAS and Hi-C datasets demonstrate the relevance of this assumption, and show that this method highlights biologically meaningful signals. Thanks to its small time and memory footprint, the method can be run on a standard laptop in minutes or even seconds.
    Availability and implementation: Software and sample data are available as an R package,
    Language English
    Publishing date 2019-11-15
    Publishing country England
    Document type Journal Article
    ZDB-ID 2224970-9
    ISSN 1748-7188
    ISSN 1748-7188
    DOI 10.1186/s13015-019-0157-4
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: Eigen-Epistasis for detecting gene-gene interactions.

    Stanislas, Virginie / Dalmasso, Cyril / Ambroise, Christophe

    BMC bioinformatics

    2017  Volume 18, Issue 1, Page(s) 54

    Abstract: Background: A large amount of research has been devoted to the detection and investigation of epistatic interactions in genome-wide association studies (GWASs). Most of the literature focuses on low-order interactions between single-nucleotide ... ...

    Abstract Background: A large amount of research has been devoted to the detection and investigation of epistatic interactions in genome-wide association studies (GWASs). Most of the literature focuses on low-order interactions between single-nucleotide polymorphisms (SNPs) with significant main effects.
    Results: In this paper we propose an original approach for detecting epistasis at the gene level, without systematically filtering on significant genes. We first compute interaction variables for each gene pair by finding its Eigen-Epistasis component, defined as the linear combination of Gene SNPs having the highest correlation with the phenotype. The selection of significant effects is done using a penalized regression method based on Group Lasso controlling the False Discovery Rate.
    Conclusion: The method is tested against two recent alternative proposals from the literature using synthetic data, and shows good performances in different settings. We demonstrate the power of our approach by detecting new gene-gene interactions on three genome-wide association studies.
    MeSH term(s) Computational Biology/methods ; Computer Simulation ; Epistasis, Genetic ; Genome-Wide Association Study ; Genotype ; Humans ; Inflammatory Bowel Diseases/genetics ; Models, Theoretical ; Phenotype ; Polymorphism, Single Nucleotide ; Principal Component Analysis ; Thyroid Neoplasms/genetics
    Language English
    Publishing date 2017-01-23
    Publishing country England
    Document type Journal Article
    ZDB-ID 2041484-5
    ISSN 1471-2105 ; 1471-2105
    ISSN (online) 1471-2105
    ISSN 1471-2105
    DOI 10.1186/s12859-017-1488-0
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  10. Book ; Online: Tree-based Inference of Species Interaction Network from Abundance Data

    Momal, Raphaëlle / Robin, Stéphane / Ambroise, Christophe

    2019  

    Abstract: The behavior of ecological systems mainly relies on the interactions between the species it involves. We consider the problem of inferring the species interaction network from abundance data. To be relevant, any network inference methodology needs to ... ...

    Abstract The behavior of ecological systems mainly relies on the interactions between the species it involves. We consider the problem of inferring the species interaction network from abundance data. To be relevant, any network inference methodology needs to handle count data and to account for possible environmental effects. It also needs to distinguish between direct interactions and indirect associations and graphical models provide a convenient framework for this purpose. We introduce a generic statistical model for network inference based on abundance data. The model includes fixed effects to account for environmental covariates and sampling efforts, and correlated random effects to encode species interactions. The inferred network is obtained by averaging over all possible tree-shaped (and therefore sparse) networks, in a computationally efficient manner. An output of the procedure is the probability for each edge to be part of the underlying network. A simulation study shows that the proposed methodology compares well with state-of-the-art approaches, even when the underlying graph strongly differs from a tree. The analysis of two datasets highlights the influence of covariates on the inferred network. Accounting for covariates is critical to avoid spurious edges. The proposed approach could be extended to perform network comparison or to look for missing species.
    Keywords Statistics - Applications ; Quantitative Biology - Populations and Evolution
    Subject code 310
    Publishing date 2019-05-07
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top