LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 53

Search options

  1. Article ; Online: Introducing User-Prescribed Constraints in Markov Chains for Nonlinear Dimensionality Reduction.

    Dixit, Purushottam D

    Neural computation

    2019  Volume 31, Issue 5, Page(s) 980–997

    Abstract: Stochastic kernel-based dimensionality-reduction approaches have become popular in the past decade. The central component of many of these methods is a symmetric kernel that quantifies the vicinity between pairs of data points and a kernel-induced Markov ...

    Abstract Stochastic kernel-based dimensionality-reduction approaches have become popular in the past decade. The central component of many of these methods is a symmetric kernel that quantifies the vicinity between pairs of data points and a kernel-induced Markov chain on the data. Typically, the Markov chain is fully specified by the kernel through row normalization. However, in many cases, it is desirable to impose user-specified stationary-state and dynamical constraints on the Markov chain. Unfortunately, no systematic framework exists to impose such user-defined constraints. Here, based on our previous work on inference of Markov models, we introduce a path entropy maximization based approach to derive the transition probabilities of Markov chains using a kernel and additional user-specified constraints. We illustrate the usefulness of these Markov chains with examples.
    Language English
    Publishing date 2019-03-18
    Publishing country United States
    Document type Journal Article
    ZDB-ID 1025692-1
    ISSN 1530-888X ; 0899-7667
    ISSN (online) 1530-888X
    ISSN 0899-7667
    DOI 10.1162/neco_a_01184
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Article ; Online: EMBED: Essential MicroBiomE Dynamics, a dimensionality reduction approach for longitudinal microbiome studies.

    Shahin, Mayar / Ji, Brian / Dixit, Purushottam D

    NPJ systems biology and applications

    2023  Volume 9, Issue 1, Page(s) 26

    Abstract: Dimensionality reduction offers unique insights into high-dimensional microbiome dynamics by leveraging collective abundance fluctuations of multiple bacteria driven by similar ecological perturbations. However, methods providing lower-dimensional ... ...

    Abstract Dimensionality reduction offers unique insights into high-dimensional microbiome dynamics by leveraging collective abundance fluctuations of multiple bacteria driven by similar ecological perturbations. However, methods providing lower-dimensional representations of microbiome dynamics both at the community and individual taxa levels are not currently available. To that end, we present EMBED: Essential MicroBiomE Dynamics, a probabilistic nonlinear tensor factorization approach. Like normal mode analysis in structural biophysics, EMBED infers ecological normal modes (ECNs), which represent the unique orthogonal modes capturing the collective behavior of microbial communities. Using multiple real and synthetic datasets, we show that a very small number of ECNs can accurately approximate microbiome dynamics. Inferred ECNs reflect specific ecological behaviors, providing natural templates along which the dynamics of individual bacteria may be partitioned. Moreover, the multi-subject treatment in EMBED systematically identifies subject-specific and universal abundance dynamics that are not detected by traditional approaches. Collectively, these results highlight the utility of EMBED as a versatile dimensionality reduction tool for studies of microbiome dynamics.
    MeSH term(s) Microbiota/genetics ; Bacteria/genetics
    Language English
    Publishing date 2023-06-20
    Publishing country England
    Document type Journal Article ; Research Support, N.I.H., Extramural
    ISSN 2056-7189
    ISSN (online) 2056-7189
    DOI 10.1038/s41540-023-00285-6
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: EMBED

    Mayar Shahin / Brian Ji / Purushottam D. Dixit

    npj Systems Biology and Applications, Vol 9, Iss 1, Pp 1-

    Essential MicroBiomE Dynamics, a dimensionality reduction approach for longitudinal microbiome studies

    2023  Volume 11

    Abstract: Abstract Dimensionality reduction offers unique insights into high-dimensional microbiome dynamics by leveraging collective abundance fluctuations of multiple bacteria driven by similar ecological perturbations. However, methods providing lower- ... ...

    Abstract Abstract Dimensionality reduction offers unique insights into high-dimensional microbiome dynamics by leveraging collective abundance fluctuations of multiple bacteria driven by similar ecological perturbations. However, methods providing lower-dimensional representations of microbiome dynamics both at the community and individual taxa levels are not currently available. To that end, we present EMBED: Essential MicroBiomE Dynamics, a probabilistic nonlinear tensor factorization approach. Like normal mode analysis in structural biophysics, EMBED infers ecological normal modes (ECNs), which represent the unique orthogonal modes capturing the collective behavior of microbial communities. Using multiple real and synthetic datasets, we show that a very small number of ECNs can accurately approximate microbiome dynamics. Inferred ECNs reflect specific ecological behaviors, providing natural templates along which the dynamics of individual bacteria may be partitioned. Moreover, the multi-subject treatment in EMBED systematically identifies subject-specific and universal abundance dynamics that are not detected by traditional approaches. Collectively, these results highlight the utility of EMBED as a versatile dimensionality reduction tool for studies of microbiome dynamics.
    Keywords Biology (General) ; QH301-705.5
    Language English
    Publishing date 2023-06-01T00:00:00Z
    Publisher Nature Portfolio
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Article ; Online: GENERALIST: A latent space based generative model for protein sequence families.

    Akl, Hoda / Emison, Brooke / Zhao, Xiaochuan / Mondal, Arup / Perez, Alberto / Dixit, Purushottam D

    PLoS computational biology

    2023  Volume 19, Issue 11, Page(s) e1011655

    Abstract: Generative models of protein sequence families are an important tool in the repertoire of protein scientists and engineers alike. However, state-of-the-art generative approaches face inference, accuracy, and overfitting- related obstacles when modeling ... ...

    Abstract Generative models of protein sequence families are an important tool in the repertoire of protein scientists and engineers alike. However, state-of-the-art generative approaches face inference, accuracy, and overfitting- related obstacles when modeling moderately sized to large proteins and/or protein families with low sequence coverage. Here, we present a simple to learn, tunable, and accurate generative model, GENERALIST: GENERAtive nonLInear tenSor-factorizaTion for protein sequences. GENERALIST accurately captures several high order summary statistics of amino acid covariation. GENERALIST also predicts conservative local optimal sequences which are likely to fold in stable 3D structure. Importantly, unlike current methods, the density of sequences in GENERALIST-modeled sequence ensembles closely resembles the corresponding natural ensembles. Finally, GENERALIST embeds protein sequences in an informative latent space. GENERALIST will be an important tool to study protein sequence variability.
    MeSH term(s) Proteins/chemistry ; Amino Acid Sequence ; Amino Acids
    Chemical Substances Proteins ; Amino Acids
    Language English
    Publishing date 2023-11-27
    Publishing country United States
    Document type Journal Article
    ZDB-ID 2193340-6
    ISSN 1553-7358 ; 1553-734X
    ISSN (online) 1553-7358
    ISSN 1553-734X
    DOI 10.1371/journal.pcbi.1011655
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Book ; Online: TMI

    Dixit, Purushottam D.

    Thermodynamic inference of data manifolds

    2019  

    Abstract: The Gibbs-Boltzmann distribution offers a physically interpretable way to massively reduce the dimensionality of high dimensional probability distributions where the extensive variables are `features' and the intensive variables are `descriptors'. ... ...

    Abstract The Gibbs-Boltzmann distribution offers a physically interpretable way to massively reduce the dimensionality of high dimensional probability distributions where the extensive variables are `features' and the intensive variables are `descriptors'. However, not all probability distributions can be modeled using the Gibbs-Boltzmann form. Here, we present TMI: TMI, {\bf T}hermodynamic {\bf M}anifold {\bf I}nference; a thermodynamic approach to approximate a collection of arbitrary distributions. TMI simultaneously learns from data intensive and extensive variables and achieves dimensionality reduction through a multiplicative, positive valued, and interpretable decomposition of the data. Importantly, the reduced dimensional space of intensive parameters is not homogeneous. The Gibbs-Boltzmann distribution defines an analytically tractable Riemannian metric on the space of intensive variables allowing us to calculate geodesics and volume elements. We discuss the applications of TMI with multiple real and artificial data sets. Possible extensions are discussed as well.
    Keywords Condensed Matter - Statistical Mechanics ; Computer Science - Machine Learning ; Quantitative Biology - Quantitative Methods
    Subject code 310
    Publishing date 2019-11-21
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Article ; Online: Building Markov state models using optimal transport theory.

    Dixit, Purushottam D / Dill, Ken A

    The Journal of chemical physics

    2019  Volume 150, Issue 5, Page(s) 54105

    Abstract: Markov State Models (MSMs) describe the rates and routes in conformational dynamics of biomolecules. Computational estimation of MSMs can be expensive because molecular simulations are slow to find and sample the rare transient events. We describe here ... ...

    Abstract Markov State Models (MSMs) describe the rates and routes in conformational dynamics of biomolecules. Computational estimation of MSMs can be expensive because molecular simulations are slow to find and sample the rare transient events. We describe here an efficient approximate way to determine MSM rate matrices by combining maximum caliber (maximizing path entropies) with optimal transport theory (minimizing some path cost function, as when routing trucks on transportation networks) to patch together transient dynamical information from multiple non-equilibrium simulations. We give toy examples.
    Language English
    Publishing date 2019-02-07
    Publishing country United States
    Document type Journal Article
    ZDB-ID 3113-6
    ISSN 1089-7690 ; 0021-9606
    ISSN (online) 1089-7690
    ISSN 0021-9606
    DOI 10.1063/1.5086681
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: GENERALIST

    Hoda Akl / Brooke Emison / Xiaochuan Zhao / Arup Mondal / Alberto Perez / Purushottam D Dixit

    PLoS Computational Biology, Vol 19, Iss 11, p e

    A latent space based generative model for protein sequence families.

    2023  Volume 1011655

    Abstract: Generative models of protein sequence families are an important tool in the repertoire of protein scientists and engineers alike. However, state-of-the-art generative approaches face inference, accuracy, and overfitting- related obstacles when modeling ... ...

    Abstract Generative models of protein sequence families are an important tool in the repertoire of protein scientists and engineers alike. However, state-of-the-art generative approaches face inference, accuracy, and overfitting- related obstacles when modeling moderately sized to large proteins and/or protein families with low sequence coverage. Here, we present a simple to learn, tunable, and accurate generative model, GENERALIST: GENERAtive nonLInear tenSor-factorizaTion for protein sequences. GENERALIST accurately captures several high order summary statistics of amino acid covariation. GENERALIST also predicts conservative local optimal sequences which are likely to fold in stable 3D structure. Importantly, unlike current methods, the density of sequences in GENERALIST-modeled sequence ensembles closely resembles the corresponding natural ensembles. Finally, GENERALIST embeds protein sequences in an informative latent space. GENERALIST will be an important tool to study protein sequence variability.
    Keywords Biology (General) ; QH301-705.5
    Subject code 612
    Language English
    Publishing date 2023-11-01T00:00:00Z
    Publisher Public Library of Science (PLoS)
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Article ; Online: SiGMoiD: A super-statistical generative model for binary data.

    Zhao, Xiaochuan / Plata, Germán / Dixit, Purushottam D

    PLoS computational biology

    2021  Volume 17, Issue 8, Page(s) e1009275

    Abstract: In modern computational biology, there is great interest in building probabilistic models to describe collections of a large number of co-varying binary variables. However, current approaches to build generative models rely on modelers' identification of ...

    Abstract In modern computational biology, there is great interest in building probabilistic models to describe collections of a large number of co-varying binary variables. However, current approaches to build generative models rely on modelers' identification of constraints and are computationally expensive to infer when the number of variables is large (N~100). Here, we address both these issues with Super-statistical Generative Model for binary Data (SiGMoiD). SiGMoiD is a maximum entropy-based framework where we imagine the data as arising from super-statistical system; individual binary variables in a given sample are coupled to the same 'bath' whose intensive variables vary from sample to sample. Importantly, unlike standard maximum entropy approaches where modeler specifies the constraints, the SiGMoiD algorithm infers them directly from the data. Due to this optimal choice of constraints, SiGMoiD allows us to model collections of a very large number (N>1000) of binary variables. Finally, SiGMoiD offers a reduced dimensional description of the data, allowing us to identify clusters of similar data points as well as binary variables. We illustrate the versatility of SiGMoiD using multiple datasets spanning several time- and length-scales.
    MeSH term(s) Algorithms ; Computational Biology/methods ; Entropy ; Models, Statistical
    Language English
    Publishing date 2021-08-06
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 2193340-6
    ISSN 1553-7358 ; 1553-734X
    ISSN (online) 1553-7358
    ISSN 1553-734X
    DOI 10.1371/journal.pcbi.1009275
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  9. Article ; Online: SiGMoiD

    Xiaochuan Zhao / Germán Plata / Purushottam D Dixit

    PLoS Computational Biology, Vol 17, Iss 8, p e

    A super-statistical generative model for binary data.

    2021  Volume 1009275

    Abstract: In modern computational biology, there is great interest in building probabilistic models to describe collections of a large number of co-varying binary variables. However, current approaches to build generative models rely on modelers' identification of ...

    Abstract In modern computational biology, there is great interest in building probabilistic models to describe collections of a large number of co-varying binary variables. However, current approaches to build generative models rely on modelers' identification of constraints and are computationally expensive to infer when the number of variables is large (N~100). Here, we address both these issues with Super-statistical Generative Model for binary Data (SiGMoiD). SiGMoiD is a maximum entropy-based framework where we imagine the data as arising from super-statistical system; individual binary variables in a given sample are coupled to the same 'bath' whose intensive variables vary from sample to sample. Importantly, unlike standard maximum entropy approaches where modeler specifies the constraints, the SiGMoiD algorithm infers them directly from the data. Due to this optimal choice of constraints, SiGMoiD allows us to model collections of a very large number (N>1000) of binary variables. Finally, SiGMoiD offers a reduced dimensional description of the data, allowing us to identify clusters of similar data points as well as binary variables. We illustrate the versatility of SiGMoiD using multiple datasets spanning several time- and length-scales.
    Keywords Biology (General) ; QH301-705.5
    Subject code 310 ; 006
    Language English
    Publishing date 2021-08-01T00:00:00Z
    Publisher Public Library of Science (PLoS)
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Article ; Online: Stationary properties of maximum-entropy random walks.

    Dixit, Purushottam D

    Physical review. E, Statistical, nonlinear, and soft matter physics

    2015  Volume 92, Issue 4, Page(s) 42149

    Abstract: Maximum-entropy (ME) inference of state probabilities using state-dependent constraints is popular in the study of complex systems. In stochastic systems, how state space topology and path-dependent constraints affect ME-inferred state probabilities ... ...

    Abstract Maximum-entropy (ME) inference of state probabilities using state-dependent constraints is popular in the study of complex systems. In stochastic systems, how state space topology and path-dependent constraints affect ME-inferred state probabilities remains unknown. To that end, we derive the transition probabilities and the stationary distribution of a maximum path entropy Markov process subject to state- and path-dependent constraints. A main finding is that the stationary distribution over states differs significantly from the Boltzmann distribution and reflects a competition between path multiplicity and imposed constraints. We illustrate our results with particle diffusion on a two-dimensional landscape. Connections with the path integral approach to diffusion are discussed.
    MeSH term(s) Diffusion ; Entropy ; Markov Chains ; Models, Theoretical
    Language English
    Publishing date 2015-10
    Publishing country United States
    Document type Journal Article
    ISSN 1550-2376
    ISSN (online) 1550-2376
    DOI 10.1103/PhysRevE.92.042149
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top