LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 6 of total 6

Search options

  1. Thesis ; Online: Efficient Markov Chain Monte Carlo Methods

    Fang, Youhan

    2018  

    Abstract: Generating random samples from a prescribed distribution is one of the most important and challenging problems in machine learning, Bayesian statistics, and the simulation of materials. Markov Chain Monte Carlo (MCMC) methods are usually the required ... ...

    Abstract Generating random samples from a prescribed distribution is one of the most important and challenging problems in machine learning, Bayesian statistics, and the simulation of materials. Markov Chain Monte Carlo (MCMC) methods are usually the required tool for this task, if the desired distribution is known only up to a multiplicative constant. Samples produced by an MCMC method are real values in N-dimensional space, called the configuration space. The distribution of such samples converges to the target distribution in the limit. However, existing MCMC methods still face many challenges that are not well resolved. Difficulties for sampling by using MCMC methods include, but not exclusively, dealing with high dimensional and multimodal problems, high computation cost due to extremely large datasets in Bayesian machine learning models, and lack of reliable indicators for detecting convergence and measuring the accuracy of sampling. This dissertation focuses on new theory and methodology for efficient MCMC methods that aim to overcome the aforementioned difficulties. One contribution of this dissertation is generalizations of hybrid Monte Carlo (HMC). An HMC method combines a discretized dynamical system in an extended space, called the state space, and an acceptance test based on the Metropolis criterion. The discretized dynamical system used in HMC is volume preserving—meaning that in the state space, the absolute Jacobian of a map from one point on the trajectory to another is 1. Volume preservation is, however, not necessary for the general purpose of sampling. A general theory allowing the use of non-volume preserving dynamics for proposing MCMC moves is proposed. Examples including isokinetic dynamics and variable mass Hamiltonian dynamics with an explicit integrator, are all designed with fewer restrictions based on the general theory. Experiments show improvement in efficiency for sampling high dimensional multimodal problems. A second contribution is stochastic gradient samplers with reduced bias. An in-depth analysis of the noise introduced by the stochastic gradient is provided. Two methods to reduce the bias in the distribution of samples are proposed. One is to correct the dynamics by using an estimated noise based on subsampled data, and the other is to introduce additional variables and corresponding dynamics to adaptively reduce the bias. Extensive experiments show that both methods outperform existing methods. A third contribution is quasi-reliable estimates of effective sample size. Proposed is a more reliable indicator—the longest integrated autocorrelation time over all functions in the state space—for detecting the convergence and measuring the accuracy of MCMC methods. The superiority of the new indicator is supported by experiments on both synthetic and real problems. Minor contributions include a general framework of changing variables, and a numerical integrator for the Hamiltonian dynamics with fourth order accuracy. The idea of changing variables is to transform the potential energy function as a function of the original variable to a function of the new variable, such that undesired properties can be removed. Two examples are provided and preliminary experimental results are obtained for supporting this idea. The fourth order integrator is constructed by combining the idea of the simplified Takahashi-Imada method and a two-stage Hessian-based integrator. The proposed method, called two-stage simplified Takahashi-Imada method, shows outstanding performance over existing methods in high-dimensional sampling problems.
    Keywords Statistics|Computer science
    Subject code 310
    Language ENG
    Publishing date 2018-01-01 00:00:01.0
    Publisher Purdue University
    Publishing country us
    Document type Thesis ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Article ; Online: Erratum: "Compressible generalized hybrid Monte Carlo" [J. Chem. Phys. 140, 174108 (2014)].

    Fang, Youhan / Sanz-Serna, J M / Skeel, Robert D

    The Journal of chemical physics

    2016  Volume 144, Issue 2, Page(s) 29901

    Language English
    Publishing date 2016-01-14
    Publishing country United States
    Document type Published Erratum
    ZDB-ID 3113-6
    ISSN 1089-7690 ; 0021-9606
    ISSN (online) 1089-7690
    ISSN 0021-9606
    DOI 10.1063/1.4940219
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article ; Online: Compressible generalized hybrid Monte Carlo.

    Fang, Youhan / Sanz-Serna, J M / Skeel, Robert D

    The Journal of chemical physics

    2014  Volume 140, Issue 17, Page(s) 174108

    Abstract: One of the most demanding calculations is to generate random samples from a specified probability distribution (usually with an unknown normalizing prefactor) in a high-dimensional configuration space. One often has to resort to using a Markov chain ... ...

    Abstract One of the most demanding calculations is to generate random samples from a specified probability distribution (usually with an unknown normalizing prefactor) in a high-dimensional configuration space. One often has to resort to using a Markov chain Monte Carlo method, which converges only in the limit to the prescribed distribution. Such methods typically inch through configuration space step by step, with acceptance of a step based on a Metropolis(-Hastings) criterion. An acceptance rate of 100% is possible in principle by embedding configuration space in a higher dimensional phase space and using ordinary differential equations. In practice, numerical integrators must be used, lowering the acceptance rate. This is the essence of hybrid Monte Carlo methods. Presented is a general framework for constructing such methods under relaxed conditions: the only geometric property needed is (weakened) reversibility; volume preservation is not needed. The possibilities are illustrated by deriving a couple of explicit hybrid Monte Carlo methods, one based on barrier-lowering variable-metric dynamics and another based on isokinetic dynamics.
    Language English
    Publishing date 2014-05-07
    Publishing country United States
    Document type Journal Article
    ZDB-ID 3113-6
    ISSN 1089-7690 ; 0021-9606
    ISSN (online) 1089-7690
    ISSN 0021-9606
    DOI 10.1063/1.4874000
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Article ; Online: High-speed automatic characterization of rare events in flow cytometric data.

    Qi, Yuan / Fang, Youhan / Sinclair, David R / Guo, Shangqin / Alberich-Jorda, Meritxell / Lu, Jun / Tenen, Daniel G / Kharas, Michael G / Pyne, Saumyadipta

    PloS one

    2020  Volume 15, Issue 2, Page(s) e0228651

    Abstract: A new computational framework for FLow cytometric Analysis of Rare Events (FLARE) has been developed specifically for fast and automatic identification of rare cell populations in very large samples generated by platforms like multi-parametric flow ... ...

    Abstract A new computational framework for FLow cytometric Analysis of Rare Events (FLARE) has been developed specifically for fast and automatic identification of rare cell populations in very large samples generated by platforms like multi-parametric flow cytometry. Using a hierarchical Bayesian model and information-sharing via parallel computation, FLARE rapidly explores the high-dimensional marker-space to detect highly rare populations that are consistent across multiple samples. Further it can focus within specified regions of interest in marker-space to detect subpopulations with desired precision.
    MeSH term(s) Automation, Laboratory/methods ; Flow Cytometry/methods ; Models, Theoretical ; Probability
    Language English
    Publishing date 2020-02-11
    Publishing country United States
    Document type Journal Article
    ISSN 1932-6203
    ISSN (online) 1932-6203
    DOI 10.1371/journal.pone.0228651
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Book: Zhong guo jin dai shi xue xi shou ce

    Gong, Shuduo / Fang, Youhan

    1989  

    Author's details Gong shu duo, fang you han zhu bian
    Keywords China
    Language Chinese
    Size 3, 316 p, 20 cm
    Edition Di 1 ban
    Publisher Bei jing da xue chu ban she
    Publishing place Bei jing
    Document type Book
    Note Includes bibliographical references
    ISBN 7301000820 ; 9787301000823
    Database Former special subject collection: coastal and deep sea fishing

    More links

    Kategorien

  6. Book: Zhong guo jin dai shi gang

    Gong, Shuduo / Fang, Youhan / Zou, Fanlin / Wang, Yini / Liang, Xiaorui

    1993  

    Author's details Gong shu duo, fang you han zhu bian; zou fan lin, wang yi ni, liang xiao rui bian
    Keywords China
    Language Chinese
    Size 5, 433 S, 21 cm
    Edition Di 2 ban
    Publisher Bei jing da xue chu ban she
    Publishing place Bei jing
    Document type Book
    Note Includes bibliographical references
    ISBN 7301021666 ; 9787301021668
    Database Former special subject collection: coastal and deep sea fishing

    More links

    Kategorien

To top