LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 14

Search options

  1. Article ; Online: An Efficient Fisher Matrix Approximation Method for Large-Scale Neural Network Optimization.

    Yang, Minghan / Xu, Dong / Cui, Qiwen / Wen, Zaiwen / Xu, Pengxiang

    IEEE transactions on pattern analysis and machine intelligence

    2023  Volume 45, Issue 5, Page(s) 5391–5403

    Abstract: Although the shapes of the parameters are not crucial for designing first-order optimization methods in large scale empirical risk minimization problems, they have important impact on the size of the matrix to be inverted when developing second-order ... ...

    Abstract Although the shapes of the parameters are not crucial for designing first-order optimization methods in large scale empirical risk minimization problems, they have important impact on the size of the matrix to be inverted when developing second-order type methods. In this article, we propose an efficient and novel second-order method based on the parameters in the real matrix space [Formula: see text] and a matrix-product approximate Fisher matrix (MatFisher) by using the products of gradients. The size of the matrix to be inverted is much smaller than that of the Fisher information matrix in the real vector space [Formula: see text]. Moreover, by utilizing the matrix delayed update and the block diagonal approximation techniques, the computational cost can be controlled and is comparable with first-order methods. A global convergence and a superlinear local convergence analysis are established under mild conditions. Numerical results on image classification with ResNet50, quantum chemistry modeling with SchNet, and data-driven partial differential equations solution with PINN illustrate that our method is quite competitive to the state-of-the-art methods.
    Language English
    Publishing date 2023-04-03
    Publishing country United States
    Document type Journal Article
    ISSN 1939-3539
    ISSN (online) 1939-3539
    DOI 10.1109/TPAMI.2022.3213654
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Book ; Online: Provably Efficient Gauss-Newton Temporal Difference Learning Method with Function Approximation

    Ke, Zhifa / Wen, Zaiwen / Zhang, Junyu

    2023  

    Abstract: In this paper, based on the spirit of Fitted Q-Iteration (FQI), we propose a Gauss-Newton Temporal Difference (GNTD) method to solve the Q-value estimation problem with function approximation. In each iteration, unlike the original FQI that solves a ... ...

    Abstract In this paper, based on the spirit of Fitted Q-Iteration (FQI), we propose a Gauss-Newton Temporal Difference (GNTD) method to solve the Q-value estimation problem with function approximation. In each iteration, unlike the original FQI that solves a nonlinear least square subproblem to fit the Q-iteration, the GNTD method can be viewed as an \emph{inexact} FQI that takes only one Gauss-Newton step to optimize this subproblem, which is much cheaper in computation. Compared to the popular Temporal Difference (TD) learning, which can be viewed as taking a single gradient descent step to FQI's subproblem per iteration, the Gauss-Newton step of GNTD better retains the structure of FQI and hence leads to better convergence. In our work, we derive the finite-sample non-asymptotic convergence of GNTD under linear, neural network, and general smooth function approximations. In particular, recent works on neural TD only guarantee a suboptimal $\mathcal{\mathcal{O}}(\epsilon^{-4})$ sample complexity, while GNTD obtains an improved complexity of $\tilde{\mathcal{O}}(\epsilon^{-2})$. Finally, we validate our method via extensive experiments in both online and offline RL problems. Our method exhibits both higher rewards and faster convergence than TD-type methods, including DQN.
    Keywords Mathematics - Optimization and Control ; Computer Science - Machine Learning
    Subject code 518
    Publishing date 2023-02-25
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Article ; Online: Predicting sequenced dental treatment plans from electronic dental records using deep learning.

    Chen, Haifan / Liu, Pufan / Chen, Zhaoxing / Chen, Qingxiao / Wen, Zaiwen / Xie, Ziqing

    Artificial intelligence in medicine

    2023  Volume 147, Page(s) 102734

    Abstract: Background: Designing appropriate clinical dental treatment plans is an urgent need because a growing number of dental patients are suffering from partial edentulism with the population getting older.: Objectives: The aim of this study is to predict ... ...

    Abstract Background: Designing appropriate clinical dental treatment plans is an urgent need because a growing number of dental patients are suffering from partial edentulism with the population getting older.
    Objectives: The aim of this study is to predict sequential treatment plans from electronic dental records.
    Methods: We construct a clinical decision support model, MultiTP, explores the unique topology of teeth information and the variation of complicated treatments, integrates deep learning models (convolutional neural network and recurrent neural network) adaptively, and embeds the attention mechanism to produce optimal treatment plans.
    Results: MultiTP shows its promising performance with an AUC of 0.9079 and an F score of 0.8472 over five treatment plans. The interpretability analysis also indicates its capability in mining clinical knowledge from the textual data.
    Conclusions: MultiTP's novel problem formulation, neural network framework, and interpretability analysis techniques allow for broad applications of deep learning in dental healthcare, providing valuable support for predicting dental treatment plans in the clinic and benefiting dental patients.
    Clinical implications: The MultiTP is an efficient tool that can be implemented in clinical practice and integrated into the existing EDR system. By predicting treatment plans for partial edentulism, the model will help dentists improve their clinical decisions.
    MeSH term(s) Humans ; Deep Learning ; Dental Records ; Electronics ; Neural Networks, Computer ; Dental Care
    Language English
    Publishing date 2023-11-29
    Publishing country Netherlands
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 645179-2
    ISSN 1873-2860 ; 0933-3657
    ISSN (online) 1873-2860
    ISSN 0933-3657
    DOI 10.1016/j.artmed.2023.102734
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Book ; Online: Monte Carlo Policy Gradient Method for Binary Optimization

    Chen, Cheng / Chen, Ruitao / Li, Tianyou / Ao, Ruichen / Wen, Zaiwen

    2023  

    Abstract: Binary optimization has a wide range of applications in combinatorial optimization problems such as MaxCut, MIMO detection, and MaxSAT. However, these problems are typically NP-hard due to the binary constraints. We develop a novel probabilistic model to ...

    Abstract Binary optimization has a wide range of applications in combinatorial optimization problems such as MaxCut, MIMO detection, and MaxSAT. However, these problems are typically NP-hard due to the binary constraints. We develop a novel probabilistic model to sample the binary solution according to a parameterized policy distribution. Specifically, minimizing the KL divergence between the parameterized policy distribution and the Gibbs distributions of the function value leads to a stochastic optimization problem whose policy gradient can be derived explicitly similar to reinforcement learning. For coherent exploration in discrete spaces, parallel Markov Chain Monte Carlo (MCMC) methods are employed to sample from the policy distribution with diversity and approximate the gradient efficiently. We further develop a filter scheme to replace the original objective function by the one with the local search technique to broaden the horizon of the function landscape. Convergence to stationary points in expectation of the policy gradient method is established based on the concentration inequality for MCMC. Numerical results show that this framework is very promising to provide near-optimal solutions for quite a few binary optimization problems.
    Keywords Mathematics - Optimization and Control ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning ; 90C09 ; 90C27 ; 90C59 ; 60J45 ; 60J20
    Subject code 510
    Publishing date 2023-07-03
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP

    Chen, Fan / Zhang, Junyu / Wen, Zaiwen

    2022  

    Abstract: As an important framework for safe Reinforcement Learning, the Constrained Markov Decision Process (CMDP) has been extensively studied in the recent literature. However, despite the rich results under various on-policy learning settings, there still ... ...

    Abstract As an important framework for safe Reinforcement Learning, the Constrained Markov Decision Process (CMDP) has been extensively studied in the recent literature. However, despite the rich results under various on-policy learning settings, there still lacks some essential understanding of the offline CMDP problems, in terms of both the algorithm design and the information theoretic sample complexity lower bound. In this paper, we focus on solving the CMDP problems where only offline data are available. By adopting the concept of the single-policy concentrability coefficient $C^*$, we establish an $\Omega\left(\frac{\min\left\{|\mathcal{S}||\mathcal{A}|,|\mathcal{S}|+I\right\} C^*}{(1-\gamma)^3\epsilon^2}\right)$ sample complexity lower bound for the offline CMDP problem, where $I$ stands for the number of constraints. By introducing a simple but novel deviation control mechanism, we propose a near-optimal primal-dual learning algorithm called DPDL. This algorithm provably guarantees zero constraint violation and its sample complexity matches the above lower bound except for an $\tilde{\mathcal{O}}((1-\gamma)^{-1})$ factor. Comprehensive discussion on how to deal with the unknown constant $C^*$ and the potential asynchronous structure on the offline dataset are also included.
    Keywords Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2022-07-13
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Article ; Online: Clinical decision support model for tooth extraction therapy derived from electronic dental records.

    Cui, Qiwen / Chen, Qingxiao / Liu, Pufan / Liu, Debin / Wen, Zaiwen

    The Journal of prosthetic dentistry

    2020  Volume 126, Issue 1, Page(s) 83–90

    Abstract: Statement of problem: Tooth extraction therapy serves as a key initial step in many prosthodontic treatment plans. Dentists must make an appropriate decision on the tooth extraction therapy considering multiple determinants and whether a clinical ... ...

    Abstract Statement of problem: Tooth extraction therapy serves as a key initial step in many prosthodontic treatment plans. Dentists must make an appropriate decision on the tooth extraction therapy considering multiple determinants and whether a clinical decision support (CDS) model might help.
    Purpose: The purpose of this retrospective records study was to construct a CDS model to predict tooth extraction therapy in clinical situations by using electronic dental records (EDRs).
    Material and methods: The cohort involved 4135 deidentified EDRs of 3559 patients from the database of a prosthodontics department. Knowledge-based algorithms were first proposed to convert raw data from EDRs into structured data for feature extraction. Redundant features were filtered by a recursive feature-elimination method. The tooth extraction problem was then modeled alternatively as a binary or triple classification problem to be solved by 5 machine learning algorithms. Five machine learning algorithms within each model were compared, as well as the efficiency between 2 models. In addition, the proposed CDS was verified by 2 prosthodontists.
    Results: The triple classification model outperformed the binary model with the F1 score of the Extreme Gradient Boost (XGBoost) algorithm as 0.856 and 0.847, respectively. The XGBoost outperformed the other 4 algorithms. The accuracy, precision, and recall of the XGBoost algorithm were 0.962, 0.865, and 0.830 in the binary classification and 0.924, 0.879, and 0.836 in the triple classification, respectively. The performance of the 2 prosthodontists was inferior to the models.
    Conclusions: The CDS model for tooth extraction therapy achieved high performance in terms of decision-making derived from EDRs.
    MeSH term(s) Algorithms ; Decision Support Systems, Clinical ; Dental Records ; Electronics ; Humans ; Retrospective Studies ; Tooth Extraction
    Language English
    Publishing date 2020-07-20
    Publishing country United States
    Document type Journal Article
    ZDB-ID 218157-5
    ISSN 1097-6841 ; 0022-3913
    ISSN (online) 1097-6841
    ISSN 0022-3913
    DOI 10.1016/j.prosdent.2020.04.010
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  7. Article ; Online: Pure State

    Faulstich, Fabian M / Kim, Raehyun / Cui, Zhi-Hao / Wen, Zaiwen / Kin-Lic Chan, Garnet / Lin, Lin

    Journal of chemical theory and computation

    2022  Volume 18, Issue 2, Page(s) 851–864

    Abstract: Density matrix embedding theory (DMET) formally requires the matching of density matrix blocks obtained from high-level and low-level theories, but this is sometimes not achievable in practical calculations. In such a case, the global band gap of the low- ...

    Abstract Density matrix embedding theory (DMET) formally requires the matching of density matrix blocks obtained from high-level and low-level theories, but this is sometimes not achievable in practical calculations. In such a case, the global band gap of the low-level theory vanishes, and this can require additional numerical considerations. We find that both the violation of the exact matching condition and the vanishing low-level gap are related to the assumption that the high-level density matrix blocks are noninteracting pure-state
    Language English
    Publishing date 2022-01-27
    Publishing country United States
    Document type Journal Article
    ISSN 1549-9626
    ISSN (online) 1549-9626
    DOI 10.1021/acs.jctc.1c01061
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  8. Book ; Online: NG+

    Yang, Minghan / Xu, Dong / Cui, Qiwen / Wen, Zaiwen / Xu, Pengxiang

    A Multi-Step Matrix-Product Natural Gradient Method for Deep Learning

    2021  

    Abstract: In this paper, a novel second-order method called NG+ is proposed. By following the rule ``the shape of the gradient equals the shape of the parameter", we define a generalized fisher information matrix (GFIM) using the products of gradients in the ... ...

    Abstract In this paper, a novel second-order method called NG+ is proposed. By following the rule ``the shape of the gradient equals the shape of the parameter", we define a generalized fisher information matrix (GFIM) using the products of gradients in the matrix form rather than the traditional vectorization. Then, our generalized natural gradient direction is simply the inverse of the GFIM multiplies the gradient in the matrix form. Moreover, the GFIM and its inverse keeps the same for multiple steps so that the computational cost can be controlled and is comparable with the first-order methods. A global convergence is established under some mild conditions and a regret bound is also given for the online learning setting. Numerical results on image classification with ResNet50, quantum chemistry modeling with Schnet, neural machine translation with Transformer and recommendation system with DLRM illustrate that GN+ is competitive with the state-of-the-art methods.
    Keywords Mathematics - Optimization and Control ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 518
    Publishing date 2021-06-14
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: Riemannian Natural Gradient Methods

    Hu, Jiang / Ao, Ruicheng / So, Anthony Man-Cho / Yang, Minghan / Wen, Zaiwen

    2022  

    Abstract: This paper studies large-scale optimization problems on Riemannian manifolds whose objective function is a finite sum of negative log-probability losses. Such problems arise in various machine learning and signal processing applications. By introducing ... ...

    Abstract This paper studies large-scale optimization problems on Riemannian manifolds whose objective function is a finite sum of negative log-probability losses. Such problems arise in various machine learning and signal processing applications. By introducing the notion of Fisher information matrix in the manifold setting, we propose a novel Riemannian natural gradient method, which can be viewed as a natural extension of the natural gradient method from the Euclidean setting to the manifold setting. We establish the almost-sure global convergence of our proposed method under standard assumptions. Moreover, we show that if the loss function satisfies certain convexity and smoothness conditions and the input-output map satisfies a Riemannian Jacobian stability condition, then our proposed method enjoys a local linear -- or, under the Lipschitz continuity of the Riemannian Jacobian of the input-output map, even quadratic -- rate of convergence. We then prove that the Riemannian Jacobian stability condition will be satisfied by a two-layer fully connected neural network with batch normalization with high probability, provided that the width of the network is sufficiently large. This demonstrates the practical relevance of our convergence rate result. Numerical experiments on applications arising from machine learning demonstrate the advantages of the proposed method over state-of-the-art ones.
    Keywords Mathematics - Optimization and Control ; Computer Science - Machine Learning
    Subject code 518
    Publishing date 2022-07-15
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Book ; Online: Toward Solving 2-TBSG Efficiently

    Jia, Zeyu / Wen, Zaiwen / Ye, Yinyu

    2019  

    Abstract: 2-TBSG is a two-player game model which aims to find Nash equilibriums and is widely utilized in reinforced learning and AI. Inspired by the fact that the simplex method for solving the deterministic discounted Markov decision processes (MDPs) is ... ...

    Abstract 2-TBSG is a two-player game model which aims to find Nash equilibriums and is widely utilized in reinforced learning and AI. Inspired by the fact that the simplex method for solving the deterministic discounted Markov decision processes (MDPs) is strongly polynomial independent of the discounted factor, we are trying to answer an open problem whether there is a similar algorithm for 2-TBSG. We develop a simplex strategy iteration where one player updates its strategy with a simplex step while the other player finds an optimal counterstrategy in turn, and a modified simplex strategy iteration. Both of them belong to a class of geometrically converging algorithms. We establish the strongly polynomial property of these algorithms by considering a strategy combined from the current strategy and the equilibrium strategy. Moreover, we present a method to transform general 2-TBSGs into special 2-TBSGs where each state has exactly two actions.

    Comment: 16 pages
    Keywords Computer Science - Computer Science and Game Theory ; Mathematics - Optimization and Control
    Subject code 510
    Publishing date 2019-06-08
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top