LIVIVO - Das Suchportal für Lebenswissenschaften

switch to English language
Erweiterte Suche

Ihre letzten Suchen

  1. AU="Bach, Francis"
  2. AU="Afşin, Emine"
  3. AU="McLeod, Jonathan"
  4. AU=Srensen Morten Drby
  5. AU=de Noronha Lucia
  6. AU=Robinson Jennifer G
  7. AU=CHIACO JOHN MICHAEL S. CHUA
  8. AU="Simon, Benedikt"
  9. AU="Zhao, Andong"
  10. AU="Zhao, Tianshi"
  11. AU="Morris, Helen"

Suchergebnis

Treffer 1 - 10 von insgesamt 123

Suchoptionen

  1. Buch ; Online: High-dimensional analysis of double descent for linear regression with random projections

    Bach, Francis

    2023  

    Abstract: We consider linear regression problems with a varying number of random projections, where we provably exhibit a double descent curve for a fixed prediction problem, with a high-dimensional analysis based on random matrix theory. We first consider the ... ...

    Abstract We consider linear regression problems with a varying number of random projections, where we provably exhibit a double descent curve for a fixed prediction problem, with a high-dimensional analysis based on random matrix theory. We first consider the ridge regression estimator and review earlier results using classical notions from non-parametric statistics, namely degrees of freedom, also known as effective dimensionality. We then compute asymptotic equivalents of the generalization performance (in terms of squared bias and variance) of the minimum norm least-squares fit with random projections, providing simple expressions for the double descent phenomenon.
    Schlagwörter Computer Science - Machine Learning ; Statistics - Machine Learning
    Erscheinungsdatum 2023-03-02
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  2. Buch ; Online: On the relationship between multivariate splines and infinitely-wide neural networks

    Bach, Francis

    2023  

    Abstract: We consider multivariate splines and show that they have a random feature expansion as infinitely wide neural networks with one-hidden layer and a homogeneous activation function which is the power of the rectified linear unit. We show that the ... ...

    Abstract We consider multivariate splines and show that they have a random feature expansion as infinitely wide neural networks with one-hidden layer and a homogeneous activation function which is the power of the rectified linear unit. We show that the associated function space is a Sobolev space on a Euclidean ball, with an explicit bound on the norms of derivatives. This link provides a new random feature expansion for multivariate splines that allow efficient algorithms. This random feature expansion is numerically better behaved than usual random Fourier features, both in theory and practice. In particular, in dimension one, we compare the associated leverage scores to compare the two random expansions and show a better scaling for the neural network expansion.
    Schlagwörter Computer Science - Machine Learning ; Mathematics - Statistics Theory ; Statistics - Machine Learning
    Erscheinungsdatum 2023-02-07
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  3. Buch ; Online: Information Theory with Kernel Methods

    Bach, Francis

    2022  

    Abstract: We consider the analysis of probability distributions through their associated covariance operators from reproducing kernel Hilbert spaces. We show that the von Neumann entropy and relative entropy of these operators are intimately related to the usual ... ...

    Abstract We consider the analysis of probability distributions through their associated covariance operators from reproducing kernel Hilbert spaces. We show that the von Neumann entropy and relative entropy of these operators are intimately related to the usual notions of Shannon entropy and relative entropy, and share many of their properties. They come together with efficient estimation algorithms from various oracles on the probability distributions. We also consider product spaces and show that for tensor product kernels, we can define notions of mutual information and joint entropies, which can then characterize independence perfectly, but only partially conditional independence. We finally show how these new notions of relative entropy lead to new upper-bounds on log partition functions, that can be used together with convex optimization within variational inference methods, providing a new family of probabilistic inference methods.
    Schlagwörter Computer Science - Information Theory ; Computer Science - Machine Learning ; Mathematics - Optimization and Control ; Statistics - Machine Learning
    Thema/Rubrik (Code) 006
    Erscheinungsdatum 2022-02-17
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  4. Buch ; Online: Sum-of-Squares Relaxations for Information Theory and Variational Inference

    Bach, Francis

    2022  

    Abstract: We consider extensions of the Shannon relative entropy, referred to as f-divergences. Three classical related computational problems are typically associated with these divergences: (a) estimation from moments, (b) computing normalizing integrals, and (c) ...

    Abstract We consider extensions of the Shannon relative entropy, referred to as f-divergences. Three classical related computational problems are typically associated with these divergences: (a) estimation from moments, (b) computing normalizing integrals, and (c) variational inference in probabilistic models. These problems are related to one another through convex duality, and for all them, there are many applications throughout data science, and we aim for computationally tractable approximation algorithms that preserve properties of the original problem such as potential convexity or monotonicity. In order to achieve this, we derive a sequence of convex relaxations for computing these divergences from non-centered covariance matrices associated with a given feature vector: starting from the typically non-tractable optimal lower-bound, we consider an additional relaxation based on "sums-of-squares", which is is now computable in polynomial time as a semidefinite program, as well as further computationally more efficient relaxations based on spectral information divergences from quantum information theory. For all of the tasks above, beyond proposing new relaxations, we derive tractable algorithms based on augmented Lagrangians and first-order methods, and we present illustrations on multivariate trigonometric polynomials and functions on the Boolean hypercube.
    Schlagwörter Computer Science - Information Theory ; Computer Science - Machine Learning ; Mathematics - Optimization and Control ; Statistics - Machine Learning
    Thema/Rubrik (Code) 004
    Erscheinungsdatum 2022-06-27
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  5. Buch ; Online: On the Effectiveness of Richardson Extrapolation in Machine Learning

    Bach, Francis

    2020  

    Abstract: Richardson extrapolation is a classical technique from numerical analysis that can improve the approximation error of an estimation method by combining linearly several estimates obtained from different values of one of its hyperparameters, without the ... ...

    Abstract Richardson extrapolation is a classical technique from numerical analysis that can improve the approximation error of an estimation method by combining linearly several estimates obtained from different values of one of its hyperparameters, without the need to know in details the inner structure of the original estimation method. The main goal of this paper is to study when Richardson extrapolation can be used within machine learning, beyond the existing applications to step-size adaptations in stochastic gradient descent. We identify two situations where Richardson interpolation can be useful: (1) when the hyperparameter is the number of iterations of an existing iterative optimization algorithm, with applications to averaged gradient descent and Frank-Wolfe algorithms (where we obtain asymptotically rates of $O(1/k^2)$ on polytopes, where $k$ is the number of iterations), and (2) when it is a regularization parameter, with applications to Nesterov smoothing techniques for minimizing non-smooth functions (where we obtain asymptotically rates close to $O(1/k^2)$ for non-smooth functions), and ridge regression. In all these cases, we show that extrapolation techniques come with no significant loss in performance, but with sometimes strong gains, and we provide theoretical justifications based on asymptotic developments for such gains, as well as empirical illustrations on classical problems from machine learning.
    Schlagwörter Computer Science - Machine Learning ; Mathematics - Numerical Analysis ; Mathematics - Optimization and Control
    Thema/Rubrik (Code) 518
    Erscheinungsdatum 2020-02-07
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  6. Buch ; Online: Kernelized Diffusion maps

    Pillaud-Vivien, Loucas / Bach, Francis

    2023  

    Abstract: Spectral clustering and diffusion maps are celebrated dimensionality reduction algorithms built on eigen-elements related to the diffusive structure of the data. The core of these procedures is the approximation of a Laplacian through a graph kernel ... ...

    Abstract Spectral clustering and diffusion maps are celebrated dimensionality reduction algorithms built on eigen-elements related to the diffusive structure of the data. The core of these procedures is the approximation of a Laplacian through a graph kernel approach, however this local average construction is known to be cursed by the high-dimension d. In this article, we build a different estimator of the Laplacian, via a reproducing kernel Hilbert space method, which adapts naturally to the regularity of the problem. We provide non-asymptotic statistical rates proving that the kernel estimator we build can circumvent the curse of dimensionality. Finally we discuss techniques (Nystr\"om subsampling, Fourier features) that enable to reduce the computational cost of the estimator while not degrading its overall performance.

    Comment: 19 pages, 1 Figure
    Schlagwörter Statistics - Machine Learning ; Computer Science - Machine Learning ; Mathematics - Numerical Analysis ; Mathematics - Statistics Theory
    Thema/Rubrik (Code) 510
    Erscheinungsdatum 2023-02-13
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  7. Buch ; Online: Convergence Rates for Non-Log-Concave Sampling and Log-Partition Estimation

    Holzmüller, David / Bach, Francis

    2023  

    Abstract: Sampling from Gibbs distributions $p(x) \propto \exp(-V(x)/\varepsilon)$ and computing their log-partition function are fundamental tasks in statistics, machine learning, and statistical physics. However, while efficient algorithms are known for convex ... ...

    Abstract Sampling from Gibbs distributions $p(x) \propto \exp(-V(x)/\varepsilon)$ and computing their log-partition function are fundamental tasks in statistics, machine learning, and statistical physics. However, while efficient algorithms are known for convex potentials $V$, the situation is much more difficult in the non-convex case, where algorithms necessarily suffer from the curse of dimensionality in the worst case. For optimization, which can be seen as a low-temperature limit of sampling, it is known that smooth functions $V$ allow faster convergence rates. Specifically, for $m$-times differentiable functions in $d$ dimensions, the optimal rate for algorithms with $n$ function evaluations is known to be $O(n^{-m/d})$, where the constant can potentially depend on $m, d$ and the function to be optimized. Hence, the curse of dimensionality can be alleviated for smooth functions at least in terms of the convergence rate. Recently, it has been shown that similarly fast rates can also be achieved with polynomial runtime $O(n^{3.5})$, where the exponent $3.5$ is independent of $m$ or $d$. Hence, it is natural to ask whether similar rates for sampling and log-partition computation are possible, and whether they can be realized in polynomial time with an exponent independent of $m$ and $d$. We show that the optimal rates for sampling and log-partition computation are sometimes equal and sometimes faster than for optimization. We then analyze various polynomial-time sampling algorithms, including an extension of a recent promising optimization approach, and find that they sometimes exhibit interesting behavior but no near-optimal rates. Our results also give further insights on the relation between sampling, log-partition, and optimization problems.

    Comment: Changes in v2: Minor corrections and formatting changes. Plots can be reproduced using the code at https://github.com/dholzmueller/sampling_experiments
    Schlagwörter Statistics - Machine Learning ; Computer Science - Machine Learning ; Mathematics - Statistics Theory ; Statistics - Computation
    Thema/Rubrik (Code) 510
    Erscheinungsdatum 2023-03-06
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  8. Buch ; Online: Going Deeper with Spectral Decompositions

    Cabannes, Vivien / Bach, Francis

    2023  

    Abstract: Historically, the machine learning community has derived spectral decompositions from graph-based approaches. We break with this approach and prove the statistical and computational superiority of the Galerkin method, which consists in restricting the ... ...

    Abstract Historically, the machine learning community has derived spectral decompositions from graph-based approaches. We break with this approach and prove the statistical and computational superiority of the Galerkin method, which consists in restricting the study to a small set of test functions. In particular, we introduce implementation tricks to deal with differential operators in large dimensions with structured kernels. Finally, we extend on the core principles beyond our approach to apply them to non-linear spaces of functions, such as the ones parameterized by deep neural networks, through loss-based optimization procedures.
    Schlagwörter Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Machine Learning
    Erscheinungsdatum 2023-06-01
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  9. Artikel ; Online: AriaNN

    Ryffel Théo / Tholoniat Pierre / Pointcheval David / Bach Francis

    Proceedings on Privacy Enhancing Technologies, Vol 2022, Iss 1, Pp 291-

    Low-Interaction Privacy-Preserving Deep Learning via Function Secret Sharing

    2022  Band 316

    Abstract: We propose AriaNN, a low-interaction privacy-preserving framework for private neural network training and inference on sensitive data. ...

    Abstract We propose AriaNN, a low-interaction privacy-preserving framework for private neural network training and inference on sensitive data.
    Schlagwörter multi party computation ; function secret sharing ; secure comparison ; deep learning ; Ethics ; BJ1-1725 ; Electronic computers. Computer science ; QA75.5-76.95
    Sprache Englisch
    Erscheinungsdatum 2022-01-01T00:00:00Z
    Verlag Sciendo
    Dokumenttyp Artikel ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  10. Buch ; Online: Polynomial-time Sparse Measure Recovery

    Daneshmand, Hadi / Bach, Francis

    From Mean Field Theory to Algorithm Design

    2022  

    Abstract: Mean field theory has provided theoretical insights into various algorithms by letting the problem size tend to infinity. We argue that the applications of mean-field theory go beyond theoretical insights as it can inspire the design of practical ... ...

    Abstract Mean field theory has provided theoretical insights into various algorithms by letting the problem size tend to infinity. We argue that the applications of mean-field theory go beyond theoretical insights as it can inspire the design of practical algorithms. Leveraging mean-field analyses in physics, we propose a novel algorithm for sparse measure recovery. For sparse measures over $\mathbb{R}$, we propose a polynomial-time recovery method from Fourier moments that improves upon convex relaxation methods in a specific parameter regime; then, we demonstrate the application of our results for the optimization of particular two-dimensional, single-layer neural networks in realizable settings.
    Schlagwörter Computer Science - Machine Learning ; Statistics - Machine Learning
    Erscheinungsdatum 2022-04-16
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

Zum Seitenanfang