LIVIVO - Search results -

Search results

Result 1 - 7 of total 7

Search options

Book ; Online: Fantastic Generalization Measures are Nowhere to be Found

Gastpar, Michael / Nachum, Ido / Shafer, Jonathan / Weinberger, Thomas

2023

Abstract: We study the notion of a generalization bound being uniformly tight, meaning that the difference between the bound and the population loss is small for all learning algorithms and all population distributions. Numerous generalization bounds have been ... ...

Abstract	We study the notion of a generalization bound being uniformly tight, meaning that the difference between the bound and the population loss is small for all learning algorithms and all population distributions. Numerous generalization bounds have been proposed in the literature as potential explanations for the ability of neural networks to generalize in the overparameterized setting. However, in their paper ``Fantastic Generalization Measures and Where to Find Them,'' Jiang et al. (2020) examine more than a dozen generalization bounds, and show empirically that none of them are uniformly tight. This raises the question of whether uniformly-tight generalization bounds are at all possible in the overparameterized setting. We consider two types of generalization bounds: (1) bounds that may depend on the training set and the learned hypothesis (e.g., margin bounds). We prove mathematically that no such bound can be uniformly tight in the overparameterized setting; (2) bounds that may in addition also depend on the learning algorithm (e.g., stability bounds). For these bounds, we show a trade-off between the algorithm's performance and the bound's tightness. Namely, if the algorithm achieves good accuracy on certain distributions, then no generalization bound can be uniformly tight for it in the overparameterized setting. We explain how these formal results can, in our view, inform research on generalization bounds for neural networks, while stressing that other interpretations of these results are also possible. Comment: 34 pages, 1 figure. Minor fix: subsection 6.2 -> section 7
Keywords	Computer Science - Machine Learning ; Computer Science - Neural and Evolutionary Computing ; Statistics - Machine Learning
Subject code	006
Publishing date	2023-09-24
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Finite Littlestone Dimension Implies Finite Information Complexity

Pradeep, Aditya / Nachum, Ido / Gastpar, Michael

2022

Abstract: We prove that every online learnable class of functions of Littlestone dimension $d$ admits a learning algorithm with finite information complexity. Towards this end, we use the notion of a globally stable algorithm. Generally, the information complexity ...

Abstract	We prove that every online learnable class of functions of Littlestone dimension $d$ admits a learning algorithm with finite information complexity. Towards this end, we use the notion of a globally stable algorithm. Generally, the information complexity of such a globally stable algorithm is large yet finite, roughly exponential in $d$. We also show there is room for improvement; for a canonical online learnable class, indicator functions of affine subspaces of dimension $d$, the information complexity can be upper bounded logarithmically in $d$.
Keywords	Computer Science - Machine Learning ; Computer Science - Information Theory
Publishing date	2022-06-27
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: On Symmetry and Initialization for Neural Networks

Nachum, Ido / Yehudayoff, Amir

2019

Abstract: This work provides an additional step in the theoretical understanding of neural networks. We consider neural networks with one hidden layer and show that when learning symmetric functions, one can choose initial conditions so that standard SGD training ... ...

Abstract	This work provides an additional step in the theoretical understanding of neural networks. We consider neural networks with one hidden layer and show that when learning symmetric functions, one can choose initial conditions so that standard SGD training efficiently produces generalization guarantees. We empirically verify this and show that this does not hold when the initial conditions are chosen at random. The proof of convergence investigates the interaction between the two layers of the network. Our results highlight the importance of using symmetry in the design of neural networks.
Keywords	Computer Science - Machine Learning ; Statistics - Machine Learning
Publishing date	2019-07-01
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Regularization by Misclassification in ReLU Neural Networks

Cornacchia, Elisabetta / Hązła, Jan / Nachum, Ido / Yehudayoff, Amir

2021

Abstract: We study the implicit bias of ReLU neural networks trained by a variant of SGD where at each step, the label is changed with probability $p$ to a random label (label smoothing being a close variant of this procedure). Our experiments demonstrate that ... ...

Abstract	We study the implicit bias of ReLU neural networks trained by a variant of SGD where at each step, the label is changed with probability $p$ to a random label (label smoothing being a close variant of this procedure). Our experiments demonstrate that label noise propels the network to a sparse solution in the following sense: for a typical input, a small fraction of neurons are active, and the firing pattern of the hidden layers is sparser. In fact, for some instances, an appropriate amount of label noise does not only sparsify the network but further reduces the test error. We then turn to the theoretical analysis of such sparsification mechanisms, focusing on the extremal case of $p=1$. We show that in this case, the network withers as anticipated from experiments, but surprisingly, in different ways that depend on the learning rate and the presence of bias, with either weights vanishing or neurons ceasing to fire.
Keywords	Computer Science - Machine Learning ; Computer Science - Neural and Evolutionary Computing
Subject code	006
Publishing date	2021-11-03
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: A Johnson--Lindenstrauss Framework for Randomly Initialized CNNs

Nachum, Ido / Hązła, Jan / Gastpar, Michael / Khina, Anatoly

2021

Abstract: How does the geometric representation of a dataset change after the application of each randomly initialized layer of a neural network? The celebrated Johnson--Lindenstrauss lemma answers this question for linear fully-connected neural networks (FNNs), ... ...

Abstract	How does the geometric representation of a dataset change after the application of each randomly initialized layer of a neural network? The celebrated Johnson--Lindenstrauss lemma answers this question for linear fully-connected neural networks (FNNs), stating that the geometry is essentially preserved. For FNNs with the ReLU activation, the angle between two inputs contracts according to a known mapping. The question for non-linear convolutional neural networks (CNNs) becomes much more intricate. To answer this question, we introduce a geometric framework. For linear CNNs, we show that the Johnson--Lindenstrauss lemma continues to hold, namely, that the angle between two inputs is preserved. For CNNs with ReLU activation, on the other hand, the behavior is richer: The angle between the outputs contracts, where the level of contraction depends on the nature of the inputs. In particular, after one layer, the geometry of natural images is essentially preserved, whereas for Gaussian correlated inputs, CNNs exhibit the same contracting behavior as FNNs with ReLU activation.
Keywords	Computer Science - Machine Learning ; Computer Science - Neural and Evolutionary Computing ; Mathematics - Probability
Publishing date	2021-11-03
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: On the Perceptron’s Compression

Moran, Shay / Nachum, Ido / Panasoff, Itai / Yehudayoff, Amir

Beyond the Horizon of Computability

Abstract: We study and provide exposition to several phenomena that are related to the perceptron’s compression. One theme concerns modifications of the perceptron algorithm that yield better guarantees on the margin of the hyperplane it outputs. These ... ...

Abstract	We study and provide exposition to several phenomena that are related to the perceptron’s compression. One theme concerns modifications of the perceptron algorithm that yield better guarantees on the margin of the hyperplane it outputs. These modifications can be useful in training neural networks as well, and we demonstrate them with some experimental data. In a second theme, we deduce conclusions from the perceptron’s compression in various contexts.
Keywords	covid19
Publisher	PMC
Document type	Article ; Online
DOI	10.1007/978-3-030-51466-2_29
Database	COVID19

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Details ▾

Book ; Online: Almost-Reed--Muller Codes Achieve Constant Rates for Random Errors

Abbe, Emmanuel / Hązła, Jan / Nachum, Ido

2020

Abstract: This paper considers '$\delta$-almost Reed-Muller codes', i.e., linear codes spanned by evaluations of all but a $\delta$ fraction of monomials of degree at most $d$. It is shown that for any $\delta > 0$ and any $\varepsilon>0$, there exists a family of ...

Abstract	This paper considers '$\delta$-almost Reed-Muller codes', i.e., linear codes spanned by evaluations of all but a $\delta$ fraction of monomials of degree at most $d$. It is shown that for any $\delta > 0$ and any $\varepsilon>0$, there exists a family of $\delta$-almost Reed-Muller codes of constant rate that correct $1/2-\varepsilon$ fraction of random errors with high probability. For exact Reed-Muller codes, the analogous result is not known and represents a weaker version of the longstanding conjecture that Reed-Muller codes achieve capacity for random errors (Abbe-Shpilka-Wigderson STOC '15). Our approach is based on the recent polarization result for Reed-Muller codes, combined with a combinatorial approach to establishing inequalities between the Reed-Muller code entropies.
Keywords	Computer Science - Information Theory
Publishing date	2020-04-20
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

To top

Search results

Search options

Book ; Online: Fantastic Generalization Measures are Nowhere to be Found

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Finite Littlestone Dimension Implies Finite Information Complexity

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: On Symmetry and Initialization for Neural Networks

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Regularization by Misclassification in ReLU Neural Networks

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: A Johnson--Lindenstrauss Framework for Randomly Initialized CNNs

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Article ; Online: On the Perceptron’s Compression

Kategorien

Inter-library loan at ZB MED

Book ; Online: Almost-Reed--Muller Codes Achieve Constant Rates for Random Errors

Full text online

More links

Kategorien

Inter-library loan at ZB MED