LIVIVO - Search results -

Search results

Result 1 - 10 of total 17

Search options

Article ; Online: Towards a Unified Theory of Learning and Information.

Alabdulmohsin, Ibrahim

2020 Volume 22, Issue 4

Abstract: In this paper, we introduce the notion of "learning capacity" for algorithms that learn from data, which is analogous to the Shannon channel capacity for communication systems. We show how "learning capacity" bridges the gap between statistical learning ... ...

Abstract	In this paper, we introduce the notion of "learning capacity" for algorithms that learn from data, which is analogous to the Shannon channel capacity for communication systems. We show how "learning capacity" bridges the gap between statistical learning theory and information theory, and we will use it to derive generalization bounds for finite hypothesis spaces, differential privacy, and countable domains, among others. Moreover, we prove that under the Axiom of Choice, the existence of an empirical risk minimization (ERM) rule that has a vanishing learning capacity is equivalent to the assertion that the hypothesis space has a finite Vapnik-Chervonenkis (VC) dimension, thus establishing an equivalence relation between two of the most fundamental concepts in statistical learning theory and information theory. In addition, we show how the learning capacity of an algorithm provides important qualitative results, such as on the relation between generalization and algorithmic stability, information leakage, and data processing. Finally, we conclude by listing some open problems and suggesting future directions of research.
Language	English
Publishing date	2020-04-13
Publishing country	Switzerland
Document type	Journal Article
ZDB-ID	2014734-X
ISSN	1099-4300 ; 1099-4300
ISSN (online)	1099-4300
ISSN	1099-4300
DOI	10.3390/e22040438
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Book ; Online: Fair Classification via Unconstrained Optimization

Alabdulmohsin, Ibrahim

2020

Abstract: Achieving the Bayes optimal binary classification rule subject to group fairness constraints is known to be reducible, in some cases, to learning a group-wise thresholding rule over the Bayes regressor. In this paper, we extend this result by proving ... ...

Abstract	Achieving the Bayes optimal binary classification rule subject to group fairness constraints is known to be reducible, in some cases, to learning a group-wise thresholding rule over the Bayes regressor. In this paper, we extend this result by proving that, in a broader setting, the Bayes optimal fair learning rule remains a group-wise thresholding rule over the Bayes regressor but with a (possible) randomization at the thresholds. This provides a stronger justification to the post-processing approach in fair classification, in which (1) a predictor is learned first, after which (2) its output is adjusted to remove bias. We show how the post-processing rule in this two-stage approach can be learned quite efficiently by solving an unconstrained optimization problem. The proposed algorithm can be applied to any black-box machine learning model, such as deep neural networks, random forests and support vector machines. In addition, it can accommodate many fairness criteria that have been previously proposed in the literature, such as equalized odds and statistical parity. We prove that the algorithm is Bayes consistent and motivate it, furthermore, via an impossibility result that quantifies the tradeoff between accuracy and fairness across multiple demographic groups. Finally, we conclude by validating the algorithm on the Adult benchmark dataset.
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Applications ; Statistics - Machine Learning ; 68T05 ; I.2.6 ; I.5
Subject code	006
Publishing date	2020-05-21
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Getting ViT in Shape

Alabdulmohsin, Ibrahim / Zhai, Xiaohua / Kolesnikov, Alexander / Beyer, Lucas

Scaling Laws for Compute-Optimal Model Design

2023

Abstract: Scaling laws have been recently employed to derive compute-optimal model size (number of parameters) for a given compute duration. We advance and refine such methods to infer compute-optimal model shapes, such as width and depth, and successfully ... ...

Abstract	Scaling laws have been recently employed to derive compute-optimal model size (number of parameters) for a given compute duration. We advance and refine such methods to infer compute-optimal model shapes, such as width and depth, and successfully implement this in vision transformers. Our shape-optimized vision transformer, SoViT, achieves results competitive with models that exceed twice its size, despite being pre-trained with an equivalent amount of compute. For example, SoViT-400m/14 achieves 90.3% fine-tuning accuracy on ILSRCV2012, surpassing the much larger ViT-g/14 and approaching ViT-G/14 under identical settings, with also less than half the inference cost. We conduct a thorough evaluation across multiple tasks, such as image classification, captioning, VQA and zero-shot transfer, demonstrating the effectiveness of our model across a broad range of domains and identifying limitations. Overall, our findings challenge the prevailing approach of blindly scaling up vision models and pave a path for a more informed scaling. Comment: 10 pages, 7 figures, 9 tables. Version 2: Layout fixes
Keywords	Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning ; I.2.10 ; I.2.6
Subject code	006
Publishing date	2023-05-22
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: A Near-Optimal Algorithm for Debiasing Trained Machine Learning Models

Alabdulmohsin, Ibrahim / Lucic, Mario

2021

Abstract: We present a scalable post-processing algorithm for debiasing trained models, including deep neural networks (DNNs), which we prove to be near-optimal by bounding its excess Bayes risk. We empirically validate its advantages on standard benchmark ... ...

Abstract	We present a scalable post-processing algorithm for debiasing trained models, including deep neural networks (DNNs), which we prove to be near-optimal by bounding its excess Bayes risk. We empirically validate its advantages on standard benchmark datasets across both classical algorithms as well as modern DNN architectures and demonstrate that it outperforms previous post-processing methods while performing on par with in-processing. In addition, we show that the proposed algorithm is particularly effective for models trained at scale where post-processing is a natural and practical choice. Comment: 21 pages, 5 figures
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Machine Learning ; 68T05 ; 68T45 ; 93E35 ; I.2.6 ; I.2.10
Publishing date	2021-06-06
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Revisiting Neural Scaling Laws in Language and Vision

Alabdulmohsin, Ibrahim / Neyshabur, Behnam / Zhai, Xiaohua

2022

Abstract: The remarkable progress in deep learning in recent years is largely driven by improvements in scale, where bigger models are trained on larger datasets for longer schedules. To predict the benefit of scale empirically, we argue for a more rigorous ... ...

Abstract	The remarkable progress in deep learning in recent years is largely driven by improvements in scale, where bigger models are trained on larger datasets for longer schedules. To predict the benefit of scale empirically, we argue for a more rigorous methodology based on the extrapolation loss, instead of reporting the best-fitting (interpolating) parameters. We then present a recipe for estimating scaling law parameters reliably from learning curves. We demonstrate that it extrapolates more accurately than previous methods in a wide range of architecture families across several domains, including image classification, neural machine translation (NMT) and language modeling, in addition to tasks from the BIG-Bench evaluation benchmark. Finally, we release a benchmark dataset comprising of 90 evaluation tasks to facilitate research in this domain.
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence
Publishing date	2022-09-13
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: A Reduction to Binary Approach for Debiasing Multiclass Datasets

Alabdulmohsin, Ibrahim / Schrouff, Jessica / Koyejo, Oluwasanmi

2022

Abstract: We propose a novel reduction-to-binary (R2B) approach that enforces demographic parity for multiclass classification with non-binary sensitive attributes via a reduction to a sequence of binary debiasing tasks. We prove that R2B satisfies optimality and ... ...

Abstract	We propose a novel reduction-to-binary (R2B) approach that enforces demographic parity for multiclass classification with non-binary sensitive attributes via a reduction to a sequence of binary debiasing tasks. We prove that R2B satisfies optimality and bias guarantees and demonstrate empirically that it can lead to an improvement over two baselines: (1) treating multiclass problems as multi-label by debiasing labels independently and (2) transforming the features instead of the labels. Surprisingly, we also demonstrate that independent label debiasing yields competitive results in most (but not all) settings. We validate these conclusions on synthetic and real-world datasets from social science, computer vision, and healthcare. Comment: 18 pages, 5 figures
Keywords	Computer Science - Machine Learning ; I.2.6 ; I.2.10
Publishing date	2022-05-31
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: The Impact of Reinitialization on Generalization in Convolutional Neural Networks

Alabdulmohsin, Ibrahim / Maennel, Hartmut / Keysers, Daniel

2021

Abstract: Recent results suggest that reinitializing a subset of the parameters of a neural network during training can improve generalization, particularly for small training sets. We study the impact of different reinitialization methods in several convolutional ...

Abstract	Recent results suggest that reinitializing a subset of the parameters of a neural network during training can improve generalization, particularly for small training sets. We study the impact of different reinitialization methods in several convolutional architectures across 12 benchmark image classification datasets, analyzing their potential gains and highlighting limitations. We also introduce a new layerwise reinitialization algorithm that outperforms previous methods and suggest explanations of the observed improved generalization. First, we show that layerwise reinitialization increases the margin on the training examples without increasing the norm of the weights, hence leading to an improvement in margin-based generalization bounds for neural networks. Second, we demonstrate that it settles in flatter local minima of the loss surface. Third, it encourages learning general rules and discourages memorization by placing emphasis on the lower layers of the neural network. Our takeaway message is that the accuracy of convolutional neural networks can be improved for small datasets using bottom-up layerwise reinitialization, where the number of reinitialized layers may vary depending on the available compute budget. Comment: 12 figures, 7 tables
Keywords	Computer Science - Machine Learning ; 68T07 ; 68T45
Subject code	006
Publishing date	2021-09-01
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: A Generalized Lottery Ticket Hypothesis

Alabdulmohsin, Ibrahim / Markeeva, Larisa / Keysers, Daniel / Tolstikhin, Ilya

2021

Abstract: We introduce a generalization to the lottery ticket hypothesis in which the notion of "sparsity" is relaxed by choosing an arbitrary basis in the space of parameters. We present evidence that the original results reported for the canonical basis continue ...

Abstract	We introduce a generalization to the lottery ticket hypothesis in which the notion of "sparsity" is relaxed by choosing an arbitrary basis in the space of parameters. We present evidence that the original results reported for the canonical basis continue to hold in this broader setting. We describe how structured pruning methods, including pruning units or factorizing fully-connected layers into products of low-rank matrices, can be cast as particular instances of this "generalized" lottery ticket hypothesis. The investigations reported here are preliminary and are provided to encourage further research along this direction. Comment: Workshop on Sparsity in Neural Networks: Advancing Understanding and Practice (SNN'21). Updates: New curve on Figure 2(left) and discussion on Li et al
Keywords	Computer Science - Machine Learning ; Computer Science - Computer Vision and Pattern Recognition ; 68T05 ; I.2.6 ; I.2.10
Publishing date	2021-07-03
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: FlexiViT

Beyer, Lucas / Izmailov, Pavel / Kolesnikov, Alexander / Caron, Mathilde / Kornblith, Simon / Zhai, Xiaohua / Minderer, Matthias / Tschannen, Michael / Alabdulmohsin, Ibrahim / Pavetic, Filip

One Model for All Patch Sizes

2022

Abstract: Vision Transformers convert images to sequences by slicing them into patches. The size of these patches controls a speed/accuracy tradeoff, with smaller patches leading to higher accuracy at greater computational cost, but changing the patch size ... ...

Abstract	Vision Transformers convert images to sequences by slicing them into patches. The size of these patches controls a speed/accuracy tradeoff, with smaller patches leading to higher accuracy at greater computational cost, but changing the patch size typically requires retraining the model. In this paper, we demonstrate that simply randomizing the patch size at training time leads to a single set of weights that performs well across a wide range of patch sizes, making it possible to tailor the model to different compute budgets at deployment time. We extensively evaluate the resulting model, which we call FlexiViT, on a wide range of tasks, including classification, image-text retrieval, open-world detection, panoptic segmentation, and semantic segmentation, concluding that it usually matches, and sometimes outperforms, standard ViT models trained at a single patch size in an otherwise identical setup. Hence, FlexiViT training is a simple drop-in improvement for ViT that makes it easy to add compute-adaptive capabilities to most models relying on a ViT backbone architecture. Code and pre-trained models are available at https://github.com/google-research/big_vision Comment: Code and pre-trained models available at https://github.com/google-research/big_vision. All authors made significant technical contributions. CVPR 2023
Keywords	Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning
Subject code	006
Publishing date	2022-12-15
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Fair Wrapping for Black-box Predictions

Soen, Alexander / Alabdulmohsin, Ibrahim / Koyejo, Sanmi / Mansour, Yishay / Moorosi, Nyalleng / Nock, Richard / Sun, Ke / Xie, Lexing

2022

Abstract: We introduce a new family of techniques to post-process ("wrap") a black-box classifier in order to reduce its bias. Our technique builds on the recent analysis of improper loss functions whose optimization can correct any twist in prediction, unfairness ...

Abstract	We introduce a new family of techniques to post-process ("wrap") a black-box classifier in order to reduce its bias. Our technique builds on the recent analysis of improper loss functions whose optimization can correct any twist in prediction, unfairness being treated as a twist. In the post-processing, we learn a wrapper function which we define as an $\alpha$-tree, which modifies the prediction. We provide two generic boosting algorithms to learn $\alpha$-trees. We show that our modification has appealing properties in terms of composition of $\alpha$-trees, generalization, interpretability, and KL divergence between modified and original predictions. We exemplify the use of our technique in three fairness notions: conditional value-at-risk, equality of opportunity, and statistical parity; and provide experiments on several readily available datasets. Comment: Published in Advances in Neural Information Processing Systems 35 (NeurIPS 2022)
Keywords	Statistics - Machine Learning ; Computer Science - Machine Learning
Subject code	006
Publishing date	2022-01-30
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

To top

More links

Kategorien

Order via subito

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED