LIVIVO - Search results -

Search results

Result 1 - 7 of total 7

Search options

Article ; Online: Augmenting interpretable models with large language models during training.

Singh, Chandan / Askari, Armin / Caruana, Rich / Gao, Jianfeng

2023 Volume 14, Issue 1, Page(s) 7913

Abstract: Recent large language models (LLMs), such as ChatGPT, have demonstrated remarkable prediction performance for a growing array of tasks. However, their proliferation into high-stakes domains and compute-limited settings has created a burgeoning need for ... ...

Abstract	Recent large language models (LLMs), such as ChatGPT, have demonstrated remarkable prediction performance for a growing array of tasks. However, their proliferation into high-stakes domains and compute-limited settings has created a burgeoning need for interpretability and efficiency. We address this need by proposing Aug-imodels, a framework for leveraging the knowledge learned by LLMs to build extremely efficient and interpretable prediction models. Aug-imodels use LLMs during fitting but not during inference, allowing complete transparency and often a speed/memory improvement of greater than 1000x for inference compared to LLMs. We explore two instantiations of Aug-imodels in natural-language processing: Aug-Linear, which augments a linear model with decoupled embeddings from an LLM and Aug-Tree, which augments a decision tree with LLM feature expansions. Across a variety of text-classification datasets, both outperform their non-augmented, interpretable counterparts. Aug-Linear can even outperform much larger models, e.g. a 6-billion parameter GPT-J model, despite having 10,000x fewer parameters and being fully transparent. We further explore Aug-imodels in a natural-language fMRI study, where they generate interesting interpretations from scientific data.
MeSH term(s)	Learning ; Knowledge ; Language ; Linear Models ; Natural Language Processing
Language	English
Publishing date	2023-11-30
Publishing country	England
Document type	Journal Article
ZDB-ID	2553671-0
ISSN	2041-1723 ; 2041-1723
ISSN (online)	2041-1723
ISSN	2041-1723
DOI	10.1038/s41467-023-43713-1
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Book ; Online: Augmenting Interpretable Models with LLMs during Training

Singh, Chandan / Askari, Armin / Caruana, Rich / Gao, Jianfeng

2022

Abstract: Recent large language models (LLMs) have demonstrated remarkable prediction performance for a growing array of tasks. However, their proliferation into high-stakes domains (e.g. medicine) and compute-limited settings has created a burgeoning need for ... ...

Abstract	Recent large language models (LLMs) have demonstrated remarkable prediction performance for a growing array of tasks. However, their proliferation into high-stakes domains (e.g. medicine) and compute-limited settings has created a burgeoning need for interpretability and efficiency. We address this need by proposing Augmented Interpretable Models (Aug-imodels), a framework for leveraging the knowledge learned by LLMs to build extremely efficient and interpretable models. Aug-imodels use LLMs during fitting but not during inference, allowing complete transparency and often a speed/memory improvement of greater than 1,000x for inference compared to LLMs. We explore two instantiations of Aug-imodels in natural-language processing: (i) Aug-GAM, which augments a generalized additive model with decoupled embeddings from an LLM and (ii) Aug-Tree, which augments a decision tree with LLM feature expansions. Across a variety of text-classification datasets, both outperform their non-augmented counterparts. Aug-GAM can even outperform much larger models (e.g. a 6-billion parameter GPT-J model), despite having 10,000x fewer parameters and being fully transparent. We further explore Aug-imodels in a natural-language fMRI study, where they generate interesting interpretations from scientific data. All code for using Aug-imodels and reproducing results is made available on Github.
Keywords	Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Machine Learning ; Statistics - Methodology
Subject code	006
Publishing date	2022-09-23
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Naive Feature Selection

Askari, Armin / d'Aspremont, Alexandre / Ghaoui, Laurent El

Sparsity in Naive Bayes

2019

Abstract: Due to its linear complexity, naive Bayes classification remains an attractive supervised learning method, especially in very large-scale settings. We propose a sparse version of naive Bayes, which can be used for feature selection. This leads to a ... ...

Abstract	Due to its linear complexity, naive Bayes classification remains an attractive supervised learning method, especially in very large-scale settings. We propose a sparse version of naive Bayes, which can be used for feature selection. This leads to a combinatorial maximum-likelihood problem, for which we provide an exact solution in the case of binary data, or a bound in the multinomial case. We prove that our bound becomes tight as the marginal contribution of additional features decreases. Both binary and multinomial sparse models are solvable in time almost linear in problem size, representing a very small extra relative cost compared to the classical naive Bayes. Numerical experiments on text data show that the naive Bayes feature selection method is as statistically effective as state-of-the-art feature selection methods such as recursive feature elimination, $l_1$-penalized logistic regression and LASSO, while being orders of magnitude faster. For a large data set, having more than with $1.6$ million training points and about $12$ million features, and with a non-optimized CPU implementation, our sparse naive Bayes model can be trained in less than 15 seconds.
Keywords	Computer Science - Machine Learning ; Statistics - Machine Learning
Subject code	005 ; 519
Publishing date	2019-05-23
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: FANOK

Askari, Armin / Rebjock, Quentin / d'Aspremont, Alexandre / Ghaoui, Laurent El

Knockoffs in Linear Time

2020

Abstract: We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems. Identifying the knockoff distribution requires solving a large scale semidefinite ... ...

Abstract	We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems. Identifying the knockoff distribution requires solving a large scale semidefinite program for which we derive several efficient methods. One handles generic covariance matrices, has a complexity scaling as $O(p^3)$ where $p$ is the ambient dimension, while another assumes a rank $k$ factor model on the covariance matrix to reduce this complexity bound to $O(pk^2)$. We also derive efficient procedures to both estimate factor models and sample knockoff covariates with complexity linear in the dimension. We test our methods on problems with $p$ as large as $500,000$. Comment: For code see https://github.com/qrebjock/fanok
Keywords	Computer Science - Machine Learning ; Statistics - Methodology ; Statistics - Machine Learning
Publishing date	2020-06-15
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Fenchel Lifted Networks

Gu, Fangda / Askari, Armin / Ghaoui, Laurent El

A Lagrange Relaxation of Neural Network Training

2018

Abstract: Despite the recent successes of deep neural networks, the corresponding training problem remains highly non-convex and difficult to optimize. Classes of models have been proposed that introduce greater structure to the objective function at the cost of ... ...

Abstract	Despite the recent successes of deep neural networks, the corresponding training problem remains highly non-convex and difficult to optimize. Classes of models have been proposed that introduce greater structure to the objective function at the cost of lifting the dimension of the problem. However, these lifted methods sometimes perform poorly compared to traditional neural networks. In this paper, we introduce a new class of lifted models, Fenchel lifted networks, that enjoy the same benefits as previous lifted models, without suffering a degradation in performance over classical networks. Our model represents activation functions as equivalent biconvex constraints and uses Lagrange Multipliers to arrive at a rigorous lower bound of the traditional neural network training problem. This model is efficiently trained using block-coordinate descent and is parallelizable across data points and/or layers. We compare our model against standard fully connected and convolutional networks and show that we are able to match or beat their performance.
Keywords	Computer Science - Machine Learning ; Statistics - Machine Learning
Subject code	006
Publishing date	2018-11-19
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Greedy Frank-Wolfe Algorithm for Exemplar Selection

Cheng, Gary / Askari, Armin / Ramchandran, Kannan / Ghaoui, Laurent El

2018

Abstract: In this paper, we consider the problem of selecting representatives from a data set for arbitrary supervised/unsupervised learning tasks. We identify a subset $S$ of a data set $A$ such that 1) the size of $S$ is much smaller than $A$ and 2) $S$ ... ...

Abstract	In this paper, we consider the problem of selecting representatives from a data set for arbitrary supervised/unsupervised learning tasks. We identify a subset $S$ of a data set $A$ such that 1) the size of $S$ is much smaller than $A$ and 2) $S$ efficiently describes the entire data set, in a way formalized via convex optimization. In order to generate $\|S\| = k$ exemplars, our kernelizable algorithm, Frank-Wolfe Sparse Representation (FWSR), only needs to execute $\approx k$ iterations with a per-iteration cost that is quadratic in the size of $A$. This is in contrast to other state of the art methods which need to execute until convergence with each iteration costing an extra factor of $d$ (dimension of the data). Moreover, we also provide a proof of linear convergence for our method. We support our results with empirical experiments; we test our algorithm against current methods in three different experimental setups on four different data sets. FWSR outperforms other exemplar finding methods both in speed and accuracy in almost all scenarios.
Keywords	Computer Science - Machine Learning ; Statistics - Machine Learning
Subject code	006
Publishing date	2018-11-06
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Implicit Deep Learning

Ghaoui, Laurent El / Gu, Fangda / Travacca, Bertrand / Askari, Armin / Tsai, Alicia Y.

2019

Abstract: Implicit deep learning prediction rules generalize the recursive rules of feedforward neural networks. Such rules are based on the solution of a fixed-point equation involving a single vector of hidden features, which is thus only implicitly defined. The ...

Abstract	Implicit deep learning prediction rules generalize the recursive rules of feedforward neural networks. Such rules are based on the solution of a fixed-point equation involving a single vector of hidden features, which is thus only implicitly defined. The implicit framework greatly simplifies the notation of deep learning, and opens up many new possibilities, in terms of novel architectures and algorithms, robustness analysis and design, interpretability, sparsity, and network architecture optimization.
Keywords	Computer Science - Machine Learning ; Mathematics - Optimization and Control ; Statistics - Machine Learning
Publishing date	2019-08-17
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

To top

Search results

Search options

Article ; Online: Augmenting interpretable models with large language models during training.

More links

Kategorien

Order via subito

Book ; Online: Augmenting Interpretable Models with LLMs during Training

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Naive Feature Selection

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: FANOK

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Fenchel Lifted Networks

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Greedy Frank-Wolfe Algorithm for Exemplar Selection

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Implicit Deep Learning

Full text online

More links

Kategorien

Inter-library loan at ZB MED