LIVIVO - Search results -

Search results

Result 1 - 10 of total 14

Search options

Book ; Online: Transformers Meet Directed Graphs

Geisler, Simon / Li, Yujia / Mankowitz, Daniel / Cemgil, Ali Taylan / Günnemann, Stephan / Paduraru, Cosmin

2023

Abstract: Transformers were originally proposed as a sequence-to-sequence model for text but have become vital for a wide range of modalities, including images, audio, video, and undirected graphs. However, transformers for directed graphs are a surprisingly ... ...

Abstract	Transformers were originally proposed as a sequence-to-sequence model for text but have become vital for a wide range of modalities, including images, audio, video, and undirected graphs. However, transformers for directed graphs are a surprisingly underexplored topic, despite their applicability to ubiquitous domains, including source code and logic circuits. In this work, we propose two direction- and structure-aware positional encodings for directed graphs: (1) the eigenvectors of the Magnetic Laplacian - a direction-aware generalization of the combinatorial Laplacian; (2) directional random walk encodings. Empirically, we show that the extra directionality information is useful in various downstream tasks, including correctness testing of sorting networks and source code understanding. Together with a data-flow-centric graph construction, our model outperforms the prior state of the art on the Open Graph Benchmark Code2 relatively by 14.7%. Comment: 29 pages
Keywords	Computer Science - Machine Learning
Subject code	006
Publishing date	2023-01-31
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: COptiDICE

Lee, Jongmin / Paduraru, Cosmin / Mankowitz, Daniel J. / Heess, Nicolas / Precup, Doina / Kim, Kee-Eung / Guez, Arthur

Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation

2022

Abstract: We consider the offline constrained reinforcement learning (RL) problem, in which the agent aims to compute a policy that maximizes expected return while satisfying given cost constraints, learning only from a pre-collected dataset. This problem setting ... ...

Abstract	We consider the offline constrained reinforcement learning (RL) problem, in which the agent aims to compute a policy that maximizes expected return while satisfying given cost constraints, learning only from a pre-collected dataset. This problem setting is appealing in many real-world scenarios, where direct interaction with the environment is costly or risky, and where the resulting policy should comply with safety constraints. However, it is challenging to compute a policy that guarantees satisfying the cost constraints in the offline RL setting, since the off-policy evaluation inherently has an estimation error. In this paper, we present an offline constrained RL algorithm that optimizes the policy in the space of the stationary distribution. Our algorithm, COptiDICE, directly estimates the stationary distribution corrections of the optimal policy with respect to returns, while constraining the cost upper bound, with the goal of yielding a cost-conservative policy for actual constraint satisfaction. Experimental results show that COptiDICE attains better policies in terms of constraint satisfaction and return-maximization, outperforming baseline algorithms. Comment: 24 pages, 6 figures, Accepted at ICLR 2022 (spotlight)
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence
Subject code	006
Publishing date	2022-04-19
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization

Zhang, Michael R. / Paine, Tom Le / Nachum, Ofir / Paduraru, Cosmin / Tucker, George / Wang, Ziyu / Norouzi, Mohammad

2021

Abstract: Standard dynamics models for continuous control make use of feedforward computation to predict the conditional distribution of next state and reward given current state and action using a multivariate Gaussian with a diagonal covariance structure. This ... ...

Abstract	Standard dynamics models for continuous control make use of feedforward computation to predict the conditional distribution of next state and reward given current state and action using a multivariate Gaussian with a diagonal covariance structure. This modeling choice assumes that different dimensions of the next state and reward are conditionally independent given the current state and action and may be driven by the fact that fully observable physics-based simulation environments entail deterministic transition dynamics. In this paper, we challenge this conditional independence assumption and propose a family of expressive autoregressive dynamics models that generate different dimensions of the next state and reward sequentially conditioned on previous dimensions. We demonstrate that autoregressive dynamics models indeed outperform standard feedforward models in log-likelihood on heldout transitions. Furthermore, we compare different model-based and model-free off-policy evaluation (OPE) methods on RL Unplugged, a suite of offline MuJoCo datasets, and find that autoregressive dynamics models consistently outperform all baselines, achieving a new state-of-the-art. Finally, we show that autoregressive dynamics models are useful for offline policy optimization by serving as a way to enrich the replay buffer through data augmentation and improving performance using model-based planning. Comment: ICLR 2021. 17 pages
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Machine Learning
Subject code	006
Publishing date	2021-04-28
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Active Offline Policy Selection

Konyushkova, Ksenia / Chen, Yutian / Paine, Tom Le / Gulcehre, Caglar / Paduraru, Cosmin / Mankowitz, Daniel J / Denil, Misha / de Freitas, Nando

2021

Abstract: This paper addresses the problem of policy selection in domains with abundant logged data, but with a restricted interaction budget. Solving this problem would enable safe evaluation and deployment of offline reinforcement learning policies in industry, ... ...

Abstract	This paper addresses the problem of policy selection in domains with abundant logged data, but with a restricted interaction budget. Solving this problem would enable safe evaluation and deployment of offline reinforcement learning policies in industry, robotics, and recommendation domains among others. Several off-policy evaluation (OPE) techniques have been proposed to assess the value of policies using only logged data. However, there is still a big gap between the evaluation by OPE and the full online evaluation. Yet, large amounts of online interactions are often not possible in practice. To overcome this problem, we introduce active offline policy selection - a novel sequential decision approach that combines logged data with online interaction to identify the best policy. We use OPE estimates to warm start the online evaluation. Then, in order to utilize the limited environment interactions wisely we decide which policy to evaluate next based on a Bayesian optimization method with a kernel that represents policy similarity. We use multiple benchmarks, including real-world robotics, with a large number of candidate policies to show that the proposed approach improves upon state-of-the-art OPE estimates and pure online policy evaluation. Comment: Presented at NeurIPS 2021
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Machine Learning
Subject code	006 ; 004
Publishing date	2021-06-18
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Towards practical reinforcement learning for tokamak magnetic control

Tracey, Brendan D. / Michi, Andrea / Chervonyi, Yuri / Davies, Ian / Paduraru, Cosmin / Lazic, Nevena / Felici, Federico / Ewalds, Timo / Donner, Craig / Galperti, Cristian / Buchli, Jonas / Neunert, Michael / Huber, Andrea / Evens, Jonathan / Kurylowicz, Paula / Mankowitz, Daniel J. / Riedmiller, Martin / Team, The TCV

2023

Abstract: Reinforcement learning (RL) has shown promising results for real-time control systems, including the domain of plasma magnetic control. However, there are still significant drawbacks compared to traditional feedback control approaches for magnetic ... ...

Abstract	Reinforcement learning (RL) has shown promising results for real-time control systems, including the domain of plasma magnetic control. However, there are still significant drawbacks compared to traditional feedback control approaches for magnetic confinement. In this work, we address key drawbacks of the RL method; achieving higher control accuracy for desired plasma properties, reducing the steady-state error, and decreasing the required time to learn new tasks. We build on top of \cite{degrave2022magnetic}, and present algorithmic improvements to the agent architecture and training procedure. We present simulation results that show up to 65\% improvement in shape accuracy, achieve substantial reduction in the long-term bias of the plasma current, and additionally reduce the training time required to learn new tasks by a factor of 3 or more. We present new experiments using the upgraded RL-based controllers on the TCV tokamak, which validate the simulation results achieved, and point the way towards routinely achieving accurate discharges using the RL approach.
Keywords	Physics - Plasma Physics ; Computer Science - Machine Learning
Publishing date	2023-07-21
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Optimizing Memory Mapping Using Deep Reinforcement Learning

Wang, Pengming / Sazanovich, Mikita / Ilbeyi, Berkin / Phothilimthana, Phitchaya Mangpo / Purohit, Manish / Tay, Han Yang / Vũ, Ngân / Wang, Miaosen / Paduraru, Cosmin / Leurent, Edouard / Zhernov, Anton / Huang, Po-Sen / Schrittwieser, Julian / Hubert, Thomas / Tung, Robert / Kurylowicz, Paula / Milan, Kieran / Vinyals, Oriol / Mankowitz, Daniel J.

2023

Abstract: Resource scheduling and allocation is a critical component of many high impact systems ranging from congestion control to cloud computing. Finding more optimal solutions to these problems often has significant impact on resource and time savings, ... ...

Abstract	Resource scheduling and allocation is a critical component of many high impact systems ranging from congestion control to cloud computing. Finding more optimal solutions to these problems often has significant impact on resource and time savings, reducing device wear-and-tear, and even potentially improving carbon emissions. In this paper, we focus on a specific instance of a scheduling problem, namely the memory mapping problem that occurs during compilation of machine learning programs: That is, mapping tensors to different memory layers to optimize execution time. We introduce an approach for solving the memory mapping problem using Reinforcement Learning. RL is a solution paradigm well-suited for sequential decision making problems that are amenable to planning, and combinatorial search spaces with high-dimensional data inputs. We formulate the problem as a single-player game, which we call the mallocGame, such that high-reward trajectories of the game correspond to efficient memory mappings on the target hardware. We also introduce a Reinforcement Learning agent, mallocMuZero, and show that it is capable of playing this game to discover new and improved memory mapping solutions that lead to faster execution times on real ML workloads on ML accelerators. We compare the performance of mallocMuZero to the default solver used by the Accelerated Linear Algebra (XLA) compiler on a benchmark of realistic ML workloads. In addition, we show that mallocMuZero is capable of improving the execution time of the recently published AlphaTensor matrix multiplication model.
Keywords	Computer Science - Performance ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning
Subject code	006
Publishing date	2023-05-11
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Semi-analytical Industrial Cooling System Model for Reinforcement Learning

Chervonyi, Yuri / Dutta, Praneet / Trochim, Piotr / Voicu, Octavian / Paduraru, Cosmin / Qian, Crystal / Karagozler, Emre / Davis, Jared Quincy / Chippendale, Richard / Bajaj, Gautam / Witherspoon, Sims / Luo, Jerry

2022

Abstract: We present a hybrid industrial cooling system model that embeds analytical solutions within a multi-physics simulation. This model is designed for reinforcement learning (RL) applications and balances simplicity with simulation fidelity and ... ...

Abstract	We present a hybrid industrial cooling system model that embeds analytical solutions within a multi-physics simulation. This model is designed for reinforcement learning (RL) applications and balances simplicity with simulation fidelity and interpretability. The model's fidelity is evaluated against real world data from a large scale cooling system. This is followed by a case study illustrating how the model can be used for RL research. For this, we develop an industrial task suite that allows specifying different problem settings and levels of complexity, and use it to evaluate the performance of different RL algorithms. Comment: 27 pages, 13 figures
Keywords	Computer Science - Artificial Intelligence ; Computer Science - Machine Learning ; Computer Science - Robotics
Publishing date	2022-07-26
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: An empirical investigation of the challenges of real-world reinforcement learning

Dulac-Arnold, Gabriel / Levine, Nir / Mankowitz, Daniel J. / Li, Jerry / Paduraru, Cosmin / Gowal, Sven / Hester, Todd

2020

Abstract: Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series ... ...

Abstract	Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. In this work, we identify and formalize a series of independent challenges that embody the difficulties that must be addressed for RL to be commonly deployed in real-world systems. For each challenge, we define it formally in the context of a Markov Decision Process, analyze the effects of the challenge on state-of-the-art learning algorithms, and present some existing attempts at tackling it. We believe that an approach that addresses our set of proposed challenges would be readily deployable in a large number of real world problems. Our proposed challenges are implemented in a suite of continuous control environments called the realworldrl-suite which we propose an as an open-source benchmark. Comment: arXiv admin note: text overlap with arXiv:1904.12901
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence
Subject code	006
Publishing date	2020-03-24
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification

Mankowitz, Daniel J. / Calian, Dan A. / Jeong, Rae / Paduraru, Cosmin / Heess, Nicolas / Dathathri, Sumanth / Riedmiller, Martin / Mann, Timothy

2020

Abstract: Many real-world physical control systems are required to satisfy constraints upon deployment. Furthermore, real-world systems are often subject to effects such as non-stationarity, wear-and-tear, uncalibrated sensors and so on. Such effects effectively ... ...

Abstract	Many real-world physical control systems are required to satisfy constraints upon deployment. Furthermore, real-world systems are often subject to effects such as non-stationarity, wear-and-tear, uncalibrated sensors and so on. Such effects effectively perturb the system dynamics and can cause a policy trained successfully in one domain to perform poorly when deployed to a perturbed version of the same domain. This can affect a policy's ability to maximize future rewards as well as the extent to which it satisfies constraints. We refer to this as constrained model misspecification. We present an algorithm that mitigates this form of misspecification, and showcase its performance in multiple simulated Mujoco tasks from the Real World Reinforcement Learning (RWRL) suite.
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Machine Learning
Publishing date	2020-10-20
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Faster sorting algorithms discovered using deep reinforcement learning.

Mankowitz, Daniel J / Michi, Andrea / Zhernov, Anton / Gelmi, Marco / Selvi, Marco / Paduraru, Cosmin / Leurent, Edouard / Iqbal, Shariq / Lespiau, Jean-Baptiste / Ahern, Alex / Köppe, Thomas / Millikin, Kevin / Gaffney, Stephen / Elster, Sophie / Broshear, Jackson / Gamble, Chris / Milan, Kieran / Tung, Robert / Hwang, Minjae /

Cemgil, Taylan / Barekatain, Mohammadamin / Li, Yujia / Mandhane, Amol / Hubert, Thomas / Schrittwieser, Julian / Hassabis, Demis / Kohli, Pushmeet / Riedmiller, Martin / Vinyals, Oriol / Silver, David

Nature

2023 Volume 618, Issue 7964, Page(s) 257–263

Abstract: Fundamental algorithms such as sorting or hashing are used trillions of times on any given ... ...

Abstract	Fundamental algorithms such as sorting or hashing are used trillions of times on any given day
Language	English
Publishing date	2023-06-07
Publishing country	England
Document type	Journal Article
ZDB-ID	120714-3
ISSN	1476-4687 ; 0028-0836
ISSN (online)	1476-4687
ISSN	0028-0836
DOI	10.1038/s41586-023-06004-9
Database	MEDical Literature Analysis and Retrieval System OnLINE

In stock of ZB MED Cologne/Königswinter

Zs.A 26: Show issues			Location: Je nach Verfügbarkeit (siehe Angabe bei Bestand) bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular Jg. 1995 - 2021: Lesesall (1.OG) ab Jg. 2022: Lesesaal (EG)
Zs.MG 9: Show issues
Zs.MO 244: Show issues

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Details ▾
- See ZB MED holdings
- Order with fees

To top

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito