LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 14

Search options

  1. Book ; Online: Transformers Meet Directed Graphs

    Geisler, Simon / Li, Yujia / Mankowitz, Daniel / Cemgil, Ali Taylan / Günnemann, Stephan / Paduraru, Cosmin

    2023  

    Abstract: Transformers were originally proposed as a sequence-to-sequence model for text but have become vital for a wide range of modalities, including images, audio, video, and undirected graphs. However, transformers for directed graphs are a surprisingly ... ...

    Abstract Transformers were originally proposed as a sequence-to-sequence model for text but have become vital for a wide range of modalities, including images, audio, video, and undirected graphs. However, transformers for directed graphs are a surprisingly underexplored topic, despite their applicability to ubiquitous domains, including source code and logic circuits. In this work, we propose two direction- and structure-aware positional encodings for directed graphs: (1) the eigenvectors of the Magnetic Laplacian - a direction-aware generalization of the combinatorial Laplacian; (2) directional random walk encodings. Empirically, we show that the extra directionality information is useful in various downstream tasks, including correctness testing of sorting networks and source code understanding. Together with a data-flow-centric graph construction, our model outperforms the prior state of the art on the Open Graph Benchmark Code2 relatively by 14.7%.

    Comment: 29 pages
    Keywords Computer Science - Machine Learning
    Subject code 006
    Publishing date 2023-01-31
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: COptiDICE

    Lee, Jongmin / Paduraru, Cosmin / Mankowitz, Daniel J. / Heess, Nicolas / Precup, Doina / Kim, Kee-Eung / Guez, Arthur

    Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation

    2022  

    Abstract: We consider the offline constrained reinforcement learning (RL) problem, in which the agent aims to compute a policy that maximizes expected return while satisfying given cost constraints, learning only from a pre-collected dataset. This problem setting ... ...

    Abstract We consider the offline constrained reinforcement learning (RL) problem, in which the agent aims to compute a policy that maximizes expected return while satisfying given cost constraints, learning only from a pre-collected dataset. This problem setting is appealing in many real-world scenarios, where direct interaction with the environment is costly or risky, and where the resulting policy should comply with safety constraints. However, it is challenging to compute a policy that guarantees satisfying the cost constraints in the offline RL setting, since the off-policy evaluation inherently has an estimation error. In this paper, we present an offline constrained RL algorithm that optimizes the policy in the space of the stationary distribution. Our algorithm, COptiDICE, directly estimates the stationary distribution corrections of the optimal policy with respect to returns, while constraining the cost upper bound, with the goal of yielding a cost-conservative policy for actual constraint satisfaction. Experimental results show that COptiDICE attains better policies in terms of constraint satisfaction and return-maximization, outperforming baseline algorithms.

    Comment: 24 pages, 6 figures, Accepted at ICLR 2022 (spotlight)
    Keywords Computer Science - Machine Learning ; Computer Science - Artificial Intelligence
    Subject code 006
    Publishing date 2022-04-19
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization

    Zhang, Michael R. / Paine, Tom Le / Nachum, Ofir / Paduraru, Cosmin / Tucker, George / Wang, Ziyu / Norouzi, Mohammad

    2021  

    Abstract: Standard dynamics models for continuous control make use of feedforward computation to predict the conditional distribution of next state and reward given current state and action using a multivariate Gaussian with a diagonal covariance structure. This ... ...

    Abstract Standard dynamics models for continuous control make use of feedforward computation to predict the conditional distribution of next state and reward given current state and action using a multivariate Gaussian with a diagonal covariance structure. This modeling choice assumes that different dimensions of the next state and reward are conditionally independent given the current state and action and may be driven by the fact that fully observable physics-based simulation environments entail deterministic transition dynamics. In this paper, we challenge this conditional independence assumption and propose a family of expressive autoregressive dynamics models that generate different dimensions of the next state and reward sequentially conditioned on previous dimensions. We demonstrate that autoregressive dynamics models indeed outperform standard feedforward models in log-likelihood on heldout transitions. Furthermore, we compare different model-based and model-free off-policy evaluation (OPE) methods on RL Unplugged, a suite of offline MuJoCo datasets, and find that autoregressive dynamics models consistently outperform all baselines, achieving a new state-of-the-art. Finally, we show that autoregressive dynamics models are useful for offline policy optimization by serving as a way to enrich the replay buffer through data augmentation and improving performance using model-based planning.

    Comment: ICLR 2021. 17 pages
    Keywords Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2021-04-28
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: Active Offline Policy Selection

    Konyushkova, Ksenia / Chen, Yutian / Paine, Tom Le / Gulcehre, Caglar / Paduraru, Cosmin / Mankowitz, Daniel J / Denil, Misha / de Freitas, Nando

    2021  

    Abstract: This paper addresses the problem of policy selection in domains with abundant logged data, but with a restricted interaction budget. Solving this problem would enable safe evaluation and deployment of offline reinforcement learning policies in industry, ... ...

    Abstract This paper addresses the problem of policy selection in domains with abundant logged data, but with a restricted interaction budget. Solving this problem would enable safe evaluation and deployment of offline reinforcement learning policies in industry, robotics, and recommendation domains among others. Several off-policy evaluation (OPE) techniques have been proposed to assess the value of policies using only logged data. However, there is still a big gap between the evaluation by OPE and the full online evaluation. Yet, large amounts of online interactions are often not possible in practice. To overcome this problem, we introduce active offline policy selection - a novel sequential decision approach that combines logged data with online interaction to identify the best policy. We use OPE estimates to warm start the online evaluation. Then, in order to utilize the limited environment interactions wisely we decide which policy to evaluate next based on a Bayesian optimization method with a kernel that represents policy similarity. We use multiple benchmarks, including real-world robotics, with a large number of candidate policies to show that the proposed approach improves upon state-of-the-art OPE estimates and pure online policy evaluation.

    Comment: Presented at NeurIPS 2021
    Keywords Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Machine Learning
    Subject code 006 ; 004
    Publishing date 2021-06-18
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: Towards practical reinforcement learning for tokamak magnetic control

    Tracey, Brendan D. / Michi, Andrea / Chervonyi, Yuri / Davies, Ian / Paduraru, Cosmin / Lazic, Nevena / Felici, Federico / Ewalds, Timo / Donner, Craig / Galperti, Cristian / Buchli, Jonas / Neunert, Michael / Huber, Andrea / Evens, Jonathan / Kurylowicz, Paula / Mankowitz, Daniel J. / Riedmiller, Martin / Team, The TCV

    2023  

    Abstract: Reinforcement learning (RL) has shown promising results for real-time control systems, including the domain of plasma magnetic control. However, there are still significant drawbacks compared to traditional feedback control approaches for magnetic ... ...

    Abstract Reinforcement learning (RL) has shown promising results for real-time control systems, including the domain of plasma magnetic control. However, there are still significant drawbacks compared to traditional feedback control approaches for magnetic confinement. In this work, we address key drawbacks of the RL method; achieving higher control accuracy for desired plasma properties, reducing the steady-state error, and decreasing the required time to learn new tasks. We build on top of \cite{degrave2022magnetic}, and present algorithmic improvements to the agent architecture and training procedure. We present simulation results that show up to 65\% improvement in shape accuracy, achieve substantial reduction in the long-term bias of the plasma current, and additionally reduce the training time required to learn new tasks by a factor of 3 or more. We present new experiments using the upgraded RL-based controllers on the TCV tokamak, which validate the simulation results achieved, and point the way towards routinely achieving accurate discharges using the RL approach.
    Keywords Physics - Plasma Physics ; Computer Science - Machine Learning
    Publishing date 2023-07-21
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: Optimizing Memory Mapping Using Deep Reinforcement Learning

    Wang, Pengming / Sazanovich, Mikita / Ilbeyi, Berkin / Phothilimthana, Phitchaya Mangpo / Purohit, Manish / Tay, Han Yang / Vũ, Ngân / Wang, Miaosen / Paduraru, Cosmin / Leurent, Edouard / Zhernov, Anton / Huang, Po-Sen / Schrittwieser, Julian / Hubert, Thomas / Tung, Robert / Kurylowicz, Paula / Milan, Kieran / Vinyals, Oriol / Mankowitz, Daniel J.

    2023  

    Abstract: Resource scheduling and allocation is a critical component of many high impact systems ranging from congestion control to cloud computing. Finding more optimal solutions to these problems often has significant impact on resource and time savings, ... ...

    Abstract Resource scheduling and allocation is a critical component of many high impact systems ranging from congestion control to cloud computing. Finding more optimal solutions to these problems often has significant impact on resource and time savings, reducing device wear-and-tear, and even potentially improving carbon emissions. In this paper, we focus on a specific instance of a scheduling problem, namely the memory mapping problem that occurs during compilation of machine learning programs: That is, mapping tensors to different memory layers to optimize execution time. We introduce an approach for solving the memory mapping problem using Reinforcement Learning. RL is a solution paradigm well-suited for sequential decision making problems that are amenable to planning, and combinatorial search spaces with high-dimensional data inputs. We formulate the problem as a single-player game, which we call the mallocGame, such that high-reward trajectories of the game correspond to efficient memory mappings on the target hardware. We also introduce a Reinforcement Learning agent, mallocMuZero, and show that it is capable of playing this game to discover new and improved memory mapping solutions that lead to faster execution times on real ML workloads on ML accelerators. We compare the performance of mallocMuZero to the default solver used by the Accelerated Linear Algebra (XLA) compiler on a benchmark of realistic ML workloads. In addition, we show that mallocMuZero is capable of improving the execution time of the recently published AlphaTensor matrix multiplication model.
    Keywords Computer Science - Performance ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning
    Subject code 006
    Publishing date 2023-05-11
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: Semi-analytical Industrial Cooling System Model for Reinforcement Learning

    Chervonyi, Yuri / Dutta, Praneet / Trochim, Piotr / Voicu, Octavian / Paduraru, Cosmin / Qian, Crystal / Karagozler, Emre / Davis, Jared Quincy / Chippendale, Richard / Bajaj, Gautam / Witherspoon, Sims / Luo, Jerry

    2022  

    Abstract: We present a hybrid industrial cooling system model that embeds analytical solutions within a multi-physics simulation. This model is designed for reinforcement learning (RL) applications and balances simplicity with simulation fidelity and ... ...

    Abstract We present a hybrid industrial cooling system model that embeds analytical solutions within a multi-physics simulation. This model is designed for reinforcement learning (RL) applications and balances simplicity with simulation fidelity and interpretability. The model's fidelity is evaluated against real world data from a large scale cooling system. This is followed by a case study illustrating how the model can be used for RL research. For this, we develop an industrial task suite that allows specifying different problem settings and levels of complexity, and use it to evaluate the performance of different RL algorithms.

    Comment: 27 pages, 13 figures
    Keywords Computer Science - Artificial Intelligence ; Computer Science - Machine Learning ; Computer Science - Robotics
    Publishing date 2022-07-26
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: An empirical investigation of the challenges of real-world reinforcement learning

    Dulac-Arnold, Gabriel / Levine, Nir / Mankowitz, Daniel J. / Li, Jerry / Paduraru, Cosmin / Gowal, Sven / Hester, Todd

    2020  

    Abstract: Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series ... ...

    Abstract Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. In this work, we identify and formalize a series of independent challenges that embody the difficulties that must be addressed for RL to be commonly deployed in real-world systems. For each challenge, we define it formally in the context of a Markov Decision Process, analyze the effects of the challenge on state-of-the-art learning algorithms, and present some existing attempts at tackling it. We believe that an approach that addresses our set of proposed challenges would be readily deployable in a large number of real world problems. Our proposed challenges are implemented in a suite of continuous control environments called the realworldrl-suite which we propose an as an open-source benchmark.

    Comment: arXiv admin note: text overlap with arXiv:1904.12901
    Keywords Computer Science - Machine Learning ; Computer Science - Artificial Intelligence
    Subject code 006
    Publishing date 2020-03-24
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification

    Mankowitz, Daniel J. / Calian, Dan A. / Jeong, Rae / Paduraru, Cosmin / Heess, Nicolas / Dathathri, Sumanth / Riedmiller, Martin / Mann, Timothy

    2020  

    Abstract: Many real-world physical control systems are required to satisfy constraints upon deployment. Furthermore, real-world systems are often subject to effects such as non-stationarity, wear-and-tear, uncalibrated sensors and so on. Such effects effectively ... ...

    Abstract Many real-world physical control systems are required to satisfy constraints upon deployment. Furthermore, real-world systems are often subject to effects such as non-stationarity, wear-and-tear, uncalibrated sensors and so on. Such effects effectively perturb the system dynamics and can cause a policy trained successfully in one domain to perform poorly when deployed to a perturbed version of the same domain. This can affect a policy's ability to maximize future rewards as well as the extent to which it satisfies constraints. We refer to this as constrained model misspecification. We present an algorithm that mitigates this form of misspecification, and showcase its performance in multiple simulated Mujoco tasks from the Real World Reinforcement Learning (RWRL) suite.
    Keywords Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Machine Learning
    Publishing date 2020-10-20
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Article ; Online: Faster sorting algorithms discovered using deep reinforcement learning.

    Mankowitz, Daniel J / Michi, Andrea / Zhernov, Anton / Gelmi, Marco / Selvi, Marco / Paduraru, Cosmin / Leurent, Edouard / Iqbal, Shariq / Lespiau, Jean-Baptiste / Ahern, Alex / Köppe, Thomas / Millikin, Kevin / Gaffney, Stephen / Elster, Sophie / Broshear, Jackson / Gamble, Chris / Milan, Kieran / Tung, Robert / Hwang, Minjae /
    Cemgil, Taylan / Barekatain, Mohammadamin / Li, Yujia / Mandhane, Amol / Hubert, Thomas / Schrittwieser, Julian / Hassabis, Demis / Kohli, Pushmeet / Riedmiller, Martin / Vinyals, Oriol / Silver, David

    Nature

    2023  Volume 618, Issue 7964, Page(s) 257–263

    Abstract: Fundamental algorithms such as sorting or hashing are used trillions of times on any given ... ...

    Abstract Fundamental algorithms such as sorting or hashing are used trillions of times on any given day
    Language English
    Publishing date 2023-06-07
    Publishing country England
    Document type Journal Article
    ZDB-ID 120714-3
    ISSN 1476-4687 ; 0028-0836
    ISSN (online) 1476-4687
    ISSN 0028-0836
    DOI 10.1038/s41586-023-06004-9
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top