LIVIVO - Search results -

Search results

Result 1 - 10 of total 169

Search options

Article ; Online: Robust Losses for Learning Value Functions.

Patterson, Andrew / Liao, Victor / White, Martha

IEEE transactions on pattern analysis and machine intelligence

2023 Volume 45, Issue 5, Page(s) 6157–6167

Abstract: Most value function learning algorithms in reinforcement learning are based on the mean squared (projected) Bellman error. However, squared errors are known to be sensitive to outliers, both skewing the solution of the objective and resulting in high- ... ...

Abstract	Most value function learning algorithms in reinforcement learning are based on the mean squared (projected) Bellman error. However, squared errors are known to be sensitive to outliers, both skewing the solution of the objective and resulting in high-magnitude and high-variance gradients. To control these high-magnitude updates, typical strategies in RL involve clipping gradients, clipping rewards, rescaling rewards, or clipping errors. While these strategies appear to be related to robust losses-like the Huber loss-they are built on semi-gradient update rules which do not minimize a known loss. In this work, we build on recent insights reformulating squared Bellman errors as a saddlepoint optimization problem and propose a saddlepoint reformulation for a Huber Bellman error and Absolute Bellman error. We start from a formalization of robust losses, then derive sound gradient-based approaches to minimize these losses in both the online off-policy prediction and control settings. We characterize the solutions of the robust losses, providing insight into the problem settings where the robust losses define notably better solutions than the mean squared Bellman error. Finally, we show that the resulting gradient-based algorithms are more stable, for both prediction and control, with less sensitivity to meta-parameters.
Language	English
Publishing date	2023-04-03
Publishing country	United States
Document type	Journal Article
ISSN	1939-3539
ISSN (online)	1939-3539
DOI	10.1109/TPAMI.2022.3213503
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Neighborhood risk and prenatal care utilization in Rhode Island, 2005-2014.

Habtemariam, Helena / Schlichting, Lauren E / Kole-White, Martha B / Berger, Blythe / Vivier, Patrick

Birth (Berkeley, Calif.)

2024

Abstract: Background: The importance of prenatal care is undeniable, as pregnant persons who receive on-time, adequate prenatal care have better maternal and infant health outcomes compared with those receiving late, less than adequate prenatal care. Previous ... ...

Abstract	Background: The importance of prenatal care is undeniable, as pregnant persons who receive on-time, adequate prenatal care have better maternal and infant health outcomes compared with those receiving late, less than adequate prenatal care. Previous studies assessing the relationship between neighborhood factors and maternal health outcomes have typically looked at singular neighborhood variables and their relationship with maternal health outcomes. In order to examine a greater number of place-based risk factors simultaneously, our analysis used a unique neighborhood risk index to assess the association between cumulative risk and prenatal care utilization, which no other studies have done. Methods: Data from Rhode Island Vital Statistics for births between 2005 and 2014 were used to assess the relationship between neighborhood risk and prenatal care utilization using two established indices. We assessed neighborhood risk with an index composed of eight socioeconomic block-group variables. A multivariate logistic regression model was used to examine the association between adequate use and neighborhood risk. Results: Individuals living in a high-risk neighborhood were less likely to have adequate or better prenatal care utilization according to both the APNCU Index (adjusted odds ratio [aOR] 0.91, 95% confidence interval [CI] 0.87-0.95) and the R-GINDEX (aOR 0.88, 95% CI 0.85-0.91) compared with those in low-risk neighborhoods. Conclusion: Understanding the impact of neighborhood-level factors on prenatal care use is a critical first step in ensuring that underserved neighborhoods are prioritized in interventions aimed at making access to prenatal care more equitable.
Language	English
Publishing date	2024-01-11
Publishing country	United States
Document type	Journal Article
ZDB-ID	604869-9
ISSN	1523-536X ; 0730-7659
ISSN (online)	1523-536X
ISSN	0730-7659
DOI	10.1111/birt.12810
Database	MEDical Literature Analysis and Retrieval System OnLINE

In stock of ZB MED Cologne/Königswinter

Zs.B 2442: Show issues

Location:
Je nach Verfügbarkeit (siehe Angabe bei Bestand)
bis Jg. 2021: Bestellungen von Artikeln über das Online-Bestellformular
ab Jg. 2022: Lesesaal (EG)

Order via subito

Details ▾
- See ZB MED holdings
- Order with fees

Article ; Online: Guidance for compassionate restraint of small children to prevent injuries with epinephrine autoinjectors.

White, Martha V

Allergy and asthma proceedings

2017 Volume 39, Issue 2, Page(s) 161–165

Abstract: Background: Without securing a child properly, injuries can happen with the use of pediatric epinephrine autoinjectors (EAI), and lacerations and embedded needles have been reported. Health care providers should ensure that instruction is provided to ... ...

Abstract	Background: Without securing a child properly, injuries can happen with the use of pediatric epinephrine autoinjectors (EAI), and lacerations and embedded needles have been reported. Health care providers should ensure that instruction is provided to parents on how to hold a child during an injection with an EAI. Objective: To demonstrate the compassionate restraint of small children during an allergic emergency to ensure the safe use of an EAI. Methods: A patient was used to illustrate a compassionate restraint technique during a mock injection with an EAI. Results: One possible technique was illustrated here to reinforce the need for complete, yet compassionate restraint of small children during the use of an EAI. The exact position intended to be used by parents or caregivers will need to be practiced with their children to ensure a safe injection in the event of an allergic emergency. Conclusion: Reinforcement of proper EAI use and visual guidance that illustrate compassionate restraint can potentially prevent EAI-related injuries.
MeSH term(s)	Anaphylaxis/drug therapy ; Caregivers ; Child ; Child, Preschool ; Empathy ; Epinephrine/adverse effects ; Epinephrine/therapeutic use ; Female ; Humans ; Injections ; Male ; Parents ; Restraint, Physical/methods ; Self Administration ; Surveys and Questionnaires ; Wounds and Injuries/etiology ; Wounds and Injuries/prevention & control
Chemical Substances	Epinephrine (YKH834O4BH)
Language	English
Publishing date	2017-11-29
Publishing country	United States
Document type	Journal Article
ZDB-ID	1312445-6
ISSN	1539-6304 ; 1088-5412
ISSN (online)	1539-6304
ISSN	1088-5412
DOI	10.2500/aap.2018.39.4110
Database	MEDical Literature Analysis and Retrieval System OnLINE

Full text online

Accessible to users with ZB MED library card

In stock of ZB MED Cologne/Königswinter

Zs.A 1804: Show issues

Location:
Je nach Verfügbarkeit (siehe Angabe bei Bestand)
bis Jg. 1994: Bestellungen von Artikeln über das Online-Bestellformular
Jg. 1995 - 2021: Lesesall (1.OG)
ab Jg. 2022: Lesesaal (EG)

Order via subito

Details ▾

Book ; Online: Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments

Liu, Vincent / Chandak, Yash / Thomas, Philip / White, Martha

2023

Abstract: In this work, we consider the off-policy policy evaluation problem for contextual bandits and finite horizon reinforcement learning in the nonstationary setting. Reusing old data is critical for policy evaluation, but existing estimators that reuse old ... ...

Abstract	In this work, we consider the off-policy policy evaluation problem for contextual bandits and finite horizon reinforcement learning in the nonstationary setting. Reusing old data is critical for policy evaluation, but existing estimators that reuse old data introduce large bias such that we can not obtain a valid confidence interval. Inspired from a related field called survey sampling, we introduce a variant of the doubly robust (DR) estimator, called the regression-assisted DR estimator, that can incorporate the past data without introducing a large bias. The estimator unifies several existing off-policy policy evaluation methods and improves on them with the use of auxiliary information and a regression approach. We prove that the new estimator is asymptotically unbiased, and provide a consistent variance estimator to a construct a large sample confidence interval. Finally, we empirically show that the new estimator improves estimation for the current and future policy values, and provides a tight and valid interval estimation in several nonstationary recommendation environments. Comment: AISTATS 2023
Keywords	Computer Science - Machine Learning
Subject code	310
Publishing date	2023-02-22
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Empirical Design in Reinforcement Learning

Patterson, Andrew / Neumann, Samuel / White, Martha / White, Adam

2023

Abstract: Empirical design in reinforcement learning is no small task. Running good experiments requires attention to detail and at times significant computational resources. While compute resources available per dollar have continued to grow rapidly, so have the ... ...

Abstract	Empirical design in reinforcement learning is no small task. Running good experiments requires attention to detail and at times significant computational resources. While compute resources available per dollar have continued to grow rapidly, so have the scale of typical experiments in reinforcement learning. It is now common to benchmark agents with millions of parameters against dozens of tasks, each using the equivalent of 30 days of experience. The scale of these experiments often conflict with the need for proper statistical evidence, especially when comparing algorithms. Recent studies have highlighted how popular algorithms are sensitive to hyper-parameter settings and implementation details, and that common empirical practice leads to weak statistical evidence (Machado et al., 2018; Henderson et al., 2018). Here we take this one step further. This manuscript represents both a call to action, and a comprehensive resource for how to do good experiments in reinforcement learning. In particular, we cover: the statistical assumptions underlying common performance measures, how to properly characterize performance variation and stability, hypothesis testing, special considerations for comparing multiple agents, baseline and illustrative example construction, and how to deal with hyper-parameters and experimenter bias. Throughout we highlight common mistakes found in the literature and the statistical consequences of those in example experiments. The objective of this document is to provide answers on how we can use our unprecedented compute to do good science in reinforcement learning, as well as stay alert to potential pitfalls in our empirical design. Comment: In submission to JMLR
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence
Subject code	006
Publishing date	2023-04-03
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Scalable Real-Time Recurrent Learning Using Sparse Connections and Selective Learning

Javed, Khurram / Shah, Haseeb / Sutton, Rich / White, Martha

2023

Abstract: State construction from sensory observations is an important component of a reinforcement learning agent. One solution for state construction is to use recurrent neural networks. Back-propagation through time (BPTT), and real-time recurrent learning ( ... ...

Abstract	State construction from sensory observations is an important component of a reinforcement learning agent. One solution for state construction is to use recurrent neural networks. Back-propagation through time (BPTT), and real-time recurrent learning (RTRL) are two popular gradient-based methods for recurrent learning. BPTT requires the complete sequence of observations before computing gradients and is unsuitable for online real-time updates. RTRL can do online updates but scales poorly to large networks. In this paper, we propose two constraints that make RTRL scalable. We show that by either decomposing the network into independent modules, or learning the network incrementally, we can make RTRL scale linearly with the number of parameters. Unlike prior scalable gradient estimation algorithms, such as UORO and Truncated-BPTT, our algorithms do not add noise or bias to the gradient estimate. Instead, they trade-off the functional capacity of the network to achieve scalable learning. We demonstrate the effectiveness of our approach over Truncated-BPTT on a benchmark inspired by animal learning and by doing policy evaluation for pre-trained Rainbow-DQN agents in the Arcade Learning Environment (ALE). Comment: Scalable recurrent learning, Online learning, RTRL, Cascade correlation networks, Agent-state construction
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence
Subject code	006
Publishing date	2023-01-20
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Generalized Munchausen Reinforcement Learning using Tsallis KL Divergence

Zhu, Lingwei / Chen, Zheng / Schlegel, Matthew / White, Martha

2023

Abstract: Many policy optimization approaches in reinforcement learning incorporate a Kullback-Leilbler (KL) divergence to the previous policy, to prevent the policy from changing too quickly. This idea was initially proposed in a seminal paper on Conservative ... ...

Abstract	Many policy optimization approaches in reinforcement learning incorporate a Kullback-Leilbler (KL) divergence to the previous policy, to prevent the policy from changing too quickly. This idea was initially proposed in a seminal paper on Conservative Policy Iteration, with approximations given by algorithms like TRPO and Munchausen Value Iteration (MVI). We continue this line of work by investigating a generalized KL divergence -- called the Tsallis KL divergence -- which use the $q$-logarithm in the definition. The approach is a strict generalization, as $q = 1$ corresponds to the standard KL divergence; $q > 1$ provides a range of new options. We characterize the types of policies learned under the Tsallis KL, and motivate when $q >1$ could be beneficial. To obtain a practical algorithm that incorporates Tsallis KL regularization, we extend MVI, which is one of the simplest approaches to incorporate KL regularization. We show that this generalized MVI($q$) obtains significant improvements over the standard MVI($q = 1$) across 35 Atari games. Comment: Accepted by NeurIPS 2023
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence
Subject code	006
Publishing date	2023-01-26
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

Liu, Vincent / Nagarajan, Prabhat / Patterson, Andrew / White, Martha

2023

Abstract: Offline reinforcement learning algorithms often require careful hyperparameter tuning. Consequently, before deployment, we need to select amongst a set of candidate policies. As yet, however, there is little understanding about the fundamental limits of ... ...

Abstract	Offline reinforcement learning algorithms often require careful hyperparameter tuning. Consequently, before deployment, we need to select amongst a set of candidate policies. As yet, however, there is little understanding about the fundamental limits of this offline policy selection (OPS) problem. In this work we aim to provide clarity on when sample efficient OPS is possible, primarily by connecting OPS to off-policy policy evaluation (OPE) and Bellman error (BE) estimation. We first show a hardness result, that in the worst case, OPS is just as hard as OPE, by proving a reduction of OPE to OPS. As a result, no OPS method can be more sample efficient than OPE in the worst case. We then propose a BE method for OPS, called Identifiable BE Selection (IBES), that has a straightforward method for selecting its own hyperparameters. We highlight that using IBES for OPS generally has more requirements than OPE methods, but if satisfied, can be more sample efficient. We conclude with an empirical study comparing OPE and IBES, and by showing the difficulty of OPS on an offline Atari benchmark dataset.
Keywords	Computer Science - Machine Learning ; Computer Science - Artificial Intelligence
Subject code	004
Publishing date	2023-12-04
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Unifying task specification in reinforcement learning

White, Martha

2016

Abstract: Reinforcement learning tasks are typically specified as Markov decision processes. This formalism has been highly successful, though specifications often couple the dynamics of the environment and the learning objective. This lack of modularity can ... ...

Abstract	Reinforcement learning tasks are typically specified as Markov decision processes. This formalism has been highly successful, though specifications often couple the dynamics of the environment and the learning objective. This lack of modularity can complicate generalization of the task specification, as well as obfuscate connections between different task settings, such as episodic and continuing. In this work, we introduce the RL task formalism, that provides a unification through simple constructs including a generalization to transition-based discounting. Through a series of examples, we demonstrate the generality and utility of this formalism. Finally, we extend standard learning constructs, including Bellman operators, and extend some seminal theoretical results, including approximation errors bounds. Overall, we provide a well-understood and sound formalism on which to build theoretical results and simplify algorithm use and development. Comment: Published at the International Conference on Machine Learning, 2017. This version includes minor typo and error fixes
Keywords	Computer Science - Artificial Intelligence
Subject code	006
Publishing date	2016-09-07
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning

Daley, Brett / White, Martha / Amato, Christopher / Machado, Marlos C.

2023

Abstract: Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, but counteracting off-policy bias without exacerbating variance is challenging. Classically, off-policy bias is corrected in a per-decision manner: past ... ...

Abstract	Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, but counteracting off-policy bias without exacerbating variance is challenging. Classically, off-policy bias is corrected in a per-decision manner: past temporal-difference errors are re-weighted by the instantaneous Importance Sampling (IS) ratio after each action via eligibility traces. Many off-policy algorithms rely on this mechanism, along with differing protocols for cutting the IS ratios to combat the variance of the IS estimator. Unfortunately, once a trace has been fully cut, the effect cannot be reversed. This has led to the development of credit-assignment strategies that account for multiple past experiences at a time. These trajectory-aware methods have not been extensively analyzed, and their theoretical justification remains uncertain. In this paper, we propose a multistep operator that can express both per-decision and trajectory-aware methods. We prove convergence conditions for our operator in the tabular setting, establishing the first guarantees for several existing methods as well as many new ones. Finally, we introduce Recency-Bounded Importance Sampling (RBIS), which leverages trajectory awareness to perform robustly across $\lambda$-values in an off-policy control task. Comment: ICML 2023. 8 pages, 2 figures. arXiv admin note: text overlap with arXiv:2112.12281
Keywords	Computer Science - Machine Learning
Publishing date	2023-01-26
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

To top

More links

Kategorien

Order via subito

Inter-library loan at ZB MED

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito

Full text online

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED