LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 14

Search options

  1. Thesis ; Online: Interpretability of Learning Algorithms Encoded in Deep Neural Networks

    von Oswald, Johannes

    2024  

    Keywords info:eu-repo/classification/ddc/004 ; Data processing ; computer science
    Language English
    Publisher ETH Zurich
    Publishing country ch
    Document type Thesis ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Article ; Online: Informationsplattform COVID-19 für die Primärversorgung: Entstehung und Entwicklung einer webbasierten Wissensplattform in der COVID-19 Pandemie.

    Wendler, Maria / Erber, Patrick / Dolcic, Johanna / Oswald, Johannes / Rabady, Susanne

    Gesundheitswesen (Bundesverband der Arzte des Offentlichen Gesundheitsdienstes (Germany))

    2024  Volume 86, Issue 4, Page(s) 294–303

    Abstract: Background: The SARS-Cov-2 outbreak in the spring of 2020 challenged the health care system, and thus primary care, on an unprecedented scale. Knowledge about the new disease was low, whereas the dynamics of knowledge generation were high and ... ...

    Title translation COVID-19 Information Platform for Primary Care: Emergence and Development of a Web-based Knowledge Platform in the COVID-19 Pandemic.
    Abstract Background: The SARS-Cov-2 outbreak in the spring of 2020 challenged the health care system, and thus primary care, on an unprecedented scale. Knowledge about the new disease was low, whereas the dynamics of knowledge generation were high and inhomogeneous. A number of new primary care tasks related to the pandemic situation emerged. Guidance in the management of COVID-19 was therefore needed, although robust evidence was not yet available. The information required concerned not only the virus and the new disease COVID-19, but also regulatory requirements and organizational issues. In this situation, a flexible, web-based information tool, easy to update and usable at the point of care, was developed at Karl Landsteiner Private University Krems and put online under the name of "COVID-19: prevention and management in primary care practices" on March 25, 2020. In a retrospective process description, we describe the needs-triggered process of developing and disseminating a practice-based tool to support practicing primary care physicians in a period of high uncertainty with an urgent need for information. Afterwards, we reflect on the learning process from a purely pragmatic to an increasingly structured approach and try to draw conclusions regarding optimization possibilities in terms of creation and dissemination.
    Conclusion and outlook: In situations of high uncertainty combined with an acute need for action and decision-making, there is a significant need for information that is as reliable as possible. Science transfer must be done in such a way that information can be implemented quickly. Dissemination, as always, plays an essential role. Gaps must be accepted. A structured process of quality assurance must be established in parallel. Funds and resources for knowledge transfer should be included in future pandemic plans.
    MeSH term(s) Humans ; COVID-19/epidemiology ; SARS-CoV-2 ; Pandemics ; Retrospective Studies ; Germany ; Primary Health Care ; Internet
    Language German
    Publishing date 2024-03-11
    Publishing country Germany
    Document type English Abstract ; Journal Article
    ZDB-ID 1101426-x
    ISSN 1439-4421 ; 0941-3790 ; 0949-7013
    ISSN (online) 1439-4421
    ISSN 0941-3790 ; 0949-7013
    DOI 10.1055/a-2173-8232
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  3. Article: [No title information]

    Wendler, Maria / Erber, Patrick / Dolcic, Johanna / Oswald, Johannes / Rabady, Susanne

    Das Gesundheitswesen

    2024  Volume 86, Issue 04, Page(s) 294–303

    Abstract: Hintergrund: Der Ausbruch von SARS-Cov-2 im Frühjahr 2020 forderte das Gesundheitssystem und damit die primärmedizinische Versorgung in noch nie dagewesenem Ausmaß. Das Wissen über die neue Krankheit war gering, ... ...

    Abstract Hintergrund: Der Ausbruch von SARS-Cov-2 im Frühjahr 2020 forderte das Gesundheitssystem und damit die primärmedizinische Versorgung in noch nie dagewesenem Ausmaß. Das Wissen über die neue Krankheit war gering, die Dynamik der Wissensgenerierung hoch und inhomogen. Es entstanden eine Reihe neuer Aufgaben im Bereich der Primärversorgung in Zusammenhang mit der pandemischen Situation. Unterstützung im Umgang mit COVID-19 war daher notwendig, obwohl belastbare Evidenz noch nicht zur Verfügung stand. Der Informationsbedarf bestand nicht nur hinsichtlich des Virus und der neuen Erkrankung COVID-19, sondern auch hinsichtlich behördlicher Vorgaben und organisatorischer Fragen. In dieser Situation wurde an der Karl Landsteiner Privatuniversität Krems eine flexible, leicht aktualisierbare und praxistaugliche webbasierte Wissensplattform entwickelt und als „COVID-19: Prävention und Umgang in Primärversorgungspraxen“ am 25.03.2020 online gestellt. Wir beschreiben in diesem Paper in Form einer restrospektiven Prozessbeschreibung zunächst den bedarfsorientiert getriggerten Prozess der Entwicklung und Disseminierung eines praxisorientierten Tools zur Unterstützung praktisch tätiger Hausärzt:innen in einer Phase hoher Ungewissheit mit dringlichem Informationsbedarf. Danach wird der Lernprozess von einem rein pragmatischen hin zu zunehmend strukturiertem Vorgehen reflektiert und wir versuchen, Schlüsse hinsichtlich Optimierungsmöglichkeiten in Bezug auf Erstellung und Disseminierung zu ziehen.
    Schlussfolgerung und Ausblick: In Situationen hoher Ungewissheit bei gleichzeitigem akuten Handlungs- und Entscheidungsbedarf entsteht erheblicher Bedarf an möglichst verlässlicher Information. Wissenschaftstransfer muss so erfolgen, dass Informationen rasch umsetzbar sind. Die Disseminierung spielt dabei wie immer eine wesentliche Rolle. Lücken müssen in Kauf genommen werden. Ein strukturierter Prozess der Qualitätssicherung hat parallel etabliert zu werden. Mittel und Ressourcen für den Wissenstransfer sollten in künftigen Pandemieplänen vorgesehen werden.
    Keywords COVID-19 ; Point-of-Care tool ; Knowledge-to-Action Framework ; Hausärztliche Primärversorgung ; Wissenstransfer ; COVID-19 ; point-of-care tool ; primary health care ; knowledge transfer ; knowledge-to-action framework
    Language German
    Publishing date 2024-03-11
    Publisher Georg Thieme Verlag KG
    Publishing place Stuttgart ; New York
    Document type Article
    ZDB-ID 1101426-x
    ISSN 1439-4421 ; 0941-3790 ; 0949-7013
    ISSN (online) 1439-4421
    ISSN 0941-3790 ; 0949-7013
    DOI 10.1055/a-2173-8232
    Database Thieme publisher's database

    More links

    Kategorien

  4. Book ; Online: The least-control principle for local learning at equilibrium

    Meulemans, Alexander / Zucchet, Nicolas / Kobayashi, Seijin / von Oswald, Johannes / Sacramento, João

    2022  

    Abstract: Equilibrium systems are a powerful way to express neural computations. As special cases, they include models of great current interest in both neuroscience and machine learning, such as deep neural networks, equilibrium recurrent neural networks, deep ... ...

    Abstract Equilibrium systems are a powerful way to express neural computations. As special cases, they include models of great current interest in both neuroscience and machine learning, such as deep neural networks, equilibrium recurrent neural networks, deep equilibrium models, or meta-learning. Here, we present a new principle for learning such systems with a temporally- and spatially-local rule. Our principle casts learning as a least-control problem, where we first introduce an optimal controller to lead the system towards a solution state, and then define learning as reducing the amount of control needed to reach such a state. We show that incorporating learning signals within a dynamics as an optimal control enables transmitting activity-dependent credit assignment information, avoids storing intermediate states in memory, and does not rely on infinitesimal learning signals. In practice, our principle leads to strong performance matching that of leading gradient-based learning methods when applied to an array of problems involving recurrent neural networks and meta-learning. Our results shed light on how the brain might learn and offer new ways of approaching a broad class of machine learning problems.

    Comment: Published at NeurIPS 2022. 56 pages
    Keywords Computer Science - Machine Learning ; Computer Science - Neural and Evolutionary Computing ; 68T07 ; I.2.6
    Subject code 006
    Publishing date 2022-07-04
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: A contrastive rule for meta-learning

    Zucchet, Nicolas / Schug, Simon / von Oswald, Johannes / Zhao, Dominic / Sacramento, João

    2021  

    Abstract: Meta-learning algorithms leverage regularities that are present on a set of tasks to speed up and improve the performance of a subsidiary learning process. Recent work on deep neural networks has shown that prior gradient-based learning of meta- ... ...

    Abstract Meta-learning algorithms leverage regularities that are present on a set of tasks to speed up and improve the performance of a subsidiary learning process. Recent work on deep neural networks has shown that prior gradient-based learning of meta-parameters can greatly improve the efficiency of subsequent learning. Here, we present a gradient-based meta-learning algorithm based on equilibrium propagation. Instead of explicitly differentiating the learning process, our contrastive meta-learning rule estimates meta-parameter gradients by executing the subsidiary process more than once. This avoids reversing the learning dynamics in time and computing second-order derivatives. In spite of this, and unlike previous first-order methods, our rule recovers an arbitrarily accurate meta-parameter update given enough compute. As such, contrastive meta-learning is a candidate rule for biologically-plausible meta-learning. We establish theoretical bounds on its performance and present experiments on a set of standard benchmarks and neural network architectures.

    Comment: 28 pages, 9 figures
    Keywords Computer Science - Machine Learning ; Computer Science - Neural and Evolutionary Computing ; Quantitative Biology - Neurons and Cognition
    Subject code 006
    Publishing date 2021-04-04
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: Transformers learn in-context by gradient descent

    von Oswald, Johannes / Niklasson, Eyvind / Randazzo, Ettore / Sacramento, João / Mordvintsev, Alexander / Zhmoginov, Andrey / Vladymyrov, Max

    2022  

    Abstract: At present, the mechanisms of in-context learning in Transformers are not well understood and remain mostly an intuition. In this paper, we suggest that training Transformers on auto-regressive objectives is closely related to gradient-based meta- ... ...

    Abstract At present, the mechanisms of in-context learning in Transformers are not well understood and remain mostly an intuition. In this paper, we suggest that training Transformers on auto-regressive objectives is closely related to gradient-based meta-learning formulations. We start by providing a simple weight construction that shows the equivalence of data transformations induced by 1) a single linear self-attention layer and by 2) gradient-descent (GD) on a regression loss. Motivated by that construction, we show empirically that when training self-attention-only Transformers on simple regression tasks either the models learned by GD and Transformers show great similarity or, remarkably, the weights found by optimization match the construction. Thus we show how trained Transformers become mesa-optimizers i.e. learn models by gradient descent in their forward pass. This allows us, at least in the domain of regression problems, to mechanistically understand the inner workings of in-context learning in optimized Transformers. Building on this insight, we furthermore identify how Transformers surpass the performance of plain gradient descent by learning an iterative curvature correction and learn linear models on deep data representations to solve non-linear regression tasks. Finally, we discuss intriguing parallels to a mechanism identified to be crucial for in-context learning termed induction-head (Olsson et al., 2022) and show how it could be understood as a specific case of in-context learning by gradient descent learning within Transformers. Code to reproduce the experiments can be found at https://github.com/google-research/self-organising-systems/tree/master/transformers_learn_icl_by_gd .
    Keywords Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Computer Science - Computation and Language
    Publishing date 2022-12-15
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: Continual learning with hypernetworks

    von Oswald, Johannes / Henning, Christian / Grewe, Benjamin F. / Sacramento, João

    2019  

    Abstract: Artificial neural networks suffer from catastrophic forgetting when they are sequentially trained on multiple tasks. To overcome this problem, we present a novel approach based on task-conditioned hypernetworks, i.e., networks that generate the weights ... ...

    Abstract Artificial neural networks suffer from catastrophic forgetting when they are sequentially trained on multiple tasks. To overcome this problem, we present a novel approach based on task-conditioned hypernetworks, i.e., networks that generate the weights of a target model based on task identity. Continual learning (CL) is less difficult for this class of models thanks to a simple key feature: instead of recalling the input-output relations of all previously seen data, task-conditioned hypernetworks only require rehearsing task-specific weight realizations, which can be maintained in memory using a simple regularizer. Besides achieving state-of-the-art performance on standard CL benchmarks, additional experiments on long task sequences reveal that task-conditioned hypernetworks display a very large capacity to retain previous memories. Notably, such long memory lifetimes are achieved in a compressive regime, when the number of trainable hypernetwork weights is comparable or smaller than target network size. We provide insight into the structure of low-dimensional task embedding spaces (the input space of the hypernetwork) and show that task-conditioned hypernetworks demonstrate transfer learning. Finally, forward information transfer is further supported by empirical results on a challenging CL benchmark based on the CIFAR-10/100 image datasets.

    Comment: Published at ICLR 2020
    Keywords Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Machine Learning ; 68T99
    Subject code 006
    Publishing date 2019-06-03
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: Neural networks with late-phase weights

    von Oswald, Johannes / Kobayashi, Seijin / Meulemans, Alexander / Henning, Christian / Grewe, Benjamin F. / Sacramento, João

    2020  

    Abstract: The largely successful method of training neural networks is to learn their weights using some variant of stochastic gradient descent (SGD). Here, we show that the solutions found by SGD can be further improved by ensembling a subset of the weights in ... ...

    Abstract The largely successful method of training neural networks is to learn their weights using some variant of stochastic gradient descent (SGD). Here, we show that the solutions found by SGD can be further improved by ensembling a subset of the weights in late stages of learning. At the end of learning, we obtain back a single model by taking a spatial average in weight space. To avoid incurring increased computational costs, we investigate a family of low-dimensional late-phase weight models which interact multiplicatively with the remaining parameters. Our results show that augmenting standard models with late-phase weights improves generalization in established benchmarks such as CIFAR-10/100, ImageNet and enwik8. These findings are complemented with a theoretical analysis of a noisy quadratic problem which provides a simplified picture of the late phases of neural network learning.

    Comment: 25 pages, 6 figures
    Keywords Computer Science - Machine Learning ; Computer Science - Computer Vision and Pattern Recognition ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2020-07-25
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: Continual Learning in Recurrent Neural Networks

    Ehret, Benjamin / Henning, Christian / Cervera, Maria R. / Meulemans, Alexander / von Oswald, Johannes / Grewe, Benjamin F.

    2020  

    Abstract: While a diverse collection of continual learning (CL) methods has been proposed to prevent catastrophic forgetting, a thorough investigation of their effectiveness for processing sequential data with recurrent neural networks (RNNs) is lacking. Here, we ... ...

    Abstract While a diverse collection of continual learning (CL) methods has been proposed to prevent catastrophic forgetting, a thorough investigation of their effectiveness for processing sequential data with recurrent neural networks (RNNs) is lacking. Here, we provide the first comprehensive evaluation of established CL methods on a variety of sequential data benchmarks. Specifically, we shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs. In contrast to feedforward networks, RNNs iteratively reuse a shared set of weights and require working memory to process input samples. We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements, which lead to an increased need for stability at the cost of decreased plasticity for learning subsequent tasks. We additionally provide theoretical arguments supporting this interpretation by studying linear RNNs. Our study shows that established CL methods can be successfully ported to the recurrent case, and that a recent regularization approach based on hypernetworks outperforms weight-importance methods, thus emerging as a promising candidate for CL in RNNs. Overall, we provide insights on the differences between CL in feedforward networks and RNNs, while guiding towards effective solutions to tackle CL on sequential data.

    Comment: Published at ICLR 2021
    Keywords Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2020-06-22
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Book ; Online: Learning where to learn

    von Oswald, Johannes / Zhao, Dominic / Kobayashi, Seijin / Schug, Simon / Caccia, Massimo / Zucchet, Nicolas / Sacramento, João

    Gradient sparsity in meta and continual learning

    2021  

    Abstract: Finding neural network weights that generalize well from small datasets is difficult. A promising approach is to learn a weight initialization such that a small number of weight changes results in low generalization error. We show that this form of meta- ... ...

    Abstract Finding neural network weights that generalize well from small datasets is difficult. A promising approach is to learn a weight initialization such that a small number of weight changes results in low generalization error. We show that this form of meta-learning can be improved by letting the learning algorithm decide which weights to change, i.e., by learning where to learn. We find that patterned sparsity emerges from this process, with the pattern of sparsity varying on a problem-by-problem basis. This selective sparsity results in better generalization and less interference in a range of few-shot and continual learning problems. Moreover, we find that sparse learning also emerges in a more expressive model where learning rates are meta-learned. Our results shed light on an ongoing debate on whether meta-learning can discover adaptable features and suggest that learning by sparse gradient descent is a powerful inductive bias for meta-learning systems.

    Comment: Published at NeurIPS 2021
    Keywords Computer Science - Machine Learning ; Computer Science - Neural and Evolutionary Computing
    Subject code 006
    Publishing date 2021-10-27
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top