LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 12

Search options

  1. Book ; Online: Information Plane Analysis for Dropout Neural Networks

    Adilova, Linara / Geiger, Bernhard C. / Fischer, Asja

    2023  

    Abstract: The information-theoretic framework promises to explain the predictive power of neural networks. In particular, the information plane analysis, which measures mutual information (MI) between input and representation as well as representation and output, ... ...

    Abstract The information-theoretic framework promises to explain the predictive power of neural networks. In particular, the information plane analysis, which measures mutual information (MI) between input and representation as well as representation and output, should give rich insights into the training process. This approach, however, was shown to strongly depend on the choice of estimator of the MI. The problem is amplified for deterministic networks if the MI between input and representation is infinite. Thus, the estimated values are defined by the different approaches for estimation, but do not adequately represent the training process from an information-theoretic perspective. In this work, we show that dropout with continuously distributed noise ensures that MI is finite. We demonstrate in a range of experiments that this enables a meaningful information plane analysis for a class of dropout neural networks that is widely used in practice.

    Comment: Published as a conference paper at ICLR2023
    Keywords Computer Science - Information Theory
    Publishing date 2023-03-01
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: Layer-wise Linear Mode Connectivity

    Adilova, Linara / Andriushchenko, Maksym / Kamp, Michael / Fischer, Asja / Jaggi, Martin

    2023  

    Abstract: Averaging neural network parameters is an intuitive method for fusing the knowledge of two independent models. It is most prominently used in federated learning. If models are averaged at the end of training, this can only lead to a good performing model ...

    Abstract Averaging neural network parameters is an intuitive method for fusing the knowledge of two independent models. It is most prominently used in federated learning. If models are averaged at the end of training, this can only lead to a good performing model if the loss surface of interest is very particular, i.e., the loss in the midpoint between the two models needs to be sufficiently low. This is impossible to guarantee for the non-convex losses of state-of-the-art networks. For averaging models trained on vastly different datasets, it was proposed to average only the parameters of particular layers or combinations of layers, resulting in better performing models. To get a better understanding of the effect of layer-wise averaging, we analyse the performance of the models that result from averaging single layers, or groups of layers. Based on our empirical and theoretical investigation, we introduce a novel notion of the layer-wise linear connectivity, and show that deep networks do not have layer-wise barriers between them. In addition, we analyze layer-wise personalization averaging and conjecture that in particular problem setup all partial aggregations result in the approximately same performance.

    Comment: preprint
    Keywords Computer Science - Machine Learning
    Subject code 006
    Publishing date 2023-07-13
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Article ; Online: Visual Analytics for Human-Centered Machine Learning

    Andrienko, Natalia / Andrienko, Gennady / Adilova, Linara / Wrobel, Stefan

    2022  

    Abstract: 123 ... 133 ... We introduce a new research area in visual analytics (VA) aiming to bridge existing gaps between methods of interactive machine learning (ML) and eXplainable Artificial Intelligence (XAI), on one side, and human minds, on the other side. The ... ...

    Abstract 123

    133

    We introduce a new research area in visual analytics (VA) aiming to bridge existing gaps between methods of interactive machine learning (ML) and eXplainable Artificial Intelligence (XAI), on one side, and human minds, on the other side. The gaps are, first, a conceptual mismatch between ML/XAI outputs and human mental models and ways of reasoning, and second, a mismatch between the information quantity and level of detail and human capabilities to perceive and understand. A grand challenge is to adapt ML and XAI to human goals, concepts, values, and ways of thinking. Complementing the current efforts in XAI towards solving this challenge, VA can contribute by exploiting the potential of visualization as an effective way of communicating information to humans and a strong trigger of human abstractive perception and thinking. We propose a cross-disciplinary research framework and formulate research directions for VA.

    42

    1
    Keywords Computer science ; Human intelligence ; Machine learning ; Visual analytics
    Subject code 401
    Language English
    Publishing date 2022-01-25
    Publishing country de
    Document type Article ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Article ; Online: Visual Analytics for Human-Centered Machine Learning.

    Andrienko, Natalia / Andrienko, Gennady / Adilova, Linara / Wrobel, Stefan / Rhyne, Theresa-Marie

    IEEE computer graphics and applications

    2022  Volume 42, Issue 1, Page(s) 123–133

    Abstract: We introduce a new research area in visual analytics (VA) aiming to bridge existing gaps between methods of interactive machine learning (ML) and eXplainable Artificial Intelligence (XAI), on one side, and human minds, on the other side. The gaps are, ... ...

    Abstract We introduce a new research area in visual analytics (VA) aiming to bridge existing gaps between methods of interactive machine learning (ML) and eXplainable Artificial Intelligence (XAI), on one side, and human minds, on the other side. The gaps are, first, a conceptual mismatch between ML/XAI outputs and human mental models and ways of reasoning, and second, a mismatch between the information quantity and level of detail and human capabilities to perceive and understand. A grand challenge is to adapt ML and XAI to human goals, concepts, values, and ways of thinking. Complementing the current efforts in XAI towards solving this challenge, VA can contribute by exploiting the potential of visualization as an effective way of communicating information to humans and a strong trigger of human abstractive perception and thinking. We propose a cross-disciplinary research framework and formulate research directions for VA.
    MeSH term(s) Artificial Intelligence ; Humans ; Machine Learning
    Language English
    Publishing date 2022-03-14
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 1558-1756
    ISSN (online) 1558-1756
    DOI 10.1109/MCG.2021.3130314
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  5. Book ; Online: Novelty Detection in Sequential Data by Informed Clustering and Modeling

    Adilova, Linara / Chen, Siming / Kamp, Michael

    2021  

    Abstract: Novelty detection in discrete sequences is a challenging task, since deviations from the process generating the normal data are often small or intentionally hidden. Novelties can be detected by modeling normal sequences and measuring the deviations of a ... ...

    Abstract Novelty detection in discrete sequences is a challenging task, since deviations from the process generating the normal data are often small or intentionally hidden. Novelties can be detected by modeling normal sequences and measuring the deviations of a new sequence from the model predictions. However, in many applications data is generated by several distinct processes so that models trained on all the data tend to over-generalize and novelties remain undetected. We propose to approach this challenge through decomposition: by clustering the data we break down the problem, obtaining simpler modeling task in each cluster which can be modeled more accurately. However, this comes at a trade-off, since the amount of training data per cluster is reduced. This is a particular problem for discrete sequences where state-of-the-art models are data-hungry. The success of this approach thus depends on the quality of the clustering, i.e., whether the individual learning problems are sufficiently simpler than the joint problem. While clustering discrete sequences automatically is a challenging and domain-specific task, it is often easy for human domain experts, given the right tools. In this paper, we adapt a state-of-the-art visual analytics tool for discrete sequence clustering to obtain informed clusters from domain experts and use LSTMs to model each cluster individually. Our extensive empirical evaluation indicates that this informed clustering outperforms automatic ones and that our approach outperforms state-of-the-art novelty detection methods for discrete sequences in three real-world application scenarios. In particular, decomposition outperforms a global model despite less training data on each individual cluster.

    Comment: AI&HCI Workshop at the 40th International Conference on Machine Learning (ICML), Honolulu, Hawaii, USA. 2023
    Keywords Computer Science - Machine Learning
    Subject code 006
    Publishing date 2021-03-05
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: FAM

    Adilova, Linara / Abourayya, Amr / Li, Jianning / Dada, Amin / Petzka, Henning / Egger, Jan / Kleesiek, Jens / Kamp, Michael

    Relative Flatness Aware Minimization

    2023  

    Abstract: Flatness of the loss curve around a model at hand has been shown to empirically correlate with its generalization ability. Optimizing for flatness has been proposed as early as 1994 by Hochreiter and Schmidthuber, and was followed by more recent ... ...

    Abstract Flatness of the loss curve around a model at hand has been shown to empirically correlate with its generalization ability. Optimizing for flatness has been proposed as early as 1994 by Hochreiter and Schmidthuber, and was followed by more recent successful sharpness-aware optimization techniques. Their widespread adoption in practice, though, is dubious because of the lack of theoretically grounded connection between flatness and generalization, in particular in light of the reparameterization curse - certain reparameterizations of a neural network change most flatness measures but do not change generalization. Recent theoretical work suggests that a particular relative flatness measure can be connected to generalization and solves the reparameterization curse. In this paper, we derive a regularizer based on this relative flatness that is easy to compute, fast, efficient, and works with arbitrary loss functions. It requires computing the Hessian only of a single layer of the network, which makes it applicable to large neural networks, and with it avoids an expensive mapping of the loss surface in the vicinity of the model. In an extensive empirical evaluation we show that this relative flatness aware minimization (FAM) improves generalization in a multitude of applications and models, both in finetuning and standard training. We make the code available at github.

    Comment: Proceedings of the 2nd Annual Workshop on Topology, Algebra, and Geometry in Machine Learning (TAG-ML) at the 40 th International Conference on Machine Learning, Honolulu, Hawaii, USA. 2023
    Keywords Computer Science - Machine Learning
    Subject code 006
    Publishing date 2023-07-05
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: A Reparameterization-Invariant Flatness Measure for Deep Neural Networks

    Petzka, Henning / Adilova, Linara / Kamp, Michael / Sminchisescu, Cristian

    2019  

    Abstract: The performance of deep neural networks is often attributed to their automated, task-related feature construction. It remains an open question, though, why this leads to solutions with good generalization, even in cases where the number of parameters is ... ...

    Abstract The performance of deep neural networks is often attributed to their automated, task-related feature construction. It remains an open question, though, why this leads to solutions with good generalization, even in cases where the number of parameters is larger than the number of samples. Back in the 90s, Hochreiter and Schmidhuber observed that flatness of the loss surface around a local minimum correlates with low generalization error. For several flatness measures, this correlation has been empirically validated. However, it has recently been shown that existing measures of flatness cannot theoretically be related to generalization due to a lack of invariance with respect to reparameterizations. We propose a natural modification of existing flatness measures that results in invariance to reparameterization.

    Comment: 14 pages; accepted at Workshop "Science meets Engineering of Deep Learning", 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)
    Keywords Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2019-11-29
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: Relative Flatness and Generalization

    Petzka, Henning / Kamp, Michael / Adilova, Linara / Sminchisescu, Cristian / Boley, Mario

    2020  

    Abstract: Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks. While it has been empirically observed that flatness measures consistently correlate strongly with ... ...

    Abstract Flatness of the loss curve is conjectured to be connected to the generalization ability of machine learning models, in particular neural networks. While it has been empirically observed that flatness measures consistently correlate strongly with generalization, it is still an open theoretical problem why and under which circumstances flatness is connected to generalization, in particular in light of reparameterizations that change certain flatness measures but leave generalization unchanged. We investigate the connection between flatness and generalization by relating it to the interpolation from representative data, deriving notions of representativeness, and feature robustness. The notions allow us to rigorously connect flatness and generalization and to identify conditions under which the connection holds. Moreover, they give rise to a novel, but natural relative flatness measure that correlates strongly with generalization, simplifies to ridge regression for ordinary least squares, and solves the reparameterization issue.

    Comment: The first two authors made equal contribution; Accepted for publication at NeurIPS 2021; arXiv admin note: substantial text overlap with arXiv:1912.00058
    Keywords Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2020-01-03
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Book ; Online: System Misuse Detection via Informed Behavior Clustering and Modeling

    Adilova, Linara / Natious, Livin / Chen, Siming / Thonnard, Olivier / Kamp, Michael

    2019  

    Abstract: One of the main tasks of cybersecurity is recognizing malicious interactions with an arbitrary system. Currently, the logging information from each interaction can be collected in almost unrestricted amounts, but identification of attacks requires a lot ... ...

    Abstract One of the main tasks of cybersecurity is recognizing malicious interactions with an arbitrary system. Currently, the logging information from each interaction can be collected in almost unrestricted amounts, but identification of attacks requires a lot of effort and time of security experts. We propose an approach for identifying fraud activity through modeling normal behavior in interactions with a system via machine learning methods, in particular LSTM neural networks. In order to enrich the modeling with system specific knowledge, we propose to use an interactive visual interface that allows security experts to identify semantically meaningful clusters of interactions. These clusters incorporate domain knowledge and lead to more precise behavior modeling via informed machine learning. We evaluate the proposed approach on a dataset containing logs of interactions with an administrative interface of login and security server. Our empirical results indicate that the informed modeling is capable of capturing normal behavior, which can then be used to detect abnormal behavior.

    Comment: 9 pages including appendix, DSN Workshop on Data-Centric Dependability and Security (http://dcds.lasige.di.fc.ul.pt/)
    Keywords Computer Science - Cryptography and Security ; Computer Science - Machine Learning
    Subject code 006
    Publishing date 2019-07-01
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Book ; Online: Resource-Constrained On-Device Learning by Dynamic Averaging

    Heppe, Lukas / Kamp, Michael / Adilova, Linara / Heinrich, Danny / Piatkowski, Nico / Morik, Katharina

    2020  

    Abstract: The communication between data-generating devices is partially responsible for a growing portion of the world's power consumption. Thus reducing communication is vital, both, from an economical and an ecological perspective. For machine learning, on- ... ...

    Abstract The communication between data-generating devices is partially responsible for a growing portion of the world's power consumption. Thus reducing communication is vital, both, from an economical and an ecological perspective. For machine learning, on-device learning avoids sending raw data, which can reduce communication substantially. Furthermore, not centralizing the data protects privacy-sensitive data. However, most learning algorithms require hardware with high computation power and thus high energy consumption. In contrast, ultra-low-power processors, like FPGAs or micro-controllers, allow for energy-efficient learning of local models. Combined with communication-efficient distributed learning strategies, this reduces the overall energy consumption and enables applications that were yet impossible due to limited energy on local devices. The major challenge is then, that the low-power processors typically only have integer processing capabilities. This paper investigates an approach to communication-efficient on-device learning of integer exponential families that can be executed on low-power processors, is privacy-preserving, and effectively minimizes communication. The empirical evaluation shows that the approach can reach a model quality comparable to a centrally learned regular model with an order of magnitude less communication. Comparing the overall energy consumption, this reduces the required energy for solving the machine learning task by a significant amount.
    Keywords Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2020-09-25
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top