LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 10 of total 17

Search options

  1. Book ; Online: Hamiltonian Neural Networks

    Greydanus, Sam / Dzamba, Misko / Yosinski, Jason

    2019  

    Abstract: Even though neural networks enjoy widespread use, they still struggle to learn the basic laws of physics. How might we endow them with better inductive biases? In this paper, we draw inspiration from Hamiltonian mechanics to train models that learn and ... ...

    Abstract Even though neural networks enjoy widespread use, they still struggle to learn the basic laws of physics. How might we endow them with better inductive biases? In this paper, we draw inspiration from Hamiltonian mechanics to train models that learn and respect exact conservation laws in an unsupervised manner. We evaluate our models on problems where conservation of energy is important, including the two-body problem and pixel observations of a pendulum. Our model trains faster and generalizes better than a regular neural network. An interesting side effect is that our model is perfectly reversible in time.

    Comment: Conference paper at NeurIPS 2019. Main paper has 8 pages and 5 figures
    Keywords Computer Science - Neural and Evolutionary Computing
    Publishing date 2019-06-04
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: Deconstructing Lottery Tickets

    Zhou, Hattie / Lan, Janice / Liu, Rosanne / Yosinski, Jason

    Zeros, Signs, and the Supermask

    2019  

    Abstract: The recent "Lottery Ticket Hypothesis" paper by Frankle & Carbin showed that a simple approach to creating sparse networks (keeping the large weights) results in models that are trainable from scratch, but only when starting from the same initial weights. ...

    Abstract The recent "Lottery Ticket Hypothesis" paper by Frankle & Carbin showed that a simple approach to creating sparse networks (keeping the large weights) results in models that are trainable from scratch, but only when starting from the same initial weights. The performance of these networks often exceeds the performance of the non-sparse base model, but for reasons that were not well understood. In this paper we study the three critical components of the Lottery Ticket (LT) algorithm, showing that each may be varied significantly without impacting the overall results. Ablating these factors leads to new insights for why LT networks perform as well as they do. We show why setting weights to zero is important, how signs are all you need to make the reinitialized network train, and why masking behaves like training. Finally, we discover the existence of Supermasks, masks that can be applied to an untrained, randomly initialized network to produce a model with performance far better than chance (86% on MNIST, 41% on CIFAR-10).

    Comment: NeurIPS 2019 camera ready version
    Keywords Computer Science - Machine Learning ; Computer Science - Computer Vision and Pattern Recognition ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2019-05-03
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: LCA

    Lan, Janice / Liu, Rosanne / Zhou, Hattie / Yosinski, Jason

    Loss Change Allocation for Neural Network Training

    2019  

    Abstract: Neural networks enjoy widespread use, but many aspects of their training, representation, and operation are poorly understood. In particular, our view into the training process is limited, with a single scalar loss being the most common viewport into ... ...

    Abstract Neural networks enjoy widespread use, but many aspects of their training, representation, and operation are poorly understood. In particular, our view into the training process is limited, with a single scalar loss being the most common viewport into this high-dimensional, dynamic process. We propose a new window into training called Loss Change Allocation (LCA), in which credit for changes to the network loss is conservatively partitioned to the parameters. This measurement is accomplished by decomposing the components of an approximate path integral along the training trajectory using a Runge-Kutta integrator. This rich view shows which parameters are responsible for decreasing or increasing the loss during training, or which parameters "help" or "hurt" the network's learning, respectively. LCA may be summed over training iterations and/or over neurons, channels, or layers for increasingly coarse views. This new measurement device produces several insights into training. (1) We find that barely over 50% of parameters help during any given iteration. (2) Some entire layers hurt overall, moving on average against the training gradient, a phenomenon we hypothesize may be due to phase lag in an oscillatory training process. (3) Finally, increments in learning proceed in a synchronized manner across layers, often peaking on identical iterations.

    Comment: NeurIPS 2019 camera ready version
    Keywords Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2019-09-03
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: Language Models are Few-shot Multilingual Learners

    Winata, Genta Indra / Madotto, Andrea / Lin, Zhaojiang / Liu, Rosanne / Yosinski, Jason / Fung, Pascale

    2021  

    Abstract: General-purpose language models have demonstrated impressive capabilities, performing on par with state-of-the-art approaches on a range of downstream natural language processing (NLP) tasks and benchmarks when inferring instructions from very few ... ...

    Abstract General-purpose language models have demonstrated impressive capabilities, performing on par with state-of-the-art approaches on a range of downstream natural language processing (NLP) tasks and benchmarks when inferring instructions from very few examples. Here, we evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages without any parameter updates. We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones. Finally, we find the in-context few-shot cross-lingual prediction results of language models are significantly better than random prediction, and they are competitive compared to the existing state-of-the-art cross-lingual models.

    Comment: 14 pages
    Keywords Computer Science - Computation and Language ; Computer Science - Artificial Intelligence
    Publishing date 2021-09-15
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: RxRx1

    Sypetkowski, Maciej / Rezanejad, Morteza / Saberian, Saber / Kraus, Oren / Urbanik, John / Taylor, James / Mabey, Ben / Victors, Mason / Yosinski, Jason / Sereshkeh, Alborz Rezazadeh / Haque, Imran / Earnshaw, Berton

    A Dataset for Evaluating Experimental Batch Correction Methods

    2023  

    Abstract: High-throughput screening techniques are commonly used to obtain large quantities of data in many fields of biology. It is well known that artifacts arising from variability in the technical execution of different experimental batches within such screens ...

    Abstract High-throughput screening techniques are commonly used to obtain large quantities of data in many fields of biology. It is well known that artifacts arising from variability in the technical execution of different experimental batches within such screens confound these observations and can lead to invalid biological conclusions. It is therefore necessary to account for these batch effects when analyzing outcomes. In this paper we describe RxRx1, a biological dataset designed specifically for the systematic study of batch effect correction methods. The dataset consists of 125,510 high-resolution fluorescence microscopy images of human cells under 1,138 genetic perturbations in 51 experimental batches across 4 cell types. Visual inspection of the images alone clearly demonstrates significant batch effects. We propose a classification task designed to evaluate the effectiveness of experimental batch correction methods on these images and examine the performance of a number of correction methods on this task. Our goal in releasing RxRx1 is to encourage the development of effective experimental batch correction methods that generalize well to unseen experimental batches. The dataset can be downloaded at https://rxrx.ai.
    Keywords Computer Science - Computer Vision and Pattern Recognition
    Subject code 004
    Publishing date 2023-01-13
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  6. Book ; Online: Supermasks in Superposition

    Wortsman, Mitchell / Ramanujan, Vivek / Liu, Rosanne / Kembhavi, Aniruddha / Rastegari, Mohammad / Yosinski, Jason / Farhadi, Ali

    2020  

    Abstract: We present the Supermasks in Superposition (SupSup) model, capable of sequentially learning thousands of tasks without catastrophic forgetting. Our approach uses a randomly initialized, fixed base network and for each task finds a subnetwork (supermask) ... ...

    Abstract We present the Supermasks in Superposition (SupSup) model, capable of sequentially learning thousands of tasks without catastrophic forgetting. Our approach uses a randomly initialized, fixed base network and for each task finds a subnetwork (supermask) that achieves good performance. If task identity is given at test time, the correct subnetwork can be retrieved with minimal memory usage. If not provided, SupSup can infer the task using gradient-based optimization to find a linear superposition of learned supermasks which minimizes the output entropy. In practice we find that a single gradient step is often sufficient to identify the correct mask, even among 2500 tasks. We also showcase two promising extensions. First, SupSup models can be trained entirely without task identity information, as they may detect when they are uncertain about new data and allocate an additional supermask for the new training distribution. Finally the entire, growing set of supermasks can be stored in a constant-sized reservoir by implicitly storing them as attractors in a fixed-sized Hopfield network.
    Keywords Computer Science - Machine Learning ; Computer Science - Artificial Intelligence ; Statistics - Machine Learning
    Subject code 004
    Publishing date 2020-06-25
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  7. Book ; Online: Multifaceted Feature Visualization

    Nguyen, Anh / Yosinski, Jason / Clune, Jeff

    Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks

    2016  

    Abstract: We can better understand deep neural networks by identifying which features each of their neurons have learned to detect. To do so, researchers have created Deep Visualization techniques including activation maximization, which synthetically generates ... ...

    Abstract We can better understand deep neural networks by identifying which features each of their neurons have learned to detect. To do so, researchers have created Deep Visualization techniques including activation maximization, which synthetically generates inputs (e.g. images) that maximally activate each neuron. A limitation of current techniques is that they assume each neuron detects only one type of feature, but we know that neurons can be multifaceted, in that they fire in response to many different types of features: for example, a grocery store class neuron must activate either for rows of produce or for a storefront. Previous activation maximization techniques constructed images without regard for the multiple different facets of a neuron, creating inappropriate mixes of colors, parts of objects, scales, orientations, etc. Here, we introduce an algorithm that explicitly uncovers the multiple facets of each neuron by producing a synthetic visualization of each of the types of images that activate a neuron. We also introduce regularization methods that produce state-of-the-art results in terms of the interpretability of images obtained by activation maximization. By separately synthesizing each type of image a neuron fires in response to, the visualizations have more appropriate colors and coherent global structure. Multifaceted feature visualization thus provides a clearer and more comprehensive description of the role of each neuron.

    Comment: 23 pages (including SI), 24 figures
    Keywords Computer Science - Neural and Evolutionary Computing ; Computer Science - Computer Vision and Pattern Recognition
    Subject code 612
    Publishing date 2016-02-11
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  8. Book ; Online: First-Order Preconditioning via Hypergradient Descent

    Moskovitz, Ted / Wang, Rui / Lan, Janice / Kapoor, Sanyam / Miconi, Thomas / Yosinski, Jason / Rawal, Aditya

    2019  

    Abstract: Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space.These difficulties can be addressed by second-order approaches that apply a pre- ... ...

    Abstract Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space.These difficulties can be addressed by second-order approaches that apply a pre-conditioning matrix to the gradient to improve convergence. Unfortunately, such algorithms typically struggle to scale to high-dimensional problems, in part because the calculation of specific preconditioners such as the inverse Hessian or Fisher information matrix is highly expensive. We introduce first-order preconditioning (FOP), a fast, scalable approach that generalizes previous work on hypergradient descent (Almeida et al., 1998; Maclaurin et al., 2015; Baydin et al.,2017) to learn a preconditioning matrix that only makes use of first-order information. Experiments show that FOP is able to improve the performance of standard deep learning optimizers on visual classification and reinforcement learning tasks with minimal computational overhead. We also investigate the properties of the learned preconditioning matrices and perform a preliminary theoretical analysis of the algorithm.
    Keywords Computer Science - Machine Learning ; Statistics - Machine Learning
    Subject code 004
    Publishing date 2019-10-18
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  9. Conference proceedings ; Online: Plug and Play Language Models

    Dathathri, Sumanth / Madotto, Andrea / Lan, Janice / Hung, Jane / Frank, Eric / Molino, Piero / Yosinski, Jason / Liu, Rosanne

    A Simple Approach to Controlled Text Generation

    2020  

    Abstract: Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the ... ...

    Abstract Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. In the canonical scenario we present, the attribute models are simple classifiers consisting of a user-specified bag of words or a single learned layer with 100,000 times fewer parameters than the LM. Sampling entails a forward and backward pass in which gradients from the attribute model push the LM’s hidden activations and thus guide the generation. Model samples demonstrate control over a range of topics and sentiment styles, and extensive automated and human annotated evaluations show attribute alignment and fluency. PPLMs are flexible in that any combination of differentiable attribute models may be used to steer text generation, which will allow for diverse and creative applications beyond the examples given in this paper.
    Keywords covid19
    Subject code 410
    Language English
    Publishing country hk
    Document type Conference proceedings ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  10. Article: Automated Identification of Northern Leaf Blight-Infected Maize Plants from Field Imagery Using Deep Learning.

    DeChant, Chad / Wiesner-Hanks, Tyr / Chen, Siyuan / Stewart, Ethan L / Yosinski, Jason / Gore, Michael A / Nelson, Rebecca J / Lipson, Hod

    Phytopathology

    2017  Volume 107, Issue 11, Page(s) 1426–1432

    Abstract: Northern leaf blight (NLB) can cause severe yield loss in maize; however, scouting large areas to accurately diagnose the disease is time consuming and difficult. We demonstrate a system capable of automatically identifying NLB lesions in field-acquired ... ...

    Abstract Northern leaf blight (NLB) can cause severe yield loss in maize; however, scouting large areas to accurately diagnose the disease is time consuming and difficult. We demonstrate a system capable of automatically identifying NLB lesions in field-acquired images of maize plants with high reliability. This approach uses a computational pipeline of convolutional neural networks (CNNs) that addresses the challenges of limited data and the myriad irregularities that appear in images of field-grown plants. Several CNNs were trained to classify small regions of images as containing NLB lesions or not; their predictions were combined into separate heat maps, then fed into a final CNN trained to classify the entire image as containing diseased plants or not. The system achieved 96.7% accuracy on test set images not used in training. We suggest that such systems mounted on aerial- or ground-based vehicles can help in automated high-throughput plant phenotyping, precision breeding for disease resistance, and reduced pesticide use through targeted application across a variety of plant and disease categories.
    Language English
    Publishing date 2017
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 208889-7
    ISSN 1943-7684 ; 0031-949X
    ISSN (online) 1943-7684
    ISSN 0031-949X
    DOI 10.1094/PHYTO-11-16-0417-R
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top