LIVIVO - Das Suchportal für Lebenswissenschaften

switch to English language
Erweiterte Suche

Ihre letzten Suchen

  1. AU="Hoffman, Judy"
  2. AU="Schubert, Dirk"
  3. AU=Jia Xiao-yu
  4. AU="Patra, Dhabaleswar"
  5. AU="Knill, Carly"
  6. AU=Jabbour Elias
  7. AU="Rodríguez-Maresca, Manuel Ángel"
  8. AU="Yang, Chang-Jung"
  9. AU="Atul Kaushik"
  10. AU="Peters, Jaime"
  11. AU="Dorothee von Laer"
  12. AU="Sreeja Attur"
  13. AU=Song Kyung Chul
  14. AU=Klimovich Pavel V.
  15. AU="Jingbo Chen"
  16. AU="Viazlo, Oleksander"
  17. AU="Toshiki Iwabuchi"
  18. AU="Dissanayake, Lakmali"
  19. AU="Michael Denkinger"
  20. AU="Abilio J. F. N. Sobral"
  21. AU="Geller, Alan"
  22. AU=Petrat Sren
  23. AU="Sterling, Shanique"
  24. AU="Qi, Zeqiang"
  25. AU="Thongstisubskul, A"
  26. AU="Daniel C. Schneider, PhD"
  27. AU="Völker, Christoph"
  28. AU="El Aoud, S"
  29. AU="Yi, Tongpei"
  30. AU="Anil K. Mantha"
  31. AU="Artzner, Christoph"
  32. AU=Diana Giovanni
  33. AU="Kinloch, Sabine"
  34. AU="Nuertey, David"
  35. AU="Ojubolamo, Olakunle"

Suchergebnis

Treffer 1 - 10 von insgesamt 35

Suchoptionen

  1. Buch ; Online: Token Merging for Fast Stable Diffusion

    Bolya, Daniel / Hoffman, Judy

    2023  

    Abstract: The landscape of image generation has been forever changed by open vocabulary diffusion models. However, at their core these models use transformers, which makes generation slow. Better implementations to increase the throughput of these transformers ... ...

    Abstract The landscape of image generation has been forever changed by open vocabulary diffusion models. However, at their core these models use transformers, which makes generation slow. Better implementations to increase the throughput of these transformers have emerged, but they still evaluate the entire model. In this paper, we instead speed up diffusion models by exploiting natural redundancy in generated images by merging redundant tokens. After making some diffusion-specific improvements to Token Merging (ToMe), our ToMe for Stable Diffusion can reduce the number of tokens in an existing Stable Diffusion model by up to 60% while still producing high quality images without any extra training. In the process, we speed up image generation by up to 2x and reduce memory consumption by up to 5.6x. Furthermore, this speed-up stacks with efficient implementations such as xFormers, minimally impacting quality while being up to 5.4x faster for large images. Code is available at https://github.com/dbolya/tomesd.

    Comment: Check out the code at https://github.com/dbolya/tomesd
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition
    Thema/Rubrik (Code) 006
    Erscheinungsdatum 2023-03-30
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  2. Buch ; Online: Mitigating Bias in Visual Transformers via Targeted Alignment

    Sudhakar, Sruthi / Prabhu, Viraj / Krishnakumar, Arvindkumar / Hoffman, Judy

    2023  

    Abstract: As transformer architectures become increasingly prevalent in computer vision, it is critical to understand their fairness implications. We perform the first study of the fairness of transformers applied to computer vision and benchmark several bias ... ...

    Abstract As transformer architectures become increasingly prevalent in computer vision, it is critical to understand their fairness implications. We perform the first study of the fairness of transformers applied to computer vision and benchmark several bias mitigation approaches from prior work. We visualize the feature space of the transformer self-attention modules and discover that a significant portion of the bias is encoded in the query matrix. With this knowledge, we propose TADeT, a targeted alignment strategy for debiasing transformers that aims to discover and remove bias primarily from query matrix features. We measure performance using Balanced Accuracy and Standard Accuracy, and fairness using Equalized Odds and Balanced Accuracy Difference. TADeT consistently leads to improved fairness over prior work on multiple attribute prediction tasks on the CelebA dataset, without compromising performance.
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition
    Thema/Rubrik (Code) 004
    Erscheinungsdatum 2023-02-08
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  3. Buch ; Online: Window Attention is Bugged

    Bolya, Daniel / Ryali, Chaitanya / Hoffman, Judy / Feichtenhofer, Christoph

    How not to Interpolate Position Embeddings

    2023  

    Abstract: Window attention, position embeddings, and high resolution finetuning are core concepts in the modern transformer era of computer vision. However, we find that naively combining these near ubiquitous components can have a detrimental effect on ... ...

    Abstract Window attention, position embeddings, and high resolution finetuning are core concepts in the modern transformer era of computer vision. However, we find that naively combining these near ubiquitous components can have a detrimental effect on performance. The issue is simple: interpolating position embeddings while using window attention is wrong. We study two state-of-the-art methods that have these three components, namely Hiera and ViTDet, and find that both do indeed suffer from this bug. To fix it, we introduce a simple absolute window position embedding strategy, which solves the bug outright in Hiera and allows us to increase both speed and performance of the model in ViTDet. We finally combine the two to obtain HieraDet, which achieves 61.7 box mAP on COCO, making it state-of-the-art for models that only use ImageNet-1k pretraining. This all stems from what is essentially a 3 line bug fix, which we name "absolute win".

    Comment: Preprint. Code release will be coming in the future
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition
    Erscheinungsdatum 2023-11-09
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  4. Buch ; Online: LANCE

    Prabhu, Viraj / Yenamandra, Sriram / Chattopadhyay, Prithvijit / Hoffman, Judy

    Stress-testing Visual Models by Generating Language-guided Counterfactual Images

    2023  

    Abstract: We propose an automated algorithm to stress-test a trained visual model by generating language-guided counterfactual test images (LANCE). Our method leverages recent progress in large language modeling and text-based image editing to augment an IID test ... ...

    Abstract We propose an automated algorithm to stress-test a trained visual model by generating language-guided counterfactual test images (LANCE). Our method leverages recent progress in large language modeling and text-based image editing to augment an IID test set with a suite of diverse, realistic, and challenging test images without altering model weights. We benchmark the performance of a diverse set of pre-trained models on our generated data and observe significant and consistent performance drops. We further analyze model sensitivity across different types of edits, and demonstrate its applicability at surfacing previously unknown class-level model biases in ImageNet. Code is available at https://github.com/virajprabhu/lance.

    Comment: NeurIPS 2023 camera ready. Project webpage: https://virajprabhu.github.io/lance-web/
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Computation and Language ; Computer Science - Machine Learning
    Thema/Rubrik (Code) 004
    Erscheinungsdatum 2023-05-30
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  5. Buch ; Online: ICON$^2$

    Sudhakar, Sruthi / Prabhu, Viraj / Russakovsky, Olga / Hoffman, Judy

    Reliably Benchmarking Predictive Inequity in Object Detection

    2023  

    Abstract: As computer vision systems are being increasingly deployed at scale in high-stakes applications like autonomous driving, concerns about social bias in these systems are rising. Analysis of fairness in real-world vision systems, such as object detection ... ...

    Abstract As computer vision systems are being increasingly deployed at scale in high-stakes applications like autonomous driving, concerns about social bias in these systems are rising. Analysis of fairness in real-world vision systems, such as object detection in driving scenes, has been limited to observing predictive inequity across attributes such as pedestrian skin tone, and lacks a consistent methodology to disentangle the role of confounding variables e.g. does my model perform worse for a certain skin tone, or are such scenes in my dataset more challenging due to occlusion and crowds? In this work, we introduce ICON$^2$, a framework for robustly answering this question. ICON$^2$ leverages prior knowledge on the deficiencies of object detection systems to identify performance discrepancies across sub-populations, compute correlations between these potential confounders and a given sensitive attribute, and control for the most likely confounders to obtain a more reliable estimate of model bias. Using our approach, we conduct an in-depth study on the performance of object detection with respect to income from the BDD100K driving dataset, revealing useful insights.

    Comment: Accepted to CVPR 2023 SSAD Workshop
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition
    Thema/Rubrik (Code) 004
    Erscheinungsdatum 2023-06-07
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  6. Buch ; Online: Benchmarking Low-Shot Robustness to Natural Distribution Shifts

    Singh, Aaditya / Sarangmath, Kartik / Chattopadhyay, Prithvijit / Hoffman, Judy

    2023  

    Abstract: Robustness to natural distribution shifts has seen remarkable progress thanks to recent pre-training strategies combined with better fine-tuning methods. However, such fine-tuning assumes access to large amounts of labelled data, and the extent to which ... ...

    Abstract Robustness to natural distribution shifts has seen remarkable progress thanks to recent pre-training strategies combined with better fine-tuning methods. However, such fine-tuning assumes access to large amounts of labelled data, and the extent to which the observations hold when the amount of training data is not as high remains unknown. We address this gap by performing the first in-depth study of robustness to various natural distribution shifts in different low-shot regimes: spanning datasets, architectures, pre-trained initializations, and state-of-the-art robustness interventions. Most importantly, we find that there is no single model of choice that is often more robust than others, and existing interventions can fail to improve robustness on some datasets even if they do so in the full-shot regime. We hope that our work will motivate the community to focus on this problem of practical importance.

    Comment: 22 Pages, 18 Tables, 12 Figures
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning
    Thema/Rubrik (Code) 006
    Erscheinungsdatum 2023-04-21
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  7. Buch ; Online: FACTS

    Yenamandra, Sriram / Ramesh, Pratik / Prabhu, Viraj / Hoffman, Judy

    First Amplify Correlations and Then Slice to Discover Bias

    2023  

    Abstract: Computer vision datasets frequently contain spurious correlations between task-relevant labels and (easy to learn) latent task-irrelevant attributes (e.g. context). Models trained on such datasets learn "shortcuts" and underperform on bias-conflicting ... ...

    Abstract Computer vision datasets frequently contain spurious correlations between task-relevant labels and (easy to learn) latent task-irrelevant attributes (e.g. context). Models trained on such datasets learn "shortcuts" and underperform on bias-conflicting slices of data where the correlation does not hold. In this work, we study the problem of identifying such slices to inform downstream bias mitigation strategies. We propose First Amplify Correlations and Then Slice to Discover Bias (FACTS), wherein we first amplify correlations to fit a simple bias-aligned hypothesis via strongly regularized empirical risk minimization. Next, we perform correlation-aware slicing via mixture modeling in bias-aligned feature space to discover underperforming data slices that capture distinct correlations. Despite its simplicity, our method considerably improves over prior work (by as much as 35% precision@10) in correlation bias identification across a range of diverse evaluation settings. Our code is available at: https://github.com/yvsriram/FACTS.

    Comment: Accepted to ICCV 2023
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition
    Erscheinungsdatum 2023-09-29
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  8. Buch ; Online: We're Not Using Videos Effectively

    Kareer, Simar / Vijaykumar, Vivek / Maheshwari, Harsh / Chattopadhyay, Prithvijit / Hoffman, Judy / Prabhu, Viraj

    An Updated Domain Adaptive Video Segmentation Baseline

    2024  

    Abstract: There has been abundant work in unsupervised domain adaptation for semantic segmentation (DAS) seeking to adapt a model trained on images from a labeled source domain to an unlabeled target domain. While the vast majority of prior work has studied this ... ...

    Abstract There has been abundant work in unsupervised domain adaptation for semantic segmentation (DAS) seeking to adapt a model trained on images from a labeled source domain to an unlabeled target domain. While the vast majority of prior work has studied this as a frame-level Image-DAS problem, a few Video-DAS works have sought to additionally leverage the temporal signal present in adjacent frames. However, Video-DAS works have historically studied a distinct set of benchmarks from Image-DAS, with minimal cross-benchmarking. In this work, we address this gap. Surprisingly, we find that (1) even after carefully controlling for data and model architecture, state-of-the-art Image-DAS methods (HRDA and HRDA+MIC)} outperform Video-DAS methods on established Video-DAS benchmarks (+14.5 mIoU on Viper$\rightarrow$CityscapesSeq, +19.0 mIoU on Synthia$\rightarrow$CityscapesSeq), and (2) naive combinations of Image-DAS and Video-DAS techniques only lead to marginal improvements across datasets. To avoid siloed progress between Image-DAS and Video-DAS, we open-source our codebase with support for a comprehensive set of Video-DAS and Image-DAS methods on a common benchmark. Code available at https://github.com/SimarKareer/UnifiedVideoDA

    Comment: TMLR 2024
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition
    Thema/Rubrik (Code) 004
    Erscheinungsdatum 2024-02-01
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  9. Buch ; Online: ZipIt! Merging Models from Different Tasks without Training

    Stoica, George / Bolya, Daniel / Bjorner, Jakob / Hearn, Taylor / Hoffman, Judy

    2023  

    Abstract: Typical deep visual recognition models are capable of performing the one task they were trained on. In this paper, we tackle the extremely difficult problem of combining completely distinct models with different initializations, each solving a separate ... ...

    Abstract Typical deep visual recognition models are capable of performing the one task they were trained on. In this paper, we tackle the extremely difficult problem of combining completely distinct models with different initializations, each solving a separate task, into one multi-task model without any additional training. Prior work in model merging permutes one model to the space of the other then adds them together. While this works for models trained on the same task, we find that this fails to account for the differences in models trained on disjoint tasks. Thus, we introduce "ZipIt!", a general method for merging two arbitrary models of the same architecture that incorporates two simple strategies. First, in order to account for features that aren't shared between models, we expand the model merging problem to additionally allow for merging features within each model by defining a general "zip" operation. Second, we add support for partially zipping the models up until a specified layer, naturally creating a multi-head model. We find that these two changes combined account for a staggering 20-60% improvement over prior work, making the merging of models trained on disjoint tasks feasible.
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning
    Thema/Rubrik (Code) 004
    Erscheinungsdatum 2023-05-04
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

  10. Buch ; Online: AUGCAL

    Chattopadhyay, Prithvijit / Goyal, Bharat / Ecsedi, Boglarka / Prabhu, Viraj / Hoffman, Judy

    Improving Sim2Real Adaptation by Uncertainty Calibration on Augmented Synthetic Images

    2023  

    Abstract: Synthetic data (SIM) drawn from simulators have emerged as a popular alternative for training models where acquiring annotated real-world images is difficult. However, transferring models trained on synthetic images to real-world applications can be ... ...

    Abstract Synthetic data (SIM) drawn from simulators have emerged as a popular alternative for training models where acquiring annotated real-world images is difficult. However, transferring models trained on synthetic images to real-world applications can be challenging due to appearance disparities. A commonly employed solution to counter this SIM2REAL gap is unsupervised domain adaptation, where models are trained using labeled SIM data and unlabeled REAL data. Mispredictions made by such SIM2REAL adapted models are often associated with miscalibration - stemming from overconfident predictions on real data. In this paper, we introduce AUGCAL, a simple training-time patch for unsupervised adaptation that improves SIM2REAL adapted models by - (1) reducing overall miscalibration, (2) reducing overconfidence in incorrect predictions and (3) improving confidence score reliability by better guiding misclassification detection - all while retaining or improving SIM2REAL performance. Given a base SIM2REAL adaptation algorithm, at training time, AUGCAL involves replacing vanilla SIM images with strongly augmented views (AUG intervention) and additionally optimizing for a training time calibration loss on augmented SIM predictions (CAL intervention). We motivate AUGCAL using a brief analytical justification of how to reduce miscalibration on unlabeled REAL data. Through our experiments, we empirically show the efficacy of AUGCAL across multiple adaptation methods, backbones, tasks and shifts.
    Schlagwörter Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning
    Thema/Rubrik (Code) 006
    Erscheinungsdatum 2023-12-10
    Erscheinungsland us
    Dokumenttyp Buch ; Online
    Datenquelle BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl)

    Zusatzmaterialien

    Kategorien

Zum Seitenanfang