LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 5 of total 5

Search options

  1. Article ; Online: Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging.

    Azizi, Shekoofeh / Culp, Laura / Freyberg, Jan / Mustafa, Basil / Baur, Sebastien / Kornblith, Simon / Chen, Ting / Tomasev, Nenad / Mitrović, Jovana / Strachan, Patricia / Mahdavi, S Sara / Wulczyn, Ellery / Babenko, Boris / Walker, Megan / Loh, Aaron / Chen, Po-Hsuan Cameron / Liu, Yuan / Bavishi, Pinal / McKinney, Scott Mayer /
    Winkens, Jim / Roy, Abhijit Guha / Beaver, Zach / Ryan, Fiona / Krogue, Justin / Etemadi, Mozziyar / Telang, Umesh / Liu, Yun / Peng, Lily / Corrado, Greg S / Webster, Dale R / Fleet, David / Hinton, Geoffrey / Houlsby, Neil / Karthikesalingam, Alan / Norouzi, Mohammad / Natarajan, Vivek

    Nature biomedical engineering

    2023  Volume 7, Issue 6, Page(s) 756–779

    Abstract: Machine-learning models for medical tasks can match or surpass the performance of clinical experts. However, in settings differing from those of the training dataset, the performance of a model can deteriorate substantially. Here we report a ... ...

    Abstract Machine-learning models for medical tasks can match or surpass the performance of clinical experts. However, in settings differing from those of the training dataset, the performance of a model can deteriorate substantially. Here we report a representation-learning strategy for machine-learning models applied to medical-imaging tasks that mitigates such 'out of distribution' performance problem and that improves model robustness and training efficiency. The strategy, which we named REMEDIS (for 'Robust and Efficient Medical Imaging with Self-supervision'), combines large-scale supervised transfer learning on natural images and intermediate contrastive self-supervised learning on medical images and requires minimal task-specific customization. We show the utility of REMEDIS in a range of diagnostic-imaging tasks covering six imaging domains and 15 test datasets, and by simulating three realistic out-of-distribution scenarios. REMEDIS improved in-distribution diagnostic accuracies up to 11.5% with respect to strong supervised baseline models, and in out-of-distribution settings required only 1-33% of the data for retraining to match the performance of supervised models retrained using all available data. REMEDIS may accelerate the development lifecycle of machine-learning models for medical imaging.
    MeSH term(s) Supervised Machine Learning ; Machine Learning ; Diagnostic Imaging
    Language English
    Publishing date 2023-06-08
    Publishing country England
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ISSN 2157-846X
    ISSN (online) 2157-846X
    DOI 10.1038/s41551-023-01049-7
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  2. Book ; Online: Evaluating AI systems under uncertain ground truth

    Stutz, David / Cemgil, Ali Taylan / Roy, Abhijit Guha / Matejovicova, Tatiana / Barsbey, Melih / Strachan, Patricia / Schaekermann, Mike / Freyberg, Jan / Rikhye, Rajeev / Freeman, Beverly / Matos, Javier Perez / Telang, Umesh / Webster, Dale R. / Liu, Yuan / Corrado, Greg S. / Matias, Yossi / Kohli, Pushmeet / Liu, Yun / Doucet, Arnaud /
    Karthikesalingam, Alan

    a case study in dermatology

    2023  

    Abstract: For safety, AI systems in health undergo thorough evaluations before deployment, validating their predictions against a ground truth that is assumed certain. However, this is actually not the case and the ground truth may be uncertain. Unfortunately, ... ...

    Abstract For safety, AI systems in health undergo thorough evaluations before deployment, validating their predictions against a ground truth that is assumed certain. However, this is actually not the case and the ground truth may be uncertain. Unfortunately, this is largely ignored in standard evaluation of AI models but can have severe consequences such as overestimating the future performance. To avoid this, we measure the effects of ground truth uncertainty, which we assume decomposes into two main components: annotation uncertainty which stems from the lack of reliable annotations, and inherent uncertainty due to limited observational information. This ground truth uncertainty is ignored when estimating the ground truth by deterministically aggregating annotations, e.g., by majority voting or averaging. In contrast, we propose a framework where aggregation is done using a statistical model. Specifically, we frame aggregation of annotations as posterior inference of so-called plausibilities, representing distributions over classes in a classification setting, subject to a hyper-parameter encoding annotator reliability. Based on this model, we propose a metric for measuring annotation uncertainty and provide uncertainty-adjusted metrics for performance evaluation. We present a case study applying our framework to skin condition classification from images where annotations are provided in the form of differential diagnoses. The deterministic adjudication process called inverse rank normalization (IRN) from previous work ignores ground truth uncertainty in evaluation. Instead, we present two alternative statistical models: a probabilistic version of IRN and a Plackett-Luce-based model. We find that a large portion of the dataset exhibits significant ground truth uncertainty and standard IRN-based evaluation severely over-estimates performance without providing uncertainty estimates.
    Keywords Computer Science - Machine Learning ; Computer Science - Computer Vision and Pattern Recognition ; Statistics - Methodology ; Statistics - Machine Learning
    Subject code 006
    Publishing date 2023-07-05
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Article ; Online: Does your dermatology classifier know what it doesn't know? Detecting the long-tail of unseen conditions.

    Guha Roy, Abhijit / Ren, Jie / Azizi, Shekoofeh / Loh, Aaron / Natarajan, Vivek / Mustafa, Basil / Pawlowski, Nick / Freyberg, Jan / Liu, Yuan / Beaver, Zach / Vo, Nam / Bui, Peggy / Winter, Samantha / MacWilliams, Patricia / Corrado, Greg S / Telang, Umesh / Liu, Yun / Cemgil, Taylan / Karthikesalingam, Alan /
    Lakshminarayanan, Balaji / Winkens, Jim

    Medical image analysis

    2021  Volume 75, Page(s) 102274

    Abstract: Supervised deep learning models have proven to be highly effective in classification of dermatological conditions. These models rely on the availability of abundant labeled training examples. However, in the real-world, many dermatological conditions are ...

    Abstract Supervised deep learning models have proven to be highly effective in classification of dermatological conditions. These models rely on the availability of abundant labeled training examples. However, in the real-world, many dermatological conditions are individually too infrequent for per-condition classification with supervised learning. Although individually infrequent, these conditions may collectively be common and therefore are clinically significant in aggregate. To prevent models from generating erroneous outputs on such examples, there remains a considerable unmet need for deep learning systems that can better detect such infrequent conditions. These infrequent 'outlier' conditions are seen very rarely (or not at all) during training. In this paper, we frame this task as an out-of-distribution (OOD) detection problem. We set up a benchmark ensuring that outlier conditions are disjoint between the model training, validation, and test sets. Unlike traditional OOD detection benchmarks where the task is to detect dataset distribution shift, we aim at the more challenging task of detecting subtle differences resulting from a different pathology or condition. We propose a novel hierarchical outlier detection (HOD) loss, which assigns multiple abstention classes corresponding to each training outlier class and jointly performs a coarse classification of inliers vs. outliers, along with fine-grained classification of the individual classes. We demonstrate that the proposed HOD loss based approach outperforms leading methods that leverage outlier data during training. Further, performance is significantly boosted by using recent representation learning methods (BiT, SimCLR, MICLe). Further, we explore ensembling strategies for OOD detection and propose a diverse ensemble selection process for the best result. We also perform a subgroup analysis over conditions of varying risk levels and different skin types to investigate how OOD performance changes over each subgroup and demonstrate the gains of our framework in comparison to baseline. Furthermore, we go beyond traditional performance metrics and introduce a cost matrix for model trust analysis to approximate downstream clinical impact. We use this cost matrix to compare the proposed method against the baseline, thereby making a stronger case for its effectiveness in real-world scenarios.
    MeSH term(s) Benchmarking ; Dermatology ; Humans
    Language English
    Publishing date 2021-10-20
    Publishing country Netherlands
    Document type Journal Article
    ZDB-ID 1356436-5
    ISSN 1361-8423 ; 1361-8431 ; 1361-8415
    ISSN (online) 1361-8423 ; 1361-8431
    ISSN 1361-8415
    DOI 10.1016/j.media.2021.102274
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  4. Book ; Online: Does Your Dermatology Classifier Know What It Doesn't Know? Detecting the Long-Tail of Unseen Conditions

    Roy, Abhijit Guha / Ren, Jie / Azizi, Shekoofeh / Loh, Aaron / Natarajan, Vivek / Mustafa, Basil / Pawlowski, Nick / Freyberg, Jan / Liu, Yuan / Beaver, Zach / Vo, Nam / Bui, Peggy / Winter, Samantha / MacWilliams, Patricia / Corrado, Greg S. / Telang, Umesh / Liu, Yun / Cemgil, Taylan / Karthikesalingam, Alan /
    Lakshminarayanan, Balaji / Winkens, Jim

    2021  

    Abstract: We develop and rigorously evaluate a deep learning based system that can accurately classify skin conditions while detecting rare conditions for which there is not enough data available for training a confident classifier. We frame this task as an out-of- ...

    Abstract We develop and rigorously evaluate a deep learning based system that can accurately classify skin conditions while detecting rare conditions for which there is not enough data available for training a confident classifier. We frame this task as an out-of-distribution (OOD) detection problem. Our novel approach, hierarchical outlier detection (HOD) assigns multiple abstention classes for each training outlier class and jointly performs a coarse classification of inliers vs. outliers, along with fine-grained classification of the individual classes. We demonstrate the effectiveness of the HOD loss in conjunction with modern representation learning approaches (BiT, SimCLR, MICLe) and explore different ensembling strategies for further improving the results. We perform an extensive subgroup analysis over conditions of varying risk levels and different skin types to investigate how the OOD detection performance changes over each subgroup and demonstrate the gains of our framework in comparison to baselines. Finally, we introduce a cost metric to approximate downstream clinical impact. We use this cost metric to compare the proposed method against a baseline system, thereby making a stronger case for the overall system effectiveness in a real-world deployment scenario.

    Comment: Under Review, 19 Pages
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Machine Learning
    Subject code 006
    Publishing date 2021-04-08
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Book ; Online: Robust and Efficient Medical Imaging with Self-Supervision

    Azizi, Shekoofeh / Culp, Laura / Freyberg, Jan / Mustafa, Basil / Baur, Sebastien / Kornblith, Simon / Chen, Ting / MacWilliams, Patricia / Mahdavi, S. Sara / Wulczyn, Ellery / Babenko, Boris / Wilson, Megan / Loh, Aaron / Chen, Po-Hsuan Cameron / Liu, Yuan / Bavishi, Pinal / McKinney, Scott Mayer / Winkens, Jim / Roy, Abhijit Guha /
    Beaver, Zach / Ryan, Fiona / Krogue, Justin / Etemadi, Mozziyar / Telang, Umesh / Liu, Yun / Peng, Lily / Corrado, Greg S. / Webster, Dale R. / Fleet, David / Hinton, Geoffrey / Houlsby, Neil / Karthikesalingam, Alan / Norouzi, Mohammad / Natarajan, Vivek

    2022  

    Abstract: Recent progress in Medical Artificial Intelligence (AI) has delivered systems that can reach clinical expert level performance. However, such systems tend to demonstrate sub-optimal "out-of-distribution" performance when evaluated in clinical settings ... ...

    Abstract Recent progress in Medical Artificial Intelligence (AI) has delivered systems that can reach clinical expert level performance. However, such systems tend to demonstrate sub-optimal "out-of-distribution" performance when evaluated in clinical settings different from the training environment. A common mitigation strategy is to develop separate systems for each clinical setting using site-specific data [1]. However, this quickly becomes impractical as medical data is time-consuming to acquire and expensive to annotate [2]. Thus, the problem of "data-efficient generalization" presents an ongoing difficulty for Medical AI development. Although progress in representation learning shows promise, their benefits have not been rigorously studied, specifically for out-of-distribution settings. To meet these challenges, we present REMEDIS, a unified representation learning strategy to improve robustness and data-efficiency of medical imaging AI. REMEDIS uses a generic combination of large-scale supervised transfer learning with self-supervised learning and requires little task-specific customization. We study a diverse range of medical imaging tasks and simulate three realistic application scenarios using retrospective data. REMEDIS exhibits significantly improved in-distribution performance with up to 11.5% relative improvement in diagnostic accuracy over a strong supervised baseline. More importantly, our strategy leads to strong data-efficient generalization of medical imaging AI, matching strong supervised baselines using between 1% to 33% of retraining data across tasks. These results suggest that REMEDIS can significantly accelerate the life-cycle of medical imaging AI development thereby presenting an important step forward for medical imaging AI to deliver broad impact.
    Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Artificial Intelligence ; Computer Science - Machine Learning
    Subject code 006
    Publishing date 2022-05-19
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

To top