Artikel ; Online: Automated Quality Evaluation of Large-Scale Benchmark Datasets for Vision-Language Tasks.
International journal of neural systems
2024 Band 34, Heft 3, Seite(n) 2450009
Abstract: Large-scale benchmark datasets are crucial in advancing research within the computer science communities. They enable the development of more sophisticated AI models and serve as "golden" benchmarks for evaluating their performance. Thus, ensuring the ... ...
Abstract | Large-scale benchmark datasets are crucial in advancing research within the computer science communities. They enable the development of more sophisticated AI models and serve as "golden" benchmarks for evaluating their performance. Thus, ensuring the quality of these datasets is of utmost importance for academic research and the progress of AI systems. For the emerging vision-language tasks, some datasets have been created and frequently used, such as Flickr30k, COCO, and NoCaps, which typically contain a large number of images paired with their ground-truth textual descriptions. In this paper, an automatic method is proposed to assess the quality of large-scale benchmark datasets designed for vision-language tasks. In particular, a new cross-modal matching model is developed, which is capable of automatically scoring the textual descriptions of visual images. Subsequently, this model is employed to evaluate the quality of vision-language datasets by automatically assigning a score to each 'ground-truth' description for every image picture. With a good agreement between manual and automated scoring results on the datasets, our findings reveal significant disparities in the quality of the ground-truth descriptions included in the benchmark datasets. Even more surprising, it is evident that a small portion of the descriptions are unsuitable for serving as reliable ground-truth references. These discoveries emphasize the need for careful utilization of these publicly accessible benchmark databases. |
---|---|
Mesh-Begriff(e) | Benchmarking ; Databases, Factual |
Sprache | Englisch |
Erscheinungsdatum | 2024-02-06 |
Erscheinungsland | Singapore |
Dokumenttyp | Journal Article |
ISSN | 1793-6462 |
ISSN (online) | 1793-6462 |
DOI | 10.1142/S0129065724500096 |
Datenquelle | MEDical Literature Analysis and Retrieval System OnLINE |
Zusatzmaterialien
Kategorien
Über subito bestellen
Dieser Service ist kostenpflichtig (siehe Lieferbedingungen von subito). Bestellungen, die einen Artikel nebst Supplementary Material umfassen, werden grundsätzlich wie mehrfache Bestellungen bearbeitet. Gebühren fallen in diesen Fällen für jede einzelne Bestellung an.
Fernleihe an ZB MED
Sie können sich den gewünschten Titel als lokale Nutzerin oder lokaler Nutzer von ZB MED direkt an den Standort Köln schicken lassen.