Buch ; Online: PaLI-X
On Scaling up a Multilingual Vision and Language Model
2023
Abstract: We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture. Our model achieves new levels of performance on a wide-range ... ...
Abstract | We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture. Our model achieves new levels of performance on a wide-range of varied and complex tasks, including multiple image-based captioning and question-answering tasks, image-based document understanding and few-shot (in-context) learning, as well as object detection, video question answering, and video captioning. PaLI-X advances the state-of-the-art on most vision-and-language benchmarks considered (25+ of them). Finally, we observe emerging capabilities, such as complex counting and multilingual object detection, tasks that are not explicitly in the training mix. |
---|---|
Schlagwörter | Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Computation and Language ; Computer Science - Machine Learning |
Erscheinungsdatum | 2023-05-29 |
Erscheinungsland | us |
Dokumenttyp | Buch ; Online |
Datenquelle | BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl) |
Volltext online
Zusatzmaterialien
Kategorien
Fernleihe an ZB MED
Sie können sich den gewünschten Titel als lokale Nutzerin oder lokaler Nutzer von ZB MED direkt an den Standort Köln schicken lassen.