LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Search results

Result 1 - 1 of total 1

Search options

Book ; Online: PaLI-X

Chen, Xi / Djolonga, Josip / Padlewski, Piotr / Mustafa, Basil / Changpinyo, Soravit / Wu, Jialin / Ruiz, Carlos Riquelme / Goodman, Sebastian / Wang, Xiao / Tay, Yi / Shakeri, Siamak / Dehghani, Mostafa / Salz, Daniel / Lucic, Mario / Tschannen, Michael / Nagrani, Arsha / Hu, Hexiang / Joshi, Mandar / Pang, Bo /
Montgomery, Ceslee / Pietrzyk, Paulina / Ritter, Marvin / Piergiovanni, AJ / Minderer, Matthias / Pavetic, Filip / Waters, Austin / Li, Gang / Alabdulmohsin, Ibrahim / Beyer, Lucas / Amelot, Julien / Lee, Kenton / Steiner, Andreas Peter / Li, Yang / Keysers, Daniel / Arnab, Anurag / Xu, Yuanzhong / Rong, Keran / Kolesnikov, Alexander / Seyedhosseini, Mojtaba / Angelova, Anelia / Zhai, Xiaohua / Houlsby, Neil / Soricut, Radu

On Scaling up a Multilingual Vision and Language Model

2023  

Abstract: We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture. Our model achieves new levels of performance on a wide-range ... ...

Abstract We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture. Our model achieves new levels of performance on a wide-range of varied and complex tasks, including multiple image-based captioning and question-answering tasks, image-based document understanding and few-shot (in-context) learning, as well as object detection, video question answering, and video captioning. PaLI-X advances the state-of-the-art on most vision-and-language benchmarks considered (25+ of them). Finally, we observe emerging capabilities, such as complex counting and multilingual object detection, tasks that are not explicitly in the training mix.
Keywords Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Computation and Language ; Computer Science - Machine Learning
Publishing date 2023-05-29
Publishing country us
Document type Book ; Online
Database BASE - Bielefeld Academic Search Engine (life sciences selection)

More links

Kategorien

To top