LIVIVO - Search results -

Search results

Result 1 - 7 of total 7

Search options

Book ; Online: Controlling High-Dimensional Data With Sparse Input

Iliescu, Dan Andrei / Mohan, Devang Savita Ram / Teh, Tian Huey / Hodari, Zack

2023

Abstract: We address the problem of human-in-the-loop control for generating highly-structured data. This task is challenging because existing generative models lack an efficient interface through which users can modify the output. Users have the option to either ... ...

Abstract	We address the problem of human-in-the-loop control for generating highly-structured data. This task is challenging because existing generative models lack an efficient interface through which users can modify the output. Users have the option to either manually explore a non-interpretable latent space, or to laboriously annotate the data with conditioning labels. To solve this, we introduce a novel framework whereby an encoder maps a sparse, human interpretable control space onto the latent space of a generative model. We apply this framework to the task of controlling prosody in text-to-speech synthesis. We propose a model, called Multiple-Instance CVAE (MICVAE), that is specifically designed to encode sparse prosodic features and output complete waveforms. We show empirically that MICVAE displays desirable qualities of a sparse human-in-the-loop control mechanism: efficiency, robustness, and faithfulness. With even a very small number of input values (~4), MICVAE enables users to improve the quality of the output significantly, in terms of listener preference (4:1). Comment: 11 pages
Keywords	Electrical Engineering and Systems Science - Audio and Speech Processing ; Computer Science - Artificial Intelligence ; Computer Science - Computation and Language ; Computer Science - Machine Learning
Subject code	004
Publishing date	2023-03-14
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Ensemble prosody prediction for expressive speech synthesis

Teh, Tian Huey / Hu, Vivian / Mohan, Devang S Ram / Hodari, Zack / Wallis, Christopher G. R. / Ibarrondo, Tomás Gomez / Torresquintero, Alexandra / Leoni, James / Gales, Mark / King, Simon

2023

Abstract: Generating expressive speech with rich and varied prosody continues to be a challenge for Text-to-Speech. Most efforts have focused on sophisticated neural architectures intended to better model the data distribution. Yet, in evaluations it is generally ... ...

Abstract	Generating expressive speech with rich and varied prosody continues to be a challenge for Text-to-Speech. Most efforts have focused on sophisticated neural architectures intended to better model the data distribution. Yet, in evaluations it is generally found that no single model is preferred for all input texts. This suggests an approach that has rarely been used before for Text-to-Speech: an ensemble of models. We apply ensemble learning to prosody prediction. We construct simple ensembles of prosody predictors by varying either model architecture or model parameter values. To automatically select amongst the models in the ensemble when performing Text-to-Speech, we propose a novel, and computationally trivial, variance-based criterion. We demonstrate that even a small ensemble of prosody predictors yields useful diversity, which, combined with the proposed selection criterion, outperforms any individual model from the ensemble. Comment: ICASSP 2023
Keywords	Electrical Engineering and Systems Science - Audio and Speech Processing
Publishing date	2023-04-03
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning

Mohan, Devang S Ram / Lenain, Raphael / Foglianti, Lorenzo / Teh, Tian Huey / Staib, Marlene / Torresquintero, Alexandra / Gao, Jiameng

2020

Abstract: Modern approaches to text to speech require the entire input character sequence to be processed before any audio is synthesised. This latency limits the suitability of such models for time-sensitive tasks like simultaneous interpretation. Interleaving ... ...

Abstract	Modern approaches to text to speech require the entire input character sequence to be processed before any audio is synthesised. This latency limits the suitability of such models for time-sensitive tasks like simultaneous interpretation. Interleaving the action of reading a character with that of synthesising audio reduces this latency. However, the order of this sequence of interleaved actions varies across sentences, which raises the question of how the actions should be chosen. We propose a reinforcement learning based framework to train an agent to make this decision. We compare our performance against that of deterministic, rule-based systems. Our results demonstrate that our agent successfully balances the trade-off between the latency of audio generation and the quality of synthesised audio. More broadly, we show that neural sequence-to-sequence models can be adapted to run in an incremental manner. Comment: To be published in Interspeech 2020. 5 pages, 4 figures
Keywords	Electrical Engineering and Systems Science - Audio and Speech Processing ; Computer Science - Machine Learning ; Computer Science - Sound ; Statistics - Machine Learning
Subject code	006
Publishing date	2020-08-07
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Phonological Features for 0-shot Multilingual Speech Synthesis

Staib, Marlene / Teh, Tian Huey / Torresquintero, Alexandra / Mohan, Devang S Ram / Foglianti, Lorenzo / Lenain, Raphael / Gao, Jiameng

2020

Abstract: Code-switching---the intra-utterance use of multiple languages---is prevalent across the world. Within text-to-speech (TTS), multilingual models have been found to enable code-switching. By modifying the linguistic input to sequence-to-sequence TTS, we ... ...

Abstract	Code-switching---the intra-utterance use of multiple languages---is prevalent across the world. Within text-to-speech (TTS), multilingual models have been found to enable code-switching. By modifying the linguistic input to sequence-to-sequence TTS, we show that code-switching is possible for languages unseen during training, even within monolingual models. We use a small set of phonological features derived from the International Phonetic Alphabet (IPA), such as vowel height and frontness, consonant place and manner. This allows the model topology to stay unchanged for different languages, and enables new, previously unseen feature combinations to be interpreted by the model. We show that this allows us to generate intelligible, code-switched speech in a new language at test time, including the approximation of sounds never seen in training. Comment: 5 pages, to be presented at INTERSPEECH 2020
Keywords	Electrical Engineering and Systems Science - Audio and Speech Processing ; Computer Science - Machine Learning ; Computer Science - Sound
Subject code	410
Publishing date	2020-08-06
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Ctrl-P

Mohan, Devang S Ram / Hu, Vivian / Teh, Tian Huey / Torresquintero, Alexandra / Wallis, Christopher G. R. / Staib, Marlene / Foglianti, Lorenzo / Gao, Jiameng / King, Simon

Temporal Control of Prosodic Variation for Speech Synthesis

2021

Abstract: Text does not fully specify the spoken form, so text-to-speech models must be able to learn from speech data that vary in ways not explained by the corresponding text. One way to reduce the amount of unexplained variation in training data is to provide ... ...

Abstract	Text does not fully specify the spoken form, so text-to-speech models must be able to learn from speech data that vary in ways not explained by the corresponding text. One way to reduce the amount of unexplained variation in training data is to provide acoustic information as an additional learning signal. When generating speech, modifying this acoustic information enables multiple distinct renditions of a text to be produced. Since much of the unexplained variation is in the prosody, we propose a model that generates speech explicitly conditioned on the three primary acoustic correlates of prosody: $F_{0}$, energy and duration. The model is flexible about how the values of these features are specified: they can be externally provided, or predicted from text, or predicted then subsequently modified. Compared to a model that employs a variational auto-encoder to learn unsupervised latent features, our model provides more interpretable, temporally-precise, and disentangled control. When automatically predicting the acoustic features from text, it generates speech that is more natural than that from a Tacotron 2 model with reference encoder. Subsequent human-in-the-loop modification of the predicted acoustic features can significantly further increase naturalness. Comment: To be published in Interspeech 2021. 5 pages, 4 figures
Keywords	Electrical Engineering and Systems Science - Audio and Speech Processing ; Computer Science - Machine Learning ; Computer Science - Sound
Subject code	400
Publishing date	2021-06-15
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: ADEPT

Torresquintero, Alexandra / Teh, Tian Huey / Wallis, Christopher G. R. / Staib, Marlene / Mohan, Devang S Ram / Hu, Vivian / Foglianti, Lorenzo / Gao, Jiameng / King, Simon

A Dataset for Evaluating Prosody Transfer

2021

Abstract: Text-to-speech is now able to achieve near-human naturalness and research focus has shifted to increasing expressivity. One popular method is to transfer the prosody from a reference speech sample. There have been considerable advances in using prosody ... ...

Abstract	Text-to-speech is now able to achieve near-human naturalness and research focus has shifted to increasing expressivity. One popular method is to transfer the prosody from a reference speech sample. There have been considerable advances in using prosody transfer to generate more expressive speech, but the field lacks a clear definition of what successful prosody transfer means and a method for measuring it. We introduce a dataset of prosodically-varied reference natural speech samples for evaluating prosody transfer. The samples include global variations reflecting emotion and interpersonal attitude, and local variations reflecting topical emphasis, propositional attitude, syntactic phrasing and marked tonicity. The corpus only includes prosodic variations that listeners are able to distinguish with reasonable accuracy, and we report these figures as a benchmark against which text-to-speech prosody transfer can be compared. We conclude the paper with a demonstration of our proposed evaluation methodology, using the corpus to evaluate two text-to-speech models that perform prosody transfer. Comment: 5 pages, 1 figure, accepted to Interspeech 2021
Keywords	Electrical Engineering and Systems Science - Audio and Speech Processing
Subject code	400
Publishing date	2021-06-15
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Gemini

Gemini Team / Anil, Rohan / Borgeaud, Sebastian / Wu, Yonghui / Alayrac, Jean-Baptiste / Yu, Jiahui / Soricut, Radu / Schalkwyk, Johan / Dai, Andrew M. / Hauth, Anja / Millican, Katie / Silver, David / Petrov, Slav / Johnson, Melvin / Antonoglou, Ioannis / Schrittwieser, Julian / Glaese, Amelia / Chen, Jilin / Pitler, Emily /

Lillicrap, Timothy / Lazaridou, Angeliki / Firat, Orhan / Molloy, James / Isard, Michael / Barham, Paul R. / Hennigan, Tom / Lee, Benjamin / Viola, Fabio / Reynolds, Malcolm / Xu, Yuanzhong / Doherty, Ryan / Collins, Eli / Meyer, Clemens / Rutherford, Eliza / Moreira, Erica / Ayoub, Kareem / Goel, Megha / Tucker, George / Piqueras, Enrique / Krikun, Maxim / Barr, Iain / Savinov, Nikolay / Danihelka, Ivo / Roelofs, Becca / White, Anaïs / Andreassen, Anders / von Glehn, Tamara / Yagati, Lakshman / Kazemi, Mehran / Gonzalez, Lucas / Khalman, Misha / Sygnowski, Jakub / Frechette, Alexandre / Smith, Charlotte / Culp, Laura / Proleev, Lev / Luan, Yi / Chen, Xi / Lottes, James / Schucher, Nathan / Lebron, Federico / Rrustemi, Alban / Clay, Natalie / Crone, Phil / Kocisky, Tomas / Zhao, Jeffrey / Perz, Bartek / Yu, Dian / Howard, Heidi / Bloniarz, Adam / Rae, Jack W. / Lu, Han / Sifre, Laurent / Maggioni, Marcello / Alcober, Fred / Garrette, Dan / Barnes, Megan / Thakoor, Shantanu / Austin, Jacob / Barth-Maron, Gabriel / Wong, William / Joshi, Rishabh / Chaabouni, Rahma / Fatiha, Deeni / Ahuja, Arun / Liu, Ruibo / Li, Yunxuan / Cogan, Sarah / Chen, Jeremy / Jia, Chao / Gu, Chenjie / Zhang, Qiao / Grimstad, Jordan / Hartman, Ale Jakse / Chadwick, Martin / Tomar, Gaurav Singh / Garcia, Xavier / Senter, Evan / Taropa, Emanuel / Pillai, Thanumalayan Sankaranarayana / Devlin, Jacob / Laskin, Michael / Casas, Diego de Las / Valter, Dasha / Tao, Connie / Blanco, Lorenzo / Badia, Adrià Puigdomènech / Reitter, David / Chen, Mianna / Brennan, Jenny / Rivera, Clara / Brin, Sergey / Iqbal, Shariq / Surita, Gabriela / Labanowski, Jane / Rao, Abhi / Winkler, Stephanie / Parisotto, Emilio / Gu, Yiming / Olszewska, Kate / Zhang, Yujing / Addanki, Ravi / Miech, Antoine / Louis, Annie / Shafey, Laurent El / Teplyashin, Denis / Brown, Geoff / Catt, Elliot / Attaluri, Nithya / Balaguer, Jan / Xiang, Jackie / Wang, Pidong / Ashwood, Zoe / Briukhov, Anton / Webson, Albert / Ganapathy, Sanjay / Sanghavi, Smit / Kannan, Ajay / Chang, Ming-Wei / Stjerngren, Axel / Djolonga, Josip / Sun, Yuting / Bapna, Ankur / Aitchison, Matthew / Pejman, Pedram / Michalewski, Henryk / Yu, Tianhe / Wang, Cindy / Love, Juliette / Ahn, Junwhan / Bloxwich, Dawn / Han, Kehang / Humphreys, Peter / Sellam, Thibault / Bradbury, James / Godbole, Varun / Samangooei, Sina / Damoc, Bogdan / Kaskasoli, Alex / Arnold, Sébastien M. R. / Vasudevan, Vijay / Agrawal, Shubham / Riesa, Jason / Lepikhin, Dmitry / Tanburn, Richard / Srinivasan, Srivatsan / Lim, Hyeontaek / Hodkinson, Sarah / Shyam, Pranav / Ferret, Johan / Hand, Steven / Garg, Ankush / Paine, Tom Le / Li, Jian / Li, Yujia / Giang, Minh / Neitz, Alexander / Abbas, Zaheer / York, Sarah / Reid, Machel / Cole, Elizabeth / Chowdhery, Aakanksha / Das, Dipanjan / Rogozińska, Dominika / Nikolaev, Vitaly / Sprechmann, Pablo / Nado, Zachary / Zilka, Lukas / Prost, Flavien / He, Luheng / Monteiro, Marianne / Mishra, Gaurav / Welty, Chris / Newlan, Josh / Jia, Dawei / Allamanis, Miltiadis / Hu, Clara Huiyi / de Liedekerke, Raoul / Gilmer, Justin / Saroufim, Carl / Rijhwani, Shruti / Hou, Shaobo / Shrivastava, Disha / Baddepudi, Anirudh / Goldin, Alex / Ozturel, Adnan / Cassirer, Albin / Xu, Yunhan / Sohn, Daniel / Sachan, Devendra / Amplayo, Reinald Kim / Swanson, Craig / Petrova, Dessie / Narayan, Shashi / Guez, Arthur / Brahma, Siddhartha / Landon, Jessica / Patel, Miteyan / Zhao, Ruizhe / Villela, Kevin / Wang, Luyu / Jia, Wenhao / Rahtz, Matthew / Giménez, Mai / Yeung, Legg / Lin, Hanzhao / Keeling, James / Georgiev, Petko / Mincu, Diana / Wu, Boxi / Haykal, Salem / Saputro, Rachel / Vodrahalli, Kiran / Qin, James / Cankara, Zeynep / Sharma, Abhanshu / Fernando, Nick / Hawkins, Will / Neyshabur, Behnam / Kim, Solomon / Hutter, Adrian / Agrawal, Priyanka / Castro-Ros, Alex / Driessche, George van den / Wang, Tao / Yang, Fan / Chang, Shuo-yiin / Komarek, Paul / McIlroy, Ross / Lučić, Mario / Zhang, Guodong / Farhan, Wael / Sharman, Michael / Natsev, Paul / Michel, Paul / Cheng, Yong / Bansal, Yamini / Qiao, Siyuan / Cao, Kris / Shakeri, Siamak / Butterfield, Christina / Chung, Justin / Rubenstein, Paul Kishan / Agrawal, Shivani / Mensch, Arthur / Soparkar, Kedar / Lenc, Karel / Chung, Timothy / Pope, Aedan / Maggiore, Loren / Kay, Jackie / Jhakra, Priya / Wang, Shibo / Maynez, Joshua / Phuong, Mary / Tobin, Taylor / Tacchetti, Andrea / Trebacz, Maja / Robinson, Kevin / Katariya, Yash / Riedel, Sebastian / Bailey, Paige / Xiao, Kefan / Ghelani, Nimesh / Aroyo, Lora / Slone, Ambrose / Houlsby, Neil / Xiong, Xuehan / Yang, Zhen / Gribovskaya, Elena / Adler, Jonas / Wirth, Mateo / Lee, Lisa / Li, Music / Kagohara, Thais / Pavagadhi, Jay / Bridgers, Sophie / Bortsova, Anna / Ghemawat, Sanjay / Ahmed, Zafarali / Liu, Tianqi / Powell, Richard / Bolina, Vijay / Iinuma, Mariko / Zablotskaia, Polina / Besley, James / Chung, Da-Woon / Dozat, Timothy / Comanescu, Ramona / Si, Xiance / Greer, Jeremy / Su, Guolong / Polacek, Martin / Kaufman, Raphaël Lopez / Tokumine, Simon / Hu, Hexiang / Buchatskaya, Elena / Miao, Yingjie / Elhawaty, Mohamed / Siddhant, Aditya / Tomasev, Nenad / Xing, Jinwei / Greer, Christina / Miller, Helen / Ashraf, Shereen / Roy, Aurko / Zhang, Zizhao / Ma, Ada / Filos, Angelos / Besta, Milos / Blevins, Rory / Klimenko, Ted / Yeh, Chih-Kuan / Changpinyo, Soravit / Mu, Jiaqi / Chang, Oscar / Pajarskas, Mantas / Muir, Carrie / Cohen, Vered / Lan, Charline Le / Haridasan, Krishna / Marathe, Amit / Hansen, Steven / Douglas, Sholto / Samuel, Rajkumar / Wang, Mingqiu / Austin, Sophia / Lan, Chang / Jiang, Jiepu / Chiu, Justin / Lorenzo, Jaime Alonso / Sjösund, Lars Lowe / Cevey, Sébastien / Gleicher, Zach / Avrahami, Thi / Boral, Anudhyan / Srinivasan, Hansa / Selo, Vittorio / May, Rhys / Aisopos, Konstantinos / Hussenot, Léonard / Soares, Livio Baldini / Baumli, Kate / Chang, Michael B. / Recasens, Adrià / Caine, Ben / Pritzel, Alexander / Pavetic, Filip / Pardo, Fabio / Gergely, Anita / Frye, Justin / Ramasesh, Vinay / Horgan, Dan / Badola, Kartikeya / Kassner, Nora / Roy, Subhrajit / Dyer, Ethan / Campos, Víctor / Tomala, Alex / Tang, Yunhao / Badawy, Dalia El / White, Elspeth / Mustafa, Basil / Lang, Oran / Jindal, Abhishek / Vikram, Sharad / Gong, Zhitao / Caelles, Sergi / Hemsley, Ross / Thornton, Gregory / Feng, Fangxiaoyu / Stokowiec, Wojciech / Zheng, Ce / Thacker, Phoebe / Ünlü, Çağlar / Zhang, Zhishuai / Saleh, Mohammad / Svensson, James / Bileschi, Max / Patil, Piyush / Anand, Ankesh / Ring, Roman / Tsihlas, Katerina / Vezer, Arpi / Selvi, Marco / Shevlane, Toby / Rodriguez, Mikel / Kwiatkowski, Tom / Daruki, Samira / Rong, Keran / Dafoe, Allan / FitzGerald, Nicholas / Gu-Lemberg, Keren / Khan, Mina / Hendricks, Lisa Anne / Pellat, Marie / Feinberg, Vladimir / Cobon-Kerr, James / Sainath, Tara / Rauh, Maribeth / Hashemi, Sayed Hadi / Ives, Richard / Hasson, Yana / Li, YaGuang / Noland, Eric / Cao, Yuan / Byrd, Nathan / Hou, Le / Wang, Qingze / Sottiaux, Thibault / Paganini, Michela / Lespiau, Jean-Baptiste / Moufarek, Alexandre / Hassan, Samer / Shivakumar, Kaushik / van Amersfoort, Joost / Mandhane, Amol / Joshi, Pratik / Goyal, Anirudh / Tung, Matthew / Brock, Andrew / Sheahan, Hannah / Misra, Vedant / Li, Cheng / Rakićević, Nemanja / Dehghani, Mostafa / Liu, Fangyu / Mittal, Sid / Oh, Junhyuk / Noury, Seb / Sezener, Eren / Huot, Fantine / Lamm, Matthew / De Cao, Nicola / Chen, Charlie / Elsayed, Gamaleldin / Chi, Ed / Mahdieh, Mahdis / Tenney, Ian / Hua, Nan / Petrychenko, Ivan / Kane, Patrick / Scandinaro, Dylan / Jain, Rishub / Uesato, Jonathan / Datta, Romina / Sadovsky, Adam / Bunyan, Oskar / Rabiej, Dominik / Wu, Shimu / Zhang, John / Vasudevan, Gautam / Leurent, Edouard / Alnahlawi, Mahmoud / Georgescu, Ionut / Wei, Nan / Zheng, Ivy / Chan, Betty / Rabinovitch, Pam G / Stanczyk, Piotr / Zhang, Ye / Steiner, David / Naskar, Subhajit / Azzam, Michael / Johnson, Matthew / Paszke, Adam / Chiu, Chung-Cheng / Elias, Jaume Sanchez / Mohiuddin, Afroz / Muhammad, Faizan / Miao, Jin / Lee, Andrew / Vieillard, Nino / Potluri, Sahitya / Park, Jane / Davoodi, Elnaz / Zhang, Jiageng / Stanway, Jeff / Garmon, Drew / Karmarkar, Abhijit / Dong, Zhe / Lee, Jong / Kumar, Aviral / Zhou, Luowei / Evens, Jonathan / Isaac, William / Chen, Zhe / Jia, Johnson / Levskaya, Anselm / Zhu, Zhenkai / Gorgolewski, Chris / Grabowski, Peter / Mao, Yu / Magni, Alberto / Yao, Kaisheng / Snaider, Javier / Casagrande, Norman / Suganthan, Paul / Palmer, Evan / Irving, Geoffrey / Loper, Edward / Faruqui, Manaal / Arkatkar, Isha / Chen, Nanxin / Shafran, Izhak / Fink, Michael / Castaño, Alfonso / Giannoumis, Irene / Kim, Wooyeol / Rybiński, Mikołaj / Sreevatsa, Ashwin / Prendki, Jennifer / Soergel, David / Goedeckemeyer, Adrian / Gierke, Willi / Jafari, Mohsen / Gaba, Meenu / Wiesner, Jeremy / Wright, Diana Gage / Wei, Yawen / Vashisht, Harsha / Kulizhskaya, Yana / Hoover, Jay / Le, Maigo / Li, Lu / Iwuanyanwu, Chimezie / Liu, Lu / Ramirez, Kevin / Khorlin, Andrey / Cui, Albert / LIN, Tian / Georgiev, Marin / Wu, Marcus / Aguilar, Ricardo / Pallo, Keith / Chakladar, Abhishek / Repina, Alena / Wu, Xihui / van der Weide, Tom / Ponnapalli, Priya / Kaplan, Caroline / Simsa, Jiri / Li, Shuangfeng / Dousse, Olivier / Piper, Jeff / Ie, Nathan / Lui, Minnie / Pasumarthi, Rama / Lintz, Nathan / Vijayakumar, Anitha / Thiet, Lam Nguyen / Andor, Daniel / Valenzuela, Pedro / Paduraru, Cosmin / Peng, Daiyi / Lee, Katherine / Zhang, Shuyuan / Greene, Somer / Nguyen, Duc Dung / Kurylowicz, Paula / Velury, Sarmishta / Krause, Sebastian / Hardin, Cassidy / Dixon, Lucas / Janzer, Lili / Choo, Kiam / Feng, Ziqiang / Zhang, Biao / Singhal, Achintya / Latkar, Tejasi / Zhang, Mingyang / Le, Quoc / Abellan, Elena Allica / Du, Dayou / McKinnon, Dan / Antropova, Natasha / Bolukbasi, Tolga / Keller, Orgad / Reid, David / Finchelstein, Daniel / Raad, Maria Abi / Crocker, Remi / Hawkins, Peter / Dadashi, Robert / Gaffney, Colin / Lall, Sid / Franko, Ken / Filonov, Egor / Bulanova, Anna / Leblond, Rémi / Yadav, Vikas / Chung, Shirley / Askham, Harry / Cobo, Luis C. / Xu, Kelvin / Fischer, Felix / Xu, Jun / Sorokin, Christina / Alberti, Chris / Lin, Chu-Cheng / Evans, Colin / Zhou, Hao / Dimitriev, Alek / Forbes, Hannah / Banarse, Dylan / Tung, Zora / Liu, Jeremiah / Omernick, Mark / Bishop, Colton / Kumar, Chintu / Sterneck, Rachel / Foley, Ryan / Jain, Rohan / Mishra, Swaroop / Xia, Jiawei / Bos, Taylor / Cideron, Geoffrey / Amid, Ehsan / Piccinno, Francesco / Wang, Xingyu / Banzal, Praseem / Gurita, Petru / Noga, Hila / Shah, Premal / Mankowitz, Daniel J. / Polozov, Alex / Kushman, Nate / Krakovna, Victoria / Brown, Sasha / Bateni, MohammadHossein / Duan, Dennis / Firoiu, Vlad / Thotakuri, Meghana / Natan, Tom / Mohananey, Anhad / Geist, Matthieu / Mudgal, Sidharth / Girgin, Sertan / Li, Hui / Ye, Jiayu / Roval, Ofir / Tojo, Reiko / Kwong, Michael / Lee-Thorp, James / Yew, Christopher / Yuan, Quan / Bagri, Sumit / Sinopalnikov, Danila / Ramos, Sabela / Mellor, John / Sharma, Abhishek / Severyn, Aliaksei / Lai, Jonathan / Wu, Kathy / Cheng, Heng-Tze / Miller, David / Sonnerat, Nicolas / Vnukov, Denis / Greig, Rory / Beattie, Jennifer / Caveness, Emily / Bai, Libin / Eisenschlos, Julian / Korchemniy, Alex / Tsai, Tomy / Jasarevic, Mimi / Kong, Weize / Dao, Phuong / Zheng, Zeyu / Liu, Frederick / Zhu, Rui / Geller, Mark / Teh, Tian Huey / Sanmiya, Jason / Gladchenko, Evgeny / Trdin, Nejc / Sozanschi, Andrei / Toyama, Daniel / Rosen, Evan / Tavakkol, Sasan / Xue, Linting / Elkind, Chen / Woodman, Oliver / Carpenter, John / Papamakarios, George / Kemp, Rupert / Kafle, Sushant / Grunina, Tanya / Sinha, Rishika / Talbert, Alice / Goyal, Abhimanyu / Wu, Diane / Owusu-Afriyie, Denese / Du, Cosmo / Thornton, Chloe / Pont-Tuset, Jordi / Narayana, Pradyumna / Li, Jing / Fatehi, Sabaer / Wieting, John / Ajmeri, Omar / Uria, Benigno / Zhu, Tao / Ko, Yeongil / Knight, Laura / Héliou, Amélie / Niu, Ning / Gu, Shane / Pang, Chenxi / Tran, Dustin / Li, Yeqing / Levine, Nir / Stolovich, Ariel / Kalb, Norbert / Santamaria-Fernandez, Rebeca / Goenka, Sonam / Yustalim, Wenny / Strudel, Robin / Elqursh, Ali / Lakshminarayanan, Balaji / Deck, Charlie / Upadhyay, Shyam / Lee, Hyo / Dusenberry, Mike / Li, Zonglin / Wang, Xuezhi / Levin, Kyle / Hoffmann, Raphael / Holtmann-Rice, Dan / Bachem, Olivier / Yue, Summer / Arora, Sho / Malmi, Eric / Mirylenka, Daniil / Tan, Qijun / Koh, Christy / Yeganeh, Soheil Hassas / Põder, Siim / Zheng, Steven / Pongetti, Francesco / Tariq, Mukarram / Sun, Yanhua / Ionita, Lucian / Seyedhosseini, Mojtaba / Tafti, Pouya / Kotikalapudi, Ragha / Liu, Zhiyu / Gulati, Anmol / Liu, Jasmine / Ye, Xinyu / Chrzaszcz, Bart / Wang, Lily / Sethi, Nikhil / Li, Tianrun / Brown, Ben / Singh, Shreya / Fan, Wei / Parisi, Aaron / Stanton, Joe / Kuang, Chenkai / Koverkathu, Vinod / Choquette-Choo, Christopher A. / Li, Yunjie / Lu, TJ / Ittycheriah, Abe / Shroff, Prakash / Sun, Pei / Varadarajan, Mani / Bahargam, Sanaz / Willoughby, Rob / Gaddy, David / Dasgupta, Ishita / Desjardins, Guillaume / Cornero, Marco / Robenek, Brona / Mittal, Bhavishya / Albrecht, Ben / Shenoy, Ashish / Moiseev, Fedor / Jacobsson, Henrik / Ghaffarkhah, Alireza / Rivière, Morgane / Walton, Alanna / Crepy, Clément / Parrish, Alicia / Liu, Yuan / Zhou, Zongwei / Farabet, Clement / Radebaugh, Carey / Srinivasan, Praveen / van der Salm, Claudia / Fidjeland, Andreas / Scellato, Salvatore / Latorre-Chimoto, Eri / Klimczak-Plucińska, Hanna / Bridson, David / de Cesare, Dario / Hudson, Tom / Mendolicchio, Piermaria / Walker, Lexi / Morris, Alex / Penchev, Ivo / Mauger, Matthew / Guseynov, Alexey / Reid, Alison / Odoom, Seth / Loher, Lucia / Cotruta, Victor / Yenugula, Madhavi / Grewe, Dominik / Petrushkina, Anastasia / Duerig, Tom / Sanchez, Antonio / Yadlowsky, Steve / Shen, Amy / Globerson, Amir / Kurzrok, Adam / Webb, Lynette / Dua, Sahil / Li, Dong / Lahoti, Preethi / Bhupatiraju, Surya / Hurt, Dan / Qureshi, Haroon / Agarwal, Ananth / Shani, Tomer / Eyal, Matan / Khare, Anuj / Belle, Shreyas Rammohan / Wang, Lei / Tekur, Chetan / Kale, Mihir Sanjay / Wei, Jinliang / Sang, Ruoxin / Saeta, Brennan / Liechty, Tyler / Sun, Yi / Zhao, Yao / Lee, Stephan / Nayak, Pandu / Fritz, Doug / Vuyyuru, Manish Reddy / Aslanides, John / Vyas, Nidhi / Wicke, Martin / Ma, Xiao / Bilal, Taylan / Eltyshev, Evgenii / Balle, Daniel / Martin, Nina / Cate, Hardie / Manyika, James / Amiri, Keyvan / Kim, Yelin / Xiong, Xi / Kang, Kai / Luisier, Florian / Tripuraneni, Nilesh / Madras, David / Guo, Mandy / Waters, Austin / Wang, Oliver / Ainslie, Joshua / Baldridge, Jason / Zhang, Han / Pruthi, Garima / Bauer, Jakob / Yang, Feng / Mansour, Riham / Gelman, Jason / Xu, Yang / Polovets, George / Liu, Ji / Cai, Honglong / Chen, Warren / Sheng, XiangHai / Xue, Emily / Ozair, Sherjil / Yu, Adams / Angermueller, Christof / Li, Xiaowei / Wang, Weiren / Wiesinger, Julia / Koukoumidis, Emmanouil / Tian, Yuan / Iyer, Anand / Gurumurthy, Madhu / Goldenson, Mark / Shah, Parashar / Blake, MK / Yu, Hongkun / Urbanowicz, Anthony / Palomaki, Jennimaria / Fernando, Chrisantha / Brooks, Kevin / Durden, Ken / Mehta, Harsh / Momchev, Nikola / Rahimtoroghi, Elahe / Georgaki, Maria / Raul, Amit / Ruder, Sebastian / Redshaw, Morgan / Lee, Jinhyuk / Jalan, Komal / Li, Dinghua / Perng, Ginger / Hechtman, Blake / Schuh, Parker / Nasr, Milad / Chen, Mia / Milan, Kieran / Mikulik, Vladimir / Strohman, Trevor / Franco, Juliana / Green, Tim / Hassabis, Demis / Kavukcuoglu, Koray / Dean, Jeffrey / Vinyals, Oriol

A Family of Highly Capable Multimodal Models

2023

Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from ... ...

Abstract	This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of Gemini models in cross-modal reasoning and language understanding will enable a wide variety of use cases and we discuss our approach toward deploying them responsibly to users.
Keywords	Computer Science - Computation and Language ; Computer Science - Artificial Intelligence ; Computer Science - Computer Vision and Pattern Recognition
Subject code	004
Publishing date	2023-12-18
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

To top

Your last searches

Search results

Search options

Book ; Online: Controlling High-Dimensional Data With Sparse Input

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Ensemble prosody prediction for expressive speech synthesis

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Phonological Features for 0-shot Multilingual Speech Synthesis

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Ctrl-P

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: ADEPT

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Book ; Online: Gemini

Full text online

More links

Kategorien

Inter-library loan at ZB MED