LIVIVO - Search results -

Search results

Result 1 - 10 of total 13

Search options

Book ; Online: Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem

Bayerl, Sebastian P. / Wagner, Dominik / Hönig, Florian / Bocklet, Tobias / Nöth, Elmar / Riedhammer, Korbinian

2022

Abstract: Specially adapted speech recognition models are necessary to handle stuttered speech. For these to be used in a targeted manner, stuttered speech must be reliably detected. Recent works have treated stuttering as a multi-class classification problem or ... ...

Abstract	Specially adapted speech recognition models are necessary to handle stuttered speech. For these to be used in a targeted manner, stuttered speech must be reliably detected. Recent works have treated stuttering as a multi-class classification problem or viewed detecting each dysfluency type as an isolated task; that does not capture the nature of stuttering, where one dysfluency seldom comes alone, i.e., co-occurs with others. This work explores an approach based on a modified wav2vec 2.0 system for end-to-end stuttering detection and classification as a multi-label problem. The method is evaluated on combinations of three datasets containing English and German stuttered speech, yielding state-of-the-art results for stuttering detection on the SEP-28k-Extended dataset. Experimental results provide evidence for the transferability of features and the generalizability of the method across datasets and languages. Comment: Submitted to ICASSP 2023
Keywords	Electrical Engineering and Systems Science - Audio and Speech Processing ; Computer Science - Sound
Subject code	006
Publishing date	2022-10-28
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: KSoF

Bayerl, Sebastian P. / von Gudenberg, Alexander Wolff / Hönig, Florian / Nöth, Elmar / Riedhammer, Korbinian

The Kassel State of Fluency Dataset -- A Therapy Centered Dataset of Stuttering

2022

Abstract: Stuttering is a complex speech disorder that negatively affects an individual's ability to communicate effectively. Persons who stutter (PWS) often suffer considerably under the condition and seek help through therapy. Fluency shaping is a therapy ... ...

Abstract	Stuttering is a complex speech disorder that negatively affects an individual's ability to communicate effectively. Persons who stutter (PWS) often suffer considerably under the condition and seek help through therapy. Fluency shaping is a therapy approach where PWSs learn to modify their speech to help them to overcome their stutter. Mastering such speech techniques takes time and practice, even after therapy. Shortly after therapy, success is evaluated highly, but relapse rates are high. To be able to monitor speech behavior over a long time, the ability to detect stuttering events and modifications in speech could help PWSs and speech pathologists to track the level of fluency. Monitoring could create the ability to intervene early by detecting lapses in fluency. To the best of our knowledge, no public dataset is available that contains speech from people who underwent stuttering therapy that changed the style of speaking. This work introduces the Kassel State of Fluency (KSoF), a therapy-based dataset containing over 5500 clips of PWSs. The clips were labeled with six stuttering-related event types: blocks, prolongations, sound repetitions, word repetitions, interjections, and - specific to therapy - speech modifications. The audio was recorded during therapy sessions at the Institut der Kasseler Stottertherapie. The data will be made available for research purposes upon request. Comment: Accepted at LREC 2022 Conference on Language Resources and Evaluation
Keywords	Electrical Engineering and Systems Science - Audio and Speech Processing ; Computer Science - Computation and Language
Subject code	400
Publishing date	2022-03-10
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: A Stutter Seldom Comes Alone -- Cross-Corpus Stuttering Detection as a Multi-label Problem

Bayerl, Sebastian P. / Wagner, Dominik / Baumann, Ilja / Hönig, Florian / Bocklet, Tobias / Nöth, Elmar / Riedhammer, Korbinian

2023

Abstract: Most stuttering detection and classification research has viewed stuttering as a multi-class classification problem or a binary detection task for each dysfluency type; however, this does not match the nature of stuttering, in which one dysfluency seldom ...

Abstract	Most stuttering detection and classification research has viewed stuttering as a multi-class classification problem or a binary detection task for each dysfluency type; however, this does not match the nature of stuttering, in which one dysfluency seldom comes alone but rather co-occurs with others. This paper explores multi-language and cross-corpus end-to-end stuttering detection as a multi-label problem using a modified wav2vec 2.0 system with an attention-based classification head and multi-task learning. We evaluate the method using combinations of three datasets containing English and German stuttered speech, one containing speech modified by fluency shaping. The experimental results and an error analysis show that multi-label stuttering detection systems trained on cross-corpus and multi-language data achieve competitive results but performance on samples with multiple labels stays below over-all detection results. Comment: Accepted for presentation at Interspeech 2023. arXiv admin note: substantial text overlap with arXiv:2210.15982
Keywords	Electrical Engineering and Systems Science - Audio and Speech Processing ; Computer Science - Computation and Language ; Computer Science - Sound
Subject code	006
Publishing date	2023-05-30
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Book ; Online: Towards Automated Assessment of Stuttering and Stuttering Therapy

Bayerl, Sebastian P. / Hönig, Florian / Reister, Joelle / Riedhammer, Korbinian

2020

Abstract: Stuttering is a complex speech disorder that can be identified by repetitions, prolongations of sounds, syllables or words, and blocks while speaking. Severity assessment is usually done by a speech therapist. While attempts at automated assessment were ... ...

Abstract	Stuttering is a complex speech disorder that can be identified by repetitions, prolongations of sounds, syllables or words, and blocks while speaking. Severity assessment is usually done by a speech therapist. While attempts at automated assessment were made, it is rarely used in therapy. Common methods for the assessment of stuttering severity include percent stuttered syllables (% SS), the average of the three longest stuttering symptoms during a speech task, or the recently introduced Speech Efficiency Score (SES). This paper introduces the Speech Control Index (SCI), a new method to evaluate the severity of stuttering. Unlike SES, it can also be used to assess therapy success for fluency shaping. We evaluate both SES and SCI on a new comprehensively labeled dataset containing stuttered German speech of clients prior to, during, and after undergoing stuttering therapy. Phone alignments of an automatic speech recognition system are statistically evaluated in relation to their relative position to labeled stuttering events. The results indicate that phone length distributions differ with respect to their position in and around labeled stuttering events Comment: 10 pages, 3 figures, 1 table Accepted at TSD 2020, 23rd International Conference on Text, Speech and Dialogue
Keywords	Quantitative Biology - Quantitative Methods ; Computer Science - Computation and Language ; Computer Science - Machine Learning ; Computer Science - Sound ; Electrical Engineering and Systems Science - Audio and Speech Processing
Subject code	410
Publishing date	2020-06-16
Publishing country	us
Document type	Book ; Online
Database	BASE - Bielefeld Academic Search Engine (life sciences selection)

Full text online

Full text

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Article ; Online: Smart Annotation of Cyclic Data Using Hierarchical Hidden Markov Models.

Martindale, Christine F / Hoenig, Florian / Strohrmann, Christina / Eskofier, Bjoern M

Sensors (Basel, Switzerland)

2017 Volume 17, Issue 10

Abstract: Cyclic signals are an intrinsic part of daily life, such as human motion and heart activity. The detailed analysis of them is important for clinical applications such as pathological gait analysis and for sports applications such as performance analysis. ...

Abstract	Cyclic signals are an intrinsic part of daily life, such as human motion and heart activity. The detailed analysis of them is important for clinical applications such as pathological gait analysis and for sports applications such as performance analysis. Labeled training data for algorithms that analyze these cyclic data come at a high annotation cost due to only limited annotations available under laboratory conditions or requiring manual segmentation of the data under less restricted conditions. This paper presents a smart annotation method that reduces this cost of labeling for sensor-based data, which is applicable to data collected outside of strict laboratory conditions. The method uses semi-supervised learning of sections of cyclic data with a known cycle number. A hierarchical hidden Markov model (hHMM) is used, achieving a mean absolute error of 0.041 ± 0.020 s relative to a manually-annotated reference. The resulting model was also used to simultaneously segment and classify continuous, 'in the wild' data, demonstrating the applicability of using hHMM, trained on limited data sections, to label a complete dataset. This technique achieved comparable results to its fully-supervised equivalent. Our semi-supervised method has the significant advantage of reduced annotation cost. Furthermore, it reduces the opportunity for human error in the labeling process normally required for training of segmentation algorithms. It also lowers the annotation cost of training a model capable of continuous monitoring of cycle characteristics such as those employed to analyze the progress of movement disorders or analysis of running technique.
Language	English
Publishing date	2017-10-13
Publishing country	Switzerland
Document type	Journal Article
ZDB-ID	2052857-7
ISSN	1424-8220 ; 1424-8220
ISSN (online)	1424-8220
ISSN	1424-8220
DOI	10.3390/s17102328
Database	MEDical Literature Analysis and Retrieval System OnLINE

Order via subito

This service is chargeable due to the Delivery terms set by subito. Orders including an article and supplementary material will be classified as separate orders. In these cases, fees will be demanded for each order.

Book ; Online: Computerunterstütztes Aussprache- und Dialogtraining

Batlinger, Anton / Hönig, Florian / Weilhammer, Karl

C-AuDiT ; Schlussbericht

2010

Title variant	C-AuDiT ; Computer aided pronunciation and dialog trainig ; Aussprachetraining
Author's details	[Autoren: Batlinger, Anton; Hönig, Florian; Weilhammer, Karl]. Projektleitung: Karl Weilhammer
Language	German
Size	Online-Ressource (29 S., 886 KB), graph. Darst.
Publisher	Technische Informationsbibliothek u. Universitätsbibliothek ; Digital Publ.u.a.
Publishing place	Hannover ; München u.a.
Document type	Book ; Online
Note	Förderkennzeichen BMBF 01IS07014A - 01IS07014B. - Verbund-Nr. 01060794. - Engl. Berichtsbl. u.d.T.: Computer aided pronunciation and dialog trainig ; Unterschiede zwischen dem gedruckten Dokument und der elektronischen Ressource können nicht ausgeschlossen werden. - Auch als gedr. Ausg. vorhanden
Database	Library catalogue of the German National Library of Science and Technology (TIB), Hannover

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Details ▾

Book: Computerunterstütztes Aussprache- und Dialogtraining

Batlinger, Anton / Hönig, Florian / Weilhammer, Karl

C-AuDiT ; Schlussbericht

2010

Title variant	C-AuDiT ; Computer aided pronunciation and dialog trainig ; Aussprachetraining
Author's details	[Autoren: Batlinger, Anton; Hönig, Florian; Weilhammer, Karl]. Projektleitung: Karl Weilhammer
Language	German
Size	13 Bl., graph. Darst.
Publisher	Digital Publ. u.a.
Publishing place	München u.a.
Document type	Book
Note	Auch als elektronische Ressource vorh. ; Förderkennzeichen BMBF 01IS07014A - 01IS07014B. - Verbund-Nr. 01060794. - Engl. Berichtsbl. u.d.T.: Computer aided pronunciation and dialog trainig ; Unterschiede zwischen dem gedruckten Dokument und der elektronischen Ressource können nicht ausgeschlossen werden
Database	Library catalogue of the German National Library of Science and Technology (TIB), Hannover

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Details ▾

Article: Automatic modelling of depressed speech

Hönig, Florian / Batliner, Anton / Nöth, Elmar / Schnieder, Sebastian / Krajewski, Jarek

Relevant features and relevance of gender

(In: Li, H.; Ching, P. (Ed.), INTERSPEECH 2014. 15th Annual Conference of the International Speech Communication Association, Singapore September 14-18 (S. 1248-1252). Baixas: International Speech Communication Association (ISCA))

2014

Abstract: Depression is an affective disorder characterised by psychomotor retardation; in speech, this shows up in reduction of pitch (variation, range), loudness, and tempo, and in voice qualities different from those of typical modal speech. A similar reduction ...

Title translation	Automatische Modellierung depressiver Sprache: Relevante Merkmale und Relevanz des Geschlechts (DeepL)
Series title	In: Li, H.; Ching, P. (Ed.), INTERSPEECH 2014. 15th Annual Conference of the International Speech Communication Association, Singapore September 14-18 (S. 1248-1252). Baixas: International Speech Communication Association (ISCA)
Abstract	Depression is an affective disorder characterised by psychomotor retardation; in speech, this shows up in reduction of pitch (variation, range), loudness, and tempo, and in voice qualities different from those of typical modal speech. A similar reduction can be observed in sleepy speech (relaxation). In this paper, we employ a small group of acoustic features modelling prosody and spectrum that have been proven successful in the modelling of sleepy speech, enriched with voice quality features, for the modelling of depressed speech within a regression approach. This knowledge-based approach is complemented by and compared with brute-forcing and automatic feature selection. We further discuss gender differences and the contributions of (groups of) features both for the modelling of depression and across depression and sleepiness.
Keywords	Automated Information Processing ; Automatisierte Informationsverarbeitung ; Computational Modeling ; Computermodell ; Major Depression ; Mustererkennung (Computerwissenschaft) ; Mündliche Kommunikation ; Natural Language Processing ; Natürliche Sprachverarbeitung ; Oral Communication ; Pattern Recognition (Computer Science) ; Prosodie ; Prosody ; Schläfrigkeit ; Sleepiness ; Speech Characteristics ; Sprechcharakteristika ; Stimme ; Voice
Language	English
Document type	Article
Database	PSYNDEX

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Details ▾

Conference proceedings: Robustes Echtzeit-Feedback für die gebundene, weiche Sprechtechnik in der Stottertherapie

Haderlein, Tino / Hönig, Florian / Jassens, Frank / Mahlberg, Lea / Nöth, Elmar / Wolff von Gudenberg, Alexander

2015 , Page(s) 45

Abstract: Hintergrund: Die Kasseler Stottertherapie vermittelt eine Sprechtechnik mit weichem Stimmeinsatz. Weitere Merkmale sind Lautdehnungen, sanfte Übergänge zwischen stimmhaften Lauten, reduzierte stimmlose Konsonanten sowie gehaltene Phonation innerhalb von ...

Event/congress	32. Wissenschaftliche Jahrestagung der Deutschen Gesellschaft für Phoniatrie und Pädaudiologie (DGPP); Oldenburg; Deutsche Gesellschaft für Phoniatrie und Pädaudiologie; 2015
Abstract	Hintergrund: Die Kasseler Stottertherapie vermittelt eine Sprechtechnik mit weichem Stimmeinsatz. Weitere Merkmale sind Lautdehnungen, sanfte Übergänge zwischen stimmhaften Lauten, reduzierte stimmlose Konsonanten sowie gehaltene Phonation innerhalb von Phrasen. Die bisher bei Sprechübungen eingesetzte Software erkennt den Stimmeinsatz nur anhand der Lautstärke und ist deshalb wegen möglicher Störgeräusche fehleranfällig. Weiterhin müssen mehrere Parameter des Programms vom Benutzer kalibriert werden. Material und Methoden: Für die ersten Auswertungen wurden 66 Aufnahmen mit weichem Stimmeinsatz von elf Personen mit 66 Aufnahmen mit zu schnellem Lautstärkeanstieg von acht Personen verglichen. Die neue Methode arbeitet mit einem Verfahren zur automatischen Segmentierung der Aufnahme in stimmhafte und stimmlose Bereiche, um zu entscheiden, wann der Stimmeinsatz erfolgt. Die dort gemessene Lautstärke wird logarithmiert und grafisch über die Zeit angezeigt. Gleichzeitig wird die maximal erlaubte Laustärkesteigerung pro Zeit visualisiert bzw. eine Überschreitung unmittelbar zurückgemeldet. Basierend auf den Erkenntnissen aus rund 1000 Aufnahmen der Sprechübungen von 20 Patientinnen und Patienten wurde der erlaubte Anstieg auf 15 dB pro Sekunde festgelegt. Der einzige Parameter, der zu bestimmen ist, ist der Pegel der normalen Sprechlautstärke. Die Automatisierung dieser Kalibrierung war Teil der vorgelegten Studie. Zur Bestimmung der stimmhaften Abschnitte wurde ein bestehender Algorithmus zur F0-Detektion (RAPT) modifiziert. Der Anfangswert der Lautstärke wurde jeweils aus den initialen stimmhaften Segmenten bestimmt. Dafür wurden vom Pegel des Stimmsignals 3 dB abgezogen. Ergebnisse: 46 von 66 „guten“ Stimmeinsätzen (69,7%) wurden als gut bewertet, 56 von 66 „zu schnell lauten“ Stimmeinsätzen (84,8%) wurden ebenfalls richtig dargestellt. Daraus ergibt sich eine Gesamterkennungsrate von 77,2%. Die Ergebnisse sind vor dem Hintergrund zu bewerten, dass auch bei 40,5% aller verfügbaren paarweisen Annotationen keine einstimmige Entscheidung zwischen „normal“ und „zu schnell laut“ erzielt wurde. Diskussion: Der Stimmeinsatz in verrauschten Daten ist beim grundfrequenzbasierten Ansatz besser zu detektieren als bei lautstärkebasierten Verfahren. Es müssen keine Parameter vor der Benutzung per Hand kalibriert werden. Fazit: Das F0-basierte Verfahren erlaubt genauere Rückmeldungen an den Benutzer in Echtzeit.
Keywords	Medizin, Gesundheit
Publishing date	2015-09-07
Publisher	German Medical Science GMS Publishing House; Düsseldorf
Document type	Conference proceedings
DOI	10.3205/15dgpp29
Database	German Medical Science

Inter-library loan at ZB MED

Your chosen title can be delivered directly to ZB MED Cologne location if you are registered as a user at ZB MED Cologne.

Details ▾

Article ; Online: Automatic detection of articulation disorders in children with cleft lip and palate.

Maier, Andreas / Hönig, Florian / Bocklet, Tobias / Nöth, Elmar / Stelzle, Florian / Nkenke, Emeka / Schuster, Maria

The Journal of the Acoustical Society of America

2009 Volume 126, Issue 5, Page(s) 2589–2602

Abstract: Speech of children with cleft lip and palate (CLP) is sometimes still disordered even after adequate surgical and nonsurgical therapies. Such speech shows complex articulation disorders, which are usually assessed perceptually, consuming time and ... ...

Abstract	Speech of children with cleft lip and palate (CLP) is sometimes still disordered even after adequate surgical and nonsurgical therapies. Such speech shows complex articulation disorders, which are usually assessed perceptually, consuming time and manpower. Hence, there is a need for an easy to apply and reliable automatic method. To create a reference for an automatic system, speech data of 58 children with CLP were assessed perceptually by experienced speech therapists for characteristic phonetic disorders at the phoneme level. The first part of the article aims to detect such characteristics by a semiautomatic procedure and the second to evaluate a fully automatic, thus simple, procedure. The methods are based on a combination of speech processing algorithms. The semiautomatic method achieves moderate to good agreement (kappa approximately 0.6) for the detection of all phonetic disorders. On a speaker level, significant correlations between the perceptual evaluation and the automatic system of 0.89 are obtained. The fully automatic system yields a correlation on the speaker level of 0.81 to the perceptual evaluation. This correlation is in the range of the inter-rater correlation of the listeners. The automatic speech evaluation is able to detect phonetic disorders at an experts'level without any additional human postprocessing.
MeSH term(s)	Algorithms ; Articulation Disorders/diagnosis ; Articulation Disorders/etiology ; Child ; Cleft Lip/complications ; Cleft Palate/complications ; Humans ; Models, Biological ; Phonation ; Phonetics ; Psycholinguistics ; Speech Therapy
Language	English
Publishing date	2009-11
Publishing country	United States
Document type	Journal Article ; Research Support, Non-U.S. Gov't
ZDB-ID	219231-7
ISSN	1520-8524 ; 0001-4966
ISSN (online)	1520-8524
ISSN	0001-4966
DOI	10.1121/1.3216913
Database	MEDical Literature Analysis and Retrieval System OnLINE

In stock of ZB MED Cologne/Königswinter

Zs.B 591: Show issues

Location:
Je nach Verfügbarkeit (siehe Angabe bei Bestand)
bis Jg. 2021: Bestellungen von Artikeln über das Online-Bestellformular
ab Jg. 2022: Lesesaal (EG)

Order via subito

Details ▾
- See ZB MED holdings
- Order with fees

To top

Your last searches

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

Full text online

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

Order via subito

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

Inter-library loan at ZB MED

More links

Kategorien

In stock of ZB MED Cologne/Königswinter

Order via subito