LIVIVO - The Search Portal for Life Sciences

zur deutschen Oberfläche wechseln
Advanced search

Your last searches

  1. AU="Hönig, Florian"
  2. AU="Loboa, Elizabeth G"
  3. AU="Rolla, Martina"
  4. AU="Barakat, Khaled"
  5. AU="Silva-Islas, Carlos Alfredo"
  6. AU="Berman, Nathaniel"
  7. AU="G Chandana"
  8. AU="Boga, Narasimha Rao"
  9. AU="Borm, Frank J"
  10. AU="Nguyen L. Vuong"
  11. AU="Pozner, Jason N"
  12. AU="Vikash R. V. Tatayah"
  13. AU=Carneiro Flvia R.G.
  14. AU="Harrison, Tina"
  15. AU="Hanxin Lu"
  16. AU="Pinto, Bernardo I"
  17. AU=Marek Martin
  18. AU="Rachmilewitz, Eliezer"
  19. AU="Marc S Cortese"
  20. AU=Nguyen Quynh Huong

Search results

Result 1 - 10 of total 13

Search options

  1. Book ; Online: Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem

    Bayerl, Sebastian P. / Wagner, Dominik / Hönig, Florian / Bocklet, Tobias / Nöth, Elmar / Riedhammer, Korbinian

    2022  

    Abstract: Specially adapted speech recognition models are necessary to handle stuttered speech. For these to be used in a targeted manner, stuttered speech must be reliably detected. Recent works have treated stuttering as a multi-class classification problem or ... ...

    Abstract Specially adapted speech recognition models are necessary to handle stuttered speech. For these to be used in a targeted manner, stuttered speech must be reliably detected. Recent works have treated stuttering as a multi-class classification problem or viewed detecting each dysfluency type as an isolated task; that does not capture the nature of stuttering, where one dysfluency seldom comes alone, i.e., co-occurs with others. This work explores an approach based on a modified wav2vec 2.0 system for end-to-end stuttering detection and classification as a multi-label problem. The method is evaluated on combinations of three datasets containing English and German stuttered speech, yielding state-of-the-art results for stuttering detection on the SEP-28k-Extended dataset. Experimental results provide evidence for the transferability of features and the generalizability of the method across datasets and languages.

    Comment: Submitted to ICASSP 2023
    Keywords Electrical Engineering and Systems Science - Audio and Speech Processing ; Computer Science - Sound
    Subject code 006
    Publishing date 2022-10-28
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  2. Book ; Online: KSoF

    Bayerl, Sebastian P. / von Gudenberg, Alexander Wolff / Hönig, Florian / Nöth, Elmar / Riedhammer, Korbinian

    The Kassel State of Fluency Dataset -- A Therapy Centered Dataset of Stuttering

    2022  

    Abstract: Stuttering is a complex speech disorder that negatively affects an individual's ability to communicate effectively. Persons who stutter (PWS) often suffer considerably under the condition and seek help through therapy. Fluency shaping is a therapy ... ...

    Abstract Stuttering is a complex speech disorder that negatively affects an individual's ability to communicate effectively. Persons who stutter (PWS) often suffer considerably under the condition and seek help through therapy. Fluency shaping is a therapy approach where PWSs learn to modify their speech to help them to overcome their stutter. Mastering such speech techniques takes time and practice, even after therapy. Shortly after therapy, success is evaluated highly, but relapse rates are high. To be able to monitor speech behavior over a long time, the ability to detect stuttering events and modifications in speech could help PWSs and speech pathologists to track the level of fluency. Monitoring could create the ability to intervene early by detecting lapses in fluency. To the best of our knowledge, no public dataset is available that contains speech from people who underwent stuttering therapy that changed the style of speaking. This work introduces the Kassel State of Fluency (KSoF), a therapy-based dataset containing over 5500 clips of PWSs. The clips were labeled with six stuttering-related event types: blocks, prolongations, sound repetitions, word repetitions, interjections, and - specific to therapy - speech modifications. The audio was recorded during therapy sessions at the Institut der Kasseler Stottertherapie. The data will be made available for research purposes upon request.

    Comment: Accepted at LREC 2022 Conference on Language Resources and Evaluation
    Keywords Electrical Engineering and Systems Science - Audio and Speech Processing ; Computer Science - Computation and Language
    Subject code 400
    Publishing date 2022-03-10
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  3. Book ; Online: A Stutter Seldom Comes Alone -- Cross-Corpus Stuttering Detection as a Multi-label Problem

    Bayerl, Sebastian P. / Wagner, Dominik / Baumann, Ilja / Hönig, Florian / Bocklet, Tobias / Nöth, Elmar / Riedhammer, Korbinian

    2023  

    Abstract: Most stuttering detection and classification research has viewed stuttering as a multi-class classification problem or a binary detection task for each dysfluency type; however, this does not match the nature of stuttering, in which one dysfluency seldom ...

    Abstract Most stuttering detection and classification research has viewed stuttering as a multi-class classification problem or a binary detection task for each dysfluency type; however, this does not match the nature of stuttering, in which one dysfluency seldom comes alone but rather co-occurs with others. This paper explores multi-language and cross-corpus end-to-end stuttering detection as a multi-label problem using a modified wav2vec 2.0 system with an attention-based classification head and multi-task learning. We evaluate the method using combinations of three datasets containing English and German stuttered speech, one containing speech modified by fluency shaping. The experimental results and an error analysis show that multi-label stuttering detection systems trained on cross-corpus and multi-language data achieve competitive results but performance on samples with multiple labels stays below over-all detection results.

    Comment: Accepted for presentation at Interspeech 2023. arXiv admin note: substantial text overlap with arXiv:2210.15982
    Keywords Electrical Engineering and Systems Science - Audio and Speech Processing ; Computer Science - Computation and Language ; Computer Science - Sound
    Subject code 006
    Publishing date 2023-05-30
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  4. Book ; Online: Towards Automated Assessment of Stuttering and Stuttering Therapy

    Bayerl, Sebastian P. / Hönig, Florian / Reister, Joelle / Riedhammer, Korbinian

    2020  

    Abstract: Stuttering is a complex speech disorder that can be identified by repetitions, prolongations of sounds, syllables or words, and blocks while speaking. Severity assessment is usually done by a speech therapist. While attempts at automated assessment were ... ...

    Abstract Stuttering is a complex speech disorder that can be identified by repetitions, prolongations of sounds, syllables or words, and blocks while speaking. Severity assessment is usually done by a speech therapist. While attempts at automated assessment were made, it is rarely used in therapy. Common methods for the assessment of stuttering severity include percent stuttered syllables (% SS), the average of the three longest stuttering symptoms during a speech task, or the recently introduced Speech Efficiency Score (SES). This paper introduces the Speech Control Index (SCI), a new method to evaluate the severity of stuttering. Unlike SES, it can also be used to assess therapy success for fluency shaping. We evaluate both SES and SCI on a new comprehensively labeled dataset containing stuttered German speech of clients prior to, during, and after undergoing stuttering therapy. Phone alignments of an automatic speech recognition system are statistically evaluated in relation to their relative position to labeled stuttering events. The results indicate that phone length distributions differ with respect to their position in and around labeled stuttering events

    Comment: 10 pages, 3 figures, 1 table Accepted at TSD 2020, 23rd International Conference on Text, Speech and Dialogue
    Keywords Quantitative Biology - Quantitative Methods ; Computer Science - Computation and Language ; Computer Science - Machine Learning ; Computer Science - Sound ; Electrical Engineering and Systems Science - Audio and Speech Processing
    Subject code 410
    Publishing date 2020-06-16
    Publishing country us
    Document type Book ; Online
    Database BASE - Bielefeld Academic Search Engine (life sciences selection)

    More links

    Kategorien

  5. Article ; Online: Smart Annotation of Cyclic Data Using Hierarchical Hidden Markov Models.

    Martindale, Christine F / Hoenig, Florian / Strohrmann, Christina / Eskofier, Bjoern M

    Sensors (Basel, Switzerland)

    2017  Volume 17, Issue 10

    Abstract: Cyclic signals are an intrinsic part of daily life, such as human motion and heart activity. The detailed analysis of them is important for clinical applications such as pathological gait analysis and for sports applications such as performance analysis. ...

    Abstract Cyclic signals are an intrinsic part of daily life, such as human motion and heart activity. The detailed analysis of them is important for clinical applications such as pathological gait analysis and for sports applications such as performance analysis. Labeled training data for algorithms that analyze these cyclic data come at a high annotation cost due to only limited annotations available under laboratory conditions or requiring manual segmentation of the data under less restricted conditions. This paper presents a smart annotation method that reduces this cost of labeling for sensor-based data, which is applicable to data collected outside of strict laboratory conditions. The method uses semi-supervised learning of sections of cyclic data with a known cycle number. A hierarchical hidden Markov model (hHMM) is used, achieving a mean absolute error of 0.041 ± 0.020 s relative to a manually-annotated reference. The resulting model was also used to simultaneously segment and classify continuous, 'in the wild' data, demonstrating the applicability of using hHMM, trained on limited data sections, to label a complete dataset. This technique achieved comparable results to its fully-supervised equivalent. Our semi-supervised method has the significant advantage of reduced annotation cost. Furthermore, it reduces the opportunity for human error in the labeling process normally required for training of segmentation algorithms. It also lowers the annotation cost of training a model capable of continuous monitoring of cycle characteristics such as those employed to analyze the progress of movement disorders or analysis of running technique.
    Language English
    Publishing date 2017-10-13
    Publishing country Switzerland
    Document type Journal Article
    ZDB-ID 2052857-7
    ISSN 1424-8220 ; 1424-8220
    ISSN (online) 1424-8220
    ISSN 1424-8220
    DOI 10.3390/s17102328
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

  6. Book ; Online: Computerunterstütztes Aussprache- und Dialogtraining

    Batlinger, Anton / Hönig, Florian / Weilhammer, Karl

    C-AuDiT ; Schlussbericht

    2010  

    Title variant C-AuDiT ; Computer aided pronunciation and dialog trainig ; Aussprachetraining
    Author's details [Autoren: Batlinger, Anton; Hönig, Florian; Weilhammer, Karl]. Projektleitung: Karl Weilhammer
    Language German
    Size Online-Ressource (29 S., 886 KB), graph. Darst.
    Publisher Technische Informationsbibliothek u. Universitätsbibliothek ; Digital Publ.u.a.
    Publishing place Hannover ; München u.a.
    Document type Book ; Online
    Note Förderkennzeichen BMBF 01IS07014A - 01IS07014B. - Verbund-Nr. 01060794. - Engl. Berichtsbl. u.d.T.: Computer aided pronunciation and dialog trainig ; Unterschiede zwischen dem gedruckten Dokument und der elektronischen Ressource können nicht ausgeschlossen werden. - Auch als gedr. Ausg. vorhanden
    Database Library catalogue of the German National Library of Science and Technology (TIB), Hannover

    More links

    Kategorien

  7. Book: Computerunterstütztes Aussprache- und Dialogtraining

    Batlinger, Anton / Hönig, Florian / Weilhammer, Karl

    C-AuDiT ; Schlussbericht

    2010  

    Title variant C-AuDiT ; Computer aided pronunciation and dialog trainig ; Aussprachetraining
    Author's details [Autoren: Batlinger, Anton; Hönig, Florian; Weilhammer, Karl]. Projektleitung: Karl Weilhammer
    Language German
    Size 13 Bl., graph. Darst.
    Publisher Digital Publ. u.a.
    Publishing place München u.a.
    Document type Book
    Note Auch als elektronische Ressource vorh. ; Förderkennzeichen BMBF 01IS07014A - 01IS07014B. - Verbund-Nr. 01060794. - Engl. Berichtsbl. u.d.T.: Computer aided pronunciation and dialog trainig ; Unterschiede zwischen dem gedruckten Dokument und der elektronischen Ressource können nicht ausgeschlossen werden
    Database Library catalogue of the German National Library of Science and Technology (TIB), Hannover

    More links

    Kategorien

  8. Article: Automatic modelling of depressed speech

    Hönig, Florian / Batliner, Anton / Nöth, Elmar / Schnieder, Sebastian / Krajewski, Jarek

    Relevant features and relevance of gender

    (In: Li, H.; Ching, P. (Ed.), INTERSPEECH 2014. 15th Annual Conference of the International Speech Communication Association, Singapore September 14-18 (S. 1248-1252). Baixas: International Speech Communication Association (ISCA))

    2014  

    Abstract: Depression is an affective disorder characterised by psychomotor retardation; in speech, this shows up in reduction of pitch (variation, range), loudness, and tempo, and in voice qualities different from those of typical modal speech. A similar reduction ...

    Title translation Automatische Modellierung depressiver Sprache: Relevante Merkmale und Relevanz des Geschlechts (DeepL)
    Series title In: Li, H.; Ching, P. (Ed.), INTERSPEECH 2014. 15th Annual Conference of the International Speech Communication Association, Singapore September 14-18 (S. 1248-1252). Baixas: International Speech Communication Association (ISCA)
    Abstract Depression is an affective disorder characterised by psychomotor retardation; in speech, this shows up in reduction of pitch (variation, range), loudness, and tempo, and in voice qualities different from those of typical modal speech. A similar reduction can be observed in sleepy speech (relaxation). In this paper, we employ a small group of acoustic features modelling prosody and spectrum that have been proven successful in the modelling of sleepy speech, enriched with voice quality features, for the modelling of depressed speech within a regression approach. This knowledge-based approach is complemented by and compared with brute-forcing and automatic feature selection. We further discuss gender differences and the contributions of (groups of) features both for the modelling of depression and across depression and sleepiness.
    Keywords Automated Information Processing ; Automatisierte Informationsverarbeitung ; Computational Modeling ; Computermodell ; Major Depression ; Mustererkennung (Computerwissenschaft) ; Mündliche Kommunikation ; Natural Language Processing ; Natürliche Sprachverarbeitung ; Oral Communication ; Pattern Recognition (Computer Science) ; Prosodie ; Prosody ; Schläfrigkeit ; Sleepiness ; Speech Characteristics ; Sprechcharakteristika ; Stimme ; Voice
    Language English
    Document type Article
    Database PSYNDEX

    More links

    Kategorien

  9. Conference proceedings: Robustes Echtzeit-Feedback für die gebundene, weiche Sprechtechnik in der Stottertherapie

    Haderlein, Tino / Hönig, Florian / Jassens, Frank / Mahlberg, Lea / Nöth, Elmar / Wolff von Gudenberg, Alexander

    2015  , Page(s) 45

    Abstract: Hintergrund: Die Kasseler Stottertherapie vermittelt eine Sprechtechnik mit weichem Stimmeinsatz. Weitere Merkmale sind Lautdehnungen, sanfte Übergänge zwischen stimmhaften Lauten, reduzierte stimmlose Konsonanten sowie gehaltene Phonation innerhalb von ...

    Event/congress 32. Wissenschaftliche Jahrestagung der Deutschen Gesellschaft für Phoniatrie und Pädaudiologie (DGPP); Oldenburg; Deutsche Gesellschaft für Phoniatrie und Pädaudiologie; 2015
    Abstract Hintergrund: Die Kasseler Stottertherapie vermittelt eine Sprechtechnik mit weichem Stimmeinsatz. Weitere Merkmale sind Lautdehnungen, sanfte Übergänge zwischen stimmhaften Lauten, reduzierte stimmlose Konsonanten sowie gehaltene Phonation innerhalb von Phrasen. Die bisher bei Sprechübungen eingesetzte Software erkennt den Stimmeinsatz nur anhand der Lautstärke und ist deshalb wegen möglicher Störgeräusche fehleranfällig. Weiterhin müssen mehrere Parameter des Programms vom Benutzer kalibriert werden.
    Material und Methoden: Für die ersten Auswertungen wurden 66 Aufnahmen mit weichem Stimmeinsatz von elf Personen mit 66 Aufnahmen mit zu schnellem Lautstärkeanstieg von acht Personen verglichen. Die neue Methode arbeitet mit einem Verfahren zur automatischen Segmentierung der Aufnahme in stimmhafte und stimmlose Bereiche, um zu entscheiden, wann der Stimmeinsatz erfolgt. Die dort gemessene Lautstärke wird logarithmiert und grafisch über die Zeit angezeigt. Gleichzeitig wird die maximal erlaubte Laustärkesteigerung pro Zeit visualisiert bzw. eine Überschreitung unmittelbar zurückgemeldet. Basierend auf den Erkenntnissen aus rund 1000 Aufnahmen der Sprechübungen von 20 Patientinnen und Patienten wurde der erlaubte Anstieg auf 15 dB pro Sekunde festgelegt. Der einzige Parameter, der zu bestimmen ist, ist der Pegel der normalen Sprechlautstärke. Die Automatisierung dieser Kalibrierung war Teil der vorgelegten Studie. Zur Bestimmung der stimmhaften Abschnitte wurde ein bestehender Algorithmus zur F0-Detektion (RAPT) modifiziert. Der Anfangswert der Lautstärke wurde jeweils aus den initialen stimmhaften Segmenten bestimmt. Dafür wurden vom Pegel des Stimmsignals 3 dB abgezogen.
    Ergebnisse: 46 von 66 „guten“ Stimmeinsätzen (69,7%) wurden als gut bewertet, 56 von 66 „zu schnell lauten“ Stimmeinsätzen (84,8%) wurden ebenfalls richtig dargestellt. Daraus ergibt sich eine Gesamterkennungsrate von 77,2%. Die Ergebnisse sind vor dem Hintergrund zu bewerten, dass auch bei 40,5% aller verfügbaren paarweisen Annotationen keine einstimmige Entscheidung zwischen „normal“ und „zu schnell laut“ erzielt wurde.
    Diskussion: Der Stimmeinsatz in verrauschten Daten ist beim grundfrequenzbasierten Ansatz besser zu detektieren als bei lautstärkebasierten Verfahren. Es müssen keine Parameter vor der Benutzung per Hand kalibriert werden.
    Fazit: Das F0-basierte Verfahren erlaubt genauere Rückmeldungen an den Benutzer in Echtzeit.
    Keywords Medizin, Gesundheit
    Publishing date 2015-09-07
    Publisher German Medical Science GMS Publishing House; Düsseldorf
    Document type Conference proceedings
    DOI 10.3205/15dgpp29
    Database German Medical Science

    More links

    Kategorien

  10. Article ; Online: Automatic detection of articulation disorders in children with cleft lip and palate.

    Maier, Andreas / Hönig, Florian / Bocklet, Tobias / Nöth, Elmar / Stelzle, Florian / Nkenke, Emeka / Schuster, Maria

    The Journal of the Acoustical Society of America

    2009  Volume 126, Issue 5, Page(s) 2589–2602

    Abstract: Speech of children with cleft lip and palate (CLP) is sometimes still disordered even after adequate surgical and nonsurgical therapies. Such speech shows complex articulation disorders, which are usually assessed perceptually, consuming time and ... ...

    Abstract Speech of children with cleft lip and palate (CLP) is sometimes still disordered even after adequate surgical and nonsurgical therapies. Such speech shows complex articulation disorders, which are usually assessed perceptually, consuming time and manpower. Hence, there is a need for an easy to apply and reliable automatic method. To create a reference for an automatic system, speech data of 58 children with CLP were assessed perceptually by experienced speech therapists for characteristic phonetic disorders at the phoneme level. The first part of the article aims to detect such characteristics by a semiautomatic procedure and the second to evaluate a fully automatic, thus simple, procedure. The methods are based on a combination of speech processing algorithms. The semiautomatic method achieves moderate to good agreement (kappa approximately 0.6) for the detection of all phonetic disorders. On a speaker level, significant correlations between the perceptual evaluation and the automatic system of 0.89 are obtained. The fully automatic system yields a correlation on the speaker level of 0.81 to the perceptual evaluation. This correlation is in the range of the inter-rater correlation of the listeners. The automatic speech evaluation is able to detect phonetic disorders at an experts'level without any additional human postprocessing.
    MeSH term(s) Algorithms ; Articulation Disorders/diagnosis ; Articulation Disorders/etiology ; Child ; Cleft Lip/complications ; Cleft Palate/complications ; Humans ; Models, Biological ; Phonation ; Phonetics ; Psycholinguistics ; Speech Therapy
    Language English
    Publishing date 2009-11
    Publishing country United States
    Document type Journal Article ; Research Support, Non-U.S. Gov't
    ZDB-ID 219231-7
    ISSN 1520-8524 ; 0001-4966
    ISSN (online) 1520-8524
    ISSN 0001-4966
    DOI 10.1121/1.3216913
    Database MEDical Literature Analysis and Retrieval System OnLINE

    More links

    Kategorien

To top