Speech and Audio Signal Processing

In different applications such as automotive hands-free telephony or speech dialogue systems, the desired speech signal is disturbed by background noise (engine, wind noise, etc.) and by echoes (due to multipath propagation from a loud-speaker to a microphone). In order to reduce the disturbing components while keeping the speech signal as natural as possible multi-channel adaptive signal enhancement algorithms are utilized.

In the field of speech enhancement we focus mainly on three applications:

  • high-quality conference systems (supporting audio and video communication),
  • speech enhancement schemes (hands-free and in-car communication) for cars, and
  • signal enhancement for breathing protection masks.

The picture on the right shows one of our test labs. In that room a system with about 20 microphones (4 of them are visible on the picture as black dots on the table) and about 10 loudspeakers is installed. We can make conference calls either in other rooms (similarly equipped or into one of our test cars).

Another application that we are focussing on - as mentioned above - is in-car communication. Here, we address the following problem: Since the conversational partners in a vehicle do not face each other directly, there is a relatively high attenuation of the speech signal and the signal-to-noise ratio at the listener’s ears is additionally decreased by driving noise. The approach of an in-car communication (ICC) system to resolve this problem is to record the talker’s speech by means of microphones and play it back over loudspeakers which are located close to the listening passengers. Because the listener always perceives a mixture of direct sound and the loudspeaker signal, the system has to be designed to operate with only a small delay of less than 15 ms. Also, such a system operates in a closed electroacoustic loop and thus, system stability becomes a critical issue.

For our research on automotive hands-free and in-car communication systems we do simulations (both offline as well as real-time simulation), but also we investigate the behavior of our algorithms in real environments. For that reason we have several systems installed in different kinds of cars. The picture on the left shows one of them. Our cars are equipped with several conventional and some "non-conentional" microphones as well as with several loudspeakers. Thus, we can investigate all kinds of systems.

In addition to the design of speech enhancement algorithms such as localization and beamforming, echo and feedback cancellation, noise reduction, or bandwidth extension we investigate also the automatic evaluation of the quality of such systems. For that purpose several subjective and objective test are investigated. Since we need a realistic environment simulation for such tests we do also research on realistic environment simulations.

In the broader context of automatic system evaluation, we also investigate the quality of transmitted speech in general. Here, the focus lies on the quality of speech after an entire signal processing chain. For telecommunication scenarios, such a chain may inlude speech enhancement, source coding, and network transmission. While the overall quality of transmitted speech is, of course, of interest, the main goal of our research is to, additionally, identify the technical causes of sub-optimal quality within the processing chain.

 

Corresponding Publications:

C. Marquard, C. Baasch, M. Brodersen, O. Niebuhr, and G. Schmidt: Speech, Think, Act: A Phonetic Analysis of the Combinatorial Effects of Respiratory Mask, Physical and Cognitive Stress on Phonation and Articulation, Proc. DAGA, Kiel, Germany, 2017

G. Schmidt, and A. Theiß: Automatic Evaluation of In-Car Communication Systems, Proc. DAGA, Kiel, Germany, open access, 2017

P. Bulling, K. Linhard, A. Wolf, and G. Schmidt: Approximation of the Optimum Stepsize for Acoustic Feedback Cancellation Based on the Detection of Reverberant Signal Periods, Proc. DAGA, Kiel, Germany, open access, 2017

S. Graf, N. Zaidi, M. Buck, and G. Schmidt: Detection of Voiced Speech and Pitch Estimation for Applications with a Low Spectral Resolution, Proc. DAGA, Kiel, Germany, open access, 2017

M. Gimm, K. Rebbe, and G. Schmidt: Echtzeitsystem zur mehrkanaligen Breitbandtelefonie, Proc. DAGA, Kiel, Germany, open access, 2017

T. Maschmann, M. Gimm, V. Kandade Rajan, and G. Schmidt: Implementation of a new Method for Noise Suppression in Automotive Environments, Proc. DAGA, Kiel, Germany, open access, 2017

M. Brodersen, B. Gröger, and G. Schmidt: Verbesserung der Sprachverständlichkeit für Atemschutzmasken mittels Signalbearbeitung mit nichtlinearen Kennlinien, Proc. DAGA, Kiel, Germany, 2017

R. Landgraf, G. Schmidt, J. Köhler-Kaeß, O. Niebuhr, and T. John: More Noise, Less Talk - The Impact of Driving Noise and ICC Systems on Acoustic-prosodic Parameters in Dialogue, Proc. DAGA, Kiel, Germany, open access, 2017

P. Bulling, K. Linhard, A. Wolf, G. Schmidt: Acoustic Feedback Compensation with Reverb-based Stepsize Control for In-car Communication Systems, ITG Speech, October 2016

S. Graf, T. Herbig, M. Buck, G. Schmidt: Kurtosis-Controlled Babble Noise Suppression, ITG Speech, October 2016

S. Graf, T. Herbig, M. Buck, G. Schmidt: Voice Activity Detection Based on Modulation-Phase Differences, ITG Speech, October 2016

M. Brodersen, T. M. Jüngling, G. Schmidt: Evaluation of Communication Systems for Full-Face Firefighter Masks, ITG Speech, October 2016

C. Baasch, G. Schmidt, U. Heute, A. Nebel, G. Deuschl: Parkinson-Speech Analysis: Methods and Aims, ITG Speech, October 2016

R. Landgraf, J. Köhler-Kaeß, C. Lüke, O. Niebuhr, G. Schmidt: Can You Hear Me Now? Reducing the Lombard Effect in a Driving Car Using an In-Car Communication System, Proc. Speech Prosody, pp. 479 - 483, 2016

C. Baasch, W. Schmidt, G. Schmidt, U. Heute, A. Baumann, A. Nebel, G. Deuschl, T. von Eimeren: Stimmtherapie für Parkinsonsprache: Akustische Analyse der Wirksamkeit, ESSV 2016, Leipzig, Germany

L. Jaschke, C. Baasch, G. Schmidt, A. Nebel, G. Deuschl: Level-correct Speech Recordings for the Analysis of Parkinson Speech, DAGA 2016

P. Bulling, K. Linhard, A. Wolf, G. Schmidt, A. Theiß, M. Gimm: Nichtlineare Kennlinien zur Verbesserung der Sprachverständlichkeit in geräuschbehafteter Umgebung, DAGA 2016

V. K. Rajan, M. Krini, K. Rodemer, G. Schmidt: Signal Processing Techniques for Seat belt Microphone Arrays, EURASIP Journal on Advances in Signal Processing, vol. 92, 2016

S. Graf, T. Herbig, M. Buck, G. Schmidt: Features for Voice Activity Detection: A Comparative Analysis, EURASIP Journal on Advances in Signal Processing, vol. 91, open access, 2015

A. Theiß, G. Schmidt: Spectral Distance Analysis for Quality Estimation of In-Car Communication Systems, 7th Biennial Workshop on DSP for In-Vehicle Systems and Safety 2015, Berkeley, CA, USA

A. Warhadpande, C. Lüke, A. Theiß, G. Schmidt: Improvement by Adding Video Feature in an Acoustic Ambiance Simulation for Automobiles, 7th Biennial Workshop on DSP for In-Vehicle Systems and Safety 2015, Berkeley, CA, USA

O. Niebuhr, B. Peters, R. Landgraf, G. Schmidt: The Kiel Corpora of "Speech & Emotion" - A Summary, Proc. DAGA 2015, March 16-19, 2015, Nürnberg, Germany

M. Krini, V. K. Rajan, Klaus Rodemer, G. Schmidt: Adaptive Beamforming for Microphone Arrays on Seat Belts, Proc. DAGA 2015, March 16-19, 2015, Nürnberg, Germany

J. Friedrich, A. Wolf, K. Linhard, S. Senkbeil, G. Schmidt, H. Schnepp: Subjektive Präferenzen eines Stereo-Vollband-Freisprechsystems, DAGA 2015, Nürnberg, Germany

R. Landgraf, O. Niebuhr, G. Schmidt, T. John, C. Lüke, A. Theiß: Von der Straße ins Labor: Die Modifikation der Sprachproduktion bei lauten Fahrgeräuschen, DAGA 2015, Nürnberg, Germany

S. Graf, A. Theiß, T. Herbig, G. Schmidt: Listening Test to Determine the Mismatch Between Signal-To-Noise Ratio and Human Perception, DAGA 2015, Nürnberg, Germany

C. Lüke, A. Wolf, M. Brodersen, G. Schmidt: Digitale Simulation der Fahrzeuginnenraumakustik zur Unterstützung der Entwicklung und Evaluierung von Innenraum-Kommunikationssystemen, DAGA 2015, Nürnberg, Germany

M. Brodersen, A. Volmer, M. Romba, G. Schmidt: Sprachaktivitätserkennung mittels eines Mustererkenners für Atemschutzmasken, Proc. DAGA 2015, March 16-19, 2015, Nuremberg, Germany

C. R. Norrenbrock, F. Hinterleitner, U. Heute, S. Möller: Quality prediction of synthesized speech based on perceptual quality dimensions, Speech Communication, Volume 66, Pages 17-35, 2015

T. John, R. Landgraf, C. Lüke, S. Rohde, G. Schmidt, A. Theiß, J. Withopf: Über die Verbesserung der Sprachkommunikation in geräuschbehafteten Umgebungen, in O. Niebuhr (ed.): Formen des Nichtverstehens, Peter Lang, December 2014 (in German)

S. Graf, T. Herbig, M. Buck, G. Schmidt: Improved Performance Measures for Voice Activity Detection, Proc. ITG 2014, September 24-26, 2014, Erlangen, Germany

V. K. Rajan, C. Baasch, G. Schmidt, M. Krini: Improvement in Listener Comfort Through Noise Shaping Using a Modified Wiener Filter Approach, Proc. ITG 2014, September 24-26, 2014, Erlangen, Germany

C. Baasch, V. K. Rajan, G. Schmidt, M. Krini: Low-Complexity Noise Power Spectral Density Estimation For Harsh Automobile Environments, Proc. International Workshop on Acoustic Signal Enhancement IWAENC 2014, Antibes, France

J. Withopf, S. Rohde, G. Schmidt: Application of Frequency Shifting in In-Car Communication Systems, Proc. ITG Fachtagung Sprachkommunikation 2014, Erlangen, Germany

A. Theiß, G. Schmidt, J. Withopf, C. Lüke: Instrumental Evaluation of In-Car Communication Systems, Proc. ITG Fachtagung Sprachkommunikation 2014, Erlangen, Germany

J. Withopf, G. Schmidt: Estimation of Time-variant Acoustic Feedback Paths in In-Car Communication Systems, Proc. International Workshop on Acoustic Signal Enhancement IWAENC 2014, Antibes, France

A. Theiß, G. Schmidt: Investigation of Self-Masking Effects for the Evaluation of In-Car Communication Systems, Proc. International Workshop on Acoustic Signal Enhancement IWAENC 2014, Antibes, France

S. Stenzel: Multichannel Signal Processing for Spatially Distributed Microphones, Shaker, September 2014

R. Landgraf: Are you serious? Irony and the perception of emphatic intensification, Proc. of the 4th International Symposium on Tonal Aspects of Languages (TAL) 2014, Nijmegen, Netherlands

C. Lüke, G. Schmidt, A. Theiß, J. Withopf: In-Car Communication, in G. Schmidt, H. Abut, K. Takeda, J. Hansen (eds.), Smart Mobile In-Vehicle Systems, Springer, January 2014

Mohamed Krini, G. Schmidt: Refinement and Temporal Interpolation of Short-Term Spectra: Theory and Applications, in G. Schmidt, H. Abut, K. Takeda, J. Hansen (eds.), Smart Mobile In-Vehicle Systems, Springer, January 2014

M. Christoph: Untersuchung verschiedener Verfahren zur messtechnischen Bestimmung und Nachbildung der Akustik insbesondere von Fahrzeugaudiosystemen, Shaker, November 2013

F. Hinterleitner, C. R. Norrenbrock, S. Möller: Is Intelligibility Still the Main Problem? A Review of Perceptual Quality Dimensions of Synthetic Speech, Proc. 8th ISCA Speech Synthesis Workshop, Barcelona, Spain, 2013

K. G. Mideksa, A. Khan, G. Deuschl, U. Heute, M. Muthuraman: Dipole Source Analysis for Identifying the Location of Deep Brain Stimulation Electrodes in Parkinson's Patients, Proc. EMBC 2013, JSMBE-EMBS, Osaka, Japan, 2013

T. John, O. Niebuhr, G. Schmidt and A. Theiß: Phonetic Analyses vs. Dirty Signals: Fixing the Paradox, Proc. ESSV, Bielefeld , Germany, 2013

V. K. Rajan, S. Rohde, G. Schmidt, J. Withopf: Signal Processing for Microphone Arrays on Seat Belts, 6th Biennial DSP Workshop for In-Vehicle Systems 2013, Seoul, Korea

C. Lüke, A. Theiß, G. Schmidt, O. Niebuhr, T. John: Creation of a Lombard Speech Database using an Acoustic Ambiance Simulation with Loudspeakers, 6th Biennial DSP Workshop for In-Vehicle Systems 2013, Seoul, Korea

F. Hinterleitner, C. R. Norrenbrock, S. Möller, U. Heute: What Makes this Voice Sound so Bad? A Multidimensional Analysis of state-of-the-art Text-to-Speech Systems , Proc. IEEE SLT Workshop, Miami, Florida, USA, 2012. *Best Paper Award*

C. R. Norrenbrock, F. Hinterleitner, U. Heute, S. Möller: Towards Perceptual Quality Modeling of Synthesized Audiobooks-Blizzard Challenge 2012, Blizzard Challenge Workshop, Portland, OR, USA, 2012

C. R. Norrenbrock, F. Hinterleitner, U. Heute, S. Möller: Quality Analysis of Macroprosodic F0 Dynamics in Text-to-Speech Signals, Proceedings Interspeech, Portland, OR, USA, 2012

G. Schmidt, A. Theiß, J. Withopf, A. Wolf: Evaluation of In-Car Communication Systems, in J. Hansen, P. Boyraz, K. Takeda, H. Abut (eds.), Digital Signal Processing for In-Vehicle Systems and Safety, Springer, January 2012

F. Hinterleitner, C. R. Norrenbrock, S. Möller, U. Heute: What Makes this Voice Sound so Bad? A Multidimensional Analysis of state-of-the-art Text-to-Speech Systems, Proc. IEEE SLT Workshop, Miami, Florida, USA, 2012 (Best Paper Award)

C. R. Norrenbrock, F. Hinterleitner, U. Heute, S. Möller: Towards Perceptual Quality Modeling of Synthesized Audiobooks-Blizzard Challenge 2012, Blizzard Challenge Workshop, Portland, OR, USA, 2012

C. R. Norrenbrock, F. Hinterleitner, U. Heute, S. Möller: Quality Analysis of Macroprosodic F0 Dynamics in Text-to-Speech Signals, Proceedings Interspeech, Portland, OR, USA, 2012

C. R. Norrenbrock, F. Hinterleitner, U. Heute: On the Use of Vocal-Tract Approximations for Instrumental Quality Assessment, ITG Fachtagung Sprachkommunikation, Braunschweig, Germany, 2012

S. Möller, U. Heute: Dimension-based Diagnostic Prediction of Speech Quality, ITG 2012, Braunschweig, Germany, 2012

F. Hinterleitner, C. R. Norrenbrock, S. Möller: On the Use of Fujisaki Parameters for the Quality Prediction of Synthetic Speech, Proc. ESSV, Cottbus, Germany, 2012

M. Krini, G. Schmidt: Method for Temporal Interpolation of Short-term Spectra and its Appliction to Adaptive System Identification, Proc. ICASSP, Kyoto, Japan, 2012

C. R. Norrenbrock, F. Hinterleitner, U. Heute: On Prosodic Quality of Text-to-Speech Signals, DAGA 2012, Darmstadt, Germany, 2012

C. R. Norrenbrock, F. Hinterleitner, U. Heute, S. Möller: Instrumental Assessment of Prosodic Quality for Text-to-Speech Signals, IEEE Signal Processing Letters, vol. 19, no. 5, 2012

C. R. Norrenbrock, F. Hinterleitner, U. Heute, S. Möller: Towards a better Understanding of TTS Synthesis: Subjective Quality and its Intrumental Assessment, Proc. ESSV 2011, Aachen, Germany, 2011

F. Hinterleitner, S. Möller, C. R. Norrenbrock: An Evaluation Protocol for the Subjective Assessment of Text-to-Speech in Audiobook Reading Tasks, Proc. Blizzard Challenge 2011 workshop, Turin, Italy, 2011

C. R. Norrenbrock, U. Heute, F. Hinterleitner, S. Möller: Aperiodicity Analysis for Quality Estimation of Text-to-Speech Signals, Proc. Interspeech 2011, Florence, Italy, 2011

F. Hinterleitner, S. Möller, C. R. Norrenbrock, U. Heute: Perceptual Quality Dimensions of Text-to-Speech Systems, Proc. Interspeech 2011, Florence, Italy, 2011

C. R. Norrenbrock, U. Heute, F. Hinterleitner, S. Möller: Quality Estimation of Text-To-Speech Signals, Proc. DAGA 2011, Düsseldorf, Germany, 2011

F. Hinterleitner, S. Möller, C. R. Norrenbrock, U. Heute: Comparison of Approaches for Instrumentally Predicting the Quality of Text-to-Speech Systems: Data from the Blizzard Challenge 2010, Proc. DAGA 2011, Düsseldorf, Germany, 2011

M. Buck, E. Hänsler, M. Krini, G. Schmidt, T. Wolff: Acoustic Array Processing for Speech Enhancement, in S. Haykin, K. J. R. Liu (eds.), Handbook on Array Processing and Sensor Networks, Wiley-IEEE Press, January 2010.

P. Hannon, G.Schmidt, M. Krini, A. Wolf: Reducing the Complexity or the Delay of Adaptive Subband Filtering, Proc. ESSV 2010, pp. 158 - 165, Berlin, Germany, 2010

A. Wolf, B. Iser, G. Schmidt: Laufzeitoptimierte Geräuschreduktionsverfahren basierend auf overlap-save-Strukturen mit Projektionsfilternäherungen, Proc. ESSV 2010, pp. 134 - 141, Berlin, Germany, 2010

J. Withopf, P. Hannon, M. Krini, G. Schmidt: Phoneme-Dependent Speech Enhancement, Proc. ITG-Fachtagung 2010, Bochum, Germany, 2010

Website News

13.08.2017: New Gas e.V. sections (e.g. pictures or prices) added.

05.08.2017: The first "slide carousel" added.

03.08.2017: Started with the RED project. Will be ready in a few years ...

30.07.2017: List of PhD theses updated and extended.

Recent Publications

P. Durdaut, J. Reermann, S. Zabel, Ch. Kirchhof, E. Quandt, F. Faupel, G. Schmidt, R. Knöchel, and M. Höft: Modeling and Analysis of Noise Sources for Thin-Film Magnetoelectric Sensors Based on the Delta-E Effect, IEEE Transactions on Instrumentation and Measurement, published online, 2017

P. Durdaut, S. Salzer, J. Reermann, V. Röbisch, J. McCord, D. Meyners, E. Quandt, G. Schmidt, R. Knöchel, and M. Höft: Improved Magnetic Frequency Conversion Approach for Magnetoelectric Sensors, IEEE Sensors Letters, published online, 2017

 

Contact

Prof. Dr.-Ing. Gerhard Schmidt

E-Mail: gus@tf.uni-kiel.de

Christian-Albrechts-Universität zu Kiel
Faculty of Engineering
Institute for Electrical Engineering and Information Engineering
Digital Signal Processing and System Theory

Kaiserstr. 2
24143 Kiel, Germany

Recent News

Jens Reermann Defended his Dissertation with Distinction

On Friday, 21st of June, Jens Reermann defended his research on signals processing for magnetoelectric sensor systems very successfully. After 90 minutes of talk and question time he finished his PhD with distinction. Congratulations, Jens, from the entire DSS team.

Jens worked for about three and a half years - as part of the collaborative research center (SFB) 1261 - on all kinds of signal ...


Read more ...