Signal processing techniques for robust sound event recognition
NAGIOS: RODERIC FUNCIONANDO

Signal processing techniques for robust sound event recognition

DSpace Repository

Signal processing techniques for robust sound event recognition

Show simple item record

dc.contributor.advisor Ferri Rabasa, Francesc Josep
dc.contributor.advisor Cobos Serrano, Máximo
dc.contributor.author Martín Morató, Irene
dc.contributor.other Departament d'Informàtica es_ES
dc.date.accessioned 2019-11-25T12:26:17Z
dc.date.available 2019-11-26T05:45:05Z
dc.date.issued 2019 es_ES
dc.date.submitted 25-11-2019 es_ES
dc.identifier.uri https://hdl.handle.net/10550/72345
dc.description.abstract The computational analysis of acoustic scenes is today a topic of major interest, with a growing community focused on designing machines capable of identifying and understanding the sounds produced in our environment, similar to how humans perform this task. Although these domains have not reached the industrial popularity of other related audio domains, such as speech recognition or music analysis, applications designed to identify the occurrence of sounds in a given scenario are rapidly increasing. These applications are usually limited to a set of sound classes, which must be defined beforehand. In order to train sound classification models, representative sets of sound events are recorded and used as training data. However, the acoustic conditions present during the collection of training examples may not coincide with the conditions during application testing. Background noise, overlapping sound events or weakly segmented data, among others, may substantially affect audio data, lowering the actual performance of the learned models. To avoid such situations, machine learning systems have to be designed with the ability to generalize to data collected under conditions different from the ones seen during training. Traditionally, the techniques used to carry out tasks related to the computational understanding of sound events have been inspired by similar domains such as music or speech, so the features selected to represent acoustic events come from those specific domains. Most of the contributions of this thesis are based on how such features are suitably applied for sound event recognition, proposing specific methods to adapt the features extracted both within classical recognition approaches and modern end-to-end convolutional neural networks. The objective of this thesis is therefore to develop novel signal processing techniques aimed at increasing the robustness of the features representing acoustic events to adverse conditions affecting the mismatch between the training and test conditions in model learning. To achieve such objective, we start first by analyzing the importance of classical feature sets such as Mel-frequency cepstral coefficients (MFCCs) or the energies extracted from log-mel filterbanks, analyzing as well the impact of noise, reverberveration or segmentation errors in diverse scenarios. We show that the performance of both classical and deep learning-based approaches is severely affected by these factors and we propose novel signal processing techniques designed to improve their robustness by means of the non-linear transformation of feature vectors along the temporal axis. Such transformation is based on the so called event trace, which can be interpreted as an indicator of the temporal activity of the event within the feature space. Finally, we propose the use of the energy envelope as a target for event detection, which implies the change from a classification-based approach to a regression-oriented one. es_ES
dc.format.extent 141 p. es_ES
dc.language.iso en es_ES
dc.subject audio classification es_ES
dc.subject support vector machines es_ES
dc.subject deep learning es_ES
dc.subject feature selection es_ES
dc.subject sound event recognition es_ES
dc.title Signal processing techniques for robust sound event recognition es_ES
dc.type info:eu-repo/semantics/doctoralThesis es_ES
dc.subject.unesco UNESCO::CIENCIAS TECNOLÓGICAS es_ES
dc.description.abstractenglish es_ES
dc.embargo.terms 0 days es_ES

View       (5.758Mb)

This item appears in the following Collection(s)

Show simple item record

Search DSpace

Advanced Search

Browse

Statistics