Konstantinos Drossos Archives

Design Recommendations for a Collaborative Game of Bird Call Recognition Based on Internet of Sound Practices

Post published:December 16, 2021
Post category:Journal / Magazine Publications

Detailed info Design Recommendations for a Collaborative Game of Bird Call Recognition Based on Internet of Sound Practices Authors:Rovithis, Emmanouel; Moustakas, Nikolaos; Vogklis, Konstantinos; Drossos, Konstantinos; Floros, AndreasTitle:Design Recommendations for…

Enriched Music Representations With Multiple Cross-Modal Contrastive Learning

Post published:November 19, 2021
Post category:Journal / Magazine Publications

Modeling various aspects that make a music piece unique is a challenging task, requiring the combination of multiple sources of information. Deep learning is commonly used to obtain representations using various sources of information, such as the audio, interactions between users and songs, or associated genre metadata.

WaveTransformer: An Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information

Post published:November 18, 2021
Post category:Publications

Automated audio captioning (AAC) is a novel task, where a method takes as an input an audio sample and outputs a textual description (i.e. a caption) of its contents.

Continual Learning for Automated Audio Captioning Using The Learning Without Forgetting Approach

Post published:November 18, 2021
Post category:Publications

Automated audio captioning (AAC) is the task of automatically creating textual descriptions (i.e. captions) for the contents of a general audio signal. Most AAC methods are using existing datasets to optimize and/or evaluate upon.

Assessment of Self-Attention on Learned Features For Sound Event Localization and Detection

Post published:November 18, 2021
Post category:Publications

Joint sound event localization and detection (SELD) is an emerging audio signal processing task adding spatial dimensions to acoustic scene analysis and sound event detection.

Fairness and underspecification in acoustic scene classification: The case for disaggregated evaluations

Post published:October 5, 2021
Post category:Publications

Underspecification and fairness in machine learning (ML) applications have recently become two prominent issues in the ML community. Acoustic scene classification (ASC) applications have so far remained unaffected by this discussion, but are now becoming increasingly used in real-world systems where fairness and reliability are critical aspects.

Automatic Analysis of the Emotional Content of Speech in Daylong Child-Centered Recordings from a Neonatal Intensive Care Unit

Post published:September 22, 2021
Post category:Publications

Researchers have recently started to study how the emotional speech heard by young infants can affect their developmental outcomes. As a part of this research, hundreds of hours of daylong recordings from preterm infants’ audio environments were collected from two hospitals in Finland and Estonia in the context of so-called APPLE study.