Cog Av Hearing - Research Outputs - Datasets


ASPIRE is a a first of its kind, audiovisual speech corpus recorded in real noisy environment (such as cafe, restaurants) which can be used to support reliable evaluation of multi-modal Speech Filtering technologies. The corpus is available to download at:

Audiovisual GRID + ChiME3 Corpus

A new AV-ChiME3 corpus is developed by combining Grid Corpus clean audio/video with ChiME3 noises for a wide-range of SNRs ranging from -12 to 12dB. The corpus is available to download at:

Primary Audiovisual Dataset

The main development of all strands of the project is taking place using the large audiovisual Grid Corpus, which was developed by project members from the University of Sheffield and has been very widely used. This corpus consists of a very large collection of audio waveform files and associated videos, with transcriptions and alignment data available.

Grid Corpus Homepage

Speech Performance evaluation

To evaluate the final approaches developed by our system, we make use of the data provided by the CHiME speech in noise challenge . This was a challenge from a number of years ago that made use of the Grid corpus, mixed with noise. It was widely used by a number of audio only speech processing systems, which therefore provides a benchmark to use. It is suitable as the Grid sentences also have associated visual data, and so we can compare.

CHiME Challenge Homepage