CogAvHearing

Home Project Background Project Members Project Vision Research Outputs Publications Media Coverage Contact

Latest Updates

Online Prototype Demo available at: https://cogbid.github.io/cogavhearingdemo/

On-Going exciting deep learning based lip reading and speech enhancement work

Interspeech 2017, satellite workshop on hearing aid research: Challenges for Hearing Assistive Technology (CHAT-Aug, 2017, Stockholm, Sweden)

MRC network meeting, Stirling, Feb 2, 2017

Audiovisual Dataset for Audiovisual Speech Mapping http://hdl.handle.net/11667/81

Enhanced visually derived Wiener filters for speech enhancement

Last Updated: March 2019

Cog Av Hearing

Welcome to the Cog AV Hearing Project page. This is a follow-on of the EPSRC funded project that commenced 1st October 2015, and ended March 2019. All project outputs, including datasets, demos, reports/papers, and related activities/events of interest, etc, are publicly available. Please check back for regular updates. For any project related queries, please e-mail the project Lead, Professor Amir Hussain, a.hussain@napier.ac.uk

Project Aims

This ambitious collaborative project aims to address the EPSRC research challenge, "Speech-in-noise performance in hearing aid devices" and the long-standing challenge of developing disruptive assistive listening technology that can help improve the quality of life of the 10m people in the UK suffering from some form of hearing loss. We aim to develop devices that mimic the unique human ability to focus hearing on a single talker, effectively ignoring background distractor sounds, regardless of their number and nature.

This research is a first attempt at developing a cognitively-inspired, adaptive and context-aware audio-visual (AV) processing approach for combining audio and visual cues (e.g. from lip movement) to deliver speech intelligibility enhancement. A publicly-available, multi-modal speech enhancement framework, pioneered by Prof Hussain is being significantly extended to incorporate models of auditory and AV scene analysis developed by Dr Barker's group at Sheffield. Further, novel computational models and theories of human vision are being explored to enable real-time tracking of facial features. Novel multi-modality selection mechanisms are being developed, and ongoing collaborations with Phonak and MRC IHR (Glasgow) will facilitate delivery of a clinically-tested software prototype.