Google has developed an AI that can pick out someone’s voice in a noisy room.
By looking at people’s faces when they’re speaking, the firm has trained a machine to spot individual people as they talk and isolate their voice against background noise.
Technology such as this could easily be used to spy on secret conversations.
However, Google claims its applications are likely to be in clearing up speech in video calls and boosting the power of hearing aids.
To learn how to isolate a voice, the AI was taught to recognise the speech of individuals in a quiet environment.
It did this by analysing how their faces changed as they spoke different words at different frequencies.
Google then created virtual ‘cocktail parties’ that included significant background noise and several people speaking at one.
The system recorded which frequencies were most likely to match to a given speaker and created an audio track of that speech.
The AI worked even when several people competed to speak at one time,
It also worked if the person partially obscures their face with hand gestures.
In a blog post, Google said: ‘All that is required from the user is to select the face of the person in the video they want to hear, or to have such a person be selected algorithmically based on context.’
In the research published by Google, the scientists explained how this is done: ‘The visual features are used to ‘focus’ the audio on desired speakers in a scene and to improve the speech separation quality’
‘We believe this capability can have a wide range of applications, from speech enhancement and recognition in videos, through video conferencing, to improved hearing aids, especially in situations where there are multiple people speaking.’