Google says its voice search system is now more accurate, especially in noisy places

Jordan Novet @jordannovet September 24, 2015 10:53 AM

Google voice search on the web.

Image Credit: Screenshot

If you’ve noticed Google doing a better job of understanding what you say using speech recognition on your smartphone lately, you’re not crazy. Google’s voice search has indeed become more accurate, thanks to advances in artificial intelligence, the tech company announced today.

“Today, we’re happy to announce we built even better neural network acoustic models using Connectionist Temporal Classification (CTC) and sequence discriminative training techniques,” Google Speech Team members Haşim Sak, Andrew Senior, Kanishka Rao, Françoise Beaufays and Johan Schalkwyk wrote in a blog post today. “These models are a special extension of recurrent neural networks (RNNs) that are more accurate, especially in noisy environments, and they are blazingly fast!”

[aditude-amp id="flyingcarpet" targeting='{"env":"staging","page_type":"article","post_id":1809581,"post_type":"story","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"big-data,business,dev,mobile,","session":"A"}']

The new models are working in the Google app for iOS and Android, as well as dictation on Android, which works inside of some third-party apps, the team members wrote.

Google has reported improvements in voice search not once but twice this year. Clearly the company has been investing in the underlying technology. RNNs are one increasingly popular approach to doing deep learning, a type of artificial intelligence, and Google is widely thought to have a deep bench in deep learning.

AI Weekly

The must-read newsletter for AI and Big Data industry written by Khari Johnson, Kyle Wiggers, and Seth Colaner.

Included with VentureBeat Insider and VentureBeat VIP memberships.

But Apple and Microsoft, among others, have also been working to improve their voice recognition capabilities. Meanwhile, Facebook is also doing more in the area, having acquired a speech recognition company, Wit.ai, some months ago.

Speech could become more important as an input to searching the Web in the years to come. Baidu’s Andrew Ng, who is known for his work on the so-called Google Brain, last year predicted that within five years “50 percent of queries will be on speech or images.”

“In addition to requiring much lower computational resources, the new models are more accurate, robust to noise, and faster to respond to voice search queries — so give it a try, and happy (voice) searching!” wrote Sak, Senior, Rao, Beaufays, and Schalkwyk.

Read the full blog post for more detail on how the team managed to get the new performance gains.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More

Explore

None Big Data Business Dev Mobile