Facebook builds a mobile app that answers spoken questions about photos with AI

Jordan Novet @jordannovet November 3, 2015 4:00 AM

The visual q&A demonstration app from Facebook.

Image Credit: Screenshot

Facebook today showed off the latest progress in its artificial intelligence research. The most impressive achievement is a new system that lets people ask questions about photos on Facebook using voice input and then receive audible answers about the photos in response.

The mobile-based system, called Visual Q&A, shows how Facebook can use multiple approaches to a type of artificial intelligence called “deep learning”—specifically, convolutional neural networks and more modern end-to-end memory networks—to turn its existing data into something that can be consumed by an audience that normally wouldn’t be able to access it.

[aditude-amp id="flyingcarpet" targeting='{"env":"staging","page_type":"article","post_id":1832793,"post_type":"story","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"big-data,cloud,dev,","session":"D"}']

“Think of what this might mean to the 285 million people globally who have low vision capabilities or the 40 million who are blind. Instead of being left out of the experience when friends share photo content, they’ll be able to participate,” Facebook chief technology officer Mike Schroepfer wrote in a blog post on the news.

Facebook isn’t releasing this app to the public, but the company has allowed some people to try it out, and a video documenting the research is pretty moving.

AI Weekly

The must-read newsletter for AI and Big Data industry written by Khari Johnson, Kyle Wiggers, and Seth Colaner.

Included with VentureBeat Insider and VentureBeat VIP memberships.

This work builds on Facebook’s ability to answer questions about textual information using AI, which Schroepfer demonstrated on stage at the company’s F8 developer conference in March.

Google and Baidu, among other companies, are also moving quickly to smarten up their products with deep learning.

In addition to unveiling the visual Q&A system, today Facebook is also announcing improvements in computer vision.

“Our team has created not only a system that has taught machines this skill, but also a state-of-the-art research system that can segment images 30 percent faster than most other systems, using 10x less training data across industry benchmarks,” Schroepfer wrote.

https://venturebeat-com-develop.go-vip.co/wp-content/uploads/2015/11/VQA3.mp4?_=1

Facebook’s computers can now also make inferences about whether virtual blocks stacked unevenly will topple over. The system is 90 percent accurate, according to Schroepfer’s blog post.

[aditude-amp id="medium1" targeting='{"env":"staging","page_type":"article","post_id":1832793,"post_type":"story","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"big-data,cloud,dev,","session":"D"}']

And Facebook researchers have also been trying to teach a computer how to play the Chinese board game Go very intelligently.

“The Go player we’ve built is getting close to being able to compete with the best humans. We’ve only been working on it for a few months, but it’s already on par with the other AI-powered systems that have been published and is as good as a very strong amateur human player.”

This might just seem like a fun application, just like IBM’s Deep Blue supercomputer taking on chess grandmaster Garry Kasparov in 1997. But Facebook has nearly a billion and a half users now. In the future, a whole lot of people may be able to take advantage of this technology whenever they want.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More

Explore

None Big Data Cloud Dev