Audience launches a sensor chip that fuses voice and motion

Dean Takahashi @deantak March 2, 2015 12:01 AM

Audiences NUE, or natural user experience

Image Credit: Audience

Audience, maker of voice processing chips for smartphones, is branching out today with its first processor that combines both voice and motion.

The Audience N100 is a multisensory processor that combines the existing voice processing and recognition technology with the motion-sensing that Audience acquired with its acquisition of Sensor Platforms last year. By combining these two separate functions of voice and motion detection, Audience is creating what it calls a Natural User Experience (NUE) that will bring new functions to smartphones. The company is revealing the technology at the Mobile World Congress event in Barcelona, Spain.

[aditude-amp id="flyingcarpet" targeting='{"env":"staging","page_type":"article","post_id":1665964,"post_type":"story","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"business,mobile,","session":"A"}']

For instance, your phone can be in an “always listening” mode for your voice. You can answer it with your hands, or you can answer it with your voice. If it is sitting flat on a table, it will be aware of that and will answer with a speaker phone in response to your voice command. If it is in your hand, it will answer without the speaker phone.

Above: Audience CEO Peter Santos

Image Credit: Audience

The phone with the multiple sensors has “contextual awareness,” said Peter Santos, chief executive of Audience in Mountain View, California. It delivers an experience inspired by neuroscience that fuses context from different sensory inputs. This NUE is an extension of Audience’s core business of making voice processing chips.

AI Weekly

The must-read newsletter for AI and Big Data industry written by Khari Johnson, Kyle Wiggers, and Seth Colaner.

Included with VentureBeat Insider and VentureBeat VIP memberships.

“The N100 will be best-in-class voice and motion sensing in one chip,” Santos said. “It has pure signal processing, but also lower power than other solutions that consist of multiple chips. We are listening for sound and sensing for motion, and we do it with lower power than multiple solutions.”

Above: Audience voice commands

Image Credit: Audience

NUE provides a richer, more seamless, more powerful way of interpreting large amounts of sensor data and making it useful in a way that is non-intrusive and invisible, Santos said.

It’s not easy to do this, since an always-listening processor consumes a lot of power. The applications processor in the smartphone has to wake up constantly, processing voice inputs to figure out if they are real commands or false alarms.

The N100’s VoiceQ and MotionQ technology employs various intelligent technologies to save power. Only the microphone and the N100 chip need be awake for VoiceQ to remain in low-power, always-on mode. VoiceQ can also detect false alarms.

Above: Audience is always listening

Image Credit: Audience

Audience came up with the voice tech last year. A half-dozen mobile phones are shipping with it, and a couple of tablets. Now it has combined it with the motion tech. MotionQ is designed for low power consumption. By putting both functions into the same chip, Audience saves even more power, Santos said.

[aditude-amp id="medium1" targeting='{"env":"staging","page_type":"article","post_id":1665964,"post_type":"story","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"business,mobile,","session":"A"}']

“There are good reasons why we did this acquisition,” Santos said. “It’s a natural extension of what we do, and we have a lot of headroom to grow.”

Above: Audience voice controls

Image Credit: Audience

The voice and motion sensors are part of a larger platform, dubbed the Open Sensor Platform. Any sensors, even those made by other manufacturers, can plug into the platform and create a new combination of smart sensors that are capable of even greater contextual awareness.

Sensor fusion developers will have access to open source algorithms, can modify or create their own algorithms, or leverage the MotionQ Library. This means that manufacturers will be able to get to the market faster with better products.

The NUE N100 could be part of everything from smartphones to wearable fitness devices and Internet of Things devices, or smart and connected everyday objects.

[aditude-amp id="medium2" targeting='{"env":"staging","page_type":"article","post_id":1665964,"post_type":"story","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"business,mobile,","session":"A"}']

One of the keys to making devices easier to operate is voice control. The VoiceQ tech keeps a mobile device in low power and waiting for a “wake-up” trigger. Audience’s VoiceQ offers a variety of wake-up triggers. The device wakes up whenever the user says the word. You can do this now with a Google Android TV or an Amazon TV set-top box. But the N100 is ideal for mobile uses. The voice recognition is speaker independent and does not need to be trained on the user’s voice.

The user can personalize their mobile device to wake up with a personal keyword. The system would then recognize and respond to that user’s voice. You can record your own keywords to get the system to do things like open your emails. It works in noisy environments.

Above: Audience N100 multisensory processor

Image Credit: Audience

Audience is going to deliver all this capability over time. But the first chip will have just basic capabilities. I’ve seen a demo of those capabilities, and they work pretty well. You can do Google searches with your voice and the phone gives you a quick answer.

The first combo motion-voice chip will be out in sample quantities in the second quarter. In large volumes, the chips may cost a dollar.

[aditude-amp id="medium3" targeting='{"env":"staging","page_type":"article","post_id":1665964,"post_type":"story","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"business,mobile,","session":"A"}']

In the future, Santos sees bigger possibilities. If you’re in a noisy restaurant, your phone may ring louder because it knows you may not hear it. Or the vibrator may also go off so that you can feel the buzzing. If you put your phone in your pocket, it should shut down right away. It can use the ambient light detector to know if it is your pocket or not.

If you take your ringing phone out of your pocket, it should automatically be answered without having to push a button. If your breathing is steady, the phone will know that you are sleeping.

The context awareness will tell the phone what to do.

“These things may sound small, but it is a much better user experience,” Santos said. “Our vision of multisensory processing is for more sophisticated context. Any one of the sensors has a certain amount of accuracy. But put them together and use machine learning, and you can be much more accurate.”

[aditude-amp id="medium4" targeting='{"env":"staging","page_type":"article","post_id":1665964,"post_type":"story","post_chan":"none","tags":null,"ai":false,"category":"none","all_categories":"business,mobile,","session":"A"}']

Now that would be a pretty cool future.

Above: Audience

Image Credit: Audience

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More

Explore

None Business Mobile