Will smart microphone voice capture see off chatbots in the IoT homes of the future?

Huw Geddes
Director of Marketing
score array microphone
High quality voice recognition interfaces will be key to enabling the success of tomorrow’s smart connected homes

The recent announcements about chatbots by Microsoft and Facebook provide an interesting approach from two of the big technology companies to the much hyped voice market. Facebook intends to create an infrastructure of services delivered by a personal digital assistant inside Facebook Messenger app, while Microsoft wants to create an ecosystem around its cross-platform personal assistant Cortana. It’s intriguing to see how this rolls out since neither addresses key challenges of the practicality of conversational interfaces and the development of the Internet of Things.

Chatbots let you ask a simple or complex question as part of a conversation with a device. You don't have to learn the hierarchical menus intrinsic to many apps, you just ask a question, the bot connects to a natural language interface or speech recognition engine running in the Cloud and then sends back an answer, or a different question if it needs clarification. There are lots of different types of chatbot including Siri or Google Now that provide automated search engines, transactional chatbots that allow you to order and purchase goods, service centre bots that direct you through support enquiries, right up to the Amazon Echo which allows you to engage with a full ecosystem of interacting web services.

A key feature of successful chatbots and conversational interfaces will be the quality of the voice capture and the response to a question or command. In both cases they must be at least 95% successful for users to be confident in using them regularly; anything less and people will not adopt them into everyday life. Speech recognition and natural language engines have improved considerably recently but the old adage "garbage in, garbage out" still exists when it comes to the actual voice capture. The way to provide high quality voice capture is to use a smart microphone that includes a steerable array microphone with signal processing to remove noise, echo and reverberation before the voice signal is passed to the speech recognition engine.

Now consider Microsoft; they have Enterprise solutions such as customer support bots or games consoles like Xbox and Kinnect which have hardware that’s capable of supporting smart microphones. But if they intend to focus on mobile phones and devices, which their announcement of a BotStore implies, there are considerable challenges to overcome. Facebook has a huge installed base for Messenger but most of these are on our mobile phones and mobile devices. 

The problem with mobile phones is that the physical characteristics of the device make it very difficult to integrate high quality smart microphones. Currently manufacturers use omnidirectional analogue and digital microphones that fit into small devices. These provide very good quality near-field voice capture, but users have to be close to their phones when talking and will continue to suffer background noise and echo, depending where they're used. So while there are billions of devices already available and capable of running chatbots, very few can meet the key target of 95% success rate voice capture. Mobile phones are not big enough to implement effective smart microphones to solve this problem.

Secondly, there’s user behaviour. Will chatbots solve an underlying problem that people are starting to get app fatigue? They download lots of apps but they only use 3-4 with any regularity. At least one of those will be the default chatbot installed by the device manufacturer, so Facebook score heavily with Messenger. But Microsoft will have to work hard to encourage developers to create Cortana aware chatbots that they can sell through the newly announced BotStore, and then persuade consumers that they have to download the chatbots.

The third issue is that people carry their phones with them when outside their home but at home the device is often in a different part of the house. For chatbots on mobile phones to be of value, the phone must be in the user’s hand, or on their wrist like an Android or Apple Watch, and this isn’t the case most of the time. If you have to walk between rooms to find your mobile and then talk to it, you’ve already lost the convenience of having a chatbot on the device initially.

The biggest challenge, however, will be the development of new categories of product that connect to the IoT. Many of these devices will not have what we consider to be a user interface in future – buttons, screens etc. These features are all additional BOM costs that can be dispensed with by using a voice interface. It’s also doubtful that manufacturers will want to integrate third party software like Cortana and Messenger if they incur licensing costs. Crucially these new categories of electronic devices will have footprints more suitable for better voice recognition interfaces. Amazon made a significant step forward with the Amazon Echo but it is just the start of what will be a huge shake- up for mobile device manufacturers and service providers.

Chatbots may provide an interface to communicate and control these new IoT devices, but mobile phones will be a much smaller part of the way we use the Internet and communicate with each other. Chatbots that depend on a third party platform look like unnecessary complexity. If this really is the case, it’ll be interesting to see how the Microsoft and Facebook initiatives develop, as well as those from Apple, Google and Amazon.

Share this page

Want more like this? Register for our newsletter
GaN’s Ground-Floor Opportunity Rudy Ramos | Mouser Electronics
GaN’s Ground-Floor Opportunity
The electronics industry has a major role to play in helping to save energy, by enabling better equipment and new ways of working and living that that are more efficient and environmentally friendly. Maintaining the pace of technological progress is key, but improvements become both smaller and harder to achieve as each technology matures. We can see this trend in the development of power semiconductors, as device designers seek more complex and expensive ways to reduce switching energy and RDS(ON) against silicon’s natural limitations.