Building a Voice-enabled Smart Home TinyML Solution with Cloud Connectivity

Jo Stichbury

1.00/5 (1 vote)

Dec 10, 2020

CPOL

7 min read

6792

In this article we show, a high-level, it is possible to create sophisticated AI-enabled applications that run upon memory-constrained, ultra-low power endpoint devices.

Embedded and IoT devices are increasingly able to perform complex processing and analytics. We explain how to build a voice-enabled smart assistant that can control common room functions, such as lights and heat. Our example uses TensorFlow Lite for Microcontrollers on an Arm Cortex-M device running Arm Mbed OS.

Artificial intelligence is no longer exclusive to heavyweight hardware. Advances in the processing power of microcontrollers and development of new machine learning models that run on the endpoint have made it feasible to perform deep learning inference directly on-device. You can now run optimized machine learning models on Arm Cortex-M devices consuming tens of milliwatts. Welcome to tiny machine learning: tinyML!

TinyML arose from a collaboration between the communities developing ultra-low power embedded systems and those focusing on machine learning and data science. It provides the means to build new and exciting applications enabled by on-device machine learning.

This new paradigm extends the capabilities of inexpensive and energy efficient microcontrollers, such as Arm Cortex-M microcontrollers, to make them "smarter," whilst also enabling greater responsiveness, reliability, and privacy. Such endpoint devices can combine data collected from a variety of ultra-low power sensors (such as those for voice or sound, vision, environment, motion, and health monitoring) with machine intelligence, provided by technology such as TensorFlow Lite for Microcontrollers or µTVM.

In this article we describe the tinyML elements of a smart home device, which uses voice recognition and cloud connectivity. We’ll describe a microcontroller that:

Listens to its surroundings with an on-board microphone
Infers commands using an on-board machine-learning model
Acts upon these instructions accordingly

In addition to the on-device response to voice instructions, the microcontroller can also take advantage of cloud connectivity to pass on information, for example, about any apparent inconsistency in the incoming data, to enable the user to act on it from a remote location. This could occur when the heating or lighting is left on for an extended period of time, or when the system knows the user is typically not in the house.

The Building Blocks of a TinyML Smart Assistant

For the tiny hardware, our example uses a microcontroller running Arm Mbed OS, which is supported across a wide range of Arm Cortex-M devices. Mbed OS is an open-source IoT operating system which is small yet full-featured in terms of security, ML, connectivity, and drivers for sensors and I/O devices.

Mbed OS provides an abstraction layer, allowing you to write C/C++ applications and run them on any Mbed-enabled device. There are detailed instructions on os.mbed.com to help you get set up to use the Mbed OS full profile.

The beauty is that you can work initially upon any suitable low-cost development board and later deploy to production hardware without rewriting your code. For this example, we need an Arm Mbed OS-capable device that also supports TensorFlow Lite for Microcontrollers.

A good example is the NXP FRDM K66F, which has a microphone built-in for voice detection (plus additional sound input expansion). It also supports internet connectivity through Ethernet, allowing the microcontroller to send messages to the cloud.

You can use Arm Mbed CLI as the development software on your local machine to create, import, and build your project with Arm Mbed OS 6. Once installed, the tool takes the Mbed OS source code, along with your voice-assistant project code and its dependencies, and compiles them using either the Arm Compiler or the GNU Arm Embedded Compiler (GCC). You can then flash the resulting binary to the board by using a serial connection between your board and development machine.

Tiny Machine Learning for Embedded Devices

For the ML component we use TensorFlow Lite for Microcontrollers, which enables you to run basic machine learning models on microcontrollers and other devices. It’s open source, written in C++ 11, and can be deployed on memory-constrained, low-power devices, such as those based upon the Arm Cortex-M processors, in a few kilobytes of memory. It is so small because it is built to fit on resource-constrained devices.

TensorFlow Lite for Microcontrollers is currently supported by a diverse set of devices and provides a subset of all TensorFlow operations. For on-device training with TensorFlow Lite, you must build and train the model locally and then convert to reduce its size and to use TensorFlow Lite functionality before it is transferred to the device.

The TensorFlow Lite machine learning model used by this example is based on a pre-trained model that can recognise two keywords ("yes" and "no") from speech data. However, you can retrain the model to recognise other words from Google’s Speech Commands dataset.

The model consists of a multilayer convolutional neural network, consisting of a convolutional 2D layer, a fully-connected layer or a MatMul Layer (output: logits) and a softmax layer (output: probabilities).

Take a look at the Notebook file for details of how to set up and train a model in simple speech recognition, how to freeze it, and how to convert it to a TensorFlow Lite model. You’ll need to further convert it to a C byte array that can be stored in read-only program memory on the device, then loaded and executed by TensorFlow Lite for Microcontrollers.

Further instructions on porting the TensorFlow Lite for Microcontrollers example to the latest Mbed OS 6 can be found on this Mbed guide for deploying the MicroSpeech example to the NXP K66 device. You can use this project as a starting point and then integrate additional code to respond to voice instructions or to send information onwards to the cloud.

Connecting a Smart Device to the Cloud

The voice assistant uses WiFi connectivity so that the microcontroller can send data to the cloud, for example, it can log instructions, actions, and outcomes. Additionally, it can send information to the cloud when the incoming data is inconsistent, for example, when the heating or lighting is left on for an extended period of time, or when the system knows the user is typically not in the house.

Mbed OS can interface with AWS IoT to manage the device gateway and the message broker, which connects and processes messages between a microcontroller and the cloud.

The assistant interfaces with an AWS MQTT broker to publish and subscribe to messages in the cloud. In your prototype code, you’ll first need to log into an AWS account and set up device credentials and policy using the AWS IoT console. Then, let Mbed OS know the custom endpoint name, name your device, and set a topic that both AWS and your device can publish messages to.

Take a look at the Mbed OS example for AWS cloud for details of how to do this and use the example as the starting point to configure a connectivity solution for the voice assistant. Finally, you can add the necessary AWS configuration and connection support described in the example project’s documentation into the Mbed project you created for deploying the TensorFlow Lite model to the device.

Wrapping Up

As we have shown at a high-level, it is possible to create sophisticated AI-enabled applications that run upon memory-constrained, ultra-low power endpoint devices. We have walked you through the puzzle pieces needed to build a lower-power voice assistant that is "always on." The assistant differentiates commands such as "Lights on" from ambient background sound and other speech. It infers the correct request and acts upon it or sends a message to the cloud when it needs to alert the user.

Our tinyML example used an Arm Cortex-M microcontroller running Mbed OS, taking a basic board for prototyping code that can be moved into a production setting without the need to rewrite it. Using Mbed OS provides you with an abstraction layer for development and deployment, and a fully-featured embedded RTOS. Mbed OS takes care of much of the use case in this example, such as sound detection, and WiFi connectivity.

The job of inferring voice commands is performed by a pre-trained machine learning model. This specialist know-how is provided by TensorFlow Lite for Microcontrollers, and allows you to build upon an established machine-learning powerhouse running on-device.

Find Out More

There are a number of resources available to help you get started with tinyML voice applications with Mbed OS: