How To Use Kinect v2 Face Basics

Vangos Pterneas

5.00/5 (4 votes)

Dec 21, 2014

CPOL

6 min read

30090

How to use Kinect v2 Face basics

What if your computer could detect your eyes, nose, and mouth? What if an application could understand your facial expressions? What if you could build such applications with minimal effort? Until today, if you wanted to create an accurate mechanism for annotating real-time facial characteristics, you should play with OpenCV and spend a ton of time experimenting with various algorithms and advanced machine vision concepts.

Luckily for us, here comes Kinect for Windows version 2 to save the day.

One of the most exciting features of Kinect 2 is the new and drastically improved Face API. Using this new API, we’ll create a simple Windows application that will understand people’s expressions. Watch the following video to see what we’ll develop:

Watch the video on YouTube

Download the source code on GitHub

Read on for the tutorial.

Note: Kinect provides two ways to access facial characteristics: Face Basics API and HD Face API. The first one lets us access the most common features, such as the position of the eyes, nose, and mouth, as well as the facial expressions. HD Face, on the other hand, lets us access a richer and more complex collection of facial points. We’ll examine Face Basics in this article and HD Face on the next blog post.

Face Basics Features

Here are the main features of the Face Basics API:

Detection of facial points in the 2D space
- Left & right eyes
- Nose
- Mouth
- Head rectangle
Detection of expressions
- Happy
- Left/Right eye open
- Left/right eye closed
- Engagement
- Looking away
Detection of accessories
- Glasses

Prerequisites

To create, build, and run this app, you’ll need the following:

Kinect for Windows v2 sensor (or Kinect for XBOX v2 sensor with an adapter)
Kinect for Windows v2 SDK
Windows 8.1
A free USB 3 port

Creating a Basic Face App in 7 Steps

The application we’ll develop in this short tutorial will highlight the face points and will detect whether the user has her eyes and mouth opened or closed. The code applies to both WPF and WinRT.

Step 1: XAML

The XAML code is fairly easy. We simply define a canvas and a few ellipses that will represent thee face points. So, launch Visual Studio, create a new project and modify your XAML like this:

<Viewbox>
    <Grid Width="1920" Height="1080">
        <Image Name="camera" />
        <Canvas Name="canvas">
            <Ellipse Name="ellipseEyeLeft" Style="{StaticResource EyeStyle}" />
            <Ellipse Name="ellipseEyeRight" Style="{StaticResource EyeStyle}" />
            <Ellipse Name="ellipseNose" Style="{StaticResource NoseStyle}" />
            <Ellipse Name="ellipseMouth" Style="{StaticResource MouthStyle}" />
        </Canvas>
    </Grid>
</Viewbox>

The styles of the ellipses are included in the App.xaml file.

Step 2: Add the Required References

So far, so good! Now, navigate to the Solution Explorer and right-click the references icon. Select “Add reference”, then select “Windows 8.1″, and then “Extensions”. Check the Microsoft.Kinect and Microsoft.Kinect.Face assemblies. If you are using WinRT, the Microsoft.Kinect assembly is called WindowsPreview.Kinect.

Step 3: Declare the Kinect Face Objects

After typing the XAML code, open the corresponding C# file (MainWindow.xaml.cs or MainPage.xaml.cs). First, you’ll need to import the Kinect namespaces.

For .NET/WPF, it is:

using Microsoft.Kinect;

For WinRT, it is:

using WindowsPreview.Kinect;

This will provide us with the core Kinect functionality, but no face capabilities. In both WPF and WinRT, the face features are included in the namespace:

using Microsoft.Kinect.Face;

We can now declare the necessary objects. Similarly to the color, depth, infrared and body stream, Kinect also includes a frame source and a face reader class:

// The sensor objects.
KinectSensor _sensor = null;

// The color frame reader is used to display the RGB stream
ColorFrameReader _colorReader = null;

// The body frame reader is used to identify the bodies
BodyFrameReader _bodyReader = null;

// The list of bodies identified by the sensor
IList<Body> _bodies = null;

// The face frame source
FaceFrameSource _faceSource = null;

// The face frame reader
FaceFrameReader _faceReader = null;

Step 4: Initialize Kinect and Handle the Events

Navigate to the .xaml.cs file and place the following code in the constructor, just below the InitializeComponent method. The _colorReader is used to display the RGB stream (refer to my previous article about the different streams). The body reader is used to acquire the body data. We need the body data, since each face corresponds to a specific body instance.

Initialization of the color and depth readers is straightforward. Initializing the face reader is a little more tricky though. You need to explicitly state which face features you expect. Unless you feed the reader with a number of features, it will give you nothing! For our example, I have specified 9 elements (bounding box in color space, glasses, closed eyes, mouth open, etc.). You can add or remove features from the enumeration FaceFrameFeatures.

Finally, just like the color and body readers, remember to handle the FrameArrived event!

_sensor = KinectSensor.GetDefault();

if (_sensor != null)
{
    _sensor.Open();

    _bodies = new Body[_sensor.BodyFrameSource.BodyCount];

    _colorReader = _sensor.ColorFrameSource.OpenReader();
    _colorReader.FrameArrived += ColorReader_FrameArrived;
    _bodyReader = _sensor.BodyFrameSource.OpenReader();
    _bodyReader.FrameArrived += BodyReader_FrameArrived;

    // Initialize the face source with the desired features
    _faceSource = new FaceFrameSource(_sensor, 0, FaceFrameFeatures.BoundingBoxInColorSpace |
                                                  FaceFrameFeatures.FaceEngagement |
                                                  FaceFrameFeatures.Glasses |
                                                  FaceFrameFeatures.Happy |
                                                  FaceFrameFeatures.LeftEyeClosed |
                                                  FaceFrameFeatures.MouthOpen |
                                                  FaceFrameFeatures.PointsInColorSpace |
                                                  FaceFrameFeatures.RightEyeClosed);
    _faceReader = _faceSource.OpenReader();
    _faceReader.FrameArrived += FaceReader_FrameArrived;
}

Step 5: Connect the Face with the Body

A face object is related to a corresponding body object (obviously). So, the face source should be updated with the tracking ID of the body. The following code detects the default body (if any) and assigns its unique tracking identifier to the face frame source.

void BodyReader_FrameArrived(object sender, BodyFrameArrivedEventArgs e)
{
    using (var frame = e.FrameReference.AcquireFrame())
    {
        if (frame != null)
        {
            frame.GetAndRefreshBodyData(_bodies);

            Body body = _bodies.Where(b => b.IsTracked).FirstOrDefault();

            if (!_faceSource.IsTrackingIdValid)
            {
                if (body != null)
                {
                    // Assign a tracking ID to the face source
                    _faceSource.TrackingId = body.TrackingId;
                }
            }
        }
    }
}

You can add additional functionality (such as gesture tracking) if necessary.

Step 6: Inside the Face FrameArrived Event Handler

Now is the time for the best part. Since the face source is connected with the body, we can specify what happens when a face frame is available. Face frames work exactly like the color, depth, infrared, and body frames: firstly, you get a reference to the frame, then you acquire the frame, and, if the frame is not empty, you can grab the FaceFrameResult object. The FaceFrameResult object encapsulates all of the available facial information.

void FaceReader_FrameArrived(object sender, FaceFrameArrivedEventArgs e)
{
    using (var frame = e.FrameReference.AcquireFrame())
    {
        if (frame != null)
        {
            // Get the face frame result
            FaceFrameResult result = frame.FaceFrameResult;

            if (result != null)
            {
                // Get the face points, mapped in the color space
                var eyeLeft = result.FacePointsInColorSpace[FacePointType.EyeLeft];
                var eyeRight = result.FacePointsInColorSpace[FacePointType.EyeRight];
                var nose = result.FacePointsInColorSpace[FacePointType.Nose];
                var mouthLeft = result.FacePointsInColorSpace[FacePointType.MouthCornerLeft];
                var mouthRight = result.FacePointsInColorSpace[FacePointType.MouthCornerRight];

                // Get the face characteristics
                var eyeLeftClosed = result.FaceProperties[FaceProperty.LeftEyeClosed];
                var eyeRightClosed = result.FaceProperties[FaceProperty.RightEyeClosed];
                var mouthOpen = result.FaceProperties[FaceProperty.MouthOpen];
            }
        }
    }
}

Although the above code is self-explanatory, there are 2 points of interest:

FacePointsInColorSpace is a collection of facial points (X, Y values), projected to the 2D color space. Face Basics API does not provide depth (Z) values for the eyes or nose (more on this on the next blog post).
FaceProperties is a collection of the detected expressions. Each property has a value from the DetectionResult enumeration (Yes, No, Maybe).

Step 7: Drawing the UI

The last step is simple UI drawing. We simply position the ellipses to the X, Y positions of the eyes and nose. The size of the mouth ellipse changes according to the expressions of the user.

// Position the canvas UI elements
Canvas.SetLeft(ellipseEyeLeft, eyeLeft.X - ellipseEyeLeft.Width / 2.0);
Canvas.SetTop(ellipseEyeLeft, eyeLeft.Y - ellipseEyeLeft.Height / 2.0);

Canvas.SetLeft(ellipseEyeRight, eyeRight.X - ellipseEyeRight.Width / 2.0);
Canvas.SetTop(ellipseEyeRight, eyeRight.Y - ellipseEyeRight.Height / 2.0);

Canvas.SetLeft(ellipseNose, nose.X - ellipseNose.Width / 2.0);
Canvas.SetTop(ellipseNose, nose.Y - ellipseNose.Height / 2.0);

Canvas.SetLeft(ellipseMouth, ((mouthRight.X + mouthLeft.X) / 2.0) - ellipseMouth.Width / 2.0);
Canvas.SetTop(ellipseMouth, ((mouthRight.Y + mouthLeft.Y) / 2.0) - ellipseMouth.Height / 2.0);
ellipseMouth.Width = Math.Abs(mouthRight.X - mouthLeft.X);

To make our project more engaging, you can hide an eye ellipse if the eye is closed. Moreover, you can increase the size of the mouth ellipse if the mouth is open, and decrease its height if the mouth is closed.

// Display or hide the ellipses
if (eyeLeftClosed == DetectionResult.Yes || eyeLeftClosed == DetectionResult.Maybe)
{
    ellipseEyeLeft.Visibility = Visibility.Collapsed;
}
else
{
    ellipseEyeLeft.Visibility = Visibility.Visible;
}

if (eyeRightClosed == DetectionResult.Yes || eyeRightClosed == DetectionResult.Maybe)
{
    ellipseEyeRight.Visibility = Visibility.Collapsed;
}
else
{
    ellipseEyeRight.Visibility = Visibility.Visible;
}

if (mouthOpen == DetectionResult.Yes || mouthOpen == DetectionResult.Maybe)
{
    ellipseMouth.Height = 50.0;
}
else
{
    ellipseMouth.Height = 20.0;
}

Quite simple, right? You can now create awesome Kinect applications using a powerful and accurate API in a few lines of code!

Download the source code

Watch the video (and subscribe!)

Notes

If you are using .NET./WPF for your project, you’ll also need to add the following line under Project → Properties → Build Events → Post-build event command line. This command will import some necessary configuration files.

xcopy "C:\Program Files (x86)\Microsoft SDKs\Windows\v8.0\ExtensionSDKs\Microsoft.Kinect.Face\2.0\
Redist\CommonConfiguration\x64\NuiDatabase" "NuiDatabase" /e /y /i /r

If you are using WinRT, remember to open the Package.appxmanifest file, click Capabilities, and check the Microphone and Webcam capabilities. This will give your app permissions to use Kinect for Windows.

Next week: we’ll dive deeper into the Kinect Face API with Kinect Face HD. Stay tuned!

The post How to use Kinect v2 Face Basics appeared first on Vangos Pterneas.