In this series, we’ll learn how to use Python, OpenCV (an open source computer vision library), and ImageAI (a deep learning library for vision) to train AI to detect whether workers are wearing hardhats. In the process, we’ll create an end-to-end solution you can use in real life—this isn’t just an academic exercise!

This is an important use case because many companies must ensure workers have the proper safety equipment. But what we’ll learn is useful beyond just detecting hardhats. By the end of the series, you’ll be able to use AI to detect nearly any kind of object in an image or video stream.

You’re currently on article 4 of 6:

Installing OpenCV and ImageAI for Object Detection
Finding Training Data for OpenCV and ImageAI Object Detection
Using Pre-trained Models to Detect Objects With OpenCV and ImageAI
Preparing Images for Object Detection With OpenCV and ImageAI
Training a Custom Model With OpenCV and ImageAI
Detecting Custom Model Objects with OpenCV and ImageAI

Now that we have some images and a detector set up, let's train our own custom model to detect if people are wearing hardhats. To get the best results from our model, we need to ensure that the data we’re training it with is accurate. We also need to annotate our data and keep some of it aside for validating our model after it’s trained.

Cleaning Data

It’s always a good idea to go through your dataset manually, even if just briefly, to ensure the data you’re using is clean. However, because we have a detector set up and working for us pretty reliably, we’ll let the detector clean our data. Clear the code block that contains the code for showing random images and replace it with the following:

hardhatImages = os.listdir("hardhat")
peopleOnly = detector.CustomObjects(person=True)

for i in hardhatImages:
    imageFile = "hardhat/{0}".format(i)
    detectedImage, detections = detector.detectCustomObjectsFromImage(custom_objects=peopleOnly, output_type="array", input_image=imageFile, minimum_percentage_probability=30)
    if len(detections) < 1:
        os.remove(imageFile)

This code block gets the list of hardhat images we downloaded and defines the detector so it only detects people. Because we know all the images involve hardhats, and because we only want to train our model on people wearing hardhats, we can use this detector to ensure our data is mostly correct. The code then iterates through each image and tries to detect people in the image. If it doesn't, it removes the image.

This process happens pretty quickly, and you should end up with about 560 images.

Splitting Data

The next step is to split our dataset in two. One set will be for training the model, the other for validating our trained model. We can either manually split the data or write some code to split it. To try and automate these processes as much as possible, let’s programmatically split the data. Start a new code block and add the following:

if not os.path.exists('hardhat/train/images'):
    os.makedirs('hardhat/train/images')
if not os.path.exists('hardhat/validation/images'):
    os.makedirs('hardhat/validation/images')

hardhatImages = os.listdir("hardhat")
hardhatTrainNums = round(len(hardhatImages) * 0.90)

for i in range(0, hardhatTrainNums):
    file = "hardhat/" + hardhatImages[i]
    if os.path.isfile(file):
        os.rename(file, "hardhat/train/images/" + hardhatImages[i])
    
hardhatImages = os.listdir("hardhat")

for i in hardhatImages:
    if i.is_file():
        file = "hardhat/" + i
        os.rename(file, "hardhat/validation/images/" + i)

This code block uses a number of methods in the "os" library to split the files into training data and validation data. The general flow is:

Create train and validation folders (os.makedirs)
List the images in the source and flag 90% of them (os.listdir)
Move 90% of the files to the train/images folder (os.rename)
Flag the rest of the images in the source (os.listdir again)
Move the rest of the images to the validation/images folder (os.rename again)

When this is complete, we’ll have two sets of sorted, cleaned data we can use to train a model to detect people wearing hardhats, and then validate how effective that model is.

Annotating Images

Once we have our training data, we need to annotate the images. Sometimes you’ll be able to find images already annotated, but mostly you’ll need to do this by defining bounding boxes around the objects you want to detect. You do this using PASCAL VOC, an XML format that standardizes image datasets for object recognition. Luckily there’s a handy utility called LabelImg that makes this process a little easier.

Download and run the LabelImg utility. From the menu on the left, select Open Dir and specify the directory "hardhat/train/images." The first image in this directory will appear. Next, select Change Save Dir, go into the "hardhat/train" folder, create a new folder called "annotations," and then select this folder. Now we’re ready to annotate. The four hotkeys we’ll need are:

w – starts a new bounding box ready to select our object. Left-click to start a box, then left-click again to finish. Then type in a label (person hardhat).
space – saves the image with annotations to our save directory using the same filename as the image. For example, an image called testimage.jpg will save one called testimage.xml.
d – moves to the next image, but prompts if you haven't yet saved.
a – moves to the previous image, but will prompt if you haven't yet saved.

The more images you annotate, the better your model will be trained. This doesn't make the process any less arduous, but it's definitely worth your while! Once you’re finished with the train folder, you’ll need to annotate the "validation" folder as well. In case you really don't feel like doing that, you can download my annotated images here.

When you’re done, you’ll have a folder with a whole heap of XML files. If you download mine and look in Labeling, you’ll notice two things. First, you don't need to annotate everything, just the better images. Second, some images I didn't annotate because there was no appropriate person wearing a hardhat. This is a subjective decision. Use your best judgement—there really isn't a right or wrong.

Up Next

We’ve learned what it takes to clean and prepare an image dataset to get it ready to train an AI model.

This step might have seemed like a bit of a letdown. After all, we’re here to train and use AI, not to sift through data. But when working with AI, you run into the same rule you encounter in many areas of computing: garbage in, garbage out. If you want your model to generate good results, you need to train it on good data.

Now that we have a clean dataset to work with, we’ll learn how to train it.