Outshift Logo

INSIGHTS

3 min read

Blog thumbnail
Published on 03/19/2021
Last updated on 03/21/2024

PC Webcam Object detection with TensorFlow2 GPU Training

Share

Introduction

During a recent Hackathon I had a task to use my PC webcam to scan and detect networking ports. At the beginning, I searched open-source projects but I couldn’t find a similar one. Fortunately, TensorFlow is one of the most popular open-source SDKs for developers to create custom models and the latest version of TensorFlow2 also supports Nvidia GPU to speed up the large dataset model training time. With that, I am able to train my custom model and detect electronic device ports from PC webcams as shown in Figure 1 below. Figure 1: My application detected an ethernet port Figure 1: My application detected an ethernet port

Prepare Dataset

The first thing of a successful model training is to prepare and label the dataset. My dataset contains many networking device images with different angles, backgrounds and lighting conditions. Because object detection is a supervised machine learning method, I have to label each image and save the annotation to an XML file in order to train my custom model. There are many GUI image annotation tools available to help the image labelling process and you can choose any one based on your preference and platform. I used Microsoft VoTT tool (https://github.com/microsoft/VoTT) and saved annotations to Pascal VOC XML file format as shown in Figure 2 below. Figure 2: VoTT image annotation tool VoTT image annotation tool

Model Training Without GPU

The next thing is to convert my dataset to TensorFlow binary format so that TensorFlow python scripts can process them. I followed TensorFlow2's tutorial, converted my dataset and started model training on my PC. On my PC i7 CPU, it was a very slow process. It took days to run 5000 steps in order to have my model trained. If I want to re-train or fine-tune my model, this is very time consuming on my regular PC.

Model Training With GPU

So I moved my model training to Google Colab Service. Google Colab provides a python virtual machine on the cloud and I can allocate Nvidia GPU to speed up model training. After setting up my Google Colab account and configuring GPU runtime, I was able to run 5000 steps to get my model trained in one hour. This shows that the GPU hardware is very good at highly parallel computation jobs such as model training and machine learning. Because Google Colab is a free service to share GPU computation, you have a 12 hour time limit to run your jobs. Make sure you save your data when jobs are done or you will lose your progress when your Colab session times out or the session is idle for too long.

Putting It All Together

Finally, I downloaded my trained model to my PC and tested it. The test result is a success and my application can recognize ethernet ports through webcams as shown in Figure 3 below. Figure 3: Router’s RJ45 ethernet ports correctly identified Router’s RJ45 ethernet ports correctly identified Many modern applications use Augmented Reality and Machine Learning techniques to detect different objects and provide a better user experience. To achieve this, developers create custom models and spend a lot of time on training and testing their models. This article provides my experience on how to setup the TensorFlow2 GPU developer environment, prepare the dataset and run model training. With the right tools, it will speed up the process and help developers working on similar tasks.
Subscribe card background
Subscribe
Subscribe to
the Shift!

Get emerging insights on emerging technology straight to your inbox.

Unlocking Multi-Cloud Security: Panoptica's Graph-Based Approach

Discover why security teams rely on Panoptica's graph-based technology to navigate and prioritize risks across multi-cloud landscapes, enhancing accuracy and resilience in safeguarding diverse ecosystems.

thumbnail
I
Subscribe
Subscribe
 to
the Shift
!
Get
emerging insights
on emerging technology straight to your inbox.

The Shift keeps you at the forefront of cloud native modern applications, application security, generative AI, quantum computing, and other groundbreaking innovations that are shaping the future of technology.

Outshift Background