You only look once (YOLO) is a state-of-the-art, real-time object detection system.
As per official Yolo site:
“Our model has several advantages over classifier-based systems. It looks at the whole image at test time so its predictions are informed by global context in the image. It also makes predictions with a single network evaluation unlike systems like R-CNN which require thousands for a single image. This makes it extremely fast, more than 1000x faster than R-CNN and 100x faster than Fast R-CNN. See our paper for more details on the full system.”
Object detector is a combination of object locator and object recognizer and Yolo does this by runing forward the whole image only once through the deep neural network (DNN).
Realtime Object Detection
We will use OpenCV for this, especially DNN, Yolov3, Darknet. And since we are going to use Python, OpenCV provides a good port for Darknet.
I am using Python3 though have earlier have used OpenCV with Python2 versions as well and it just works fine.
Preparation/Setup
Install OpenCV
Good intro to OpenCV there on https://docs.opencv.org/master/d9/df8/tutorial_root.html
Adrian Rosebrock have published a detailed step by step guide on his blog pip install opencv
Once you have installed Python and OpenCV, you should see below;
$ workon
cv
dl4cv
$ workon cv
(cv) $ python
Python 3.7.4 (default, Mar 24 2019, 18:13:23)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
'4.1.0'
>>>
I’m also using a great utility by Adrian Rosebrock called imutils that has many very useful OpenCV utilities.
Download Pre-trained Models
Create a project directory e.g. yolo-realtime-object-detection and download the following files to yolo-realtime-object-detection/model directory (or whatever name you prefer).
Pre-trained network’s weights (this is ~250mb in size). yolov3.weights - https://pjreddie.com/media/files/yolov3.weights
Network configuration. yolov3.cfg - https://github.com/pjreddie/darknet/blob/master/cfg/yolov3.cfg?raw=true
Names from COCO dataset. coco.names - https://github.com/pjreddie/darknet/blob/master/data/coco.names?raw=true
Object Detection
Initialise OpenCV/DNN with Darknet with COCO dataset:
net = cv.dnn.readNetFromDarknet("yolov3.cfg", "yolov3.weights")
net.setPreferableBackend(cv.dnn.DNN_BACKEND_OPENCV)
net.setPreferableTarget(cv.dnn.DNN_TARGET_CPU)
Initialise video stream:
# create window and start video stream to capture images
cv.namedWindow(windowName, cv.WINDOW_NORMAL)
vs = VideoStream(src=0).start()
cv.resizeWindow(windowName, 800, 600)
Capture and process video stream (images), pass through DNN and filter objects based on confident level:
# grab the frame from video stream
frame = vs.read()
# create blob from frame.
blob = cv.dnn.blobFromImage(frame, 1/255, (inputWidth, inputHeight), [0,0,0], 1, crop=False)
# set input to the network
net.setInput(blob)
# forward pass to get output of the output layers
outs = net.forward(getLayerNames(net))
# process output
processOutput(frame, outs)
# show output
cv.imshow(windowName, frame)
Recently I’ve added code on Git, here is my python code for yolo-realtime-object-detection https://github.com/manmohanp/machineintelligence/tree/master/yolo-realtime-object-detection
(cv) $ python detect_object.py
This should initiate your native camera and detect objects.
Here is an output;
Thats it!! This can detect 80 objects that are in COCO dataset.
Next is to add custom objects for detection.