Ever wonder how self-driving cars can “see” the road? That is a combination of computer vision! Its becoming super critical as every day we are generating more and more images. Computer vision is used across various sectors, including healthcare and security.
Computer vision enables computers to “see” and understand pictures — just like we do. For this, Python is a good choice because it is simple to use and has lots of useful tools. So, in this blog, we will learn about computer vision with python.
Configuring Your Computer Vision Development Environment
All right, so before you get started building cool stuff, you need some good tools. Fortunately for us, Python makes this straightforward. Now let’s configure your computer vision environment.
Installing Python and pip
Finally, you should have Python installed. Install Python from the Python website and download the newest version. During installation, make sure to check “Add Python to PATH”. Pip is bundled with Python (but you can install it separately if not). Now we can open our command prompt or terminal and run:
python -m ensurepip –default-pip
This leaves you with pip available to install other libraries. So, now you are prepared to move on.
Key Libraries Installation: OpenCV, TensorFlow, and PyTorch
These are the big three! OpenCV takes care of basic image tasks. TensorFlow and PyTorch are going to be for more advanced things such as deep learning.
OpenCV:
pip install opencv-python
TensorFlow:
pip install tensorflow
PyTorch: Go to the PyTorch website and pick an appropriate installation command which corresponds to your operating system and CUDA version.
If you get errors during installation, check that your pip is up to date: pip install –upgrade pip Make sure you are on the right versions of Python and other dependencies as well.
Installing Python and Setting Up a Virtual Environment
Virtual environments minus your projects separate. This makes sure that will be no conflict between different projects. To create one, use these commands:
python -m venv myenv
And it creates new virtual enviorenment “myenv”. Activate it with:
Windows: myenvScriptsactivate
macOS/Linux: source myenv/bin/activate
When finished, turn it off with the command deactivate. Well, if we repeat this operation, we will not have a clean operator to work with, and we can use a virtual environment that carries a clean and orderly workspace.
From Zero to Hero: Computer Vision with Python
With the environment set, let us discover some core techniques. These techniques allow you to process images. You can refine them and get key insights from them.
Image Processing Basics: How to Read, Display and Save Images
OpenCV is the one which makes it easy for you to handle images. This is how to load, show and save an image:
import cv2
Reading an image
img = cv2. imread(‘myimage. jpg’)
Displaying the image
cv2.imshow(‘Image’, img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Saving the image
cv2. imwrite(‘new_image. jpg’, img)
BGR (default in OpenCV) and RGB are examples of color spaces for images. You can convert between them:
img_rgb = cv2. cvtColor(img, cv2. COLOR_BGR2RGB)
Color spaces are foundational for many image processing tasks.
Object Detection and Image Segmentation
For this reason filtering can make images blurry, sharper or find borders. Here are a few examples:
import cv2
import numpy as np
Blurring
blurred = cv2. GaussianBlur(img, (5, 5), 0)
Sharpening
kernel = np. array([[-1, -1, -1],
[-1, 9, -1],
[-1, -1, -1]])
sharpened = cv2. filter2D(img, -1, kernel)
Edge detection
edges = cv2. Canny(img, 100, 200)
Try different filter types and settings for unique effects.
Context Awareness and Personalization
Feature Detection → It finds interesting points in an image. It aids in recognition of objects through these points. Common techniques are Harris corners, SIFT, and SURF.
import cv2
Harris corner detection
gray = cv2. cvtColor(img, cv2. COLOR_BGR2GRAY)
gray = np.float32(gray)
dst = cv2. cornerHarris(gray(2, 3, 0.04)
dst = cv2.dilate(dst, None)
img[dst > 0.01*dst. max()]=[0,0,255]
These characteristics are usually also served as the basis for more advanced object recognition systems.
Detection and Recognition of Objects
Now our computer to identify objects! We will employ a range of both traditional and contemporary techniques. Let’s begin.
Non deep learning object detection method: Haar Cascade
References [1] Haar feature-based cascade classifiers for object detection. They are quick but not always correct.
import cv2
Load the cascade
face_cascade = cv2. CascadeClassifier(haarcascade_frontalface_default. xml’)
Convert to grayscale
gray = cv2. cvtColor(img, cv2. COLOR_BGR2GRAY)
Detect faces
faces = face_cascade. detectMultiScale(gray, 1.1, 4)
Draw rectangles around faces
for (x, y, w, h) in faces:
cv2. rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
In highly complex scenes, Haar cascades are less performant.
Computer Vision for Deep Learning: Object Detection: YOLO, SSD
YOLO (You Only Look Once) and SSD (Single Shot Detector) improve the speed and accuracy. They use deep learning. They outperform Haar cascades.
import cv2
import numpy as np
Load YOLO
net = cv2. dnn. readNet(‘yolov3. weights’, ‘yolov3. cfg’)
classes = []
with open(‘coco. names’, ‘r’) as f:
classes = [line. strip() line for line in f.readlines()]
layer_names = net. getLayerNames()
output_layers = [layer_names[i[0] – 1] for i in net.getUnconnectedOutLayers() ] getUnconnectedOutLayers()]
YOLO is a very fast model, while SSD achieves a good trade-off between speed and accuracy.
How Image Classification works with Convolutional Neural Networks (CNN)
CNNs — the workhorse of image classification This means that features are learned automatically from images.
import tensorflow as tf
from tensorflow. keras. models import Sequential
from tensorflow. keras. from keras.
Build the model
model = Sequential([
At first we will create a convolutional layer: Conv2D(32, (3, 3), activation=’relu’, input_shape=(28, 28, 1)),
MaxPooling2D((2, 2)),
Flatten(),
Dense(10, activation=’softmax’)
])
Compile the model
model. compile(optimizer=’adam’,
trainable=True)), loss=’sparse_categorical_crossentropy’,
metrics=[‘accuracy’])
Constraining it to pre-trained models enables you to attain high accuracy with less training data in transfer learning.
Guide to Computer Vision with Python: Real-world Applications
All of these advantages make computer vision a game changer for many sectors. It offers solutions that are novel. Here are some examples.
Medical Image Analysis
Medical imaging analysis is facilitated by computer vision. It’s able to identify tumors present in X-rays. This results in faster and more precise diagnoses.
Autonomous Vehicles
Computer vision is what self-driving cars use to see the world around them. They see lanes, traffic signs and other vehicles. This is how they avoid getting stuck.
Security and Surveillance
Security is one of the strongest domains where facial recognition systems play a key role. They use it to identify people in real-time. It assists in preventing crime and increasing safety.
Intermediate Computer Vision Topics
Want to go deeper? Explore advanced topics here.
Image Segmentation
Image Segmentation — This technique breaks down an image into meaningful parts. It is helpful in medical imaging and autonomous driving.
Automatic Text Recognition (ATR)
OCR stands for Optical Character Recognition, which means converting both character (images of text) into a machine-readable text. It is used to scan documents, and for data entry.
3D Computer Vision
3D Computer Vision also reconstruct 3D models based on images. This tilling up is very useful for robotics and AR/VR applications.
The outcome: the future is looking bright
That is all with computer vision, a useful field with lots of applications. The good news is that Python is relatively easy to learn. So far we have taken an overview with a few cool applications.
The scope of computer vision is enormous. It will keep transforming the way we interact with technology. Explore, play, and create something awesome!