Activity goal
The aim is to develop a computer vision algorithm that can detect boats in an image. Specifically, it should draw a rectangular bounding box around each founded boat, and it should work correctly even when the image has no boats in it. Note that boats can appear from any perspective, and they can be very different: from small dinghies to cruise ships. The intersection over union (IoU) metric is used on the provided test images to measure the algorithm’s performance.
Adopted approach
Due to the great variety of boats and the availability of a large data set of images, a machine learning approach seems promising. Hence the use of a neural network in the proposed solution. In particular, the developed algorithm initially extracts features (key points and descriptors) from the image, then classifies the descriptors as boat/non-boat, and lastly, builds the bounding boxes. The last step is, in turn, composed of several sub-steps. The algorithm first generates a probability density function (PDF) of the potential locations of the boats. The PDF modes are then clustered using the mean shift algorithm, and finally, the bounding boxes are generated with a simple iterative procedure. The complete algorithm is here summarized:
- Extract features key points and descriptors from the to-be-detected image.
- Classify each descriptor as boat/non-boat using the trained classifier.
- Generate the probability map of the possible positions of the boats.
- Find and cluster the modes of the above probability density using the mean shift algorithm.
- Prune and suppress “weak” clusters.
- For each cluster, generate the bounding box.