Comparison of study of various color and contour-based pumpking counter method for aerial farm monitoring.

Utilized various models and check their efficiency and accuracy to fit a specific determined task.

tags: data analytics, dataviz, python, openCV

Problem Solution Statement

  • Pumpkin farming can be a challenging business, and one of the biggest obstacles faced by farmers is managing supply and demand. With thousands of hectares of land to cultivate, it can be difficult to accurately estimate the number of pumpkins that will be produced each season, leading to potential shortages or surpluses.
  • By carefully monitoring and counting the pumpkins, farmers can gain a better understanding of their supply and adjust their production accordingly. This can help ensure that they are able to meet market demand and avoid any potential losses due to overproduction or waste.
  • By accurately estimating their supply, farmers can also plan for the future and make strategic decisions about when to plant, harvest, and sell their pumpkins. This can help them maximize profits and achieve long-term success in their business.

The Work

  • FYI: I created a tutorial on my blog the process on creating the program using python, you can check it out here on my substack.
  • We eventually go further more than just one algorithm, the methods that are used to evaluate the accuracy are are Color Segmentation, Sobel Edge Detection, Simple Binary Thresholding, Distance Transform, Canny Edge Detector, Circle Hough Transform, and Watershed Transform.
    • Color Segmnetation
      • Every image can be represented by a model that consists of tuples of numbers. This model can be interpreted using a specific set of rules, so that it can be perceived using optical sensors, such as eye. Color Segmentation is an image processing technique that applies a thresholding method to extract certain color range from the image. This can be used to create a binary image. that consists only of pumpkins, with the pumpkin represented as white and the others as black.
    • Sobel Edge Detection
      • Sobel Edge Detection is a method used to detect edges using Sobel Operator, an isotropic 3x3 image gradient operator. The calculation uses two 3x3 kernels, which are convolved with the original image to calculate the approximated gradients, one for each horizontal and vertical direction. If we define S as the source image, and Gx and Gy as the gradient approximation image for x and y direction respectively, then both Gx and Gy can be approximated using
      • with * representing two-dimensional signal processing convolution operation. The resulting gradient approximation can be calculated with
        with G representing the gradient image.
    • Simple Binary Thresholding
      • Binary thresholding is similar to color segmentation, except that binary thresholding can only be applied at grayscale images and the resulting image pixel value will be either the maximum or minimum value. The function that represents the operation on each pixel of the source image can be written as
    • Distance Transformation
      • Distance Transformation is the process of changing pixel values on binary image according to its normal distance to the closest boundary from each point
      • Canny Edge Detector is a multi-step algorithm used to detect edges in images. Its process is as follow
        • Apply Gaussian blurring to reduce noise Edge detection is susceptible to noise, so noise reduction using 5X5 Gaussian filter is done.
        • Calculate intensity gradients of the image This is done by calculating gradients in x and y direction, and then using equation
        • Non maximum suppression to remove spurious result This will check every pixel on the image if it is a local maximum by comparing it with the gradient calculated from step two. Those that are not edges have their pixel values turned to zero.
        • Hysteresis thresholding using two threshold value Pixels that are not connected to other pixels that are part of edges are discarded. Pixels with value exceeding maximum threshold value or below minimum threshold value are also discarded.
      • Circle Hough Transform
        • Circle Hough transform is an algorithm used to extract circles from images. Let’s say we want to find a circle with radius R. Circles with radius R are defined with their center on the edge, then an accumulator matrix is used to store intersection points of the defined circles. For each addition of intersecting circles in that point, the number of votes will increase by one, and every point that has at least the same number of votes as the predetermined value will be considered as the center point of the circle.
      • Watershed Transform
        • Watershed Transform is a transformation algorithm used to create borders between object. It can be illustrated as drawing drainage divide, which separates adjacent drainage basins.
          An intuitive method for explaining Watershed Transform is by imagining a relief with multiple regional minimums. Then, each minimum is filled with water until the relief is flooded, and walls are placed on the place where water from different relief meet.

Comparison Analysis

  • Performance data parameters used as comparison parameters are accuracy, precision, and sensitivity. Those data are listed in tables and normalized, so that each type of performance data will have the same value on deciding each methods’ performance.
  • Then, each performance parameter data values of each samples are summed accordingly for each method. This will produce different value for different method, and the one with the highest value can be considered as the best, as accuracy, precision, and sensitivity values correlate positively with performance.
  • Accuracy is defined as the degree of closeness of measurements of a quantity to that quantity's true value. Mathematically, accuracy as in binary classification is defined as
    But there is no way to count true negatives, because there are infinitely many true negatives, so the equation become
  • Precision is defined as the degree of repeated measurements under unchanged conditions show the same result.Mathematically, precision in binary classification is defined as
  • Sensitivity is defined as the degree of positives that are correctly identified to the number of that quantity’s true value. Mathematically, sensitivity in binary classification is defined as

Result and Discussion

  • It is clearly shown that Simple Binary Threshold holds the highest performance data value. In general, color-based methods are also shown to be better than contour-based methods, as all of the color-based ones hold higher performance data value than any of the contour-based ones.
  • Simple Binary Threshold works according to the equation displayed on its part. This might seem to be a bad method, but in most of the samples used in this work, pumpkins usually have the brightest color, as seen on figure 2. In that case, this method will bear good performance data in various images. Although, this method could bring bad result if the threshold value is not set appropriately.
  • Color Segmentation works similarly with Simple Binary Threshold, yet they produced quite different result. This is caused by their thresholding method. Simple Binary Thresholding ‘disposes’ of the pixels below the threshold value and Color Segmentation ‘disposes’ of the pixels below the lower threshold value or higher than the upper threshold value; this can be seen by comparing their equation. This could be fixed by adjusting both threshold values on Color Segmentation, but the problem is pumpkin varies by color in each region, due to multiple factors such as lighting, shade, noise, and grains.
  • Distance Transformation is the process of weighting each pixel value according to its distance to any border. This will highlight the center of each object and increase pumpkin counting performance. This doesn’t work as well as the other color-based methods, as pumpkins in a farm is usually more contrast than their surroundings and the other methods is more sensitive to color difference.
  • Watershed Transform is intended for segmentation of different objects in very close proximity, so it is not surprising that it produced bad result, as Watershed Transform is used to segment objects that are in very close proximity to each other and very little is found on all of the samples used in this work. This method is usually used with distance transformation, but that is out of scope of this work.
  • Sobel Edge Detector and Canny Edge Detection also produced bad result, as it doesn’t differentiate the gradient that is the boundary between pumpkins and pumpkin with surrounding soil or plantation. But Sobelx yields better result than Sobelxy due to unknown result, as usually calculating gradients form both the x and y axis is more reliable than from only one axis. Perhaps, further study is needed for explaining this phenomenon.
  • Circle Hough Transform doesn’t produce good result, as not all pumpkins are circles. Moreover, some of the pumpkins are hidden beneath leaves which makes the round edges less distinguishable.


    One of the purposes of UAV in civil sector is pumpkin farm aerial monitoring to calculate its yield. To do it, one must count the amount of pumpkins in the field, and one of the methods that can be used to achieve said purpose is by using camera to map out the farm. Then, the pumpkins can be counted using various methods, and the ones studied in this work are color-based and contour-based ones. The performance data bar chart showed that in general, color-based methods are better than contour-based methods for counting pumpkins. This is caused by contour-based methods don’t differentiate the edges of every object in the image, and color-based methods can correctly identify pumpkins based on their color that is contrast to their surroundings.

More Information