Books and Sunsets Image Feature Extraction
By Landon Flom
CS510



Introduction


The main features I chose to work with for my two categories, books and sunsets, were the edges. Using various combinations of the Sobel edge detector, in conjunction with some minor image preperation, I was able to classify up to 90% correct for the two sets of data. At a very high level, my features were:

  • Feature 1: Compare the ratio of edges in the top half of a section of the image with the bottom half.
  • Feature 2: Compute the number of edges, number of horizontal edges, and number of vertical edges for various parts of the image.
  • Feature 3: Compute the number of edges in the image at various levels of blurring.



  • Observations About Image Categories

    One of my first thoughts about my two image categories was that sunsets tend to have many of the same colors: red, orange, yellow, and blue. As it turns out, books can also be any of these colors, and in fact also have many of the same features as far as location and intensity of these colors:

    Something like this:
    Sunset-like books
    Isn't so far off from something like this:
    Book-like sunset


    It seemed, however, that images with books had more edges than images with sunsets, at least in certain areas. Most images of sunsets are taken from an upright position, meaning that the sky is generally in the top half of the picture, whereas with books, the top half of the picture generally contains more books.


    Features

    For each image category, for each feature, a file was created containing the feature values for each image in that category. Each row contains the values for a single image, starting with the name of the feature, the name of the image without the extension, the name of the category, and the number of the image. Any values generated describing the feature are appended, comma delimited, on to the end of this. Therefore, a given entry in a file will look similar to:
    flom1, book00, book, 0, X1, X2, X3, ...
    
    Where the X's represent values for the features described below.


    Feature 1

    Feature 1 attempts to take advantage of the assumption that book images will have a more uniform distribution of edges. To do this without worrying about exactly how many edges there are, it takes the number of edges in the top of one region, and divides by the number of edges in the bottom of that region. The image below shows the different regions. The first region is the entire image, where the number of edges are summed up in the area above the horizontal red line, and divided by the number of edges summed up in the area below the red line. The next areas are the left and right sides of the vertical red line, where the edges in the upper left quadrant are summed and divided by the edges in the lower left quadrant, as defined by the two red lines. This is also done for the two quadrants to the right of the red line. Lastly, the same is done for the areas above and below the horizontal red line (note the entire half on each side of the line are taken, not split by the vertical red line), and their top and bottom halves are defined by the purple lines.


    For this image, the output looks like:
    0.6183193 0.7421267 0.4851653 0.2500768 0.8563609
    
    Where the numbers represent the whole image, left half, right half, top half, and bottom half.


    Feature 2

    Feature 2 looks at the total edges, horizontal edges, and vertical edges seperately, for various regions in the image. No relative comparisons are done, however, each computed value is just the sum of the edges for that region. The regions looked at are once again the entire image, the left half, the right half, and the top and bottom halves. Using the same sample image from above, here are the results after using horizontal and vertical edge detectors.

    Horizontal Edges


    Vertical Edges

    The output is three numbers for each region, such as:
    24891.07 14397.39 16538.85 13502.66 7750.529 9029.754 11504.45 7694.643 7482.293 9747.041 6299.21 5833.93 15763.77 15763.77 15763.77
    
    So the first three numbers are for the total edges, horizontal edges, and vertical edges for the entire image, the next three are for the total, horizontal, and vertical edges for the left half, and so on. Notice that the value for the sum of all edges is not the sum of the vertical and horizontal edge sums. This is because some of the edges found overlap each other and would be counted twice if the horizontal and vertical edges were just added together.


    Feature 3

    Feature 3 uses a low pass filter to blur the image before running edge detection. This is done over several levels over blurring to the original grayscale version of the image. The images below show what two of the images look like after edge detection has been ran on a few levels of blurring.
    Original Images

    Original Grayscale After Low Pass Filter

    Sobel on Original Images

    Sobel on Various Levels of Low Pass Filtering




    Once one of these images has been generated, the same process from feature 2 is used where the image is divided into various regions and the number of edges are summed for the top and bottom halves of those regions. This means that this feature generates 105 values for each image.

    Results

    Once the features were generated they were classified by simply projecting each point on to a line that passes through each of the classes average point. The halfway point between the two averages is where I cut off one class from the other. To obtain how well each feature classifies the original data is randomly separated into training and testing data, where 80% is used for training, and the remaining 20% for testing. This is done 1000 times, and the results averaged to get the percentage of how well each feature performs in general. The graphs displayed, however, are of all the images in each category projected on to the line.


    Feature 1

    This feature performed the worst, with an average of only 72.9% of the images correctly classified. As shown in the plot, it was able to classify all of the books correctly, but over half of the sunset images were classified as books.


    According to this feature, the most sunset-like book image, or more accurately, the least book-like book image, was book11, but just barely:



    Feature 2

    This feature was able to classify 85.9% of the images correctly. Here we can see that the two categories are much more evenly distributed relative to each other, and the separation is much better.


    The two outliers with this feature were book10 and sunset20:





    Feature 3

    This image was able to classify 89.2% of the images correctly. The distribution for this feature is very similar to that of feature 2, however this feature was able to consistently classify better than the other two features.


    The outlies with this feature were book04 and sunset00:





    Conclusion

    Using edge detection in various ways proved to work reasonably well for these two categories, especially for using such a simple classifier. It would be interesting to see how well these features extend to other classes, though I suspect that without the aid of other features, possibly that use color, the results would be less than optimal. Interestingly, the images I picked at the beginning as being similar were not a problem to classify using any of the features. It is possible that that is because I had those particular images in mind when deciding on what features to use, and they could therefore be slightly biased. I also didn't test enough variations on feature 3, so I am unsure if the number of levels of blur used is more than necessary, or if adding even more would have improved performance by a noticable amount. However, any one of these features has so many parameters and variations that testing them all by hand would be unpractical.