Using image processing to automatically detect labels (part 4)

This is the last blog post (at least for now) about using image processing to automatically detect labels of 7" singles. It is advised to first read part 1, part 2 and part 3 before reading this post.

In my previous blog post I looked at histogram comparisons and for my test image that worked really well. In this post I am going to look at how well it worked on some other images.

The first label image that I tried: a Wham! promo single with an (almost) white label, with a perfect white circle around it (either a very good scanner, or the uploader cut the label from the scan), so I was not expecting a lot. The masked versions look like this:

I was expecting quite a bad result here, but the comparison for the full image is:

correlation: 0.07351815513756556
chi squared: 15.131605789349916
intersection: 0.08877012991160882
Bhattycharyya distance: 0.8772594089938355

That is even better than with my test image, which puzzles me to be honest.

The next image I tested with is the Dutch pressing of Queen's Tie Your Mother Down. The masked versions look like this:

You can see the rim of the vinyl record and a tiny bit of the label and also that in this case the center hole mask is not covering the center hole perfectly. The histogram comparison looks like this:

correlation: 0.0003169877412830062
chi squared: 704.6096364590801
intersection: 0.09703724197242991
Bhattycharyya distance: 0.9591737531004111

Looking at the value of the correlation (remember: lower means more different) then that's what I would call success!

Next test case: the A-side of a Spanish single where the image is not perfectly cropped and not perfectly centered.

The masked versions look like this:

and the histogram comparison results:

correlation: 0.3220106911431574
chi squared: 23.676490935171586
intersection: 1.2247211879621318
Bhattycharyya distance: 0.8174380098878803

which is not as good as expected, especially when looking at the correlations of the red and green channels. The correlation for the red channel is 0.00172, for the green channel it is -0.00046 and for the blue channel it is 0.35150. The only thing I can think of is that because the label is a bit smaller than in other images (due to different cropping and scaling) the label mask also includes some vinyl pixels (black) in the label part of the image and this has a huge influence. The center hole is also not completely centered.

I tried a slightly smaller radius for the label mask and shifting the center hole a bit. The masks now look like this:

and as you can see very few of the black vinyl pixels are now included in the label image. The histogram comparisons are drastically different:

correlation: 0.0006212548064740618
chi squared: 8255.232299202899
intersection: 0.07737282696507464
Bhattycharyya distance: 0.9615555333658465

This means that for optimal results I need to know:

what the diameter of the label is
where the center of the center hole is

and also means that instead of being able to flag a picture as "label" or "not label" the best I can do without extra preprocessing (for example: edge detection) is flag it as "label" or "don't know". But, that's still a good enough result for me. It also means that I will need to dive deeper into this subject, which is not a bad thing either.

I ran some more tests, like with the A-side label of this Bette Midler release with bad results because black pixels get added to the masked label version (and you can actually see the dust on the record):

correlation: 0.7543407548880438
chi squared: 4.216618476033613
intersection: 0.9175061059013387
Bhattycharyya distance: 0.7430200503113149

Playing a bit with the radius of the mask gives a lot better results, confirming suspicions:

and the histogram comparison values:

correlation: 0.047177802369014636
chi squared: 26.450314478943397
intersection: 0.13217700347286154
Bhattycharyya distance: 0.9329150658094154

A Fleetwood Mac promo single:

with very good results:

correlation: 0.013033629117424022
chi squared: 633.6883680148159
intersection: 0.08043640897267323
Bhattycharyya distance: 0.9225785722847764

and I could go on and on. I am very happy with these results.

So, to wrap up: using histograms as I described is a very effective way to find which images from Discogs potentially contain labels, but only if the images are properly cropped and centered. As soon as this is not the case the quality of the results rapidly drops. There are methods to prevent this from happening, but those can possibly also be used to detect labels on their own. Definitely to be continued in a few months!

Vinyl & Data

Search This Blog

Using image processing to automatically detect labels (part 4)

Labels

Comments

Post a Comment

Popular posts from this blog

SID codes (part 1)

SPARS codes (part 1)

Country statistics (part 2)