Skip to main content

Using image processing to automatically detect labels (part 4)

This is the last blog post (at least for now) about using image processing to automatically detect labels of 7" singles. It is advised to first read part 1, part 2 and part 3 before reading this post.

In my previous blog post I looked at histogram comparisons and for my test image that worked really well. In this post I am going to look at how well it worked on some other images.

The first label image that I tried: a Wham! promo single with an (almost) white label, with a perfect white circle around it (either a very good scanner, or the uploader cut the label from the scan), so I was not expecting a lot. The masked versions look like this:




I was expecting quite a bad result here, but the comparison for the full image is:
  1. correlation: 0.07351815513756556
  2. chi squared: 15.131605789349916
  3. intersection: 0.08877012991160882
  4. Bhattycharyya distance: 0.8772594089938355
That is even better than with my test image, which puzzles me to be honest.

The next image I tested with is the Dutch pressing of Queen's Tie Your Mother Down. The masked versions look like this:



You can see the rim of the vinyl record and a tiny bit of the label and also that in this case the center hole mask is not covering the center hole perfectly. The histogram comparison looks like this:
  1. correlation: 0.0003169877412830062
  2. chi squared: 704.6096364590801
  3. intersection: 0.09703724197242991
  4. Bhattycharyya distance: 0.9591737531004111
Looking at the value of the correlation (remember: lower means more different) then that's what I would call success!

Next test case: the A-side of a Spanish single where the image is not perfectly cropped and not perfectly centered.

The masked versions look like this:



and the histogram comparison results:
  1. correlation: 0.3220106911431574
  2. chi squared: 23.676490935171586
  3. intersection: 1.2247211879621318
  4. Bhattycharyya distance: 0.8174380098878803
which is not as good as expected, especially when looking at the correlations of the red and green channels. The correlation for the red channel is 0.00172, for the green channel it is -0.00046 and for the blue channel it is 0.35150. The only thing I can think of is that because the label is a bit smaller than in other images (due to different cropping and scaling) the label mask also includes some vinyl  pixels (black) in the label part of the image and this has a huge influence. The center hole is also not completely centered.

I tried a slightly smaller radius for the label mask and shifting the center hole a bit. The masks now look like this:



and as you can see very few of the black vinyl pixels are now included in the label image. The histogram comparisons are drastically different:

  1. correlation: 0.0006212548064740618
  2. chi squared: 8255.232299202899
  3. intersection: 0.07737282696507464
  4. Bhattycharyya distance: 0.9615555333658465
This means that for optimal results I need to know:
  • what the diameter of the label is
  • where the center of the center hole is
and also means that instead of being able to flag a picture as "label" or "not label" the best I can do without extra preprocessing (for example: edge detection) is flag it as "label" or "don't know". But, that's still a good enough result for me. It also means that I will need to dive deeper into this subject, which is not a bad thing either.

I ran some more tests, like with the A-side label of this Bette Midler release with bad results because black pixels get added to the masked label version (and you can actually see the dust on the record):



  1. correlation: 0.7543407548880438
  2. chi squared: 4.216618476033613
  3. intersection: 0.9175061059013387
  4. Bhattycharyya distance: 0.7430200503113149
Playing a bit with the radius of the mask gives a lot better results, confirming suspicions:



and the histogram comparison values:

  1. correlation: 0.047177802369014636
  2. chi squared: 26.450314478943397
  3. intersection: 0.13217700347286154
  4. Bhattycharyya distance: 0.9329150658094154
A Fleetwood Mac promo single:



with very good results:
  1. correlation: 0.013033629117424022
  2. chi squared: 633.6883680148159
  3. intersection: 0.08043640897267323
  4. Bhattycharyya distance: 0.9225785722847764
and I could go on and on. I am very happy with these results.

So, to wrap up: using histograms as I described is a very effective way to find which images from Discogs potentially contain labels, but only if the images are properly cropped and centered. As soon as this is not the case the quality of the results rapidly drops. There are methods to prevent this from happening, but those can possibly also be used to detect labels on their own. Definitely to be continued in a few months!

Comments

Popular posts from this blog

SID codes (part 1)

One thing that I only learned about after using Discogs is the so called Source Identification Code, or SID. These codes were introduced in 1994 to combat piracy and to find out on which machines a CD was made. It was introduced by Philips and adopted by IFPI, and specifications are publicly available which clearly describe the two available SID codes (mastering SID code and mould SID code). Since quite a few months Discogs has two fields available in the " Barcode and Other Identifiers " (BaOI) section: Mould SID code Mastering SID code A few questions immediately popped up in my mind: how many releases don't have a SID field defined when there should be (for example, the free text field indicates it is a SID field)? how many releases have a SID field with values that should not be in the SID field? how many release have a SID field, but a wrong year (as SID codes were only introduced in 1994) how many vinyl releases have a SID code defined (which is impossi

SPARS codes (part 1)

Let's talk about SPARS codes used on CDs (or CD-like formats). You have most likely seen it used, but maybe don't know its name. The SPARS code is a three letter code indicating if recording, mixing and mastering were analogue or digital. For example they could look like the ones below. There is not a fixed format, so there are other variants as well. Personally I am not paying too much attention to these codes (I simply do not care), but in the classical music world if something was labeled as DDD (so everything digital) companies could ask premium prices. That makes it interesting information to mine and unlock, which is something that Discogs does not allow people to do when searching (yet!) even though it could be a helpful filter. I wanted to see if it can be used as an identifier to tell releases apart (are there similar releases where the only difference is the SPARS code?). SPARS code in Discogs Since a few months SPARS is a separate field in the Discogs

Country statistics (part 2)

One thing I wondered about: for how many releases is the country field changed? I looked at the two most recent data dumps (covering February and March 2019) and see where they differed. In total 5274 releases "moved". The top 20 moves are: unknown -> US: 454 Germany -> Europe: 319 UK & Europe -> Europe: 217 unknown -> UK: 178 UK -> Europe: 149 Netherlands -> Europe: 147 unknown -> Europe: 139 unknown -> Germany: 120 UK -> US: 118 Europe -> Germany: 84 US -> UK: 79 USA & Canada -> US: 76 US -> Canada: 65 unknown -> France: 64 UK -> UK & Europe: 62 UK & Europe -> UK: 51 France -> Europe: 51 Saudi Arabia -> United Arab Emirates: 49 US -> Europe: 46 unknown -> Japan: 45 When you think about it these all make sense (there was a big consolidation in Europe in the 1980s and releases for multiple countries were made in a single pressing plant) but there are also a few weird changes: