When looking at a music release in general (not just in Discogs) it can be quite confusing which company did what, as there are quite a few listed on them. There is the label, the production company, the rights owner and perhaps also others like a printing company and others.
Then there are companies that use names which are very close and which at different points in time might have meant something else (as companies are renamed, sold, repurposed, and so on). This is guaranteed to lead to errors. Recently I bumped into one example and that triggered me to look into it a bit moreto see how many errors I could find.
The company in question is London which is associated with the label called London Records. The webpage for London (the company) says: "For the record label, please see London Records." which means as much as "don't use this as the label".
So how often does this happen in the data? I grabbed the latest data dump (released January 2018), adapted my scripts to look for this particular smell, and then processed the data dump. The answer: 62 times. It is interesting to see how these are distributed over the data:
The error appears most often in new entries, which seems to suggest that this error gets detected and fixed. In the near future I will look at a few more labels and see if I can find similar patterns.
Then there are companies that use names which are very close and which at different points in time might have meant something else (as companies are renamed, sold, repurposed, and so on). This is guaranteed to lead to errors. Recently I bumped into one example and that triggered me to look into it a bit moreto see how many errors I could find.
The company in question is London which is associated with the label called London Records. The webpage for London (the company) says: "For the record label, please see London Records." which means as much as "don't use this as the label".
So how often does this happen in the data? I grabbed the latest data dump (released January 2018), adapted my scripts to look for this particular smell, and then processed the data dump. The answer: 62 times. It is interesting to see how these are distributed over the data:
Distribution of wrong use of "London" as a label in the Discogs data |
Comments
Post a Comment