One thing that I noticed when looking at older releases is that sometimes new errors are introduced. These are then picked up by my scripts which detect them just fine, but I was wondering how often this happens, as that is something my scripts do not detect.
So I grabbed the dumps released in May and June and simply counted how many old releases were in the later dump, but not in the earlier. As it turned out, quite a few: 1692 errors in 1046 unique releases and there was only about two weeks in between the two dump files. Extrapoliting a bit that means probably for around 3000 errors in 2000 older releases each month new errors are introduced (and most of these errors are preventable).
When ignoring Artist and Tracklisting errors there are still about 630 errors left in 474 releases.
What is clear is that this time it is mostly the recent releases that are adapted. What this also means is that cleanups cannot be done in a "one off" fashion and the data will need constant monitoring and fixing. As long as Discogs makes it easy to introduce errors people will take advantage of that. The only fix, as I have argued before, is that Discogs makes it more difficult for people to make those errors in the first place.
So I grabbed the dumps released in May and June and simply counted how many old releases were in the later dump, but not in the earlier. As it turned out, quite a few: 1692 errors in 1046 unique releases and there was only about two weeks in between the two dump files. Extrapoliting a bit that means probably for around 3000 errors in 2000 older releases each month new errors are introduced (and most of these errors are preventable).
When ignoring Artist and Tracklisting errors there are still about 630 errors left in 474 releases.
Old releases in which errors were introduced in the second half of May 2018 |
What is clear is that this time it is mostly the recent releases that are adapted. What this also means is that cleanups cannot be done in a "one off" fashion and the data will need constant monitoring and fixing. As long as Discogs makes it easy to introduce errors people will take advantage of that. The only fix, as I have argued before, is that Discogs makes it more difficult for people to make those errors in the first place.
Comments
Post a Comment