A new month, so that means that there is a new dumpfile available
that I can do analysis on. If this is the first time you see one of
these posts I would highly recommend to first read the posts from previous months.
What I hadn't yet looked at is where the edits are happening: is it mostly older releases that get fixed, or newer releases? I looked at the numbers of the releases that were changed and plotted the following barchart:
As can be seen the changes are spread pretty evenly across the data, although significantly more older releases are being changed. I had expected those to be more stable. On the other hand: in the last year or so quite a few new fields were introduced so it could be that the older releases are "catching up" with the new standards.
For the rest it is very dull: around 90 depĆ³sito legal errors, a bit more than 100 SPARS code issues, more than 1200 label code errors, some 200 rights society errors, and so on.
It all looks very normal and that is also something that frustrates me, as most of these errors are completely preventable. It makes me wonder if Discogs actually is really interested in having these errors fixed in a structural way, or if "outsourcing" fixing errors to the contributors is their preferred way.
Release statistics
The latest dump ("the April dump") covers the period from March 1 - March 31 (inclusive). The previous dump had 9,554,069 releases, the new dump has 9,680,263 releases. That means 126,194 releases more in the database.- 8,962,136 releases stayed the same
- 589,131 releases were changed
- 128,996 releases were added
- 2,802 releases were removed from the database
- 247 releases had status Draft, Deleted or Rejected
- 11 releases that were not Accepted were in both dumps
- 0 release was moved from Draft to Accepted
What I hadn't yet looked at is where the edits are happening: is it mostly older releases that get fixed, or newer releases? I looked at the numbers of the releases that were changed and plotted the following barchart:
Releases in Discogs that were changed in March 2018 |
Smells
The amount of releases with smells that were added in March 2018: 1,862 releases, with around 8,000+ errors. This is a bit more than in previous months, but my scripts are also catching more errors. It should be noted that tracklisting errors were ignored.For the rest it is very dull: around 90 depĆ³sito legal errors, a bit more than 100 SPARS code issues, more than 1200 label code errors, some 200 rights society errors, and so on.
It all looks very normal and that is also something that frustrates me, as most of these errors are completely preventable. It makes me wonder if Discogs actually is really interested in having these errors fixed in a structural way, or if "outsourcing" fixing errors to the contributors is their preferred way.
Comments
Post a Comment