A new dumpfile was uploaded by Discogs, so I launched my scripts again to see what happened in Discogs in October 2017.
This means that there was one edit every 4 seconds, compared to one edit every 5 seconds in September 2017.
In October: 212,772 smells were found, compared to 220,387 in September 2017, meaning 7,600 fewer entries with smells. These numbers are not exact though: there are some known issues with finding the Depósito Legal field in the Notes section as these are processed and reported before there is a check if there actually is a Depósito Legal section. This will be improved in a future version of the scripts.
In October 1,022 entries were added with a known smell (where my scripts found something), which is not too bad. Most of these seem to come from people not understanding what a Label Code is, people copying old entries as templates and changing the fields on a per needed basis without fixing the wrong fields, or people following old habits.
So to answer last month's question: the quality of releases is increasing, but we still have a very long way to go. Let's see in a month what will have happened in November 2017!
Release statistics
The new dumpfile ("the November dump") was released on November 4 2017 and has 9,107,428 releases. The previous dump ("the October dump") was released on October 4 2017 and had 8,996,419 releases. That means 111,009 more releases in the database. Of these:- 8,455,978 releases stayed the same
- 537,679 releases were changed
- 113,771 releases were added
- 2,762 releases were removed
- 222 releases had the status Draft, Deleted or Rejected set
- 11 releases that were not Accepted were present in the October and November data dump
- 1 release moved from Draft to Accepted
October 2017 saw fewer new releases added than September 2017, but significantly more edits of existing releases (11% increase).In total there were edits for 651,450 releases. There were 44,640 minutes in October 2017, meaning there were a bit over 14.5 edits (new or changed releases) per minute, compared to 12.5 edits per minute in September 2017.
This means that there was one edit every 4 seconds, compared to one edit every 5 seconds in September 2017.
In October 2017 there was one edit in the Discogs catalogue every four seconds
Releases with known smells
To make a fair comparison I first ran the scripts on the previous datadump, as I had updated some of the scripts and more smells were found.In October: 212,772 smells were found, compared to 220,387 in September 2017, meaning 7,600 fewer entries with smells. These numbers are not exact though: there are some known issues with finding the Depósito Legal field in the Notes section as these are processed and reported before there is a check if there actually is a Depósito Legal section. This will be improved in a future version of the scripts.
In October 1,022 entries were added with a known smell (where my scripts found something), which is not too bad. Most of these seem to come from people not understanding what a Label Code is, people copying old entries as templates and changing the fields on a per needed basis without fixing the wrong fields, or people following old habits.
So to answer last month's question: the quality of releases is increasing, but we still have a very long way to go. Let's see in a month what will have happened in November 2017!
Comments
Post a Comment