What I like about working with the Discogs data is to make data that isn't visible visible. In an earlier post I talked about that I suspected that the Discogs contributor ranking likely followed the 80/20 rule, but I didn't have enough data yet to confirm that.
I crawled more data from Discogs (very slowly, as Discogs doesn't make it easy with their anti-crawling measure, so I crawled from multiple locations over quite a few hours) and reran scripts that I wrote to crunch the numbers and see how many of the top contributors were responsible for having 80% of the accumulated points in Discogs. When looking at contributions of the top 1000 contributors 60% of the contributors accounted for about 80% of the points.
The more data I got the more this moved towards 20% and it became clear very quickly that Discogs indeed seems to follow the 80/20 rule: when looking at the points of the top 36,000 contributors 80% of the points accumulated belong to the top 21.3% contributors. For the top 38,000 this was down to 20.6%. For the top 45,000 it further down to just 18.7% and the top 20% accounts for a bit over 81% of the points.
I stopped looking after I had processed the top 50,000 where the top 80% of the points was owned by 17.6% and the top 20% contributors collectively had a bit over 82% of the points.
I then also calculated the percentages per quintile:
So it is very clear: the contributor points in Discogs are distributed according to the 80/20 rule.
When looking at the profiles of the people in the top it becomes clear that these are either people who have been around in Discogs for a long time and who are very familiar with how Discogs works (quirks and all), or people who are newer but very dedicated.
If I were Discogs I would be wondering how I could lift the other 80% out of "point poverty" and enable them to engage more with the website.
I crawled more data from Discogs (very slowly, as Discogs doesn't make it easy with their anti-crawling measure, so I crawled from multiple locations over quite a few hours) and reran scripts that I wrote to crunch the numbers and see how many of the top contributors were responsible for having 80% of the accumulated points in Discogs. When looking at contributions of the top 1000 contributors 60% of the contributors accounted for about 80% of the points.
The more data I got the more this moved towards 20% and it became clear very quickly that Discogs indeed seems to follow the 80/20 rule: when looking at the points of the top 36,000 contributors 80% of the points accumulated belong to the top 21.3% contributors. For the top 38,000 this was down to 20.6%. For the top 45,000 it further down to just 18.7% and the top 20% accounts for a bit over 81% of the points.
I stopped looking after I had processed the top 50,000 where the top 80% of the points was owned by 17.6% and the top 20% contributors collectively had a bit over 82% of the points.
I then also calculated the percentages per quintile:
- First 20%: 82.036%
- Second 20%: 9.246%
- Third 20%: 4.328%
- Fourth 20%: 2.610%
- Fifth 20%: 1.780%
So it is very clear: the contributor points in Discogs are distributed according to the 80/20 rule.
When looking at the profiles of the people in the top it becomes clear that these are either people who have been around in Discogs for a long time and who are very familiar with how Discogs works (quirks and all), or people who are newer but very dedicated.
If I were Discogs I would be wondering how I could lift the other 80% out of "point poverty" and enable them to engage more with the website.
Comments
Post a Comment