Exploring algorithmic reputation and governance. replabs.xyz
Exploring algorithmic reputation and governance. replabs.xyz

Subscribe to Oliver Klingefjord

Subscribe to Oliver Klingefjord
Share Dialog
Share Dialog


<100 subscribers
<100 subscribers
I recently read a thread about the uneven distribution of content on platforms like Netflix, Spotify etc. An overwhelming amount of consumption is coming from a few artists and producers:
On Netflix, the most watched show is sometimes 50x as popular as number #10 on the list. 0.7% of the artists on Spotify receive 90% of the payouts. 98% of games on Steam are from independent studios while 78% of games that sold over 500 units are from top major publishers.
I think this is interesting since Netflix has been a leader in personalization research for years. Still, somehow, everyone ends up watching the same garbage.
These numbers also match my anecdotal experience. Even though excellent new art, music and movies are created, I'm appalled by the mediocrity of what is presented to me every time I log onto Netflix. How can this be?
These platforms often boast about how many views and listens stem from their shiny recommendation systems. These systems are evidently imperfect and suffer from many problems. One that my friend Joe repeatedly talks about is how the data we model from doesn't properly capture what's important to us. In this essay I'd like to explore another reason as to why I think new interesting experimental content has a hard time on these platforms: The cold start problem.
Most recommender systems, such as the one suggesting new songs for you on Spotify, use variants of a technique called collaborative filtering. In crude terms, this method consists of the following steps:
Create a profile based on my behavior.
Compare my profile with other similar profiles.
Recommend content that similar profiles like but I haven’t discovered yet.
The hypothesis is that if I like artists such as Max Cooper, Caterina Barbieri and Nicolas Jaar, and another person likes Max Cooper, Caterina Barbieri and Tim Hecker, I probably also like Tim Hecker.
All good – in this case it is true.
However – if a new artist enters the platform with music similar to the above, why would the system choose to recommend that artist over Tim Hecker? Thousands of other users already point towards me liking Tim Hecker. Since no users currently listen to this new artist there is no data to work from.
This is what's called the cold start problem – it's difficult to draw inferences from users or items with few interactions.
There are many ways to combat this. Spotify for instance also compares the raw audio data of new songs with songs that I like. But the distribution of content on these platforms point towards it not being enough – the algorithms tend to favor "safe bets" with lots of data to draw inferences from rather than risk disappointment with newcomers.
Traditionally, new artists could enter the world bottom-up and dynamically grow from there. By starting with making a name for yourself in a small scene, new artists were not crushed by the cold start problems of centralized recommenders. It seems to me that these scenes are now disappearing.
Growing up, I was part of several small niche communities (both online and offline) wherein new music was continually circulated. Recommendations did not happen algorithmically but socially.
These communities, or scenes, act as curation mechanisms. As a new artist, you did not have to amass a lot of listens before being worth something. All that was required was to make something that a few people within a scene really liked – network effects took care of the rest (if you were good).
In the platform age, scenes have disappeared behind the veil of centralized recommender systems. Behind that veil, scenes are proxied by clusters of users with similar preferences. At first glance this might make sense, but reducing scenes to "people who like the same things" is a massive error:
Scenes could be defined as containers for experimentation. When people within a scene stop experimenting, people leave and go to sub-scenes instead. It's become boring and mainstream – it's dead.
User similarity, on the other hand, is defined by the absence of experimentation. When users within a similarity cluster start experimenting, they stop being similar.
To be fair, many recommendation systems do include both exploitation (using the current preferences about you) and exploration (trying out new content to see if you like it). On Spotify, a wildcard song might get thrown in the recommendation mix every now and then to verify that the assumptions about your preferences are correct. But you still have to put in deliberate effort to escape mediocrity. You have to fight the algorithm, screaming "No, enough! I want something different" when bored by its suggestions. The longer the history of your past preferences is, the harder you have to fight.
Scenes, on the other hand, fall apart naturally when they become too mediocre and mainstream.
Of course, scenes still exist in the digital world. On less centralized platforms such as Reddit, new sub-communities continuously form and fall apart.
Before continuing, it might be worthwhile to take a step back and ask – what exactly is a scene?
I'd argue scenes exist only insofar as people within it deem them to exist. Someone is part of a scene if other people within that scene deem them to be a part of it. They are mutually verified clusters of people who share some value, idea or mode of expression. Scenes can thus be modeled as a sort of reputation graph.
Digital platforms have various ways of measuring reputation. Reddit, for instance, have Karma, which "...is a reflection of how much your contributions mean to the community". On Twitter, reputation is expressed through likes and follows.
However – without the context of a scene, these crude indicators mean very little. There are friends whose judgement I trust when it comes to new electronic music, public intellectuals whose work about Russian geopolitics I find interesting and comedians who I think are funny. All of this gets collapsed into likes and retweets.
Of course, An excellent chef is not automatically an excellent musician, and a popular actor is not automatically an expert in climate change. Reputation is not directly portable between domains and scenes – but our current way of modeling information assumes so, and I think many ills are caused by it.
People who're already apt at navigating our new information environment get this. They carefully curate the people they follow, the substacks they read and actively seek out new artists and music on more niche platforms such as NTS and Filmaffinity. They are aware of the scenes they partake in when their recommenders are not and they consciously leave them when they cease to be interesting.
I don't think we can nor want to go back to a world without recommendation algorithms. There's simply too much information uploaded to the internet to sift through otherwise.
I do think, however, that we could drastically change how these systems work, hopefully resulting in a cultural landscape that is far more interesting. A step towards that would be a more explicit and granular modeling of reputation and scenes.
Also, I think the issue at hand is larger than just boringness. Recommendation engines don't just shape our taste and preferences but also our broader understanding of the world.
But that is a topic for another day…
Like this article? Consider buying me a coffee ☕️
I recently read a thread about the uneven distribution of content on platforms like Netflix, Spotify etc. An overwhelming amount of consumption is coming from a few artists and producers:
On Netflix, the most watched show is sometimes 50x as popular as number #10 on the list. 0.7% of the artists on Spotify receive 90% of the payouts. 98% of games on Steam are from independent studios while 78% of games that sold over 500 units are from top major publishers.
I think this is interesting since Netflix has been a leader in personalization research for years. Still, somehow, everyone ends up watching the same garbage.
These numbers also match my anecdotal experience. Even though excellent new art, music and movies are created, I'm appalled by the mediocrity of what is presented to me every time I log onto Netflix. How can this be?
These platforms often boast about how many views and listens stem from their shiny recommendation systems. These systems are evidently imperfect and suffer from many problems. One that my friend Joe repeatedly talks about is how the data we model from doesn't properly capture what's important to us. In this essay I'd like to explore another reason as to why I think new interesting experimental content has a hard time on these platforms: The cold start problem.
Most recommender systems, such as the one suggesting new songs for you on Spotify, use variants of a technique called collaborative filtering. In crude terms, this method consists of the following steps:
Create a profile based on my behavior.
Compare my profile with other similar profiles.
Recommend content that similar profiles like but I haven’t discovered yet.
The hypothesis is that if I like artists such as Max Cooper, Caterina Barbieri and Nicolas Jaar, and another person likes Max Cooper, Caterina Barbieri and Tim Hecker, I probably also like Tim Hecker.
All good – in this case it is true.
However – if a new artist enters the platform with music similar to the above, why would the system choose to recommend that artist over Tim Hecker? Thousands of other users already point towards me liking Tim Hecker. Since no users currently listen to this new artist there is no data to work from.
This is what's called the cold start problem – it's difficult to draw inferences from users or items with few interactions.
There are many ways to combat this. Spotify for instance also compares the raw audio data of new songs with songs that I like. But the distribution of content on these platforms point towards it not being enough – the algorithms tend to favor "safe bets" with lots of data to draw inferences from rather than risk disappointment with newcomers.
Traditionally, new artists could enter the world bottom-up and dynamically grow from there. By starting with making a name for yourself in a small scene, new artists were not crushed by the cold start problems of centralized recommenders. It seems to me that these scenes are now disappearing.
Growing up, I was part of several small niche communities (both online and offline) wherein new music was continually circulated. Recommendations did not happen algorithmically but socially.
These communities, or scenes, act as curation mechanisms. As a new artist, you did not have to amass a lot of listens before being worth something. All that was required was to make something that a few people within a scene really liked – network effects took care of the rest (if you were good).
In the platform age, scenes have disappeared behind the veil of centralized recommender systems. Behind that veil, scenes are proxied by clusters of users with similar preferences. At first glance this might make sense, but reducing scenes to "people who like the same things" is a massive error:
Scenes could be defined as containers for experimentation. When people within a scene stop experimenting, people leave and go to sub-scenes instead. It's become boring and mainstream – it's dead.
User similarity, on the other hand, is defined by the absence of experimentation. When users within a similarity cluster start experimenting, they stop being similar.
To be fair, many recommendation systems do include both exploitation (using the current preferences about you) and exploration (trying out new content to see if you like it). On Spotify, a wildcard song might get thrown in the recommendation mix every now and then to verify that the assumptions about your preferences are correct. But you still have to put in deliberate effort to escape mediocrity. You have to fight the algorithm, screaming "No, enough! I want something different" when bored by its suggestions. The longer the history of your past preferences is, the harder you have to fight.
Scenes, on the other hand, fall apart naturally when they become too mediocre and mainstream.
Of course, scenes still exist in the digital world. On less centralized platforms such as Reddit, new sub-communities continuously form and fall apart.
Before continuing, it might be worthwhile to take a step back and ask – what exactly is a scene?
I'd argue scenes exist only insofar as people within it deem them to exist. Someone is part of a scene if other people within that scene deem them to be a part of it. They are mutually verified clusters of people who share some value, idea or mode of expression. Scenes can thus be modeled as a sort of reputation graph.
Digital platforms have various ways of measuring reputation. Reddit, for instance, have Karma, which "...is a reflection of how much your contributions mean to the community". On Twitter, reputation is expressed through likes and follows.
However – without the context of a scene, these crude indicators mean very little. There are friends whose judgement I trust when it comes to new electronic music, public intellectuals whose work about Russian geopolitics I find interesting and comedians who I think are funny. All of this gets collapsed into likes and retweets.
Of course, An excellent chef is not automatically an excellent musician, and a popular actor is not automatically an expert in climate change. Reputation is not directly portable between domains and scenes – but our current way of modeling information assumes so, and I think many ills are caused by it.
People who're already apt at navigating our new information environment get this. They carefully curate the people they follow, the substacks they read and actively seek out new artists and music on more niche platforms such as NTS and Filmaffinity. They are aware of the scenes they partake in when their recommenders are not and they consciously leave them when they cease to be interesting.
I don't think we can nor want to go back to a world without recommendation algorithms. There's simply too much information uploaded to the internet to sift through otherwise.
I do think, however, that we could drastically change how these systems work, hopefully resulting in a cultural landscape that is far more interesting. A step towards that would be a more explicit and granular modeling of reputation and scenes.
Also, I think the issue at hand is larger than just boringness. Recommendation engines don't just shape our taste and preferences but also our broader understanding of the world.
But that is a topic for another day…
Like this article? Consider buying me a coffee ☕️
No activity yet