Twitter reveals part of the source code, including the recommendation algorithm

by Ana Lopez

As repeatedly promised by Twitter CEO Elon Musk, Twitter opened part of the source code for public scrutiny, including the algorithm used to recommend tweets in users’ timelines.

Twitter has published on GitHub two repositories containing code for many of the parts that make the social network work, including the mechanism Twitter uses to manage the tweets users see on the For You timeline. In a blog post, Twitter characterized the move as a “first step to be[ing] more transparent” while at the same time “[preventing] risk to Twitter itself and people on the platform.

During a Twitter Spaces session today, Musk clarified:

“Our first release of the so-called algorithm will be quite embarrassing, and people will find a lot of mistakes, but we’re going to fix them very soon,” Musk said. “Even if you don’t agree with something, at least you know why it’s there, and that you’re not being secretly manipulated… The analog here we’re aiming for is the great example of Linux as an open source operating system… One can discover many exploits for Linux in theory. What actually happens is that the community identifies and fixes those exploits.”

As for that second point in the risk avoidance blog post, the open source releases don’t include the code that powers Twitter’s ad recommendations or the data used to train Twitter’s recommendation algorithm. In addition, they contain few instructions on how to inspect or actually use the code, reinforcing the idea that the releases are strictly aimed at developers.

“[We excluded] any code that would compromise user security and privacy or the ability to protect our platform from bad actors, including undermining our efforts to combat child sexual exploitation and manipulation,” Twitter wrote. Just a bit of mixed messages coming weeks after Twitter has been fired a lot of of its ethical AI and trust and security staff, who were responsible for moderating content in addition to other duties related to user security. But the company nevertheless insists that it “[took] steps to ensure user security and privacy are protected” with the code release.

Twitter

A diagram showing how Twitter’s recommendation pipeline works.

Twitter says it is working on tools to manage code suggestions from the community and sync changes in its internal repository. Presumably those will be made available at a future date – there’s no sign of that at the moment.

“We’re looking for suggestions, not just about bugs, but about how the algorithm should work,” Musk said during the Spaces session. “It will be an evolving process. I wouldn’t expect it to be a non-stop upward movement… but we are very open to what could improve the user experience.”

At first glance, the algorithm is quite complex, but not necessarily surprising from a technical point of view. It consists of multiple models, including a model for detecting “not safe for work” or abusive content, determining the likelihood of a Twitter user interacting with another user, and calculating a Twitter’s “reputation” -user. (It’s unclear exactly what “reputation” refers to; the high-level documentation isn’t clear on that.) Several neural networks are responsible for ranking the tweets and recommending accounts to follow, while a filtering component hides tweets to — forgive the jargon – “support legal compliance, improve product quality, increase user confidence, protect revenue through the use of harsh filtering, visible product treatments, and coarse-grained downranking.”

In a technique blog postTwitter reveals more about its recommendation pipeline, which it says runs about five billion times a day:

“We try to pull the best 1,500 tweets out of a pool of hundreds of millions… Today, the For You timeline is 50% [tweets from people you don’t follow] and 50% [tweets from people you follow] average, although this may vary from user to user,” Twitter wrote. “Ranking [tweets] is achieved with a neural network of ~48 million parameters that is continuously trained on tweet interactions to optimize for positive engagement (e.g. likes, retweets and replies).”

Of course, Twitter users don’t see the full 1,500 tweets. They are filtered based on content restrictions and other criteria and factors considered by the models, such as

Gizmodo notes the one thing that doesn’t seem to have been made public is the list of VIPs Twitter pushes to users. This week, Platformer reported that Twitter has a varying list of notable users, including YouTuber Mr. Beast and Daily Wire founder Ben Shapiro, who uses it to track changes in the recommendation algorithm by increasing the visibility of these “power users” seemingly at will.

The release of the source code comes after several controversies over tweaks to Twitter’s recommendation algorithm in recent months. According to Platform gameIn February, Musk called on Twitter engineers to reconfigure the algorithm so that his tweets would be viewed more widely. (Twitter later reversed this change — at least somewhat.) In November, Twitter began showing users more tweets from people they don’t follow — a move the platform attempted prior to the Musk acquisition, but later reversed after a backlash. from users.


Related Posts