Elon Musk initially suggested his plan to acquire Twitter and stated that one of the initial modifications he would implement is releasing Twitter’s algorithm to the public. The promise made by Twitter has finally been kept as the code for the “For You” recommendations section of the site has been released on GitHub.
With great haste, individuals on Twitter with a talent for investigation commenced analyzing the code in order to uncover any relevant information. In short order emerged a noteworthy discovery: that Musk’s tweets are classified under a unique group, alongside Democrats, Republicans, and “power users.” The engineers at Twitter quickly clarified that the reason for this was to monitor statistics, a statement that has been verified by subsequent analyses. Despite the prompt removal of a certain portion of code from Twitter’s GitHub account, rumors continue to circulate that the company’s employees prioritize their boss’s interaction on the platform and have implemented measures to artificially enhance the reach of his tweets.
Little information regarding the code’s content or Twitter’s algorithm functioning has surfaced in significant measure since then. Those anticipating that this code published for public use would reveal fresh understandings about the internal mechanisms of Twitter are likely to be disillusioned. Engineers who have extensively examined the algorithm have concluded that Twitter’s released code lacks significant information regarding the algorithm’s functioning.
Sol Messing, a former Twitter employee and current associate professor at NYU’s Center for Social Media and Politics, stated that the algorithm shared by Twitter was highly censored. Firstly, Twitter’s recommendations are not influenced by all systems.
In order to prevent malicious users from exploiting their platform, Twitter announced that they would be withholding code related to advertising, trust, and safety systems. According to Messing, the decision made by the company to not disclose the models used to train its algorithm, as explained in a recent blog post, was of significant importance in ensuring the safety and privacy of users. He informs me that the algorithm’s crucial aspect is not publicly available, as the model propelling it remains undisclosed. The algorithm’s most crucial aspect remains incomprehensible.
It appears that Musk’s initial intention to release the algorithm as open source software derived from his belief that Twitter had utilized the algorithm to restrict freedom of expression. Last April, during his appearance at TED after announcing his takeover bid, Musk suggested that Twitter should disclose the algorithm and clearly indicate any modifications made to users’ tweets, including emphasizing or de-emphasizing them. It is evident to everyone that action has been performed, eliminating any possibility of hidden manipulation, whether automated or manual.
Twitter’s released code does not reveal significant information on possible bias or the type of covert manipulation that Musk intended to uncover. Messing states that it possesses the characteristic of being transparent. However, it doesn’t actually provide any understanding of the algorithm’s functionality. The text fails to provide an understanding of the reasons behind the down-ranking of some tweets and the up-ranking of others.
As per Messing, the recent API changes made by Twitter have resulted in depriving most researchers of accessing a considerable amount of useful data from the platform. The absence of appropriate API authorization hinders the researchers from carrying out their own examinations, which could reveal fresh insights into the functioning of the algorithm. As per his analysis, he pointed out that Twitter has made it exceedingly challenging for researchers to scrutinize the code, even though the company is providing the code simultaneously.
Last year, during our conversation about Musk’s plan to “open source” Twitter’s algorithm, Alex Hanna, the research director at DAIR, emphasized the significance of performing audits. Similar to Messing, she had doubts that merely putting code on GitHub would significantly enhance transparency regarding Twitter’s functionality.
According to Hanna, in order to achieve public supervision of a Twitter algorithm, there must be numerous approaches to ensure oversight.
The GitHub code sheds some new light on an aspect of Twitter’s algorithm. Messing draws attention to a document discovered by Jeff Allen, a data scientist, that outlines a sort of “recipe” for the algorithm’s prioritization of various forms of engagement. According to Messing, if we accept that statement as it appears, a favorite on Twitter has half the value of a retweet. A response on Twitter is valuable, as it is equivalent to 27 retweets. However, when the original poster of the tweet responds, the value increases significantly to 75 retweets.
Although partially informative, it still fails to provide the entire scenario. Messing suggests that the information lacks significance in the absence of factual evidence. Musk has greatly inflated the cost for academics to obtain data. To conduct a comprehensive study on the subject at present, it is necessary to secure substantial funding such as annual grants amounting to approximately half a million dollars to obtain significant data for analysis.