Your browser is not supported. Please update it.

12 May 2022

Meta ruffles OpenAI feathers with carbon-crunching language models

A research paper hot off the press from Meta AI has made quite a splash in the academic community. In layman’s terms, Meta has effectively open sourced a set of language models so large that it makes Meta more open than OpenAI, the dedicated research lab founded by Elon Musk, among others.

While it sounds like Meta is ticking all the right boxes – including sustainability, diversity and openness within AI language models – one has to consider whether this is more than a muscle-flexing PR stunt? Besides, what’s stopping people using Meta’s new open models for harmful purposes, like spreading fake news or developing more convincing deep fake videos?

Without getting too deep into the methodology (you’ll thank us later), Meta’s open pre-trained transformer (OPT) large language models (LLMs) bear potential to make strides in fields including modern dialog models, bias and toxicity evaluations, and hate speech detection.

With the revolution of social media as a significant political weapon, and with grave concerns about how young people will interact with future metaverse experiences, making these LLMs available to the wider research community is designed to bring benefits to AI and natural language processing development. However, we doubt the Facebook research wing will receive the plaudits it deserves for this huge open source effort.

But why bother trying to get one over on OpenAI in the first place? The reason Meta’s OPT has caused such a stir is that the suite of decoder-only pre-trained transformers range from 125 million to 175 billion parameters, which is comparable to the 175 billion parameters comprising OpenAI’s GPT-3, but at 1/7th the carbon footprint to develop.

Called OPT-175B, Meta achieved this using the latest generation of Nvidia hardware.

Models of this size require massive computational resources and therefore cost, meaning any copycat attempts need serious capital and come at sustained detriment to the environment. Meta warns that repeated efforts to replicate a model of this size will only amplify the growing compute footprint of these LLMs, which sounds a little hypocritical, although at 1/7th the carbon footprint, it’s hard to argue.

In other words, “Don’t even bother trying to beat us because you’ll fail, and we’ll guilt trip you for impacting the environment even if you do produce better results than us.” Meta has released its logbook detailing the infrastructure challenges faced, as well as code for experimenting with all released models.

Given the centrality of LLMs in many downstream language applications, Meta also hopes to increase the diversity of voices defining the ethical considerations of such technologies. These voices across the AI community span academic researchers, civil society, policymakers, and private sector industries.

For the full OPT-175B suite, research access is provided upon request, although the models between 125M and 30B parameters will be released without request.

As for limitations, there are several, including of particular concern that OPT-175B can produce factually incorrect statements which can be harmful in applications where information accuracy is critical. OPT-175B was also found to have a high propensity to generate toxic language and reinforce harmful stereotypes, as well as a tendency to be repetitive and easily get stuck in a loop. It is within the limitations that it makes sense to open source the LLMs and where arguably most research efforts are required.