OpenAI has been a much-discussed organization in the tech world, with its research and products providing interesting implementations of artificial intelligence (AI). Recently, their CEO Sam Altman led the way in the creation of GPT-4, the company’s latest long awaited upgrade to its language models (LLMs). This large model has been quite successful—so much so that a group of prominent experts and tech company leaders, including Elon Musk, signed a letter urging a moratorium of AI experimentation greater than OpenAI’s GPT-4.
Given the success of GPT-4 and the OpenAI’s research, one might expect them to continue to pursue creating ever-larger models to produce more impressive results. However, Altman has surprised many in the tech world by saying that the practice of simply increasing models’ size to gain improvements has come to an end. During an MIT event, Altman stated that “[we]are at the end of the era where it’s going to be these, like, giant, giant models. We’ll make them better in other ways.”
OpenAI has had a long history of working on large language models. It all started with the GPT-2, their first landmark model, which was released in 2019 with a parameter count of 1.5 billion–adjustable variables that help its AI “learn” from given data. The following year saw an incredible leap in size and power with the GPT-3, which held an estimated 175 billion parameters. Compared to that, GPT-4 was an absolute beast, with its size increasing to one trillion parameters. While OpenAI has not revealed the true size of the GPT-4, its impact upon the AI industry makes it clear that a tremendous increase in size and power was achieved.
Nonetheless, OpenAI’s own technical report suggests that further increases to model size may no longer yield results. As Altman noted, this is a similar situation to the “gigahertz race in chips in the 1990s and 2000s, where everybody was trying to point to a big number.” Interestingly, Altman conceded that parameter counts may still increase, but his primary focus has shifted away from simply scaling up models. Rather, he said that their goal is “to deliver the most capable, useful and safe models.”