AI Alignment: A Complex Problem That Cannot be Solved in Four Years, Experts Say
OpenAI, the renowned artificial intelligence (AI) research organization, recently made waves with its ambitious announcement about tackling the issue of AI alignment. The company revealed its plan to invest 20% of its compute over the next four years on alignment, forming a new Superalignment team led by co-founder and Chief Scientist Illya Sutskever and Jan Leike, the newly appointed Head of Alignment. While OpenAI aims to solve the core technical challenges of superintelligence, experts from the AI community have expressed doubts about the feasibility of such a goal.
The need for AI alignment has become increasingly urgent as AI continues to advance by leaps and bounds. With the recent development of AI models like ChatGPT and DALL.E, it is crucial to ensure that these systems behave in accordance with human intent and do not go astray. However, aligning AI models with human values is a complex task that is not as straightforward as controlling a car.
One of the significant challenges lies in identifying the specific values to align AI models with, as well as addressing conflicts and changes in these values over time. Humans have diverse values, which makes alignment a moving target. Different interpretations, applications, and contextual variations further complicate the matter. Achieving universal consensus on the right values for AI models seems challenging, given the dynamic nature of society and the myriad of cultural and philosophical perspectives.
Although the alignment problem is acknowledged as vital, scientists from the AI community have disputed OpenAI’s approach. Yann LeCun, Meta’s Chief AI Scientist, argued that the alignment problem is not solvable, let alone within a four-year timeframe. The French scientist compared it to safety challenges in other fields like transportation, stating that achieving reliability requires continuous refinement rather than a one-time solution.
Giada Pistilli, Principal Ethicist at Hugging Face, shares a similar opinion, emphasizing that the complexity of human values cannot be solved or summarized in AI models. Attempting to engineer solutions for social problems has historically proved unsuccessful. While Pistilli acknowledges OpenAI’s efforts, she believes that the problem may require a more mundane and less ambitious approach.
Despite the skepticism surrounding OpenAI’s timeline and methodology, it is evident that addressing the alignment challenge is critical for preventing potential AI risks. Without proper alignment, AI systems could pose dangers such as accidents in self-driving cars, biased decisions in hiring processes, or the propagation of false information by chatbots.
To start addressing the alignment problem, Pistilli suggests prioritizing a clear understanding of the goals of AI models. Defining these goals will lay the foundation for effective alignment. However, it is important to recognize that alignment remains a complex task, and finding a foolproof engineering solution for it may prove elusive.
While OpenAI’s commitment to allocating significant resources to tackle alignment is commendable, the experts caution against expecting a definitive solution within a fixed timeframe. The alignment problem demands continuous attention, refinement, and a deep understanding of human values. As the AI community grapples with this challenge, it is crucial to strike a balance between ambition and realism to ensure the safe and ethical development of AI technology.