When posed the question of whether you’d be more likely to use a paper map or a stone to fan life into coals for a barbecue, artificial intelligence language system GPT-3 unsurprisingly preferred the stone. Automatically, it suggested a hairpin when asked to use it to smooth a wrinkled skirt and a hamburger bun to cover the hair for work. But why?
Arthur Glenberg and Cameron Robert Jones from Arizona State University and University of California, San Diego set out to answer exactly this. They tested the understanding of GPT-3 compared to a computer system from around 20 years ago. The findings showed that GPT-3 was much better than the older model, yet still was significantly worse than humans in determining the potential results of the scenarios.
The answer lies with the way language works in our body and what is taught to the computer. The meaning behind a word or sentence is closely linked with being in a physical and social environment to understand how to use them. People’s understanding of a term like “paper sandwich wrapper,” for example, include the properties of it, such as the looks, weight, and knowing it can be used to wrap a sandwich.
GPT-3, however, learns language through a trillion instances, noting which word follows which other word. Its knowledge of language sequences may not be enough to consider the possibilities one has when using language. Artificial intelligence systems, such as GPT-3 and ChatGPT, do not have bodies and do not experience emotions and perceptions, which is why they can’t develop understanding of a word or phrase like humans do.
Its successor, GPT-4, was trained on images as well as text, which suggests it can figure out sentences in an accurate way, at least the three questions discussed above. Because of its larger size, it might be resilient against different scenarios. But it cannot understand the affordances of objects, like people’s hands and arms, diverse needs, and feeling behind language like humans.
Research is ongoing and projects such as training language models to generate physics simulations, interact with physical environments, and even generate robotic action plans, will help us figure out if artificial intelligence can ever understand language in the same way humans do, if it ever has access to a human body, senses, and real-world needs.
ChatGPT is an incredibly fascinating tool, and will no doubt be used both positively and negatively. Nevertheless, never confuse ChatGPT’s language production and prediction accuracy with the same level of understanding as a human.