Eggs “over easy” and reflections on the authenticity of ChatGPT’s language

Eggs cooked over easy

I was in Vietnam last fall on assignment from the US State Department (Public Diplomacy Division) to serve as an “English Language Specialist,” giving workshops to foreign language teachers (mostly of English, come Korean and Mandarin) on integrating technology into language learning. The main topic: generative AI like ChatGPT. There was a lot of interest, as well as plenty of concern (students cheating, teachers losing their jobs). AI offers a variety of opportunities for language teachers and learners, as well as bringing with it lots of issues, both in education and for society as a whole. I just finished up an upcoming column on that topic for Language Learning & Technology.

One of the topics we talked about in the workshops was the quality of the language output from AI systems. While the language generated is grammatically correct, fluent, and coherent (although sometimes inaccurate), I pointed out to the teachers that AI systems have a quite different orientation to language than do humans, with a likely result that the social and cultural aspects of AI output might be questionable. I decided to try out an experiment with ChatGPT based on an incident I witnessed at breakfast at the hotel where I was staying in Hanoi. I entered the prompt below:

Me

An American tourist is staying at a hotel in Vietnam. At breakfast, he goes to the grill station and requests, in English, fried eggs “over easy” (i.e. with runny yokes). The cook has trouble understanding what he wants. The tourist repeats his request, but the cook still does not understand. The tourist says, “Ok, I’ll come back later. Thank you.” The cook does not respond. The tourist starts to walk away, then turns back to the cook and says, “I SAID thank you.” Why did the tourist make that last statement?

ChatGPT

The tourist likely made the last statement, “I said thank you,” as a polite way of expressing gratitude and acknowledging the cook, even though there was no verbal response from the cook. In some cultures, including American culture, expressing gratitude is considered polite and is a social norm. The tourist may have wanted to convey that he appreciated the cook’s efforts or wanted to ensure that his gratitude was acknowledged, even if the cook did not respond verbally. It’s a common social convention to say “thank you” after making a request or receiving assistance.

Of course, ChatGPT was wrong and took literally what was said instead of understanding the real meaning of “I SAID thank you.” The American tourist expected (unreasonably) an “Ok” or “You’re welcome” or some kind of verbal response and was repeating his “thank you” from irritation over the lack of response (and over not getting the eggs he wanted). From the perspective of language pragmatics, the Vietnamese cook failed to complete an “adjacency pair”, a conversational turn-taking pattern that stipulates a standard reply to an utterance. Unfortunately, the cook didn’t know enough English to play his role as expected. These are formulaic sequences, without real semantic meaning, rather a form of socialization, connecting speakers together. The American English greeting “How are you?” is not asking for a health report, but just offering a greeting, with an expected reply of “Fine, thanks”. Below is an abstract for a talk I am giving (virtually) in Portugal at an upcoming conference on “Digital Learning Environments and Authenticity in English Language Teaching.” My presentation deals with the question of social and cultural authenticity in AI language production:

The ability of ChatGPT and other AI systems to generate language that resembles closely human-produced speech has led to claims that AI chatbots can “facilitate an authentic, interactional language learning environment” (Chiu et al., 2023), that AI use is “essential for promoting cultural sensitivity, intercultural competency, and global awareness” (Anis, 2023, p. 64), and that AI-based VR supplies “the benefits of in-country immersion programs without the hassle” (Divekar, 2022, p. 2354). The suggestion in these studies is that AI output is linguistically and culturally “authentic” and could substitute in language learning settings for human interlocutors or could even provide similar benefits to a study abroad experience.

Such a view ignores the process used by AI systems to reproduce language and the limitations of that process for the linguistic features and cultural content of the resulting output. AI systems break down language (their training data) into mathematical symbols and use machine learning to find patterns and regularities to form a “large language model” that enables next word prediction in a text string, which is then used to very effectively construct sentences, paragraphs, even complete discourses. Humans, in contrast, are socialized into their language abilities, learning gradually how to use language appropriately within an ever-expanding circle of social contexts. Through interactions with others, we acquire the social and cultural norms of language use, including the contextually appropriate use of nonverbal communication, i.e., facial expressions, body language, and paralanguage. The statistical model of language in AI lacks the sociocultural grounding humans have through sensorimotor interactions and from simply living in the real world.

Studies of AI’s capabilities to engage in pragmatically effective language use have shown significant limitations (Lee & Wang, 2022; Su & Goslar, 2023). While AI systems can gain pragmalinguistic knowledge and learn appropriate formulaic sequences (politeness conventions, for example) through the verbal exchanges in their training data, they have proven to be much less effective in sociopragmatic engagement, that is, in generating contextually acceptable speech reflecting an interlocutor’s state of mind, intentions, and emotional status. AI systems are likely to improve through user interactions added to their language models, through enlarging their datasets, and through multimodal incorporation (adding video and image training). However, those measures still will not supply the lived experience humans go through in negotiating common ground linguistically and culturally in social interactions and therefore the ability to deal with nuanced pragmatic scenarios. AI generated language–while valuable as a resource in language learning–will remain artificial and inauthentic in ways that cannot serve as an acceptable substitute for actual learner engagement in the L2 with peers and expert speakers.