Asian language in US movies, now becoming more authentic

Screenshot from interactive Web page on Asian movies in NY Times

The New York Times announced recently a new series that examines the output of Asian Americans and Pacific Islanders (AAPIs) in popular culture. The first  piece in the series, “Found in Translation: Asian Languages Onscreen,” is focused on the use of Asian languages in US American movies and TV. The article itself is quite interesting, but particularly effective is its online presentation, which features an innovative visual design by Alice Fang. The article points out that traditionally US movie viewers have been averse to reading subtitles but that has changed recently in respect to Hollywood movies with Asian characters. Recent hit movies Shogun and Everything Everywhere All at Once feature dialog, respectively, in Japanese and Chinese, mixed in with English.

The Web version of the article shows in interactive format how in a scene from the recent HBO mini-series The Sympathizer, a movie about making a movie, there is a kind of hidden dialog going on through the subtitles that illuminates what is happening on screen. The movie is set in Vietnam, but in that scene an actress who is playing a Vietnamese peasant speaks Cantonese. According to the article:

A crew member sheepishly explains to the director that they didn’t bother casting a Vietnamese speaker since “there’s no line in the script.” Eventually, another actress who does speak Vietnamese is brought in, and they reshoot the scene. Instead of the line the director suggests — “Don’t shoot me, I’m only a peasant” — the actress shouts one fed to her by the Captain, the on-set cultural consultant who is actually a Communist spy: “Our hands will close around the throat of American imperialism!” The swap goes over the director’s head, but for the viewer, who can read the subtitles (and for anyone who speaks Vietnamese), the layers of language become a narrative tool for political satire.

The article points out that this trend is likely to continue, as it seems to play well with audiences, adding a welcome tone of realism to movies about Asian characters that in the past were highly Americanized. For audience members who actually speak the language used in the film, the meaning is deepened, giving those viewers a special, inside track on what’s going on. The article points out that Internet users, through exposure to multilingual videos in YouTube, TikTok, etc. have grown more accoustomed and accepting of subtitles. In movies, adding authentic language enhances the story line: “Multiplicity of language is most interesting when it’s used to progress these stories — to ratchet up tension, to encase or reveal secrets, to create emotional resonance, to reflect or deflect identity.” That’s the case, for example, in the rich code-switching in Everything Everywhere All at Once which often features characters mixing Chinese and English in the same sentence. Subtitles not only enrich movies, they are also a wonderful (and entertaining) way to learn or maintain a foreign language. The fact that English language TV programs have long been shown in Scandinavia in English with subtitles has been shown to be a major factor in the citizens of those countries speaking better English than in countries such as Germany, that has always dubbed foreign video into German.

Interesting in terms of language in movies, a recent trailer has stirred a good deal of comment on accents used. The upcoming movie is the sequel Gladiators 2, which features Denzel Washington. The issue with many viewers of the trailer is that Washington does not adhere to the cinematic tradition of actors in movies about ancient Rome using British English but instead speaks in his regular New York accented American English. It’s a rather strange criticism in that ancient Romans certainly did not speak English of any kind. As many have pointed out, the Roman Empire was multicultural and therefore likely had many speakers with different accents speaking Latin. So criticizing the movie for mixed English accents is way off base.

If a language is not on the Internet, does it exist?

The quick answer to the question in the title is is yes, but maybe not for long. This week there was an interesting podcast on US public radio (KERA’s Think), AI could drive most languages to extinction, which features a conversation with Matteo Wong, a staff writer for The Atlantic, who published this spring a piece on the same topic, The AI Revolution Is Crushing Thousands of Languages. Both the podcast and the article deal with the troublesome issue of the poor representation in AI of low-resource languages, i.e. languages which do not have a large written record, and which may be under-represented online. As a result, those language, in contrast to high-resource languages like English, Chinese, Spanish, French, or Japanese (and other European and Western languages), do not represent much of the training data for generative AI systems like ChatGPT. As a result, AI systems have little knowledge of those languages and therefore are likely to perform poorly in areas such as translation, providing accurate information, or even generating coherent texts.

Wong gives the example of a linguist from Benin asking ChatGPT to provide a response in Fon, a language of the Atlantic-Congo family spoken by millions in Benin and neighboring countries. The AI response: it was not able to comply, as Fon was “a fictional language”. An additional problem for low-resource languages is that the texts that do appear online may not be genuinely produced by speakers of the language, but be machine-translated, and therefore potentially of questionable quality.

This means that AI, the increasingly important source of information in today’s world, will be unavailable to those who do not know English or another large-resource language. Wong cites David Adelani, a DeepMind research fellow at University College London, pointing out that “even when AI models are able to process low-resource languages, the programs require more memory and computational power to do so, and thus become significantly more expensive to run—meaning worse results at higher costs”. That means there is little incentive for AI companies like OpenAI, Meta, or Google to develop capabilities in languages like Fon.

The information about low-resource languages is not just linguistically deficient, but culturally problematic as well:

AI models might also be void of cultural nuance and context, no matter how grammatically adept they become. Such programs long translated “good morning” to a variation of “someone has died” in Yoruba, Adelani said, because the same Yoruba phrase can convey either meaning. Text translated from English has been used to generate training data for Indonesian, Vietnamese, and other languages spoken by hundreds of millions of people in Southeast Asia. As Holy Lovenia, a researcher at AI Singapore, the country’s program for AI research, told me, the resulting models know much more about hamburgers and Big Ben than local cuisines and landmarks.

The lack of support for most of the 7,000 world languages is evident from the fact that Google’s Gemini supports 35 languages and ChatGPT 50. As Wong notes, this is not just a practical problem for speakers of low-resource languages, in that the lack of support sends the message that those speakers’ languages are not valued. There is of course also the danger of languages lacking AI support will become less widely spoken, as they are perceived as not offering the personal and professional benefits of high-resource languages. Losing languages loses the human values associated with those language, for example, knowledge of the natural world tied to Indigenous languages, unique cultural values, traditional stories.

Wong points out that there are efforts to remedy this situation, such as Meta’s No Language Left Behind project. That initiative is developing open-source models for “high-quality translations directly between 200 languages—including low-resource languages like Asturian, Luganda, Urdu and more”. The Aya project is a global initiative led by non-profit Cohere For AI involving researchers in 119 countries seeking to develop a multilingual AI for 101 languages as an open resource. That system features human-curated annotations from fluent speakers in many languages. Masakhane is a grassroots organization whose mission is to strengthen and spur “in African languages, for Africans, by Africans”.

Let us hope that such initiatives can help bring AI to more languages and cultures. However, the power of the big AI systems is such that only if they commit to adding more diverse language data to their training corpus will AI truly become more fully multilingual and multicultural.

Eggs “over easy” and reflections on the authenticity of ChatGPT’s language

Eggs cooked over easy

I was in Vietnam last fall on assignment from the US State Department (Public Diplomacy Division) to serve as an “English Language Specialist,” giving workshops to foreign language teachers (mostly of English, come Korean and Mandarin) on integrating technology into language learning. The main topic: generative AI like ChatGPT. There was a lot of interest, as well as plenty of concern (students cheating, teachers losing their jobs). AI offers a variety of opportunities for language teachers and learners, as well as bringing with it lots of issues, both in education and for society as a whole. I just finished up an upcoming column on that topic for Language Learning & Technology.

One of the topics we talked about in the workshops was the quality of the language output from AI systems. While the language generated is grammatically correct, fluent, and coherent (although sometimes inaccurate), I pointed out to the teachers that AI systems have a quite different orientation to language than do humans, with a likely result that the social and cultural aspects of AI output might be questionable. I decided to try out an experiment with ChatGPT based on an incident I witnessed at breakfast at the hotel where I was staying in Hanoi. I entered the prompt below:

Me

An American tourist is staying at a hotel in Vietnam. At breakfast, he goes to the grill station and requests, in English, fried eggs “over easy” (i.e. with runny yokes). The cook has trouble understanding what he wants. The tourist repeats his request, but the cook still does not understand. The tourist says, “Ok, I’ll come back later. Thank you.” The cook does not respond. The tourist starts to walk away, then turns back to the cook and says, “I SAID thank you.” Why did the tourist make that last statement?

ChatGPT

The tourist likely made the last statement, “I said thank you,” as a polite way of expressing gratitude and acknowledging the cook, even though there was no verbal response from the cook. In some cultures, including American culture, expressing gratitude is considered polite and is a social norm. The tourist may have wanted to convey that he appreciated the cook’s efforts or wanted to ensure that his gratitude was acknowledged, even if the cook did not respond verbally. It’s a common social convention to say “thank you” after making a request or receiving assistance.

Of course, ChatGPT was wrong and took literally what was said instead of understanding the real meaning of “I SAID thank you.” The American tourist expected (unreasonably) an “Ok” or “You’re welcome” or some kind of verbal response and was repeating his “thank you” from irritation over the lack of response (and over not getting the eggs he wanted). From the perspective of language pragmatics, the Vietnamese cook failed to complete an “adjacency pair”, a conversational turn-taking pattern that stipulates a standard reply to an utterance. Unfortunately, the cook didn’t know enough English to play his role as expected. These are formulaic sequences, without real semantic meaning, rather a form of socialization, connecting speakers together. The American English greeting “How are you?” is not asking for a health report, but just offering a greeting, with an expected reply of “Fine, thanks”. Below is an abstract for a talk I am giving (virtually) in Portugal at an upcoming conference on “Digital Learning Environments and Authenticity in English Language Teaching.” My presentation deals with the question of social and cultural authenticity in AI language production:

The ability of ChatGPT and other AI systems to generate language that resembles closely human-produced speech has led to claims that AI chatbots can “facilitate an authentic, interactional language learning environment” (Chiu et al., 2023), that AI use is “essential for promoting cultural sensitivity, intercultural competency, and global awareness” (Anis, 2023, p. 64), and that AI-based VR supplies “the benefits of in-country immersion programs without the hassle” (Divekar, 2022, p. 2354). The suggestion in these studies is that AI output is linguistically and culturally “authentic” and could substitute in language learning settings for human interlocutors or could even provide similar benefits to a study abroad experience.

Such a view ignores the process used by AI systems to reproduce language and the limitations of that process for the linguistic features and cultural content of the resulting output. AI systems break down language (their training data) into mathematical symbols and use machine learning to find patterns and regularities to form a “large language model” that enables next word prediction in a text string, which is then used to very effectively construct sentences, paragraphs, even complete discourses. Humans, in contrast, are socialized into their language abilities, learning gradually how to use language appropriately within an ever-expanding circle of social contexts. Through interactions with others, we acquire the social and cultural norms of language use, including the contextually appropriate use of nonverbal communication, i.e., facial expressions, body language, and paralanguage. The statistical model of language in AI lacks the sociocultural grounding humans have through sensorimotor interactions and from simply living in the real world.

Studies of AI’s capabilities to engage in pragmatically effective language use have shown significant limitations (Lee & Wang, 2022; Su & Goslar, 2023). While AI systems can gain pragmalinguistic knowledge and learn appropriate formulaic sequences (politeness conventions, for example) through the verbal exchanges in their training data, they have proven to be much less effective in sociopragmatic engagement, that is, in generating contextually acceptable speech reflecting an interlocutor’s state of mind, intentions, and emotional status. AI systems are likely to improve through user interactions added to their language models, through enlarging their datasets, and through multimodal incorporation (adding video and image training). However, those measures still will not supply the lived experience humans go through in negotiating common ground linguistically and culturally in social interactions and therefore the ability to deal with nuanced pragmatic scenarios. AI generated language–while valuable as a resource in language learning–will remain artificial and inauthentic in ways that cannot serve as an acceptable substitute for actual learner engagement in the L2 with peers and expert speakers.

The Beatles, AI, and authenticity

I’m in Vietnam currently, giving workshops on using AI tools in teaching English. Yesterday, we looked at what ChatGPT, Bard, and Bing might suggest as “best practices in using AI in language learning and teaching”. One of the suggestions was using AI chats as “authentic” language practice. That has set me to wonder what that word means in the context of AI text generation. That topic has been raised this month with the release of a new Beatles song, a musical group that disbanded over 50 years ago, with only 2 of the 4 members still living. A recent article in the New York Times discussed the issues related to that release:

Does it really make sense to use a song originally written by [John] Lennon alone, with no known intention of ever bringing it to his former bandmates, as the basis for a “Beatles” song? Is Lennon’s vocal, plucked and scrubbed by artificial intelligence and taking on a faintly unnatural air, something he would have embraced or been repulsed by? “Is this something we shouldn’t do?” McCartney asks in a voice-over, but neither he nor anyone else ever articulates exactly what the problem might be. “We’ve all played on it,” McCartney says. “So it is a genuine Beatle recording.” On one hand, who is more qualified than McCartney to issue this edict of authenticity? On the other: Why did he feel the need?

The author makes the point that this is quite different from what we all have been worrying about with AI, namely brand new “fakes”. In this case it is an example of using tech advances to, in the author’s opinion, make money from recycling old material:

The worry is that, for the companies that shape so much of our cultural life, A.I. will function first and foremost as a way to keep pushing out recycled goods rather than investing in innovations and experiments from people who don’t yet have a well-known back catalog to capitalize on. I hope I am wrong. Maybe “Now and Then” is just a blip, a one-off — less a harbinger of things to come than the marking of a limit. But I suspect that, in this late project, the always-innovative Beatles are once again ahead of their time.

The question of authenticity has one that is at the core of the communicative approach to language learning, with the idea that learners should not be working with made-up, simplified language materials, but be provided with real-world materials that native speakers would themselves might be accessing. For the materials to be comprehensible, learners are supplied with “scaffolding” (annotations, notes, glossaries, etc.). Online materials have been a boon in that respect, in contrast to most materials in textbooks. Now, AI is making the question of authenticity and attribution much trickier. AI generated materials are not products of native speakers, so should we treat them, as we do manufactured texts as lacking in authenticity? Certainly, the cultural perspective is missing, which is one of the principal benefits of using “authentic” materials. Stay tuned, as AI and attitudes towards its output are evolving rapidly.

ChatGPT and the human-machine relationship

There has been an eruption of interest recently in generative AI, due to the public release of ChatGPT from OpenAI, a tool which, given a brief prompt, can generate in seconds texts of all kinds that are coherent, substantive, and eerily human-like. The availability of such a tool has led educators, especially in fields relying on essay writing, to wring their hands over students simply turning in assignments written by ChatGPT. Some have embraced GPTZero, a tool designed to determine whether a text is written by an AI system (in my testing, it was hit and miss in its accuracy). Some school systems have banned the use of ChatGPT.

I believe that is the wrong approach; I believe we need instead to help students use AI tools appropriately, adjust writing assignments accordingly, and lead students to understand the limits of what such tools can do (there are many). ChatGPT will soon be joined by similar tools and their abilities are sure to grow exponentially. That means they will see wide use in all domains of human activity. In their real lives after graduation, students will be expected to use such tools; let’s prepare them for that future. I argued last year for that position in a column in Language Learning & Technology (“Partnering with AI”).

In a forthcoming LLT column (“Expanded spaces for language learning,” available in February), I look at another aspect of the presence of such tools in our lives, namely what it means in terms of the human-machine relationship and in understanding the nature (and limits) of human agency. A spatial orientation to human speech, which emphasizes the primacy of context (physical, virtual, emotional, etc.) has gained currency in applied linguistics in recent years. Rather than viewing language as something set apart from spatio-temporal contexts (as was the case in structuralism or Chomskian linguistics), scholars such as Pennycook, Bloomaert, and Canagarajah show how the spatial context is central to meaning-making. This perspective is bolstered by theories in psychology and neuroscience that cognition (and therefore speech) is not exclusive to the brain, but rather is embodied, embedded, enacted, or extended (4E cognition theory). That places greater meaning-making emphasis on physicality (gestures, body language) as well as on the environment and potential semiotic objects in it (such as AI tools!). I argue that an approach helpful in understanding the dynamics at play is sociomaterialism (also labeled “new materialism”). This is an approach used widely in the social sciences and more recently, in studies in applied linguistics. It offers a different perspective on the relationship of humans to the material world. Reflecting theories in the biological sciences, sociomaterialism posits a more complex and intertwined relationship between an organism and its surroundings, for us bipeds that translates into a distributed agency shared by humans and non-humans (including machines).

Here is an excerpt from the conclusion:

A spatial orientation to language use and language learning illuminates the complex intertwining of people and artifacts physically present with those digitally available. The wide use of videoconferencing in education, for example, complicates concepts of local and remote as well as online versus offline. Neat divisions are not tenable. Mobile devices as well represent the intersection of the local and the remote, of the personal and the social; they are equipped to support localized use, while making available all the resources of a global network. From a sociomaterial viewpoint, the phone and user form an entanglement of shared agency; smartphones supply “extensions of human cognition, senses, and memory” (Moreno & Traxler, 2016, p. 78). The sensors, proximity alerts, and camera feeds function as stimuli, extending cognition while acting as an intermediary between ourselves and the environment. For many users, smartphones have become part of their Umwelt, an indispensable “digital appendage” (Godwin-Jones, 2017, p. 4) with which they reach out to and interact with the outside world.

A sociomaterial perspective and 4E cognition theory problematize distinctions of mind versus body, as they also qualify the nature of human agency. The increasing role that AI plays in our lives (and in education) adds a further dimension to the complex human-material dynamic. AI systems built on large language models produce language that mimics closely human-created texts in style and content. A radical development in writing-related technologies is the AI-enabled incorporation of auto-completion of phrases into text editors and online writing venues, as well as suggestions for alternative wording. Auto-completion features in tools such as Google Docs or Grammarly raise questions of originality and credit. That is all the more the case with tools such as ChatGPT which are capable of generating texts on virtually any topic and in a variety of languages. O’Gieblyn in God, human, animal, machine: Technology, metaphor, and the search for meaning (2021) argues that due to the powerful advances in language technologies, we need new definitions of intelligence and consciousness, an argument bolstered by 4E cognition theory. In consideration of the language capabilities of AI tools today, particularly the text generation capabilities of services such as ChatGPT, we also need new understandings of authenticity and authorship.

O’Gieblyn points out that AI is able to replicate many functional processes of human cognition such as pattern recognition and predicting. That derives from the fact that language generation in such systems is based on statistical analysis of syntactic structures in immense collections of human-generated texts. That probabilistic approach to chaining together phrases, sentences, and paragraphs is capable of producing mostly cohesive and logically consistent texts. Yet these systems can also betray a surprising lack of knowledge about how objects and humans relate to one another. This results in statements that are occasionally incoherent from a social perspective. This is due to the fact that AI systems have no first-hand knowledge of real life. Unlike human brains, AI has no referential or relating experiences to draw on. Since the bots have no real understanding of human social relationships, they assume universal cultural contexts apply to all situations, not making appropriate distinctions based on context. This can lead to unfortunate and unacceptable language production including the use of pejorative or racist language.

The deep machine learning processes behind LLM-based chatbots do not allow for fine tuning or tweaking the algorithms. Today we have better insight into human neural networks through neuroimaging then we do into the black box of artificial neural networks used in AI. That fact should make us cautious in using AI-based language technologies in an unreflective manner. At the same time, advanced AI tools offer considerable potential benefits for language learning, and their informed, judicious use—alongside additional semantic resources that are contextually appropriate—seems to lie ahead for both learners and teachers.

Loneliness: Can AI and robots help?

OriHome robot, a social companion from Japan

Japan has a Ministry of Loneliness with a Minister whose mission is to combat social isolation in the country. This is a widely recognized problem in Japan, with a special term, hikikomori, created to refer to shut-ins who have virtually no contact with other people. Loneliness is common among older Japanese, but a study showed that it is prevalent among younger age groups as well. Prolonged social isolation can lead to depression and increases the risk of suicide, a perennial issue in Japan. Now the Japanese government through its Loneliness Minister is enlisting AI tech to help alleviate the problem. There is an experimental program which lends out robots to those who are isolated or who struggle with anxiety about social interactions. The robost, called OriHome, are tiny, just 9 inches tool, white, with green eyes. They are controlled by an app on mobile devices. According to the report, the robot has two arms, which it moves about expressively while talking, allowing for integration of nonverbal communication.

OriHome is not the first such device in Japan designed to provide companionship to humans. Paro is a robotic seal deployed in a retirement facility. Sony Aibo robot dogs have been available for some time. A piece in the Huffington Post traces the evolution of robotic companions in Japan. While the Japanese tend to embrace new tech early and often enthusiastically, the trend towards socially competent AI is something we are seeing world-wide, particularly through smart voice assistants such as Siri, Alexa, and Google Assistant. Advances in voice recognition and in more naturally sounding synthetic voices have grown at a fast pace in recent years. Tech companies are also trying to make the assistants more social, through providing more human-like politeness formulas and small talk. They are also beginning to be able to build persistent models of user interactions, which enable multi-turn conversations and some degree of follow-up from previous conversations. Recently a Google engineer claimed that Google’s most advanced AI conversation partner, Lamda, had in fact become sentient. Most AI experts have expressed extreme skepticism. But the claim alone demonstrates how sophisticated such AI systems have become in being able to mimic humans engaged in conversation.

In the current issue of Language Learning & Technology, I have a column which examines how partnering with AI has benefited language learners, particularly in terms of writing in a second language.

A Facebook metaverse? No thanks.

I recently wrote a column in the LLT journal (Language Learning & Technology) looking back at 25 years of the use of technology in language learning. One of the principal developments discussed in the column is the transformation of the World Wide Web, from the heady early days in the 1990s when the web seemed to promise universal access to information and knowledge to the situation we have today in which the online world has become a rich source of misinformation and divisiveness. Social media has been a major contributor to that transformation. Donald Trump demonstrated how Twitter can be a vehicle for spreading disinformation and for engaging in nasty personal vendettas. The pernicious social role played by Facebook and Instagram has become clear in recent media reports, based on leaked internal company documents.

Now Facebook has announced – not that it has seen the light and will adjust its algorithm so as not to encourage acrimony, animosity, and conflict – but that it will become the creator of the metaverse. Facebook is now “Meta”. In my LLT column, I wrote about the metaverse, i.e., the ubiquitous intertwining of physical and online worlds first depicted in Stephenson’s novel Snow Crash (1992). Some commentators have recently pointed to the expansion of gaming platforms such as Fortnite or Roblox as moving in the direction of a burgeoning metaverse. The availability of those platforms on a great variety of devices and systems, from phones to gaming consoles, points to the ubiquity of access needed for this vision. Also needed are other elements already available within Fortnite or Roblox: the ability to have both planned and spontaneous events, to offer a variety of gaming and communication options, to have its own economic system (such as gaming currency) and, importantly, to enable users to carve out within that environment their own space, offering, for example, user-created games.

Apparently, Facebook’s interest in creating a metaverse is in part its commitment to the growth of virtual reality (through the Oculus system it acquired), as VR is a likely priority entry point to a metaverse. But it’s likely as well that the company hopes the announcement will distract from the many issues raised in the media about Facebook and its associated platforms. Additionally, it seems likely that Facebook’s move is an attempt as well to attract younger users, who have flocked to Roblox and Fortnite. Ironically it is that group, especially young girls, who have been shown in Facebook’s own studies to suffer potential harmful results from social media such as Instagram.

Do we want to have a new virtual world built by a company which ignores its own findings about its negative impact, a company that seems to have a focus only on market share and profit? For me, Fortnite and Roblox seem to offer better alternatives, if indeed we are moving towards a metaverse.

Big data and language learning

The big news in artificial intelligence (AI) this past year was the arrival of GPT-3, a substantially improved version of the “Generative Pre-trained Transformer” from OpenAI, an advanced AI system built on a web of artificial neural networks, deep machine learning, and massive collection of data on human language. The system has been described as a giant step towards the realization of AGI, “artificial general intelligence”, the ability of a system to use language in virtually any domain of human activity. I wrote about this development in the latest issue of Language Learning & Technology, a special journal issue on big data and language learning. I discuss the breakthrough represented by AGI:

Normally, an AI system will be able to deal effectively only within a narrowly defined domain, for which the system has been trained, so as to expect specific language patterns typically used in that context. Google Duplex, for example, does a remarkable job in conversing over the phone with human operators in making dinner reservations or reserving a ride on Uber. GPT-3, in contrast, has been shown to interact through language in a wide variety of genres and content areas: creative writing, journalism, essays, poetry, text-based gaming, and even writing software code. The Guardian newspaper ran an article written by the program, while the New York Times asked it to write about love. A blogger used GPT-3 to write multiple blog posts, subsequently receiving numerous subscribers and notice on tech websites. The fact that many readers were not able to tell that the GPT-3 generated texts were written by an AI system raises questions of trust and authenticity, mirroring the concerns raised about audio and video “deepfakes”, based on training an artificial neural network on many hours of real audio or video footage of the targeted individual.

The system represents a remarkable achievement in its ability to write in natural sounding language (idiomaticity, flow, cohesion). That ability is based on the collection and analysis of huge volumes of speech data collected by crawling the internet, including all of Wikipedia. GPT-3 translates that data into a very large (175 billion!) set of connections or “parameters”, i.e. mathematical representations of patterns. These parameters provide a model of language, based not on rules, but on actual language usage. That allows the system to predict speech sequencing, based on regularly occurring constructions of words and phrases, thereby enabling the machine production of natural-sounding language utterances. One can imagine how powerful GPT-3 could be integrated into a smart personal assistant such as Siri. We are already seeing interesting uses of chatbots and intelligent assistants in language learning. A company called LearnFromAnyone is building on top of GPT-3 a kind of automated tutor, which can take on the identity of famous scientists or writers.

While GPT-3 and other advanced AI systems represent a significant technical achievement, there are, as I discuss in the article, plenty of reasons to be cautious and thoughtful in their use, as is the case generally with big data in both social and educational contexts. While the language generated by GPT-3 mimics what a human might write in terms of language use, compositional structure, and idea development, the texts don’t always make sense in terms of lived human experience, i.e. demonstrating an understanding of social norms and cultural practices. Human beings have the advantage in communicative effectiveness of having lived in the real world and and having developed the pragmatic abilities to generate language that is contingent on human interactions and appropriate to the context. We also can use crucial non-verbal cues, unavailable to a machine: gesture, gaze, posture, intonation, etc.

I argue in the article that a human factor is a crucial mediating factor in implementations of AI systems built on top of big data, particularly in education. Learning analytics (collection of data about student academic performance) tends to treat students as data, not as human beings with complicated lives (especially these days). I discuss these and other ethical and practical issues with data collection and use in the context of D’Ignazio and Klein’s Data feminism (2020). The book explores many examples of inequities in data science, as well as providing useful suggestions for overcoming disparities in data collection (favoring standard language use, for example) and for recognizing and compensating for algorithmic bias.

Cancel culture and shifting power

D. Trump Jr. at the Republican Convention

One of the expressions current in the media is cancel culture, a term heard many times at the Republican Convention in the US last week. At that event it was used as a political weapon against the Democrats; according to buzzfeed:

A few weeks ago, most Americans either hadn’t heard of “cancel culture” or were quite unfamiliar with the term. And then President Donald Trump’s Republican National Convention began. Since Monday night, primetime convention speakers repeatedly have warned of a future where conservative patriots are silenced and vilified as a nation led by Joe Biden descends into lawlessness. Democrats and the media, they’ve argued, are canceling your beloved founding fathers and will cancel you next if you don’t adhere to their politically correct point of view.

In fact, President Trump’s administration has been active in suppressing speech from opponents, labeling as “fake news” not false reporting, but any news item not supporting the President’s views or actions.

The term cancel culture has been around for a while and has little to do with any conventional understanding of what a “culture” is. Instead it references a social practice, principally on social media, involving ostracizing or shaming someone for their behavior,  thereby “cancelling” their participation in human society, making them social outsiders. There have been famous cases in which social media attacks, for perceived or real transgressions, such as offensive tweets in the past (the film director, James Gunn) or calling the police on a black bird watcher (Amy Cooper), have resulted not only being “cancelled” in the media, but actually losing their jobs.

The phenomenon has been interpreted as indicating a shift of power in society (at least in the US), giving more weight to social media over official government authorities such as the courts or police. As reports of incidents or transgressions turn viral online, immense pressure is placed on those connected to the “cancelled” (employers, landlords, associates) to disassociate themselves from those individuals. The NY Times has run a number of stories on cancel culture, including several by Jonah Engel Bromwich. In one recent piece he commented:

People tend to see cancellation as either wholly good — there are new consequences for saying or doing racist, bigoted or otherwise untenable things — or wholly bad, in that people can lose their reputations and in some cases their jobs, all because a mob has taken undue offense to a clumsy or out-of-context remark. Personally, I think it’s best viewed not as either positive or negative, but as something else: a new development in the way that power works — a development brought about by social media.

The views on whether this is a good development vary.  Harper’s Magazine published an open letter, signed by a number of influential public figures, “A Letter on Justice and Open Debate,” which decried the development. The letter received some negative feedback, with the signers being accused of fearing that their own power and influence would be lost. However one might judge cancel culture, it seems undeniable that the power of social media it demonstrates is unlikely to go away any time soon.

Advanced tech: No need to learn a language?

From Ciklopea (Juraj Močilac)

I’m currently in Belgium, attending a conference on language learning and technology (EuroCALL 2019). Many topics are presented and discussed at such conferences, but one which came up repeatedly at this one is the use of smart digital services and devices which incorporate voice recognition and voice synthesis, available in multiple languages. Those include Apple’s Siri, Amazon’s Alexa, and Google Assistant, available on mobile phones/watches, dedicated devices, and smart speakers. In addition, machine translation such as Google Translate is constantly improving, as artificial intelligence advances (especially through neural networks) and large collections of language data (corpora) are collected and tagged. There are also dedicated translation devices being marketed, such as Pocketalk and Illi.

I presented a paper on this topic at a previous conference this summer in Taiwan (PPTell 2019). I summarized current developments in this way:

All these projects and devices have been expanding continuously the number of languages supported, with as well language variations included, such as Australian English, alongside British and North American varieties. Amazon has begun an intriguing project to add additional languages to Alexa. An Alexa skill, Cleo, uses crowdsourcing, inviting users to contribute data to support incorporation of additional languages. Speech recognition and synthesis continue to show significant advancements from year to year. Synthesized voices in particular, have improved tremendously, sounding much less robotic. Google Duplex, for example, has rolled out a service which is now available on both Android and iOS devices to allow users to ask Google Assistant to book a dinner reservation at a restaurant. The user specifies the restaurant, date and time, and the number of the party. Google Assistant places a call to the restaurant and engages in an interaction with the restaurant reservation desk. Google has released audio recordings of such calls, in which the artificial voice sounds remarkably human.

Advances in natural language processing (NLP) will impact all digital language services – making the quality of machine translations more reliable, improving the accuracy of speech recognition, enhancing the quality of speech synthesis, and, finally, rendering conversational abilities more human-like. At the same time, advances in chip design, miniaturization, and batteries, will allow sophisticated language services to be made available on mobile, wearable, and implantable devices. We are already seeing devices on the market which move in this direction. Those include Google Pixel earbuds which recognize and translate user speech into a target language and translate back the partner’s speech into the user’s language.

Conference participant, Mark Pegrum, kindly summarized some of the other informationpresented in his blog.

The question I addressed at the conference was, given this scenario, will there still be a need for language learning in the future. Can’t we all just use smart devices instead? My conclusion was no:

Even as language assistants become more sophisticated and capable, few would argue that they represent a satisfactory communication scenario. Holding a phone or device, or using earbuds, creates an awkward barrier, an electronic intermediary. That might work satisfactorily for quick information seeking questions but is hardly inviting for an extended conversation, that is, even if the battery held out long enough. Furthermore, in order to have socially and emotionally fulfilling conversations with a fellow human, a device would need support far beyond transactional language situations. Real language use is not primarily transactional, but social, more about building relationships than achieving a goal. Although language consists of repeating patterns, the direction in which a conversation involves is infinitely variable. Therefore, language support needs to be very robust, to support all the twists and turns of conversational exchanges. Real language use is varied, colorful, and creative and therefore difficult to anticipate. Conversations also don’t develop logically — they progress by stops and starts, including pauses and silences. The verbal language is richly supplemented semantically by paralanguage, facial expressions, and body language. This reality makes NLP all the more difficult. Humans can hear irony and sarcasm in the tone of voice and receive messages accordingly. We understand the clues that nonverbals and the context of the conversation provide for interpreting meaning.

It remains to be seen how technology will evolve to offer language support and instant translation, but despite advances it is hard to imagine a future in which learning a second language is not needed, if not alone for insights it provides into other cultures. Smart technology will continue to improve and offer growing convenience and efficiency in providing language services but is not likely to replace the human process of person-to-person communication and the essentially social nature of language learning.

Best time ever to learn languages?

MinorMynas app for language learning

The BBC recently ran a story, “Is this the best time in history to learn a foreign language?”, with the subtitle, “Today’s youngest generation is more multilingual and wired than ever. Could the tech they’re using breed a global army of polyglots?”. I would say that depends. Certainly, the opportunities are there now for language learning through resources on the Internet, most available on mobile phones as well. The article provides an example, through this combination of collaboration and smartphones, with a profile of Hillary Yip, a 13-year-old student from Hong Kong, who created a smartphone app (MinorMynas) for enabling young people to connect with one another for the explicit goal of language learning. The article cites the increase in migration patterns worldwide that increases the multicultural make-up of the population in many countries as one of the developments that is leading to greater interest in language learning: “This increased migration, especially in cities, brings people with a wide variety of cultural and linguistic backgrounds into close contact. Could a more multicultural world lead to a more multilingual generation?”. Yes, that could be the case, certainly among the younger generation, as the article points out. Unfortunately, in older populations, the influx of newcomers from different cultures may lead to discrimination and nativist political views, as we’ve seen in the US, UK, and elsewhere.

The article references the idea of “translanguaging”, the informal mixing of languages common today on the Internet, as well as in many multicultural classrooms. The opportunities for encountering other languages online are increasing, as social media enables contact with people from around the globe. One of the new options is the availability of streaming videos–especially in English– in the target language (with native language subtitles) now available in many countries through Netflix and other services. In many countries, TV shows and movies, shown on TV networks, are dubbed. However, in some countries, such as the Netherlands and Scandinavian countries, videos are shown in the original audio soundtrack. It’s no coincidence that citizens of those countries have typically had better English language skills. The availability of target language videos on commercial services as well as on YouTube offers the possibility of learning or refreshing a language through entertainment. This is, as the article discusses, a way to learn without having the goal to learn, or even without the realization that one is learning.

Several recent studies document this process for learners in Europe and in Brazil. The study from Brazil found that “fully autonomous self-instructed learners” of English gained a high level of proficiency without formal instruction, revealing “how the new affordances for naturalistic learning through the Internet have transformed informal language learning, enabling significant numbers of independent, informal learners in foreign language contexts to achieve very high levels of proficiency” (Cole & Vanderplank, 2016, p. 31). In fact, the study showed that the autonomous learners studied had fewer “fossilized errors” than classroom-based learners at a similar proficiency level, that is, fewer persistent, baked-in errors in grammar or word usage. This may be a wake-up call for instructed language learning, to look at more ways to encourage students to make use of online language resources.

Cole, Jason, and Robert Vanderplank. 2016. “Comparing autonomous and class-based learners in Brazil: Evidence for the present-day advantages of informal, out-of-class learning.” System 61: 31-42.

Tech addictions

The Yondr device for locking up smartphones

The current episode of “On the media” discusses the recent breach of privacy at Facebook and the backlash against the company, which has led to calls to delete the application altogether (#deletefacebook). This is difficult for a variety of reasons. For many people, particular Facebook-based communities have become centrally important in their lives. An article in today’s Wall Street Journal describes leaving Facebook as “socially messy and psychologically fraught”. Many users have gotten so habituated to use of Facebook as a way to connect with friends and families that the idea of leaving would likely mean a loss of meaningful human contact. In some cases, I imagine it may lead to withdrawal-like reactions. There is also a practical problem leaving: People have used their Facebook login as a way to authenticate themselves to different online sites and services and will have to do a lot of new registrations and password resettings.

The problem for Facebook to solve is not an easy one: how to protect users’ privacy when their entire business model is based on being able to supply personal data to advertisers. In other words, how Facebook does business flies in the face of the company’s claim that they “put privacy first.” Somehow, Facebook needs to figure out how to be a viable company (i.e. to make money) while still keeping user data private. Facebook has become a public utility – in that role, being at the same time a profit machine is problematic.

If Facebook is additive, so are smartphones. Giving up or restricting smartphone use is a topic that’s been trending lately, as in the recent piece in The Atlantic, The Case for Locking Up Your Smartphone. It highlights the use of Yondr devices, small fabric pouches, which close with a proprietary lock that can be opened only with a Yondr-­supplied gadget. They have been in use at concerts, as well as in hospitals and law courts, to prevent use of smartphones in those environments. The article also discusses education-related initiatives, such the proposal by French President Macron, to ban student phone use in public schools.

Banning phones from schools, to my mind, is misguided. I made that argument recently in a white paper for Cambridge University Press: Using mobile devices in the language classroom. I argue that for language learning in particular, having access to smartphones in class can bring significant benefits, both in terms of language learning resources and communicative/collaborative opportunities. Banning mobile devices can have undesirable results:

  • A good number of students will likely continue to use their phones, but surreptitiously, possibly resulting in classroom conflict.
  • Prohibiting phones leads students to view what happens in their language classroom as separate from their ‘real’ lives.
  • Students don’t see their devices as potential learning tools, in particular for language study.
  • Classroom instruction does not take advantage of the wealth of tools and resources available for language practice on mobile devices.

The white paper outlines a number of specific activities using smartphones that can enhance student learning in the classroom. I also point out that having a class full of distracted students may tell us something about the dynamics in the classroom:

In a classroom in which students are fully engaged in the learning at hand, there are likely to be fewer bored or distracted students. In communicative language learning, we expect students to be using the language actively as much as possible, collaborating and communicating among themselves, not listening to the teacher lecture. In that sense, introducing mobile devices as a new teaching and learning tool follows this instructional pattern, with the teacher as guide and facilitator.

I advise in the paper steps for creating a productive environment for mopbile use in  the classroom: Discussing the issue with all stakeholders – teachers, students, administrators, families – is a recommended first step. Studies have shown that students are very much aware of the distraction factor and are amenable to finding and adhering to a workable set of guidelines.

Leaders of state and language

A recent article in the NY Times discussed the influence that Kazakhstan’s first and only president, Nursultan A. Nazarbayev, has had on the development of a new alphabet for the Kazakh language, which is currently written using a modified version of Cyrillic. Nazarbayev announced in May that the Russian alphabet would be replaced by a new script based on the Latin alphabet. However, there is a problem with representing some sounds in Kazakh. The President has decided, according to the article, “to ignore the advice of specialists and announce a system that uses apostrophes to designate Kazakh sounds that don’t exist in other languages written in the standard Latin script. The Republic of Kazakhstan, for example, will be written in Kazakh as Qazaqstan Respy’bli’kasy.” Apparently, linguists had recommended that the new system follow Turkish, which uses umlauts and other phonetic markers, not apostrophes, but the President has insisted on his system. According to the article, this has led to intense debate in the country, with the President’s views being widely mocked:

“Nobody knows where he got this terrible idea from,” said Timur Kocaoglu, a professor of international relations and Turkish studies at Michigan State, who visited Kazakhstan last year. “Kazakh intellectuals are all laughing and asking: How can you read anything written like this?” The proposed script, he said, “makes your eyes hurt.”

Another President, Donald Trump, has made waves with his use of language as well, as in his re-interpretation of “fake news” to mean any reporting critical of him or his actions, his use of vulgar language to describe African countries, or his habit of using insults and abusive language in his tweets. In an interview with the Deutsche Welle recently, well-known linguist, George Lakoff, commented on Trump’s use of language. Lakoff laments how the wide-spread reporting of the President’s tweets tends to cite the texts. According to Lakoff, just having that language repeated – even if within an article attacking what was said in the tweet – tends, through repetition, to plant Trump’s ideas in our heads. He points out that this is the strategy used by Russia propagandists and the Islamic State in their online messaging. The twitter bots used by Russian hackers repeat tweets over and over again, with slightly different texts, but always using hashtags that support divisiveness in the US population and electorate. This was done in the 2016 election, and is continuing, as in the recent bots’ activity in spreading the #releasethememo hashtag, in reference to the controversial classified memo that some Republicans say shows bias in the FBI’s Trump-Russian probe. This is in an effort to discredit both the FBI and the investigation into the Russian electoral interference, in the desire to undermine the US people’s faith in their government, and thus weaken US democracy.

Lakoff’s advice to the media on reporting on Trump tweets:

Don’t retweet him and don’t use the language he uses. Use the language that conveys the truth. Truths are complicated. And seasoned reporters in every news outlet know that truths have the following structure: They have a history, a certain structure and if it is an important truth, there is a moral reason why it is important. And you need to tell what that moral reason is, with all its moral consequences. That is what a truth is.

He’s not advocating stopping news reports on the tweets, but rather to put them into a proper context, and point to factual discrepancies, when they exist – and in the reporting, to forego inadvertently spreading messaging through textual repetitions. The way that politicians use language can make a big difference in how policies and actions are viewed by the public, especially if a term is repeated frequently. We are seeing that currently in the US in relation to immigration. Trump and Republicans use the term “chain migration“, which has negative connotations rather than the term preferred by Democrats, “family reunification”, which makes the process sound much more positive.

“Too much technology”

The disabled USS John S. McCain being tugged to shore

That’s the quote from an article in the Navy Times, expressing the author’s take on the root cause for the surprising series of incidents recently involving the U.S. Navy. There were two fatal collisions of Navy warships with commercial vessels, resulting in the deaths of 17 sailors. The culprit, according to the article, and other reports, is the transition from face-to-face training in ship navigation from SWOS (the Surface Warfare Officers School Command) to the use of computer-based training. The article describes the process:

After 2003, each young officer was issued a set of 21 CD-ROMs for computer-based training — jokingly called “SWOS in a Box” — to take with them to sea and learn. Young officers were required to complete this instructor-less course in between earning their shipboard qualifications, management of their divisions and collateral duties.

The article asserts that those navigating the ships which suffered the collisions were trained under this system and concludes that “the Navy’s growing reliance on technology has eroded basic seamanship skills”. A story on National Public Radio last week echoes this view, with the commentator describing the change in the approach to training in this way: “They’ve been given a load of CDs. That’s right – online learning”.

The dismissive tone used here is something one hears often in connection with online learning, with the implication being that of course it cannot be as effective as face-to-face instruction. The problem here, as is often the case, is that this assessment does not consider the nature of the computer-based learning environment. It’s entirely possible to have such instruction be ineffective. Self-paced learning materials which center around static content presentation through presentations and documents are not likely to foster effective learning. But a slew of studies have shown that computer-based learning can be effective – if done right, with dynamic, media-rich, and interactive content. It’s particularly effective if incorporated into a socially connected and collaborative online learning environment. I have no idea how well the Navy’s training materials were designed, but neither did the commentators cited above. This kind of undifferentiated assessment of the use of technology in education and training gives a distorted picture of the reality of online learning.

Smartphones & language learning

The Apple Developer Conference is coming up next week and there is a lot of speculation about what Apple will announce, especially in the area of hardware. Most likely there will be new iPad models and new laptops, but probably not the new iPhone model, which is said to be coming later this year. This is the 10th anniversary of the iPhone, which represented a radical re-invention of the mobile phone, soon followed by similar Android phones. I just published a piece in Language Learning & Technology on what the smartphone has meant for literacy training and language learning.

Here are some quotes from the column, slightly edited (references removed):

The most successful mobile apps and services feature contextualized learning through an ecological approach. Apps can place language and culture learning into a localized setting, while also leveraging the resources of the global network. That makes available both social connectivity and worldwide information sources. In the process, learning content is customized and personalized, allowing the user to integrate new knowledge and skills into a real-world setting. This approach brings into play three major affordances of the mobile complex, which will be discussed here: situated learning, local and global integration, and personal empowerment.

Situated learning
One of the most powerful affordances of smartphones is situated learning.  Embedding activities and language in real-world environments holds the potential to make learning more meaningful and memorable. The built-in GPS, mapping, and touch control graphics of smartphones allow mobile apps and services to be location-aware and to provide continuous updates as a user moves from place to place.

[Many examples provided, such as a mobile app helping immigrants to Great Britain and mobile games for language learning]

Incidental language learning [is] more easily implemented through the smartphone environment. As users go about their daily lives, a learning companion is always available—a kind of personal tutor, available for consultations on demand, somewhat like the companion in Rousseau’s Émile (1762). Rousseau postulated that Emile’s curiosity would not only prompt the child to ask questions, but that information provided in response, since given in a particular context and location, would be retained longer. A series of contextualized learning experiences is the kind of “cognitive apprenticeship” smartphones may help develop.  Learning through concrete, lived experiences, integrated into everyday life, can provide a powerful instrument for more effective language acquisition

Local Agency and Global Reach
Smartphones are uniquely equipped to support localized use, while making available all the resources of a global network. One of the seemingly minor but, for linguistic purposes, highly useful innovations of the iPhone was the elimination of the physical keyboard, using instead an on-screen keyboard. While there were initial complaints over the difficulty of typing accurately with fingers, now that screen sizes have grown, and auto-correct algorithms have improved, users have gotten used to this form of text entry. The major benefit has been to make it much easier to support different writing systems.

Having mobile devices support a local language makes them into powerful tools for teaching literacy. In many developing economies, where there is not a fully developed landline phone and Internet system, mobile phones provide voice telephony, text messaging, and Internet access. This is particularly the case in isolated and rural areas, where solar and other alternative power sources can be used. In such environments, or among scattered urban groups, there may be limited access to schools or libraries, so that mobile devices offer a unique opportunity for the delivery of education.

[Examples discussed include support for endangered languages and for literacy projects worldwide]

Smartphone users, from virtually any connected location, have the power to connect interactively to a wide array of educational opportunities. This is an invaluable tool for enabling educational services in far-flung locations and supporting distance learning, but it also offers face-to-face instruction a means for students to learn on the go wherever they may be.

Personal Empowerment
Every smartphone is configured differently, customized as to language and locale, and loaded with apps of the user’s choosing. One of the difficulties in being able to measure the efficacy of MALL projects [mobile assisted language learning] is that the typical student will have access to and be using daily a variety of online tools and services. Some may be in the target language (foreign newspapers or TV stations) or be designed for language study, ranging from basic tools such as dual-language dictionaries to sophisticated services such as Babbel or Duolingo. The extent to which language learners take advantage of such resources will vary with the individual and the context of learning.

While language learning may not be an issue of central importance in the lives of many of our students, learning a second language, along with the cultural framework that comes with it, is a matter of crucial importance to one population: migrants and refugees. For these groups, mobile phones are a powerful instrument in potentially life-changing (or life-threatening) situations.

[Examples given of NGO’s and government agencies which have created apps to help immigrants with areas such as language learning and enculturation into the new country]

The column concludes:
Smartphones do not seem likely to be going away anytime in the foreseeable future. While the pace of innovation has slowed, new features will continue to be added as the devices become thinner but more powerful. As inexpensive smartphone models proliferate, feature phones have been forced to add features formally found only on expensive smartphones. This should enable the spread of smartphone-like capabilities to more communities. This, in turn, will encourage further development of mobile-enabled literacy projects and language learning applications. Language learners will continue to use regular commercial apps for socially based or incidental language learning, while taking advantage of utility apps for translation and dictionary look-ups.

While smartphones have clearly moved from the category of fun toys to that of powerful pocket computers, it is no easy task to harness the computing, communication, and collaboration capabilities for the purpose of serious learning. For instructed language learning, the mobile complex, developed around the smartphone, provides both challenges and opportunities. The main challenge is to provide to students the skills and knowledge to be informed and engaged online learners. Important in that process is presenting persuasive illustrations of learning connected to students’ lives (present and future) and to bring those experiences into the classroom. The most effective way to do that may be through the smartphone they likely all own. The opportunity is to leverage those digital devices and online experiences to enable and encourage in our students life-long learning, learner autonomy, and critical digital literacy.