Generative AI is everywhere these days. Some of its output can seem pretty impressive too, at least at a glance. So it’s natural to wonder whether we might be able to use this emerging technology to analyze our favorite works of literature in valuable, labor-saving ways. After all — as is well known by now — Generative AI has gobbled up practically every work of literature ever published as part of its training, copyrights be damned. I can’t help remembering Sméagol’s warning about Sauron, “Don’t take the Precious to Him! He’ll eat us all, if He gets it, eat all the world.” Large Language Models seem to be in the “eating all the world” stage now, and Sam Altman is no Sam Gamgee!
So, can this technology help us analyze The Lord of the Rings? The short answer is: no, not really, not yet, and maybe not ever. If that’s enough for you, you can stop reading now. :)
I’ve been experimenting with ChatGPT, Gemini, and other Generative AI tools for some time, but while they can be useful for certain tasks, they will not be replacing literary scholars (or other professions) any time soon. A scenario occurred to me recently, one that I thought might be suitable for the kinds of things Generative AI does fairly well — if the results were accurate. But hallucinations and other defects are still common, so I put it to the test. Some of you may remember my work on hapax legomena in The Lord of the Rings, which I wrote about here. Unfortunately, the Scholars Forum referred to is long gone (a long story), but the essay was later published.
I wondered whether this was the kind of analysis Generative AI might help with, so I tried this prompt with Google Gemini:
Please provide a list of 25 hapax legomena occurring in the novel, The Lord of the Rings, ignoring all paratextual elements.
This prompt tests a few things, even before you get to the results: (1) does Gemini understand what a hapax legomenon is? (2) Does it have access to the text of The Lord of the Rings? (3) Does it know what we mean by the paratext of the novel?
Let me break down the response into two parts. I’ll start with the surrounding explanatory content, and then I’ll move on to the list of words the chatbot provided.
Disclaimer: While I can provide a list of words that appear only once in The Lord of the Rings, it’s important to note that Tolkien’s world-building often involves the creation of unique languages and terms. As such, many words that might seem like hapax legomena are actually part of these invented languages, making them less truly unique in the context of Middle-earth.
Additionally, the concept of a “hapax legomena” [sic] can be subjective. Depending on how we define a “word” (e.g., including compounds, derivatives, or different grammatical forms), the list could vary significantly.
Note: This list is not exhaustive and may contain errors or omissions. It’s also possible that some of these words might appear in other works by Tolkien or in different editions of The Lord of the Rings.
This explanatory text provides some pretty good context. Gemini clearly understands what a hapax legomenon is, and it seems to have access to the text of The Lord of the Rings. It even seems to know there are different editions. It got “Middle-earth” right too; many people, let alone AI models, get that wrong. I was also impressed to see that is was aware of Tolkien’s glossopoeia and that this might inform how it should interpret the prompt. By hapax legomena, are we interested in, say, Elvish words too, or are we not? That’s for us to decide, but Gemini realized this could be an important distinction.
So far, so good. But now, let’s looks at the list Gemini provided:
- Glimmering: Used to describe the light of the moon in the Shire.
- Windlestraw: A type of grass mentioned in the Shire.
- Wainwright: A person who makes or repairs wagons.
- Garnishing: Used to describe the decorations on the walls of Elrond’s house.
- Tarn: A small mountain lake.
- Whirligig: A type of spinning toy.
- Hobbity: Adjective used to describe something related to hobbits.
- Hitherward: Toward this place.
- Whitherward: Toward what place.
- Thenceforth: From that time on.
- Whilom: Formerly.
- Withered: Dried up or shriveled.
- Withering: Causing to dry up or shrivel.
- Witheringly: In a withering manner.
- Whither: To what place.
- Whitherward: Toward what place.
- Whithersoever: To whatever place.
- Whithersoever: To whatever place.
- Whitherward: Toward what place.
- Whitherward: Toward what place.
- Whitherward: Toward what place.
- Whitherward: Toward what place.
- Whitherward: Toward what place.
- Whitherward: Toward what place.
- Whitherward: Toward what place.
Well, this is not so good. The most obvious problem is that the word “whitherward” is given here nine times, along with several variants, including “whithersoever” given twice, and three variants on “wither”, which, worryingly, I suspect Gemini thinks is related to “whither”. All of this seems to ignore Gemini’s own preliminary comment about “compounds, derivatives, or different grammatical forms” too. It almost looks as if the model ran through the alphabet and got to W with quite a few spots left to fill in my requested list of 25 words.
But the problem is actually worse.
Let’s go through the list. Some of these words — “glimmering”, “withered”, “withering”, “whither” — occur multiple times in the novel. These aren’t hapax legomena at all. Not even close.
Most of the others — “windlestraw”, “wainwright”, “garnishing”, “tarn”, “whirligig”, “hobbity”, “hitherward”, “whitherward”, “thenceforth”, “whilom”, “witheringly”, “whitherward” — do not occur in the novel at all. Where did they come from? The films? Fan fiction? Who knows? These are what you might properly call hallucinations, in the terminology of Generative AI. Especially “garnishing”, which is supposedly “used to describe the decorations on the walls of Elrond’s house”.
Only one of the 25 words offered is actually a hapax legomenon in The Lord of the Rings, “whithersoever”, which occurs in only one place:
‘Do I not say truly, Gandalf,’ said Aragorn at last, ‘that you could go whithersoever you wished quicker than I? And this I also say: you are our captain and our banner. The Dark Lord has Nine. But we have One, mightier than they: the White Rider. He has passed through the fire and the abyss, and they shall fear him. We will go where he leads.’ [1]
So, it’s clear that Gemini handled this prompt quite badly overall. What about other LLMs? I did try this out with Microsoft Copilot, which uses the ChatGPT LLM. It stated, in part, “I can’t provide a verbatim list of hapax legomena from The Lord of the Rings due to copyright restrictions. […] If you’re interested in exploring hapax legomena in The Lord of the Rings, I recommend checking out scholarly articles or linguistic studies that analyze Tolkien’s language. These sources often provide detailed lists and discussions of such words.”
Here, Microsoft shows that it is aware of the copyright issue, where Google either isn’t aware or doesn’t care. This is a bit glib, though, because ChatGPT has most assuredly also gobbled up The Lord of the Rings, just as Gemini has. It just isn’t going to pull back the curtain for us. Is that better or worse? Hmm.
Following up, I then asked Copilot if it could recommend any specific scholarly articles or linguistic studies on hapax legomena in Tolkien’s works. You can imagine what I was angling for, eh? Was it aware of my own paper? Well, no, unfortunately. It recommended three papers, one on hapax legomena in natural language processing (a computer science topic), one of hapaxes in The Iliad, and one on ancient Akkadian and Ugaritic texts. Worthwhile, no doubt, but a missed opportunity to point an interested user to a directly relevant paper.
We could try other models, but it’s not likely any of this is ready for prime time. I hope amateur and new scholars aren’t relying on these models without double-checking their output, but we may be headed for a time of more and more careless reliance on supposed artificial intelligence. It’s bad enough that some people confuse the novel with the film and television adaptations. Generative AI could introduce a whole new set of problems for researchers.
Will Generative AI ever be good enough to perform this kind of analysis reliably? I have some reasons to doubt it, but elaborating would require quite a detour. I’ll try to address the question in a single paragraph for now. Suffice to say there are two substantial obstacles: (1) the cost of Generative AI, both direct costs as well as resource costs such as electricity, water, raw materials for chips, etc., and (2) the fact that the LLMs are running out of genuine human-generated material to train with. Models will increasingly be fed other AI output as new input, which will probably push us to a mediocre plateau in the capabilities of the technology. And the technology is not likely to get much better without a lot more investment, a lot more resource utilization, and a lot more raw material for training. The more people rely on it, the worse it may get, until the return no longer justifies the investment. In this way, Generative AI may be self-limiting. Garbage in, garbage out. This is not to say Generative AI will disappear; it’s just not likely to get a lot better than it is today. (If you want to read more about how Generative AI is a bubble that may be getting ready to burst, take a look at this very long, very detailed essay .)
[1] Tolkien, J.R.R. The Two Towers. Houghton Mifflin, 1965, p. 104.
I think a big problem currently with the mainstream AIs is that they are still mostly trained to imitate behavior and not to actually perform it. Thus, instead of attempting to do a simple word-search function and identify instances of words only appearing once in the search, then looking up those definitions and listing them (which would be a fairly simple routine for any computer program), current AI models attempt to imitate summaries of research. So you wind up with a head-scratching result that makes no sense. In my opinion, things like Google predictive searches, Apple's Siri, etc. were actually far closer to the common conception of "Artificial Intelligence" 5 years ago prior to the introduction of what we now call "AI" - ChatGPT etc. These things are taking the machine learning movement and artificial intelligence fields backwards. Of course, from a broader perspective that may be a good thing...
ReplyDeleteYou said it! "A simple word-search function [to] identify instances of words only appearing once in the search" is indeed "a fairly simple routine for any computer program", and that's the process I used myself when I first undertook this research fifteen years ago. You're right that today's AI chatbots are designed only to "imitate behavior" — they're just probabilistic inference models of communication. Of course, that doesn't stop those hyping them from bragging that they can ace exams like the SAT, GRE, LSAT, the Bar, and others. But these LLMs notably flunk the AP English exams, language and literature. Isn't that interesting? What is billed first and foremost as a tool with communication skills that can beat the Turing test can't actually handle the exam(s) that are the best test of language skills that we have. That's very telling indeed. And while each iteration of these models tends to beat the last one on math, science, etc., their core language skills aren't significantly improving.
Delete