Friday, September 27, 2024

Can Generative AI help us analyze The Lord of the Rings?

Generative AI is everywhere these days. Some of its output can seem pretty impressive too, at least at a glance. So it’s natural to wonder whether we might be able to use this emerging technology to analyze our favorite works of literature in valuable, labor-saving ways. After all — as is well known by now — Generative AI has gobbled up practically every work of literature ever published as part of its training, copyrights be damned. I can’t help remembering Sméagol’s warning about Sauron, “Don’t take the Precious to Him! He’ll eat us all, if He gets it, eat all the world.” Large Language Models seem to be in the “eating all the world” stage now, and Sam Altman is no Sam Gamgee!

So, can this technology help us analyze The Lord of the Rings? The short answer is: no, not really, not yet, and maybe not ever. If that’s enough for you, you can stop reading now. :)

I’ve been experimenting with ChatGPT, Gemini, and other Generative AI tools for some time, but while they can be useful for certain tasks, they will not be replacing literary scholars (or other professions) any time soon. A scenario occurred to me recently, one that I thought might be suitable for the kinds of things Generative AI does fairly well — if the results were accurate. But hallucinations and other defects are still common, so I put it to the test. Some of you may remember my work on hapax legomena in The Lord of the Rings, which I wrote about here. Unfortunately, the Scholars Forum referred to is long gone (a long story), but the essay was later published.

I wondered whether this was the kind of analysis Generative AI might help with, so I tried this prompt with Google Gemini:

Please provide a list of 25 hapax legomena occurring in the novel, The Lord of the Rings, ignoring all paratextual elements.

This prompt tests a few things, even before you get to the results: (1) does Gemini understand what a hapax legomenon is? (2) Does it have access to the text of The Lord of the Rings? (3) Does it know what we mean by the paratext of the novel?

Let me break down the response into two parts. I’ll start with the surrounding explanatory content, and then I’ll move on to the list of words the chatbot provided.

Disclaimer: While I can provide a list of words that appear only once in The Lord of the Rings, it’s important to note that Tolkien’s world-building often involves the creation of unique languages and terms. As such, many words that might seem like hapax legomena are actually part of these invented languages, making them less truly unique in the context of Middle-earth.

Additionally, the concept of a “hapax legomena” [sic] can be subjective. Depending on how we define a “word” (e.g., including compounds, derivatives, or different grammatical forms), the list could vary significantly.

Note: This list is not exhaustive and may contain errors or omissions. It’s also possible that some of these words might appear in other works by Tolkien or in different editions of The Lord of the Rings.

This explanatory text provides some pretty good context. Gemini clearly understands what a hapax legomenon is, and it seems to have access to the text of The Lord of the Rings. It even seems to know there are different editions. It got “Middle-earth” right too; many people, let alone AI models, get that wrong. I was also impressed to see that is was aware of Tolkien’s glossopoeia and that this might inform how it should interpret the prompt. By hapax legomena, are we interested in, say, Elvish words too, or are we not? That’s for us to decide, but Gemini realized this could be an important distinction.

So far, so good. But now, let’s looks at the list Gemini provided:

  1. Glimmering: Used to describe the light of the moon in the Shire.
  2. Windlestraw: A type of grass mentioned in the Shire.
  3. Wainwright: A person who makes or repairs wagons.
  4. Garnishing: Used to describe the decorations on the walls of Elrond’s house.
  5. Tarn: A small mountain lake.
  6. Whirligig: A type of spinning toy.
  7. Hobbity: Adjective used to describe something related to hobbits.
  8. Hitherward: Toward this place.
  9. Whitherward: Toward what place.
  10. Thenceforth: From that time on.
  11. Whilom: Formerly.
  12. Withered: Dried up or shriveled.
  13. Withering: Causing to dry up or shrivel.
  14. Witheringly: In a withering manner.
  15. Whither: To what place.
  16. Whitherward: Toward what place.
  17. Whithersoever: To whatever place.
  18. Whithersoever: To whatever place.
  19. Whitherward: Toward what place.
  20. Whitherward: Toward what place.
  21. Whitherward: Toward what place.
  22. Whitherward: Toward what place.
  23. Whitherward: Toward what place.
  24. Whitherward: Toward what place.
  25. Whitherward: Toward what place.

Well, this is not so good. The most obvious problem is that the word “whitherward” is given here nine times, along with several variants, including “whithersoever” given twice, and three variants on “wither”, which, worryingly, I suspect Gemini thinks is related to “whither”. All of this seems to ignore Gemini’s own preliminary comment about “compounds, derivatives, or different grammatical forms” too. It almost looks as if the model ran through the alphabet and got to W with quite a few spots left to fill in my requested list of 25 words.

But the problem is actually worse.

Let’s go through the list. Some of these words — “glimmering”, “withered”, “withering”, “whither” — occur multiple times in the novel. These aren’t hapax legomena at all. Not even close.

Most of the others — “windlestraw”, “wainwright”, “garnishing”, “tarn”, “whirligig”, “hobbity”, “hitherward”, “whitherward”, “thenceforth”, “whilom”, “witheringly”, “whitherward” — do not occur in the novel at all. Where did they come from? The films? Fan fiction? Who knows? These are what you might properly call hallucinations, in the terminology of Generative AI. Especially “garnishing”, which is supposedly “used to describe the decorations on the walls of Elrond’s house”.

Only one of the 25 words offered is actually a hapax legomenon in The Lord of the Rings, “whithersoever”, which occurs in only one place:

‘Do I not say truly, Gandalf,’ said Aragorn at last, ‘that you could go whithersoever you wished quicker than I? And this I also say: you are our captain and our banner. The Dark Lord has Nine. But we have One, mightier than they: the White Rider. He has passed through the fire and the abyss, and they shall fear him. We will go where he leads.’ [1]

So, it’s clear that Gemini handled this prompt quite badly overall. What about other LLMs? I did try this out with Microsoft Copilot, which uses the ChatGPT LLM. It stated, in part, “I can’t provide a verbatim list of hapax legomena from The Lord of the Rings due to copyright restrictions. […] If you’re interested in exploring hapax legomena in The Lord of the Rings, I recommend checking out scholarly articles or linguistic studies that analyze Tolkien’s language. These sources often provide detailed lists and discussions of such words.”

Here, Microsoft shows that it is aware of the copyright issue, where Google either isn’t aware or doesn’t care. This is a bit glib, though, because ChatGPT has most assuredly also gobbled up The Lord of the Rings, just as Gemini has. It just isn’t going to pull back the curtain for us. Is that better or worse? Hmm.

Following up, I then asked Copilot if it could recommend any specific scholarly articles or linguistic studies on hapax legomena in Tolkien’s works. You can imagine what I was angling for, eh? Was it aware of my own paper? Well, no, unfortunately. It recommended three papers, one on hapax legomena in natural language processing (a computer science topic), one of hapaxes in The Iliad, and one on ancient Akkadian and Ugaritic texts. Worthwhile, no doubt, but a missed opportunity to point an interested user to a directly relevant paper.

We could try other models, but it’s not likely any of this is ready for prime time. I hope amateur and new scholars aren’t relying on these models without double-checking their output, but we may be headed for a time of more and more careless reliance on supposed artificial intelligence. It’s bad enough that some people confuse the novel with the film and television adaptations. Generative AI could introduce a whole new set of problems for researchers.

Will Generative AI ever be good enough to perform this kind of analysis reliably? I have some reasons to doubt it, but elaborating would require quite a detour. I’ll try to address the question in a single paragraph for now. Suffice to say there are two substantial obstacles: (1) the cost of Generative AI, both direct costs as well as resource costs such as electricity, water, raw materials for chips, etc., and (2) the fact that the LLMs are running out of genuine human-generated material to train with. Models will increasingly be fed other AI output as new input, which will probably push us to a mediocre plateau in the capabilities of the technology. And the technology is not likely to get much better without a lot more investment, a lot more resource utilization, and a lot more raw material for training. The more people rely on it, the worse it may get, until the return no longer justifies the investment. In this way, Generative AI may be self-limiting. Garbage in, garbage out. This is not to say Generative AI will disappear; it’s just not likely to get a lot better than it is today. (If you want to read more about how Generative AI is a bubble that may be getting ready to burst, take a look at this very long, very detailed essay .)

[1] Tolkien, J.R.R. The Two Towers. Houghton Mifflin, 1965, p. 104.

Friday, September 20, 2024

Tolkien, Goethe, and the long long ago

In a recent Facebook conversation, someone asked about The Making of Middle-earth: The Worlds of Tolkien and The Lord of the Rings by Christopher Snyder. I didn’t recognize it from its cover, but I usually try to help with queries like this. I said, “I haven’t seen it. David Bratman reviewed the first edition in Tolkien Studies Vol. 11 [1]. Fairly mixed review, but overall, sounds like it’s not terrible.” David also covered it more briefly a couple of years later in The Year’s Work in Tolkien Studies Vol. 13 [2]. So, I’d done my duty … but this snagged in the back of mind for some reason.

I had said I hadn’t seen this book, but funnily enough, I actually have. I’ve got a copy on my own bookshelves! I didn’t recognize it because the one shared on Facebook is a 2022 reissue with a very different cover — a beautiful one, to be fair, though the original one is also nice. Apparently I received a review copy of the original 2013 edition, probably intended for review in Mythprint. But I had never done anything more than flip through it. It’s pretty, with lots of illustrations, but I couldn’t have said more than that.

Reading David Bratman’s review now — which I hadn’t when it first appeared; sorry, David! — I see that Snyder made a passing comparison between Tolkien and Goethe, something I myself explored in much greater detail in a conference paper I gave in Spring 2014, and then published in 2015. This means that Snyder actually beat me to getting the basic idea into print — although his comments are only two paragraphs and a short quotation from the poem. I did a survey of the literature when I wrote my paper, as I always do, but I didn’t discover Snyder’s comments at the time, as his book had only just been published.

All this got me to thinking back. Although I didn’t write this paper until 2014, I originally had the idea much earlier. In June, 2010 — four years earlier! — I pitched it to Thomas Honegger and Fanfan Chen for their (at the time) new journal, Fastitocalon. The idea itself predates that abstract, but the first germ is lost in the fog of memory. At least a year before, I expect, maybe longer. Maybe much longer. I might have been reading Goethe. I might have been listening to Schubert. Who can say?

A couple of weeks after I sent in the proposal, I got some feedback from Thomas (hoping he doesn’t mind my quoting it here):

I’ve received the feedback from several members of our board of advisors and the overall response to your proposal has been positive, though with some critical remarks concerning the ‘scope’ or ‘focus’ (well, that’s what peer-review is good for). I enclose an excerpt from one of the reports below to give you an idea.

What I’d like to suggest is that you slightly shifted the focus from looking for Tolkien’s sources for his Ringwraiths to considering Tolkien’s black riders as one (by now very popular) ‘incarnation’ of an ancient Germanic tradition (regardless of Tolkien’s actual acquaintance with the texts you mention). This would allow you to investigate the ‘Erlkönig’ background, which seems to me highly relevant for the Ringwraiths, yet at the same time it does not limit your argument to ‘sources’ only. What do you think?

To be sure! I was obsessed with digging out sources in those days, but this feedback sounds like the advice I myself would give blundering source-hunters today. Thomas encouraged me to finish and send in the paper. This was right around the time I was co-chairing the annual Mythopoeic Conference in Dallas with Randy Hoyt, and we were both already buried. Not only that, but I was in the middle of working on my book, J.R.R. Tolkien and the Study of His Sources. I had also taken on the job of editor of Mythprint in the Spring of 2010. And I was co-editing C.S. Lewis and the Inklings: Discovering Hidden Truth with Salwa Khoddam and Mark Hall, plus writing other essays for books and journals. Gaaaah!

I had a really self-destructive tendency in those days to overcommit to any and every opportunity that came my way. I still have a bit of a problem with this, but I’ve gotten much better about it. Looking back at that summer, I can’t believe how much I was actually juggling — including my full-time job! It’s a great lesson in what not to do. It’s a miracle I managed to be as effective as I was in all those projects. In any case, I told Thomas I’d finish the paper after Mythcon, intending to get it to him before a November 30 deadline for the issue. Ha! Ha! Ha!

In July, I got another piece of feedback from Thomas:

I’ve received another feedback from one of our readers. Since he makes a rather specific point in a), I forward it to you below (some of his more general points such as under b) have been kind of taken care of by what we discussed earlier on).

The Erlkönig theme itself - and certainly Goethe’s ballad — is fascinating, but the connection to Tolkien’s Old Man Willow episode and the rest needs some strengthening. What I miss is a) a consideration of the “Alder Maiden” episode in MacDonald’s Phantastes (Ch. 6), which is relevant for the “tree spirit” interpretation and which Tolkien, via Lewis’s recommendation, is likely to have known, and, perhaps even more important, b) some reflections on what is gained by such a (rather speculative) linking of different texts. ([… which] perhaps could be used to gain or support some insights into the intertextual quality of fantasy fiction in general.)

I hadn’t (and still haven’t) read Phantastes, though I’ve meant to and have heard many conference papers on it. So it goes.

Mythcon came and went, and by October I had landed a publisher for my book and started the final push to complete the opus magnum (Thomas’s description) — a very big job requiring my full attention. I wrote to Thomas, who was incidentally also a contributor to my book, that “the sudden acceleration in this project means that I have had to put the Erlkönig / Black Riders comparative piece for Fastitocalon on hold. I’m not going to be able to have that for you next month; it will have to come next spring, I imagine, once the book has moved further along and no longer demands the kind of time it has heretofore.” Thomas nodded back, indicating we could shift our plans for the essay to the second issue of Fastitocalon, planned for 2011.

My book arrived in the summer of 2011, and the Khoddam/Hall/Fisher collection arrived some months after that, but I didn’t manage to return to the essay. I ended up interviewing at Microsoft near the beginning of 2012 and moving across the country in a major career and life pivot in the spring of that year. The essay was never completed for Fastitocalon — nor did Thomas and I end up talking about it any further, though we continued to correspond about lots of other things, as one does. It had effectively slipped my mind — and perhaps his as well. Then, in January 2014, the Call for Papers for the C.S. Lewis and Inklings Society’s annual conference hit my inbox, and as I cast about for an idea — remember, I couldn’t say no to anything back then — I remembered the planned paper.

I finished the essay in February 2014, still thinking that I alone in all the universe — my crest has now fallen — had made this connection between Goethe’s poem and the episode in The Lord of the Rings.

David Bratman’s review was announced in July 2014, but the issue didn’t reach people until at least November. David mentioned the Tolkien/Goethe/Schubert connection explicitly, but I had finished my paper nine months before I could have seen his review, even if I read it when it was published, which I’m sure I didn’t.

Of note in David’s review:

A comparison of the wounded Frodo’s journey from Weathertop to Rivendell to the Goethe-Schubert song “Der Erlkönig” works better in relation to the more frantic pace of the movie than of the wearying slog of the book (132–33). [3]

At the Mythcon in Colorado Springs in 2015, I read the paper again (about a year about a half after I had read it for the CSLIS). There, I did something I hadn’t done before but had always envisioned as part of making the case for the paper. I played the scene from the Peter Jackson film adaptation of The Fellowship of the Ring, but I substituted the Franz Schubert Lied, “Erlkönig”, for the film’s audio. That’s something you just can’t do in print, and it was something David had almost seemed to be asking for in his review — again, if I had read it; which, again, I hadn’t.

On his blog, David recounted highlights from Mythcon, in which he wrote:

Better music made an appearance in today’s program when Jason Fisher delivered an erudite paper on resemblance and possible influence of Goethe’s “Erlkönig” on Tolkien’s Black Riders and Old Man Willow. He even went so far as to play a clip of the Ford of Bruinen scene from Jackson’s “Fellowship” with Schubert’s setting of Goethe substituted for whatever crap Howard Shore wrote for that scene. It was so much better this way it wasn’t funny. I suggest we make a full Schubert lieder recital as a substitute soundtrack for the entire movie. Except then I’d have to watch the thing again.

So that was incredibly gratifying. And perhaps as a tease, I might say that I have another paper that has been years in the works in which I might be pulling the same trick!

So, what is the point of this meandering recollection?

I suppose it may be to ruminate on just how much you can forget, even of your own work; how easy it is to overlook the work others are doing (and there is a lot of it); and the delights of remembering and retracing your steps later on.

It may also be a cautionary tale on delaying the pursuit of an idea too long. Like the child in the poem, an idea may die if you can’t get it home in time. Even now, I still have old notes and abstracts I’ve never returned to. With such delays, this means that occasionally someone will beat me to getting a connection or discovery into print. I used to lose sleep over being “scooped”, but this is much less important to me now. Now, I’m more likely to greet such an essay with a sigh of relief. So long as the idea gets out there, that’s a mountain of work I don’t have to do!

And one final note. It’s been ten years, but perhaps someone wants to take Thomas Honegger’s anonymous reviewer’s idea to heart and bring MacDonald’s Phantastes back into this discussion. Anyone?

[1] Bratman, David. Rev. of The Making of Middle-earth, by Christopher Snyder; The Essential Tolkien Trivia and Quiz Book, by William MacKay. Tolkien Studies 11 (2014): 254–57.

[2] Bratman, David, Edith L. Crowe, Jason Fisher, John Wm. Houghton, John Magoun, and Robin Anne Reid. “The Year’s Work in Tolkien Studies 2013.” Tolkien Studies 13 (2016): 223–300, p. 225.

[3] See [1], p. 256.