The Mystique Of Language Generation
The Mystique Of Language Generation

The Mystique Of Language Generation

I was thinking about a recent comment on something I posted (as is my wont), and I realised that there may be a slight bias, even (gasp) hypocrisy, in my views about this whole AI thing. As anybody who has read any of my recent pieces about Synthography, or the AI Revolution might have picked up, (go on, I’ll wait, it’s only a few thousand words), for various reasons I’m a big fan of AI image generation. However, I’ve come to notice that I’m a great deal more ambivalent toward Large Language Models (LLMs) and AI text generation.

Part of this may be to do with the fact that I am in some senses, (and always will be), a writer. So I may feel a certain level of professional competitiveness. I am luckily not the kind of writer whose job is likely to be threatened by AI, but I understand and sympathise with the threat faced, and the concern felt, by those who are.

Largely though, I think my ambivalence springs from my experience with its “proficiency,” and perhaps a certain familiarity with its process. Language generation itself holds no mystique for me. I can generate thousands of words on almost any topic you care to mention with a bit of preparation. :D

Art and imagery however, are for me a strange and fascinating country that I navigate in semi-awe. But words? Words are all I have.

The Meaning Of Words

For my entire life, I’ve been surrounded by words. From the bedtime stories my parents read me every night until my mother, (a teacher) taught me to read at a young age (desperate to get me to stop talking), to the books which line the walls of my study right now and even the words appearing on this screen as I tap at the keys, as an aphantasic words are, and have always been, the fundamental way in which I interact with the world.

I’ve spent almost all of my adult life making a living from writing things for people, or editing things that other people have written. Even now that it’s not actually technically my job any more, its still something I do a lot of, because I enjoy it, and there are people who think I’m quite good at it. And as somebody who has spent their life working with words in one way or another, and whose brain has no other means than words to parse and comprehend things, I am rarely, very rarely, at a loss for them.

The Magic Box Of Words

As a result, I find no need to ask a magic box to provide me with a plausible written description or explanation of something, and nor do I find its ability to do so surprising. For all the discussion about plagiarism machines, and scraping peoples content from the web, the fact is that almost every word in existence has been used before by somebody else.

All we really do is rearrange them in a way that’s pleasing to the ear and mind, and suits our purpose for conveying whatever information we want to convey. Except for cases where the literal words in the same order and with the same intent are spat out verbatim by it (or so similarly as to be distinctive), (which does sometimes happen, probably as a result of over-training or limited training data on certain topics), and with no attribution, I’m not particularly predisposed to using the term.

The Real Problem

No, the problem with LLMs is not that they learned to create word patterns based on the analysis of millions of previous word patterns, nor that they produce largely quite plausible word patterns when prompted. The real problem with LLMs are the ways that people are using it.

Primed for so long (it feels like) for the on-coming machine intelligence, humanity has grasped the very first glimpse of it like it was some holy grail. And the companies who create / supply its eyes lit up. And the companies who create / supply things that could possibly integrate with its eye lit up as well. And we very humanly rushed into embracing it as an answer to so many problems.

The Practicality Of Realism

And as I’ve mentioned before, it can and does help with many problems. It’s folding proteins and decoding ancient manuscripts. It’s writing code and designing chips and diagnosing illnesses. And all those things are great. But it’s also being pushed into replacing critical thought. It’s being touted as being able to teach students. To recognise criminals and crimes and weapons. To make target selection decisions faster than people. They’re testing how AI will perform in war games. And of course, to analyse the vast swathes of data that is collected about us.

Recent studies have shown the rise in school and university students using it amidst declining academic performance, and even more interestingly, that it has become the new favourite method of so-called “cognitive unloading” with an associated decline in critical thinking.

Now, cognitive offloading is nothing new. We started when the first person scratched a rudimentary map onto a stone to help them get to the new hunting grounds. Keep an appointment book? Use a calculator? Store people’s phone numbers in your phone? Cognitive offloading. And the reason we do this is because it reduces the load our own brains have to carry. And we have evolved to conserve as much energy as we can. That’s why short-cuts are a thing. (Any kind of short-cut. Keyboard short-cuts exist because we wanted to achieve the same result with less effort. (Remote controls too.))

The reality is that if something offers us a quicker, easier, less energy intensive way of achieving the same (or substantially similar and “good enough”) result, we as a species are going to leap at the opportunity. It’s the sensible thing to do on an individual short-term level.

But The Text…

LLMs are very good at producing believable word patterns. That is exactly what they were designed for. To use complex mathematical modelling to determine the next word in a sequence of words, based on the probability of the given word appearing next amongst the countless millions of word patterns they were fed.

The coders in my life use it for code. The marketers use it for ad copy and campaign ideas. The content writers produce web content with it. The analysts use it to write reports. And then I check it. And edit it. (Except for the code of course. :D )

And it’s…fine. Mostly. It’s often repetitive, stilted, or awkward. It’s almost always either bland or injected with frivolous “buzz words.” It sometimes struggles to maintain a logical flow, but its mostly coherent, which is more than I can say for some of the human stuff I’ve worked on over the years.

For all the hype, hope and investment, the key thing to remember is that it was effectively built with the Turing test in mind.

In the test, a human evaluator judges a text transcript of a natural-language conversation between a human and a machine. The evaluator tries to identify the machine, and the machine passes if the evaluator cannot reliably tell them apart. The results would not depend on the machine’s ability to answer questions correctly, only on how closely its answers resembled those of a human. (Wikipedia:Turing Test)

The results do not depend on the ability to answer correctly, only on how closely its answers resemble those of a human.

Talking Without Thinking

LLMs have smashed the Turing test. Even the most rigorous modern versions of the test have been passed by current LLMs. But humans couldn’t leave it at that. Even Turing didn’t seem to focus on the fact that the ability to create plausible word patterns did not mean that cognition was taking place, and the average lay-person even less so. Why search the web and evaluate possible results when you can ask the magic word box to explain something to you in a short paragraph…and watch the words appear on your screen as if somebody were talking to you?

And so, we have rushed into the next stage of cognitive offloading…a device that has every appearance of thinking for us. And because we have been coded to interpret text as the output of thought, we struggle not to see the so-called AI as thinking, rather than creating a pattern based on what is probably the most plausible, human sounding next best word in the sentence. Not the next correct word. The next word that fits a pattern that most resembles human speech.

The Mystique…

And because, let’s be honest, the majority of people struggle to effectively express themselves verbally, or to communicate clearly, especially in writing, it seems almost magical how the word box can convert your half-coherent mumblings into prose that looks good. I see clients using it for emails, suppliers using it for proposals. And it looks good enough. It’s not, of course. But for the average person, its a big step up in readability, and an even bigger energy saving.

As a lover of language and the written word though, what enchants me is a writers turn of phrase. The elegant or clever combination of words everybody knows into a description or explanation or sensibility that is unique. That captures more than basic meaning. That is evocative of something.

And although anything I write is ostensibly for the purposes of communicating some sort of information, (even if it is only what I happen to think about something), its underlying intent is the attempt to convey the nuance, the subtlety of thought and emotion, that only a well crafted line can convey.

The Root Cause

My ambivalence toward LLMs then is, I think, rooted in a sense of disdain. A reasonable understanding of the fundamental process by which it’s achieved, and the missing admiration of the wit, insight, or evocative phrase that characterises an appreciation of good writing.

In that sense, the AI opponents are not wrong. And nor are they wrong to call a great majority of AI output “slop.” It is indeed bland, regurgitated phrasing piling up abundantly. I’d say something about it ruining the internet, but the truth is we already did that once for SEO…

(I still remember “article spinning” where people used specialised (if fairly basic) software to convert an article about [keyword] into 20 or 50 or 100 articles about [keyword] with the automatic replacement of words with synonyms or variations, to “generate” “unique” content in an attempt to rank for [keyword]. Some of them even automatically published your “articles” to content farms or directory sites or whatever. (It was just as awful as it sounds. :D ))

AI text is good enough for what it was intended for, which was to generate human readable text that’s ostensibly indistinguishable from human generated text. Of course, we’re using it for a lot more now than it was intended for, and as a result, a lot of noise is being added to the signal. But systems will learn to filter it out. They’ll have to.*

*A Side Note On Model Collapse
One of the big reasons they'll have to is an interesting phenomenon known as "model collapse." One of the recent claims about a bottleneck in AI advance (denied by some companies with a financial interest in constantly churning out better models) is that one cause is the fact that they have effectively already scraped everything human-written on the 'net to build their models in the first place.

And it turns out that for all their strengths, you can't train AI models on AI generated text. Although it's (kinda) indistinguishable to most human readers, the lack of authentic human-generated patterns in text has an impact on the LLMs themselves, (even if they can't reliably identify it yet), and training models on AI generated text steadily degrades the performance of the models as they lose track of what things mean. :D The patterns start to collapse, and as the paper linked to up ^there^ says, within a few generations they become incomprehensible.

(This appears less true with image models...there are plenty of image models trained on AI images. It can have negative effects, but they're mitigated by good training and tagging etc.)

The Other Side Of The Coin

Image generation on the other hand…well, you can read those pieces I linked to in the first couple of paragraphs for a more detailed explanation, but to cut a long story short, as an aphantasic, I can’t mentally visualise anything. As such, I tend to view image generation as a sort of “prosthetic” minds eye, and I’m constantly amazed by the ability it gives me to see a visual representation of something I thought about.

I’m aware of the debates and the questions around training models, and the legitimate copyright issues involved, and of course the often vehement discussions around whether AI art is “art” at all. (My view is that it is indeed art…that anything which allows a person to express their feelings and emotions is art, regardless of the tools by which they achieve that expression, but that’s a discussion for another day perhaps.)

All these points notwithstanding though, it is probably my own inability to create visual art, and my unfamiliarity with the process by which it’s created that leaves me somewhat in awe of the facility by which it can do so. Perhaps if my medium were the visual arts I would be more sensitive to its weaknesses, or more threatened by its strengths in that field. Maybe if I could see pictures in my head I’d be less fascinated by the creation of images. Words on the other hand, well, like I said…the creation of words is hardly a mystery to me.

The Hypocrisy Of Inconsistency

So? Am I a hypocrite? Is it inconsistent of me to denigrate LLMs while using and enjoying image generators? Am I negative about something that (sorta) affects me and my skill-set, while embracing something which affects others more negatively?

I like to think not. Or not very much anyway. If nothing else, I can at least see why I treat those two imposters a little differently. But for those who believe I am deluding myself, I leave you with the words of that great American poet, Walt Whitman:

Do I contradict myself?
Very well then I contradict myself,
(I am large, I contain multitudes.)

– Walt Whitman – Song of Myself 51