This one slipped past me when it happened, but one morning recently, the better half asked me if I was familiar with the term, and I was forced to admit ignorance.
In an article published last year (2023) in the Philosophy of Photography journal by Dr. Elke Reinhuber, a new term was proposed for AI generated photo-realistic images, that distinguished them from images created by optical means. That term was Synthography.
“Synth,” from the Greek “synthetos,” meaning “put together” or “combined,” and “graphy,” from “graphia,” meaning “writing” or “drawing.” As opposed to “photography” which essentially means “drawing with light.”
Not As New As We Think
Of course, design aided by technology is by no means new. (CAD anybody?) Even before the digital age, photographers were using technology like shutter speed and aperture controls to help them automate the process of capturing images. The fractals I was so happily generating in the 90’s (shout out to FracTint) were a much later example, and so were CGI effects in movies. Digital cameras had “AI” powered features like focus detection, low-light modes, etc. and smart phone cameras had image filters long before AI image generation became mainstream.
The author suggests that all these advances, demonstrating a continuous evolution in image creation toward greater automation and image control, paved the way to the current “full image” generation we have today.
In Favour Of Definition
I’m a big fan of this. I like the term, I like having definitions for things, (definitions are crucial in both philosophy and debate), and once it had been brought to my attention, I started to notice that not only was it being used, but that common usage has already broadened it to include all forms (it seems) of AI generated imagery. It’s good to have a term for it, both for providing a common basis, and for easily distinguishing between these and other types of imagery.
In From The Cold…
This piece was actually for a far different purpose, and originally I intended a far different (and shorter) direction for it, but… After an absence of nearly 10 years from all forms of direct social media, I made the random decision to return to it, after I chanced across an article about the burgeoning new(ish) social network BlueSky and thought I would check it out. I created an account and quickly ran into some folks from back in the G+ days, and started participating a bit. This coincided with, but was unrelated to, the resurrection of this old blog.
As I mentioned in the re-inaugural post of the blog, the reason I’d done that had very much to do with Synthography.
A Prosthetic Minds Eye
I am, as I’ve had occasion to mention once or twice, aphantasic. My brain physiologically lacks the ability to visualise anything. My minds eye is blind, with various consequences. But what it especially means is that I have never been able to see a visual representation of the things I think of. And I think of a lot of stuff. I’ve also never been able to draw, no matter how hard I tried to learn, which is probably related.
The advent of publicly accessible image generation has literally been amazing to me. For the first time in my life, I can think of something, type some instructions into a computer, and see many different variations of the image my words conjure from the machine. :D It’s literally the closest I can ever get to what Aristotle described as people’s “richly visualised internal lives.”
So…I got very into AI image generation, and quickly came to love this democratisation of art that allows anybody to create images of whatever they can imagine. I set up a local Stable Diffusion instance on my own PC, and was quickly generating hundreds, if not thousands, of images, and almost instantly deleting them, experimenting with prompting and different models and settings etc. etc.
But because I’m a digital hoarder, I wanted somewhere to keep at least some of them, and I decided to fix up the old blog so I had somewhere to do so. (You can check them out yourself if you want in the AI Image Galleries. They’re all “raw prompts” from poetry, literature, music, etc. generated with no style / content instructions.) (I add more from time to time, so Check Back Soon!TM ) :D
The Great Divide
So, anyway, I was pleased to see a lot of these Synthographers posting their work on BlueSky, but I quickly noticed a very stark divide between the people who loved it, and the people who (sometimes vituperatively), condemned it. And while I was a little disappointed, I’m honestly not that surprised either.
We live in a time of great technological upheaval, and with great upheaval, often come great casualties.
It seems many of the more traditional artists, which used to mean people who paint or use other physical mediums to produce their art, but has now (with the advent of a common enemy ;) ) grown to include digital artists, are demonising, opposing, and belittling AI art, usually claiming that it is not art at all, and pointing out (albeit angrily) the ethical issues involved in the training of the image generation models.
Now, the first part of that argument can clearly be recognised as the “no true Scotsman” fallacy. Remember, it wasn’t long ago that the paintbrushes and pencils folks were saying that digital art wasn’t “real” art either, because “any idiot can use the fill tool” or whatever. Digital artists of course, knew it wasn’t that easy, and that there were skills and techniques and methodologies involved that were not being appreciated by their nay-sayers. (And believe me, the same is true of producing good AI generated art.) (Not that I do, but that’s not the point. :D )
But I don’t think anybody can honestly deny that the second part is a perfectly valid and legitimate concern. And there are, I think, other issues that underlie this kind of response as well, which are equally valid, which I want to touch on first.
A Caveat On Bias
I think it’s important to note here that I do not have skin in this game. I do a lot of writing (and reviewing and editing) as part of my job, but I am not paid to write (technically) and nor does my livelihood depend on getting people to buy or pay for my writing. I am definitely not an artist, and neither does my livelihood does not depend on getting people to pay for or commission works of art created by the skill of my hands. I also only have the most basic comprehension of the technicalities behind generative AI. I am, at best, an interested lay-person. :D
Tools Of The Trade
Whenever we enter a time of technological upheaval, particularly when that technology can be used to produce things that are already being produced via other means, those involved in that production come automatically and immediately under threat.
The most famous example of course are the Luddites, although they were not the first. Despite the common usage of the term, (which I often see levelled against those who oppose the use of AI in art (or anything else)), the Luddites were not actually “anti-technology.” What they were really (often violently) protesting was the automation of the textile industry, particularly the introduction of power looms and knitting frames. (Hell, I call myself a “neo-luddite” because I’m opposed to smart phones, but that’s another story. :D)
Now one of the reasons they gave was the fear that standards of quality would fall, but unsurprisingly, the biggest issue was that they (rightly) feared losing their livelihood. And when people are afraid, it is natural for them to lash out.
Every technological advance of this sort has had a similar trajectory. The introduction of the printing press threatened the livelihoods of engravers and copyists. The camera threatened the livelihood of portrait and landscape painters. After all, who would pay to have a picture painted if somebody could just buy a camera, point it at whatever they wanted, and press a button? The advent of digital art, computerised drawing etc. was likewise opposed and denigrated by so called “real” artists on the same grounds.
The Persistence Of Art
Now, these historical examples share some things in common. For one thing, none of the opposition or disdain actually succeeded in preventing the eventually ubiquitous use of the technology they were disdaining.
For another, all of those professions are still around to some degree. People still write or copy out documents for others by hand, still engrave, still weave by hand, still create portraits and landscapes, whether on canvas or with pixels.
It would be extremely disingenuous to claim that nobody was harmed by these things though. There were (and will be) a multitude of casualties. But the world did not end for artists when camera’s became commonplace, or when Photoshop did, and it will not end for traditional artists now that Synthography is battling for its place in the artistic lexicon.
But it will end for some of them. The fear that their skills will become devalued, that people will be less likely to pay them for work which can be done by so-called “artificial intelligence” is not only real, but already a proven outcome. So we should neither be surprised, nor too critical, when people respond to these literally existential fears with anger and resentment.
The Ethical Dilemma
But that is only part of the story. There is another part that is a legitimate concern with both ethical, and potentially legal, ramifications. As everybody knows now, the models that both generative language, and generative image, systems use were trained on the vast corpus of largely publicly available data from the internet. The huge repository of human knowledge, of the output of human skill, that billions of people have been effectively tapped into for the last 30+ years.
This makes perfect sense from the developers point of view. All of that information was there. Just sort of lying around, and available. Where else would they turn when they wanted to teach large language models how to recognise, analyse and reproduce the patterns inherent in human writing / speech, but the largest collection of human writing known to man? When they wanted to show AI models that “chairs” could look like 10,000 different things, and still be, fundamentally, chairs?
Of course, by virtue of the almost globally recognised legal principle of copyright, a lot of that data belonged to somebody else.
Hanlon’s Razor
According to Hanlon’s Razor, we should not attribute to malice that which can be adequately explained by stupidity. And while I don’t know if I’d strictly call it stupidity, or insensitivity, or lack of care, it seems equally unlikely to me that these computer scientists and developers sat down one day and said, “you know what, let’s just steal all the data we need to train our models and screw the people who created (generated) it.”
As much as we might like to think that, (and the regular cries of “thief!” which dog the generative AI industry show we do), the reality is almost certainly more prosaic. I suspect it just never occurred to them until it was already too late.
Because that’s how human’s operate. Fools rush in and all that kind of thing. There’s barely a technology we’ve created that didn’t immediately outstrip societies ability to cope with it. Hell, the legal system has still not fully come to terms with the digital age yet, and now it at the very least appears likely that we’re hitting a significant new inference point altogether.
Of course, other factors may well have played a role. Financial constraints, lack of consideration of artists rights, may all have played a role, but outright malice seems an unlikely one.
Selective Outrage?
I did wonder for a while why people don’t seem to be complaining about the processing of hundreds of thousands of x-rays, scans, results etc. to create models that improve medical diagnosis, or predict the structure of proteins or identify text on burned papyrus scrolls from Pompeii. The outrage, it seems, is reserved largely for the generation of images and text.
But it occurred to me it’s probably largely a question of need / use case, and perhaps to an extent of scale. There are far more people trying to make a living through their art, (and I include writing in this) than there are doing medical research and diagnosis, or unfolding proteins, or trying to conserve 2,000 year old scrolls that have been through a volcanic eruption. And those that are seem far more likely to see it as an incredibly useful research / study tool, with attendant savings in time, effort and attention.
No, by far the most hate is displayed against generative models that are publicly accessible or freely available.
The Copyright Case
Copyright has always been a contentious issue. Views range from “there should be no copyright” to “everything belongs to somebody and nobody may use it for anything without paying.”
Unsurprisingly, I find myself somewhere in the middle…I respect the right of any author or artist or creator or developer to be recognised as such for their work, and I believe it is only fair that they should be able to expect fair compensation for the use of such work. I also believe that, insofar as possible, information should be freely available at need.
The GF, who as I have mentioned before, is an artist, (the paint and brushes type) thinks AI is amazing. She doesn’t use it in her work at all, but she loves the potential it represents for more art. She also believes that once you “release” any art (in any form) into the wild, it effectively stops belonging to you, and can be re-imagined or repurposed by anyone. (This means she’s always up for a re-make, a re-imaging, a reboot, etc. which as a bit of a canon purist I struggle with myself.)
Regardless, not even the most die-hard anti-copyright campaigner would, (I think), deny that artists (authors, etc.) deserve value for their work, their time, their effort, their perspective and the time they have spent mastering their craft. The fact that a huge proportion of the scraped data is likely to be the output of people who do not have wide recognition or stable incomes from their creative output makes this even worse.
There are countless arguments that could be made here regarding the public domain, free accessibility, definitions of fair use, etc. but the fact remains that if you are profiting from the work of an artist or writer, simple decency at the very least (if not legal requirements) should include the artist themselves being able to share in that profit in some way as well.
On Plagiarism
Apart from the copyright issue, there are two other claims I regularly see levelled at AI generated…content. The first is that there can be nothing original produced by it, and that it simply regurgitates composites of other people’s existing work.
I feel that this is generally not strictly speaking accurate. Although there have indeed been cases where LLMs will spit out scraped content verbatim, this is not what they are designed to do. These cases seem likely to be either very narrowly trained models, highly specific prompts, or specific use cases where the body of training data is sparse for the particular topic or query.
However, I do not disagree that LLMs in particular have a peculiar limitation in this regard. Although their output may technically be “original” in the sense that it is (usually) a specific combination of words on a subject that have not been (notably) seen before, a major stumbling block for people who use text generation for various purposes (including creative ones) is that they have a tendency to devolve to repetition. Creatives in the ad industry for example will find a given model eventually starts regurgitating the same suggestions when asked for campaign ideas for example. Prompt engineering can alleviate this a little, but ultimately, I think these are limitations of the model training. Because an LLM cannot truly (yet) be creative. (But humans can.)
In Art…
When it comes to image based art however, I think the story is a little different. No matter how traditional artists feel about it, and despite the etymology of the new term, the process does not involve spitting out copies of work done by other people and somehow mashing them together. Again, model training plays a role here, but although you could train a model or LoRA (Low Rank Adaptation model) specifically on paintings by van Gogh, to produce works in the clear style of van Gogh, they are still not van Gogh paintings (and will not be mistaken for such) any more than a human who studied every van Gogh in order to mimic the style is producing van Goghs. Any more than Picasso, who was trained by his father, was somehow stealing his fathers art every time he painted a picture in his fathers style.
Again, a lack of understanding of the process of tokenisation in the training of image models probably contributes to this impression. It’s not a matter of scanning in a picture of a chair, and now the model can “draw” a chair like the source image. Honestly, the mathematics of this are way behind my comprehension, but more than a year of experimenting with image generation has taught me that it’s not nearly so simple as saying “make me a picture of XYZ.” As I implied earlier, multiple techniques and methodologies are required, and there is no doubt in my mind that the skill and experience of the artist is a critical factor.
On Quality
The other claim, more frequently aimed at text generating models, are that they’re just rubbish. That they’re useless at whatever thing they’re being asked for, that they just “hallucinate” things, that they’re incapable of providing a useful response, and similar related ideas.
In certain senses, this is true. It is worth noting however that the term “hallucinations” is frowned upon by many researchers, due to its implications of neurological functioning etc. I think a large part of this too is a misunderstanding of how LLMs work. Remember, they were not designed to be factual, or to “care” about the correctness of their output. They were designed to mimic human speech / text based on a probabilistic mathematical model of naturally occurring patterns in that speech (or text, for our purposes, the two are interchangeable).
That appears to be changing now, but even if they’re improving in specific directions, they were built as mathematical models of language, which predicted the next word in a sequence of words based on the patterns identified in their training data. That’s all.
Lies, Damn Lies, And Statistics
Based on my own informal testing, it took nearly two years for both ChatGPT and Gemini (nee Bard) to stop answering “they weigh the same” when asked “which is heavier, 1 kg of feathers, or 2 kg of steel?”
That’s because in the corpus of training data, that question would certainly almost always be in relation to the old trick question or brain teaser in which the mass of both items were the same, and people were predisposed to answer based on the properties of the item rather than the stated mass.
The fact that a single character of my question was different was well within the margin of error for LLM pattern detection, and so the most crucial part of it was disregarded, in order to produce the statistically most probable sequence of tokens to follow the ones I’d input.
They weren’t designed as search engines, fact checkers, report generators, or anything like that. They were designed to credibly mimic human language use. And it’s impossible to tell whether the fact that it answers that specific question correctly now is because of a manual intervention on that specific issue, or because of improvements to the process / model that make it possible to avoid that kind of error.
But believe, me, they’re working on changing it into a better search engine, a better fact checker, a better writer, etc. Because that’s what you want it to be.
The User Dilemma
People’s propensity to use generative text AI for purposes that were never really intended is definitely a part of the issues about its accuracy. The AI hype cycle that I’ve referred to in the past is another part of this, and it’s also the reason, as I alluded to above, that they’re changing it. User expectation is high on the list of drivers for improvement, because meeting user expectations makes monetisation easier and more effective.
Another part of the reason is our very human tendency to anthropomorphise. In casual conversation with a wide variety of people, I’ve noticed a very strong tendency to do just that. As far as your average non-technical user is concerned, LLMs already think, understand, reason, and communicate just like a person does. This exacerbates feelings of annoyance and frustration when it “fails” users, while those who understand its processes a bit better are frustrated and annoyed at being threatened by it.
Solutions
I don’t see any easy solutions here. (Sorry, all I ever guarantee is more questions. :D ) The genie is out of the bottle. The data has been scraped, the (base) models trained, and the tools have been released into the gleeful hands of the public. Barring really hitting a “scaling wall” (with developers allegedly already seeing diminishing returns), or lack of new, original content leading to widespread “model collapse,” I don’t see us going backward on this.
For the copyright issue, artists best bet is probably legal remedy. It’s hard to challenge companies like OpenAI or Google or Meta etc. on this alone, but maybe some sort of class-action suit could be used to leverage compensation for those whose work has been scraped without their knowledge and / or permission, and pave the way for fair licensing agreements that benefit artists as well.
For everything else, my best suggestion is to embrace it. It’s too late to prevent the use of these kind of tools, and it looks like the smart money is on them becoming even better and more widespread over time. An AI generated portrait of Alan Turing sold for over 1 million dollars at Southeby’s last month…This is when we need to start learning how to use them, and leverage the advantages they may confer for our own benefit. As Robert Anton Wilson said, “In an evolving world, he who stands still moves backwards.“
Maybe this piece will give somebody something to think about the next time they want to tell somebody their painstakingly created synthographic image is not art, or the next time they want to accuse a traditional artist of being a tech-phobic Luddite. (Probably not, but we can hope, right? :D )
On The Nature Of Art
And finally, (finally) to return somewhat to my original point…I submit to you that the true meaning and value of art lies not in the medium, the method, or the techniques of creating it, but in the sense it evokes in both the creator, and the observer.
I have known men and women who have elevated the most mundane appearing things to the level of art. Not because of money or fame or influence, but because doing that thing is a true expression of their selves.
And what could be more artistic than that?
Note:
I have foregone my usual habit of a poorly modelled and rendered featured image for this piece, choosing instead to generate one using only the prompt “Synthography.”