Making art using artificial intelligence isn’t latest. It’s as old as AI itself.

What’s latest is that a wave of tools now let most individuals generate images by entering a text prompt. All you should do is write “a landscape within the kind of van Gogh” right into a text box, and the AI can create a stupendous image as instructed.

The power of this technology lies in its capability to make use of human language to manage art generation. But do these systems accurately translate an artist’s vision? Can bringing language into art-making truly result in artistic breakthroughs?

Engineering outputs

I’ve worked with generative AI as an artist and computer scientist for years, and I might argue that this latest form of tool constrains the creative process.

When you write a text prompt to generate a picture with AI, there are infinite possibilities. If you’re an informal user, you may be blissful with what AI generates for you. And startups and investors have poured billions into this technology, seeing it as a simple solution to generate graphics for articles, video game characters and advertisements.

Generative AI is seen as a promising tool for coming up with video game characters.
Benlisquare/Wikimedia Commons, CC BY-SA

In contrast, an artist might need to write down an essaylike prompt to generate a high-quality image that reflects their vision – with the suitable composition, the suitable lighting and the right shading. That long prompt shouldn’t be necessarily descriptive of the image but typically uses numerous keywords to invoke the system of what’s within the artist’s mind. There’s a comparatively latest term for this: prompt engineering.

Basically, the role of an artist using these tools is reduced to reverse-engineering the system to search out the suitable keywords to compel the system to generate the specified output. It takes quite a lot of effort, and far trial and error, to search out the suitable words.

AI isn’t as intelligent because it seems

To learn easy methods to higher control the outputs, it’s necessary to acknowledge that almost all of those systems are trained on images and captions from the web.

Think about what a typical image caption tells about a picture. Captions are typically written to enhance the visual experience in web browsing.

For example, the caption might describe the name of the photographer and the copyright holder. On some web sites, like Flickr, a caption typically describes the form of camera and the lens used. On other sites, the caption describes the graphic engine and hardware used to render a picture.

So to write down a useful text prompt, users must insert many nondescriptive keywords for the AI system to create a corresponding image.

Today’s AI systems usually are not as intelligent as they appear; they’re essentially smart retrieval systems which have an enormous memory and work by association.

Artists frustrated by a scarcity of control

Is this really the type of tool that might help artists create great work?

At Playform AI, a generative AI art platform that I founded, we conducted a survey to raised understand artists’ experiences with generative AI. We collected responses from over 500 digital artists, traditional painters, photographers, illustrators and graphic designers who had used platforms corresponding to DALL-E, Stable Diffusion and Midjourney, amongst others.

Only 46% of the respondents found such tools to be “very useful,” while 32% found them somewhat useful but couldn’t integrate them to their workflow. The remainder of the users – 22% – didn’t find them useful in any respect.

The major limitation artists and designers highlighted was a scarcity of control. On a scale 0 to 10, with 10 being most control, respondents described their ability to manage the final result to be between 4 and 5. Half the respondents found the outputs interesting, but not of a high enough quality to be utilized in their practice.

When it got here to beliefs about whether generative AI would influence their practice, 90% of the artists surveyed thought that it could; 46% believed that the effect can be a positive one, with 7% predicting that it could have a negative effect. And 37% thought their practice can be affected but weren’t sure in what way.

The best visual art transcends language

Are these limitations fundamental, or will they only go away because the technology improves?

Of course, newer versions of generative AI will give users more control over outputs, together with higher resolutions and higher image quality.

But to me, the major limitation, so far as art is worried, is foundational: it’s the strategy of using language because the major driver in generating the image.

Visual artists, by definition, are visual thinkers. When they imagine their work, they sometimes draw from visual references, not words – a memory, a set of photographs or other art they’ve encountered.

When language is in the motive force’s seat of image generation, I see an additional barrier between the artist and the digital canvas. Pixels can be rendered only through the lens of language. Artists lose the liberty of manipulating pixels outside the boundaries of semantics.

Grid of different cartoon images of an animal with wings.
The same input can result in a variety of random outputs.
OpenAI/Wikimedia Commons

There’s one other fundamental limitation in text-to-image technology.

If two artists enter the very same prompt, it’s most unlikely that the system will generate the identical image. That’s not resulting from anything the artist did; the several outcomes are simply due the AI’s ranging from different random initial images.

In other words, the artist’s output is boiled right down to likelihood.

Nearly two-thirds of the artists we surveyed had concerns that their AI generations may be much like other artists’ works and that the technology doesn’t reflect their identity – and even replaces it altogether.

The issue of artist identity is crucial in relation to making and recognizing art. In the nineteenth century, when photography began to turn out to be popular, there was a debate about whether photography was a type of art. It got here right down to a court case in France in 1861 to determine whether photography might be copyrighted as an art form. The decision hinged on whether an artist’s unique identity might be expressed through photographs.

Those same questions emerge when considering AI systems which can be taught with the web’s existing images.

Before the emergence of text-to-image prompting, creating art with AI was a more elaborate process: Artists often trained their very own AI models based on their very own images. That allowed them to make use of their very own work as visual references and retain more control over the outputs, which higher reflected their unique style.

Text-to-image tools may be useful for certain creators and casual on a regular basis users who wish to create graphics for a piece presentation or a social media post.

But in relation to art, I can’t see how text-to-image software can adequately reflect the artist’s true intentions or capture the sweetness and emotional resonance or works that grip viewers and makes them see the world anew.

This article was originally published at