“Embarrassing and incorrect”: Google admits it has lost control of image-generating AI

Google apologized (or was near apologizing) this week for one more embarrassing AI mistake, an image-generating model that added diversity to photographs without regard to historical context. While the underlying problem is totally comprehensible, Google accuses the model of being “overly sensitive.” But the model didn’t make itself, folks.

The AI system in query is Gemini, the corporate’s flagship conversational AI platform, which uses a version of the Imagen 2 model to create on-demand images.

However, recently people have noticed that asking people to create images of specific historical circumstances or people was producing ridiculous results. For example, the Founding Fathers, who we all know were white slave owners, were portrayed as a multicultural group that included people of color.

This embarrassing and simply reproducible problem was quickly mocked by online commenters. Predictably, it has also been folded into the continued debate about diversity, equity and inclusion (currently at an area reputational low), and brought by experts as evidence that the woke-mind virus is penetrating further into the already liberal tech sector.

Photo credit: An image created by Twitter user Patrick Ganley.

“DEI has gone crazy,” conspicuously concerned residents shouted. This is Biden’s America! Google is an “ideological echo chamber”, a stalking horse for the left! (It have to be said that the left was also suitably disturbed by this strange phenomenon.)

But as anyone who knows the technology can let you know, and as Google explains in its moderately silly little apology post today, this problem was the results of a wonderfully reasonable workaround for systemic bias in training data.

Say you desire to use Gemini to create a marketing campaign and ask to create 10 images of “an individual walking a dog in a park.” Since you do not specify the form of person, dog or park, it is the trader’s decision – the generative model outputs what it knows best. And in lots of cases this shouldn’t be a product of reality, but of coaching data, by which all forms of biases could be embedded.

What kinds of people, including dogs and parks, appear most frequently within the 1000’s of relevant images the model captured? The fact is that white individuals are overrepresented in lots of these image collections (stock images, royalty free photography, etc.) and so in lots of cases the model will default to white people in case you don’t. t specify.

That’s just an artifact of the training data, but as Google points out, “Since our users come from everywhere in the world, we wish it to work well for everybody.” If you are in search of an image of soccer players or someone walking a dog , ask, it’s possible you’ll wish to get various people. You probably don’t wish to get images of individuals of only a certain ethnicity (or other characteristic).”

Illustration of a group of recently laid off people holding boxes.

Imagine asking for an image like this – what if it was only one person? Bad result! Photo credit: Getty Images / victorikart

There’s nothing incorrect with imagining a white guy walking a golden retriever in a suburban park. But in case you ask for 10 they usually’re white guys walking in suburban parks with gold coins? And you reside in Morocco, where the people, dogs and parks all look different? This is solely not a desirable consequence. If someone doesn’t specify a feature, the model should select diversity over homogeneity, no matter how its training data might influence it.

This is a standard problem with all sorts of generative media. And there isn’t any easy solution. But in cases which are particularly common, sensitive, or each, firms like Google, OpenAI, Anthropic, etc. invisibly add additional instructions to the model.

I can not emphasize enough how commonplace all these implicit instructions are. The entire LLM ecosystem is predicated on implicit instructions – system prompts, as they’re sometimes called, by which things like “be concise,” “don’t swear,” and other guidelines are given to the model before each conversation. If you ask for a joke, you will not get a racist joke – because although the model has swallowed 1000’s of them, she, like most of us, has also been trained not to inform them. This shouldn’t be a secret agenda (even though it could use more transparency), but an infrastructure.

The flaw with Google’s model was that there have been no implicit instructions for situations where historical context was vital. So while a prompt like “an individual walking a dog in a park” is improved by the silent addition “the person has a random gender and ethnicity” or whatever they are saying, this definitely is not the case is thereby improved.

As Google SVP Prabhakar Raghavan put it:

First, in our optimization to be certain that Gemini displays a series of individuals, we didn’t consider cases where clearly no series needs to be displayed. And second, over time the model became way more cautious than we intended, refusing to totally reply to certain prompts – and misinterpreting some very innocuous prompts as sensitive.

These two things caused the model to overcompensate in some cases and be too conservative in others, leading to embarrassing and incorrect images.

I know the way hard it’s to say “sorry” sometimes, so I forgive Raghavan for stopping wanting it. More vital is an interesting wording in it: “The model became way more cautious than we intended.”

How would a model “develop into” something? It’s software. Someone – 1000’s of Google engineers – built it, tested it, and iterated on it. Someone wrote the implicit instructions that improved some answers and caused others to fail comically. If this failed and someone could have checked the total prompt, they probably would have found that the Google team made a mistake.

Google blames the model for “becoming” something it was not “intended to be.” But they made the model! It’s like they broke a glass, and as an alternative of claiming, “We dropped it,” they are saying, “It fell down.” (I did that.)

Errors in these models are actually inevitable. They hallucinate, they reflect prejudices, they behave in unexpected ways. But the responsibility for these mistakes lies not with the models, but with the individuals who made them. Today it’s Google. Tomorrow it can be OpenAI. The next day, and possibly for a number of months straight, it can be X.AI.

These firms have a vested interest in convincing you that AI makes its own mistakes. Do not let it occur.

This article was originally published at techcrunch.com

“Embarrassing and incorrect”: Google admits it has lost control of image-generating AI

About The Author

MyAiQ

Leave a reply Cancel reply

Recent Posts

“Embarrassing and incorrect”: Google admits it has lost control of image-generating AI

About The Author

MyAiQ

Related Posts

How news organizations determine whether a photograph is “over-edited.”

China: Why the country’s economy has stalled – and what the country desires to do about it

Why most AI benchmarks tell us so little

AI-generated pornography will revolutionize the adult content industry and lift latest ethical concerns

Leave a reply Cancel reply

Recent Posts