As increasingly capable artificial intelligence (AI) systems grow to be widespread, the query of the risks they might pose has taken on latest urgency. Governments, researchers and developers have highlighted AI safety.

The EU is moving on AI regulation, the UK is convening an AI safety summit, and Australia is in search of input on supporting secure and responsible AI.

The current wave of interest is a chance to deal with concrete AI issues of safety like bias, misuse and labour exploitation. But many in Silicon Valley view safety through the speculative lens of “AI alignment”, which misses out on the very real harms current AI systems can do to society – and the pragmatic ways we will address them.

What is ‘AI alignment’?

AI alignment” is about attempting to be certain the behaviour of AI systems matches what we and what we . Alignment research tends to concentrate on hypothetical future AI systems, more advanced than today’s technology.

It’s a difficult problem since it’s hard to predict how technology will develop, and likewise because humans aren’t superb at knowing what we wish – or agreeing about it.

Nevertheless, there isn’t a shortage of alignment research. There are a number of technical and philosophical proposals with esoteric names equivalent to “Cooperative Inverse Reinforcement Learning” and “Iterated Amplification”.

There are two broad schools of thought. In “top-down” alignment, designers explicitly specify the values and ethical principles for AI to follow (think Asimov’s three laws of robotics), while “bottom-up” efforts attempt to reverse-engineer human values from data, then construct AI systems aligned with those values. There are, in fact, difficulties in defining “human values”, deciding who chooses which values are vital, and determining what happens when humans disagree.

OpenAI, the corporate behind the ChatGPT chatbot and the DALL-E image generator amongst other products, recently outlined its plans for “superalignment”. This plan goals to sidestep tricky questions and align a future superintelligent AI by first constructing a merely human-level AI to assist out with alignment research.

But to do that they need to first align the alignment-research AI…

Why is alignment imagined to be so vital?

Advocates of the alignment approach to AI safety say failing to “solve” AI alignment could lead on to large risks, as much as and including the extinction of humanity.

Belief in these risks largely springs from the concept “Artificial General Intelligence” (AGI) – roughly speaking, an AI system that may do anything a human can – might be developed within the near future, and will then keep improving itself without human input. In this narrative, the super-intelligent AI might then annihilate the human race, either intentionally or as a side-effect of another project.

In much the identical way the mere possibility of heaven and hell was enough to persuade the philosopher Blaise Pascal to consider in God, the potential for future super-AGI is sufficient to persuade some groups we should always devote all our efforts to “solving” AI alignment.

There are many philosophical pitfalls with this sort of reasoning. It can be very difficult to make predictions about technology.

Even leaving those concerns aside, alignment (let alone “superalignment”) is a limited and inadequate solution to take into consideration safety and AI systems.

Three problems with AI alignment

First, the concept of “alignment” is just not well defined. Alignment research typically goals at vague objectives like constructing “provably useful” systems, or “stopping human extinction”.

But these goals are quite narrow. A brilliant-intelligent AI could meet them and still do immense harm.

More importantly, AI safety is about greater than just machines and software. Like all technology, AI is each technical and social.

Making secure AI will involve addressing a complete range of issues including the political economy of AI development, exploitative labour practices, problems with misappropriated data, and ecological impacts. We also must be honest in regards to the likely uses of advanced AI (equivalent to pervasive authoritarian surveillance and social manipulation) and who will profit along the way in which (entrenched technology corporations).

Finally, treating AI alignment as a technical problem puts power within the incorrect place. Technologists shouldn’t be those deciding what risks and which values count.

The rules governing AI systems must be determined by public debate and democratic institutions.

OpenAI is making some efforts on this regard, equivalent to consulting with users in several fields of labor through the design of ChatGPT. However, we must be wary of efforts to “solve” AI safety by merely gathering feedback from a broader pool of individuals, without allowing space to deal with larger questions.

Another problem is a scarcity of diversity – ideological and demographic – amongst alignment researchers. Many have ties to Silicon Valley groups equivalent to effective altruists and rationalists, and there’s a lack of representation from women and other marginalised people groups who’ve historically been the drivers of progress in understanding the harm technology can do.

If not alignment, then what?

The impacts of technology on society can’t be addressed using technology alone.

The idea of “AI alignment” positions AI corporations as guardians protecting users from rogue AI, quite than the developers of AI systems that may possibly perpetrate harms. While secure AI is actually a very good objective, approaching this by narrowly specializing in “alignment” ignores too many pressing and potential harms.

So what’s a greater solution to take into consideration AI safety? As a social and technical problem to be addressed to begin with by acknowledging and addressing existing harms.

This isn’t to say that alignment research won’t be useful, however the framing isn’t helpful. And hare-brained schemes like OpenAI’s “superalignment” amount to kicking the meta-ethical can one block down the road, and hoping we don’t trip over it in a while.

This article was originally published at