In December 2023, Google has announced the launch its recent Large Language Model (LLM). Twins. Gemini now provides the foundations of artificial intelligence (AI) for Google products; it is usually a direct rival to GPT-4 by OpenAI.

But why does Google consider Gemini such a vital milestone and what does it mean for users of Google services? And what does that mean basically within the context of the present hyper-fast developments in AI?

AI all over the place

Google is betting that Gemini will transform most of its products by improving current functionality and creating recent ones for services like search, Gmail, YouTube and its Office productivity suite. This would also enable improvements to its internet advertising business – its essential income – in addition to its software for Android phones, with stripped-down versions of Gemini running on hardware with limited capability.

A video from Google highlights Gemini’s capabilities.

For users, Gemini means recent features and improved capabilities that might make it harder to avoid Google services and strengthen an already dominant position in areas comparable to serps. The potential and opportunity for Google is important as most of their software consists of easily upgradable cloud services.

But the large one and unexpected success by ChatGPT attracted a whole lot of attention and increased the credibility of OpenAI. Gemini will allow Google to re-establish itself in the general public eye as a significant player in AI. Google is a powerhouse in AI, with large and powerful research teams which have been at the basis of lots of the main advances of the last decade.

There is a public debate about these recent technologies, each in regards to the advantages they provide and the disruption they cause in areas comparable to education, design and healthcare.

Strengthen AI

At its core, Gemini relies on Transformer networks. The same technology was originally developed by a research team at Google and can be used for other LLMs comparable to GPT-4.

A special feature of Gemini is its ability to handle different data modalities: text, audio, image and video. This gives the AI ​​model the power to perform tasks across multiple modalities, comparable to answering questions on the content of a picture or performing keyword searches for specific sorts of content discussed in podcasts.

But more importantly, the incontrovertible fact that the models can handle different modalities enables the training of worldwide superior AI models in comparison with different models trained independently for every modality. In fact, such multimodal models are considered stronger because they’re exposed to different perspectives of the identical concepts.

For example, the concept of birds might be higher understood by learning from a combination of text descriptions, vocalizations, images, and videos of birds. This idea of ​​multimodal transformer models was explored in previous research that GoogleGemini is the primary full-fledged industrial implementation of the approach.

An AI would higher understand the concept of birds using a combination of textual descriptions, vocalizations, images and videos of the birds.

Such a model is seen as a step towards stronger generalist AI models, also referred to as Artificial general intelligence (AGI).

Risks of AGI

Given the speed at which AI is advancing, the expectation that AGI with superhuman capabilities might be developed within the near future is sparking debate within the research community and society at large.

On the one hand, some anticipate and demand the chance of catastrophic events if a strong AGI falls into the hands of malicious groups Developments are slowed down.

Others claim that we’re still very removed from such an actionable AGI, that current approaches allow for superficial modeling of intelligence, Mimicking the info they’re trained onand so they lack effective—an in depth understanding of actual reality—mandatory to realize human-level intelligence.

a digital representation of a brain
More technological breakthroughs are required to create artificial general intelligence.

On the opposite hand, one could argue that focusing the conversation on existential risks distracts attention from more immediate impacts caused by recent advances in AI, including perpetuating prejudicesproduce false and misleading content – causing Google to pause its Gemini image generator, increasing environmental impact And Asserting Big Tech’s dominance.

The line to follow lies somewhere between all these considerations. We are still a good distance from adopting actionable AGI – further breakthroughs are needed, including the introduction of stronger symbolic modeling and reasoning capabilities.

In the meantime, we should always not be distracted from the necessary ethical and societal implications of recent AI. These considerations are necessary and needs to be addressed by individuals with diverse expertise in technological and social science fields.

While not a short-term threat, the belief of AI with superhuman abilities is a priority. It is significant that we’re prepared together to responsibly manage the emergence of AGI when this necessary milestone is reached.

This article was originally published at