Researchers on the University of Berkeley, California, have developed an AI prediction system to predict future events with similar accuracy to mass human wisdom.

Since LLMs weren’t specifically designed for event prediction, the team built a forecasting system on top of GPT-4 using a novel approach called Retrieval-Augmented Reasoning.

This multi-step process involved training GPT-4 to go looking for relevant information, evaluate its relevance, and integrate it into its reasoning process before making a prediction.

This is how it really works:

  1. Retrieval: The AI ​​system uses GPT-4 to generate search queries based on the prediction query and sub-questions and retrieve a big selection of probably relevant news articles.
  2. Relevance assessment: GPT-4 evaluates the relevance of every retrieved article and discards low-scoring articles to narrow the knowledge pool.
  3. Summary: GPT-4 reduces each article to its core points, specializing in details related to the forecast query.
  4. argumentation: Using “notepad prompts,” GPT-4 analyzes the summarized articles and produces an in depth forecast with an explanatory rationale. These prompts guide the model’s thought process and encourage a scientific approach to considering.

The Berkeley team then went a step further with self-monitored fine-tuning of the system.

They made a lot of AI predictions on previous questions with known answers and chosen examples where the AI ​​had surpassed the “wisdom of the gang” – defined as the mixture predictions of human forecasters.

By fine-tuning GPT-4 using these examples, the researchers taught the model to emulate reasoning patterns that produced one of the best predictions.


When tested on forecasting questions from June 2023, the AI ​​achieved a Brier rating of 0.179, in comparison with the human forecaster rating of 0.149.

The AI ​​performed particularly well on questions with high human uncertainty early within the forecasting process and when it had access to enough relevant articles on a given topic.

(a) The system is more powerful than a bunch of individuals if it has 0 to 10 relevant articles.
(b) When individuals are uncertain about their predictions (confidence levels between 0.3 and 0.7), the system performs higher, with a Brier rating of 0.199 in comparison with 0.246. However, when individuals are very confident (predictions below 0.05), they perform higher than our system.
(c) The accuracy of the system is higher initially of data collection. Source: ArXiv (open access).

The authors write in study“To our knowledge, that is the primary automated system with forecasting capabilities approaching human crowd levels, that are generally stronger than individual human forecasters.”

There was a small peculiarity since the system looked as if it would deteriorate as more items became available and thus the forecast reliability increased. This might be since the model is “hedging” its predictions.

Researchers describe it as follows: “We suspect that is as a consequence of our model’s tendency to under-confirm predictions based on its security training.”


Policymakers, businesses and public health officials could all profit from this way of voice-driven AI forecasting, researchers say.

“In the longer term, policymakers could seek the advice of AIs about which actions could be almost certainly to supply desired outcomes,” explains Dan Hendrycks of the Center for AI Safety in California.

He suggests that predictive models could address the approaching threats posed by AI. “Prediction bots would help us predict and avoid these risks,” Hendrycks said New Scientist.

There have been other attempts to predict complex life events with AI, including a model trained to accomplish that by Danish researchers predict the chance of premature death.

The use of AI for predictive applications that impact people’s lives raises ethical questions, corresponding to whether these systems are transparent, unbiased and ethically sound.

This recent Berkeley study describes how AI could make effective predictions, but we will not estimate how accurately it makes its decisions.

Using AI to predict necessary societal and individual events may appear to be a dystopian concept, but it surely is already a widespread practice in lots of parts of the world.

AI is getting used for policing, surveillance and social decision-making in several democratic countries, including the US, UK, Brazil, Australia and the Netherlands.

Could an AI be predicting elements of your future immediately? It’s actually possible.

This article was originally published at