OpenAI aims to battle AI 'hallucinations' with new training method

File picture

OpenAI announced it is tackling the issue of AI "hallucinations" through a novel approach to training artificial intelligence models.

The research comes at a critical juncture, as the spread of misinformation generated by AI systems has become a topic of intense debate, particularly in light of the upcoming 2024 US presidential election and the ongoing generative AI boom.

OpenAI made waves in the industry last year with the release of ChatGPT, its chatbot powered by GPT-3 and GPT-4, which quickly garnered over 100 million monthly users, setting a record as the fastest-growing app. Microsoft has demonstrated its confidence in OpenAI's potential, having invested over $13 billion in the startup, thereby valuing it at approximately $29 billion.

AI hallucinations occur when models, such as OpenAI's ChatGPT or Google's Bard, fabricate information and present it as factual. For instance, Google's Bard made an inaccurate claim about the James Webb Space Telescope in a promotional video. More recently, ChatGPT cited false cases in a New York federal court filing, potentially leading to sanctions for the involved attorneys.

In their report, the OpenAI researchers acknowledged that even state-of-the-art models are prone to producing falsehoods and exhibit a tendency to invent facts when faced with uncertainty. Such hallucinations pose significant challenges in domains that require multi-step reasoning, as a single logical error can derail an entire solution.

To combat these fabrications, OpenAI's potential solution involves training AI models to reward themselves for each correct step of reasoning they take in reaching an answer, rather than solely rewarding the final conclusion. This approach, known as "process supervision," as opposed to "outcome supervision," aims to promote more explainable AI. By encouraging models to follow a more human-like chain of thought, OpenAI hopes to mitigate logical errors and enhance the overall capabilities of AI systems.

Karl Cobbe, a mathgen researcher at OpenAI, explained that detecting and addressing logical mistakes or hallucinations is a crucial step toward building artificial general intelligence (AGI). While OpenAI did not originate the process-supervision approach, the company is actively contributing to its advancement. Cobbe emphasized that the research aims to address hallucinations and improve models' problem-solving abilities.

OpenAI has released an accompanying dataset of 800,000 human labels used to train the model mentioned in the research paper, according to Cobbe.

More from Business

  • DEWA announces record AED 30.98 bln revenue

    Dubai Electricity and Water Authority (DEWA) recorded consolidated full year revenue, for 2024, of AED 30.98 billion, EBITDA of AED 15.70 billion and net profit after tax of AED 7.24 billion.

  • Aviation sector contributes $4.1 trillion to global economy

    The UAE's Minister of Economy and Chairman of the General Civil Aviation Authority (GCAA), on Monday emphasised the aviation sector's critical role in the global economy, noting that it accounts for 12 to 13 per cent of GDP in some countries and supports millions of jobs worldwide.

  • Paris AI summit draws world leaders

    World leaders and technology executives are convening in Paris on Monday to discuss how to safely embrace artificial intelligence at a time of mounting resistance to red tape that businesses say stifles innovation.

  • 16% growth in new economic licences in Abu Dhabi during 2024

    The Abu Dhabi Registration and Licensing Authority (ADRA), which develops and regulates the business sector, on Monday revealed significant growth in business licences and compliance indicators in the Emirate's mainland and non-financial economic free zones during 2024.

  • DEWA updates billing on water consumption

    Dubai Electricity and Water Authority (DEWA) has announced that it will adopt the cubic metre as the standard unit for measuring water consumption starting from the March 2025 billing cycle.