🎉 AI Engineers: Join our webinar on Prompt Engineering for AI Agents. Register here >>

May 29, 2024 - last updated
Artificial Intelligence

What are AI Hallucinations and how to prevent them?

While some people find them amusing, AI hallucinations can be dangerous. Here’s why.

Noa Azaria
Noa Azaria
9 min read Jan 15, 2024

While some people find them amusing, AI hallucinations can be dangerous. This is a big reason why prevention should be on your agenda. Imagine asking an AI chatbot for a recipe and it suggests chlorine gas (yes this actually happened). Not great, right?

In this article, we’re going to cover everything you need to know about AI hallucinations, from causes and types to mitigation techniques.  

AI Hallucination Explained

AI hallucination is when AI systems, such as chatbots, generate responses that are inaccurate or completely fabricated. This happens because AI tools like ChatGPT learn to guess the words that fit best with what you’re asking. But they don’t know how to think logically or critically. This often leads to inaccurate responses and to confusion and misinformation. Essentially, they’re a constant bug in generative AI.

What are the causes of AI hallucinations?

Hallucinations are an inherent risk in large language models (LLMs), stemming from the foundational models developed by OpenAI, Google, Meta, and others. This area is beyond user control, and comes with the GenAI field. In this case, Yann LeCun, VP & Cheif AI Scientist at Meta, put it best in his 2023 tweet when talking about AI hallucinations: “They will still hallucinate, they will still be difficult to control, and they will still merely regurgitate stuff they’ve been trained on.”

Here, we’ll focus on the more practical LLM use case model that powers most generative AI tools and chatbots.

RAG (retrieval-augmented generation) LLMs

What is retrieval-augmented generation? In short, it’s a technique to enhance LLMs by retrieving and incorporating external knowledgebase in the model generation process. RAGs are a favorite due to their compatibility as a chatbot engine. In the image below you can see the basic architecture for building a RAG chatbot.

RAG LLMs Basic Architecture

While many claim that using RAG can reduce the hallucination problem, it’s not that simple. RAG doesn’t solve hallucinations. Along with the inherent risk of hallucinations, ineffective knowledgebase retrieval can add to the issue. This section outlines the causes for hallucinations specifically associated with the use of retrieval-augmented generation LLMs:

  • Inaccurate context retrieval: When the retrieval mechanism fetches irrelevant or low-quality information, it directly impacts the quality of the generated output, leading to hallucinations or misleading responses.
  • Ineffective queries: Poorly formulated prompts from the user can mislead the retrieval process. Additionally, if the GenAI app’s prompt is ineffective, then responses may be based on incorrect or inappropriate context.
  • Complex language challenges: Challenges in understanding idioms, slang, or accurately processing non-English languages can cause the system to generate incorrect or nonsensical responses. This is compounded when the retrieval component struggles to find or interpret contextually relevant information in these languages.

AI Hallucinations vs AI Biases: What’s the Difference?

It’s important to separate between AI hallucinations and biases. Biases in AI result from training that leads to consistent error patterns. For example, if an AI frequently misidentifies wildlife photos because it was mostly trained on city images. Hallucinations, on the other hand, are when AI makes up incorrect information out of thin air. Both are issues that need addressing, but they stem from different root causes.

Machine Learning Pipeline
An ML pipeline showing where human bias infiltrates.

Researchers at USC have identified bias in a substantial 38.6% of the ‘facts’ and data employed by AI.

Generative AI Hallucination Examples

AI hallucinations can range from funny to extremely dangerous when impacting sensitive areas of society. Let’s look at the different types threatening GenAI performance.

1. Fabricated content

From deepfakes to fabricated sources, generative agents can make false information up out of thin air. In this case, AI creates entirely false data, like making up a news story or a historical fact. This can be extremely dangerous in sensitive industries like healthcare or finance.

Example of ChatGPT hallucination:

ChatGPT has been know for some memorable hallucinations. In 2023, it was reported that a New York lawyer used the GenAI tool to unintentionally generate fake legal citations, which got him sanctioned and later fined.

Another great example comes from Air Canada’s recent confession when a chatbot hallucination fabricated a non-existing case law regarding bereavement policy. This resulted in the plaintiff taking the airline to court and winning.

2. Inaccurate facts

One of the most common forms of artificial intelligence is hallucinations. Here, AI generates a convincing response with totally untrue facts, such as making up achievements.

Example of Google Bard hallucination

Google’s Bard AI (now Gemini) made a factual hallucination about the James Webb Space Telescope, causing a significant drop in Alphabet Inc.’s market value.

3. Weird and off-topic outputs

Sometimes AI hallucinations are just weird or out of context. Here, AI gives answers that are unrelated to the question. This leads to bizarre or confusing responses.

Example of Microsoft Copilot hallucination

Reddit user “riap0526″ posted his weird experience with Copilot, when the chatbot became unhinged.

Example of Microsoft Copilot hallucination

4. Harmful misinformation

Without prompting it, AI might produce offensive or harmful content. This can be detrimental to the mass adoption of generative AI. Just imagine a chatbot mouthing off, or purposely producing harmful content generated misinformation

Example of Gemini hallucination

Google’s Gemini chatbot creates totally fabricated images of racially diverse Nazis. Enough said here. You know why this is wrong.

Example of Gemini hallucination

5. Invalid LLM-generated code

When tasked with generating code, AI might produce flawed or completely wrong code. This opens software up to vulnerabilities, latency issues, and poor performance.

Engineers experience this lack of reliability on a daily basis. While Copilot, ChatGPT, and others have significantly cut development time, they have also added to the risks of blindly trusting LLM-generated code.

The impact of AI hallucinations on the enterprise

When AI gets things wrong, it’s not just a small mistake—it can lead to ethical problems. This is a big issue because it makes us question our trust in AI. It’s especially tricky in key industries like healthcare or finance, where wrong info or inaccurate information can cause real harm. Hallucinations can make or break new companies emerging in the space and can determine the effectiveness of established brands’ transition to AI applications.

Here’s why AI hallucinations matter… a lot:

  • Misinformation spread: This can mislead users and perpetuate discrimination and fake news. 
  • Trust erosion: Frequent hallucinations erode trust in AI. This leads to skepticism about its reliability.
  • Reliability concerns: This raises doubts about the AI’s capability to consistently provide accurate and reliable outputs.
  • Ethical implications: They may amplify biases or lead to questionable ethical outcomes.

For enterprises, AI hallucinations present even more risks:

  • Brand reputation: AI hallucinations can harm a company’s reputation, reducing customer trust and loyalty.
  • Product liability: Inaccuracies in critical industries could lead to serious legal issues.
  • User experience degradation: Unreliable AI outputs frustrate users, affecting engagement and adoption.
  • Competitive disadvantage: Companies with more reliable AI solutions have a market advantage over those with hallucination-prone products.
  • Increased costs: Addressing AI hallucinations involves additional expenses, from technical fixes to customer service.

How to Mitigate AI Hallucinations for Enterprise Success?

Reducing AI hallucinations involves several strategies:

Implement AI Guardrails: Proactive safety measures that detect and mitigate AI hallucinations and prevent other AI risks in real time. Aporia Guardrails ensures real-time reliability in every AI-generated response, safeguarding brand reputation and user trust.

Enhance AI knowledge: Expanding the AI’s knowledge base offers the RAG LLM more context to retrieve from. While this does make the generation more accurate, hallucinations are still a threat.

Robust Testing: Regularly testing AI models against new and diverse scenarios ensures models are fine-tuned and remain accurate and up-to-date.

Encourage proof: Users should be encouraged to verify AI-generated information, fostering a healthy skepticism towards AI responses.

Final thoughts

AI hallucinations present a significant challenge, a huge problem not just for casual users but for technology leaders striving to make generative AI reliable and trustworthy. Solutions like Aporia Guardrails are key in ensuring AI applications remain accurate, enhancing both user trust and the overall AI experience. By understanding and addressing the causes of AI hallucinations, we can pave the way for more dependable and ethical AI applications.

Get a live demo and see Aporia Guardrails in action. 


FAQ

What are AI Hallucinations?

AI hallucinations are when AI systems, such as chatbots, generate responses that are inaccurate or completely fabricated. This happens because AI tools like ChatGPT learn to guess the words that fit best with what you’re asking. But they don’t really know how to think logically or critically. This often leads to inaccurate responses and to confusion and misinformation. Essentially, they’re a constant bug in generative AI.

Why are AI hallucinations a problem?

They spread false info, causing confusion or misinformation. In critical areas like healthcare, this can lead to serious issues, reducing AI reliability and trust.

How can you mitigate AI hallucinations?

1) Add proactive guardrails to ensure safety and trust.
2) Improve training data quality and diversity.
3) Use strong validation and error checks.
4) Update AI models with new, correct information.
5) Use user feedback for improvements.

What is an example of generative AI hallucinations?

Generative AI hallucination can occur when an AI model predicts an event that is improbable or unlikely to happen. For instance, consider an AI weather prediction model forecasting rain for tomorrow despite no such indication in the weather forecast.

 

Green Background

Control All your GenAI Apps in minutes