Dealing with Hallucinations

The Jagged Frontier

One of the really weird things about large language models is they don’t work like we think computers should work. So, LLMs are very bad at math. Computers should be good at math. Computers should be all, kind of, cold logic, but large language models are weirdly emotional and can threaten you or want to be your friend, seemingly. It can be very hard to know in advance what they’re good or bad at. In fact, nobody actually knows the answer. We call this the jagged frontier of AI, the idea that there’s almost a spiky shape to what the AI can do and what it can’t do. And things that seem closely related to each other or hard for a human or easy for a human don’t necessarily map well to AI even though it feels like it should.

If you ask the AI to write a 25-word sentence, you’ll probably get somewhere between 18 and 30 words because the AI doesn’t see words the way we do. It sees tokens. But if you ask it to write a sonnet, it does a pretty good sonnet. It’s very hard for people to write sonnets. It’s very easy for them to write 25 words. One of the things we have to learn is about where that frontier lies, and you have to keep updating that as new AI models come out. What is it good at? What is it bad at? Because the problem is that if you don’t know that, you can fall prey to a phenomenon called falling asleep at the wheel. And this is the fact that since AIs are really good or at least seem really good, people who use AI systems stop checking the work, even if you know you should. They stop paying attention to the details. It’s really easy to make mistakes with AI if you don’t understand the shape of the frontier.

Hallucination

Hallucination is a mediocre term for an important phenomenon. Because when the AI hallucinates, it’s not hallucinating the way a human would. And hallucination rates have been dropping over time. Hallucination rates for earlier AI models, like if you ask ChatGPT 3.5 to give you a medical reference, there’s a paper on this, It hallucinates between 80 and 90 percent of the time. So that medical information has an error 80, 90 percent of the time. GPT 4, error rate of 20 percent. When you connect GPT 4 to the Internet, it drops even further, maybe down to 5 percent. So we’re making progress on the hallucinations issue. We don’t know if it’ll ever go away, but it’s a really important thing to realize. And because the hallucinations are so plausible, that’s what makes them dangerous. It’d be one thing if it was an out-and-out lie. If you said, give me an analysis of this document and said, this document is about dragons. And you’re like, well, that’s clearly wrong. It won’t do that.

Sharpening Your Intuition

To understand how to deal with hallucinations, you can think about AI as having 3 different goals. And this isn’t really true, but it’s a helpful heuristic. Give you accurate answers, make you happy, don’t embarrass the company that created the AI or do anything illegal or evil. As a result, AI can be a little bit weird. And you quickly find out that if you prioritize one of these goals over others, it can result in all kinds of weird things happening like hallucinations. If the AI doesn’t know an answer to a problem, it might tell you, I can’t answer that. But if you say, no, no, I really want you to answer it. It’s very important to me. You’ll get an answer, but it’s likely to be hallucinated because you just pushed it to give you an answer even though it didn’t wanna do it by prioritizing the happiness principle over the principle that it wants to give you accurate answers. So you’ve just increased the hallucination rate. Doesn’t mean it’s always gonna hallucinate, but it could do that.

A more common example might be if you ask it for your biography and there’s no information about you, it will probably say, I don’t know much about this person. But if you say no, assume anything you need to. Give me the biography based on the information you have. You’ll get a biography with all these made-up details. It’s very common for the AI to say, I have a computer science degree, for example, because it feels like I should have one. I do not have a computer science degree. There still is this kind of error rate. So what you need to do is sharpen your own intuition with working with tool. Get a sense of when you see something that might make you concerned.