LLMs Will Always Generate Plausible yet Incorrect Output and We Have to Make Peace With it: A Paper Review
Large Language models have had a tremendous growth across all domains since past few years. Researchers are actively engaged in finetuning these models, increasing parameters, context and token length, as well as developing new architectures for their better performance. But, unfortunately, as we make advancements in LLMs, we also come across various issues and limitations of these large language models. One of the biggest limitation of LLMs is generating plausible yet incorrect outputs: Hallucination. It means when a language model gives an output but that output is not 100% based on facts, not 100% correct, as well as not even 100% aligned with its trained data and retrieved information which is also called RAG. Over the course of time, various techniques have been applied, but hallucination isn't coped up with completely. And the paper I am gonna cover which is: "LLMs Will Always Hallucinate, and We Need to Live With This" makes some interesting claims and prove them mathematically.
The paper states that hallucination in language models are are not just occasional errors but an inherited feature of these models. It also explains that hallucination originates from the fundamental mathematical and logical structure of LLMs. It is, therefore, impossible to eliminate hallucination through improving architecture, datasets enhancements, or fact checking mechanisms. The paper draws its analysis on one of the remarkable theorem of mathematics which is Godel's First incompleteness Theorem. It explains the undecidability of a problem like halting, emptiness, and acceptance problem. Will demonstrate this later in this blog.
First, the paper explains the fundamental concepts of how LLMs generate an output based on conditional probability which means predict the linguistic patterns and the conditional probability of next token given the previous token.
The paper also used various improvements strategies to improve the performance of LLMs such as rotatory positional encodings in the embedding stage, and relative position encoding, because absolute encoding does not work well on longer sequences of input data. Along with these, the paper used various architectures like Linear RNNs, transformer, and Mamba for better performance.
Above figure demonstrates the performance comparison of Mamba and Other Language Models on The Pile Benchmark Dataset. Mamba exhibits comparable or slightly superior performance to other language models across various metrics on The Pile.
The paper also uses parameter efficient finetuning. Traditional fine-tuning updates all model parameters, which is costly for large models. Parameter-Efficient Fine-Tuning (PEFT) reduces computational expense by updating only a small number of parameters, helping models adapt to specific tasks. Adapters are a key PEFT method, introducing small trainable modules that are fine-tuned instead of the entire model. This approach preserves accuracy while lowering costs.
- Factual incorrectness: occur when LLMs provide incorrect information based on existing data, but without inventing new, non-existent details.
- Misinterpretation: occurs when LLMs fail to correctly understand input or context, leading to inaccurate responses.
- Needle in a Haystack: problem refers to the challenge LLMs face in retrieving specific, correct information from a vast corpus.
- Fabrications: involve the creation of entirely false statements that have no basis in the model’s training data.
- Chain-of-Thought (CoT) prompting: encourages LLMs to make the reasoning process explicit and potentially reduce logical inconsistencies and hallucinations. CoT, according to the paper, may explicit the process of reasoning, and help in reduction of logical errors and hallucination, it still does not eliminate hallucination entirely.
- Self-consistency: is based on generating multiple reasonings using CoT and then select the most consistent one. The model is prompted with CoT and it generate multiple outputs for each reasoning, then it pairs each reasoning with its respective output and then selects the most consistent answer.
- Uncertainty quantification: uncertainty quantification depends upon the model used. It basically help us find out those instances where the model might be hallucinating.
- Softmax Neural Networks: these are classifiers (algorithms) which categorize the data and predict the output based on softmax function. Softmax function calculates the probability distribution. Probability distribution tells the likelihood of different outputs. In this context, it tells how certain a model is in predicting an output and helps in uncertainty quantification.
- Bayesian Neural Network: it treats weights as random variable that follow a probability distribution. Then a posterior probability distribution of weights is to be found given the input data. To quantify the uncertainty, the network use a sample of weights drawn from posterior probability distribution to create multiple models. The final output is determined by averaging the prediction of these models and uncertainty can be quantified.
- Ensemble Neural Networks: are just like Bayesian Neural Network, and the algorithm consists of set of models as well. However, these models are independent of each other and make their predictions separately from other members of the set.
- Faithfulness: it
refers to the extent to which an explanation accurately reflects a model’s reasoning
process. It can be measured by Shapley Values. It measures how much each
feature contributed to the models predictions. Features which have
high influence on output will have high Shapley value. This helps in
understand why model reaches certain specific outputs.
The paper shows that no matter which one of the above technique is applied, the fact remains the same LLMs will hallucinate; hallucinations can never be fully eliminated. At every one of these stages, the LLMs are susceptible to hallucinations. Training can never be 100% complete; intent classification and information retrieval are undecidable; output generation is necessarily susceptible to hallucination; and post-generation fact checking can never be a 100% accurate: Irrespective of how advanced our architectures or training datasets, or fact-checking techniques may be, hallucinations are still there.
NOTE: Pls check proof in the paper, its easy to be understood.
NOTE: Pls check proof in the paper, its easy to be understood.





Comments
Post a Comment