Machine Learning in Easy Language

Posts

LLMs Will Always Generate Plausible yet Incorrect Output and We Have to Make Peace With it: A Paper Review

October 03, 2024

Large Language models have had a tremendous growth across all domains since past few years. Researchers are actively engaged in finetuning these models, increasing parameters, context and token length, as well as developing new architectures for their better performance. But, unfortunately, as we make advancements in LLMs, we also come across various issues and limitations of these large language models. One of the biggest limitation of LLMs is generating plausible yet incorrect outputs: Hallucination. It means when a language model gives an output but that output is not 100% based on facts, not 100% correct, as well as not even 100% aligned with its trained data and retrieved information which is also called RAG. Over the course of time, various techniques have been applied, but hallucination isn't coped up with completely. And the paper I am gonna cover which is: "LLMs Will Always Hallucinate, and We Need to Live With This" makes some interesting claims and prove them...

Training a Basic Language Model (Trigram Language Model)

September 18, 2024

Large Language Models (LLMs) have revolutionized artificial intelligence in my opinion. When I first explored OpenAi's GPT 3.5 Turbo two years back, I was shocked. That how this models works. Then with the passage of time, other LLMs like Google's Gemini, Anthropic's Claud, Microsoft's Copliot and many more amazed us with their increased context lengths, increased parameters, and accuracy as well as RAG in these models. This advancement sparked curiosity inside me, that how these giants language models work in the backend? How do they predict things? What's inside there? Which kind of mathematics and statistics is involved? So, to quench my thirst, I decided to go deep down to the very basics and understand from there, that how language models are built, how mathematics is involved, how to train them, how to increase their accuracy so they predict accurately. In this blog post, I will train a trigram language model. Trigram means three ...

Analysis of the Paper: RAGE Against the Machine: Retrieval-Augmented LLM Explanations. A personal Take.

September 14, 2024

In recent years, we've witnessed remarkable advancements, particularly with large language models like GPT, Gemini, Claude, and others. Research scientists are now training increasingly complex models, extending their context lengths, and equipping them with billions or even trillions of parameters and boosting their computational capabilities. However, as these models improve rapidly, issues arise regarding their outputs. Researchers are now focusing on quantifying the accuracy, timeliness, and reasoning behind these outputs, as well as troubleshooting their limitations. In my last blog post [https://www.blogger.com/blog/post/edit/3763838017499822545/1365082907163370755], I discussed a paper exploring whether we can generalize LLMs, rely on their capabilities, and avoid falling into the trap of fixed effects. One of the key advancements that sets modern Large Language Models (LLMs) apart from previous iterations is the introduction of Retrieval-Augmented Generat...

Posts

How to correctly Initialize the Neural Network: Mechanistic Interpretability Part 1

LLMs Will Always Generate Plausible yet Incorrect Output and We Have to Make Peace With it: A Paper Review

Training a Basic Language Model (Trigram Language Model)

Analysis of the Paper: RAGE Against the Machine: Retrieval-Augmented LLM Explanations. A personal Take.