Fine-tune vs Few-Shot

Vibudh Singh
1 min readSep 18, 2023

--

I found this research paper very interesting on fine tuning LLMs. Essentially, it says that LLMs learn very fast compared to traditional Neural Networks. In this case learning stops after 1st or 2nd epoch. The reason for this could be because the loss surface that we navigate using Stochastic Gradient Descent is much smoother for LLMs compared to traditional Neural Networks.

Based on my experiments with giving few-shot examples and fine tuning on complex sentence pattern, I have experienced very similar results. The performance matrix change quite a bit even by introducing only a few examples.

Takeaway:
Be very careful in introducing the examples exposed to your LLMs — they should be high quality and diverse.

Reference to the research paper-> https://www.fast.ai/posts/2023-09-04-learning-jumps/

My Medium blog features short and concise yet insightful articles exploring the latest topics on Artificial Intelligence ( AI ), Large Language Models ( LLM ), Generative AI, and Natural Language Processing ( NLP ). Stay updated by subscribing for a regular dose of cutting-edge knowledge.

--

--

Vibudh Singh
Vibudh Singh

Written by Vibudh Singh

Lead Machine Learning Engineer at S&P Global

No responses yet