Does Fine Tuning Embedding Models Improve RAG?

「ツール」は右上に移動しました。

248いいね 5119回再生

Does Fine Tuning Embedding Models Improve RAG?

Can fine tuning embedding models improve your RAG application? Yes! And it doesn’t even have to be that complicated. In this video we show how to train a query only linear adapter on your own RAG data to improve your document retrieval accuracy- a lightweight approach that can be applied to any embedding model without needing to fully fine tune the model itself, OR re-embed your knowledgebase.

Resources:
GitHub Repo - github.com/ALucek/linear-adapter-embedding
Trained Adapters - huggingface.co/AdamLucek/all-MiniLM-L6-v2-query-on…
Dataset - huggingface.co/datasets/AdamLucek/apple-environmen…
ChromaDB Research - research.trychroma.com/embedding-adapters
Efficient Domain Adaptation of Sentence Embeddings Using Adapters - arxiv.org/pdf/2307.03104
Improving Text Embeddings with Large Language Models - arxiv.org/pdf/2401.00368

Chapters:
00:00 - Introduction
00:39 - What is an Embedding Adapter?
03:04 - Defining our RAG Application
04:30 - Creating a Synthetic Dataset
09:03 - Setting Up Vector Database
11:23 - Evaluating our Model Baseline
14:16 - Training: Context
14:40 - Training: Triplet Margin Loss
16:01 - Training: Random Negative Sampling
17:01 - Training: Linear Layer Explanation
18:59 - Training: Triplet Data Loader
19:44 - Training: Training Script
20:17 - Training: Execution & Hyperparameters
21:22 - Assessment: New Embedding Function
22:04 - Assessment: Evaluating the Adapter
22:40 - Assessment: Metric Interpretation
23:28 - Assessment: Visualization
24:09 - Assessment: Training Data Fitting
25:35 - Closing Thoughts

#ai #datascience #programming