Optimal Text Chunking for RAG ๐Ÿ”

Learn the best method to chunk text for RAG. Try Brilliant free for 30 days and get 20% off an annual premium subscription!

Optimal Text Chunking for RAG ๐Ÿ”
Adam Lucek
42.3K views โ€ข Dec 9, 2024
Optimal Text Chunking for RAG ๐Ÿ”

About this video

To try everything Brilliant has to offerโ€”freeโ€”for a full 30 days, visit https://brilliant.org/AdamLucek/ Youโ€™ll also get 20% off an annual premium subscription!

Resources:
Chunking Notebook: https://github.com/ALucek/chunking-strategies
ChromaDB Technical Report: https://research.trychroma.com/evaluating-chunking
ChromaDB Report Repo: https://github.com/brandonstarxel/chunking_evaluation
OpenAI Token Visualizer: https://platform.openai.com/tokenizer
Greg Kamradt 5 Levels of Text Splitting: https://github.com/FullStackRetrieval-com/RetrievalTutorials/blob/main/tutorials/LevelsOfTextSplitting/5_Levels_Of_Text_Splitting.ipynb
Jaccard Index: https://en.wikipedia.org/wiki/Jaccard_index

Chapters:
00:00 - Background on Text Chunking
02:28 - Brilliant!
03:47 - Character Text Splitting
06:28 - Token Text Splitting
10:26 - Recursive Character/Token Splitting
16:07 - Kamradt & Modified Semantic Chunking
20:43 - Cluster Semantic Chunking
24:46 - LLM Semantic Chunking
27:56 - Chunking Metrics & Comparison
30:00 - Overall Findings

#ai #programming #datascience

This video is sponsored by Brilliant

Tags and Topics

Browse our collection to discover more content in these categories.

Video Information

Views

42.3K

Likes

1.7K

Duration

33:17

Published

Dec 9, 2024

User Reviews

4.7
(8)
Rate:

Related Trending Topics

LIVE TRENDS

Related trending topics. Click any trend to explore more videos.