Tokenization & Embeddings in Transformers 🧠

Learn how tokenization and embeddings prepare data for self-attention in transformer models, starting with simple sentences like 'the quick brown fox.'

Tokenization & Embeddings in Transformers 🧠
Stephen Blum
2.0K views β€’ Jul 25, 2024
Tokenization & Embeddings in Transformers 🧠

About this video

Before self-attention in the transformer model, there's a phase called data preparation. Let's say we have a simple sentence like "the quick brown fox." Each word is a token. First, we tokenize the sentence, splitting it into individual words.

Each token is then converted into an ID using a model like word2vec. For example, "the" might be ID 200. This ID is then used in a word embedding model that shows the relationships between words.

The word embedding turns each token into an array, possibly 512 floating-point numbers, describing its attributes. These arrays are then copied as the query, key, and value for the self-attention mechanism.

Tags and Topics

Browse our collection to discover more content in these categories.

Video Information

Views

2.0K

Likes

47

Duration

1:00

Published

Jul 25, 2024

User Reviews

4.5
(1)
Rate:

Related Trending Topics

LIVE TRENDS

Related trending topics. Click any trend to explore more videos.