StatQuest: Mastering t-SNE for Visualizing Complex Data ๐Ÿ“Š

Learn how t-SNE transforms intricate datasets into clear, insightful visualizations. This easy-to-understand guide reveals the secrets behind this popular dimensionality reduction technique.

StatQuest: Mastering t-SNE for Visualizing Complex Data ๐Ÿ“Š
StatQuest with Josh Starmer
534.7K views โ€ข Sep 18, 2017
StatQuest: Mastering t-SNE for Visualizing Complex Data ๐Ÿ“Š

About this video

t-SNE is a popular method for making an easy to read graph from a complex dataset, but not many people know how it works. Here's the inside scoop.

Hereโ€™s how to create a t-SNE graph in R (this is copied from the help file for Rtsne)โ€ฆ

library("Rtsne")
iris_unique <- unique(iris) # Remove duplicates
iris_matrix <- as.matrix(iris_unique[,1:4])
set.seed(42) # Set a seed if you want reproducible results
tsne_out <- Rtsne(iris_matrix) # Run TSNE

# Show the objects in the 2D tsne representation
plot(tsne_out$Y,col=iris_unique$Species)

This StatQuest is based on the original t-SNE manuscript, and it's not super hard to read (especially if you understand the general idea of how it works): https://lvdmaaten.github.io/publications/papers/JMLR_2008.pdf

For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/

If you'd like to support StatQuest, please consider...

Patreon: https://www.patreon.com/statquest
...or...
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join

...buying one of my books, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
https://statquest.org/statquest-store/

...or just donating to StatQuest!
https://www.paypal.me/statquest

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer

0:00 Awesome song and introduction
1:19 Overview of what t-SNE does
2:24 Overview of how t-SNE works
4:12 Step 1: Determine high-dimensional similarities
9:26 Step 2: Determine low-dimensional similarities
10:33 Step 3: Move points in low-d
11:05 Why the t-distribution is used instead of the normal distribution

Corrections:
6:17 I should have said that the blue points have twice the density of the purple points.
7:08 There should be a 0.05 in the denominator, not a 0.5.

#statquest #tsne

Tags and Topics

Browse our collection to discover more content in these categories.

Video Information

Views

534.7K

Likes

12.6K

Duration

11:48

Published

Sep 18, 2017

User Reviews

4.8
(106)
Rate:

Related Trending Topics

LIVE TRENDS

Related trending topics. Click any trend to explore more videos.