Real-World Python Machine Learning Tutorial w/ Scikit Learn (sklearn basics, NLP, classifiers, etc)
Practice your Python Pandas data science skills with problems on StrataScratch! https://stratascratch.com/?via=keith In this video we walk through a real wo...
🔥 Related Trending Topics
LIVE TRENDSThis video may be related to current global trending topics. Click any trend to explore more videos about what's hot right now!
THIS VIDEO IS TRENDING!
This video is currently trending in Thailand under the topic 'สภาพอากาศ'.
About this video
Practice your Python Pandas data science skills with problems on StrataScratch!
https://stratascratch.com/?via=keith
In this video we walk through a real world python machine learning project using the sci-kit learn library. In it we work our way to building a model that automatically classifies text as either having a positive or negative sentiment. We do this by using amazon reviews as our training data. Full video timeline in the comments!
Link to Code & Data:
https://github.com/keithgalli/sklearn
Raw Data download:
http://jmcauley.ucsd.edu/data/amazon/
Sci-kit learn documentation:
https://scikit-learn.org/stable/documentation.html
Make sure you have sci-kit learn downloaded! To do this either run "pip install sklearn" or use python through Anaconda.
Join the Python Army to get access to perks!
YouTube - https://www.youtube.com/channel/UCq6XkhO5SZ66N04IcPbqNcw/join
Patreon - https://www.patreon.com/keithgalli
---------------------------
Follow me on social media!
Instagram: https://www.instagram.com/keithgalli/
Twitter: https://twitter.com/keithgalli
To get one of the cool shirts I was wearing:
https://www.instagram.com/pagandvls/
---------------------------
Video outline!
0:00 - What we will be doing!
3:40 - Sci-Kit Learn Overview
6:38 - How do we find training data?
9:33 - Download data
11:45 - Load our data into Jupyter Notebook
16:38 - Cleaning our code a bit (building data class)
20:13 - Using Enums
22:50 - Converting text to numerical vectors, bag of words (BOW) explanation
25:45 - Training/Test Split (make sure to "pip install sklearn" !)
33:45 - Bag of words in sklearn (CountVectorizer)
40:05 - fit_transform, fit, transform methods
42:05 - Model Selection (SVM, Decision Tree, Naive Bayes, Logistic Regression) & Classification
47:50 - predict method
53:35 - Analysis & Evaluation (using clf.score() method)
56:58 - F1 score
1:01:01 - Improving our model (evenly distributing positive & negative examples and loading in more data)
1:20:36 - Let's see our model in action! (qualitative testing)
1:22:24 - Tfidf Vectorizer
1:25:40 - GridSearchCv to automatically find the best parameters
1:31:30 - Further NLP improvement opportunities
1:32:50 - Saving our model (Pickle) and reloading it later
1:36:37 - Category Classifier
1:39:14 - Confusion Matrix
---------------------
If you are curious to learn how I make my tutorials, check out this video: https://youtu.be/LEO4igyXbLs
*I use affiliate links on the products that I recommend. I may earn a purchase commission or a referral bonus from the usage of these links.
Video Information
Views
260.3K
Total views since publication
Likes
7.1K
User likes and reactions
Duration
01:40:49
Video length
Published
Sep 30, 2019
Release date
Quality
hd
Video definition
About the Channel
Tags and Topics
This video is tagged with the following topics. Click any tag to explore more related content and discover similar videos:
#Keith Galli #MIT #sklearn #python machine learning #nlp #machine learning project #artificial intelligence #sci kit learn #sci-kit learn #AI #python 3 #jupyter notebook #data science #ML #python data science #model selection #classification #regression #algorithms #sklearn overview #machine learning in python #python programming #programming #advanced #simple #complete #save model #confusion matrix #python plotting #sentiment #natural language processing #project #machine learning
Tags help categorize content and make it easier to find related videos. Browse our collection to discover more content in these categories.