Real-World Python Machine Learning Tutorial w/ Scikit Learn (sklearn basics, NLP, classifiers, etc)

Practice your Python Pandas data science skills with problems on StrataScratch! https://stratascratch.com/?via=keith In this video we walk through a real wo...

Keith Galli260.3K views01:40:49

🔥 Related Trending Topics

LIVE TRENDS

This video may be related to current global trending topics. Click any trend to explore more videos about what's hot right now!

THIS VIDEO IS TRENDING!

This video is currently trending in Thailand under the topic 'สภาพอากาศ'.

About this video

Practice your Python Pandas data science skills with problems on StrataScratch! https://stratascratch.com/?via=keith In this video we walk through a real world python machine learning project using the sci-kit learn library. In it we work our way to building a model that automatically classifies text as either having a positive or negative sentiment. We do this by using amazon reviews as our training data. Full video timeline in the comments! Link to Code & Data: https://github.com/keithgalli/sklearn Raw Data download: http://jmcauley.ucsd.edu/data/amazon/ Sci-kit learn documentation: https://scikit-learn.org/stable/documentation.html Make sure you have sci-kit learn downloaded! To do this either run "pip install sklearn" or use python through Anaconda. Join the Python Army to get access to perks! YouTube - https://www.youtube.com/channel/UCq6XkhO5SZ66N04IcPbqNcw/join Patreon - https://www.patreon.com/keithgalli --------------------------- Follow me on social media! Instagram: https://www.instagram.com/keithgalli/ Twitter: https://twitter.com/keithgalli To get one of the cool shirts I was wearing: https://www.instagram.com/pagandvls/ --------------------------- Video outline! 0:00 - What we will be doing! 3:40 - Sci-Kit Learn Overview 6:38 - How do we find training data? 9:33 - Download data 11:45 - Load our data into Jupyter Notebook 16:38 - Cleaning our code a bit (building data class) 20:13 - Using Enums 22:50 - Converting text to numerical vectors, bag of words (BOW) explanation 25:45 - Training/Test Split (make sure to "pip install sklearn" !) 33:45 - Bag of words in sklearn (CountVectorizer) 40:05 - fit_transform, fit, transform methods 42:05 - Model Selection (SVM, Decision Tree, Naive Bayes, Logistic Regression) & Classification 47:50 - predict method 53:35 - Analysis & Evaluation (using clf.score() method) 56:58 - F1 score 1:01:01 - Improving our model (evenly distributing positive & negative examples and loading in more data) 1:20:36 - Let's see our model in action! (qualitative testing) 1:22:24 - Tfidf Vectorizer 1:25:40 - GridSearchCv to automatically find the best parameters 1:31:30 - Further NLP improvement opportunities 1:32:50 - Saving our model (Pickle) and reloading it later 1:36:37 - Category Classifier 1:39:14 - Confusion Matrix --------------------- If you are curious to learn how I make my tutorials, check out this video: https://youtu.be/LEO4igyXbLs *I use affiliate links on the products that I recommend. I may earn a purchase commission or a referral bonus from the usage of these links.

Video Information

Views
260.3K

Total views since publication

Likes
7.1K

User likes and reactions

Duration
01:40:49

Video length

Published
Sep 30, 2019

Release date

Quality
hd

Video definition