Using Kaggle Datasets for Data Science & ML π
Learn how to find and utilize free Kaggle datasets for your data science and machine learning projects effectively.

ProgrammingKnowledge
58.5K views β’ Mar 8, 2025

About this video
π **Want to use real-world datasets for your data science and machine learning projects?** Kaggle is the perfect place to find free datasets for analysis, visualization, and model building!
In this tutorial, Iβll show you **how to find, download, and use Kaggle datasets** in your Python projects, whether you're working in **Jupyter Notebook, VS Code, or Google Colab**.
---
### **πΉ What Youβll Learn:**
β How to **find and explore datasets** on Kaggle
β How to **download Kaggle datasets** manually and using the Kaggle API
β How to **load and use Kaggle datasets in Python**
β How to **use Kaggle datasets directly in Google Colab and Jupyter Notebook**
β How to **analyze, clean, and visualize data**
---
### **πΉ Prerequisites:**
βοΈ **Basic Python knowledge** (recommended)
βοΈ A **Kaggle Account** ([Sign up here](https://www.kaggle.com/))
βοΈ Installed **pandas, numpy, and matplotlib** (`pip install pandas numpy matplotlib seaborn`)
---
## **Step 1: Find a Dataset on Kaggle**
1οΈβ£ Go to **[Kaggle Datasets](https://www.kaggle.com/datasets)**
2οΈβ£ Search for a dataset (e.g., **Netflix Movies, COVID-19, Titanic, Stock Market Data**)
3οΈβ£ Click on a dataset to explore:
πΉ Description π
πΉ Data files π (CSV, JSON, Excel, etc.)
πΉ Sample visualizations π
πΉ Popular Notebooks π
---
## **Step 2: Download a Kaggle Dataset (Manually)**
1οΈβ£ Click **Download** on the dataset page
2οΈβ£ Extract the ZIP file
3οΈβ£ Load the dataset into Python using **pandas**
Example for a CSV file:
```python
import pandas as pd
# Load dataset
df = pd.read_csv('dataset.csv')
# Display first 5 rows
print(df.head())
```
---
## **Step 3: Download a Kaggle Dataset Using Kaggle API**
Kaggle provides an API for easy dataset access.
### **πΉ Setup Kaggle API:**
1οΈβ£ Go to **[Kaggle Account Settings](https://www.kaggle.com/account)**
2οΈβ£ Scroll to **API** and click **Create New API Token**
3οΈβ£ Download the `kaggle.json` file
4οΈβ£ Move it to `~/.kaggle/` (Linux/Mac) or `C:\Users\YourUser\.kaggle\` (Windows)
### **πΉ Install and Use Kaggle API:**
```bash
pip install kaggle
```
```bash
kaggle datasets download -d dataset-owner/dataset-name
```
Example:
```bash
kaggle datasets download -d zynicide/wine-reviews
```
Extract and use it in Python:
```python
import pandas as pd
import zipfile
# Extract ZIP
with zipfile.ZipFile("wine-reviews.zip", "r") as z:
z.extractall("data")
# Load CSV
df = pd.read_csv("data/winemag-data-130k-v2.csv")
print(df.head())
```
---
## **Step 4: Load a Kaggle Dataset in Google Colab**
1οΈβ£ Open **Google Colab** ([colab.research.google.com](https://colab.research.google.com))
2οΈβ£ Run the following command to enable Kaggle API in Colab:
```python
!pip install kaggle
```
3οΈβ£ Upload the `kaggle.json` API key:
```python
from google.colab import files
files.upload()
```
4οΈβ£ Download the dataset:
```python
!kaggle datasets download -d zynicide/wine-reviews
```
5οΈβ£ Extract and use in Colab:
```python
import zipfile
with zipfile.ZipFile("wine-reviews.zip", "r") as z:
z.extractall("data")
```
---
## **Step 5: Analyze and Visualize Kaggle Datasets**
Once the dataset is loaded, you can clean and visualize the data!
πΉ **Check for missing values:**
```python
print(df.isnull().sum())
```
πΉ **Basic statistics:**
```python
print(df.describe())
```
πΉ **Visualize data with Matplotlib & Seaborn:**
```python
import matplotlib.pyplot as plt
import seaborn as sns
# Histogram
sns.histplot(df['price'], bins=30)
plt.show()
```
---
## **Step 6: Use Kaggle Datasets in Jupyter Notebook or VS Code**
If you're using **Jupyter Notebook or VS Code**, follow the **manual download** or **Kaggle API** method to get the dataset, then load it using `pandas`.
---
## **Next Steps:**
π **How to Analyze Data with Pandas** β [Watch Now]
π **Best Kaggle Tips for Beginners** β [Watch Now]
π **How to Build a Machine Learning Model with Kaggle Data** β [Watch Now]
---
### **π Like, Share & Subscribe!**
If this tutorial helped you, **LIKE**, **SHARE**, and **SUBSCRIBE** for more Kaggle & Data Science content!
π¬ Have questions? Drop them in the **comments** below!
---
### **πΉ Hashtags:**
#Kaggle #DataScience #MachineLearning #Python #KaggleDatasets #AI #BigData #DeepLearning #DataAnalytics #KaggleAPI
In this tutorial, Iβll show you **how to find, download, and use Kaggle datasets** in your Python projects, whether you're working in **Jupyter Notebook, VS Code, or Google Colab**.
---
### **πΉ What Youβll Learn:**
β How to **find and explore datasets** on Kaggle
β How to **download Kaggle datasets** manually and using the Kaggle API
β How to **load and use Kaggle datasets in Python**
β How to **use Kaggle datasets directly in Google Colab and Jupyter Notebook**
β How to **analyze, clean, and visualize data**
---
### **πΉ Prerequisites:**
βοΈ **Basic Python knowledge** (recommended)
βοΈ A **Kaggle Account** ([Sign up here](https://www.kaggle.com/))
βοΈ Installed **pandas, numpy, and matplotlib** (`pip install pandas numpy matplotlib seaborn`)
---
## **Step 1: Find a Dataset on Kaggle**
1οΈβ£ Go to **[Kaggle Datasets](https://www.kaggle.com/datasets)**
2οΈβ£ Search for a dataset (e.g., **Netflix Movies, COVID-19, Titanic, Stock Market Data**)
3οΈβ£ Click on a dataset to explore:
πΉ Description π
πΉ Data files π (CSV, JSON, Excel, etc.)
πΉ Sample visualizations π
πΉ Popular Notebooks π
---
## **Step 2: Download a Kaggle Dataset (Manually)**
1οΈβ£ Click **Download** on the dataset page
2οΈβ£ Extract the ZIP file
3οΈβ£ Load the dataset into Python using **pandas**
Example for a CSV file:
```python
import pandas as pd
# Load dataset
df = pd.read_csv('dataset.csv')
# Display first 5 rows
print(df.head())
```
---
## **Step 3: Download a Kaggle Dataset Using Kaggle API**
Kaggle provides an API for easy dataset access.
### **πΉ Setup Kaggle API:**
1οΈβ£ Go to **[Kaggle Account Settings](https://www.kaggle.com/account)**
2οΈβ£ Scroll to **API** and click **Create New API Token**
3οΈβ£ Download the `kaggle.json` file
4οΈβ£ Move it to `~/.kaggle/` (Linux/Mac) or `C:\Users\YourUser\.kaggle\` (Windows)
### **πΉ Install and Use Kaggle API:**
```bash
pip install kaggle
```
```bash
kaggle datasets download -d dataset-owner/dataset-name
```
Example:
```bash
kaggle datasets download -d zynicide/wine-reviews
```
Extract and use it in Python:
```python
import pandas as pd
import zipfile
# Extract ZIP
with zipfile.ZipFile("wine-reviews.zip", "r") as z:
z.extractall("data")
# Load CSV
df = pd.read_csv("data/winemag-data-130k-v2.csv")
print(df.head())
```
---
## **Step 4: Load a Kaggle Dataset in Google Colab**
1οΈβ£ Open **Google Colab** ([colab.research.google.com](https://colab.research.google.com))
2οΈβ£ Run the following command to enable Kaggle API in Colab:
```python
!pip install kaggle
```
3οΈβ£ Upload the `kaggle.json` API key:
```python
from google.colab import files
files.upload()
```
4οΈβ£ Download the dataset:
```python
!kaggle datasets download -d zynicide/wine-reviews
```
5οΈβ£ Extract and use in Colab:
```python
import zipfile
with zipfile.ZipFile("wine-reviews.zip", "r") as z:
z.extractall("data")
```
---
## **Step 5: Analyze and Visualize Kaggle Datasets**
Once the dataset is loaded, you can clean and visualize the data!
πΉ **Check for missing values:**
```python
print(df.isnull().sum())
```
πΉ **Basic statistics:**
```python
print(df.describe())
```
πΉ **Visualize data with Matplotlib & Seaborn:**
```python
import matplotlib.pyplot as plt
import seaborn as sns
# Histogram
sns.histplot(df['price'], bins=30)
plt.show()
```
---
## **Step 6: Use Kaggle Datasets in Jupyter Notebook or VS Code**
If you're using **Jupyter Notebook or VS Code**, follow the **manual download** or **Kaggle API** method to get the dataset, then load it using `pandas`.
---
## **Next Steps:**
π **How to Analyze Data with Pandas** β [Watch Now]
π **Best Kaggle Tips for Beginners** β [Watch Now]
π **How to Build a Machine Learning Model with Kaggle Data** β [Watch Now]
---
### **π Like, Share & Subscribe!**
If this tutorial helped you, **LIKE**, **SHARE**, and **SUBSCRIBE** for more Kaggle & Data Science content!
π¬ Have questions? Drop them in the **comments** below!
---
### **πΉ Hashtags:**
#Kaggle #DataScience #MachineLearning #Python #KaggleDatasets #AI #BigData #DeepLearning #DataAnalytics #KaggleAPI
Tags and Topics
Browse our collection to discover more content in these categories.
Video Information
Views
58.5K
Likes
717
Duration
9:57
Published
Mar 8, 2025
User Reviews
4.5
(11) Related Trending Topics
LIVE TRENDSRelated trending topics. Click any trend to explore more videos.
No specific trending topics match this video yet.
Explore All Trends