Using Kaggle Datasets for Data Science & ML π
Learn how to find and utilize free Kaggle datasets for your data science and machine learning projects effectively.

ProgrammingKnowledge
58.5K views β’ Mar 8, 2025

About this video
π **Want to use real-world datasets for your data science and machine learning projects?** Kaggle is the perfect place to find free datasets for analysis, visualization, and model building!
In this tutorial, Iβll show you **how to find, download, and use Kaggle datasets** in your Python projects, whether you're working in **Jupyter Notebook, VS Code, or Google Colab**.
---
### **πΉ What Youβll Learn:**
β How to **find and explore datasets** on Kaggle
β How to **download Kaggle datasets** manually and using the Kaggle API
β How to **load and use Kaggle datasets in Python**
β How to **use Kaggle datasets directly in Google Colab and Jupyter Notebook**
β How to **analyze, clean, and visualize data**
---
### **πΉ Prerequisites:**
βοΈ **Basic Python knowledge** (recommended)
βοΈ A **Kaggle Account** ([Sign up here](https://www.kaggle.com/))
βοΈ Installed **pandas, numpy, and matplotlib** (`pip install pandas numpy matplotlib seaborn`)
---
## **Step 1: Find a Dataset on Kaggle**
1οΈβ£ Go to **[Kaggle Datasets](https://www.kaggle.com/datasets)**
2οΈβ£ Search for a dataset (e.g., **Netflix Movies, COVID-19, Titanic, Stock Market Data**)
3οΈβ£ Click on a dataset to explore:
πΉ Description π
πΉ Data files π (CSV, JSON, Excel, etc.)
πΉ Sample visualizations π
πΉ Popular Notebooks π
---
## **Step 2: Download a Kaggle Dataset (Manually)**
1οΈβ£ Click **Download** on the dataset page
2οΈβ£ Extract the ZIP file
3οΈβ£ Load the dataset into Python using **pandas**
Example for a CSV file:
```python
import pandas as pd
# Load dataset
df = pd.read_csv('dataset.csv')
# Display first 5 rows
print(df.head())
```
---
## **Step 3: Download a Kaggle Dataset Using Kaggle API**
Kaggle provides an API for easy dataset access.
### **πΉ Setup Kaggle API:**
1οΈβ£ Go to **[Kaggle Account Settings](https://www.kaggle.com/account)**
2οΈβ£ Scroll to **API** and click **Create New API Token**
3οΈβ£ Download the `kaggle.json` file
4οΈβ£ Move it to `~/.kaggle/` (Linux/Mac) or `C:\Users\YourUser\.kaggle\` (Windows)
### **πΉ Install and Use Kaggle API:**
```bash
pip install kaggle
```
```bash
kaggle datasets download -d dataset-owner/dataset-name
```
Example:
```bash
kaggle datasets download -d zynicide/wine-reviews
```
Extract and use it in Python:
```python
import pandas as pd
import zipfile
# Extract ZIP
with zipfile.ZipFile("wine-reviews.zip", "r") as z:
z.extractall("data")
# Load CSV
df = pd.read_csv("data/winemag-data-130k-v2.csv")
print(df.head())
```
---
## **Step 4: Load a Kaggle Dataset in Google Colab**
1οΈβ£ Open **Google Colab** ([colab.research.google.com](https://colab.research.google.com))
2οΈβ£ Run the following command to enable Kaggle API in Colab:
```python
!pip install kaggle
```
3οΈβ£ Upload the `kaggle.json` API key:
```python
from google.colab import files
files.upload()
```
4οΈβ£ Download the dataset:
```python
!kaggle datasets download -d zynicide/wine-reviews
```
5οΈβ£ Extract and use in Colab:
```python
import zipfile
with zipfile.ZipFile("wine-reviews.zip", "r") as z:
z.extractall("data")
```
---
## **Step 5: Analyze and Visualize Kaggle Datasets**
Once the dataset is loaded, you can clean and visualize the data!
πΉ **Check for missing values:**
```python
print(df.isnull().sum())
```
πΉ **Basic statistics:**
```python
print(df.describe())
```
πΉ **Visualize data with Matplotlib & Seaborn:**
```python
import matplotlib.pyplot as plt
import seaborn as sns
# Histogram
sns.histplot(df['price'], bins=30)
plt.show()
```
---
## **Step 6: Use Kaggle Datasets in Jupyter Notebook or VS Code**
If you're using **Jupyter Notebook or VS Code**, follow the **manual download** or **Kaggle API** method to get the dataset, then load it using `pandas`.
---
## **Next Steps:**
π **How to Analyze Data with Pandas** β [Watch Now]
π **Best Kaggle Tips for Beginners** β [Watch Now]
π **How to Build a Machine Learning Model with Kaggle Data** β [Watch Now]
---
### **π Like, Share & Subscribe!**
If this tutorial helped you, **LIKE**, **SHARE**, and **SUBSCRIBE** for more Kaggle & Data Science content!
π¬ Have questions? Drop them in the **comments** below!
---
### **πΉ Hashtags:**
#Kaggle #DataScience #MachineLearning #Python #KaggleDatasets #AI #BigData #DeepLearning #DataAnalytics #KaggleAPI
In this tutorial, Iβll show you **how to find, download, and use Kaggle datasets** in your Python projects, whether you're working in **Jupyter Notebook, VS Code, or Google Colab**.
---
### **πΉ What Youβll Learn:**
β How to **find and explore datasets** on Kaggle
β How to **download Kaggle datasets** manually and using the Kaggle API
β How to **load and use Kaggle datasets in Python**
β How to **use Kaggle datasets directly in Google Colab and Jupyter Notebook**
β How to **analyze, clean, and visualize data**
---
### **πΉ Prerequisites:**
βοΈ **Basic Python knowledge** (recommended)
βοΈ A **Kaggle Account** ([Sign up here](https://www.kaggle.com/))
βοΈ Installed **pandas, numpy, and matplotlib** (`pip install pandas numpy matplotlib seaborn`)
---
## **Step 1: Find a Dataset on Kaggle**
1οΈβ£ Go to **[Kaggle Datasets](https://www.kaggle.com/datasets)**
2οΈβ£ Search for a dataset (e.g., **Netflix Movies, COVID-19, Titanic, Stock Market Data**)
3οΈβ£ Click on a dataset to explore:
πΉ Description π
πΉ Data files π (CSV, JSON, Excel, etc.)
πΉ Sample visualizations π
πΉ Popular Notebooks π
---
## **Step 2: Download a Kaggle Dataset (Manually)**
1οΈβ£ Click **Download** on the dataset page
2οΈβ£ Extract the ZIP file
3οΈβ£ Load the dataset into Python using **pandas**
Example for a CSV file:
```python
import pandas as pd
# Load dataset
df = pd.read_csv('dataset.csv')
# Display first 5 rows
print(df.head())
```
---
## **Step 3: Download a Kaggle Dataset Using Kaggle API**
Kaggle provides an API for easy dataset access.
### **πΉ Setup Kaggle API:**
1οΈβ£ Go to **[Kaggle Account Settings](https://www.kaggle.com/account)**
2οΈβ£ Scroll to **API** and click **Create New API Token**
3οΈβ£ Download the `kaggle.json` file
4οΈβ£ Move it to `~/.kaggle/` (Linux/Mac) or `C:\Users\YourUser\.kaggle\` (Windows)
### **πΉ Install and Use Kaggle API:**
```bash
pip install kaggle
```
```bash
kaggle datasets download -d dataset-owner/dataset-name
```
Example:
```bash
kaggle datasets download -d zynicide/wine-reviews
```
Extract and use it in Python:
```python
import pandas as pd
import zipfile
# Extract ZIP
with zipfile.ZipFile("wine-reviews.zip", "r") as z:
z.extractall("data")
# Load CSV
df = pd.read_csv("data/winemag-data-130k-v2.csv")
print(df.head())
```
---
## **Step 4: Load a Kaggle Dataset in Google Colab**
1οΈβ£ Open **Google Colab** ([colab.research.google.com](https://colab.research.google.com))
2οΈβ£ Run the following command to enable Kaggle API in Colab:
```python
!pip install kaggle
```
3οΈβ£ Upload the `kaggle.json` API key:
```python
from google.colab import files
files.upload()
```
4οΈβ£ Download the dataset:
```python
!kaggle datasets download -d zynicide/wine-reviews
```
5οΈβ£ Extract and use in Colab:
```python
import zipfile
with zipfile.ZipFile("wine-reviews.zip", "r") as z:
z.extractall("data")
```
---
## **Step 5: Analyze and Visualize Kaggle Datasets**
Once the dataset is loaded, you can clean and visualize the data!
πΉ **Check for missing values:**
```python
print(df.isnull().sum())
```
πΉ **Basic statistics:**
```python
print(df.describe())
```
πΉ **Visualize data with Matplotlib & Seaborn:**
```python
import matplotlib.pyplot as plt
import seaborn as sns
# Histogram
sns.histplot(df['price'], bins=30)
plt.show()
```
---
## **Step 6: Use Kaggle Datasets in Jupyter Notebook or VS Code**
If you're using **Jupyter Notebook or VS Code**, follow the **manual download** or **Kaggle API** method to get the dataset, then load it using `pandas`.
---
## **Next Steps:**
π **How to Analyze Data with Pandas** β [Watch Now]
π **Best Kaggle Tips for Beginners** β [Watch Now]
π **How to Build a Machine Learning Model with Kaggle Data** β [Watch Now]
---
### **π Like, Share & Subscribe!**
If this tutorial helped you, **LIKE**, **SHARE**, and **SUBSCRIBE** for more Kaggle & Data Science content!
π¬ Have questions? Drop them in the **comments** below!
---
### **πΉ Hashtags:**
#Kaggle #DataScience #MachineLearning #Python #KaggleDatasets #AI #BigData #DeepLearning #DataAnalytics #KaggleAPI
Tags and Topics
Browse our collection to discover more content in these categories.
Video Information
Views
58.5K
Likes
717
Duration
9:57
Published
Mar 8, 2025
User Reviews
4.5
(11) Related Trending Topics
LIVE TRENDSRelated trending topics. Click any trend to explore more videos.
Trending Now