Python Pandas Dataset Analysis: Sorting & More πŸ“Š

Learn how to load, sort, subset, find unique elements, and count values in datasets using Python's Pandas in Jupyter notebooks.

Python Pandas Dataset Analysis: Sorting & More πŸ“Š
Saniya Khullar
177 views β€’ Dec 24, 2021
Python Pandas Dataset Analysis: Sorting & More πŸ“Š

About this video

Using Python's Pandas package (Free!) to better understand a dataset. Saniya will be covering how to load in pandas, read in a dataset to a jupyter notebook, and do other key pandas dataset operations, such as: sorting (based on 1 or more columns), subsetting (selecting certain rows based on criteria), looking for unique elements in pandas columns, finding value counts, and beyond! This is an initial exploration into working with pandas to better understand data. Saniya also works with a deforestation dataframe "annual-change-forest-area.csv" from Kaggle ( https://www.kaggle.com/chiticariucristian/deforestation-and-forest-loss) in support of efforts to improving climate change outcomes and reducing deforestation (So the real pandas and other critters can have their forest homes preserved!).
Saniya talks a little bit about how to get datasets to practice on (for learning or for competitions on crowd-sourcing sites like kaggle.com)
Please reach out to Saniya with any and all questions you have and please subscribe to Saniya's YouTube channel for more updates :)

In short, we will learn how to use Python's Pandas to better understand deforestation datasets (so we can eventually help protect Panda homes)). Please note this is for Python 3.

Please note Saniya plans to hopefully make more Python videos. Here, we learn Python Pandas tools like:
* import pandas as pd (nickname for pandas)
* import numpy as np
* reading in a dataset (csv file) to pandas
* get # of columns and rows for dataset
* view first 5 rows (head of dataset)
* sort dataframes based on columns
* extract columns from a dataframe
* look for and retrieve all unique elements in columns
* finding value counts (# of times each item appears in a pandas column)
* subsetting dataframe based on criteria ( relational operators and .isin()).
* and beyond!

TIME STAMPS
00:00 Python Pandas Dataset Analysis: Sorting, Subsetting, Unique Elements, Value Counts, and beyond!
01:14 What is deforestation (explaining context for dataset)
02:05 What is Pandas package in Python?
02:44 Kaggle Dataset on Deforestation and Forest Loss (used for dataframe in this video)
03:11 Loading the dataset into Excel (dataframes are standard rows and columns of data)
03:49 Only 1,048,576 rows can be loaded into Excel (Pandas can load in millions of rows!)
04:38 Explaining more about the dataset
05:20 Public Service Announcement: Why Deforestation and Forest Loss are Big Concerns
06:52 Loading up Python Jupyter notebook and datafiles (importing pandas and numpy)
08:18 Converting Pandas Dataframe column (e.g. Year column) to a list
09:14 finding the dimensions (shape) of a dataframe (# rows, # columns)
10:03 finding the unique elements in a pandas column (e.g. Year column, Entity column) using sets
13:37 showing first 5 rows of dataframe (default) using head function
13:58 sorting dataframe by a column (e.g. Year)
17:03 sorting a list using sorted function
17:47 subset/filter dataframe for certain rows for 1 value (e.g. for specific country)
19:05 using value counts to find breakdown of counts of unique values for a given column
19:53 subset/filter dataframe for certain rows for many values (e.g. list of countries)
22:16 recap on index numbers for a Python list (positive and negative)
24:19 explaining the .isin() function
29:00 subsetting/filtering for net forest conversion and year using greater than or less than operators
30:43 selecting only certain columns from pandas dataframe

Tags and Topics

Browse our collection to discover more content in these categories.

Video Information

Views

177

Likes

3

Duration

32:46

Published

Dec 24, 2021

Related Trending Topics

LIVE TRENDS

Related trending topics. Click any trend to explore more videos.