Pandas Mastery | DeviX Solutions

Creation

Selection

Operations

Grouping

Advanced

view_module Data Creation

Pandas provides Series (1D) and DataFrame (2D) as primary structures.

code Creating Series & DataFrames

import pandas as pd

s = pd.Series([1, 2, 3], index=['a','b','c'])
df = pd.DataFrame({
    'name': ['Alice','Bob'],
    'age': [23, 34]
})

# From numpy array
import numpy as np
df2 = pd.DataFrame(np.arange(6).reshape(2,3), columns=['A','B','C'])

code Reading Files

df = pd.read_csv("data.csv")
df = pd.read_excel("data.xlsx")
df = pd.read_json("data.json")

sticky_note_2 Key Notes

Use pd.DataFrame.from_dict(), from_records() for advanced creation. DataFrame columns can have different types.

filter_list Selection & Indexing

Access and manipulate data using powerful indexing features:

code Column, Row Access

df['age']          # Series
df[['name','age']] # DataFrame
df.loc[0]          # Row by label/index
df.iloc[0,1]       # Row/col by integer
df.at[0,'age']     # Fast scalar access
df.iat[0,1]        # Fast scalar by position

code Boolean, Fancy Indexing

mask = df['age'] > 25
df[mask]          # Filter rows

df.loc[df['age']>30, 'name'] # Select name where age>30

df.iloc[[0,2], [1,2]]  # Fancy row/col selection

code Setting Values

df.loc[1, 'age'] = 40
df['new_col'] = df['age'] * 2

sticky_note_2 Good to Know

Use loc for label-based and iloc for integer-based indexing. Boolean indexing is extremely powerful!

calculate Operations & Aggregations

Manipulate and analyze data with vectorized ops, stats, apply/map, and handle missing data:

code Aggregation & Descriptive Stats

df.sum()
df.mean()
df.describe()
df['age'].min()
df.count()

code Apply, Map, Transform

df['age_plus_10'] = df['age'].apply(lambda x: x+10)
df['name_len'] = df['name'].map(len)
df.transform({'age': np.sqrt})

code Handling Missing Data

df.isnull()
df.dropna()
df.fillna(0)
df['col'].fillna(df['col'].mean())

sticky_note_2 Good to Know

apply() works row/col-wise, map() on Series. Use fillna() and dropna() for NaN.

group_work Grouping, Merging, Reshaping

Combine, reshape, and summarize data:

code groupby, pivot, melt

df.groupby('dept')['age'].mean()
df.pivot_table(index='dept', columns='gender', values='salary', aggfunc='sum')
pd.melt(df, id_vars=['name'], value_vars=['age','salary'])

code Merging & Joining

pd.concat([df1, df2], axis=0)     # Stack rows
pd.concat([df1, df2], axis=1)     # Stack columns
pd.merge(df1, df2, on='id', how='inner')
df1.join(df2, rsuffix='_other')

code Reshaping

df.stack()
df.unstack()
df.T      # Transpose
df.reset_index()

sticky_note_2 Good to Know

groupby is for summary stats per group; pivot and melt reshape data; merge is like SQL join.

schedule Timeseries & Advanced

Handle dates, rolling windows, categorical data, IO, and advanced tricks:

code Dates, Timeseries

dt = pd.date_range("2023-01-01", periods=5, freq='D')
df['date'] = pd.to_datetime(df['date'])
df.set_index('date', inplace=True)
df.resample('M').mean()
df.rolling(window=3).mean()

code Categorical & String Ops

df['cat'] = df['col'].astype('category')
df['name'].str.upper()
df['email'].str.contains('gmail')

code IO: Read/Write Files

df.to_csv("out.csv")
df.to_excel("out.xlsx")
df.to_json("out.json")
df.to_sql("table", conn)

sticky_note_2 Key Notes

Pandas is great for timeseries, categorical, and text data. Use read_* and to_* for file IO.

table_view Pandas Mastery Course

About Your Instructor

Pandas Core Concepts

Data Creation

Selection & Indexing

Operations

Grouping & Merging

Timeseries & Advanced

view_module Data Creation

code Creating Series & DataFrames

code Reading Files

sticky_note_2 Key Notes

filter_list Selection & Indexing

code Column, Row Access

code Boolean, Fancy Indexing

code Setting Values

sticky_note_2 Good to Know

calculate Operations & Aggregations

code Aggregation & Descriptive Stats

code Apply, Map, Transform

code Handling Missing Data

sticky_note_2 Good to Know

group_work Grouping, Merging, Reshaping

code groupby, pivot, melt

code Merging & Joining

code Reshaping

sticky_note_2 Good to Know

schedule Timeseries & Advanced

code Dates, Timeseries

code Categorical & String Ops

code IO: Read/Write Files

sticky_note_2 Key Notes

Pandas Playground

Try Pandas Code

Pandas Function Reference

Data Creation

Selection & Indexing

Math & Aggregation

Grouping & Merging

Timeseries & Resample

File IO

Missing Data

String & Category