#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
5 min read

Adding and Removing Columns

Learn to add new columns and remove existing ones from DataFrames

Adding and Removing Columns

Adding Single Column

code.py
import pandas as pd

df = pd.DataFrame({
    'Name': ['John', 'Sarah', 'Mike'],
    'Salary': [50000, 60000, 55000]
})

df['Department'] = 'Sales'
print(df)

Output:

Name Salary Department 0 John 50000 Sales 1 Sarah 60000 Sales 2 Mike 55000 Sales

All rows get same value.

Adding from List

code.py
df['Age'] = [25, 30, 28]
print(df)

List must match number of rows!

Adding from Calculation

code.py
df['Bonus'] = df['Salary'] * 0.1
print(df)

Creates new column from existing column.

Adding from Multiple Columns

code.py
df['Total'] = df['Salary'] + df['Bonus']
print(df)

Adding with Conditions

code.py
import numpy as np

df['Level'] = np.where(df['Salary'] > 55000, 'Senior', 'Junior')
print(df)

Adding with apply()

code.py
def calculate_tax(salary):
    return salary * 0.2

df['Tax'] = df['Salary'].apply(calculate_tax)
print(df)

Using lambda:

code.py
df['Tax'] = df['Salary'].apply(lambda x: x * 0.2)

Adding Multiple Columns

code.py
df[['Bonus', 'Tax']] = df['Salary'] * 0.1, df['Salary'] * 0.2
print(df)

Or separately:

code.py
df['Bonus'] = df['Salary'] * 0.1
df['Tax'] = df['Salary'] * 0.2

Insert at Specific Position

code.py
df.insert(1, 'ID', [101, 102, 103])
print(df)

Inserts ID column at position 1 (after first column).

Removing Single Column

code.py
df_new = df.drop('Age', axis=1)
print(df_new)

Original df unchanged.

Remove permanently:

code.py
df.drop('Age', axis=1, inplace=True)

Removing Multiple Columns

code.py
df_new = df.drop(['Age', 'Bonus'], axis=1)
print(df_new)

Delete with del

code.py
del df['Tax']
print(df)

Modifies DataFrame immediately!

Pop Column

Remove and return column.

code.py
bonus_column = df.pop('Bonus')
print("Bonus column:", bonus_column)
print("DataFrame now:", df)

Column removed from df.

Practice Example

The scenario: Build employee database with calculated columns.

code.py
import pandas as pd
import numpy as np

employees = pd.DataFrame({
    'Name': ['John', 'Sarah', 'Mike', 'Emma', 'David'],
    'Base_Salary': [50000, 65000, 55000, 70000, 60000],
    'Years': [3, 7, 4, 9, 5]
})

print("Initial data:")
print(employees)
print()

print("1. Add Department:")
employees['Department'] = ['Sales', 'IT', 'Sales', 'HR', 'IT']
print(employees)
print()

print("2. Add Employee ID at start:")
employees.insert(0, 'ID', range(101, 106))
print(employees)
print()

print("3. Calculate bonus (10% of base):")
employees['Bonus'] = employees['Base_Salary'] * 0.1
print(employees)
print()

print("4. Calculate tax (20% of base):")
employees['Tax'] = employees['Base_Salary'] * 0.2
print(employees)
print()

print("5. Add experience level:")
employees['Level'] = np.where(
    employees['Years'] >= 7, 'Senior',
    np.where(employees['Years'] >= 4, 'Mid', 'Junior')
)
print(employees)
print()

print("6. Calculate total compensation:")
employees['Total_Comp'] = employees['Base_Salary'] + employees['Bonus']
print(employees)
print()

print("7. Add performance multiplier:")
def get_multiplier(row):
    if row['Level'] == 'Senior':
        return 1.5
    elif row['Level'] == 'Mid':
        return 1.2
    else:
        return 1.0

employees['Multiplier'] = employees.apply(get_multiplier, axis=1)
print(employees)
print()

print("8. Remove Tax column:")
employees = employees.drop('Tax', axis=1)
print(employees)
print()

print("Final summary:")
print("Columns:", employees.columns.tolist())
print("Shape:", employees.shape)
print("Total compensation:", employees['Total_Comp'].sum())

Adding Empty Column

code.py
df['Notes'] = None
print(df)

Or with NaN:

code.py
df['Comments'] = np.nan

Adding from Series

code.py
new_data = pd.Series([100, 200, 300])
df['Values'] = new_data

Conditional Column Addition

code.py
if 'Bonus' not in df.columns:
    df['Bonus'] = 0

Adding with assign()

Creates copy with new column.

code.py
df_new = df.assign(
    Bonus=df['Salary'] * 0.1,
    Tax=df['Salary'] * 0.2
)
print(df_new)

Original df unchanged.

Chain Multiple Operations

code.py
df_result = (df
    .assign(Bonus=df['Salary'] * 0.1)
    .assign(Tax=df['Salary'] * 0.2)
    .assign(Net=lambda x: x['Salary'] - x['Tax'])
)

Removing Columns by Pattern

code.py
cols_to_drop = [col for col in df.columns if 'temp' in col]
df = df.drop(cols_to_drop, axis=1)

Keep Only Specific Columns

code.py
df = df[['Name', 'Salary', 'Age']]

Drops all other columns.

Reorder Columns

code.py
df = df[['ID', 'Name', 'Age', 'Salary']]

Add Prefix to Columns

code.py
df = df.add_prefix('emp_')
print(df.columns.tolist())

Output: ['emp_Name', 'emp_Salary', 'emp_Age']

Add Suffix to Columns

code.py
df = df.add_suffix('_2024')
print(df.columns.tolist())

Copy Column

code.py
df['Salary_Backup'] = df['Salary']

Replace Column

code.py
df['Salary'] = df['Salary'] * 1.1

Overwrites existing column.

Key Points to Remember

Add column with df['NewCol'] = values. Simple and direct.

Remove column with drop('Col', axis=1). Use inplace=True to modify original.

del df['Col'] removes immediately without creating copy.

insert(position, name, values) adds column at specific position.

assign() creates new DataFrame with added columns. Original unchanged.

List length must match number of rows when adding from list.

Common Mistakes

Mistake 1: Wrong list length

code.py
df['Age'] = [25, 30]  # Error if df has 3 rows!
# Check: len(df) must equal len([25, 30])

Mistake 2: Forgetting axis

code.py
df.drop('Age')  # Error!
df.drop('Age', axis=1)  # Correct

Mistake 3: Not assigning result

code.py
df.drop('Age', axis=1)  # Doesn't change df!
df = df.drop('Age', axis=1)  # Correct
# OR
df.drop('Age', axis=1, inplace=True)

Mistake 4: Using del on filtered DataFrame

code.py
subset = df[df['Age'] > 25]
del subset['Name']  # May affect original!

subset = df[df['Age'] > 25].copy()  # Safe

Mistake 5: Column name typo

code.py
df.drop('Sallary', axis=1)  # Error if column is 'Salary'
print(df.columns.tolist())  # Check names first

What's Next?

You now know how to add and remove columns. Next, you'll learn about renaming columns - changing column names to better ones.