The Data Cleaner
Welcome to the Real World.
In the Digital Forest, you used magic to transform elements. In the software industry, we do the same thing, but we call it Data Cleaning (or Data Munging).
The Problem: Human input is messy. Users accidentally type spaces, forget to capitalize, or scream in ALL CAPS. The Goal: Clean the data so it looks professional and uniform.
In this lab, you aren't casting spells. You are writing a Transformation Pipeline.
You will receive raw, dirty data, and your script must output clean, structured data.
The Tools
Python keeps two powerful tools in its belt for text cleaning:
name = " aLIce " # 1. .strip() - Removes leading/trailing whitespace clean_spaces = name.strip() # "aLIce" # 2. .title() - Capitalizes first letter, lowers the rest proper_case = clean_spaces.title() # "Alice" # You can even chain them! perfect = name.strip().title() # "Alice"
Your Task
You have received a list of new user names from a web form. It's a disaster.
raw_users = [" aLIce ", "BOB", " cindy", "dave "]
Create an empty list called clean_users.
Write a for loop to go through each name in raw_users.
Inside the loop, clean the name (remove spaces, fix casing).
Append the cleaned name to your clean_users list.
Print the clean_users list to see your handiwork.
Suggested SolutionExpandCollapse
This pattern (Initialize -> Loop -> Process -> Append) is the foundation of almost all data processing scripts.
raw_users = [" aLIce ", "BOB", " cindy", "dave "]
clean_users = []
for name in raw_users:
# Chain the methods for efficiency!
cleaned = name.strip().title()
clean_users.append(cleaned)
print("Original:", raw_users)
print("Cleaned: ", clean_users)