🐸

The Broken Cabinet

The Data Tinkererpython-data-tinkerer-36-the-broken-cabinet
Reward: 180 XP
|

The Broken Cabinet

When Hoppy enters the third room of the gauntlet, he finds a small tool that somebody else already wrote. It reads cabinet records. It prints a summary. At first glance, everything looks fine. But the archivist shakes his head: the most troublesome script is often not the one that crashes right away — it is the one that runs, produces output, and still gives quietly wrong results.

So the point of this lesson is not inventing a brand-new flow. It is practicing a real working skill: read existing code, notice where variable names and actual logic stop matching, and repair the tool until the result becomes trustworthy again.

Practice one important feeling first: does the code really do what the variable name says?

When you repair old code, the first step is often not staring at an error message. It is following the shape of the data from one step to the next. If the variable name, the split position, or the loop target is off, the result starts drifting quietly.

row = "name=fern seal | room=north | count=2"
parts = row.split(" | ")

room_name = parts[0].split("=")[1]
print(room_name)

This code runs, but room_name actually becomes "fern seal", not "north". So a very useful repair question is: does this variable name really match the piece of data the code is extracting? Today’s starter has a few places exactly like that — small mismatches that quietly bend the whole result.

Today’s task: repair a tool that already runs, but cannot be trusted yet

The starter already reads broken_cabinet.txt and cabinet_index.json for you. Most of the flow is already written: clean the text, split the fields, enrich them with JSON records, deduplicate, count, and build a summary. Your job is not to rewrite it. Your job is to bring the wrong logic back into alignment.

1
Check whether clean_line(raw_line) really cleans the noisy row

This step should remove the "## " prefix, turn "~" back into spaces, and remove the trailing "??". If even one part is missing here, the names and field values stay dirty all the way down the flow.

2
Check whether build_record(cleaned_line) reads the correct fields

Each row contains drawer, cabinet, status, and dust. The real task here is not more syntax. It is confirming which variable should come from which piece, then using cabinet_index[cabinet_code] to add room_name and keeper_name.

3
Make deduplication and counting work on the right thing

The tool is supposed to keep the first record for each drawer, so seen_drawers should remember drawer_name. Then room_counts should count the repaired unique_records, not every raw record.

4
Finally, check whether the summary words are actually true

raw_row_count should really mean the raw row total, unique_drawer_count should really mean the unique drawer total, and ready_unique_count should be based on the repaired status. In repair work, this last pass matters a lot: a label that sounds right does not guarantee a correct result.

This is not a giant debugging tutorial

You are not learning exception handling here, and you are not meant to wander through a puzzle maze. The starter only contains a few clear old-skill mistakes: incomplete cleanup, one wrong field read, the wrong deduplication key, and counting against the wrong collection. Your task is to read the script calmly and repair it.

Suggested Solution
Expand
Solution:
import json

with open("broken_cabinet.txt", "r", encoding="utf-8") as file:
  cabinet_text = file.read().strip()

with open("cabinet_index.json", "r", encoding="utf-8") as file:
  cabinet_index = json.load(file)

print("Broken cabinet text:")
print(cabinet_text)
print("Cabinet index:", cabinet_index)

cabinet_lines = cabinet_text.splitlines()
print("Cabinet lines:", cabinet_lines)


def clean_line(raw_line):
  cleaned = raw_line.strip().replace("## ", "")
  cleaned = cleaned.replace("~", " ")
  cleaned = cleaned.replace("??", "")
  return cleaned


def build_record(cleaned_line):
  parts = cleaned_line.split(" | ")
  drawer_name = parts[0].split("=")[1]
  cabinet_code = parts[1].split("=")[1]
  status = parts[2].split("=")[1]
  dust_level = parts[3].split("=")[1]
  cabinet_record = cabinet_index[cabinet_code]

  return {
      "drawer_name": drawer_name,
      "cabinet_code": cabinet_code,
      "status": status,
      "dust_level": dust_level,
      "room_name": cabinet_record["room_name"],
      "keeper_name": cabinet_record["keeper_name"],
  }


cleaned_lines = []
for raw_line in cabinet_lines:
  cleaned_lines.append(clean_line(raw_line))

all_records = []
for cleaned_line in cleaned_lines:
  all_records.append(build_record(cleaned_line))

seen_drawers = set()
unique_records = []
for record in all_records:
  drawer_name = record["drawer_name"]
  if drawer_name not in seen_drawers:
      seen_drawers.add(drawer_name)
      unique_records.append(record)

room_counts = {}
for record in unique_records:
  room_name = record["room_name"]
  if room_name not in room_counts:
      room_counts[room_name] = 0
  room_counts[room_name] += 1

ready_unique_count = 0
for record in unique_records:
  if record["status"] == "ready":
      ready_unique_count += 1

cabinet_summary = {
  "raw_row_count": len(all_records),
  "unique_drawer_count": len(unique_records),
  "duplicate_row_count": len(all_records) - len(unique_records),
  "ready_unique_count": ready_unique_count,
  "room_counts": room_counts,
}

print("Cleaned lines:", cleaned_lines)
print("All records:", all_records)
print("Seen drawers:", seen_drawers)
print("Unique records:", unique_records)
print("Room counts:", room_counts)
print("Cabinet summary:", cabinet_summary)
Advanced Tips
Want more? Click to expand

The most useful thing to carry forward from this lesson is not only “I fixed some bugs.” It is that you started checking whether a whole data flow stays aligned: what the cleaned text becomes, whether each extracted field is really that field, and what collection the deduplication and counting logic is actually using.

Next lesson will push that integration feeling to the finish line. Instead of only repairing one script, you will complete the final world-based trial for the whole chapter. That is where it becomes even clearer: someone who has really learned this series does not only follow steps — they can judge, repair, and close the whole flow.

Loading...
Terminal
Terminal
Ready to run...