🐸

The Archivist's Trial

The Data Tinkererpython-data-tinkerer-37-the-archivists-trial
Reward: 180 XP
|

The Archivist's Trial

Hoppy has finally reached the deepest chamber of the archive. Here, tower scrolls, cabinet records, and noisy text fragments all come together in one last real trial. The archivist slides a stack of relic rows onto the table: the lines are dirty, some entries repeat, and some archive details are missing unless you fill them in from the vault register. Only a steady end-to-end flow will open the gate.

So this lesson does not add a new trick. It closes the whole Series by connecting the core actions you already learned: clean text, split fields, read JSON, choose the right data structures, keep the first true record, count the results, and make one clear final decision. By the end, you should feel something simple and strong: I can really do this now.

Start with one smaller motion: split a row, then read the clue that actually matters

In the final trial, you will keep making the same kind of judgment: what field is this piece of text, and what should I check after the row is cleaned? Here is a tiny toy example first, just to practice that feeling.

row = "relic=moon key | seal=amber spark"
parts = row.split(" | ")
seal_mark = parts[1].split("=")[1]

if seal_mark.find("amber") != -1:
  print("amber found")

This is not today’s full answer. It only demonstrates one key move: split one row, then keep reading inside a field. In the real starter, you still need to clean noisy text, enrich the rows with JSON archive data, deduplicate, count, and build the final gate decision.

Today’s task: complete the whole archive trial and deliver the final access result

The starter already reads archivist_trial.txt and vault_index.json for you. Your job is to finish one complete data flow:

1
Clean the noisy trial rows first

Finish clean_line(raw_line). This is still your familiar cleanup work: remove the "## " prefix, turn "~" back into spaces, and remove the trailing "??". If this step is steady, the later fields become readable for real.

2
Turn each row into a structured record, then enrich it with the vault register

Inside build_record(cleaned_line), split one row into relic_name, vault_code, status, and seal_mark. Then use vault_index[vault_code] to add hall_name and keeper_name.

3
Keep only the true unique relics and organize the counts

Use the set seen_relics to remember whether a relic_name already appeared, then keep the first-seen record in unique_records. After that, build hall_counts for the unique halls, and build a final clue list like amber_ready_relics.

4
Close everything into a final summary and access decision

The most important ending here is not printing lots of intermediate values. It is closing the whole flow into two clear results: trial_summary and access_decision. One explains what happened in the trial. The other gives the final archive verdict.

This is a closing mastery lesson

There is no new concept here, and this is not meant to become an open-ended project. You are taking the skills you already built and connecting them into one trustworthy archive script with a clear finish.

Suggested Solution
Expand
Solution:
import json

with open("archivist_trial.txt", "r", encoding="utf-8") as file:
  trial_text = file.read().strip()

with open("vault_index.json", "r", encoding="utf-8") as file:
  vault_index = json.load(file)

print("Trial text:")
print(trial_text)
print("Vault index:", vault_index)

trial_lines = trial_text.splitlines()
print("Trial lines:", trial_lines)


def clean_line(raw_line):
  return raw_line.strip().replace("## ", "").replace("~", " ").replace("??", "")


def build_record(cleaned_line):
  parts = cleaned_line.split(" | ")
  relic_name = parts[0].split("=")[1]
  vault_code = parts[1].split("=")[1]
  status = parts[2].split("=")[1]
  seal_mark = parts[3].split("=")[1]
  vault_record = vault_index[vault_code]

  return {
      "relic_name": relic_name,
      "vault_code": vault_code,
      "status": status,
      "seal_mark": seal_mark,
      "hall_name": vault_record["hall_name"],
      "keeper_name": vault_record["keeper_name"],
  }


cleaned_lines = []
for raw_line in trial_lines:
  cleaned_lines.append(clean_line(raw_line))

all_records = []
for cleaned_line in cleaned_lines:
  all_records.append(build_record(cleaned_line))

seen_relics = set()
unique_records = []
for record in all_records:
  relic_name = record["relic_name"]
  if relic_name not in seen_relics:
      seen_relics.add(relic_name)
      unique_records.append(record)

hall_counts = {}
for record in unique_records:
  hall_name = record["hall_name"]
  if hall_name not in hall_counts:
      hall_counts[hall_name] = 0
  hall_counts[hall_name] += 1

amber_ready_relics = []
for record in unique_records:
  if record["status"] == "ready" and record["seal_mark"].find("amber") != -1:
      amber_ready_relics.append(record["relic_name"])

keeper_names = []
for record in unique_records:
  keeper_name = record["keeper_name"]
  if keeper_name not in keeper_names:
      keeper_names.append(keeper_name)

trial_summary = {
  "raw_row_count": len(all_records),
  "unique_relic_count": len(unique_records),
  "duplicate_row_count": len(all_records) - len(unique_records),
  "ready_unique_count": len([record for record in unique_records if record["status"] == "ready"]),
  "amber_ready_relics": amber_ready_relics,
  "hall_counts": hall_counts,
}

trial_passed = (
  trial_summary["unique_relic_count"] == 5
  and trial_summary["ready_unique_count"] >= 4
  and len(trial_summary["amber_ready_relics"]) >= 3
  and len(trial_summary["hall_counts"]) == len(vault_index)
)

access_decision = {
  "verdict": "pass" if trial_passed else "retry",
  "keeper_roll_call": ", ".join(keeper_names),
  "final_message": "The archive opens." if trial_passed else "The archive asks for another pass.",
}

print("Cleaned lines:", cleaned_lines)
print("All records:", all_records)
print("Seen relics:", seen_relics)
print("Unique records:", unique_records)
print("Hall counts:", hall_counts)
print("Amber ready relics:", amber_ready_relics)
print("Keeper names:", keeper_names)
print("Trial summary:", trial_summary)
print("Access decision:", access_decision)
Advanced Tips
Want more? Click to expand

If you can finish this lesson steadily, what you carry forward is not just a few methods. It is a real data-handling path: start with messy text, clean it, split it, enrich it, deduplicate it, count it, judge it, and hand back a clear result.

Chapter 6 closes here. The next chapter will bring these skills out of the Hoppy world and into reality-based tasks. But before that step, you have already completed the final mainline trial inside the archive.

Loading...
Terminal
Terminal
Ready to run...