← /projects

Cartographer

Autonomous market-tracking agent — parallel scraping, diff detection, and lifecycle state across hundreds of sources.

The Problem

Manually tracking a marketplace — job listings, competitor pricing, regulatory filings, any high-cardinality web data — is inefficient. Changes go undetected. Duplicates pile up. Closed items linger. A system that autonomously refreshes sources in parallel, diffs against previous runs, and tracks item lifecycle solves this. Built first for personal job-listings tracking; generalizes to any marketplace.

What I Built

Two complementary Claude Code skills that automate web scraping at scale. /find-openings navigates a single source via headless browser to extract items matching a profile. /refresh-openings batches the single-source search across dozens of sources in parallel (10–12 per wave), diffs fresh results against an existing markdown listings file using fuzzy title matching, validates links via Greenhouse/Ashby APIs (or custom validators per source), verifies possibly-closed items by checking link liveness, and updates the file with status transitions (New → Active → Possibly Closed → Closed). All intermediate state persists to disk for compaction safety.

Notable

Dual-phase closure (Possibly Closed → Closed) prevents false positives from temporary site issues. An item only moves to Closed if it's unconfirmed on the second run AND its link is dead, eliminating phantom closures from upstream lag.

Stack

Claude CodeClaude Opus agentsbrowse-multiGreenhouse / Ashby APIsYAML + Markdown + JSON

Status

Open source at github.com/blaizew/job-search.