Colleague-facing explainer at docs/sources/pro_crawl_vs_export.md.
Addresses the "I see 278 .epro2 files but my browser only downloaded
one" confusion: web download is a ZIP container (extension is a UX
choice, not a format), our crawl produces per-doc message streams.
Both carry equivalent EPRO2 data; only real gap is IMAGE/ binary
previews which we don't fetch yet.
Why per-doc and not ZIP: the ZIP path has no public endpoint —
three HARs confirm the export button fires zero HTTP requests, it's
pure client-side JSZip on data already loaded by the editor. Our
crawler hits the same chain endpoints the editor uses internally,
which delivers per-doc streams.
Log entry references the 278 vs 266 doc-count delta for ESP-VoCat
(we walk full history chain, web export is a current snapshot).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three coupled changes so kicad-cli sch erc runs at the project level
(across all sheets of one schematic) instead of single-sheet:
1. (label) → (global_label (shape passive)). EPRO2 nets are
project-global by construction (named rails span every page in the
SCH and physically wire across PCBs); KiCad's local label is sheet-
scoped and triggers `label_dangling` for any name not duplicated on
the same page.
2. New root_sch_writer that groups SCH_PAGE docs by their parent SCH
(META.schematic), emits one root .kicad_sch per group with one
(sheet ...) entry per child, and threads the root-assigned uuid back
into each child's (sheet_instances) so KiCad can bind them.
--all-sch now defaults to this; --flat falls back to one-file-per-page.
3. EPRO2's "5-Voltage" placeholder COMPONENT (partId
pid8a0e77bacb214e, 365 instances on ESP-VoCat) is the editor's power
port. The rail name lives in the placement's `Global Net Name` ATTR,
not in the PART. We now emit a (global_label "<rail>") at the
placement coords whenever that attr is set (101/365 of them on
ESP-VoCat — the rest are unconfigured drafts).
ESP-VoCat 5 hierarchical roots: 2325 → 2265 violations. Modest because
5 of 6 SCHs are single-page (no cross-sheet nets to resolve), and the
one 4-page schematic (CoreBoard) shares only a handful of names across
sheets — most net names are de-facto sheet-local. The remaining ~190
pin_not_connected are dominated by 0402-style passives whose pin tip
lies on a wire's interior, not at an endpoint; KiCad needs an explicit
(junction) at those points and we don't yet emit one. Marked as the
next follow-up in log.md.
47 → 52 unit tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bisect found two semantics mismatches between EPRO2 and KiCad that cause
the 850 real-connectivity ERC violations on the ESP-VoCat ref project:
1. sym_writer was emitting lib coords without negating Y, but KiCad lib
uses Y-up and re-flips Y on placement (Y-down schematic). So vertically
arranged pins ended up at Y-mirrored absolute positions and wires that
reach the geometric pin tip in EPRO2 missed the rendered pin tip in
KiCad. Fix: lib_y = -epro2_y, lib_rot = (360 - rot) % 360 for pin/text.
2. sch_writer was treating each LINE as an isolated wire — but EPRO2
binds segments into nets by NAME (WIRE.NET attr), not just geometry.
Multi-segment nets like GND/VBUS show up as N disconnected stubs to
KiCad. Fix: per-LINE, look up lineGroup → WIRE → NET attr and emit a
`(label "<NET>")` at the LINE's start. Same-named labels on distinct
physical wires is how KiCad's ERC recognizes a multi-segment net.
ESP-VoCat 9 sheets:
wire_dangling 444 → 52 (-88%)
pin_not_connected 406 → 196 (-52%)
real connectivity total 850 → 248 (-71%)
Why we did NOT round to grid (the obvious-looking fix): EPRO2 places
some pins on a 10-mil pitch (e.g. magnetic socket); rounding to KiCad's
default 50-mil ERC grid would collapse those pins. The 248 residual is
fundamentally cross-sheet — single-sheet ERC can't see a net's other
endpoints on sibling sheets — and is a Phase-3 (hierarchical sheet)
problem, not a per-sheet one.
41 → 46 unit tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Probed listing API and learned: total field is exposed (Pro=21,202 / Std=12,493),
pageSize accepts >=1000 (full corpus = 35 requests / 71s), sort param is silently
ignored. Dump all listings via scripts/dump_listing_index.py to local jsonl so
downstream batch-selection no longer hits the API.
Why: needed quantitative anchors before scaling Pro batch beyond top-5. License
is detail-page only (~19h serial scan), so we want to filter on grade/like
*locally* first to shortlist before paying that cost. Quality-tier counts now
known: A-tier (grade>=3 & like>=10) = 2,806 across both origins.
- scripts/dump_listing_index.py: one-shot scraper, polite QPS, streams to jsonl
- docs/sources/oshwhub_listing_full.md: human-readable report with growth
trends, quality tiers, owner concentration, and storage-budget anchors
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>