在 replay 的扁平 objects[id] -> payload 之上盖一层 Relations,建索引和
反向引用,把孤立对象拼成可遍历的图,是后续 EPRO2 → KiCad 转换器的
中间表示前置。
Relations.build(doc) 单遍扫所有对象,得到:
主集合(按类型分桶):
parts / components / pins / pads / wires / nets / layers / rules
复合 ID 解析(关键):
'["LAYER",1]' → layers[1]
'["NET","GND"]' → nets["GND"]
'["PAD_NET","e0","1","e7"]' → pad_nets_by_pad/by_net
'["RULE","SAFE","copperThickness1oz"]' → rules[("RULE","SAFE",...)]
反向引用:
obj_ids_by_part partId → 引用对象 ids(lib 内 RECT/TEXT/PIN 都带 partId)
components_by_part partId → component ids
attrs_by_parent parentId → ATTR ids
lines_by_wire WIRE.id → LINE ids(wire 由若干 LINE 段组成)
pad_nets_by_pad PAD.id → PAD_NET 记录
pad_nets_by_net net name → PAD_NET 记录
objects_on_layer / objects_in_net 字段反查
便捷 accessor:
attrs_dict(parent_id) 折叠所有 ATTR ops 到 {key: value} dict(last
write wins),KiCad 转换时按 component 拿
Designator/Value/Footprint 的常用入口
ATTR.parentId 解析(实测发现的两种坑):
1. 不仅指向 COMPONENT/PART —— 也大量指向 WIRE(schematic 上的网络
标签 / 网络属性)。原查重函数漏算,636 个 false positive
unresolved;改为"任意 doc.objects[parentId] 命中即算 resolved"
2. 复合形式 `<comp_id>-<pin_id>` 用于把 ATTR 挂在某 component 的某个
pin 上(如 PinName)。`_resolve_parent()` 用 split("-",1) 兜底
CLI 加 --relations,按 docType 聚合 stats:
uv run python -m tools.epro2 data/raw/oshwhub/<uuid> --relations
ESP-VoCat 验证:
SCH_PAGE 9 docs : 572 components, 563 wires, 934 lines_grouped,
4111 attrs_attached, 0 unresolved_parents
PCB 6 docs : 206 components, 807 pad_nets, 173 nets, 544 layers
SYMBOL 105 docs : 106 parts, 560 pins, 1680 attrs_attached
FOOTPRINT 55 docs: 496 pads, 9 nets, 1771 layers, 140 rules
注:PCB 内 pads=6 vs pad_nets=807 不矛盾 —— PAD 实例存在 FOOTPRINT
文档里,PCB stream 用 ["PAD_NET",comp,pin,pad] 复合 id 跨文档引用;
解析"comp 的某 pin 通过哪个 footprint 的哪个 pad"需要 project-级
Relations 聚合(下个 task)。
测试:tools/epro2/tests/test_relations.py 9 个单测覆盖复合 id 解析、
lineGroup 链接、parentId 直/复合解析、partId 反查、attrs 折叠。
parser + relations 共 15/15 通过。
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
272 lines
12 KiB
Python
272 lines
12 KiB
Python
"""Build cross-object relationship indices from a replayed Document.
|
|
|
|
After ``replay.Document`` flattens the EPRO2 stream into ``objects[id] -> payload``,
|
|
this module walks those payloads to build the secondary indices needed for
|
|
downstream translation (KiCad export, graph extraction, etc).
|
|
|
|
Relationships modeled (empirically — see docs/sources/easyeda_pro_source.md §3
|
|
+ probe results 2026-04-28 on ESP-VoCat):
|
|
|
|
PART --(id, dotted name)--> primitives via primitive.partId (lib/parts)
|
|
COMPONENT --(.partId)--> PART (sch) or footprint via ATTR (pcb)
|
|
ATTR --(.parentId)--> COMPONENT or PART (key/value annotations)
|
|
LINE --(.lineGroup)--> WIRE (sch wire segments)
|
|
PAD_NET[id=["PAD_NET",comp,pin,pad]] --(.padNet)--> NET[id=["NET",name]]
|
|
any obj --(.layerId)--> LAYER[id=["LAYER",N]] (pcb)
|
|
any obj --(.netName)--> NET (pcb)
|
|
|
|
Composite IDs (e.g. ``'["LAYER",1]'``) are emitted by the editor as JSON
|
|
serialized arrays. We parse them lazily — see ``parse_composite_id``.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import json
|
|
from collections import defaultdict
|
|
from dataclasses import dataclass, field
|
|
from typing import Any
|
|
|
|
from .replay import Document
|
|
|
|
|
|
def parse_composite_id(s: str) -> list | None:
|
|
"""Best-effort decode an id field that's a serialized JSON array.
|
|
|
|
Returns the list if the string looks like JSON array, else None.
|
|
"""
|
|
if not isinstance(s, str) or not s.startswith("["):
|
|
return None
|
|
try:
|
|
v = json.loads(s)
|
|
except json.JSONDecodeError:
|
|
return None
|
|
return v if isinstance(v, list) else None
|
|
|
|
|
|
def _resolve_parent(parent_id: str, doc: Document) -> bool:
|
|
"""Check whether ``parent_id`` references something we know about.
|
|
|
|
Accepts:
|
|
- direct hit on ``doc.objects`` (any _type — COMPONENT/WIRE/PART/PAD/PIN/...)
|
|
- compound ``<a>-<b>`` where ``<a>`` resolves to a doc object
|
|
(used for "component+pin" addressing in schematic ATTR ops)
|
|
"""
|
|
if parent_id in doc.objects:
|
|
return True
|
|
if "-" in parent_id:
|
|
head = parent_id.split("-", 1)[0]
|
|
if head in doc.objects:
|
|
return True
|
|
return False
|
|
|
|
|
|
@dataclass
|
|
class Relations:
|
|
"""Indices built from one ``Document``. Cheap to (re)build.
|
|
|
|
Lookup conventions:
|
|
- "by_id" maps a primitive's id to its payload.
|
|
- "by_<key>" maps the value at <key> to a list of object ids referencing it.
|
|
- composite-keyed maps use the parsed tuple as key (e.g. layer int).
|
|
"""
|
|
|
|
doc: Document
|
|
|
|
# Primitive collections by type ----------------------------------------
|
|
parts: dict[str, dict] = field(default_factory=dict) # PART.id (dotted) → payload
|
|
components: dict[str, dict] = field(default_factory=dict) # COMPONENT.id → payload
|
|
pins: dict[str, dict] = field(default_factory=dict) # PIN.id → payload
|
|
pads: dict[str, dict] = field(default_factory=dict) # PAD.id → payload
|
|
wires: dict[str, dict] = field(default_factory=dict) # WIRE.id → payload
|
|
nets: dict[str, dict] = field(default_factory=dict) # NET name → payload
|
|
layers: dict[int, dict] = field(default_factory=dict) # LAYER int → payload
|
|
rules: dict[tuple, dict] = field(default_factory=dict) # ("RULE", ...) tuple → payload
|
|
|
|
# Cross-references -----------------------------------------------------
|
|
obj_ids_by_part: dict[str, list[str]] = field(default_factory=lambda: defaultdict(list))
|
|
"""partId (dotted name OR `pid...` prefix) → object ids referencing it."""
|
|
|
|
components_by_part: dict[str, list[str]] = field(default_factory=lambda: defaultdict(list))
|
|
"""partId → component ids whose COMPONENT.partId == this."""
|
|
|
|
attrs_by_parent: dict[str, list[str]] = field(default_factory=lambda: defaultdict(list))
|
|
"""parentId → ATTR ids attached."""
|
|
|
|
lines_by_wire: dict[str, list[str]] = field(default_factory=lambda: defaultdict(list))
|
|
"""WIRE.id → LINE ids whose lineGroup == this."""
|
|
|
|
pad_nets_by_pad: dict[str, list[dict]] = field(default_factory=lambda: defaultdict(list))
|
|
"""PAD.id → [{comp, pin, net_name, padNet_payload}, ...]."""
|
|
|
|
pad_nets_by_net: dict[str, list[dict]] = field(default_factory=lambda: defaultdict(list))
|
|
"""net_name (from PAD_NET.padNet) → [{comp, pin, pad}, ...]."""
|
|
|
|
objects_on_layer: dict[int, list[str]] = field(default_factory=lambda: defaultdict(list))
|
|
"""layer int → object ids whose payload.layerId == this."""
|
|
|
|
objects_in_net: dict[str, list[str]] = field(default_factory=lambda: defaultdict(list))
|
|
"""net name (payload.netName) → object ids."""
|
|
|
|
# Diagnostics ----------------------------------------------------------
|
|
unresolved_parents: int = 0 # ATTR.parentId points to nothing in components/parts/pads
|
|
unresolved_wires: int = 0 # LINE.lineGroup points to nothing in wires
|
|
unresolved_layers: int = 0 # payload.layerId points to nothing in layers (pcb only)
|
|
bad_composite_ids: int = 0
|
|
|
|
# ----------------------------------------------------------------------
|
|
|
|
@classmethod
|
|
def build(cls, doc: Document) -> "Relations":
|
|
rel = cls(doc=doc)
|
|
|
|
# First pass: bucket primitives by type, parse composite ids.
|
|
for obj_id, payload in doc.objects.items():
|
|
t = payload.get("_type")
|
|
|
|
if t == "PART":
|
|
# PART payload uses head.id as its key (e.g. "0.96_inch_lcd.1").
|
|
# In our replay, doc.objects[obj_id] has _type=PART; obj_id IS the part id.
|
|
rel.parts[obj_id] = payload
|
|
elif t == "COMPONENT":
|
|
rel.components[obj_id] = payload
|
|
if part_ref := payload.get("partId"):
|
|
rel.components_by_part[str(part_ref)].append(obj_id)
|
|
elif t == "PIN":
|
|
rel.pins[obj_id] = payload
|
|
elif t == "PAD":
|
|
rel.pads[obj_id] = payload
|
|
elif t == "WIRE":
|
|
rel.wires[obj_id] = payload
|
|
elif t == "NET":
|
|
# NET id is `["NET", "<name>"]`
|
|
comp = parse_composite_id(obj_id)
|
|
if comp and len(comp) >= 2 and comp[0] == "NET":
|
|
rel.nets[str(comp[1])] = payload
|
|
else:
|
|
rel.bad_composite_ids += 1
|
|
elif t == "LAYER":
|
|
# LAYER id is `["LAYER", <int>]`
|
|
comp = parse_composite_id(obj_id)
|
|
if comp and len(comp) >= 2 and comp[0] == "LAYER":
|
|
try:
|
|
rel.layers[int(comp[1])] = payload
|
|
except (TypeError, ValueError):
|
|
rel.bad_composite_ids += 1
|
|
else:
|
|
rel.bad_composite_ids += 1
|
|
elif t == "RULE":
|
|
comp = parse_composite_id(obj_id)
|
|
if comp and comp[0] == "RULE":
|
|
rel.rules[tuple(comp)] = payload
|
|
else:
|
|
rel.bad_composite_ids += 1
|
|
elif t == "PAD_NET":
|
|
# id is `["PAD_NET", <comp_id>, <pin_num>, <pad_id>]`
|
|
# payload.padNet = "<net name>"
|
|
comp = parse_composite_id(obj_id)
|
|
if comp and len(comp) >= 4 and comp[0] == "PAD_NET":
|
|
_, c_id, pin_num, pad_id = comp[0], str(comp[1]), str(comp[2]), str(comp[3])
|
|
net_name = payload.get("padNet")
|
|
record = {
|
|
"comp": c_id,
|
|
"pin": pin_num,
|
|
"pad": pad_id,
|
|
"net_name": net_name,
|
|
"payload": payload,
|
|
}
|
|
rel.pad_nets_by_pad[pad_id].append(record)
|
|
if net_name:
|
|
rel.pad_nets_by_net[str(net_name)].append(record)
|
|
else:
|
|
rel.bad_composite_ids += 1
|
|
|
|
# Second pass: cross-references that need full primitive maps available.
|
|
for obj_id, payload in doc.objects.items():
|
|
t = payload.get("_type")
|
|
|
|
# partId fan-in (not just COMPONENTs — RECT/TEXT/PIN inside SYMBOL/FOOTPRINT
|
|
# all carry partId pointing at their containing PART)
|
|
if (part_ref := payload.get("partId")) and t != "COMPONENT":
|
|
rel.obj_ids_by_part[str(part_ref)].append(obj_id)
|
|
|
|
# ATTR → parent. parentId may target any addressable object in the doc
|
|
# (COMPONENT / WIRE / PART / PAD / PIN), or a compound `<a>-<b>` form
|
|
# where <a> is a component and <b> is its pin/sub-ref.
|
|
if t == "ATTR":
|
|
if parent := payload.get("parentId"):
|
|
parent_str = str(parent)
|
|
rel.attrs_by_parent[parent_str].append(obj_id)
|
|
if not _resolve_parent(parent_str, doc):
|
|
rel.unresolved_parents += 1
|
|
|
|
# LINE → wire
|
|
if t == "LINE":
|
|
if wire_ref := payload.get("lineGroup"):
|
|
rel.lines_by_wire[str(wire_ref)].append(obj_id)
|
|
if wire_ref not in rel.wires:
|
|
rel.unresolved_wires += 1
|
|
|
|
# any obj on layer
|
|
if (lid := payload.get("layerId")) is not None:
|
|
try:
|
|
lid_int = int(lid)
|
|
rel.objects_on_layer[lid_int].append(obj_id)
|
|
if lid_int not in rel.layers:
|
|
rel.unresolved_layers += 1
|
|
except (TypeError, ValueError):
|
|
pass
|
|
|
|
# any obj in net
|
|
if net_name := payload.get("netName"):
|
|
rel.objects_in_net[str(net_name)].append(obj_id)
|
|
|
|
return rel
|
|
|
|
# Accessor helpers -----------------------------------------------------
|
|
|
|
def part_for_component(self, comp_id: str) -> dict | None:
|
|
"""Return the PART payload for a COMPONENT, if resolvable.
|
|
|
|
In schematic context, COMPONENT.partId is a `pid...` prefix string that
|
|
does NOT match PART.id directly — the editor resolves it via library
|
|
cache. We try a best-effort match on the raw partId; callers handle None.
|
|
"""
|
|
comp = self.components.get(comp_id)
|
|
if not comp:
|
|
return None
|
|
return self.parts.get(str(comp.get("partId", "")))
|
|
|
|
def attrs_dict(self, parent_id: str) -> dict[str, Any]:
|
|
"""Convenience: collapse all ATTR ops with parentId == ``parent_id`` into a
|
|
flat ``{key: value}`` dict. Last write wins on duplicate keys.
|
|
"""
|
|
out: dict[str, Any] = {}
|
|
for attr_id in self.attrs_by_parent.get(parent_id, []):
|
|
payload = self.doc.objects.get(attr_id) or {}
|
|
k = payload.get("key")
|
|
if k is not None:
|
|
out[str(k)] = payload.get("value")
|
|
return out
|
|
|
|
def summary(self) -> dict[str, int]:
|
|
"""Stats for CLI / tests / sanity checks."""
|
|
return {
|
|
"parts": len(self.parts),
|
|
"components": len(self.components),
|
|
"pins": len(self.pins),
|
|
"pads": len(self.pads),
|
|
"wires": len(self.wires),
|
|
"nets": len(self.nets),
|
|
"layers": len(self.layers),
|
|
"rules": len(self.rules),
|
|
"lines_grouped": sum(len(v) for v in self.lines_by_wire.values()),
|
|
"attrs_attached": sum(len(v) for v in self.attrs_by_parent.values()),
|
|
"pad_nets": sum(len(v) for v in self.pad_nets_by_pad.values()),
|
|
"objects_on_layer": sum(len(v) for v in self.objects_on_layer.values()),
|
|
"objects_in_net": sum(len(v) for v in self.objects_in_net.values()),
|
|
"unresolved_parents": self.unresolved_parents,
|
|
"unresolved_wires": self.unresolved_wires,
|
|
"unresolved_layers": self.unresolved_layers,
|
|
"bad_composite_ids": self.bad_composite_ids,
|
|
}
|