Commit Graph

4 Commits

Author SHA1 Message Date
d11ca1d3be tools/epro2/std: fetch + decrypt Pro 2.x encrypted-external blobs
Pro 2.x stores some doc payloads (notably Taishan's PCB) externally at
modules.lceda.cn keyed by dataStrId, AES-256-GCM encrypted with the
iv/key fields stored alongside. Same crypto pattern as Pro 3.x EPRO2:
last 16 bytes are the GCM auth tag, rest is gzip(plaintext-op-stream).
The CDN doesn't require auth.

  - pro2_writer.fetch_encrypted_plaintext(): fetch + decrypt + gunzip,
    cache result at source/<uuid>.decrypted.txt so re-runs skip the
    network round-trip. Heavy imports (httpx, pycryptodome) are
    deferred to call-time so the pure-replay path doesn't pay for them.
  - pro2_writer.split_plaintext_by_doctype(): walk the multi-doc
    plaintext (Pro 2.x bundles N FOOTPRINTs + 1 PCB into one blob), yield
    (label, sub_text) per inner doc. Label = HEAD.uuid if present, else
    fallback `<kind>_<idx>`.
  - __main__._convert_pro2_encrypted(): for each sub-doc, write a
    synthetic inline-Pro-2.x JSON next to the original and re-route
    through write_pro2_doc — re-uses BBox / layers / objects-extraction
    instead of duplicating the logic. Output filename
    `<parent_uuid>__<sub_label>.json` makes the parent association
    visible.

Smoke (Taishan): 28 inline SCHs → 55 total. Decrypts:
  - one PCB blob (3.4 MB plaintext, 20267-object PCB + 25 FOOTPRINT
    sub-docs of 130-580 objects each)
  - one SCH-typed encrypted doc (1 sub-SCH of 891 objects)

86 unit tests still pass; new fetch/decrypt path is covered manually
via the smoke test rather than mocking httpx + AES.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 02:07:40 +08:00
3720cd176a tools/epro2/std: add Pro 2.x JSON path — Liangshan + Taishan SCH now exportable
The downstream colleague's "encrypted_external" / "string old format"
projects were Pro 2.x, not Pro 3.x EPRO2. Pro 2.x ships each doc as a
JSON file whose `dataStr` is a plaintext op-stream — one JSON array per
line, e.g. `["COMPONENT","e1","",0,0,0,0,{},0]`. Different wire format
from EPRO2's binary tilde/pipe streams; same Std envelope works for
output.

  - tools/epro2/std/pro2_writer.py: parses dataStr line-by-line, keys
    objects by id (position 1 for most ops, OPTYPE for singletons),
    extracts BBox by walking known coord positions per OPTYPE, derives
    layers from LAYER ops directly (Pro 2.x almost matches Std layer
    string format already). PCB blobs that are encrypted-external
    (`dataStrId` URL + `iv` + `key`, no inline dataStr — Taishan PCB)
    return None so the CLI skips with a message instead of stubbing.

  - tools/epro2/std/__main__.py: auto-detect via manifest's
    editor_version. "2.x" → Pro 2.x writer; otherwise the existing
    EPRO2 replay path. CLI surface and output layout unchanged.

  - docs/sources/epro2_to_std_mapping.md: adds a Pro 2.x section.
    Adapter dispatches on `head.epro_format`: absent / "epro2" gets
    dict-shaped objects values, "pro2" gets array-shaped values
    (`[OPTYPE, arg1, ...]`). Lists the Pro 2.x-specific OPTYPEs
    (FONTSTYLE / LINESTYLE / CONNECT / OBJ / REGION / DIMENSION /
    STRING / TEARDROP) the EPRO2 vocabulary doesn't have.

Smoke (re-running --all on all 5 Pro projects): 191 → 222 JSON files.
Liangshan adds 3 (2 SCH + inline 5357-object PCB). Taishan adds 28
(SCH only — PCB skipped, encrypted-external; source/<uuid>.json still
keeps the dataStrId/iv/key for a later fetch+decrypt pass).

84 → 86 unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 02:00:37 +08:00
3866e24189 tools/epro2/std: rewrite to Option 2 (objects dump) per downstream spec
Downstream came back with concrete requirements: don't pre-compute Std
shape[] tilde strings, just dump the raw EPRO2 `objects: {id: payload}`
dict and they'll write a ~100-LoC adapter on their side. Pulling the
tilde-mapping work back saves us from second-guessing positional fields
without their parser to verify against, and shortens our pcb_writer
from ~500 lines to ~40.

Output shape (Std envelope intact, just no `shape[]`):

    {
      "success": true, "code": 0,
      "result": {
        "uuid", "puuid", "title",
        "docType": 3 | 1,
        "components": {},
        "dataStr": {
          "head": {
            "docType": "3" | "1",
            "editorVersion": "facere-epro2/0.1 (epro2 <X.Y.Z>)",
            "units": "mil",
            "epro2_doc_uuid": ...,
            "epro2_editor_version": ...,
          },
          "BBox": {x, y, width, height},   # mil
          "layers": [...],                  # Std layer-string array
          "objects": dict(doc.objects),     # raw EPRO2, 1:1
          "preference": {}, "netColors": [], "DRCRULE": {},
        }
      }
    }

Per-doc spec downstream gave us:
  - shape[] dropped (empty placeholder misleads adapter)
  - all units mil (no mm conversion — Std canvas already declares mil)
  - head.units="mil" so adapter doesn't have to guess
  - BBox min/max across known x/y/startX/endX/centerX fields; adapter
    can refine by walking path arrays itself
  - layers[] keeps Std's 17-line default + inner SIGNAL layers actually
    used (21~Inner1.., 22~Inner2..)
  - empty stubs preference/netColors/DRCRULE for grep-based triage

New: docs/sources/epro2_to_std_mapping.md with the full EPRO2 OPTYPE →
Std verb table that downstream's adapter authors will copy from. Tables
include the layer-id remapping (the 5↔7 paste/mask flip, 11→10 outline,
12→11 multi, SIGNAL 15+→21+), PCB op mappings, SCH op mappings (marked
best-effort: no Std SCH samples in our corpus), and the 5-Voltage
placeholder COMPONENT → extra net flag trick. Extracted from the
previous Option-3 writer (commit fe6971f) so adapter writers don't
have to reverse-engineer it from source.

ESP-VoCat smoke: 6 PCB + 9 SCH = 15 JSON files, head.units=mil
preserved, no shape[] field present. 82 → 84 unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 01:41:12 +08:00
fe6971f3f9 tools/epro2: add std/ writer — EPRO2 → EasyEDA Std-format JSON for downstream
The downstream colleague consumes oshwhub Std (lceda) dict-format JSON,
not KiCad. The EPRO2 decryption part (per-doc plaintext .epro2 streams
in data/raw/<uuid>/source/) is what we already provide; the missing
piece is converting EPRO2 op-streams into the same `dataStr.shape`
tilde-delimited format their parser already speaks.

New tools/epro2/std/ module, peer of tools/epro2/kicad/, kept
deliberately separate so the KiCad path stays untouched:

  - pcb_writer.write_pcb_std() — high-fidelity, validated against a Std
    PCB sample at data/raw/oshwhub/3e2f893d.../25931ddab8.json. Maps
    LINE→TRACK, VIA→VIA, POUR→COPPERAREA (with SVG `M..L..Z` path),
    POLY→CIRCLE/SOLIDREGION, COMPONENT+FOOTPRINT→LIB nested with
    #@$-separated PADs (placement rotation + translate applied so pad
    coords land at PCB-absolute positions). Layer-id mapping (EPRO2 5↔7
    flipped vs Std solder/paste, 11→10 outline, 12→11 multi, SIGNAL
    inner 15+ → Std 21+) noted inline.

  - sch_writer.write_sch_std() — best-effort. Our corpus has zero Std
    schematic samples (docType=1) so verb field orders follow the
    EasyEDA Std public spec, not direct observation. Emits W (wire),
    N (net flag, including the 5-Voltage Global Net Name power-port
    pattern), T (text), LIB (placement with #@$-nested PIN/T). If
    downstream's parser bails the fix is almost certainly a positional
    field tweak, not a re-architecture.

  - __main__.py — flat output `<doc_uuid>.json` per doc directly under
    --out (mirrors Std's own data layout); --all-pcb / --all-sch / --all.

Smoke test on ESP-VoCat: 6 PCB + 9 SCH = 15 JSON files, libs_unresolved=0
across the board. Compact JSON (separators=(",",":")) matches Std's
single-line format. Numbers use _num() — integers without trailing .0,
floats trimmed.

71 → 82 unit tests pass.

Open questions for downstream: (1) confirm SCH verb field orders, (2)
do they want any of the upstream metadata fields we drop (master,
owner, created_at, etc — those live on the crawler side, not the
schematic itself)?

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 01:16:39 +08:00