FacereDataset/docs/sources/epro2_to_std_mapping.md

# EPRO2 / Pro 2.x OPTYPE → EasyEDA Std shape verb mapping

For downstream adapters that consume `tools/epro2/std/`'s Option-2 output
(raw `objects: {id: payload}` dict in the `dataStr` field) and need to
produce real Std `shape[]` tilde strings.

This table is the same mapping our previous Phase-3 writer encoded
inline (`fe6971f:tools/epro2/std/pcb_writer.py`); we extracted it here
so adapter authors don't have to reverse-engineer it from the writer
source.

All EPRO2 coordinate fields are in **mil**; Std `dataStr.canvas` declares
`mil` as its unit, so the adapter copies coords through unchanged.

## Layer id remapping

EPRO2 and Std agree on most copper layer ids, but differ on the mask /
paste layers (5↔7 swapped) and have different numbering for OUTLINE /
MULTI / inner SIGNAL.

| EPRO2 id | EPRO2 type             | Std id | Std name              |
|---------:|------------------------|-------:|-----------------------|
| 1        | TOP                    | 1      | TopLayer              |
| 2        | BOTTOM                 | 2      | BottomLayer           |
| 3        | TOP_SILK               | 3      | TopSilkLayer          |
| 4        | BOT_SILK               | 4      | BottomSilkLayer       |
| 5        | TOP_SOLDER_MASK        | **7**  | TopSolderMaskLayer    |
| 6        | BOT_SOLDER_MASK        | **8**  | BottomSolderMaskLayer |
| 7        | TOP_PASTE_MASK         | **5**  | TopPasteMaskLayer     |
| 8        | BOT_PASTE_MASK         | **6**  | BottomPasteMaskLayer  |
| 9        | TOP_ASSEMBLY           | 13     | TopAssembly           |
| 10       | BOT_ASSEMBLY           | 14     | BottomAssembly        |
| 11       | OUTLINE                | 10     | BoardOutLine          |
| 12       | MULTI (THT pads)       | 11     | Multi-Layer           |
| 13       | DOCUMENT               | 12     | Document              |
| 14       | MECHANICAL             | 15     | Mechanical            |
| 15..46   | SIGNAL inner (in use)  | 21..50 | Inner1..InnerN        |

The 21..50 inner mapping is dense — assign Std `21` to the lowest-numbered
EPRO2 SIGNAL id actually carrying geometry on this board, `22` to the
next, etc. EPRO2 SIGNAL layers declared in LAYER ops but unused don't
need a Std slot.

## PCB OPTYPE → Std shape verb (docType=3)

### LINE (copper trace, silk line, ...) → `TRACK`

```
TRACK~width~layer~net~points~uuid~locked
```
- `width` ← `LINE.width` (mil)
- `layer` ← `_layer(LINE.layerId)` via the table above
- `net` ← `LINE.netName` (string, may be empty for non-net graphics)
- `points` ← `"<startX> <startY> <endX> <endY>"` (mil, space-separated)
- `uuid` ← any unique `gge<8 hex>` id; downstream usually mints fresh
- `locked` ← `0`

EPRO2 doesn't distinguish copper trace from silk line at the op level —
both are LINE with a different `layerId`. Std uses `TRACK` for both;
the layer id is what disambiguates.

### VIA → `VIA`

```
VIA~x~y~outerD~net~innerD~uuid~locked
```
- `x` `y` ← `VIA.centerX/centerY`
- `outerD` ← `VIA.viaDiameter`
- `innerD` ← `VIA.holeDiameter`
- `net` ← `VIA.netName`

### POUR → `COPPERAREA`

```
COPPERAREA~1~layer~net~svgPath~strokeWidth~~~~~~~uuid~locked
```
- `1` is the `id` slot Std uses; any int works
- `svgPath` ← convert `POUR.path` to SVG `M..L..Z` string. Three
  EPRO2 path encodings:
  - rectangle `[['R', x, y, w, h, ...]]` → 4-corner closed polygon
  - circle `[['CIRCLE', cx, cy, r]]` → 24-segment polygon approximation
  - polyline `[[x1, y1, 'L', x2, y2, ..., 'ARC', radius, endX, endY, ...]]`
    → walk numeric pairs as `M x y` (first) / `L x y` (rest); ARC verbs
    chord-approximate to `L endX endY` (good enough for fill connectivity,
    Phase-2 sticks with this; precise arc chord recovery is a follow-up)
- `strokeWidth` ← `POUR.width`

### FILL (manual filled region) → `SOLIDREGION`

```
SOLIDREGION~99~~svgPath~solid~uuid~~~~locked
```
- Same SVG-path encoding as COPPERAREA
- `99` is the `id` slot; the `~~` after it is an empty layer field
  (FILL on EPRO2 carries `layerId` but Std SOLIDREGION leaves it blank
  for "uses the path's natural color"; this is fine for downstream)

### POLY with `path[0] == 'CIRCLE'` → `CIRCLE`

```
CIRCLE~cx~cy~radius~strokeWidth~layer~uuid~locked~~
```

### POLY with polyline path → `SOLIDREGION` (graphic polygon)

Same as FILL.

### COMPONENT (+ its FOOTPRINT.PADs) → `LIB...#@$PAD...#@$TEXT...`

The Std `LIB` shape is one outer string plus N inner shapes joined by
the literal three-byte separator `#@$`. The outer carries placement; each
inner is a real PAD / TEXT shape with the **PCB-absolute coords** that
result from rotating + translating the FOOTPRINT-local pad positions.

Outer:
```
LIB~x~y~package_name`~rotation~~uuid~display~~~locked~~yes~~
```
- `x` `y` ← `COMPONENT.x/y` (mil)
- `package_name` ← FOOTPRINT META.title (then a literal trailing backtick)
- `rotation` ← `COMPONENT.angle` (degrees)
- `display` `1`, `locked` `0`

Inner PAD (one per FOOTPRINT.PAD owned by this COMPONENT):
```
PAD~shape~x~y~width~height~layer~net~num~drillSize~~rotation~uuid~0~~Y~0~0~0.2~x,y
```
- `shape` ← `defaultPad.padType` ∈ {`RECT`, `ELLIPSE`, `OVAL`, `POLYGON`}
- `x` `y` ← absolute coords:
  ```
  abs_x = comp.x + pad.centerX * cos(comp.angle) − pad.centerY * sin(comp.angle)
  abs_y = comp.y + pad.centerX * sin(comp.angle) + pad.centerY * cos(comp.angle)
  ```
- `width` `height` ← `defaultPad.width/height`
- `layer` ← `_layer(pad.layerId)` (typically 1=TOP, 2=BOTTOM, 11=Multi for THT)
- `net` ← resolve via PCB-level `PAD_NET` op:
  the PCB doc has ops with composite ids
  `["PAD_NET", <component_id>, <pin_num>, <pad_id>]` → `padNet` payload
  is the net name. Cross-doc lookup; the FOOTPRINT itself doesn't know
  the net of any specific instance.
- `num` ← `pad.num` (pin number, string)
- `drillSize` ← `pad.hole.width` if hole present, else `0`
- `rotation` ← `(pad.padAngle + comp.angle) % 360`

Inner TEXT (designator + value, one each if attrs present):
```
TEXT~P~x~y~strokeWidth~rotation~mirror~layer~font~size~content~svgPath~visible
```
- `P` flag = property text (vs `L` for label)
- `content` ← attrs.Designator / attrs.Value pulled from ATTR ops with
  `parentId = component_id`

The downstream adapter doesn't need a separate ATTR walk — by the time
it has the COMPONENT's ATTR-derived attrs (Designator, Value, Footprint,
...), those are typically already collapsed into a `attrs_dict` map
(`tools.epro2.relations.Relations.attrs_dict(parent_id)` does this).

## Schematic OPTYPE → Std verb (docType=1, **best-effort**)

We have zero Std schematic samples in `data/raw/oshwhub/*/source/` (all
the projects we crawled are PCB-only Std exports), so the field orders
below follow the **EasyEDA Std public schematic spec**, not direct
observation. Adapter authors should expect to tweak field positions if
their parser rejects a verb.

### LINE → `W` (wire segment)

```
W~strokeColor~strokeWidth~strokeStyle~points~uuid~locked
```
- `points` ← same `<x1> <y1> <x2> <y2>` form as TRACK

### LINE.lineGroup with parent WIRE.NET attr → also emit `N` (net flag)

```
N~x~y~rotation~text~uuid~locked
```
EPRO2 binds wire segments by NET name, not just geometry. Place one N
flag at each LINE's start endpoint, with the `text` set to the parent
WIRE op's `NET` ATTR value. Same-named flags on physically distinct
wire segments is how Std unifies a multi-segment named net.

### COMPONENT (+ its SYMBOL primitives) → `LIB...#@$P...`

Outer:
```
LIB~x~y~package`<symbol_title>`~rotation~~uuid~display~~~locked~~yes~~
```

Inner per SYMBOL.PIN:
```
P~show~0~~x~y~rotation~uuid^^pin_number^^pin_name^^length
```

(Note: PIN field separator inside the inner string uses `^^` not `~`,
per spec — but this varies by editor version. If downstream's parser
rejects PIN, this is the most likely culprit.)

### Power-port placeholder → `LIB` + extra `N`

EPRO2 represents power rails (VBUS / GND / VCC / VBAT_IN / ...) as a
generic placeholder COMPONENT with `partId = "pid8a0e77bacb214e"` whose
**Global Net Name** ATTR carries the rail name. For each such instance,
emit the regular `LIB` placement *plus* an `N` flag at the placement
coords with the Global Net Name as `text` — that's how the symbol's pin
binds to the global rail. (This mirrors the same fix our KiCad path uses
to emit a `(global_label)` for these.)

### TEXT → `T`

```
T~x~y~rotation~text~uuid~locked
```

## Skipped / "not yet supported"

These exist in EPRO2 but our writer doesn't address them — adapters can
choose to skip silently or emit best-effort placeholders:

| EPRO2 op   | Std target | Notes                                                  |
|------------|------------|--------------------------------------------------------|
| TEARDROP   | (drop)     | Cosmetic fillets at via/pad-trace junctions            |
| ARC (PCB)  | `ARC`      | Std verb exists; we emit only chord-approximated ones  |
| IMAGE      | `SVGNODE`  | Bitmap logos; Std stores as embedded SVG JSON          |
| STRING (PCB) | `TEXT`   | Board-level text; field order distinct from PCB TEXT-in-LIB |
| BUS / BE (SCH) | `BUS` / `BE` | Bus + bus entry — no EPRO2 sample in our corpus  |

## Pro 2.x source format

Pro 2.x projects (lceda Pro editor 2.x — Liangshan Pi, Taishan Pi RK3566
in our corpus) use a **different on-disk format** than Pro 3.x EPRO2,
even though both come out of the same crawler. Detection: the
`source/manifest.json` file has `"editor_version": "2.x.x"`. Our
exporter auto-detects this and emits the same Std envelope, but with two
key differences the adapter must branch on:

- `result.dataStr.head.epro_format = "pro2"` (vs absent / `"epro2"` for
  Pro 3.x). This is the canonical dispatch field.
- `result.dataStr.objects` values are **JSON arrays**, not the
  `{"_type": ..., **fields}` dicts EPRO2 produces. The first array
  element is the OPTYPE (`["COMPONENT", "e1", "", 0, 0, 90, ...]`).

Pro 2.x op vocabulary overlaps EPRO2 but adds editor-specific helpers:
`FONTSTYLE` / `LINESTYLE` (referenced by id from text/stroke ops),
`CONNECT` (sch wire-end to pin binding), `OBJ` (group container),
`REGION` (sch background fills), `DIMENSION` (sch annotation),
`STRING` (PCB board-level text — distinct from PCB `TEXT`),
`TEARDROP` (cosmetic fillets at via/pad).

Field positions per OPTYPE follow the public EasyEDA Pro 2.x spec
(versioned via the leading `["DOCTYPE","SCH","1.1"]` / `["DOCTYPE",
"PCB","1.4"]` op). Our writer doesn't translate them — adapter
dispatches by `arr[0]` (OPTYPE) and walks the rest by index.

### Encrypted-external PCB blobs

Some Pro 2.x PCB docs (and a handful of resource docs) replace the
inline `dataStr` field with `{"dataStrId": "https://modules.lceda.cn/...",
"iv": "...", "key": "..."}` — the actual op-stream lives at the URL,
AES-decrypted with the iv+key. **Our exporter skips these**; the
`source/<uuid>.json` files still hold the dataStrId/iv/key so a future
fetch+decrypt pass can recover them. Taishan PCB is the example in our
corpus.

## Provenance fields the adapter can rely on

In addition to `objects`, our writer always emits:
- `result.dataStr.head.docType` `"3"` (PCB) or `"1"` (SCH) — same string
  encoding Std uses
- `result.dataStr.head.units` `"mil"` — explicit unit hint so the
  adapter doesn't have to guess
- `result.dataStr.head.editorVersion` `"facere-epro2/0.1 (epro2 X.Y.Z)"`
  where X.Y.Z is the EPRO2 doc's `editVersion`. Useful for triage when
  a board exhibits version-specific quirks.
- `result.dataStr.BBox` `{x, y, width, height}` — gross outer rectangle
  from min/max of every numeric `x/y/startX/startY/endX/endY/centerX/centerY`
  field across `objects`. Adapters that want a tighter BBox can refine
  by walking `path` arrays themselves.