Files
FacereDataset/schemas/project.schema.json
Knowit d874278bc5 Add EasyEDA Std project source ingestion (10 boards backfilled)
打通 oshwhub origin=std 项目的工程源(schematic + PCB dataStr)抓取链路。原
plan.md §1.6 假设需要登录,实测 lceda.cn/api/documents/<doc>?uuid=<doc>&path=<doc>
对公开项目匿名可访问 —— 无需 cookie,无账号封禁风险。

调研:4 轮探测留痕在 data/state/std_probe[1-5]/(gitignored);翻 Std 编辑器
v6.5.51 的 main.min.js bundle 找到 ajaxDetail 端点;按 docType 区分两种
响应 shape(schematic 项目视图 vs PCB 文档视图)。

Crawler:
  - make_source_client() 用浏览器 UA + lceda.cn/editor Referer,因为
    oshwhub /api/project/<uuid> 端点拒绝 FacereDataset/0.1 UA(CLAUDE.md
    UA 例外条款:目标站主动封自定义 UA + 公开静态资源)
  - fetch_std_source(): 项目元 → version_documents → 逐文档 dataStr → 落
    source/<doc>.json + source/manifest.json
  - --with-source(爬新项目时一并抓源)/ --backfill-source(仅扫已有)
  - QPS ≤ 0.2 (SLEEP_SOURCE = 5s) 自律

Schema: 加 source_format / source_path / source_documents / editor_version
(前 3 进 enum 锁定,便于后续 Pro / KiCad 源对齐)。

回填结果:10/10 成功,45 个文档,33.2 MB;schema validate 全通。
docTypes 主要是 1 (schematic) 与 3 (pcb);USB 电压电流表只有 PCB 文档(4 个:
主板+盖板+底板+面板,作者未上传原理图源)。

完整调研:docs/sources/easyeda_std_source.md。
2026-04-28 20:07:40 +08:00

125 lines
4.3 KiB
JSON
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://git.deepknow.site/Facere/FacereDataset/schemas/project.schema.json",
"title": "FacereDataset Project",
"description": "统一项目记录。跨源oshwhub / hackaday / github / ...)通用。",
"type": "object",
"required": [
"source",
"source_url",
"project_id",
"title",
"author",
"license",
"crawled_at",
"files"
],
"properties": {
"source": {
"type": "string",
"description": "数据源标识,如 'oshwhub'、'hackaday'、'github'",
"enum": ["oshwhub", "hackaday", "github", "cern_ohr", "wikifactory", "other"]
},
"source_url": { "type": "string", "format": "uri" },
"project_id": {
"type": "string",
"description": "源站点内部 IDoshwhub 用 uuid"
},
"title": { "type": "string" },
"description_short": {
"type": "string",
"description": "简介(< 200 chars"
},
"description_path": {
"type": "string",
"description": "长描述 markdown 相对本项目目录的路径,如 'description.md'"
},
"author": {
"type": "object",
"required": ["username"],
"properties": {
"username": { "type": "string" },
"display_name": { "type": "string" },
"user_id": { "type": "string" }
}
},
"license": {
"type": "string",
"description": "原始许可证字符串;下游做规范化映射。未知标 'unknown'"
},
"tags": {
"type": "array",
"items": { "type": "string" }
},
"created_at": { "type": "string", "format": "date-time" },
"updated_at": { "type": "string", "format": "date-time" },
"published_at": { "type": "string", "format": "date-time" },
"crawled_at": { "type": "string", "format": "date-time" },
"metrics": {
"type": "object",
"additionalProperties": true,
"description": "任意源站点统计likes/stars/views/forks 等"
},
"cover": {
"type": "object",
"properties": {
"url": { "type": "string", "format": "uri" },
"path": { "type": "string", "description": "本地相对路径,如 'cover.png'" }
}
},
"files": {
"type": "array",
"items": {
"type": "object",
"required": ["name", "url"],
"properties": {
"name": { "type": "string" },
"url": { "type": "string", "format": "uri" },
"path": { "type": "string", "description": "本地相对路径,如 'files/xxx.pdf'。缺省表示只保留 URL" },
"size": { "type": "integer" },
"md5": { "type": "string" },
"sha256": { "type": "string" },
"ext": { "type": "string" },
"mime": { "type": "string" },
"original_id": { "type": "string", "description": "源站点内部文件 ID" }
}
}
},
"raw_fields": {
"type": "object",
"description": "不易规范化但想保留的源站原始字段grade、download_count 等)",
"additionalProperties": true
},
"source_format": {
"type": "string",
"description": "EDA 工程源格式标记。如 'easyeda-std'u.lceda.cn/ 'easyeda-pro'pro.lceda.cn EPRO2/ 'kicad'。",
"enum": ["easyeda-std", "easyeda-pro", "kicad", "altium", "eagle", "other"]
},
"source_path": {
"type": "string",
"description": "工程源文件目录,相对本项目目录,如 'source/'。"
},
"source_documents": {
"type": "array",
"description": "工程源文档清单。每条对应一个 schematic / pcb / sheet 文档。",
"items": {
"type": "object",
"required": ["doc_uuid", "path"],
"properties": {
"doc_uuid": { "type": "string" },
"docType": { "type": "integer", "description": "EasyEDA Std: 1=schematic, 3=pcb其它待观察" },
"master": { "type": "string", "description": "当前 head history hash" },
"path": { "type": "string", "description": "本地相对路径,如 'source/<doc_uuid>.json'" },
"size": { "type": "integer" },
"sha256": { "type": "string" }
}
}
},
"editor_version": {
"type": "string",
"description": "EasyEDA / KiCad 编辑器版本(从 dataStr.head.editorVersion 抽取)。"
}
},
"additionalProperties": false
}