Phase 1 MVP: crawl 10 high-quality oshwhub projects into LFS

Why:
- Charles 指定:先爬 10 个高质量项目存 Gitea LFS,一个项目一个文件夹,
  保留原文件和 URL。先以小批量验证 schema + LFS 流水线,放量前再拍板
  存储规模。

What:
- crawlers/oshwhub: 列表 API (`/api/project?sort=hot`) + SSR HTML 解析,
  一次性产出 metadata / description / cover / files / _urls
- schemas/project.schema.json: 跨源统一 schema
- docs/sources/oshwhub.md: API 入口 / 字段映射 / 陷阱调研
- pyproject.toml: httpx[http2] 单依赖
- .gitattributes: data/raw/**/files/** 一律走 LFS(规则写窄,避免误伤 schemas/*.json 等)
- .gitignore: 移除 data/raw/* 排除(改走 LFS 入库)

10 个项目覆盖:调试器 / 加热台 / 盖革计数器 / 数控电源 / 焊台 /
智能手表 / USB 测电流 / ZVS 感应加热 / AI 开发板 / 红外热成像。
共 52 附件 ≈ 524 MB 入 LFS,筛选判据 grade=4 & likes>=100 & 多样性。

Known gaps(见 plan.md § Phase 1.4):
- EasyEDA 源 JSON 需登录 (u.lceda.cn),v0.1 跳过
- fs-web-stream.jlc.com 的工程源下载未测
- scripts/validate.py 自动 schema 校验未实现

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Zhang Jiahao
2026-04-23 19:34:09 +08:00
parent bf2370f83b
commit 5ffa10f256
103 changed files with 2279 additions and 28 deletions

View File

@@ -0,0 +1,36 @@
{
"detail_url": "https://oshwhub.com/dhx233/esp32_s3_watch",
"cover_url": "https://image.lceda.cn/oshwhub/e65c8dd8b85f426ebded1acf94084b55.jpg",
"attachments": [
{
"name": "演示视频V1_0_2.mp4",
"url": "https://image.lceda.cn/attachments/2023/5/rjPzo9rb32stS2uMtOYyQHB1Mzsku2jIpfkxS0u6.mp4",
"original_id": "9e3b1b0aab4e4077bba58e9b99072752"
},
{
"name": "QF_ZERO_V2_V1_0_3.zip",
"url": "https://image.lceda.cn/attachments/2023/5/a07O0dN3bNwNohLe4TFmCeNc9OPFYHyo3p4GWagW.zip",
"original_id": "ec260479deae4829992fe4e87f0e461a"
},
{
"name": "lvgl_demo_watch_code_blocks.zip",
"url": "https://image.lceda.cn/attachments/2023/6/vEqACjU5pj8DZUYTFlMSquYiTrT05KqWEXKtwtle.zip",
"original_id": "511fcfd7fd0743c98b5d05bb2919c778"
},
{
"name": "V1.2外壳打螺丝版本.zip",
"url": "https://image.lceda.cn/attachments/2023/6/Y4R4DRQgYSDp7qdF3Jti2NYUpFlRpGXvEr1EcDpW.zip",
"original_id": "27d5835a26824d0ab7666ebe7a1d679b"
},
{
"name": "wx_camera_1697969384939.mp4",
"url": "https://image.lceda.cn/attachments/2023/10/FiNBQGrRzqux6SvmFD2ZRxvEV8hgRpW2bpF08zKj.mp4",
"original_id": "bd456ad1f21a40b2a84492eab943fb08"
},
{
"name": "qf_zero_v2_firmeware_V1.0.9_app.bin",
"url": "https://image.lceda.cn/oshwhub/project/attachments/4d91cd61c2374105bb43e92a619b4efa.bin",
"original_id": "f4b04d62ddc5451ca1fd77675d75cf53"
}
]
}