Phase 1 MVP: crawl 10 high-quality oshwhub projects into LFS

Why:
- Charles 指定:先爬 10 个高质量项目存 Gitea LFS,一个项目一个文件夹,
  保留原文件和 URL。先以小批量验证 schema + LFS 流水线,放量前再拍板
  存储规模。

What:
- crawlers/oshwhub: 列表 API (`/api/project?sort=hot`) + SSR HTML 解析,
  一次性产出 metadata / description / cover / files / _urls
- schemas/project.schema.json: 跨源统一 schema
- docs/sources/oshwhub.md: API 入口 / 字段映射 / 陷阱调研
- pyproject.toml: httpx[http2] 单依赖
- .gitattributes: data/raw/**/files/** 一律走 LFS(规则写窄,避免误伤 schemas/*.json 等)
- .gitignore: 移除 data/raw/* 排除(改走 LFS 入库)

10 个项目覆盖:调试器 / 加热台 / 盖革计数器 / 数控电源 / 焊台 /
智能手表 / USB 测电流 / ZVS 感应加热 / AI 开发板 / 红外热成像。
共 52 附件 ≈ 524 MB 入 LFS,筛选判据 grade=4 & likes>=100 & 多样性。

Known gaps(见 plan.md § Phase 1.4):
- EasyEDA 源 JSON 需登录 (u.lceda.cn),v0.1 跳过
- fs-web-stream.jlc.com 的工程源下载未测
- scripts/validate.py 自动 schema 校验未实现

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Zhang Jiahao
2026-04-23 19:34:09 +08:00
parent bf2370f83b
commit 5ffa10f256
103 changed files with 2279 additions and 28 deletions

View File

@@ -0,0 +1,71 @@
{
"detail_url": "https://oshwhub.com/qaxslk/dai-PD-QCyou-pian-jian-ce-yi-ji-",
"cover_url": "https://image.lceda.cn/pullimage/Xl5EY8fMBTiXzXkTPdbza8bTdtaqEHzKjVZyI5rF.jpeg",
"attachments": [
{
"name": "pd诱骗、检测&emarker读取演示.mp4",
"url": "https://image.lceda.cn/attachments/2022/7/uTn5lxDhfdtLIUHzLAJQlfdEqtf59MNG1ct1xOwv.mp4",
"original_id": "8227d8f1af9942ada85c34c0bae46520"
},
{
"name": "电流监测及功能演示.mp4",
"url": "https://image.lceda.cn/attachments/2022/7/Hrxaxo9kD6biCRjTHRPz9lihZPDspQxTyUOmj0ZP.mp4",
"original_id": "5a38d4a5c90c4c929ee6afd90685c54a"
},
{
"name": "qc诱骗演示.mp4",
"url": "https://image.lceda.cn/attachments/2022/7/TPVFa8A5CZUYDDcjwqLqGh2CasxOhQuwJwCrTUoV.mp4",
"original_id": "8ec5c15c8a884581b93ea7c92e0fd537"
},
{
"name": "flash_download_tool_3.9.2_0.zip",
"url": "https://image.lceda.cn/attachments/2022/8/zxKZnl7dstHkJzcCJsSCZH8Z3h4xv3r0dJ9pz4OR.zip",
"original_id": "4ebd8bd9cdc04174a33eabde9ad178b6"
},
{
"name": "iic测试.bin",
"url": "https://image.lceda.cn/attachments/2022/8/vry47jDECDC6Oi580rjR3A8kPBqxUEqrua6CyTHJ.bin",
"original_id": "62f50e21b4794a849019abf339cb0a87"
},
{
"name": "BOM清单.csv",
"url": "https://image.lceda.cn/attachments/2022/8/bgFVrW1VricnDvTBz3teVuMEZHn5p01utREdg1fB.txt",
"original_id": "1bf37839d71245baa542e5edd2c87c51"
},
{
"name": "IBOM焊接图.zip",
"url": "https://image.lceda.cn/attachments/2022/9/AMssLlo7nElKEtewGB5MR9CCltg0o1ANYHcLHIlc.zip",
"original_id": "9510dc61265243f691828f76ed24eb0c"
},
{
"name": "新版本直通监测,主界面,PD监测抓包,Emarker读取演示.mp4",
"url": "https://image.lceda.cn/attachments/2023/3/ml3J4ndhkWtv85i37YK1bXUyl8ss2Me00izGjSPv.mp4",
"original_id": "aacf4846545f4b26a2ca90355e51ead3"
},
{
"name": "新版本PD,PPS诱骗,PD抓包Emarker读取演示.mp4",
"url": "https://image.lceda.cn/attachments/2023/3/UoNgPZVjkgceMl8H1AlPv0HGYMX0cGpSjz71rCzO.mp4",
"original_id": "323305b48e6a4fd99de89d4892f2b8ca"
},
{
"name": "新版本QC,QC3诱骗演示.mp4",
"url": "https://image.lceda.cn/attachments/2023/3/GVI4oTGtFn7v7M97a0wCk713lRsC0FI9gtSBtemK.mp4",
"original_id": "c4a07773431643e28aa45a3d523e7869"
},
{
"name": "新版本设置项等其它功能演示.mp4",
"url": "https://image.lceda.cn/attachments/2023/3/okBYwyFpliRTbS0WjOq3nDOd8VArZNrienxnYOd5.mp4",
"original_id": "cfe67e3dd4bb44bea05a6b07fa396618"
},
{
"name": "TTL1.2.3 免注册.bin",
"url": "https://image.lceda.cn/oshwhub/project/attachments/3c8beccc8bb645d7900f78ff8b5bd511.bin",
"original_id": "44872f72038747a8b42ad88b812a3443"
},
{
"name": "OTA1.2.3 免注册.bin",
"url": "https://image.lceda.cn/oshwhub/project/attachments/72261283c1a44d9d9c48e1a3a7c332b4.bin",
"original_id": "2172e76f1d4d42ccb4d89d2a27d7be5f"
}
]
}