Phase 1 MVP: crawl 10 high-quality oshwhub projects into LFS

Why:
- Charles 指定:先爬 10 个高质量项目存 Gitea LFS,一个项目一个文件夹,
  保留原文件和 URL。先以小批量验证 schema + LFS 流水线,放量前再拍板
  存储规模。

What:
- crawlers/oshwhub: 列表 API (`/api/project?sort=hot`) + SSR HTML 解析,
  一次性产出 metadata / description / cover / files / _urls
- schemas/project.schema.json: 跨源统一 schema
- docs/sources/oshwhub.md: API 入口 / 字段映射 / 陷阱调研
- pyproject.toml: httpx[http2] 单依赖
- .gitattributes: data/raw/**/files/** 一律走 LFS(规则写窄,避免误伤 schemas/*.json 等)
- .gitignore: 移除 data/raw/* 排除(改走 LFS 入库)

10 个项目覆盖:调试器 / 加热台 / 盖革计数器 / 数控电源 / 焊台 /
智能手表 / USB 测电流 / ZVS 感应加热 / AI 开发板 / 红外热成像。
共 52 附件 ≈ 524 MB 入 LFS,筛选判据 grade=4 & likes>=100 & 多样性。

Known gaps(见 plan.md § Phase 1.4):
- EasyEDA 源 JSON 需登录 (u.lceda.cn),v0.1 跳过
- fs-web-stream.jlc.com 的工程源下载未测
- scripts/validate.py 自动 schema 校验未实现

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Zhang Jiahao
2026-04-23 19:34:09 +08:00
parent bf2370f83b
commit 5ffa10f256
103 changed files with 2279 additions and 28 deletions

View File

@@ -0,0 +1,118 @@
{
"source": "oshwhub",
"source_url": "https://oshwhub.com/mojinyinhu/t12858-tong-yong-han-tai",
"project_id": "3e2f893d74664e01b755ccf2582792de",
"title": "t12-858d烙铁热风枪通用焊台二合一",
"description_short": "基于开源t12 858设计重绘电路板到88*38*120铝外壳最小体积预留1.4寸TFT SPI接口和iic接口上到彩屏,也可修改到jbc245来使用预留了丰富的自定义空间",
"description_path": "description.md",
"author": {
"username": "mojinyinhu",
"display_name": "mojinyinhu",
"user_id": "2158788da2584d3893fcb09344aa7085"
},
"license": "GPL 3.0",
"tags": [],
"created_at": "2021-07-08T03:05:48.000Z",
"updated_at": "2026-04-20T02:02:25.000Z",
"published_at": "2024-07-26T08:17:03.000Z",
"crawled_at": "2026-04-23T11:27:21.695580+00:00",
"metrics": {
"likes": 483,
"stars": 1013,
"forks": 395,
"views": 133220,
"watch": 0,
"comments": 293
},
"cover": {
"url": "https://image.lceda.cn/pullimage/X7qQwpTwtIeTBm2FkJfDZCo0K0tv4ZyyQCrgYJGQ.jpeg",
"path": "cover.jpeg"
},
"files": [
{
"name": "F1-T12-858D-master4.07.zip",
"url": "https://image.lceda.cn/attachments/2022/4/jxFVtEW6XG1wB6AjK8ykfSwqUVDV9OpKiaHqqb51.zip",
"original_id": "456c6c6a13b64483be969f93637e7255",
"ext": "zip",
"mime": "application/x-zip-compressed",
"size": 33224795,
"md5": "2f69640add7a8b42c9b835fe682de26c",
"path": "files/F1-T12-858D-master4.07.zip",
"sha256": "b01b044b3f370ff17c66a2fe455312876a2b6d61ff818283c2ea747127786c95"
},
{
"name": "固件在解压缩后在BINARY文件夹选取适合自己屏幕固件烧录.txt",
"url": "https://image.lceda.cn/attachments/2022/4/GREMfysm2sIMw6hOBfAwTvzjMdCPXCTAz5oq1jpd.txt",
"original_id": "f25fdd81e5134335a2f56b39da71c501",
"ext": "txt",
"mime": "text/plain",
"size": 84,
"md5": "aa10d20f9601588d99dc697c4e573e49",
"path": "files/固件在解压缩后在BINARY文件夹选取适合自己屏幕固件烧录.txt",
"sha256": "abe44c2443bdc11876e750df1a927d2365318e20a978ed80ee6461e8ecbab05b"
},
{
"name": "校准.txt",
"url": "https://image.lceda.cn/attachments/2022/4/TBGObPjIQG0Z7h9bBrfZdZRKqdfgC9hAiCtaU6G7.txt",
"original_id": "4d9c148dafc64ad6b27dbefed3e8ec2e",
"ext": "txt",
"mime": "text/plain",
"size": 1058,
"md5": "167921cdc21b13a6c85abdb42f2ffba2",
"path": "files/校准.txt",
"sha256": "9fd2af841a84b6a27dd6c13d81c208d45a02e959a73094efb0b1fa4586f71215"
},
{
"name": "F1_T12+858D编译视频教程.rar",
"url": "https://image.lceda.cn/attachments/2022/4/oIV5aPj2dyXrwODr0PpF9VH16w98TuJmggWB3lJ4.rar",
"original_id": "70d67505c93242a4a3bc7fb91fa7c80d",
"ext": "rar",
"mime": "application/octet-stream",
"size": 10367841,
"md5": "0bbdb68ff491bba65004cbc507a0bb36",
"path": "files/F1_T12+858D编译视频教程.rar",
"sha256": "9da0b74d3757d08493811d888baebd466383126ce241dbe3b6d5f48d9d8b6c08"
},
{
"name": "T12+858焊台BOM表.xlsx",
"url": "https://image.lceda.cn/attachments/2022/6/y6f0JXE1jHDbcoMVx9prMqU1VQODyNyCggeZFRY3.xlsx",
"original_id": "be2b2cce0ee24fe8838805cacd53ccd0",
"ext": "xlsx",
"mime": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"size": 15738,
"md5": "6f60ccf1daf9f6441f0910827f3476ce",
"path": "files/T12+858焊台BOM表.xlsx",
"sha256": "80b8d57ef1a0615e1b01ba237f906e7063f870e67ccaa966e2ced3bcf3524a10"
},
{
"name": "PCB_小板合一拼版二层板自行导入力创导出gerber.json",
"url": "https://image.lceda.cn/attachments/2022/6/BMkqLsDhiJo2t8op8Ejvwwc8TaKqaooxSzg8a4rQ.txt",
"original_id": "eaba63442d404070b930639390eef369",
"ext": "txt",
"mime": "application/json",
"size": 780139,
"md5": "0298cf1a0e505b3cebcf9f8c0c18adc9",
"path": "files/PCB_小板合一拼版二层板自行导入力创导出gerber.json",
"sha256": "f42a140b9183d242da25bdec5b861c4ade79033ddb1a8b39a71690fcd9c09646"
},
{
"name": "PCB_功率板+控制板四层板自行导入力创导出_2022-08-21.json",
"url": "https://image.lceda.cn/attachments/2022/8/4LRvXx6WH6DD8V8GPZQEegxSCThDMkr1w4qbe1FT.txt",
"original_id": "ae9d73994d534fd499a456fd9d2f7b40",
"ext": "txt",
"mime": "application/json",
"size": 1825835,
"md5": "9e4a2e722f2d57539add8750f9e27ff3",
"path": "files/PCB_功率板+控制板四层板自行导入力创导出_2022-08-21.json",
"sha256": "e7238e5752f2aa763ba4900bac2cff07c8bcfb6b3a316c3809fc5da79ef04ad9"
}
],
"raw_fields": {
"path": "mojinyinhu/t12858-tong-yong-han-tai",
"grade": 4,
"origin": "std",
"public": true,
"publish": true,
"skipped_files": []
}
}