Phase 1 MVP: crawl 10 high-quality oshwhub projects into LFS
Why: - Charles 指定:先爬 10 个高质量项目存 Gitea LFS,一个项目一个文件夹, 保留原文件和 URL。先以小批量验证 schema + LFS 流水线,放量前再拍板 存储规模。 What: - crawlers/oshwhub: 列表 API (`/api/project?sort=hot`) + SSR HTML 解析, 一次性产出 metadata / description / cover / files / _urls - schemas/project.schema.json: 跨源统一 schema - docs/sources/oshwhub.md: API 入口 / 字段映射 / 陷阱调研 - pyproject.toml: httpx[http2] 单依赖 - .gitattributes: data/raw/**/files/** 一律走 LFS(规则写窄,避免误伤 schemas/*.json 等) - .gitignore: 移除 data/raw/* 排除(改走 LFS 入库) 10 个项目覆盖:调试器 / 加热台 / 盖革计数器 / 数控电源 / 焊台 / 智能手表 / USB 测电流 / ZVS 感应加热 / AI 开发板 / 红外热成像。 共 52 附件 ≈ 524 MB 入 LFS,筛选判据 grade=4 & likes>=100 & 多样性。 Known gaps(见 plan.md § Phase 1.4): - EasyEDA 源 JSON 需登录 (u.lceda.cn),v0.1 跳过 - fs-web-stream.jlc.com 的工程源下载未测 - scripts/validate.py 自动 schema 校验未实现 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
41
data/raw/oshwhub/3e2f893d74664e01b755ccf2582792de/_urls.json
Normal file
41
data/raw/oshwhub/3e2f893d74664e01b755ccf2582792de/_urls.json
Normal file
@@ -0,0 +1,41 @@
|
||||
{
|
||||
"detail_url": "https://oshwhub.com/mojinyinhu/t12858-tong-yong-han-tai",
|
||||
"cover_url": "https://image.lceda.cn/pullimage/X7qQwpTwtIeTBm2FkJfDZCo0K0tv4ZyyQCrgYJGQ.jpeg",
|
||||
"attachments": [
|
||||
{
|
||||
"name": "F1-T12-858D-master4.07.zip",
|
||||
"url": "https://image.lceda.cn/attachments/2022/4/jxFVtEW6XG1wB6AjK8ykfSwqUVDV9OpKiaHqqb51.zip",
|
||||
"original_id": "456c6c6a13b64483be969f93637e7255"
|
||||
},
|
||||
{
|
||||
"name": "固件在解压缩后在BINARY文件夹,选取适合自己屏幕固件烧录.txt",
|
||||
"url": "https://image.lceda.cn/attachments/2022/4/GREMfysm2sIMw6hOBfAwTvzjMdCPXCTAz5oq1jpd.txt",
|
||||
"original_id": "f25fdd81e5134335a2f56b39da71c501"
|
||||
},
|
||||
{
|
||||
"name": "校准.txt",
|
||||
"url": "https://image.lceda.cn/attachments/2022/4/TBGObPjIQG0Z7h9bBrfZdZRKqdfgC9hAiCtaU6G7.txt",
|
||||
"original_id": "4d9c148dafc64ad6b27dbefed3e8ec2e"
|
||||
},
|
||||
{
|
||||
"name": "F1_T12+858D编译视频教程.rar",
|
||||
"url": "https://image.lceda.cn/attachments/2022/4/oIV5aPj2dyXrwODr0PpF9VH16w98TuJmggWB3lJ4.rar",
|
||||
"original_id": "70d67505c93242a4a3bc7fb91fa7c80d"
|
||||
},
|
||||
{
|
||||
"name": "T12+858焊台BOM表.xlsx",
|
||||
"url": "https://image.lceda.cn/attachments/2022/6/y6f0JXE1jHDbcoMVx9prMqU1VQODyNyCggeZFRY3.xlsx",
|
||||
"original_id": "be2b2cce0ee24fe8838805cacd53ccd0"
|
||||
},
|
||||
{
|
||||
"name": "PCB_小板合一拼版(二层板)自行导入力创导出gerber.json",
|
||||
"url": "https://image.lceda.cn/attachments/2022/6/BMkqLsDhiJo2t8op8Ejvwwc8TaKqaooxSzg8a4rQ.txt",
|
||||
"original_id": "eaba63442d404070b930639390eef369"
|
||||
},
|
||||
{
|
||||
"name": "PCB_功率板+控制板(四层板)自行导入力创导出_2022-08-21.json",
|
||||
"url": "https://image.lceda.cn/attachments/2022/8/4LRvXx6WH6DD8V8GPZQEegxSCThDMkr1w4qbe1FT.txt",
|
||||
"original_id": "ae9d73994d534fd499a456fd9d2f7b40"
|
||||
}
|
||||
]
|
||||
}
|
||||
Reference in New Issue
Block a user