Add projects.md index (stars-sorted) + build_index.py generator

Why:
- Charles 要一个索引页看入库项目 + 他们的 stars。手工维护会漂移,
  所以 scripts/build_index.py 直接读 metadata.json 重新生成,保证
  projects.md 永远是 data/raw/ 的镜像。

What:
- projects.md: 10 个项目按 Stars 倒序(最高 3293 的加热台量产计划
  → 最低 236 的柚子爱 AI 相机),含 stars/likes/forks/views/comments/
  files/size,+ License 与数据源分布
- scripts/build_index.py: 扫 metadata.json 渲染 markdown,支持未来
  多数据源(source 字段区分),下次新增 oshwhub / github / hackaday
  项目后重跑即可
- README.md: 加 projects.md 链接

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Zhang Jiahao
2026-04-23 19:48:21 +08:00
parent e222b08f27
commit ce22717288
3 changed files with 213 additions and 1 deletions

View File

@@ -21,7 +21,7 @@
| CERN OHR | ohwr.org | 高质量工业级 | CERN-OHL | 低 | | CERN OHR | ohwr.org | 高质量工业级 | CERN-OHL | 低 |
| Wikifactory | wikifactory.com | 社区项目 | 作者自定 | 中 | | Wikifactory | wikifactory.com | 社区项目 | 作者自定 | 中 |
详细爬取计划见 [`plan.md`](./plan.md)。 详细爬取计划见 [`plan.md`](./plan.md);当前已入库项目清单见 [`projects.md`](./projects.md)
## 仓库结构 ## 仓库结构

57
projects.md Normal file
View File

@@ -0,0 +1,57 @@
# Crawled Projects Index
_自动生成,最近更新 2026-04-23 11:48 UTC_
**当前**10 个项目 · 52 个附件 · 510.8 MB
> 按 **Stars 倒序**。Title → 源站UUID → 本仓库对应目录。
| # | Title | Author | License | ⭐ Stars | ❤️ Likes | 🍴 Forks | 👁 Views | 💬 Comments | Files | Size (MB) |
|---|-------|--------|---------|--------:|---------:|---------:|---------:|------------:|------:|----------:|
| 1 | [加热台量产计划](https://oshwhub.com/sheep_finder/pcb-heng-wen-jia-re-tai)<br>[`7b6a3988…`](./data/raw/oshwhub/7b6a398811f14eba9a952b8d2ddd7ace/) | [sheep_finder](https://oshwhub.com/sheep_finder) | Public Domain | 3,293 | 1,447 | 3,939 | 347,329 | 383 | 4 | 23.0 |
| 2 | [支持PD3.1/米PPS与Emarker读取的USB电压电流表](https://oshwhub.com/qaxslk/dai-PD-QCyou-pian-jian-ce-yi-ji-)<br>[`1a1e8655…`](./data/raw/oshwhub/1a1e865568d04db59a5a140dd3f13581/) | [qaxslk](https://oshwhub.com/qaxslk) | CC BY-NC-SA 4.0 | 2,695 | 1,215 | 1,146 | 306,681 | 448 | 13 | 204.5 |
| 3 | [自制ST-LINK V2-1开源版本](https://oshwhub.com/CYIIOT/ST_LINK-V2_1)<br>[`298873b7…`](./data/raw/oshwhub/298873b7fdbe44f8ba0e7351e023bc2c/) | [攻城狮神木](https://oshwhub.com/CYIIOT) | GPL 3.0 | 1,947 | 863 | 996 | 239,671 | 369 | 7 | 20.9 |
| 4 | [QF ZERO V2 智能手表终端V1.0.9-24-6-29](https://oshwhub.com/dhx233/esp32_s3_watch)<br>[`892dbc4e…`](./data/raw/oshwhub/892dbc4ebca74227ac6269a1693380d8/) | [启凡科创](https://oshwhub.com/dhx233) | Public Domain | 1,737 | 774 | 643 | 175,969 | 164 | 6 | 113.0 |
| 5 | [RT300-MKV 250W 数控升降压桌面可调电源](https://oshwhub.com/XACT/rt300-mkv)<br>[`91206ca7…`](./data/raw/oshwhub/91206ca73e96455f946bfcdd73e814fd/) | [XACT](https://oshwhub.com/XACT) | CC BY-NC-SA 4.0 | 1,735 | 867 | 782 | 185,523 | 231 | 2 | 80.9 |
| 6 | [t12-858d烙铁热风枪通用焊台二合一](https://oshwhub.com/mojinyinhu/t12858-tong-yong-han-tai)<br>[`3e2f893d…`](./data/raw/oshwhub/3e2f893d74664e01b755ccf2582792de/) | [mojinyinhu](https://oshwhub.com/mojinyinhu) | GPL 3.0 | 1,013 | 483 | 395 | 133,220 | 293 | 7 | 44.1 |
| 7 | [大功率感应加热2500W 增强型ZVS](https://oshwhub.com/diy17102800/tu-teng-zvs)<br>[`f974b06d…`](./data/raw/oshwhub/f974b06d9c01470bb319e7df6d4512c9/) | [金石之声](https://oshwhub.com/diy17102800) | TAPR Open Hardware License | 708 | 355 | 378 | 61,550 | 265 | 2 | 8.8 |
| 8 | [手持红外热成像](https://oshwhub.com/wesd/h7b0-re-cheng-xiang)<br>[`1b09581d…`](./data/raw/oshwhub/1b09581d66d34438a1e6513e457e0532/) | [wesd](https://oshwhub.com/wesd) | CERN Open Hardware License | 646 | 247 | 175 | 73,081 | 266 | 2 | 3.3 |
| 9 | [小汐 & 阿曈 -> 盖革计数器MWGC-2T](https://oshwhub.com/yanranxiaoxi/Multi-adaptation-Wi-Fi-Geiger-Counter-Double-Tube-Type)<br>[`b077573d…`](./data/raw/oshwhub/b077573dfb764e95b1d27faba49cca65/) | [久治明千树汐](https://oshwhub.com/yanranxiaoxi) | CC BY-SA 4.0 | 365 | 212 | 189 | 49,755 | 168 | 2 | 4.0 |
| 10 | [柚子爱AI相机-YuzuAI-YuzuMaix-AIoT-V831开发板](https://oshwhub.com/armbian-pythoniot/yuzumaix-v831)<br>[`922c1f3a…`](./data/raw/oshwhub/922c1f3a9b9a43ff98998f476e7946ca/) | [Armbian-PythonIot](https://oshwhub.com/armbian-pythoniot) | CC BY-NC-SA 3.0 | 236 | 129 | 96 | 45,128 | 93 | 7 | 8.3 |
## 汇总
- Stars 合计 **14,375**(平均 1,437/项目)
- Likes 合计 **6,592**
- Views 合计 **1,617,907**
### License 分布
- `Public Domain` — 2 项目
- `CC BY-NC-SA 4.0` — 2 项目
- `GPL 3.0` — 2 项目
- `TAPR Open Hardware License` — 1 项目
- `CERN Open Hardware License` — 1 项目
- `CC BY-SA 4.0` — 1 项目
- `CC BY-NC-SA 3.0` — 1 项目
### 数据源分布
- `oshwhub` — 10 项目
## 目录结构(每个项目)
```
data/raw/<source>/<uuid>/
├── metadata.json # 统一 schema见 schemas/project.schema.json
├── description.md # 标题 + 简介 + 许可证
├── cover.{jpg,png} # 封面
├── _urls.json # 所有原始 URL
└── files/* # 原始附件Git LFS
```
## 重新生成
```bash
uv run python scripts/build_index.py
```

155
scripts/build_index.py Normal file
View File

@@ -0,0 +1,155 @@
"""Scan data/raw/*/*/metadata.json and build projects.md (index, sorted by stars desc).
Usage:
uv run python scripts/build_index.py
uv run python scripts/build_index.py --out projects.md
"""
from __future__ import annotations
import argparse
import json
from datetime import datetime, timezone
from pathlib import Path
REPO = Path(__file__).resolve().parent.parent
def fmt_mb(b: int) -> str:
return f"{b / 1024 / 1024:.1f}"
def collect() -> list[dict]:
rows: list[dict] = []
for meta in (REPO / "data" / "raw").rglob("metadata.json"):
m = json.loads(meta.read_text(encoding="utf-8"))
files = m.get("files", [])
bytes_total = sum(f.get("size") or 0 for f in files)
rows.append(
{
"uuid": m["project_id"],
"title": m["title"],
"source": m["source"],
"source_url": m["source_url"],
"author_display": m["author"].get("display_name") or m["author"]["username"],
"author_username": m["author"]["username"],
"license": m.get("license") or "unknown",
"metrics": m.get("metrics") or {},
"files_count": len(files),
"files_bytes": bytes_total,
"local_dir": str(meta.parent.relative_to(REPO)),
}
)
# sort by stars desc, tie-break by likes
rows.sort(
key=lambda r: (
-(r["metrics"].get("stars") or 0),
-(r["metrics"].get("likes") or 0),
)
)
return rows
def render(rows: list[dict]) -> str:
out: list[str] = []
w = out.append
total_files = sum(r["files_count"] for r in rows)
total_bytes = sum(r["files_bytes"] for r in rows)
total_stars = sum((r["metrics"].get("stars") or 0) for r in rows)
total_likes = sum((r["metrics"].get("likes") or 0) for r in rows)
total_views = sum((r["metrics"].get("views") or 0) for r in rows)
w("# Crawled Projects Index")
w("")
w(f"_自动生成最近更新 {datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M UTC')}_")
w("")
w(
f"**当前**{len(rows)} 个项目 · {total_files} 个附件 · {fmt_mb(total_bytes)} MB"
)
w("")
w("> 按 **Stars 倒序**。Title → 源站UUID → 本仓库对应目录。")
w("")
w(
"| # | Title | Author | License | "
"⭐ Stars | ❤️ Likes | 🍴 Forks | 👁 Views | 💬 Comments | Files | Size (MB) |"
)
w(
"|---|-------|--------|---------|"
"--------:|---------:|---------:|---------:|------------:|------:|----------:|"
)
for i, r in enumerate(rows, 1):
m = r["metrics"]
title_link = f"[{r['title']}]({r['source_url']})"
# author link inference: oshwhub 格式 `https://oshwhub.com/<username>`
if r["source"] == "oshwhub":
author_url = f"https://oshwhub.com/{r['author_username']}"
else:
author_url = r["source_url"] # fallback
author_link = f"[{r['author_display']}]({author_url})"
uuid_short = r["uuid"][:8]
dir_link = f"[`{uuid_short}…`](./{r['local_dir']}/)"
w(
f"| {i} | {title_link}<br>{dir_link} | {author_link} | {r['license']} | "
f"{m.get('stars', 0):,} | {m.get('likes', 0):,} | {m.get('forks', 0):,} | "
f"{m.get('views', 0):,} | {m.get('comments', 0):,} | "
f"{r['files_count']} | {fmt_mb(r['files_bytes'])} |"
)
w("")
w("## 汇总")
w("")
avg_stars = total_stars // max(len(rows), 1)
w(f"- Stars 合计 **{total_stars:,}**(平均 {avg_stars:,}/项目)")
w(f"- Likes 合计 **{total_likes:,}**")
w(f"- Views 合计 **{total_views:,}**")
w("")
w("### License 分布")
w("")
lic_count: dict[str, int] = {}
for r in rows:
lic_count[r["license"]] = lic_count.get(r["license"], 0) + 1
for lic, c in sorted(lic_count.items(), key=lambda x: -x[1]):
w(f"- `{lic}` — {c} 项目")
w("")
w("### 数据源分布")
w("")
src_count: dict[str, int] = {}
for r in rows:
src_count[r["source"]] = src_count.get(r["source"], 0) + 1
for src, c in sorted(src_count.items(), key=lambda x: -x[1]):
w(f"- `{src}` — {c} 项目")
w("")
w("## 目录结构(每个项目)")
w("")
w("```")
w("data/raw/<source>/<uuid>/")
w("├── metadata.json # 统一 schema见 schemas/project.schema.json")
w("├── description.md # 标题 + 简介 + 许可证")
w("├── cover.{jpg,png} # 封面")
w("├── _urls.json # 所有原始 URL")
w("└── files/* # 原始附件Git LFS")
w("```")
w("")
w("## 重新生成")
w("")
w("```bash")
w("uv run python scripts/build_index.py")
w("```")
w("")
return "\n".join(out)
def main(argv: list[str] | None = None) -> int:
ap = argparse.ArgumentParser()
ap.add_argument("--out", type=Path, default=REPO / "projects.md")
args = ap.parse_args(argv)
rows = collect()
md = render(rows)
args.out.write_text(md, encoding="utf-8")
print(f"wrote {args.out} ({len(rows)} projects)")
return 0
if __name__ == "__main__":
raise SystemExit(main())