From ce22717288c218dcf53c913de29270e2c09dbad2 Mon Sep 17 00:00:00 2001 From: Zhang Jiahao Date: Thu, 23 Apr 2026 19:48:21 +0800 Subject: [PATCH] Add projects.md index (stars-sorted) + build_index.py generator MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Why: - Charles 要一个索引页看入库项目 + 他们的 stars。手工维护会漂移, 所以 scripts/build_index.py 直接读 metadata.json 重新生成,保证 projects.md 永远是 data/raw/ 的镜像。 What: - projects.md: 10 个项目按 Stars 倒序(最高 3293 的加热台量产计划 → 最低 236 的柚子爱 AI 相机),含 stars/likes/forks/views/comments/ files/size,+ License 与数据源分布 - scripts/build_index.py: 扫 metadata.json 渲染 markdown,支持未来 多数据源(source 字段区分),下次新增 oshwhub / github / hackaday 项目后重跑即可 - README.md: 加 projects.md 链接 Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 2 +- projects.md | 57 +++++++++++++++ scripts/build_index.py | 155 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 213 insertions(+), 1 deletion(-) create mode 100644 projects.md create mode 100644 scripts/build_index.py diff --git a/README.md b/README.md index 9d3869b..7bc3718 100644 --- a/README.md +++ b/README.md @@ -21,7 +21,7 @@ | CERN OHR | ohwr.org | 高质量工业级 | CERN-OHL | 低 | | Wikifactory | wikifactory.com | 社区项目 | 作者自定 | 中 | -详细爬取计划见 [`plan.md`](./plan.md)。 +详细爬取计划见 [`plan.md`](./plan.md);当前已入库项目清单见 [`projects.md`](./projects.md)。 ## 仓库结构 diff --git a/projects.md b/projects.md new file mode 100644 index 0000000..83f7b78 --- /dev/null +++ b/projects.md @@ -0,0 +1,57 @@ +# Crawled Projects Index + +_自动生成,最近更新 2026-04-23 11:48 UTC_ + +**当前**:10 个项目 · 52 个附件 · 510.8 MB + +> 按 **Stars 倒序**。Title → 源站;UUID → 本仓库对应目录。 + +| # | Title | Author | License | ⭐ Stars | ❤️ Likes | 🍴 Forks | 👁 Views | 💬 Comments | Files | Size (MB) | +|---|-------|--------|---------|--------:|---------:|---------:|---------:|------------:|------:|----------:| +| 1 | [加热台量产计划](https://oshwhub.com/sheep_finder/pcb-heng-wen-jia-re-tai)
[`7b6a3988…`](./data/raw/oshwhub/7b6a398811f14eba9a952b8d2ddd7ace/) | [sheep_finder](https://oshwhub.com/sheep_finder) | Public Domain | 3,293 | 1,447 | 3,939 | 347,329 | 383 | 4 | 23.0 | +| 2 | [支持PD3.1/米PPS与Emarker读取的USB电压电流表](https://oshwhub.com/qaxslk/dai-PD-QCyou-pian-jian-ce-yi-ji-)
[`1a1e8655…`](./data/raw/oshwhub/1a1e865568d04db59a5a140dd3f13581/) | [qaxslk](https://oshwhub.com/qaxslk) | CC BY-NC-SA 4.0 | 2,695 | 1,215 | 1,146 | 306,681 | 448 | 13 | 204.5 | +| 3 | [自制ST-LINK V2-1(开源版本)](https://oshwhub.com/CYIIOT/ST_LINK-V2_1)
[`298873b7…`](./data/raw/oshwhub/298873b7fdbe44f8ba0e7351e023bc2c/) | [攻城狮神木](https://oshwhub.com/CYIIOT) | GPL 3.0 | 1,947 | 863 | 996 | 239,671 | 369 | 7 | 20.9 | +| 4 | [QF ZERO V2 智能手表终端V1.0.9-24-6-29](https://oshwhub.com/dhx233/esp32_s3_watch)
[`892dbc4e…`](./data/raw/oshwhub/892dbc4ebca74227ac6269a1693380d8/) | [启凡科创](https://oshwhub.com/dhx233) | Public Domain | 1,737 | 774 | 643 | 175,969 | 164 | 6 | 113.0 | +| 5 | [RT300-MKV 250W 数控升降压桌面可调电源](https://oshwhub.com/XACT/rt300-mkv)
[`91206ca7…`](./data/raw/oshwhub/91206ca73e96455f946bfcdd73e814fd/) | [XACT](https://oshwhub.com/XACT) | CC BY-NC-SA 4.0 | 1,735 | 867 | 782 | 185,523 | 231 | 2 | 80.9 | +| 6 | [t12-858d烙铁热风枪通用焊台二合一](https://oshwhub.com/mojinyinhu/t12858-tong-yong-han-tai)
[`3e2f893d…`](./data/raw/oshwhub/3e2f893d74664e01b755ccf2582792de/) | [mojinyinhu](https://oshwhub.com/mojinyinhu) | GPL 3.0 | 1,013 | 483 | 395 | 133,220 | 293 | 7 | 44.1 | +| 7 | [大功率感应加热2500W 增强型ZVS](https://oshwhub.com/diy17102800/tu-teng-zvs)
[`f974b06d…`](./data/raw/oshwhub/f974b06d9c01470bb319e7df6d4512c9/) | [金石之声](https://oshwhub.com/diy17102800) | TAPR Open Hardware License | 708 | 355 | 378 | 61,550 | 265 | 2 | 8.8 | +| 8 | [手持红外热成像](https://oshwhub.com/wesd/h7b0-re-cheng-xiang)
[`1b09581d…`](./data/raw/oshwhub/1b09581d66d34438a1e6513e457e0532/) | [wesd](https://oshwhub.com/wesd) | CERN Open Hardware License | 646 | 247 | 175 | 73,081 | 266 | 2 | 3.3 | +| 9 | [小汐 & 阿曈 -> 盖革计数器(MWGC-2T)](https://oshwhub.com/yanranxiaoxi/Multi-adaptation-Wi-Fi-Geiger-Counter-Double-Tube-Type)
[`b077573d…`](./data/raw/oshwhub/b077573dfb764e95b1d27faba49cca65/) | [久治明千树汐](https://oshwhub.com/yanranxiaoxi) | CC BY-SA 4.0 | 365 | 212 | 189 | 49,755 | 168 | 2 | 4.0 | +| 10 | [柚子爱AI相机-YuzuAI-YuzuMaix-AIoT-V831开发板](https://oshwhub.com/armbian-pythoniot/yuzumaix-v831)
[`922c1f3a…`](./data/raw/oshwhub/922c1f3a9b9a43ff98998f476e7946ca/) | [Armbian-PythonIot](https://oshwhub.com/armbian-pythoniot) | CC BY-NC-SA 3.0 | 236 | 129 | 96 | 45,128 | 93 | 7 | 8.3 | + +## 汇总 + +- Stars 合计 **14,375**(平均 1,437/项目) +- Likes 合计 **6,592** +- Views 合计 **1,617,907** + +### License 分布 + +- `Public Domain` — 2 项目 +- `CC BY-NC-SA 4.0` — 2 项目 +- `GPL 3.0` — 2 项目 +- `TAPR Open Hardware License` — 1 项目 +- `CERN Open Hardware License` — 1 项目 +- `CC BY-SA 4.0` — 1 项目 +- `CC BY-NC-SA 3.0` — 1 项目 + +### 数据源分布 + +- `oshwhub` — 10 项目 + +## 目录结构(每个项目) + +``` +data/raw/// +├── metadata.json # 统一 schema,见 schemas/project.schema.json +├── description.md # 标题 + 简介 + 许可证 +├── cover.{jpg,png} # 封面 +├── _urls.json # 所有原始 URL +└── files/* # 原始附件(Git LFS) +``` + +## 重新生成 + +```bash +uv run python scripts/build_index.py +``` diff --git a/scripts/build_index.py b/scripts/build_index.py new file mode 100644 index 0000000..d27ea8c --- /dev/null +++ b/scripts/build_index.py @@ -0,0 +1,155 @@ +"""Scan data/raw/*/*/metadata.json and build projects.md (index, sorted by stars desc). + +Usage: + uv run python scripts/build_index.py + uv run python scripts/build_index.py --out projects.md +""" + +from __future__ import annotations + +import argparse +import json +from datetime import datetime, timezone +from pathlib import Path + +REPO = Path(__file__).resolve().parent.parent + + +def fmt_mb(b: int) -> str: + return f"{b / 1024 / 1024:.1f}" + + +def collect() -> list[dict]: + rows: list[dict] = [] + for meta in (REPO / "data" / "raw").rglob("metadata.json"): + m = json.loads(meta.read_text(encoding="utf-8")) + files = m.get("files", []) + bytes_total = sum(f.get("size") or 0 for f in files) + rows.append( + { + "uuid": m["project_id"], + "title": m["title"], + "source": m["source"], + "source_url": m["source_url"], + "author_display": m["author"].get("display_name") or m["author"]["username"], + "author_username": m["author"]["username"], + "license": m.get("license") or "unknown", + "metrics": m.get("metrics") or {}, + "files_count": len(files), + "files_bytes": bytes_total, + "local_dir": str(meta.parent.relative_to(REPO)), + } + ) + # sort by stars desc, tie-break by likes + rows.sort( + key=lambda r: ( + -(r["metrics"].get("stars") or 0), + -(r["metrics"].get("likes") or 0), + ) + ) + return rows + + +def render(rows: list[dict]) -> str: + out: list[str] = [] + w = out.append + + total_files = sum(r["files_count"] for r in rows) + total_bytes = sum(r["files_bytes"] for r in rows) + total_stars = sum((r["metrics"].get("stars") or 0) for r in rows) + total_likes = sum((r["metrics"].get("likes") or 0) for r in rows) + total_views = sum((r["metrics"].get("views") or 0) for r in rows) + + w("# Crawled Projects Index") + w("") + w(f"_自动生成,最近更新 {datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M UTC')}_") + w("") + w( + f"**当前**:{len(rows)} 个项目 · {total_files} 个附件 · {fmt_mb(total_bytes)} MB" + ) + w("") + w("> 按 **Stars 倒序**。Title → 源站;UUID → 本仓库对应目录。") + w("") + w( + "| # | Title | Author | License | " + "⭐ Stars | ❤️ Likes | 🍴 Forks | 👁 Views | 💬 Comments | Files | Size (MB) |" + ) + w( + "|---|-------|--------|---------|" + "--------:|---------:|---------:|---------:|------------:|------:|----------:|" + ) + for i, r in enumerate(rows, 1): + m = r["metrics"] + title_link = f"[{r['title']}]({r['source_url']})" + # author link inference: oshwhub 格式 `https://oshwhub.com/` + if r["source"] == "oshwhub": + author_url = f"https://oshwhub.com/{r['author_username']}" + else: + author_url = r["source_url"] # fallback + author_link = f"[{r['author_display']}]({author_url})" + uuid_short = r["uuid"][:8] + dir_link = f"[`{uuid_short}…`](./{r['local_dir']}/)" + w( + f"| {i} | {title_link}
{dir_link} | {author_link} | {r['license']} | " + f"{m.get('stars', 0):,} | {m.get('likes', 0):,} | {m.get('forks', 0):,} | " + f"{m.get('views', 0):,} | {m.get('comments', 0):,} | " + f"{r['files_count']} | {fmt_mb(r['files_bytes'])} |" + ) + w("") + w("## 汇总") + w("") + avg_stars = total_stars // max(len(rows), 1) + w(f"- Stars 合计 **{total_stars:,}**(平均 {avg_stars:,}/项目)") + w(f"- Likes 合计 **{total_likes:,}**") + w(f"- Views 合计 **{total_views:,}**") + w("") + w("### License 分布") + w("") + lic_count: dict[str, int] = {} + for r in rows: + lic_count[r["license"]] = lic_count.get(r["license"], 0) + 1 + for lic, c in sorted(lic_count.items(), key=lambda x: -x[1]): + w(f"- `{lic}` — {c} 项目") + w("") + w("### 数据源分布") + w("") + src_count: dict[str, int] = {} + for r in rows: + src_count[r["source"]] = src_count.get(r["source"], 0) + 1 + for src, c in sorted(src_count.items(), key=lambda x: -x[1]): + w(f"- `{src}` — {c} 项目") + w("") + w("## 目录结构(每个项目)") + w("") + w("```") + w("data/raw///") + w("├── metadata.json # 统一 schema,见 schemas/project.schema.json") + w("├── description.md # 标题 + 简介 + 许可证") + w("├── cover.{jpg,png} # 封面") + w("├── _urls.json # 所有原始 URL") + w("└── files/* # 原始附件(Git LFS)") + w("```") + w("") + w("## 重新生成") + w("") + w("```bash") + w("uv run python scripts/build_index.py") + w("```") + w("") + return "\n".join(out) + + +def main(argv: list[str] | None = None) -> int: + ap = argparse.ArgumentParser() + ap.add_argument("--out", type=Path, default=REPO / "projects.md") + args = ap.parse_args(argv) + + rows = collect() + md = render(rows) + args.out.write_text(md, encoding="utf-8") + print(f"wrote {args.out} ({len(rows)} projects)") + return 0 + + +if __name__ == "__main__": + raise SystemExit(main())