Calibrated against ladder probes on 2026-04-29. Findings in docs/sources/probe_rate_limit_results.md. SLEEP_PRO 5.0 -> 0.5 (pro.lceda.cn API) SLEEP_BETWEEN 2.0 -> 1.0 (oshwhub detail/listing) SLEEP_SOURCE 5.0 unchanged (lceda.cn Std endpoints — not yet probed) SLEEP_PRO_CDN 0.2 unchanged (modules.lceda.cn — already optimized) The original 5s rate for Pro API was set out of caution because Pro requires a logged-in cookie. Empirical sustained-burst probe (25 distinct UUIDs at 0.5s sleep, no recovery): 0/25 errors, median latency 410ms, p90 932ms. The "Pro is rate-sensitive" assumption was wrong — server tolerates QPS=2 cleanly. oshwhub detail HTML pages slowed from p90 6.4s at 1.0s sleep to p90 15s at 0.5s — server queue backs up. 1.0s is the headroom-safe water mark. Net effect on batch-50 estimate: ~1.5h -> ~30min. scripts/probe_rate_limit.py: rate-limit ladder probe tool. Reusable for new endpoints (Std source still owes a probe). Designed for safety: 30s tier recovery, low rep counts on auth hosts, bail on first non-200. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
93 lines
3.2 KiB
Markdown
93 lines
3.2 KiB
Markdown
# Rate-limit probe results
|
||
|
||
**Probe date**: 2026-04-29
|
||
**Script**: `scripts/probe_rate_limit.py`
|
||
**Method**: Ladder test — N requests at decreasing inter-request sleep,
|
||
30s recovery between tiers, watch for status != 200, body shrinkage,
|
||
or latency degradation.
|
||
|
||
## oshwhub.com listing API (`/api/project`)
|
||
|
||
No auth. 6 tiers × 10 reps = 60 reqs total.
|
||
|
||
| sleep | status | bad | latency p90 |
|
||
|---|---|---:|---:|
|
||
| 2.0s | all 200 | 0 | 1187ms |
|
||
| 1.0s | all 200 | 0 | 1237ms |
|
||
| 0.5s | all 200 | 0 | 567ms |
|
||
| 0.25s | all 200 | 0 | 1180ms |
|
||
| 0.1s | all 200 | 0 | 2194ms |
|
||
| 0.0s | all 200 | 0 | 5362ms ← server soft-limits via latency |
|
||
|
||
**Verdict**: 0.5s safe water mark. Going faster doesn't fail but server adds
|
||
queueing latency (no return on the speed-up).
|
||
|
||
## oshwhub.com detail HTML (`/<owner>/<path>`)
|
||
|
||
No auth. 6 tiers × 10 distinct paths from batch-50 candidates.
|
||
|
||
| sleep | status | bad | latency p90 |
|
||
|---|---|---:|---:|
|
||
| 2.0s | all 200 | 0 | 4767ms |
|
||
| 1.0s | all 200 | 0 | 6350ms |
|
||
| 0.5s | all 200 | 0 | **15364ms** ← queue building |
|
||
| 0.25s | all 200 | 0 | 3755ms |
|
||
| 0.1s | all 200 | 0 | 8179ms |
|
||
| 0.0s | all 200 | 0 | 3856ms |
|
||
|
||
**Verdict**: 1.0s safe water mark. Detail HTML is 0.5 MB SSR, server
|
||
slowdown earlier than listing API. Going to 0.5s already triggers server
|
||
queue (one outlier 15s response), risk of timeout cascades on real bulk runs.
|
||
|
||
## pro.lceda.cn API (`/api/v4/projects/<P>`)
|
||
|
||
**Auth required** (logged-in cookie). Conservative ladder, reps capped at 8
|
||
to limit fingerprint exposure. 5 tiers × 8 reqs.
|
||
|
||
| sleep | status | bad | latency p90 |
|
||
|---|---|---:|---:|
|
||
| 5.0s | all 200 | 0 | 7299ms |
|
||
| 2.0s | all 200 | 0 | 5518ms |
|
||
| 1.0s | all 200 | 0 | 1409ms |
|
||
| 0.5s | all 200 | 0 | 2995ms |
|
||
| 0.25s | all 200 | 0 | 1552ms |
|
||
|
||
Then **sustained burst test** at the chosen water mark:
|
||
**25 distinct Pro UUIDs at 0.5s sleep, no recovery**.
|
||
|
||
- 25/25 success (all status 200, all `success: true`)
|
||
- median latency 410ms, p90 932ms, max 1853ms (first call only — TLS handshake)
|
||
- effective QPS 1.0
|
||
- wall time 24.9s (vs ~140s at the old 5s/req — 5.6× speedup)
|
||
|
||
**Verdict**: 0.5s safe water mark. Empirically Pro API tolerates QPS=2
|
||
cleanly, even sustained. Originally set high (5s) out of caution because
|
||
Pro requires a logged-in account — that caution was unjustified.
|
||
|
||
## lceda.cn Std source endpoints — NOT YET PROBED
|
||
|
||
Currently `SLEEP_SOURCE = 5.0`. Should be probed before lowering. Std
|
||
crawler isn't on the critical path for batch-50 (~12 min vs Pro's
|
||
~10 min savings), so this can wait.
|
||
|
||
## modules.lceda.cn CDN — already at 0.2s
|
||
|
||
CDN host serving AES-encrypted EPRO2 history blobs. Pre-existing
|
||
`SLEEP_PRO_CDN = 0.2`, validated against editor HAR which fires blobs
|
||
back-to-back without throttling. No further probing needed.
|
||
|
||
## Settings applied
|
||
|
||
```python
|
||
SLEEP_BETWEEN = 1.0 # was 2.0 (oshwhub detail/listing)
|
||
SLEEP_SOURCE = 5.0 # unchanged (Std source — not yet probed)
|
||
SLEEP_PRO = 0.5 # was 5.0 (Pro API host, 10× speedup)
|
||
SLEEP_PRO_CDN = 0.2 # unchanged (CDN, already optimized)
|
||
```
|
||
|
||
## Net impact on batch-50 plan
|
||
|
||
- Pro 25 项 × ~5 API calls each: 5×5 = 25s/proj × 25 = ~10min → 0.5×5 = 2.5s/proj × 25 = ~1min
|
||
- Detail page scan 50 项: 50 × 2s = 100s → 50 × 1s = 50s
|
||
- Combined batch-50 walltime estimate: **~1.5h → ~30 min**
|