feat: expandable previews, KaTeX rendering, variant speedup, batch import
- Analytics/Similar: expandable question preview with KaTeX rendering - KaTeXRenderer: auto markdown-to-HTML (code blocks, tables, bold), auto Unicode→LaTeX - ErrorBook: full question text rendering instead of truncated preview - Variant: remove hint/solution from generation (faster), async, fix null crash - Grading: add max_tokens limit - JSON parser: robust multi-layer repair + JSONDecodeError retry - Extraction prompt: enforce LaTeX notation for math - Upload: redirect to home instead of blank paper page - ProcessingBanner: add ETA time estimate + percentage - Batch import script + handoff guide for team Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
220
backend/BATCH_IMPORT_GUIDE.md
Normal file
220
backend/BATCH_IMPORT_GUIDE.md
Normal file
@@ -0,0 +1,220 @@
|
||||
# 批量导入试卷指南
|
||||
|
||||
## 概述
|
||||
|
||||
`batch_import.py` 用于批量向 PastPaper Master 数据库填充试卷。它会自动完成:
|
||||
1. 创建 DB 记录
|
||||
2. 上传 PDF 到 Supabase Storage
|
||||
3. Gemini Vision 提取题目结构
|
||||
4. DeepSeek 生成 AI 解题三件套(knowledge reminder + hint + solution)
|
||||
|
||||
## 环境准备
|
||||
|
||||
### 1. 服务器信息
|
||||
|
||||
| 项目 | 值 |
|
||||
|------|-----|
|
||||
| 生产服务器 | `129.226.210.66` |
|
||||
| SSH | `ssh -i ~/.ssh/id_ed25519 root@129.226.210.66` |
|
||||
| 后端容器 | `pastpaper-backend-1` |
|
||||
| 项目路径 | `/opt/pastpaper/` |
|
||||
| 前端静态文件 | `/opt/1panel/www/pastpaper/` |
|
||||
|
||||
### 2. 在本地运行(推荐)
|
||||
|
||||
```bash
|
||||
cd /path/to/PastPaper\ Master/backend
|
||||
|
||||
# 确保 .env 在项目根目录(../. env)
|
||||
# 需要的 key: SUPABASE_URL, SUPABASE_SERVICE_ROLE_KEY, GOOGLE_GEMINI_API_KEY, DEEPSEEK_API_KEY
|
||||
|
||||
# 激活虚拟环境
|
||||
source .venv/bin/activate
|
||||
|
||||
# 或用 venv 的 python
|
||||
.venv/bin/python batch_import.py ...
|
||||
```
|
||||
|
||||
### 3. 在服务器 Docker 容器里运行
|
||||
|
||||
```bash
|
||||
# 先把脚本和试卷文件传到服务器
|
||||
scp -i ~/.ssh/id_ed25519 batch_import.py root@129.226.210.66:/opt/pastpaper/backend/
|
||||
scp -i ~/.ssh/id_ed25519 -r /path/to/papers root@129.226.210.66:/opt/pastpaper/papers_to_import/
|
||||
|
||||
# 进容器运行
|
||||
ssh -i ~/.ssh/id_ed25519 root@129.226.210.66
|
||||
docker exec -it pastpaper-backend-1 bash
|
||||
cd /app
|
||||
python batch_import.py /path/to/papers --batch
|
||||
```
|
||||
|
||||
## 使用方法
|
||||
|
||||
### 单份导入
|
||||
|
||||
```bash
|
||||
python batch_import.py paper.pdf \
|
||||
--course COMP2211 \
|
||||
--year 2024 \
|
||||
--term spring \
|
||||
--exam midterm
|
||||
|
||||
# 带答案
|
||||
python batch_import.py paper.pdf \
|
||||
--answer answer.pdf \
|
||||
--course COMP2211 \
|
||||
--year 2024 \
|
||||
--term spring \
|
||||
--exam midterm
|
||||
```
|
||||
|
||||
### 批量导入
|
||||
|
||||
#### 目录结构要求
|
||||
|
||||
```
|
||||
papers_to_import/
|
||||
├── COMP2211/
|
||||
│ ├── 2024_spring_midterm.pdf
|
||||
│ ├── 2024_spring_midterm_answer.pdf <- 自动匹配
|
||||
│ ├── 2024_fall_final.pdf
|
||||
│ └── 2023_spring_midterm.pdf
|
||||
├── COMP2011/
|
||||
│ ├── 2024_spring_midterm.pdf
|
||||
│ └── 2024_fall_final.pdf
|
||||
├── MATH1014/
|
||||
│ └── 2024_spring_midterm.pdf
|
||||
└── FINA2303/
|
||||
└── 2023_fall_midterm.pdf
|
||||
```
|
||||
|
||||
- 一级目录名 = 课程代码(自动转大写)
|
||||
- 文件名格式: `{year}_{term}_{examtype}.pdf`
|
||||
- 答案文件: `{year}_{term}_{examtype}_answer.pdf`(可选,放同一目录,自动匹配)
|
||||
- term: `spring` / `fall` / `summer`
|
||||
- examtype: `midterm` / `final` / `quiz`
|
||||
|
||||
#### 命令
|
||||
|
||||
```bash
|
||||
# 先试运行看看会导入什么
|
||||
python batch_import.py papers_to_import/ --batch --dry-run
|
||||
|
||||
# 正式导入(串行,最安全)
|
||||
python batch_import.py papers_to_import/ --batch
|
||||
|
||||
# 并发导入(2个同时处理,更快但 API 可能限流)
|
||||
python batch_import.py papers_to_import/ --batch --concurrency 2
|
||||
```
|
||||
|
||||
### 自动查重
|
||||
|
||||
脚本会自动跳过已存在的试卷(相同 course_code + year + term + exam_type 且 status 为 ready 或 processing)。
|
||||
|
||||
## 处理时间估计
|
||||
|
||||
单份试卷处理时间取决于页数和题目数:
|
||||
|
||||
| 阶段 | 耗时 |
|
||||
|------|------|
|
||||
| PDF 渲染 | 2-5s |
|
||||
| Vision 提取(每 8 页一批) | 30-60s/批 |
|
||||
| 答案匹配 | 20-40s |
|
||||
| AI trio 生成(每 3 题一批) | 15-25s/批 |
|
||||
| **总计(30 题试卷)** | **~3-5 min** |
|
||||
| **总计(40+ 题试卷)** | **~5-8 min** |
|
||||
|
||||
建议: 并发不要超过 2,否则 Gemini API 可能限流(429 错误,脚本会自动重试但会更慢)。
|
||||
|
||||
## API 费用
|
||||
|
||||
| 模型 | 用途 | 费用 |
|
||||
|------|------|------|
|
||||
| Gemini 2.5 Flash | Vision 提取 + 答案匹配 | 免费额度内通常够 |
|
||||
| DeepSeek V3 | AI trio 生成 | ~$0.5-1.5/份试卷 |
|
||||
|
||||
监控费用:
|
||||
- Gemini: https://aistudio.google.com (API keys 页面看用量)
|
||||
- DeepSeek: https://platform.deepseek.com (Usage 页面)
|
||||
|
||||
## 常见问题
|
||||
|
||||
### Q: 处理失败怎么办?
|
||||
|
||||
试卷会标记为 `status=error`。可以删掉重来:
|
||||
```python
|
||||
# 在 backend/ 目录下
|
||||
.venv/bin/python -c "
|
||||
import sys; sys.path.insert(0, '.')
|
||||
from dotenv import load_dotenv; load_dotenv('../.env')
|
||||
from app.services.supabase_client import get_supabase
|
||||
sb = get_supabase()
|
||||
errors = sb.table('papers').select('id, course_code').eq('status', 'error').execute().data
|
||||
for p in errors:
|
||||
sb.table('paper_questions').delete().eq('paper_id', p['id']).execute()
|
||||
sb.table('papers').delete().eq('id', p['id']).execute()
|
||||
print('Deleted', p['course_code'])
|
||||
"
|
||||
```
|
||||
|
||||
### Q: JSON 解析错误?
|
||||
|
||||
已内置多层 JSON 修复 + 自动重试(最多 6 次)。如果还是失败,通常是因为试卷内容太复杂(大量 LaTeX + 代码),可以尝试:
|
||||
1. 删掉 error 记录重新导入
|
||||
2. 如果反复失败,可能需要拆分试卷 PDF
|
||||
|
||||
### Q: 如何只重新生成 AI trio(题目已提取)?
|
||||
|
||||
```python
|
||||
# 清空 solution 字段,重启后端会自动续传
|
||||
.venv/bin/python -c "
|
||||
import sys; sys.path.insert(0, '.')
|
||||
from dotenv import load_dotenv; load_dotenv('../.env')
|
||||
from app.services.supabase_client import get_supabase
|
||||
sb = get_supabase()
|
||||
PAPER_ID = 'xxxxxxxx-xxxx-...' # 替换
|
||||
qs = sb.table('paper_questions').select('id').eq('paper_id', PAPER_ID).execute().data
|
||||
for q in qs:
|
||||
sb.table('paper_questions').update({'solution': None, 'ai_hint': None, 'knowledge_reminder': None}).eq('id', q['id']).execute()
|
||||
sb.table('papers').update({'status': 'processing'}).eq('id', PAPER_ID).execute()
|
||||
print(f'Reset {len(qs)} questions, restart backend to regenerate')
|
||||
"
|
||||
|
||||
# 然后重启后端
|
||||
ssh -i ~/.ssh/id_ed25519 root@129.226.210.66 "sudo docker restart pastpaper-backend-1"
|
||||
```
|
||||
|
||||
### Q: 如何部署后端代码改动?
|
||||
|
||||
```bash
|
||||
# 上传改动的文件
|
||||
scp -i ~/.ssh/id_ed25519 app/services/paper_processor.py root@129.226.210.66:/opt/pastpaper/backend/app/services/
|
||||
|
||||
# 重建容器
|
||||
ssh -i ~/.ssh/id_ed25519 root@129.226.210.66 "cd /opt/pastpaper && sudo docker compose up -d --build backend"
|
||||
```
|
||||
|
||||
### Q: 如何部署前端改动?
|
||||
|
||||
```bash
|
||||
cd frontend
|
||||
npm run build
|
||||
cp public/favicon.jpg dist/
|
||||
ssh -i ~/.ssh/id_ed25519 root@129.226.210.66 "rm -rf /opt/1panel/www/pastpaper/assets"
|
||||
scp -i ~/.ssh/id_ed25519 dist/index.html dist/favicon.jpg root@129.226.210.66:/opt/1panel/www/pastpaper/
|
||||
scp -i ~/.ssh/id_ed25519 -r dist/assets root@129.226.210.66:/opt/1panel/www/pastpaper/
|
||||
```
|
||||
|
||||
## 试卷来源
|
||||
|
||||
`pastpaper-scraper/papers/` 目录下有从 HKUST 爬取的历年试卷 PDF,按课程分目录。可以从中挑选热门课程导入:
|
||||
|
||||
优先导入的课程(用户量大):
|
||||
- COMP2011, COMP2211, COMP2711H
|
||||
- MATH1013, MATH1014, MATH2023
|
||||
- PHYS1112
|
||||
- ELEC2100
|
||||
- FINA2303
|
||||
|
||||
将文件按上述目录结构组织后运行 `--batch` 即可。
|
||||
@@ -230,6 +230,7 @@ async def get_course_analytics(course_code: str):
|
||||
"source": source_label,
|
||||
"question_number": question["question_number"],
|
||||
"preview": question["question_text"][:220],
|
||||
"full_text": question["question_text"],
|
||||
"difficulty": question.get("difficulty"),
|
||||
"question_type": question_type,
|
||||
"year": paper.get("year"),
|
||||
|
||||
@@ -133,7 +133,7 @@ async def create_variant(question_id: str, user_id: str = Depends(get_current_us
|
||||
raise HTTPException(status_code=404, detail="Question not found")
|
||||
|
||||
question = result.data[0]
|
||||
variant_data = await asyncio.to_thread(generate_variant, question)
|
||||
variant_data = await generate_variant(question)
|
||||
variant_data["knowledge_reminder"] = question.get("knowledge_reminder", "")
|
||||
|
||||
saved = sb.table("question_variants").insert({
|
||||
|
||||
@@ -68,9 +68,7 @@ Return JSON:
|
||||
"question_text": "HTML formatted variant question",
|
||||
"question_type": "{question_type}",
|
||||
"options": [MC only, format {{"label":"A","text":"..."}}, ...] or null,
|
||||
"correct_answer": "Correct answer (plain text)",
|
||||
"ai_hint": "HTML formatted hint that guides thinking WITHOUT giving the answer",
|
||||
"solution": "HTML formatted complete step-by-step solution"
|
||||
"correct_answer": "Correct answer (plain text)"
|
||||
}}"""
|
||||
|
||||
|
||||
@@ -90,7 +88,7 @@ def ocr_photo(photo_bytes: bytes) -> str:
|
||||
]},
|
||||
],
|
||||
temperature=0,
|
||||
max_tokens=2000,
|
||||
max_tokens=1500,
|
||||
)
|
||||
return resp.choices[0].message.content or ""
|
||||
|
||||
@@ -114,13 +112,15 @@ def grade_answer(question: dict, student_answer: str) -> dict:
|
||||
)},
|
||||
],
|
||||
temperature=0.2,
|
||||
max_tokens=2048,
|
||||
response_format={"type": "json_object"},
|
||||
)
|
||||
return json.loads(resp.choices[0].message.content)
|
||||
|
||||
|
||||
def generate_variant(question: dict) -> dict:
|
||||
"""Gemini generates a variant question"""
|
||||
async def generate_variant(question: dict) -> dict:
|
||||
"""DeepSeek generates a variant question (async)"""
|
||||
import asyncio
|
||||
answer = (
|
||||
question.get("correct_option")
|
||||
or question.get("correct_answer")
|
||||
@@ -129,18 +129,20 @@ def generate_variant(question: dict) -> dict:
|
||||
)
|
||||
|
||||
ds = get_deepseek_client()
|
||||
resp = ds.chat.completions.create(
|
||||
prompt = VARIANT_PROMPT.format(
|
||||
question_type=question["question_type"],
|
||||
question_text=question["question_text"],
|
||||
topics=", ".join(question.get("topics", [])),
|
||||
difficulty=question.get("difficulty", "medium"),
|
||||
answer=answer,
|
||||
)
|
||||
|
||||
resp = await asyncio.to_thread(
|
||||
ds.chat.completions.create,
|
||||
model="deepseek-chat",
|
||||
messages=[
|
||||
{"role": "system", "content": VARIANT_PROMPT.format(
|
||||
question_type=question["question_type"],
|
||||
question_text=question["question_text"],
|
||||
topics=", ".join(question.get("topics", [])),
|
||||
difficulty=question.get("difficulty", "medium"),
|
||||
answer=answer,
|
||||
)},
|
||||
],
|
||||
messages=[{"role": "system", "content": prompt}],
|
||||
temperature=0.5,
|
||||
max_tokens=2048,
|
||||
response_format={"type": "json_object"},
|
||||
)
|
||||
return json.loads(resp.choices[0].message.content)
|
||||
|
||||
@@ -35,6 +35,8 @@ CRITICAL RULES for question_text:
|
||||
- For sub-questions (e.g. (a)(i)), copy the ENTIRE parent question setup (variable definitions, code blocks, problem description) into the question_text, then append the specific sub-question.
|
||||
- For Python/code questions: include ALL variable definitions and import statements verbatim, exactly as they appear in the exam, preserving multi-line arrays and data structures completely.
|
||||
- Never truncate code. If a variable is defined across multiple lines (e.g. a numpy array), include every line.
|
||||
- CRITICAL: ALL mathematical expressions, formulas, variables, and symbols MUST use LaTeX notation. Wrap inline math with $...$ and display math with $$...$$. NEVER use Unicode symbols like σ, μ, π, ², ≥, ≤, √, ∑, etc. Use $\sigma$, $\mu$, $\pi$, $^2$, $\geq$, $\leq$, $\sqrt{}$, $\sum$, etc. Every fraction should be $\frac{a}{b}$, every subscript $x_i$, every superscript $x^n$.
|
||||
- Code blocks must use markdown fenced code blocks (```python ... ```).
|
||||
|
||||
Output JSON format (strictly follow):
|
||||
{
|
||||
@@ -203,6 +205,8 @@ RETRYABLE_ERROR_MARKERS = (
|
||||
|
||||
|
||||
def is_retryable_error(exc: Exception) -> bool:
|
||||
if isinstance(exc, json.JSONDecodeError):
|
||||
return True # LLM returned bad JSON, retry may fix it
|
||||
message = str(exc).lower()
|
||||
return any(marker in message for marker in RETRYABLE_ERROR_MARKERS)
|
||||
|
||||
@@ -221,17 +225,51 @@ def pdf_to_images(pdf_bytes: bytes, dpi: int = 96) -> list[str]:
|
||||
|
||||
|
||||
def parse_json_response(text: str) -> dict:
|
||||
"""解析模型返回的 JSON,兼容 markdown 代码块包装"""
|
||||
"""解析模型返回的 JSON,兼容各种格式问题"""
|
||||
text = text.strip()
|
||||
# 去掉 ```json ... ``` 包装
|
||||
|
||||
# 1. 去掉 ```json ... ``` 包装
|
||||
if text.startswith("```"):
|
||||
lines = text.splitlines()
|
||||
text = "\n".join(lines[1:-1] if lines[-1].strip() == "```" else lines[1:])
|
||||
# 移除 JSON 字符串中的非法控制字符(0x00-0x1F 除了 \t \n \r)
|
||||
|
||||
# 2. 如果不以 { 开头,尝试找到第一个 {
|
||||
idx = text.find("{")
|
||||
if idx > 0:
|
||||
text = text[idx:]
|
||||
# 找到最后一个 } 截断尾部垃圾
|
||||
ridx = text.rfind("}")
|
||||
if ridx > 0:
|
||||
text = text[:ridx + 1]
|
||||
|
||||
# 3. 移除所有非法控制字符(0x00-0x1F 除了 \t \n \r)
|
||||
text = re.sub(r'[\x00-\x08\x0b\x0c\x0e-\x1f]', '', text)
|
||||
# 修复模型返回的无效 JSON 转义序列:只修奇数个反斜杠后的非法字符
|
||||
text = re.sub(r'(?<!\\)((?:\\\\)*)\\([^"\\/bfnrtu])', r'\1\\\\\2', text)
|
||||
return json.loads(text)
|
||||
|
||||
# 4. 修复无效 JSON 转义:LaTeX 如 \sqrt, \sigma 等
|
||||
text = re.sub(r'(?<!\\)((?:\\\\)*)\\([^"\\/bfnrtu\n])', r'\1\\\\\2', text)
|
||||
|
||||
# 5. 尝试解析
|
||||
try:
|
||||
return json.loads(text)
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
|
||||
# 6. 更激进的修复:移除所有控制字符包括 \t
|
||||
text = re.sub(r'[\x00-\x1f]', lambda m: ' ' if m.group() in '\t\n\r' else '', text)
|
||||
|
||||
# 7. 修复未终止的字符串:在行尾补引号
|
||||
text = re.sub(r'(?<!\\)"([^"]*)\n', r'"\1\\n"\n', text)
|
||||
|
||||
try:
|
||||
return json.loads(text)
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
|
||||
# 8. 最后一搏:用 strict=False 解析
|
||||
try:
|
||||
return json.loads(text, strict=False)
|
||||
except json.JSONDecodeError:
|
||||
raise
|
||||
|
||||
|
||||
async def gemini_vision_json(
|
||||
|
||||
323
backend/batch_import.py
Normal file
323
backend/batch_import.py
Normal file
@@ -0,0 +1,323 @@
|
||||
"""
|
||||
批量导入试卷到 PastPaper Master
|
||||
================================
|
||||
|
||||
用法:
|
||||
# 导入单份试卷
|
||||
python batch_import.py /path/to/paper.pdf --course COMP2211 --year 2024 --term spring --exam midterm
|
||||
|
||||
# 导入单份试卷 + 答案
|
||||
python batch_import.py /path/to/paper.pdf --answer /path/to/answer.pdf --course COMP2211 --year 2024 --term spring --exam midterm
|
||||
|
||||
# 批量导入整个目录(自动从文件名解析元数据)
|
||||
python batch_import.py /path/to/papers_dir/ --batch
|
||||
|
||||
# 批量导入,限制并发数(默认 1,避免 API 限流)
|
||||
python batch_import.py /path/to/papers_dir/ --batch --concurrency 2
|
||||
|
||||
# 试运行(只打印会导入什么,不实际执行)
|
||||
python batch_import.py /path/to/papers_dir/ --batch --dry-run
|
||||
|
||||
目录结构约定 (--batch 模式):
|
||||
papers_dir/
|
||||
├── COMP2211/
|
||||
│ ├── 2024_spring_midterm.pdf
|
||||
│ ├── 2024_spring_midterm_answer.pdf (可选,自动匹配)
|
||||
│ ├── 2024_fall_final.pdf
|
||||
│ └── 2023_spring_midterm.pdf
|
||||
├── MATH1014/
|
||||
│ ├── 2024_spring_midterm.pdf
|
||||
│ └── ...
|
||||
└── ...
|
||||
|
||||
文件名格式: {year}_{term}_{exam_type}.pdf
|
||||
答案文件名: {year}_{term}_{exam_type}_answer.pdf (自动匹配)
|
||||
|
||||
环境:
|
||||
需要项目根目录的 .env 文件(包含 Supabase 和 LLM API keys)
|
||||
在 backend/ 目录下运行: python batch_import.py ...
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, os.path.dirname(__file__))
|
||||
|
||||
from dotenv import load_dotenv
|
||||
load_dotenv(os.path.join(os.path.dirname(__file__), "..", ".env"))
|
||||
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.services.paper_processor import process_paper
|
||||
|
||||
# ── 服务账号 user_id(批量导入用,不关联具体用户) ──
|
||||
BATCH_USER_ID = "00000000-0000-0000-0000-000000000000"
|
||||
|
||||
|
||||
def parse_filename(filename: str) -> dict | None:
|
||||
"""
|
||||
从文件名解析元数据。支持格式:
|
||||
- 2024_spring_midterm.pdf
|
||||
- 2024-fall-final.pdf
|
||||
- 2024s_mid.pdf
|
||||
- (COMP2211)[2024](s)midterm~xxx.pdf (scraper 格式)
|
||||
"""
|
||||
base = Path(filename).stem.lower()
|
||||
|
||||
# 去掉 _answer 后缀
|
||||
if base.endswith("_answer") or base.endswith("_ans") or base.endswith("_solution"):
|
||||
return None # 这是答案文件,不单独导入
|
||||
|
||||
result = {}
|
||||
|
||||
# Year: 4位数字
|
||||
year_match = re.search(r'(20[1-2]\d)', base)
|
||||
if year_match:
|
||||
result["year"] = int(year_match.group(1))
|
||||
|
||||
# Term
|
||||
if re.search(r'spring|spr|\(s\)|_s_', base):
|
||||
result["term"] = "spring"
|
||||
elif re.search(r'fall|aut|\(f\)|_f_', base):
|
||||
result["term"] = "fall"
|
||||
elif re.search(r'summer|sum', base):
|
||||
result["term"] = "summer"
|
||||
|
||||
# Exam type
|
||||
if re.search(r'mid', base):
|
||||
result["exam_type"] = "midterm"
|
||||
elif re.search(r'final|fin', base):
|
||||
result["exam_type"] = "final"
|
||||
elif re.search(r'quiz', base):
|
||||
result["exam_type"] = "quiz"
|
||||
|
||||
if "year" in result and "term" in result and "exam_type" in result:
|
||||
return result
|
||||
return None
|
||||
|
||||
|
||||
def find_answer_file(paper_path: Path) -> Path | None:
|
||||
"""查找对应的答案文件"""
|
||||
stem = paper_path.stem
|
||||
parent = paper_path.parent
|
||||
for suffix in ["_answer", "_ans", "_solution"]:
|
||||
candidate = parent / f"{stem}{suffix}.pdf"
|
||||
if candidate.exists():
|
||||
return candidate
|
||||
return None
|
||||
|
||||
|
||||
def scan_directory(dir_path: Path) -> list[dict]:
|
||||
"""
|
||||
扫描目录,返回待导入的试卷列表。
|
||||
期望结构: dir_path/COURSE_CODE/year_term_examtype.pdf
|
||||
"""
|
||||
items = []
|
||||
for course_dir in sorted(dir_path.iterdir()):
|
||||
if not course_dir.is_dir():
|
||||
continue
|
||||
course_code = course_dir.name.upper()
|
||||
|
||||
for pdf in sorted(course_dir.glob("*.pdf")):
|
||||
meta = parse_filename(pdf.name)
|
||||
if meta is None:
|
||||
continue
|
||||
|
||||
answer_file = find_answer_file(pdf)
|
||||
items.append({
|
||||
"paper_path": pdf,
|
||||
"answer_path": answer_file,
|
||||
"course_code": course_code,
|
||||
**meta,
|
||||
})
|
||||
return items
|
||||
|
||||
|
||||
def check_duplicate(sb, course_code: str, year: int, term: str, exam_type: str) -> bool:
|
||||
"""检查是否已存在相同试卷"""
|
||||
existing = (
|
||||
sb.table("papers")
|
||||
.select("id")
|
||||
.eq("course_code", course_code)
|
||||
.eq("year", year)
|
||||
.eq("term", term)
|
||||
.eq("exam_type", exam_type)
|
||||
.in_("status", ["ready", "processing"])
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
return len(existing) > 0
|
||||
|
||||
|
||||
async def import_single(
|
||||
paper_path: Path,
|
||||
answer_path: Path | None,
|
||||
course_code: str,
|
||||
year: int,
|
||||
term: str,
|
||||
exam_type: str,
|
||||
skip_duplicates: bool = True,
|
||||
) -> str | None:
|
||||
"""导入单份试卷,返回 paper_id 或 None(跳过)"""
|
||||
sb = get_supabase()
|
||||
|
||||
# 查重
|
||||
if skip_duplicates and check_duplicate(sb, course_code, year, term, exam_type):
|
||||
print(f" SKIP (duplicate): {course_code} {year} {term} {exam_type}")
|
||||
return None
|
||||
|
||||
# 读文件
|
||||
paper_bytes = paper_path.read_bytes()
|
||||
answer_bytes = answer_path.read_bytes() if answer_path else None
|
||||
|
||||
# 创建 DB 记录
|
||||
record = sb.table("papers").insert({
|
||||
"user_id": BATCH_USER_ID,
|
||||
"course_code": course_code,
|
||||
"year": year,
|
||||
"term": term,
|
||||
"exam_type": exam_type,
|
||||
"paper_file_url": "",
|
||||
"answer_file_url": None,
|
||||
"status": "processing",
|
||||
}).execute()
|
||||
paper_id = record.data[0]["id"]
|
||||
|
||||
# 上传到 Supabase Storage
|
||||
storage_path = f"{course_code}/{year}_{term}_{exam_type}"
|
||||
try:
|
||||
sb.storage.from_("papers").upload(
|
||||
f"{storage_path}/paper.pdf", paper_bytes,
|
||||
file_options={"content-type": "application/pdf", "upsert": "true"},
|
||||
)
|
||||
paper_url = sb.storage.from_("papers").get_public_url(f"{storage_path}/paper.pdf")
|
||||
update = {"paper_file_url": paper_url}
|
||||
|
||||
if answer_bytes:
|
||||
sb.storage.from_("papers").upload(
|
||||
f"{storage_path}/answer.pdf", answer_bytes,
|
||||
file_options={"content-type": "application/pdf", "upsert": "true"},
|
||||
)
|
||||
update["answer_file_url"] = sb.storage.from_("papers").get_public_url(f"{storage_path}/answer.pdf")
|
||||
|
||||
sb.table("papers").update(update).eq("id", paper_id).execute()
|
||||
except Exception as e:
|
||||
print(f" WARNING: Storage upload failed: {e}")
|
||||
|
||||
# 处理试卷(Vision 提取 + AI trio)
|
||||
print(f" Processing {course_code} {year} {term} {exam_type} ...")
|
||||
t0 = time.time()
|
||||
try:
|
||||
await process_paper(paper_id, paper_bytes, answer_bytes)
|
||||
elapsed = time.time() - t0
|
||||
print(f" DONE in {elapsed:.0f}s -> {paper_id[:8]}")
|
||||
except Exception as e:
|
||||
elapsed = time.time() - t0
|
||||
print(f" ERROR after {elapsed:.0f}s: {e}")
|
||||
sb.table("papers").update({"status": "error", "processing_step": str(e)[:200]}).eq("id", paper_id).execute()
|
||||
|
||||
return paper_id
|
||||
|
||||
|
||||
async def batch_import(dir_path: Path, concurrency: int = 1, dry_run: bool = False):
|
||||
"""批量导入目录下所有试卷"""
|
||||
items = scan_directory(dir_path)
|
||||
|
||||
if not items:
|
||||
print(f"No papers found in {dir_path}")
|
||||
print("Expected structure: dir/COURSE_CODE/year_term_examtype.pdf")
|
||||
return
|
||||
|
||||
print(f"Found {len(items)} papers to import:\n")
|
||||
for item in items:
|
||||
ans_label = f" + answer" if item["answer_path"] else ""
|
||||
print(f" {item['course_code']} {item['year']} {item['term']} {item['exam_type']}{ans_label}")
|
||||
print(f" <- {item['paper_path']}")
|
||||
|
||||
if dry_run:
|
||||
print(f"\n[DRY RUN] Would import {len(items)} papers. Exiting.")
|
||||
return
|
||||
|
||||
print(f"\nStarting import (concurrency={concurrency})...\n")
|
||||
|
||||
semaphore = asyncio.Semaphore(concurrency)
|
||||
results = {"ok": 0, "skip": 0, "error": 0}
|
||||
|
||||
async def process_one(item):
|
||||
async with semaphore:
|
||||
try:
|
||||
pid = await import_single(
|
||||
paper_path=item["paper_path"],
|
||||
answer_path=item["answer_path"],
|
||||
course_code=item["course_code"],
|
||||
year=item["year"],
|
||||
term=item["term"],
|
||||
exam_type=item["exam_type"],
|
||||
)
|
||||
if pid:
|
||||
results["ok"] += 1
|
||||
else:
|
||||
results["skip"] += 1
|
||||
except Exception as e:
|
||||
results["error"] += 1
|
||||
print(f" FATAL: {item['course_code']} {item['year']} - {e}")
|
||||
|
||||
# 串行或并发处理
|
||||
if concurrency == 1:
|
||||
for item in items:
|
||||
await process_one(item)
|
||||
else:
|
||||
await asyncio.gather(*[process_one(item) for item in items])
|
||||
|
||||
print(f"\n{'='*50}")
|
||||
print(f"Import complete: {results['ok']} success, {results['skip']} skipped, {results['error']} errors")
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Batch import papers to PastPaper Master")
|
||||
parser.add_argument("path", help="Path to PDF file or directory (with --batch)")
|
||||
parser.add_argument("--answer", help="Path to answer PDF (single file mode)")
|
||||
parser.add_argument("--course", help="Course code (e.g. COMP2211)")
|
||||
parser.add_argument("--year", type=int, help="Year (e.g. 2024)")
|
||||
parser.add_argument("--term", choices=["spring", "summer", "fall"], help="Term")
|
||||
parser.add_argument("--exam", choices=["midterm", "final", "quiz"], help="Exam type")
|
||||
parser.add_argument("--batch", action="store_true", help="Batch import from directory")
|
||||
parser.add_argument("--concurrency", type=int, default=1, help="Max concurrent imports (default: 1)")
|
||||
parser.add_argument("--dry-run", action="store_true", help="Print what would be imported without doing it")
|
||||
|
||||
args = parser.parse_args()
|
||||
path = Path(args.path)
|
||||
|
||||
if args.batch:
|
||||
if not path.is_dir():
|
||||
print(f"Error: {path} is not a directory")
|
||||
sys.exit(1)
|
||||
asyncio.run(batch_import(path, concurrency=args.concurrency, dry_run=args.dry_run))
|
||||
else:
|
||||
# Single file mode
|
||||
if not path.is_file():
|
||||
print(f"Error: {path} is not a file")
|
||||
sys.exit(1)
|
||||
if not all([args.course, args.year, args.term, args.exam]):
|
||||
print("Error: --course, --year, --term, --exam are required for single file import")
|
||||
sys.exit(1)
|
||||
|
||||
answer_path = Path(args.answer) if args.answer else None
|
||||
result = asyncio.run(import_single(
|
||||
paper_path=path,
|
||||
answer_path=answer_path,
|
||||
course_code=args.course.upper(),
|
||||
year=args.year,
|
||||
term=args.term,
|
||||
exam_type=args.exam,
|
||||
))
|
||||
if result:
|
||||
print(f"\nPaper ID: {result}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -121,10 +121,34 @@ export default function ProcessingBanner() {
|
||||
{expanded && (
|
||||
<div className="mt-1.5 flex flex-col gap-1.5" style={{ minWidth: 240 }}>
|
||||
{processing.map((p) => {
|
||||
const step = p.processing_step;
|
||||
const step = p.processing_step || "";
|
||||
const progress = p.processing_progress || 0;
|
||||
const total = p.processing_total || 0;
|
||||
const pct = total > 0 ? Math.round((progress / total) * 100) : 0;
|
||||
const totalSteps = p.processing_total || 0;
|
||||
const pct = totalSteps > 0 ? Math.round((progress / totalSteps) * 100) : 0;
|
||||
|
||||
// Estimate remaining time based on step
|
||||
let eta = "";
|
||||
if (step.includes("Rendering")) {
|
||||
eta = "~2-3 min";
|
||||
} else if (step.includes("Reading") || step.includes("Extracting")) {
|
||||
eta = "~3-5 min";
|
||||
} else if (step.includes("Matching answer")) {
|
||||
eta = "~1-2 min";
|
||||
} else if (step.includes("Generating solution") || step.includes("Generating AI")) {
|
||||
if (totalSteps > 0 && progress > 0) {
|
||||
const remaining = totalSteps - progress;
|
||||
const secsPerBatch = 25;
|
||||
const batchSize = 3;
|
||||
const totalSecs = Math.ceil(remaining / batchSize) * secsPerBatch;
|
||||
if (totalSecs < 60) eta = `~${totalSecs}s`;
|
||||
else eta = `~${Math.ceil(totalSecs / 60)} min`;
|
||||
} else {
|
||||
eta = "~5-8 min";
|
||||
}
|
||||
} else if (step) {
|
||||
eta = "~5-10 min";
|
||||
}
|
||||
|
||||
return (
|
||||
<div
|
||||
key={p.id}
|
||||
@@ -132,18 +156,24 @@ export default function ProcessingBanner() {
|
||||
>
|
||||
<div className="flex items-center gap-2.5 mb-1.5">
|
||||
<span className="w-3 h-3 border-2 border-white border-t-transparent rounded-full animate-spin shrink-0" />
|
||||
<span className="truncate">
|
||||
<span className="truncate flex-1">
|
||||
<span className="font-semibold">{p.course_code}</span>{" "}
|
||||
{p.year} {p.term} {p.exam_type}
|
||||
</span>
|
||||
{eta && (
|
||||
<span className="text-[10px] text-blue-300 shrink-0">{eta}</span>
|
||||
)}
|
||||
</div>
|
||||
{step && (
|
||||
<>
|
||||
<div className="text-[10px] text-gray-400 mb-1 truncate">{step}</div>
|
||||
{total > 0 && (
|
||||
<div className="h-1.5 bg-gray-700 rounded-full overflow-hidden">
|
||||
<div className="h-full bg-blue-400 rounded-full transition-all duration-500" style={{ width: `${pct}%` }} />
|
||||
</div>
|
||||
{totalSteps > 0 && (
|
||||
<>
|
||||
<div className="h-1.5 bg-gray-700 rounded-full overflow-hidden">
|
||||
<div className="h-full bg-blue-400 rounded-full transition-all duration-500" style={{ width: `${pct}%` }} />
|
||||
</div>
|
||||
<div className="text-[10px] text-gray-500 mt-0.5 text-right">{pct}%</div>
|
||||
</>
|
||||
)}
|
||||
</>
|
||||
)}
|
||||
|
||||
@@ -68,6 +68,122 @@ function renderTex(tex: string, displayMode: boolean): string {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Light markdown-to-HTML for raw question text that isn't already HTML.
|
||||
* Handles fenced code blocks, inline code, markdown tables, and newlines.
|
||||
*/
|
||||
function markdownToHtml(text: string): string {
|
||||
// Split into blocks to handle code fences and tables separately
|
||||
const blocks: string[] = [];
|
||||
let remaining = text;
|
||||
|
||||
// 1. Extract fenced code blocks first
|
||||
remaining = remaining.replace(/```(\w*)\n([\s\S]*?)```/g, (_m, lang: string, code: string) => {
|
||||
const escaped = code.replace(/&/g, "&").replace(/</g, "<").replace(/>/g, ">");
|
||||
const placeholder = `\x00CODE${blocks.length}\x00`;
|
||||
blocks.push(`<pre class="bg-gray-900 text-green-300 rounded-lg p-3 text-xs overflow-x-auto my-2 font-mono"><code${lang ? ` class="language-${lang}"` : ""}>${escaped.trimEnd()}</code></pre>`);
|
||||
return placeholder;
|
||||
});
|
||||
|
||||
// 2. Convert markdown tables
|
||||
remaining = remaining.replace(
|
||||
/(?:^|\n)((?:\|[^\n]+\|\n)+\|[-| :]+\|\n(?:\|[^\n]+\|\n?)*)/g,
|
||||
(_m, table: string) => {
|
||||
const rows = table.trim().split("\n").filter((r) => r.trim());
|
||||
if (rows.length < 2) return _m;
|
||||
const parseRow = (row: string) =>
|
||||
row.split("|").slice(1, -1).map((c) => c.trim());
|
||||
const headers = parseRow(rows[0]);
|
||||
// rows[1] is the separator
|
||||
const bodyRows = rows.slice(2).map(parseRow);
|
||||
let html = '<table class="border-collapse text-xs my-2 w-full"><thead><tr>';
|
||||
for (const h of headers) {
|
||||
html += `<th class="border border-gray-300 bg-gray-100 px-2 py-1 text-left font-semibold">${h}</th>`;
|
||||
}
|
||||
html += "</tr></thead><tbody>";
|
||||
for (const row of bodyRows) {
|
||||
html += "<tr>";
|
||||
for (const cell of row) {
|
||||
html += `<td class="border border-gray-300 px-2 py-1">${cell}</td>`;
|
||||
}
|
||||
html += "</tr>";
|
||||
}
|
||||
html += "</tbody></table>";
|
||||
return html;
|
||||
}
|
||||
);
|
||||
|
||||
// 3. Inline code: `...` → <code>
|
||||
remaining = remaining.replace(/`([^`]+)`/g, '<code class="bg-gray-100 text-pink-600 px-1 py-0.5 rounded text-xs font-mono">$1</code>');
|
||||
|
||||
// 4. Bold: **...** or __...__
|
||||
remaining = remaining.replace(/\*\*([^*]+)\*\*/g, "<strong>$1</strong>");
|
||||
|
||||
// 5. Auto-wrap Unicode math symbols in $ if not already wrapped
|
||||
// Greek letters
|
||||
remaining = remaining.replace(/(?<!\$)([\u03B1-\u03C9\u0391-\u03A9])(?!\$)/g, (_, ch) => {
|
||||
const greekMap: Record<string, string> = {
|
||||
"α": "\\alpha", "β": "\\beta", "γ": "\\gamma", "δ": "\\delta",
|
||||
"ε": "\\epsilon", "ζ": "\\zeta", "η": "\\eta", "θ": "\\theta",
|
||||
"λ": "\\lambda", "μ": "\\mu", "ν": "\\nu", "π": "\\pi",
|
||||
"ρ": "\\rho", "σ": "\\sigma", "τ": "\\tau", "φ": "\\phi",
|
||||
"χ": "\\chi", "ψ": "\\psi", "ω": "\\omega",
|
||||
"Σ": "\\Sigma", "Π": "\\Pi", "Δ": "\\Delta", "Ω": "\\Omega",
|
||||
"Φ": "\\Phi", "Γ": "\\Gamma", "Λ": "\\Lambda", "Θ": "\\Theta",
|
||||
};
|
||||
return `$${greekMap[ch] || ch}$`;
|
||||
});
|
||||
// Unicode math operators: ≥ ≤ ≠ × ÷ ± ∈ ⊆ ∪ ∩ √ ∞
|
||||
const mathSymbols: [RegExp, string][] = [
|
||||
[/≥/g, "$\\geq$"], [/≤/g, "$\\leq$"], [/≠/g, "$\\neq$"],
|
||||
[/×/g, "$\\times$"], [/÷/g, "$\\div$"], [/±/g, "$\\pm$"],
|
||||
[/∈/g, "$\\in$"], [/∉/g, "$\\notin$"], [/⊆/g, "$\\subseteq$"],
|
||||
[/∪/g, "$\\cup$"], [/∩/g, "$\\cap$"], [/∅/g, "$\\emptyset$"],
|
||||
[/√/g, "$\\sqrt{}$"], [/∞/g, "$\\infty$"], [/∑/g, "$\\sum$"],
|
||||
[/∧/g, "$\\wedge$"], [/∨/g, "$\\vee$"],
|
||||
];
|
||||
for (const [re, repl] of mathSymbols) {
|
||||
remaining = remaining.replace(re, repl);
|
||||
}
|
||||
// Unicode superscripts/subscripts
|
||||
remaining = remaining.replace(/([⁰¹²³⁴⁵⁶⁷⁸⁹ⁿ⁻]+)/g, (_, sups) => {
|
||||
const supMap: Record<string, string> = {
|
||||
"⁰": "0", "¹": "1", "²": "2", "³": "3", "⁴": "4",
|
||||
"⁵": "5", "⁶": "6", "⁷": "7", "⁸": "8", "⁹": "9",
|
||||
"ⁿ": "n", "⁻": "-",
|
||||
};
|
||||
const converted = [...sups].map((c) => supMap[c] || c).join("");
|
||||
return `$^{${converted}}$`;
|
||||
});
|
||||
remaining = remaining.replace(/([₀₁₂₃₄₅₆₇₈₉]+)/g, (_, subs) => {
|
||||
const subMap: Record<string, string> = {
|
||||
"₀": "0", "₁": "1", "₂": "2", "₃": "3", "₄": "4",
|
||||
"₅": "5", "₆": "6", "₇": "7", "₈": "8", "₉": "9",
|
||||
};
|
||||
const converted = [...subs].map((c) => subMap[c] || c).join("");
|
||||
return `$_{${converted}}$`;
|
||||
});
|
||||
// Merge adjacent $...$ $...$ → $... ...$
|
||||
remaining = remaining.replace(/\$\s*\$/g, " ");
|
||||
|
||||
// 6. Newlines → <br>
|
||||
remaining = remaining.replace(/\n/g, "<br>");
|
||||
|
||||
// 6. Restore code blocks
|
||||
for (let i = 0; i < blocks.length; i++) {
|
||||
remaining = remaining.replace(`\x00CODE${i}\x00`, blocks[i]);
|
||||
}
|
||||
|
||||
return remaining;
|
||||
}
|
||||
|
||||
/**
|
||||
* Detect if a string is already HTML (has tags) or is raw text.
|
||||
*/
|
||||
function isHtml(text: string): boolean {
|
||||
return /<[a-z][\s\S]*>/i.test(text);
|
||||
}
|
||||
|
||||
export default function KaTeXRenderer({
|
||||
html,
|
||||
className,
|
||||
@@ -75,7 +191,10 @@ export default function KaTeXRenderer({
|
||||
html: string;
|
||||
className?: string;
|
||||
}) {
|
||||
const rendered = useMemo(() => renderLatexInString(html), [html]);
|
||||
const rendered = useMemo(() => {
|
||||
const processed = isHtml(html) ? html : markdownToHtml(html);
|
||||
return renderLatexInString(processed);
|
||||
}, [html]);
|
||||
|
||||
return (
|
||||
<div
|
||||
|
||||
@@ -90,8 +90,8 @@ export default function UploadForm() {
|
||||
fd.append("term", term);
|
||||
fd.append("exam_type", examType);
|
||||
|
||||
const result = await uploadPaper(fd);
|
||||
navigate(`/paper/${result.paper_id}`);
|
||||
await uploadPaper(fd);
|
||||
navigate("/");
|
||||
} catch (err) {
|
||||
setError(err instanceof Error ? err.message : "Upload failed");
|
||||
setSubmitting(false);
|
||||
|
||||
@@ -2,6 +2,7 @@ import { useEffect, useState } from "react";
|
||||
import { Link } from "react-router-dom";
|
||||
|
||||
import { getSimilarQuestions } from "@/lib/api";
|
||||
import KaTeXRenderer from "@/components/shared/KaTeXRenderer";
|
||||
import type { Question, SimilarQuestion } from "@/types/api";
|
||||
|
||||
const typeLabel: Record<string, string> = {
|
||||
@@ -21,7 +22,6 @@ function matchColor(percent: number): string {
|
||||
}
|
||||
|
||||
function cleanReason(reason: string): string {
|
||||
// "Shared topic: foo_bar, baz_qux" → "Shared topic: Foo Bar, Baz Qux"
|
||||
return reason.replace(/[_]/g, " ").replace(/:\s*(.+)$/, (_, rest) =>
|
||||
": " + rest.split(",").map((s: string) =>
|
||||
s.trim().replace(/\b\w/g, (c: string) => c.toUpperCase())
|
||||
@@ -29,6 +29,108 @@ function cleanReason(reason: string): string {
|
||||
);
|
||||
}
|
||||
|
||||
function SimilarCard({ item }: { item: SimilarQuestion }) {
|
||||
const [expanded, setExpanded] = useState(false);
|
||||
|
||||
return (
|
||||
<div
|
||||
className={`rounded-lg border overflow-hidden transition-all duration-200 ${
|
||||
expanded ? "border-blue-300 bg-white shadow-sm" : "border-gray-100 hover:border-blue-200 hover:bg-blue-50/40"
|
||||
}`}
|
||||
>
|
||||
{/* Header — click to expand */}
|
||||
<button
|
||||
onClick={() => setExpanded((v) => !v)}
|
||||
className="w-full flex items-center gap-2 px-2.5 py-2 text-left"
|
||||
>
|
||||
{/* Match % badge */}
|
||||
<span className={`shrink-0 text-[11px] font-bold px-1.5 py-0.5 rounded ${matchColor(item.match_percent)}`}>
|
||||
{item.match_percent}%
|
||||
</span>
|
||||
|
||||
{/* Main info */}
|
||||
<div className="flex-1 min-w-0">
|
||||
<div className="flex items-center gap-1.5 flex-wrap">
|
||||
<span className="text-xs font-semibold text-gray-700">{item.source}</span>
|
||||
<span className="text-xs text-gray-400">·</span>
|
||||
<span className="text-xs text-gray-500">Q{item.question_number}</span>
|
||||
{item.question_type && (
|
||||
<>
|
||||
<span className="text-xs text-gray-400">·</span>
|
||||
<span className="text-xs text-gray-500">{typeLabel[item.question_type] ?? item.question_type}</span>
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Topics + reasons */}
|
||||
<div className="flex gap-1 flex-wrap mt-1">
|
||||
{item.topics.slice(0, 2).map((topic) => (
|
||||
<span key={topic} className="text-[10px] px-1.5 py-0.5 rounded bg-gray-100 text-gray-500">
|
||||
{topic}
|
||||
</span>
|
||||
))}
|
||||
{item.match_reasons
|
||||
?.filter((r) => !r.startsWith("Same format") && !r.startsWith("Same difficulty"))
|
||||
.slice(0, 2)
|
||||
.map((reason) => (
|
||||
<span key={reason} className="text-[10px] px-1.5 py-0.5 rounded bg-blue-50 text-blue-500">
|
||||
{cleanReason(reason)}
|
||||
</span>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<span
|
||||
className={`text-gray-300 text-xs shrink-0 transition-transform duration-200 ${
|
||||
expanded ? "rotate-90" : ""
|
||||
}`}
|
||||
>
|
||||
›
|
||||
</span>
|
||||
</button>
|
||||
|
||||
{/* Expanded preview */}
|
||||
{expanded && (
|
||||
<div className="px-3 pb-3 border-t border-gray-100">
|
||||
<div className="mt-2.5">
|
||||
<KaTeXRenderer html={item.question_text || ""} className="text-sm text-gray-700 leading-relaxed" />
|
||||
</div>
|
||||
|
||||
{/* All topics */}
|
||||
{item.topics.length > 0 && (
|
||||
<div className="flex gap-1 mt-2.5 flex-wrap">
|
||||
{item.topics.map((t) => (
|
||||
<span key={t} className="text-[10px] px-2 py-0.5 rounded-full bg-blue-50 text-blue-600 border border-blue-100">
|
||||
{t}
|
||||
</span>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Actions */}
|
||||
<div className="flex items-center gap-3 mt-2.5 pt-2.5 border-t border-gray-100">
|
||||
<Link
|
||||
to={`/paper/${item.paper_id}`}
|
||||
className="inline-flex items-center gap-1.5 text-xs font-semibold text-white bg-blue-600 hover:bg-blue-700 px-3 py-1.5 rounded-lg transition-colors"
|
||||
>
|
||||
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M10 6H6a2 2 0 00-2 2v10a2 2 0 002 2h10a2 2 0 002-2v-4M14 4h6m0 0v6m0-6L10 14" />
|
||||
</svg>
|
||||
Open in Exam
|
||||
</Link>
|
||||
<button
|
||||
onClick={() => setExpanded(false)}
|
||||
className="text-xs text-gray-400 hover:text-gray-600 transition-colors"
|
||||
>
|
||||
Collapse
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
export default function SimilarHistoryPanel({ question }: { question: Question }) {
|
||||
const [items, setItems] = useState<SimilarQuestion[]>([]);
|
||||
const [loading, setLoading] = useState(true);
|
||||
@@ -78,50 +180,7 @@ export default function SimilarHistoryPanel({ question }: { question: Question }
|
||||
)}
|
||||
|
||||
{items.map((item) => (
|
||||
<Link
|
||||
key={item.id}
|
||||
to={`/paper/${item.paper_id}`}
|
||||
className="flex items-center gap-2 px-2.5 py-2 rounded-lg border border-gray-100 hover:border-blue-200 hover:bg-blue-50/40 transition-colors"
|
||||
>
|
||||
{/* Match % badge */}
|
||||
<span className={`shrink-0 text-[11px] font-bold px-1.5 py-0.5 rounded ${matchColor(item.match_percent)}`}>
|
||||
{item.match_percent}%
|
||||
</span>
|
||||
|
||||
{/* Main info */}
|
||||
<div className="flex-1 min-w-0">
|
||||
<div className="flex items-center gap-1.5 flex-wrap">
|
||||
<span className="text-xs font-semibold text-gray-700">{item.source}</span>
|
||||
<span className="text-xs text-gray-400">·</span>
|
||||
<span className="text-xs text-gray-500">Q{item.question_number}</span>
|
||||
{item.question_type && (
|
||||
<>
|
||||
<span className="text-xs text-gray-400">·</span>
|
||||
<span className="text-xs text-gray-500">{typeLabel[item.question_type] ?? item.question_type}</span>
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Topics + reasons in one row */}
|
||||
<div className="flex gap-1 flex-wrap mt-1">
|
||||
{item.topics.slice(0, 2).map((topic) => (
|
||||
<span key={topic} className="text-[10px] px-1.5 py-0.5 rounded bg-gray-100 text-gray-500">
|
||||
{topic}
|
||||
</span>
|
||||
))}
|
||||
{item.match_reasons
|
||||
?.filter((r) => !r.startsWith("Same format") && !r.startsWith("Same difficulty"))
|
||||
.slice(0, 2)
|
||||
.map((reason) => (
|
||||
<span key={reason} className="text-[10px] px-1.5 py-0.5 rounded bg-blue-50 text-blue-500">
|
||||
{cleanReason(reason)}
|
||||
</span>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<span className="text-gray-300 text-xs shrink-0">›</span>
|
||||
</Link>
|
||||
<SimilarCard key={item.id} item={item} />
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
|
||||
@@ -139,9 +139,11 @@ export default function VariantDetail({
|
||||
<KaTeXRenderer html={variant.ai_hint} />
|
||||
</CollapsibleSection>
|
||||
)}
|
||||
<CollapsibleSection title="Solution" colorScheme="green">
|
||||
<KaTeXRenderer html={variant.solution} />
|
||||
</CollapsibleSection>
|
||||
{variant.solution && (
|
||||
<CollapsibleSection title="Solution" colorScheme="green">
|
||||
<KaTeXRenderer html={variant.solution} />
|
||||
</CollapsibleSection>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
|
||||
@@ -161,19 +161,21 @@ export default function VariantModal({
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
<div>
|
||||
<button
|
||||
onClick={() => setShowSolution(!showSolution)}
|
||||
className="text-sm text-green-600 hover:text-green-800 font-medium"
|
||||
>
|
||||
{showSolution ? "▾ Hide Solution" : "▸ Solution"}
|
||||
</button>
|
||||
{showSolution && (
|
||||
<div className="mt-2 bg-green-50 rounded-lg p-3 text-sm border border-green-200">
|
||||
<KaTeXRenderer html={variant.solution} />
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
{variant.solution && (
|
||||
<div>
|
||||
<button
|
||||
onClick={() => setShowSolution(!showSolution)}
|
||||
className="text-sm text-green-600 hover:text-green-800 font-medium"
|
||||
>
|
||||
{showSolution ? "▾ Hide Solution" : "▸ Solution"}
|
||||
</button>
|
||||
{showSolution && (
|
||||
<div className="mt-2 bg-green-50 rounded-lg p-3 text-sm border border-green-200">
|
||||
<KaTeXRenderer html={variant.solution} />
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
<button
|
||||
|
||||
@@ -2,6 +2,7 @@ import { useEffect, useMemo, useState } from "react";
|
||||
import { Link, useNavigate, useParams } from "react-router-dom";
|
||||
|
||||
import Header from "@/components/layout/Header";
|
||||
import KaTeXRenderer from "@/components/shared/KaTeXRenderer";
|
||||
import { getCourseAnalytics, listCourses } from "@/lib/api";
|
||||
import type { CourseAnalytics, AnalyticsTopicQuestion } from "@/types/api";
|
||||
|
||||
@@ -421,6 +422,7 @@ function InteractiveChart({ topicData, typeData, diffData }: {
|
||||
|
||||
// ── Shared components ──
|
||||
function QuestionCard({ question: q }: { question: QItem }) {
|
||||
const [expanded, setExpanded] = useState(false);
|
||||
const typeColor = TYPE_COLORS[q.question_type] ?? "bg-gray-50 text-gray-600 border-gray-200";
|
||||
const cleanPreview = (q.preview || "")
|
||||
.replace(/^Problem\s+\d+\s*\[.*?\]\s*/i, "")
|
||||
@@ -430,35 +432,95 @@ function QuestionCard({ question: q }: { question: QItem }) {
|
||||
.replace(/\s+/g, " ")
|
||||
.trim();
|
||||
|
||||
const fullText = (q.full_text || q.preview || "")
|
||||
.replace(/^Problem\s+\d+\s*\[.*?\]\s*/i, "")
|
||||
.trim();
|
||||
|
||||
return (
|
||||
<Link to={`/paper/${q.paper_id}`}
|
||||
className="flex items-start gap-3 bg-gray-50 border border-gray-200 border-l-2 border-l-transparent rounded-xl px-3.5 py-2.5 hover:border-blue-300 hover:border-l-blue-500 hover:bg-white hover:shadow-md hover:-translate-y-0.5 transition-all duration-200 group">
|
||||
<span className="shrink-0 inline-flex items-center justify-center w-8 h-8 rounded-lg bg-blue-600 text-white text-xs font-bold mt-0.5">
|
||||
{q.question_number}
|
||||
</span>
|
||||
<div className="flex-1 min-w-0">
|
||||
<div className="flex items-center gap-1.5 mb-1 flex-wrap">
|
||||
<span className="text-xs font-medium text-blue-600">{q.source}</span>
|
||||
<span className="text-gray-300">·</span>
|
||||
<span className={`text-[10px] px-1.5 py-0.5 rounded border font-medium ${typeColor}`}>
|
||||
{typeLabel[q.question_type] ?? q.question_type}
|
||||
</span>
|
||||
{q.difficulty && (
|
||||
<>
|
||||
<span className="text-gray-300">·</span>
|
||||
<span className={`text-[10px] px-1.5 py-0.5 rounded border font-medium ${DIFF_COLORS[q.difficulty] ?? ""}`}>
|
||||
{q.difficulty}
|
||||
</span>
|
||||
</>
|
||||
<div
|
||||
className={`bg-gray-50 border border-gray-200 border-l-2 rounded-xl overflow-hidden transition-all duration-200 ${
|
||||
expanded ? "border-l-blue-500 border-blue-200 bg-white shadow-md" : "border-l-transparent hover:border-blue-300 hover:border-l-blue-500 hover:bg-white hover:shadow-md hover:-translate-y-0.5"
|
||||
}`}
|
||||
>
|
||||
{/* Header — click to expand */}
|
||||
<button
|
||||
onClick={() => setExpanded((v) => !v)}
|
||||
className="w-full flex items-start gap-3 px-3.5 py-2.5 text-left group"
|
||||
>
|
||||
<span className="shrink-0 inline-flex items-center justify-center w-8 h-8 rounded-lg bg-blue-600 text-white text-xs font-bold mt-0.5">
|
||||
{q.question_number}
|
||||
</span>
|
||||
<div className="flex-1 min-w-0">
|
||||
<div className="flex items-center gap-1.5 mb-1 flex-wrap">
|
||||
<span className="text-xs font-medium text-blue-600">{q.source}</span>
|
||||
<span className="text-gray-300">·</span>
|
||||
<span className={`text-[10px] px-1.5 py-0.5 rounded border font-medium ${typeColor}`}>
|
||||
{typeLabel[q.question_type] ?? q.question_type}
|
||||
</span>
|
||||
{q.difficulty && (
|
||||
<>
|
||||
<span className="text-gray-300">·</span>
|
||||
<span className={`text-[10px] px-1.5 py-0.5 rounded border font-medium ${DIFF_COLORS[q.difficulty] ?? ""}`}>
|
||||
{q.difficulty}
|
||||
</span>
|
||||
</>
|
||||
)}
|
||||
{q.topics?.slice(0, 2).map((t) => (
|
||||
<span key={t} className="text-[10px] px-1.5 py-0.5 rounded bg-gray-100 text-gray-500 border border-gray-200">{t}</span>
|
||||
))}
|
||||
</div>
|
||||
{!expanded && (
|
||||
<p className="text-xs text-gray-600 line-clamp-2 leading-relaxed">{cleanPreview || q.preview}</p>
|
||||
)}
|
||||
{q.topics?.slice(0, 2).map((t) => (
|
||||
<span key={t} className="text-[10px] px-1.5 py-0.5 rounded bg-gray-100 text-gray-500 border border-gray-200">{t}</span>
|
||||
))}
|
||||
</div>
|
||||
<p className="text-xs text-gray-600 line-clamp-2 leading-relaxed">{cleanPreview || q.preview}</p>
|
||||
</div>
|
||||
<span className="shrink-0 text-gray-300 group-hover:text-blue-500 text-sm pt-1">→</span>
|
||||
</Link>
|
||||
<span
|
||||
className={`shrink-0 text-gray-300 group-hover:text-blue-500 text-sm pt-1 transition-transform duration-200 ${
|
||||
expanded ? "rotate-90" : ""
|
||||
}`}
|
||||
>
|
||||
→
|
||||
</span>
|
||||
</button>
|
||||
|
||||
{/* Expanded preview */}
|
||||
{expanded && (
|
||||
<div className="px-3.5 pb-3 animate-[fadeIn_0.2s_ease-out]">
|
||||
<div className="ml-11 border-t border-gray-100 pt-3">
|
||||
<KaTeXRenderer html={fullText} className="text-sm text-gray-700 leading-relaxed" />
|
||||
|
||||
{/* All topics */}
|
||||
{q.topics && q.topics.length > 0 && (
|
||||
<div className="flex gap-1 mt-3 flex-wrap">
|
||||
{q.topics.map((t) => (
|
||||
<span key={t} className="text-[10px] px-2 py-0.5 rounded-full bg-blue-50 text-blue-600 border border-blue-100">
|
||||
{t}
|
||||
</span>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Actions */}
|
||||
<div className="flex items-center gap-3 mt-3 pt-3 border-t border-gray-100">
|
||||
<Link
|
||||
to={`/paper/${q.paper_id}`}
|
||||
className="inline-flex items-center gap-1.5 text-xs font-semibold text-white bg-blue-600 hover:bg-blue-700 px-3.5 py-1.5 rounded-lg transition-colors"
|
||||
>
|
||||
<svg className="w-3.5 h-3.5" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M10 6H6a2 2 0 00-2 2v10a2 2 0 002 2h10a2 2 0 002-2v-4M14 4h6m0 0v6m0-6L10 14" />
|
||||
</svg>
|
||||
Open in Exam
|
||||
</Link>
|
||||
<button
|
||||
onClick={() => setExpanded(false)}
|
||||
className="text-xs text-gray-400 hover:text-gray-600 transition-colors"
|
||||
>
|
||||
Collapse
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
|
||||
@@ -172,7 +172,9 @@ export default function ErrorBookPage() {
|
||||
<span className="text-yellow-400">★</span>
|
||||
<div className="flex-1 min-w-0">
|
||||
<span className="text-sm font-medium text-gray-700">Variant of Q{v.source_question_number}</span>
|
||||
<p className="text-xs text-gray-500 truncate">{v.variant_data.question_text?.replace(/<[^>]*>/g, "").slice(0, 100)}</p>
|
||||
<div className="text-xs text-gray-500 line-clamp-2">
|
||||
<KaTeXRenderer html={v.variant_data.question_text || ""} className="text-xs" />
|
||||
</div>
|
||||
</div>
|
||||
<button onClick={() => void handleUnfavoriteVariant(v.id)} className="text-xs text-gray-400 hover:text-red-500">Remove</button>
|
||||
</div>
|
||||
@@ -264,8 +266,10 @@ function ErrorCard({ entry, onMastered, onRemove }: { entry: UserAttempt; onMast
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Question preview */}
|
||||
<p className="text-sm text-gray-600 mt-3 line-clamp-2">{preview}</p>
|
||||
{/* Question text */}
|
||||
<div className="mt-3">
|
||||
<KaTeXRenderer html={question.question_text || ""} className="text-sm text-gray-600 leading-relaxed" />
|
||||
</div>
|
||||
|
||||
{/* Topics */}
|
||||
{question.topics && question.topics.length > 0 && (
|
||||
|
||||
@@ -461,9 +461,9 @@ export default function WorkbenchPage() {
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
<p className="text-xs text-gray-600 line-clamp-2 mb-3">
|
||||
{v.variant_data.question_text?.replace(/<[^>]*>/g, "").slice(0, 140)}
|
||||
</p>
|
||||
<div className="text-xs text-gray-600 line-clamp-3 mb-3">
|
||||
<KaTeXRenderer html={v.variant_data.question_text || ""} className="text-xs" />
|
||||
</div>
|
||||
<button
|
||||
onClick={() => setActiveVariantId(v.id)}
|
||||
className="px-3 py-1.5 bg-blue-600 text-white text-xs font-medium rounded-lg hover:bg-blue-700"
|
||||
|
||||
@@ -130,6 +130,7 @@ export interface AnalyticsTopicQuestion {
|
||||
source: string;
|
||||
question_number: string;
|
||||
preview: string;
|
||||
full_text?: string;
|
||||
difficulty: string | null;
|
||||
question_type: string;
|
||||
year?: number | null;
|
||||
|
||||
Reference in New Issue
Block a user