Initial commit: PastPaper Master full stack
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
12
.gitignore
vendored
Normal file
12
.gitignore
vendored
Normal file
@@ -0,0 +1,12 @@
|
||||
.env
|
||||
.env.*
|
||||
node_modules/
|
||||
__pycache__/
|
||||
*.pyc
|
||||
.DS_Store
|
||||
dist/
|
||||
.claude/
|
||||
.venv/
|
||||
pastpaper-scraper/
|
||||
pastpaper/
|
||||
*.pdf
|
||||
BIN
080c1b16be5aa2e1ea87d5175894fb3c.jpg
Normal file
BIN
080c1b16be5aa2e1ea87d5175894fb3c.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 103 KiB |
328
HANDOFF_COMP2211.md
Normal file
328
HANDOFF_COMP2211.md
Normal file
@@ -0,0 +1,328 @@
|
||||
# COMP2211 Handoff
|
||||
|
||||
## Current Status
|
||||
|
||||
`COMP2211` course-library papers are now fully loaded into Supabase and normalized to subquestion-level granularity.
|
||||
|
||||
Canonical papers currently in DB:
|
||||
|
||||
- `COMP2211-2022-fall-midterm`
|
||||
- `COMP2211-2022-spring-midterm`
|
||||
- `COMP2211-2022-spring-final-part-a`
|
||||
- `COMP2211-2022-spring-final-part-b`
|
||||
- `COMP2211-2023-spring-midterm`
|
||||
- `COMP2211-2024-spring-midterm`
|
||||
- `COMP2211-2024-spring-final`
|
||||
|
||||
All seven papers are:
|
||||
|
||||
- `status = ready`
|
||||
- split to subquestion level
|
||||
- tagged with `analytics_topic`, `topic_primary`, `topic_tags`, `skill_tags`
|
||||
|
||||
Question counts:
|
||||
|
||||
- 2022 fall midterm: `43`
|
||||
- 2022 spring midterm: `38`
|
||||
- 2022 spring final part A: `24`
|
||||
- 2022 spring final part B: `19`
|
||||
- 2023 spring midterm: `36`
|
||||
- 2024 spring midterm: `42`
|
||||
- 2024 spring final: `48`
|
||||
|
||||
## Key Files
|
||||
|
||||
Schema / SQL:
|
||||
|
||||
- [001_init_schema.sql](/Users/soda/Desktop/PastPaper%20Master/supabase/migrations/001_init_schema.sql)
|
||||
- [002_course_library_fields.sql](/Users/soda/Desktop/PastPaper%20Master/supabase/migrations/002_course_library_fields.sql)
|
||||
- [003_question_taxonomy_fields.sql](/Users/soda/Desktop/PastPaper%20Master/supabase/migrations/003_question_taxonomy_fields.sql)
|
||||
- [004_decouple_course_library_from_auth.sql](/Users/soda/Desktop/PastPaper%20Master/supabase/migrations/004_decouple_course_library_from_auth.sql)
|
||||
- [005_allow_long_question_format_alias.sql](/Users/soda/Desktop/PastPaper%20Master/supabase/migrations/005_allow_long_question_format_alias.sql)
|
||||
- [006_make_scores_numeric.sql](/Users/soda/Desktop/PastPaper%20Master/supabase/migrations/006_make_scores_numeric.sql)
|
||||
|
||||
Course-library seeds:
|
||||
|
||||
- [comp2211_course_library_papers.sql](/Users/soda/Desktop/PastPaper%20Master/supabase/seeds/comp2211_course_library_papers.sql)
|
||||
- [comp2211_problem_taxonomy_backfill.sql](/Users/soda/Desktop/PastPaper%20Master/supabase/seeds/comp2211_problem_taxonomy_backfill.sql)
|
||||
- [comp2211_problem_level_questions.sql](/Users/soda/Desktop/PastPaper%20Master/supabase/seeds/comp2211_problem_level_questions.sql)
|
||||
|
||||
Manual splitters used for final subquestion rebuild:
|
||||
|
||||
- [split_comp2211_2022_spring_midterm.py](/Users/soda/Desktop/PastPaper%20Master/backend/split_comp2211_2022_spring_midterm.py)
|
||||
- [split_comp2211_2022_spring_final_part_a.py](/Users/soda/Desktop/PastPaper%20Master/backend/split_comp2211_2022_spring_final_part_a.py)
|
||||
- [split_comp2211_2022_spring_final_part_b.py](/Users/soda/Desktop/PastPaper%20Master/backend/split_comp2211_2022_spring_final_part_b.py)
|
||||
- [split_comp2211_2023_spring_midterm.py](/Users/soda/Desktop/PastPaper%20Master/backend/split_comp2211_2023_spring_midterm.py)
|
||||
- [split_comp2211_2024_spring_midterm.py](/Users/soda/Desktop/PastPaper%20Master/backend/split_comp2211_2024_spring_midterm.py)
|
||||
- [split_comp2211_2024_spring_final.py](/Users/soda/Desktop/PastPaper%20Master/backend/split_comp2211_2024_spring_final.py)
|
||||
|
||||
Deprecated filler script:
|
||||
|
||||
- [fill_manual_study_aids.py](/Users/soda/Desktop/PastPaper%20Master/backend/fill_manual_study_aids.py)
|
||||
|
||||
Audit / taxonomy references:
|
||||
|
||||
- [COMP2211.json](/Users/soda/Desktop/PastPaper%20Master/pastpaper-scraper/manifests/COMP2211.json)
|
||||
- [COMP2211_taxonomy.json](/Users/soda/Desktop/PastPaper%20Master/pastpaper-scraper/manifests/COMP2211_taxonomy.json)
|
||||
- [summary.json](/Users/soda/Desktop/PastPaper%20Master/pastpaper-scraper/reviews/COMP2211/summary.json)
|
||||
- [problem_topics.json](/Users/soda/Desktop/PastPaper%20Master/pastpaper-scraper/reviews/COMP2211/problem_topics.json)
|
||||
- [problem_seed.json](/Users/soda/Desktop/PastPaper%20Master/pastpaper-scraper/reviews/COMP2211/problem_seed.json)
|
||||
|
||||
Frontend / backend areas already adapted to real taxonomy:
|
||||
|
||||
- [frontend/src/pages/HomePage.tsx](/Users/soda/Desktop/PastPaper%20Master/frontend/src/pages/HomePage.tsx)
|
||||
- [frontend/src/pages/AnalyticsPage.tsx](/Users/soda/Desktop/PastPaper%20Master/frontend/src/pages/ErrorBookPage.tsx)
|
||||
- [frontend/src/components/workbench/SimilarHistoryPanel.tsx](/Users/soda/Desktop/PastPaper%20Master/frontend/src/components/workbench/SimilarHistoryPanel.tsx)
|
||||
- [backend/app/routers/analytics.py](/Users/soda/Desktop/PastPaper%20Master/backend/app/routers/analytics.py)
|
||||
- [backend/app/routers/questions.py](/Users/soda/Desktop/PastPaper%20Master/backend/app/routers/questions.py)
|
||||
- [backend/app/routers/attempts.py](/Users/soda/Desktop/PastPaper%20Master/backend/app/routers/attempts.py)
|
||||
|
||||
## Important Product / Data Decisions Already Made
|
||||
|
||||
### Course library vs user upload
|
||||
|
||||
This is now separated semantically inside `papers`:
|
||||
|
||||
- `source_kind = 'course_library'` for platform-owned papers
|
||||
- `source_kind = 'user_upload'` for user-contributed papers
|
||||
|
||||
Course-library papers no longer require `user_id`.
|
||||
|
||||
### Taxonomy model
|
||||
|
||||
`question_type` is not the main analytics dimension.
|
||||
|
||||
Current intended usage:
|
||||
|
||||
- `question_type` / `question_format`: rendering and answer interaction
|
||||
- `analytics_topic`: normalized analytics bucket
|
||||
- `topic_tags`: multi-tag topical indexing
|
||||
- `skill_tags`: finer-grained retrieval / grading / similarity support
|
||||
|
||||
### Score field
|
||||
|
||||
Scores are `NUMERIC`, not integer, because many subquestions use fractional marks like `1.5`.
|
||||
|
||||
## Known Issues
|
||||
|
||||
### 1. Similar question retrieval is still not truly production-ready
|
||||
|
||||
Current state:
|
||||
|
||||
- backend route exists
|
||||
- frontend panel exists
|
||||
- demo fallback still exists in the UI when retrieval returns empty / fails
|
||||
|
||||
What needs to be done:
|
||||
|
||||
- remove demo fallback behavior once real retrieval is stable
|
||||
- improve ranking beyond current basic topic/type matching
|
||||
- ideally add indexed text retrieval, then embeddings if needed
|
||||
|
||||
Recommended order:
|
||||
|
||||
1. build deterministic same-course retrieval first
|
||||
2. rank by `analytics_topic`, `topic_tags`, `skill_tags`, `question_format`, text similarity
|
||||
3. only then consider vector search
|
||||
|
||||
### 2. Analytics is real, but still not the final version
|
||||
|
||||
Current state:
|
||||
|
||||
- analytics already reads real DB data
|
||||
- taxonomy fields are being used
|
||||
|
||||
Still missing:
|
||||
|
||||
- better topic normalization for edge cases
|
||||
- per-paper and per-subtopic drill-down
|
||||
- cleaner stats for mixed-format questions
|
||||
- confidence around aggregated counts across all courses, not only `COMP2211`
|
||||
|
||||
### 3. LaTeX / math rendering is still fragile
|
||||
|
||||
Known symptoms:
|
||||
|
||||
- OCR / extracted math strings are noisy
|
||||
- some generated HTML contains malformed or hard-to-read math fragments
|
||||
- not all backend feedback is rendered with the same quality
|
||||
|
||||
What needs work:
|
||||
|
||||
- normalize math strings before rendering
|
||||
- improve KaTeX preprocessing
|
||||
- avoid dumping broken extracted formulas directly into UI
|
||||
- ensure solution / feedback content is consistently rendered through the same component path
|
||||
|
||||
### 4. Presentation quality is still uneven
|
||||
|
||||
Data is now real, but UI still needs polish:
|
||||
|
||||
- question nav is still too weak for long real papers
|
||||
- status / difficulty / topic chips can be clearer
|
||||
- workbench hierarchy is inconsistent across question types
|
||||
- some pages still read like an internal demo rather than a finished study product
|
||||
|
||||
### 5. User upload flow still lacks dedup / library filtering
|
||||
|
||||
This is the next big backend product task.
|
||||
|
||||
Desired logic:
|
||||
|
||||
- when user uploads a paper, compare against existing course-library papers
|
||||
- if it is already covered, do not create a duplicate paper
|
||||
- if it is new, ingest it as `user_upload`
|
||||
- if high quality and non-duplicate, optionally promote into library workflow later
|
||||
|
||||
### 6. Most non-Spring-2024 study aids are contaminated by template filler content
|
||||
|
||||
Current state:
|
||||
|
||||
- `COMP2211-2022-fall-midterm` has question-level LLM-authored study aids
|
||||
- `COMP2211-2024-spring-midterm` is the intended quality bar
|
||||
- the remaining papers were backfilled with a deprecated template script and should not be treated as production-quality AI content
|
||||
|
||||
Impact:
|
||||
|
||||
- `knowledge_reminder` is often generic topic boilerplate
|
||||
- `ai_hint` often points to a parent problem header instead of the actual subquestion
|
||||
- `solution` is often just wrapped reference text, not a true worked solution
|
||||
|
||||
Required action:
|
||||
|
||||
1. detect and clear templated study aids from affected papers
|
||||
2. regenerate them through the real LLM path in [paper_processor.py](/Users/soda/Desktop/PastPaper%20Master/backend/app/services/paper_processor.py)
|
||||
3. review output quality before marking the papers as complete
|
||||
|
||||
## Next Major Workstreams
|
||||
|
||||
### A. Real similar-question retrieval
|
||||
|
||||
Goal:
|
||||
|
||||
- no demo fallback
|
||||
- same-course retrieval that feels trustworthy
|
||||
|
||||
Suggested implementation:
|
||||
|
||||
1. add a richer retrieval score in [questions.py](/Users/soda/Desktop/PastPaper%20Master/backend/app/routers/questions.py)
|
||||
2. use:
|
||||
- same `course_code`
|
||||
- same `analytics_topic`
|
||||
- overlapping `topic_tags`
|
||||
- overlapping `skill_tags`
|
||||
- same or compatible `question_format`
|
||||
- lexical similarity on `question_text`
|
||||
3. expose match reasons in response if useful
|
||||
4. update UI to show why a question was retrieved
|
||||
|
||||
Potential DB improvement:
|
||||
|
||||
- add `search_text` / `tsvector` on `paper_questions`
|
||||
- later optionally add `embedding`
|
||||
|
||||
### B. Real paper / topic statistics
|
||||
|
||||
Goal:
|
||||
|
||||
- analytics should be fully trustworthy at subquestion level
|
||||
|
||||
Suggested improvements:
|
||||
|
||||
- topic frequency by `analytics_topic`
|
||||
- question-format distribution by subquestion, not by top-level problem
|
||||
- per-paper breakdown
|
||||
- high-yield topic trend across years
|
||||
- topic-to-question index page for drill mode
|
||||
|
||||
### C. LaTeX and content rendering cleanup
|
||||
|
||||
Goal:
|
||||
|
||||
- all math-heavy content should render legibly
|
||||
|
||||
Suggested work:
|
||||
|
||||
- centralize HTML + KaTeX normalization
|
||||
- strip broken OCR artifacts before render
|
||||
- make study-aid content generation avoid malformed formula formatting
|
||||
- ensure grading feedback and solutions share the same renderer pipeline
|
||||
|
||||
### D. User upload deduplication and library filtering
|
||||
|
||||
Goal:
|
||||
|
||||
- new uploads should not pollute the DB with duplicates
|
||||
|
||||
Suggested logic:
|
||||
|
||||
1. normalize upload metadata
|
||||
2. compare against existing papers in same course:
|
||||
- year / term / exam_type / part_label
|
||||
- title similarity
|
||||
- extracted first-page markers
|
||||
- optional text fingerprint
|
||||
3. if duplicate:
|
||||
- attach to existing paper or reject with explanation
|
||||
4. if not duplicate:
|
||||
- create `user_upload`
|
||||
- process normally
|
||||
|
||||
Likely schema additions later:
|
||||
|
||||
- content fingerprint field on `papers`
|
||||
- upload provenance fields
|
||||
- moderation / promotion state for community uploads
|
||||
|
||||
### E. UI / UX pass
|
||||
|
||||
Priority items:
|
||||
|
||||
- stronger question navigation for real papers
|
||||
- clearer ready / processing / failed states
|
||||
- better paper list and filtering UX
|
||||
- richer workbench metadata:
|
||||
- topic
|
||||
- difficulty
|
||||
- format
|
||||
- score
|
||||
- answered / wrong / mastered state
|
||||
- unify visual style across analytics, error book, workbench
|
||||
|
||||
## Suggested Development Order
|
||||
|
||||
1. Remove similar-question demo fallback and ship real retrieval
|
||||
2. Improve analytics and topic drill views using subquestion-level data
|
||||
3. Fix LaTeX / rendering quality
|
||||
4. Build upload dedup / filtering against existing library papers
|
||||
5. Do a focused UI / UX pass after the real data flows are stable
|
||||
|
||||
## Operational Notes
|
||||
|
||||
### Frontend entry issue that was fixed
|
||||
|
||||
Homepage was previously still using mock papers and an old hardcoded `COMP2211` id.
|
||||
It now reads real papers from `listPapers()`.
|
||||
|
||||
### Manual content generation
|
||||
|
||||
The current `COMP2211` three-piece study aids were filled manually through local scripts and deterministic templates, not through external LLM batch processing. This is deliberate and keeps the current dataset stable.
|
||||
|
||||
### If rebuilding papers again
|
||||
|
||||
For `COMP2211`, use the manual splitters rather than rerunning generic extraction blindly. `2024-spring-midterm` especially required reconstruction from PDF page spans because the earlier top-level extraction had already truncated `Problem 5` and `Problem 7`.
|
||||
|
||||
## Ready-to-Verify Checklist
|
||||
|
||||
If you want to sanity-check the current product quickly:
|
||||
|
||||
1. Open home page and filter `COMP2211`
|
||||
2. Open each paper and confirm `status = ready`
|
||||
3. Check question count matches:
|
||||
- `43 / 38 / 24 / 19 / 36 / 42 / 48`
|
||||
4. Open analytics page for `COMP2211`
|
||||
5. Open several papers and verify:
|
||||
- question nav loads
|
||||
- AI trio exists
|
||||
- topics render
|
||||
- similar-question panel does not block the page
|
||||
516
TECHNICAL.md
Normal file
516
TECHNICAL.md
Normal file
@@ -0,0 +1,516 @@
|
||||
# PastPaper Master — 技术文档
|
||||
|
||||
## 系统架构总览
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Frontend (React 19 + Vite 7) │
|
||||
│ Pages: Home / Upload / Workbench / ErrorBook │
|
||||
│ PDF: react-pdf v10 | Math: KaTeX 0.16 | Style: Tailwind v4 │
|
||||
└────────────────────────────┬────────────────────────────────────┘
|
||||
│ /api (Vite proxy → :8000)
|
||||
┌────────────────────────────▼────────────────────────────────────┐
|
||||
│ Backend (FastAPI + Python) │
|
||||
│ Routers: papers / attempts / questions │
|
||||
│ Services: paper_processor / grader / llm_clients / text_extractor│
|
||||
└────────┬───────────────────┬──────────────────┬─────────────────┘
|
||||
│ │ │
|
||||
┌─────▼─────┐ ┌────────▼───────┐ ┌───────▼──────┐
|
||||
│ Supabase │ │ GPT-4o │ │ Qwen-plus │
|
||||
│ PostgreSQL │ │ (laozhang API) │ │ (DashScope) │
|
||||
│ + Storage │ │ 结构化/OCR/变体 │ │ AI三件套/判分 │
|
||||
└───────────┘ └────────────────┘ └──────────────┘
|
||||
```
|
||||
|
||||
**技术栈一览:**
|
||||
- **Frontend**: React 19, TypeScript, Vite 7, Tailwind CSS v4, react-pdf v10, KaTeX 0.16
|
||||
- **Backend**: FastAPI, Python 3.12, uv (包管理)
|
||||
- **Database**: Supabase (PostgreSQL + Row Level Security)
|
||||
- **Storage**: Supabase Storage (buckets: `papers`, `attempt-photos`)
|
||||
- **LLM**: GPT-4o (laozhang API 代理), Qwen-plus (阿里 DashScope)
|
||||
|
||||
---
|
||||
|
||||
## 数据库 Schema
|
||||
|
||||
> 文件: `supabase/migrations/001_init_schema.sql`
|
||||
|
||||
### Table: `papers` — 试卷
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|------|------|------|
|
||||
| id | UUID PK | 自动生成 |
|
||||
| user_id | UUID FK → auth.users | 上传者 |
|
||||
| course_code | TEXT | 课程代码, e.g. "COMP2011" |
|
||||
| year / term / exam_type | INT/TEXT/TEXT | 元信息 |
|
||||
| paper_file_url | TEXT | 试卷 PDF (Supabase Storage) |
|
||||
| answer_file_url | TEXT? | 答案 PDF (可选) |
|
||||
| status | TEXT | `uploaded` → `processing` → `ready` / `error` |
|
||||
| paper_extracted_text | TEXT | PyMuPDF 提取的原始文本 (缓存) |
|
||||
| total_score / question_count | INT | AI 提取的整卷概览 |
|
||||
| topics_summary | JSONB | `{"Linked List": 40, "Recursion": 30}` |
|
||||
| difficulty_level | TEXT | easy / medium / hard |
|
||||
|
||||
### Table: `paper_questions` — 逐题数据
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|------|------|------|
|
||||
| id | UUID PK | |
|
||||
| paper_id | UUID FK → papers | |
|
||||
| question_number | TEXT | "1", "1a", "2b" |
|
||||
| parent_question | TEXT? | 子题父题号: "1a" → "1" |
|
||||
| display_order | INT | 排序 |
|
||||
| question_type | TEXT | `mc` / `true_false` / `fill_blank` / `long_question` |
|
||||
| question_text | TEXT | 题目原文 |
|
||||
| score / page_number | INT | 分值, PDF 页码 (PDF-题目联动用) |
|
||||
| options | JSONB | MC 选项: `[{"label":"A","text":"..."}]` |
|
||||
| correct_option | TEXT | MC 正确选项 |
|
||||
| correct_answer | TEXT | 填空题正确答案 |
|
||||
| raw_answer_text | TEXT | 答案 PDF 原始解<E5A78B><E8A7A3> |
|
||||
| topics | TEXT[] | 知识点标签 |
|
||||
| difficulty | TEXT | easy / medium / hard |
|
||||
| knowledge_reminder | TEXT | AI 知识点提醒 (HTML+KaTeX) |
|
||||
| ai_hint | TEXT | AI 思路提示 (HTML+KaTeX) |
|
||||
| solution | TEXT | AI 完整解题过程 (HTML+KaTeX) |
|
||||
|
||||
### Table: `user_attempts` — 用户答题记录
|
||||
|
||||
| 字段 | 类型 | 说明 |
|
||||
|------|------|------|
|
||||
| id | UUID PK | |
|
||||
| user_id / question_id | UUID FK | |
|
||||
| attempt_type | TEXT | `select` / `input` / `photo` |
|
||||
| user_answer | TEXT | 用户的选项或输入 |
|
||||
| photo_url / photo_ocr_text | TEXT | 拍照上传的图片和 OCR 结果 |
|
||||
| is_correct | BOOL | AI 判定 |
|
||||
| feedback | TEXT | HTML 逐步错误分析 |
|
||||
| error_at_step | INT | 第几步出错 |
|
||||
| in_error_book / mastered | BOOL | 错题本状态 |
|
||||
|
||||
---
|
||||
|
||||
## 核心功能一:试卷分析管线
|
||||
|
||||
### 流程概述
|
||||
|
||||
```
|
||||
用户上传 PDF → 后台 BackgroundTask → 5 步管线 → 状态变 ready
|
||||
```
|
||||
|
||||
### 文件
|
||||
|
||||
| 文件 | 作用 |
|
||||
|------|------|
|
||||
| `backend/app/routers/papers.py` | 上传接口, 触发后台处理 |
|
||||
| `backend/app/services/paper_processor.py` | **核心管线**, 5 步处理逻辑 |
|
||||
| `backend/app/services/text_extractor.py` | PDF → 文本提取 (PyMuPDF) |
|
||||
| `backend/app/services/llm_clients.py` | GPT-4o / Qwen 客户端单例 |
|
||||
|
||||
### 管线 5 步 (`paper_processor.py: process_paper()`)
|
||||
|
||||
**Step 1 — PDF 文本提取**
|
||||
- 使用 PyMuPDF (`fitz`) 逐页提取文本
|
||||
- 如果某页文本 < 50 字符 (可能是扫描件), 额外保存该页为 base64 图片备用
|
||||
- 提取结果缓存到 `papers.paper_extracted_text`
|
||||
|
||||
```python
|
||||
# text_extractor.py
|
||||
extract_pdf(file_bytes) → ExtractedContent(pages_text, page_images, total_pages, has_images)
|
||||
get_full_text(extracted) → "--- Page 1 ---\n{text}\n\n--- Page 2 ---\n..."
|
||||
```
|
||||
|
||||
**Step 2 — GPT-4o 结构化拆题**
|
||||
- Model: `gpt-4o`, temperature=0, response_format=json_object
|
||||
- 输入: 整卷文本
|
||||
- 输出: JSON 包含 total_score, difficulty_level, topics_summary, questions[]
|
||||
- 每题提取: question_number, parent_question, question_type, question_text, score, page_number, options, topics, difficulty
|
||||
- 更新 `papers` 表的概览字段 (total_score, question_count, topics_summary, difficulty_level)
|
||||
|
||||
**Step 3 — 答案匹配 (如果有答案 PDF)**
|
||||
- Model: `gpt-4o`, temperature=0
|
||||
- 输入: 题目结构 JSON + 答案文本
|
||||
- 输出: 逐题匹配 — correct_option / correct_answer / raw_answer_text
|
||||
- 选择题 → correct_option, 填空题 → correct_answer, 大题 → raw_answer_text
|
||||
|
||||
**Step 4 — Qwen 生成 AI 三件套 (逐题)**
|
||||
- Model: `qwen-plus`, temperature=0.3
|
||||
- 逐题调用, 输入题目信息 + 标准答案
|
||||
- 输出 JSON 三件套:
|
||||
- `knowledge_reminder`: 前置知识要点 (HTML+KaTeX)
|
||||
- `ai_hint`: 不给答案的思路引导 (HTML+KaTeX)
|
||||
- `solution`: 完整逐步解题过程 (HTML+KaTeX)
|
||||
- 写入 `paper_questions` 表
|
||||
|
||||
**Step 5 — 标记完成**
|
||||
- `papers.status` 更新为 `ready`
|
||||
- 如果任何步骤抛异常, status 设为 `error`, 错误信息写入 `error_message`
|
||||
|
||||
### 关键 Prompt 设计
|
||||
|
||||
**STRUCTURE_PROMPT** — 结构化拆题
|
||||
- 限定 question_type 只能是 mc / true_false / fill_blank / long_question
|
||||
- 判断题 (True/False) 用 `true_false` 类型,options 为 `[{label:"True",text:"True"},{label:"False",text:"False"}]`
|
||||
- 选择题必须提取 options 数组
|
||||
- 子题通过 parent_question 关联 (e.g. "1a" parent 是 "1")
|
||||
- 要求推断 page_number, topics, difficulty
|
||||
|
||||
**ANSWER_MATCH_PROMPT** — 答案匹配
|
||||
- 输入包含 questions_json (题号+题型) 和 answer_text
|
||||
- 按题型输出不同字段: MC → correct_option, fill → correct_answer, 大题 → raw_answer_text
|
||||
|
||||
**ANALYSIS_PROMPT** — AI 三件套
|
||||
- Solution 要求带完整过程 (Step 1, 2, 3...), 不能只给答案
|
||||
- 选择题要解释为什么对、为什么其他选项错
|
||||
- 标注常见错误: `<div class="common-error">...</div>`
|
||||
- KaTeX 规则: 块级 `$$...$$`, 行内 `$...$`
|
||||
|
||||
---
|
||||
|
||||
## 核心功能二:PDF 滚动 + 题目联动
|
||||
|
||||
### 文件
|
||||
|
||||
| 文件 | 作用 |
|
||||
|------|------|
|
||||
| `frontend/src/components/workbench/PdfViewer.tsx` | PDF 连续滚动渲染 + 可见页检测 |
|
||||
| `frontend/src/components/workbench/QuestionNav.tsx` | 题目水平导航栏 |
|
||||
| `frontend/src/pages/WorkbenchPage.tsx` | 双向联动调度中枢 |
|
||||
|
||||
### 实现方案
|
||||
|
||||
**布局**: 左侧 60% PDF, 右侧 40% 题目面板
|
||||
|
||||
**PDF 连续滚动 (`PdfViewer.tsx`)**
|
||||
- 使用 `react-pdf` 的 `<Document>` + `<Page>` 组件
|
||||
- 所有页面垂直排列在可滚动容器中 (不是单页切换)
|
||||
- `ResizeObserver` 监听容器宽度, 动态设置 Page width
|
||||
- 手动跳转: 输入页码 → `scrollIntoView`
|
||||
|
||||
**双向联动:**
|
||||
|
||||
1. **题目 → PDF (点击题目, PDF 滚动到对应页)**
|
||||
- QuestionNav 点击 → `handleQuestionSelect(index)` → 记录 `lastUserSelectTime = Date.now()` + `setCurrentIndex`
|
||||
- PdfViewer 收到 `currentPage` prop 变化 → `useEffect` 触发 `el.scrollIntoView({ behavior: "smooth" })`
|
||||
- 设置 `programmaticScroll.current = true`, 2s 后重置
|
||||
|
||||
2. **PDF → 题目 (滚动 PDF, 右侧自动切换到当前题)**
|
||||
- `IntersectionObserver` 监听所有 `<Page>` 元素, threshold: `[0, 0.25, 0.5, 0.75, 1]`
|
||||
- 追踪每页的 `intersectionRatio`, 选出可见占比最高的页码
|
||||
- 如果 `programmaticScroll.current === true`, 跳过回调
|
||||
- 触发 `onPageChange(bestPage)` → WorkbenchPage `handlePdfPageChange`
|
||||
- `handlePdfPageChange`: 找到 `page_number <= currentPage` 的最后一题, 更新 `currentIndex`
|
||||
|
||||
**防止跳转抢夺 (双层保护):**
|
||||
- **WorkbenchPage 层 (核心)**: `lastUserSelectTime` ref — 用户点击题目后 2 秒内, `handlePdfPageChange` 直接 return, 不响应任何 Observer 回调。解决长文档 smooth scroll 经过中间页触发 Observer 导致题目被切走的问题
|
||||
- **PdfViewer 层 (辅助)**: `programmaticScroll` ref — scrollIntoView 期间 Observer 回调跳过, 2s 后重置
|
||||
|
||||
---
|
||||
|
||||
## 核心功能三:做题交互 (MC / 填空)
|
||||
|
||||
### 文件
|
||||
|
||||
| 文件 | 作用 |
|
||||
|------|------|
|
||||
| `frontend/src/components/workbench/QuestionDetail.tsx` | 题目展示 + 答题交互 |
|
||||
| `frontend/src/components/workbench/AiTrioPanel.tsx` | 知<><E79FA5>点/提示/解析 折叠面板 |
|
||||
| `frontend/src/components/shared/CollapsibleSection.tsx` | 可折叠区域组件 |
|
||||
| `frontend/src/components/shared/KaTeXRenderer.tsx` | HTML+KaTeX 渲染器 |
|
||||
|
||||
### QuestionDetail 交互逻辑
|
||||
|
||||
**选择题 (MC):**
|
||||
- 状态: `selectedOption`, `checked`
|
||||
- 点击选项 → 高亮蓝色 (未检查时)
|
||||
- 点击 "Check Answer" → `checked=true`
|
||||
- 正确: 选项变绿 + "Correct!" / 错误: 选中项变红, 正确项变绿 + 显示正确答案
|
||||
- 切换题目时自动重置状态 (`useEffect` on `question.id`)
|
||||
|
||||
**判断题 (True/False):**
|
||||
- 状态: `tfAnswers: Record<string, "True" | "False">`, `tfChecked`
|
||||
- 每个 statement 右侧有 T / F 两个按钮, 独立切换
|
||||
- 选中高亮蓝色, 全部选完后可点 "Submit Answers"
|
||||
- 提交后提示查看 solution 对答案 (因为逐条正确答案暂未单独存储)
|
||||
|
||||
**填空题 (Fill Blank):**
|
||||
- 文本输入框 + "Check" 按钮
|
||||
- Enter 键可直接检查
|
||||
- 大小写不敏感比较 (`toLowerCase()`)
|
||||
- 检查后输入框变色: 绿色 (对) / 红色 (错)
|
||||
|
||||
**回调**: `onAnswerResult(isCorrect, userAnswer)` → WorkbenchPage → `recordAttempt` API
|
||||
|
||||
### AiTrioPanel
|
||||
|
||||
- 三个 `CollapsibleSection`: Knowledge Reminder (蓝, 默认展开), AI Hint (琥珀), Solution (绿)
|
||||
- `CollapsibleSection` 使用 CSS `grid-template-rows: 0fr → 1fr` 动画平滑展开收起
|
||||
- 内容通过 `KaTeXRenderer` 渲染 (HTML + KaTeX 公式)
|
||||
|
||||
---
|
||||
|
||||
## 核心功能四:变体题生成 (Similar Question)
|
||||
|
||||
### 文件
|
||||
|
||||
| 文件 | 作用 |
|
||||
|------|------|
|
||||
| `backend/app/routers/questions.py` | `POST /{question_id}/variant` 端点 |
|
||||
| `backend/app/services/grader.py` | `generate_variant()` — GPT-4o 生成变体 |
|
||||
| `frontend/src/components/workbench/ActionBar.tsx` | "Similar Question" 按钮, 异步触发 |
|
||||
| `frontend/src/pages/WorkbenchPage.tsx` | Variants Tab 状态管理 |
|
||||
| `frontend/src/components/workbench/VariantDetail.tsx` | 变体题作答界面 |
|
||||
|
||||
### 后端
|
||||
|
||||
- `POST /api/questions/{question_id}/variant`
|
||||
- 从 DB 查原题 → 调 `generate_variant(question)` → 附上原题的 `knowledge_reminder` → 返回
|
||||
- Model: `gpt-4o`, temperature=0.5, response_format=json_object
|
||||
- VARIANT_PROMPT 要求: 同知识点, 相似难度, 不同数据/场景, 输出 HTML 格式 (非 markdown)
|
||||
- 输出字段: question_text, question_type, options (if MC), correct_answer, ai_hint, solution
|
||||
|
||||
### 前端交互 (Tab-based 异步流程)
|
||||
|
||||
**状态管理 (`WorkbenchPage.tsx`):**
|
||||
```typescript
|
||||
interface StoredVariant {
|
||||
id: string; // placeholder ID, e.g. "variant-1"
|
||||
sourceQuestionNumber: string; // 原题题号
|
||||
variant: VariantQuestion; // 生成结果
|
||||
status: "generating" | "ready";
|
||||
}
|
||||
```
|
||||
|
||||
**流程:**
|
||||
1. 用户点击 "Similar Question" → `ActionBar` 调 `onVariantStart(placeholderId, questionNumber)`
|
||||
2. WorkbenchPage 创建 `status: "generating"` 的占位项, 用户可继续做题不受阻塞
|
||||
3. API 返回后 → `onVariantReady(placeholderId, variant)` → 状态更新为 `ready`
|
||||
4. 失败 → `onVariantFailed(placeholderId)` → 删除占位项
|
||||
|
||||
**右侧面板三种视图:**
|
||||
- **Questions Tab**: 题目导航 + QuestionDetail + AiTrioPanel + ActionBar
|
||||
- **Variants Tab**: 变体列表 (Generating.../Ready), 每项显示题号和预览文本
|
||||
- **Variant Detail**: 点击 "Start" 后整个右侧替换为 VariantDetail 组件 + "Back" 按钮
|
||||
|
||||
**VariantDetail 组件**: 紫色主题, 包含完整 MC/填空交互 + AI 三件套 (CollapsibleSection)
|
||||
|
||||
---
|
||||
|
||||
## 核心功能五:拍照批改
|
||||
|
||||
### 文件
|
||||
|
||||
| 文件 | 作用 |
|
||||
|------|------|
|
||||
| `backend/app/routers/attempts.py` | `POST /photo` — 上传+OCR+批改 |
|
||||
| `backend/app/services/grader.py` | `ocr_photo()` + `grade_answer()` |
|
||||
| `frontend/src/components/workbench/PhotoUpload.tsx` | 拍照上传 Modal |
|
||||
| `frontend/src/components/workbench/ActionBar.tsx` | "Upload handwritten answer" 按钮 |
|
||||
|
||||
### 后端流程
|
||||
|
||||
1. 接收图片 → 上传到 Supabase Storage `attempt-photos` bucket
|
||||
2. `ocr_photo(photo_bytes)` — GPT-4o Vision 识别手写内容
|
||||
- 输入: base64 图片
|
||||
- 输出: 学生答案文本 (含 LaTeX 公式)
|
||||
3. `grade_answer(question, student_answer)` — Qwen-plus 批改
|
||||
- 输入: 题目信息 + 标准答案 + 学生答案
|
||||
- 输出: `{ is_correct, score_given, feedback (HTML), error_at_step }`
|
||||
4. 写入 `user_attempts` 表 (含 photo_url, photo_ocr_text, feedback, is_correct)
|
||||
5. 答错自动 `in_error_book = true`
|
||||
|
||||
### 前端
|
||||
|
||||
- PhotoUpload: Modal 弹窗, 支持拖拽/点击选择图片
|
||||
- 预览 → 提交 → 显示 OCR 识别结果 + AI 批改反馈
|
||||
- 所有题型均可使用 (MC / 填空 / 大题)
|
||||
|
||||
---
|
||||
|
||||
## 核心功能六:错题本
|
||||
|
||||
### 文件
|
||||
|
||||
| 文件 | 作用 |
|
||||
|------|------|
|
||||
| `backend/app/routers/attempts.py` | `GET /error-book` + `PATCH /{attempt_id}` |
|
||||
| `frontend/src/pages/ErrorBookPage.tsx` | 错题本页面 |
|
||||
| `frontend/src/lib/api.ts` | `getErrorBook()` + `updateAttempt()` |
|
||||
|
||||
### 后端
|
||||
|
||||
- `GET /api/attempts/error-book?user_id=xxx`
|
||||
- 查询 `in_error_book=true AND mastered=false`
|
||||
- JOIN `paper_questions` 返回完整题目信息
|
||||
- `PATCH /api/attempts/{attempt_id}`
|
||||
- 更新 `in_error_book` 或 `mastered` 标记
|
||||
|
||||
### 前端
|
||||
|
||||
- 列表展示: 题目信息 + 用户答案 + AI 反馈
|
||||
- 操作: "Review in Workbench" (跳转) / "Mastered" (标记掌握) / "Remove" (移出错题本)
|
||||
|
||||
---
|
||||
|
||||
## 核心功能七:答题记录
|
||||
|
||||
### 文件
|
||||
|
||||
| 文件 | 作用 |
|
||||
|------|------|
|
||||
| `backend/app/routers/attempts.py` | `POST /` — 记录答题 |
|
||||
| `frontend/src/components/workbench/ActionBar.tsx` | "Got it right" / "Got it wrong" 按钮 |
|
||||
|
||||
### 流程
|
||||
|
||||
- "Got it right" → `POST /api/attempts/` with `attempt_type: "select", is_correct: true`
|
||||
- "Got it wrong" → `POST /api/attempts/` with `attempt_type: "select", is_correct: false`
|
||||
- 后端自动 `in_error_book = true`
|
||||
- Toast 提示操作结果
|
||||
|
||||
---
|
||||
|
||||
## API 接口汇总
|
||||
|
||||
### Papers Router (`/api/papers`)
|
||||
|
||||
| Method | Path | 说明 |
|
||||
|--------|------|------|
|
||||
| GET | `/` | 列出所有试卷 (可按 user_id 过滤) |
|
||||
| POST | `/upload` | 上传试卷 PDF + 可选答案 PDF |
|
||||
| GET | `/{paper_id}` | 获<><E88EB7><EFBFBD>单份试卷信息 |
|
||||
| GET | `/{paper_id}/questions` | 获取试卷所有题目 |
|
||||
|
||||
### Attempts Router (`/api/attempts`)
|
||||
|
||||
| Method | Path | 说明 |
|
||||
|--------|------|------|
|
||||
| POST | `/` | 记录一次答题 |
|
||||
| POST | `/photo` | 拍照上传 + OCR + AI 批改 |
|
||||
| GET | `/error-book?user_id=` | 获取错题本 |
|
||||
| PATCH | `/{attempt_id}` | 更新错题本/掌握状态 |
|
||||
|
||||
### Questions Router (`/api/questions`)
|
||||
|
||||
| Method | Path | 说明 |
|
||||
|--------|------|------|
|
||||
| POST | `/{question_id}/variant` | 生成变体题 |
|
||||
|
||||
---
|
||||
|
||||
## 前端路由
|
||||
|
||||
| 路径 | 页面 | 文件 |
|
||||
|------|------|------|
|
||||
| `/` | 首页 — 试卷列表 | `src/pages/HomePage.tsx` |
|
||||
| `/upload` | 上传试卷 | `src/pages/UploadPage.tsx` |
|
||||
| `/paper/:id` | 做题工作台 | `src/pages/WorkbenchPage.tsx` |
|
||||
| `/error-book` | 错题本 | `src/pages/ErrorBookPage.tsx` |
|
||||
|
||||
---
|
||||
|
||||
## 前端组件树 (Workbench)
|
||||
|
||||
```
|
||||
WorkbenchPage
|
||||
├── Header # 顶部导航 (课程+试卷标题)
|
||||
├── PdfViewer # 左侧 60% — PDF 连续滚动
|
||||
└── Right Panel (40%)
|
||||
├── [Questions Tab]
|
||||
│ ├── QuestionNav # 题目水平导航 Q1 Q2 Q3...
|
||||
│ ├── QuestionDetail # 题目展示 + MC/填空交互
|
||||
│ ├── AiTrioPanel # 知识点/提示/解析 (3x CollapsibleSection)
|
||||
│ └── ActionBar # 底部按钮 (对/错/变体/拍照)
|
||||
├── [Variants Tab]
|
||||
│ └── Variant Cards # 变体列表 (Generating.../Ready)
|
||||
└── [Variant Detail View] # 替换整个右侧
|
||||
├── Back Button
|
||||
└── VariantDetail # 变体题作答 + AI 三件套
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## LLM 调用模型分工
|
||||
|
||||
| 任务 | 模型 | Provider | 文件 |
|
||||
|------|------|----------|------|
|
||||
| 结构化拆题 | gpt-4o | laozhang API | paper_processor.py |
|
||||
| 答案匹配 | gpt-4o | laozhang API | paper_processor.py |
|
||||
| AI 三件套 (knowledge/hint/solution) | qwen-plus | DashScope | paper_processor.py |
|
||||
| 手写 OCR | gpt-4o (Vision) | laozhang API | grader.py |
|
||||
| 答案批改 | qwen-plus | DashScope | grader.py |
|
||||
| 变体题生成 | gpt-4o | laozhang API | grader.py |
|
||||
|
||||
---
|
||||
|
||||
## 配置与环境变量
|
||||
|
||||
> 文件: `backend/app/config.py`, `.env`
|
||||
|
||||
| 变量 | 说明 |
|
||||
|------|------|
|
||||
| SUPABASE_URL | Supabase 项目 URL |
|
||||
| SUPABASE_ANON_KEY | 前端用匿名 Key |
|
||||
| SUPABASE_SERVICE_ROLE_KEY | 后端用 Service Role Key (绕过 RLS) |
|
||||
| LAOZHANG_BASE_URL | GPT-4o 代理 API 地址 |
|
||||
| LAOZHANG_API_KEY | GPT-4o 代理 API Key |
|
||||
| DASHSCOPE_BASE_URL | 阿里 DashScope API |
|
||||
| DASHSCOPE_API_KEY | DashScope API Key |
|
||||
|
||||
---
|
||||
|
||||
## 文件完整索引
|
||||
|
||||
### Backend (`backend/app/`)
|
||||
|
||||
```
|
||||
main.py # FastAPI 入口, CORS, 路由注册
|
||||
config.py # Pydantic Settings, 环境变量
|
||||
routers/
|
||||
papers.py # 试卷 CRUD + 上传触发处理
|
||||
attempts.py # 答题记录 + 拍照OCR批改 + 错题本
|
||||
questions.py # 变体题生成
|
||||
services/
|
||||
paper_processor.py # 核心5步管线: PDF→结构化→答案匹配→AI三件套
|
||||
text_extractor.py # PyMuPDF 文本提取
|
||||
grader.py # OCR + 批改 + 变体生成 (Prompt + LLM 调用)
|
||||
llm_clients.py # GPT-4o / Qwen 客户端单例
|
||||
supabase_client.py # Supabase 客户端
|
||||
```
|
||||
|
||||
### Frontend (`frontend/src/`)
|
||||
|
||||
```
|
||||
App.tsx # React Router 路由定义
|
||||
main.tsx # ReactDOM 入口
|
||||
lib/
|
||||
api.ts # 所有 API 调用封装 (9 个函数)
|
||||
types/
|
||||
api.ts # TypeScript 类型定义
|
||||
hooks/
|
||||
usePaper.ts # 轮询获取试卷状态 (3s interval)
|
||||
useQuestions.ts # 获取题目列表
|
||||
pages/
|
||||
HomePage.tsx # 首页 — 试卷列表
|
||||
UploadPage.tsx # 上传页
|
||||
WorkbenchPage.tsx # 做题工作台 — 核心调度组件
|
||||
ErrorBookPage.tsx # 错题本
|
||||
components/
|
||||
layout/
|
||||
Header.tsx # 顶部导航栏
|
||||
shared/
|
||||
KaTeXRenderer.tsx # HTML+KaTeX 公式渲染
|
||||
CollapsibleSection.tsx # 折叠面板 (grid动画)
|
||||
StatusBadge.tsx # 状态标签
|
||||
upload/
|
||||
UploadForm.tsx # 上传表单
|
||||
FilePickerField.tsx # 文件选择器
|
||||
workbench/
|
||||
PdfViewer.tsx # PDF 连续滚动 + IntersectionObserver
|
||||
QuestionNav.tsx # 题目导航栏
|
||||
QuestionDetail.tsx # 题目展示 + MC/填空交互
|
||||
AiTrioPanel.tsx # AI 三件套面板
|
||||
ActionBar.tsx # 底部操作按钮
|
||||
PhotoUpload.tsx # 拍照上传 Modal
|
||||
VariantDetail.tsx # 变体题内联作答
|
||||
VariantModal.tsx # (已废弃, 被 VariantDetail 替代)
|
||||
```
|
||||
16
backend/Dockerfile
Normal file
16
backend/Dockerfile
Normal file
@@ -0,0 +1,16 @@
|
||||
FROM python:3.12-slim
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# System deps for PyMuPDF
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
libmupdf-dev gcc g++ && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
COPY pyproject.toml .
|
||||
RUN pip install --no-cache-dir .
|
||||
|
||||
COPY app/ app/
|
||||
|
||||
EXPOSE 8000
|
||||
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
4
backend/add_progress_columns.sql
Normal file
4
backend/add_progress_columns.sql
Normal file
@@ -0,0 +1,4 @@
|
||||
ALTER TABLE papers
|
||||
ADD COLUMN IF NOT EXISTS processing_step text DEFAULT NULL,
|
||||
ADD COLUMN IF NOT EXISTS processing_progress integer DEFAULT 0,
|
||||
ADD COLUMN IF NOT EXISTS processing_total integer DEFAULT 0;
|
||||
0
backend/app/__init__.py
Normal file
0
backend/app/__init__.py
Normal file
36
backend/app/config.py
Normal file
36
backend/app/config.py
Normal file
@@ -0,0 +1,36 @@
|
||||
from pydantic_settings import BaseSettings
|
||||
from functools import lru_cache
|
||||
import os
|
||||
|
||||
|
||||
class Settings(BaseSettings):
|
||||
# Supabase
|
||||
supabase_url: str
|
||||
supabase_anon_key: str
|
||||
supabase_service_role_key: str
|
||||
|
||||
# LLM - laozhang (gpt-4o, gpt-4o-mini)
|
||||
laozhang_base_url: str = "https://api.laozhang.ai/v1"
|
||||
laozhang_api_key: str = ""
|
||||
|
||||
# LLM - DashScope (qwen-plus)
|
||||
dashscope_base_url: str = "https://dashscope.aliyuncs.com/compatible-mode/v1"
|
||||
dashscope_api_key: str = ""
|
||||
|
||||
# LLM - DeepSeek
|
||||
deepseek_base_url: str = "https://api.deepseek.com/v1"
|
||||
deepseek_api_key: str = ""
|
||||
|
||||
# Google Gemini (official)
|
||||
google_gemini_api_key: str = ""
|
||||
|
||||
model_config = {
|
||||
"env_file": os.path.join(os.path.dirname(__file__), "../../.env"),
|
||||
"env_file_encoding": "utf-8",
|
||||
"extra": "ignore",
|
||||
}
|
||||
|
||||
|
||||
@lru_cache
|
||||
def get_settings() -> Settings:
|
||||
return Settings()
|
||||
0
backend/app/dependencies/__init__.py
Normal file
0
backend/app/dependencies/__init__.py
Normal file
34
backend/app/dependencies/auth.py
Normal file
34
backend/app/dependencies/auth.py
Normal file
@@ -0,0 +1,34 @@
|
||||
"""Auth dependency: validate Supabase JWT and return user_id"""
|
||||
|
||||
from fastapi import Depends, HTTPException, status
|
||||
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
bearer_scheme = HTTPBearer(auto_error=False)
|
||||
|
||||
|
||||
async def get_current_user_id(
|
||||
credentials: HTTPAuthorizationCredentials | None = Depends(bearer_scheme),
|
||||
) -> str:
|
||||
"""Extract and validate Bearer token, return user_id."""
|
||||
if not credentials:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_401_UNAUTHORIZED,
|
||||
detail="Not authenticated",
|
||||
)
|
||||
token = credentials.credentials
|
||||
sb = get_supabase()
|
||||
try:
|
||||
result = sb.auth.get_user(token)
|
||||
user = result.user
|
||||
if not user:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_401_UNAUTHORIZED,
|
||||
detail="Invalid token",
|
||||
)
|
||||
return user.id
|
||||
except Exception:
|
||||
raise HTTPException(
|
||||
status_code=status.HTTP_401_UNAUTHORIZED,
|
||||
detail="Invalid or expired token",
|
||||
)
|
||||
59
backend/app/main.py
Normal file
59
backend/app/main.py
Normal file
@@ -0,0 +1,59 @@
|
||||
import asyncio
|
||||
import threading
|
||||
from contextlib import asynccontextmanager
|
||||
|
||||
from fastapi import FastAPI
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
from app.routers import analytics, papers, attempts, questions
|
||||
|
||||
|
||||
def _resume_stale_papers():
|
||||
"""启动时检查卡在 processing 的 paper,自动续传 AI trio"""
|
||||
try:
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.services.paper_processor import process_paper
|
||||
|
||||
sb = get_supabase()
|
||||
stale = sb.table("papers").select("id").eq("status", "processing").execute().data
|
||||
if not stale:
|
||||
return
|
||||
|
||||
for p in stale:
|
||||
paper_id = p["id"]
|
||||
print(f"[STARTUP] Resuming processing for paper {paper_id[:8]}...")
|
||||
|
||||
def run(pid=paper_id):
|
||||
asyncio.run(process_paper(pid, b"", None))
|
||||
|
||||
threading.Thread(target=run, daemon=True).start()
|
||||
except Exception as e:
|
||||
print(f"[STARTUP] Resume skipped: {e}")
|
||||
|
||||
|
||||
@asynccontextmanager
|
||||
async def lifespan(app: FastAPI):
|
||||
# Startup
|
||||
_resume_stale_papers()
|
||||
yield
|
||||
# Shutdown (nothing to do)
|
||||
|
||||
|
||||
app = FastAPI(title="PastPaper Master API", version="0.1.0", lifespan=lifespan)
|
||||
|
||||
app.add_middleware(
|
||||
CORSMiddleware,
|
||||
allow_origins=["*"], # 开发阶段先放开,上线收紧
|
||||
allow_credentials=True,
|
||||
allow_methods=["*"],
|
||||
allow_headers=["*"],
|
||||
)
|
||||
|
||||
app.include_router(papers.router, prefix="/api/papers", tags=["papers"])
|
||||
app.include_router(attempts.router, prefix="/api/attempts", tags=["attempts"])
|
||||
app.include_router(questions.router, prefix="/api/questions", tags=["questions"])
|
||||
app.include_router(analytics.router, prefix="/api/analytics", tags=["analytics"])
|
||||
|
||||
|
||||
@app.get("/health")
|
||||
def health():
|
||||
return {"status": "ok"}
|
||||
0
backend/app/routers/__init__.py
Normal file
0
backend/app/routers/__init__.py
Normal file
285
backend/app/routers/analytics.py
Normal file
285
backend/app/routers/analytics.py
Normal file
@@ -0,0 +1,285 @@
|
||||
"""Course-level analytics endpoints."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from collections import Counter, defaultdict
|
||||
|
||||
from fastapi import APIRouter
|
||||
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
DIFFICULTY_SCORE = {"easy": 1, "medium": 2, "hard": 3}
|
||||
DIFFICULTY_LABEL = {1: "Easy", 2: "Medium", 3: "Hard"}
|
||||
|
||||
# ── Topic normalization ──────────────────────────────────────
|
||||
# Map variant spellings to canonical label
|
||||
_TOPIC_ALIASES: dict[str, str] = {
|
||||
"numpy": "NumPy",
|
||||
"naïve bayes": "Naive Bayes",
|
||||
"naïve bayes classifier": "Naive Bayes",
|
||||
"naive bayes classifier": "Naive Bayes",
|
||||
"bayes classifier": "Naive Bayes",
|
||||
"bayes model": "Naive Bayes",
|
||||
"bayes' theorem": "Naive Bayes",
|
||||
"bayes' rule": "Naive Bayes",
|
||||
"k-nearest neighbors": "K-Nearest Neighbors (KNN)",
|
||||
"knn": "K-Nearest Neighbors (KNN)",
|
||||
"k-means clustering": "K-Means Clustering",
|
||||
"k-means": "K-Means Clustering",
|
||||
"k means": "K-Means Clustering",
|
||||
"multilayer perceptron": "Multilayer Perceptron (MLP)",
|
||||
"multi-layer perceptron": "Multilayer Perceptron (MLP)",
|
||||
"multi-layer perceptron (mlp)": "Multilayer Perceptron (MLP)",
|
||||
"mlp": "Multilayer Perceptron (MLP)",
|
||||
"single layer perceptron": "Perceptron",
|
||||
"convolutional neural network": "CNN",
|
||||
"convolutional neural network (cnn)": "CNN",
|
||||
"convolutional neural networks": "CNN",
|
||||
"cnn architecture": "CNN",
|
||||
"cnn properties": "CNN",
|
||||
"python fundamentals": "Python",
|
||||
"python programming": "Python",
|
||||
"python implementation": "Python",
|
||||
"advanced python programming": "Python",
|
||||
"python programming: convolutional neural network": "CNN",
|
||||
"cross-validation": "Cross Validation",
|
||||
"model evaluation implementation": "Model Evaluation",
|
||||
"digital image processing": "Image Processing",
|
||||
"computer vision": "Image Processing",
|
||||
"array slicing": "Array Slicing",
|
||||
"slicing": "Array Slicing",
|
||||
"array indexing": "Array Slicing",
|
||||
"array reshaping": "Reshape",
|
||||
"array views": "Array Slicing",
|
||||
"view vs copy": "Array Slicing",
|
||||
"boolean indexing": "Array Slicing",
|
||||
"arange": "NumPy",
|
||||
"newaxis": "NumPy",
|
||||
"expand dims": "NumPy",
|
||||
"transpose": "NumPy",
|
||||
"type casting": "NumPy",
|
||||
"element-wise operation": "NumPy",
|
||||
"array reduction": "NumPy",
|
||||
"multi-dimensional array": "NumPy",
|
||||
"dot product": "NumPy",
|
||||
"vectorization": "NumPy",
|
||||
"activation functions": "Activation Function",
|
||||
"linear activation function": "Activation Function",
|
||||
"neural network architecture": "Neural Networks",
|
||||
"hidden layer": "Neural Networks",
|
||||
"deep learning": "Neural Networks",
|
||||
"deep learning frameworks": "Neural Networks",
|
||||
"alpha-beta pruning": "Alpha-Beta Pruning",
|
||||
"minimax algorithm": "Minimax",
|
||||
"ethics of ai": "AI Ethics",
|
||||
"ethics": "AI Ethics",
|
||||
"cosine distance": "Cosine Similarity",
|
||||
"distance calculation": "Distance Metrics",
|
||||
"euclidean distance": "Distance Metrics",
|
||||
"manhattan distance": "Distance Metrics",
|
||||
"hamming distance": "Distance Metrics",
|
||||
"precision": "Model Evaluation",
|
||||
"recall": "Model Evaluation",
|
||||
"f1 score": "Model Evaluation",
|
||||
"macro f1 score": "Model Evaluation",
|
||||
"accuracy": "Model Evaluation",
|
||||
"classification accuracy": "Model Evaluation",
|
||||
"confusion matrix": "Model Evaluation",
|
||||
"convolution operation": "Convolution",
|
||||
"dilated convolution": "Convolution",
|
||||
"3d convolution": "Convolution",
|
||||
"gaussian likelihood": "Probability",
|
||||
"gaussian distribution": "Probability",
|
||||
"categorical likelihood": "Probability",
|
||||
"conditional probability": "Probability",
|
||||
"total probability theorem": "Probability",
|
||||
"probability assumptions": "Probability",
|
||||
"tensorflow": "Keras",
|
||||
"model summary": "Keras",
|
||||
"model construction": "Keras",
|
||||
"trainable parameters": "Parameter Calculation",
|
||||
"parameter reduction": "Parameter Calculation",
|
||||
"output shape calculation": "Parameter Calculation",
|
||||
"shape calculation": "Parameter Calculation",
|
||||
}
|
||||
|
||||
|
||||
def normalize_topic(label: str) -> str:
|
||||
return _TOPIC_ALIASES.get(label.lower().strip(), label)
|
||||
|
||||
|
||||
def extract_topic_labels(question: dict) -> list[str]:
|
||||
labels: list[str] = []
|
||||
raw_labels: list[str] = []
|
||||
|
||||
analytics_topic = question.get("analytics_topic")
|
||||
if analytics_topic:
|
||||
raw_labels.append(analytics_topic)
|
||||
|
||||
for tag in question.get("topic_tags") or []:
|
||||
if tag and tag not in raw_labels:
|
||||
raw_labels.append(tag)
|
||||
|
||||
if not raw_labels:
|
||||
for tag in question.get("topics") or []:
|
||||
if tag and tag not in raw_labels:
|
||||
raw_labels.append(tag)
|
||||
|
||||
# Normalize and deduplicate
|
||||
seen: set[str] = set()
|
||||
for raw in raw_labels:
|
||||
norm = normalize_topic(raw)
|
||||
if norm not in seen:
|
||||
seen.add(norm)
|
||||
labels.append(norm)
|
||||
|
||||
return labels
|
||||
|
||||
|
||||
def extract_question_family(question: dict) -> str:
|
||||
return (
|
||||
question.get("question_format")
|
||||
or question.get("question_type")
|
||||
or "unknown"
|
||||
)
|
||||
|
||||
|
||||
@router.get("/courses")
|
||||
async def list_courses():
|
||||
"""返回所有有 ready 状态试卷的课程列表"""
|
||||
sb = get_supabase()
|
||||
rows = (
|
||||
sb.table("papers")
|
||||
.select("course_code")
|
||||
.eq("status", "ready")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
codes = sorted({row["course_code"] for row in rows if row.get("course_code")})
|
||||
return codes
|
||||
|
||||
|
||||
@router.get("/course/{course_code}")
|
||||
async def get_course_analytics(course_code: str):
|
||||
sb = get_supabase()
|
||||
|
||||
papers = (
|
||||
sb.table("papers")
|
||||
.select("id, course_code, year, term, exam_type, part_label, status")
|
||||
.eq("course_code", course_code.upper())
|
||||
.eq("status", "ready")
|
||||
.order("year", desc=True)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
if not papers:
|
||||
return {
|
||||
"course_code": course_code.upper(),
|
||||
"kpi": {"papers": 0, "questions": 0, "topics": 0, "difficulty": "N/A"},
|
||||
"topic_frequency": [],
|
||||
"question_types": [],
|
||||
"difficulty_distribution": {"easy": 0, "medium": 0, "hard": 0},
|
||||
"high_yield_topics": [],
|
||||
}
|
||||
|
||||
paper_ids = [paper["id"] for paper in papers]
|
||||
questions = (
|
||||
sb.table("paper_questions")
|
||||
.select(
|
||||
"id, paper_id, question_number, question_type, question_format, "
|
||||
"question_text, score, topics, analytics_topic, topic_tags, difficulty"
|
||||
)
|
||||
.in_("paper_id", paper_ids)
|
||||
.order("display_order")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
|
||||
papers_by_id = {paper["id"]: paper for paper in papers}
|
||||
total_questions = len(questions)
|
||||
topic_counter: Counter[str] = Counter()
|
||||
type_counter: Counter[str] = Counter()
|
||||
difficulty_counter: Counter[str] = Counter()
|
||||
topic_examples: dict[str, list[dict]] = defaultdict(list)
|
||||
difficulty_scores: list[int] = []
|
||||
all_question_items: list[dict] = []
|
||||
|
||||
for question in questions:
|
||||
question_type = extract_question_family(question)
|
||||
type_counter[question_type] += 1
|
||||
|
||||
difficulty = question.get("difficulty")
|
||||
if difficulty in DIFFICULTY_SCORE:
|
||||
difficulty_counter[difficulty] += 1
|
||||
difficulty_scores.append(DIFFICULTY_SCORE[difficulty])
|
||||
|
||||
paper = papers_by_id.get(question["paper_id"], {})
|
||||
source_label = (
|
||||
f"{paper.get('year', '')} {paper.get('term', '').title()} "
|
||||
f"{paper.get('exam_type', '').title()}"
|
||||
).strip()
|
||||
if paper.get("part_label"):
|
||||
source_label = f"{source_label} Part {paper['part_label']}"
|
||||
|
||||
topics = extract_topic_labels(question)
|
||||
q_item = {
|
||||
"paper_id": paper.get("id"),
|
||||
"source": source_label,
|
||||
"question_number": question["question_number"],
|
||||
"preview": question["question_text"][:220],
|
||||
"difficulty": question.get("difficulty"),
|
||||
"question_type": question_type,
|
||||
"year": paper.get("year"),
|
||||
"term": paper.get("term"),
|
||||
"exam_type": paper.get("exam_type"),
|
||||
"topics": topics,
|
||||
}
|
||||
all_question_items.append(q_item)
|
||||
|
||||
for topic in topics:
|
||||
topic_counter[topic] += 1
|
||||
topic_examples[topic].append(q_item)
|
||||
|
||||
avg_difficulty = "N/A"
|
||||
if difficulty_scores:
|
||||
rounded = round(sum(difficulty_scores) / len(difficulty_scores))
|
||||
avg_difficulty = DIFFICULTY_LABEL.get(rounded, "Medium")
|
||||
|
||||
topic_frequency = []
|
||||
for topic, count in topic_counter.most_common():
|
||||
pct = round((count / total_questions) * 100) if total_questions else 0
|
||||
topic_frequency.append(
|
||||
{
|
||||
"label": topic,
|
||||
"count": count,
|
||||
"pct": pct,
|
||||
"questions": topic_examples[topic],
|
||||
}
|
||||
)
|
||||
|
||||
question_types = []
|
||||
for label, count in type_counter.most_common():
|
||||
pct = round((count / total_questions) * 100) if total_questions else 0
|
||||
question_types.append({"label": label, "count": count, "pct": pct})
|
||||
|
||||
return {
|
||||
"course_code": course_code.upper(),
|
||||
"kpi": {
|
||||
"papers": len(papers),
|
||||
"questions": total_questions,
|
||||
"topics": len(topic_counter),
|
||||
"difficulty": avg_difficulty,
|
||||
},
|
||||
"topic_frequency": topic_frequency,
|
||||
"question_types": question_types,
|
||||
"all_questions": all_question_items,
|
||||
"difficulty_distribution": {
|
||||
"easy": difficulty_counter.get("easy", 0),
|
||||
"medium": difficulty_counter.get("medium", 0),
|
||||
"hard": difficulty_counter.get("hard", 0),
|
||||
},
|
||||
"high_yield_topics": [topic for topic, _ in topic_counter.most_common(5)],
|
||||
}
|
||||
208
backend/app/routers/attempts.py
Normal file
208
backend/app/routers/attempts.py
Normal file
@@ -0,0 +1,208 @@
|
||||
"""用户答题记录 + 拍照批改 + 错题本"""
|
||||
|
||||
import asyncio
|
||||
from fastapi import APIRouter, UploadFile, File, Form, HTTPException, Depends
|
||||
from pydantic import BaseModel
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.services.grader import ocr_photo, grade_answer
|
||||
from app.dependencies.auth import get_current_user_id
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
class AttemptCreate(BaseModel):
|
||||
question_id: str
|
||||
attempt_type: str # "select" | "input" | "photo"
|
||||
user_answer: str | None = None
|
||||
is_correct: bool | None = None
|
||||
|
||||
|
||||
class AttemptUpdate(BaseModel):
|
||||
in_error_book: bool | None = None
|
||||
mastered: bool | None = None
|
||||
|
||||
|
||||
@router.post("/")
|
||||
async def create_attempt(data: AttemptCreate, user_id: str = Depends(get_current_user_id)):
|
||||
"""记录一次答题"""
|
||||
sb = get_supabase()
|
||||
record = {
|
||||
"user_id": user_id,
|
||||
"question_id": data.question_id,
|
||||
"attempt_type": data.attempt_type,
|
||||
"user_answer": data.user_answer,
|
||||
"is_correct": data.is_correct,
|
||||
}
|
||||
# Auto add to error book if wrong
|
||||
if data.is_correct is False:
|
||||
record["in_error_book"] = True
|
||||
|
||||
result = sb.table("user_attempts").insert(record).execute()
|
||||
return result.data[0]
|
||||
|
||||
|
||||
@router.post("/photo")
|
||||
async def photo_attempt(
|
||||
question_id: str = Form(...),
|
||||
photo: UploadFile = File(...),
|
||||
user_id: str = Depends(get_current_user_id),
|
||||
):
|
||||
"""拍照上传 → OCR → AI批改"""
|
||||
sb = get_supabase()
|
||||
|
||||
# 1. Read photo
|
||||
photo_bytes = await photo.read()
|
||||
|
||||
# 2. Upload to storage
|
||||
storage_path = f"attempts/{user_id}/{question_id}/{photo.filename}"
|
||||
sb.storage.from_("attempt-photos").upload(
|
||||
storage_path, photo_bytes,
|
||||
file_options={"content-type": photo.content_type or "image/jpeg", "upsert": "true"},
|
||||
)
|
||||
photo_url = sb.storage.from_("attempt-photos").get_public_url(storage_path)
|
||||
|
||||
# 3. OCR (run in thread pool to avoid blocking event loop)
|
||||
ocr_text = await asyncio.to_thread(ocr_photo, photo_bytes)
|
||||
|
||||
# 4. Fetch question for grading context
|
||||
q_result = sb.table("paper_questions").select("*").eq("id", question_id).execute()
|
||||
if not q_result.data:
|
||||
raise HTTPException(status_code=404, detail="Question not found")
|
||||
question = q_result.data[0]
|
||||
|
||||
# 5. AI grading (run in thread pool)
|
||||
grade_result = await asyncio.to_thread(grade_answer, question, ocr_text)
|
||||
|
||||
# 6. Save attempt
|
||||
record = {
|
||||
"user_id": user_id,
|
||||
"question_id": question_id,
|
||||
"attempt_type": "photo",
|
||||
"photo_url": photo_url,
|
||||
"photo_ocr_text": ocr_text,
|
||||
"is_correct": grade_result.get("is_correct", False),
|
||||
"feedback": grade_result.get("feedback", ""),
|
||||
"error_at_step": grade_result.get("error_at_step"),
|
||||
"in_error_book": not grade_result.get("is_correct", False),
|
||||
}
|
||||
result = sb.table("user_attempts").insert(record).execute()
|
||||
|
||||
return {
|
||||
"attempt": result.data[0],
|
||||
"ocr_text": ocr_text,
|
||||
"grade": grade_result,
|
||||
}
|
||||
|
||||
|
||||
@router.get("/error-book")
|
||||
async def get_error_book(
|
||||
course_code: str | None = None,
|
||||
user_id: str = Depends(get_current_user_id),
|
||||
):
|
||||
"""获取错题本"""
|
||||
sb = get_supabase()
|
||||
attempts = (
|
||||
sb.table("user_attempts")
|
||||
.select("*")
|
||||
.eq("user_id", user_id)
|
||||
.eq("in_error_book", True)
|
||||
.eq("mastered", False)
|
||||
.order("created_at", desc=True)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
if not attempts:
|
||||
return []
|
||||
|
||||
question_ids = list({attempt["question_id"] for attempt in attempts})
|
||||
questions = (
|
||||
sb.table("paper_questions")
|
||||
.select("*")
|
||||
.in_("id", question_ids)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
questions_by_id = {question["id"]: question for question in questions}
|
||||
|
||||
paper_ids = list({question["paper_id"] for question in questions})
|
||||
papers = (
|
||||
sb.table("papers")
|
||||
.select("id, course_code, year, term, exam_type, part_label")
|
||||
.in_("id", paper_ids)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
papers_by_id = {paper["id"]: paper for paper in papers}
|
||||
|
||||
enriched = []
|
||||
for attempt in attempts:
|
||||
question = questions_by_id.get(attempt["question_id"])
|
||||
if not question:
|
||||
continue
|
||||
paper = papers_by_id.get(question["paper_id"])
|
||||
if course_code and paper and paper.get("course_code") != course_code.upper():
|
||||
continue
|
||||
|
||||
enriched.append(
|
||||
{
|
||||
**attempt,
|
||||
"paper_questions": {
|
||||
**question,
|
||||
"paper": paper,
|
||||
},
|
||||
}
|
||||
)
|
||||
return enriched
|
||||
|
||||
|
||||
@router.get("/by-paper/{paper_id}")
|
||||
async def get_paper_attempts(paper_id: str, user_id: str = Depends(get_current_user_id)):
|
||||
"""获取某张试卷所有题目的最新判卷记录"""
|
||||
sb = get_supabase()
|
||||
attempts = (
|
||||
sb.table("user_attempts")
|
||||
.select("question_id, is_correct, feedback, photo_ocr_text, attempt_type, created_at")
|
||||
.eq("user_id", user_id)
|
||||
.order("created_at", desc=True)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
# 只保留 photo 类型的,且只保留每题最新一条
|
||||
question_ids = (
|
||||
sb.table("paper_questions")
|
||||
.select("id")
|
||||
.eq("paper_id", paper_id)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
qid_set = {q["id"] for q in question_ids}
|
||||
seen: set[str] = set()
|
||||
result = []
|
||||
for a in attempts:
|
||||
if a["question_id"] not in qid_set:
|
||||
continue
|
||||
if a["question_id"] in seen:
|
||||
continue
|
||||
if a["attempt_type"] != "photo":
|
||||
continue
|
||||
seen.add(a["question_id"])
|
||||
result.append(a)
|
||||
return result
|
||||
|
||||
|
||||
@router.patch("/{attempt_id}")
|
||||
async def update_attempt(attempt_id: str, data: AttemptUpdate):
|
||||
"""更新错题状态(标记掌握等)"""
|
||||
sb = get_supabase()
|
||||
update = {}
|
||||
if data.in_error_book is not None:
|
||||
update["in_error_book"] = data.in_error_book
|
||||
if data.mastered is not None:
|
||||
update["mastered"] = data.mastered
|
||||
if not update:
|
||||
raise HTTPException(status_code=400, detail="Nothing to update")
|
||||
|
||||
result = sb.table("user_attempts").update(update).eq("id", attempt_id).execute()
|
||||
if not result.data:
|
||||
raise HTTPException(status_code=404, detail="Attempt not found")
|
||||
return result.data[0]
|
||||
142
backend/app/routers/papers.py
Normal file
142
backend/app/routers/papers.py
Normal file
@@ -0,0 +1,142 @@
|
||||
"""试卷上传 + 处理管线"""
|
||||
|
||||
import asyncio
|
||||
import threading
|
||||
from fastapi import APIRouter, UploadFile, File, Form, HTTPException, Depends
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.services.text_extractor import extract_pdf, get_full_text
|
||||
from app.services.paper_processor import process_paper
|
||||
from app.dependencies.auth import get_current_user_id
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
def _upload_and_process_sync(
|
||||
paper_id: str,
|
||||
storage_path: str,
|
||||
paper_bytes: bytes,
|
||||
answer_bytes: bytes | None,
|
||||
):
|
||||
"""在独立线程中运行:Storage 上传 + AI 处理"""
|
||||
sb = get_supabase()
|
||||
try:
|
||||
paper_storage_path = f"{storage_path}/paper.pdf"
|
||||
sb.storage.from_("papers").upload(
|
||||
paper_storage_path, paper_bytes,
|
||||
file_options={"content-type": "application/pdf", "upsert": "true"},
|
||||
)
|
||||
paper_url = sb.storage.from_("papers").get_public_url(paper_storage_path)
|
||||
|
||||
update_data: dict = {"paper_file_url": paper_url}
|
||||
|
||||
if answer_bytes:
|
||||
answer_storage_path = f"{storage_path}/answer.pdf"
|
||||
sb.storage.from_("papers").upload(
|
||||
answer_storage_path, answer_bytes,
|
||||
file_options={"content-type": "application/pdf", "upsert": "true"},
|
||||
)
|
||||
update_data["answer_file_url"] = sb.storage.from_("papers").get_public_url(answer_storage_path)
|
||||
|
||||
sb.table("papers").update(update_data).eq("id", paper_id).execute()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# process_paper 是 async,在新事件循环里跑
|
||||
asyncio.run(process_paper(paper_id, paper_bytes, answer_bytes))
|
||||
|
||||
|
||||
@router.get("/")
|
||||
async def list_papers():
|
||||
"""获取试卷列表(公共资产,所有用户共享)"""
|
||||
sb = get_supabase()
|
||||
return (
|
||||
sb.table("papers")
|
||||
.select("id, course_code, year, term, exam_type, status, question_count, total_score, difficulty_level, processing_step, processing_progress, processing_total, created_at")
|
||||
.order("created_at", desc=True)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
|
||||
|
||||
@router.get("/mine")
|
||||
async def my_papers(user_id: str = Depends(get_current_user_id)):
|
||||
"""当前用户上传的试卷(含 processing 状态)"""
|
||||
sb = get_supabase()
|
||||
return (
|
||||
sb.table("papers")
|
||||
.select("id, course_code, year, term, exam_type, part_label, status, question_count, processing_step, processing_progress, processing_total, created_at")
|
||||
.eq("user_id", user_id)
|
||||
.order("created_at", desc=True)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
|
||||
|
||||
@router.post("/upload")
|
||||
async def upload_paper(
|
||||
paper_file: UploadFile = File(...),
|
||||
answer_file: UploadFile | None = File(None),
|
||||
course_code: str = Form(...),
|
||||
year: int = Form(...),
|
||||
term: str = Form(...),
|
||||
exam_type: str = Form(...),
|
||||
user_id: str = Depends(get_current_user_id),
|
||||
):
|
||||
"""上传试卷 PDF(可选答案 PDF),触发后台处理"""
|
||||
sb = get_supabase()
|
||||
|
||||
# 1. 读取文件内容(已在内存中,快)
|
||||
paper_bytes = await paper_file.read()
|
||||
answer_bytes = await answer_file.read() if answer_file else None
|
||||
|
||||
# 2. 立即创建记录(status=processing),马上返回
|
||||
storage_path = f"{course_code.upper()}/{year}_{term}_{exam_type}"
|
||||
paper_record = sb.table("papers").insert({
|
||||
"user_id": user_id,
|
||||
"course_code": course_code.upper(),
|
||||
"year": year,
|
||||
"term": term,
|
||||
"exam_type": exam_type,
|
||||
"paper_file_url": "", # 后台上传后更新
|
||||
"answer_file_url": None,
|
||||
"status": "processing",
|
||||
}).execute()
|
||||
|
||||
paper_id = paper_record.data[0]["id"]
|
||||
|
||||
# 3. 在独立线程中运行,完全不阻塞事件循环
|
||||
threading.Thread(
|
||||
target=_upload_and_process_sync,
|
||||
args=(paper_id, storage_path, paper_bytes, answer_bytes),
|
||||
daemon=True,
|
||||
).start()
|
||||
|
||||
return {
|
||||
"paper_id": paper_id,
|
||||
"status": "processing",
|
||||
"message": "试卷已上传,正在处理中...",
|
||||
}
|
||||
|
||||
|
||||
@router.get("/{paper_id}")
|
||||
async def get_paper(paper_id: str):
|
||||
"""获取试卷信息 + 处理状态"""
|
||||
sb = get_supabase()
|
||||
result = sb.table("papers").select("*").eq("id", paper_id).execute()
|
||||
if not result.data:
|
||||
raise HTTPException(status_code=404, detail="Paper not found")
|
||||
return result.data[0]
|
||||
|
||||
|
||||
@router.get("/{paper_id}/questions")
|
||||
async def get_questions(paper_id: str):
|
||||
"""获取试卷的所有题目(含 AI 三件套)"""
|
||||
sb = get_supabase()
|
||||
result = (
|
||||
sb.table("paper_questions")
|
||||
.select("*")
|
||||
.eq("paper_id", paper_id)
|
||||
.order("display_order")
|
||||
.execute()
|
||||
)
|
||||
return result.data
|
||||
325
backend/app/routers/questions.py
Normal file
325
backend/app/routers/questions.py
Normal file
@@ -0,0 +1,325 @@
|
||||
"""题目相关:变式题生成 + 相似题召回"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import time
|
||||
from fastapi import APIRouter, HTTPException, Depends
|
||||
from pydantic import BaseModel
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.services.grader import generate_variant
|
||||
from app.dependencies.auth import get_current_user_id
|
||||
|
||||
# Simple in-memory cache: question_id → (timestamp, result)
|
||||
_similar_cache: dict[str, tuple[float, list]] = {}
|
||||
_CACHE_TTL = 300 # 5 minutes
|
||||
|
||||
|
||||
class VariantUpdate(BaseModel):
|
||||
favorited: bool | None = None
|
||||
|
||||
router = APIRouter()
|
||||
|
||||
|
||||
def normalized_labels(values: list[str] | None) -> dict[str, str]:
|
||||
labels: dict[str, str] = {}
|
||||
for value in values or []:
|
||||
if value:
|
||||
labels[value.lower()] = value
|
||||
return labels
|
||||
|
||||
|
||||
def question_family(question: dict) -> str:
|
||||
return question.get("question_format") or question.get("question_type") or "unknown"
|
||||
|
||||
|
||||
def display_topics(question: dict) -> list[str]:
|
||||
labels: list[str] = []
|
||||
analytics_topic = question.get("analytics_topic")
|
||||
if analytics_topic:
|
||||
labels.append(analytics_topic)
|
||||
for topic in question.get("topic_tags") or []:
|
||||
if topic and topic not in labels:
|
||||
labels.append(topic)
|
||||
if labels:
|
||||
return labels
|
||||
for topic in question.get("topics") or []:
|
||||
if topic and topic not in labels:
|
||||
labels.append(topic)
|
||||
return labels
|
||||
|
||||
|
||||
def similarity_score(
|
||||
target: dict,
|
||||
candidate: dict,
|
||||
text_score: float = 0.0,
|
||||
) -> tuple[int, list[str]]:
|
||||
score = 0
|
||||
reasons: list[str] = []
|
||||
|
||||
# Primary topic bucket: 40 pts
|
||||
target_topic = target.get("analytics_topic")
|
||||
candidate_topic = candidate.get("analytics_topic")
|
||||
if target_topic and target_topic == candidate_topic:
|
||||
score += 40
|
||||
reasons.append(f"Same topic: {target_topic}")
|
||||
|
||||
# Concept overlap: up to 20 pts
|
||||
target_topics = normalized_labels(target.get("topic_tags"))
|
||||
candidate_topics = normalized_labels(candidate.get("topic_tags"))
|
||||
shared_topics = sorted(set(target_topics) & set(candidate_topics))
|
||||
if shared_topics:
|
||||
score += min(len(shared_topics) * 10, 20)
|
||||
# Only show concept reason if analytics_topic didn't already match (avoid redundancy)
|
||||
if not (target_topic and target_topic == candidate_topic):
|
||||
reasons.append(
|
||||
"Shared concept: "
|
||||
+ ", ".join(target_topics[key] for key in shared_topics[:2])
|
||||
)
|
||||
|
||||
# Skill overlap: up to 20 pts
|
||||
target_skills = normalized_labels(target.get("skill_tags"))
|
||||
candidate_skills = normalized_labels(candidate.get("skill_tags"))
|
||||
shared_skills = sorted(set(target_skills) & set(candidate_skills))
|
||||
if shared_skills:
|
||||
score += min(len(shared_skills) * 10, 20)
|
||||
reasons.append(
|
||||
"Shared skill: "
|
||||
+ ", ".join(target_skills[key] for key in shared_skills[:2])
|
||||
)
|
||||
|
||||
# Same question format: 10 pts
|
||||
if question_family(candidate) == question_family(target):
|
||||
score += 10
|
||||
reasons.append("Same format")
|
||||
|
||||
# Same difficulty: 5 pts
|
||||
if candidate.get("difficulty") and candidate.get("difficulty") == target.get("difficulty"):
|
||||
score += 5
|
||||
reasons.append("Same difficulty")
|
||||
|
||||
# Full-text similarity from PostgreSQL ts_rank_cd: up to 20 pts
|
||||
if text_score > 0:
|
||||
text_pts = min(round(text_score * 60), 20)
|
||||
score += text_pts
|
||||
if text_pts >= 4:
|
||||
reasons.append("Similar wording")
|
||||
|
||||
return min(score, 99), reasons
|
||||
|
||||
|
||||
@router.get("/variants/favorited")
|
||||
async def get_favorited_variants(user_id: str = Depends(get_current_user_id)):
|
||||
"""获取用户收藏的所有 variant(用于 Error Book)"""
|
||||
sb = get_supabase()
|
||||
rows = (
|
||||
sb.table("question_variants")
|
||||
.select("*, paper_questions(question_number, paper_id, papers(id, course_code, year, term, exam_type, part_label))")
|
||||
.eq("user_id", user_id)
|
||||
.eq("favorited", True)
|
||||
.order("created_at", desc=True)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
return rows
|
||||
|
||||
|
||||
@router.post("/{question_id}/variant")
|
||||
async def create_variant(question_id: str, user_id: str = Depends(get_current_user_id)):
|
||||
"""生成变式题并入库"""
|
||||
sb = get_supabase()
|
||||
result = sb.table("paper_questions").select("*").eq("id", question_id).execute()
|
||||
if not result.data:
|
||||
raise HTTPException(status_code=404, detail="Question not found")
|
||||
|
||||
question = result.data[0]
|
||||
variant_data = await asyncio.to_thread(generate_variant, question)
|
||||
variant_data["knowledge_reminder"] = question.get("knowledge_reminder", "")
|
||||
|
||||
saved = sb.table("question_variants").insert({
|
||||
"user_id": user_id,
|
||||
"source_question_id": question_id,
|
||||
"variant_data": variant_data,
|
||||
"favorited": False,
|
||||
}).execute()
|
||||
|
||||
row = saved.data[0]
|
||||
row["source_question_number"] = question["question_number"]
|
||||
return row
|
||||
|
||||
|
||||
@router.get("/{question_id}/variants")
|
||||
async def list_variants(question_id: str, user_id: str = Depends(get_current_user_id)):
|
||||
"""获取某道题的用户所有 variant"""
|
||||
sb = get_supabase()
|
||||
q_result = sb.table("paper_questions").select("question_number").eq("id", question_id).execute()
|
||||
question_number = q_result.data[0]["question_number"] if q_result.data else ""
|
||||
|
||||
rows = (
|
||||
sb.table("question_variants")
|
||||
.select("*")
|
||||
.eq("user_id", user_id)
|
||||
.eq("source_question_id", question_id)
|
||||
.order("created_at", desc=True)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
for row in rows:
|
||||
row["source_question_number"] = question_number
|
||||
return rows
|
||||
|
||||
|
||||
@router.patch("/variant/{variant_id}")
|
||||
async def update_variant(variant_id: str, data: VariantUpdate, user_id: str = Depends(get_current_user_id)):
|
||||
"""更新 variant(收藏/取消收藏)"""
|
||||
sb = get_supabase()
|
||||
update: dict = {}
|
||||
if data.favorited is not None:
|
||||
update["favorited"] = data.favorited
|
||||
if not update:
|
||||
raise HTTPException(status_code=400, detail="Nothing to update")
|
||||
|
||||
result = (
|
||||
sb.table("question_variants")
|
||||
.update(update)
|
||||
.eq("id", variant_id)
|
||||
.eq("user_id", user_id)
|
||||
.execute()
|
||||
)
|
||||
if not result.data:
|
||||
raise HTTPException(status_code=404, detail="Variant not found")
|
||||
return result.data[0]
|
||||
|
||||
|
||||
@router.delete("/variant/{variant_id}", status_code=204)
|
||||
async def delete_variant(variant_id: str, user_id: str = Depends(get_current_user_id)):
|
||||
"""删除 variant"""
|
||||
sb = get_supabase()
|
||||
sb.table("question_variants").delete().eq("id", variant_id).eq("user_id", user_id).execute()
|
||||
|
||||
|
||||
@router.get("/{question_id}/similar")
|
||||
async def get_similar_questions(question_id: str, limit: int = 6):
|
||||
"""Retrieve similar questions from the same course."""
|
||||
# Cache hit
|
||||
cached = _similar_cache.get(question_id)
|
||||
if cached and (time.time() - cached[0]) < _CACHE_TTL:
|
||||
return cached[1][:max(1, min(limit, 12))]
|
||||
|
||||
sb = get_supabase()
|
||||
result = sb.table("paper_questions").select("*, similar_questions").eq("id", question_id).execute()
|
||||
if not result.data:
|
||||
raise HTTPException(status_code=404, detail="Question not found")
|
||||
|
||||
target = result.data[0]
|
||||
|
||||
# Return pre-computed immediately; schedule background refresh
|
||||
if target.get("similar_questions"):
|
||||
precomputed = target["similar_questions"]
|
||||
_similar_cache[question_id] = (time.time(), precomputed)
|
||||
return precomputed[:max(1, min(limit, 12))]
|
||||
|
||||
paper_result = sb.table("papers").select("id, course_code").eq("id", target["paper_id"]).execute()
|
||||
# (fallback: compute on-the-fly for questions not yet backfilled)
|
||||
if not paper_result.data:
|
||||
raise HTTPException(status_code=404, detail="Paper not found")
|
||||
|
||||
course_code = paper_result.data[0]["course_code"]
|
||||
papers = (
|
||||
sb.table("papers")
|
||||
.select("id, course_code, year, term, exam_type, part_label")
|
||||
.eq("course_code", course_code)
|
||||
.eq("status", "ready")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
paper_ids = [paper["id"] for paper in papers if paper["id"] != target["paper_id"]]
|
||||
if not paper_ids:
|
||||
return []
|
||||
|
||||
papers_by_id = {paper["id"]: paper for paper in papers}
|
||||
|
||||
# Pre-filter by analytics_topic in DB when possible (cuts candidates from ~250 to ~30)
|
||||
candidates_query = (
|
||||
sb.table("paper_questions")
|
||||
.select(
|
||||
"id, paper_id, question_number, question_type, question_format, "
|
||||
"question_text, score, topics, analytics_topic, topic_tags, skill_tags, "
|
||||
"difficulty, knowledge_reminder, ai_hint, solution"
|
||||
)
|
||||
.in_("paper_id", paper_ids)
|
||||
)
|
||||
target_topic = target.get("analytics_topic")
|
||||
if target_topic:
|
||||
candidates_query = candidates_query.eq("analytics_topic", target_topic)
|
||||
|
||||
candidates = candidates_query.execute().data
|
||||
if not candidates:
|
||||
return []
|
||||
|
||||
# Batch full-text scores from PostgreSQL (skip if too many candidates — slow)
|
||||
text_scores: dict[str, float] = {}
|
||||
if len(candidates) <= 50:
|
||||
try:
|
||||
rpc_result = sb.rpc(
|
||||
"text_similarity_scores",
|
||||
{
|
||||
"query_text": target.get("question_text") or "",
|
||||
"candidate_ids": [c["id"] for c in candidates],
|
||||
},
|
||||
).execute()
|
||||
for row in rpc_result.data or []:
|
||||
text_scores[row["question_id"]] = float(row["text_score"] or 0)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
ranked = []
|
||||
for candidate in candidates:
|
||||
text_score = text_scores.get(candidate["id"], 0.0)
|
||||
match_percent, reasons = similarity_score(target, candidate, text_score)
|
||||
if match_percent < 20:
|
||||
continue
|
||||
paper = papers_by_id.get(candidate["paper_id"], {})
|
||||
source = (
|
||||
f"{paper.get('year', '')} {paper.get('term', '').title()} "
|
||||
f"{paper.get('exam_type', '').title()}"
|
||||
).strip()
|
||||
if paper.get("part_label"):
|
||||
source = f"{source} Part {paper['part_label']}"
|
||||
ranked.append(
|
||||
{
|
||||
"id": candidate["id"],
|
||||
"paper_id": candidate["paper_id"],
|
||||
"source": source,
|
||||
"question_number": candidate["question_number"],
|
||||
"match_percent": match_percent,
|
||||
"match_reasons": reasons,
|
||||
"question_type": question_family(candidate),
|
||||
"question_text": candidate["question_text"],
|
||||
"topics": display_topics(candidate),
|
||||
"difficulty": candidate.get("difficulty"),
|
||||
"knowledge_reminder": candidate.get("knowledge_reminder", ""),
|
||||
"ai_hint": candidate.get("ai_hint", ""),
|
||||
"solution": candidate.get("solution", ""),
|
||||
}
|
||||
)
|
||||
|
||||
ranked.sort(key=lambda item: (-item["match_percent"], item["source"], item["question_number"]))
|
||||
|
||||
# Keep only the best-scoring question per paper
|
||||
seen_papers: set[str] = set()
|
||||
deduped = []
|
||||
for item in ranked:
|
||||
if item["paper_id"] not in seen_papers:
|
||||
seen_papers.add(item["paper_id"])
|
||||
deduped.append(item)
|
||||
|
||||
_similar_cache[question_id] = (time.time(), deduped)
|
||||
|
||||
# Persist to DB so future requests are instant
|
||||
try:
|
||||
sb.table("paper_questions").update({"similar_questions": deduped}).eq("id", question_id).execute()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return deduped[:max(1, min(limit, 12))]
|
||||
0
backend/app/services/__init__.py
Normal file
0
backend/app/services/__init__.py
Normal file
146
backend/app/services/grader.py
Normal file
146
backend/app/services/grader.py
Normal file
@@ -0,0 +1,146 @@
|
||||
"""OCR, grading, and variant generation prompts"""
|
||||
|
||||
import json
|
||||
import base64
|
||||
from app.services.llm_clients import get_vision_client, get_deepseek_client
|
||||
|
||||
OCR_PROMPT = """You are an expert at recognizing handwritten answers. Analyze this photo of a student's handwritten answer and extract the text and mathematical formulas.
|
||||
|
||||
Requirements:
|
||||
- Faithfully extract what the student wrote, do not modify or correct
|
||||
- Use LaTeX format for math formulas (e.g. $x^2 + 1$)
|
||||
- If there are multiple steps, list them in original order
|
||||
- If some handwriting is unclear, mark with [unclear]
|
||||
|
||||
Return only the extracted text, no additional explanation."""
|
||||
|
||||
GRADING_PROMPT = """You are an expert academic grader. Grade the following student answer. ALL output must be in English.
|
||||
|
||||
Question info:
|
||||
- Number: {question_number}
|
||||
- Type: {question_type}
|
||||
- Question: {question_text}
|
||||
- Score: {score}
|
||||
|
||||
Reference answer / solution:
|
||||
{reference_answer}
|
||||
|
||||
Student answer:
|
||||
{student_answer}
|
||||
|
||||
Grade and return JSON:
|
||||
{{
|
||||
"is_correct": true/false,
|
||||
"score_given": 0-{score},
|
||||
"feedback": "<HTML> Step-by-step analysis of the student's answer, pointing out correct parts and errors, using KaTeX formulas </HTML>",
|
||||
"error_at_step": null or the step number where errors begin (integer)
|
||||
}}
|
||||
|
||||
Grading rules:
|
||||
- MC / fill-blank: only correct if answer matches exactly
|
||||
- Long questions: give partial credit for correct steps even if the final answer is wrong
|
||||
- feedback in HTML format, supports KaTeX ($..$ inline, $$...$$ block)
|
||||
- Mark errors with <div class="common-error">...</div>
|
||||
- Identify exactly which step the error starts"""
|
||||
|
||||
VARIANT_PROMPT = """You are an expert exam question creator. Generate a similar but different variant question based on the original below. ALL output must be in English.
|
||||
|
||||
Original question info:
|
||||
- Type: {question_type}
|
||||
- Question: {question_text}
|
||||
- Topics: {topics}
|
||||
- Difficulty: {difficulty}
|
||||
- Reference answer: {answer}
|
||||
|
||||
Requirements:
|
||||
- Variant must test the same knowledge points at similar difficulty
|
||||
- Data/scenario/wording must differ — don't just change numbers
|
||||
- Must provide a complete correct answer
|
||||
|
||||
Format requirements (CRITICAL):
|
||||
- All text in HTML format, absolutely NO markdown syntax
|
||||
- Code: <pre><code class="language-xxx">...</code></pre>, NOT ```
|
||||
- Math: $...$ (inline) or $$...$$ (block), KaTeX compatible
|
||||
- Line breaks: <br>, paragraphs: <p>
|
||||
|
||||
Return JSON:
|
||||
{{
|
||||
"question_text": "HTML formatted variant question",
|
||||
"question_type": "{question_type}",
|
||||
"options": [MC only, format {{"label":"A","text":"..."}}, ...] or null,
|
||||
"correct_answer": "Correct answer (plain text)",
|
||||
"ai_hint": "HTML formatted hint that guides thinking WITHOUT giving the answer",
|
||||
"solution": "HTML formatted complete step-by-step solution"
|
||||
}}"""
|
||||
|
||||
|
||||
def ocr_photo(photo_bytes: bytes) -> str:
|
||||
"""Gemini Vision OCR for handwritten answers"""
|
||||
client = get_vision_client()
|
||||
b64 = base64.b64encode(photo_bytes).decode("utf-8")
|
||||
|
||||
resp = client.chat.completions.create(
|
||||
model="gemini-2.5-flash",
|
||||
messages=[
|
||||
{"role": "system", "content": OCR_PROMPT},
|
||||
{"role": "user", "content": [
|
||||
{"type": "image_url", "image_url": {
|
||||
"url": f"data:image/jpeg;base64,{b64}",
|
||||
}},
|
||||
]},
|
||||
],
|
||||
temperature=0,
|
||||
max_tokens=2000,
|
||||
)
|
||||
return resp.choices[0].message.content or ""
|
||||
|
||||
|
||||
def grade_answer(question: dict, student_answer: str) -> dict:
|
||||
"""Qwen grades student answer"""
|
||||
reference = question.get("raw_answer_text") or question.get("solution") or "No reference answer"
|
||||
score = question.get("score") or "unknown"
|
||||
|
||||
ds = get_deepseek_client()
|
||||
resp = ds.chat.completions.create(
|
||||
model="deepseek-chat",
|
||||
messages=[
|
||||
{"role": "system", "content": GRADING_PROMPT.format(
|
||||
question_number=question["question_number"],
|
||||
question_type=question["question_type"],
|
||||
question_text=question["question_text"],
|
||||
score=score,
|
||||
reference_answer=reference,
|
||||
student_answer=student_answer,
|
||||
)},
|
||||
],
|
||||
temperature=0.2,
|
||||
response_format={"type": "json_object"},
|
||||
)
|
||||
return json.loads(resp.choices[0].message.content)
|
||||
|
||||
|
||||
def generate_variant(question: dict) -> dict:
|
||||
"""Gemini generates a variant question"""
|
||||
answer = (
|
||||
question.get("correct_option")
|
||||
or question.get("correct_answer")
|
||||
or question.get("raw_answer_text")
|
||||
or "N/A"
|
||||
)
|
||||
|
||||
ds = get_deepseek_client()
|
||||
resp = ds.chat.completions.create(
|
||||
model="deepseek-chat",
|
||||
messages=[
|
||||
{"role": "system", "content": VARIANT_PROMPT.format(
|
||||
question_type=question["question_type"],
|
||||
question_text=question["question_text"],
|
||||
topics=", ".join(question.get("topics", [])),
|
||||
difficulty=question.get("difficulty", "medium"),
|
||||
answer=answer,
|
||||
)},
|
||||
],
|
||||
temperature=0.5,
|
||||
response_format={"type": "json_object"},
|
||||
)
|
||||
return json.loads(resp.choices[0].message.content)
|
||||
74
backend/app/services/llm_clients.py
Normal file
74
backend/app/services/llm_clients.py
Normal file
@@ -0,0 +1,74 @@
|
||||
import httpx
|
||||
from openai import OpenAI
|
||||
from app.config import get_settings
|
||||
|
||||
_TIMEOUT = httpx.Timeout(connect=10, read=300, write=60, pool=10)
|
||||
|
||||
_gpt_client: OpenAI | None = None
|
||||
_qwen_client: OpenAI | None = None
|
||||
_gemini_flash_client: OpenAI | None = None
|
||||
_gemini_lite_client: OpenAI | None = None
|
||||
_deepseek_client: OpenAI | None = None
|
||||
|
||||
|
||||
def get_gpt_client() -> OpenAI:
|
||||
"""laozhang API — gpt-4o / gpt-4o-mini"""
|
||||
global _gpt_client
|
||||
if _gpt_client is None:
|
||||
s = get_settings()
|
||||
_gpt_client = OpenAI(
|
||||
base_url=s.laozhang_base_url,
|
||||
api_key=s.laozhang_api_key,
|
||||
)
|
||||
return _gpt_client
|
||||
|
||||
|
||||
def get_qwen_client() -> OpenAI:
|
||||
"""DashScope — qwen-plus"""
|
||||
global _qwen_client
|
||||
if _qwen_client is None:
|
||||
s = get_settings()
|
||||
_qwen_client = OpenAI(
|
||||
base_url=s.dashscope_base_url,
|
||||
api_key=s.dashscope_api_key,
|
||||
)
|
||||
return _qwen_client
|
||||
|
||||
|
||||
def get_vision_client() -> OpenAI:
|
||||
"""Google Gemini 官方 API(视觉,用于拆题+OCR)— 部署在新加坡可用"""
|
||||
global _gemini_flash_client
|
||||
if _gemini_flash_client is None:
|
||||
s = get_settings()
|
||||
_gemini_flash_client = OpenAI(
|
||||
base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
|
||||
api_key=s.google_gemini_api_key,
|
||||
timeout=_TIMEOUT,
|
||||
)
|
||||
return _gemini_flash_client
|
||||
|
||||
|
||||
def get_gemini_lite_client() -> OpenAI:
|
||||
"""laozhang — gemini-3.1-flash-lite-preview(轻量,用于 AI trio)"""
|
||||
global _gemini_lite_client
|
||||
if _gemini_lite_client is None:
|
||||
s = get_settings()
|
||||
_gemini_lite_client = OpenAI(
|
||||
base_url=s.laozhang_base_url,
|
||||
api_key=s.laozhang_api_key,
|
||||
timeout=_TIMEOUT,
|
||||
)
|
||||
return _gemini_lite_client
|
||||
|
||||
|
||||
def get_deepseek_client() -> OpenAI:
|
||||
"""DeepSeek — deepseek-chat(用于 AI trio)"""
|
||||
global _deepseek_client
|
||||
if _deepseek_client is None:
|
||||
s = get_settings()
|
||||
_deepseek_client = OpenAI(
|
||||
base_url=s.deepseek_base_url,
|
||||
api_key=s.deepseek_api_key,
|
||||
timeout=_TIMEOUT,
|
||||
)
|
||||
return _deepseek_client
|
||||
576
backend/app/services/paper_processor.py
Normal file
576
backend/app/services/paper_processor.py
Normal file
@@ -0,0 +1,576 @@
|
||||
"""试卷处理管线:PDF → 结构化题目 → AI 三件套(Vision 模式)"""
|
||||
|
||||
import asyncio
|
||||
import base64
|
||||
import io
|
||||
import json
|
||||
import re
|
||||
import traceback
|
||||
from contextlib import redirect_stdout
|
||||
import fitz # pymupdf
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.services.llm_clients import get_vision_client, get_deepseek_client
|
||||
|
||||
|
||||
def strip_nulls(obj):
|
||||
"""Recursively remove \\u0000 null bytes from strings (PostgreSQL rejects them)."""
|
||||
if isinstance(obj, str):
|
||||
return obj.replace("\u0000", "")
|
||||
if isinstance(obj, dict):
|
||||
return {k: strip_nulls(v) for k, v in obj.items()}
|
||||
if isinstance(obj, list):
|
||||
return [strip_nulls(i) for i in obj]
|
||||
return obj
|
||||
|
||||
|
||||
# ============================================
|
||||
# Prompts
|
||||
# ============================================
|
||||
|
||||
STRUCTURE_PROMPT = """You are an expert exam paper structure analyst. You are given images of a past exam paper. Analyze every page carefully and extract all questions into structured JSON.
|
||||
All generated values must be in English. Do not output Chinese.
|
||||
|
||||
CRITICAL RULES for question_text:
|
||||
- Each question's question_text must be FULLY SELF-CONTAINED. Include ALL context needed to solve it.
|
||||
- For sub-questions (e.g. (a)(i)), copy the ENTIRE parent question setup (variable definitions, code blocks, problem description) into the question_text, then append the specific sub-question.
|
||||
- For Python/code questions: include ALL variable definitions and import statements verbatim, exactly as they appear in the exam, preserving multi-line arrays and data structures completely.
|
||||
- Never truncate code. If a variable is defined across multiple lines (e.g. a numpy array), include every line.
|
||||
|
||||
Output JSON format (strictly follow):
|
||||
{
|
||||
"total_score": 100,
|
||||
"difficulty_level": "medium",
|
||||
"topics_summary": {"Topic A": 40, "Topic B": 30, "Topic C": 30},
|
||||
"questions": [
|
||||
{
|
||||
"question_number": "1a",
|
||||
"parent_question": "1",
|
||||
"question_type": "mc",
|
||||
"question_text": "Original question text...",
|
||||
"score": 5,
|
||||
"page_number": 1,
|
||||
"options": [{"label": "A", "text": "Option content"}, {"label": "B", "text": "..."}],
|
||||
"topics": ["Linked List", "Pointer"],
|
||||
"difficulty": "easy"
|
||||
},
|
||||
{
|
||||
"question_number": "2",
|
||||
"parent_question": null,
|
||||
"question_type": "long_question",
|
||||
"question_text": "Original question text...",
|
||||
"score": 15,
|
||||
"page_number": 2,
|
||||
"options": null,
|
||||
"topics": ["Recursion"],
|
||||
"difficulty": "hard"
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
Rules:
|
||||
- question_type must be one of: "mc" (multiple choice), "true_false" (true/false), "fill_blank" (fill in blank), "long_question" (long question)
|
||||
- True/False questions MUST use "true_false" type, with options set to [{"label":"True","text":"True"},{"label":"False","text":"False"}], correct_option as "True" or "False"
|
||||
- Multiple choice must extract the options array
|
||||
- Sub-questions use parent_question to link to parent: "1a" parent is "1"
|
||||
- Independent questions without sub-questions set parent_question to null
|
||||
- page_number inferred from where the question appears
|
||||
- topics inferred from the question content
|
||||
- difficulty: "easy" | "medium" | "hard"
|
||||
- Extract ALL questions, do not miss any
|
||||
- Keep topic labels in English only
|
||||
"""
|
||||
|
||||
ANSWER_MATCH_PROMPT = """You are an expert exam answer matching specialist. Below is the answer text for an exam paper. Extract and match answers to their corresponding question numbers.
|
||||
All generated values must be in English. Do not output Chinese.
|
||||
|
||||
Question structure:
|
||||
{questions_json}
|
||||
|
||||
Answer text:
|
||||
{answer_text}
|
||||
|
||||
Output JSON format:
|
||||
{{
|
||||
"answers": [
|
||||
{{
|
||||
"question_number": "1a",
|
||||
"correct_option": "B",
|
||||
"correct_answer": null,
|
||||
"raw_answer_text": "Original answer text..."
|
||||
}},
|
||||
{{
|
||||
"question_number": "2",
|
||||
"correct_option": null,
|
||||
"correct_answer": null,
|
||||
"raw_answer_text": "Complete solution process and answer..."
|
||||
}}
|
||||
]
|
||||
}}
|
||||
|
||||
Rules:
|
||||
- For MC questions, fill correct_option (e.g. "B")
|
||||
- For fill-blank questions, fill correct_answer (e.g. "O(n log n)")
|
||||
- For long questions, only fill raw_answer_text (complete solution process)
|
||||
- Match all questions where answers can be found
|
||||
- Keep raw_answer_text faithful to the source answer, but do not add Chinese commentary
|
||||
"""
|
||||
|
||||
ANALYSIS_PROMPT = """You are an expert academic answer analyst. Generate three sections for the following exam question. ALL output must be in English.
|
||||
|
||||
Question info:
|
||||
- Number: {question_number}
|
||||
- Type: {question_type}
|
||||
- Score: {score}
|
||||
- Question: {question_text}
|
||||
- Topics: {topics}
|
||||
{answer_section}
|
||||
|
||||
Generate THREE sections in HTML format (supports KaTeX: block $$ ... $$ inline $ ... $):
|
||||
|
||||
Output JSON:
|
||||
{{
|
||||
"knowledge_reminder": "<HTML> Prerequisite knowledge points needed for this question, as a concise bullet list </HTML>",
|
||||
"ai_hint": "<HTML> A hint that guides thinking direction WITHOUT giving away the answer </HTML>",
|
||||
"solution": "<HTML> Complete step-by-step solution (Step 1, Step 2, ...) with derivations, formulas, and common mistake warnings </HTML>"
|
||||
}}
|
||||
|
||||
Solution requirements:
|
||||
- Must include complete working process, not just the answer
|
||||
- Each step must have an explanation
|
||||
- If a reference answer is provided, derive the solution based on it
|
||||
- If no reference answer, work out the complete solution independently
|
||||
- For MC questions, explain why the correct option is right AND why others are wrong
|
||||
- Use <ol> or numbered steps
|
||||
- Mark common mistakes with <div class="common-error">...</div>
|
||||
|
||||
KaTeX formula rules:
|
||||
- Block formula: $$ on its own line, with blank lines before and after
|
||||
- Inline formula: $x^2$ no line break
|
||||
- Matrix: \\begin{{bmatrix}} ... \\end{{bmatrix}}
|
||||
- Fraction: \\frac{{a}}{{b}}
|
||||
"""
|
||||
|
||||
BATCH_ANALYSIS_PROMPT = """You are an expert academic answer analyst. Generate three study sections for each question below. ALL output must be in English.
|
||||
|
||||
For every question, return:
|
||||
- knowledge_reminder: concise prerequisite bullets in HTML
|
||||
- ai_hint: a helpful hint in HTML without revealing the final answer
|
||||
- solution: a complete step-by-step solution in HTML
|
||||
|
||||
Return JSON in this exact format:
|
||||
{{
|
||||
"analyses": [
|
||||
{{
|
||||
"question_number": "1a",
|
||||
"knowledge_reminder": "<HTML>...</HTML>",
|
||||
"ai_hint": "<HTML>...</HTML>",
|
||||
"solution": "<HTML>...</HTML>"
|
||||
}}
|
||||
]
|
||||
}}
|
||||
|
||||
Rules:
|
||||
- Return one item for every provided question_number
|
||||
- Keep each item matched to the same question_number
|
||||
- All text must be in English
|
||||
- HTML only, KaTeX compatible
|
||||
- For MC questions, explain why the correct option is right and why the others are wrong
|
||||
- For long questions, show a complete derivation or reasoning chain
|
||||
- Use <ol> or numbered steps in solution when appropriate
|
||||
- Mark common mistakes with <div class="common-error">...</div>
|
||||
- CRITICAL: When a question_text contains "[Context from parent question X]" followed by "[Sub-question Y]", the parent section is background context only. You MUST solve ONLY the specific sub-question labeled [Sub-question Y]. Do NOT solve other sub-questions listed in the parent context. Give one precise answer for that single sub-question only.
|
||||
|
||||
Questions:
|
||||
{questions_payload}
|
||||
"""
|
||||
|
||||
|
||||
# ============================================
|
||||
# 处理管线
|
||||
# ============================================
|
||||
|
||||
RETRYABLE_ERROR_MARKERS = (
|
||||
"429",
|
||||
"rate limit",
|
||||
"rate_limit",
|
||||
"too many requests",
|
||||
"timeout",
|
||||
"timed out",
|
||||
"connection",
|
||||
)
|
||||
|
||||
|
||||
def is_retryable_error(exc: Exception) -> bool:
|
||||
message = str(exc).lower()
|
||||
return any(marker in message for marker in RETRYABLE_ERROR_MARKERS)
|
||||
|
||||
|
||||
def pdf_to_images(pdf_bytes: bytes, dpi: int = 96) -> list[str]:
|
||||
"""将 PDF 每页渲染为 base64 PNG 图片列表(96dpi 平衡清晰度与成本)"""
|
||||
doc = fitz.open(stream=pdf_bytes, filetype="pdf")
|
||||
images = []
|
||||
mat = fitz.Matrix(dpi / 72, dpi / 72)
|
||||
for page in doc:
|
||||
pix = page.get_pixmap(matrix=mat, colorspace=fitz.csRGB)
|
||||
img_bytes = pix.tobytes("png")
|
||||
images.append(base64.b64encode(img_bytes).decode())
|
||||
doc.close()
|
||||
return images
|
||||
|
||||
|
||||
def parse_json_response(text: str) -> dict:
|
||||
"""解析模型返回的 JSON,兼容 markdown 代码块包装"""
|
||||
text = text.strip()
|
||||
# 去掉 ```json ... ``` 包装
|
||||
if text.startswith("```"):
|
||||
lines = text.splitlines()
|
||||
text = "\n".join(lines[1:-1] if lines[-1].strip() == "```" else lines[1:])
|
||||
# 移除 JSON 字符串中的非法控制字符(0x00-0x1F 除了 \t \n \r)
|
||||
text = re.sub(r'[\x00-\x08\x0b\x0c\x0e-\x1f]', '', text)
|
||||
# 修复模型返回的无效 JSON 转义序列:只修奇数个反斜杠后的非法字符
|
||||
text = re.sub(r'(?<!\\)((?:\\\\)*)\\([^"\\/bfnrtu])', r'\1\\\\\2', text)
|
||||
return json.loads(text)
|
||||
|
||||
|
||||
async def gemini_vision_json(
|
||||
*,
|
||||
system_prompt: str,
|
||||
images: list[str],
|
||||
user_text: str = "",
|
||||
temperature: float = 0,
|
||||
max_attempts: int = 6,
|
||||
) -> dict:
|
||||
"""发送图片 + prompt 给 Gemini vision 模型,返回 JSON"""
|
||||
client = get_vision_client()
|
||||
delay_seconds = 2
|
||||
|
||||
content: list = []
|
||||
for b64 in images:
|
||||
content.append({"type": "image_url", "image_url": {"url": f"data:image/png;base64,{b64}"}})
|
||||
if user_text:
|
||||
content.append({"type": "text", "text": user_text})
|
||||
|
||||
for attempt in range(1, max_attempts + 1):
|
||||
try:
|
||||
response = client.chat.completions.create(
|
||||
model="gemini-2.5-flash",
|
||||
messages=[
|
||||
{"role": "system", "content": system_prompt + "\n\nIMPORTANT: Your entire response must be valid JSON only. No markdown, no code fences, no extra text."},
|
||||
{"role": "user", "content": content},
|
||||
],
|
||||
temperature=temperature,
|
||||
max_tokens=16384,
|
||||
)
|
||||
return parse_json_response(response.choices[0].message.content)
|
||||
except Exception as exc:
|
||||
if attempt == max_attempts or not is_retryable_error(exc):
|
||||
raise
|
||||
await asyncio.sleep(delay_seconds)
|
||||
delay_seconds = min(delay_seconds * 2, 30)
|
||||
|
||||
|
||||
async def deepseek_json_completion(
|
||||
*,
|
||||
system_prompt: str,
|
||||
user_prompt: str | None = None,
|
||||
temperature: float = 0,
|
||||
max_attempts: int = 6,
|
||||
) -> dict:
|
||||
"""DeepSeek 纯文本 JSON completion(用于 AI trio 生成)"""
|
||||
client = get_deepseek_client()
|
||||
delay_seconds = 2
|
||||
|
||||
for attempt in range(1, max_attempts + 1):
|
||||
try:
|
||||
messages = [{"role": "system", "content": system_prompt}]
|
||||
if user_prompt:
|
||||
messages.append({"role": "user", "content": user_prompt})
|
||||
|
||||
response = client.chat.completions.create(
|
||||
model="deepseek-chat",
|
||||
messages=messages,
|
||||
temperature=temperature,
|
||||
max_tokens=8192,
|
||||
response_format={"type": "json_object"},
|
||||
)
|
||||
raw = response.choices[0].message.content
|
||||
raw = re.sub(r'[\x00-\x08\x0b\x0c\x0e-\x1f]', '', raw)
|
||||
raw = re.sub(r'(?<!\\)((?:\\\\)*)\\([^"\\/bfnrtu])', r'\1\\\\\2', raw)
|
||||
return json.loads(raw)
|
||||
except Exception as exc:
|
||||
if attempt == max_attempts or not is_retryable_error(exc):
|
||||
raise
|
||||
await asyncio.sleep(delay_seconds)
|
||||
delay_seconds = min(delay_seconds * 2, 30)
|
||||
|
||||
|
||||
def chunked(items: list[dict], size: int) -> list[list[dict]]:
|
||||
return [items[i:i + size] for i in range(0, len(items), size)]
|
||||
|
||||
|
||||
def _question_sort_key(qnum: str) -> tuple:
|
||||
"""自然排序题号:1a < 1b < ... < 1i < 1j < 2ai < 2aii < 10a"""
|
||||
parts = re.findall(r'(\d+|[a-zA-Z]+|[()]+)', qnum)
|
||||
key = []
|
||||
for idx, p in enumerate(parts):
|
||||
if p.isdigit():
|
||||
key.append((0, int(p), ''))
|
||||
elif p in ('(', ')'):
|
||||
continue
|
||||
else:
|
||||
# Single letter (a-z): always sort alphabetically (a=1, b=2, ..., j=10)
|
||||
if len(p) == 1 and p.isalpha():
|
||||
key.append((1, ord(p.lower()) - ord('a') + 1, p))
|
||||
else:
|
||||
# Multi-letter: roman numerals for sub-sub-questions (i=1, ii=2, iii=3, ...)
|
||||
romans = {'i':1,'ii':2,'iii':3,'iv':4,'v':5,'vi':6,'vii':7,'viii':8,'ix':9,'x':10,'xi':11,'xii':12,'xiii':13}
|
||||
if p.lower() in romans:
|
||||
key.append((2, romans[p.lower()], p))
|
||||
else:
|
||||
key.append((1, 0, p))
|
||||
return tuple(key)
|
||||
|
||||
|
||||
def sort_questions(questions: list[dict]) -> list[dict]:
|
||||
"""按题号自然排序"""
|
||||
return sorted(questions, key=lambda q: _question_sort_key(q.get("question_number", "")))
|
||||
|
||||
|
||||
def extract_code_block(text: str) -> str:
|
||||
"""
|
||||
从题目文本中提取 Python 代码块。
|
||||
策略:找到第一个明确的代码起始行(import/赋值/print),
|
||||
然后把后续所有缩进或延续行一并带上,直到明显的非代码段落。
|
||||
"""
|
||||
lines = text.splitlines()
|
||||
result = []
|
||||
in_code = False
|
||||
open_brackets = 0
|
||||
|
||||
CODE_START = re.compile(r"^\s*(import |from \w|[A-Za-z_]\w*\s*=|print\()")
|
||||
|
||||
for line in lines:
|
||||
stripped = line.strip()
|
||||
|
||||
# 已在代码块内:括号未闭合时继续收集
|
||||
if in_code and open_brackets > 0:
|
||||
result.append(stripped)
|
||||
open_brackets += stripped.count("(") + stripped.count("[") + stripped.count("{")
|
||||
open_brackets -= stripped.count(")") + stripped.count("]") + stripped.count("}")
|
||||
continue
|
||||
|
||||
# 检测新的代码起始行
|
||||
if CODE_START.match(line):
|
||||
in_code = True
|
||||
result.append(stripped)
|
||||
open_brackets += stripped.count("(") + stripped.count("[") + stripped.count("{")
|
||||
open_brackets -= stripped.count(")") + stripped.count("]") + stripped.count("}")
|
||||
continue
|
||||
|
||||
# 非代码行:重置(但保留 in_code=True 以便继续接后续代码行)
|
||||
in_code = False
|
||||
|
||||
return "\n".join(result)
|
||||
|
||||
|
||||
# 保持向后兼容
|
||||
extract_code_lines = extract_code_block
|
||||
|
||||
|
||||
def try_exec_python(code: str, shared_ns: dict) -> str | None:
|
||||
"""
|
||||
在 shared_ns 命名空间中执行 code,捕获 stdout。
|
||||
返回输出字符串,失败返回 None。
|
||||
"""
|
||||
buf = io.StringIO()
|
||||
try:
|
||||
with redirect_stdout(buf):
|
||||
exec(code, shared_ns) # noqa: S102
|
||||
output = buf.getvalue().strip()
|
||||
return output if output else None
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
async def _resume_ai_trio(sb, paper_id: str, questions: list[dict]):
|
||||
"""为缺 solution 的题目生成 AI trio,逐条写回 DB。支持断点续传。"""
|
||||
need = [q for q in questions if not q.get("solution")]
|
||||
if not need:
|
||||
# 全部已有 solution,直接标记完成
|
||||
sb.table("papers").update({"status": "ready", "processing_step": None}).eq("id", paper_id).execute()
|
||||
return
|
||||
|
||||
total_q = len(questions)
|
||||
done_q = total_q - len(need)
|
||||
|
||||
# 构建 payload
|
||||
id_map = {q["question_number"]: q["id"] for q in need}
|
||||
# 需要完整的 question_text 来生成 AI trio
|
||||
full_data = sb.table("paper_questions").select(
|
||||
"id, question_number, question_type, question_text, score, correct_option, correct_answer, raw_answer_text"
|
||||
).eq("paper_id", paper_id).in_("id", [q["id"] for q in need]).execute().data
|
||||
|
||||
payloads = []
|
||||
for q in full_data:
|
||||
answer_section = q.get("raw_answer_text") or ""
|
||||
if not answer_section and q.get("correct_option"):
|
||||
answer_section = f"Correct option: {q['correct_option']}"
|
||||
elif not answer_section and q.get("correct_answer"):
|
||||
answer_section = f"Correct answer: {q['correct_answer']}"
|
||||
payloads.append({
|
||||
"question_number": q["question_number"],
|
||||
"question_type": q["question_type"] or "long_question",
|
||||
"score": q.get("score") or "unknown",
|
||||
"question_text": q["question_text"] or "",
|
||||
"reference_answer": answer_section,
|
||||
})
|
||||
|
||||
batches = chunked(payloads, 3)
|
||||
for batch_idx, batch in enumerate(batches, 1):
|
||||
current = done_q + batch_idx * 3
|
||||
_update_progress(sb, paper_id, f"Generating solutions ({min(current, total_q)}/{total_q} questions)", batch_idx, len(batches))
|
||||
try:
|
||||
result = await deepseek_json_completion(
|
||||
system_prompt=BATCH_ANALYSIS_PROMPT.format(
|
||||
questions_payload=json.dumps(batch, ensure_ascii=False),
|
||||
),
|
||||
temperature=0.3,
|
||||
)
|
||||
for item in result.get("analyses", []):
|
||||
qnum = item.get("question_number")
|
||||
qid = id_map.get(qnum)
|
||||
if qid:
|
||||
sb.table("paper_questions").update({
|
||||
"knowledge_reminder": item.get("knowledge_reminder", ""),
|
||||
"ai_hint": item.get("ai_hint", ""),
|
||||
"solution": item.get("solution", ""),
|
||||
}).eq("id", qid).execute()
|
||||
except Exception:
|
||||
pass # 单批失败不影响其他批
|
||||
await asyncio.sleep(1)
|
||||
|
||||
# 标记完成
|
||||
sb.table("papers").update({"status": "ready", "processing_step": None}).eq("id", paper_id).execute()
|
||||
|
||||
|
||||
def _update_progress(sb, paper_id: str, step: str, progress: int = 0, total: int = 0):
|
||||
"""更新处理进度到 DB"""
|
||||
sb.table("papers").update({
|
||||
"processing_step": step,
|
||||
"processing_progress": progress,
|
||||
"processing_total": total,
|
||||
}).eq("id", paper_id).execute()
|
||||
|
||||
|
||||
async def process_paper(paper_id: str, paper_bytes: bytes, answer_bytes: bytes | None):
|
||||
"""后台处理管线: PDF pages → Vision 结构化 → AI 三件套
|
||||
|
||||
设计原则:每个步骤完成后立即持久化到 DB,支持断点续传。
|
||||
"""
|
||||
sb = get_supabase()
|
||||
|
||||
try:
|
||||
# 检查是否已有题目(断点续传场景)
|
||||
existing = sb.table("paper_questions").select("id, question_number, solution").eq("paper_id", paper_id).execute().data
|
||||
|
||||
if existing:
|
||||
# 已有题目 → 跳过提取,直接补 AI trio
|
||||
await _resume_ai_trio(sb, paper_id, existing)
|
||||
return
|
||||
|
||||
# ── Step 1: PDF → 图片 ──
|
||||
_update_progress(sb, paper_id, "Rendering PDF pages...")
|
||||
paper_images = pdf_to_images(paper_bytes)
|
||||
|
||||
# ── Step 2: Vision 结构化拆题 ──
|
||||
PAGE_BATCH = 8
|
||||
all_questions: list = []
|
||||
meta: dict = {}
|
||||
num_page_batches = -(-len(paper_images) // PAGE_BATCH)
|
||||
for i in range(0, len(paper_images), PAGE_BATCH):
|
||||
batch_imgs = paper_images[i:i + PAGE_BATCH]
|
||||
batch_idx = i // PAGE_BATCH + 1
|
||||
_update_progress(sb, paper_id, f"Reading pages {i+1}-{i+len(batch_imgs)}...", batch_idx, num_page_batches)
|
||||
batch_result = await gemini_vision_json(
|
||||
system_prompt=STRUCTURE_PROMPT,
|
||||
images=batch_imgs,
|
||||
user_text=f"Pages {i+1}-{i+len(batch_imgs)} of the exam paper. Extract all questions visible on these pages.",
|
||||
temperature=0,
|
||||
)
|
||||
if not meta:
|
||||
meta = {k: batch_result.get(k) for k in ("total_score", "difficulty_level", "topics_summary")}
|
||||
all_questions.extend(batch_result.get("questions", []))
|
||||
|
||||
all_questions = sort_questions(all_questions)
|
||||
questions = all_questions
|
||||
|
||||
# 更新 paper 概览
|
||||
sb.table("papers").update({
|
||||
"total_score": meta.get("total_score"),
|
||||
"question_count": len(questions),
|
||||
"topics_summary": meta.get("topics_summary"),
|
||||
"difficulty_level": meta.get("difficulty_level"),
|
||||
}).eq("id", paper_id).execute()
|
||||
|
||||
# ── Step 3: 答案匹配(分批,失败跳过)──
|
||||
answers_map = {}
|
||||
if answer_bytes:
|
||||
_update_progress(sb, paper_id, "Matching answers...")
|
||||
try:
|
||||
answer_images = pdf_to_images(answer_bytes)
|
||||
questions_json = json.dumps(
|
||||
[{"question_number": q["question_number"], "question_type": q["question_type"]}
|
||||
for q in questions], ensure_ascii=False,
|
||||
)
|
||||
all_answers: list = []
|
||||
for ai in range(0, len(answer_images), 8):
|
||||
batch_ans_imgs = answer_images[ai:ai + 8]
|
||||
try:
|
||||
match_result = await gemini_vision_json(
|
||||
system_prompt=ANSWER_MATCH_PROMPT.format(
|
||||
questions_json=questions_json, answer_text="(See images)",
|
||||
),
|
||||
images=batch_ans_imgs,
|
||||
user_text=f"Match answers to these questions: {questions_json}",
|
||||
temperature=0,
|
||||
)
|
||||
all_answers.extend(match_result.get("answers", []))
|
||||
except Exception:
|
||||
pass
|
||||
answers_map = {a["question_number"]: a for a in all_answers}
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# ── Step 4: 立即写入题目到 DB(先不含 AI trio)──
|
||||
_update_progress(sb, paper_id, "Saving questions...")
|
||||
for i, q in enumerate(questions):
|
||||
qnum = q["question_number"]
|
||||
answer = answers_map.get(qnum, {})
|
||||
sb.table("paper_questions").insert(strip_nulls({
|
||||
"paper_id": paper_id,
|
||||
"question_number": qnum,
|
||||
"parent_question": q.get("parent_question"),
|
||||
"display_order": i,
|
||||
"question_type": q["question_type"],
|
||||
"question_text": q["question_text"],
|
||||
"score": q.get("score"),
|
||||
"page_number": q.get("page_number"),
|
||||
"options": q.get("options"),
|
||||
"correct_option": answer.get("correct_option"),
|
||||
"correct_answer": answer.get("correct_answer"),
|
||||
"raw_answer_text": answer.get("raw_answer_text"),
|
||||
"topics": q.get("topics", []),
|
||||
"analytics_topic": q.get("topics", [None])[0],
|
||||
"topic_tags": q.get("topics", []),
|
||||
"difficulty": q.get("difficulty"),
|
||||
})).execute()
|
||||
|
||||
# ── Step 5: AI trio(逐条更新,支持断点续传)──
|
||||
saved = sb.table("paper_questions").select("id, question_number, solution").eq("paper_id", paper_id).execute().data
|
||||
await _resume_ai_trio(sb, paper_id, saved)
|
||||
|
||||
except Exception as e:
|
||||
sb.table("papers").update({
|
||||
"status": "error",
|
||||
"error_message": f"{type(e).__name__}: {str(e)}\n{traceback.format_exc()[-500:]}",
|
||||
}).eq("id", paper_id).execute()
|
||||
raise
|
||||
13
backend/app/services/supabase_client.py
Normal file
13
backend/app/services/supabase_client.py
Normal file
@@ -0,0 +1,13 @@
|
||||
from supabase import create_client, Client
|
||||
from app.config import get_settings
|
||||
|
||||
_client: Client | None = None
|
||||
|
||||
|
||||
def get_supabase() -> Client:
|
||||
"""获取 Supabase client (service_role,绕过 RLS)"""
|
||||
global _client
|
||||
if _client is None:
|
||||
s = get_settings()
|
||||
_client = create_client(s.supabase_url, s.supabase_service_role_key)
|
||||
return _client
|
||||
48
backend/app/services/text_extractor.py
Normal file
48
backend/app/services/text_extractor.py
Normal file
@@ -0,0 +1,48 @@
|
||||
"""PDF 文本提取 — 复用 SOS 的 text_extractor 逻辑"""
|
||||
|
||||
import base64
|
||||
import fitz # PyMuPDF
|
||||
from dataclasses import dataclass
|
||||
|
||||
|
||||
@dataclass
|
||||
class ExtractedContent:
|
||||
pages_text: list[str] # 每页文本
|
||||
page_images: dict[int, str] # 页码 → base64 图片(图片密集型页面)
|
||||
total_pages: int
|
||||
has_images: bool
|
||||
|
||||
|
||||
def extract_pdf(file_bytes: bytes) -> ExtractedContent:
|
||||
"""从 PDF 提取文本和图片"""
|
||||
doc = fitz.open(stream=file_bytes, filetype="pdf")
|
||||
pages_text = []
|
||||
page_images = {}
|
||||
|
||||
for i, page in enumerate(doc):
|
||||
text = page.get_text("text")
|
||||
pages_text.append(text)
|
||||
|
||||
# 如果某页文本很少但有图片,可能是扫描件 → 保存为图片用于 Vision OCR
|
||||
if len(text.strip()) < 50:
|
||||
pix = page.get_pixmap(dpi=200)
|
||||
img_bytes = pix.tobytes("png")
|
||||
page_images[i] = base64.b64encode(img_bytes).decode("utf-8")
|
||||
|
||||
doc.close()
|
||||
|
||||
return ExtractedContent(
|
||||
pages_text=pages_text,
|
||||
page_images=page_images,
|
||||
total_pages=len(pages_text),
|
||||
has_images=len(page_images) > 0,
|
||||
)
|
||||
|
||||
|
||||
def get_full_text(extracted: ExtractedContent) -> str:
|
||||
"""合并所有页面文本"""
|
||||
return "\n\n".join(
|
||||
f"--- Page {i+1} ---\n{text}"
|
||||
for i, text in enumerate(extracted.pages_text)
|
||||
if text.strip()
|
||||
)
|
||||
252
backend/backfill_ai_trio_with_context.py
Normal file
252
backend/backfill_ai_trio_with_context.py
Normal file
@@ -0,0 +1,252 @@
|
||||
"""
|
||||
重新生成所有题目的 AI trio,子题带父题上下文。
|
||||
用法: python backfill_ai_trio_with_context.py [--paper-id <id>] [--course <code>]
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import io
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
import argparse
|
||||
from contextlib import redirect_stdout
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.services.llm_clients import get_deepseek_client
|
||||
|
||||
|
||||
def extract_code_lines(text: str) -> str:
|
||||
lines = (text or "").splitlines()
|
||||
result = []
|
||||
in_code = False
|
||||
open_brackets = 0
|
||||
CODE_START = re.compile(r"^\s*(import |from \w|[A-Za-z_]\w*\s*=|print\()")
|
||||
for line in lines:
|
||||
stripped = line.strip()
|
||||
if in_code and open_brackets > 0:
|
||||
result.append(stripped)
|
||||
open_brackets += stripped.count("(") + stripped.count("[") + stripped.count("{")
|
||||
open_brackets -= stripped.count(")") + stripped.count("]") + stripped.count("}")
|
||||
continue
|
||||
if CODE_START.match(line):
|
||||
in_code = True
|
||||
result.append(stripped)
|
||||
open_brackets += stripped.count("(") + stripped.count("[") + stripped.count("{")
|
||||
open_brackets -= stripped.count(")") + stripped.count("]") + stripped.count("}")
|
||||
continue
|
||||
in_code = False
|
||||
return "\n".join(result)
|
||||
|
||||
|
||||
def try_exec_python(code: str, shared_ns: dict) -> str | None:
|
||||
buf = io.StringIO()
|
||||
try:
|
||||
with redirect_stdout(buf):
|
||||
exec(code, shared_ns) # noqa: S102
|
||||
output = buf.getvalue().strip()
|
||||
return output if output else None
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
BATCH_ANALYSIS_PROMPT = """You are an expert academic answer analyst. Generate three study sections for each question below. ALL output must be in English.
|
||||
|
||||
For every question, return:
|
||||
- knowledge_reminder: concise prerequisite bullets in HTML
|
||||
- ai_hint: a helpful hint in HTML without revealing the final answer
|
||||
- solution: a complete step-by-step solution in HTML
|
||||
|
||||
Return JSON in this exact format:
|
||||
{{
|
||||
"analyses": [
|
||||
{{
|
||||
"question_number": "1a",
|
||||
"knowledge_reminder": "<HTML>...</HTML>",
|
||||
"ai_hint": "<HTML>...</HTML>",
|
||||
"solution": "<HTML>...</HTML>"
|
||||
}}
|
||||
]
|
||||
}}
|
||||
|
||||
Rules:
|
||||
- Return one item for every provided question_number
|
||||
- All text must be in English
|
||||
- HTML only, KaTeX compatible (block $$ ... $$ inline $ ... $)
|
||||
- For MC questions, explain why the correct option is right and why others are wrong
|
||||
- For long questions, show a complete derivation or reasoning chain
|
||||
- Use <ol> or numbered steps in solution when appropriate
|
||||
- Mark common mistakes with <div class="common-error">...</div>
|
||||
- CRITICAL: When a question_text contains "[Context from parent question X]" followed by "[Sub-question Y]", the parent section is background context only. You MUST solve ONLY the specific sub-question labeled [Sub-question Y]. Do NOT solve other sub-questions listed in the parent context. Give one precise answer for that single sub-question only.
|
||||
|
||||
Questions:
|
||||
{questions_payload}
|
||||
"""
|
||||
|
||||
|
||||
def chunked(lst, size):
|
||||
return [lst[i:i+size] for i in range(0, len(lst), size)]
|
||||
|
||||
|
||||
async def deepseek_batch(batch: list[dict]) -> list[dict]:
|
||||
client = get_deepseek_client()
|
||||
for attempt in range(5):
|
||||
try:
|
||||
resp = client.chat.completions.create(
|
||||
model="deepseek-chat",
|
||||
messages=[{
|
||||
"role": "system",
|
||||
"content": BATCH_ANALYSIS_PROMPT.format(
|
||||
questions_payload=json.dumps(batch, ensure_ascii=False)
|
||||
)
|
||||
}],
|
||||
temperature=0.3,
|
||||
max_tokens=8192,
|
||||
response_format={"type": "json_object"},
|
||||
)
|
||||
raw = re.sub(r'[\x00-\x08\x0b\x0c\x0e-\x1f]', '', resp.choices[0].message.content)
|
||||
raw = re.sub(r'(?<!\\)((?:\\\\)*)\\([^"\\/bfnrtu])', r'\1\\\\\2', raw)
|
||||
data = json.loads(raw)
|
||||
return data.get("analyses", [])
|
||||
except Exception as e:
|
||||
print(f" attempt {attempt+1} failed: {e}")
|
||||
if attempt < 4:
|
||||
await asyncio.sleep(2 ** attempt * 2)
|
||||
return []
|
||||
|
||||
|
||||
async def main():
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--paper-id", help="Only process this paper")
|
||||
parser.add_argument("--course", help="Only process papers with this course code")
|
||||
parser.add_argument("--missing-only", action="store_true", help="Only process questions missing solution")
|
||||
args = parser.parse_args()
|
||||
|
||||
sb = get_supabase()
|
||||
|
||||
# Fetch all questions (with paper info for filtering)
|
||||
query = sb.table("paper_questions").select(
|
||||
"id, paper_id, question_number, question_type, question_text, "
|
||||
"parent_question, score, correct_option, correct_answer, raw_answer_text, "
|
||||
"analytics_topic, topic_tags, solution"
|
||||
)
|
||||
if args.paper_id:
|
||||
query = query.eq("paper_id", args.paper_id)
|
||||
result = query.order("paper_id").order("display_order").execute()
|
||||
all_questions = result.data
|
||||
|
||||
if args.course:
|
||||
# Filter by course via papers table
|
||||
papers_res = sb.table("papers").select("id").eq("course_code", args.course.upper()).execute()
|
||||
paper_ids = {p["id"] for p in papers_res.data}
|
||||
all_questions = [q for q in all_questions if q["paper_id"] in paper_ids]
|
||||
|
||||
if args.missing_only:
|
||||
all_questions = [q for q in all_questions if not q.get("solution")]
|
||||
print(f"Questions missing solution: {len(all_questions)}")
|
||||
else:
|
||||
print(f"Total questions to process: {len(all_questions)}")
|
||||
|
||||
# Group by paper_id
|
||||
from collections import defaultdict
|
||||
by_paper: dict[str, list] = defaultdict(list)
|
||||
for q in all_questions:
|
||||
by_paper[q["paper_id"]].append(q)
|
||||
|
||||
total_updated = 0
|
||||
|
||||
for paper_id, questions in by_paper.items():
|
||||
print(f"\nPaper {paper_id} — {len(questions)} questions")
|
||||
|
||||
# 所有题都可能是别的题的父题
|
||||
parent_text_map: dict[str, str] = {
|
||||
q["question_number"]: q["question_text"] or ""
|
||||
for q in questions
|
||||
}
|
||||
|
||||
# Build payloads with context + Python exec
|
||||
payloads = []
|
||||
exec_namespaces: dict[str, dict] = {}
|
||||
|
||||
for q in questions:
|
||||
parent_q = q.get("parent_question")
|
||||
if parent_q and parent_q in parent_text_map:
|
||||
full_text = (
|
||||
f"[Context from parent question {parent_q}]\n"
|
||||
f"{parent_text_map[parent_q]}\n\n"
|
||||
f"[Sub-question {q['question_number']}]\n"
|
||||
f"{q['question_text'] or ''}"
|
||||
)
|
||||
else:
|
||||
full_text = q["question_text"] or ""
|
||||
|
||||
answer_section = ""
|
||||
if q.get("raw_answer_text"):
|
||||
answer_section = q["raw_answer_text"]
|
||||
elif q.get("correct_option"):
|
||||
answer_section = f"Correct option: {q['correct_option']}"
|
||||
elif q.get("correct_answer"):
|
||||
answer_section = f"Correct answer: {q['correct_answer']}"
|
||||
|
||||
# 尝试 Python exec 拿真实输出
|
||||
if not answer_section:
|
||||
group_key = parent_q or q["question_number"]
|
||||
if group_key not in exec_namespaces:
|
||||
ns: dict = {}
|
||||
try:
|
||||
import numpy as np
|
||||
ns["np"] = np
|
||||
except ImportError:
|
||||
pass
|
||||
# 先执行父题 setup 代码
|
||||
if parent_q and parent_q in parent_text_map:
|
||||
setup = extract_code_lines(parent_text_map[parent_q])
|
||||
try_exec_python(setup, ns)
|
||||
exec_namespaces[group_key] = ns
|
||||
|
||||
ns = exec_namespaces[group_key]
|
||||
sub_code = extract_code_lines(q["question_text"] or "")
|
||||
if sub_code:
|
||||
exec_out = try_exec_python(sub_code, ns)
|
||||
if exec_out is not None:
|
||||
answer_section = f"Executed output: {exec_out}"
|
||||
print(f" [exec] {q['question_number']}: {exec_out[:60]}")
|
||||
|
||||
payloads.append({
|
||||
"_id": q["id"],
|
||||
"question_number": q["question_number"],
|
||||
"question_type": q["question_type"] or "long_question",
|
||||
"score": q.get("score") or "unknown",
|
||||
"question_text": full_text,
|
||||
"reference_answer": answer_section,
|
||||
})
|
||||
|
||||
# Process in batches of 3
|
||||
id_map = {q["question_number"]: q["id"] for q in questions}
|
||||
|
||||
for batch in chunked(payloads, 3):
|
||||
# Strip internal _id before sending to model
|
||||
model_batch = [{k: v for k, v in p.items() if k != "_id"} for p in batch]
|
||||
nums = [p["question_number"] for p in batch]
|
||||
print(f" Batch {nums} ...", end=" ", flush=True)
|
||||
|
||||
analyses = await deepseek_batch(model_batch)
|
||||
|
||||
for item in analyses:
|
||||
qnum = item.get("question_number")
|
||||
qid = id_map.get(qnum)
|
||||
if not qid:
|
||||
continue
|
||||
sb.table("paper_questions").update({
|
||||
"knowledge_reminder": item.get("knowledge_reminder"),
|
||||
"ai_hint": item.get("ai_hint"),
|
||||
"solution": item.get("solution"),
|
||||
}).eq("id", qid).execute()
|
||||
total_updated += 1
|
||||
|
||||
print(f"done ({len(analyses)} updated)")
|
||||
await asyncio.sleep(1)
|
||||
|
||||
print(f"\nDone. Total updated: {total_updated}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
160
backend/backfill_comp2211_page_y.py
Normal file
160
backend/backfill_comp2211_page_y.py
Normal file
@@ -0,0 +1,160 @@
|
||||
"""Backfill page_y_ratio for COMP2211 subquestions."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
import time
|
||||
from pathlib import Path
|
||||
from concurrent.futures import ThreadPoolExecutor, as_completed
|
||||
|
||||
import fitz
|
||||
import httpx
|
||||
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
|
||||
ROOT = Path(__file__).resolve().parent.parent
|
||||
PAPERS_DIR = ROOT / "pastpaper-scraper" / "papers" / "COMP2211"
|
||||
|
||||
PDF_BY_EXAM_KEY = {
|
||||
"COMP2211-2022-fall-midterm": "(COMP2211)[2022](f)midterm~=yjz8dxdd^_27002.pdf",
|
||||
"COMP2211-2022-spring-midterm": "(COMP2211)[2022](s)midterm~=b8bidkgs^_14629.pdf",
|
||||
"COMP2211-2022-spring-final-part-a": "(COMP2211)[2022](s)final~=b8bidkgs^_33018.pdf",
|
||||
"COMP2211-2022-spring-final-part-b": "(COMP2211)[2022](s)final~=b8bidkgs^_40627.pdf",
|
||||
"COMP2211-2023-spring-midterm": "(COMP2211)[2023](s)midterm~=bxbidkmj^_26587.pdf",
|
||||
"COMP2211-2024-spring-midterm": "(COMP2211)[2024](s)midterm~=rcidkjgf^_82003.pdf",
|
||||
"COMP2211-2024-spring-final": "(COMP2211)[2024](s)final~=igk5mmg^_90365.pdf",
|
||||
}
|
||||
|
||||
|
||||
def marker_candidates(question_number: str) -> list[str]:
|
||||
if "_" in question_number:
|
||||
left, right = question_number.split("_", 1)
|
||||
tokens: list[str] = []
|
||||
m = re.fullmatch(r"(\d+)([a-z])", left)
|
||||
if m:
|
||||
tokens.append(f"({m.group(2)})")
|
||||
elif re.fullmatch(r"\d+[a-z]+", left):
|
||||
tokens.append(f"({re.sub(r'^\\d+', '', left)})")
|
||||
tokens.append(f"({right})")
|
||||
return tokens[::-1]
|
||||
|
||||
m = re.fullmatch(r"(\d+)([a-z])", question_number)
|
||||
if m:
|
||||
return [f"({m.group(2)})", f"Problem {m.group(1)}"]
|
||||
|
||||
if question_number.isdigit():
|
||||
return [f"Problem {question_number}"]
|
||||
|
||||
return [question_number]
|
||||
|
||||
|
||||
def line_matches(line_text: str, marker: str) -> bool:
|
||||
text = re.sub(r"\s+", " ", line_text.strip())
|
||||
if not text:
|
||||
return False
|
||||
if marker.startswith("("):
|
||||
return text.startswith(marker)
|
||||
return marker.lower() in text.lower()
|
||||
|
||||
|
||||
def line_y_ratio(page: fitz.Page, marker: str) -> float | None:
|
||||
data = page.get_text("dict")
|
||||
hits: list[float] = []
|
||||
for block in data.get("blocks", []):
|
||||
if block.get("type") != 0:
|
||||
continue
|
||||
for line in block.get("lines", []):
|
||||
line_text = "".join(
|
||||
span.get("text", "")
|
||||
for span in line.get("spans", [])
|
||||
)
|
||||
if line_matches(line_text, marker):
|
||||
bbox = line.get("bbox")
|
||||
if bbox:
|
||||
hits.append(float(bbox[1]))
|
||||
if not hits:
|
||||
return None
|
||||
y = min(hits)
|
||||
return max(0.0, min((y - page.rect.y0) / page.rect.height, 0.98))
|
||||
|
||||
|
||||
def search_y_ratio(page: fitz.Page, marker: str) -> float | None:
|
||||
ratios: list[float] = []
|
||||
for rect in page.search_for(marker):
|
||||
ratios.append(max(0.0, min((rect.y0 - page.rect.y0) / page.rect.height, 0.98)))
|
||||
return min(ratios) if ratios else None
|
||||
|
||||
|
||||
def infer_y_ratio(page: fitz.Page, question_number: str) -> float:
|
||||
for marker in marker_candidates(question_number):
|
||||
ratio = line_y_ratio(page, marker)
|
||||
if ratio is not None:
|
||||
return ratio
|
||||
ratio = search_y_ratio(page, marker)
|
||||
if ratio is not None:
|
||||
return ratio
|
||||
return 0.05
|
||||
|
||||
|
||||
def main() -> None:
|
||||
sb = get_supabase()
|
||||
papers = (
|
||||
sb.table("papers")
|
||||
.select("id, source_exam_key")
|
||||
.eq("course_code", "COMP2211")
|
||||
.eq("source_kind", "course_library")
|
||||
.execute()
|
||||
.data
|
||||
or []
|
||||
)
|
||||
|
||||
updates: list[tuple[str, float]] = []
|
||||
for paper in papers:
|
||||
exam_key = paper["source_exam_key"]
|
||||
pdf_name = PDF_BY_EXAM_KEY.get(exam_key)
|
||||
if not pdf_name:
|
||||
continue
|
||||
pdf_path = PAPERS_DIR / pdf_name
|
||||
doc = fitz.open(pdf_path)
|
||||
try:
|
||||
questions = (
|
||||
sb.table("paper_questions")
|
||||
.select("id, question_number, page_number")
|
||||
.eq("paper_id", paper["id"])
|
||||
.order("display_order")
|
||||
.execute()
|
||||
.data
|
||||
or []
|
||||
)
|
||||
for question in questions:
|
||||
page_number = question.get("page_number") or 1
|
||||
page = doc[page_number - 1]
|
||||
ratio = infer_y_ratio(page, question["question_number"])
|
||||
updates.append((question["id"], round(ratio, 4)))
|
||||
finally:
|
||||
doc.close()
|
||||
|
||||
def apply_update(payload: tuple[str, float]) -> None:
|
||||
question_id, ratio = payload
|
||||
attempts = 0
|
||||
while True:
|
||||
try:
|
||||
sb.table("paper_questions").update({"page_y_ratio": ratio}).eq("id", question_id).execute()
|
||||
return
|
||||
except httpx.HTTPError:
|
||||
attempts += 1
|
||||
if attempts >= 5:
|
||||
raise
|
||||
time.sleep(0.4 * attempts)
|
||||
|
||||
with ThreadPoolExecutor(max_workers=3) as executor:
|
||||
futures = [executor.submit(apply_update, payload) for payload in updates]
|
||||
for future in as_completed(futures):
|
||||
future.result()
|
||||
|
||||
print(f"Backfilled page_y_ratio for {len(updates)} COMP2211 questions.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
365
backend/backfill_comp2211_tags.py
Normal file
365
backend/backfill_comp2211_tags.py
Normal file
@@ -0,0 +1,365 @@
|
||||
"""Backfill COMP2211 tags to the revised retrieval schema."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
from collections import OrderedDict
|
||||
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
|
||||
SKILL_LABELS = {
|
||||
"concept_check": "Concept Check",
|
||||
"code_tracing": "Code Tracing",
|
||||
"algorithm_tracing": "Algorithm Tracing",
|
||||
"distance_calculation": "Distance Calculation",
|
||||
"centroid_update": "Centroid Update",
|
||||
"weight_update": "Weight Update",
|
||||
"decision_boundary": "Decision Boundary",
|
||||
"implementation": "Implementation",
|
||||
"debugging": "Debugging",
|
||||
"model_selection": "Model Selection",
|
||||
"concept_explanation": "Concept Explanation",
|
||||
"architecture_reasoning": "Architecture Reasoning",
|
||||
"convergence_reasoning": "Convergence Reasoning",
|
||||
"generalization_reasoning": "Generalization Reasoning",
|
||||
"classification_decision": "Classification Decision",
|
||||
}
|
||||
|
||||
ACRONYMS = {
|
||||
"ai": "AI",
|
||||
"cnn": "CNN",
|
||||
"knn": "KNN",
|
||||
"mlp": "MLP",
|
||||
"nb": "NB",
|
||||
"numpy": "NumPy",
|
||||
}
|
||||
|
||||
|
||||
def title_case_with_acronyms(value: str) -> str:
|
||||
words = re.split(r"[\s_]+", value.strip())
|
||||
parts: list[str] = []
|
||||
for word in words:
|
||||
if not word:
|
||||
continue
|
||||
lowered = word.lower()
|
||||
parts.append(ACRONYMS.get(lowered, lowered.capitalize()))
|
||||
return " ".join(parts)
|
||||
|
||||
|
||||
def normalize_skill_tag(tag: str) -> str:
|
||||
if tag in SKILL_LABELS:
|
||||
return SKILL_LABELS[tag]
|
||||
return title_case_with_acronyms(tag)
|
||||
|
||||
|
||||
def text_blob(question: dict) -> str:
|
||||
parts = [
|
||||
question.get("question_text") or "",
|
||||
question.get("raw_answer_text") or "",
|
||||
" ".join(question.get("topic_tags") or []),
|
||||
" ".join(question.get("skill_tags") or []),
|
||||
question.get("analytics_topic") or "",
|
||||
]
|
||||
return " ".join(parts).lower()
|
||||
|
||||
|
||||
def has_any(text: str, phrases: list[str]) -> bool:
|
||||
return any(phrase in text for phrase in phrases)
|
||||
|
||||
|
||||
def infer_analytics_topic(question: dict) -> str:
|
||||
text = text_blob(question)
|
||||
broad = question.get("analytics_topic") or ""
|
||||
skills = {normalize_skill_tag(tag) for tag in (question.get("skill_tags") or [])}
|
||||
|
||||
if has_any(text, ["ethics", "bias", "privacy", "autonomous vehicle", "informed consent", "human participants", "ethically"]):
|
||||
return "Ethics of AI"
|
||||
if has_any(text, ["minimax", "alpha-beta", "alpha beta", "game tree", "tic-tac-toe", "tic tac toe"]):
|
||||
return "Game Trees"
|
||||
if has_any(text, ["search algorithm", "best-first", "breadth-first", "depth-first", "a* search", "a star"]):
|
||||
return "Search Algorithms"
|
||||
if has_any(text, ["cross validation", "d-fold", "k-fold", "train/val", "validation set", "fold "]) or broad == "Cross Validation":
|
||||
return "Cross Validation"
|
||||
if has_any(text, ["confusion matrix", "precision", "recall", "macro f1", "f1 score", "accuracy score", "evaluation metric"]):
|
||||
return "Evaluation Metrics"
|
||||
if has_any(text, ["naive bayes", "gaussian distribution", "laplace smoothing", "likelihood", "posterior probability"]) or broad == "Naive Bayes":
|
||||
return "Naive Bayes"
|
||||
if has_any(text, ["bayes classifier", "conditional probability", "bayesian inference", "prior probability", "posterior"]) or broad == "Bayesian Inference":
|
||||
return "Bayesian Inference"
|
||||
if has_any(text, ["leader clustering", "k-means", "k means", "centroid", "elbow method", "silhouette", "cluster assignments", "closest centroid", "new cluster"]):
|
||||
return "K-Means"
|
||||
if has_any(text, ["k-nearest", "nearest neighbors", "weighted knn", "cosine distance", "euclidean distance", "manhattan distance", "6-cross-validation error for k", "class for cosine distance"]):
|
||||
return "KNN"
|
||||
if has_any(text, ["multilayer perceptron", "mlp", "back propagation", "backpropagation", "hidden layer", "output layer", "dropout", "softmax", "sigmoid function", "relu as the activation"]) or broad == "MLP":
|
||||
return "MLP"
|
||||
if has_any(text, ["perceptron", "decision boundary", "single neuron", "weight update", "activation function f(z)", "linearly separable"]) or broad == "Perceptron":
|
||||
return "Perceptron"
|
||||
if has_any(text, ["convolutional neural network", "cnn", "kernel", "padding", "stride", "pooling", "dilated convolution", "3d convolution", "otsu", "histogram", "image processing", "grayscale image"]):
|
||||
return "CNN"
|
||||
if has_any(text, ["numpy", "python", "np.", "broadcasting", "reshape", "transpose", "mask", "vectorized", "np.arange", "np.mean", "np.dot", "np.convolve"]):
|
||||
return "Python and NumPy"
|
||||
|
||||
if broad == "KNN and Clustering":
|
||||
if (
|
||||
has_any(text, ["k-means", "k means", "centroid", "leader clustering", "elbow", "silhouette"])
|
||||
or "Centroid Update" in skills
|
||||
or "Convergence Reasoning" in skills
|
||||
or "Algorithm Tracing" in skills
|
||||
or "Model Selection" in skills
|
||||
):
|
||||
return "K-Means"
|
||||
return "KNN"
|
||||
|
||||
if broad == "Perceptron and MLP":
|
||||
if (
|
||||
has_any(text, ["hidden layer", "backprop", "activation function", "softmax", "relu", "sigmoid", "multilayer perceptron", "mlp"])
|
||||
or "Architecture Reasoning" in skills
|
||||
):
|
||||
return "MLP"
|
||||
return "Perceptron"
|
||||
|
||||
if broad == "Probabilistic Models":
|
||||
if has_any(text, ["naive bayes", "gaussian", "laplace", "likelihood"]):
|
||||
return "Naive Bayes"
|
||||
return "Bayesian Inference"
|
||||
|
||||
if broad == "Evaluation and Validation":
|
||||
if has_any(text, ["cross validation", "cross-validation", "k-fold", "d-fold", "validation set", "train/val"]):
|
||||
return "Cross Validation"
|
||||
return "Evaluation Metrics"
|
||||
|
||||
if broad == "Search and Games":
|
||||
if has_any(text, ["minimax", "alpha-beta", "alpha beta", "game tree"]):
|
||||
return "Game Trees"
|
||||
return "Search Algorithms"
|
||||
|
||||
broad_map = {
|
||||
"Vision and CNN": "CNN",
|
||||
"Python Fundamentals": "Python and NumPy",
|
||||
"Ethics of AI": "Ethics of AI",
|
||||
}
|
||||
return broad_map.get(broad, "Python and NumPy")
|
||||
|
||||
|
||||
TOPIC_CONCEPTS = {
|
||||
"Naive Bayes": [
|
||||
("Naive Bayes", ["naive bayes"]),
|
||||
("Prior", ["prior"]),
|
||||
("Likelihood", ["likelihood"]),
|
||||
("Posterior", ["posterior"]),
|
||||
("Gaussian", ["gaussian"]),
|
||||
("Laplace Smoothing", ["laplace"]),
|
||||
("Missing Data", ["missing data", "missing value"]),
|
||||
],
|
||||
"Bayesian Inference": [
|
||||
("Bayesian Inference", ["bayes", "conditional probability", "posterior"]),
|
||||
("Conditional Probability", ["conditional probability"]),
|
||||
("Bayes Rule", ["bayes rule", "posterior"]),
|
||||
("Prior", ["prior"]),
|
||||
("Posterior", ["posterior"]),
|
||||
],
|
||||
"KNN": [
|
||||
("KNN", ["k-nearest", "nearest neighbors", "knn"]),
|
||||
("Euclidean Distance", ["euclidean distance"]),
|
||||
("Manhattan Distance", ["manhattan distance"]),
|
||||
("Cosine Distance", ["cosine distance"]),
|
||||
("Weighted KNN", ["weighted k-nearest", "weighted knn", "inverse of the distance"]),
|
||||
("Classification", ["class label", "predict", "classification"]),
|
||||
("Cross Validation", ["cross-validation", "cross validation"]),
|
||||
("Test Error", ["test error"]),
|
||||
],
|
||||
"K-Means": [
|
||||
("K-Means", ["k-means", "k means"]),
|
||||
("Centroid Update", ["centroid"]),
|
||||
("Convergence", ["converged", "convergence"]),
|
||||
("Leader Clustering", ["leader clustering"]),
|
||||
("Outliers", ["outlier"]),
|
||||
("Model Selection", ["elbow method", "silhouette", "suitable k"]),
|
||||
],
|
||||
"Perceptron": [
|
||||
("Perceptron", ["perceptron"]),
|
||||
("Decision Boundary", ["decision boundary", "linearly separable"]),
|
||||
("Weight Update", ["weight update", "∆w", "deltaw", "backward propagation"]),
|
||||
("Convergence", ["converged", "convergence"]),
|
||||
("Activation Function", ["activation function"]),
|
||||
],
|
||||
"MLP": [
|
||||
("MLP", ["mlp", "multilayer perceptron"]),
|
||||
("Backpropagation", ["back propagation", "backpropagation", "backward propagation"]),
|
||||
("Activation Function", ["activation function", "relu", "sigmoid", "softmax"]),
|
||||
("Hidden Layer", ["hidden layer"]),
|
||||
("Output Layer", ["output layer"]),
|
||||
("Parameter Count", ["number of parameters", "parameter"]),
|
||||
("Overfitting", ["overfitting", "dropout"]),
|
||||
],
|
||||
"CNN": [
|
||||
("CNN", ["cnn", "convolutional neural network"]),
|
||||
("Convolution", ["convolution", "kernel"]),
|
||||
("Padding", ["padding", "reflection padding", "zero padding"]),
|
||||
("Stride", ["stride"]),
|
||||
("Pooling", ["pooling", "max pooling", "average pooling"]),
|
||||
("Image Processing", ["image processing", "grayscale image"]),
|
||||
("Histogram", ["histogram"]),
|
||||
("Otsu Thresholding", ["otsu"]),
|
||||
("Dilated Convolution", ["dilated convolution"]),
|
||||
("3D Convolution", ["3d convolution"]),
|
||||
("Dropout", ["dropout"]),
|
||||
],
|
||||
"Evaluation Metrics": [
|
||||
("Evaluation Metrics", ["evaluation", "metric"]),
|
||||
("Confusion Matrix", ["confusion matrix"]),
|
||||
("Accuracy", ["accuracy"]),
|
||||
("Precision", ["precision"]),
|
||||
("Recall", ["recall"]),
|
||||
("F1 Score", ["f1"]),
|
||||
("Macro F1", ["macro f1"]),
|
||||
],
|
||||
"Cross Validation": [
|
||||
("Cross Validation", ["cross validation", "cross-validation", "d-fold", "k-fold"]),
|
||||
("Train Validation Split", ["validation set", "train", "test fold"]),
|
||||
("Model Selection", ["choose k", "which k", "fold"]),
|
||||
("Data Shuffling", ["shuffle", "shuffling"]),
|
||||
],
|
||||
"Python and NumPy": [
|
||||
("Python and NumPy", ["numpy", "python"]),
|
||||
("NumPy", ["numpy", "np."]),
|
||||
("Broadcasting", ["broadcast"]),
|
||||
("Array Indexing", ["index", "slice"]),
|
||||
("Vectorization", ["no explicit loops", "vectorized"]),
|
||||
("Matrix Multiplication", ["matmul", "matrix multiplication", "@"]),
|
||||
("Reshape", ["reshape"]),
|
||||
("Transpose", ["transpose"]),
|
||||
("Masking", ["mask"]),
|
||||
("Convolution", ["convolve"]),
|
||||
],
|
||||
"Search Algorithms": [
|
||||
("Search Algorithms", ["search"]),
|
||||
("Breadth-First Search", ["breadth-first", "breadth first", "bfs"]),
|
||||
("Depth-First Search", ["depth-first", "depth first", "dfs"]),
|
||||
("Best-First Search", ["best-first", "best first"]),
|
||||
("A* Search", ["a* search", "a star", "astar"]),
|
||||
("Heuristic", ["heuristic"]),
|
||||
],
|
||||
"Game Trees": [
|
||||
("Game Trees", ["game tree", "minimax", "alpha-beta", "alpha beta"]),
|
||||
("Minimax", ["minimax"]),
|
||||
("Alpha-Beta Pruning", ["alpha-beta", "alpha beta", "pruned"]),
|
||||
("Utility", ["utility"]),
|
||||
],
|
||||
"Ethics of AI": [
|
||||
("Ethics of AI", ["ethics", "ethical"]),
|
||||
("Bias", ["bias"]),
|
||||
("Privacy", ["privacy"]),
|
||||
("Fairness", ["fair"]),
|
||||
("Research Ethics", ["informed consent", "human participants"]),
|
||||
("Governance", ["monitoring", "production", "organizations"]),
|
||||
("Autonomous Vehicles", ["autonomous vehicle"]),
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
TOPIC_DEFAULTS = {
|
||||
"Naive Bayes": ["Likelihood", "Posterior"],
|
||||
"Bayesian Inference": ["Conditional Probability", "Bayes Rule"],
|
||||
"KNN": ["Classification", "Distance Calculation"],
|
||||
"K-Means": ["Centroid Update", "Convergence"],
|
||||
"Perceptron": ["Decision Boundary", "Weight Update"],
|
||||
"MLP": ["Activation Function", "Hidden Layer"],
|
||||
"CNN": ["Convolution", "Padding"],
|
||||
"Evaluation Metrics": ["Confusion Matrix", "F1 Score"],
|
||||
"Cross Validation": ["Train Validation Split", "Model Selection"],
|
||||
"Python and NumPy": ["NumPy", "Vectorization"],
|
||||
"Search Algorithms": ["Breadth-First Search", "Heuristic"],
|
||||
"Game Trees": ["Minimax", "Alpha-Beta Pruning"],
|
||||
"Ethics of AI": ["Bias", "Fairness"],
|
||||
}
|
||||
|
||||
DEFAULT_SKILLS = {
|
||||
"Naive Bayes": ["Probability Reasoning"],
|
||||
"Bayesian Inference": ["Probability Reasoning"],
|
||||
"KNN": ["Classification Decision"],
|
||||
"K-Means": ["Centroid Update"],
|
||||
"Perceptron": ["Decision Boundary"],
|
||||
"MLP": ["Concept Explanation"],
|
||||
"CNN": ["Concept Explanation"],
|
||||
"Evaluation Metrics": ["Metric Reasoning"],
|
||||
"Cross Validation": ["Model Selection"],
|
||||
"Python and NumPy": ["Code Tracing"],
|
||||
"Search Algorithms": ["Algorithm Tracing"],
|
||||
"Game Trees": ["Game Reasoning"],
|
||||
"Ethics of AI": ["Ethical Reasoning"],
|
||||
}
|
||||
|
||||
|
||||
def unique_keep_order(values: list[str]) -> list[str]:
|
||||
return list(OrderedDict((value, None) for value in values if value).keys())
|
||||
|
||||
|
||||
def build_topic_tags(question: dict, analytics_topic: str) -> list[str]:
|
||||
text = text_blob(question)
|
||||
tags: list[str] = [analytics_topic]
|
||||
for label, keywords in TOPIC_CONCEPTS.get(analytics_topic, []):
|
||||
if label == analytics_topic:
|
||||
continue
|
||||
if has_any(text, keywords):
|
||||
tags.append(label)
|
||||
for default in TOPIC_DEFAULTS.get(analytics_topic, []):
|
||||
if len(unique_keep_order(tags)) >= 2:
|
||||
break
|
||||
tags.append(default)
|
||||
tags = unique_keep_order(tags)
|
||||
return tags[:5]
|
||||
|
||||
|
||||
def build_skill_tags(question: dict, analytics_topic: str) -> list[str]:
|
||||
raw = question.get("skill_tags") or []
|
||||
converted = unique_keep_order([normalize_skill_tag(tag) for tag in raw])
|
||||
if not converted:
|
||||
converted = DEFAULT_SKILLS.get(analytics_topic, ["Concept Check"])
|
||||
return converted[:3]
|
||||
|
||||
|
||||
def main() -> None:
|
||||
sb = get_supabase()
|
||||
papers = (
|
||||
sb.table("papers")
|
||||
.select("id")
|
||||
.eq("course_code", "COMP2211")
|
||||
.eq("source_kind", "course_library")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
paper_ids = [paper["id"] for paper in papers]
|
||||
if not paper_ids:
|
||||
print("No COMP2211 course-library papers found.")
|
||||
return
|
||||
|
||||
questions = (
|
||||
sb.table("paper_questions")
|
||||
.select("id, paper_id, question_number, question_text, raw_answer_text, analytics_topic, topic_tags, skill_tags, topics")
|
||||
.in_("paper_id", paper_ids)
|
||||
.order("paper_id")
|
||||
.order("display_order")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
|
||||
for question in questions:
|
||||
analytics_topic = infer_analytics_topic(question)
|
||||
topic_tags = build_topic_tags(question, analytics_topic)
|
||||
skill_tags = build_skill_tags(question, analytics_topic)
|
||||
payload = {
|
||||
"analytics_topic": analytics_topic,
|
||||
"topic_primary": analytics_topic,
|
||||
"topic_tags": topic_tags,
|
||||
"topics": topic_tags,
|
||||
"skill_tags": skill_tags,
|
||||
}
|
||||
sb.table("paper_questions").update(payload).eq("id", question["id"]).execute()
|
||||
|
||||
print(f"Backfilled {len(questions)} COMP2211 questions.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
169
backend/backfill_null_ai_trio.py
Normal file
169
backend/backfill_null_ai_trio.py
Normal file
@@ -0,0 +1,169 @@
|
||||
"""Backfill AI trio for questions where knowledge_reminder IS NULL.
|
||||
|
||||
For each question, generates fields in two separate LLM calls to avoid token truncation:
|
||||
Call 1 → knowledge_reminder + ai_hint (short, ~500 tokens output)
|
||||
Call 2 → solution (long, up to 4096 tokens output)
|
||||
|
||||
Run from the backend directory:
|
||||
uv run python backfill_null_ai_trio.py [--dry-run]
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import sys
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.services.paper_processor import qwen_json_completion
|
||||
|
||||
|
||||
KNOWLEDGE_HINT_PROMPT = """\
|
||||
You are an expert tutor. Given a past-paper question, produce two short study aids in English.
|
||||
|
||||
Return JSON exactly:
|
||||
{{
|
||||
"knowledge_reminder": "2-4 sentences summarising the key concept or formula the student must recall.",
|
||||
"ai_hint": "1-3 sentence nudge that guides WITHOUT giving the answer away."
|
||||
}}
|
||||
|
||||
Question:
|
||||
{payload}
|
||||
"""
|
||||
|
||||
SOLUTION_PROMPT = """\
|
||||
You are an expert tutor. Given a past-paper question and its reference answer, write a clear, \
|
||||
step-by-step model solution in English. Show all working. Be thorough but stop when the answer \
|
||||
is complete — do not pad.
|
||||
|
||||
Return JSON exactly:
|
||||
{{
|
||||
"solution": "<full step-by-step solution as a single string, use \\n for line breaks>"
|
||||
}}
|
||||
|
||||
Question:
|
||||
{payload}
|
||||
"""
|
||||
|
||||
|
||||
def build_payload(q: dict) -> dict:
|
||||
ref = ""
|
||||
if q.get("raw_answer_text"):
|
||||
ref = q["raw_answer_text"]
|
||||
elif q.get("correct_option"):
|
||||
ref = f"Correct option: {q['correct_option']}"
|
||||
elif q.get("correct_answer"):
|
||||
ref = f"Correct answer: {q['correct_answer']}"
|
||||
|
||||
return {
|
||||
"question_number": q["question_number"],
|
||||
"question_type": q["question_type"] or "long_question",
|
||||
"score": q.get("score") or "unknown",
|
||||
"question_text": q.get("question_text") or "",
|
||||
"topics": q.get("topics") or [],
|
||||
"reference_answer": ref,
|
||||
}
|
||||
|
||||
|
||||
async def process_one(sb, q: dict, dry_run: bool) -> bool:
|
||||
payload_str = json.dumps(build_payload(q), ensure_ascii=False)
|
||||
row_id = q["id"]
|
||||
qnum = q["question_number"]
|
||||
|
||||
if dry_run:
|
||||
print(f" [dry-run] would process {qnum}")
|
||||
return True
|
||||
|
||||
update: dict = {}
|
||||
|
||||
# ── Call 1: knowledge_reminder + ai_hint ─────────────────────────
|
||||
try:
|
||||
r1 = await qwen_json_completion(
|
||||
system_prompt=KNOWLEDGE_HINT_PROMPT.format(payload=payload_str),
|
||||
temperature=0.3,
|
||||
max_tokens=1024,
|
||||
)
|
||||
if r1.get("knowledge_reminder"):
|
||||
update["knowledge_reminder"] = r1["knowledge_reminder"]
|
||||
if r1.get("ai_hint"):
|
||||
update["ai_hint"] = r1["ai_hint"]
|
||||
except Exception as e:
|
||||
print(f" WARN call-1 failed for {qnum}: {e}")
|
||||
|
||||
await asyncio.sleep(1)
|
||||
|
||||
# ── Call 2: solution ──────────────────────────────────────────────
|
||||
try:
|
||||
r2 = await qwen_json_completion(
|
||||
system_prompt=SOLUTION_PROMPT.format(payload=payload_str),
|
||||
temperature=0.3,
|
||||
max_tokens=4096,
|
||||
)
|
||||
if r2.get("solution"):
|
||||
update["solution"] = r2["solution"]
|
||||
except Exception as e:
|
||||
print(f" WARN call-2 failed for {qnum}: {e}")
|
||||
|
||||
if not update:
|
||||
print(f" SKIP {qnum}: both calls returned nothing")
|
||||
return False
|
||||
|
||||
sb.table("paper_questions").update(update).eq("id", row_id).execute()
|
||||
return True
|
||||
|
||||
|
||||
async def backfill(dry_run: bool = False) -> None:
|
||||
sb = get_supabase()
|
||||
|
||||
papers = (
|
||||
sb.table("papers")
|
||||
.select("id")
|
||||
.eq("course_code", "COMP2211")
|
||||
.eq("source_kind", "course_library")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
paper_ids = [p["id"] for p in papers]
|
||||
if not paper_ids:
|
||||
print("No COMP2211 course-library papers found.")
|
||||
return
|
||||
|
||||
questions = (
|
||||
sb.table("paper_questions")
|
||||
.select("id, paper_id, question_number, question_type, score, question_text, topics, raw_answer_text, correct_option, correct_answer")
|
||||
.in_("paper_id", paper_ids)
|
||||
.is_("knowledge_reminder", "null")
|
||||
.order("paper_id")
|
||||
.order("display_order")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
|
||||
if not questions:
|
||||
print("No NULL questions found — all done!")
|
||||
return
|
||||
|
||||
print(f"Found {len(questions)} questions with NULL knowledge_reminder.")
|
||||
|
||||
# Group by paper for cleaner output
|
||||
from collections import defaultdict
|
||||
by_paper: dict[str, list] = defaultdict(list)
|
||||
for q in questions:
|
||||
by_paper[q["paper_id"]].append(q)
|
||||
|
||||
total_updated = 0
|
||||
for paper_idx, (paper_id, qs) in enumerate(by_paper.items(), 1):
|
||||
print(f"\n[{paper_idx}/{len(by_paper)}] paper_id={paper_id} — {len(qs)} NULL questions")
|
||||
for q in qs:
|
||||
print(f" Processing {q['question_number']}...", end=" ", flush=True)
|
||||
ok = await process_one(sb, q, dry_run)
|
||||
if ok:
|
||||
total_updated += 1
|
||||
print("done")
|
||||
await asyncio.sleep(1.5)
|
||||
|
||||
print(f"\nDone. {total_updated}/{len(questions)} questions updated.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
dry_run = "--dry-run" in sys.argv
|
||||
asyncio.run(backfill(dry_run=dry_run))
|
||||
135
backend/backfill_similar_questions.py
Normal file
135
backend/backfill_similar_questions.py
Normal file
@@ -0,0 +1,135 @@
|
||||
"""Pre-compute similar_questions for all COMP2211 course-library questions.
|
||||
|
||||
For each question, runs the same similarity logic as the API and writes the result
|
||||
into paper_questions.similar_questions (JSONB). The API will then return this
|
||||
pre-computed value directly with no computation overhead.
|
||||
|
||||
Run from the backend directory:
|
||||
uv run python backfill_similar_questions.py [--dry-run]
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
from collections import Counter
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.routers.questions import (
|
||||
similarity_score,
|
||||
question_family,
|
||||
display_topics,
|
||||
)
|
||||
|
||||
|
||||
def run(dry_run: bool = False) -> None:
|
||||
sb = get_supabase()
|
||||
|
||||
# Fetch all ready COMP2211 papers
|
||||
papers = (
|
||||
sb.table("papers")
|
||||
.select("id, year, term, exam_type, part_label")
|
||||
.eq("course_code", "COMP2211")
|
||||
.eq("status", "ready")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
if not papers:
|
||||
print("No ready COMP2211 papers found.")
|
||||
return
|
||||
|
||||
papers_by_id = {p["id"]: p for p in papers}
|
||||
paper_ids = list(papers_by_id.keys())
|
||||
|
||||
# Fetch all questions for these papers
|
||||
all_questions = (
|
||||
sb.table("paper_questions")
|
||||
.select(
|
||||
"id, paper_id, question_number, question_type, question_format, "
|
||||
"question_text, score, topics, analytics_topic, topic_tags, skill_tags, "
|
||||
"difficulty, knowledge_reminder, ai_hint, solution"
|
||||
)
|
||||
.in_("paper_id", paper_ids)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
print(f"Found {len(all_questions)} questions across {len(papers)} papers.")
|
||||
|
||||
# Batch full-text scores not practical here; skip RPC, rely on tag/topic scoring
|
||||
# (text_score = 0 for all, still produces good tag-based results)
|
||||
|
||||
updated = 0
|
||||
skipped = 0
|
||||
|
||||
for i, target in enumerate(all_questions, 1):
|
||||
target_paper_id = target["paper_id"]
|
||||
target_topic = target.get("analytics_topic")
|
||||
|
||||
# Candidates: same course, different paper
|
||||
candidates = [
|
||||
q for q in all_questions
|
||||
if q["paper_id"] != target_paper_id
|
||||
]
|
||||
|
||||
# Pre-filter by analytics_topic if available
|
||||
if target_topic:
|
||||
candidates = [c for c in candidates if c.get("analytics_topic") == target_topic]
|
||||
|
||||
if not candidates:
|
||||
skipped += 1
|
||||
print(f" [{i}/{len(all_questions)}] {target['question_number']} — no candidates, skip")
|
||||
continue
|
||||
|
||||
ranked = []
|
||||
for candidate in candidates:
|
||||
match_percent, reasons = similarity_score(target, candidate, text_score=0.0)
|
||||
if match_percent < 20:
|
||||
continue
|
||||
paper = papers_by_id.get(candidate["paper_id"], {})
|
||||
source = (
|
||||
f"{paper.get('year', '')} {paper.get('term', '').title()} "
|
||||
f"{paper.get('exam_type', '').title()}"
|
||||
).strip()
|
||||
if paper.get("part_label"):
|
||||
source = f"{source} Part {paper['part_label']}"
|
||||
ranked.append({
|
||||
"id": candidate["id"],
|
||||
"paper_id": candidate["paper_id"],
|
||||
"source": source,
|
||||
"question_number": candidate["question_number"],
|
||||
"match_percent": match_percent,
|
||||
"match_reasons": reasons,
|
||||
"question_type": question_family(candidate),
|
||||
"question_text": candidate["question_text"],
|
||||
"topics": display_topics(candidate),
|
||||
"difficulty": candidate.get("difficulty"),
|
||||
"knowledge_reminder": candidate.get("knowledge_reminder", ""),
|
||||
"ai_hint": candidate.get("ai_hint", ""),
|
||||
"solution": candidate.get("solution", ""),
|
||||
})
|
||||
|
||||
ranked.sort(key=lambda item: (-item["match_percent"], item["source"], item["question_number"]))
|
||||
|
||||
# Deduplicate: best per paper
|
||||
seen_papers: set[str] = set()
|
||||
deduped = []
|
||||
for item in ranked:
|
||||
if item["paper_id"] not in seen_papers:
|
||||
seen_papers.add(item["paper_id"])
|
||||
deduped.append(item)
|
||||
deduped = deduped[:12]
|
||||
|
||||
print(f" [{i}/{len(all_questions)}] {target['question_number']} → {len(deduped)} similar", end="")
|
||||
|
||||
if dry_run:
|
||||
print(" [dry-run]")
|
||||
continue
|
||||
|
||||
sb.table("paper_questions").update({"similar_questions": deduped}).eq("id", target["id"]).execute()
|
||||
updated += 1
|
||||
print()
|
||||
|
||||
print(f"\nDone. {updated} updated, {skipped} skipped (no candidates).")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
dry_run = "--dry-run" in sys.argv
|
||||
run(dry_run=dry_run)
|
||||
238
backend/backfill_vision.py
Normal file
238
backend/backfill_vision.py
Normal file
@@ -0,0 +1,238 @@
|
||||
"""
|
||||
用 Vision 模式重新处理所有已 ready 的试卷:
|
||||
- 从 Supabase Storage 拉 PDF → 图片 → Vision 拆题 → exec → AI trio → 更新 DB
|
||||
|
||||
用法:
|
||||
python backfill_vision.py --course COMP2211
|
||||
python backfill_vision.py --paper-id <uuid>
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import argparse
|
||||
import requests
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.services.paper_processor import (
|
||||
process_paper,
|
||||
strip_nulls,
|
||||
pdf_to_images,
|
||||
gemini_vision_json,
|
||||
deepseek_json_completion,
|
||||
parse_json_response,
|
||||
extract_code_lines,
|
||||
try_exec_python,
|
||||
chunked,
|
||||
sort_questions,
|
||||
STRUCTURE_PROMPT,
|
||||
ANSWER_MATCH_PROMPT,
|
||||
BATCH_ANALYSIS_PROMPT,
|
||||
)
|
||||
import json
|
||||
import traceback
|
||||
|
||||
|
||||
async def reprocess_paper(paper: dict):
|
||||
"""重新处理单张试卷(Vision 模式)"""
|
||||
sb = get_supabase()
|
||||
paper_id = paper["id"]
|
||||
label = f"{paper['course_code']} {paper['year']} {paper['term']} {paper['exam_type']}"
|
||||
print(f"\n=== {label} ({paper_id[:8]}) ===")
|
||||
|
||||
# 1. 拉 PDF
|
||||
try:
|
||||
pdf_bytes = requests.get(paper["paper_file_url"], timeout=60).content
|
||||
except Exception as e:
|
||||
print(f" SKIP: failed to fetch PDF: {e}")
|
||||
return
|
||||
|
||||
answer_bytes = None
|
||||
if paper.get("answer_file_url"):
|
||||
try:
|
||||
answer_bytes = requests.get(paper["answer_file_url"], timeout=60).content
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# 2. PDF → 图片
|
||||
print(f" Rendering {len(pdf_to_images(pdf_bytes))} pages...", end=" ", flush=True)
|
||||
paper_images = pdf_to_images(pdf_bytes)
|
||||
print("done")
|
||||
|
||||
# 3. Vision 拆题(分批,每批 8 页)
|
||||
PAGE_BATCH = 8
|
||||
all_questions: list = []
|
||||
meta: dict = {}
|
||||
print(f" Vision extraction ({len(paper_images)} pages, {-(-len(paper_images)//PAGE_BATCH)} batches)...")
|
||||
for i in range(0, len(paper_images), PAGE_BATCH):
|
||||
batch_imgs = paper_images[i:i + PAGE_BATCH]
|
||||
print(f" Pages {i+1}-{i+len(batch_imgs)}...", end=" ", flush=True)
|
||||
try:
|
||||
batch_result = await gemini_vision_json(
|
||||
system_prompt=STRUCTURE_PROMPT,
|
||||
images=batch_imgs,
|
||||
user_text=f"Pages {i+1}-{i+len(batch_imgs)} of the exam paper. Extract all questions visible on these pages.",
|
||||
temperature=0,
|
||||
)
|
||||
if not meta:
|
||||
meta = {k: batch_result.get(k) for k in ("total_score", "difficulty_level", "topics_summary")}
|
||||
qs = batch_result.get("questions", [])
|
||||
all_questions.extend(qs)
|
||||
print(f"done ({len(qs)} questions)")
|
||||
except Exception as e:
|
||||
print(f"FAILED: {e}")
|
||||
structure = {**meta, "questions": all_questions}
|
||||
questions = sort_questions(all_questions)
|
||||
print(f" Total: {len(questions)} questions extracted")
|
||||
|
||||
# 4. 答案匹配
|
||||
answers_map = {}
|
||||
if answer_bytes:
|
||||
print(" Vision answer matching...", end=" ", flush=True)
|
||||
answer_images = pdf_to_images(answer_bytes)
|
||||
questions_json = json.dumps(
|
||||
[{"question_number": q["question_number"], "question_type": q["question_type"]}
|
||||
for q in questions], ensure_ascii=False
|
||||
)
|
||||
try:
|
||||
match_result = await gemini_vision_json(
|
||||
system_prompt=ANSWER_MATCH_PROMPT.format(
|
||||
questions_json=questions_json, answer_text="(See images)"
|
||||
),
|
||||
images=answer_images,
|
||||
user_text=f"Match answers to these questions: {questions_json}",
|
||||
temperature=0,
|
||||
)
|
||||
answers_map = {a["question_number"]: a for a in match_result.get("answers", [])}
|
||||
print(f"done ({len(answers_map)} matched)")
|
||||
except Exception as e:
|
||||
print(f"FAILED: {e}")
|
||||
|
||||
# 5. 构建 payloads(exec Python)
|
||||
import numpy as np
|
||||
exec_namespaces: dict = {}
|
||||
batched_payloads = []
|
||||
|
||||
for q in questions:
|
||||
qnum = q["question_number"]
|
||||
answer = answers_map.get(qnum, {})
|
||||
full_text = q["question_text"] or ""
|
||||
|
||||
answer_section = ""
|
||||
if answer.get("raw_answer_text"):
|
||||
answer_section = answer["raw_answer_text"]
|
||||
elif answer.get("correct_option"):
|
||||
answer_section = f"Correct option: {answer['correct_option']}"
|
||||
elif answer.get("correct_answer"):
|
||||
answer_section = f"Correct answer: {answer['correct_answer']}"
|
||||
|
||||
if not answer_section:
|
||||
parent_q = q.get("parent_question")
|
||||
group_key = parent_q or qnum
|
||||
if group_key not in exec_namespaces:
|
||||
ns: dict = {"np": np}
|
||||
setup = extract_code_lines(full_text)
|
||||
try_exec_python(setup, ns)
|
||||
exec_namespaces[group_key] = ns
|
||||
ns = exec_namespaces[group_key]
|
||||
print_lines = [l.strip() for l in full_text.splitlines() if l.strip().startswith("print(")]
|
||||
if print_lines:
|
||||
out = try_exec_python(print_lines[-1], ns)
|
||||
if out is not None:
|
||||
answer_section = f"Executed output: {out}"
|
||||
print(f" [exec] {qnum}: {out[:60]}")
|
||||
|
||||
batched_payloads.append({
|
||||
"question_number": qnum,
|
||||
"question_type": q["question_type"],
|
||||
"score": q.get("score", "unknown"),
|
||||
"question_text": full_text,
|
||||
"topics": q.get("topics", []),
|
||||
"reference_answer": answer_section,
|
||||
})
|
||||
|
||||
# 6. AI trio
|
||||
print(f" Generating AI trio ({len(batched_payloads)} questions, {len(list(chunked(batched_payloads, 3)))} batches)...")
|
||||
analyses: dict = {}
|
||||
for batch in chunked(batched_payloads, 3):
|
||||
nums = [p["question_number"] for p in batch]
|
||||
print(f" Batch {nums}...", end=" ", flush=True)
|
||||
try:
|
||||
result = await deepseek_json_completion(
|
||||
system_prompt=BATCH_ANALYSIS_PROMPT.format(
|
||||
questions_payload=json.dumps(batch, ensure_ascii=False)
|
||||
),
|
||||
temperature=0.3,
|
||||
)
|
||||
for item in result.get("analyses", []):
|
||||
if item.get("question_number"):
|
||||
analyses[item["question_number"]] = item
|
||||
print(f"done ({len(result.get('analyses', []))})")
|
||||
except Exception as e:
|
||||
print(f"FAILED: {e}")
|
||||
await asyncio.sleep(1)
|
||||
|
||||
# 7. 删除旧题目,写入新题目
|
||||
print(" Writing to DB...", end=" ", flush=True)
|
||||
sb.table("paper_questions").delete().eq("paper_id", paper_id).execute()
|
||||
|
||||
for i, q in enumerate(questions):
|
||||
qnum = q["question_number"]
|
||||
answer = answers_map.get(qnum, {})
|
||||
analysis = analyses.get(qnum, {})
|
||||
sb.table("paper_questions").insert(strip_nulls({
|
||||
"paper_id": paper_id,
|
||||
"question_number": qnum,
|
||||
"parent_question": q.get("parent_question"),
|
||||
"display_order": i,
|
||||
"question_type": q["question_type"],
|
||||
"question_text": q["question_text"],
|
||||
"score": q.get("score"),
|
||||
"page_number": q.get("page_number"),
|
||||
"options": q.get("options"),
|
||||
"correct_option": answer.get("correct_option"),
|
||||
"correct_answer": answer.get("correct_answer"),
|
||||
"raw_answer_text": answer.get("raw_answer_text"),
|
||||
"topics": q.get("topics", []),
|
||||
"analytics_topic": q.get("topics", [None])[0],
|
||||
"topic_tags": q.get("topics", []),
|
||||
"difficulty": q.get("difficulty"),
|
||||
"knowledge_reminder": analysis.get("knowledge_reminder", ""),
|
||||
"ai_hint": analysis.get("ai_hint", ""),
|
||||
"solution": analysis.get("solution", ""),
|
||||
})).execute()
|
||||
|
||||
sb.table("papers").update({
|
||||
"question_count": len(questions),
|
||||
"total_score": structure.get("total_score"),
|
||||
"topics_summary": structure.get("topics_summary"),
|
||||
"difficulty_level": structure.get("difficulty_level"),
|
||||
}).eq("id", paper_id).execute()
|
||||
|
||||
print(f"done ({len(questions)} questions written)")
|
||||
|
||||
|
||||
async def main():
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--course", help="Course code")
|
||||
parser.add_argument("--paper-id", help="Single paper ID")
|
||||
args = parser.parse_args()
|
||||
|
||||
sb = get_supabase()
|
||||
query = sb.table("papers").select("*").eq("status", "ready")
|
||||
if args.paper_id:
|
||||
query = query.eq("id", args.paper_id)
|
||||
elif args.course:
|
||||
query = query.eq("course_code", args.course.upper())
|
||||
papers = query.order("created_at").execute().data
|
||||
|
||||
print(f"Papers to reprocess: {len(papers)}")
|
||||
for paper in papers:
|
||||
try:
|
||||
await reprocess_paper(paper)
|
||||
except Exception as e:
|
||||
print(f" ERROR: {e}")
|
||||
traceback.print_exc()
|
||||
|
||||
print("\nAll done.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
29
backend/fill_manual_study_aids.py
Normal file
29
backend/fill_manual_study_aids.py
Normal file
@@ -0,0 +1,29 @@
|
||||
"""Deprecated: study aids must come from LLM output, not template fillers."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
|
||||
|
||||
MESSAGE = """
|
||||
fill_manual_study_aids.py is intentionally disabled.
|
||||
|
||||
Reason:
|
||||
- knowledge_reminder / ai_hint / solution must be generated by LLM
|
||||
- template-based filler content polluted the COMP2211 course library
|
||||
|
||||
Use one of these paths instead:
|
||||
1. Regenerate study aids through the real LLM pipeline in app/services/paper_processor.py
|
||||
2. Rebuild paper_questions from a reviewed source and then run LLM generation
|
||||
|
||||
This script must not be used to backfill production study aids.
|
||||
""".strip()
|
||||
|
||||
|
||||
def main() -> None:
|
||||
print(MESSAGE, file=sys.stderr)
|
||||
raise SystemExit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
240
backend/import_course_manifest.py
Normal file
240
backend/import_course_manifest.py
Normal file
@@ -0,0 +1,240 @@
|
||||
"""Import a canonical course manifest into Supabase-backed papers."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import json
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from app.services.paper_processor import process_paper
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Import a canonical course paper manifest into Supabase."
|
||||
)
|
||||
parser.add_argument(
|
||||
"--manifest",
|
||||
type=Path,
|
||||
required=True,
|
||||
help="Path to the manifest JSON file.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--papers-root",
|
||||
type=Path,
|
||||
required=True,
|
||||
help="Root folder that contains the course PDF files referenced by the manifest.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--user-id",
|
||||
required=False,
|
||||
help="Existing auth.users UUID used as the owner of imported course-library rows.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--course-code",
|
||||
help="Optional filter to only import entries from one course.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--exam-key",
|
||||
action="append",
|
||||
dest="exam_keys",
|
||||
default=[],
|
||||
help="Optional exam_key filter. Repeat the flag to import multiple entries.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--process",
|
||||
action="store_true",
|
||||
help="Run the full paper processing pipeline after the files are uploaded.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--dry-run",
|
||||
action="store_true",
|
||||
help="Print what would be imported without uploading or writing database rows.",
|
||||
)
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def load_manifest(path: Path) -> list[dict[str, Any]]:
|
||||
with path.open("r", encoding="utf-8") as f:
|
||||
data = json.load(f)
|
||||
if not isinstance(data, list):
|
||||
raise ValueError("Manifest must be a JSON array.")
|
||||
return data
|
||||
|
||||
|
||||
def should_import(entry: dict[str, Any], args: argparse.Namespace) -> bool:
|
||||
if args.course_code and entry.get("course_code") != args.course_code:
|
||||
return False
|
||||
if args.exam_keys and entry.get("exam_key") not in set(args.exam_keys):
|
||||
return False
|
||||
return bool(entry.get("importable"))
|
||||
|
||||
|
||||
def resolve_file_path(root: Path, filename: str | None) -> Path | None:
|
||||
if not filename:
|
||||
return None
|
||||
|
||||
direct = root / filename
|
||||
if direct.exists():
|
||||
return direct
|
||||
|
||||
all_files = [candidate for candidate in root.iterdir() if candidate.is_file()]
|
||||
|
||||
def normalize(name: str) -> str:
|
||||
return name.replace(" (1)", "")
|
||||
|
||||
target_name = normalize(filename)
|
||||
normalized = [candidate for candidate in all_files if normalize(candidate.name) == target_name]
|
||||
if len(normalized) == 1:
|
||||
return normalized[0]
|
||||
|
||||
path = Path(filename)
|
||||
normalized_stem = normalize(path.stem)
|
||||
suffix = path.suffix
|
||||
stem_matches = [
|
||||
candidate
|
||||
for candidate in all_files
|
||||
if candidate.suffix == suffix and normalize(candidate.stem) == normalized_stem
|
||||
]
|
||||
if len(stem_matches) == 1:
|
||||
return stem_matches[0]
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def read_file_bytes(root: Path, filename: str | None) -> bytes | None:
|
||||
if not filename:
|
||||
return None
|
||||
path = resolve_file_path(root, filename)
|
||||
if path is None or not path.exists():
|
||||
raise FileNotFoundError(f"Referenced file does not exist under {root}: {filename}")
|
||||
return path.read_bytes()
|
||||
|
||||
|
||||
def build_storage_path(entry: dict[str, Any], kind: str) -> str:
|
||||
exam_key = entry["exam_key"]
|
||||
return f"course-library/{entry['course_code']}/{exam_key}/{kind}.pdf"
|
||||
|
||||
|
||||
def upsert_paper_record(
|
||||
entry: dict[str, Any],
|
||||
user_id: str | None,
|
||||
paper_url: str,
|
||||
answer_url: str | None,
|
||||
) -> str:
|
||||
sb = get_supabase()
|
||||
payload = {
|
||||
"user_id": user_id,
|
||||
"course_code": entry["course_code"],
|
||||
"year": entry["year"],
|
||||
"term": entry["term"],
|
||||
"exam_type": entry["exam_type"],
|
||||
"part_label": entry.get("part_label"),
|
||||
"paper_file_url": paper_url,
|
||||
"answer_file_url": answer_url,
|
||||
"status": "processing",
|
||||
"source_kind": "course_library",
|
||||
"source_exam_key": entry["exam_key"],
|
||||
"source_question_filename": entry.get("question_pdf"),
|
||||
"source_answer_filename": entry.get("primary_answer_pdf"),
|
||||
}
|
||||
|
||||
existing = (
|
||||
sb.table("papers")
|
||||
.select("id")
|
||||
.eq("source_kind", "course_library")
|
||||
.eq("source_exam_key", entry["exam_key"])
|
||||
.limit(1)
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
if existing:
|
||||
paper_id = existing[0]["id"]
|
||||
sb.table("papers").update(payload).eq("id", paper_id).execute()
|
||||
return paper_id
|
||||
|
||||
created = sb.table("papers").insert(payload).execute().data
|
||||
return created[0]["id"]
|
||||
|
||||
|
||||
def reset_existing_processed_data(paper_id: str) -> None:
|
||||
sb = get_supabase()
|
||||
sb.table("paper_questions").delete().eq("paper_id", paper_id).execute()
|
||||
sb.table("papers").update(
|
||||
{
|
||||
"status": "processing",
|
||||
"error_message": None,
|
||||
"paper_extracted_text": None,
|
||||
"answer_extracted_text": None,
|
||||
"total_score": None,
|
||||
"question_count": None,
|
||||
"topics_summary": None,
|
||||
"difficulty_level": None,
|
||||
}
|
||||
).eq("id", paper_id).execute()
|
||||
|
||||
|
||||
async def import_entry(
|
||||
entry: dict[str, Any],
|
||||
args: argparse.Namespace,
|
||||
) -> None:
|
||||
paper_bytes = read_file_bytes(args.papers_root, entry.get("question_pdf"))
|
||||
answer_bytes = read_file_bytes(args.papers_root, entry.get("primary_answer_pdf"))
|
||||
|
||||
if paper_bytes is None:
|
||||
raise ValueError(f"Importable entry is missing question PDF: {entry['exam_key']}")
|
||||
|
||||
if args.dry_run:
|
||||
print(
|
||||
f"[dry-run] {entry['exam_key']}: "
|
||||
f"question={entry.get('question_pdf')} answer={entry.get('primary_answer_pdf')}"
|
||||
)
|
||||
return
|
||||
|
||||
sb = get_supabase()
|
||||
paper_path = build_storage_path(entry, "paper")
|
||||
sb.storage.from_("papers").upload(
|
||||
paper_path,
|
||||
paper_bytes,
|
||||
file_options={"content-type": "application/pdf", "upsert": "true"},
|
||||
)
|
||||
paper_url = sb.storage.from_("papers").get_public_url(paper_path)
|
||||
|
||||
answer_url = None
|
||||
if answer_bytes:
|
||||
answer_path = build_storage_path(entry, "answer")
|
||||
sb.storage.from_("papers").upload(
|
||||
answer_path,
|
||||
answer_bytes,
|
||||
file_options={"content-type": "application/pdf", "upsert": "true"},
|
||||
)
|
||||
answer_url = sb.storage.from_("papers").get_public_url(answer_path)
|
||||
|
||||
paper_id = upsert_paper_record(entry, args.user_id, paper_url, answer_url)
|
||||
print(f"Imported metadata for {entry['exam_key']} -> paper_id={paper_id}")
|
||||
|
||||
if args.process:
|
||||
reset_existing_processed_data(paper_id)
|
||||
await process_paper(paper_id, paper_bytes, answer_bytes)
|
||||
print(f"Processed {entry['exam_key']}")
|
||||
|
||||
|
||||
async def main() -> None:
|
||||
args = parse_args()
|
||||
manifest = load_manifest(args.manifest)
|
||||
entries = [entry for entry in manifest if should_import(entry, args)]
|
||||
|
||||
if not entries:
|
||||
print("No manifest entries matched the provided filters.")
|
||||
return
|
||||
|
||||
print(f"Preparing to import {len(entries)} manifest entries.")
|
||||
for entry in entries:
|
||||
await import_entry(entry, args)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
17
backend/pyproject.toml
Normal file
17
backend/pyproject.toml
Normal file
@@ -0,0 +1,17 @@
|
||||
[project]
|
||||
name = "pastpaper-master-backend"
|
||||
version = "0.1.0"
|
||||
requires-python = ">=3.11"
|
||||
dependencies = [
|
||||
"fastapi>=0.115.0",
|
||||
"uvicorn[standard]>=0.30.0",
|
||||
"python-dotenv>=1.0.0",
|
||||
"python-multipart>=0.0.9",
|
||||
"supabase>=2.0.0",
|
||||
"openai>=1.50.0",
|
||||
"PyMuPDF>=1.24.0",
|
||||
"pydantic>=2.0.0",
|
||||
"pydantic-settings>=2.0.0",
|
||||
"httpx>=0.27.0",
|
||||
"numpy>=2.4.4",
|
||||
]
|
||||
174
backend/regen_ai_trio_comp2211.py
Normal file
174
backend/regen_ai_trio_comp2211.py
Normal file
@@ -0,0 +1,174 @@
|
||||
"""Regenerate AI trio (knowledge_reminder, ai_hint, solution) for all COMP2211 course-library questions.
|
||||
|
||||
Reads existing paper_questions rows and runs the same BATCH_ANALYSIS_PROMPT used by
|
||||
paper_processor.py — but does UPDATE instead of INSERT, so question structure is untouched.
|
||||
|
||||
Run from the backend directory:
|
||||
uv run python regen_ai_trio_comp2211.py
|
||||
|
||||
Pass --dry-run to print batches without calling the LLM or writing to the database.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import sys
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.services.paper_processor import BATCH_ANALYSIS_PROMPT, qwen_json_completion, chunked
|
||||
|
||||
|
||||
def build_reference_answer(q: dict) -> str:
|
||||
if q.get("raw_answer_text"):
|
||||
return q["raw_answer_text"]
|
||||
if q.get("correct_option"):
|
||||
return f"Correct option: {q['correct_option']}"
|
||||
if q.get("correct_answer"):
|
||||
return f"Correct answer: {q['correct_answer']}"
|
||||
return ""
|
||||
|
||||
|
||||
async def regen(dry_run: bool = False) -> None:
|
||||
sb = get_supabase()
|
||||
|
||||
papers = (
|
||||
sb.table("papers")
|
||||
.select("id")
|
||||
.eq("course_code", "COMP2211")
|
||||
.eq("source_kind", "course_library")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
paper_ids = [p["id"] for p in papers]
|
||||
if not paper_ids:
|
||||
print("No COMP2211 course-library papers found.")
|
||||
return
|
||||
|
||||
questions = (
|
||||
sb.table("paper_questions")
|
||||
.select("id, paper_id, question_number, question_type, score, question_text, topics, raw_answer_text, correct_option, correct_answer")
|
||||
.in_("paper_id", paper_ids)
|
||||
.order("paper_id")
|
||||
.order("display_order")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
print(f"Found {len(questions)} questions across {len(paper_ids)} papers.")
|
||||
|
||||
payloads = [
|
||||
{
|
||||
"question_number": q["question_number"],
|
||||
"question_type": q["question_type"] or "long_question",
|
||||
"score": q.get("score") or "unknown",
|
||||
"question_text": q.get("question_text") or "",
|
||||
"topics": q.get("topics") or [],
|
||||
"reference_answer": build_reference_answer(q),
|
||||
}
|
||||
for q in questions
|
||||
]
|
||||
|
||||
id_by_qnum_paper: dict[tuple[str, str], str] = {
|
||||
(q["paper_id"], q["question_number"]): q["id"]
|
||||
for q in questions
|
||||
}
|
||||
paper_id_by_qnum: dict[str, str] = {
|
||||
q["question_number"]: q["paper_id"] for q in questions
|
||||
}
|
||||
|
||||
# Group payloads by paper so batches don't mix papers (cleaner context for LLM)
|
||||
from collections import defaultdict
|
||||
payloads_by_paper: dict[str, list[dict]] = defaultdict(list)
|
||||
for q, payload in zip(questions, payloads):
|
||||
payloads_by_paper[q["paper_id"]].append((q["id"], payload))
|
||||
|
||||
total_updated = 0
|
||||
total_papers = len(payloads_by_paper)
|
||||
|
||||
for paper_idx, (paper_id, items) in enumerate(payloads_by_paper.items(), 1):
|
||||
ids = [item[0] for item in items]
|
||||
batch_payloads = [item[1] for item in items]
|
||||
|
||||
print(f"\n[{paper_idx}/{total_papers}] paper_id={paper_id} — {len(batch_payloads)} questions")
|
||||
|
||||
for batch_idx, batch in enumerate(chunked(batch_payloads, 3), 1):
|
||||
print(f" Batch {batch_idx}: questions {[b['question_number'] for b in batch]}", end="", flush=True)
|
||||
|
||||
if dry_run:
|
||||
print(" [dry-run, skipped]")
|
||||
continue
|
||||
|
||||
batch_start = (batch_idx - 1) * 3
|
||||
batch_ids = ids[batch_start: batch_start + 3]
|
||||
|
||||
async def run_single(row_id: str, payload: dict) -> bool:
|
||||
try:
|
||||
r = await qwen_json_completion(
|
||||
system_prompt=BATCH_ANALYSIS_PROMPT.format(
|
||||
questions_payload=json.dumps([payload], ensure_ascii=False),
|
||||
),
|
||||
temperature=0.3,
|
||||
max_tokens=8192,
|
||||
)
|
||||
items = r.get("analyses", [])
|
||||
if not items:
|
||||
return False
|
||||
analysis = items[0]
|
||||
sb.table("paper_questions").update({
|
||||
"knowledge_reminder": analysis.get("knowledge_reminder", ""),
|
||||
"ai_hint": analysis.get("ai_hint", ""),
|
||||
"solution": analysis.get("solution", ""),
|
||||
}).eq("id", row_id).execute()
|
||||
return True
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
try:
|
||||
result = await qwen_json_completion(
|
||||
system_prompt=BATCH_ANALYSIS_PROMPT.format(
|
||||
questions_payload=json.dumps(batch, ensure_ascii=False),
|
||||
),
|
||||
temperature=0.3,
|
||||
max_tokens=8192,
|
||||
)
|
||||
analyses = {item["question_number"]: item for item in result.get("analyses", [])}
|
||||
written = 0
|
||||
for row_id, payload in zip(batch_ids, batch):
|
||||
qnum = payload["question_number"]
|
||||
analysis = analyses.get(qnum)
|
||||
if not analysis:
|
||||
# fallback: retry this single question alone
|
||||
ok = await run_single(row_id, payload)
|
||||
if ok:
|
||||
written += 1
|
||||
total_updated += 1
|
||||
else:
|
||||
print(f"\n SKIP: {qnum}")
|
||||
else:
|
||||
sb.table("paper_questions").update({
|
||||
"knowledge_reminder": analysis.get("knowledge_reminder", ""),
|
||||
"ai_hint": analysis.get("ai_hint", ""),
|
||||
"solution": analysis.get("solution", ""),
|
||||
}).eq("id", row_id).execute()
|
||||
written += 1
|
||||
total_updated += 1
|
||||
print(f" → {written} written")
|
||||
except Exception as exc:
|
||||
# batch failed entirely — retry each question individually
|
||||
print(f" [batch error, retrying 1-by-1]")
|
||||
written = 0
|
||||
for row_id, payload in zip(batch_ids, batch):
|
||||
ok = await run_single(row_id, payload)
|
||||
if ok:
|
||||
written += 1
|
||||
total_updated += 1
|
||||
await asyncio.sleep(1)
|
||||
print(f" → {written}/{len(batch)} written")
|
||||
|
||||
await asyncio.sleep(2.5)
|
||||
|
||||
print(f"\nDone. {total_updated} questions updated.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
dry_run = "--dry-run" in sys.argv
|
||||
asyncio.run(regen(dry_run=dry_run))
|
||||
69
backend/regenerate_analysis.py
Normal file
69
backend/regenerate_analysis.py
Normal file
@@ -0,0 +1,69 @@
|
||||
"""Re-generate AI trio (knowledge_reminder, ai_hint, solution) in English for existing questions."""
|
||||
|
||||
import json
|
||||
import asyncio
|
||||
from app.services.supabase_client import get_supabase
|
||||
from app.services.llm_clients import get_qwen_client
|
||||
from app.services.paper_processor import ANALYSIS_PROMPT
|
||||
|
||||
|
||||
async def regenerate_for_paper(paper_id: str):
|
||||
sb = get_supabase()
|
||||
qwen = get_qwen_client()
|
||||
|
||||
questions = sb.table("paper_questions").select("*").eq("paper_id", paper_id).order("display_order").execute().data
|
||||
print(f"Found {len(questions)} questions for paper {paper_id[:8]}")
|
||||
|
||||
for q in questions:
|
||||
qnum = q["question_number"]
|
||||
print(f" Regenerating Q{qnum}...", end=" ", flush=True)
|
||||
|
||||
answer_section = ""
|
||||
if q.get("raw_answer_text"):
|
||||
answer_section = f"- Reference answer: {q['raw_answer_text']}"
|
||||
elif q.get("correct_option"):
|
||||
answer_section = f"- Correct option: {q['correct_option']}"
|
||||
elif q.get("correct_answer"):
|
||||
answer_section = f"- Correct answer: {q['correct_answer']}"
|
||||
|
||||
resp = qwen.chat.completions.create(
|
||||
model="qwen-plus",
|
||||
messages=[
|
||||
{"role": "system", "content": ANALYSIS_PROMPT.format(
|
||||
question_number=qnum,
|
||||
question_type=q["question_type"],
|
||||
score=q.get("score", "unknown"),
|
||||
question_text=q["question_text"],
|
||||
topics=", ".join(q.get("topics", [])),
|
||||
answer_section=answer_section,
|
||||
)},
|
||||
],
|
||||
temperature=0.3,
|
||||
response_format={"type": "json_object"},
|
||||
)
|
||||
analysis = json.loads(resp.choices[0].message.content)
|
||||
|
||||
sb.table("paper_questions").update({
|
||||
"knowledge_reminder": analysis.get("knowledge_reminder", ""),
|
||||
"ai_hint": analysis.get("ai_hint", ""),
|
||||
"solution": analysis.get("solution", ""),
|
||||
}).eq("id", q["id"]).execute()
|
||||
|
||||
print("done")
|
||||
|
||||
print(f"All questions regenerated for paper {paper_id[:8]}")
|
||||
|
||||
|
||||
async def main():
|
||||
sb = get_supabase()
|
||||
papers = sb.table("papers").select("id,course_code,year,term").eq("status", "ready").order("created_at", desc=True).execute().data
|
||||
|
||||
for p in papers:
|
||||
print(f"\n=== {p['course_code']} {p['year']} {p['term']} ===")
|
||||
await regenerate_for_paper(p["id"])
|
||||
|
||||
print("\nAll done!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
224
backend/split_comp2211_2022_spring_final_part_a.py
Normal file
224
backend/split_comp2211_2022_spring_final_part_a.py
Normal file
@@ -0,0 +1,224 @@
|
||||
"""Split COMP2211 Spring 2022 final part A into subquestions."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
|
||||
EXAM_KEY = "COMP2211-2022-spring-final-part-a"
|
||||
TRUE_FALSE_OPTIONS = [{"label": "True", "text": "True"}, {"label": "False", "text": "False"}]
|
||||
PROBLEM_SEED_PATH = (
|
||||
Path(__file__).resolve().parent.parent
|
||||
/ "pastpaper-scraper"
|
||||
/ "reviews"
|
||||
/ "COMP2211"
|
||||
/ "problem_seed.json"
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ChildSpec:
|
||||
question_number: str
|
||||
parent_question: str
|
||||
top_level_number: str
|
||||
path: tuple[str, ...]
|
||||
score: float
|
||||
question_type: str
|
||||
question_format: str | None = None
|
||||
analytics_topic: str | None = None
|
||||
topic_primary: str | None = None
|
||||
topic_tags: tuple[str, ...] | None = None
|
||||
skill_tags: tuple[str, ...] | None = None
|
||||
page_number: int = 1
|
||||
|
||||
|
||||
def short_answer(
|
||||
question_number: str,
|
||||
parent_question: str,
|
||||
top_level_number: str,
|
||||
path: tuple[str, ...],
|
||||
score: float,
|
||||
*,
|
||||
analytics_topic: str | None = None,
|
||||
topic_primary: str | None = None,
|
||||
topic_tags: tuple[str, ...] | None = None,
|
||||
skill_tags: tuple[str, ...] | None = None,
|
||||
page_number: int,
|
||||
) -> ChildSpec:
|
||||
return ChildSpec(
|
||||
question_number=question_number,
|
||||
parent_question=parent_question,
|
||||
top_level_number=top_level_number,
|
||||
path=path,
|
||||
score=score,
|
||||
question_type="long_question",
|
||||
question_format="short_answer",
|
||||
analytics_topic=analytics_topic,
|
||||
topic_primary=topic_primary,
|
||||
topic_tags=topic_tags,
|
||||
skill_tags=skill_tags,
|
||||
page_number=page_number,
|
||||
)
|
||||
|
||||
|
||||
CHILDREN: list[ChildSpec] = [
|
||||
ChildSpec("1a", "1", "1", ("a",), 1, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "algorithm_property"), page_number=2),
|
||||
ChildSpec("1b", "1", "1", ("b",), 1, "true_false", "true_false", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("concept_check", "architecture_reasoning"), page_number=2),
|
||||
ChildSpec("1c", "1", "1", ("c",), 1, "true_false", "true_false", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("concept_check", "activation_selection"), page_number=2),
|
||||
ChildSpec("1d", "1", "1", ("d",), 1, "true_false", "true_false", "Evaluation and Validation", "Evaluation and Validation", ("Evaluation and Validation",), ("concept_check", "metric_reasoning"), page_number=2),
|
||||
ChildSpec("1e", "1", "1", ("e",), 1, "true_false", "true_false", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("concept_check", "hardware_reasoning"), page_number=2),
|
||||
ChildSpec("1f", "1", "1", ("f",), 1, "true_false", "true_false", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("concept_check", "image_processing"), page_number=2),
|
||||
ChildSpec("1g", "1", "1", ("g",), 1, "true_false", "true_false", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("concept_check", "cnn_architecture"), page_number=2),
|
||||
ChildSpec("1h", "1", "1", ("h",), 1, "true_false", "true_false", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("concept_check", "regularization"), page_number=2),
|
||||
ChildSpec("1i", "1", "1", ("i",), 1, "true_false", "true_false", "Search and Games", "Search and Games", ("Search and Games",), ("concept_check", "game_reasoning"), page_number=2),
|
||||
ChildSpec("1j", "1", "1", ("j",), 1, "true_false", "true_false", "Search and Games", "Search and Games", ("Search and Games",), ("concept_check", "pruning_reasoning"), page_number=2),
|
||||
ChildSpec("2a", "2", "2", ("a",), 6.5, "long_question", "long_answer", "Probabilistic Models", "Probabilistic Models", ("Probabilistic Models",), ("manual_computation", "probability_reasoning", "classification_decision"), page_number=4),
|
||||
ChildSpec("2b", "2", "2", ("b",), 7.5, "long_question", "long_answer", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("distance_calculation", "algorithm_tracing", "classification_decision"), page_number=4),
|
||||
short_answer("3a", "3", "3", ("a",), 3, analytics_topic="Evaluation and Validation", topic_primary="Evaluation and Validation", topic_tags=("Evaluation and Validation",), skill_tags=("concept_explanation", "metric_reasoning"), page_number=6),
|
||||
short_answer("3b", "3", "3", ("b",), 2, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("concept_explanation", "activation_selection"), page_number=6),
|
||||
short_answer("3c", "3", "3", ("c",), 2, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("architecture_reasoning", "output_layer_design"), page_number=6),
|
||||
short_answer("3d", "3", "3", ("d",), 3, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("concept_explanation", "optimization_reasoning"), page_number=6),
|
||||
short_answer("3e_i", "3e", "3", ("e", "i"), 1, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("optimization_reasoning",), page_number=6),
|
||||
short_answer("3e_ii", "3e", "3", ("e", "ii"), 1, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("optimization_reasoning",), page_number=6),
|
||||
short_answer("3f", "3", "3", ("f",), 2, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("regularization", "concept_explanation"), page_number=6),
|
||||
ChildSpec("4a_i", "4a", "4", ("a", "i"), 2, "fill_blank", "fill_blank", page_number=7),
|
||||
ChildSpec("4a_ii", "4a", "4", ("a", "ii"), 2, "long_question", "long_answer", page_number=7),
|
||||
ChildSpec("4b_i", "4b", "4", ("b", "i"), 3, "fill_blank", "fill_blank", page_number=7),
|
||||
ChildSpec("4b_ii", "4b", "4", ("b", "ii"), 4, "fill_blank", "fill_blank", page_number=7),
|
||||
ChildSpec("4b_iii", "4b", "4", ("b", "iii"), 4, "long_question", "long_answer", page_number=7),
|
||||
]
|
||||
|
||||
|
||||
MARKER_RE = re.compile(r"(?m)^\(([a-z]+|[ivx]+)\)\s*")
|
||||
|
||||
|
||||
def split_sections(text: str) -> tuple[str, dict[str, str]]:
|
||||
matches = list(MARKER_RE.finditer(text))
|
||||
if not matches:
|
||||
return text.strip(), {}
|
||||
intro = text[: matches[0].start()].strip()
|
||||
sections: dict[str, str] = {}
|
||||
for idx, match in enumerate(matches):
|
||||
marker = match.group(1)
|
||||
end = matches[idx + 1].start() if idx + 1 < len(matches) else len(text)
|
||||
sections[marker] = text[match.start() : end].strip()
|
||||
return intro, sections
|
||||
|
||||
|
||||
def extract_segment(text: str, path: tuple[str, ...]) -> str:
|
||||
current = text.strip()
|
||||
carried_intro: list[str] = []
|
||||
for depth, marker in enumerate(path):
|
||||
intro, sections = split_sections(current)
|
||||
if depth == 0 and intro:
|
||||
carried_intro.append(intro)
|
||||
current = sections.get(marker, current)
|
||||
return "\n".join(part for part in [*carried_intro, current] if part).strip()
|
||||
|
||||
|
||||
def extract_true_false_answers(answer_text: str) -> dict[str, str]:
|
||||
answers: dict[str, str] = {}
|
||||
matches = list(re.finditer(r"(?m)^\(([a-j])\)\s*\n?([TF])\b", answer_text))
|
||||
for match in matches:
|
||||
answers[match.group(1)] = match.group(2)
|
||||
return answers
|
||||
|
||||
|
||||
def derive_correct_answer(answer_text: str) -> str | None:
|
||||
if not answer_text:
|
||||
return None
|
||||
tail = answer_text.split("Answer:", 1)[1] if "Answer:" in answer_text else answer_text
|
||||
lines = [line.strip() for line in tail.splitlines() if line.strip()]
|
||||
if not lines:
|
||||
return None
|
||||
first = lines[0]
|
||||
if first.lower().startswith("marking scheme"):
|
||||
return None
|
||||
if len(first) <= 240:
|
||||
return first
|
||||
return None
|
||||
|
||||
|
||||
def load_seed_rows() -> dict[str, dict]:
|
||||
data = json.loads(PROBLEM_SEED_PATH.read_text())
|
||||
return {
|
||||
row["question_number"]: row
|
||||
for row in data
|
||||
if row["source_exam_key"] == EXAM_KEY
|
||||
}
|
||||
|
||||
|
||||
def main() -> None:
|
||||
sb = get_supabase()
|
||||
paper = sb.table("papers").select("id").eq("source_exam_key", EXAM_KEY).execute().data[0]
|
||||
paper_id = paper["id"]
|
||||
|
||||
current_rows = (
|
||||
sb.table("paper_questions")
|
||||
.select("*")
|
||||
.eq("paper_id", paper_id)
|
||||
.order("display_order")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
existing_by_number = {row["question_number"]: row for row in current_rows}
|
||||
parent_rows = load_seed_rows()
|
||||
tf_answers = extract_true_false_answers(parent_rows["1"]["raw_answer_text"] or "")
|
||||
|
||||
inserts = []
|
||||
for display_order, child in enumerate(CHILDREN, start=1):
|
||||
parent = parent_rows[child.top_level_number]
|
||||
existing = existing_by_number.get(child.question_number, {})
|
||||
question_text = extract_segment(parent["question_text"] or "", child.path)
|
||||
raw_answer_text = extract_segment(parent["raw_answer_text"] or "", child.path)
|
||||
|
||||
correct_option = None
|
||||
correct_answer = None
|
||||
options = None
|
||||
if child.question_type == "true_false":
|
||||
correct_option = tf_answers.get(child.path[0])
|
||||
options = TRUE_FALSE_OPTIONS
|
||||
elif child.question_type == "fill_blank":
|
||||
correct_answer = derive_correct_answer(raw_answer_text)
|
||||
|
||||
inserts.append(
|
||||
{
|
||||
"paper_id": paper_id,
|
||||
"question_number": child.question_number,
|
||||
"parent_question": child.parent_question,
|
||||
"display_order": display_order,
|
||||
"question_type": child.question_type,
|
||||
"question_format": child.question_format,
|
||||
"question_text": question_text,
|
||||
"score": child.score,
|
||||
"page_number": child.page_number,
|
||||
"page_y_ratio": existing.get("page_y_ratio"),
|
||||
"options": options,
|
||||
"correct_option": correct_option,
|
||||
"correct_answer": correct_answer,
|
||||
"raw_answer_text": raw_answer_text,
|
||||
"topics": existing.get("topics") or (list(child.topic_tags) if child.topic_tags else parent.get("topics")),
|
||||
"topic_primary": existing.get("topic_primary") or child.topic_primary or parent.get("topic_primary"),
|
||||
"analytics_topic": existing.get("analytics_topic") or child.analytics_topic or parent.get("analytics_topic"),
|
||||
"topic_tags": existing.get("topic_tags") or (list(child.topic_tags) if child.topic_tags else parent.get("topic_tags")),
|
||||
"skill_tags": existing.get("skill_tags") or (list(child.skill_tags) if child.skill_tags else parent.get("skill_tags")),
|
||||
"difficulty": existing.get("difficulty") or parent.get("difficulty"),
|
||||
"knowledge_reminder": existing.get("knowledge_reminder", ""),
|
||||
"ai_hint": existing.get("ai_hint", ""),
|
||||
"solution": existing.get("solution", ""),
|
||||
}
|
||||
)
|
||||
|
||||
sb.table("paper_questions").delete().eq("paper_id", paper_id).execute()
|
||||
sb.table("paper_questions").insert(inserts).execute()
|
||||
sb.table("papers").update({"question_count": len(inserts), "status": "processing"}).eq("id", paper_id).execute()
|
||||
print(f"Inserted {len(inserts)} rows for {EXAM_KEY}.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
232
backend/split_comp2211_2022_spring_final_part_b.py
Normal file
232
backend/split_comp2211_2022_spring_final_part_b.py
Normal file
@@ -0,0 +1,232 @@
|
||||
"""Split COMP2211 Spring 2022 final part B into subquestions."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
|
||||
EXAM_KEY = "COMP2211-2022-spring-final-part-b"
|
||||
PROBLEM_SEED_PATH = (
|
||||
Path(__file__).resolve().parent.parent
|
||||
/ "pastpaper-scraper"
|
||||
/ "reviews"
|
||||
/ "COMP2211"
|
||||
/ "problem_seed.json"
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ChildSpec:
|
||||
question_number: str
|
||||
parent_question: str
|
||||
top_level_number: str
|
||||
path: tuple[str, ...]
|
||||
score: float
|
||||
question_type: str
|
||||
question_format: str | None = None
|
||||
analytics_topic: str | None = None
|
||||
topic_primary: str | None = None
|
||||
topic_tags: tuple[str, ...] | None = None
|
||||
skill_tags: tuple[str, ...] | None = None
|
||||
options: tuple[tuple[str, str], ...] | None = None
|
||||
correct_option: str | None = None
|
||||
correct_answer: str | None = None
|
||||
page_number: int = 1
|
||||
|
||||
|
||||
def short_answer(
|
||||
question_number: str,
|
||||
parent_question: str,
|
||||
top_level_number: str,
|
||||
path: tuple[str, ...],
|
||||
score: float,
|
||||
*,
|
||||
analytics_topic: str | None = None,
|
||||
topic_primary: str | None = None,
|
||||
topic_tags: tuple[str, ...] | None = None,
|
||||
skill_tags: tuple[str, ...] | None = None,
|
||||
correct_answer: str | None = None,
|
||||
page_number: int,
|
||||
) -> ChildSpec:
|
||||
return ChildSpec(
|
||||
question_number=question_number,
|
||||
parent_question=parent_question,
|
||||
top_level_number=top_level_number,
|
||||
path=path,
|
||||
score=score,
|
||||
question_type="long_question",
|
||||
question_format="short_answer",
|
||||
analytics_topic=analytics_topic,
|
||||
topic_primary=topic_primary,
|
||||
topic_tags=topic_tags,
|
||||
skill_tags=skill_tags,
|
||||
correct_answer=correct_answer,
|
||||
page_number=page_number,
|
||||
)
|
||||
|
||||
|
||||
def mc(
|
||||
question_number: str,
|
||||
parent_question: str,
|
||||
top_level_number: str,
|
||||
path: tuple[str, ...],
|
||||
score: float,
|
||||
*,
|
||||
options: tuple[tuple[str, str], ...],
|
||||
correct_option: str,
|
||||
analytics_topic: str,
|
||||
skill_tags: tuple[str, ...],
|
||||
page_number: int,
|
||||
) -> ChildSpec:
|
||||
return ChildSpec(
|
||||
question_number=question_number,
|
||||
parent_question=parent_question,
|
||||
top_level_number=top_level_number,
|
||||
path=path,
|
||||
score=score,
|
||||
question_type="mc",
|
||||
question_format="mc",
|
||||
analytics_topic=analytics_topic,
|
||||
topic_primary=analytics_topic,
|
||||
topic_tags=(analytics_topic,),
|
||||
skill_tags=skill_tags,
|
||||
options=options,
|
||||
correct_option=correct_option,
|
||||
page_number=page_number,
|
||||
)
|
||||
|
||||
|
||||
ETHICS_ABCD = (
|
||||
("A", "A"),
|
||||
("B", "B"),
|
||||
("C", "C"),
|
||||
("D", "D"),
|
||||
)
|
||||
|
||||
|
||||
CHILDREN: list[ChildSpec] = [
|
||||
ChildSpec("1a", "1", "1", ("a",), 1.5, "long_question", "long_answer", page_number=2),
|
||||
short_answer("1b", "1", "1", ("b",), 1.5, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("concept_explanation", "data_augmentation"), page_number=2),
|
||||
ChildSpec("1c", "1", "1", ("c",), 4.5, "long_question", "long_answer", page_number=2),
|
||||
short_answer("1d", "1", "1", ("d",), 2, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("architecture_reasoning", "parameter_reduction"), page_number=3),
|
||||
ChildSpec("1e", "1", "1", ("e",), 2.5, "fill_blank", "fill_blank", correct_answer="1558656", page_number=3),
|
||||
ChildSpec("1f_i", "1f", "1", ("f", "i"), 2.5, "fill_blank", "fill_blank", correct_answer="2071656", page_number=3),
|
||||
ChildSpec("1f_ii", "1f", "1", ("f", "ii"), 2.5, "fill_blank", "fill_blank", correct_answer="150529000", page_number=4),
|
||||
short_answer("1g", "1", "1", ("g",), 2, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("architecture_reasoning", "comparison"), page_number=4),
|
||||
ChildSpec("2a", "2", "2", ("a",), 9, "long_question", "coding", page_number=5),
|
||||
short_answer("2b", "2", "2", ("b",), 4, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("architecture_reasoning", "regression_reasoning"), page_number=6),
|
||||
ChildSpec("3a", "3", "3", ("a",), 3.5, "long_question", "long_answer", page_number=9),
|
||||
short_answer("3b", "3", "3", ("b",), 0.5, analytics_topic="Search and Games", topic_primary="Search and Games", topic_tags=("Search and Games",), skill_tags=("game_reasoning",), correct_answer="E-a", page_number=9),
|
||||
short_answer("3c", "3", "3", ("c",), 1.5, analytics_topic="Search and Games", topic_primary="Search and Games", topic_tags=("Search and Games",), skill_tags=("concept_explanation", "game_reasoning"), page_number=9),
|
||||
short_answer("3d", "3", "3", ("d",), 2.5, analytics_topic="Search and Games", topic_primary="Search and Games", topic_tags=("Search and Games",), skill_tags=("pruning_reasoning",), correct_answer="E-j and E-f", page_number=9),
|
||||
mc("4a", "4", "4", ("a",), 1, options=ETHICS_ABCD, correct_option="C", analytics_topic="Ethics of AI", skill_tags=("concept_check", "ethical_reasoning"), page_number=10),
|
||||
mc("4b", "4", "4", ("b",), 1, options=ETHICS_ABCD, correct_option="A", analytics_topic="Ethics of AI", skill_tags=("concept_check", "bias_reasoning"), page_number=10),
|
||||
mc("4c", "4", "4", ("c",), 1, options=ETHICS_ABCD, correct_option="C", analytics_topic="Ethics of AI", skill_tags=("concept_check", "ethical_reasoning"), page_number=10),
|
||||
mc("4d", "4", "4", ("d",), 1, options=ETHICS_ABCD, correct_option="B", analytics_topic="Ethics of AI", skill_tags=("concept_check", "bias_reasoning"), page_number=10),
|
||||
short_answer("4e", "4", "4", ("e",), 3, analytics_topic="Ethics of AI", topic_primary="Ethics of AI", topic_tags=("Ethics of AI",), skill_tags=("argumentation", "concept_explanation"), page_number=11),
|
||||
]
|
||||
|
||||
|
||||
MARKER_RE = re.compile(r"(?m)^\(([a-z]+|[ivx]+)\)\s*")
|
||||
|
||||
|
||||
def split_sections(text: str) -> tuple[str, dict[str, str]]:
|
||||
matches = list(MARKER_RE.finditer(text))
|
||||
if not matches:
|
||||
return text.strip(), {}
|
||||
intro = text[: matches[0].start()].strip()
|
||||
sections: dict[str, str] = {}
|
||||
for idx, match in enumerate(matches):
|
||||
marker = match.group(1)
|
||||
end = matches[idx + 1].start() if idx + 1 < len(matches) else len(text)
|
||||
sections[marker] = text[match.start() : end].strip()
|
||||
return intro, sections
|
||||
|
||||
|
||||
def extract_segment(text: str, path: tuple[str, ...]) -> str:
|
||||
current = text.strip()
|
||||
carried_intro: list[str] = []
|
||||
for depth, marker in enumerate(path):
|
||||
intro, sections = split_sections(current)
|
||||
if depth == 0 and intro:
|
||||
carried_intro.append(intro)
|
||||
current = sections.get(marker, current)
|
||||
return "\n".join(part for part in [*carried_intro, current] if part).strip()
|
||||
|
||||
|
||||
def load_seed_rows() -> dict[str, dict]:
|
||||
data = json.loads(PROBLEM_SEED_PATH.read_text())
|
||||
return {
|
||||
row["question_number"]: row
|
||||
for row in data
|
||||
if row["source_exam_key"] == EXAM_KEY
|
||||
}
|
||||
|
||||
|
||||
def main() -> None:
|
||||
sb = get_supabase()
|
||||
paper = sb.table("papers").select("id").eq("source_exam_key", EXAM_KEY).execute().data[0]
|
||||
paper_id = paper["id"]
|
||||
|
||||
current_rows = (
|
||||
sb.table("paper_questions")
|
||||
.select("*")
|
||||
.eq("paper_id", paper_id)
|
||||
.order("display_order")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
existing_by_number = {row["question_number"]: row for row in current_rows}
|
||||
parent_rows = load_seed_rows()
|
||||
|
||||
inserts = []
|
||||
for display_order, child in enumerate(CHILDREN, start=1):
|
||||
parent = parent_rows[child.top_level_number]
|
||||
existing = existing_by_number.get(child.question_number, {})
|
||||
question_text = extract_segment(parent["question_text"] or "", child.path)
|
||||
raw_answer_text = extract_segment(parent["raw_answer_text"] or "", child.path)
|
||||
options = None
|
||||
if child.options:
|
||||
options = [{"label": label, "text": text} for label, text in child.options]
|
||||
|
||||
inserts.append(
|
||||
{
|
||||
"paper_id": paper_id,
|
||||
"question_number": child.question_number,
|
||||
"parent_question": child.parent_question,
|
||||
"display_order": display_order,
|
||||
"question_type": child.question_type,
|
||||
"question_format": child.question_format,
|
||||
"question_text": question_text,
|
||||
"score": child.score,
|
||||
"page_number": child.page_number,
|
||||
"page_y_ratio": existing.get("page_y_ratio"),
|
||||
"options": options,
|
||||
"correct_option": child.correct_option,
|
||||
"correct_answer": child.correct_answer,
|
||||
"raw_answer_text": raw_answer_text,
|
||||
"topics": existing.get("topics") or (list(child.topic_tags) if child.topic_tags else parent.get("topics")),
|
||||
"topic_primary": existing.get("topic_primary") or child.topic_primary or parent.get("topic_primary"),
|
||||
"analytics_topic": existing.get("analytics_topic") or child.analytics_topic or parent.get("analytics_topic"),
|
||||
"topic_tags": existing.get("topic_tags") or (list(child.topic_tags) if child.topic_tags else parent.get("topic_tags")),
|
||||
"skill_tags": existing.get("skill_tags") or (list(child.skill_tags) if child.skill_tags else parent.get("skill_tags")),
|
||||
"difficulty": existing.get("difficulty") or parent.get("difficulty"),
|
||||
"knowledge_reminder": existing.get("knowledge_reminder", ""),
|
||||
"ai_hint": existing.get("ai_hint", ""),
|
||||
"solution": existing.get("solution", ""),
|
||||
}
|
||||
)
|
||||
|
||||
sb.table("paper_questions").delete().eq("paper_id", paper_id).execute()
|
||||
sb.table("paper_questions").insert(inserts).execute()
|
||||
sb.table("papers").update({"question_count": len(inserts), "status": "processing"}).eq("id", paper_id).execute()
|
||||
print(f"Inserted {len(inserts)} rows for {EXAM_KEY}.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
233
backend/split_comp2211_2022_spring_midterm.py
Normal file
233
backend/split_comp2211_2022_spring_midterm.py
Normal file
@@ -0,0 +1,233 @@
|
||||
"""Split COMP2211 Spring 2022 midterm top-level problems into subquestions."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
|
||||
EXAM_KEY = "COMP2211-2022-spring-midterm"
|
||||
TRUE_FALSE_OPTIONS = [{"label": "True", "text": "True"}, {"label": "False", "text": "False"}]
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ChildSpec:
|
||||
question_number: str
|
||||
parent_question: str
|
||||
top_level_number: str
|
||||
path: tuple[str, ...]
|
||||
score: float
|
||||
question_type: str
|
||||
question_format: str | None = None
|
||||
page_number: int = 1
|
||||
|
||||
|
||||
def short_answer(
|
||||
question_number: str,
|
||||
parent_question: str,
|
||||
top_level_number: str,
|
||||
path: tuple[str, ...],
|
||||
score: float,
|
||||
*,
|
||||
page_number: int,
|
||||
) -> ChildSpec:
|
||||
return ChildSpec(
|
||||
question_number=question_number,
|
||||
parent_question=parent_question,
|
||||
top_level_number=top_level_number,
|
||||
path=path,
|
||||
score=score,
|
||||
question_type="long_question",
|
||||
question_format="short_answer",
|
||||
page_number=page_number,
|
||||
)
|
||||
|
||||
|
||||
CHILDREN: list[ChildSpec] = [
|
||||
*[
|
||||
ChildSpec(f"1{letter}", "1", "1", (letter,), 1.5, "true_false", page_number=2)
|
||||
for letter in "abcdefghij"
|
||||
],
|
||||
ChildSpec("2a_i", "2a", "2", ("a", "i"), 1, "fill_blank", page_number=4),
|
||||
ChildSpec("2a_ii", "2a", "2", ("a", "ii"), 1, "fill_blank", page_number=4),
|
||||
ChildSpec("2a_iii", "2a", "2", ("a", "iii"), 1, "fill_blank", page_number=4),
|
||||
ChildSpec("2a_iv", "2a", "2", ("a", "iv"), 1, "fill_blank", page_number=4),
|
||||
ChildSpec("2a_v", "2a", "2", ("a", "v"), 1, "fill_blank", page_number=4),
|
||||
ChildSpec("2b", "2", "2", ("b",), 2, "fill_blank", page_number=4),
|
||||
ChildSpec("2c", "2", "2", ("c",), 9, "long_question", "coding", page_number=5),
|
||||
ChildSpec("3a", "3", "3", ("a",), 2, "fill_blank", page_number=7),
|
||||
ChildSpec("3b_i", "3b", "3", ("b", "i"), 1.75, "fill_blank", page_number=7),
|
||||
ChildSpec("3b_ii", "3b", "3", ("b", "ii"), 1.75, "fill_blank", page_number=7),
|
||||
ChildSpec("3b_iii", "3b", "3", ("b", "iii"), 1.75, "fill_blank", page_number=7),
|
||||
ChildSpec("3b_iv", "3b", "3", ("b", "iv"), 1.75, "fill_blank", page_number=7),
|
||||
short_answer("3c", "3", "3", ("c",), 2, page_number=8),
|
||||
ChildSpec("4a", "4", "4", ("a",), 3, "long_question", "long_answer", page_number=9),
|
||||
short_answer("4b_i", "4b", "4", ("b", "i"), 3, page_number=9),
|
||||
short_answer("4b_ii", "4b", "4", ("b", "ii"), 3, page_number=9),
|
||||
ChildSpec("4c_i", "4c", "4", ("c", "i"), 2, "long_question", "long_answer", page_number=10),
|
||||
ChildSpec("4c_ii", "4c", "4", ("c", "ii"), 3, "long_question", "long_answer", page_number=10),
|
||||
ChildSpec("5a", "5", "5", ("a",), 4.5, "long_question", "long_answer", page_number=11),
|
||||
ChildSpec("5b", "5", "5", ("b",), 1.5, "fill_blank", page_number=11),
|
||||
ChildSpec("5c", "5", "5", ("c",), 4.5, "long_question", "long_answer", page_number=11),
|
||||
short_answer("5d", "5", "5", ("d",), 1.5, page_number=11),
|
||||
ChildSpec("6a", "6", "6", ("a",), 8, "long_question", "long_answer", page_number=12),
|
||||
short_answer("6b", "6", "6", ("b",), 2, page_number=13),
|
||||
ChildSpec("6c", "6", "6", ("c",), 10, "long_question", "coding", page_number=13),
|
||||
short_answer("7a", "7", "7", ("a",), 4, page_number=14),
|
||||
short_answer("7b", "7", "7", ("b",), 6, page_number=14),
|
||||
ChildSpec("7c", "7", "7", ("c",), 2, "fill_blank", page_number=15),
|
||||
]
|
||||
|
||||
|
||||
MARKER_RE = re.compile(r"(?m)^\(([a-z]+)\)\s*")
|
||||
PROBLEM_SEED_PATH = (
|
||||
Path(__file__).resolve().parent.parent
|
||||
/ "pastpaper-scraper"
|
||||
/ "reviews"
|
||||
/ "COMP2211"
|
||||
/ "problem_seed.json"
|
||||
)
|
||||
|
||||
|
||||
def split_sections(text: str) -> tuple[str, dict[str, str]]:
|
||||
matches = list(MARKER_RE.finditer(text))
|
||||
if not matches:
|
||||
return text.strip(), {}
|
||||
intro = text[: matches[0].start()].strip()
|
||||
sections: dict[str, str] = {}
|
||||
for idx, match in enumerate(matches):
|
||||
marker = match.group(1)
|
||||
end = matches[idx + 1].start() if idx + 1 < len(matches) else len(text)
|
||||
sections[marker] = text[match.start() : end].strip()
|
||||
return intro, sections
|
||||
|
||||
|
||||
def extract_segment(text: str, path: tuple[str, ...]) -> str:
|
||||
intro, sections = split_sections(text)
|
||||
if not path:
|
||||
return text.strip()
|
||||
first = sections.get(path[0], "")
|
||||
if not first:
|
||||
return text.strip()
|
||||
if len(path) == 1:
|
||||
return "\n".join(part for part in [intro, first] if part).strip()
|
||||
child_intro, child_sections = split_sections(first)
|
||||
second = child_sections.get(path[1], "")
|
||||
return "\n".join(part for part in [intro, child_intro, second] if part).strip()
|
||||
|
||||
|
||||
def extract_true_false_answers(answer_text: str) -> dict[str, str]:
|
||||
answers: dict[str, str] = {}
|
||||
matches = list(re.finditer(r"(?m)^\(([a-j])\)\s*\n?([TF])\b", answer_text))
|
||||
for match in matches:
|
||||
answers[match.group(1)] = match.group(2)
|
||||
return answers
|
||||
|
||||
|
||||
def derive_correct_answer(answer_text: str) -> str | None:
|
||||
if not answer_text:
|
||||
return None
|
||||
if "Answer:" in answer_text:
|
||||
tail = answer_text.split("Answer:", 1)[1]
|
||||
else:
|
||||
tail = answer_text
|
||||
lines = [line.strip() for line in tail.splitlines() if line.strip()]
|
||||
if not lines:
|
||||
return None
|
||||
first = lines[0]
|
||||
if first.lower().startswith("marking scheme"):
|
||||
return None
|
||||
if len(first) <= 240:
|
||||
return first
|
||||
return None
|
||||
|
||||
|
||||
def load_seed_rows() -> dict[str, dict]:
|
||||
data = json.loads(PROBLEM_SEED_PATH.read_text())
|
||||
return {
|
||||
row["question_number"]: row
|
||||
for row in data
|
||||
if row["source_exam_key"] == EXAM_KEY
|
||||
}
|
||||
|
||||
|
||||
def main() -> None:
|
||||
sb = get_supabase()
|
||||
paper = (
|
||||
sb.table("papers")
|
||||
.select("id")
|
||||
.eq("source_exam_key", EXAM_KEY)
|
||||
.execute()
|
||||
.data[0]
|
||||
)
|
||||
paper_id = paper["id"]
|
||||
|
||||
current_rows = (
|
||||
sb.table("paper_questions")
|
||||
.select("*")
|
||||
.eq("paper_id", paper_id)
|
||||
.order("display_order")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
existing_by_number = {row["question_number"]: row for row in current_rows}
|
||||
parent_rows = load_seed_rows()
|
||||
tf_answers = extract_true_false_answers(parent_rows["1"]["raw_answer_text"] or "")
|
||||
|
||||
inserts = []
|
||||
for display_order, child in enumerate(CHILDREN, start=1):
|
||||
parent = parent_rows[child.top_level_number]
|
||||
existing = existing_by_number.get(child.question_number, {})
|
||||
question_text = extract_segment(parent["question_text"] or "", child.path)
|
||||
raw_answer_text = extract_segment(parent["raw_answer_text"] or "", child.path)
|
||||
|
||||
correct_option = None
|
||||
correct_answer = None
|
||||
options = None
|
||||
if child.question_type == "true_false":
|
||||
marker = child.path[0]
|
||||
correct_option = tf_answers.get(marker)
|
||||
options = TRUE_FALSE_OPTIONS
|
||||
elif child.question_type == "fill_blank":
|
||||
correct_answer = derive_correct_answer(raw_answer_text)
|
||||
|
||||
inserts.append(
|
||||
{
|
||||
"paper_id": paper_id,
|
||||
"question_number": child.question_number,
|
||||
"parent_question": child.parent_question,
|
||||
"display_order": display_order,
|
||||
"question_type": child.question_type,
|
||||
"question_format": child.question_format,
|
||||
"question_text": question_text,
|
||||
"score": child.score,
|
||||
"page_number": child.page_number,
|
||||
"page_y_ratio": existing.get("page_y_ratio"),
|
||||
"options": options,
|
||||
"correct_option": correct_option,
|
||||
"correct_answer": correct_answer,
|
||||
"raw_answer_text": raw_answer_text,
|
||||
"topics": existing.get("topics") or parent.get("topics"),
|
||||
"topic_primary": existing.get("topic_primary") or parent.get("topic_primary"),
|
||||
"analytics_topic": existing.get("analytics_topic") or parent.get("analytics_topic"),
|
||||
"topic_tags": existing.get("topic_tags") or parent.get("topic_tags"),
|
||||
"skill_tags": existing.get("skill_tags") or parent.get("skill_tags"),
|
||||
"difficulty": existing.get("difficulty") or parent.get("difficulty"),
|
||||
"knowledge_reminder": existing.get("knowledge_reminder", ""),
|
||||
"ai_hint": existing.get("ai_hint", ""),
|
||||
"solution": existing.get("solution", ""),
|
||||
}
|
||||
)
|
||||
|
||||
sb.table("paper_questions").delete().eq("paper_id", paper_id).execute()
|
||||
sb.table("paper_questions").insert(inserts).execute()
|
||||
sb.table("papers").update({"question_count": len(inserts), "status": "processing"}).eq("id", paper_id).execute()
|
||||
print(f"Inserted {len(inserts)} rows for {EXAM_KEY}.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
268
backend/split_comp2211_2023_spring_midterm.py
Normal file
268
backend/split_comp2211_2023_spring_midterm.py
Normal file
@@ -0,0 +1,268 @@
|
||||
"""Split COMP2211 Spring 2023 midterm into subquestions."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
|
||||
EXAM_KEY = "COMP2211-2023-spring-midterm"
|
||||
PROBLEM_SEED_PATH = (
|
||||
Path(__file__).resolve().parent.parent
|
||||
/ "pastpaper-scraper"
|
||||
/ "reviews"
|
||||
/ "COMP2211"
|
||||
/ "problem_seed.json"
|
||||
)
|
||||
TRUE_FALSE_OPTIONS = [{"label": "True", "text": "True"}, {"label": "False", "text": "False"}]
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ChildSpec:
|
||||
question_number: str
|
||||
parent_question: str
|
||||
top_level_number: str
|
||||
path: tuple[str, ...]
|
||||
score: float
|
||||
question_type: str
|
||||
question_format: str | None = None
|
||||
analytics_topic: str | None = None
|
||||
topic_primary: str | None = None
|
||||
topic_tags: tuple[str, ...] | None = None
|
||||
skill_tags: tuple[str, ...] | None = None
|
||||
options: tuple[tuple[str, str], ...] | None = None
|
||||
correct_option: str | None = None
|
||||
correct_answer: str | None = None
|
||||
page_number: int = 1
|
||||
|
||||
|
||||
def short_answer(
|
||||
question_number: str,
|
||||
parent_question: str,
|
||||
top_level_number: str,
|
||||
path: tuple[str, ...],
|
||||
score: float,
|
||||
*,
|
||||
analytics_topic: str | None = None,
|
||||
topic_primary: str | None = None,
|
||||
topic_tags: tuple[str, ...] | None = None,
|
||||
skill_tags: tuple[str, ...] | None = None,
|
||||
correct_answer: str | None = None,
|
||||
page_number: int,
|
||||
) -> ChildSpec:
|
||||
return ChildSpec(
|
||||
question_number=question_number,
|
||||
parent_question=parent_question,
|
||||
top_level_number=top_level_number,
|
||||
path=path,
|
||||
score=score,
|
||||
question_type="long_question",
|
||||
question_format="short_answer",
|
||||
analytics_topic=analytics_topic,
|
||||
topic_primary=topic_primary,
|
||||
topic_tags=topic_tags,
|
||||
skill_tags=skill_tags,
|
||||
correct_answer=correct_answer,
|
||||
page_number=page_number,
|
||||
)
|
||||
|
||||
|
||||
def mc(
|
||||
question_number: str,
|
||||
parent_question: str,
|
||||
top_level_number: str,
|
||||
path: tuple[str, ...],
|
||||
score: float,
|
||||
*,
|
||||
options: tuple[tuple[str, str], ...],
|
||||
correct_option: str,
|
||||
analytics_topic: str,
|
||||
skill_tags: tuple[str, ...],
|
||||
page_number: int,
|
||||
) -> ChildSpec:
|
||||
return ChildSpec(
|
||||
question_number=question_number,
|
||||
parent_question=parent_question,
|
||||
top_level_number=top_level_number,
|
||||
path=path,
|
||||
score=score,
|
||||
question_type="mc",
|
||||
question_format="mc",
|
||||
analytics_topic=analytics_topic,
|
||||
topic_primary=analytics_topic,
|
||||
topic_tags=(analytics_topic,),
|
||||
skill_tags=skill_tags,
|
||||
options=options,
|
||||
correct_option=correct_option,
|
||||
page_number=page_number,
|
||||
)
|
||||
|
||||
|
||||
ABCDE = (("A", "A"), ("B", "B"), ("C", "C"), ("D", "D"), ("E", "E"))
|
||||
|
||||
|
||||
CHILDREN: list[ChildSpec] = [
|
||||
ChildSpec("1a", "1", "1", ("a",), 1, "true_false", "true_false", "Probabilistic Models", "Probabilistic Models", ("Probabilistic Models",), ("concept_check", "classification_decision"), page_number=3),
|
||||
ChildSpec("1b", "1", "1", ("b",), 1, "true_false", "true_false", "Probabilistic Models", "Probabilistic Models", ("Probabilistic Models",), ("concept_check", "classification_decision"), page_number=3),
|
||||
ChildSpec("1c", "1", "1", ("c",), 1, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "algorithm_property"), page_number=3),
|
||||
ChildSpec("1d", "1", "1", ("d",), 1, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "distance_reasoning"), page_number=3),
|
||||
ChildSpec("1e", "1", "1", ("e",), 1, "true_false", "true_false", "Evaluation and Validation", "Evaluation and Validation", ("Evaluation and Validation",), ("concept_check", "validation_reasoning"), page_number=3),
|
||||
ChildSpec("1f", "1", "1", ("f",), 1, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "algorithm_property"), page_number=3),
|
||||
ChildSpec("1g", "1", "1", ("g",), 1, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "robustness_reasoning"), page_number=3),
|
||||
ChildSpec("1h", "1", "1", ("h",), 1, "true_false", "true_false", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("concept_check", "decision_boundary"), page_number=3),
|
||||
ChildSpec("1i", "1", "1", ("i",), 1, "true_false", "true_false", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("concept_check", "optimization_reasoning"), page_number=3),
|
||||
ChildSpec("1j", "1", "1", ("j",), 1, "true_false", "true_false", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("concept_check", "expressiveness_reasoning"), page_number=3),
|
||||
short_answer("2a_i", "2a", "2", ("a", "i"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("code_tracing",), page_number=4),
|
||||
short_answer("2a_ii", "2a", "2", ("a", "ii"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("code_tracing",), page_number=4),
|
||||
short_answer("2a_iii", "2a", "2", ("a", "iii"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("code_tracing",), page_number=4),
|
||||
short_answer("2a_iv", "2a", "2", ("a", "iv"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("code_tracing",), page_number=4),
|
||||
short_answer("2a_v", "2a", "2", ("a", "v"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("indexing", "code_tracing"), page_number=4),
|
||||
short_answer("2a_vi", "2a", "2", ("a", "vi"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("indexing", "error_reasoning"), page_number=5),
|
||||
short_answer("2a_vii", "2a", "2", ("a", "vii"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("masking", "code_tracing"), page_number=5),
|
||||
short_answer("2a_viii", "2a", "2", ("a", "viii"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("aggregation", "code_tracing"), page_number=5),
|
||||
short_answer("2a_ix", "2a", "2", ("a", "ix"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("transpose", "code_tracing"), page_number=5),
|
||||
short_answer("2b_i", "2b", "2", ("b", "i"), 2, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("broadcasting", "code_tracing"), page_number=6),
|
||||
short_answer("2b_ii", "2b", "2", ("b", "ii"), 2, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("broadcasting", "error_reasoning"), page_number=6),
|
||||
short_answer("2b_iii", "2b", "2", ("b", "iii"), 2, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("broadcasting", "code_tracing"), page_number=6),
|
||||
ChildSpec("2c", "2", "2", ("c",), 6, "long_question", "coding", "Python Fundamentals", "Python Fundamentals", ("Python Fundamentals",), ("implementation", "vectorization", "geometry_reasoning"), page_number=7),
|
||||
short_answer("3", "3", "3", (), 8, analytics_topic="Probabilistic Models", topic_primary="Probabilistic Models", topic_tags=("Probabilistic Models",), skill_tags=("concept_explanation", "missing_data_reasoning"), page_number=9),
|
||||
ChildSpec("4a", "4", "4", ("a",), 8, "long_question", "long_answer", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("distance_calculation", "classification_decision"), page_number=10),
|
||||
short_answer("4b", "4", "4", ("b",), 6, analytics_topic="KNN and Clustering", topic_primary="KNN and Clustering", topic_tags=("KNN and Clustering",), skill_tags=("distance_reasoning", "comparison"), page_number=11),
|
||||
ChildSpec("5a", "5", "5", ("a",), 7, "long_question", "long_answer", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("distance_calculation", "algorithm_tracing"), page_number=12),
|
||||
ChildSpec("5b", "5", "5", ("b",), 7, "long_question", "long_answer", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("centroid_update", "algorithm_tracing"), page_number=12),
|
||||
short_answer("5c", "5", "5", ("c",), 5, analytics_topic="KNN and Clustering", topic_primary="KNN and Clustering", topic_tags=("KNN and Clustering",), skill_tags=("concept_explanation", "model_selection"), page_number=14),
|
||||
short_answer("6a", "6", "6", ("a",), 2, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("convergence_reasoning",), page_number=15),
|
||||
mc("6b", "6", "6", ("b",), 2, options=ABCDE, correct_option="D", analytics_topic="Perceptron and MLP", skill_tags=("generalization_reasoning",), page_number=15),
|
||||
short_answer("6c", "6", "6", ("c",), 2, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("activation_reasoning",), page_number=16),
|
||||
ChildSpec("6d", "6", "6", ("d",), 6, "long_question", "coding", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("debugging", "implementation", "weight_update"), page_number=16),
|
||||
short_answer("7a", "7", "7", ("a",), 4, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("decision_boundary", "linearity_reasoning"), page_number=18),
|
||||
short_answer("7b", "7", "7", ("b",), 2, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("decision_boundary", "linearity_reasoning"), page_number=18),
|
||||
ChildSpec("7c", "7", "7", ("c",), 10, "long_question", "long_answer", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("architecture_reasoning", "parameter_design"), page_number=19),
|
||||
]
|
||||
|
||||
|
||||
MARKER_RE = re.compile(r"(?m)^\(([a-z]+|[ivx]+)\)\s*")
|
||||
|
||||
|
||||
def split_sections(text: str) -> tuple[str, dict[str, str]]:
|
||||
matches = list(MARKER_RE.finditer(text))
|
||||
if not matches:
|
||||
return text.strip(), {}
|
||||
intro = text[: matches[0].start()].strip()
|
||||
sections: dict[str, str] = {}
|
||||
for idx, match in enumerate(matches):
|
||||
marker = match.group(1)
|
||||
end = matches[idx + 1].start() if idx + 1 < len(matches) else len(text)
|
||||
sections[marker] = text[match.start() : end].strip()
|
||||
return intro, sections
|
||||
|
||||
|
||||
def extract_segment(text: str, path: tuple[str, ...]) -> str:
|
||||
current = text.strip()
|
||||
carried_intro: list[str] = []
|
||||
for depth, marker in enumerate(path):
|
||||
intro, sections = split_sections(current)
|
||||
if depth == 0 and intro:
|
||||
carried_intro.append(intro)
|
||||
current = sections.get(marker, current)
|
||||
return "\n".join(part for part in [*carried_intro, current] if part).strip()
|
||||
|
||||
|
||||
def extract_true_false_answers(answer_text: str) -> dict[str, str]:
|
||||
answers: dict[str, str] = {}
|
||||
matches = list(re.finditer(r"(?m)^\(([a-j])\)\s*\n?T\s*F", answer_text))
|
||||
if matches:
|
||||
return answers
|
||||
for match in re.finditer(r"(?m)^\(([a-j])\)\s*\n?([TF])\b", answer_text):
|
||||
answers[match.group(1)] = match.group(2)
|
||||
if answers:
|
||||
return answers
|
||||
lines = [line.strip() for line in answer_text.splitlines() if line.strip()]
|
||||
current = None
|
||||
for line in lines:
|
||||
m = re.fullmatch(r"\(([a-j])\)", line)
|
||||
if m:
|
||||
current = m.group(1)
|
||||
continue
|
||||
if current and line in {"T", "F"}:
|
||||
answers[current] = line
|
||||
current = None
|
||||
return answers
|
||||
|
||||
|
||||
def load_seed_rows() -> dict[str, dict]:
|
||||
data = json.loads(PROBLEM_SEED_PATH.read_text())
|
||||
return {row["question_number"]: row for row in data if row["source_exam_key"] == EXAM_KEY}
|
||||
|
||||
|
||||
def main() -> None:
|
||||
sb = get_supabase()
|
||||
paper = sb.table("papers").select("id").eq("source_exam_key", EXAM_KEY).execute().data[0]
|
||||
paper_id = paper["id"]
|
||||
current_rows = (
|
||||
sb.table("paper_questions")
|
||||
.select("*")
|
||||
.eq("paper_id", paper_id)
|
||||
.order("display_order")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
existing_by_number = {row["question_number"]: row for row in current_rows}
|
||||
parent_rows = load_seed_rows()
|
||||
tf_answers = extract_true_false_answers(parent_rows["1"]["raw_answer_text"] or "")
|
||||
|
||||
inserts = []
|
||||
for display_order, child in enumerate(CHILDREN, start=1):
|
||||
parent = parent_rows[child.top_level_number]
|
||||
existing = existing_by_number.get(child.question_number, {})
|
||||
question_text = extract_segment(parent["question_text"] or "", child.path)
|
||||
raw_answer_text = extract_segment(parent["raw_answer_text"] or "", child.path) if child.path else (parent["raw_answer_text"] or "")
|
||||
|
||||
options = None
|
||||
correct_option = child.correct_option
|
||||
if child.options:
|
||||
options = [{"label": label, "text": text} for label, text in child.options]
|
||||
if child.question_type == "true_false":
|
||||
options = TRUE_FALSE_OPTIONS
|
||||
correct_option = tf_answers.get(child.path[0])
|
||||
|
||||
inserts.append(
|
||||
{
|
||||
"paper_id": paper_id,
|
||||
"question_number": child.question_number,
|
||||
"parent_question": child.parent_question,
|
||||
"display_order": display_order,
|
||||
"question_type": child.question_type,
|
||||
"question_format": child.question_format,
|
||||
"question_text": question_text,
|
||||
"score": child.score,
|
||||
"page_number": child.page_number,
|
||||
"page_y_ratio": existing.get("page_y_ratio"),
|
||||
"options": options,
|
||||
"correct_option": correct_option,
|
||||
"correct_answer": child.correct_answer,
|
||||
"raw_answer_text": raw_answer_text,
|
||||
"topics": existing.get("topics") or (list(child.topic_tags) if child.topic_tags else parent.get("topics")),
|
||||
"topic_primary": existing.get("topic_primary") or child.topic_primary or parent.get("topic_primary"),
|
||||
"analytics_topic": existing.get("analytics_topic") or child.analytics_topic or parent.get("analytics_topic"),
|
||||
"topic_tags": existing.get("topic_tags") or (list(child.topic_tags) if child.topic_tags else parent.get("topic_tags")),
|
||||
"skill_tags": existing.get("skill_tags") or (list(child.skill_tags) if child.skill_tags else parent.get("skill_tags")),
|
||||
"difficulty": existing.get("difficulty") or parent.get("difficulty"),
|
||||
"knowledge_reminder": existing.get("knowledge_reminder", ""),
|
||||
"ai_hint": existing.get("ai_hint", ""),
|
||||
"solution": existing.get("solution", ""),
|
||||
}
|
||||
)
|
||||
|
||||
sb.table("paper_questions").delete().eq("paper_id", paper_id).execute()
|
||||
sb.table("paper_questions").insert(inserts).execute()
|
||||
sb.table("papers").update({"question_count": len(inserts), "status": "processing"}).eq("id", paper_id).execute()
|
||||
print(f"Inserted {len(inserts)} rows for {EXAM_KEY}.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
242
backend/split_comp2211_2024_spring_final.py
Normal file
242
backend/split_comp2211_2024_spring_final.py
Normal file
@@ -0,0 +1,242 @@
|
||||
"""Split COMP2211 Spring 2024 final into subquestions."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
|
||||
EXAM_KEY = "COMP2211-2024-spring-final"
|
||||
PROBLEM_SEED_PATH = (
|
||||
Path(__file__).resolve().parent.parent
|
||||
/ "pastpaper-scraper"
|
||||
/ "reviews"
|
||||
/ "COMP2211"
|
||||
/ "problem_seed.json"
|
||||
)
|
||||
TRUE_FALSE_OPTIONS = [{"label": "True", "text": "True"}, {"label": "False", "text": "False"}]
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ChildSpec:
|
||||
question_number: str
|
||||
parent_question: str
|
||||
top_level_number: str
|
||||
path: tuple[str, ...]
|
||||
score: float
|
||||
question_type: str
|
||||
question_format: str | None = None
|
||||
analytics_topic: str | None = None
|
||||
topic_primary: str | None = None
|
||||
topic_tags: tuple[str, ...] | None = None
|
||||
skill_tags: tuple[str, ...] | None = None
|
||||
options: tuple[tuple[str, str], ...] | None = None
|
||||
correct_option: str | None = None
|
||||
correct_answer: str | None = None
|
||||
page_number: int = 1
|
||||
|
||||
|
||||
def short_answer(
|
||||
question_number: str,
|
||||
parent_question: str,
|
||||
top_level_number: str,
|
||||
path: tuple[str, ...],
|
||||
score: float,
|
||||
*,
|
||||
analytics_topic: str | None = None,
|
||||
topic_primary: str | None = None,
|
||||
topic_tags: tuple[str, ...] | None = None,
|
||||
skill_tags: tuple[str, ...] | None = None,
|
||||
correct_answer: str | None = None,
|
||||
page_number: int,
|
||||
) -> ChildSpec:
|
||||
return ChildSpec(
|
||||
question_number=question_number,
|
||||
parent_question=parent_question,
|
||||
top_level_number=top_level_number,
|
||||
path=path,
|
||||
score=score,
|
||||
question_type="long_question",
|
||||
question_format="short_answer",
|
||||
analytics_topic=analytics_topic,
|
||||
topic_primary=topic_primary,
|
||||
topic_tags=topic_tags,
|
||||
skill_tags=skill_tags,
|
||||
correct_answer=correct_answer,
|
||||
page_number=page_number,
|
||||
)
|
||||
|
||||
|
||||
CHILDREN: list[ChildSpec] = [
|
||||
ChildSpec("1a", "1", "1", ("a",), 1, "true_false", "true_false", "Python Fundamentals", "Python Fundamentals", ("Python Fundamentals",), ("concept_check", "code_tracing"), page_number=2),
|
||||
ChildSpec("1b", "1", "1", ("b",), 1, "true_false", "true_false", "Probabilistic Models", "Probabilistic Models", ("Probabilistic Models",), ("concept_check", "classification_decision"), page_number=2),
|
||||
ChildSpec("1c", "1", "1", ("c",), 1, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "algorithm_property"), page_number=2),
|
||||
ChildSpec("1d", "1", "1", ("d",), 1, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "algorithm_property"), page_number=2),
|
||||
ChildSpec("1e", "1", "1", ("e",), 1, "true_false", "true_false", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("concept_check", "activation_reasoning"), page_number=2),
|
||||
ChildSpec("1f", "1", "1", ("f",), 1, "true_false", "true_false", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("concept_check", "image_processing"), page_number=2),
|
||||
ChildSpec("1g", "1", "1", ("g",), 1, "true_false", "true_false", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("concept_check", "cnn_complexity"), page_number=2),
|
||||
ChildSpec("1h", "1", "1", ("h",), 1, "true_false", "true_false", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("concept_check", "regularization"), page_number=2),
|
||||
ChildSpec("1i", "1", "1", ("i",), 1, "true_false", "true_false", "Search and Games", "Search and Games", ("Search and Games",), ("concept_check", "pruning_reasoning"), page_number=2),
|
||||
ChildSpec("1j", "1", "1", ("j",), 1, "true_false", "true_false", "Ethics of AI", "Ethics of AI", ("Ethics of AI",), ("concept_check", "research_ethics"), page_number=2),
|
||||
ChildSpec("2a", "2", "2", ("a",), 4, "long_question", "coding", "Python Fundamentals", "Python Fundamentals", ("Python Fundamentals",), ("implementation", "vectorization", "masking"), page_number=3),
|
||||
ChildSpec("2b", "2", "2", ("b",), 6, "long_question", "coding", "Python Fundamentals", "Python Fundamentals", ("Python Fundamentals",), ("implementation", "convolution", "array_manipulation"), page_number=4),
|
||||
short_answer("3a_i", "3a", "3", ("a", "i"), 1.5, analytics_topic="Probabilistic Models", topic_primary="Probabilistic Models", topic_tags=("Probabilistic Models",), skill_tags=("manual_computation", "probability_reasoning"), page_number=6),
|
||||
short_answer("3a_ii", "3a", "3", ("a", "ii"), 1.5, analytics_topic="Probabilistic Models", topic_primary="Probabilistic Models", topic_tags=("Probabilistic Models",), skill_tags=("manual_computation", "probability_reasoning"), page_number=6),
|
||||
short_answer("3a_iii", "3a", "3", ("a", "iii"), 1.5, analytics_topic="Probabilistic Models", topic_primary="Probabilistic Models", topic_tags=("Probabilistic Models",), skill_tags=("manual_computation", "probability_reasoning"), page_number=6),
|
||||
short_answer("3a_iv", "3a", "3", ("a", "iv"), 1.5, analytics_topic="Probabilistic Models", topic_primary="Probabilistic Models", topic_tags=("Probabilistic Models",), skill_tags=("manual_computation", "probability_reasoning"), page_number=6),
|
||||
short_answer("3b_i", "3b", "3", ("b", "i"), 1.5, analytics_topic="Evaluation and Validation", topic_primary="Evaluation and Validation", topic_tags=("Evaluation and Validation",), skill_tags=("validation_reasoning",), page_number=6),
|
||||
short_answer("3b_ii", "3b", "3", ("b", "ii"), 1.5, analytics_topic="Evaluation and Validation", topic_primary="Evaluation and Validation", topic_tags=("Evaluation and Validation",), skill_tags=("validation_reasoning",), page_number=6),
|
||||
short_answer("3b_iii", "3b", "3", ("b", "iii"), 1.5, analytics_topic="Evaluation and Validation", topic_primary="Evaluation and Validation", topic_tags=("Evaluation and Validation",), skill_tags=("validation_reasoning",), page_number=6),
|
||||
short_answer("3c", "3", "3", ("c",), 1.5, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("linearity_reasoning", "classification_decision"), page_number=6),
|
||||
short_answer("4a_i", "4a", "4", ("a", "i"), 2.5, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("parameter_counting",), page_number=7),
|
||||
short_answer("4a_ii", "4a", "4", ("a", "ii"), 2.5, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("model_selection",), page_number=7),
|
||||
short_answer("4b", "4", "4", ("b",), 1, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("concept_explanation",), page_number=7),
|
||||
short_answer("4c", "4", "4", ("c",), 2, analytics_topic="Perceptron and MLP", topic_primary="Perceptron and MLP", topic_tags=("Perceptron and MLP",), skill_tags=("activation_reasoning", "optimization_reasoning"), page_number=7),
|
||||
ChildSpec("4d_i", "4d", "4", ("d", "i"), 1.5, "long_question", "long_answer", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("forward_pass", "activation_reasoning"), page_number=8),
|
||||
ChildSpec("4d_ii", "4d", "4", ("d", "ii"), 1.5, "long_question", "long_answer", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("backpropagation", "weight_update"), page_number=8),
|
||||
ChildSpec("5a", "5", "5", ("a",), 4.5, "long_question", "long_answer", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("histogram_reasoning", "image_transform"), page_number=9),
|
||||
ChildSpec("5b", "5", "5", ("b",), 3, "long_question", "long_answer", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("thresholding", "manual_computation"), page_number=10),
|
||||
ChildSpec("5c", "5", "5", ("c",), 2, "long_question", "long_answer", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("padding", "manual_construction"), page_number=10),
|
||||
short_answer("5d_i", "5d", "5", ("d", "i"), 0.5, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("filter_effect_reasoning",), page_number=11),
|
||||
short_answer("5d_ii", "5d", "5", ("d", "ii"), 0.5, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("filter_effect_reasoning",), page_number=11),
|
||||
short_answer("5d_iii", "5d", "5", ("d", "iii"), 0.5, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("filter_effect_reasoning",), page_number=11),
|
||||
short_answer("5e", "5", "5", ("e",), 2, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("concept_explanation", "local_vs_global"), page_number=11),
|
||||
ChildSpec("6a", "6", "6", ("a",), 10, "long_question", "coding", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("implementation", "convolution", "debugging"), page_number=12),
|
||||
ChildSpec("6b", "6", "6", ("b",), 3, "long_question", "coding", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("implementation", "regularization"), page_number=15),
|
||||
short_answer("7a_i", "7a", "7", ("a", "i"), 1, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("cnn_architecture",), page_number=16),
|
||||
short_answer("7a_ii", "7a", "7", ("a", "ii"), 4, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("shape_reasoning", "parameter_counting"), page_number=16),
|
||||
short_answer("7a_iii", "7a", "7", ("a", "iii"), 3, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("overfitting", "regularization"), page_number=16),
|
||||
ChildSpec("7b", "7", "7", ("b",), 5, "long_question", "long_answer", "Vision and CNN", "Vision and CNN", ("Vision and CNN",), ("manual_computation", "cnn_forward_pass"), page_number=17),
|
||||
short_answer("7c_i", "7c", "7", ("c", "i"), 2, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("shape_reasoning", "3d_convolution"), page_number=17),
|
||||
short_answer("7c_ii", "7c", "7", ("c", "ii"), 1.5, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("parameter_counting", "3d_convolution"), page_number=17),
|
||||
short_answer("7c_iii", "7c", "7", ("c", "iii"), 1.5, analytics_topic="Vision and CNN", topic_primary="Vision and CNN", topic_tags=("Vision and CNN",), skill_tags=("parameter_counting", "3d_convolution"), page_number=17),
|
||||
short_answer("8a_i", "8a", "8", ("a", "i"), 1, analytics_topic="Search and Games", topic_primary="Search and Games", topic_tags=("Search and Games",), skill_tags=("tree_search", "manual_tracing"), page_number=18),
|
||||
short_answer("8a_ii", "8a", "8", ("a", "ii"), 3, analytics_topic="Search and Games", topic_primary="Search and Games", topic_tags=("Search and Games",), skill_tags=("pruning", "manual_tracing"), page_number=18),
|
||||
short_answer("8a_iii", "8a", "8", ("a", "iii"), 1, analytics_topic="Search and Games", topic_primary="Search and Games", topic_tags=("Search and Games",), skill_tags=("game_reasoning",), page_number=18),
|
||||
short_answer("8b_i", "8b", "8", ("b", "i"), 2.5, analytics_topic="Search and Games", topic_primary="Search and Games", topic_tags=("Search and Games",), skill_tags=("utility_reasoning",), page_number=18),
|
||||
short_answer("8b_ii", "8b", "8", ("b", "ii"), 2.5, analytics_topic="Search and Games", topic_primary="Search and Games", topic_tags=("Search and Games",), skill_tags=("pruning_reasoning", "concept_explanation"), page_number=18),
|
||||
short_answer("9", "9", "9", (), 3, analytics_topic="Ethics of AI", topic_primary="Ethics of AI", topic_tags=("Ethics of AI",), skill_tags=("concept_explanation", "governance"), page_number=19),
|
||||
]
|
||||
|
||||
|
||||
MARKER_RE = re.compile(r"(?m)^\(([a-z]+|[ivx]+)\)\s*")
|
||||
|
||||
|
||||
def split_sections(text: str) -> tuple[str, dict[str, str]]:
|
||||
matches = list(MARKER_RE.finditer(text))
|
||||
if not matches:
|
||||
return text.strip(), {}
|
||||
intro = text[: matches[0].start()].strip()
|
||||
sections: dict[str, str] = {}
|
||||
for idx, match in enumerate(matches):
|
||||
marker = match.group(1)
|
||||
end = matches[idx + 1].start() if idx + 1 < len(matches) else len(text)
|
||||
sections[marker] = text[match.start() : end].strip()
|
||||
return intro, sections
|
||||
|
||||
|
||||
def extract_segment(text: str, path: tuple[str, ...]) -> str:
|
||||
if not path:
|
||||
return text.strip()
|
||||
current = text.strip()
|
||||
carried_intro: list[str] = []
|
||||
for depth, marker in enumerate(path):
|
||||
intro, sections = split_sections(current)
|
||||
if depth == 0 and intro:
|
||||
carried_intro.append(intro)
|
||||
current = sections.get(marker, current)
|
||||
return "\n".join(part for part in [*carried_intro, current] if part).strip()
|
||||
|
||||
|
||||
def extract_true_false_answers(answer_text: str) -> dict[str, str]:
|
||||
answers: dict[str, str] = {}
|
||||
table_match = re.search(r"Answer\s+(T\s+F\s+T\s+F\s+F\s+T\s+F\s+F\s+F\s+T)", answer_text, re.S)
|
||||
if table_match:
|
||||
seq = re.findall(r"[TF]", table_match.group(1))
|
||||
if len(seq) == 10:
|
||||
for idx, val in enumerate(seq):
|
||||
answers[chr(ord("a") + idx)] = val
|
||||
return answers
|
||||
seq = re.findall(r"\b([TF])\b", answer_text)
|
||||
if len(seq) >= 10:
|
||||
for idx, val in enumerate(seq[:10]):
|
||||
answers[chr(ord("a") + idx)] = val
|
||||
return answers
|
||||
|
||||
|
||||
def load_seed_rows() -> dict[str, dict]:
|
||||
data = json.loads(PROBLEM_SEED_PATH.read_text())
|
||||
return {row["question_number"]: row for row in data if row["source_exam_key"] == EXAM_KEY}
|
||||
|
||||
|
||||
def main() -> None:
|
||||
sb = get_supabase()
|
||||
paper = sb.table("papers").select("id").eq("source_exam_key", EXAM_KEY).execute().data[0]
|
||||
paper_id = paper["id"]
|
||||
current_rows = (
|
||||
sb.table("paper_questions")
|
||||
.select("*")
|
||||
.eq("paper_id", paper_id)
|
||||
.order("display_order")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
existing_by_number = {row["question_number"]: row for row in current_rows}
|
||||
parent_rows = load_seed_rows()
|
||||
tf_answers = extract_true_false_answers(parent_rows["1"]["raw_answer_text"] or "")
|
||||
|
||||
inserts = []
|
||||
for display_order, child in enumerate(CHILDREN, start=1):
|
||||
parent = parent_rows[child.top_level_number]
|
||||
existing = existing_by_number.get(child.question_number, {})
|
||||
question_text = extract_segment(parent["question_text"] or "", child.path)
|
||||
raw_answer_text = extract_segment(parent["raw_answer_text"] or "", child.path) if child.path else (parent["raw_answer_text"] or "")
|
||||
|
||||
options = None
|
||||
correct_option = child.correct_option
|
||||
if child.question_type == "true_false":
|
||||
options = TRUE_FALSE_OPTIONS
|
||||
correct_option = tf_answers.get(child.path[0])
|
||||
elif child.options:
|
||||
options = [{"label": label, "text": text} for label, text in child.options]
|
||||
|
||||
inserts.append(
|
||||
{
|
||||
"paper_id": paper_id,
|
||||
"question_number": child.question_number,
|
||||
"parent_question": child.parent_question,
|
||||
"display_order": display_order,
|
||||
"question_type": child.question_type,
|
||||
"question_format": child.question_format,
|
||||
"question_text": question_text,
|
||||
"score": child.score,
|
||||
"page_number": child.page_number,
|
||||
"page_y_ratio": existing.get("page_y_ratio"),
|
||||
"options": options,
|
||||
"correct_option": correct_option,
|
||||
"correct_answer": child.correct_answer,
|
||||
"raw_answer_text": raw_answer_text,
|
||||
"topics": existing.get("topics") or (list(child.topic_tags) if child.topic_tags else parent.get("topics")),
|
||||
"topic_primary": existing.get("topic_primary") or child.topic_primary or parent.get("topic_primary"),
|
||||
"analytics_topic": existing.get("analytics_topic") or child.analytics_topic or parent.get("analytics_topic"),
|
||||
"topic_tags": existing.get("topic_tags") or (list(child.topic_tags) if child.topic_tags else parent.get("topic_tags")),
|
||||
"skill_tags": existing.get("skill_tags") or (list(child.skill_tags) if child.skill_tags else parent.get("skill_tags")),
|
||||
"difficulty": existing.get("difficulty") or parent.get("difficulty"),
|
||||
"knowledge_reminder": existing.get("knowledge_reminder", ""),
|
||||
"ai_hint": existing.get("ai_hint", ""),
|
||||
"solution": existing.get("solution", ""),
|
||||
}
|
||||
)
|
||||
|
||||
sb.table("paper_questions").delete().eq("paper_id", paper_id).execute()
|
||||
sb.table("paper_questions").insert(inserts).execute()
|
||||
sb.table("papers").update({"question_count": len(inserts), "status": "processing"}).eq("id", paper_id).execute()
|
||||
print(f"Inserted {len(inserts)} rows for {EXAM_KEY}.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
291
backend/split_comp2211_2024_spring_midterm.py
Normal file
291
backend/split_comp2211_2024_spring_midterm.py
Normal file
@@ -0,0 +1,291 @@
|
||||
"""Rebuild COMP2211 Spring 2024 midterm into subquestions."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
import fitz
|
||||
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
|
||||
EXAM_KEY = "COMP2211-2024-spring-midterm"
|
||||
ROOT = Path(__file__).resolve().parent.parent
|
||||
QUESTION_PDF = ROOT / "pastpaper-scraper" / "papers" / "COMP2211" / "(COMP2211)[2024](s)midterm~=rcidkjgf^_82003.pdf"
|
||||
ANSWER_PDF = ROOT / "pastpaper-scraper" / "papers" / "COMP2211" / "(COMP2211)[2024](s)midterm~=ubrzkjmz^_90406.pdf"
|
||||
PROBLEM_SEED_PATH = ROOT / "pastpaper-scraper" / "reviews" / "COMP2211" / "problem_seed.json"
|
||||
TRUE_FALSE_OPTIONS = [{"label": "True", "text": "True"}, {"label": "False", "text": "False"}]
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ChildSpec:
|
||||
question_number: str
|
||||
parent_question: str
|
||||
top_level_number: str
|
||||
path: tuple[str, ...]
|
||||
score: float
|
||||
question_type: str
|
||||
question_format: str | None = None
|
||||
analytics_topic: str | None = None
|
||||
topic_primary: str | None = None
|
||||
topic_tags: tuple[str, ...] | None = None
|
||||
skill_tags: tuple[str, ...] | None = None
|
||||
page_number: int = 1
|
||||
|
||||
|
||||
def short_answer(
|
||||
question_number: str,
|
||||
parent_question: str,
|
||||
top_level_number: str,
|
||||
path: tuple[str, ...],
|
||||
score: float,
|
||||
*,
|
||||
analytics_topic: str | None = None,
|
||||
topic_primary: str | None = None,
|
||||
topic_tags: tuple[str, ...] | None = None,
|
||||
skill_tags: tuple[str, ...] | None = None,
|
||||
page_number: int,
|
||||
) -> ChildSpec:
|
||||
return ChildSpec(
|
||||
question_number=question_number,
|
||||
parent_question=parent_question,
|
||||
top_level_number=top_level_number,
|
||||
path=path,
|
||||
score=score,
|
||||
question_type="long_question",
|
||||
question_format="short_answer",
|
||||
analytics_topic=analytics_topic,
|
||||
topic_primary=topic_primary,
|
||||
topic_tags=topic_tags,
|
||||
skill_tags=skill_tags,
|
||||
page_number=page_number,
|
||||
)
|
||||
|
||||
|
||||
CHILDREN: list[ChildSpec] = [
|
||||
ChildSpec("1a", "1", "1", ("a",), 0.5, "true_false", "true_false", "Python Fundamentals", "Python Fundamentals", ("Python Fundamentals",), ("concept_check", "code_tracing"), page_number=3),
|
||||
ChildSpec("1b", "1", "1", ("b",), 0.5, "true_false", "true_false", "Python Fundamentals", "Python Fundamentals", ("Python Fundamentals",), ("concept_check", "broadcasting"), page_number=3),
|
||||
ChildSpec("1c", "1", "1", ("c",), 0.5, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "algorithm_property"), page_number=3),
|
||||
ChildSpec("1d", "1", "1", ("d",), 0.5, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "tie_reasoning"), page_number=3),
|
||||
ChildSpec("1e", "1", "1", ("e",), 0.5, "true_false", "true_false", "Evaluation and Validation", "Evaluation and Validation", ("Evaluation and Validation",), ("concept_check", "cross_validation"), page_number=3),
|
||||
ChildSpec("1f", "1", "1", ("f",), 0.5, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "clustering_property"), page_number=3),
|
||||
ChildSpec("1g", "1", "1", ("g",), 0.5, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "robustness_reasoning"), page_number=3),
|
||||
ChildSpec("1h", "1", "1", ("h",), 0.5, "true_false", "true_false", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("concept_check", "decision_boundary"), page_number=3),
|
||||
ChildSpec("1i", "1", "1", ("i",), 0.5, "true_false", "true_false", "Perceptron and MLP", "Perceptron and MLP", ("Perceptron and MLP",), ("concept_check", "optimization_reasoning"), page_number=3),
|
||||
ChildSpec("1j", "1", "1", ("j",), 0.5, "true_false", "true_false", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("concept_check", "clustering_property"), page_number=3),
|
||||
short_answer("2a_i", "2a", "2", ("a", "i"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("code_tracing",), page_number=4),
|
||||
short_answer("2a_ii", "2a", "2", ("a", "ii"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("code_tracing",), page_number=4),
|
||||
short_answer("2a_iii", "2a", "2", ("a", "iii"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("array_manipulation",), page_number=5),
|
||||
short_answer("2a_iv", "2a", "2", ("a", "iv"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("array_construction",), page_number=5),
|
||||
short_answer("2a_v", "2a", "2", ("a", "v"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("aggregation",), page_number=5),
|
||||
short_answer("2a_vi", "2a", "2", ("a", "vi"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("transpose",), page_number=6),
|
||||
short_answer("2a_vii", "2a", "2", ("a", "vii"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("matrix_multiplication",), page_number=6),
|
||||
short_answer("2a_viii", "2a", "2", ("a", "viii"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("dot_product",), page_number=6),
|
||||
short_answer("2a_ix", "2a", "2", ("a", "ix"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("broadcasting",), page_number=6),
|
||||
short_answer("2a_x", "2a", "2", ("a", "x"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("error_reasoning",), page_number=7),
|
||||
short_answer("2a_xi", "2a", "2", ("a", "xi"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("broadcasting",), page_number=7),
|
||||
short_answer("2a_xii", "2a", "2", ("a", "xii"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("slicing",), page_number=7),
|
||||
short_answer("2a_xiii", "2a", "2", ("a", "xiii"), 1, analytics_topic="Python Fundamentals", topic_primary="Python Fundamentals", topic_tags=("Python Fundamentals",), skill_tags=("views_vs_copies",), page_number=7),
|
||||
ChildSpec("2b", "2", "2", ("b",), 6, "long_question", "coding", "Python Fundamentals", "Python Fundamentals", ("Python Fundamentals",), ("implementation", "vectorization", "similarity_computation"), page_number=8),
|
||||
ChildSpec("3a", "3", "3", ("a",), 5.5, "long_question", "long_answer", "Evaluation and Validation", "Evaluation and Validation", ("Evaluation and Validation",), ("manual_computation", "metric_reasoning"), page_number=10),
|
||||
short_answer("3b", "3", "3", ("b",), 1, analytics_topic="Evaluation and Validation", topic_primary="Evaluation and Validation", topic_tags=("Evaluation and Validation",), skill_tags=("metric_reasoning",), page_number=11),
|
||||
ChildSpec("3c", "3", "3", ("c",), 2.5, "long_question", "long_answer", "Evaluation and Validation", "Evaluation and Validation", ("Evaluation and Validation",), ("manual_computation", "metric_reasoning"), page_number=11),
|
||||
short_answer("3d", "3", "3", ("d",), 1, analytics_topic="Evaluation and Validation", topic_primary="Evaluation and Validation", topic_tags=("Evaluation and Validation",), skill_tags=("metric_reasoning",), page_number=12),
|
||||
ChildSpec("3e", "3", "3", ("e",), 6, "long_question", "coding", "Evaluation and Validation", "Evaluation and Validation", ("Evaluation and Validation",), ("implementation", "metrics", "vectorization"), page_number=12),
|
||||
ChildSpec("4a", "4", "4", ("a",), 4, "long_question", "long_answer", "Probabilistic Models", "Probabilistic Models", ("Probabilistic Models",), ("manual_computation", "gaussian_nb"), page_number=15),
|
||||
ChildSpec("4b", "4", "4", ("b",), 3, "long_question", "long_answer", "Probabilistic Models", "Probabilistic Models", ("Probabilistic Models",), ("manual_computation", "likelihood_reasoning"), page_number=15),
|
||||
ChildSpec("4c", "4", "4", ("c",), 4, "long_question", "long_answer", "Probabilistic Models", "Probabilistic Models", ("Probabilistic Models",), ("laplace_smoothing", "likelihood_reasoning"), page_number=16),
|
||||
short_answer("4d", "4", "4", ("d",), 2, analytics_topic="Probabilistic Models", topic_primary="Probabilistic Models", topic_tags=("Probabilistic Models",), skill_tags=("prior_reasoning",), page_number=17),
|
||||
ChildSpec("4e", "4", "4", ("e",), 3, "long_question", "long_answer", "Probabilistic Models", "Probabilistic Models", ("Probabilistic Models",), ("posterior_reasoning", "classification_decision"), page_number=17),
|
||||
ChildSpec("5a", "5", "5", ("a",), 3, "long_question", "long_answer", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("distance_calculation", "weighted_knn"), page_number=18),
|
||||
ChildSpec("5b", "5", "5", ("b",), 13, "long_question", "long_answer", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("cross_validation", "manual_tracing", "model_selection"), page_number=18),
|
||||
short_answer("5c", "5", "5", ("c",), 2, analytics_topic="KNN and Clustering", topic_primary="KNN and Clustering", topic_tags=("KNN and Clustering",), skill_tags=("test_error", "model_selection"), page_number=20),
|
||||
ChildSpec("6a", "6", "6", ("a",), 6, "long_question", "long_answer", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("manual_computation", "clustering"), page_number=21),
|
||||
ChildSpec("6b", "6", "6", ("b",), 6, "long_question", "long_answer", "KNN and Clustering", "KNN and Clustering", ("KNN and Clustering",), ("manual_computation", "clustering"), page_number=22),
|
||||
short_answer("6c", "6", "6", ("c",), 2, analytics_topic="KNN and Clustering", topic_primary="KNN and Clustering", topic_tags=("KNN and Clustering",), skill_tags=("outlier_reasoning",), page_number=22),
|
||||
short_answer("6d", "6", "6", ("d",), 2, analytics_topic="KNN and Clustering", topic_primary="KNN and Clustering", topic_tags=("KNN and Clustering",), skill_tags=("model_selection", "threshold_reasoning"), page_number=22),
|
||||
ChildSpec("7", "7", "7", (), 10, "long_question", "long_answer", "Evaluation and Validation", "Evaluation and Validation", ("Evaluation and Validation",), ("cross_validation", "data_leakage_reasoning"), page_number=23),
|
||||
]
|
||||
|
||||
|
||||
MARKER_RE = re.compile(r"(?m)^\(([a-z]+|[ivx]+)\)\s*")
|
||||
|
||||
|
||||
def split_sections(text: str) -> tuple[str, dict[str, str]]:
|
||||
matches = list(MARKER_RE.finditer(text))
|
||||
if not matches:
|
||||
return text.strip(), {}
|
||||
intro = text[: matches[0].start()].strip()
|
||||
sections: dict[str, str] = {}
|
||||
for idx, match in enumerate(matches):
|
||||
marker = match.group(1)
|
||||
end = matches[idx + 1].start() if idx + 1 < len(matches) else len(text)
|
||||
sections[marker] = text[match.start() : end].strip()
|
||||
return intro, sections
|
||||
|
||||
|
||||
def extract_segment(text: str, path: tuple[str, ...]) -> str:
|
||||
if not path:
|
||||
return text.strip()
|
||||
current = text.strip()
|
||||
carried_intro: list[str] = []
|
||||
for depth, marker in enumerate(path):
|
||||
intro, sections = split_sections(current)
|
||||
if depth == 0 and intro:
|
||||
carried_intro.append(intro)
|
||||
current = sections.get(marker, current)
|
||||
return "\n".join(part for part in [*carried_intro, current] if part).strip()
|
||||
|
||||
|
||||
def extract_pages(pdf_path: Path, start: int, end: int) -> str:
|
||||
doc = fitz.open(pdf_path)
|
||||
try:
|
||||
return "\n".join(doc[i].get_text("text") for i in range(start - 1, end))
|
||||
finally:
|
||||
doc.close()
|
||||
|
||||
|
||||
def load_seed_rows() -> dict[str, dict]:
|
||||
data = json.loads(PROBLEM_SEED_PATH.read_text())
|
||||
return {row["question_number"]: row for row in data if row["source_exam_key"] == EXAM_KEY}
|
||||
|
||||
|
||||
def build_source_rows(existing_rows: dict[str, dict]) -> dict[str, dict]:
|
||||
seed_rows = load_seed_rows()
|
||||
rows = dict(seed_rows)
|
||||
if "5" in rows:
|
||||
rows["5"] = {
|
||||
**rows["5"],
|
||||
"question_text": extract_pages(QUESTION_PDF, 18, 20),
|
||||
"raw_answer_text": extract_pages(ANSWER_PDF, 21, 25),
|
||||
"page_number": 18,
|
||||
"analytics_topic": "KNN and Clustering",
|
||||
"topic_primary": "KNN and Clustering",
|
||||
"topic_tags": ["KNN and Clustering"],
|
||||
"skill_tags": ["manual_computation", "distance_calculation", "algorithm_tracing"],
|
||||
"difficulty": "medium",
|
||||
}
|
||||
else:
|
||||
rows["5"] = {
|
||||
**seed_rows["5"],
|
||||
"question_text": extract_pages(QUESTION_PDF, 18, 20),
|
||||
"raw_answer_text": extract_pages(ANSWER_PDF, 21, 25),
|
||||
"page_number": 18,
|
||||
}
|
||||
if "7" in rows:
|
||||
rows["7"] = {
|
||||
**rows["7"],
|
||||
"question_text": extract_pages(QUESTION_PDF, 23, 24),
|
||||
"raw_answer_text": extract_pages(ANSWER_PDF, 31, 34),
|
||||
"page_number": 23,
|
||||
"analytics_topic": "Evaluation and Validation",
|
||||
"topic_primary": "Evaluation and Validation",
|
||||
"topic_tags": ["Evaluation and Validation"],
|
||||
"skill_tags": ["cross_validation", "data_leakage_reasoning"],
|
||||
"difficulty": "medium",
|
||||
}
|
||||
else:
|
||||
rows["7"] = {
|
||||
**seed_rows["7"],
|
||||
"question_text": extract_pages(QUESTION_PDF, 23, 24),
|
||||
"raw_answer_text": extract_pages(ANSWER_PDF, 31, 34),
|
||||
"page_number": 23,
|
||||
}
|
||||
return rows
|
||||
|
||||
|
||||
def extract_true_false_answers(answer_text: str) -> dict[str, str]:
|
||||
answers: dict[str, str] = {}
|
||||
table_match = re.search(r"Answer\s+([TF\s]+)", answer_text, re.S)
|
||||
if table_match:
|
||||
seq = re.findall(r"[TF]", table_match.group(1))
|
||||
if len(seq) >= 10:
|
||||
for idx, val in enumerate(seq[:10]):
|
||||
answers[chr(ord("a") + idx)] = val
|
||||
return answers
|
||||
lines = [line.strip() for line in answer_text.splitlines() if line.strip()]
|
||||
current_letter: str | None = None
|
||||
for line in lines:
|
||||
m = re.fullmatch(r"\(([a-j])\)", line)
|
||||
if m:
|
||||
current_letter = m.group(1)
|
||||
continue
|
||||
if current_letter and line in {"T", "F"}:
|
||||
answers[current_letter] = line
|
||||
current_letter = None
|
||||
if answers:
|
||||
return answers
|
||||
seq = re.findall(r"\b([TF])\b", answer_text)
|
||||
if len(seq) >= 10:
|
||||
for idx, val in enumerate(seq[:10]):
|
||||
answers[chr(ord("a") + idx)] = val
|
||||
return answers
|
||||
|
||||
|
||||
def main() -> None:
|
||||
sb = get_supabase()
|
||||
paper = sb.table("papers").select("id").eq("source_exam_key", EXAM_KEY).execute().data[0]
|
||||
paper_id = paper["id"]
|
||||
current_rows = (
|
||||
sb.table("paper_questions")
|
||||
.select("*")
|
||||
.eq("paper_id", paper_id)
|
||||
.order("display_order")
|
||||
.execute()
|
||||
.data
|
||||
)
|
||||
existing_by_number = {row["question_number"]: row for row in current_rows}
|
||||
parent_rows = build_source_rows(existing_by_number)
|
||||
tf_answers = extract_true_false_answers(parent_rows["1"]["raw_answer_text"] or "")
|
||||
|
||||
inserts = []
|
||||
for display_order, child in enumerate(CHILDREN, start=1):
|
||||
parent = parent_rows[child.top_level_number]
|
||||
existing = existing_by_number.get(child.question_number, {})
|
||||
question_text = extract_segment(parent["question_text"] or "", child.path)
|
||||
raw_answer_text = extract_segment(parent["raw_answer_text"] or "", child.path) if child.path else (parent["raw_answer_text"] or "")
|
||||
options = None
|
||||
correct_option = None
|
||||
if child.question_type == "true_false":
|
||||
options = TRUE_FALSE_OPTIONS
|
||||
correct_option = tf_answers.get(child.path[0])
|
||||
|
||||
inserts.append(
|
||||
{
|
||||
"paper_id": paper_id,
|
||||
"question_number": child.question_number,
|
||||
"parent_question": child.parent_question,
|
||||
"display_order": display_order,
|
||||
"question_type": child.question_type,
|
||||
"question_format": child.question_format,
|
||||
"question_text": question_text,
|
||||
"score": child.score,
|
||||
"page_number": child.page_number,
|
||||
"page_y_ratio": existing.get("page_y_ratio"),
|
||||
"options": options,
|
||||
"correct_option": correct_option,
|
||||
"correct_answer": None,
|
||||
"raw_answer_text": raw_answer_text,
|
||||
"topics": existing.get("topics") or (list(child.topic_tags) if child.topic_tags else parent.get("topics")),
|
||||
"topic_primary": existing.get("topic_primary") or child.topic_primary or parent.get("topic_primary"),
|
||||
"analytics_topic": existing.get("analytics_topic") or child.analytics_topic or parent.get("analytics_topic"),
|
||||
"topic_tags": existing.get("topic_tags") or (list(child.topic_tags) if child.topic_tags else parent.get("topic_tags")),
|
||||
"skill_tags": existing.get("skill_tags") or (list(child.skill_tags) if child.skill_tags else parent.get("skill_tags")),
|
||||
"difficulty": existing.get("difficulty") or parent.get("difficulty"),
|
||||
"knowledge_reminder": existing.get("knowledge_reminder", ""),
|
||||
"ai_hint": existing.get("ai_hint", ""),
|
||||
"solution": existing.get("solution", ""),
|
||||
}
|
||||
)
|
||||
|
||||
sb.table("paper_questions").delete().eq("paper_id", paper_id).execute()
|
||||
sb.table("paper_questions").insert(inserts).execute()
|
||||
sb.table("papers").update({"question_count": len(inserts), "status": "processing"}).eq("id", paper_id).execute()
|
||||
print(f"Inserted {len(inserts)} rows for {EXAM_KEY}.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
121
backend/upload_course_library_pdfs.py
Normal file
121
backend/upload_course_library_pdfs.py
Normal file
@@ -0,0 +1,121 @@
|
||||
"""Upload COMP2211 course-library PDFs to Supabase Storage.
|
||||
|
||||
Run from the backend directory:
|
||||
uv run python upload_course_library_pdfs.py
|
||||
|
||||
Each entry maps a storage path (inside the `papers` bucket) to the local
|
||||
source file under pastpaper-scraper/papers/COMP2211/.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Manifest: (storage_path, local_filename)
|
||||
# storage_path is relative inside the `papers` bucket.
|
||||
# local_filename is relative to PAPERS_DIR below.
|
||||
# ---------------------------------------------------------------------------
|
||||
MANIFEST: list[tuple[str, str]] = [
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2022-fall-midterm/paper.pdf",
|
||||
"(COMP2211)[2022](f)midterm~=yjz8dxdd^_27002.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2022-fall-midterm/answer.pdf",
|
||||
"(COMP2211)[2022](f)midterm~=yjz8dxdd^_18747.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2022-spring-midterm/paper.pdf",
|
||||
"(COMP2211)[2022](s)midterm~=b8bidkgs^_14629.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2022-spring-midterm/answer.pdf",
|
||||
"(COMP2211)[2022](s)midterm~=6ma030^_89587.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2022-spring-final-part-a/paper.pdf",
|
||||
"(COMP2211)[2022](s)final~=b8bidkgs^_33018.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2022-spring-final-part-a/answer.pdf",
|
||||
"(COMP2211)[2022](s)final~=ajou6^_82011.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2022-spring-final-part-b/paper.pdf",
|
||||
"(COMP2211)[2022](s)final~=b8bidkgs^_40627.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2022-spring-final-part-b/answer.pdf",
|
||||
"(COMP2211)[2022](s)final~=ajou6^_51199.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2023-spring-midterm/paper.pdf",
|
||||
"(COMP2211)[2023](s)midterm~=bxbidkmj^_26587.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2023-spring-midterm/answer.pdf",
|
||||
"(COMP2211)[2023](s)midterm~clchanbg^_17297.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2024-spring-midterm/paper.pdf",
|
||||
"(COMP2211)[2024](s)midterm~=rcidkjgf^_82003.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2024-spring-midterm/answer.pdf",
|
||||
"(COMP2211)[2024](s)midterm~=ubrzkjmz^_90406.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2024-spring-final/paper.pdf",
|
||||
"(COMP2211)[2024](s)final~=igk5mmg^_90365.pdf",
|
||||
),
|
||||
(
|
||||
"course-library/COMP2211/COMP2211-2024-spring-final/answer.pdf",
|
||||
"(COMP2211)[2024](s)final~=igk5mmg^_58857.pdf",
|
||||
),
|
||||
]
|
||||
|
||||
PAPERS_DIR = (
|
||||
Path(__file__).parent.parent
|
||||
/ "pastpaper-scraper"
|
||||
/ "papers"
|
||||
/ "COMP2211"
|
||||
)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
from app.services.supabase_client import get_supabase
|
||||
|
||||
sb = get_supabase()
|
||||
bucket = sb.storage.from_("papers")
|
||||
|
||||
ok = 0
|
||||
skipped = 0
|
||||
failed = 0
|
||||
|
||||
for storage_path, local_name in MANIFEST:
|
||||
local_file = PAPERS_DIR / local_name
|
||||
if not local_file.exists():
|
||||
print(f" MISSING local file: {local_name}")
|
||||
failed += 1
|
||||
continue
|
||||
|
||||
data = local_file.read_bytes()
|
||||
try:
|
||||
bucket.upload(
|
||||
storage_path,
|
||||
data,
|
||||
file_options={"content-type": "application/pdf", "upsert": "true"},
|
||||
)
|
||||
print(f" OK {storage_path}")
|
||||
ok += 1
|
||||
except Exception as exc:
|
||||
print(f" ERR {storage_path}: {exc}")
|
||||
failed += 1
|
||||
|
||||
print(f"\nDone: {ok} uploaded, {skipped} skipped, {failed} failed.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
1969
backend/uv.lock
generated
Normal file
1969
backend/uv.lock
generated
Normal file
File diff suppressed because it is too large
Load Diff
92
deploy.md
Normal file
92
deploy.md
Normal file
@@ -0,0 +1,92 @@
|
||||
# 部署到腾讯云
|
||||
|
||||
## 1. 服务器准备
|
||||
|
||||
```bash
|
||||
# SSH 登录后安装 Docker
|
||||
curl -fsSL https://get.docker.com | sh
|
||||
sudo systemctl enable docker && sudo systemctl start docker
|
||||
|
||||
# 安装 docker-compose
|
||||
sudo curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
|
||||
sudo chmod +x /usr/local/bin/docker-compose
|
||||
```
|
||||
|
||||
## 2. 上传代码
|
||||
|
||||
```bash
|
||||
# 本地打包(排除 node_modules 和 .venv)
|
||||
cd "/Users/soda/Desktop/PastPaper Master"
|
||||
tar --exclude='node_modules' --exclude='.venv' --exclude='__pycache__' --exclude='.git' \
|
||||
-czf pastpaper.tar.gz .
|
||||
|
||||
# 上传到服务器
|
||||
scp pastpaper.tar.gz root@<SERVER_IP>:/opt/pastpaper/
|
||||
|
||||
# 服务器上解压
|
||||
ssh root@<SERVER_IP>
|
||||
cd /opt/pastpaper && tar xzf pastpaper.tar.gz
|
||||
```
|
||||
|
||||
## 3. 配置环境变量
|
||||
|
||||
```bash
|
||||
# 编辑 .env,确认所有 key 正确
|
||||
vi /opt/pastpaper/.env
|
||||
```
|
||||
|
||||
需要的变量:
|
||||
- `SUPABASE_URL`, `SUPABASE_ANON_KEY`, `SUPABASE_SERVICE_ROLE_KEY`
|
||||
- `DASHSCOPE_BASE_URL`, `DASHSCOPE_API_KEY`
|
||||
- `DEEPSEEK_BASE_URL`, `DEEPSEEK_API_KEY`
|
||||
- `LAOZHANG_BASE_URL`, `LAOZHANG_API_KEY`(备用)
|
||||
- `GOOGLE_GEMINI_API_KEY`(如果服务器地区支持)
|
||||
|
||||
## 4. 构建并启动
|
||||
|
||||
```bash
|
||||
cd /opt/pastpaper
|
||||
docker-compose up -d --build
|
||||
```
|
||||
|
||||
## 5. 验证
|
||||
|
||||
```bash
|
||||
# 检查容器状态
|
||||
docker-compose ps
|
||||
|
||||
# 检查后端健康
|
||||
curl http://localhost/health
|
||||
|
||||
# 查看日志
|
||||
docker-compose logs -f backend
|
||||
docker-compose logs -f frontend
|
||||
```
|
||||
|
||||
## 6. 域名 + HTTPS(可选)
|
||||
|
||||
如果有域名,在腾讯云控制台配置 DNS → 服务器 IP,然后:
|
||||
|
||||
```bash
|
||||
# 安装 certbot
|
||||
apt install -y certbot python3-certbot-nginx
|
||||
|
||||
# 获取证书(先把 nginx.conf 里 server_name 改成你的域名)
|
||||
certbot --nginx -d your-domain.com
|
||||
```
|
||||
|
||||
## 常用运维命令
|
||||
|
||||
```bash
|
||||
# 重启
|
||||
docker-compose restart
|
||||
|
||||
# 更新代码后重新构建
|
||||
docker-compose up -d --build
|
||||
|
||||
# 查看后端日志
|
||||
docker-compose logs -f backend
|
||||
|
||||
# 进入后端容器
|
||||
docker-compose exec backend bash
|
||||
```
|
||||
10
docker-compose.yml
Normal file
10
docker-compose.yml
Normal file
@@ -0,0 +1,10 @@
|
||||
services:
|
||||
backend:
|
||||
build: ./backend
|
||||
env_file: .env
|
||||
ports:
|
||||
- "8001:8000"
|
||||
restart: unless-stopped
|
||||
dns:
|
||||
- 8.8.8.8
|
||||
- 1.1.1.1
|
||||
152
docs/PAGE_NUMBER_BACKFILL.md
Normal file
152
docs/PAGE_NUMBER_BACKFILL.md
Normal file
@@ -0,0 +1,152 @@
|
||||
# Sub-question Page Number Backfill — Requirements
|
||||
|
||||
## Problem
|
||||
|
||||
All six `split_comp2211_*.py` scripts create sub-questions by inheriting `page_number`
|
||||
from their parent question:
|
||||
|
||||
```python
|
||||
"page_number": parent.get("page_number"),
|
||||
```
|
||||
|
||||
This is wrong for sub-questions that span multiple pages. For example, Q1 True/False
|
||||
has 10 statements (a–j); if (a)–(f) are on page 1 and (g)–(j) are on page 2, all ten
|
||||
inherit page 1 from the parent. Clicking Q1h in the UI scrolls to page 1 instead of page 2.
|
||||
|
||||
## Goal
|
||||
|
||||
Every `ChildSpec` in every split script should carry its own correct `page_number`.
|
||||
When the script runs, it writes that page number to the database instead of inheriting
|
||||
from the parent.
|
||||
|
||||
## Files to modify
|
||||
|
||||
```
|
||||
backend/split_comp2211_2022_fall_midterm.py ← does not exist yet; parent is seed SQL
|
||||
backend/split_comp2211_2022_spring_midterm.py
|
||||
backend/split_comp2211_2022_spring_final_part_a.py
|
||||
backend/split_comp2211_2022_spring_final_part_b.py
|
||||
backend/split_comp2211_2023_spring_midterm.py
|
||||
backend/split_comp2211_2024_spring_midterm.py
|
||||
backend/split_comp2211_2024_spring_final.py
|
||||
```
|
||||
|
||||
Note: `2022-fall-midterm` sub-questions were inserted directly via the seed SQL
|
||||
(`supabase/seeds/comp2211_problem_level_questions.sql`), not via a split script.
|
||||
Their page numbers must be fixed directly in that SQL file or via a separate UPDATE.
|
||||
|
||||
## How to determine page numbers
|
||||
|
||||
Use PyMuPDF (`import pymupdf` — already in the venv) to search for question markers
|
||||
in the local PDF files. The PDFs are at:
|
||||
|
||||
```
|
||||
../pastpaper-scraper/papers/COMP2211/<filename>
|
||||
```
|
||||
|
||||
Filename mapping (from `upload_course_library_pdfs.py`):
|
||||
|
||||
| Exam key | Local paper PDF |
|
||||
|----------|----------------|
|
||||
| COMP2211-2022-fall-midterm | (COMP2211)[2022](f)midterm~=yjz8dxdd^_27002.pdf |
|
||||
| COMP2211-2022-spring-midterm | (COMP2211)[2022](s)midterm~=b8bidkgs^_14629.pdf |
|
||||
| COMP2211-2022-spring-final-part-a | (COMP2211)[2022](s)final~=b8bidkgs^_33018.pdf |
|
||||
| COMP2211-2022-spring-final-part-b | (COMP2211)[2022](s)final~=b8bidkgs^_40627.pdf |
|
||||
| COMP2211-2023-spring-midterm | (COMP2211)[2023](s)midterm~=bxbidkmj^_26587.pdf |
|
||||
| COMP2211-2024-spring-midterm | (COMP2211)[2024](s)midterm~=rcidkjgf^_82003.pdf |
|
||||
| COMP2211-2024-spring-final | (COMP2211)[2024](s)final~=igk5mmg^_90365.pdf |
|
||||
|
||||
### Suggested search strategy
|
||||
|
||||
```python
|
||||
import pymupdf
|
||||
|
||||
doc = pymupdf.open("path/to/paper.pdf")
|
||||
for page_num, page in enumerate(doc, start=1):
|
||||
text = page.get_text()
|
||||
print(f"--- Page {page_num} ---")
|
||||
print(text[:500])
|
||||
```
|
||||
|
||||
Search for markers like:
|
||||
- `"(a)"`, `"(b)"`, ... for True/False sub-statements
|
||||
- `"Q2(a)"`, `"2(a)"`, `"Question 2"` for major sub-questions
|
||||
- `"(i)"`, `"(ii)"` for nested sub-questions
|
||||
|
||||
Page numbers are 1-indexed (matching the `page_number` field in the database).
|
||||
|
||||
## Code changes per split script
|
||||
|
||||
### Step 1 — Add `page_number` field to `ChildSpec`
|
||||
|
||||
Each script has its own `ChildSpec` dataclass. Add the field with a default so
|
||||
existing call sites don't break immediately:
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class ChildSpec:
|
||||
...
|
||||
page_number: int = 1 # add this field
|
||||
```
|
||||
|
||||
### Step 2 — Set correct page numbers in each `ChildSpec` instance
|
||||
|
||||
Fill in the actual page after inspecting the PDF:
|
||||
|
||||
```python
|
||||
ChildSpec("1a", "1", "1", ("a",), 1.5, "true_false", page_number=1),
|
||||
ChildSpec("1b", "1", "1", ("b",), 1.5, "true_false", page_number=1),
|
||||
...
|
||||
ChildSpec("1h", "1", "1", ("h",), 1.5, "true_false", page_number=2),
|
||||
```
|
||||
|
||||
### Step 3 — Write `page_number` in the upsert payload
|
||||
|
||||
Find where the script builds the INSERT/upsert dict and replace the inherited value:
|
||||
|
||||
```python
|
||||
# Before:
|
||||
"page_number": parent.get("page_number"),
|
||||
|
||||
# After:
|
||||
"page_number": child.page_number,
|
||||
```
|
||||
|
||||
### Step 4 — Update existing rows in the database
|
||||
|
||||
After modifying the scripts, run each script once — they already use upsert/update
|
||||
semantics, so re-running overwrites the old (inherited) page numbers with the correct ones.
|
||||
|
||||
If a script does INSERT-only (not upsert), add a separate UPDATE pass:
|
||||
|
||||
```python
|
||||
sb.table("paper_questions").update({"page_number": child.page_number}) \
|
||||
.eq("paper_id", paper_id) \
|
||||
.eq("question_number", child.question_number) \
|
||||
.execute()
|
||||
```
|
||||
|
||||
## 2022-fall-midterm (seed SQL)
|
||||
|
||||
Sub-questions for this paper are in:
|
||||
`supabase/seeds/comp2211_problem_level_questions.sql`
|
||||
|
||||
The seed has a `page_number` column in the VALUES rows. Find all rows for
|
||||
`COMP2211-2022-fall-midterm` and correct the values. Then run a direct UPDATE
|
||||
against the live database:
|
||||
|
||||
```sql
|
||||
-- Example — adjust actual page numbers after inspecting the PDF
|
||||
UPDATE paper_questions
|
||||
SET page_number = 2
|
||||
WHERE paper_id = (SELECT id FROM papers WHERE source_exam_key = 'COMP2211-2022-fall-midterm')
|
||||
AND question_number IN ('1g', '1h', '1i', '1j');
|
||||
```
|
||||
|
||||
## Definition of Done
|
||||
|
||||
- [ ] Every `ChildSpec` in every split script has an explicit `page_number`
|
||||
- [ ] No script uses `parent.get("page_number")` for the upsert payload
|
||||
- [ ] All six scripts have been re-run against the live database
|
||||
- [ ] 2022-fall-midterm sub-questions updated via SQL
|
||||
- [ ] Spot-check: clicking Q1h in a paper where Q1 spans 2 pages scrolls to page 2 in the UI
|
||||
243
docs/TAGGING_REQUIREMENTS.md
Normal file
243
docs/TAGGING_REQUIREMENTS.md
Normal file
@@ -0,0 +1,243 @@
|
||||
# Tag Schema & Similar Question Retrieval — Requirements
|
||||
|
||||
## Background
|
||||
|
||||
Current state of `paper_questions` tagging for COMP2211:
|
||||
|
||||
- `analytics_topic`: 8 coarse buckets (e.g. "KNN and Clustering" covers both KNN and K-Means)
|
||||
- `topic_tags`: redundant copy of `analytics_topic`, adds no information
|
||||
- `skill_tags`: fine-grained snake_case labels (e.g. `centroid_update`, `distance_calculation`), not shown to users
|
||||
- `question_text`: at subquestion level, but currently stores **parent problem header text**, not the actual subquestion statement
|
||||
|
||||
The result is that similar question retrieval conflates KNN and K-Means, cannot distinguish "write code" from "trace algorithm", and produces low-precision recommendations.
|
||||
|
||||
---
|
||||
|
||||
## Goal
|
||||
|
||||
Every subquestion should carry enough structured metadata that the retrieval system can return **topically and skill-wise identical questions across different exam years**, rather than just questions from the same broad topic bucket.
|
||||
|
||||
Precision target: a question on K-Means centroid update should retrieve other K-Means centroid update questions, not KNN distance questions.
|
||||
|
||||
---
|
||||
|
||||
## Field Definitions (revised)
|
||||
|
||||
### `analytics_topic` — single string, primary retrieval bucket
|
||||
|
||||
Granularity: **algorithm or concept level**, not course-section level.
|
||||
|
||||
Allowed values for COMP2211 (replace current 8-bucket system):
|
||||
|
||||
| New value | Replaces / splits |
|
||||
|-----------|-------------------|
|
||||
| `Naive Bayes` | Probabilistic Models (partial) |
|
||||
| `Bayesian Inference` | Probabilistic Models (partial) |
|
||||
| `KNN` | KNN and Clustering (partial) |
|
||||
| `K-Means` | KNN and Clustering (partial) |
|
||||
| `Perceptron` | Perceptron and MLP (partial) |
|
||||
| `MLP` | Perceptron and MLP (partial) |
|
||||
| `CNN` | Vision and CNN |
|
||||
| `Evaluation Metrics` | Evaluation and Validation (partial) |
|
||||
| `Cross Validation` | Evaluation and Validation (partial) |
|
||||
| `Python and NumPy` | Python Fundamentals |
|
||||
| `Search Algorithms` | Search and Games (partial) |
|
||||
| `Game Trees` | Search and Games (partial) |
|
||||
| `Ethics of AI` | Ethics of AI (unchanged) |
|
||||
|
||||
Rules:
|
||||
- One value per question — pick the **most specific** algorithm being tested
|
||||
- If a subquestion genuinely spans two algorithms, pick the one being asked to compute/demonstrate
|
||||
- `True/False` is **not** a valid analytics_topic (it is a format, not a topic)
|
||||
|
||||
---
|
||||
|
||||
### `topic_tags` — string array, secondary topic labels
|
||||
|
||||
Granularity: **concept and variant level** within the algorithm.
|
||||
|
||||
Purpose: catch cross-topic overlaps and concept aliases.
|
||||
|
||||
Examples:
|
||||
|
||||
```
|
||||
analytics_topic = "K-Means"
|
||||
topic_tags = ["K-Means", "Centroid Update", "Convergence"]
|
||||
|
||||
analytics_topic = "KNN"
|
||||
topic_tags = ["KNN", "Euclidean Distance", "Classification"]
|
||||
|
||||
analytics_topic = "Naive Bayes"
|
||||
topic_tags = ["Naive Bayes", "Prior", "Likelihood", "Posterior"]
|
||||
|
||||
analytics_topic = "Evaluation Metrics"
|
||||
topic_tags = ["Evaluation Metrics", "Precision", "Recall", "F1 Score"]
|
||||
|
||||
analytics_topic = "MLP"
|
||||
topic_tags = ["MLP", "Backpropagation", "Activation Function", "Hidden Layer"]
|
||||
|
||||
analytics_topic = "Python and NumPy"
|
||||
topic_tags = ["NumPy", "Broadcasting", "Array Indexing", "Vectorization"]
|
||||
```
|
||||
|
||||
Rules:
|
||||
- First element should match or alias `analytics_topic`
|
||||
- Include concept names a student would search for ("F1 Score", not "metric_reasoning")
|
||||
- 2–5 tags per question; avoid over-tagging
|
||||
- Human-readable, title-case, no underscores
|
||||
|
||||
---
|
||||
|
||||
### `skill_tags` — string array, task type labels
|
||||
|
||||
Granularity: **what the student must do**, not what the topic is.
|
||||
|
||||
Current values are acceptable in meaning but must be converted to human-readable form.
|
||||
|
||||
Rename convention: `snake_case` → `Title Case with spaces`
|
||||
|
||||
| Old | New |
|
||||
|-----|-----|
|
||||
| `concept_check` | `Concept Check` |
|
||||
| `code_tracing` | `Code Tracing` |
|
||||
| `algorithm_tracing` | `Algorithm Tracing` |
|
||||
| `distance_calculation` | `Distance Calculation` |
|
||||
| `centroid_update` | `Centroid Update` |
|
||||
| `weight_update` | `Weight Update` |
|
||||
| `decision_boundary` | `Decision Boundary` |
|
||||
| `implementation` | `Implementation` |
|
||||
| `debugging` | `Debugging` |
|
||||
| `model_selection` | `Model Selection` |
|
||||
| `concept_explanation` | `Concept Explanation` |
|
||||
| `architecture_reasoning` | `Architecture Reasoning` |
|
||||
| `convergence_reasoning` | `Convergence Reasoning` |
|
||||
| `generalization_reasoning` | `Generalization Reasoning` |
|
||||
| `classification_decision` | `Classification Decision` |
|
||||
|
||||
Rules:
|
||||
- 1–3 tags per question
|
||||
- Describes the **task type**, not the subject matter
|
||||
- These are used for retrieval ranking, not primary display
|
||||
|
||||
---
|
||||
|
||||
### `question_text` — the actual subquestion statement
|
||||
|
||||
Current problem: subquestions store the **parent problem header** as `question_text`, not the individual statement.
|
||||
|
||||
Required fix per subquestion type:
|
||||
|
||||
| Type | What `question_text` should contain |
|
||||
|------|-------------------------------------|
|
||||
| True/False subquestion (Q1a–Q1j) | The specific T/F statement being judged |
|
||||
| Code output (Q2a_i–Q2a_v) | The specific code snippet + "What is the output?" |
|
||||
| Calculation subquestion (Q4a, Q5a) | The specific sub-task, e.g. "Compute the Euclidean distance between..." |
|
||||
| Written explanation (Q3, Q5c) | The full question prompt for that part |
|
||||
|
||||
This is a **data extraction quality issue**. The backfill script must extract the correct per-subquestion text from the source PDF or from `raw_answer_text`.
|
||||
|
||||
---
|
||||
|
||||
## Backfill Requirements
|
||||
|
||||
### Script: `backfill_comp2211_tags.py`
|
||||
|
||||
Target: all `paper_questions` where `paper_id` in the COMP2211 course library.
|
||||
|
||||
For each question:
|
||||
|
||||
1. **Re-classify `analytics_topic`** using the new value list above
|
||||
- Use `question_text` + existing `topic_tags` + `skill_tags` as signals
|
||||
- If `analytics_topic` is currently `"KNN and Clustering"`:
|
||||
- Look at `skill_tags` and `question_text`
|
||||
- If `centroid_update`, `algorithm_tracing`, or text contains "K-Means" / "centroid" → set `"K-Means"`
|
||||
- Otherwise → set `"KNN"`
|
||||
- If `analytics_topic` is currently `"Perceptron and MLP"`:
|
||||
- If `question_text` or `skill_tags` references hidden layer, backprop, activation function → `"MLP"`
|
||||
- Otherwise → `"Perceptron"`
|
||||
- If `analytics_topic` is currently `"Probabilistic Models"`:
|
||||
- If Naive Bayes in text → `"Naive Bayes"`
|
||||
- Otherwise → `"Bayesian Inference"`
|
||||
- If `analytics_topic` is currently `"Evaluation and Validation"`:
|
||||
- If cross-validation, train/val split in text → `"Cross Validation"`
|
||||
- Otherwise → `"Evaluation Metrics"`
|
||||
- If `analytics_topic` is currently `"Search and Games"`:
|
||||
- If minimax, alpha-beta, game tree in text → `"Game Trees"`
|
||||
- Otherwise → `"Search Algorithms"`
|
||||
|
||||
2. **Rebuild `topic_tags`** — do not copy `analytics_topic`; derive from question content
|
||||
|
||||
3. **Rename `skill_tags`** — convert all snake_case values to Title Case per the mapping table above
|
||||
|
||||
4. **Do not overwrite `question_text`** in this pass (separate task)
|
||||
|
||||
---
|
||||
|
||||
## Retrieval Algorithm Changes (backend `questions.py`)
|
||||
|
||||
### Separate topic and skill contributions
|
||||
|
||||
Current `similarity_score()` merges `analytics_topic`, `topic_tags`, and `skill_tags` into one set. This causes skill tags like `centroid_update` to appear as "Shared topic: centroid_update" in the UI.
|
||||
|
||||
Required split:
|
||||
|
||||
```python
|
||||
def similarity_score(target, candidate):
|
||||
score = 0
|
||||
reasons = []
|
||||
|
||||
# 1. analytics_topic exact match: 40 pts
|
||||
if target.get("analytics_topic") and target["analytics_topic"] == candidate.get("analytics_topic"):
|
||||
score += 40
|
||||
reasons.append(f"Same topic: {target['analytics_topic']}")
|
||||
|
||||
# 2. topic_tags overlap: up to 20 pts (10 per shared tag, max 2)
|
||||
target_tt = set(t.lower() for t in (target.get("topic_tags") or []))
|
||||
candidate_tt = set(t.lower() for t in (candidate.get("topic_tags") or []))
|
||||
shared_tt = target_tt & candidate_tt
|
||||
tt_pts = min(len(shared_tt) * 10, 20)
|
||||
if tt_pts:
|
||||
score += tt_pts
|
||||
reasons.append(f"Shared concept: {', '.join(sorted(shared_tt)[:2])}")
|
||||
|
||||
# 3. skill_tags overlap: up to 20 pts (10 per shared tag, max 2)
|
||||
target_st = set(t.lower() for t in (target.get("skill_tags") or []))
|
||||
candidate_st = set(t.lower() for t in (candidate.get("skill_tags") or []))
|
||||
shared_st = target_st & candidate_st
|
||||
st_pts = min(len(shared_st) * 10, 20)
|
||||
if st_pts:
|
||||
score += st_pts
|
||||
reasons.append(f"Shared skill: {', '.join(sorted(shared_st)[:2])}")
|
||||
|
||||
# 4. Same question format: 10 pts
|
||||
if question_family(candidate) == question_family(target):
|
||||
score += 10
|
||||
reasons.append("Same format")
|
||||
|
||||
# 5. Same difficulty: 5 pts
|
||||
if candidate.get("difficulty") and candidate["difficulty"] == target.get("difficulty"):
|
||||
score += 5
|
||||
reasons.append("Same difficulty")
|
||||
|
||||
# 6. Full-text similarity: up to 20 pts (from tsvector RPC)
|
||||
# (injected externally, not computed here)
|
||||
|
||||
return min(score, 99), reasons
|
||||
```
|
||||
|
||||
### Threshold and display
|
||||
|
||||
- Filter: `match_percent < 20` (raised from 10; ensures analytics_topic at least partially matches)
|
||||
- UI display: show `match_reasons` chips, but replace snake_case with Title Case before display
|
||||
|
||||
---
|
||||
|
||||
## Definition of Done
|
||||
|
||||
- [ ] All COMP2211 questions have `analytics_topic` from the new value list
|
||||
- [ ] No `analytics_topic` value of `"KNN and Clustering"`, `"Perceptron and MLP"`, `"Probabilistic Models"`, `"Evaluation and Validation"`, `"Search and Games"` remains
|
||||
- [ ] `topic_tags` contains 2–5 human-readable concept names, not a copy of `analytics_topic`
|
||||
- [ ] `skill_tags` values are Title Case with spaces
|
||||
- [ ] Similar question retrieval returns 0 cross-algorithm false positives between KNN and K-Means
|
||||
- [ ] `match_reasons` chips in the UI show no underscores
|
||||
- [ ] Retrieval threshold enforces `analytics_topic` match as a hard or near-hard requirement
|
||||
12
frontend/Dockerfile
Normal file
12
frontend/Dockerfile
Normal file
@@ -0,0 +1,12 @@
|
||||
FROM node:20-alpine AS build
|
||||
|
||||
WORKDIR /app
|
||||
COPY package.json package-lock.json ./
|
||||
RUN npm ci
|
||||
COPY . .
|
||||
RUN npm run build
|
||||
|
||||
FROM nginx:alpine
|
||||
COPY --from=build /app/dist /usr/share/nginx/html
|
||||
COPY nginx.conf /etc/nginx/conf.d/default.conf
|
||||
EXPOSE 80
|
||||
13
frontend/index.html
Normal file
13
frontend/index.html
Normal file
@@ -0,0 +1,13 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="zh-CN">
|
||||
<head>
|
||||
<meta charset="UTF-8" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||
<link rel="icon" type="image/jpeg" href="/favicon.jpg" />
|
||||
<title>PastPaper Master</title>
|
||||
</head>
|
||||
<body>
|
||||
<div id="root"></div>
|
||||
<script type="module" src="/src/main.tsx"></script>
|
||||
</body>
|
||||
</html>
|
||||
27
frontend/nginx.conf
Normal file
27
frontend/nginx.conf
Normal file
@@ -0,0 +1,27 @@
|
||||
server {
|
||||
listen 80;
|
||||
server_name pastpaper.knowit.top;
|
||||
|
||||
root /usr/share/nginx/html;
|
||||
index index.html;
|
||||
|
||||
# SPA fallback
|
||||
location / {
|
||||
try_files $uri $uri/ /index.html;
|
||||
}
|
||||
|
||||
# API proxy to backend
|
||||
location /api/ {
|
||||
proxy_pass http://backend:8000/api/;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_read_timeout 300s;
|
||||
client_max_body_size 50M;
|
||||
}
|
||||
|
||||
# Health check proxy
|
||||
location /health {
|
||||
proxy_pass http://backend:8000/health;
|
||||
}
|
||||
}
|
||||
3058
frontend/package-lock.json
generated
Normal file
3058
frontend/package-lock.json
generated
Normal file
File diff suppressed because it is too large
Load Diff
30
frontend/package.json
Normal file
30
frontend/package.json
Normal file
@@ -0,0 +1,30 @@
|
||||
{
|
||||
"name": "frontend",
|
||||
"version": "1.0.0",
|
||||
"description": "",
|
||||
"type": "module",
|
||||
"scripts": {
|
||||
"dev": "vite",
|
||||
"build": "tsc && vite build",
|
||||
"preview": "vite preview"
|
||||
},
|
||||
"dependencies": {
|
||||
"@supabase/supabase-js": "^2.103.0",
|
||||
"katex": "^0.16.38",
|
||||
"pdfjs-dist": "^5.5.207",
|
||||
"react": "^19.2.4",
|
||||
"react-dom": "^19.2.4",
|
||||
"react-pdf": "^10.4.1",
|
||||
"react-router-dom": "^7.13.1"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@tailwindcss/vite": "^4.2.1",
|
||||
"@types/katex": "^0.16.8",
|
||||
"@types/react": "^19.2.14",
|
||||
"@types/react-dom": "^19.2.3",
|
||||
"@vitejs/plugin-react": "^4.7.0",
|
||||
"tailwindcss": "^4.2.1",
|
||||
"typescript": "^5.9.3",
|
||||
"vite": "^7.3.1"
|
||||
}
|
||||
}
|
||||
BIN
frontend/public/favicon.jpg
Normal file
BIN
frontend/public/favicon.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 103 KiB |
30
frontend/src/App.tsx
Normal file
30
frontend/src/App.tsx
Normal file
@@ -0,0 +1,30 @@
|
||||
import { Navigate, Routes, Route } from "react-router-dom";
|
||||
import { useAuth } from "./contexts/AuthContext";
|
||||
import ProcessingBanner from "./components/layout/ProcessingBanner";
|
||||
import LoginPage from "./pages/LoginPage";
|
||||
import HomePage from "./pages/HomePage";
|
||||
import UploadPage from "./pages/UploadPage";
|
||||
import WorkbenchPage from "./pages/WorkbenchPage";
|
||||
import ErrorBookPage from "./pages/ErrorBookPage";
|
||||
import AnalyticsPage from "./pages/AnalyticsPage";
|
||||
|
||||
export default function App() {
|
||||
const { session, loading } = useAuth();
|
||||
|
||||
if (loading) return <div className="min-h-screen bg-gray-50 flex items-center justify-center"><div className="text-gray-400 text-sm">Loading...</div></div>;
|
||||
|
||||
return (
|
||||
<>
|
||||
<ProcessingBanner />
|
||||
<Routes>
|
||||
<Route path="/login" element={session ? <Navigate to="/" replace /> : <LoginPage />} />
|
||||
<Route path="/" element={<HomePage />} />
|
||||
<Route path="/upload" element={<UploadPage />} />
|
||||
<Route path="/paper/:id" element={<WorkbenchPage />} />
|
||||
<Route path="/error-book" element={<ErrorBookPage />} />
|
||||
<Route path="/analytics" element={<AnalyticsPage />} />
|
||||
<Route path="/analytics/:courseCode" element={<AnalyticsPage />} />
|
||||
</Routes>
|
||||
</>
|
||||
);
|
||||
}
|
||||
69
frontend/src/components/layout/Header.tsx
Normal file
69
frontend/src/components/layout/Header.tsx
Normal file
@@ -0,0 +1,69 @@
|
||||
import { Link } from "react-router-dom";
|
||||
import { useAuth } from "@/contexts/AuthContext";
|
||||
|
||||
export default function Header({
|
||||
courseCode,
|
||||
paperTitle,
|
||||
}: {
|
||||
courseCode?: string;
|
||||
paperTitle?: string;
|
||||
}) {
|
||||
const { user, signOut } = useAuth();
|
||||
|
||||
return (
|
||||
<header className="h-14 border-b border-gray-200 bg-white flex items-center px-6 shrink-0">
|
||||
<Link to="/" className="text-lg font-bold text-blue-600 mr-6">
|
||||
PastPaper Master
|
||||
</Link>
|
||||
{courseCode && (
|
||||
<div className="flex items-center gap-2 text-sm text-gray-600">
|
||||
<span className="bg-blue-50 text-blue-700 px-2 py-0.5 rounded font-medium">
|
||||
{courseCode}
|
||||
</span>
|
||||
{paperTitle && <span>{paperTitle}</span>}
|
||||
<Link
|
||||
to={`/analytics/${courseCode}`}
|
||||
className="ml-2 flex items-center gap-1 px-2.5 py-1 text-xs font-medium text-indigo-600 bg-indigo-50 rounded hover:bg-indigo-100 transition-colors"
|
||||
>
|
||||
<svg className="w-3 h-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M3 13.125C3 12.504 3.504 12 4.125 12h2.25c.621 0 1.125.504 1.125 1.125v6.75C7.5 20.496 6.996 21 6.375 21h-2.25A1.125 1.125 0 013 19.875v-6.75zM9.75 8.625c0-.621.504-1.125 1.125-1.125h2.25c.621 0 1.125.504 1.125 1.125v11.25c0 .621-.504 1.125-1.125 1.125h-2.25a1.125 1.125 0 01-1.125-1.125V8.625zM16.5 4.125c0-.621.504-1.125 1.125-1.125h2.25C20.496 3 21 3.504 21 4.125v15.75c0 .621-.504 1.125-1.125 1.125h-2.25a1.125 1.125 0 01-1.125-1.125V4.125z" />
|
||||
</svg>
|
||||
AI Analytics
|
||||
</Link>
|
||||
</div>
|
||||
)}
|
||||
<div className="ml-auto flex items-center gap-4 text-sm">
|
||||
<Link to="/" className="text-gray-500 hover:text-gray-800">
|
||||
My Papers
|
||||
</Link>
|
||||
<Link to="/error-book" className="text-gray-500 hover:text-gray-800">
|
||||
Error Book
|
||||
</Link>
|
||||
<Link to="/analytics" className="text-gray-500 hover:text-gray-800">
|
||||
Analytics
|
||||
</Link>
|
||||
<Link to="/upload" className="text-blue-600 hover:text-blue-800 font-medium">
|
||||
Upload
|
||||
</Link>
|
||||
{user ? (
|
||||
<div className="flex items-center gap-3 pl-4 border-l border-gray-200">
|
||||
<span className="text-xs text-gray-400">{user.email}</span>
|
||||
<button
|
||||
onClick={signOut}
|
||||
className="text-xs text-gray-500 hover:text-gray-800 px-2 py-1 rounded hover:bg-gray-100"
|
||||
>
|
||||
Sign out
|
||||
</button>
|
||||
</div>
|
||||
) : (
|
||||
<Link
|
||||
to="/login"
|
||||
className="text-sm text-blue-600 hover:text-blue-800 font-medium pl-4 border-l border-gray-200"
|
||||
>
|
||||
Sign in
|
||||
</Link>
|
||||
)}
|
||||
</div>
|
||||
</header>
|
||||
);
|
||||
}
|
||||
183
frontend/src/components/layout/ProcessingBanner.tsx
Normal file
183
frontend/src/components/layout/ProcessingBanner.tsx
Normal file
@@ -0,0 +1,183 @@
|
||||
import { useEffect, useRef, useState } from "react";
|
||||
import { Link } from "react-router-dom";
|
||||
import { myPapers } from "@/lib/api";
|
||||
import { useAuth } from "@/contexts/AuthContext";
|
||||
import type { Paper } from "@/types/api";
|
||||
|
||||
interface Notification {
|
||||
paperId: string;
|
||||
label: string;
|
||||
}
|
||||
|
||||
const POLL_MS = 4000;
|
||||
|
||||
export default function ProcessingBanner() {
|
||||
const { user } = useAuth();
|
||||
const [processing, setProcessing] = useState<Paper[]>([]);
|
||||
const [doneNotifs, setDoneNotifs] = useState<Notification[]>([]);
|
||||
const [expanded, setExpanded] = useState(false);
|
||||
const knownIds = useRef<Set<string>>(new Set());
|
||||
|
||||
// Drag state
|
||||
const [pos, setPos] = useState({ x: window.innerWidth - 220, y: 24 });
|
||||
const dragging = useRef(false);
|
||||
const dragOffset = useRef({ x: 0, y: 0 });
|
||||
const widgetRef = useRef<HTMLDivElement>(null);
|
||||
|
||||
useEffect(() => {
|
||||
if (!user) return;
|
||||
let cancelled = false;
|
||||
|
||||
const poll = async () => {
|
||||
try {
|
||||
const papers = await myPapers();
|
||||
if (cancelled) return;
|
||||
|
||||
const inProgress = papers.filter((p) => p.status === "processing" || p.status === "uploaded");
|
||||
setProcessing(inProgress);
|
||||
|
||||
papers
|
||||
.filter((p) => p.status === "ready" && knownIds.current.has(p.id))
|
||||
.forEach((p) => {
|
||||
knownIds.current.delete(p.id);
|
||||
const label = `${p.course_code} ${p.year} ${p.term} ${p.exam_type}`;
|
||||
setDoneNotifs((prev) => [...prev, { paperId: p.id, label }]);
|
||||
setTimeout(() => {
|
||||
setDoneNotifs((prev) => prev.filter((n) => n.paperId !== p.id));
|
||||
}, 8000);
|
||||
});
|
||||
|
||||
inProgress.forEach((p) => knownIds.current.add(p.id));
|
||||
} catch {
|
||||
// silent
|
||||
}
|
||||
};
|
||||
|
||||
poll();
|
||||
const interval = setInterval(poll, POLL_MS);
|
||||
return () => { cancelled = true; clearInterval(interval); };
|
||||
}, [user]);
|
||||
|
||||
// Drag handlers
|
||||
const onMouseDown = (e: React.MouseEvent) => {
|
||||
// Only drag on the header bar
|
||||
dragging.current = true;
|
||||
dragOffset.current = {
|
||||
x: e.clientX - pos.x,
|
||||
y: e.clientY - pos.y,
|
||||
};
|
||||
e.preventDefault();
|
||||
};
|
||||
|
||||
useEffect(() => {
|
||||
const onMouseMove = (e: MouseEvent) => {
|
||||
if (!dragging.current) return;
|
||||
setPos({
|
||||
x: Math.max(0, Math.min(window.innerWidth - 200, e.clientX - dragOffset.current.x)),
|
||||
y: Math.max(0, Math.min(window.innerHeight - 60, e.clientY - dragOffset.current.y)),
|
||||
});
|
||||
};
|
||||
const onMouseUp = () => { dragging.current = false; };
|
||||
window.addEventListener("mousemove", onMouseMove);
|
||||
window.addEventListener("mouseup", onMouseUp);
|
||||
return () => {
|
||||
window.removeEventListener("mousemove", onMouseMove);
|
||||
window.removeEventListener("mouseup", onMouseUp);
|
||||
};
|
||||
}, []);
|
||||
|
||||
if (!user || (processing.length === 0 && doneNotifs.length === 0)) return null;
|
||||
|
||||
const total = processing.length + doneNotifs.length;
|
||||
|
||||
return (
|
||||
<div
|
||||
ref={widgetRef}
|
||||
className="fixed z-50 select-none"
|
||||
style={{ left: pos.x, top: pos.y }}
|
||||
>
|
||||
{/* ── Header / collapsed pill ── */}
|
||||
<div
|
||||
onMouseDown={onMouseDown}
|
||||
onClick={() => setExpanded((v) => !v)}
|
||||
className="flex items-center gap-2 bg-gray-900 text-white text-xs px-3.5 py-2.5 rounded-xl shadow-lg cursor-grab active:cursor-grabbing"
|
||||
style={{ minWidth: 180 }}
|
||||
>
|
||||
<span className="w-3 h-3 border-2 border-white border-t-transparent rounded-full animate-spin shrink-0" />
|
||||
<span className="flex-1 font-medium">
|
||||
{processing.length > 0
|
||||
? `${processing.length} processing…`
|
||||
: `${doneNotifs.length} ready`}
|
||||
</span>
|
||||
{doneNotifs.length > 0 && (
|
||||
<span className="w-4 h-4 flex items-center justify-center bg-green-500 rounded-full text-[10px] font-bold shrink-0">
|
||||
{doneNotifs.length}
|
||||
</span>
|
||||
)}
|
||||
<span className="text-gray-400 text-[10px] shrink-0">{expanded ? "▲" : "▼"}</span>
|
||||
</div>
|
||||
|
||||
{/* ── Expanded panel ── */}
|
||||
{expanded && (
|
||||
<div className="mt-1.5 flex flex-col gap-1.5" style={{ minWidth: 240 }}>
|
||||
{processing.map((p) => {
|
||||
const step = p.processing_step;
|
||||
const progress = p.processing_progress || 0;
|
||||
const total = p.processing_total || 0;
|
||||
const pct = total > 0 ? Math.round((progress / total) * 100) : 0;
|
||||
return (
|
||||
<div
|
||||
key={p.id}
|
||||
className="bg-gray-900 text-white text-xs px-3.5 py-2.5 rounded-xl shadow-lg"
|
||||
>
|
||||
<div className="flex items-center gap-2.5 mb-1.5">
|
||||
<span className="w-3 h-3 border-2 border-white border-t-transparent rounded-full animate-spin shrink-0" />
|
||||
<span className="truncate">
|
||||
<span className="font-semibold">{p.course_code}</span>{" "}
|
||||
{p.year} {p.term} {p.exam_type}
|
||||
</span>
|
||||
</div>
|
||||
{step && (
|
||||
<>
|
||||
<div className="text-[10px] text-gray-400 mb-1 truncate">{step}</div>
|
||||
{total > 0 && (
|
||||
<div className="h-1.5 bg-gray-700 rounded-full overflow-hidden">
|
||||
<div className="h-full bg-blue-400 rounded-full transition-all duration-500" style={{ width: `${pct}%` }} />
|
||||
</div>
|
||||
)}
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
})}
|
||||
|
||||
{doneNotifs.map((n) => (
|
||||
<div
|
||||
key={n.paperId}
|
||||
className="flex items-center gap-2.5 bg-green-600 text-white text-xs px-3.5 py-2.5 rounded-xl shadow-lg"
|
||||
>
|
||||
<span className="text-sm leading-none">✓</span>
|
||||
<span className="flex-1 truncate font-semibold">{n.label}</span>
|
||||
<Link
|
||||
to={`/paper/${n.paperId}`}
|
||||
className="shrink-0 underline font-semibold hover:text-green-100"
|
||||
onClick={(e) => e.stopPropagation()}
|
||||
>
|
||||
Open →
|
||||
</Link>
|
||||
<button
|
||||
onClick={(e) => {
|
||||
e.stopPropagation();
|
||||
setDoneNotifs((prev) => prev.filter((x) => x.paperId !== n.paperId));
|
||||
}}
|
||||
className="shrink-0 text-green-200 hover:text-white"
|
||||
>
|
||||
×
|
||||
</button>
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
65
frontend/src/components/shared/CollapsibleSection.tsx
Normal file
65
frontend/src/components/shared/CollapsibleSection.tsx
Normal file
@@ -0,0 +1,65 @@
|
||||
import { useState } from "react";
|
||||
|
||||
const schemes = {
|
||||
blue: {
|
||||
border: "border-blue-200",
|
||||
bg: "bg-blue-50",
|
||||
text: "text-blue-800",
|
||||
icon: "text-blue-500",
|
||||
},
|
||||
amber: {
|
||||
border: "border-amber-200",
|
||||
bg: "bg-amber-50",
|
||||
text: "text-amber-800",
|
||||
icon: "text-amber-500",
|
||||
},
|
||||
green: {
|
||||
border: "border-green-200",
|
||||
bg: "bg-green-50",
|
||||
text: "text-green-800",
|
||||
icon: "text-green-500",
|
||||
},
|
||||
} as const;
|
||||
|
||||
export default function CollapsibleSection({
|
||||
title,
|
||||
colorScheme,
|
||||
defaultOpen = false,
|
||||
children,
|
||||
}: {
|
||||
title: string;
|
||||
colorScheme: keyof typeof schemes;
|
||||
defaultOpen?: boolean;
|
||||
children: React.ReactNode;
|
||||
}) {
|
||||
const [isOpen, setIsOpen] = useState(defaultOpen);
|
||||
const s = schemes[colorScheme];
|
||||
|
||||
return (
|
||||
<div className={`rounded-lg border ${s.border} mb-3`}>
|
||||
<button
|
||||
onClick={() => setIsOpen(!isOpen)}
|
||||
className={`w-full flex items-center justify-between p-3 rounded-t-lg ${s.bg} cursor-pointer`}
|
||||
>
|
||||
<span className={`font-semibold text-sm ${s.text}`}>{title}</span>
|
||||
<svg
|
||||
className={`w-4 h-4 ${s.icon} transition-transform duration-200 ${isOpen ? "rotate-180" : ""}`}
|
||||
fill="none"
|
||||
viewBox="0 0 24 24"
|
||||
stroke="currentColor"
|
||||
strokeWidth={2}
|
||||
>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M19 9l-7 7-7-7" />
|
||||
</svg>
|
||||
</button>
|
||||
<div
|
||||
className="grid transition-[grid-template-rows] duration-300 ease-in-out"
|
||||
style={{ gridTemplateRows: isOpen ? "1fr" : "0fr" }}
|
||||
>
|
||||
<div className="overflow-hidden">
|
||||
<div className="p-3">{children}</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
86
frontend/src/components/shared/KaTeXRenderer.tsx
Normal file
86
frontend/src/components/shared/KaTeXRenderer.tsx
Normal file
@@ -0,0 +1,86 @@
|
||||
import { useMemo } from "react";
|
||||
import katex from "katex";
|
||||
|
||||
/**
|
||||
* Pre-render all LaTeX in an HTML string at the string level,
|
||||
* then set innerHTML. This avoids DOM-based auto-render issues
|
||||
* where delimiters get split across text nodes or special chars
|
||||
* like # cause silent failures.
|
||||
*/
|
||||
function renderLatexInString(html: string): string {
|
||||
// Strip <code class="latex"> and <pre class="latex"> wrappers
|
||||
let s = html
|
||||
.replace(/<code[^>]*class="latex"[^>]*>(.*?)<\/code>/gs, "$1")
|
||||
.replace(/<pre[^>]*class="latex"[^>]*>(.*?)<\/pre>/gs, "$1");
|
||||
|
||||
// 1) Render display math: $$...$$ and \[...\]
|
||||
s = s.replace(/\$\$([\s\S]+?)\$\$/g, (_match, tex: string) => {
|
||||
return renderTex(tex.trim(), true);
|
||||
});
|
||||
s = s.replace(/\\\[([\s\S]+?)\\\]/g, (_match, tex: string) => {
|
||||
return renderTex(tex.trim(), true);
|
||||
});
|
||||
|
||||
// 2) Render inline math: $...$ and \(...\)
|
||||
// Negative lookbehind for \ to avoid matching \$ escapes
|
||||
// Also avoid matching $$ (already handled above)
|
||||
s = s.replace(/(?<![\\$])\$(?!\$)((?:[^$\\]|\\.)+?)\$/g, (_match, tex: string) => {
|
||||
return renderTex(tex, false);
|
||||
});
|
||||
s = s.replace(/\\\(([\s\S]+?)\\\)/g, (_match, tex: string) => {
|
||||
return renderTex(tex, false);
|
||||
});
|
||||
|
||||
return s;
|
||||
}
|
||||
|
||||
function decodeHtmlEntities(s: string): string {
|
||||
return s
|
||||
.replace(/&/g, "&")
|
||||
.replace(/</g, "<")
|
||||
.replace(/>/g, ">")
|
||||
.replace(/"/g, '"')
|
||||
.replace(/'/g, "'")
|
||||
.replace(/ /g, " ");
|
||||
}
|
||||
|
||||
function renderTex(tex: string, displayMode: boolean): string {
|
||||
// Decode HTML entities that might appear in DB-sourced HTML
|
||||
let cleaned = decodeHtmlEntities(tex);
|
||||
// Sanitize common issues that cause KaTeX to fail:
|
||||
// 1) # and % inside \text{} — escape them
|
||||
cleaned = cleaned.replace(/\\text\{([^}]*)\}/g, (_m, inner: string) => {
|
||||
return "\\text{" + inner.replace(/#/g, "\\#").replace(/%/g, "\\%") + "}";
|
||||
});
|
||||
// 2) Standalone # outside \text{} in math — escape it
|
||||
cleaned = cleaned.replace(/(?<!\\)#(?!\\)/g, "\\#");
|
||||
|
||||
try {
|
||||
return katex.renderToString(cleaned, {
|
||||
displayMode,
|
||||
throwOnError: false,
|
||||
trust: true,
|
||||
strict: false,
|
||||
});
|
||||
} catch {
|
||||
// Fallback: show the raw LaTeX in a styled span
|
||||
return `<span class="katex-error" style="color:#E11D48;font-size:0.85em">${tex}</span>`;
|
||||
}
|
||||
}
|
||||
|
||||
export default function KaTeXRenderer({
|
||||
html,
|
||||
className,
|
||||
}: {
|
||||
html: string;
|
||||
className?: string;
|
||||
}) {
|
||||
const rendered = useMemo(() => renderLatexInString(html), [html]);
|
||||
|
||||
return (
|
||||
<div
|
||||
className={`kb-html-content text-sm ${className ?? ""}`}
|
||||
dangerouslySetInnerHTML={{ __html: rendered }}
|
||||
/>
|
||||
);
|
||||
}
|
||||
15
frontend/src/components/shared/StatusBadge.tsx
Normal file
15
frontend/src/components/shared/StatusBadge.tsx
Normal file
@@ -0,0 +1,15 @@
|
||||
const statusConfig = {
|
||||
uploaded: { label: "Uploaded", bg: "bg-gray-100", text: "text-gray-600" },
|
||||
processing: { label: "Processing...", bg: "bg-blue-100", text: "text-blue-700" },
|
||||
ready: { label: "Ready", bg: "bg-green-100", text: "text-green-700" },
|
||||
error: { label: "Error", bg: "bg-red-100", text: "text-red-700" },
|
||||
} as const;
|
||||
|
||||
export default function StatusBadge({ status }: { status: string }) {
|
||||
const config = statusConfig[status as keyof typeof statusConfig] ?? statusConfig.uploaded;
|
||||
return (
|
||||
<span className={`inline-block px-2 py-0.5 rounded-full text-xs font-medium ${config.bg} ${config.text}`}>
|
||||
{config.label}
|
||||
</span>
|
||||
);
|
||||
}
|
||||
63
frontend/src/components/upload/FilePickerField.tsx
Normal file
63
frontend/src/components/upload/FilePickerField.tsx
Normal file
@@ -0,0 +1,63 @@
|
||||
import { useRef, useState } from "react";
|
||||
|
||||
export default function FilePickerField({
|
||||
label,
|
||||
required,
|
||||
file,
|
||||
onFileChange,
|
||||
}: {
|
||||
label: string;
|
||||
required?: boolean;
|
||||
file: File | null;
|
||||
onFileChange: (file: File | null) => void;
|
||||
}) {
|
||||
const inputRef = useRef<HTMLInputElement>(null);
|
||||
const [isDragging, setIsDragging] = useState(false);
|
||||
|
||||
const handleDrop = (e: React.DragEvent) => {
|
||||
e.preventDefault();
|
||||
setIsDragging(false);
|
||||
const f = e.dataTransfer.files[0];
|
||||
if (f?.type === "application/pdf") onFileChange(f);
|
||||
};
|
||||
|
||||
return (
|
||||
<div>
|
||||
<label className="block text-sm font-medium text-gray-700 mb-1">
|
||||
{label} {required && <span className="text-red-500">*</span>}
|
||||
</label>
|
||||
<div
|
||||
className={`border-2 border-dashed rounded-lg p-6 text-center cursor-pointer transition-colors
|
||||
${isDragging ? "border-blue-400 bg-blue-50" : "border-gray-300 hover:border-gray-400"}`}
|
||||
onClick={() => inputRef.current?.click()}
|
||||
onDragOver={(e) => { e.preventDefault(); setIsDragging(true); }}
|
||||
onDragLeave={() => setIsDragging(false)}
|
||||
onDrop={handleDrop}
|
||||
>
|
||||
<input
|
||||
ref={inputRef}
|
||||
type="file"
|
||||
accept=".pdf"
|
||||
className="hidden"
|
||||
onChange={(e) => onFileChange(e.target.files?.[0] ?? null)}
|
||||
/>
|
||||
{file ? (
|
||||
<div className="flex items-center justify-center gap-2">
|
||||
<span className="text-blue-600 font-medium text-sm">{file.name}</span>
|
||||
<button
|
||||
type="button"
|
||||
onClick={(e) => { e.stopPropagation(); onFileChange(null); }}
|
||||
className="text-gray-400 hover:text-red-500 text-xs"
|
||||
>
|
||||
Remove
|
||||
</button>
|
||||
</div>
|
||||
) : (
|
||||
<div className="text-gray-400 text-sm">
|
||||
Click or drag PDF file here
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
184
frontend/src/components/upload/UploadForm.tsx
Normal file
184
frontend/src/components/upload/UploadForm.tsx
Normal file
@@ -0,0 +1,184 @@
|
||||
import { useState, useCallback } from "react";
|
||||
import { useNavigate } from "react-router-dom";
|
||||
import { uploadPaper } from "@/lib/api";
|
||||
import FilePickerField from "./FilePickerField";
|
||||
|
||||
/** Try to extract course code, year, term, exam type from filename */
|
||||
function parseFilename(name: string): {
|
||||
courseCode?: string;
|
||||
year?: number;
|
||||
term?: string;
|
||||
examType?: string;
|
||||
} {
|
||||
const result: ReturnType<typeof parseFilename> = {};
|
||||
|
||||
// Remove extension
|
||||
const base = name.replace(/\.[^.]+$/, "").replace(/[_\-]+/g, " ");
|
||||
|
||||
// Course code: 2-4 uppercase letters + 4 digits + optional letter (e.g. COMP2211, MATH1014H)
|
||||
const courseMatch = base.match(/([A-Za-z]{2,4}\s*\d{4}[A-Za-z]?)/i);
|
||||
if (courseMatch) {
|
||||
result.courseCode = courseMatch[1].replace(/\s/g, "").toUpperCase();
|
||||
}
|
||||
|
||||
// Year: 4-digit (2019-2029) or 2-digit (19-29)
|
||||
const year4 = base.match(/\b(20[1-2]\d)\b/);
|
||||
if (year4) {
|
||||
result.year = Number(year4[1]);
|
||||
} else {
|
||||
const year2 = base.match(/\b(\d{2})\b/);
|
||||
if (year2) {
|
||||
const y = Number(year2[1]);
|
||||
if (y >= 15 && y <= 29) result.year = 2000 + y;
|
||||
}
|
||||
}
|
||||
|
||||
// Term
|
||||
const lower = base.toLowerCase();
|
||||
if (/spring|spr/i.test(lower)) result.term = "spring";
|
||||
else if (/fall|aut/i.test(lower)) result.term = "fall";
|
||||
else if (/summer|sum/i.test(lower)) result.term = "summer";
|
||||
|
||||
// Exam type
|
||||
if (/mid/i.test(lower)) result.examType = "midterm";
|
||||
else if (/final|fin/i.test(lower)) result.examType = "final";
|
||||
else if (/quiz/i.test(lower)) result.examType = "quiz";
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
export default function UploadForm() {
|
||||
const navigate = useNavigate();
|
||||
const [paperFile, setPaperFile] = useState<File | null>(null);
|
||||
const [answerFile, setAnswerFile] = useState<File | null>(null);
|
||||
const [courseCode, setCourseCode] = useState("");
|
||||
const [year, setYear] = useState(new Date().getFullYear());
|
||||
const [term, setTerm] = useState("fall");
|
||||
const [examType, setExamType] = useState("midterm");
|
||||
const [submitting, setSubmitting] = useState(false);
|
||||
const [error, setError] = useState<string | null>(null);
|
||||
const [autoFilled, setAutoFilled] = useState(false);
|
||||
|
||||
const handlePaperFile = useCallback((file: File | null) => {
|
||||
setPaperFile(file);
|
||||
if (!file) { setAutoFilled(false); return; }
|
||||
|
||||
const parsed = parseFilename(file.name);
|
||||
const filled: string[] = [];
|
||||
|
||||
if (parsed.courseCode) { setCourseCode(parsed.courseCode); filled.push("course"); }
|
||||
if (parsed.year) { setYear(parsed.year); filled.push("year"); }
|
||||
if (parsed.term) { setTerm(parsed.term); filled.push("term"); }
|
||||
if (parsed.examType) { setExamType(parsed.examType); filled.push("type"); }
|
||||
|
||||
setAutoFilled(filled.length > 0);
|
||||
}, []);
|
||||
|
||||
const handleSubmit = async (e: React.FormEvent) => {
|
||||
e.preventDefault();
|
||||
if (!paperFile || !courseCode) return;
|
||||
|
||||
setSubmitting(true);
|
||||
setError(null);
|
||||
|
||||
try {
|
||||
const fd = new FormData();
|
||||
fd.append("paper_file", paperFile);
|
||||
if (answerFile) fd.append("answer_file", answerFile);
|
||||
fd.append("course_code", courseCode);
|
||||
fd.append("year", String(year));
|
||||
fd.append("term", term);
|
||||
fd.append("exam_type", examType);
|
||||
|
||||
const result = await uploadPaper(fd);
|
||||
navigate(`/paper/${result.paper_id}`);
|
||||
} catch (err) {
|
||||
setError(err instanceof Error ? err.message : "Upload failed");
|
||||
setSubmitting(false);
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<form onSubmit={handleSubmit} className="max-w-lg mx-auto space-y-5">
|
||||
<FilePickerField
|
||||
label="Paper PDF"
|
||||
required
|
||||
file={paperFile}
|
||||
onFileChange={handlePaperFile}
|
||||
/>
|
||||
{autoFilled && (
|
||||
<div className="text-xs text-green-600 bg-green-50 px-3 py-1.5 rounded-lg -mt-3">
|
||||
Auto-filled from filename — please verify below
|
||||
</div>
|
||||
)}
|
||||
<FilePickerField
|
||||
label="Answer / Solution PDF (optional)"
|
||||
file={answerFile}
|
||||
onFileChange={setAnswerFile}
|
||||
/>
|
||||
|
||||
<div>
|
||||
<label className="block text-sm font-medium text-gray-700 mb-1">
|
||||
Course Code <span className="text-red-500">*</span>
|
||||
</label>
|
||||
<input
|
||||
type="text"
|
||||
value={courseCode}
|
||||
onChange={(e) => setCourseCode(e.target.value.toUpperCase())}
|
||||
placeholder="e.g. COMP2011"
|
||||
className="w-full border border-gray-300 rounded-lg px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-blue-500"
|
||||
required
|
||||
/>
|
||||
</div>
|
||||
|
||||
<div className="grid grid-cols-3 gap-3">
|
||||
<div>
|
||||
<label className="block text-sm font-medium text-gray-700 mb-1">Year</label>
|
||||
<input
|
||||
type="number"
|
||||
value={year}
|
||||
onChange={(e) => setYear(Number(e.target.value))}
|
||||
className="w-full border border-gray-300 rounded-lg px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-blue-500"
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<label className="block text-sm font-medium text-gray-700 mb-1">Term</label>
|
||||
<select
|
||||
value={term}
|
||||
onChange={(e) => setTerm(e.target.value)}
|
||||
className="w-full border border-gray-300 rounded-lg px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-blue-500"
|
||||
>
|
||||
<option value="fall">Fall</option>
|
||||
<option value="spring">Spring</option>
|
||||
<option value="summer">Summer</option>
|
||||
</select>
|
||||
</div>
|
||||
<div>
|
||||
<label className="block text-sm font-medium text-gray-700 mb-1">Exam Type</label>
|
||||
<select
|
||||
value={examType}
|
||||
onChange={(e) => setExamType(e.target.value)}
|
||||
className="w-full border border-gray-300 rounded-lg px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-blue-500"
|
||||
>
|
||||
<option value="midterm">Midterm</option>
|
||||
<option value="final">Final</option>
|
||||
<option value="quiz">Quiz</option>
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{error && (
|
||||
<div className="text-red-600 text-sm bg-red-50 p-3 rounded-lg">{error}</div>
|
||||
)}
|
||||
|
||||
<button
|
||||
type="submit"
|
||||
disabled={!paperFile || !courseCode || submitting}
|
||||
className="w-full bg-blue-600 text-white py-2.5 rounded-lg font-medium text-sm
|
||||
hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed transition-colors"
|
||||
>
|
||||
{submitting ? "Uploading..." : "Upload & Analyze"}
|
||||
</button>
|
||||
</form>
|
||||
);
|
||||
}
|
||||
58
frontend/src/components/workbench/ActionBar.tsx
Normal file
58
frontend/src/components/workbench/ActionBar.tsx
Normal file
@@ -0,0 +1,58 @@
|
||||
import type { Question } from "@/types/api";
|
||||
|
||||
export default function ActionBar({
|
||||
question,
|
||||
onGenerateVariant,
|
||||
isGenerating,
|
||||
onPhotoOpen,
|
||||
answerState,
|
||||
}: {
|
||||
question: Question | null;
|
||||
onGenerateVariant: () => void;
|
||||
isGenerating: boolean;
|
||||
onPhotoOpen: () => void;
|
||||
answerState?: "correct" | "wrong" | null;
|
||||
}) {
|
||||
if (!question) return null;
|
||||
|
||||
const isLong = question.question_type === "long_question" || question.question_type === "long_answer" || question.question_type === "coding";
|
||||
|
||||
return (
|
||||
<div className="border-t border-gray-200 bg-white px-4 py-3 shrink-0 space-y-2">
|
||||
{/* Answer state feedback (for non-long questions, driven by QuestionDetail) */}
|
||||
{answerState && (
|
||||
<div className={`text-center text-sm font-medium py-1.5 rounded-lg ${
|
||||
answerState === "correct"
|
||||
? "bg-green-50 text-green-600"
|
||||
: "bg-red-50 text-red-600"
|
||||
}`}>
|
||||
{answerState === "correct" ? "Correct!" : "Added to error book"}
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Long question: Upload handwritten answer */}
|
||||
{isLong && (
|
||||
<button
|
||||
onClick={onPhotoOpen}
|
||||
className="w-full py-2.5 rounded-lg text-sm font-medium bg-blue-600 text-white hover:bg-blue-700 transition-colors"
|
||||
>
|
||||
Upload handwritten answer
|
||||
</button>
|
||||
)}
|
||||
|
||||
{/* Generate variant — always available */}
|
||||
<button
|
||||
onClick={onGenerateVariant}
|
||||
disabled={isGenerating}
|
||||
className="w-full py-2 rounded-lg text-sm font-medium bg-purple-50 text-purple-700 border border-purple-200 hover:bg-purple-100 disabled:opacity-50 transition-colors"
|
||||
>
|
||||
{isGenerating ? (
|
||||
<span className="flex items-center justify-center gap-2">
|
||||
<span className="w-3 h-3 border-2 border-purple-600 border-t-transparent rounded-full animate-spin" />
|
||||
Generating...
|
||||
</span>
|
||||
) : "Generate Variant"}
|
||||
</button>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
21
frontend/src/components/workbench/AiTrioPanel.tsx
Normal file
21
frontend/src/components/workbench/AiTrioPanel.tsx
Normal file
@@ -0,0 +1,21 @@
|
||||
import type { Question } from "@/types/api";
|
||||
import CollapsibleSection from "@/components/shared/CollapsibleSection";
|
||||
import KaTeXRenderer from "@/components/shared/KaTeXRenderer";
|
||||
|
||||
export default function AiTrioPanel({ question }: { question: Question }) {
|
||||
return (
|
||||
<div>
|
||||
<CollapsibleSection title="Knowledge Reminder" colorScheme="blue" defaultOpen>
|
||||
<KaTeXRenderer html={question.knowledge_reminder} />
|
||||
</CollapsibleSection>
|
||||
|
||||
<CollapsibleSection title="AI Hint" colorScheme="amber">
|
||||
<KaTeXRenderer html={question.ai_hint} />
|
||||
</CollapsibleSection>
|
||||
|
||||
<CollapsibleSection title="Solution" colorScheme="green">
|
||||
<KaTeXRenderer html={question.solution} />
|
||||
</CollapsibleSection>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
170
frontend/src/components/workbench/PdfViewer.tsx
Normal file
170
frontend/src/components/workbench/PdfViewer.tsx
Normal file
@@ -0,0 +1,170 @@
|
||||
import { useState, useRef, useEffect, useCallback } from "react";
|
||||
import { Document, Page, pdfjs } from "react-pdf";
|
||||
import "react-pdf/dist/Page/AnnotationLayer.css";
|
||||
import "react-pdf/dist/Page/TextLayer.css";
|
||||
|
||||
pdfjs.GlobalWorkerOptions.workerSrc = `https://unpkg.com/pdfjs-dist@${pdfjs.version}/build/pdf.worker.min.mjs`;
|
||||
|
||||
export default function PdfViewer({
|
||||
fileUrl,
|
||||
currentPage,
|
||||
onPageChange,
|
||||
}: {
|
||||
fileUrl: string;
|
||||
currentPage?: number;
|
||||
onPageChange?: (page: number) => void;
|
||||
}) {
|
||||
const [numPages, setNumPages] = useState(0);
|
||||
const [containerWidth, setContainerWidth] = useState(0);
|
||||
const containerRef = useRef<HTMLDivElement>(null);
|
||||
const scrollRef = useRef<HTMLDivElement>(null);
|
||||
const pageRefs = useRef<Map<number, HTMLDivElement>>(new Map());
|
||||
const [jumpPage, setJumpPage] = useState("");
|
||||
const programmaticScroll = useRef(false);
|
||||
|
||||
// Resize observer for container width
|
||||
useEffect(() => {
|
||||
if (!containerRef.current) return;
|
||||
const observer = new ResizeObserver((entries) => {
|
||||
setContainerWidth(entries[0].contentRect.width);
|
||||
});
|
||||
observer.observe(containerRef.current);
|
||||
return () => observer.disconnect();
|
||||
}, []);
|
||||
|
||||
// Scroll to page when currentPage changes (programmatic)
|
||||
useEffect(() => {
|
||||
if (!currentPage || currentPage < 1) return;
|
||||
const el = pageRefs.current.get(currentPage);
|
||||
if (el) {
|
||||
programmaticScroll.current = true;
|
||||
el.scrollIntoView({ behavior: "smooth", block: "start" });
|
||||
setTimeout(() => { programmaticScroll.current = false; }, 2000);
|
||||
}
|
||||
}, [currentPage]);
|
||||
|
||||
// IntersectionObserver to detect visible page on user scroll
|
||||
useEffect(() => {
|
||||
if (numPages === 0 || !scrollRef.current) return;
|
||||
|
||||
const visiblePages = new Map<number, number>();
|
||||
|
||||
const observer = new IntersectionObserver(
|
||||
(entries) => {
|
||||
for (const entry of entries) {
|
||||
const pageNum = Number(entry.target.getAttribute("data-page"));
|
||||
if (entry.isIntersecting) {
|
||||
visiblePages.set(pageNum, entry.intersectionRatio);
|
||||
} else {
|
||||
visiblePages.delete(pageNum);
|
||||
}
|
||||
}
|
||||
|
||||
// Don't fire callback during programmatic scroll
|
||||
if (programmaticScroll.current) return;
|
||||
|
||||
// Find the page with the highest visibility ratio
|
||||
let bestPage = 0;
|
||||
let bestRatio = 0;
|
||||
for (const [page, ratio] of visiblePages) {
|
||||
if (ratio > bestRatio) {
|
||||
bestRatio = ratio;
|
||||
bestPage = page;
|
||||
}
|
||||
}
|
||||
if (bestPage > 0) {
|
||||
onPageChange?.(bestPage);
|
||||
}
|
||||
},
|
||||
{
|
||||
root: scrollRef.current,
|
||||
threshold: [0, 0.25, 0.5, 0.75, 1],
|
||||
},
|
||||
);
|
||||
|
||||
for (const [, el] of pageRefs.current) {
|
||||
observer.observe(el);
|
||||
}
|
||||
|
||||
return () => observer.disconnect();
|
||||
}, [numPages, onPageChange]);
|
||||
|
||||
const setPageRef = useCallback((pageNum: number, el: HTMLDivElement | null) => {
|
||||
if (el) {
|
||||
el.setAttribute("data-page", String(pageNum));
|
||||
pageRefs.current.set(pageNum, el);
|
||||
} else {
|
||||
pageRefs.current.delete(pageNum);
|
||||
}
|
||||
}, []);
|
||||
|
||||
const handleJump = () => {
|
||||
const p = parseInt(jumpPage, 10);
|
||||
if (p >= 1 && p <= numPages) {
|
||||
const el = pageRefs.current.get(p);
|
||||
el?.scrollIntoView({ behavior: "smooth", block: "start" });
|
||||
}
|
||||
setJumpPage("");
|
||||
};
|
||||
|
||||
return (
|
||||
<div ref={containerRef} className="h-full flex flex-col bg-gray-100">
|
||||
{/* Page controls */}
|
||||
<div className="flex items-center justify-center gap-3 py-2 bg-white border-b border-gray-200 text-sm shrink-0">
|
||||
<span className="text-gray-600">
|
||||
{numPages} pages
|
||||
</span>
|
||||
<span className="text-gray-300">|</span>
|
||||
<span className="text-gray-600">
|
||||
Go to{" "}
|
||||
<input
|
||||
type="number"
|
||||
value={jumpPage}
|
||||
onChange={(e) => setJumpPage(e.target.value)}
|
||||
onKeyDown={(e) => { if (e.key === "Enter") handleJump(); }}
|
||||
placeholder="#"
|
||||
className="w-12 text-center border border-gray-300 rounded px-1 py-0.5 text-sm"
|
||||
min={1}
|
||||
max={numPages}
|
||||
/>
|
||||
</span>
|
||||
</div>
|
||||
|
||||
{/* All pages scrollable */}
|
||||
<div ref={scrollRef} className="flex-1 overflow-auto">
|
||||
<Document
|
||||
file={fileUrl}
|
||||
onLoadSuccess={({ numPages: n }) => setNumPages(n)}
|
||||
loading={
|
||||
<div className="flex items-center justify-center h-64 text-gray-400">
|
||||
Loading PDF...
|
||||
</div>
|
||||
}
|
||||
error={
|
||||
<div className="flex items-center justify-center h-64 text-red-400">
|
||||
Failed to load PDF
|
||||
</div>
|
||||
}
|
||||
>
|
||||
{numPages > 0 &&
|
||||
Array.from({ length: numPages }, (_, i) => i + 1).map((pageNum) => (
|
||||
<div
|
||||
key={pageNum}
|
||||
ref={(el) => setPageRef(pageNum, el)}
|
||||
className="flex justify-center mb-2"
|
||||
>
|
||||
<div className="bg-white shadow-sm">
|
||||
<Page
|
||||
pageNumber={pageNum}
|
||||
width={containerWidth > 0 ? containerWidth - 48 : undefined}
|
||||
renderAnnotationLayer
|
||||
renderTextLayer
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
))}
|
||||
</Document>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
90
frontend/src/components/workbench/PhotoUpload.tsx
Normal file
90
frontend/src/components/workbench/PhotoUpload.tsx
Normal file
@@ -0,0 +1,90 @@
|
||||
import { useState, useRef } from "react";
|
||||
import { uploadPhoto } from "@/lib/api";
|
||||
import type { UserAttempt } from "@/types/api";
|
||||
|
||||
export default function PhotoUpload({
|
||||
questionId,
|
||||
onClose,
|
||||
onSubmitted,
|
||||
}: {
|
||||
questionId: string;
|
||||
onClose: () => void;
|
||||
onSubmitted: (promise: Promise<{ attempt: UserAttempt; ocr_text: string; grade: { is_correct: boolean; score_given?: number; feedback: string } }>) => void;
|
||||
}) {
|
||||
const [file, setFile] = useState<File | null>(null);
|
||||
const [preview, setPreview] = useState<string | null>(null);
|
||||
const [submitting, setSubmitting] = useState(false);
|
||||
const [error, setError] = useState<string | null>(null);
|
||||
const inputRef = useRef<HTMLInputElement>(null);
|
||||
|
||||
const handleFile = (f: File) => {
|
||||
setFile(f);
|
||||
setPreview(URL.createObjectURL(f));
|
||||
setError(null);
|
||||
};
|
||||
|
||||
const handleSubmit = () => {
|
||||
if (!file || submitting) return;
|
||||
setSubmitting(true);
|
||||
const promise = uploadPhoto(questionId, file);
|
||||
// Close modal immediately, let parent handle the async result
|
||||
onSubmitted(promise);
|
||||
onClose();
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="fixed inset-0 bg-black/40 flex items-center justify-center z-50 p-4">
|
||||
<div className="bg-white rounded-xl shadow-xl max-w-lg w-full max-h-[90vh] overflow-y-auto">
|
||||
<div className="p-5">
|
||||
<div className="flex items-center justify-between mb-4">
|
||||
<h3 className="text-lg font-semibold text-gray-900">Upload Answer Photo</h3>
|
||||
<button onClick={onClose} className="text-gray-400 hover:text-gray-600 text-xl">×</button>
|
||||
</div>
|
||||
|
||||
{!preview ? (
|
||||
<div
|
||||
onClick={() => inputRef.current?.click()}
|
||||
className="border-2 border-dashed border-gray-300 rounded-lg p-8 text-center cursor-pointer hover:border-blue-400 transition-colors"
|
||||
>
|
||||
<div className="text-3xl mb-2">📷</div>
|
||||
<p className="text-sm text-gray-600">Click to take photo or select image</p>
|
||||
<input
|
||||
ref={inputRef}
|
||||
type="file"
|
||||
accept="image/*"
|
||||
capture="environment"
|
||||
className="hidden"
|
||||
onChange={(e) => {
|
||||
const f = e.target.files?.[0];
|
||||
if (f) handleFile(f);
|
||||
}}
|
||||
/>
|
||||
</div>
|
||||
) : (
|
||||
<div className="space-y-3">
|
||||
<img src={preview} alt="Preview" className="w-full rounded-lg border" />
|
||||
{error && (
|
||||
<div className="text-sm text-red-600 bg-red-50 rounded-lg p-3">{error}</div>
|
||||
)}
|
||||
<div className="flex gap-2">
|
||||
<button
|
||||
onClick={() => { setFile(null); setPreview(null); }}
|
||||
className="flex-1 py-2 rounded-lg text-sm border border-gray-200 text-gray-600 hover:bg-gray-50"
|
||||
>
|
||||
Retake
|
||||
</button>
|
||||
<button
|
||||
onClick={handleSubmit}
|
||||
disabled={submitting}
|
||||
className="flex-1 py-2 rounded-lg text-sm bg-blue-600 text-white font-medium hover:bg-blue-700 disabled:opacity-50"
|
||||
>
|
||||
Submit for Grading
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
260
frontend/src/components/workbench/QuestionDetail.tsx
Normal file
260
frontend/src/components/workbench/QuestionDetail.tsx
Normal file
@@ -0,0 +1,260 @@
|
||||
import { useState, useEffect } from "react";
|
||||
import type { Question } from "@/types/api";
|
||||
import { subquestionLabel } from "@/lib/questionGroups";
|
||||
|
||||
const typeLabels: Record<string, string> = {
|
||||
mc: "Multiple Choice",
|
||||
true_false: "True / False",
|
||||
fill_blank: "Fill in Blank",
|
||||
long_question: "Long Question",
|
||||
long_answer: "Long Answer",
|
||||
short_answer: "Short Answer",
|
||||
coding: "Coding",
|
||||
};
|
||||
|
||||
const difficultyColors: Record<string, string> = {
|
||||
easy: "bg-green-100 text-green-700",
|
||||
medium: "bg-yellow-100 text-yellow-700",
|
||||
hard: "bg-red-100 text-red-700",
|
||||
};
|
||||
|
||||
export default function QuestionDetail({
|
||||
question,
|
||||
onAnswerResult,
|
||||
}: {
|
||||
question: Question;
|
||||
onAnswerResult?: (isCorrect: boolean, userAnswer: string) => void;
|
||||
}) {
|
||||
const [selectedOption, setSelectedOption] = useState<string | null>(null);
|
||||
const [checked, setChecked] = useState(false);
|
||||
const [fillAnswer, setFillAnswer] = useState("");
|
||||
const [fillChecked, setFillChecked] = useState(false);
|
||||
// True/False: per-statement answers { "a": "True", "b": "False", ... }
|
||||
const [tfAnswer, setTfAnswer] = useState<"True" | "False" | null>(null);
|
||||
const [tfChecked, setTfChecked] = useState(false);
|
||||
|
||||
// Reset state when question changes
|
||||
useEffect(() => {
|
||||
setSelectedOption(null);
|
||||
setChecked(false);
|
||||
setFillAnswer("");
|
||||
setFillChecked(false);
|
||||
setTfAnswer(null);
|
||||
setTfChecked(false);
|
||||
}, [question.id]);
|
||||
|
||||
const isCorrectMc = checked && selectedOption === question.correct_option;
|
||||
const isCorrectFill =
|
||||
fillChecked &&
|
||||
question.correct_answer != null &&
|
||||
fillAnswer.trim().toLowerCase() === question.correct_answer.trim().toLowerCase();
|
||||
|
||||
const handleMcCheck = () => {
|
||||
if (!selectedOption) return;
|
||||
setChecked(true);
|
||||
const correct = selectedOption === question.correct_option;
|
||||
onAnswerResult?.(correct, selectedOption);
|
||||
};
|
||||
|
||||
const handleFillCheck = () => {
|
||||
if (!fillAnswer.trim()) return;
|
||||
setFillChecked(true);
|
||||
const correct =
|
||||
question.correct_answer != null &&
|
||||
fillAnswer.trim().toLowerCase() === question.correct_answer.trim().toLowerCase();
|
||||
onAnswerResult?.(correct, fillAnswer.trim());
|
||||
};
|
||||
|
||||
const getOptionStyle = (label: string) => {
|
||||
if (!checked) {
|
||||
return label === selectedOption
|
||||
? "border-blue-400 bg-blue-50"
|
||||
: "border-gray-200 hover:bg-gray-50";
|
||||
}
|
||||
if (label === question.correct_option) return "border-green-400 bg-green-50";
|
||||
if (label === selectedOption) return "border-red-400 bg-red-50";
|
||||
return "border-gray-200 opacity-50";
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="mb-4">
|
||||
{/* Header row */}
|
||||
<div className="flex items-center gap-2 mb-2 flex-wrap">
|
||||
<span className="text-base font-bold text-gray-900">
|
||||
Q{question.question_number.match(/^\d+/)?.[0] ?? question.question_number}
|
||||
</span>
|
||||
{question.question_number.replace(/^\d+/, "") && (
|
||||
<span className="text-xs px-2 py-0.5 rounded bg-gray-100 text-gray-600">
|
||||
{subquestionLabel(question)}
|
||||
</span>
|
||||
)}
|
||||
<span className="text-xs px-2 py-0.5 rounded bg-blue-100 text-blue-700">
|
||||
{typeLabels[question.question_type] ?? question.question_type}
|
||||
</span>
|
||||
{question.score != null && (
|
||||
<span className="text-xs text-gray-500">{question.score} pts</span>
|
||||
)}
|
||||
{question.difficulty && (
|
||||
<span
|
||||
className={`text-xs px-2 py-0.5 rounded ${difficultyColors[question.difficulty] ?? ""}`}
|
||||
>
|
||||
{question.difficulty}
|
||||
</span>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Topics */}
|
||||
{question.topics && question.topics.length > 0 && (
|
||||
<div className="flex gap-1 mb-3 flex-wrap">
|
||||
{question.topics.map((t) => (
|
||||
<span key={t} className="text-xs bg-gray-100 text-gray-600 px-2 py-0.5 rounded-full">
|
||||
{t}
|
||||
</span>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* MC options */}
|
||||
{question.question_type === "mc" && question.options && (
|
||||
<>
|
||||
<div className="mt-3 space-y-1.5">
|
||||
{question.options.map((opt) => (
|
||||
<button
|
||||
key={opt.label}
|
||||
onClick={() => { if (!checked) setSelectedOption(opt.label); }}
|
||||
className={`w-full flex items-start gap-2 p-2 rounded-lg border text-sm text-left transition-colors ${getOptionStyle(opt.label)}`}
|
||||
disabled={checked}
|
||||
>
|
||||
<span className={`font-semibold shrink-0 w-6 ${
|
||||
checked && opt.label === question.correct_option ? "text-green-600" :
|
||||
checked && opt.label === selectedOption ? "text-red-600" :
|
||||
opt.label === selectedOption ? "text-blue-600" : "text-blue-600"
|
||||
}`}>
|
||||
{opt.label}.
|
||||
</span>
|
||||
<span className="text-gray-700">{opt.text}</span>
|
||||
{checked && opt.label === question.correct_option && (
|
||||
<span className="ml-auto text-green-600 text-xs font-medium shrink-0">Correct</span>
|
||||
)}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
{!checked && selectedOption && (
|
||||
<button
|
||||
onClick={handleMcCheck}
|
||||
className="mt-2 px-4 py-1.5 bg-blue-600 text-white rounded-lg text-sm font-medium hover:bg-blue-700 transition-colors"
|
||||
>
|
||||
Check Answer
|
||||
</button>
|
||||
)}
|
||||
{checked && (
|
||||
<div className={`mt-2 text-sm font-medium ${isCorrectMc ? "text-green-600" : "text-red-600"}`}>
|
||||
{isCorrectMc ? "Correct!" : `Wrong — the answer is ${question.correct_option}`}
|
||||
</div>
|
||||
)}
|
||||
</>
|
||||
)}
|
||||
|
||||
{/* True/False */}
|
||||
{question.question_type === "true_false" && (() => {
|
||||
// Normalize T/F/True/False to "true"/"false"
|
||||
const normTF = (v: string | null | undefined): string => {
|
||||
if (!v) return "";
|
||||
const l = v.trim().toLowerCase();
|
||||
if (l === "t" || l === "true") return "true";
|
||||
if (l === "f" || l === "false") return "false";
|
||||
return l;
|
||||
};
|
||||
const correctNorm = normTF(question.correct_option ?? question.correct_answer);
|
||||
const correctDisplay = correctNorm === "true" ? "True" : "False";
|
||||
return (
|
||||
<>
|
||||
<div className="mt-3 flex gap-2">
|
||||
{(["True", "False"] as const).map((val) => {
|
||||
const isSelected = tfAnswer === val;
|
||||
const isCorrectVal = tfChecked && normTF(val) === correctNorm;
|
||||
const isWrongVal = tfChecked && isSelected && !isCorrectVal;
|
||||
return (
|
||||
<button
|
||||
key={val}
|
||||
onClick={() => { if (!tfChecked) setTfAnswer(val); }}
|
||||
disabled={tfChecked}
|
||||
className={`flex-1 py-2 rounded-lg border text-sm font-semibold transition-colors ${
|
||||
isCorrectVal
|
||||
? "border-green-400 bg-green-50 text-green-700"
|
||||
: isWrongVal
|
||||
? "border-red-400 bg-red-50 text-red-700"
|
||||
: isSelected
|
||||
? "border-blue-400 bg-blue-50 text-blue-700"
|
||||
: "border-gray-200 text-gray-600 hover:bg-gray-50"
|
||||
}`}
|
||||
>
|
||||
{val === "True" ? "T — True" : "F — False"}
|
||||
</button>
|
||||
);
|
||||
})}
|
||||
</div>
|
||||
{!tfChecked && tfAnswer && (
|
||||
<button
|
||||
onClick={() => {
|
||||
setTfChecked(true);
|
||||
const isCorrect = normTF(tfAnswer) === correctNorm;
|
||||
onAnswerResult?.(isCorrect, tfAnswer);
|
||||
}}
|
||||
className="mt-2 px-4 py-1.5 bg-blue-600 text-white rounded-lg text-sm font-medium hover:bg-blue-700 transition-colors"
|
||||
>
|
||||
Check Answer
|
||||
</button>
|
||||
)}
|
||||
{tfChecked && (
|
||||
<div className={`mt-2 text-sm font-medium ${
|
||||
normTF(tfAnswer) === correctNorm ? "text-green-600" : "text-red-600"
|
||||
}`}>
|
||||
{normTF(tfAnswer) === correctNorm
|
||||
? "Correct!"
|
||||
: `Wrong — the answer is ${correctDisplay}`}
|
||||
</div>
|
||||
)}
|
||||
</>
|
||||
);
|
||||
})()}
|
||||
|
||||
{/* Fill-blank input */}
|
||||
{question.question_type === "fill_blank" && (
|
||||
<div className="mt-3">
|
||||
<div className="flex gap-2">
|
||||
<input
|
||||
type="text"
|
||||
value={fillAnswer}
|
||||
onChange={(e) => { if (!fillChecked) setFillAnswer(e.target.value); }}
|
||||
placeholder="Type your answer..."
|
||||
disabled={fillChecked}
|
||||
className={`flex-1 border rounded-lg px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-blue-500 ${
|
||||
fillChecked
|
||||
? isCorrectFill ? "border-green-400 bg-green-50" : "border-red-400 bg-red-50"
|
||||
: "border-gray-300"
|
||||
}`}
|
||||
onKeyDown={(e) => { if (e.key === "Enter") handleFillCheck(); }}
|
||||
/>
|
||||
{!fillChecked && (
|
||||
<button
|
||||
onClick={handleFillCheck}
|
||||
disabled={!fillAnswer.trim()}
|
||||
className="px-4 py-2 bg-blue-600 text-white rounded-lg text-sm font-medium hover:bg-blue-700 disabled:opacity-50 transition-colors"
|
||||
>
|
||||
Check
|
||||
</button>
|
||||
)}
|
||||
</div>
|
||||
{fillChecked && (
|
||||
<div className={`mt-2 text-sm font-medium ${isCorrectFill ? "text-green-600" : "text-red-600"}`}>
|
||||
{isCorrectFill
|
||||
? "Correct!"
|
||||
: `Wrong — the answer is: ${question.correct_answer ?? "N/A"}`}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
56
frontend/src/components/workbench/QuestionNav.tsx
Normal file
56
frontend/src/components/workbench/QuestionNav.tsx
Normal file
@@ -0,0 +1,56 @@
|
||||
import type { Question } from "@/types/api";
|
||||
import type { QuestionGroup } from "@/lib/questionGroups";
|
||||
import { subquestionLabel } from "@/lib/questionGroups";
|
||||
|
||||
export default function QuestionNav({
|
||||
groups,
|
||||
currentGroupKey,
|
||||
currentQuestionId,
|
||||
onSelectGroup,
|
||||
onSelectQuestion,
|
||||
}: {
|
||||
groups: QuestionGroup[];
|
||||
currentGroupKey: string | null;
|
||||
currentQuestionId: string | null;
|
||||
onSelectGroup: (groupKey: string) => void;
|
||||
onSelectQuestion: (questionId: string) => void;
|
||||
}) {
|
||||
const activeGroup = groups.find((group) => group.key === currentGroupKey) ?? null;
|
||||
|
||||
return (
|
||||
<div className="border-b border-gray-200 bg-white px-4 py-2 shrink-0">
|
||||
<div className="flex gap-1.5 overflow-x-auto hide-scrollbar">
|
||||
{groups.map((group) => (
|
||||
<button
|
||||
key={group.key}
|
||||
onClick={() => onSelectGroup(group.key)}
|
||||
className={`px-3 py-1.5 rounded-lg text-xs font-medium whitespace-nowrap transition-colors
|
||||
${group.key === currentGroupKey
|
||||
? "bg-blue-600 text-white"
|
||||
: "bg-gray-100 text-gray-600 hover:bg-gray-200"
|
||||
}`}
|
||||
>
|
||||
{group.label}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
{activeGroup && activeGroup.questions.length > 1 && (
|
||||
<div className="flex gap-1.5 overflow-x-auto hide-scrollbar mt-2">
|
||||
{activeGroup.questions.map((question) => (
|
||||
<button
|
||||
key={question.id}
|
||||
onClick={() => onSelectQuestion(question.id)}
|
||||
className={`px-2.5 py-1 rounded-md text-[11px] font-medium whitespace-nowrap transition-colors
|
||||
${question.id === currentQuestionId
|
||||
? "bg-blue-50 text-blue-700 border border-blue-200"
|
||||
: "bg-gray-50 text-gray-500 border border-gray-200 hover:bg-gray-100"
|
||||
}`}
|
||||
>
|
||||
{subquestionLabel(question)}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
130
frontend/src/components/workbench/SimilarHistoryPanel.tsx
Normal file
130
frontend/src/components/workbench/SimilarHistoryPanel.tsx
Normal file
@@ -0,0 +1,130 @@
|
||||
import { useEffect, useState } from "react";
|
||||
import { Link } from "react-router-dom";
|
||||
|
||||
import { getSimilarQuestions } from "@/lib/api";
|
||||
import type { Question, SimilarQuestion } from "@/types/api";
|
||||
|
||||
const typeLabel: Record<string, string> = {
|
||||
mc: "MC",
|
||||
true_false: "T/F",
|
||||
fill_blank: "Fill",
|
||||
long_question: "Long",
|
||||
long_answer: "Long",
|
||||
short_answer: "Short",
|
||||
coding: "Code",
|
||||
};
|
||||
|
||||
function matchColor(percent: number): string {
|
||||
if (percent >= 80) return "bg-green-100 text-green-700";
|
||||
if (percent >= 60) return "bg-amber-100 text-amber-700";
|
||||
return "bg-gray-100 text-gray-600";
|
||||
}
|
||||
|
||||
function cleanReason(reason: string): string {
|
||||
// "Shared topic: foo_bar, baz_qux" → "Shared topic: Foo Bar, Baz Qux"
|
||||
return reason.replace(/[_]/g, " ").replace(/:\s*(.+)$/, (_, rest) =>
|
||||
": " + rest.split(",").map((s: string) =>
|
||||
s.trim().replace(/\b\w/g, (c: string) => c.toUpperCase())
|
||||
).join(", ")
|
||||
);
|
||||
}
|
||||
|
||||
export default function SimilarHistoryPanel({ question }: { question: Question }) {
|
||||
const [items, setItems] = useState<SimilarQuestion[]>([]);
|
||||
const [loading, setLoading] = useState(true);
|
||||
const [error, setError] = useState<string | null>(null);
|
||||
const [isOpen, setIsOpen] = useState(true);
|
||||
|
||||
useEffect(() => {
|
||||
let cancelled = false;
|
||||
setLoading(true);
|
||||
setError(null);
|
||||
setItems([]);
|
||||
getSimilarQuestions(question.id)
|
||||
.then((data) => {
|
||||
if (cancelled) return;
|
||||
setItems(data);
|
||||
setLoading(false);
|
||||
})
|
||||
.catch((err: unknown) => {
|
||||
if (cancelled) return;
|
||||
setError(err instanceof Error ? err.message : "Failed to load.");
|
||||
setLoading(false);
|
||||
});
|
||||
return () => { cancelled = true; };
|
||||
}, [question.id]);
|
||||
|
||||
return (
|
||||
<div className="rounded-lg border border-blue-200 mb-3 overflow-hidden">
|
||||
<button
|
||||
onClick={() => setIsOpen((open) => !open)}
|
||||
className="w-full flex items-center justify-between p-3 bg-blue-50"
|
||||
>
|
||||
<div className="flex items-center gap-2">
|
||||
<span className="w-5 h-5 flex items-center justify-center rounded bg-blue-600 text-white text-xs font-bold">S</span>
|
||||
<span className="font-semibold text-sm text-blue-800">Similar Questions</span>
|
||||
</div>
|
||||
<span className="text-xs text-blue-600">{loading ? "…" : items.length}</span>
|
||||
</button>
|
||||
|
||||
{isOpen && (
|
||||
<div className="p-2 space-y-1.5 bg-white">
|
||||
{loading && <div className="text-xs text-gray-400 px-1 py-2">Loading…</div>}
|
||||
{!loading && error && (
|
||||
<div className="text-xs text-red-600 bg-red-50 border border-red-200 rounded px-3 py-2">{error}</div>
|
||||
)}
|
||||
{!loading && !error && items.length === 0 && (
|
||||
<div className="text-xs text-gray-400 px-1 py-2">No similar questions found.</div>
|
||||
)}
|
||||
|
||||
{items.map((item) => (
|
||||
<Link
|
||||
key={item.id}
|
||||
to={`/paper/${item.paper_id}`}
|
||||
className="flex items-center gap-2 px-2.5 py-2 rounded-lg border border-gray-100 hover:border-blue-200 hover:bg-blue-50/40 transition-colors"
|
||||
>
|
||||
{/* Match % badge */}
|
||||
<span className={`shrink-0 text-[11px] font-bold px-1.5 py-0.5 rounded ${matchColor(item.match_percent)}`}>
|
||||
{item.match_percent}%
|
||||
</span>
|
||||
|
||||
{/* Main info */}
|
||||
<div className="flex-1 min-w-0">
|
||||
<div className="flex items-center gap-1.5 flex-wrap">
|
||||
<span className="text-xs font-semibold text-gray-700">{item.source}</span>
|
||||
<span className="text-xs text-gray-400">·</span>
|
||||
<span className="text-xs text-gray-500">Q{item.question_number}</span>
|
||||
{item.question_type && (
|
||||
<>
|
||||
<span className="text-xs text-gray-400">·</span>
|
||||
<span className="text-xs text-gray-500">{typeLabel[item.question_type] ?? item.question_type}</span>
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Topics + reasons in one row */}
|
||||
<div className="flex gap-1 flex-wrap mt-1">
|
||||
{item.topics.slice(0, 2).map((topic) => (
|
||||
<span key={topic} className="text-[10px] px-1.5 py-0.5 rounded bg-gray-100 text-gray-500">
|
||||
{topic}
|
||||
</span>
|
||||
))}
|
||||
{item.match_reasons
|
||||
?.filter((r) => !r.startsWith("Same format") && !r.startsWith("Same difficulty"))
|
||||
.slice(0, 2)
|
||||
.map((reason) => (
|
||||
<span key={reason} className="text-[10px] px-1.5 py-0.5 rounded bg-blue-50 text-blue-500">
|
||||
{cleanReason(reason)}
|
||||
</span>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<span className="text-gray-300 text-xs shrink-0">›</span>
|
||||
</Link>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
148
frontend/src/components/workbench/VariantDetail.tsx
Normal file
148
frontend/src/components/workbench/VariantDetail.tsx
Normal file
@@ -0,0 +1,148 @@
|
||||
import { useState } from "react";
|
||||
import type { VariantQuestion } from "@/types/api";
|
||||
import KaTeXRenderer from "@/components/shared/KaTeXRenderer";
|
||||
import CollapsibleSection from "@/components/shared/CollapsibleSection";
|
||||
|
||||
export default function VariantDetail({
|
||||
variant,
|
||||
}: {
|
||||
variant: VariantQuestion;
|
||||
}) {
|
||||
const [selectedOption, setSelectedOption] = useState<string | null>(null);
|
||||
const [checked, setChecked] = useState(false);
|
||||
const [fillAnswer, setFillAnswer] = useState("");
|
||||
const [fillChecked, setFillChecked] = useState(false);
|
||||
|
||||
const isMc = (variant.question_type === "mc" || variant.question_type === "true_false") && variant.options;
|
||||
|
||||
const handleMcCheck = () => {
|
||||
if (!selectedOption) return;
|
||||
setChecked(true);
|
||||
};
|
||||
|
||||
const handleFillCheck = () => {
|
||||
if (!fillAnswer.trim()) return;
|
||||
setFillChecked(true);
|
||||
};
|
||||
|
||||
const isCorrectMc = checked && selectedOption === variant.correct_answer;
|
||||
const isCorrectFill =
|
||||
fillChecked &&
|
||||
fillAnswer.trim().toLowerCase() === variant.correct_answer.trim().toLowerCase();
|
||||
|
||||
const getOptionStyle = (label: string) => {
|
||||
if (!checked) {
|
||||
return label === selectedOption
|
||||
? "border-blue-400 bg-blue-50"
|
||||
: "border-gray-200 hover:bg-gray-50";
|
||||
}
|
||||
if (label === variant.correct_answer) return "border-green-400 bg-green-50";
|
||||
if (label === selectedOption) return "border-red-400 bg-red-50";
|
||||
return "border-gray-200 opacity-50";
|
||||
};
|
||||
|
||||
return (
|
||||
<div>
|
||||
{/* Header */}
|
||||
<div className="flex items-center gap-2 mb-3">
|
||||
<span className="w-5 h-5 flex items-center justify-center bg-purple-600 text-white text-xs font-bold rounded-full">V</span>
|
||||
<span className="text-sm font-semibold text-gray-900">Similar Question</span>
|
||||
<span className="text-xs px-2 py-0.5 rounded bg-purple-100 text-purple-700">
|
||||
{variant.question_type}
|
||||
</span>
|
||||
</div>
|
||||
|
||||
{/* Question text */}
|
||||
<div className="text-sm text-gray-800 leading-relaxed bg-purple-50 rounded-lg p-3 border border-purple-200 mb-4">
|
||||
<KaTeXRenderer html={variant.question_text} />
|
||||
</div>
|
||||
|
||||
{/* MC options */}
|
||||
{isMc && variant.options && (
|
||||
<>
|
||||
<div className="space-y-1.5">
|
||||
{variant.options.map((opt) => (
|
||||
<button
|
||||
key={opt.label}
|
||||
onClick={() => { if (!checked) setSelectedOption(opt.label); }}
|
||||
disabled={checked}
|
||||
className={`w-full flex items-start gap-2 p-2 rounded-lg border text-sm text-left transition-colors ${getOptionStyle(opt.label)}`}
|
||||
>
|
||||
<span className="font-semibold shrink-0 w-6 text-blue-600">{opt.label}.</span>
|
||||
<span className="text-gray-700">{opt.text}</span>
|
||||
{checked && opt.label === variant.correct_answer && (
|
||||
<span className="ml-auto text-green-600 text-xs font-medium shrink-0">Correct</span>
|
||||
)}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
{!checked && selectedOption && (
|
||||
<button
|
||||
onClick={handleMcCheck}
|
||||
className="mt-2 px-4 py-1.5 bg-blue-600 text-white rounded-lg text-sm font-medium hover:bg-blue-700"
|
||||
>
|
||||
Check Answer
|
||||
</button>
|
||||
)}
|
||||
{checked && (
|
||||
<div className={`mt-2 text-sm font-medium ${isCorrectMc ? "text-green-600" : "text-red-600"}`}>
|
||||
{isCorrectMc ? "Correct!" : `Wrong — the answer is ${variant.correct_answer}`}
|
||||
</div>
|
||||
)}
|
||||
</>
|
||||
)}
|
||||
|
||||
{/* Non-MC input */}
|
||||
{!isMc && (
|
||||
<div className="mb-3">
|
||||
<div className="flex gap-2">
|
||||
<input
|
||||
type="text"
|
||||
value={fillAnswer}
|
||||
onChange={(e) => { if (!fillChecked) setFillAnswer(e.target.value); }}
|
||||
placeholder="Type your answer..."
|
||||
disabled={fillChecked}
|
||||
className={`flex-1 border rounded-lg px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-blue-500 ${
|
||||
fillChecked
|
||||
? isCorrectFill ? "border-green-400 bg-green-50" : "border-red-400 bg-red-50"
|
||||
: "border-gray-300"
|
||||
}`}
|
||||
onKeyDown={(e) => { if (e.key === "Enter") handleFillCheck(); }}
|
||||
/>
|
||||
{!fillChecked && (
|
||||
<button
|
||||
onClick={handleFillCheck}
|
||||
disabled={!fillAnswer.trim()}
|
||||
className="px-4 py-2 bg-blue-600 text-white rounded-lg text-sm font-medium hover:bg-blue-700 disabled:opacity-50"
|
||||
>
|
||||
Check
|
||||
</button>
|
||||
)}
|
||||
</div>
|
||||
{fillChecked && (
|
||||
<div className={`mt-2 text-sm font-medium ${isCorrectFill ? "text-green-600" : "text-red-600"}`}>
|
||||
{isCorrectFill ? "Correct!" : `Answer: ${variant.correct_answer}`}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* AI Trio */}
|
||||
<div className="mt-4 space-y-2">
|
||||
{variant.knowledge_reminder && (
|
||||
<CollapsibleSection title="Knowledge Reminder" colorScheme="blue">
|
||||
<KaTeXRenderer html={variant.knowledge_reminder} />
|
||||
</CollapsibleSection>
|
||||
)}
|
||||
{variant.ai_hint && (
|
||||
<CollapsibleSection title="AI Hint" colorScheme="amber">
|
||||
<KaTeXRenderer html={variant.ai_hint} />
|
||||
</CollapsibleSection>
|
||||
)}
|
||||
<CollapsibleSection title="Solution" colorScheme="green">
|
||||
<KaTeXRenderer html={variant.solution} />
|
||||
</CollapsibleSection>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
189
frontend/src/components/workbench/VariantModal.tsx
Normal file
189
frontend/src/components/workbench/VariantModal.tsx
Normal file
@@ -0,0 +1,189 @@
|
||||
import { useState } from "react";
|
||||
import type { VariantQuestion } from "@/types/api";
|
||||
import KaTeXRenderer from "@/components/shared/KaTeXRenderer";
|
||||
|
||||
export default function VariantModal({
|
||||
variant,
|
||||
onClose,
|
||||
}: {
|
||||
variant: VariantQuestion;
|
||||
onClose: () => void;
|
||||
}) {
|
||||
const [selectedOption, setSelectedOption] = useState<string | null>(null);
|
||||
const [checked, setChecked] = useState(false);
|
||||
const [fillAnswer, setFillAnswer] = useState("");
|
||||
const [fillChecked, setFillChecked] = useState(false);
|
||||
const [showKnowledge, setShowKnowledge] = useState(false);
|
||||
const [showHint, setShowHint] = useState(false);
|
||||
const [showSolution, setShowSolution] = useState(false);
|
||||
|
||||
const isMc = (variant.question_type === "mc" || variant.question_type === "true_false") && variant.options;
|
||||
|
||||
const handleMcCheck = () => {
|
||||
if (!selectedOption) return;
|
||||
setChecked(true);
|
||||
};
|
||||
|
||||
const handleFillCheck = () => {
|
||||
if (!fillAnswer.trim()) return;
|
||||
setFillChecked(true);
|
||||
};
|
||||
|
||||
const isCorrectMc = checked && selectedOption === variant.correct_answer;
|
||||
const isCorrectFill =
|
||||
fillChecked &&
|
||||
fillAnswer.trim().toLowerCase() === variant.correct_answer.trim().toLowerCase();
|
||||
|
||||
const getOptionStyle = (label: string) => {
|
||||
if (!checked) {
|
||||
return label === selectedOption
|
||||
? "border-blue-400 bg-blue-50"
|
||||
: "border-gray-200 hover:bg-gray-50";
|
||||
}
|
||||
if (label === variant.correct_answer) return "border-green-400 bg-green-50";
|
||||
if (label === selectedOption) return "border-red-400 bg-red-50";
|
||||
return "border-gray-200 opacity-50";
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="fixed inset-0 bg-black/40 flex items-center justify-center z-50 p-4">
|
||||
<div className="bg-white rounded-xl shadow-xl max-w-lg w-full max-h-[90vh] overflow-y-auto">
|
||||
<div className="p-5">
|
||||
<div className="flex items-center justify-between mb-4">
|
||||
<h3 className="text-lg font-semibold text-gray-900">Similar Question</h3>
|
||||
<button onClick={onClose} className="text-gray-400 hover:text-gray-600 text-xl">×</button>
|
||||
</div>
|
||||
|
||||
{/* Question text */}
|
||||
<div className="text-sm text-gray-800 leading-relaxed bg-gray-50 rounded-lg p-3 border border-gray-200 mb-3">
|
||||
<KaTeXRenderer html={variant.question_text} />
|
||||
</div>
|
||||
|
||||
{/* MC options */}
|
||||
{isMc && variant.options && (
|
||||
<>
|
||||
<div className="space-y-1.5">
|
||||
{variant.options.map((opt) => (
|
||||
<button
|
||||
key={opt.label}
|
||||
onClick={() => { if (!checked) setSelectedOption(opt.label); }}
|
||||
disabled={checked}
|
||||
className={`w-full flex items-start gap-2 p-2 rounded-lg border text-sm text-left transition-colors ${getOptionStyle(opt.label)}`}
|
||||
>
|
||||
<span className="font-semibold shrink-0 w-6 text-blue-600">{opt.label}.</span>
|
||||
<span className="text-gray-700">{opt.text}</span>
|
||||
{checked && opt.label === variant.correct_answer && (
|
||||
<span className="ml-auto text-green-600 text-xs font-medium shrink-0">Correct</span>
|
||||
)}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
{!checked && selectedOption && (
|
||||
<button
|
||||
onClick={handleMcCheck}
|
||||
className="mt-2 px-4 py-1.5 bg-blue-600 text-white rounded-lg text-sm font-medium hover:bg-blue-700"
|
||||
>
|
||||
Check Answer
|
||||
</button>
|
||||
)}
|
||||
{checked && (
|
||||
<div className={`mt-2 text-sm font-medium ${isCorrectMc ? "text-green-600" : "text-red-600"}`}>
|
||||
{isCorrectMc ? "Correct!" : `Wrong — the answer is ${variant.correct_answer}`}
|
||||
</div>
|
||||
)}
|
||||
</>
|
||||
)}
|
||||
|
||||
{/* Non-MC input */}
|
||||
{!isMc && (
|
||||
<div className="mt-1">
|
||||
<div className="flex gap-2">
|
||||
<input
|
||||
type="text"
|
||||
value={fillAnswer}
|
||||
onChange={(e) => { if (!fillChecked) setFillAnswer(e.target.value); }}
|
||||
placeholder="Type your answer..."
|
||||
disabled={fillChecked}
|
||||
className={`flex-1 border rounded-lg px-3 py-2 text-sm focus:outline-none focus:ring-2 focus:ring-blue-500 ${
|
||||
fillChecked
|
||||
? isCorrectFill ? "border-green-400 bg-green-50" : "border-red-400 bg-red-50"
|
||||
: "border-gray-300"
|
||||
}`}
|
||||
onKeyDown={(e) => { if (e.key === "Enter") handleFillCheck(); }}
|
||||
/>
|
||||
{!fillChecked && (
|
||||
<button
|
||||
onClick={handleFillCheck}
|
||||
disabled={!fillAnswer.trim()}
|
||||
className="px-4 py-2 bg-blue-600 text-white rounded-lg text-sm font-medium hover:bg-blue-700 disabled:opacity-50"
|
||||
>
|
||||
Check
|
||||
</button>
|
||||
)}
|
||||
</div>
|
||||
{fillChecked && (
|
||||
<div className={`mt-2 text-sm font-medium ${isCorrectFill ? "text-green-600" : "text-red-600"}`}>
|
||||
{isCorrectFill ? "Correct!" : `Answer: ${variant.correct_answer}`}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* AI Trio: Knowledge / Hint / Solution */}
|
||||
<div className="mt-4 border-t pt-3 space-y-2">
|
||||
{variant.knowledge_reminder && (
|
||||
<div>
|
||||
<button
|
||||
onClick={() => setShowKnowledge(!showKnowledge)}
|
||||
className="text-sm text-blue-600 hover:text-blue-800 font-medium"
|
||||
>
|
||||
{showKnowledge ? "▾ Hide Knowledge" : "▸ Knowledge Reminder"}
|
||||
</button>
|
||||
{showKnowledge && (
|
||||
<div className="mt-2 bg-blue-50 rounded-lg p-3 text-sm border border-blue-200">
|
||||
<KaTeXRenderer html={variant.knowledge_reminder} />
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
{variant.ai_hint && (
|
||||
<div>
|
||||
<button
|
||||
onClick={() => setShowHint(!showHint)}
|
||||
className="text-sm text-amber-600 hover:text-amber-800 font-medium"
|
||||
>
|
||||
{showHint ? "▾ Hide Hint" : "▸ AI Hint"}
|
||||
</button>
|
||||
{showHint && (
|
||||
<div className="mt-2 bg-amber-50 rounded-lg p-3 text-sm border border-amber-200">
|
||||
<KaTeXRenderer html={variant.ai_hint} />
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
<div>
|
||||
<button
|
||||
onClick={() => setShowSolution(!showSolution)}
|
||||
className="text-sm text-green-600 hover:text-green-800 font-medium"
|
||||
>
|
||||
{showSolution ? "▾ Hide Solution" : "▸ Solution"}
|
||||
</button>
|
||||
{showSolution && (
|
||||
<div className="mt-2 bg-green-50 rounded-lg p-3 text-sm border border-green-200">
|
||||
<KaTeXRenderer html={variant.solution} />
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<button
|
||||
onClick={onClose}
|
||||
className="mt-4 w-full py-2 rounded-lg text-sm bg-gray-100 text-gray-700 font-medium hover:bg-gray-200"
|
||||
>
|
||||
Close
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
49
frontend/src/contexts/AuthContext.tsx
Normal file
49
frontend/src/contexts/AuthContext.tsx
Normal file
@@ -0,0 +1,49 @@
|
||||
import { createContext, useContext, useEffect, useState } from "react";
|
||||
import type { Session, User } from "@supabase/supabase-js";
|
||||
import { supabase } from "@/lib/supabase";
|
||||
|
||||
interface AuthContextValue {
|
||||
session: Session | null;
|
||||
user: User | null;
|
||||
loading: boolean;
|
||||
signOut: () => Promise<void>;
|
||||
}
|
||||
|
||||
const AuthContext = createContext<AuthContextValue>({
|
||||
session: null,
|
||||
user: null,
|
||||
loading: true,
|
||||
signOut: async () => {},
|
||||
});
|
||||
|
||||
export function AuthProvider({ children }: { children: React.ReactNode }) {
|
||||
const [session, setSession] = useState<Session | null>(null);
|
||||
const [loading, setLoading] = useState(true);
|
||||
|
||||
useEffect(() => {
|
||||
supabase.auth.getSession().then(({ data }) => {
|
||||
setSession(data.session);
|
||||
setLoading(false);
|
||||
});
|
||||
|
||||
const { data: { subscription } } = supabase.auth.onAuthStateChange((_event, session) => {
|
||||
setSession(session);
|
||||
});
|
||||
|
||||
return () => subscription.unsubscribe();
|
||||
}, []);
|
||||
|
||||
const signOut = async () => {
|
||||
await supabase.auth.signOut();
|
||||
};
|
||||
|
||||
return (
|
||||
<AuthContext.Provider value={{ session, user: session?.user ?? null, loading, signOut }}>
|
||||
{children}
|
||||
</AuthContext.Provider>
|
||||
);
|
||||
}
|
||||
|
||||
export function useAuth() {
|
||||
return useContext(AuthContext);
|
||||
}
|
||||
43
frontend/src/hooks/usePaper.ts
Normal file
43
frontend/src/hooks/usePaper.ts
Normal file
@@ -0,0 +1,43 @@
|
||||
import { useEffect, useState } from "react";
|
||||
import { getPaper } from "@/lib/api";
|
||||
import type { Paper } from "@/types/api";
|
||||
|
||||
const POLL_INTERVAL = 3000;
|
||||
|
||||
export function usePaper(paperId: string) {
|
||||
const [paper, setPaper] = useState<Paper | null>(null);
|
||||
const [loading, setLoading] = useState(true);
|
||||
const [error, setError] = useState<string | null>(null);
|
||||
|
||||
useEffect(() => {
|
||||
let intervalId: number | null = null;
|
||||
let cancelled = false;
|
||||
|
||||
const fetchPaper = async () => {
|
||||
try {
|
||||
const data = await getPaper(paperId);
|
||||
if (cancelled) return;
|
||||
setPaper(data);
|
||||
setLoading(false);
|
||||
if (data.status === "ready" || data.status === "error") {
|
||||
if (intervalId !== null) clearInterval(intervalId);
|
||||
}
|
||||
} catch (err) {
|
||||
if (cancelled) return;
|
||||
setError(err instanceof Error ? err.message : "Unknown error");
|
||||
setLoading(false);
|
||||
if (intervalId !== null) clearInterval(intervalId);
|
||||
}
|
||||
};
|
||||
|
||||
fetchPaper();
|
||||
intervalId = window.setInterval(fetchPaper, POLL_INTERVAL);
|
||||
|
||||
return () => {
|
||||
cancelled = true;
|
||||
if (intervalId !== null) clearInterval(intervalId);
|
||||
};
|
||||
}, [paperId]);
|
||||
|
||||
return { paper, loading, error };
|
||||
}
|
||||
33
frontend/src/hooks/useQuestions.ts
Normal file
33
frontend/src/hooks/useQuestions.ts
Normal file
@@ -0,0 +1,33 @@
|
||||
import { useEffect, useState } from "react";
|
||||
import { getQuestions } from "@/lib/api";
|
||||
import type { Question } from "@/types/api";
|
||||
|
||||
export function useQuestions(paperId: string, enabled: boolean) {
|
||||
const [questions, setQuestions] = useState<Question[]>([]);
|
||||
const [loading, setLoading] = useState(false);
|
||||
const [error, setError] = useState<string | null>(null);
|
||||
|
||||
useEffect(() => {
|
||||
if (!enabled) return;
|
||||
let cancelled = false;
|
||||
setLoading(true);
|
||||
|
||||
getQuestions(paperId)
|
||||
.then((data) => {
|
||||
if (!cancelled) {
|
||||
setQuestions(data);
|
||||
setLoading(false);
|
||||
}
|
||||
})
|
||||
.catch((err) => {
|
||||
if (!cancelled) {
|
||||
setError(err instanceof Error ? err.message : "Unknown error");
|
||||
setLoading(false);
|
||||
}
|
||||
});
|
||||
|
||||
return () => { cancelled = true; };
|
||||
}, [paperId, enabled]);
|
||||
|
||||
return { questions, loading, error };
|
||||
}
|
||||
190
frontend/src/lib/api.ts
Normal file
190
frontend/src/lib/api.ts
Normal file
@@ -0,0 +1,190 @@
|
||||
import type {
|
||||
CourseAnalytics,
|
||||
Paper,
|
||||
Question,
|
||||
QuestionVariant,
|
||||
SimilarQuestion,
|
||||
UploadResponse,
|
||||
UserAttempt,
|
||||
} from "@/types/api";
|
||||
import { supabase } from "@/lib/supabase";
|
||||
|
||||
const API_BASE = "/api";
|
||||
|
||||
async function authHeaders(): Promise<Record<string, string>> {
|
||||
const { data } = await supabase.auth.getSession();
|
||||
const token = data.session?.access_token;
|
||||
if (!token) return {};
|
||||
return { Authorization: `Bearer ${token}` };
|
||||
}
|
||||
|
||||
export async function uploadPaper(formData: FormData): Promise<UploadResponse> {
|
||||
const headers = await authHeaders();
|
||||
const res = await fetch(`${API_BASE}/papers/upload`, {
|
||||
method: "POST",
|
||||
headers,
|
||||
body: formData,
|
||||
});
|
||||
if (!res.ok) throw new Error(`Upload failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function getPaper(paperId: string): Promise<Paper> {
|
||||
const res = await fetch(`${API_BASE}/papers/${paperId}`);
|
||||
if (!res.ok) throw new Error(`Paper not found: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function getQuestions(paperId: string): Promise<Question[]> {
|
||||
const res = await fetch(`${API_BASE}/papers/${paperId}/questions`);
|
||||
if (!res.ok) throw new Error(`Questions fetch failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function myPapers(): Promise<Paper[]> {
|
||||
const headers = await authHeaders();
|
||||
const res = await fetch(`${API_BASE}/papers/mine`, { headers });
|
||||
if (!res.ok) throw new Error(`My papers fetch failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function listPapers(): Promise<Paper[]> {
|
||||
const res = await fetch(`${API_BASE}/papers/`);
|
||||
if (!res.ok) throw new Error(`List papers failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function recordAttempt(
|
||||
questionId: string,
|
||||
attemptType: string,
|
||||
userAnswer: string | null,
|
||||
isCorrect: boolean | null,
|
||||
): Promise<UserAttempt> {
|
||||
const headers = await authHeaders();
|
||||
const res = await fetch(`${API_BASE}/attempts/`, {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json", ...headers },
|
||||
body: JSON.stringify({
|
||||
question_id: questionId,
|
||||
attempt_type: attemptType,
|
||||
user_answer: userAnswer,
|
||||
is_correct: isCorrect,
|
||||
}),
|
||||
});
|
||||
if (!res.ok) throw new Error(`Attempt save failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function uploadPhoto(
|
||||
questionId: string,
|
||||
photo: File,
|
||||
): Promise<{ attempt: UserAttempt; ocr_text: string; grade: { is_correct: boolean; score_given?: number; feedback: string; error_at_step: number | null } }> {
|
||||
const headers = await authHeaders();
|
||||
const fd = new FormData();
|
||||
fd.append("question_id", questionId);
|
||||
fd.append("photo", photo);
|
||||
const res = await fetch(`${API_BASE}/attempts/photo`, {
|
||||
method: "POST",
|
||||
headers,
|
||||
body: fd,
|
||||
});
|
||||
if (!res.ok) throw new Error(`Photo upload failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function getPaperAttempts(paperId: string): Promise<{
|
||||
question_id: string;
|
||||
is_correct: boolean;
|
||||
feedback: string | null;
|
||||
photo_ocr_text: string | null;
|
||||
}[]> {
|
||||
const headers = await authHeaders();
|
||||
const res = await fetch(`${API_BASE}/attempts/by-paper/${paperId}`, { headers });
|
||||
if (!res.ok) return [];
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function generateVariant(questionId: string): Promise<QuestionVariant> {
|
||||
const headers = await authHeaders();
|
||||
const res = await fetch(`${API_BASE}/questions/${questionId}/variant`, {
|
||||
method: "POST",
|
||||
headers,
|
||||
});
|
||||
if (!res.ok) throw new Error(`Variant generation failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function getVariants(questionId: string): Promise<QuestionVariant[]> {
|
||||
const headers = await authHeaders();
|
||||
const res = await fetch(`${API_BASE}/questions/${questionId}/variants`, { headers });
|
||||
if (!res.ok) throw new Error(`Variants fetch failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function updateVariant(variantId: string, data: { favorited?: boolean }): Promise<QuestionVariant> {
|
||||
const headers = await authHeaders();
|
||||
const res = await fetch(`${API_BASE}/questions/variant/${variantId}`, {
|
||||
method: "PATCH",
|
||||
headers: { "Content-Type": "application/json", ...headers },
|
||||
body: JSON.stringify(data),
|
||||
});
|
||||
if (!res.ok) throw new Error(`Variant update failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function deleteVariant(variantId: string): Promise<void> {
|
||||
const headers = await authHeaders();
|
||||
await fetch(`${API_BASE}/questions/variant/${variantId}`, { method: "DELETE", headers });
|
||||
}
|
||||
|
||||
export async function getFavoriteVariants(): Promise<QuestionVariant[]> {
|
||||
const headers = await authHeaders();
|
||||
const res = await fetch(`${API_BASE}/questions/variants/favorited`, { headers });
|
||||
if (!res.ok) throw new Error(`Favorited variants fetch failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function getErrorBook(courseCode?: string): Promise<UserAttempt[]> {
|
||||
const headers = await authHeaders();
|
||||
const params = new URLSearchParams();
|
||||
if (courseCode) params.set("course_code", courseCode);
|
||||
const query = params.toString() ? `?${params.toString()}` : "";
|
||||
const res = await fetch(`${API_BASE}/attempts/error-book${query}`, { headers });
|
||||
if (!res.ok) throw new Error(`Error book fetch failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function updateAttempt(
|
||||
attemptId: string,
|
||||
data: { in_error_book?: boolean; mastered?: boolean },
|
||||
): Promise<UserAttempt> {
|
||||
const headers = await authHeaders();
|
||||
const res = await fetch(`${API_BASE}/attempts/${attemptId}`, {
|
||||
method: "PATCH",
|
||||
headers: { "Content-Type": "application/json", ...headers },
|
||||
body: JSON.stringify(data),
|
||||
});
|
||||
if (!res.ok) throw new Error(`Attempt update failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function listCourses(): Promise<string[]> {
|
||||
const res = await fetch(`${API_BASE}/analytics/courses`);
|
||||
if (!res.ok) throw new Error(`Courses fetch failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function getCourseAnalytics(courseCode: string): Promise<CourseAnalytics> {
|
||||
const res = await fetch(`${API_BASE}/analytics/course/${courseCode}`);
|
||||
if (!res.ok) throw new Error(`Analytics fetch failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
|
||||
export async function getSimilarQuestions(
|
||||
questionId: string,
|
||||
limit = 6,
|
||||
): Promise<SimilarQuestion[]> {
|
||||
const res = await fetch(`${API_BASE}/questions/${questionId}/similar?limit=${limit}`);
|
||||
if (!res.ok) throw new Error(`Similar question fetch failed: ${res.status}`);
|
||||
return res.json();
|
||||
}
|
||||
45
frontend/src/lib/questionGroups.ts
Normal file
45
frontend/src/lib/questionGroups.ts
Normal file
@@ -0,0 +1,45 @@
|
||||
import type { Question } from "@/types/api";
|
||||
|
||||
export interface QuestionGroup {
|
||||
key: string;
|
||||
label: string;
|
||||
questions: Question[];
|
||||
startPage: number;
|
||||
}
|
||||
|
||||
function topLevelKey(questionNumber: string): string {
|
||||
const match = questionNumber.match(/^\d+/);
|
||||
return match?.[0] ?? questionNumber;
|
||||
}
|
||||
|
||||
export function groupQuestions(questions: Question[]): QuestionGroup[] {
|
||||
const groups = new Map<string, QuestionGroup>();
|
||||
|
||||
for (const question of questions) {
|
||||
const key = topLevelKey(question.question_number);
|
||||
const existing = groups.get(key);
|
||||
if (existing) {
|
||||
existing.questions.push(question);
|
||||
existing.startPage = Math.min(existing.startPage, question.page_number ?? existing.startPage);
|
||||
continue;
|
||||
}
|
||||
groups.set(key, {
|
||||
key,
|
||||
label: `Q${key}`,
|
||||
questions: [question],
|
||||
startPage: question.page_number ?? 1,
|
||||
});
|
||||
}
|
||||
|
||||
return Array.from(groups.values()).sort((a, b) => Number(a.key) - Number(b.key));
|
||||
}
|
||||
|
||||
export function subquestionLabel(question: Question): string {
|
||||
const remainder = question.question_number.replace(/^\d+/, "");
|
||||
if (!remainder) return "Main";
|
||||
return remainder
|
||||
.replace(/^_+/, "")
|
||||
.split("_")
|
||||
.filter(Boolean)
|
||||
.join(".");
|
||||
}
|
||||
6
frontend/src/lib/supabase.ts
Normal file
6
frontend/src/lib/supabase.ts
Normal file
@@ -0,0 +1,6 @@
|
||||
import { createClient } from "@supabase/supabase-js";
|
||||
|
||||
const supabaseUrl = import.meta.env.VITE_SUPABASE_URL as string;
|
||||
const supabaseAnonKey = import.meta.env.VITE_SUPABASE_ANON_KEY as string;
|
||||
|
||||
export const supabase = createClient(supabaseUrl, supabaseAnonKey);
|
||||
16
frontend/src/main.tsx
Normal file
16
frontend/src/main.tsx
Normal file
@@ -0,0 +1,16 @@
|
||||
import { StrictMode } from "react";
|
||||
import { createRoot } from "react-dom/client";
|
||||
import { BrowserRouter } from "react-router-dom";
|
||||
import App from "./App";
|
||||
import { AuthProvider } from "./contexts/AuthContext";
|
||||
import "./styles/globals.css";
|
||||
|
||||
createRoot(document.getElementById("root")!).render(
|
||||
<StrictMode>
|
||||
<BrowserRouter>
|
||||
<AuthProvider>
|
||||
<App />
|
||||
</AuthProvider>
|
||||
</BrowserRouter>
|
||||
</StrictMode>,
|
||||
);
|
||||
521
frontend/src/pages/AnalyticsPage.tsx
Normal file
521
frontend/src/pages/AnalyticsPage.tsx
Normal file
@@ -0,0 +1,521 @@
|
||||
import { useEffect, useMemo, useState } from "react";
|
||||
import { Link, useNavigate, useParams } from "react-router-dom";
|
||||
|
||||
import Header from "@/components/layout/Header";
|
||||
import { getCourseAnalytics, listCourses } from "@/lib/api";
|
||||
import type { CourseAnalytics, AnalyticsTopicQuestion } from "@/types/api";
|
||||
|
||||
const typeLabel: Record<string, string> = {
|
||||
mc: "Multiple Choice",
|
||||
true_false: "True / False",
|
||||
fill_blank: "Fill in Blank",
|
||||
long_question: "Long Question",
|
||||
short_answer: "Short Answer",
|
||||
coding: "Coding",
|
||||
};
|
||||
|
||||
const TYPE_COLORS: Record<string, string> = {
|
||||
mc: "bg-violet-50 text-violet-700 border-violet-200",
|
||||
true_false: "bg-amber-50 text-amber-700 border-amber-200",
|
||||
fill_blank: "bg-teal-50 text-teal-700 border-teal-200",
|
||||
long_question: "bg-sky-50 text-sky-700 border-sky-200",
|
||||
short_answer: "bg-rose-50 text-rose-700 border-rose-200",
|
||||
coding: "bg-emerald-50 text-emerald-700 border-emerald-200",
|
||||
};
|
||||
|
||||
const DIFF_COLORS: Record<string, string> = {
|
||||
hard: "text-red-600 bg-red-50 border-red-200",
|
||||
medium: "text-amber-600 bg-amber-50 border-amber-200",
|
||||
easy: "text-green-600 bg-green-50 border-green-200",
|
||||
};
|
||||
|
||||
type QItem = AnalyticsTopicQuestion;
|
||||
type Analytics = CourseAnalytics;
|
||||
|
||||
const PAGE_SIZE = 8;
|
||||
|
||||
export default function AnalyticsPage() {
|
||||
const { courseCode } = useParams<{ courseCode?: string }>();
|
||||
const navigate = useNavigate();
|
||||
|
||||
const [courses, setCourses] = useState<string[]>([]);
|
||||
const [search, setSearch] = useState("");
|
||||
|
||||
useEffect(() => { listCourses().then(setCourses).catch(() => {}); }, []);
|
||||
const filtered = useMemo(() => {
|
||||
const q = search.trim().toUpperCase();
|
||||
return q ? courses.filter((c) => c.includes(q)) : courses;
|
||||
}, [courses, search]);
|
||||
|
||||
const normalizedCourse = courseCode?.toUpperCase();
|
||||
const [analytics, setAnalytics] = useState<Analytics | null>(null);
|
||||
const [loading, setLoading] = useState(false);
|
||||
const [error, setError] = useState<string | null>(null);
|
||||
|
||||
useEffect(() => {
|
||||
if (!normalizedCourse) return;
|
||||
let cancelled = false;
|
||||
setLoading(true);
|
||||
setAnalytics(null);
|
||||
setError(null);
|
||||
getCourseAnalytics(normalizedCourse)
|
||||
.then((data) => { if (!cancelled) { setAnalytics(data); setLoading(false); } })
|
||||
.catch((err) => { if (!cancelled) { setError(err instanceof Error ? err.message : "Failed"); setLoading(false); } });
|
||||
return () => { cancelled = true; };
|
||||
}, [normalizedCourse]);
|
||||
|
||||
// ── Course picker ──
|
||||
if (!normalizedCourse) {
|
||||
return (
|
||||
<div className="min-h-screen bg-gray-50">
|
||||
<Header />
|
||||
<main className="max-w-2xl mx-auto px-6 py-12">
|
||||
<h1 className="text-2xl font-bold text-gray-900 mb-1">Analytics</h1>
|
||||
<p className="text-sm text-gray-500 mb-6">Select a course to view statistics.</p>
|
||||
<input
|
||||
type="text"
|
||||
placeholder="Search course code..."
|
||||
value={search}
|
||||
onChange={(e) => setSearch(e.target.value)}
|
||||
className="w-full px-4 py-2.5 border border-gray-300 rounded-xl text-sm focus:outline-none focus:ring-2 focus:ring-blue-500 mb-4"
|
||||
/>
|
||||
{filtered.length === 0 ? (
|
||||
<p className="text-sm text-gray-400">No courses found.</p>
|
||||
) : (
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
{filtered.map((code) => (
|
||||
<button key={code} onClick={() => navigate(`/analytics/${code}`)}
|
||||
className="text-left px-4 py-3 bg-white border border-gray-200 rounded-xl hover:border-blue-400 hover:bg-blue-50 transition-colors">
|
||||
<span className="font-semibold text-gray-900">{code}</span>
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</main>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
// ── Dashboard ──
|
||||
return (
|
||||
<div className="min-h-screen bg-gray-50">
|
||||
<Header />
|
||||
<main className="max-w-7xl mx-auto px-6 py-8">
|
||||
<div className="mb-6 flex items-center gap-3">
|
||||
<button onClick={() => navigate("/analytics")} className="text-sm text-gray-400 hover:text-gray-600">← All courses</button>
|
||||
<span className="text-gray-300">/</span>
|
||||
<h1 className="text-2xl font-bold text-gray-900">{normalizedCourse}</h1>
|
||||
</div>
|
||||
|
||||
{loading && <div className="text-sm text-gray-400">Loading analytics...</div>}
|
||||
{error && <div className="text-sm text-red-600">{error}</div>}
|
||||
|
||||
{!loading && !error && analytics && (
|
||||
<>
|
||||
{/* KPI row */}
|
||||
<section className="grid grid-cols-4 gap-4 mb-6">
|
||||
<KpiCard label="Papers" value={analytics.kpi.papers} />
|
||||
<KpiCard label="Questions" value={analytics.kpi.questions} />
|
||||
<KpiCard label="Topics" value={analytics.kpi.topics} />
|
||||
<KpiCard label="Avg Difficulty" value={analytics.kpi.difficulty} />
|
||||
</section>
|
||||
|
||||
{/* Main area: left = search, right = charts */}
|
||||
<section className="grid grid-cols-[5fr_2fr] gap-6">
|
||||
{/* Left: Global search */}
|
||||
<GlobalSearch questions={analytics.all_questions} topics={analytics.topic_frequency.map((t) => t.label)} />
|
||||
|
||||
{/* Right: Interactive charts + stats */}
|
||||
<div className="space-y-5">
|
||||
<InteractiveChart
|
||||
topicData={analytics.topic_frequency.slice(0, 8).map((t) => ({ label: t.label, value: t.count }))}
|
||||
typeData={analytics.question_types.map((t) => ({ label: typeLabel[t.label] ?? t.label, value: t.count }))}
|
||||
diffData={[
|
||||
{ label: "Easy", value: analytics.difficulty_distribution.easy },
|
||||
{ label: "Medium", value: analytics.difficulty_distribution.medium },
|
||||
{ label: "Hard", value: analytics.difficulty_distribution.hard },
|
||||
].filter((d) => d.value > 0)}
|
||||
/>
|
||||
|
||||
<Panel title="High Yield Topics">
|
||||
{analytics.high_yield_topics.length === 0 ? (
|
||||
<div className="text-sm text-gray-400">No data yet.</div>
|
||||
) : (
|
||||
<ul className="space-y-2">
|
||||
{analytics.high_yield_topics.map((t, i) => (
|
||||
<li key={t} className="flex items-center gap-3 text-sm text-gray-700">
|
||||
<span className="w-6 h-6 rounded-full bg-red-50 text-red-600 flex items-center justify-center text-xs font-semibold">{i + 1}</span>
|
||||
<span>{t}</span>
|
||||
</li>
|
||||
))}
|
||||
</ul>
|
||||
)}
|
||||
</Panel>
|
||||
</div>
|
||||
</section>
|
||||
</>
|
||||
)}
|
||||
</main>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
// ── Global Search Engine ──
|
||||
function GlobalSearch({ questions, topics }: { questions: QItem[]; topics: string[] }) {
|
||||
const [search, setSearch] = useState("");
|
||||
const [topicFilter, setTopicFilter] = useState<string | null>(null);
|
||||
const [typeFilter, setTypeFilter] = useState<string | null>(null);
|
||||
const [yearFilter, setYearFilter] = useState<number | null>(null);
|
||||
const [termFilter, setTermFilter] = useState<string | null>(null);
|
||||
const [diffFilter, setDiffFilter] = useState<string | null>(null);
|
||||
const [visibleCount, setVisibleCount] = useState(PAGE_SIZE);
|
||||
|
||||
const types = useMemo(() => [...new Set(questions.map((q) => q.question_type))].sort(), [questions]);
|
||||
const years = useMemo(() => [...new Set(questions.map((q) => q.year).filter(Boolean))].sort((a, b) => (b ?? 0) - (a ?? 0)) as number[], [questions]);
|
||||
const terms = useMemo(() => {
|
||||
const order = ["spring", "summer", "fall", "winter"];
|
||||
return [...new Set(questions.map((q) => q.term).filter(Boolean))].sort((a, b) => order.indexOf(a!) - order.indexOf(b!)) as string[];
|
||||
}, [questions]);
|
||||
const diffs = useMemo(() => [...new Set(questions.map((q) => q.difficulty).filter(Boolean))] as string[], [questions]);
|
||||
|
||||
const filtered = useMemo(() => {
|
||||
const q = search.toLowerCase();
|
||||
return questions.filter((item) => {
|
||||
if (topicFilter && !item.topics?.includes(topicFilter)) return false;
|
||||
if (typeFilter && item.question_type !== typeFilter) return false;
|
||||
if (yearFilter && item.year !== yearFilter) return false;
|
||||
if (termFilter && item.term !== termFilter) return false;
|
||||
if (diffFilter && item.difficulty !== diffFilter) return false;
|
||||
if (q && !item.preview.toLowerCase().includes(q) && !item.source.toLowerCase().includes(q) && !item.question_number.toLowerCase().includes(q) && !item.topics?.some((t) => t.toLowerCase().includes(q))) return false;
|
||||
return true;
|
||||
});
|
||||
}, [questions, search, topicFilter, typeFilter, yearFilter, termFilter, diffFilter]);
|
||||
|
||||
const activeCount = [topicFilter, typeFilter, yearFilter, termFilter, diffFilter].filter(Boolean).length;
|
||||
|
||||
useEffect(() => setVisibleCount(PAGE_SIZE), [search, topicFilter, typeFilter, yearFilter, termFilter, diffFilter]);
|
||||
|
||||
const visible = filtered.slice(0, visibleCount);
|
||||
const hasMore = visibleCount < filtered.length;
|
||||
|
||||
return (
|
||||
<div className="bg-white border border-gray-200 rounded-2xl p-6">
|
||||
<h2 className="text-sm font-semibold text-gray-500 uppercase tracking-wide mb-4">Question Search</h2>
|
||||
|
||||
{/* Search bar */}
|
||||
<div className="relative mb-3">
|
||||
<input
|
||||
type="text"
|
||||
value={search}
|
||||
onChange={(e) => setSearch(e.target.value)}
|
||||
placeholder="Search questions, topics, papers..."
|
||||
className="w-full pl-9 pr-3 py-2.5 text-sm border border-gray-200 rounded-xl bg-gray-50 focus:bg-white focus:outline-none focus:ring-2 focus:ring-blue-400"
|
||||
/>
|
||||
<span className="absolute left-3 top-1/2 -translate-y-1/2 text-gray-400">🔍</span>
|
||||
</div>
|
||||
|
||||
{/* Filter rows */}
|
||||
<div className="space-y-2 mb-3">
|
||||
{/* Topic */}
|
||||
<FilterRow label="Topic">
|
||||
<TopicCombobox topics={topics} value={topicFilter} onChange={setTopicFilter} />
|
||||
</FilterRow>
|
||||
|
||||
{/* Type + Year + Term + Difficulty in one row */}
|
||||
<div className="flex items-center gap-3 flex-wrap">
|
||||
<FilterRow label="Type">
|
||||
<div className="flex gap-1 flex-wrap">
|
||||
{types.map((t) => (
|
||||
<Pill key={t} label={typeLabel[t] ?? t} active={typeFilter === t}
|
||||
color={TYPE_COLORS[t]} onClick={() => setTypeFilter(typeFilter === t ? null : t)} />
|
||||
))}
|
||||
</div>
|
||||
</FilterRow>
|
||||
|
||||
<FilterRow label="Year">
|
||||
<div className="flex gap-1 flex-wrap">
|
||||
{years.map((y) => (
|
||||
<Pill key={y} label={String(y)} active={yearFilter === y}
|
||||
onClick={() => setYearFilter(yearFilter === y ? null : y)} />
|
||||
))}
|
||||
</div>
|
||||
</FilterRow>
|
||||
|
||||
<FilterRow label="Term">
|
||||
<div className="flex gap-1 flex-wrap">
|
||||
{terms.map((t) => (
|
||||
<Pill key={t} label={t.charAt(0).toUpperCase() + t.slice(1)} active={termFilter === t}
|
||||
onClick={() => setTermFilter(termFilter === t ? null : t)} />
|
||||
))}
|
||||
</div>
|
||||
</FilterRow>
|
||||
|
||||
<FilterRow label="Diff">
|
||||
<div className="flex gap-1">
|
||||
{(["easy", "medium", "hard"] as const).filter((d) => diffs.includes(d)).map((d) => (
|
||||
<Pill key={d} label={d.charAt(0).toUpperCase() + d.slice(1)} active={diffFilter === d}
|
||||
color={DIFF_COLORS[d]} onClick={() => setDiffFilter(diffFilter === d ? null : d)} />
|
||||
))}
|
||||
</div>
|
||||
</FilterRow>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Results count + clear */}
|
||||
<div className="flex items-center justify-between mb-3 pb-3 border-b border-gray-100">
|
||||
<span className="text-xs text-gray-400">
|
||||
{filtered.length} question{filtered.length !== 1 ? "s" : ""}
|
||||
{activeCount > 0 || search ? " matched" : ""}
|
||||
</span>
|
||||
{(activeCount > 0 || search) && (
|
||||
<button onClick={() => { setTopicFilter(null); setTypeFilter(null); setYearFilter(null); setTermFilter(null); setDiffFilter(null); setSearch(""); }}
|
||||
className="text-xs text-blue-500 hover:text-blue-700">Clear all</button>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Results */}
|
||||
<div className="space-y-2">
|
||||
{visible.map((q, i) => (
|
||||
<QuestionCard key={`${q.paper_id}-${q.question_number}-${i}`} question={q} />
|
||||
))}
|
||||
</div>
|
||||
|
||||
{hasMore && (
|
||||
<button onClick={() => setVisibleCount((v) => v + PAGE_SIZE)}
|
||||
className="w-full mt-3 py-2 text-xs text-blue-600 hover:text-blue-700 bg-blue-50 rounded-xl font-medium">
|
||||
Show more ({filtered.length - visibleCount} remaining)
|
||||
</button>
|
||||
)}
|
||||
{filtered.length === 0 && (
|
||||
<div className="text-center py-6 text-sm text-gray-400">No questions match your search.</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
// ── Interactive Pie Chart ──
|
||||
const PIE_PALETTE = [
|
||||
"#3B82F6", "#8B5CF6", "#F59E0B", "#10B981", "#EF4444",
|
||||
"#EC4899", "#06B6D4", "#F97316", "#6366F1", "#14B8A6",
|
||||
];
|
||||
|
||||
function InteractiveChart({ topicData, typeData, diffData }: {
|
||||
topicData: { label: string; value: number }[];
|
||||
typeData: { label: string; value: number }[];
|
||||
diffData: { label: string; value: number }[];
|
||||
}) {
|
||||
const [view, setView] = useState<"topic" | "type" | "difficulty">("topic");
|
||||
const [hovered, setHovered] = useState<number | null>(null);
|
||||
|
||||
const data = view === "topic" ? topicData : view === "type" ? typeData : diffData;
|
||||
const colors = view === "difficulty"
|
||||
? ["#10B981", "#F59E0B", "#EF4444"]
|
||||
: PIE_PALETTE;
|
||||
|
||||
const total = data.reduce((s, d) => s + d.value, 0);
|
||||
|
||||
// Build conic-gradient
|
||||
let cumPct = 0;
|
||||
const segments = data.map((d, i) => {
|
||||
const pct = total ? (d.value / total) * 100 : 0;
|
||||
const start = cumPct;
|
||||
cumPct += pct;
|
||||
return { ...d, pct, start, end: cumPct, color: colors[i % colors.length] };
|
||||
});
|
||||
|
||||
const gradient = segments
|
||||
.map((s) => `${s.color} ${s.start}% ${s.end}%`)
|
||||
.join(", ");
|
||||
|
||||
return (
|
||||
<section className="bg-white border border-gray-200 rounded-2xl p-5">
|
||||
{/* Tab switcher */}
|
||||
<div className="flex gap-1 mb-4">
|
||||
{(["topic", "type", "difficulty"] as const).map((t) => (
|
||||
<button key={t} onClick={() => { setView(t); setHovered(null); }}
|
||||
className={`text-xs px-3 py-1.5 rounded-lg font-medium transition-colors ${
|
||||
view === t ? "bg-gray-900 text-white" : "bg-gray-100 text-gray-500 hover:text-gray-700"
|
||||
}`}>
|
||||
{t === "topic" ? "Topics" : t === "type" ? "Types" : "Difficulty"}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
|
||||
{/* Pie */}
|
||||
<div className="flex items-center gap-4">
|
||||
<div className="relative w-36 h-36 shrink-0">
|
||||
<div
|
||||
className="w-full h-full rounded-full"
|
||||
style={{ background: `conic-gradient(${gradient})` }}
|
||||
/>
|
||||
<div className="absolute inset-3 bg-white rounded-full flex items-center justify-center">
|
||||
{hovered !== null ? (
|
||||
<div className="text-center">
|
||||
<div className="text-lg font-bold text-gray-900">{segments[hovered].value}</div>
|
||||
<div className="text-[9px] text-gray-400">{segments[hovered].pct.toFixed(0)}%</div>
|
||||
</div>
|
||||
) : (
|
||||
<div className="text-center">
|
||||
<div className="text-lg font-bold text-gray-900">{total}</div>
|
||||
<div className="text-[9px] text-gray-400">total</div>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Legend */}
|
||||
<div className="flex-1 space-y-1 max-h-36 overflow-y-auto">
|
||||
{segments.map((s, i) => (
|
||||
<div
|
||||
key={s.label}
|
||||
onMouseEnter={() => setHovered(i)}
|
||||
onMouseLeave={() => setHovered(null)}
|
||||
className={`flex items-center gap-2 px-2 py-1 rounded-lg cursor-default transition-colors ${
|
||||
hovered === i ? "bg-gray-50" : ""
|
||||
}`}
|
||||
>
|
||||
<span className="w-2.5 h-2.5 rounded-full shrink-0" style={{ backgroundColor: s.color }} />
|
||||
<span className="text-xs text-gray-700 flex-1 truncate">{s.label}</span>
|
||||
<span className="text-xs text-gray-400 tabular-nums">{s.value}</span>
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
);
|
||||
}
|
||||
|
||||
// ── Shared components ──
|
||||
function QuestionCard({ question: q }: { question: QItem }) {
|
||||
const typeColor = TYPE_COLORS[q.question_type] ?? "bg-gray-50 text-gray-600 border-gray-200";
|
||||
const cleanPreview = (q.preview || "")
|
||||
.replace(/^Problem\s+\d+\s*\[.*?\]\s*/i, "")
|
||||
.replace(/^(True\/False Questions?\s*)?Indicate whether.*?(answer\.\s*)/i, "")
|
||||
.trim();
|
||||
|
||||
return (
|
||||
<Link to={`/paper/${q.paper_id}`}
|
||||
className="flex items-start gap-3 bg-gray-50 border border-gray-200 rounded-xl px-3.5 py-2.5 hover:border-blue-300 hover:bg-white hover:shadow-sm transition-all group">
|
||||
<span className="shrink-0 inline-flex items-center justify-center w-8 h-8 rounded-lg bg-blue-600 text-white text-xs font-bold mt-0.5">
|
||||
{q.question_number}
|
||||
</span>
|
||||
<div className="flex-1 min-w-0">
|
||||
<div className="flex items-center gap-1.5 mb-1 flex-wrap">
|
||||
<span className="text-xs font-medium text-blue-600">{q.source}</span>
|
||||
<span className="text-gray-300">·</span>
|
||||
<span className={`text-[10px] px-1.5 py-0.5 rounded border font-medium ${typeColor}`}>
|
||||
{typeLabel[q.question_type] ?? q.question_type}
|
||||
</span>
|
||||
{q.difficulty && (
|
||||
<>
|
||||
<span className="text-gray-300">·</span>
|
||||
<span className={`text-[10px] px-1.5 py-0.5 rounded border font-medium ${DIFF_COLORS[q.difficulty] ?? ""}`}>
|
||||
{q.difficulty}
|
||||
</span>
|
||||
</>
|
||||
)}
|
||||
{q.topics?.slice(0, 2).map((t) => (
|
||||
<span key={t} className="text-[10px] px-1.5 py-0.5 rounded bg-gray-100 text-gray-500 border border-gray-200">{t}</span>
|
||||
))}
|
||||
</div>
|
||||
<p className="text-xs text-gray-600 line-clamp-2 leading-relaxed">{cleanPreview || q.preview}</p>
|
||||
</div>
|
||||
<span className="shrink-0 text-gray-300 group-hover:text-blue-500 text-sm pt-1">→</span>
|
||||
</Link>
|
||||
);
|
||||
}
|
||||
|
||||
function FilterRow({ label, children }: { label: string; children: React.ReactNode }) {
|
||||
return (
|
||||
<div className="flex items-center gap-1.5">
|
||||
<span className="text-[10px] text-gray-400 w-10 shrink-0">{label}</span>
|
||||
{children}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function Pill({ label, active, color, onClick }: { label: string; active: boolean; color?: string; onClick: () => void }) {
|
||||
return (
|
||||
<button onClick={onClick}
|
||||
className={`text-[10px] px-2 py-1 rounded-full border font-medium transition-colors whitespace-nowrap ${
|
||||
active ? (color ?? "bg-blue-50 text-blue-700 border-blue-200") : "bg-white text-gray-400 border-gray-200 hover:text-gray-600"
|
||||
}`}>
|
||||
{label}
|
||||
</button>
|
||||
);
|
||||
}
|
||||
|
||||
function KpiCard({ label, value }: { label: string; value: string | number }) {
|
||||
return (
|
||||
<div className="bg-white border border-gray-200 rounded-2xl p-5">
|
||||
<div className="text-2xl font-semibold text-gray-900">{value}</div>
|
||||
<div className="text-xs uppercase tracking-wide text-gray-400 mt-2">{label}</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function Panel({ title, children }: { title: string; children: React.ReactNode }) {
|
||||
return (
|
||||
<section className="bg-white border border-gray-200 rounded-2xl p-5">
|
||||
<h2 className="text-sm font-semibold text-gray-500 uppercase tracking-wide mb-4">{title}</h2>
|
||||
{children}
|
||||
</section>
|
||||
);
|
||||
}
|
||||
|
||||
function TopicCombobox({ topics, value, onChange }: { topics: string[]; value: string | null; onChange: (v: string | null) => void }) {
|
||||
const [input, setInput] = useState("");
|
||||
const [open, setOpen] = useState(false);
|
||||
|
||||
const filtered = useMemo(() => {
|
||||
const q = input.toLowerCase();
|
||||
return q ? topics.filter((t) => t.toLowerCase().includes(q)) : topics;
|
||||
}, [topics, input]);
|
||||
|
||||
const handleSelect = (t: string | null) => {
|
||||
onChange(t);
|
||||
setInput(t ?? "");
|
||||
setOpen(false);
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="relative">
|
||||
<div className="flex items-center gap-1">
|
||||
<input
|
||||
type="text"
|
||||
value={value ? (input || value) : input}
|
||||
onChange={(e) => { setInput(e.target.value); setOpen(true); if (!e.target.value) onChange(null); }}
|
||||
onFocus={() => setOpen(true)}
|
||||
placeholder="All Topics"
|
||||
className="text-xs border border-gray-200 rounded-lg px-2 py-1.5 bg-white focus:outline-none focus:ring-1 focus:ring-blue-400 w-48"
|
||||
/>
|
||||
{value && (
|
||||
<button onClick={() => { onChange(null); setInput(""); }} className="text-gray-400 hover:text-gray-600 text-xs">✕</button>
|
||||
)}
|
||||
</div>
|
||||
{open && filtered.length > 0 && (
|
||||
<div className="absolute z-20 top-full mt-1 w-56 max-h-48 overflow-y-auto bg-white border border-gray-200 rounded-lg shadow-lg">
|
||||
{filtered.map((t) => (
|
||||
<button
|
||||
key={t}
|
||||
onClick={() => handleSelect(t)}
|
||||
className={`w-full text-left px-3 py-1.5 text-xs hover:bg-blue-50 transition-colors ${value === t ? "bg-blue-50 text-blue-700 font-medium" : "text-gray-700"}`}
|
||||
>
|
||||
{t}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
{open && <div className="fixed inset-0 z-10" onClick={() => setOpen(false)} />}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function DiffStat({ label, value }: { label: string; value: number }) {
|
||||
return (
|
||||
<div className="bg-gray-50 rounded-xl px-3 py-4">
|
||||
<div className="text-xl font-semibold text-gray-900">{value}</div>
|
||||
<div className="text-xs uppercase tracking-wide text-gray-400 mt-1">{label}</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
296
frontend/src/pages/ErrorBookPage.tsx
Normal file
296
frontend/src/pages/ErrorBookPage.tsx
Normal file
@@ -0,0 +1,296 @@
|
||||
import { useEffect, useMemo, useState } from "react";
|
||||
import { Link } from "react-router-dom";
|
||||
|
||||
import Header from "@/components/layout/Header";
|
||||
import KaTeXRenderer from "@/components/shared/KaTeXRenderer";
|
||||
import { getErrorBook, updateAttempt, getFavoriteVariants, updateVariant } from "@/lib/api";
|
||||
import { useAuth } from "@/contexts/AuthContext";
|
||||
import type { UserAttempt, QuestionVariant } from "@/types/api";
|
||||
|
||||
const typeLabel: Record<string, string> = {
|
||||
mc: "Multiple Choice",
|
||||
true_false: "True / False",
|
||||
fill_blank: "Fill in Blank",
|
||||
long_question: "Long Question",
|
||||
short_answer: "Short Answer",
|
||||
coding: "Coding",
|
||||
};
|
||||
|
||||
const TYPE_COLORS: Record<string, string> = {
|
||||
mc: "bg-violet-50 text-violet-700",
|
||||
true_false: "bg-amber-50 text-amber-700",
|
||||
fill_blank: "bg-teal-50 text-teal-700",
|
||||
long_question: "bg-sky-50 text-sky-700",
|
||||
short_answer: "bg-rose-50 text-rose-700",
|
||||
coding: "bg-emerald-50 text-emerald-700",
|
||||
};
|
||||
|
||||
const DIFF_COLORS: Record<string, string> = {
|
||||
easy: "text-green-600",
|
||||
medium: "text-amber-600",
|
||||
hard: "text-red-600",
|
||||
};
|
||||
|
||||
export default function ErrorBookPage() {
|
||||
const { user } = useAuth();
|
||||
const [entries, setEntries] = useState<UserAttempt[]>([]);
|
||||
const [favoriteVariants, setFavoriteVariants] = useState<QuestionVariant[]>([]);
|
||||
const [loading, setLoading] = useState(true);
|
||||
const [error, setError] = useState<string | null>(null);
|
||||
const [courseFilter, setCourseFilter] = useState<string>("all");
|
||||
|
||||
useEffect(() => {
|
||||
if (!user) { setLoading(false); return; }
|
||||
let cancelled = false;
|
||||
setLoading(true);
|
||||
Promise.all([getErrorBook(), getFavoriteVariants()])
|
||||
.then(([attempts, variants]) => {
|
||||
if (cancelled) return;
|
||||
setEntries(attempts);
|
||||
setFavoriteVariants(variants);
|
||||
setLoading(false);
|
||||
})
|
||||
.catch((err) => {
|
||||
if (cancelled) return;
|
||||
setError(err instanceof Error ? err.message : "Failed to load error book");
|
||||
setLoading(false);
|
||||
});
|
||||
return () => { cancelled = true; };
|
||||
}, [user]);
|
||||
|
||||
const courses = useMemo(
|
||||
() => Array.from(new Set(
|
||||
entries.map((e) => e.paper_questions?.paper?.course_code).filter((v): v is string => Boolean(v)),
|
||||
)).sort(),
|
||||
[entries],
|
||||
);
|
||||
|
||||
const filteredEntries = useMemo(() => {
|
||||
if (courseFilter === "all") return entries;
|
||||
return entries.filter((e) => e.paper_questions?.paper?.course_code === courseFilter);
|
||||
}, [courseFilter, entries]);
|
||||
|
||||
async function handleMarkMastered(attemptId: string) {
|
||||
await updateAttempt(attemptId, { mastered: true });
|
||||
setEntries((prev) => prev.filter((e) => e.id !== attemptId));
|
||||
}
|
||||
|
||||
async function handleRemove(attemptId: string) {
|
||||
await updateAttempt(attemptId, { in_error_book: false });
|
||||
setEntries((prev) => prev.filter((e) => e.id !== attemptId));
|
||||
}
|
||||
|
||||
async function handleUnfavoriteVariant(variantId: string) {
|
||||
await updateVariant(variantId, { favorited: false });
|
||||
setFavoriteVariants((prev) => prev.filter((v) => v.id !== variantId));
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="min-h-screen bg-gray-50">
|
||||
<Header />
|
||||
<main className="max-w-4xl mx-auto px-6 py-8">
|
||||
{/* Header */}
|
||||
<div className="flex items-end justify-between gap-4 mb-6">
|
||||
<div>
|
||||
<h1 className="text-2xl font-bold text-gray-900">Error Book</h1>
|
||||
<p className="text-sm text-gray-500 mt-1">Review your mistakes and track progress.</p>
|
||||
</div>
|
||||
<div className="flex gap-3 text-sm">
|
||||
<StatCard label="To Review" value={filteredEntries.length} color="red" />
|
||||
<StatCard label="Courses" value={courses.length} color="blue" />
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Course filter */}
|
||||
<div className="flex gap-2 mb-6 flex-wrap">
|
||||
<Pill active={courseFilter === "all"} onClick={() => setCourseFilter("all")} label="All" />
|
||||
{courses.map((c) => (
|
||||
<Pill key={c} active={courseFilter === c} onClick={() => setCourseFilter(c)} label={c} />
|
||||
))}
|
||||
</div>
|
||||
|
||||
{!user && (
|
||||
<div className="bg-white border border-gray-200 rounded-xl p-12 text-center">
|
||||
<div className="text-3xl mb-3">🔒</div>
|
||||
<p className="text-gray-500 mb-4">Sign in to unlock your Error Book</p>
|
||||
<Link to="/login" className="inline-block px-5 py-2 bg-indigo-600 text-white text-sm font-medium rounded-lg hover:bg-indigo-700 transition-colors">
|
||||
Sign in
|
||||
</Link>
|
||||
</div>
|
||||
)}
|
||||
{user && loading && <div className="text-sm text-gray-400">Loading...</div>}
|
||||
{user && error && <div className="text-sm text-red-600">{error}</div>}
|
||||
|
||||
{user && !loading && !error && filteredEntries.length === 0 && favoriteVariants.length === 0 && (
|
||||
<div className="bg-white border border-gray-200 rounded-xl p-12 text-center">
|
||||
<div className="text-3xl mb-3">🎉</div>
|
||||
<p className="text-gray-500">No mistakes yet. Keep practicing!</p>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Saved variants */}
|
||||
{favoriteVariants.length > 0 && (
|
||||
<div className="mb-8">
|
||||
<h2 className="text-xs font-semibold text-gray-400 uppercase tracking-wide mb-3">
|
||||
Saved Variants ({favoriteVariants.length})
|
||||
</h2>
|
||||
<div className="space-y-2">
|
||||
{favoriteVariants.map((v) => (
|
||||
<div key={v.id} className="flex items-center gap-3 bg-white border border-yellow-200 rounded-xl px-4 py-3">
|
||||
<span className="text-yellow-400">★</span>
|
||||
<div className="flex-1 min-w-0">
|
||||
<span className="text-sm font-medium text-gray-700">Variant of Q{v.source_question_number}</span>
|
||||
<p className="text-xs text-gray-500 truncate">{v.variant_data.question_text?.replace(/<[^>]*>/g, "").slice(0, 100)}</p>
|
||||
</div>
|
||||
<button onClick={() => void handleUnfavoriteVariant(v.id)} className="text-xs text-gray-400 hover:text-red-500">Remove</button>
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Error entries */}
|
||||
<div className="space-y-4">
|
||||
{filteredEntries.map((entry) => (
|
||||
<ErrorCard
|
||||
key={entry.id}
|
||||
entry={entry}
|
||||
onMastered={() => void handleMarkMastered(entry.id)}
|
||||
onRemove={() => void handleRemove(entry.id)}
|
||||
/>
|
||||
))}
|
||||
</div>
|
||||
</main>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function ErrorCard({ entry, onMastered, onRemove }: { entry: UserAttempt; onMastered: () => void; onRemove: () => void }) {
|
||||
const [showFeedback, setShowFeedback] = useState(true);
|
||||
const question = entry.paper_questions;
|
||||
if (!question) return null;
|
||||
|
||||
const courseCode = question.paper?.course_code;
|
||||
const paperId = question.paper?.id;
|
||||
const paper = question.paper;
|
||||
const paperInfo = paper ? `${paper.year} ${paper.term} ${paper.exam_type}` : "";
|
||||
const typeColor = TYPE_COLORS[question.question_type] ?? "bg-gray-100 text-gray-600";
|
||||
const diffColor = DIFF_COLORS[question.difficulty ?? ""] ?? "";
|
||||
|
||||
// Clean preview: strip boilerplate
|
||||
const preview = (question.question_text || "")
|
||||
.replace(/^Problem\s+\d+\s*\[.*?\]\s*/i, "")
|
||||
.slice(0, 200);
|
||||
|
||||
return (
|
||||
<article className="bg-white border border-gray-200 rounded-xl overflow-hidden">
|
||||
{/* Header */}
|
||||
<div className="px-5 pt-4 pb-3">
|
||||
<div className="flex items-start justify-between gap-3">
|
||||
<div className="flex items-center gap-2 flex-wrap">
|
||||
<span className="inline-flex items-center justify-center w-9 h-9 rounded-lg bg-red-600 text-white text-sm font-bold">
|
||||
{question.question_number}
|
||||
</span>
|
||||
<div>
|
||||
<div className="flex items-center gap-1.5">
|
||||
<span className={`text-[11px] px-2 py-0.5 rounded-full font-medium ${typeColor}`}>
|
||||
{typeLabel[question.question_type] ?? question.question_type}
|
||||
</span>
|
||||
{question.difficulty && (
|
||||
<span className={`text-[11px] font-medium ${diffColor}`}>{question.difficulty}</span>
|
||||
)}
|
||||
{courseCode && (
|
||||
<Link to={`/analytics/${courseCode}`} className="text-[11px] px-2 py-0.5 rounded-full bg-blue-50 text-blue-700 hover:bg-blue-100">
|
||||
{courseCode}
|
||||
</Link>
|
||||
)}
|
||||
</div>
|
||||
<div className="text-[11px] text-gray-400 mt-0.5">
|
||||
{paperId ? <Link to={`/paper/${paperId}`} className="hover:text-blue-600">{paperInfo}</Link> : paperInfo}
|
||||
{" · "}
|
||||
{new Date(entry.created_at).toLocaleDateString("en-CA")}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Score badge */}
|
||||
{entry.feedback && (
|
||||
<div className="flex items-center gap-1 bg-red-50 border border-red-200 rounded-lg px-2.5 py-1">
|
||||
<span className="text-red-600 text-sm font-bold">✗</span>
|
||||
<span className="text-xs text-red-600 font-medium">Incorrect</span>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Question preview */}
|
||||
<p className="text-sm text-gray-600 mt-3 line-clamp-2">{preview}</p>
|
||||
|
||||
{/* Topics */}
|
||||
{question.topics && question.topics.length > 0 && (
|
||||
<div className="flex gap-1 mt-2 flex-wrap">
|
||||
{question.topics.slice(0, 4).map((t) => (
|
||||
<span key={t} className="text-[10px] px-1.5 py-0.5 rounded bg-gray-100 text-gray-500">{t}</span>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* AI Feedback section */}
|
||||
{entry.feedback && (
|
||||
<div className="border-t border-gray-100">
|
||||
<button
|
||||
onClick={() => setShowFeedback((v) => !v)}
|
||||
className="w-full flex items-center justify-between px-5 py-2.5 text-xs font-medium text-blue-700 bg-blue-50/50 hover:bg-blue-50"
|
||||
>
|
||||
<span>AI Feedback</span>
|
||||
<span>{showFeedback ? "▲" : "▼"}</span>
|
||||
</button>
|
||||
{showFeedback && (
|
||||
<div className="px-5 py-4 bg-white">
|
||||
<KaTeXRenderer html={entry.feedback} className="text-sm text-gray-700 leading-relaxed" />
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Actions */}
|
||||
<div className="border-t border-gray-100 px-5 py-2.5 flex items-center gap-4 bg-gray-50/50">
|
||||
{paperId && (
|
||||
<Link to={`/paper/${paperId}`} className="text-xs font-medium text-blue-600 hover:text-blue-700">
|
||||
Open paper →
|
||||
</Link>
|
||||
)}
|
||||
<button onClick={onMastered} className="text-xs font-medium text-green-600 hover:text-green-700">
|
||||
Mark mastered
|
||||
</button>
|
||||
<button onClick={onRemove} className="text-xs font-medium text-gray-400 hover:text-gray-600">
|
||||
Remove
|
||||
</button>
|
||||
</div>
|
||||
</article>
|
||||
);
|
||||
}
|
||||
|
||||
function StatCard({ label, value, color }: { label: string; value: number; color: string }) {
|
||||
const bg = color === "red" ? "bg-red-50 border-red-200" : "bg-blue-50 border-blue-200";
|
||||
const text = color === "red" ? "text-red-700" : "text-blue-700";
|
||||
return (
|
||||
<div className={`border rounded-xl px-4 py-2.5 ${bg}`}>
|
||||
<div className={`text-xl font-bold ${text}`}>{value}</div>
|
||||
<div className="text-[10px] uppercase tracking-wide text-gray-400 mt-0.5">{label}</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function Pill({ active, onClick, label }: { active: boolean; onClick: () => void; label: string }) {
|
||||
return (
|
||||
<button
|
||||
onClick={onClick}
|
||||
className={`px-3 py-1.5 text-xs font-medium rounded-full border transition-colors ${
|
||||
active ? "bg-gray-900 text-white border-gray-900" : "bg-white text-gray-600 border-gray-200 hover:border-gray-300"
|
||||
}`}
|
||||
>
|
||||
{label}
|
||||
</button>
|
||||
);
|
||||
}
|
||||
705
frontend/src/pages/HomePage.tsx
Normal file
705
frontend/src/pages/HomePage.tsx
Normal file
@@ -0,0 +1,705 @@
|
||||
import { useEffect, useRef, useState } from "react";
|
||||
import { Link, useNavigate } from "react-router-dom";
|
||||
import { listPapers, myPapers } from "@/lib/api";
|
||||
import { useAuth } from "@/contexts/AuthContext";
|
||||
import type { Paper } from "@/types/api";
|
||||
|
||||
function getWorkedIds(userId: string): string[] {
|
||||
try {
|
||||
const raw = localStorage.getItem(`worked_papers_${userId}`);
|
||||
return raw ? JSON.parse(raw) : [];
|
||||
} catch { return []; }
|
||||
}
|
||||
|
||||
const fontSora = { fontFamily: "'Sora', sans-serif" };
|
||||
const fontMono = { fontFamily: "'IBM Plex Mono', monospace" };
|
||||
|
||||
/* ── Feature cards data ── */
|
||||
const FEATURES = [
|
||||
{
|
||||
icon: (
|
||||
<svg className="w-6 h-6" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M9.813 15.904L9 18.75l-.813-2.846a4.5 4.5 0 00-3.09-3.09L2.25 12l2.846-.813a4.5 4.5 0 003.09-3.09L9 5.25l.813 2.846a4.5 4.5 0 003.09 3.09L15.75 12l-2.846.813a4.5 4.5 0 00-3.09 3.09zM18.259 8.715L18 9.75l-.259-1.035a3.375 3.375 0 00-2.455-2.456L14.25 6l1.036-.259a3.375 3.375 0 002.455-2.456L18 2.25l.259 1.035a3.375 3.375 0 002.455 2.456L21.75 6l-1.036.259a3.375 3.375 0 00-2.455 2.456z" />
|
||||
</svg>
|
||||
),
|
||||
title: "AI Analysis",
|
||||
desc: "Every question gets knowledge reminders, hints, and step-by-step solutions.",
|
||||
color: "#6366F1",
|
||||
},
|
||||
{
|
||||
icon: (
|
||||
<svg className="w-6 h-6" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M12 6.042A8.967 8.967 0 006 3.75c-1.052 0-2.062.18-3 .512v14.25A8.987 8.987 0 016 18c2.305 0 4.408.867 6 2.292m0-14.25a8.966 8.966 0 016-2.292c1.052 0 2.062.18 3 .512v14.25A8.987 8.987 0 0018 18a8.967 8.967 0 00-6 2.292m0-14.25v14.25" />
|
||||
</svg>
|
||||
),
|
||||
title: "Smart Error Book",
|
||||
desc: "Auto-collect mistakes with AI feedback. Review, understand, and master.",
|
||||
color: "#E11D48",
|
||||
},
|
||||
{
|
||||
icon: (
|
||||
<svg className="w-6 h-6" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M3 13.125C3 12.504 3.504 12 4.125 12h2.25c.621 0 1.125.504 1.125 1.125v6.75C7.5 20.496 6.996 21 6.375 21h-2.25A1.125 1.125 0 013 19.875v-6.75zM9.75 8.625c0-.621.504-1.125 1.125-1.125h2.25c.621 0 1.125.504 1.125 1.125v11.25c0 .621-.504 1.125-1.125 1.125h-2.25a1.125 1.125 0 01-1.125-1.125V8.625zM16.5 4.125c0-.621.504-1.125 1.125-1.125h2.25C20.496 3 21 3.504 21 4.125v15.75c0 .621-.504 1.125-1.125 1.125h-2.25a1.125 1.125 0 01-1.125-1.125V4.125z" />
|
||||
</svg>
|
||||
),
|
||||
title: "Course Analytics",
|
||||
desc: "Topic frequency, difficulty distribution, and high-yield focus areas.",
|
||||
color: "#0D9488",
|
||||
},
|
||||
{
|
||||
icon: (
|
||||
<svg className="w-6 h-6" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M19.5 12c0-1.232-.046-2.453-.138-3.662a4.006 4.006 0 00-3.7-3.7 48.678 48.678 0 00-7.324 0 4.006 4.006 0 00-3.7 3.7c-.017.22-.032.441-.046.662M19.5 12l3-3m-3 3l-3-3m-12 3c0 1.232.046 2.453.138 3.662a4.006 4.006 0 003.7 3.7 48.656 48.656 0 007.324 0 4.006 4.006 0 003.7-3.7c.017-.22.032-.441.046-.662M4.5 12l3 3m-3-3l-3 3" />
|
||||
</svg>
|
||||
),
|
||||
title: "Variant Generation",
|
||||
desc: "Generate unlimited similar questions for extra practice on weak topics.",
|
||||
color: "#7C3AED",
|
||||
},
|
||||
];
|
||||
|
||||
/* ── Filter options ── */
|
||||
const COURSE_OPTIONS = ["COMP2011", "COMP2211", "MATH1014", "PHYS1112", "MATH2023", "ELEC2100"];
|
||||
const TERM_OPTIONS = ["spring", "fall"];
|
||||
const TYPE_OPTIONS = ["midterm", "final"];
|
||||
|
||||
/* ── Chevron SVG ── */
|
||||
function ChevronDown({ className = "" }: { className?: string }) {
|
||||
return (
|
||||
<svg className={className} fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2.5}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M19 9l-7 7-7-7" />
|
||||
</svg>
|
||||
);
|
||||
}
|
||||
|
||||
/* ── Dropdown select component ── */
|
||||
function Dropdown({
|
||||
label,
|
||||
value,
|
||||
options,
|
||||
onChange,
|
||||
}: {
|
||||
label: string;
|
||||
value: string | null;
|
||||
options: { value: string; label: string }[];
|
||||
onChange: (v: string | null) => void;
|
||||
}) {
|
||||
const [open, setOpen] = useState(false);
|
||||
const ref = useRef<HTMLDivElement>(null);
|
||||
|
||||
useEffect(() => {
|
||||
const handler = (e: MouseEvent) => {
|
||||
if (ref.current && !ref.current.contains(e.target as Node)) setOpen(false);
|
||||
};
|
||||
document.addEventListener("mousedown", handler);
|
||||
return () => document.removeEventListener("mousedown", handler);
|
||||
}, []);
|
||||
|
||||
const selected = options.find((o) => o.value === value);
|
||||
|
||||
return (
|
||||
<div ref={ref} className="relative" style={{ minWidth: 150 }}>
|
||||
<div className="text-[11px] font-semibold text-indigo-300 uppercase tracking-wider mb-1.5" style={fontSora}>
|
||||
{label}
|
||||
</div>
|
||||
<button
|
||||
onClick={() => setOpen(!open)}
|
||||
className="w-full flex items-center justify-between bg-white px-3.5 py-2.5 text-sm cursor-pointer whitespace-nowrap"
|
||||
style={{ borderRadius: 0, ...fontMono }}
|
||||
>
|
||||
<span className={`${selected ? "text-slate-800 font-semibold" : "text-slate-400"} mr-2`}>
|
||||
{selected ? selected.label : `All ${label}s`}
|
||||
</span>
|
||||
<ChevronDown className={`w-4 h-4 text-slate-400 transition-transform ${open ? "rotate-180" : ""}`} />
|
||||
</button>
|
||||
{open && (
|
||||
<div
|
||||
className="absolute top-full left-0 right-0 mt-1 bg-white shadow-lg z-50 overflow-hidden"
|
||||
style={{ borderRadius: 0, border: "1px solid #E2E8F0" }}
|
||||
>
|
||||
<button
|
||||
onClick={() => { onChange(null); setOpen(false); }}
|
||||
className={`w-full text-left px-3.5 py-2 text-sm hover:bg-indigo-50 transition-colors ${
|
||||
!value ? "text-indigo-600 font-semibold bg-indigo-50/50" : "text-slate-500"
|
||||
}`}
|
||||
style={fontMono}
|
||||
>
|
||||
All {label}s
|
||||
</button>
|
||||
{options.map((o) => (
|
||||
<button
|
||||
key={o.value}
|
||||
onClick={() => { onChange(o.value); setOpen(false); }}
|
||||
className={`w-full text-left px-3.5 py-2 text-sm hover:bg-indigo-50 transition-colors ${
|
||||
value === o.value ? "text-indigo-600 font-semibold bg-indigo-50/50" : "text-slate-600"
|
||||
}`}
|
||||
style={fontMono}
|
||||
>
|
||||
{o.label}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
export default function HomePage() {
|
||||
const navigate = useNavigate();
|
||||
const { user, signOut } = useAuth();
|
||||
const [papers, setPapers] = useState<Paper[]>([]);
|
||||
const [papersLoading, setPapersLoading] = useState(false);
|
||||
const [myUploadedPapers, setMyUploadedPapers] = useState<Paper[]>([]);
|
||||
const [workedPapers, setWorkedPapers] = useState<Paper[]>([]);
|
||||
const [courseInput, setCourseInput] = useState("");
|
||||
const [courseFilter, setCourseFilter] = useState<string | null>(null);
|
||||
const [showSuggestions, setShowSuggestions] = useState(false);
|
||||
const [termFilter, setTermFilter] = useState<string | null>(null);
|
||||
const [typeFilter, setTypeFilter] = useState<string | null>(null);
|
||||
const [analyzing, setAnalyzing] = useState(false);
|
||||
const inputRef = useRef<HTMLDivElement>(null);
|
||||
|
||||
// Autocomplete suggestions
|
||||
const suggestions = courseInput.trim()
|
||||
? COURSE_OPTIONS.filter((c) =>
|
||||
c.toLowerCase().includes(courseInput.trim().toLowerCase())
|
||||
)
|
||||
: [];
|
||||
|
||||
// Close suggestions on outside click
|
||||
useEffect(() => {
|
||||
const handler = (e: MouseEvent) => {
|
||||
if (inputRef.current && !inputRef.current.contains(e.target as Node)) setShowSuggestions(false);
|
||||
};
|
||||
document.addEventListener("mousedown", handler);
|
||||
return () => document.removeEventListener("mousedown", handler);
|
||||
}, []);
|
||||
|
||||
useEffect(() => {
|
||||
let cancelled = false;
|
||||
setPapersLoading(true);
|
||||
listPapers()
|
||||
.then((data) => {
|
||||
if (cancelled) return;
|
||||
setPapers(
|
||||
data.sort((a, b) => {
|
||||
if (a.course_code !== b.course_code) return a.course_code.localeCompare(b.course_code);
|
||||
if (a.year !== b.year) return b.year - a.year;
|
||||
if (a.term !== b.term) return a.term.localeCompare(b.term);
|
||||
return a.exam_type.localeCompare(b.exam_type);
|
||||
}),
|
||||
);
|
||||
})
|
||||
.catch(() => {
|
||||
if (!cancelled) setPapers([]);
|
||||
})
|
||||
.finally(() => {
|
||||
if (!cancelled) setPapersLoading(false);
|
||||
});
|
||||
|
||||
return () => {
|
||||
cancelled = true;
|
||||
};
|
||||
}, []);
|
||||
|
||||
// My Papers
|
||||
useEffect(() => {
|
||||
if (!user) return;
|
||||
let cancelled = false;
|
||||
myPapers().then((data) => {
|
||||
if (cancelled) return;
|
||||
setMyUploadedPapers(data.filter((p) => p.status !== "error"));
|
||||
}).catch(() => {});
|
||||
return () => { cancelled = true; };
|
||||
}, [user]);
|
||||
|
||||
useEffect(() => {
|
||||
if (!user || papers.length === 0) return;
|
||||
const workedIds = new Set(getWorkedIds(user.id));
|
||||
setWorkedPapers(papers.filter((p) => workedIds.has(p.id)));
|
||||
}, [user, papers]);
|
||||
|
||||
// Filter papers
|
||||
const hasFilter = courseFilter || termFilter || typeFilter;
|
||||
const filteredPapers = papers.filter((p) => {
|
||||
if (courseFilter && p.course_code !== courseFilter) return false;
|
||||
if (termFilter && p.term !== termFilter) return false;
|
||||
if (typeFilter && p.exam_type !== typeFilter) return false;
|
||||
return true;
|
||||
});
|
||||
|
||||
const selectCourse = (code: string) => {
|
||||
setCourseInput(code);
|
||||
setCourseFilter(code);
|
||||
setShowSuggestions(false);
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="min-h-screen" style={{ background: "#FAFAFA" }}>
|
||||
{/* ══════ Nav ══════ */}
|
||||
<nav className="bg-white border-b border-slate-200">
|
||||
<div className="max-w-[1200px] mx-auto px-6 h-14 flex items-center justify-between">
|
||||
<div className="flex items-center gap-2">
|
||||
<div
|
||||
className="w-8 h-8 flex items-center justify-center text-white text-sm font-bold"
|
||||
style={{ background: "#6366F1", borderRadius: 0 }}
|
||||
>
|
||||
PM
|
||||
</div>
|
||||
<span className="text-lg font-bold text-slate-800" style={fontSora}>
|
||||
PastPaper Master
|
||||
</span>
|
||||
</div>
|
||||
<div className="flex items-center gap-5 text-sm" style={fontSora}>
|
||||
<Link to="/" className="text-indigo-600 font-semibold">
|
||||
Home
|
||||
</Link>
|
||||
<Link to="/analytics" className="text-slate-500 hover:text-slate-800 transition-colors">
|
||||
Analytics
|
||||
</Link>
|
||||
<Link to="/error-book" className="text-slate-500 hover:text-slate-800 transition-colors">
|
||||
Error Book
|
||||
</Link>
|
||||
<Link
|
||||
to="/upload"
|
||||
className="px-4 py-1.5 text-white text-xs font-semibold"
|
||||
style={{ background: "#6366F1", borderRadius: 0 }}
|
||||
>
|
||||
Upload Paper
|
||||
</Link>
|
||||
{user ? (
|
||||
<div className="flex items-center gap-3 pl-3 border-l border-slate-200">
|
||||
<span className="text-xs text-slate-400 max-w-[140px] truncate" style={fontMono}>{user.email}</span>
|
||||
<button
|
||||
onClick={() => void signOut()}
|
||||
className="text-xs text-slate-400 hover:text-red-500 transition-colors"
|
||||
>
|
||||
Sign out
|
||||
</button>
|
||||
</div>
|
||||
) : (
|
||||
<Link
|
||||
to="/login"
|
||||
className="text-sm text-indigo-600 font-semibold pl-3 border-l border-slate-200 hover:text-indigo-800 transition-colors"
|
||||
>
|
||||
Sign in
|
||||
</Link>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
</nav>
|
||||
|
||||
{/* ══════ Hero + Filter ══════ */}
|
||||
<section
|
||||
className="relative overflow-hidden"
|
||||
style={{ background: "linear-gradient(135deg, #1E1B4B 0%, #312E81 50%, #4338CA 100%)" }}
|
||||
>
|
||||
<div className="max-w-[1200px] mx-auto px-6 pt-16 pb-10 text-center relative z-10">
|
||||
<h1
|
||||
className="text-4xl font-bold text-white mb-4 leading-tight"
|
||||
style={fontSora}
|
||||
>
|
||||
The Smartest Way to<br />
|
||||
<span style={{ color: "#A5B4FC" }}>Master Past Papers</span>
|
||||
</h1>
|
||||
<p className="text-indigo-200 text-base mb-10 max-w-xl mx-auto" style={fontSora}>
|
||||
Upload any HKUST past paper. AI breaks down every question with analysis,
|
||||
hints, and solutions — so you study smarter, not harder.
|
||||
</p>
|
||||
|
||||
{/* ── Filter row: Course input + Term dropdown + Type dropdown ── */}
|
||||
<div className="max-w-[680px] mx-auto">
|
||||
<div className="flex gap-3 items-end">
|
||||
{/* Course code input with autocomplete */}
|
||||
<div ref={inputRef} className="relative flex-1">
|
||||
<div className="text-[11px] font-semibold text-indigo-300 uppercase tracking-wider mb-1.5 text-left" style={fontSora}>
|
||||
Course Code
|
||||
</div>
|
||||
<div className="flex bg-white" style={{ borderRadius: 0 }}>
|
||||
<input
|
||||
type="text"
|
||||
value={courseInput}
|
||||
onChange={(e) => {
|
||||
const v = e.target.value.toUpperCase();
|
||||
setCourseInput(v);
|
||||
setCourseFilter(COURSE_OPTIONS.includes(v) ? v : null);
|
||||
setShowSuggestions(true);
|
||||
}}
|
||||
onFocus={() => setShowSuggestions(true)}
|
||||
placeholder="e.g. COMP2011"
|
||||
className="flex-1 px-3.5 py-2.5 text-sm text-slate-800 outline-none bg-transparent font-semibold"
|
||||
style={fontMono}
|
||||
/>
|
||||
{courseInput && (
|
||||
<button
|
||||
onClick={() => { setCourseInput(""); setCourseFilter(null); }}
|
||||
className="px-2 text-slate-300 hover:text-slate-500 transition-colors"
|
||||
>
|
||||
<svg className="w-4 h-4" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M6 18L18 6M6 6l12 12" />
|
||||
</svg>
|
||||
</button>
|
||||
)}
|
||||
</div>
|
||||
{/* Autocomplete dropdown */}
|
||||
{showSuggestions && suggestions.length > 0 && !courseFilter && (
|
||||
<div
|
||||
className="absolute top-full left-0 right-0 mt-1 bg-white shadow-lg z-50 overflow-hidden"
|
||||
style={{ borderRadius: 0, border: "1px solid #E2E8F0" }}
|
||||
>
|
||||
{suggestions.map((c) => (
|
||||
<button
|
||||
key={c}
|
||||
onClick={() => selectCourse(c)}
|
||||
className="w-full text-left px-3.5 py-2.5 text-sm text-slate-700 hover:bg-indigo-50 hover:text-indigo-600 transition-colors"
|
||||
style={fontMono}
|
||||
>
|
||||
<span className="font-semibold">{c.slice(0, courseInput.length)}</span>
|
||||
{c.slice(courseInput.length)}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Term dropdown */}
|
||||
<Dropdown
|
||||
label="Term"
|
||||
value={termFilter}
|
||||
options={[
|
||||
{ value: "spring", label: "Spring" },
|
||||
{ value: "fall", label: "Fall" },
|
||||
]}
|
||||
onChange={setTermFilter}
|
||||
/>
|
||||
|
||||
{/* Exam Type dropdown */}
|
||||
<Dropdown
|
||||
label="Exam Type"
|
||||
value={typeFilter}
|
||||
options={[
|
||||
{ value: "midterm", label: "Midterm" },
|
||||
{ value: "final", label: "Final" },
|
||||
]}
|
||||
onChange={setTypeFilter}
|
||||
/>
|
||||
|
||||
{/* Buttons */}
|
||||
<div className="flex gap-2 items-end">
|
||||
<div>
|
||||
<div className="mb-1.5" />
|
||||
<button
|
||||
className="px-6 py-2.5 text-white text-sm font-semibold shrink-0"
|
||||
style={{ background: "#6366F1", borderRadius: 0, ...fontSora }}
|
||||
>
|
||||
Search
|
||||
</button>
|
||||
</div>
|
||||
<div>
|
||||
<div className="mb-1.5" />
|
||||
<button
|
||||
onClick={() => {
|
||||
setAnalyzing(true);
|
||||
setTimeout(() => {
|
||||
if (courseFilter) navigate(`/analytics/${courseFilter}`);
|
||||
else navigate("/analytics");
|
||||
}, 1200);
|
||||
}}
|
||||
disabled={analyzing}
|
||||
className="px-5 py-2.5 text-sm font-semibold shrink-0 border transition-all flex items-center gap-2"
|
||||
style={{
|
||||
borderRadius: 0,
|
||||
background: analyzing ? "#BE123C" : courseFilter ? "#E11D48" : "transparent",
|
||||
color: courseFilter || analyzing ? "#fff" : "rgba(165,180,252,0.7)",
|
||||
borderColor: analyzing ? "#BE123C" : courseFilter ? "#E11D48" : "rgba(165,180,252,0.3)",
|
||||
...fontSora,
|
||||
}}
|
||||
>
|
||||
{analyzing && (
|
||||
<svg className="w-4 h-4 animate-spin" viewBox="0 0 24 24" fill="none">
|
||||
<circle className="opacity-25" cx="12" cy="12" r="10" stroke="currentColor" strokeWidth="3" />
|
||||
<path className="opacity-75" fill="currentColor" d="M4 12a8 8 0 018-8V0C5.373 0 0 5.373 0 12h4z" />
|
||||
</svg>
|
||||
)}
|
||||
{analyzing ? "Analyzing..." : "Analyze"}
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* ── Results panel ── */}
|
||||
{hasFilter && (
|
||||
<div
|
||||
className="mt-3 text-left max-h-[300px] overflow-y-auto"
|
||||
style={{ background: "rgba(255,255,255,0.06)", backdropFilter: "blur(8px)", border: "1px solid rgba(255,255,255,0.1)" }}
|
||||
>
|
||||
{papersLoading ? (
|
||||
<div className="p-6 text-center">
|
||||
<p className="text-indigo-300 text-sm" style={fontSora}>Loading papers...</p>
|
||||
</div>
|
||||
) : filteredPapers.length === 0 ? (
|
||||
<div className="p-6 text-center">
|
||||
<p className="text-indigo-300 text-sm" style={fontSora}>No papers match these filters</p>
|
||||
</div>
|
||||
) : (
|
||||
<>
|
||||
<div className="px-4 pt-3 pb-1 flex items-center justify-between">
|
||||
<span className="text-[11px] font-semibold text-indigo-400 uppercase tracking-wider" style={fontSora}>
|
||||
{filteredPapers.length} paper{filteredPapers.length > 1 ? "s" : ""} found
|
||||
</span>
|
||||
{courseFilter && (
|
||||
<Link
|
||||
to={`/analytics/${courseFilter}`}
|
||||
className="flex items-center gap-1.5 px-3 py-1 text-[11px] font-bold text-white hover:opacity-90 transition-opacity"
|
||||
style={{ background: "#6366F1", borderRadius: 0, ...fontMono }}
|
||||
>
|
||||
<svg className="w-3 h-3" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M3 13.125C3 12.504 3.504 12 4.125 12h2.25c.621 0 1.125.504 1.125 1.125v6.75C7.5 20.496 6.996 21 6.375 21h-2.25A1.125 1.125 0 013 19.875v-6.75zM9.75 8.625c0-.621.504-1.125 1.125-1.125h2.25c.621 0 1.125.504 1.125 1.125v11.25c0 .621-.504 1.125-1.125 1.125h-2.25a1.125 1.125 0 01-1.125-1.125V8.625zM16.5 4.125c0-.621.504-1.125 1.125-1.125h2.25C20.496 3 21 3.504 21 4.125v15.75c0 .621-.504 1.125-1.125 1.125h-2.25a1.125 1.125 0 01-1.125-1.125V4.125z" />
|
||||
</svg>
|
||||
AI Analytics · {courseFilter}
|
||||
</Link>
|
||||
)}
|
||||
</div>
|
||||
{filteredPapers.map((p) => (
|
||||
<button
|
||||
key={p.id}
|
||||
onClick={() => { navigate(`/paper/${p.id}`); }}
|
||||
className="w-full flex items-center justify-between px-4 py-3 text-left transition-colors hover:bg-white/10 cursor-pointer"
|
||||
style={{ borderBottom: "1px solid rgba(255,255,255,0.06)" }}
|
||||
>
|
||||
<div className="flex items-center gap-3">
|
||||
<div className="w-8 h-8 flex items-center justify-center shrink-0" style={{ background: "rgba(255,255,255,0.1)" }}>
|
||||
<svg className="w-4 h-4 text-indigo-300" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={1.5}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M19.5 14.25v-2.625a3.375 3.375 0 00-3.375-3.375h-1.5A1.125 1.125 0 0113.5 7.125v-1.5a3.375 3.375 0 00-3.375-3.375H8.25m2.25 0H5.625c-.621 0-1.125.504-1.125 1.125v17.25c0 .621.504 1.125 1.125 1.125h12.75c.621 0 1.125-.504 1.125-1.125V11.25a9 9 0 00-9-9z" />
|
||||
</svg>
|
||||
</div>
|
||||
<div>
|
||||
<span className="text-sm font-bold text-white" style={fontMono}>{p.course_code}</span>
|
||||
<span className="text-sm text-indigo-300 capitalize ml-2" style={fontSora}>
|
||||
{p.year} {p.term} {p.exam_type}
|
||||
</span>
|
||||
<div className="flex gap-3 mt-0.5">
|
||||
{p.question_count != null && (
|
||||
<span className="text-[11px] text-indigo-400" style={fontMono}>{p.question_count} Qs</span>
|
||||
)}
|
||||
{p.difficulty_level && (
|
||||
<span className="text-[11px] text-indigo-400 capitalize" style={fontMono}>{p.difficulty_level}</span>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div className="flex items-center gap-2">
|
||||
<span
|
||||
className={`px-2 py-0.5 text-[10px] font-bold border ${
|
||||
p.status === "ready"
|
||||
? "text-emerald-400 border-emerald-400/40"
|
||||
: p.status === "processing"
|
||||
? "text-amber-300 border-amber-300/40"
|
||||
: "text-indigo-400/60 border-indigo-400/20"
|
||||
}`}
|
||||
style={{ borderRadius: 0, ...fontMono }}
|
||||
>
|
||||
{p.status.toUpperCase()}
|
||||
</span>
|
||||
<svg className="w-4 h-4 text-indigo-400" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M8.25 4.5l7.5 7.5-7.5 7.5" />
|
||||
</svg>
|
||||
</div>
|
||||
</button>
|
||||
))}
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Quick stats — real data */}
|
||||
<div className="flex justify-center gap-8 mt-10">
|
||||
{[
|
||||
[String(papers.filter(p => p.status === "ready").length), "Past Papers"],
|
||||
[String(papers.reduce((s, p) => s + (p.question_count || 0), 0)), "Questions Analyzed"],
|
||||
[String(new Set(papers.filter(p => p.status === "ready").map(p => p.course_code)).size), "Courses"],
|
||||
].map(([num, label]) => (
|
||||
<div key={label} className="text-center">
|
||||
<div className="text-2xl font-bold text-white" style={fontMono}>{num}</div>
|
||||
<div className="text-xs text-indigo-300" style={fontSora}>{label}</div>
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Decorative grid */}
|
||||
<div
|
||||
className="absolute inset-0 opacity-[0.04]"
|
||||
style={{
|
||||
backgroundImage: "linear-gradient(#fff 1px, transparent 1px), linear-gradient(90deg, #fff 1px, transparent 1px)",
|
||||
backgroundSize: "40px 40px",
|
||||
}}
|
||||
/>
|
||||
</section>
|
||||
|
||||
<main className="max-w-[1200px] mx-auto px-6">
|
||||
{/* ══════ Features ══════ */}
|
||||
<section className="py-12">
|
||||
<h2
|
||||
className="text-sm font-semibold text-slate-400 uppercase tracking-wider mb-6"
|
||||
style={fontSora}
|
||||
>
|
||||
Platform Features
|
||||
</h2>
|
||||
<div className="grid grid-cols-4 gap-4">
|
||||
{FEATURES.map((f) => (
|
||||
<div
|
||||
key={f.title}
|
||||
className="bg-white border border-slate-200 p-5 hover:border-slate-300 transition-colors group"
|
||||
style={{ borderRadius: 0 }}
|
||||
>
|
||||
<div
|
||||
className="w-10 h-10 flex items-center justify-center text-white mb-4"
|
||||
style={{ background: f.color, borderRadius: 0 }}
|
||||
>
|
||||
{f.icon}
|
||||
</div>
|
||||
<h3
|
||||
className="text-sm font-bold text-slate-800 mb-1.5"
|
||||
style={fontSora}
|
||||
>
|
||||
{f.title}
|
||||
</h3>
|
||||
<p className="text-xs text-slate-400 leading-relaxed" style={fontSora}>
|
||||
{f.desc}
|
||||
</p>
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
</section>
|
||||
|
||||
{/* ══════ My Papers ══════ */}
|
||||
{user && (
|
||||
<section className="pb-12">
|
||||
<h2 className="text-sm font-semibold text-slate-400 uppercase tracking-wider mb-6" style={fontSora}>
|
||||
My Papers
|
||||
</h2>
|
||||
{myUploadedPapers.length === 0 && workedPapers.length === 0 ? (
|
||||
<div className="bg-white border border-slate-200 px-6 py-8 text-center" style={{ borderRadius: 0 }}>
|
||||
<p className="text-sm text-slate-400" style={fontSora}>No papers yet. Upload a past paper or open one to get started.</p>
|
||||
</div>
|
||||
) : (
|
||||
<div className="grid grid-cols-2 gap-6">
|
||||
{/* Uploaded */}
|
||||
{myUploadedPapers.length > 0 && (
|
||||
<div>
|
||||
<div className="text-xs font-semibold text-slate-500 uppercase tracking-wider mb-3" style={fontSora}>
|
||||
Uploaded
|
||||
</div>
|
||||
<div className="space-y-2">
|
||||
{myUploadedPapers.map((p) => (
|
||||
<Link
|
||||
key={p.id}
|
||||
to={p.status === "ready" ? `/paper/${p.id}` : "#"}
|
||||
className="flex items-center justify-between bg-white border border-slate-200 px-4 py-3 hover:border-indigo-300 transition-colors"
|
||||
style={{ borderRadius: 0 }}
|
||||
>
|
||||
<div>
|
||||
<span className="text-sm font-bold text-slate-800" style={fontMono}>{p.course_code}</span>
|
||||
<span className="text-sm text-slate-500 capitalize ml-2" style={fontSora}>{p.year} {p.term} {p.exam_type}</span>
|
||||
</div>
|
||||
<span className={`text-[10px] font-bold px-2 py-0.5 border ${
|
||||
p.status === "ready" ? "text-emerald-600 border-emerald-300 bg-emerald-50"
|
||||
: p.status === "processing" ? "text-amber-600 border-amber-300 bg-amber-50"
|
||||
: "text-slate-400 border-slate-200"
|
||||
}`} style={{ borderRadius: 0, ...fontMono }}>
|
||||
{p.status === "processing" ? (
|
||||
<span className="flex items-center gap-1">
|
||||
<span className="w-2 h-2 border border-amber-500 border-t-transparent rounded-full animate-spin inline-block" />
|
||||
PROCESSING
|
||||
</span>
|
||||
) : p.status.toUpperCase()}
|
||||
</span>
|
||||
</Link>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Worked on */}
|
||||
{workedPapers.length > 0 && (
|
||||
<div>
|
||||
<div className="text-xs font-semibold text-slate-500 uppercase tracking-wider mb-3" style={fontSora}>
|
||||
Recently Worked
|
||||
</div>
|
||||
<div className="space-y-2">
|
||||
{workedPapers.map((p) => (
|
||||
<Link
|
||||
key={p.id}
|
||||
to={`/paper/${p.id}`}
|
||||
className="flex items-center justify-between bg-white border border-slate-200 px-4 py-3 hover:border-indigo-300 transition-colors"
|
||||
style={{ borderRadius: 0 }}
|
||||
>
|
||||
<div>
|
||||
<span className="text-sm font-bold text-slate-800" style={fontMono}>{p.course_code}</span>
|
||||
<span className="text-sm text-slate-500 capitalize ml-2" style={fontSora}>{p.year} {p.term} {p.exam_type}</span>
|
||||
</div>
|
||||
<svg className="w-4 h-4 text-slate-300" fill="none" viewBox="0 0 24 24" stroke="currentColor" strokeWidth={2}>
|
||||
<path strokeLinecap="round" strokeLinejoin="round" d="M8.25 4.5l7.5 7.5-7.5 7.5" />
|
||||
</svg>
|
||||
</Link>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
</section>
|
||||
)}
|
||||
|
||||
{/* ══════ CTA Banner ══════ */}
|
||||
<section className="pb-16">
|
||||
<div
|
||||
className="p-8 flex items-center justify-between"
|
||||
style={{ background: "linear-gradient(135deg, #1E1B4B, #312E81)", borderRadius: 0 }}
|
||||
>
|
||||
<div>
|
||||
<h3 className="text-lg font-bold text-white mb-1" style={fontSora}>
|
||||
Ready to ace your exams?
|
||||
</h3>
|
||||
<p className="text-sm text-indigo-300" style={fontSora}>
|
||||
Upload a past paper and let AI do the heavy lifting.
|
||||
</p>
|
||||
</div>
|
||||
<div className="flex gap-3">
|
||||
<Link
|
||||
to="/upload"
|
||||
className="px-5 py-2.5 text-sm font-semibold text-white"
|
||||
style={{ background: "#6366F1", borderRadius: 0, ...fontSora }}
|
||||
>
|
||||
Upload Paper
|
||||
</Link>
|
||||
<Link
|
||||
to="/analytics"
|
||||
className="px-5 py-2.5 text-sm font-semibold text-indigo-200 border border-indigo-400 hover:bg-indigo-900/30 transition-colors"
|
||||
style={{ borderRadius: 0, ...fontSora }}
|
||||
>
|
||||
View Analytics
|
||||
</Link>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
</main>
|
||||
|
||||
{/* ══════ Footer ══════ */}
|
||||
<footer className="border-t border-slate-200 bg-white">
|
||||
<div className="max-w-[1200px] mx-auto px-6 py-6 flex items-center justify-between">
|
||||
<span className="text-xs text-slate-400" style={fontSora}>
|
||||
PastPaper Master · HKUST · 2025
|
||||
</span>
|
||||
<div className="flex gap-4 text-xs text-slate-400" style={fontSora}>
|
||||
<span>About</span>
|
||||
<span>Contact</span>
|
||||
<span>Privacy</span>
|
||||
</div>
|
||||
</div>
|
||||
</footer>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
90
frontend/src/pages/LoginPage.tsx
Normal file
90
frontend/src/pages/LoginPage.tsx
Normal file
@@ -0,0 +1,90 @@
|
||||
import { useState } from "react";
|
||||
import { supabase } from "@/lib/supabase";
|
||||
|
||||
export default function LoginPage() {
|
||||
const [email, setEmail] = useState("");
|
||||
const [password, setPassword] = useState("");
|
||||
const [mode, setMode] = useState<"signin" | "signup">("signin");
|
||||
const [error, setError] = useState<string | null>(null);
|
||||
const [loading, setLoading] = useState(false);
|
||||
const handleSubmit = async (e: React.FormEvent) => {
|
||||
e.preventDefault();
|
||||
setError(null);
|
||||
setLoading(true);
|
||||
try {
|
||||
if (mode === "signin") {
|
||||
const { error } = await supabase.auth.signInWithPassword({ email, password });
|
||||
if (error) throw error;
|
||||
} else {
|
||||
const { error } = await supabase.auth.signUp({ email, password });
|
||||
if (error) throw error;
|
||||
// Auto sign in after signup (requires email confirm disabled in Supabase dashboard)
|
||||
const { error: signInError } = await supabase.auth.signInWithPassword({ email, password });
|
||||
if (signInError) throw signInError;
|
||||
}
|
||||
} catch (err: unknown) {
|
||||
setError(err instanceof Error ? err.message : "Something went wrong");
|
||||
} finally {
|
||||
setLoading(false);
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="min-h-screen bg-gray-50 flex items-center justify-center">
|
||||
<div className="bg-white rounded-2xl shadow-sm border border-gray-200 p-8 w-full max-w-sm">
|
||||
<div className="mb-6">
|
||||
<h1 className="text-xl font-bold text-gray-900">PastPaper Master</h1>
|
||||
<p className="text-sm text-gray-500 mt-1">{mode === "signin" ? "Sign in to continue" : "Create your account"}</p>
|
||||
</div>
|
||||
|
||||
<form onSubmit={handleSubmit} className="space-y-4">
|
||||
<div>
|
||||
<label className="block text-xs font-medium text-gray-700 mb-1">Email</label>
|
||||
<input
|
||||
type="email"
|
||||
value={email}
|
||||
onChange={(e) => setEmail(e.target.value)}
|
||||
required
|
||||
className="w-full px-3 py-2 border border-gray-300 rounded-lg text-sm focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent"
|
||||
placeholder="you@example.com"
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<label className="block text-xs font-medium text-gray-700 mb-1">Password</label>
|
||||
<input
|
||||
type="password"
|
||||
value={password}
|
||||
onChange={(e) => setPassword(e.target.value)}
|
||||
required
|
||||
minLength={6}
|
||||
className="w-full px-3 py-2 border border-gray-300 rounded-lg text-sm focus:outline-none focus:ring-2 focus:ring-blue-500 focus:border-transparent"
|
||||
placeholder="••••••"
|
||||
/>
|
||||
</div>
|
||||
|
||||
{error && (
|
||||
<p className="text-xs text-red-600 bg-red-50 border border-red-200 rounded-lg px-3 py-2">{error}</p>
|
||||
)}
|
||||
|
||||
<button
|
||||
type="submit"
|
||||
disabled={loading}
|
||||
className="w-full py-2.5 bg-blue-600 text-white text-sm font-medium rounded-lg hover:bg-blue-700 disabled:opacity-50 transition-colors"
|
||||
>
|
||||
{loading ? "..." : mode === "signin" ? "Sign in" : "Create account"}
|
||||
</button>
|
||||
</form>
|
||||
|
||||
<p className="text-center text-xs text-gray-500 mt-4">
|
||||
{mode === "signin" ? "No account? " : "Already have one? "}
|
||||
<button
|
||||
onClick={() => { setMode(mode === "signin" ? "signup" : "signin"); setError(null); }}
|
||||
className="text-blue-600 hover:underline font-medium"
|
||||
>
|
||||
{mode === "signin" ? "Sign up" : "Sign in"}
|
||||
</button>
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
16
frontend/src/pages/UploadPage.tsx
Normal file
16
frontend/src/pages/UploadPage.tsx
Normal file
@@ -0,0 +1,16 @@
|
||||
import Header from "@/components/layout/Header";
|
||||
import UploadForm from "@/components/upload/UploadForm";
|
||||
|
||||
export default function UploadPage() {
|
||||
return (
|
||||
<div className="min-h-screen bg-gray-50">
|
||||
<Header />
|
||||
<main className="py-10 px-6">
|
||||
<h1 className="text-xl font-bold text-center mb-8 text-gray-800">
|
||||
Upload Past Paper
|
||||
</h1>
|
||||
<UploadForm />
|
||||
</main>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
524
frontend/src/pages/WorkbenchPage.tsx
Normal file
524
frontend/src/pages/WorkbenchPage.tsx
Normal file
@@ -0,0 +1,524 @@
|
||||
import { useState, useEffect, useCallback, useRef } from "react";
|
||||
import { useParams } from "react-router-dom";
|
||||
import Header from "@/components/layout/Header";
|
||||
import PdfViewer from "@/components/workbench/PdfViewer";
|
||||
import QuestionNav from "@/components/workbench/QuestionNav";
|
||||
import QuestionDetail from "@/components/workbench/QuestionDetail";
|
||||
import AiTrioPanel from "@/components/workbench/AiTrioPanel";
|
||||
import SimilarHistoryPanel from "@/components/workbench/SimilarHistoryPanel";
|
||||
import ActionBar from "@/components/workbench/ActionBar";
|
||||
import PhotoUpload from "@/components/workbench/PhotoUpload";
|
||||
import VariantDetail from "@/components/workbench/VariantDetail";
|
||||
import KaTeXRenderer from "@/components/shared/KaTeXRenderer";
|
||||
import { usePaper } from "@/hooks/usePaper";
|
||||
import { useQuestions } from "@/hooks/useQuestions";
|
||||
import { generateVariant, getVariants, updateVariant, deleteVariant, recordAttempt, getPaperAttempts } from "@/lib/api";
|
||||
import { groupQuestions } from "@/lib/questionGroups";
|
||||
import { useAuth } from "@/contexts/AuthContext";
|
||||
import type { QuestionVariant } from "@/types/api";
|
||||
|
||||
const WORKED_KEY = (userId: string) => `worked_papers_${userId}`;
|
||||
const WORKED_THRESHOLD_MS = 3 * 60 * 1000; // 3 minutes
|
||||
|
||||
function markWorked(userId: string, paperId: string) {
|
||||
try {
|
||||
const raw = localStorage.getItem(WORKED_KEY(userId));
|
||||
const ids: string[] = raw ? JSON.parse(raw) : [];
|
||||
if (!ids.includes(paperId)) {
|
||||
localStorage.setItem(WORKED_KEY(userId), JSON.stringify([...ids, paperId]));
|
||||
}
|
||||
} catch { /* silent */ }
|
||||
}
|
||||
|
||||
export default function WorkbenchPage() {
|
||||
const { id } = useParams<{ id: string }>();
|
||||
const { user } = useAuth();
|
||||
const { paper, loading: paperLoading, error: paperError } = usePaper(id!);
|
||||
const isReady = paper?.status === "ready";
|
||||
const { questions, loading: questionsLoading } = useQuestions(id!, isReady);
|
||||
const [currentQuestionId, setCurrentQuestionId] = useState<string | null>(null);
|
||||
const [showPhoto, setShowPhoto] = useState(false);
|
||||
// Grading result per question
|
||||
const [gradingResults, setGradingResults] = useState<Map<string, {
|
||||
isCorrect: boolean;
|
||||
feedback: string;
|
||||
ocrText: string;
|
||||
scoreGiven?: number;
|
||||
loading?: boolean;
|
||||
}>>(new Map());
|
||||
// Track which grading panels are expanded
|
||||
const [gradingExpanded, setGradingExpanded] = useState<Set<string>>(new Set());
|
||||
|
||||
// Tab state
|
||||
const [activeTab, setActiveTab] = useState<"questions" | "variants">("questions");
|
||||
// variants per question: questionId → QuestionVariant[]
|
||||
const [variantMap, setVariantMap] = useState<Map<string, QuestionVariant[]>>(new Map());
|
||||
// which question IDs have been fetched from server
|
||||
const loadedRef = useRef<Set<string>>(new Set());
|
||||
// generating state
|
||||
const [isGenerating, setIsGenerating] = useState(false);
|
||||
// Currently viewing variant (full detail view)
|
||||
const [activeVariantId, setActiveVariantId] = useState<string | null>(null);
|
||||
|
||||
// Cooldown: ignore scroll-based updates for 2s after user clicks a question
|
||||
const lastUserSelectTime = useRef(0);
|
||||
|
||||
const handleQuestionSelect = useCallback((questionId: string) => {
|
||||
lastUserSelectTime.current = Date.now();
|
||||
setCurrentQuestionId(questionId);
|
||||
}, []);
|
||||
|
||||
const groups = groupQuestions(questions);
|
||||
const currentQuestion =
|
||||
questions.find((question) => question.id === currentQuestionId)
|
||||
?? questions[0]
|
||||
?? null;
|
||||
const currentGroupKey = currentQuestion?.question_number.match(/^\d+/)?.[0] ?? null;
|
||||
const paperTitle = paper
|
||||
? `${paper.year} ${paper.term} ${paper.exam_type}`
|
||||
: undefined;
|
||||
|
||||
const currentVariants = variantMap.get(currentQuestion?.id ?? "") ?? [];
|
||||
const activeVariant = currentVariants.find((v) => v.id === activeVariantId) ?? null;
|
||||
|
||||
const handleGroupSelect = useCallback((groupKey: string) => {
|
||||
lastUserSelectTime.current = Date.now();
|
||||
const group = groups.find((item) => item.key === groupKey);
|
||||
if (group?.questions[0]) {
|
||||
setCurrentQuestionId(group.questions[0].id);
|
||||
}
|
||||
}, [groups]);
|
||||
|
||||
useEffect(() => {
|
||||
if (questions.length === 0) {
|
||||
setCurrentQuestionId(null);
|
||||
return;
|
||||
}
|
||||
setCurrentQuestionId((prev) =>
|
||||
prev && questions.some((question) => question.id === prev) ? prev : questions[0].id,
|
||||
);
|
||||
}, [questions]);
|
||||
|
||||
// 3-minute worked tracking
|
||||
useEffect(() => {
|
||||
if (!id || !user) return;
|
||||
const timer = setTimeout(() => markWorked(user.id, id), WORKED_THRESHOLD_MS);
|
||||
return () => clearTimeout(timer);
|
||||
}, [id, user]);
|
||||
|
||||
// Load historical grading results
|
||||
useEffect(() => {
|
||||
if (!id || !user || !isReady) return;
|
||||
getPaperAttempts(id).then((attempts) => {
|
||||
const map = new Map<string, { isCorrect: boolean; feedback: string; ocrText: string; scoreGiven?: number }>();
|
||||
for (const a of attempts) {
|
||||
map.set(a.question_id, {
|
||||
isCorrect: a.is_correct,
|
||||
feedback: a.feedback || "",
|
||||
ocrText: a.photo_ocr_text || "",
|
||||
});
|
||||
}
|
||||
if (map.size > 0) {
|
||||
setGradingResults((prev) => {
|
||||
const next = new Map(prev);
|
||||
for (const [k, v] of map) {
|
||||
if (!next.has(k)) next.set(k, v); // don't overwrite current session
|
||||
}
|
||||
return next;
|
||||
});
|
||||
setGradingExpanded(new Set(map.keys()));
|
||||
}
|
||||
}).catch(() => {});
|
||||
}, [id, user, isReady]);
|
||||
|
||||
// Load variants for current question (once per question ID)
|
||||
useEffect(() => {
|
||||
if (!currentQuestionId || loadedRef.current.has(currentQuestionId)) return;
|
||||
loadedRef.current.add(currentQuestionId);
|
||||
getVariants(currentQuestionId)
|
||||
.then((data) => {
|
||||
setVariantMap((prev) => new Map(prev).set(currentQuestionId, data));
|
||||
})
|
||||
.catch(() => {});
|
||||
}, [currentQuestionId]);
|
||||
|
||||
// When user scrolls PDF, find the question closest to that page
|
||||
// But ignore if user just clicked a question (2s cooldown)
|
||||
const handlePdfPageChange = useCallback(
|
||||
(page: number) => {
|
||||
if (questions.length === 0) return;
|
||||
if (Date.now() - lastUserSelectTime.current < 2000) return;
|
||||
let best = questions[0];
|
||||
for (let i = 0; i < questions.length; i++) {
|
||||
if ((questions[i].page_number ?? 1) <= page) best = questions[i];
|
||||
}
|
||||
setCurrentQuestionId(best.id);
|
||||
},
|
||||
[questions],
|
||||
);
|
||||
|
||||
// Track answer state per question for ActionBar feedback
|
||||
const [answerStates, setAnswerStates] = useState<Map<string, "correct" | "wrong">>(new Map());
|
||||
|
||||
const handleAnswerResult = async (isCorrect: boolean, userAnswer: string) => {
|
||||
if (!currentQuestion) return;
|
||||
const state = isCorrect ? "correct" : "wrong";
|
||||
setAnswerStates((prev) => new Map(prev).set(currentQuestion.id, state));
|
||||
try {
|
||||
const type = currentQuestion.question_type === "mc" ? "select" : "input";
|
||||
await recordAttempt(currentQuestion.id, type, userAnswer, isCorrect);
|
||||
// Wrong answer → auto generate variant
|
||||
if (!isCorrect) {
|
||||
handleGenerateVariant();
|
||||
}
|
||||
} catch {
|
||||
// silent
|
||||
}
|
||||
};
|
||||
|
||||
const handleGenerateVariant = async () => {
|
||||
if (!currentQuestion || isGenerating) return;
|
||||
setIsGenerating(true);
|
||||
setActiveTab("variants");
|
||||
try {
|
||||
const saved = await generateVariant(currentQuestion.id);
|
||||
setVariantMap((prev) => {
|
||||
const existing = prev.get(currentQuestion.id) ?? [];
|
||||
return new Map(prev).set(currentQuestion.id, [saved, ...existing]);
|
||||
});
|
||||
} catch {
|
||||
// silent
|
||||
} finally {
|
||||
setIsGenerating(false);
|
||||
}
|
||||
};
|
||||
|
||||
const handleToggleFavorite = async (v: QuestionVariant) => {
|
||||
const updated = await updateVariant(v.id, { favorited: !v.favorited });
|
||||
setVariantMap((prev) => {
|
||||
const existing = prev.get(v.source_question_id) ?? [];
|
||||
return new Map(prev).set(
|
||||
v.source_question_id,
|
||||
existing.map((item) => (item.id === v.id ? updated : item)),
|
||||
);
|
||||
});
|
||||
};
|
||||
|
||||
const handleDeleteVariant = async (v: QuestionVariant) => {
|
||||
await deleteVariant(v.id);
|
||||
if (activeVariantId === v.id) setActiveVariantId(null);
|
||||
setVariantMap((prev) => {
|
||||
const existing = prev.get(v.source_question_id) ?? [];
|
||||
return new Map(prev).set(
|
||||
v.source_question_id,
|
||||
existing.filter((item) => item.id !== v.id),
|
||||
);
|
||||
});
|
||||
};
|
||||
|
||||
if (paperLoading) {
|
||||
return (
|
||||
<div className="min-h-screen bg-gray-50 flex items-center justify-center">
|
||||
<div className="text-gray-400 text-sm">Loading...</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
if (paperError || !paper) {
|
||||
return (
|
||||
<div className="min-h-screen bg-gray-50 flex items-center justify-center">
|
||||
<div className="text-red-500 text-sm">{paperError ?? "Paper not found"}</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="h-screen flex flex-col">
|
||||
<Header courseCode={paper.course_code} paperTitle={paperTitle} />
|
||||
|
||||
{/* Processing overlay */}
|
||||
{paper.status === "processing" && (
|
||||
<div className="flex-1 flex items-center justify-center bg-gray-50">
|
||||
<div className="text-center">
|
||||
<div className="inline-block w-8 h-8 border-3 border-blue-600 border-t-transparent rounded-full animate-spin mb-4" />
|
||||
<p className="text-gray-600 text-sm">AI is analyzing the paper...</p>
|
||||
<p className="text-gray-400 text-xs mt-1">
|
||||
{paper.question_count
|
||||
? `${paper.question_count} questions found, generating analysis...`
|
||||
: "Extracting and structuring questions..."}
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Error state */}
|
||||
{paper.status === "error" && (
|
||||
<div className="flex-1 flex items-center justify-center bg-gray-50">
|
||||
<div className="text-center max-w-md">
|
||||
<p className="text-red-600 font-medium mb-2">Processing Failed</p>
|
||||
<p className="text-gray-500 text-sm">{paper.error_message}</p>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Ready — workbench */}
|
||||
{paper.status === "ready" && (
|
||||
<div className="flex-1 flex overflow-hidden">
|
||||
{/* Left: PDF viewer */}
|
||||
<div className="w-[60%] border-r border-gray-200">
|
||||
<PdfViewer
|
||||
fileUrl={paper.paper_file_url}
|
||||
currentPage={currentQuestion?.page_number ?? 1}
|
||||
onPageChange={handlePdfPageChange}
|
||||
/>
|
||||
</div>
|
||||
|
||||
{/* Right: analysis panel */}
|
||||
<div className="w-[40%] flex flex-col overflow-hidden">
|
||||
{questionsLoading ? (
|
||||
<div className="flex-1 flex items-center justify-center text-gray-400 text-sm">
|
||||
Loading questions...
|
||||
</div>
|
||||
) : activeVariantId && activeVariant ? (
|
||||
/* ===== Variant Detail View ===== */
|
||||
<>
|
||||
<button
|
||||
onClick={() => setActiveVariantId(null)}
|
||||
className="flex items-center gap-2 px-4 py-2.5 text-sm font-medium text-blue-600 bg-gray-50 border-b border-gray-200 hover:bg-gray-100 shrink-0"
|
||||
>
|
||||
<span>←</span>
|
||||
<span>Back to Questions</span>
|
||||
<span className="ml-2 px-2 py-0.5 bg-purple-100 text-purple-700 text-xs rounded-full font-medium">
|
||||
Variant Q{activeVariant.source_question_number}
|
||||
</span>
|
||||
</button>
|
||||
<div className="flex-1 overflow-y-auto p-4">
|
||||
<VariantDetail variant={activeVariant.variant_data} />
|
||||
</div>
|
||||
</>
|
||||
) : (
|
||||
/* ===== Normal Tab View ===== */
|
||||
<>
|
||||
{/* Tab bar */}
|
||||
<div className="flex border-b border-gray-200 shrink-0">
|
||||
<button
|
||||
onClick={() => setActiveTab("questions")}
|
||||
className={`flex-1 py-2.5 text-sm font-medium text-center transition-colors ${
|
||||
activeTab === "questions"
|
||||
? "text-gray-900 border-b-2 border-blue-600"
|
||||
: "text-gray-400 hover:text-gray-600"
|
||||
}`}
|
||||
>
|
||||
Questions
|
||||
</button>
|
||||
<button
|
||||
onClick={() => setActiveTab("variants")}
|
||||
className={`flex-1 py-2.5 text-sm font-medium text-center transition-colors flex items-center justify-center gap-1.5 ${
|
||||
activeTab === "variants"
|
||||
? "text-gray-900 border-b-2 border-blue-600"
|
||||
: "text-gray-400 hover:text-gray-600"
|
||||
}`}
|
||||
>
|
||||
Variants
|
||||
{currentVariants.length > 0 && (
|
||||
<span className="w-5 h-5 flex items-center justify-center bg-purple-500 text-white text-xs font-bold rounded-full">
|
||||
{currentVariants.length}
|
||||
</span>
|
||||
)}
|
||||
</button>
|
||||
</div>
|
||||
|
||||
{/* Question nav — always visible */}
|
||||
<QuestionNav
|
||||
groups={groups}
|
||||
currentGroupKey={currentGroupKey}
|
||||
currentQuestionId={currentQuestion?.id ?? null}
|
||||
onSelectGroup={handleGroupSelect}
|
||||
onSelectQuestion={handleQuestionSelect}
|
||||
/>
|
||||
|
||||
{/* Questions tab content */}
|
||||
{activeTab === "questions" && (
|
||||
<>
|
||||
<div className="flex-1 overflow-y-auto p-4">
|
||||
{currentQuestion && (
|
||||
<>
|
||||
<QuestionDetail
|
||||
question={currentQuestion}
|
||||
onAnswerResult={handleAnswerResult}
|
||||
/>
|
||||
{/* Grading result panel */}
|
||||
{gradingResults.has(currentQuestion.id) && (() => {
|
||||
const gr = gradingResults.get(currentQuestion.id)!;
|
||||
const expanded = gradingExpanded.has(currentQuestion.id);
|
||||
const toggleExpand = () => setGradingExpanded((prev) => {
|
||||
const next = new Set(prev);
|
||||
next.has(currentQuestion.id) ? next.delete(currentQuestion.id) : next.add(currentQuestion.id);
|
||||
return next;
|
||||
});
|
||||
|
||||
if (gr.loading) {
|
||||
return (
|
||||
<div className="mb-4 rounded-lg border border-blue-200 bg-blue-50 p-3">
|
||||
<div className="flex items-center gap-2">
|
||||
<span className="w-4 h-4 border-2 border-blue-600 border-t-transparent rounded-full animate-spin" />
|
||||
<span className="text-sm font-medium text-blue-700">Grading your answer...</span>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<div className={`mb-4 rounded-lg border ${gr.isCorrect ? "border-green-200" : "border-red-200"}`}>
|
||||
<button
|
||||
onClick={toggleExpand}
|
||||
className={`w-full flex items-center justify-between px-3 py-2.5 rounded-t-lg ${gr.isCorrect ? "bg-green-50" : "bg-red-50"}`}
|
||||
>
|
||||
<div className="flex items-center gap-2">
|
||||
<span className="text-lg">{gr.isCorrect ? "✓" : "✗"}</span>
|
||||
<span className={`font-semibold text-sm ${gr.isCorrect ? "text-green-700" : "text-red-700"}`}>
|
||||
AI Grading: {gr.isCorrect ? "Correct" : "Incorrect"}
|
||||
{gr.scoreGiven !== undefined && ` — ${gr.scoreGiven} pts`}
|
||||
</span>
|
||||
</div>
|
||||
<span className="text-gray-400 text-xs">{expanded ? "▲" : "▼"}</span>
|
||||
</button>
|
||||
{expanded && (
|
||||
<div className="p-3 border-t border-gray-100 bg-white rounded-b-lg">
|
||||
{gr.ocrText && (
|
||||
<details className="mb-3 bg-gray-50 rounded-lg border border-gray-200">
|
||||
<summary className="px-3 py-2 text-xs font-medium text-gray-500 cursor-pointer">Your Answer (OCR)</summary>
|
||||
<div className="px-3 pb-3">
|
||||
<KaTeXRenderer html={gr.ocrText.replace(/\n/g, "<br/>")} className="text-xs text-gray-700" />
|
||||
</div>
|
||||
</details>
|
||||
)}
|
||||
<KaTeXRenderer html={gr.feedback} className="text-gray-700 text-sm" />
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
})()}
|
||||
<AiTrioPanel question={currentQuestion} />
|
||||
<SimilarHistoryPanel question={currentQuestion} />
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
<ActionBar
|
||||
question={currentQuestion}
|
||||
onGenerateVariant={handleGenerateVariant}
|
||||
isGenerating={isGenerating}
|
||||
onPhotoOpen={() => setShowPhoto(true)}
|
||||
answerState={currentQuestion ? answerStates.get(currentQuestion.id) ?? null : null}
|
||||
/>
|
||||
</>
|
||||
)}
|
||||
|
||||
{/* Variants tab content */}
|
||||
{activeTab === "variants" && (
|
||||
<div className="flex-1 overflow-y-auto p-4">
|
||||
<div className="mb-3">
|
||||
<button
|
||||
onClick={handleGenerateVariant}
|
||||
disabled={!currentQuestion || isGenerating}
|
||||
className="w-full py-2 rounded-lg text-sm font-medium bg-purple-50 text-purple-700 border border-purple-200 hover:bg-purple-100 disabled:opacity-50 transition-colors"
|
||||
>
|
||||
{isGenerating ? (
|
||||
<span className="flex items-center justify-center gap-2">
|
||||
<span className="w-3 h-3 border-2 border-purple-600 border-t-transparent rounded-full animate-spin" />
|
||||
Generating...
|
||||
</span>
|
||||
) : "+ Generate Variant"}
|
||||
</button>
|
||||
</div>
|
||||
|
||||
{currentVariants.length === 0 && !isGenerating ? (
|
||||
<div className="text-center py-12">
|
||||
<p className="text-gray-400 text-sm">No variants yet for this question.</p>
|
||||
</div>
|
||||
) : (
|
||||
<div className="space-y-3">
|
||||
{currentVariants.map((v) => (
|
||||
<div key={v.id} className="bg-gray-50 rounded-lg border border-gray-200 p-4">
|
||||
<div className="flex items-center justify-between mb-2">
|
||||
<span className="text-xs text-gray-400">
|
||||
{new Date(v.created_at).toLocaleDateString("en-CA")}
|
||||
</span>
|
||||
<div className="flex items-center gap-2">
|
||||
<button
|
||||
onClick={() => void handleToggleFavorite(v)}
|
||||
title={v.favorited ? "Unfavorite" : "Save to Error Book"}
|
||||
className={`text-lg leading-none ${v.favorited ? "text-yellow-400" : "text-gray-300 hover:text-yellow-400"}`}
|
||||
>
|
||||
★
|
||||
</button>
|
||||
<button
|
||||
onClick={() => void handleDeleteVariant(v)}
|
||||
className="text-gray-300 hover:text-red-400 text-sm leading-none"
|
||||
title="Delete"
|
||||
>
|
||||
×
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
<p className="text-xs text-gray-600 line-clamp-2 mb-3">
|
||||
{v.variant_data.question_text?.replace(/<[^>]*>/g, "").slice(0, 140)}
|
||||
</p>
|
||||
<button
|
||||
onClick={() => setActiveVariantId(v.id)}
|
||||
className="px-3 py-1.5 bg-blue-600 text-white text-xs font-medium rounded-lg hover:bg-blue-700"
|
||||
>
|
||||
Practice →
|
||||
</button>
|
||||
</div>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
</>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Photo upload modal */}
|
||||
{showPhoto && currentQuestion && (() => {
|
||||
const qid = currentQuestion.id;
|
||||
return (
|
||||
<PhotoUpload
|
||||
questionId={qid}
|
||||
onClose={() => setShowPhoto(false)}
|
||||
onSubmitted={async (promise) => {
|
||||
// Set loading state
|
||||
setGradingResults((prev) => new Map(prev).set(qid, { isCorrect: false, feedback: "", ocrText: "", loading: true }));
|
||||
setGradingExpanded((prev) => new Set(prev).add(qid));
|
||||
try {
|
||||
const res = await promise;
|
||||
const { is_correct, feedback, score_given } = res.grade;
|
||||
setGradingResults((prev) => new Map(prev).set(qid, {
|
||||
isCorrect: is_correct,
|
||||
feedback,
|
||||
ocrText: res.ocr_text,
|
||||
scoreGiven: score_given,
|
||||
loading: false,
|
||||
}));
|
||||
// Wrong → auto generate variant
|
||||
if (!is_correct) {
|
||||
handleGenerateVariant();
|
||||
}
|
||||
} catch {
|
||||
setGradingResults((prev) => new Map(prev).set(qid, {
|
||||
isCorrect: false,
|
||||
feedback: "Grading failed. Please try again.",
|
||||
ocrText: "",
|
||||
loading: false,
|
||||
}));
|
||||
}
|
||||
}}
|
||||
/>
|
||||
);
|
||||
})()}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
79
frontend/src/styles/globals.css
Normal file
79
frontend/src/styles/globals.css
Normal file
@@ -0,0 +1,79 @@
|
||||
@import "tailwindcss";
|
||||
@import "katex/dist/katex.min.css";
|
||||
|
||||
/* ── Google Fonts: Sora (headings) + IBM Plex Mono (data) ── */
|
||||
@import url("https://fonts.googleapis.com/css2?family=Sora:wght@400;500;600;700&family=IBM+Plex+Mono:wght@400;500;600&display=swap");
|
||||
|
||||
/* Hide scrollbar on horizontal tab rows */
|
||||
.hide-scrollbar { -ms-overflow-style: none; scrollbar-width: none; }
|
||||
.hide-scrollbar::-webkit-scrollbar { display: none; }
|
||||
|
||||
/* ── Knowledge Base HTML content styling (from SOS project) ── */
|
||||
.kb-html-content h1 { font-size: 1.25rem; font-weight: 700; margin: 0.75rem 0 0.5rem; line-height: 1.3; }
|
||||
.kb-html-content h2 { font-size: 1.1rem; font-weight: 600; margin: 0.75rem 0 0.4rem; color: #1e40af; border-bottom: 1px solid #e5e7eb; padding-bottom: 0.25rem; }
|
||||
.kb-html-content h3 { font-size: 0.95rem; font-weight: 600; margin: 0.6rem 0 0.3rem; color: #374151; }
|
||||
.kb-html-content h4 { font-size: 0.875rem; font-weight: 600; margin: 0.5rem 0 0.25rem; color: #6b7280; }
|
||||
.kb-html-content p { margin: 0.3rem 0; line-height: 1.6; }
|
||||
.kb-html-content p.summary { background: #eff6ff; border-left: 3px solid #3b82f6; padding: 0.5rem 0.75rem; border-radius: 0 0.25rem 0.25rem 0; color: #1e3a5f; margin-bottom: 0.75rem; }
|
||||
.kb-html-content ul, .kb-html-content ol { margin: 0.3rem 0 0.3rem 1.25rem; line-height: 1.6; }
|
||||
.kb-html-content ul { list-style: disc; }
|
||||
.kb-html-content ol { list-style: decimal; }
|
||||
.kb-html-content li { margin: 0.15rem 0; }
|
||||
.kb-html-content strong { font-weight: 600; color: #1e293b; }
|
||||
.kb-html-content blockquote { border-left: 3px solid #d1d5db; padding: 0.4rem 0.75rem; margin: 0.4rem 0; background: #f9fafb; color: #4b5563; font-style: italic; border-radius: 0 0.25rem 0.25rem 0; }
|
||||
.kb-html-content pre { background: #1e293b; color: #e2e8f0; padding: 0.75rem; border-radius: 0.375rem; overflow-x: auto; margin: 0.4rem 0; font-size: 0.8rem; }
|
||||
.kb-html-content code { font-family: ui-monospace, monospace; font-size: 0.85em; }
|
||||
.kb-html-content :not(pre) > code { background: #f1f5f9; padding: 0.1rem 0.3rem; border-radius: 0.2rem; color: #be185d; }
|
||||
.kb-html-content table { border-collapse: collapse; width: 100%; margin: 0.4rem 0; font-size: 0.8rem; }
|
||||
.kb-html-content th, .kb-html-content td { border: 1px solid #e5e7eb; padding: 0.35rem 0.5rem; text-align: left; }
|
||||
.kb-html-content th { background: #f3f4f6; font-weight: 600; }
|
||||
.kb-html-content section { margin: 0.5rem 0; }
|
||||
.kb-html-content .tag { display: inline-block; background: #dbeafe; color: #1e40af; padding: 0.1rem 0.5rem; border-radius: 9999px; font-size: 0.75rem; margin: 0.15rem 0.15rem; }
|
||||
.kb-html-content hr { border: none; border-top: 1px solid #e5e7eb; margin: 0.75rem 0; }
|
||||
|
||||
/* ── Example blocks ── */
|
||||
.kb-html-content .example { background: #fffbeb; border: 1px solid #fbbf24; border-radius: 0.375rem; padding: 0.75rem; margin: 0.6rem 0; }
|
||||
.kb-html-content .example-title { font-weight: 700; color: #92400e; margin-bottom: 0.4rem; font-size: 0.9rem; }
|
||||
.kb-html-content .example-solution { border-top: 1px dashed #d97706; padding-top: 0.4rem; }
|
||||
|
||||
/* ── LaTeX blocks ── */
|
||||
.kb-html-content pre.latex { background: #f8fafc; color: #1e293b; border: 1px solid #e2e8f0; text-align: center; font-size: 0.9rem; padding: 0.6rem; }
|
||||
.kb-html-content code.latex { background: #f1f5f9; padding: 0.1rem 0.3rem; border-radius: 0.2rem; color: #4338ca; font-size: 0.85em; }
|
||||
|
||||
/* ── Common error block (used in solution) ── */
|
||||
.kb-html-content .common-error {
|
||||
background: #fef2f2;
|
||||
border: 1px solid #fca5a5;
|
||||
border-left: 3px solid #ef4444;
|
||||
border-radius: 0.375rem;
|
||||
padding: 0.6rem 0.75rem;
|
||||
margin: 0.5rem 0;
|
||||
}
|
||||
.kb-html-content .common-error::before {
|
||||
content: "⚠ Common Mistake";
|
||||
font-weight: 700;
|
||||
color: #dc2626;
|
||||
display: block;
|
||||
margin-bottom: 0.3rem;
|
||||
font-size: 0.85rem;
|
||||
}
|
||||
|
||||
/* ── Figure description blocks ── */
|
||||
.kb-html-content .figure-desc {
|
||||
background: #faf5ff;
|
||||
border: 1px solid #d8b4fe;
|
||||
border-left: 3px solid #a855f7;
|
||||
border-radius: 0.375rem;
|
||||
padding: 0.6rem 0.75rem;
|
||||
margin: 0.5rem 0;
|
||||
}
|
||||
|
||||
/* ── AI Supplement blocks ── */
|
||||
.kb-html-content .ai-supplement {
|
||||
background: #f0fdf4;
|
||||
border: 1px solid #86efac;
|
||||
border-left: 3px solid #22c55e;
|
||||
border-radius: 0.375rem;
|
||||
padding: 0.6rem 0.75rem;
|
||||
margin: 0.5rem 0;
|
||||
}
|
||||
169
frontend/src/types/api.ts
Normal file
169
frontend/src/types/api.ts
Normal file
@@ -0,0 +1,169 @@
|
||||
export interface Paper {
|
||||
id: string;
|
||||
user_id: string | null;
|
||||
course_code: string;
|
||||
year: number;
|
||||
term: string;
|
||||
exam_type: string;
|
||||
paper_file_url: string;
|
||||
answer_file_url: string | null;
|
||||
status: "uploaded" | "processing" | "ready" | "error";
|
||||
error_message: string | null;
|
||||
total_score: number | null;
|
||||
question_count: number | null;
|
||||
topics_summary: Record<string, number> | null;
|
||||
difficulty_level: string | null;
|
||||
processing_step: string | null;
|
||||
processing_progress: number;
|
||||
processing_total: number;
|
||||
created_at: string;
|
||||
updated_at: string;
|
||||
}
|
||||
|
||||
export interface PaperSummary {
|
||||
id: string;
|
||||
course_code: string;
|
||||
year: number;
|
||||
term: string;
|
||||
exam_type: string;
|
||||
part_label: string | null;
|
||||
}
|
||||
|
||||
export interface Question {
|
||||
id: string;
|
||||
paper_id: string;
|
||||
question_number: string;
|
||||
parent_question: string | null;
|
||||
display_order: number;
|
||||
question_type: string;
|
||||
question_format?: string | null;
|
||||
question_text: string;
|
||||
score: number | null;
|
||||
page_number: number | null;
|
||||
page_y_ratio?: number | null;
|
||||
options: { label: string; text: string }[] | null;
|
||||
correct_option: string | null;
|
||||
correct_answer: string | null;
|
||||
raw_answer_text: string | null;
|
||||
topics: string[] | null;
|
||||
topic_primary?: string | null;
|
||||
analytics_topic?: string | null;
|
||||
topic_tags?: string[] | null;
|
||||
skill_tags?: string[] | null;
|
||||
difficulty: string | null;
|
||||
knowledge_reminder: string;
|
||||
ai_hint: string;
|
||||
solution: string;
|
||||
created_at: string;
|
||||
updated_at: string;
|
||||
paper?: PaperSummary;
|
||||
}
|
||||
|
||||
export interface UploadResponse {
|
||||
paper_id: string;
|
||||
status: string;
|
||||
message: string;
|
||||
}
|
||||
|
||||
export interface UserAttempt {
|
||||
id: string;
|
||||
user_id: string;
|
||||
question_id: string;
|
||||
attempt_type: string;
|
||||
user_answer: string | null;
|
||||
photo_url: string | null;
|
||||
photo_ocr_text: string | null;
|
||||
is_correct: boolean | null;
|
||||
feedback: string | null;
|
||||
error_at_step: number | null;
|
||||
in_error_book: boolean;
|
||||
mastered: boolean;
|
||||
created_at: string;
|
||||
paper_questions?: Question;
|
||||
score_given?: number | null;
|
||||
}
|
||||
|
||||
export interface VariantQuestion {
|
||||
question_text: string;
|
||||
question_type: string;
|
||||
options: { label: string; text: string }[] | null;
|
||||
correct_answer: string;
|
||||
ai_hint: string;
|
||||
knowledge_reminder: string;
|
||||
solution: string;
|
||||
}
|
||||
|
||||
export interface QuestionVariant {
|
||||
id: string;
|
||||
user_id: string;
|
||||
source_question_id: string;
|
||||
source_question_number: string;
|
||||
variant_data: VariantQuestion;
|
||||
favorited: boolean;
|
||||
created_at: string;
|
||||
}
|
||||
|
||||
export interface GradeResult {
|
||||
is_correct: boolean;
|
||||
feedback: string;
|
||||
error_at_step: number | null;
|
||||
}
|
||||
|
||||
export interface SimilarQuestion {
|
||||
id: string;
|
||||
paper_id: string;
|
||||
source: string;
|
||||
question_number: string;
|
||||
match_percent: number;
|
||||
match_reasons?: string[];
|
||||
question_type: Question["question_type"];
|
||||
question_text: string;
|
||||
topics: string[];
|
||||
difficulty: string | null;
|
||||
knowledge_reminder: string;
|
||||
ai_hint: string;
|
||||
solution: string;
|
||||
}
|
||||
|
||||
export interface AnalyticsTopicQuestion {
|
||||
paper_id: string;
|
||||
source: string;
|
||||
question_number: string;
|
||||
preview: string;
|
||||
difficulty: string | null;
|
||||
question_type: string;
|
||||
year?: number | null;
|
||||
term?: string | null;
|
||||
exam_type?: string | null;
|
||||
topics?: string[];
|
||||
}
|
||||
|
||||
export interface AnalyticsTopicEntry {
|
||||
label: string;
|
||||
count: number;
|
||||
pct: number;
|
||||
questions: AnalyticsTopicQuestion[];
|
||||
}
|
||||
|
||||
export interface CourseAnalytics {
|
||||
course_code: string;
|
||||
kpi: {
|
||||
papers: number;
|
||||
questions: number;
|
||||
topics: number;
|
||||
difficulty: string;
|
||||
};
|
||||
topic_frequency: AnalyticsTopicEntry[];
|
||||
question_types: Array<{
|
||||
label: string;
|
||||
count: number;
|
||||
pct: number;
|
||||
}>;
|
||||
difficulty_distribution: {
|
||||
easy: number;
|
||||
medium: number;
|
||||
hard: number;
|
||||
};
|
||||
high_yield_topics: string[];
|
||||
all_questions: AnalyticsTopicQuestion[];
|
||||
}
|
||||
1
frontend/src/vite-env.d.ts
vendored
Normal file
1
frontend/src/vite-env.d.ts
vendored
Normal file
@@ -0,0 +1 @@
|
||||
/// <reference types="vite/client" />
|
||||
21
frontend/tsconfig.json
Normal file
21
frontend/tsconfig.json
Normal file
@@ -0,0 +1,21 @@
|
||||
{
|
||||
"compilerOptions": {
|
||||
"target": "ES2020",
|
||||
"useDefineForClassFields": true,
|
||||
"lib": ["ES2020", "DOM", "DOM.Iterable"],
|
||||
"module": "ESNext",
|
||||
"skipLibCheck": true,
|
||||
"moduleResolution": "bundler",
|
||||
"allowImportingTsExtensions": true,
|
||||
"isolatedModules": true,
|
||||
"moduleDetection": "force",
|
||||
"noEmit": true,
|
||||
"jsx": "react-jsx",
|
||||
"strict": true,
|
||||
"baseUrl": ".",
|
||||
"paths": {
|
||||
"@/*": ["src/*"]
|
||||
}
|
||||
},
|
||||
"include": ["src"]
|
||||
}
|
||||
22
frontend/vite.config.ts
Normal file
22
frontend/vite.config.ts
Normal file
@@ -0,0 +1,22 @@
|
||||
import { defineConfig } from "vite";
|
||||
import react from "@vitejs/plugin-react";
|
||||
import tailwindcss from "@tailwindcss/vite";
|
||||
import { resolve } from "path";
|
||||
|
||||
export default defineConfig({
|
||||
plugins: [react(), tailwindcss()],
|
||||
resolve: {
|
||||
alias: {
|
||||
"@": resolve(__dirname, "src"),
|
||||
},
|
||||
},
|
||||
server: {
|
||||
port: 5173,
|
||||
proxy: {
|
||||
"/api": {
|
||||
target: "http://localhost:8000",
|
||||
changeOrigin: true,
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
22
index 2.html
Normal file
22
index 2.html
Normal file
File diff suppressed because one or more lines are too long
3
memory/MEMORY.md
Normal file
3
memory/MEMORY.md
Normal file
@@ -0,0 +1,3 @@
|
||||
# Memory Index
|
||||
|
||||
- [project_pastpaper_master.md](project_pastpaper_master.md) — PastPaper Master 项目概览与当前开发进度
|
||||
37
memory/project_pastpaper_master.md
Normal file
37
memory/project_pastpaper_master.md
Normal file
@@ -0,0 +1,37 @@
|
||||
---
|
||||
name: PastPaper Master 项目概览
|
||||
description: 项目技术栈、当前开发状态、已完成工作流及下一步优先级
|
||||
type: project
|
||||
---
|
||||
|
||||
AI 辅助学习平台,支持 COMP2211 试卷练习。核心功能:题目工作台、AI 三件套、相似题推荐、错题本、变式题生成。
|
||||
|
||||
## 技术栈
|
||||
- Frontend: React 19 + TypeScript + Vite 7 + Tailwind v4
|
||||
- Backend: FastAPI + Python 3.12 + uv
|
||||
- DB: Supabase PostgreSQL(RLS 已预留,当前用 temp user id)
|
||||
- LLM: GPT-4o (laozhang proxy) + Qwen-plus fallback
|
||||
|
||||
## 当前 DB 状态(2026-04-10)
|
||||
COMP2211 共 7 份 status=ready 试卷,250 道 subquestion 级题目,均有 knowledge_reminder / ai_hint / solution / analytics_topic / topic_tags / skill_tags。
|
||||
|
||||
## 已完成的工作(本次 session)
|
||||
**Workstream A:相似题检索 + 移除 demo fallback**
|
||||
- `backend/app/routers/questions.py`:
|
||||
- `skill_tags` 加入 SELECT 和 `question_topics()` 计算
|
||||
- 修复 `isinstance(target_score, int)` → `(int, float)` 支持 NUMERIC 小数分
|
||||
- `similarity_score()` 返回 `(score, reasons)` tuple
|
||||
- 过滤阈值从 `<= 0` 改为 `< 10`
|
||||
- 响应增加 `match_reasons` 字段
|
||||
- `frontend/src/types/api.ts`:`SimilarQuestion` 加 `match_reasons?: string[]`
|
||||
- `frontend/src/components/workbench/SimilarHistoryPanel.tsx`:移除全部 demo fallback,改为真实 empty/error 状态,显示 match_reasons chip
|
||||
|
||||
## 下一步优先级(来自 HANDOFF_COMP2211.md)
|
||||
1. ✅ Workstream A: 相似题检索 + 移除 demo fallback — 已完成
|
||||
2. Workstream B: Analytics 深化(per-paper drill-down、topic 频率时序、高频话题)
|
||||
3. Workstream C: LaTeX/KaTeX 渲染质量(集中归一化、剔除 OCR 噪声)
|
||||
4. Workstream D: 用户上传去重(对比 course_library 已有试卷)
|
||||
5. Workstream E: UI/UX pass(QuestionNav、状态 badge、workbench 层级)
|
||||
|
||||
**Why:** HANDOFF 文档中建议的开发顺序,以数据稳定性为先。
|
||||
**How to apply:** 下次 session 从 Workstream B(Analytics 深化)开始。
|
||||
1
pastpaper-scraper
Submodule
1
pastpaper-scraper
Submodule
Submodule pastpaper-scraper added at 36d4a450cd
25
pitch_script.md
Normal file
25
pitch_script.md
Normal file
@@ -0,0 +1,25 @@
|
||||
# KnowIt Pitch — Product Demo (Pages 5-6, ~45s)
|
||||
|
||||
## Transition In
|
||||
|
||||
> Now let me show you the product.
|
||||
|
||||
## Page 5 — Product Demo
|
||||
|
||||
> This is PastPaper Master. Search any course, download past papers, and hit "AI Analyze" — our system reads every page, extracts each question, and generates knowledge reminders, hints, and full solutions automatically.
|
||||
>
|
||||
> It's powered by Gemini vision and DeepSeek, with a RAG pipeline connecting papers, recordings, and courseware.
|
||||
|
||||
## Page 6 — Workflow
|
||||
|
||||
> Here's the full student workflow.
|
||||
>
|
||||
> **Download** papers. **AI analysis** breaks down topics and difficulty. **Upload your answers** — AI grades them instantly with detailed feedback.
|
||||
>
|
||||
> Wrong answers go into your **mistake book**. AI generates **variant questions** on the same topic, plus retrieves **similar questions** from other exams.
|
||||
>
|
||||
> And **smart flashcards** auto-generated for quick revision — already live for pharmacology students.
|
||||
|
||||
## Transition Out
|
||||
|
||||
> One closed loop — find, practice, grade, review, master. Over to [name] on the market.
|
||||
207
supabase/migrations/001_init_schema.sql
Normal file
207
supabase/migrations/001_init_schema.sql
Normal file
@@ -0,0 +1,207 @@
|
||||
-- ============================================
|
||||
-- PastPaper Master — 初始数据库 Schema
|
||||
-- Version: 001
|
||||
-- Date: 2025-03-11
|
||||
-- ============================================
|
||||
|
||||
-- 启用必要的扩展
|
||||
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
|
||||
|
||||
-- ============================================
|
||||
-- Table 1: papers — 上传的试卷
|
||||
-- ============================================
|
||||
CREATE TABLE papers (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
user_id UUID NOT NULL REFERENCES auth.users(id) ON DELETE CASCADE,
|
||||
|
||||
-- 元信息(用户上传时填写)
|
||||
course_code TEXT NOT NULL, -- "COMP2011"
|
||||
year INTEGER NOT NULL, -- 2024
|
||||
term TEXT NOT NULL CHECK (term IN ('fall', 'spring', 'summer')),
|
||||
exam_type TEXT NOT NULL CHECK (exam_type IN ('midterm', 'final', 'quiz')),
|
||||
|
||||
-- 文件 (Supabase Storage)
|
||||
paper_file_url TEXT NOT NULL, -- 试卷 PDF
|
||||
answer_file_url TEXT, -- 答案 PDF(可选)
|
||||
|
||||
-- 处理状态
|
||||
status TEXT NOT NULL DEFAULT 'uploaded'
|
||||
CHECK (status IN ('uploaded', 'processing', 'ready', 'error')),
|
||||
error_message TEXT, -- 处理失败时的错误信息
|
||||
|
||||
-- 提取的原始文本(缓存)
|
||||
paper_extracted_text TEXT,
|
||||
answer_extracted_text TEXT,
|
||||
|
||||
-- 整卷概览(AI 生成)
|
||||
total_score INTEGER,
|
||||
question_count INTEGER,
|
||||
topics_summary JSONB, -- {"Linked List": 40, "Recursion": 30}
|
||||
difficulty_level TEXT CHECK (difficulty_level IN ('easy', 'medium', 'hard')),
|
||||
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
|
||||
-- ============================================
|
||||
-- Table 2: paper_questions — 逐题数据
|
||||
-- ============================================
|
||||
CREATE TABLE paper_questions (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
paper_id UUID NOT NULL REFERENCES papers(id) ON DELETE CASCADE,
|
||||
|
||||
-- 题目标识
|
||||
question_number TEXT NOT NULL, -- "1", "1a", "2b"
|
||||
parent_question TEXT, -- 子题的父题号: "1a" → "1"
|
||||
display_order INTEGER NOT NULL, -- 显示顺序
|
||||
|
||||
-- 题目内容
|
||||
question_type TEXT NOT NULL
|
||||
CHECK (question_type IN ('mc', 'fill_blank', 'long_question')),
|
||||
question_text TEXT NOT NULL, -- 题目原文
|
||||
score INTEGER, -- 分值
|
||||
page_number INTEGER, -- PDF 页码(左右联动)
|
||||
|
||||
-- 选择题专用
|
||||
options JSONB, -- [{"label":"A","text":"..."},...]
|
||||
correct_option TEXT, -- "B"
|
||||
|
||||
-- 填空题专用
|
||||
correct_answer TEXT, -- 正确答案
|
||||
accept_variants TEXT[], -- 等价表达 ["O(nlogn)","O(n log n)"]
|
||||
|
||||
-- 答案 PDF 提取的原始答案(所有题型)
|
||||
raw_answer_text TEXT,
|
||||
|
||||
-- 知识点标签
|
||||
topics TEXT[], -- ["Linked List","Pointer"]
|
||||
difficulty TEXT CHECK (difficulty IN ('easy', 'medium', 'hard')),
|
||||
|
||||
-- AI 三件套(HTML + KaTeX)
|
||||
knowledge_reminder TEXT, -- 知识点 Reminder
|
||||
ai_hint TEXT, -- AI Hint
|
||||
solution TEXT, -- Solution(逐步 derivation)
|
||||
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
|
||||
-- ============================================
|
||||
-- Table 3: user_attempts — 用户答题记录
|
||||
-- Phase 4 实现,先建好表结构
|
||||
-- ============================================
|
||||
CREATE TABLE user_attempts (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
user_id UUID NOT NULL REFERENCES auth.users(id) ON DELETE CASCADE,
|
||||
question_id UUID NOT NULL REFERENCES paper_questions(id) ON DELETE CASCADE,
|
||||
|
||||
-- 用户的作答
|
||||
attempt_type TEXT NOT NULL
|
||||
CHECK (attempt_type IN ('select', 'input', 'photo')),
|
||||
user_answer TEXT, -- 选项 / 输入的答案
|
||||
photo_url TEXT, -- 上传的照片
|
||||
photo_ocr_text TEXT, -- OCR 识别结果
|
||||
|
||||
-- AI 判定
|
||||
is_correct BOOLEAN,
|
||||
feedback TEXT, -- HTML — 逐步错误分析
|
||||
error_at_step INTEGER, -- 第几步开始错
|
||||
|
||||
-- 错题本
|
||||
in_error_book BOOLEAN NOT NULL DEFAULT false,
|
||||
mastered BOOLEAN NOT NULL DEFAULT false,
|
||||
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
|
||||
-- ============================================
|
||||
-- 索引
|
||||
-- ============================================
|
||||
CREATE INDEX idx_papers_user ON papers(user_id);
|
||||
CREATE INDEX idx_papers_course ON papers(course_code);
|
||||
CREATE INDEX idx_papers_status ON papers(status);
|
||||
|
||||
CREATE INDEX idx_questions_paper ON paper_questions(paper_id);
|
||||
CREATE INDEX idx_questions_type ON paper_questions(question_type);
|
||||
CREATE INDEX idx_questions_topics ON paper_questions USING GIN(topics);
|
||||
|
||||
CREATE INDEX idx_attempts_user ON user_attempts(user_id);
|
||||
CREATE INDEX idx_attempts_question ON user_attempts(question_id);
|
||||
CREATE INDEX idx_attempts_errorbook ON user_attempts(user_id)
|
||||
WHERE in_error_book = true;
|
||||
|
||||
-- ============================================
|
||||
-- RLS 策略
|
||||
-- ============================================
|
||||
ALTER TABLE papers ENABLE ROW LEVEL SECURITY;
|
||||
ALTER TABLE paper_questions ENABLE ROW LEVEL SECURITY;
|
||||
ALTER TABLE user_attempts ENABLE ROW LEVEL SECURITY;
|
||||
|
||||
-- papers: 用户只能看自己上传的(以后加公共库时再调整)
|
||||
CREATE POLICY "Users can view own papers"
|
||||
ON papers FOR SELECT
|
||||
USING (auth.uid() = user_id);
|
||||
|
||||
CREATE POLICY "Users can insert own papers"
|
||||
ON papers FOR INSERT
|
||||
WITH CHECK (auth.uid() = user_id);
|
||||
|
||||
CREATE POLICY "Users can update own papers"
|
||||
ON papers FOR UPDATE
|
||||
USING (auth.uid() = user_id);
|
||||
|
||||
CREATE POLICY "Users can delete own papers"
|
||||
ON papers FOR DELETE
|
||||
USING (auth.uid() = user_id);
|
||||
|
||||
-- paper_questions: 跟随 paper 的权限
|
||||
CREATE POLICY "Users can view questions of own papers"
|
||||
ON paper_questions FOR SELECT
|
||||
USING (
|
||||
EXISTS (
|
||||
SELECT 1 FROM papers
|
||||
WHERE papers.id = paper_questions.paper_id
|
||||
AND papers.user_id = auth.uid()
|
||||
)
|
||||
);
|
||||
|
||||
-- service_role 用于后端写入 questions(处理管线用)
|
||||
-- 前端不直接写 questions,通过 API 触发后端处理
|
||||
|
||||
-- user_attempts: 用户只能看/写自己的
|
||||
CREATE POLICY "Users can view own attempts"
|
||||
ON user_attempts FOR SELECT
|
||||
USING (auth.uid() = user_id);
|
||||
|
||||
CREATE POLICY "Users can insert own attempts"
|
||||
ON user_attempts FOR INSERT
|
||||
WITH CHECK (auth.uid() = user_id);
|
||||
|
||||
CREATE POLICY "Users can update own attempts"
|
||||
ON user_attempts FOR UPDATE
|
||||
USING (auth.uid() = user_id);
|
||||
|
||||
-- ============================================
|
||||
-- updated_at 自动更新触发器
|
||||
-- ============================================
|
||||
CREATE OR REPLACE FUNCTION update_updated_at()
|
||||
RETURNS TRIGGER AS $$
|
||||
BEGIN
|
||||
NEW.updated_at = now();
|
||||
RETURN NEW;
|
||||
END;
|
||||
$$ LANGUAGE plpgsql;
|
||||
|
||||
CREATE TRIGGER papers_updated_at
|
||||
BEFORE UPDATE ON papers
|
||||
FOR EACH ROW EXECUTE FUNCTION update_updated_at();
|
||||
|
||||
CREATE TRIGGER questions_updated_at
|
||||
BEFORE UPDATE ON paper_questions
|
||||
FOR EACH ROW EXECUTE FUNCTION update_updated_at();
|
||||
|
||||
-- ============================================
|
||||
-- Storage bucket
|
||||
-- ============================================
|
||||
-- 在 Supabase Dashboard 中手动创建 bucket: "papers"
|
||||
-- 或通过 API 创建(后端初始化时处理)
|
||||
38
supabase/migrations/002_course_library_fields.sql
Normal file
38
supabase/migrations/002_course_library_fields.sql
Normal file
@@ -0,0 +1,38 @@
|
||||
-- ============================================
|
||||
-- PastPaper Master — Shared course library fields
|
||||
-- Version: 002
|
||||
-- Date: 2026-03-24
|
||||
-- ============================================
|
||||
|
||||
-- Shared library / canonical import metadata on papers
|
||||
ALTER TABLE papers
|
||||
ADD COLUMN IF NOT EXISTS source_kind TEXT NOT NULL DEFAULT 'user_upload'
|
||||
CHECK (source_kind IN ('user_upload', 'course_library')),
|
||||
ADD COLUMN IF NOT EXISTS source_exam_key TEXT,
|
||||
ADD COLUMN IF NOT EXISTS part_label TEXT
|
||||
CHECK (part_label IN ('A', 'B')),
|
||||
ADD COLUMN IF NOT EXISTS source_question_filename TEXT,
|
||||
ADD COLUMN IF NOT EXISTS source_answer_filename TEXT;
|
||||
|
||||
CREATE UNIQUE INDEX IF NOT EXISTS idx_papers_course_library_exam_key
|
||||
ON papers(source_exam_key)
|
||||
WHERE source_kind = 'course_library' AND source_exam_key IS NOT NULL;
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_papers_course_lookup
|
||||
ON papers(course_code, year, term, exam_type, part_label);
|
||||
|
||||
-- Grading results should persist awarded score
|
||||
ALTER TABLE user_attempts
|
||||
ADD COLUMN IF NOT EXISTS score_given INTEGER;
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_attempts_errorbook_active
|
||||
ON user_attempts(user_id, created_at DESC)
|
||||
WHERE in_error_book = true AND mastered = false;
|
||||
|
||||
-- The backend and frontend already support true_false; schema must match.
|
||||
ALTER TABLE paper_questions
|
||||
DROP CONSTRAINT IF EXISTS paper_questions_question_type_check;
|
||||
|
||||
ALTER TABLE paper_questions
|
||||
ADD CONSTRAINT paper_questions_question_type_check
|
||||
CHECK (question_type IN ('mc', 'true_false', 'fill_blank', 'long_question'));
|
||||
41
supabase/migrations/003_question_taxonomy_fields.sql
Normal file
41
supabase/migrations/003_question_taxonomy_fields.sql
Normal file
@@ -0,0 +1,41 @@
|
||||
-- ============================================
|
||||
-- PastPaper Master — Question taxonomy fields
|
||||
-- Version: 003
|
||||
-- Date: 2026-03-24
|
||||
-- ============================================
|
||||
|
||||
-- A question needs multiple classification layers:
|
||||
-- 1) question_format: how the student interacts with it
|
||||
-- 2) topic_tags / topic_primary / analytics_topic: course knowledge taxonomy
|
||||
-- 3) skill_tags: what kind of thinking task the question requires
|
||||
ALTER TABLE paper_questions
|
||||
ADD COLUMN IF NOT EXISTS question_format TEXT
|
||||
CHECK (
|
||||
question_format IN (
|
||||
'mc',
|
||||
'true_false',
|
||||
'fill_blank',
|
||||
'short_answer',
|
||||
'long_answer',
|
||||
'coding'
|
||||
)
|
||||
),
|
||||
ADD COLUMN IF NOT EXISTS topic_primary TEXT,
|
||||
ADD COLUMN IF NOT EXISTS analytics_topic TEXT,
|
||||
ADD COLUMN IF NOT EXISTS topic_tags TEXT[],
|
||||
ADD COLUMN IF NOT EXISTS skill_tags TEXT[];
|
||||
|
||||
-- Keep the legacy topics column for backward compatibility for now.
|
||||
-- New analytics and retrieval code should gradually move to analytics_topic/topic_tags.
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_questions_question_format
|
||||
ON paper_questions(question_format);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_questions_analytics_topic
|
||||
ON paper_questions(analytics_topic);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_questions_topic_tags
|
||||
ON paper_questions USING GIN(topic_tags);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_questions_skill_tags
|
||||
ON paper_questions USING GIN(skill_tags);
|
||||
@@ -0,0 +1,30 @@
|
||||
-- ============================================
|
||||
-- PastPaper Master — Decouple course library papers from auth users
|
||||
-- Version: 004
|
||||
-- Date: 2026-03-24
|
||||
-- ============================================
|
||||
|
||||
-- Course-library papers should not depend on a concrete auth.users row.
|
||||
-- User-uploaded papers still keep user_id populated.
|
||||
ALTER TABLE papers
|
||||
ALTER COLUMN user_id DROP NOT NULL;
|
||||
|
||||
-- Keep existing FK so user-owned papers can still reference auth.users,
|
||||
-- while course-library rows simply use NULL.
|
||||
|
||||
-- Tighten the intended invariant with a check constraint:
|
||||
-- - user_upload rows must have user_id
|
||||
-- - course_library rows must not have user_id
|
||||
ALTER TABLE papers
|
||||
DROP CONSTRAINT IF EXISTS papers_source_kind_user_id_check;
|
||||
|
||||
ALTER TABLE papers
|
||||
ADD CONSTRAINT papers_source_kind_user_id_check
|
||||
CHECK (
|
||||
(source_kind = 'user_upload' AND user_id IS NOT NULL)
|
||||
OR
|
||||
(source_kind = 'course_library' AND user_id IS NULL)
|
||||
);
|
||||
|
||||
-- Existing RLS policies continue to apply to user-owned rows.
|
||||
-- Course-library rows are accessed through the backend service role.
|
||||
27
supabase/migrations/005_allow_long_question_format_alias.sql
Normal file
27
supabase/migrations/005_allow_long_question_format_alias.sql
Normal file
@@ -0,0 +1,27 @@
|
||||
-- ============================================
|
||||
-- PastPaper Master — Allow legacy long_question format alias
|
||||
-- Version: 005
|
||||
-- Date: 2026-03-24
|
||||
-- ============================================
|
||||
--
|
||||
-- Some existing seeds and older generated SQL used `long_question` in the
|
||||
-- `question_format` column, while the 003 taxonomy migration introduced
|
||||
-- `long_answer` as the canonical value. Allow both temporarily so historical
|
||||
-- inserts do not fail. New generators should continue emitting `long_answer`.
|
||||
|
||||
ALTER TABLE paper_questions
|
||||
DROP CONSTRAINT IF EXISTS paper_questions_question_format_check;
|
||||
|
||||
ALTER TABLE paper_questions
|
||||
ADD CONSTRAINT paper_questions_question_format_check
|
||||
CHECK (
|
||||
question_format IN (
|
||||
'mc',
|
||||
'true_false',
|
||||
'fill_blank',
|
||||
'short_answer',
|
||||
'long_answer',
|
||||
'long_question',
|
||||
'coding'
|
||||
)
|
||||
);
|
||||
17
supabase/migrations/006_make_scores_numeric.sql
Normal file
17
supabase/migrations/006_make_scores_numeric.sql
Normal file
@@ -0,0 +1,17 @@
|
||||
-- ============================================
|
||||
-- PastPaper Master — Make score fields numeric
|
||||
-- Version: 006
|
||||
-- Date: 2026-04-10
|
||||
-- ============================================
|
||||
|
||||
ALTER TABLE paper_questions
|
||||
ALTER COLUMN score TYPE NUMERIC
|
||||
USING score::NUMERIC;
|
||||
|
||||
ALTER TABLE papers
|
||||
ALTER COLUMN total_score TYPE NUMERIC
|
||||
USING total_score::NUMERIC;
|
||||
|
||||
ALTER TABLE user_attempts
|
||||
ALTER COLUMN score_given TYPE NUMERIC
|
||||
USING score_given::NUMERIC;
|
||||
36
supabase/migrations/007_fulltext_search.sql
Normal file
36
supabase/migrations/007_fulltext_search.sql
Normal file
@@ -0,0 +1,36 @@
|
||||
-- 007: Full-text search on paper_questions.question_text
|
||||
--
|
||||
-- Adds a tsvector generated column (auto-maintained by PostgreSQL on every
|
||||
-- INSERT/UPDATE), a GIN index for fast @@ queries, and a batch-scoring RPC
|
||||
-- used by the similar-question retrieval endpoint.
|
||||
|
||||
ALTER TABLE paper_questions
|
||||
ADD COLUMN IF NOT EXISTS search_text tsvector
|
||||
GENERATED ALWAYS AS (
|
||||
to_tsvector('english', coalesce(question_text, ''))
|
||||
) STORED;
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_pq_search_text
|
||||
ON paper_questions USING gin(search_text);
|
||||
|
||||
-- text_similarity_scores(query_text, candidate_ids)
|
||||
-- Returns one row per candidate ID with a ts_rank_cd score normalised by
|
||||
-- unique word count (normalization flag = 1). Questions that share no
|
||||
-- lexemes with the query still appear in the result with score = 0 so the
|
||||
-- caller always gets a complete score map for every candidate.
|
||||
CREATE OR REPLACE FUNCTION text_similarity_scores(
|
||||
query_text text,
|
||||
candidate_ids uuid[]
|
||||
)
|
||||
RETURNS TABLE (question_id uuid, text_score float4)
|
||||
LANGUAGE sql STABLE AS $$
|
||||
SELECT
|
||||
id,
|
||||
ts_rank_cd(
|
||||
search_text,
|
||||
plainto_tsquery('english', query_text),
|
||||
1 -- normalise by unique word count
|
||||
)::float4
|
||||
FROM paper_questions
|
||||
WHERE id = ANY(candidate_ids);
|
||||
$$;
|
||||
2
supabase/migrations/008_add_page_y_ratio.sql
Normal file
2
supabase/migrations/008_add_page_y_ratio.sql
Normal file
@@ -0,0 +1,2 @@
|
||||
ALTER TABLE paper_questions
|
||||
ADD COLUMN IF NOT EXISTS page_y_ratio NUMERIC;
|
||||
27
supabase/migrations/008_fix_storage_url_placeholder.sql
Normal file
27
supabase/migrations/008_fix_storage_url_placeholder.sql
Normal file
@@ -0,0 +1,27 @@
|
||||
-- 008: Replace __SUPABASE_STORAGE_PUBLIC_BASE_URL__ placeholder in paper URLs
|
||||
--
|
||||
-- The course-library seed (comp2211_course_library_papers.sql) was inserted
|
||||
-- without substituting the placeholder. This migration replaces it with the
|
||||
-- real Supabase Storage public base URL for the `papers` bucket.
|
||||
|
||||
UPDATE papers
|
||||
SET paper_file_url = REPLACE(
|
||||
paper_file_url,
|
||||
'__SUPABASE_STORAGE_PUBLIC_BASE_URL__',
|
||||
'https://pvcxipwovpwrurebouwg.supabase.co/storage/v1/object/public/papers'
|
||||
)
|
||||
WHERE paper_file_url LIKE '%__SUPABASE_STORAGE_PUBLIC_BASE_URL__%';
|
||||
|
||||
UPDATE papers
|
||||
SET answer_file_url = REPLACE(
|
||||
answer_file_url,
|
||||
'__SUPABASE_STORAGE_PUBLIC_BASE_URL__',
|
||||
'https://pvcxipwovpwrurebouwg.supabase.co/storage/v1/object/public/papers'
|
||||
)
|
||||
WHERE answer_file_url LIKE '%__SUPABASE_STORAGE_PUBLIC_BASE_URL__%';
|
||||
|
||||
-- Verify: should return 0 rows
|
||||
SELECT id, course_code, year, term, exam_type, paper_file_url, answer_file_url
|
||||
FROM papers
|
||||
WHERE paper_file_url LIKE '%__SUPABASE_STORAGE_PUBLIC_BASE_URL__%'
|
||||
OR answer_file_url LIKE '%__SUPABASE_STORAGE_PUBLIC_BASE_URL__%';
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user