From 6110ef4bc2db14dc5fb479ce3293551149eec3d4 Mon Sep 17 00:00:00 2001 From: Knowit <1604106@ce.buet.ac.bd> Date: Mon, 20 Apr 2026 15:03:06 +0800 Subject: [PATCH] Initial commit: bookkeeping skill Receipt-image to Google Sheets expense logger with HKD conversion. Includes SKILL.md, categories/schema reference, config template, and Python scripts for FX conversion (frankfurter.app) and Sheets append. Co-Authored-By: Claude Opus 4.7 (1M context) --- .gitignore | 4 +++ SKILL.md | 71 ++++++++++++++++++++++++++++++++++++++ categories.md | 26 ++++++++++++++ config.example.json | 6 ++++ schema.md | 20 +++++++++++ scripts/append_row.py | 80 +++++++++++++++++++++++++++++++++++++++++++ scripts/fx_convert.py | 46 +++++++++++++++++++++++++ scripts/setup.md | 39 +++++++++++++++++++++ 8 files changed, 292 insertions(+) create mode 100644 .gitignore create mode 100644 SKILL.md create mode 100644 categories.md create mode 100644 config.example.json create mode 100644 schema.md create mode 100755 scripts/append_row.py create mode 100755 scripts/fx_convert.py create mode 100644 scripts/setup.md diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..a4fb6d7 --- /dev/null +++ b/.gitignore @@ -0,0 +1,4 @@ +config.json +*.pyc +__pycache__/ +.DS_Store diff --git a/SKILL.md b/SKILL.md new file mode 100644 index 0000000..2562986 --- /dev/null +++ b/SKILL.md @@ -0,0 +1,71 @@ +--- +name: bookkeeping +description: Extract expense data from a receipt/invoice image (plus optional caption) and append it to a Google Sheet with HKD conversion. Use whenever the user provides a receipt image and wants it logged, or forwards a WhatsApp-style message that contains a receipt. +--- + +# Bookkeeping — Receipt → Google Sheet + +Default working language: **English**. All written output (row values, replies) is English unless the user explicitly requests otherwise. + +## When to use +- User provides a receipt / invoice / payment-screenshot image and wants it recorded. +- User says "log this", "record this expense", "add to bookkeeping", "记一下" with an image. +- Caption may be empty, or may add context (who paid, split %, category hint, payment method). Always incorporate caption if present. + +## Prerequisites (check once per session) +1. `~/.claude/skills/bookkeeping/config.json` exists. If only `config.example.json` is present, **stop** and tell the user to copy it and fill in `sheet_id`, `worksheet`, `service_account_path`. Point them at `scripts/setup.md`. +2. Python deps installed: `google-api-python-client`, `google-auth`. If `append_row.py` fails with ImportError, instruct the user to run `pip install google-api-python-client google-auth` and retry. + +## Workflow +1. **Extract** fields from the image using vision. Caption is auxiliary — never let caption override a legible receipt, but use it to fill gaps (category hint, note, payment method). +2. **Normalize** per the rules below. +3. **Convert to HKD** by running: + ``` + python ~/.claude/skills/bookkeeping/scripts/fx_convert.py --date + ``` + Output is `\t\t` (tab-separated). If currency is HKD, skip the call and set `amount_hkd=amount`, `fx_rate=1`, `fx_date=`. +4. **Append the row** by piping JSON into: + ``` + echo '' | python ~/.claude/skills/bookkeeping/scripts/append_row.py + ``` + Keys must match `schema.md` (snake_case: `date`, `merchant`, `category`, `amount`, `currency`, `amount_hkd`, `fx_rate`, `fx_date`, `payment_method`, `line_items`, `raw_ocr`, `note`, `receipt`). Script adds `logged_at` automatically. +5. **Report** to the user: the row you wrote and any field you had to guess. + +## Normalization rules +- **Date** → `yyyy-mm-dd`. Use the receipt date. If no date visible, fall back to today and flag it. +- **Amount** → grand total, numeric, no currency symbol. Tax/tip already included. +- **Currency** → ISO-4217 code (USD, CNY, HKD, JPY, EUR, GBP, …). Infer from symbol and language. If symbol is just "$" and context is ambiguous, check merchant country; if still unclear AND amount > 500 units, **ask**. +- **Merchant** → clean chain name. Strip store numbers, addresses, register IDs. "STARBUCKS #1234 SHENZHEN" → `Starbucks`. +- **Category** → pick exactly one from `categories.md`. Default to `Other` rather than invent a new one. +- **Payment method** → one of `cash`, `card`, `alipay`, `wechat`, `octopus`, `other`, or `""`. Only fill if visible on receipt or stated in caption. +- **Line items** → `"name xQty price; name xQty price"`. Skip (empty string) if handwritten, illegible, or trivially one item. +- **Raw OCR** → full receipt text, newlines as literal `\n`. For audit. Trim trailing whitespace only. +- **Note** → user's caption verbatim. Empty string if none. +- **Receipt** → if image arrived as a URL, use it. If it's a local path, write the local path and tell the user to wire Drive/S3 upload if they want hotlinkable receipts. + +## Caption handling +- Caption overrides only when the receipt lacks that field, OR when caption is an explicit correction ("this is actually groceries, not food"). +- Split expenses: if caption says "split 50/50 with Alice", still log the **full** amount — splitting belongs in a separate sheet/column. Note the split in the Note field. + +## When to ask vs. guess +- **Ask** when: currency is ambiguous and amount is non-trivial; date could be MM/DD vs DD/MM and merchant country is unclear; amount is unreadable. +- **Guess + flag** when: category is multi-fit (pick most specific, mention it); merchant name is partially OCR'd (pick best guess, mention it). +- **Never silently guess** the amount or the currency. + +## Reply format +``` +Logged to (row ): + Date: 2026-04-20 + Merchant: Starbucks + Category: Food & Drink + Amount: CNY 48.00 → HKD 51.74 (fx 1.0779 on 2026-04-20) + Payment: wechat + Note: "breakfast with Mark" +Guessed: category (receipt didn't specify; "Food & Drink" based on items). +``` +Omit the `Guessed:` line when nothing was guessed. + +## Reference files +- `categories.md` — fixed category list. +- `schema.md` — Google Sheet column order and formats. +- `scripts/setup.md` — one-time setup (service account, sheet share, deps). diff --git a/categories.md b/categories.md new file mode 100644 index 0000000..a01fce0 --- /dev/null +++ b/categories.md @@ -0,0 +1,26 @@ +# Expense Categories + +Pick exactly one per entry. Default to `Other` rather than invent a new category. + +- Food & Drink +- Groceries +- Transport +- Travel & Lodging +- Entertainment +- Shopping +- Health & Medical +- Utilities +- Housing & Rent +- Fees & Services +- Gifts & Donations +- Work & Office +- Education +- Other + +## Picking tips +- **Food & Drink** = restaurants, cafes, takeaway, bars. +- **Groceries** = supermarkets, wet markets, bulk food stores (even if also selling prepared food). +- **Transport** = taxi, MTR, bus, fuel, parking, tolls. +- **Travel & Lodging** = hotels, flights, trains (intercity), Airbnb. +- **Fees & Services** = bank charges, subscriptions, laundry, repairs. +- **Work & Office** = supplies, co-working, work-related software. diff --git a/config.example.json b/config.example.json new file mode 100644 index 0000000..bc80ca1 --- /dev/null +++ b/config.example.json @@ -0,0 +1,6 @@ +{ + "sheet_id": "PUT_GOOGLE_SHEET_ID_HERE", + "worksheet": "Expenses", + "service_account_path": "~/.config/gcp/bookkeeping-sa.json", + "hkd_fx_provider": "frankfurter" +} diff --git a/schema.md b/schema.md new file mode 100644 index 0000000..829c8d6 --- /dev/null +++ b/schema.md @@ -0,0 +1,20 @@ +# Google Sheet Column Schema + +`append_row.py` writes columns in this exact order. Create the header row manually once during setup. + +| Col | Header | Format | Source | +|-----|-----------------|----------------------------------------------|---------------------------------| +| A | Date | `yyyy-mm-dd` | receipt date, else today | +| B | Merchant | string | cleaned chain name | +| C | Category | one of `categories.md` | inferred | +| D | Amount | decimal | grand total, original currency | +| E | Currency | ISO-4217 (USD, CNY, HKD, …) | inferred | +| F | Amount (HKD) | decimal | `fx_convert.py` output | +| G | FX Rate | decimal, original→HKD | `fx_convert.py` output | +| H | FX Date | `yyyy-mm-dd` | `fx_convert.py` output | +| I | Payment Method | `cash` / `card` / `alipay` / `wechat` / `octopus` / `other` / `""` | receipt or caption | +| J | Line Items | `"name xQty price; name xQty price"` or `""` | receipt | +| K | Raw OCR | full text, `\n` for newlines | receipt | +| L | Note | verbatim caption | user | +| M | Receipt | URL or local path | image ref | +| N | Logged At | ISO-8601 UTC timestamp | auto (`append_row.py`) | diff --git a/scripts/append_row.py b/scripts/append_row.py new file mode 100755 index 0000000..e3b2754 --- /dev/null +++ b/scripts/append_row.py @@ -0,0 +1,80 @@ +#!/usr/bin/env python3 +""" +append_row.py — append one expense row to the configured Google Sheet. + +Reads config from ../config.json. +Reads a single JSON object from stdin. Keys (all optional; missing -> ""): + date, merchant, category, amount, currency, + amount_hkd, fx_rate, fx_date, payment_method, + line_items, raw_ocr, note, receipt + +`logged_at` is set automatically to now (UTC, ISO-8601). + +Prints the updated range on success; exits non-zero on failure. +""" +from __future__ import annotations + +import json +import os +import sys +from datetime import datetime, timezone +from pathlib import Path + +from google.oauth2.service_account import Credentials +from googleapiclient.discovery import build + + +CONFIG_PATH = Path(__file__).resolve().parent.parent / "config.json" + +COLUMNS = [ + "date", "merchant", "category", "amount", "currency", + "amount_hkd", "fx_rate", "fx_date", "payment_method", + "line_items", "raw_ocr", "note", "receipt", "logged_at", +] + + +def load_config() -> dict: + if not CONFIG_PATH.exists(): + sys.exit( + f"config.json not found at {CONFIG_PATH}. " + f"Copy config.example.json to config.json and fill it in." + ) + cfg = json.loads(CONFIG_PATH.read_text()) + cfg["service_account_path"] = os.path.expanduser(cfg["service_account_path"]) + return cfg + + +def main() -> int: + cfg = load_config() + row = json.loads(sys.stdin.read()) + row.setdefault( + "logged_at", + datetime.now(timezone.utc).isoformat(timespec="seconds"), + ) + + values = [str(row.get(col, "")) for col in COLUMNS] + + creds = Credentials.from_service_account_file( + cfg["service_account_path"], + scopes=["https://www.googleapis.com/auth/spreadsheets"], + ) + svc = build("sheets", "v4", credentials=creds, cache_discovery=False) + resp = ( + svc.spreadsheets() + .values() + .append( + spreadsheetId=cfg["sheet_id"], + range=f"{cfg['worksheet']}!A1", + valueInputOption="USER_ENTERED", + insertDataOption="INSERT_ROWS", + body={"values": [values]}, + ) + .execute() + ) + updated = resp.get("updates", {}).get("updatedRange", "?") + print(f"OK {updated}") + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/scripts/fx_convert.py b/scripts/fx_convert.py new file mode 100755 index 0000000..6225e16 --- /dev/null +++ b/scripts/fx_convert.py @@ -0,0 +1,46 @@ +#!/usr/bin/env python3 +""" +fx_convert.py [--date yyyy-mm-dd] + +Convert a given amount in to HKD. +Prints one tab-separated line: \t\t +Non-zero exit on failure. + +Uses frankfurter.app by default: free, no API key, ECB reference rates, +historical dates supported via /{date} path. +""" +from __future__ import annotations + +import argparse +import json +import sys +import urllib.request +from datetime import date + + +def fetch_rate(currency: str, on_date: str) -> tuple[float, str]: + currency = currency.upper() + if currency == "HKD": + return 1.0, on_date + url = f"https://api.frankfurter.app/{on_date}?from={currency}&to=HKD" + with urllib.request.urlopen(url, timeout=10) as resp: + data = json.loads(resp.read()) + rate = data["rates"]["HKD"] + return float(rate), data["date"] + + +def main() -> int: + ap = argparse.ArgumentParser() + ap.add_argument("amount", type=float) + ap.add_argument("currency") + ap.add_argument("--date", default=date.today().isoformat()) + args = ap.parse_args() + + rate, fx_date = fetch_rate(args.currency, args.date) + hkd = round(args.amount * rate, 2) + print(f"{hkd}\t{rate}\t{fx_date}") + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/scripts/setup.md b/scripts/setup.md new file mode 100644 index 0000000..3a46b57 --- /dev/null +++ b/scripts/setup.md @@ -0,0 +1,39 @@ +# One-time setup + +## 1. Python deps +``` +pip install google-api-python-client google-auth +``` + +## 2. Google Cloud service account +1. Create (or reuse) a GCP project. +2. Enable the **Google Sheets API** for the project. +3. Create a **service account**; skip the optional IAM steps. +4. In the service account, create a **JSON key** and download it. +5. Move the key to a safe path, e.g. `~/.config/gcp/bookkeeping-sa.json`, then: + ``` + chmod 600 ~/.config/gcp/bookkeeping-sa.json + ``` + +## 3. Prepare the Google Sheet +1. Create a new Google Sheet (or open an existing one). +2. Rename the first tab to `Expenses` (or update `worksheet` in config). +3. In row 1 add headers matching `schema.md` columns A–N: + `Date | Merchant | Category | Amount | Currency | Amount (HKD) | FX Rate | FX Date | Payment Method | Line Items | Raw OCR | Note | Receipt | Logged At` +4. Open the service account JSON and copy the `client_email` value (looks like `...@...iam.gserviceaccount.com`). +5. Click **Share** on the sheet and add that email as **Editor**. +6. Copy the sheet ID from the URL: `https://docs.google.com/spreadsheets/d//edit`. + +## 4. Skill config +``` +cd ~/.claude/skills/bookkeeping +cp config.example.json config.json +# edit config.json: sheet_id, service_account_path +``` + +## 5. Sanity check +``` +echo '{"date":"2026-04-20","merchant":"TEST","category":"Other","amount":1,"currency":"HKD","amount_hkd":1,"fx_rate":1,"fx_date":"2026-04-20"}' \ + | python ~/.claude/skills/bookkeeping/scripts/append_row.py +``` +You should see `OK Expenses!A2:N2` (or similar) and a new row in the sheet. Delete the TEST row when done.