diff --git a/.claude/skills/code-context/SKILL.md b/.claude/skills/code-context/SKILL.md deleted file mode 100644 index ea1f76f2..00000000 --- a/.claude/skills/code-context/SKILL.md +++ /dev/null @@ -1,250 +0,0 @@ ---- -name: code-context -description: Persistent code architecture documentation via a `.codecontext/` folder. -user-invokable: false -metadata: - category: architecture ---- - -# Code Architecture Documentation Skill - -Use this skill to **read architecture docs before work** and **document architecture after work** using the `.codecontext/` folder. - -`.codecontext/` is a living architecture reference — it helps new developers onboard and coding agents pick up where previous sessions left off. It is **NOT** a session log, changelog, or task diary. - ---- - -## What Belongs in `.codecontext/` - -Only document **architecture-level knowledge** that would take significant effort to rediscover by reading code alone. - -### Include - -- **Code architecture** — how modules/components are structured, layered, and why -- **Code flow** — request lifecycle, data flow between layers, event/cron pipelines -- **Component relationships** — which modules depend on each other, call chains, shared state -- **Edge cases and gotchas** — non-obvious behaviors, race conditions, ordering constraints -- **Design decisions and rationale** — why a pattern was chosen over alternatives -- **Integration points** — how external services, databases, queues connect -- **Invariants and constraints** — rules that must hold (e.g., "timestamps are always UTC seconds", "all DB access goes through db singleton") -- **Error handling patterns** — how errors propagate, retry logic, fallback behavior -- **Key file map** — which files own which responsibilities (only when non-obvious) - -### Exclude - -- Session logs, changelogs, or diary-style entries -- What files were changed in a specific task -- Raw terminal output or build logs -- Code snippets (reference file paths + line ranges instead) -- Obvious facts that can be inferred from reading one file -- Task status, TODO lists, or progress tracking -- Anything already covered in README, AGENTS.md, or inline comments - ---- - -## Trigger Conditions - -Run this skill at the **start and end** of any coding task that touches architecture: - -- Feature implementations spanning multiple files/modules -- Refactors that change module boundaries or data flow -- Bug fixes that reveal non-obvious system behavior -- New integrations or service connections -- Discovery of undocumented edge cases or invariants -- **After any agent run** — if the agent explored, read, or traced code to understand how part of the codebase works, that understanding must be captured (see Phase C) - -**Skip** for trivial changes (typo fixes, single-line edits, style-only changes). - ---- - -## Phase A — Read Architecture Docs (Before Acting) - -### A1) Discover docs - -```bash -ls .codecontext/ -``` - -If `.codecontext/` does not exist, continue the task and create it in Phase B. - -### A2) Find relevant docs - -```bash -grep -ril "" .codecontext/ -``` - -Use keywords from the feature area you are working on (e.g., "alerting", "auth", "monitors", "cron"). - -### A3) Read and apply - -Read only relevant files. Extract: - -- Architecture constraints that affect your implementation -- Code flow you need to hook into or extend -- Edge cases to preserve or handle -- Integration points to respect - -If existing docs conflict with current code, trust the code — update docs in Phase B. - ---- - -## Phase B — Document Architecture (Before Ending) - -Only write/update docs if the task revealed architecture knowledge worth preserving. - -### B1) Decide what to document - -Ask: _"Would a new developer or future agent need to re-discover this to work in this area?"_ - -If yes, proceed. If no, skip Phase B entirely. - -Then apply this filter to **every sentence** before writing: - -> "Does this sentence describe how the code is structured, a design decision, or a constraint that would change how someone writes future code in this area?" - -If no → cut it. This is the line between architecture documentation and a session diary. - -### B2) Write architecture documentation - -Structure each doc as a **reference document**, not a session diary. - -Template (use only the sections that apply): - -```markdown -# - -## Overview - -Brief description of what this area does and its role in the system. - -## Architecture - -How the components are structured, key abstractions, layers. - -## Code Flow - -Step-by-step flow for the primary operations (e.g., "How a monitor check executes"). - -## Key Files - -| File | Responsibility | -| -------------------- | -------------- | -| `src/lib/server/...` | Does X | - -## Edge Cases and Gotchas - -- Non-obvious behavior 1 -- Constraint that must be preserved - -## Design Decisions - -- Why X was chosen over Y (if non-obvious) -``` - -Not all sections are required — include only what is relevant. Keep each doc under **300 lines**. - -### B3) Pick target file - -```bash -ls .codecontext/ -grep -ril "" .codecontext/ -``` - -| Condition | Action | -| ------------------------------- | -------------------------------- | -| Existing doc covers this domain | Update/rewrite relevant sections | -| Different domain | Create new file | -| No match | Create new file | - -When updating, **replace outdated sections** rather than appending session entries. The doc should always read as a clean, current architecture reference. - -### B4) Persist - -```bash -mkdir -p .codecontext -``` - -Create or overwrite the file so it reads as a standalone reference: - -```bash -cat > .codecontext/.md -``` - ---- - -## Phase C — Capture Agent Understanding (After Any Agent Run) - -After completing any task (coding, debugging, research, exploration), review what you learned about the codebase during the session and persist anything not already documented. - -### C1) Identify new understanding - -Reflect on what you discovered during this session: - -- How does a feature/module actually work? (code flow, data transformations, call chains) -- What patterns or conventions did you observe across multiple files? -- What dependencies or relationships between modules did you trace? -- What surprised you or was non-obvious? (hidden side effects, implicit ordering, shared state) -- What constraints or invariants did you discover that aren't documented anywhere? - -### C2) Check if already documented - -```bash -ls .codecontext/ -grep -ril "" .codecontext/ -``` - -Read matching files. If the understanding is already captured accurately, skip. If partially captured, update the relevant sections. - -### C3) Write or update docs - -Apply the same quality filters from Phase B (B1 architecture filter). Then: - -- If the understanding maps to an existing `.codecontext/` file, update the relevant sections -- If it covers a new domain area, create a new file following the B2 template and naming rules -- Merge your new understanding with existing content — do not duplicate or contradict - -### C4) Scope - -This phase applies even when: - -- The task was **read-only** (research, exploration, answering questions about code) -- The task was a **bug investigation** that didn't result in a fix -- The agent **traced code flow** to understand behavior before making changes -- The agent **read multiple files** to understand how a feature works - -This phase does **NOT** apply when: - -- The agent only touched a single file and learned nothing non-obvious -- The understanding is already fully captured in existing `.codecontext/` docs -- The session was trivial (formatting, typo fix, config change) - ---- - -## Naming Rules - -- Name by domain/feature area: `alerting.md`, `auth.md`, `monitor-execution.md`, `incident-lifecycle.md` -- Use kebab-case for multi-word topics -- Never use generic names: `notes.md`, `misc.md`, `context.md`, `session-1.md` -- One file per bounded domain — split if a file exceeds ~300 lines - ---- - -## Fast Checklist - -Before coding: - -- [ ] Checked `.codecontext/` for relevant architecture docs -- [ ] Applied constraints and patterns from existing docs - -Before finishing: - -- [ ] Every sentence passed the B1 architecture filter -- [ ] Documented any new architecture knowledge discovered -- [ ] Updated outdated docs if current code contradicts them -- [ ] Doc reads as a clean architecture reference, not a session log - -After any agent run: - -- [ ] Reviewed what was learned about the codebase during this session -- [ ] Checked if that understanding is already in `.codecontext/` -- [ ] Persisted any new architectural knowledge (even from read-only/research sessions) diff --git a/.codecontext/database-migrations.md b/.codecontext/database-migrations.md deleted file mode 100644 index fecf8c6a..00000000 --- a/.codecontext/database-migrations.md +++ /dev/null @@ -1,46 +0,0 @@ -# Database & Migrations - -## Overview - -Kener uses **Knex.js** as a database abstraction layer supporting three engines: **SQLite** (default, via `better-sqlite3`), **PostgreSQL** (`pg`), and **MySQL** (`mysql2`). The engine is selected at runtime from the `DATABASE_URL` environment variable prefix (`sqlite://`, `postgresql://`, `mysql://`). - -## Architecture - -### Connection & Config - -| File | Responsibility | -| --------------------------------- | ---------------------------------------------------------- | -| `knexfile.ts` | Parses `DATABASE_URL`, selects client, exports Knex config | -| `src/lib/server/db/db.ts` | Singleton Knex instance — all app code imports from here | -| `src/lib/server/db/dbimpl.js` | High-level DB methods (wraps repositories) | -| `src/lib/server/db/repositories/` | Per-domain query classes extending `BaseRepository` | - -### Repository Pattern - -Each repository lives in `src/lib/server/db/repositories/.ts`, extends `BaseRepository` (which receives the Knex instance), and exposes typed async methods. Queries use Knex query builder — never raw SQL (except index creation wrapped in try/catch). - -### Migrations - -Migrations live in `migrations/` as TypeScript files with naming convention `YYYYMMDDHHMMSS_.ts`. - -Key patterns observed across all existing migrations: - -- **Idempotency guards**: `knex.schema.hasTable` / `knex.schema.hasColumn` before `createTable` / `alterTable`. -- **Column types**: Knex abstractions only (`.string()`, `.integer()`, `.text()`, `.float()`). No raw DDL. -- **Defaults**: `.defaultTo()` + `.notNullable()` for YES/NO string flags (e.g., `is_hidden`, `is_owner`). -- **Data seeding in migrations**: Standard Knex query builder (`.orderBy().first()`, `.update()`) — works on all three engines. -- **Index creation**: Wrapped in `try/catch` because `CREATE INDEX IF NOT EXISTS` isn't portable. -- **PostgreSQL insert returning**: Some repos branch on `GetDbType() === "postgresql"` to use `.returning("*")`, with fallback to re-read by inserted ID for SQLite/MySQL. - -## Edge Cases and Gotchas - -- SQLite requires `useNullAsDefault: true` in Knex config. -- PostgreSQL `INSERT ... RETURNING *` is not supported by SQLite/MySQL — branch on `GetDbType()`. -- `knex.fn.now()` is the portable way to set timestamps; never use `NOW()` or `datetime('now')`. -- String-based YES/NO flags (not booleans) are the project convention for flag columns. - -## Design Decisions - -- **YES/NO strings over booleans**: Consistent with existing `is_hidden`, `include_degraded_in_downtime`, etc. Avoids SQLite boolean quirks. -- **hasColumn guard in migrations**: Allows re-running migrations safely without failure on already-applied columns. -- **Owner flag (`is_owner`)**: Set during migration on the first user by `id ASC`. Only one user should be owner; enforced at application level, not DB constraint. diff --git a/.codecontext/group-monitor-cache-flow.md b/.codecontext/group-monitor-cache-flow.md deleted file mode 100644 index 7a7c246b..00000000 --- a/.codecontext/group-monitor-cache-flow.md +++ /dev/null @@ -1,39 +0,0 @@ -# Group Monitor Cache Flow - -## Overview - -Group monitors aggregate the status of child monitors using weighted scores. The execution involves a cache layer (Redis) that must stay in sync with the database for correct status propagation. - -## Execution Flow - -1. **Cron fires** → `appScheduler` schedules each monitor via BullMQ job scheduler (`monitorSchedulers.ts`) -2. **Worker calls** `Minuter(monitor)` in `cron-minute.ts` -3. For GROUP monitors, `Minuter` adds a BullMQ **delay** of `executionDelay` ms so child monitors finish first -4. **`monitorExecuteQueue`** worker calls `Service.execute()` → `GroupCall.execute()` -5. `GroupCall.execute()` reads each child's latest status from **Redis cache** via `GetLastMonitoringValue(tag, fetcher)` - -## Cache Behaviour - -- **Key format:** `kener:cache:{tag}:last_status` -- **TTL:** 86400 seconds (1 day) -- `getCache()` returns cached value if present — **fetcher is only called on cache miss** -- Cache is updated in two places: - - `monitorResponseQueue.ts` — after cron-driven monitor execution completes - - API PATCH endpoints (`api/v4/monitors/[tag]/data/`) — after external status updates - -## Key Invariant - -Any code path that writes monitoring data to the DB **must** also update the `last_status` cache entry. Otherwise `GroupCall` (and other cache consumers like badge endpoints) will serve stale data. - -## `executionDelay` Purpose - -`executionDelay` is NOT used inside `groupCall.ts`. It is applied in `cron-minute.ts` as BullMQ's `delay` option, deferring the group job so child monitors have time to complete and update the cache before the group aggregates. - -## Key Files - -- `src/lib/server/services/groupCall.ts` — weighted status aggregation -- `src/lib/server/cron-minute.ts` — applies `executionDelay` as BullMQ delay -- `src/lib/server/cache/setGet.ts` — `SetLastMonitoringValue` / `GetLastMonitoringValue` -- `src/lib/server/queues/monitorResponseQueue.ts` — cache update after cron execution -- `src/routes/(api)/api/v4/monitors/[monitor_tag]/data/+server.ts` — range update API -- `src/routes/(api)/api/v4/monitors/[monitor_tag]/data/[timestamp]/+server.ts` — single-point update API diff --git a/AGENTS.md b/AGENTS.md index f1105090..4229c74d 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -38,23 +38,3 @@ When the user asks to write or edit documentation, follow the skill file: - `.claude/skills/documentation-writer/SKILL.md` This is mandatory for docs-related tasks. Prioritize short, clear, action-oriented docs and avoid bloat. - -## Code architecture docs skill - Important for all tasks - -Always try to use the code-context skill at the start and end of coding sessions: - -- `.claude/skills/code-context/SKILL.md` - -## Code architecture enforcement (mandatory) - -The code-context skill is not optional. Agents MUST do both: - -1. **Before coding**: load relevant architecture docs from `.codecontext/`. -2. **Before final response**: if the task revealed new architecture knowledge (code flow, edge cases, component relationships), update or create a `.codecontext/*.md` entry. Skip if the task was trivial. - -Required output evidence in the final response: - -- `Context loaded:` list of `.codecontext` files read (or `none found`). -- `Context updated:` exact `.codecontext` file path written (or `skipped — no architecture changes`). - -The `.codecontext/` folder documents **code architecture only** — not session logs, changelogs, or task summaries.