1288 Commits

Author SHA1 Message Date
Raj Nandan Sharma 6764b470a9 Merge pull request #766 from CrazyComputer2008/Fix-Chinese-Translation
Fix Chinese Translation
2026-06-21 19:02:53 +05:30
Crazy Computer 9fc16187ca Add Traditional Chinese (Macau) locale file 2026-06-21 12:22:34 +08:00
Crazy Computer 32c36bfc52 Add Traditional Chinese (Hong Kong) localization 2026-06-21 12:20:58 +08:00
Crazy Computer 9ad4ca50fb Update Traditional Chinese translations in zh-TW.json 2026-06-21 12:17:54 +08:00
Crazy Computer 2aa83b01c2 Update Chinese translations in zh-CN.json 2026-06-21 12:14:00 +08:00
github-actions[bot] a25d823db8 chore(release): bump version to 4.1.1 v4.1.1 2026-06-19 17:54:04 +00:00
Raj Nandan Sharma c43b4ef863 chore(docs): include v4.1.1 changelog with new features and improvements 2026-06-19 23:23:22 +05:30
Raj Nandan Sharma a9e06ae9f7 Merge pull request #765 from rajnandan1/implement/665
Implement/665
2026-06-19 22:39:38 +05:30
Raj Nandan Sharma a925791671 refactor(layout): update main class for embed and status pages 2026-06-19 22:34:03 +05:30
Raj Nandan Sharma ba1d0079de address PR #765 review comments
- layoutController/site-configurations: use strict boolean check instead of
  Boolean() coercion for showInlineEvents so persisted "false" strings don't
  flip the toggle
- (kener)/+page.svelte: collapse confusing triple negation !!! to single !
- NotificationsList: add aria-label/title to the icon-only events button
  (+ "Open events page" en locale key)
- move NotificationEvent into shared $lib/types/notifications so client code
  no longer imports from the server dashboardController; controller re-exports
  it for backwards compatibility
- [page_path] and monitor pages: pass hideNotificationsPopover={showInlineEvents}
  to ThemePlus so inline and popover event surfaces stay mutually exclusive

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 22:28:22 +05:30
Raj Nandan Sharma 3503e73f56 ensure newline at end of file in notifications-client.ts 2026-06-19 11:24:09 +05:30
Raj Nandan Sharma 5021e01018 implement event display settings and notifications list component
implements #665
2026-06-19 11:23:57 +05:30
Raj Nandan Sharma 95c341fc35 Merge pull request #764 from rajnandan1/fix/db-pool-web-worker-split
fix(database): isolate web and worker connection pools
2026-06-18 17:00:15 +05:30
Raj Nandan Sharma e69fcdfd71 fix(database): isolate web and worker connection pools
GET / was throwing KnexTimeoutError ("Timeout acquiring a connection")
in production. Root cause was the connection pool, not the database:
the single process (SvelteKit + cron scheduler + BullMQ workers) shared
one pool capped at 10, while one GET / fans out ~6 queries. A couple of
concurrent page loads, or a per-minute monitor burst overlapping a load,
exceeded 10 and queued acquires blew past the 15s timeout. Postgres
itself had 97 free slots the whole time and no leak.

Split into two pools so background work can't starve page loads:
- web pool (DATABASE_POOL_MAX, default 10) serves HTTP requests
- worker pool (DATABASE_WORKER_POOL_MAX, default 5) serves background jobs

Routing is by execution context via AsyncLocalStorage: q.createWorker
(the single chokepoint all workers/schedulers flow through) runs each
processor inside a worker-pool context, and BaseRepository.knex resolves
the pool from that context, defaulting to the web pool. This keeps shared
controllers correct whether they run in a request or a job. SQLite has no
real pool and reuses a single connection, so the split is a no-op there.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 11:39:10 +05:30
Raj Nandan Sharma acf11459ed Merge pull request #763 from kensac/perf/monitoring-data-covering-index
perf: add covering index for monitor-bars status aggregation
2026-06-17 20:53:42 +05:30
Kanishk Sachdev 9ad315c3c7 perf: add covering index for monitor-bars status aggregation 2026-06-17 03:15:57 -04:00
Raj Nandan Sharma caf7427b04 Merge pull request #761 from rajnandan1/do/incident-start
refactor: update incident table to include start date and improve dat…
2026-06-16 09:17:01 +05:30
Raj Nandan Sharma 8833b7e410 refactor: update incident table to include start date and improve date formatting 2026-06-16 09:16:21 +05:30
Raj Nandan Sharma e43186d121 Merge pull request #760 from rajnandan1/fix/759
Fix/759
2026-06-15 23:24:12 +05:30
Raj Nandan Sharma 4536bceef6 refactor: improve incident management page structure and enhance data fetching logic 2026-06-15 23:22:25 +05:30
Raj Nandan Sharma 734b062626 refactor: update heartbeat URL format to use path segments and implement legacy URL rewriting 2026-06-15 23:19:24 +05:30
Raj Nandan Sharma f257cdc2c4 create heartbeat route handler for GET and POST requests 2026-06-15 23:19:17 +05:30
Raj Nandan Sharma 1362d06b20 Merge pull request #731 from Kukks/feat/rss-feed
Add public RSS 2.0 feed for incidents and scheduled maintenance
2026-06-15 10:30:59 +05:30
Raj Nandan Sharma 663ce52c4e Merge pull request #711 from phatlet/fix/add-redis-reconnect
Fix Redis writes after replica failover by reconnecting on `READONLY`
2026-06-15 09:48:37 +05:30
Raj Nandan Sharma 3f4df5faa7 fix(redis): use named RedisOptions type so npm run check passes
ioredis v5 dropped the namespace merge on the default export, so
`Redis.RedisOptions` resolves to TS2702 (type used as a namespace).
Import the `RedisOptions` type by name instead.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 09:43:12 +05:30
Raj Nandan Sharma 2501875406 refactor: update link to include rel="external" for better SEO 2026-06-14 11:55:30 +05:30
Raj Nandan Sharma 10586108c5 changes v4.1.0 2026-06-13 17:03:50 +00:00
Raj Nandan Sharma 87c69201ab Merge pull request #758 from rajnandan1/changelog/4.1.0
chore(docs): include v4.1.0 changelog and update navigation
2026-06-13 22:18:33 +05:30
Raj Nandan Sharma a3ec81af20 chore(docs): include v4.1.0 changelog and update navigation 2026-06-13 22:18:07 +05:30
Raj Nandan Sharma 61acf53c10 Merge pull request #757 from rajnandan1/implement/712
feat: Confirmation Threshold (grace period) — write-time damping with backfill + edge-case hardening (#755, #756)
2026-06-13 22:08:19 +05:30
Raj Nandan Sharma 8bbafe4c8a fix(confirmation): make unhealthy backfill atomic via a transaction (#757 review)
The per-row status+note backfill is one logical confirmation flip; wrap the
read+updates in a knex transaction so a mid-loop failure can't leave the window
half-confirmed/half-held (coderabbit out-of-diff finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 21:00:16 +05:30
Raj Nandan Sharma 4589568405 fix(confirmation): address PR #757 review comments
- PATCH: confirmation_threshold:null resets to 1 (off); undefined keeps existing (Copilot)
- backfill note is per-row severity-aware: 'Down'/'Degraded confirmed after N…' (Copilot)
- enforce 1–60 at the data layer via clampConfirmationThreshold on insert/update,
  covering all app write paths incl. the manage API (coderabbit)
- anchor via dedicated getLastObservedStatus query so a long incident/maintenance
  window can no longer push the anchor out of the lookback and bypass damping (coderabbit)
- overlays fetched AFTER execute() and keyed by job ts, making the freeze gate
  timestamp-safe and catching mid-check overlays (coderabbit + greptile)
- use Array.includes over indexOf!==-1 (greptile); refresh pendingHold doc (coderabbit)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 20:57:18 +05:30
Raj Nandan Sharma e8fb4126a8 docs(v4): add Grace Period (Confirmation Threshold) monitor docs (#756)
New v4/monitors/grace-period.md covering behavior, config, API, interactions
(alerts/maintenance/NO_DATA/groups/heartbeat), and verification; linked from the
Monitors sidebar and the Monitors Overview related-docs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 20:40:02 +05:30
Raj Nandan Sharma e9d3281067 chore: gitignore and untrack the whole docs/adr/ folder
Broaden the ignore from the single 0009 ADR to the entire docs/adr/ directory and
untrack the existing ADRs (0001-0008); files are kept on disk and remain in history.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 20:29:46 +05:30
Raj Nandan Sharma 7439632eff chore: stop tracking AI workflow docs; drop ADR references from code comments
gitignore + untrack CONTEXT.md, docs/adr/0009-*, and docs/superpowers/ (files kept
on disk). Remove the 'ADR 0009' citations from code comments; issue references and the
pre-existing ADR 0005 citations are retained.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 20:25:13 +05:30
Raj Nandan Sharma f0362fd919 feat(confirmation): preserve observed error on held rows; append confirmation note on backfill (#756)
Held (pending) rows now keep the real error text tagged '| Status held during
grace period' instead of dropping it, so no diagnostic info is lost. On confirmation
the backfill appends '| Down confirmed after N consecutive checks' to the existing
text (pipe-separated) rather than overwriting it; recovery clears the error. Append
is per-row for cross-DB safety and idempotent on replay.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 20:20:40 +05:30
Raj Nandan Sharma e61873164b fix(confirmation): preserve real latency on grace-pending rows instead of zeroing (#756)
A pending (held) row was written with latency 0, losing the measured latency and
denting the latency chart during every grace window (and discarding a recovering
check's real latency). Keep the observed latency; only drop the error text so a
held row never shows a status-contradicting failure message.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 20:14:30 +05:30
Raj Nandan Sharma 6d7b56a0ac feat(queues): freeze Confirmation Threshold during active overlays (#756)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 19:50:11 +05:30
Raj Nandan Sharma 0940c8d01e fix(confirmation): exclude NO_DATA from lookback to prevent stuck confirmation under neutral-dense history (#756)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 19:48:35 +05:30
Raj Nandan Sharma 17e3fa6d77 feat(services): NO_DATA-neutral counting and overlay freeze in resolver (#756)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 19:40:25 +05:30
Raj Nandan Sharma a363079695 feat(monitoring): widen confirmation lookback to include overlays + type (#756)
Replaces getRecentObservedSamples with getRecentSamplesForConfirmation, which adds
INCIDENT/MAINTENANCE overlay rows and the `type` column to the result set so the
Confirmation Threshold resolver can detect freeze boundaries. MANUAL and DEFAULT
rows remain excluded (transparent). Adds OVERLAY_TYPES constant alongside the
existing OBSERVED_CHECK_TYPES. Updates dbimpl.ts declaration and binding to match.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 19:37:27 +05:30
Raj Nandan Sharma 87fc3081df docs(confirmation-threshold): add #756 hardening implementation plan
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 19:32:59 +05:30
Raj Nandan Sharma 350e291db0 feat(ui): add Grace period input to monitor general settings (#755)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 19:04:13 +05:30
Raj Nandan Sharma 9a545dbf48 feat(api): accept confirmation_threshold on v4 monitor create/update (#755)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 18:58:45 +05:30
Raj Nandan Sharma ab527ff7d8 feat(queues): apply Confirmation Threshold in write path and persist raw_status (#755)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 18:53:11 +05:30
Raj Nandan Sharma 850ebae11a feat(services): add Confirmation Threshold resolver with retroactive backfill (#755)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 18:50:31 +05:30
Raj Nandan Sharma a6948f087c feat(monitoring): persist raw_status; add observed-sample lookback and confirmed-status backfill (#755)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 18:42:17 +05:30
Raj Nandan Sharma aaa7c2a46d feat(monitors): persist confirmation_threshold on create/update/duplicate/group (#755)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 18:33:00 +05:30
Raj Nandan Sharma 8edf92ea02 feat(types): add confirmation_threshold and raw_status to monitor/monitoring types (#755)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 18:30:34 +05:30
Raj Nandan Sharma e5e7e44471 feat(db): add confirmation_threshold and raw_status columns (#755)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 18:29:06 +05:30