mirror of
https://github.com/rajnandan1/kener.git
synced 2026-06-23 04:10:22 +00:00
Merge pull request #753 from rajnandan1/fix/717
refactor: unify overall status handling across components and improve…
This commit is contained in:
@@ -84,6 +84,14 @@ _Avoid_: Default page, base page, root page
|
||||
The number of days of per-day status shown for a monitor, per device class (desktop/mobile). Configurable at two levels with the same defaults and bounds: per Page (applies to all its monitors) and per Monitor (overrides the page level).
|
||||
_Avoid_: History days, bar count
|
||||
|
||||
**Overall Status**:
|
||||
The single status a set of monitors collapses to for display — the page banner, the all-monitors badge. The worst state wins: DOWN > DEGRADED > MAINTENANCE > UP. Maintenance is noteworthy but not a problem: it surfaces only when nothing is actively wrong, and never masks an outage or degradation. Distinct from Group Monitor scoring, which derives a monitor's own status from its members.
|
||||
_Avoid_: Page status, summary status, aggregate status
|
||||
|
||||
**All-Monitors Badge**:
|
||||
The badge addressed by the `_` token, showing the Overall Status of every ACTIVE, non-hidden monitor site-wide. Not scoped to any Page — on a multi-page install it need not match any single page's banner.
|
||||
_Avoid_: Site badge, page badge
|
||||
|
||||
**Page Settings**:
|
||||
A Page's display configuration: status-history window per device class, monitor layout style, per-page meta/social overrides, and event display preferences. The admin UI and the API expose the same settings, though each surface may name fields differently; a writer must never drop fields it does not understand.
|
||||
_Avoid_: Display settings (ambiguous with site-wide event display settings)
|
||||
|
||||
@@ -0,0 +1,7 @@
|
||||
# Overall Status is problem-first: DOWN > DEGRADED > MAINTENANCE > UP
|
||||
|
||||
Everywhere a set of monitor statuses collapses into one display status — the page banner (`GetStatusSummary`/`GetStatusColor` in `src/lib/clientTools.ts`), the per-day bar summaries, and the all-monitors `_` badge and dot badge (`GetLatestStatusActiveAll` in `src/lib/server/controllers/monitorsController.ts`) — the same worst-wins ordering applies: DOWN > DEGRADED > MAINTENANCE > UP, with NO_DATA only when no monitor has any data at all.
|
||||
|
||||
Issue #717 exposed that the codebase had three independent answers to "what does maintenance mean when aggregating". The frontend banner checked maintenance first, so one monitor in a planned window reported "Under Maintenance" even while another monitor was hard DOWN — a real outage masked by planned work. The badge loop had no MAINTENANCE branch at all, so maintenance samples were silently skipped (DEGRADED+UP+MAINTENANCE → "Degraded", disagreeing with the banner) and a fleet entirely under maintenance fell through to "No Status Available". Group Monitor scoring counts maintenance as UP. With the same monitors, the page and the badge told different stories, which breaks any automation treating either as the source of truth.
|
||||
|
||||
Problem-first won because maintenance is planned and acknowledged while DOWN/DEGRADED are active problems users are hitting now; a status page that says "Under Maintenance" during an unrelated outage is lying in the reassuring direction. The rejected alternatives: canonicalizing the frontend's maintenance-first order (makes the masking bug the contract), and treating maintenance as UP at the display level for uniformity with group scoring (erases planned windows from badges that already have an "Under Maintenance" label). Group Monitor scoring deliberately keeps maintenance≈UP — it answers a different question (a member's planned work should not tank the group's derived status), whereas Overall Status answers "what should a visitor see". Also affirmed here rather than changed: the `_` badge is site-wide (every ACTIVE, non-hidden monitor), not page-scoped, so on multi-page installs it legitimately need not match any single page's banner — a documentation fact, not a bug.
|
||||
+42
-34
@@ -1,5 +1,5 @@
|
||||
import type { TimestampStatusCount } from "$lib/server/types/db";
|
||||
import { PAGE_STATUS_MESSAGES } from "$lib/global-constants";
|
||||
import GC, { PAGE_STATUS_MESSAGES, type StatusType } from "$lib/global-constants";
|
||||
|
||||
function ParseLatency(latencyMs: number): string {
|
||||
if (!!!latencyMs) {
|
||||
@@ -316,46 +316,53 @@ interface GameItem {
|
||||
function GetGameFromId(list: GameItem[], id: string): GameItem | undefined {
|
||||
return list.find((game: GameItem) => game.id === id);
|
||||
}
|
||||
type StatusCounts = Pick<TimestampStatusCount, "countOfUp" | "countOfDown" | "countOfDegraded" | "countOfMaintenance">;
|
||||
|
||||
// Canonical Overall Status collapse: the worst state wins, and maintenance
|
||||
// never masks an active problem. See docs/adr/0007-problem-first-overall-status.md.
|
||||
function CollapseStatusCounts(counts: StatusCounts): StatusType {
|
||||
const total = counts.countOfUp + counts.countOfDown + counts.countOfDegraded + counts.countOfMaintenance;
|
||||
if (total === 0) return GC.NO_DATA;
|
||||
if (counts.countOfDown > 0) return GC.DOWN;
|
||||
if (counts.countOfDegraded > 0) return GC.DEGRADED;
|
||||
if (counts.countOfMaintenance > 0) return GC.MAINTENANCE;
|
||||
return GC.UP;
|
||||
}
|
||||
|
||||
function GetStatusSummary(item: TimestampStatusCount): string {
|
||||
const total = item.countOfUp + item.countOfDown + item.countOfDegraded + item.countOfMaintenance;
|
||||
if (total === 0) return PAGE_STATUS_MESSAGES.NO_DATA;
|
||||
|
||||
const maintenancePercent = (item.countOfMaintenance / total) * 100;
|
||||
const downPercent = (item.countOfDown / total) * 100;
|
||||
const degradedPercent = (item.countOfDegraded / total) * 100;
|
||||
|
||||
if (maintenancePercent > 0) {
|
||||
return PAGE_STATUS_MESSAGES.UNDER_MAINTENANCE;
|
||||
} else if (downPercent >= 75) {
|
||||
return PAGE_STATUS_MESSAGES.MAJOR_OUTAGE;
|
||||
} else if (downPercent >= 50) {
|
||||
return PAGE_STATUS_MESSAGES.PARTIAL_OUTAGE;
|
||||
} else if (item.countOfDown > 0) {
|
||||
return PAGE_STATUS_MESSAGES.PARTIAL_OUTAGE;
|
||||
} else if (degradedPercent >= 75) {
|
||||
return PAGE_STATUS_MESSAGES.DEGRADED_PERFORMANCE;
|
||||
} else if (degradedPercent >= 50) {
|
||||
return PAGE_STATUS_MESSAGES.PARTIAL_DEGRADED;
|
||||
} else if (item.countOfDegraded > 0) {
|
||||
return PAGE_STATUS_MESSAGES.PARTIAL_DEGRADED;
|
||||
} else if (item.countOfUp === total) {
|
||||
return PAGE_STATUS_MESSAGES.ALL_OPERATIONAL;
|
||||
switch (CollapseStatusCounts(item)) {
|
||||
case GC.DOWN:
|
||||
return (item.countOfDown / total) * 100 >= 75
|
||||
? PAGE_STATUS_MESSAGES.MAJOR_OUTAGE
|
||||
: PAGE_STATUS_MESSAGES.PARTIAL_OUTAGE;
|
||||
case GC.DEGRADED:
|
||||
return (item.countOfDegraded / total) * 100 >= 75
|
||||
? PAGE_STATUS_MESSAGES.DEGRADED_PERFORMANCE
|
||||
: PAGE_STATUS_MESSAGES.PARTIAL_DEGRADED;
|
||||
case GC.MAINTENANCE:
|
||||
return PAGE_STATUS_MESSAGES.UNDER_MAINTENANCE;
|
||||
case GC.UP:
|
||||
return PAGE_STATUS_MESSAGES.ALL_OPERATIONAL;
|
||||
default:
|
||||
return PAGE_STATUS_MESSAGES.NO_DATA;
|
||||
}
|
||||
|
||||
return PAGE_STATUS_MESSAGES.NO_DATA;
|
||||
}
|
||||
|
||||
function GetStatusColor(item: TimestampStatusCount): string {
|
||||
const total = item.countOfUp + item.countOfDown + item.countOfDegraded + item.countOfMaintenance;
|
||||
if (total === 0) return "text-muted-foreground";
|
||||
|
||||
const maintenancePercent = (item.countOfMaintenance / total) * 100;
|
||||
const downPercent = (item.countOfDown / total) * 100;
|
||||
|
||||
if (maintenancePercent > 0) return "text-maintenance";
|
||||
if (downPercent > 0) return "text-down";
|
||||
if (item.countOfDegraded > 0) return "text-degraded";
|
||||
return "text-up";
|
||||
switch (CollapseStatusCounts(item)) {
|
||||
case GC.DOWN:
|
||||
return "text-down";
|
||||
case GC.DEGRADED:
|
||||
return "text-degraded";
|
||||
case GC.MAINTENANCE:
|
||||
return "text-maintenance";
|
||||
case GC.UP:
|
||||
return "text-up";
|
||||
default:
|
||||
return "text-muted-foreground";
|
||||
}
|
||||
}
|
||||
|
||||
function GetStatusBgColor(item: TimestampStatusCount): string {
|
||||
@@ -378,6 +385,7 @@ export {
|
||||
IsValidNameServer,
|
||||
IsValidURL,
|
||||
IsValidPort,
|
||||
CollapseStatusCounts,
|
||||
GetStatusSummary,
|
||||
GetStatusColor,
|
||||
GetStatusBgColor,
|
||||
|
||||
@@ -21,11 +21,11 @@ import type {
|
||||
import type { MonitorFilter } from "../db/repositories/base.js";
|
||||
import db from "../db/db.js";
|
||||
import type { PaginationInput } from "../../types/common.js";
|
||||
import type { DayWiseStatus, NumberWithChange } from "../../types/monitor.js";
|
||||
import GC, { getBadgeStyle, type BadgeStyle } from "../../global-constants.js";
|
||||
import { makeBadge } from "badge-maker";
|
||||
import { ErrorSvg } from "../../anywhere.js";
|
||||
import { GetLastMonitoringValue, SetLastHeartbeat, DeleteMonitorCaches } from "../cache/setGet.js";
|
||||
import { CollapseStatusCounts } from "../../clientTools.js";
|
||||
import { translate, isLocaleAvailable } from "../i18n.js";
|
||||
import type { HeartbeatMonitor, GroupMonitorTypeData } from "../types/monitor.js";
|
||||
|
||||
@@ -290,7 +290,7 @@ export const GetLatestMonitoringData = async (monitor_tag: string): Promise<Moni
|
||||
};
|
||||
export const GetLatestStatusActiveAll = async (): Promise<{ status: string }> => {
|
||||
//get all the active not hidden monitor tags
|
||||
const monitors = await db.getMonitors({ status: "ACTIVE", is_hidden: "NO" });
|
||||
const monitors = await db.getMonitors({ status: GC.ACTIVE, is_hidden: GC.NO });
|
||||
const monitor_tags = monitors.map((m) => m.tag);
|
||||
|
||||
const latestData: MonitoringData[] = [];
|
||||
@@ -302,19 +302,20 @@ export const GetLatestStatusActiveAll = async (): Promise<{ status: string }> =>
|
||||
}
|
||||
}
|
||||
|
||||
let status: string = GC.NO_DATA;
|
||||
for (let i = 0; i < latestData.length; i++) {
|
||||
//if any status is down then status = down, if any is degraded then status = degraded, down > degraded > up
|
||||
if (latestData[i].status === GC.DOWN) {
|
||||
status = GC.DOWN;
|
||||
} else if (latestData[i].status === GC.DEGRADED && status !== GC.DOWN) {
|
||||
status = GC.DEGRADED;
|
||||
} else if (latestData[i].status === GC.UP && status !== GC.DOWN && status !== GC.DEGRADED) {
|
||||
status = GC.UP;
|
||||
const counts = { countOfUp: 0, countOfDown: 0, countOfDegraded: 0, countOfMaintenance: 0 };
|
||||
for (const data of latestData) {
|
||||
if (data.status === GC.UP) {
|
||||
counts.countOfUp++;
|
||||
} else if (data.status === GC.DOWN) {
|
||||
counts.countOfDown++;
|
||||
} else if (data.status === GC.DEGRADED) {
|
||||
counts.countOfDegraded++;
|
||||
} else if (data.status === GC.MAINTENANCE) {
|
||||
counts.countOfMaintenance++;
|
||||
}
|
||||
}
|
||||
return {
|
||||
status: status,
|
||||
status: CollapseStatusCounts(counts),
|
||||
};
|
||||
};
|
||||
|
||||
@@ -461,9 +462,6 @@ export const GetAllAlertsPaginated = async (
|
||||
export const GetMonitoringData = async (tag: string, since: number, now: number): Promise<MonitoringData[]> => {
|
||||
return await db.getMonitoringData(tag, since, now);
|
||||
};
|
||||
export const GetMonitoringDataAll = async (tags: string[], since: number, now: number): Promise<MonitoringData[]> => {
|
||||
return await db.getMonitoringDataAll(tags, since, now);
|
||||
};
|
||||
|
||||
export const InsertNewAlert = async (data: MonitorAlertInsert): Promise<MonitorAlert | undefined> => {
|
||||
if (await db.alertExists(data.monitor_tag, data.monitor_status, data.alert_status)) {
|
||||
@@ -547,7 +545,7 @@ export const GetBadge = async (badgeType: BadgeType, params: BadgeParams): Promi
|
||||
lastObj = await GetLatestStatusActiveAll();
|
||||
} else {
|
||||
// Single monitor status
|
||||
const monitors = await GetMonitorsParsed({ tag, status: "ACTIVE", is_hidden: "NO" });
|
||||
const monitors = await GetMonitorsParsed({ tag, status: GC.ACTIVE, is_hidden: GC.NO });
|
||||
if (monitors.length === 0) {
|
||||
return new Response(ErrorSvg, {
|
||||
headers: { "Content-Type": "image/svg+xml" },
|
||||
@@ -635,14 +633,14 @@ export const GetBadge = async (badgeType: BadgeType, params: BadgeParams): Promi
|
||||
const siteData = await db.getSiteDataByKey("siteName");
|
||||
const siteName = siteData?.value as string | undefined;
|
||||
name = siteName || "All Monitors";
|
||||
const goodMonitors = await GetMonitorsParsed({ status: "ACTIVE", is_hidden: "NO" });
|
||||
const goodMonitors = await GetMonitorsParsed({ status: GC.ACTIVE, is_hidden: GC.NO });
|
||||
const activeTags = goodMonitors.map((monitor) => monitor.tag);
|
||||
|
||||
stats = await db.getStatusCountsByInterval(activeTags, since, now - since, 1);
|
||||
uptimeData = UptimeCalculator(stats);
|
||||
} else {
|
||||
// Single monitor badge
|
||||
const monitors = await GetMonitorsParsed({ tag });
|
||||
const monitors = await GetMonitorsParsed({ tag, status: GC.ACTIVE, is_hidden: GC.NO });
|
||||
if (monitors.length === 0) {
|
||||
return new Response(ErrorSvg, {
|
||||
headers: { "Content-Type": "image/svg+xml" },
|
||||
|
||||
@@ -48,7 +48,6 @@ class DbImpl {
|
||||
// ============ Monitoring Data ============
|
||||
insertMonitoringData!: MonitoringRepository["insertMonitoringData"];
|
||||
getMonitoringData!: MonitoringRepository["getMonitoringData"];
|
||||
getMonitoringDataAll!: MonitoringRepository["getMonitoringDataAll"];
|
||||
getLatestMonitoringData!: MonitoringRepository["getLatestMonitoringData"];
|
||||
getLatestMonitoringDataN!: MonitoringRepository["getLatestMonitoringDataN"];
|
||||
getMonitoringDataPaginated!: MonitoringRepository["getMonitoringDataPaginated"];
|
||||
@@ -409,7 +408,6 @@ class DbImpl {
|
||||
private bindMonitoringMethods(): void {
|
||||
this.insertMonitoringData = this.monitoring.insertMonitoringData.bind(this.monitoring);
|
||||
this.getMonitoringData = this.monitoring.getMonitoringData.bind(this.monitoring);
|
||||
this.getMonitoringDataAll = this.monitoring.getMonitoringDataAll.bind(this.monitoring);
|
||||
this.getLatestMonitoringData = this.monitoring.getLatestMonitoringData.bind(this.monitoring);
|
||||
this.getLatestMonitoringDataN = this.monitoring.getLatestMonitoringDataN.bind(this.monitoring);
|
||||
this.getMonitoringDataPaginated = this.monitoring.getMonitoringDataPaginated.bind(this.monitoring);
|
||||
|
||||
@@ -49,27 +49,6 @@ export class MonitoringRepository extends BaseRepository {
|
||||
.orderBy("timestamp", "asc");
|
||||
}
|
||||
|
||||
// Groups by timestamp and applies priority: DOWN > DEGRADED > UP
|
||||
async getMonitoringDataAll(monitor_tags: string[], start: number, end: number): Promise<MonitoringData[]> {
|
||||
return await this.knex("monitoring_data")
|
||||
.select(
|
||||
"timestamp",
|
||||
this.knex.raw(`
|
||||
CASE
|
||||
WHEN MAX(CASE WHEN status = 'DOWN' THEN 1 ELSE 0 END) = 1 THEN 'DOWN'
|
||||
WHEN MAX(CASE WHEN status = 'DEGRADED' THEN 1 ELSE 0 END) = 1 THEN 'DEGRADED'
|
||||
ELSE 'UP'
|
||||
END as status
|
||||
`),
|
||||
)
|
||||
.whereIn("monitor_tag", monitor_tags)
|
||||
.where("timestamp", ">=", start)
|
||||
.where("timestamp", "<=", end)
|
||||
.whereNotNull("status")
|
||||
.groupBy("timestamp")
|
||||
.orderBy("timestamp", "asc");
|
||||
}
|
||||
|
||||
async getLatestMonitoringData(monitor_tag: string): Promise<MonitoringData | undefined> {
|
||||
return await this.knex("monitoring_data")
|
||||
.where("monitor_tag", monitor_tag)
|
||||
|
||||
@@ -18,6 +18,26 @@ From **Manage → Badges**, you can generate:
|
||||
|
||||
You can customize style, label, colors, and time range.
|
||||
|
||||
### All-monitors badge {#all-monitors-badge}
|
||||
|
||||
Status and dot badges accept `_` instead of a monitor tag:
|
||||
|
||||
```
|
||||
/badge/_/status
|
||||
/badge/_/dot
|
||||
```
|
||||
|
||||
The `_` badge shows the overall status of **every active, non-hidden monitor site-wide**. It is not scoped to a page, so on a multi-page setup it may not match an individual page's banner.
|
||||
|
||||
### Overall status priority {#overall-status-priority}
|
||||
|
||||
When multiple monitor statuses collapse into one overall status (the page banner and the `_` badge), the worst state wins:
|
||||
|
||||
`DOWN` > `DEGRADED` > `MAINTENANCE` > `UP`
|
||||
|
||||
> [!NOTE]
|
||||
> Maintenance never masks an active problem: a page with one monitor under maintenance and another down or degraded reports the outage, not the maintenance. `NO_DATA` is shown only when no monitor has reported any data.
|
||||
|
||||
### Embeds {#embeds}
|
||||
|
||||
From **Manage → Embed**, you can generate:
|
||||
|
||||
Reference in New Issue
Block a user