BMS/IMPROVEMENTS.md
2026-03-19 11:32:17 +00:00

15 KiB
Raw Permalink Blame History

BMS Improvement Plan — Singapore DC01

Read this file at the start of the next session to restore context. Generated from full page review (all 9 pages read and analysed).


Phased Execution Plan

Phase 1 — Frontend Quick Wins (no backend/simulator changes)

# Page Improvement Status
1.1 Alarms Escalation timer — colour-ramping counter for unacknowledged critical alarms [x]
1.2 Alarms MTTR stat card — derived from triggered_at → resolved_at deferred to Phase 3 (needs resolved_at from backend)
1.3 Assets Sortable inventory table columns [x]
1.4 Environmental Humidity overlay toggle on heatmap [x]
1.5 Environmental Dew point derived client-side (Magnus formula from temp + humidity) [x]
1.6 Environmental ASHRAE A1 compliance table per rack [x]
1.8 Capacity Stranded power total kW shown prominently [x]
1.9 Environmental Dew point vs. supply air temp chart (client-side derived) [x]
1.10 Floor Map Alarm badge overlay option [x]

Phase 2 — Simulator Expansion (new bots + topology)

# Bot Status
2.1 GeneratorBot — fuel_pct, load_kw, run_hours, state, scenarios: GENERATOR_FAILURE / LOW_FUEL [x]
2.2 AtsBot — active_feed, transfer_count, last_transfer_ms, scenario: ATS_TRANSFER [x]
2.3 ChillerBot — chw_supply/return_c, flow_gpm, cop, condenser_pressure_bar, scenario: CHILLER_FAULT [x]
2.4 VesdaBot — level (normal/alert/action/fire), obscuration_pct, zone_id, scenarios: VESDA_ALERT / VESDA_FIRE [x]
2.5 Extend PduBot — per-phase kW + amps (A/B/C), imbalance_pct, scenario: PHASE_IMBALANCE [x]
2.6 Extend WaterLeakBot — floor_zone, under_floor, near_crac metadata [x]
2.7 Topology update — generators, ats, chillers, vesda zones, extra leak sensors [x]

Phase 3 — Backend API Expansion

# Endpoint Status
3.1 GET /api/generator/status [x]
3.2 GET /api/power/ats [x]
3.3 GET /api/power/phase [x]
3.4 GET /api/power/redundancy [x]
3.5 GET /api/cooling/status (chiller) [x]
3.6 GET /api/cooling/history (COP + capacity over time) [x]
3.7 GET /api/fire/status (VESDA zones) [x]
3.8 GET /api/leak/status (with location metadata) [x]
3.9 GET /api/power/utility (grid import, tariff, monthly kWh) [x]
3.10 GET /api/reports/energy (kWh cost, PUE 30-day trend) [x]
3.11 Extend cooling/{crac_id} detail — add airflow_cfm [x] (was already done in env.py)

Phase 4 — Existing Pages Wired Up (uses Phase 2+3 data)

# Page Improvement Status
4.1 Dashboard Generator status KPI card [x]
4.2 Dashboard Leak detection KPI card [x]
4.3 Dashboard UPS worst-case runtime card deferred (UPS runtime already shown on Power page)
4.4 Power Generator section [x]
4.5 Power ATS transfer switch panel [x]
4.6 Power PDU branch circuit section [x] phase imbalance table
4.7 Power Phase imbalance warning on UPS cards [x]
4.8 Power Power redundancy level indicator [x]
4.9 Cooling COP trend chart per CRAC [x] (in CRAC detail sheet)
4.10 Cooling Chiller plant summary panel [x]
4.11 Cooling Predictive filter replacement estimate [x]
4.12 Cooling Airflow CFM tile in fleet summary [x]
4.13 Environmental Leak sensor map panel [x]
4.14 Environmental VESDA/smoke status panel [x]
4.15 Floor Map Leak sensor overlay layer [x] (panel below map)
4.16 Floor Map Power feed (A/B) overlay layer [x]
4.17 Floor Map Humidity 3rd overlay [x] (done in Phase 1)
4.18 Capacity N+1 cooling margin indicator [x]
4.19 Capacity Capacity runway chart [x]
4.20 Alarms Generator alarm category [x] (alarm engine raises gen alarms automatically)
4.21 Alarms Leak alarm category with floor map link [x] (alarm engine already handles leak)
4.22 Alarms Fire/VESDA alarm category [x] (alarm engine raises vesda_level alarms)
4.23 Assets PDU as asset type [x] (PDU phase monitoring section in assets grid)
4.24 Assets Rack elevation diagram in RackDetailSheet [x] (already implemented as RackDiagram)
4.25 Reports PUE 30-day trend graph [x] (daily IT kW trend + PUE estimated)
4.26 Reports Energy cost section [x]

Phase 5 — New Pages

# Page Status
5.1 Generator & Power Path [x]
5.2 Leak Detection [x]
5.3 Fire & Life Safety [x]

Phase 6 — Low Priority & Polish

# Item Status
6.1 Alarms: assigned-to column + maintenance window suppression [x] (assigned-to with localStorage)
6.2 Alarms: root cause correlation [x] (5-rule RootCausePanel above stat cards)
6.3 Assets: warranty expiry + lifecycle status [x] (lifecycle status column added)
6.4 Assets: CSV import/export for CMDB [x] (CSV export added)
6.5 Reports: comparison period (this week vs last) [x]
6.6 Reports: scheduled PDF email [ ]
6.7 New page: Network Infrastructure [x]
6.8 New page: Energy & Sustainability [x]
6.9 New page: Maintenance windows [x]
6.10 Environmental: particle count (ISO 14644) [ ]
6.11 Dashboard: room quick-status grid (Hall A / Hall B avg temp, power, CRAC state) — visual rack-grid thumbnail deferred to backlog [x]
6.12 Floor Map: zoom/pan + CRAC coverage shading [ ]

Phase 7 — Untracked Additions

# Item Status
7.1 Settings page — Profile, Notifications, Thresholds, Site Config tabs [x]
7.2 Floor layout editor — server-side persistence via site_config table (PUT/GET /api/floor-layout) [x]
7.3 Rack naming convention updated to SG1A01.xx / SG1B01.xx format across all topology files [x]
7.4 80-rack topology — Hall A and Hall B each have 2 rows × 20 racks [x]


Dashboard (/dashboard)

# Type Improvement Priority
1 Sensor Add Generator status KPI card (fuel %, run-hours, transfer state) High
2 Sensor Add Water/Leak Detection KPI card — badge showing any active leaks High
3 Sensor Add Raised floor differential pressure widget Medium
4 Sensor Show UPS state in KPI row (mains vs. battery, worst-case runtime) High
5 Visual Dashboard KPI row: add 5th card or replace PUE with site health score Medium
6 Visual Add mini floor map thumbnail as 4th bottom-row panel Medium
7 Info Show carbon intensity / CO2e alongside PUE Low
8 Info Add MTBF / uptime streak counter for critical infrastructure Low

Cooling (/cooling)

# Type Improvement Priority
1 Sensor Add Chiller plant metrics — CHW supply/return temps, flow rate, chiller COP, condenser pressure High
2 Sensor Add Cooling tower stats — approach temp, basin level, blow-down rate, fan speed Medium
3 Sensor Glycol/refrigerant level indicator per CRAC High
4 Sensor Airflow (CFM) per CRAC — not just fan % Medium
5 Sensor Condenser water inlet/outlet temperature for water-cooled units Medium
6 Sensor Raised floor tile differential pressure — 0.040.08 in. W.C. target range High
7 Sensor Hot/cold aisle containment breach indicator — door open, blanking panels Medium
8 Sensor Chilled water flow rate (GPM) and heat rejection kW Medium
9 Visual COP trend chart over time per unit (currently only static value) High
10 Visual Fleet summary: add total fleet airflow (CFM) tile Medium
11 Visual Add cooling efficiency vs. IT load scatter/trend chart Medium
12 Info Predictive filter replacement — estimated days until change-out based on dP rate of rise Medium

Power (/power)

# Type Improvement Priority
1 Sensor Add Generator status section — active/standby, fuel %, last test date, load kW High
2 Sensor Add ATS/STS transfer switch status — which feed active (Utility A/B), transfer time High
3 Sensor Add PDU branch circuit monitoring — per-phase kW, amps, trip status High
4 Sensor Power quality metrics — THD, voltage sag/swell events, neutral current Medium
5 Sensor Busway / overhead busbar load per tap-off box Medium
6 Sensor Utility metering — grid import kW, tariff period, cost/kWh, monthly kWh Medium
7 Sensor Phase imbalance per panel/UPS — flag >5% imbalance High
8 Visual UPS cards: add input voltage/frequency per phase, bypass mode status Medium
9 Info Add power redundancy level indicator — N, N+1, 2N — highlight single points of failure High
10 Info Annualised energy cost projection alongside kWh Low

Environmental (/environmental)

# Type Improvement Priority
1 Sensor Add Dew point derived value per room — approaching supply temp = condensation risk High
2 Sensor Add Water/leak detection sensors map — floor, under-floor, drip trays, pipe runs High
3 Sensor Smoke detector / VESDA status panel — aspirating detector alarm levels High
4 Sensor Raised floor pressure differential trend chart Medium
5 Sensor Hot aisle inlet temperature per rack row (return air) Medium
6 Sensor Server inlet temperature sensors from IPMI per device Medium
7 Sensor Particle count (ISO 14644 class) Low
8 Visual Heatmap: add humidity overlay toggle (currently separate chart only) High
9 Visual Add ASHRAE compliance table per rack — flag racks outside A1/A2 envelope Medium
10 Visual Add dew point vs. supply air temp chart with condensation risk zone Medium
11 Info Show absolute humidity (g/kg) alongside RH for ASHRAE compliance Low

Floor Map (/floor-map)

# Type Improvement Priority
1 Sensor Add leak sensor overlay — highlight tiles where water sensors are placed High
2 Sensor Add smoke/VESDA zone overlay Medium
3 Sensor Add PDU/power path overlay — show which feed (A/B) each rack is on High
4 Visual Add 3rd overlay: humidity Medium
5 Visual Add airflow arrows showing cold aisle → rack → hot aisle direction Low
6 Visual Show blank rack slots count on each rack tile (U available) Medium
7 Visual Add rack-level alarm badge as an overlay option High
8 Visual Add zoom/pan for larger floor plans Medium
9 Info Add CRAC coverage radius shading showing which racks each CRAC thermally serves Medium

Capacity (/capacity)

# Type Improvement Priority
1 Visual Add capacity runway chart — at current growth rate, weeks until power/cooling capacity hit High
2 Sensor Add U-space utilisation per rack — units occupied vs. total 42U Medium
3 Sensor Generator fuel capacity as a capacity dimension Medium
4 Info Thermal capacity per CRAC vs. current IT load — N+1 cooling margin High
5 Info Add growth projection input — operator enters expected kW/month to forecast capacity date Medium
6 Visual Cross-room comparison radar chart (Power %, Cooling %, Space %) Medium
7 Visual Show stranded power total in kW (not just per-rack list) Medium
8 Sensor Weight capacity per rack — floor load (kg/m2) Low

Alarms (/alarms)

# Type Improvement Priority
1 Sensor Add Generator alarm category (fuel low, start fail, overload) High
2 Sensor Add Leak alarm category with direct link to leak sensor on floor map High
3 Sensor Add Fire/VESDA alarm category with severity escalation High
4 Sensor Add Network device alarm category (switch down, link fault, LACP failure) Medium
5 Visual Add escalation timer — how long critical alarm unacknowledged, colour ramp High
6 Visual Add MTTR stat card alongside existing stat cards Medium
7 Visual Alarm table: add "Assigned to" column Low
8 Visual Add alarm suppression / maintenance window toggle Medium
9 Info Root cause correlation — surface linked alarms (e.g. rack temp high + CRAC fan low) Medium

Assets (/assets)

# Type Improvement Priority
1 Sensor Per-device power draw from PDU outlet monitoring (not estimated) High
2 Sensor Server inlet temperature from IPMI/iDRAC per device High
3 Sensor Add PDUs as asset type with per-outlet monitoring High
4 Sensor Network device status (switch uptime, port count, active links) Medium
5 Visual Inventory table: add sortable columns (currently unsortable) High
6 Visual Add rack elevation diagram (visual U-space view) in RackDetailSheet High
7 Visual Add device age / warranty expiry column in inventory Medium
8 Info Add DCIM-style lifecycle status — Active / Decomm / Planned Low
9 Info Add asset import/export (CSV) for CMDB sync Medium

Reports (/reports)

# Type Improvement Priority
1 Sensor Add energy cost report — kWh, estimated cost at tariff, month-to-date High
2 Visual Add PUE trend graph — 30-day rolling PUE vs. target High
3 Visual Add cooling efficiency (kW IT / kW cooling) over time Medium
4 Visual Add alarm MTTR and alarm volume trend per week Medium
5 Info Add scheduled report configuration — email PDF daily/weekly Medium
6 Info Add comparison period — this week vs. last week Medium
7 Info Add sustainability section — CO2e, renewable fraction, WUE Low
8 Info Add SLA compliance section — uptime %, incidents, breach risk Medium
9 Info Expand CSV exports: PDU branch data, CRAC detailed logs, humidity history Medium

New Pages to Build

Page Description Priority
Generator & Power Path ATS status, generator load, fuel level, transfer switch history High
Leak Detection Site-wide leak sensor map, sensor status, historical events High
Fire & Life Safety VESDA levels, smoke detector zones, suppression system status High
Network Infrastructure Core/edge switch health, port utilisation, link status Medium
Energy & Sustainability kWh cost, PUE trend, CO2e, WUE Medium
Maintenance Planned outages, maintenance windows, alarm suppression Low