Enterprise · AI

AI-Powered HVAC Analytics

At a glance: Led three rounds of research to turn an AI HVAC platform from a distrusted “report” into a transparent daily tool. Trust climbed 50 → 75 and UX-Lite rose past the global benchmark.

Facility managers were stuck reacting to HVAC failures instead of preventing them. We built an AI platform that diagnoses performance issues and surfaces the highest-impact ones. I joined after an agency's initial design and led research and design to validate and refine it across three rounds of testing.

Role

Lead Researcher & UX Designer

Platform

Cloud web platform

Team

Lead designer · UI designer · PMs · Engineering

UX-Lite score, up from 66.7, above the 68 global benchmark

71.4

Trust & credibility rating across the testing rounds

50 → 75

Rounds of moderated 1:1 testing, each feeding the next iteration

01 · Research

A three-round research strategy

I defined and led three rounds of moderated testing and surveys to validate and refine the product, translating findings into feasible AI improvements between each round.

Ran journey mapping with stakeholders, weaving new tasks into the existing facility-manager journey.
Recruited facility managers, support technicians, and account managers who influence adoption.
Ran scenario-based 1:1 sessions probing comprehension, trust, and usability.
Synthesized findings into insight reports and prioritized changes with product.
Partnered with engineering on model feasibility, flagging where requests exceeded AI capability.
Implemented iterations with a UI designer and validated them the next round.

Moderated 1:1 sessions with facility managers and administered surveys to remaining participants.

Insight reports translated findings into prioritized changes.

02 · Iteration

What each round taught us

Round 1: Initial testing

Learned

Liked the modern interface, but hit onboarding & navigation friction
Distrusted incomplete data; wanted scores and faults explained
Found red too alarming; low scores and colors felt alarm-like

Changed

Exposed that scores are calculated from automated tests
Added transparent AI root-cause explanations; retrained for common faults
Swapped red, yellow, green for neutral purple

Round 2: Validation testing

Learned

Trust was inconsistent: high when insights matched reality, low when irrelevant
Wanted repair specifics
Too many clicks to reach daily-operations tools

Changed

Reframed outputs as “early indicators,” not “predictions”
Cut clicks to operations tools
Surfaced critical information upfront

Round 3: Final refinement

Learned

Preferred opportunity-focused language and “next steps” over “fixes”
Wanted Key Insights more prominent and clearer navigation
Found vague labels like “Take Action” unhelpful

Changed

Rewrote messaging around optimization; moved Key Insights to the top
Added breadcrumbs and trend charts; relabeled to “Investigate” and “Review”
Exposed the data, thresholds and rules behind the AI; added a feedback loop

Navigating a business constraint: users wanted actionable repair recommendations, but our account-management team worried this would cannibalize the company's paid service offerings. I facilitated alignment across product and account management on a compromise: the AI diagnoses issues and explains root causes, then directs users to professional service or technical documentation rather than prescribing the fix.

03 · Solution

An AI that shows its work

The refined platform makes the AI's thinking visible and keeps people in control. Every score can be drilled to reveal the data, thresholds and rules behind it, so nothing reads like a black box. Findings are framed as opportunities rather than verdicts, the ones that matter most rise to the top, and trends show how performance shifts over time. And when the model isn't certain, people can push back, flagging what looks off instead of taking the answer on faith.

An in-app explainer defines what each score range means, from "Great" to "Investigate."

Trend charts show performance over time.

04 · Impact

Outcome & reflection

Between Round 1 and Round 3, the UX-Lite score rose from 66.7 to 71.4 (above the 68 global benchmark), and trust & credibility climbed from 50 to 75. User sentiment shifted from "needs significant improvements" to "mostly ready for launch with minor tweaks."

More than the numbers, the project shaped how I approach AI work. I start with research, I make the model's reasoning something users can actually see, and I keep a person in control whenever the stakes are high.

The original analytics dashboard: one overall building score with a flat list of equipment rows. — From an unexplained score to a transparent, action-oriented view of building health.

The redesigned dashboard: a plain-language score, high-impact key insights and per-system breakdowns. — From an unexplained score to a transparent, action-oriented view of building health.

05 · Reflection

What I took away

Trust through transparency. Trust grows when the AI shows which data, thresholds and logic informed each insight.
Frame the AI as an assistant. Wording insights as "likely causes" and "suggested next steps" left room for the user's own judgment instead of dictating the answer.
Design around AI constraints. Asking "can the model reliably do this?" early let me design alternatives (overrides, feedback, transparent disclaimers) that preserved trust while the model improved.
Involve business stakeholders early. Collaborating with account management upfront prevented late-stage pivots and produced a solution everyone could support.

Next project

AI-Powered HVAC Analytics

A three-round research strategy

What each round taught us

Round 1: Initial testing

Round 2: Validation testing

Round 3: Final refinement

An AI that shows its work

Outcome & reflection

What I took away

Commercial Smart Thermostat