Why smart buildings still struggle with dumb data

Andrew Foster, product director, IOTech looks at how AI at the edge can fix dumb data in buildings.

Smart buildings are becoming more connected than ever. But behind the sleek dashboards and intelligent automation lies a persistent challenge that’s slowing progress: the data itself.

Building systems — HVAC, lighting, metering, security, and access control — often operate in silos. They generate data in different formats, with inconsistent labeling and structure. Even basic contextual information like “what is this data stream measuring, and where?” is often missing or manually entered.

Before any of that information can be used for building analytics, optimization, or integration into a centralized BMS, it has to be cleaned, tagged, and normalized. This is typically a manual and time-consuming process that involves aligning data with semantic models such as Project Haystack and Brick. The lack of interoperability across devices continues to drive up engineering costs and prevent true plug-and-play integration — especially when deploying systems at scale.

A hidden bottleneck with real-world costs

Inconsistent device data may not grab headlines, but it has far-reaching consequences. When teams have to spend days or weeks manually preparing data just to get a system online, project timelines stretch and labor costs spike.

More importantly, the longer this manual process remains in place, the harder it becomes to scale smart building initiatives across portfolios. Analytics platforms can’t deliver useful insights if the input data lacks structure. Predictive maintenance systems are limited without context. And energy optimization goals are harder to reach when performance data is fragmented or ambiguous.

Manual tagging is also prone to human error, especially across complex environments where thousands of data points are being pulled from different subsystems. That leads to inconsistent labels, incomplete metadata, and delayed diagnostics — all of which impact the performance of higher-level systems that depend on accurate, real-time data.

In a space that’s increasingly focused on automation, this manual bottleneck is holding the industry back.

The shift toward edge intelligence

One approach gaining traction is the use of AI and semantic modeling at the edge — directly where data is generated. By processing and tagging data locally, building operators can standardize and contextualize information in real time, rather than relying on post-processing after the fact.

This model shifts the burden of data preparation closer to the source. Instead of sending raw, unstructured data to the cloud for cleaning and organization, edge platforms can normalize and align the data at the point of ingestion. That means better data flowing through the system from the start — and a significant reduction in setup time for integrators and automation specialists.

Several technology providers are embedding this functionality in edge platforms that sit on building gateways or controllers. These platforms can ingest device-level data and automatically apply semantic models to align it with frameworks like Project Haystack and Brick. This allows data to be streamed in a usable format from the moment it’s created — enabling faster integration, reduced configuration effort, and smoother interoperability across systems.

One example is the growing adoption of edge software based on the open-source EdgeX Foundry framework. EdgeX, a project under the Linux Foundation’s LF Edge umbrella, provides a flexible, vendor-neutral platform that supports device interoperability and application portability at the edge. EdgeX Foundry and its commercial derivatives — including IOTech’s Edge Central — are seeing increased use in building automation, where they help streamline the normalization and semantic tagging of device data in real time.

This approach is helping building operators solve the messy data challenge without relying on custom-engineered integrations for every deployment.

Lessons from the field

Facility operators working across multiple sites are beginning to adopt this edge-first model to accelerate system integration and improve long-term scalability. For example, CBRE, one of the world’s largest real estate services firms, has been using edge-based approaches with integrated AI to help automate data normalization across diverse building systems. By aligning data from the outset, they’re able to roll out smart building technologies more quickly across complex portfolios.

Similarly, SHIFT Energy, a provider of machine learning software for predictive operations and control for central plants and HVAC, has integrated edge platforms into its workflows to ensure consistent, structured data across client sites. This consistency allows them to feed clean inputs into their analytics and automation engines from day one — improving performance without the usual integration delays.

These efforts illustrate how standardizing and contextualizing operational data at the edge is helping building teams avoid the bottlenecks of manual tagging. Faster onboarding of devices, reduced reliance on engineering resources, and cleaner inputs for analytics are proving to be tangible wins in both large-scale facility management and energy efficiency projects.

A foundation for smarter automation

As building systems grow more complex and the pressure to reduce energy consumption intensifies, the need for structured, high-quality data will only increase. AI-driven edge solutions offer a way to meet that need by embedding intelligence where it’s most efficient — at the source.

Rather than relying solely on cloud-based processing or manual workflows, smart buildings can now normalize and tag data in real time, using shared industry models. This not only simplifies deployment, but also lays the groundwork for more sophisticated use cases such as cross-system orchestration, adaptive energy management, and AI-driven fault diagnostics.

What’s emerging is a more scalable model — one that allows systems integrators, building owners, and solution providers to reuse data models across projects and streamline onboarding without starting from scratch each time.

Looking ahead

The promise of smart buildings has never just been about installing more sensors. It’s about making the data from those sensors truly usable — quickly, consistently, and at scale.

As the industry moves toward deeper integration of systems, more AI-driven automation, and tighter sustainability mandates, the quality and readiness of data will only become more important. Buildings are expected to adapt in real time, respond to occupant behavior, optimize energy use hour by hour, and self-diagnose issues before they escalate. None of that is possible without structured, contextualized data flowing freely between systems.

The shift to edge intelligence is not a passing trend. It reflects a broader architectural transformation in how building systems are managed and connected. By enabling real-time data normalization and semantic tagging at the point of origin, building operators can reduce costs, speed up integration, and position themselves to take advantage of whatever technologies come next — from advanced fault detection to AI-powered occupancy optimization.

Standards like Project Haystack and Brick will continue to play a role in enabling semantic interoperability. But standards alone are not enough. What’s needed is a practical, scalable way to apply those standards automatically — and edge-based AI solutions are emerging as one of the most effective ways to do that.

For building owners, integrators, and solution providers, now is the time to evaluate how data is handled at every level of the stack. Because no matter how advanced the application layer becomes, the entire system still depends on getting the data layer right.