diagram.mmd — flowchart
IoT Telemetry Pipeline flowchart diagram

An IoT telemetry pipeline is the cloud-side infrastructure that receives high-throughput sensor messages from connected devices, routes them through stream processing and storage layers, and delivers outputs to dashboards, alert systems, and downstream analytics consumers.

Telemetry pipelines must handle scale and heterogeneity simultaneously. A deployment with 100,000 devices each sending one message per second produces 100,000 inbound messages per second. Messages vary in schema by device type and firmware version. The pipeline must be elastic, schema-tolerant, and able to maintain low end-to-end latency (typically < 5 seconds from device to dashboard).

The ingestion tier is a managed IoT broker (AWS IoT Core, Azure IoT Hub) or a self-hosted MQTT cluster that accepts authenticated device connections. The broker fans messages into a message stream — Kinesis, Kafka, or Pub/Sub — which provides durability, replay capability, and backpressure isolation between producers and consumers.

Stream processors (Apache Flink, Spark Streaming, or cloud-native equivalents) consume from the stream in real time. They perform: schema validation and normalisation, device-level enrichment (looking up metadata from a device registry), windowed aggregation (computing 1-minute averages), and anomaly scoring. Valid records are written to a time-series database (InfluxDB, TimescaleDB, or cloud equivalents). Records flagged as anomalous route to an alerting service that evaluates severity rules and dispatches notifications.

Periodically, a batch ETL job reads the time-series store and writes rolled-up data into a data warehouse for long-term analytics. A metadata store (device registry, site hierarchy, equipment catalogue) is queried both by stream processors and dashboards to enrich raw telemetry with business context. For the upstream pipeline feeding this architecture, see IoT Sensor Data Pipeline. For aggregation patterns on the stored data, see IoT Data Aggregation. For event streaming architecture at the messaging layer, see messaging/event-streaming-architecture.

Free online editor
Edit this diagram in Graphlet
Fork, modify, and export to SVG or PNG. No sign-up required.
Open in Graphlet →

Frequently asked questions

An IoT telemetry pipeline is the cloud-side infrastructure that receives high-throughput sensor messages from connected devices, routes them through stream processing and storage layers, and delivers outputs to dashboards, alert systems, and downstream analytics consumers. It is responsible for scale, schema tolerance, enrichment, and end-to-end latency from device to dashboard.
A managed IoT broker or MQTT cluster accepts authenticated device connections and fans messages into a durable message stream such as Kafka or Kinesis. Stream processors consume in real time to validate, enrich, aggregate, and score messages. Valid records land in a time-series database. Anomalous records route to an alerting service. Periodically a batch ETL job rolls up stored data into a warehouse for long-term analytics.
A dedicated pipeline is appropriate when device count and message rate exceed what a simple HTTPS ingest endpoint can handle, when you need schema evolution tolerance across different firmware versions, when end-to-end latency requirements demand stream processing rather than batch, or when downstream consumers — dashboards, alert engines, ML training — need to consume the same data independently without coupling.
Common mistakes include coupling schema tightly to device firmware so a firmware update breaks the pipeline, using a message stream without configuring sufficient retention to allow consumer replay during outages, performing expensive enrichment synchronously in the hot path rather than asynchronously, and not separating the alerting evaluation path from the storage write path so a slow alert rule delays data persistence.
A sensor data pipeline operates on or near the device: it handles sampling, noise filtering, feature extraction, and MQTT publish from firmware or an edge node. A telemetry pipeline operates in the cloud: it handles broker ingestion, stream processing, time-series storage, and alert routing. The sensor pipeline produces the messages; the telemetry pipeline consumes and routes them. Both are required in a complete IoT system.
mermaid
flowchart LR Devices[IoT Devices\n100k+ endpoints] --> Broker[IoT Broker\nAWS IoT / Azure IoT Hub] Broker --> Stream[Message stream\nKinesis / Kafka / Pub-Sub] Stream --> SP[Stream processor\nFlink / Spark Streaming] Stream --> RawStore[(Raw message archive\nS3 / GCS cold storage)] SP --> Validate{Schema valid?} Validate -->|No| DLQ[Dead-letter queue\nfor inspection] Validate -->|Yes| Enrich[Enrich with device metadata\nfrom device registry] Enrich --> Window[Windowed aggregation\n1-min / 5-min averages] Window --> TSDB[(Time-series database\nInfluxDB / TimescaleDB)] Window --> AnomalyCheck{Anomaly score\nhigh?} AnomalyCheck -->|Yes| Alert[Alert service\nPagerDuty / SNS] AnomalyCheck -->|No| TSDB TSDB --> Dashboard[Grafana dashboard\nReal-time visualisation] TSDB --> BatchETL[Nightly batch ETL] BatchETL --> Warehouse[(Data warehouse\nBigQuery / Redshift)] DevRegistry[(Device registry\nMetadata store)] --> Enrich
Copied to clipboard