Real-Time, All the Time: How We Streamed Data from Everywhere into Dashboards and APIs in Seconds
From hourly dashboards to sub-second APIs — how we built a blazing-fast, cost-efficient analytics pipeline using cloud-native tools and open-source tech.
Why We Built This
The ops team and analysts at CARS24 used to rely on hourly data syncs into Snowflake, visualized in Tableau. Meanwhile, internal services and APIs needing fresh insights had to deal with stale or incomplete data — not ideal when seconds matter.
We needed real-time visibility and instant data access — for both human dashboards and real-time DS/ML APIs.
So, we built a real-time data ingestion and analytics pipeline on GKE using:
- Google Pub/Sub
- Apache Kafka
- StarRocks
- MinIO + Google Cloud Storage
- Apache Superset
Now, the same live data that fuels dashboards also powers low-latency APIs for DS/ML use-cases — unlocking new automation, intelligence, and responsiveness across systems.
Data Ingestion/Delivery Architecture
Step 1: Unified Ingestion with Google Pub/Sub
We start by streaming data into Google Pub/Sub, which acts as our real-time message broker. This includes:
- BigQuery exports/CDC pushed as events
- Internal APIs pushing app metrics and events
- Third-party vendor data feeds
This gave us a consistent entry point for all inbound data, regardless of origin. Pub/Sub acts as the unified, decoupled transport layer that feeds the rest of our pipeline.
Step 2: From Pub/Sub to Kafka (Yes, Both!)
Why both? Because:
- Pub/Sub is great for ingesting external data.
- Kafka gives us better buffering, replayability, and tooling for downstream processing.
We use the Pub/Sub Source Connector to stream messages directly into Kafka topics. This makes the system more flexible and robust — with Kafka acting as a durable message bus between ingestion and analytics.
Step 3: Real-Time Ingestion into StarRocks
Here’s where the magic happens.
We use StarRocks’ Routine Load feature to ingest Kafka topics continuously. No more batch jobs, no more delays — just real-time data landing in StarRocks tables, ready to be queried.
Why StarRocks?
- Blazing fast query performance
- Native support for Kafka ingestion
- Scales horizontally
- Great for real-time analytics workloads
- Materialized Views which further improve query performance
We run StarRocks in shared-data mode, which separates compute from storage. This lets us autoscale Compute Nodes (CNs) based on real-time load using Kubernetes HPA.
Step 4: Smart Storage with MinIO & GCS
Storage costs can escalate quickly, especially with SSDs. So we introduced tiered storage:
- Hot Data → Stored on MinIO (an S3-compatible object store inside our cluster)
- Cold Data → Offloaded to Google Cloud Storage (GCS) to save cost
StarRocks integrates smoothly with this setup. Queries hit hot storage for recent data and fetch older data from cold storage only when needed — with minimal performance impact. There is a feature within StarRocks for data tiering which can move data from SSD to HDD. This is useful where you want to further reduce tiered data retrieval time, but comes at the cost of having HDDs.
💰 Result: Big storage savings, no compromise on query performance.
Step 5: Real-Time Dashboards with Superset
Finally, we plugged Apache Superset into StarRocks.
Superset is fast, open-source, and user-friendly. It lets the team:
- Build and view dashboards in real time
- Run complex queries with sub-second response times
- No need to wait for the Snowflake sync or Tableau refresh
- A fully open-source analytics stack (bye-bye, Tableau license fees!)
Before vs. After
Key Takeaways
- Decoupling ingestion from processing via Pub/Sub and Kafka gave us flexibility and reliability.
- StarRocks provided the perfect balance of real-time performance and cost efficiency.
- Tiered storage with MinIO and GCS dramatically cut down our storage costs.
- Superset + StarRocks made real-time analytics truly self-service for the Ops team.
This pipeline has completely transformed how we operate. Our Ops team no longer waits for stale dashboards — they act on live data. And it’s not just people who benefit:
Our real-time DS/ML APIs now fetch up-to-the-second data directly from StarRocks, powering internal tools, automations, and services that rely on fast, accurate insights. Whether it’s a live personalization service or a backend service making decisions based on operational metrics — they’re all plugged into the same real-time engine.
We’ve reduced costs, improved performance, and unlocked a new level of agility and intelligence across the organization.
Future Work & Limitations
While our real-time pipeline has significantly improved visibility and responsiveness across teams, it’s still evolving — and there are a few areas we’re actively exploring and others where trade-offs exist.
Schema Evolution & Data Contracts
Right now, handling schema changes across producers (Pub/Sub, Kafka) and consumers (StarRocks, APIs) is manual and fragile. We’re exploring:
- Schema registry integration (e.g., Confluent Schema Registry)
- Enforced data contracts between producers and downstream consumers
High-Cardinality Metrics & Joins
StarRocks handles high-concurrency and massive ingest well, but:
- Joins across large datasets can get expensive in real time
- High-cardinality dimensions (e.g., user-level analytics) need careful modeling to avoid bloated tables or slow queries
We’re experimenting with materialised views, pre-aggregations, and even hybrid models (mixing real-time + batch where needed).
Lakehouse Integration
As we continue to scale our real-time data platform, we’re exploring how to integrate it with lakehouse architecture — to bridge the gap between streaming, historical, and analytical workloads.
Currently, StarRocks serves as our high-speed analytical engine, but it doesn’t store long-term, large-volume datasets. Integrating with a lakehouse would enable us to:
- Persist raw + processed data in an open format (e.g., Parquet, Iceberg, or Delta Lake)
- Enable retrospective analysis beyond the retention period of StarRocks
- Run AI/ML workflows and batch analytics on historical data without overloading our real-time pipeline
- Support replays or reprocessing from deep storage when business logic changes
The long-term goal is to have a fully unified architecture:
Data is streamed, stored, queried, and analyzed — across real-time and historical timelines — from one logical platform.
Loved this article?
Hit the like button
Share this article
Spread the knowledge
More from the world of CARS24
How Our Fine-Tuned Whisper is Revolutionising CARS24’s Call Intelligence
CARS24 operates across India, UAE, and Australia, and while 95%+ of calls in UAE and AUS are in English (handled well by out of the shelf Whisper), India is a whole different beast.
How we supercharged our Auction flows with TanStack query
Switching to TanStack Query transformed how we handle server state, making our app faster and easier to maintain.
Refactoring Auth, Not Breaking Prod: A Case Study in S2S Migration
We needed a control centre — something that unified authentication, gave us visibility, and offered extensibility. So we leaned into something we already trusted internally: UMS, our Keycloak-based authentication platform.