Solution ideas
This article describes a solution idea. Your cloud architect can use this guidance to help visualize the major components for a typical implementation of this architecture. Use this article as a starting point to design a well-architected solution that aligns with your workload's specific requirements.
This article presents a solution for using Azure Kubernetes Service (AKS) to quickly process and analyze a large volume of streaming data from devices.
*ApacheĀ®, Apache Kafka, and Apache Spark are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks. Splunk is a registered trademark of Cisco. *
Architecture
Download a Visio file of this architecture.
Dataflow
- Sensors generate data and stream it to Azure API Management.
- An AKS cluster runs microservices that are deployed as containers behind a service mesh. The containers are built by using a DevOps process. The container images are stored in Azure Container Registry.
- An ingest service in AKS stores data in Azure Cosmos DB.
- Asynchronously, an analysis service in AKS receives the data and streams it to Apache Kafka on Azure HDInsight.
- Data scientists use machine learning models on Azure HDInsights and the Splunk platform to analyze the data.
- A processing service in AKS processes the data and stores the results in Azure Database for PostgreSQL. The service also caches the data in Azure Cache for Redis.
- A web app that runs in Azure App Service creates visualizations of the results.
Components
The solution uses the following key technologies:
- API Management
- App Service
- Azure Cache for Redis
- Container Registry
- Azure Cosmos DB
- Azure Database for PostgreSQL
- HDInsight
- AKS
- Azure Pipelines
Scenario details
This solution is a good fit for a scenario that involves millions of data points, where data sources include Internet of Things (IoT) devices, sensors, and vehicles. In such a situation, processing the large volume of data is one challenge. Quickly analyzing the data is another demanding task, as organizations seek to gain insight into complex scenarios.
Containerized microservices in AKS form a key part of the solution. These self-contained services ingest and process the real-time data stream. They also scale as needed. The containers' portability makes it possible for the services to run in different environments and process data from multiple sources. To develop and deploy the microservices, DevOps and continuous integration/continuous delivery (CI/CD) are used. These approaches shorten the development cycle.
To store the ingested data, the solution uses Azure Cosmos DB. This database elastically scales throughput and storage, which makes it a good choice for large volumes of data.
The solution also uses Apache Kafka. This low-latency streaming platform handles real-time data feeds at extremely high speeds.
Another key solution component is Azure HDInsight, which is a managed cloud service that enables you to efficiently process massive amounts of data using the most popular open source frameworks. Azure HDInsight simplifies running big data frameworks in large volume and velocity while using Apache Spark in Azure. Splunk helps in the data analysis process. Splunk creates visualizations from real-time data and provides business intelligence.
Potential use cases
This solution benefits the following areas:
- Vehicle safety, especially in the automotive industry
- Customer service in retail and other industries
- Healthcare cloud solutions
- Financial technology solutions in the finance industry
Next steps
Product documentation:
- About Azure Cache for Redis
- What is Azure API Management?
- App Service overview
- Azure Kubernetes Service
- Introduction to Azure container registry
- Welcome to Azure Cosmos DB
- What is Azure Database for PostgreSQL?
- What is Azure HDInsight?
- What is Azure Pipelines?
Microsoft training modules:
- Build and store container images with Azure Container Registry
- Configure Azure App Service plans
- Work with Azure Cosmos DB
- Create and connect to an Azure Database for PostgreSQL
- Develop for Azure Cache for Redis
- Explore API Management
- Introduction to Azure HDInsight