cover

Real-time Data Processing and Insights from Seoul Bus Data

Date: 2023-04-03
Stacks:KafkaDruidSupersetMySQLReact

GitHub Link

https://github.com/KEA-ACCELER/kafka-druid-superset

📢 Presentation Video

https://youtu.be/kDg0ZoLWLRs

📖 Presentation Material

⭐️ Project Overview

This project involved building a system that collects, processes, analyzes, and visualizes real-time bus boarding and alighting data and bus stop data using kafka, kafka streams, druid, and superset. The system was constructed through the following steps:

  • Firstly, using kafka, a message bus was set up to collect and deliver real-time bus boarding, alighting, and bus stop data. Kafka is known for its high performance and scalability and can integrate with various data sources.
  • Next, kafka streams was utilized to process the streaming data related to bus boarding, alighting, and bus stops. Kafka streams is a library that allows easy processing of data from kafka, enabling the implementation of complex business logic. For example, it can calculate and deliver real-time statistics on passenger counts, boarding ratios per bus stop, and bus operating status.
  • Subsequently, druid was employed to create a real-time analytical database for the bus boarding, alighting, and bus stop data. Druid is an open-source, high-performance database specifically designed for real-time analytics, capable of querying and aggregating large volumes of data quickly. Druid can ingest and index data from kafka in real-time and provides various aggregation functions and filtering capabilities.
  • Finally, superset was used to build a BI platform that visualizes bus boarding, alighting, and bus stop data in various charts and dashboards. Superset is an open-source BI platform that integrates with druid to easily visualize data. It offers a user-friendly interface with diverse chart options and allows the creation of real-time updating dashboards.
  • 👬 팀 구성

    팀원 (5)

    👬 Team Composition

    Team Members: 5

    🔨 Responsibilities

  • Development of Front-End Page with React
  • Structuring the Model Architecture
  • Creating Insights from data
  • Producing SQL Queries for Insights
  • Analysing the Insights
  • ⚒️ Technologies and Libraries Used

    💡 Reflections

  • Through this system, we were able to gain real-time insights into bus boarding, alighting, and bus stop data, contributing to efficient traffic management and service improvement. The project provided deep knowledge and experience in real-time data analysis and visualization. In the future, we plan to further enhance and develop the system created in this project.
  • The process of receiving large amounts of data in real-time, visualizing it by topic, and gaining data insights was intriguing and novel. It sparked my interest in using this technology for even larger datasets and services.
  • ← Back to project list