CONSTRUCTION OF AN ETL PIPELINE FOR TRAFFIC DATA ANALYSIS USING PUBLIC SOURCES
Abstract
The growth of vehicle flow in cities has intensified challenges related to urban mobility, requiring data-driven solutions to support traffic planning and management. This work presents the development of an Extraction, Transformation, and Load (ETL) pipeline for the automated processing of public transit data from the city of Curitiba. The solution was implemented with open-source technologies—Python, Apache Airflow, and MySQL—aiming to ensure scalability, automation, and integration between different data sources. The results demonstrated the pipeline’s effectiveness in collecting, transforming, and storing information, enabling the generation of analytical visualizations in Power BI that highlight spatial and temporal patterns of traffic occurrences. The study reinforces the importance of Data Engineering as a tool to support decision-making and the optimization of urban mobility.
